Aug 30
2011

GitHub, your network graph sucks!

If you follow me on twitter you've probably seen me going off about this for a couple days now.

I recently ran into an issue with a Ruby project for background processing. Unfortunately, this project has fallen out of fashion with the Ruby community, and the original maintainer has all but abandoned it. I'm the kind of person that would rather fix issues than invest time rewriting something from the ground up - so I went looking for a solution that might already exist.

Like any Ruby developer in this same situation - I immediately went to GitHub to track down a fix. For the uninitiated - GitHub is a source code website that allows developers to collaborate on open source projects. GitHub users "fork" a project, or copy and modify it without changing the original source. A GitHub fanboy might tell you "just find an updated fork and use that". Unfortunately, doing that is extremely hard with a project like this.

Workling has 77 forks at current count. This not only makes the GitHub network graph slow, but extremely confusing.

If you're not familiar with it - the network graph shows all forks a particular project. The dots on the lines represent source code modifications these people have made through time.

Sadly, this is the only tool GitHub offers to explore the forks, and there are a number of problems with it.

Problem 1 - The graph is slow and unreliable.

This network graph took up to five minutes to display - when it actually worked. I understand there are calculations to be made, and data to pull here...but really is this acceptable?

Problem 2 - Which fork fixes my issue?

It's incredibly difficult to figure out exactly which of these projects solves my particular need. I have to mouse over each and every dot to figure out what people are doing, then click the dot to view the exact changes made.

From my limited viewing of everyone's forks it seems that this issue has caused multiple people to work on the same bug multiple times, each solving it in their own way. This is wasted effort that the distributed open source model is supposed to cure.

Why can't I search all commits on forks from this project for specific keywords? This would go a long way in helping my problem (and everyone else's).

Problem 3 - Which is the fork has the most momentum?

A rational coder would suspect that the fork with the most watchers, commits, or forks would be the new "unofficial" project everyone is working from.

I wasn't even sure how to begin choosing the proper fork to try out, and continue work on. Github has stats on number of "watchers", "commits", and forks for each of these. Why aren't these numbers displayed here? At least show them on a hover of the fork name.

I ended up trying forks with the most commits, until I found one that solved my issue. I think I had to go through about 3 before I actually found the one I'm moving forward with.

Does GitHub even care?

The issues I bring up could easily be solved by better information visualization and an interaction designer's eye.

I've tweeted @GitHub, but nobody really seems to care. Maybe the problem isn't widespread enough, or perhaps the code heads over there expect someone to write an API client to handle this type of issue. I'm not quite sure.

Whatever the issue, GitHub - your network graph sucks!

Written by Seth B

As Principal of Subimage LLC, Seth spends most of his days improving Cashboard. Occasionally he finds time to write about music, design, startups, and technology.

Tagged: design, interface, critique