[I started writing this in 2016 and unearthed it amongst some old drafts. But 6 years have only intensified my feelings here, so here it is updated and finished]
I’m sure you’ve all heard the statement from Arthur C. Clark that “Any sufficiently advanced technology is indistinguishable from magic.” But I had an exchange which convinced me of a variation of that: “Any suffciently hyped technology is indistinguishable from religion.”
The case in point was a discussion with someone who seemed to think that Git was the only version control system which had checkin hooks. And after informing him that every modern version control system, indeed, every one of them I’ve used in the last two decades, has such things (in one way or another), he repeated the same thing later on in the conversation, as if unable to process this new piece of information which contradicted established dogma.
Another interaction was with someone who asserted that Git was “more secure”. When I questioned him as to exactly how it is more secure, we was unable to articulate anything meaningful. Then I pointed out that is was trivial to forge checkins (and even demonstrated it in front of him by doing a checkin in his name), but this didn’t phase him, and he returned, mindlessly, to his original point.
I have nothing against Git on the whole, I use it myself every day. I have minor gripes with it, mainly having to do with the arcane, counter-intuitive interface (like this). But my biggest gripe is the religious fervor with which it is hyped, the irrational one-size-fits-all, Maslow’s Hammer weilding, be-all and end-all, the perfect final pinacle of version control. For ever and ever. Amen!
There have been many “religious wars” in the software world over the years. Long ago I defected to the Emacs camp, so I know it well. But with all of those wars, there were always competing technologies; differing views on how to approach a problem. But in this case, there is only one left, all others have been shouted down into irrelevancy. At the time of Git’s ascendance, I was managing at least 6 different version control systems, and I thought this would just be one more to add to the mix. But I was mystified as my team was quickly sidelined as everyone mindlessly rushed to Git.
I am a believer in using the right tool for the right job, and Git is certainly the right tool in a lot of cases, but not every problem is a nail. Sometimes you need different tools.
I notice that I wrote about this earlier, but I am now taking the long view of this: every ascendant technology eventually declines as the next shiny thing attracts everyone’s attention. I look forward to the day something new comes along and pushes Git to the sidelines.
I ran into a situation today, which was quite astonishing: Git creates new repositories in an inconsistent state. Until the first checkin is done, the repository has HEAD pointing to a nonexistent location. I discovered this because I was replicating several other team’s Git servers for backup purposes. In experimenting with this I came up with this reproduction:
$ git init --bare foo.git
Initialized empty Git repository in /tmp/foo.git/
$ git clone --bare foo.git foo2.git
Cloning into bare repository 'foo2.git'...
warning: You appear to have cloned an empty repository.
$ cd foo2.git
$ git fetch
fatal: Couldn't find remote ref HEAD
fatal: The remote end hung up unexpectedly
While doing clone and fetch from an empty repository is a silly thing to do, but it isn’t worthy of a fatal error. No other systems I work with have this flaw. So now I have to modify my replication scripts to detect such repositories and avoid them.
I spotted this today in a discussion about Subversion, and a workaround for a situation which ended up corrupting the workspace:
Arr… the reason I’m trying to get us to switch to Git. Less of this funny business.
I will admit, I’m not a big fan of Git. But my biggest problem with it is the born-again fervor of some of its fans like the one above. I will freely admit that Git has some advantages over Subversion (but there are disadvantages as well). But to claim that workspace corruption and the attendant workarounds is something Git (or any version control system) is immune to is an indicator that someone drank too much kool-aid.
Maybe it’s just me, but I’ve been working with computers so long that I believe that every piece of software that has ever existed (or ever will exist) has its share of “funny business”. But salesmen and evangelists could never admit such a thing.
The original issue I ran into had to do with the dysfunctional practice of checking in enormous binary files. Every version control system is going to have issues with this, though to varying degrees. In the course of researching the issue, I found this passage about Git:
The primary reason git can’t handle huge files is that it runs them through
xdelta
, which generally means it tries to load the entire contents of a file into memory at once. If it didn’t do this, it would have to store the entire contents of every single revision of every single file, even if you only changed a few bytes of that file. That would be a terribly inefficient use of disk space, and _git is well known for its amazingly efficient repository format.
I was with him up until that last phrase. We were having a serious technical discussion, and suddenly a salesman crashed the party! This “amazingly efficient” repository format is largely thanks to xdelta. The salesman neglected to mention that xdelta is the same mechanism used by Subversion. We could certainly have a serious, quantitative, technical discussion about the tradeoffs of various mechanisms for storing versioned data, or about the ways to manage those deltas. But something tells me that the salesmen and evangelists will crash that party as well.
That last phrase could have been more accurate and less obnoxious had it been phrased “and any modern version control system worth using would not do so.”
I’m not sure why I didn’t think of this earlier, but I just put Defensive Omnivore Bingo onto GitHub. So if you have any contributions, feel free to send me a pull request. Of course, email still works.
After attending Subversion & Git Live 2013 in Boston, I was thinking that I need to get this written down, and where better than a rarely updated and even more rarely read blog?
In my job, I work with numerous version control systems. When I was first exposed to Mercurial and Git, I realized they very nicely solved a problem which had plagued the free software community for years.
Several years ago, I wanted to contribute fixes to p42svn, which is in a Subversion repository (though what I am about to say applies to most any pre-DVCS system). I created my workspace, and then set to work on my fix. When I was done, I could not check in, since I didn’t have checkin access, and didn’t expect to be granted such access since the owner of the project had no idea who I was and if I could be trusted to checkin to the repository. So, I had to do a diff and then mail that to the owner. But, now, for my next change it got trickier. Since I only want to send the owner the changes I made since the last patch, I had to manually save a copy of what I sent last time (since I couldn’t check in). At this point I could have just created my own repository, and started checking in, which would have simplified this. Fortunately, I never needed to do that as the owner kindly gave me access to the repository.
But now I started experiencing the other side of this problem. I found patches on the forum for a number of fixes. But with each patch, I had a puzzle: what version did each one base their work on? If they didn’t say, I had to either ask, or work it out myself, which is quite tedious. Once I worked that out, I would create a branch, run patch and checkin their change. Now I have their change recorded in relation to the history of the code, and I could properly evaluate the change and merge it in. Of course, the first steps could have been entirely avoided if the change had been checked into a branch directly by the contributor. But they can’t do that since I don’t know who they are or if they can be trusted (sound familiar?).
The beauty of Mercurial and Git is that a potential contributor can make their changes, check them in and then “push” them to me. Thus, preserving the relationship of their change to the rest of the code. Of course, this is just what Github and its ilk enable. This is fantastic! All that monkeying around with diff and patch is gone! Now I can focus on the important problems.
But, here’s the rub. In the enterprise, this is not a problem we face in any way. First off, it is exceedingly rare for someone on an unrelated project to contribute a fix to my code. Even if such a thing did happen, I would know exactly who that person is. So if their ‘fix’ fouls up the code, contains malicious code, etc, they can be held accountable for that.
In my opinion, being able to solve the unknown contributor problem discussed above is truly revolutionary and is the most important reason for using Mercurial or Git. But it’s a problem we don’t have in the enterprise.