Saturday, January 27, 2007

MIXing it up

Is there any value to reading The Art of Computer Programming anymore?

The series is a classic, and that means it carries loads of baggage along with its merits. I don't know of any universally accepted definition for a classic other than the tongue-in-cheek one:

A classic is something that everybody wants to have read and nobody wants to read. -- Mark Twain
It sure strikes close to home for someone with a liberal arts education. Most software developers I know speak of the book with a mix of reverent respect and pitying affection; "it was an amazing achievement for its time, but it's so outdated now that it's more of a historical document than a useful guide for modern developers." My training has conditioned me to distrust this kind of attitude, if for no other reason than I spent my entire college career reading books that fit just this description. Plus, when I want to understand something through a book, I want the source; I hate textbooks and Reader's Digest-style summaries. I don't need any intermediaries between me and the original idea. I can handle it on its own merits. (Well, at least I SHOULD be able to.)

So I decided I wanted to read it. I got it for Christmas, and I'm up to about page 150. The first volume starts with some really intense math, and although I was always good at math, I'll admit frankly that I didn't understand most of it. Most "classics" are like that, though; the first time through your reaction is "Huh?", and the juicy nuggets only reveal themselves through repeated readings. So I pressed on and was treated to MIX, the ideal machine Knuth designed to illustrate algorithms in code.

MIX strongly resembles computers of the 60s, and its guts are unlike those of any modern machine. It's got registers and memory cells but no operating system; programs are written directly on the hardware layer in raw five-byte words. Bytes in MIX are not eight binary digits; they can be binary or decimal(!) and the only guarantee the programmer has is that they can contain at least 64 values (0-63) and at most 100. This is weird enough, as I've been thinking in binary forever. When dealing with values 64-100, you have to use two bytes to avoid undefined results; if it's a binary computer and needs two bytes for 90 and you only copy the first byte, you only get 63.

I haven't gotten to program this thing seriously yet (there's an assembly language called MIXAL that comes next), but it's radically different from a higher-order language. The machine does really next to nothing for you. You get to implement algorithms at the lowest level, which of course is the point, but I haven't implemented linked lists in assembly before.

Anyway, is implementing basic algorithms on an ideal machine really going to make me a better programmer? I don't know. I need to get a little further.

Where the hell does this class go?

Software development, however you want to define it, is really hard.

I've been doing this professionally for 10 years now, and recently I've been struck forcefully by how, well, bad a lot of software is. It breaks, it crashes, it doesn't do what it needs to, it decides to destroy unrelated data innocently minding its own business, it lies to your face about what will happen when you push the button. It doesn't matter who the developers are, or how big the team is, or how much time and effort are put into it; the goal, asymptotically approached, is making something that doesn't suck *that* bad.

Some developers are idiots. I have my moments, and I think there's a lot of truth in the simple "try not to make mistakes" approach as opposed to "find the best possible solution". But some are just straight-up, envy-inducing geniuses, and they work on really powerful and complex systems that wind up being terrible. How much effort has Microsoft spent on developing the Windows family? Is the quality of the end product proportional to the amount of effort invested in it? How many brilliant, motivated people worked on it through the years?

I got The Art Of Computer Programming for Christmas, and I eagerly tore into it. All right! The undisputed classic that lays out the guts of classical computer science! Surely in here I'll find my answers, clearly and cleverly presented in inky-black awesomeness, explaining why I can't architect an entire J2EE application correctly on the first attempt!

...yeah, it does sound pretty dumb, but I've always been the kind of person who figured that if he read just one more book, or if he found the right teacher, or if he learned the proper technique, then the formerly difficult and frustrating task would become easy, clear and fun. There was a time when this was true. When I first taught myself how to program, I was able to make the computer do damn near anything I wanted (so I thought). This part of Real Programmers Don't Use Pascal strikes too close to home for comfort:

When I got out of school, I thought I was the best programmer in the world. I could write an unbeatable tic-tac-toe program, use five different computer languages, and create 1000 line programs that WORKED (Really!). Then I got out into the Real World. My first task in the Real World was to read and understand a 200,000 line Fortran program, then speed it up by a factor of two. Any Real Programmer will tell you that all the Structured Coding in the world won't help you solve a problem like that-- it takes actual talent.
Well, I don't know whether I've got talent or not. I know I'm not a genius, because if I were I'd be off creating awesome things already. And I want to succeed in the Real World, not in the Happytime Sugar World Where No Challenge Trips You While You Saunter About Listening To The Sound Of How Awesome You Are. If a genius is just someone with supreme talent, like a Mozart or a Nietzsche, they don't need to develop skills in the same way we mortals do.

I'll save what I've found in TAOCP for the future, but suffice it to say that it wasn't exactly what I was expecting.

I read other programs and I can discern their structure; if it's Java (and it usually is, since that's my job), I can grok a method on one reading. I can understand an entire class in 15 minutes or so. After an hour I can fix bugs in the javadoc, write unit tests, and leave it a much better place than when I found it. If I have a package, it's harder to do. Some packages are just a velcro strip to stick loosely related functional units (java.text), while others provide a (hopefully simplified) abstraction over a much larger problem (javax.swing). The former are much easier to understand than the latter, which is fine since the latter is much more complex. But what do I do when I have to write something that's somewhere in the middle? How do you go top-down in architecture as opposed to bottom-up?

I want to be a better developer. To do that I must become at least competent in top-down design. I can't do it well, so I look for help in books. No help there. Can't learn it from geniuses since they don't have to think about what they're doing in the same way that I do. Can't read existing code because even if it is designed well it doesn't tell me anything about where it came from. So... what the hell do I do?