Stephen Ramsay    Home

The Man-Finger Aftermath

There was a spirited discussion about The Mythical Man-Finger yesterday in the comment thread and on Hacker News, and I want to thank everyone for some very thoughtful insights.

A couple themes emerged that I want to touch on.

In my view, the power of textual interfaces has been neglected, because we’ve all more-or-less accepted the gui as the best way to do things — especially for “non-expert” users. One of the things that’s potentially confusing in this conversation is that we tend to conflate the unix command line with textual interfaces in general. It’s a defect in my own treatment of the subject, and I probably should have done a better job separating the two. I think there are many excellent textual interfaces in the unix environment, but taken as a whole, it’s probably not cogent to suggest that the unix cli, as currently conceived, is a model interface for ordinary computing. I still insist that some very routine, userland things are considerably easier to do on the command line than with a gui, but that’s not the same as saying that we would all be better off in Bash.

For me, though, the most fascinating part of the discussion thread involved trying to tease out what is lacking in the unix cli. Amanda French began by noting (and her thoughts were widely echoed) that it’s easier to “figure it out” with a gui. If you want to do something, you can usually poke around the menus and get something going (even if you’ve never used the program before). I think there are limits to this. Very complicated graphical systems like PhotoShop can tie you in knots just as efficiently as any complicated textual interface; the thing you want to do is greyed-out, because you didn’t open the dialogue to set the alpha-channel… But I take the main point. guis are a lot easier to approach if you’re a new user. Many noted the fact that the “play” icon on a desktop will often display a musical note, a speaker, or the standard transport control symbol for “play;” the blinking cursor gives no indication whatsoever of how to proceed with anything.

On the other hand, that same approachability may become a liability when you get used to the program. Once you know how to do it, the endless mousing around can feel unduly laborious. Many complicated systems have “macro” facilities (a textual interface) for automating tasks. They also often have some kind of “record action” facility, which I (and, I suspect, many others) find frustratingly difficult — not because they don’t work, but because it’s hard to get through a complete mouse-click sequence without making a mistake. I suspect such systems are not used as much as they might be, because not a lot of thought has been devoted to thinking about the usability of such interfaces (they’re for “experts,” after all).

Some noted that it is precisely the sense of interface affordance that is lacking, and this seems to me one of the most important points. How do I know what is possible? Several mentioned autocompletion as a partial solution, and noted that such systems are not nearly as deeply implemented as they might be. Is it possible to imagine systems that look at your commands and show you options that are relevant to the particular thing you’re trying to do (based on analysis of the current state of the command)? What about hybrid systems that combine text with some visualization of context-sensitive possibilities?

I suppose my main point here (now offered with the benefit of everyone else’s reflections) is that we would do well to approach these limitations as research problems, and not as settled truths about hci. No one denies (I think) that that there are many advantages to textual interfaces along with various deficiencies. The problem, as I see it, is that we’ve largely consigned this entire method of interaction to the dustbin of history (and to the workstations of so-called “power users”).

Several mentioned Ubiquity — a system that I greatly admire, and that has inspired me to think about different ways to interact with systems that are right now “gui-only.” I’ve also been inspired by Quicksilver (Gnome DO and others on Linux) — a hybrid text-graphical app launcher that is very high on the list of things I wish I’d thought of.

But actually, the application that got me on this jag in the first place is, of all things, Xtranormal. Xtranormal is extremely clever, of course, and many people have done extremely clever (not to mention hilarious) things with it. Not many, though, have remarked on its quite radical approach to interface. “If you can type, you can make movies” (as it says on the home page), and it’s really true. I doubt that people spend a lot of time trying to figure out how to use it, and yet it is at some basic level a “scripting” language that uses batch-execution.

When I first saw that — and then saw a group of technologically skittish K-12 teachers use it to great effect — it really changed the way I think about hci. Granted, it’s not a command line. But it also doesn’t do the usual thing by trying to make sure that the entire system is picture-and-index-finger. Some people in the thread noted that some systems are just impossible without guis (presentation software, spreadsheets, video editing, 3d modeling), but I have to admit that I’m not sure that’s the case. I, for one, would like to see a “script”-based equivalent to PowerPoint (or even iMovie) that works like Xtranormal; or a 3d system that uses a simplified dsl; or an alternative to spreadsheets (an already pretty textual interface) that tries to incorporate natural language awareness. In fact, I’d like to devote the rest of my life to building systems like that.

And so it has been bracing to read these comments, which seem to me very prescient in the way they crystallize both the power and the remaining challenges of textual interfaces.

blog comments powered by Disqus