Internet, discourse and interaction potential

Keynote to appear at First Asia Pacific Conference on Human-Computer Interaction.

Harold Thimbleby
Middlesex University



The conventions of drama present the planned as spontaneous, stimulating the imagination of greater interaction potential than there is. This paper argues for a distinction between design for demonstration and design for interaction. The distinction is needed on the Internet, which supports the greatest range of discourse -- spontaneous to planned -- and therefore wide scope for confusing dramatic presentation for effective interaction.


Design, discourse, drama, human-computer interaction, hypertext, scenarios

1. Introduction

There are two contrasting modes of human communication, which correspond to the spoken and written word. In spoken discourse, communication is sequential, spontaneous and conversational. In written discourse, communication is planned, and neither its creation nor consumption need be sequential. Interesting interaction arises when the source of communication is in one mode, but it is received in another mode. Historically, drama and recording spanned these styles of discourse; more recently, the Internet and hypermedia span them.

HCI -- human computer interaction -- is about design, which is planned interaction: it is concerned with the effective planned 'writing' of a design to be later consumed as 'conversation.' HCI, viewed like this, is a transformation from one discourse style to another. Yet so is drama; arguably, this leads to confusion when an interactive system is demonstrated. Is there more to the demonstration than is presented, or is the demonstration all there is? Drama encourages the audience to believe there is more than there is on the surface; in contrast, HCI requires capabilities beyond the demonstration. This potential interaction must be carefully planned, to be substantial rather than drawn out of the user's imagination. Although good HCI can be demonstrated, a good demonstration does not imply good HCI.

Despite the obviousness of this, it seems many demonstrations of prototypes are converted to products too speedily. Understandably, managers, after viewing an impressive performance, may tighten production schedules. Users are left with a void between their expectations (after all, the performance exercised their imagination!) and what they can actually do.

HCI is about interaction, and interaction can be characterised as moving around in a conceptual network of possibilities -- just like hypermedia. We come full circle with the World Wide Web as hypermedia, blurring the distinction between users and designers. All are communicators. Despite the scope for interaction breakdown on an unprecedented scale, the Web certainly enables humans to engage in new styles of discourse. As HCI professionals we must try to ensure that users are empowered to explore this new medium, their relation to it, and their roles in it, without interaction breakdowns escalating beyond control.

"Writing, when properly managed, (as you may be sure I think mine is) is but a different name for conversation." Tristram Shandy, Book I.

2. Discourse and the Internet

Humans have been communicating with each other ever since there have been humans. Writing was discovered around 5000 years ago, and had a very different use than spoken language. It was much more formal, used for records, religious material and legal purposes. However the status of written language changed dramatically after the invention and widespread use of movable type. Before printing scribes spent all their effort copying authorities, leaving no spare energy for their own thought; now people could pursue truth rather than spend their time laboriously copying, because printing was so much more efficient. A much more critical attitude developed, leading to the Renaissance and the overthrow of medieval thinking. The technology of printing enabled many radical developments, even such 'simple' ideas like page numbers and indexes became practical -- because every time a book was printed it had the same pagination. Partly thanks to the ease of printing illustrations, personal knowledge became based, not on rote verbal learning, but on flexible visual abilities (Thimbleby, 1991).

If we ignore broadcast (newspapers, radio, TV) the next major innovation in communication was the telephone. The telephone did not change anything; it just allowed people talking to each other to be further apart. In fact, many countries passed laws to emphasise the distinction between spoken (spontaneous, transient) and written (formal, permanent) language. In England, for example, it is normally illegal to record a telephone conversation. In other words, the law -- that is, society -- wishes to distinguish between the transient spoken form of communication and the written record. There is something different about the use of recorded (usually written) language that changes its status. A tape recording of a private conversation on a telephone is an invasion of privacy. The law recognises this; or, put another way, society finds the conventions important. People like to place clear boundaries between different forms of communication.

Special forms of communication were developed for special purposes. For example, drama was developed to present the intimate as public, to present the spontaneous as repeatable, to engage the audience in issues they could literally walk away from at the end of a performance. An actor speaking personal thoughts aloud uses a convention to communicate to the audience; whereas a person walking along a road in a public space and talking aloud is easily thought to be mad. Indeed, talking to yourself is the first sign of madness -- unless you are on stage, or using a mobile phone (in both cases you are talking to other people).

Notably, the broadcast medium of television naturally adopted the conventions of drama (though, earlier, it was with some effort that cinema separated from stage [1]). Formal material on television is rare, and is usually taken especially seriously -- with few exceptions, it is only really successful when it is edutainment or politically motivated investigative journalism.

In the last few years the Internet introduced the first truly new form of human discourse for five thousand years. Email, just one style of interaction on the net, is a mixture of spontaneous and planned written communication. So-called 'flames' are instant thoughts captured and reinterpreted as formal written accusations. The World Wide Web allows individuals to create home pages, 'speaking aloud' about themselves to the whole world. In real life, one is mad to speak to nobody. Yet on the net, it is legitimate to speak to nobody now, because the spontaneity is recorded and can be listened to later, perhaps by millions around the world. It may be a coincidence that many personal home pages are crazy, or maybe people import dramatic conventions of self-expression into the new medium. Nevertheless there are some things in the Web that are serious, and they try to be taken at face value, not as drama that can be taken or left.

The net is a plastic medium that merges spontaneous, recorded, broadcast and personal discourse. I can type about as fast as I can speak; is, then, my email spoken or written? Sometimes it is one, sometimes the other. The recipients of my email may read it as spontaneous spoken or as formal written. They can do new things with my email that cannot be done with conversation: they can forward it to others -- it can be treated as a recorded object; they can save it and reply months later, reinstating the historical conversation to the present. To substitute for the conversational manoeuvres of speech, new written conventions are employed: such as ':-)' to label humour; turn-taking by '>' quoting (i.e., making the history of the conversation explicit); and identity by digital signatures (cf. Zimmermann, 1995).

With email I can send a simple yes or no, or I can send an entire book. Who would appear on television just to say no? Who would read a book over a telephone? The flexibility is exactly why email is so compelling.

The distinctions we are making are important. We give some examples:

The table shows (and briefly expands on) the major points we have made; for further discussion of text linguistics see standard references (Crystal, 1987). For clarity, we ignored self-referential expression: obviously humans are complex, and if we considered humour, say, we would be allowed to break all sorts of conventions with impunity! We can also raise ourselves above a discourse to discuss it. The common "this conversation is getting us nowhere" or Laurence Sterne's playful -- but planned! -- interactions with the reader of The Life and Opinions of Tristram Shandy, Gentleman (1759-67) are examples.

Source Destination Exceptions
Spoken Spontaneous. Private. Composed in order, no revocation, except by more speech. Free. Always new and free of history. Requires a present audience. The speaker represents themselves. Chosen by speaker and hearer. Transient. Interactive. Involved, though hard to reflect without making the reflection part of the conversation. Cannot be reused. Can only be consumed in given order. Free. Drama is rehearsed spontaneity, and is often presented to a large, non-interactive audience. Recorded speech is not interactive, can be reused, and is typically copyrighted.
Written Considered. Can be edited before commitment to communicate. Formal. Expensive. Steeped in history, often explicitly building on other written sources. Done without a present audience. The writer can represent many characters. Chosen by readers. Not interactive. Can be saved and reused (though there are copyright conventions). Can be consumed in any order. Priced. Easy to reflect and argue. Destroying written information is highly symbolic. Transcripts are written recordings of speech -- they look chaotic (Chapanis, 1981). Speeches and plays are written but intended to be used as spoken. (Hypertext is discussed in the body of the paper.)
Net Any of the above. Typing is writing at the speed of speech. Any of the above, though destroying electronic information has little significance. None.

Though separation in time is important, spatial distance is not; indeed telephone technology permits both spoken discourse and written, using faxes, both as if speaker and listener were adjacent. The hiding of distance may encourage some users to under-estimate cultural diversity: engaged in interacting with like-minded people around the world, they may think that this select group of individuals is more representative that it is. Certainly the notion of 'local' neighbourhood is widened.

When we look at the discourse on the Internet there is surprising flexibility. Social conventions have not been established. English law -- even if we suppose this to represent the UK's distilled social conventions -- seems quite happy to hide behind the excuse of not understanding the 'new' technology. More positively, the fluidity is something we should exploit so that new forms of discourse can develop that go beyond current socio-legal traditions, and which may stretch our neurological dispositions on which social habits have been founded [2].

Newsgroups (e.g., the Usenet) are at once written and shared, and at the same time, the lively spoken views of like-minded people. Someone coming into a newsgroup culture can be subject to strong forces to conform. They are directed to FAQs to be initiated into the group's customs and shared history. Someone who does not share the group's views might well see the FAQs etc. as having an identification role like myths.

The table makes clear that drama breaks down normal discourse conventions: the dramatic context (usually clearly flagged, with stage, masks, etc.) allows -- sometimes contentiously -- a spontaneous or private communication to become public. There is even the distinguished profession of theatre critic; a role that would not be tolerated intruding into normal conversation! Recording has a shorter cultural history, and only two uses seem permitted: one is that the recording is a recording of a dramatic performance (including music), the other is the use of the recording for a formal or legal purpose. Most people feel betrayed if their spoken communications are recorded in the wrong context.

3. Human-Computer Interaction

We have set the scene -- to use that simple dramatic metaphor -- to emphasise the new styles of discourse evolving on the net. The ease of storing, copying, editing, and generating material leads to new ways of constructing it and using it. Sometimes the way material is used is not the way it was intended to be used; flames are one example of this, as are the powerful effects of anonymous email, and virtual forms of communication that would have been thought impossible, such as netsex.

These new styles of discourse are not just of specialist interest. The net is the largest collaboration of humans the world has ever seen. Its power for good is phenomenal. Unfortunately, especially when building bridges between cultures -- spanning different conventions of discourse -- the net provides opportunities for misunderstanding, expression of anger, intimidation or destructiveness on a scale and speed never before imagined.

It is our duty, then, as HCI professionals, to ensure that the technology of the net itself does not contribute to misunderstanding. It is our duty to understand the transformation that happens between minds, as computers broadcast, record, and exchange ideas. To a large extent, what people do is their own responsibility, but if misunderstanding is increased between people by the lack of an undo function (for example), that would be something worth understanding and planning for, or designing to avoid.

Whereas drama converts intimacy into theatre, and indeed gains some attraction by doing so, HCI gains its attraction by converting static plans into effective interaction. A computer program is an object that the user brings to life, and to just that sort of life planned for it by its designer. Although drama may convert a script into a living experience, the users of the experience are the audience: they are disconnected from the 'lives' of the characters. In the terms of Winograd and Flores (1986), the audience of a drama is out of control of their thrownness; and their thrownness is easy to confuse with potential readiness-to-hand that is imagined to be the case in an actual interaction. Drama is unlike HCI, where the user is involved in the act, and has a personal commitment to the outcome. (Notwithstanding some HCI is about better drama; see also Carroll's classic paper (1970) on the analogies between entertainment and work.)

HCI, as the field that concerns itself with communication from designer to user, recapitulates the field of discourse we saw in all human communication, and on the Internet in particular.

3.1. Hypertext

Before the World Wide Web, we thought the design of hypertext was a problem for professionals and a subject of HCI research. Now, however, users create enormous hypertexts of their own, and they surf in the largest hypertext ever. The scale of the World Wide Web dwarves any traditional hypertext.

Everything is hypertext. At once this is the strength of the idea, and its limitation. No longer does writing require any planned structure before it can be released to its users, for users are now supposed to make what they will of it. Many writers can engage in the text in a way that is both 'structured' and more flexible than any other style of written text. The texts can be linked together with no respect for any conventions (such as story or lexical order, required for most encyclopaedias). So even the nature of writing and co-authoring is transformed.

Though the World Wide Web must be one of the HCI design successes [3], hiding the complexities of using the net and making it extremely easy to use, it does not hide the complexity of information. The issue of getting lost in hyperspace has changed: the Web has become a place where surfing means treating the medium as spoken discourse, rather than written discourse with a plan that one could get lost in. The new systemic issues must be clearly distinguished from conventional HCI, which seems, to my mind, to be excessively concerned with simple low level details (e.g., Kellogg & Richards, 1995) that in any other technology would be dismissed as outrageous and feeble engineering.

As HCI professionals we must assume that complexity imposed on users is our responsibility. Yet the designers' problem of creating usable material is harder than the readers'. A reader has only one course of action, namely the one that he or she takes. The designer has to cater for more than one user on one occasion. Each choice that any user is given at least doubles the size of the design problem. After only ten alternatives for one user (or one alternative each for ten users) the design space is a thousand times larger. Ten choices is trivial and provides little scope for interaction: a more realistic interactive system would allow for thousands of choices, representing an astronomically large design space. If writing books is difficult -- certainly, not everyone is a successful book author -- then writing good hypertext documents is very much harder.

It is useful to introduce new terms to be clear. Interaction is what a user does. Interaction potential is what the designer has to plan; the design allows for many potential interactions, but a user only experiences one, namely 'the' interaction. Even over a period of time, the sequence of user interactions explores the potential, but is still a single interaction, just longer. Thus we may say that Laurel's Computers as Theatre (1991) emphasised interaction, not interaction potential [4].

Users do not experience the designer's problems, because they are in the flow of their own experience. The alternatives they did not follow -- which the designer should have planned for them just in case they did -- are hidden from them. Not just hidden: any interaction is simpler than any interaction potential, the designer's work appears simpler than it is. Thus people think the design of hypertext is very much easier than it really is. This lack of reciprocity between user and designer is important: on the World Wide Web, users are designers, but the lack of reciprocity remains.

There are several consequences.

First, 'everyone' thinks design is easy. The result is that designers are under enormous pressure from marketing, management, and everyone else, to deliver complex products faster than is possible consistent with doing a good design. Conversely, designers become disassociated from users since their job is not easy; hence the relevance of phenomenological views like Ehn (1988), Laurel (1991), Suchman (1987), and Winograd and Flores (1986).

Secondly, good design can be faked. If we know before-hand which choices a user will make, then the system can be constructed implementing only those choices. It is more like a film than a computer program. (Indeed there is an established professional interest in emphasising the presentation aspects of the medium.) The problem is that a film is ideal for a system demonstration: and will all too easily give management or marketing the idea that the product is far closer to market than it really is. So, again, presentation (that is, drama) can masquerade as interaction potential. Scenarios instantiate interaction potential as short takes from 'spoken' interaction discourse: running the risk of confusing realism for generality [5] -- but see below.

The dramatic confusion of presentation (encouraging people to imagine interaction potential) with actual content leads to the ascendancy of superficial fashion in interactive systems: most computer 'solutions' are chosen because they are attractive and fashionable rather than effective. Design professionals competent at presentation are adept at exploiting users' imagination to fill in the interaction with potential -- potential that may not be there.

So: the world fills up with poor hypertext. Users then become less demanding. It then seems easier still to create hypertext. Standards plummet. And the gulf widens between what users do and what theories designers have to use in their work. HCI seems even less comprehensible.

3.2. Sony TVs

Hypertext is a new medium for reading and writing. As browsers on the Web make clear, it is a style of interaction. Conventional interactive devices, such as video recorders, can also be seen as hypertexts, albeit with rather simple text, and rather complex structure. Addison & Thimbleby show elsewhere (1996) how to exploit the direct correspondence between interaction and hypertext. Many design issues of hypertext and of interactive devices coincide.

As a specific example of poor interactive device design, consider the Sony KV-M1421U type TV with its remote control, the RM-694. The figure shows two statecharts (Harel, 1987) specifying how the user potentially interacts with each device. We do not need to understand statecharts to see that the devices are very different; even the corresponding buttons do different things. Some features can be done on the TV alone, some can be done on the remote control alone, some features available on both are done differently on each device. It is not clear that the complexity is justified. Although there is only one application, namely the TV, the user has two user interfaces to understand, with their own rules; this, in turn, requires a user manual of double the thickness. Moreover if the user becomes skilled with one device then their skill is of little use with the other. Pity the user who loses their remote control!
Television Remote control
Statechart showing all features available using television's control panel. Statechart showing only buttons that correspond to those on the television; the remote control provides additional features not shown here, such as teletext.
There are many 'sensible' interaction paths through the design, and they could be demonstrated persuasively. Somehow Sony created an object for interaction, apparently without thinking how it would be used. The overall interaction potential is bizarre; I myself can see no justification for the obscurity, for the specious inconsistencies, for the timeouts [6]. Sony were, I imagine, encouraged by uncritical acceptance of a demonstration. A mock up might have been produced, demonstrated to executives, who were duly impressed. The executives maybe imposed unrealistic production deadlines. And so the market got another gadget that looks cute but is difficult to use -- because it was only designed to be used in one way, and that was for the demonstration. The rest of the user interface has not been worked out. Even if anyone realised the stupidity of this schedule, it is the way everything is designed: standards are low.

Unfortunately the Web is bigger. If Sony have trouble (I'm not sure they realised it) with such a trivial design, how much worse will world wide hypertext become? Ironic that Thompson (1961) writes, "people like being sold rubbish and enjoy being deceived by advertising [] they continue because the consumer accepts his disappointment philosophically or puts it down to his own misjudgement." Let us hope that discourse on the net can yet be worthwhile, not another medium swamped by the lowest common denominator.

4. Free associations

4.1. Interaction and physical space

The logical design space of the TV, made visible by a statechart, is normally a transient interaction that is invisible. Normal use of a television does not create an 'object' that the user can explore like a reader can explore a book: the discourse the user has with a TV is spoken discourse. The 'spoken' spontaneous can be 'pre-recorded' or objectified in a design formalism such as a statechart (or perhaps a mathematical specification). Since formal 'written' discourse is so much more valuable for making design choices -- or for exposing design quirks -- one surmises that Sony did not use a formal method before or during the design process: for the statechart (as typical of formal methods) makes manifest some peculiar design features. Even without a full critique, it is clear that the 'written' discourse allows greater exploration and planning than the 'spoken' -- it provides an overview that the designer can explore in any order without being limited to particular paths users take in particular interactions -- this is entirely consistent with our expectations of the different styles of discourse.

One reason the TV is awkward is a few buttons do many things. Visible physical space has been traded for hidden interaction potential. Instead, at the other extreme, the TV might have been designed to have 400 buttons each doing exactly one thing. Such visible physical complexity would have been ridiculous.

The point is knowing it is ridiculous is easier when a design is physical. Hence we can make better HCI judgements in objectified media. Statecharts make the TV interaction issues clear because they are a suitable object medium for planning interaction potential. Interestingly, the advance of formalism historically has been attributed to the writing down of the arts, particularly 'theatres for the mind' (Yates, 1966).

4.2. Faking or delaying?

A demonstration of an interactive system that is presented as 'the real thing' is a fake. One can fake for two reasons: to make life easier (e.g., to disguise an unfinished product), or to explore part of the interaction potential as if the rest of it was there. This may be necessary for user evaluation. Thimbleby (1990) discusses deliberate use of design faking, and a computer science technique -- delaying commitment -- that allows designers to postpone some design decisions systematically without recourse to faking.

4.3. Getting lost in the wavefront of change

There are many sites "under construction" on the Web. Rapid growth means "under construction" is an inevitable wavefront. The wavefront of change means that users are restricted in the ways in which they can reuse information: at a 'higher level' the very fluidity of the net makes it harder for users to perform the operations (editing, forwarding, etc.) typical of written language. For example, copying a web page gives a user some information that is not only localised (i.e., referring to objects local to its source, not where it is now), but also information that quickly dates. There are many sites that were accessible last week but not this week. Is getting lost in the wavefront of change itself a temporary phenomenon, or a permanent state that requires new thinking and new tools to manage? Is it something the technology should make transparent, or is it something to exploit? -- Probably both. The wavefront may itself be a new style of discourse.

4.4. Commercial pressures

Broadcast media sink to the level that upsets nobody, because they want universal appeal (Thompson, 1965), leading to larger markets. Johnson (1996) warns very clearly that commercial interests may change the nature of the Web. So far, the user has been in control of their browsing (reading rather than listening type discourse); Johnson calls this pull. But commercial interests want to make money by pushing into captive markets, and that means controlling what users consume, and their interactivity evaporates -- they now listen to sequential drama (e.g., pay-per-view films).

4.5. Design rationale and computing science

The essence of design rationale is reification of the design process (Carroll, 1993). In other words, design rationale is the recording of 'spoken' design discourse. Carroll further writes that, "codifying a design discourse creates an audit trail." In contrast, computing 'goes the other way' from codification to interaction. Both views, however, encourage greater reflection on the 'written' design medium, arguably with formal methods from computer science having the edge -- except that people who are good at formal methods often do not appreciate human factors. This is one reason why scenarios are successful: they stimulate imagining human issues.

5. Conclusions

The World Wide Web is a successful HCI design job that improved the Internet and made it accessible to almost everyone. Like printing before it, it extends the power of all users. Information became cheap: readers could afford printed books to annotate with their own ideas, now users can freely create their own web sites.

Yet the Web does not conceal the fundamental problems of complexity, nor of a new medium finding its niche. Design on the web characterises all HCI design: the designer's job is far harder than anyone -- even the designer! -- can imagine. Users think so highly of demonstrations that they demand production systems before designers can honestly create them with the appropriate quality.

On the net, no longer are user and designer different, with designers out-numbered by users -- for all are designers, all are users [7]. We can contemplate that only 0.0000001% of these designer/users are aware of any HCI principles. The new discourse of the net (MUD, Web, IRC, etc.) makes new design issues manifest.

At least we saw it coming.

What can we contribute? A start will be to be aware of these issues, to consider, discriminate, and be selective in our choices for the future. As the First Asia Pacific Conference on Human Computer Interaction proceeds, ask yourselves where each presentation stands -- and where it is going.

Don't be impressed with demonstrations, passively imagining the interaction potential, get involved and interact to explore their actual potential. Specifically, when you see a demonstration, ask: is the interaction potential in your imagination (i.e., it is good drama) or is it actual (i.e., it is a good system)? What is the reusable (recorded, written) thing it contributing to the world -- and how are we going to take advantage of it?

Theatre is fun, but let us criticise it where ever it pretends to be our future relying on our imagination to do the creative work that should have been done by designers for the users.


The ideas here developed in conversations with my colleagues, particularly with Matt Jones, Gary Marsden, David Pullinger, Sherry Turkle, and my with wife, Prue Thimbleby.


  1. The goal of realism, accurate simulation of context in a time and place other than the experience, makes clear discussion particularly fraught.

  2. We do not have space to discuss brain specialisation, nor developmental issues (e.g., everyone starts off verbally, only later learning to read and write) and how this may impact education on the net. Note that Western and Eastern scripts invoke different cortical functions; this may produce interesting cross-cultural effects, as pictorial and virtual reality information becomes widespread.

  3. Browsers hide protocols from the user --HTTP etc. -- to present the same interaction style, unifying the earlier diversity in incompatible protocols. They successfully hide the computers' conversational discourse from the user.

  4. Laurel briefly mentions an AI research project to provide interaction potential. Her stance is to make the 'spoken' interaction more effective, rather than to reflect on the distinction between discourse as I outline in this paper.

  5. The trouble is, realism is 'easy' -- there are a lot of really good artists who are very creative with computer media -- but, unfortunately, generality is still a hard research issue. Formal computer science would use specification techniques to generalise scenarios from collections of instances to general, coherent, interaction potential.

  6. The television has a class of states (seen bottom right in the statechart) that cannot be left except by doing nothing until the timeout.

  7. Continual innovation (like Java) change the leaders. There will always be push/pull as new technologies lead to new aspirations, and as popular tools catch up with technique. Note that so-called 'universal literacy' -- the pull of the printed word's push -- took four centuries from the invention of printing.