Re: [r6rs-discuss] Case sensitivity

Joe Marshall Mon, 23 Feb 2009 12:16:00 -0800

Oh Boy!!!!   One of my favorite pet peeves!

Allow me to clear things up.  Eli has provided the best argument for
case sensitivity.  Nonetheless, Guillermo is correct.  The default
should not be case sensitive.


The reason is simple:  Case sensitivity breaks abstraction.

There are two concepts in play here.  The first is orthography, which
is the way in which words are represented in writing.  The second is
typography, which is the way in which letters are represented as
visual glyphs.

The meaning of a written word is highly dependent upon orthography.
Inflections of words change the morphemes, and the morphemes are
usually indicated by simple character sequences.  To give an example,
making something plural in English is usually done by adding an `s'
(as a sound, when you are speaking, as the letter when writing).  Another
example is making something past tense by adding `-ed'.

The meaning of a written word is highly independent of typography.
Walk down any commercial street and look at the signs.  There are
all sorts of fonts, styles, colors, and sizes of letters, yet they are all
readable.  Typography is used for a number of reasons.  On a sign,
it may be used as a distinctive way to make the text stand out or
to be easy to read at high speed (highway signs, for example).  In
an academic paper, italic types may be used when a new term is
defined or when a term is borrowed from another language.  In a chat
room, ALL CAPS INDICATES SHOUTING.  In an end-user license
agreement, the warranty and liability clause is in all caps.
In a paragraph, it is customary to capitalize the first word (the word
`Typography' in the fourth sentence means the same as the word
`typography' in the first.)

Homographs are an interesting problem.  There are sets of words
in a language that are spelled the same, yet mean different things.
In some cases, it is possible to distinguish between the two meanings
with typographic conventions (`buffalo' and `Buffalo') , but in general,
this is not possible (`conduct', `fair', `pen')

Majuscule and minuscule letters are a typographic variation that is
found in the Roman, Greek, Cyrillic, and Armenian alphabets, but
is not present in most writing systems.  Just as some typefaces
don't have italic or bold variants, some typefaces don't have majuscule
or minuscule variants.

In Lisp, one of the most important concepts is that of a SYMBOL.
Symbols are used to name things and represent abstract concepts
that are not necessarily stored in the machine.  A common use is
as an identifier.  Identifiers don't need to be symbolic (de Bruijn indices)
or mnemonic (`temp'), but good programmers use identifiers that
have a meaning beyond that of being a unique string of characters.
There are `obfuscation' programs that attempt to hide meaning by
replacing meaningful identifiers with arbitrary unique strings of random
characters.  (A better one might replace identifiers with real words
that are unrelated to the actual use!)  A symbol is very much like
a `word' in human speech.

Symbols need to be distinguished by orthography.  It is how words
are distinguished.

Symbols do not need to be distinguished by typography.  Words are
not so distinguished, and making meaning independent of typography
allows us to use typography for other purposes.  As we do with words,
we can use typographic conventions for distinguishing differences
that have nothing to do with the orthography of the words.  A good
example is in quasiquote templates.  A practice I have seen is to
make the static part of the template in upper case and the dynamic
part lower:  `(,n SCORE AND ,m YEARS AGO)

Earlier I said that ``Case sensitivity breaks abstraction.''

Elevating case, a typographic convention, to the same status as
spelling, an orthorgraphic convention, breaks the abstraction between
orthography and typography.

Breaking an abstraction such as this will have benefits and costs.
The benefits are easy to enumerate:

  1.  It will be possible to use case to distinguish between homographs.
    For programs that deal with bovines in upstate New York, or with
    metalworkers from Eastern Europe, this will be a much-awaited
    change.  (Light-skinned people who are members of the judiciary
    will be out of luck, though.)

  2.  It will open up a much larger number of short strings to use as
     identifiers.  Instead of a simple `if', we will have `if', `If',
`iF', and `IF'.

  3.  It will make it easier to map symbols to identifiers in other
    computer languages.

  4.  It is easier to implement.

On the other hand, these are the costs:

  1.  It makes it impossible to use typographic conventions for
       things other than symbol identity.

        This is a big drawback because there is a long history
      of using typographic conventions in books and IDEs to
      highlight keywords, separate comments, distinguish between
      variables and compile-time constants, etc.

      It will no longer be correct to use typography to denote literal
      Scheme expressions in text:  (EQ? 'A 'B) => false.

  2.  Legacy code that used typography for things like templates
       will no longer work.

  3.  It will be much harder to describe code in situations where
      there are no typographic cues (verbally, in typefaces that
      do not have majuscule and minuscule variants)

  4.  It will be harder to remember the names of identifiers.
      (was that Define-Foo, Define-foo, or define-foo?)

Neither the cost nor the benefit are very high, and I think this barely
rises to the level of `trivial'.  I am *annoyed* at case sensitivity, but
it isn't a showstopper by any means.

By the way, the problem with the #!case-sensitive marker is that
calls to READ do not respect the lexically visible setting of the marker!
This caused a number of problems for me when PLT scheme switched
sensitivity.

-- 
~jrm

_______________________________________________
r6rs-discuss mailing list
[email protected]
http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss

Re: [r6rs-discuss] Case sensitivity

Reply via email to