Nick Nicholas asked:

> I'm wondering if I could get some clarification on the Unicode corner
> brackets.
> 
> Unicode encodes the following sets:
> 
> 2308, 2309, 230A, 230B: LEFT CEILING, RIGHT CEILING, LEFT FLOOR, RIGHT
> FLOOR
> 
> 231C, 231D, 231E, 231F: TOP LEFT CORNER, TOP RIGHT CORNER, BOTTOM LEFT
> CORNER, BOTTOM CORNER

This is the set of quine corners (cf. TUS 3.0, p. 302), and might be
questionable pressed into service as brackets.

My own preference would be to go with the ceilings and floors for
textual corner bracketing.

> 
> 300C, 300D: LEFT CORNER BRACKET, RIGHT CORNER BRACKET

These are clearly punctuation, but they function primarily as quotation
marks in a CJK context.

You should also be aware that two more sets of graphical bracket
pieces or bracketing pieces are in the works for Unicode 3.2:

23A1, 23A3, 23A4, 23A6: LEFT SQUARE BRACKET UPPER CORNER, LSBLC, RSBUC, RSBLC

Those are terminal graphics pieces, designed to be used with vertical
extenders to create large square brackets on terminals or similar
contexts. But as symbols, they will look much like the ceilings and
floors.

23BE, 23BF, 23CB, 23CC: DENTISTRY SYMBOL LIGHT VERTICAL AND TOP RIGHT, BR, TL, BL

These are part of a dentistry notation system that has pieces similar
to the box drawing characters. But again, as isolated symbols, they
look much like the ceilings and floors.

> 
> However, the bottom corner brackets are absent from CJK. The appropriate
> shapes *are* present, both as CEILING/FLOORs and CORNERS. I presume they
> can be interspersed with text, but I'd like to double check that.

They certainly could be, as could any other symbols.

> 
> In other words, for brackets intended as punctuation, is it better to:
> 
> (a) propose BOTTOM LEFT CORNER BRACKET and BOTTOM RIGHT CORNER BRACKET, to
> complement LEFT CORNER BRACKET and RIGHT CORNER BRACKET? or

This would definitely not work. The problem is that while the CJK
left/right corner brackets are clearly bracketing punctuation, you
have to contend with their other properties as CJK punctuation. Most
systems will default them to wide behavior, giving them spacing
properties appropriate for CJK text, but inappropriate for the kind
of bracketing you would expect in typical alphabetic text.

> 
> (b) to assume that functionality is covered by CEILING/FLOORs or CORNERS
> in the Misc. Technical block, even though these are not necessarily
> intended to be used as punctuation?

This is probably better, although you would have to teach your
applications to recognize them as bracket pairs, rather than as
undifferentiated symbols. However, since epigraphic conventions
presumably could go beyond standard brackets to include somewhat
arbitrary start/end symbol combinations to indicate various conditions
of the text, I don't think this should be too much out of the ordinary.

> 
> I know concerns like spacing and width are secondary and font-related;
> the same goes for their look (I've seen them both rectangular-looking,
> like CEILING/FLOOR/CORNER BRACKET, and square, like CORNER ---
> obviously just a matter of typographical preference.) But
> I would like any such brackets used to be recognised as punctuation, and
> I'm not sure the extant symbols in Misc. Technical necessarily would. 

By default, no, since they have Sm or So categories, rather than the
P general categories associated with punctuation. However, applications
can override such character properties to follow other textual conventions
or produce certain effects. There is nothing stopping me, for example,
from using U+261B/U+261A (the black pointing hand symbols) or
U+25B7/U+25C1 (white left/right-pointing triangles) as bracketing
characters. The trick is getting your application to recognize any
such conventions appropriately.

If you are in the position of controlling your own parsing, or knowing
the person who knows the person who writes the parser... ;-), then
almost any character can be given any special function you want.

--Ken



Reply via email to