Re: High dot/dot above punctuation?

2010-07-31 Thread Asmus Freytag

On 7/29/2010 1:15 AM, Khaled Hosny wrote:

 I don't buy in Unicode idea of
encoding different sets of decimal digits separately, they are all
different graphical presentations of the same thing.

  

Two observations:

  1) During rendering, everything turns into a graphical representation.
 
  The decision one has to make is where to keep the information

  about what graphical representation to use. The choices are
  a) document content (character encoding)
  b) documen format (CSS etc.)
  c) local environment at display time (fonts available,
   locale settings, etc).

  Now you can always argue which (combination) of these factors
  works best.
 
  2) For better or for worse, Unicode has made a decision, and

   picked a)  for representing numbers.

  This choice allows the text content to carry the information,
   and documents will normally not change based on how they
   are viewed  and on whether style information is present.

   For some tasks, this may not be the best choice, but given the
   importance of numbers for the text content, it's definitely a
   defensible choice.

The most important bit here is that the choice has been
made and given the goal of an immutable encoding, that's what
you are stuck with.

In some cases, you can make your own choice - the display of
programatically generated numbers is one such case, because it
may not actually involve interchange of character codes, you are free
to use your own solution.

A./



Re: High dot/dot above punctuation?

2010-07-29 Thread Martin J. Dürst

Hello Joanma,

On 2010/07/30 12:05, Juanma Barranquero wrote:

On Fri, Jul 30, 2010 at 04:52, "Martin J. Dürst"  wrote:


It's very clear that we would get nowhere if we wanted to encode
all these.


The comment I respondend to talked about characters that are already encoded.


Sorry, I didn't get that.


In simpler words, you cannot use the needs of discussions about encoding
(the meta-level) to determine encodings.


Discussing arabic versus latin numerals is not more meta-level that
talking about upper vs. lowercase.


Yes indeed. If these distinctions were only necessary when talking 
*about* these characters (meta-level) rather than when just using them 
(non-meta), then I would indeed agree that there is no reason to encode 
them separately.


Regards,Martin.

--
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:due...@it.aoyama.ac.jp



Re: High dot/dot above punctuation?

2010-07-29 Thread Juanma Barranquero
On Fri, Jul 30, 2010 at 04:52, "Martin J. Dürst"  wrote:

> It's very clear that we would get nowhere if we wanted to encode
> all these.

The comment I respondend to talked about characters that are already encoded.

> In simpler words, you cannot use the needs of discussions about encoding
> (the meta-level) to determine encodings.

Discussing arabic versus latin numerals is not more meta-level that
talking about upper vs. lowercase.

    Juanma




Re: High dot/dot above punctuation?

2010-07-29 Thread Martin J. Dürst



On 2010/07/29 19:51, Juanma Barranquero wrote:

On Thu, Jul 29, 2010 at 10:15, Khaled Hosny  wrote:


Also, I don't buy in Unicode idea of
encoding different sets of decimal digits separately, they are all
different graphical presentations of the same thing.


Not in a document where the author is discussing the differences
between them, for example.


The "where the author is discussing the differences" doesn't help in 
deciding whether to encode one or two characters. A document may discuss 
the roman and italic versions of a character, or the Times and Palatino 
versions of a character, or different versions of Times fonts for the 
same character, and so on. It's very clear that we would get nowhere if 
we wanted to encode all these.


In simpler words, you cannot use the needs of discussions about encoding 
(the meta-level) to determine encodings.


Regards,Martin.


--
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:due...@it.aoyama.ac.jp



Re: High dot/dot above punctuation?

2010-07-29 Thread Juanma Barranquero
On Thu, Jul 29, 2010 at 10:15, Khaled Hosny  wrote:

> Also, I don't buy in Unicode idea of
> encoding different sets of decimal digits separately, they are all
> different graphical presentations of the same thing.

Not in a document where the author is discussing the differences
between them, for example.

    Juanma




Re: High dot/dot above punctuation?

2010-07-29 Thread Khaled Hosny
On Thu, Jul 29, 2010 at 10:01:37AM +0200, Kent Karlsson wrote:
> 
> Den 2010-07-29 08.47, skrev "Khaled Hosny" :
> 
> > I have few fonts where I implemented a 'locl' OpenType feature that maps
> > European to Arabic digits, and contextual substitution feature that
> > replaces the dot with Arabic decimal separator when it comes between two
> > Arabic numbers, so I think it is doable.
> 
> Doable is not the same thing as a good idea. Your example here is one of the
> not-at-all-good ideas.

This was done of a GUI font, the main aim is to have Arabic numbers in
Arabic contexts and vice versa, since the numbers here are generated on
the fly like dates, percentages etc. it is not possible (or even
desirable) to change the input. Also, I don't buy in Unicode idea of
encoding different sets of decimal digits separately, they are all
different graphical presentations of the same thing.

Regards,
 Khaled

-- 
 Khaled Hosny
 Arabic localiser and member of Arabeyes.org team
 Free font developer



Re: High dot/dot above punctuation?

2010-07-29 Thread André Szabolcs Szelp
2010/7/28 Asmus Freytag 

> On 7/28/2010 9:30 AM, André Szabolcs Szelp wrote:
>
>> You really all say, that general property Sk (DOT ABOVE) rather than Po
>> (FULL STOP, COMMA, MIDDLE DOT) (compared with all other decimal point
>> characters) can not cause any problems ever in certain algorithms?
>>
> No, we say that this is equivalent to a decimal comma - it's the same as
> regular comma, and well-designed algorithms can tell the difference.
>
> Distinguishing identically looking punctuation marks by their function in
> text on the level of character encoding is not something that has proven
> workable.
>


Well,

I have replied Ken Whistler privately on a question of his, however, in the
particular case I'm trying to digitalize, comma is used as
millions-separator, period as thousands separator, and high dot as decimal
separator.
(formally: #,###.###˙##(###...) ) (I can post the example if necessary).

Seeing a number encoded as 1.000 — even knowing this locale!! — you cannot
tell whether its "thousand" or "one" with explicit post-decimal zeros, IF
you encode both the period and the high dot as FULL STOP.
This poses a problem for _any_ contextual alternates-based approach in
display as well.

So in this case, actually, I think its well arguable, that encoding the
decimal point with FULL STOP and treating it as a glyphtic variant is not
viable.

/Szabolcs


Re: High dot/dot above punctuation?

2010-07-29 Thread Kent Karlsson

Den 2010-07-29 08.47, skrev "Khaled Hosny" :

> I have few fonts where I implemented a 'locl' OpenType feature that maps
> European to Arabic digits, and contextual substitution feature that
> replaces the dot with Arabic decimal separator when it comes between two
> Arabic numbers, so I think it is doable.

Doable is not the same thing as a good idea. Your example here is one of the
not-at-all-good ideas.

/Kent K





Re: High dot/dot above punctuation?

2010-07-29 Thread Khaled Hosny
On Wed, Jul 28, 2010 at 11:37:28AM -0700, Asmus Freytag wrote:
> On 7/28/2010 10:09 AM, Murray Sargent wrote:
> >Contextual rendering is getting to be more common thanks to
> >adoption of OpenType features. For example, both MS Publisher 2010
> >and MS Word 2010 support various contextually dependent OpenType
> >features at the user's discretion. The choice of glyph for U+002E
> >could be chosen according to an OpenType style.
> I know that the technology exists that (in principle) can overcome
> an early limitation of 1:1 relation between characters and glyphs in
> a single font. I also know that this technology has been implemented
> for certain (but not all) types of mappings that are not 1:1.
> >It's worth remembering that plain text is a format that was introduced due 
> >to the limitations of early computers. Books have always been rendered with 
> >at least some degree of rich text. And due to the complexity of Unicode, 
> >even Unicode plain text often needs to be rendered with more than one font.
> However, the question I raised here is whether such mechanisms have
> been implemented to date for FULL STOP. Which implementation makes
> the required context analysis to determine whether 002E is part of a
> number during layout? If it does make this determination, which
> OpenType feature does it invoke? Which font supports this particular
> OpenType feature?

I have few fonts where I implemented a 'locl' OpenType feature that maps
European to Arabic digits, and contextual substitution feature that
replaces the dot with Arabic decimal separator when it comes between two
Arabic numbers, so I think it is doable.

Regards,
 Khaled

-- 
 Khaled Hosny
 Arabic localiser and member of Arabeyes.org team
 Free font developer



RE: Plain text (was: Re: High dot/dot above punctuation?)

2010-07-28 Thread Murray Sargent
Doug comments:

> Murray Sargent  wrote:

>> It's worth remembering that plain text is a format that was introduced 
>> due to the limitations of early computers. Books have always been 
>> rendered with at least some degree of rich text. And due to the 
>> complexity of Unicode, even Unicode plain text often needs to be 
>> rendered with more than one font.

> I disagree with this assessment of plain text.  When you consider the basic 
> equivalence of the "same" text
> written in longhand by different people, typed on a typewriter, 
> finger-painted by a child, spray-painted
> through a stencil, etc., it's clear that the "sameness" is an attribute of 
> the underlying plain text.  None of
> these examples has anything to do with computers, old or new.

> I do agree that rich text has existed for a long time, possibly as long as 
> plain text (though I doubt that, when
> you consider really early writing technologies like palm leaves), but I don't 
> think that refutes the independent
> existence of plain text.  And I don't think the need to use more than one 
> font to render some Unicode text
> implies it isn't plain text.  I think that has more to do with aesthetics (a 
> rich-text concept) and technical limits
> on font size.

My comments were to some degree hyperbole, in the hope that people fixated on 
plain text would be encouraged to think a little more broadly. Plain text 
underlies all rich text and in that capacity, it's been around since mankind 
started scribing. And plain text can have exotic formatting, e.g., gradient 
color; it's just that the formatting has to be uniform for all the text, rather 
than for parts (runs) of the text. One can regard the need for more than one 
font to render Unicode text as an implementation detail. But as a practical 
matter, it means that rendering/editing engines need to be able to handle a 
fair amount of richness. The RichEdit library used in Windows and Office takes 
advantage of that fact in providing plain-text controls as well as rich-text 
controls.

Murray




Plain text (was: Re: High dot/dot above punctuation?)

2010-07-28 Thread Doug Ewell

Murray Sargent  wrote:

It's worth remembering that plain text is a format that was introduced 
due to the limitations of early computers. Books have always been 
rendered with at least some degree of rich text. And due to the 
complexity of Unicode, even Unicode plain text often needs to be 
rendered with more than one font.


I disagree with this assessment of plain text.  When you consider the 
basic equivalence of the "same" text written in longhand by different 
people, typed on a typewriter, finger-painted by a child, spray-painted 
through a stencil, etc., it's clear that the "sameness" is an attribute 
of the underlying plain text.  None of these examples has anything to do 
with computers, old or new.


I do agree that rich text has existed for a long time, possibly as long 
as plain text (though I doubt that, when you consider really early 
writing technologies like palm leaves), but I don't think that refutes 
the independent existence of plain text.  And I don't think the need to 
use more than one font to render some Unicode text implies it isn't 
plain text.  I think that has more to do with aesthetics (a rich-text 
concept) and technical limits on font size.


--
Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org
RFC 5645, 4645, UTN #14 | ietf-languages @ is dot gd slash 2kf0s ­






Re: High dot/dot above punctuation?

2010-07-28 Thread Andrew West
On 28 July 2010 18:41, Michael Everson  wrote:
>
>> Contextual rendering is getting to be more common thanks to adoption of 
>> OpenType features. For example, both MS Publisher 2010 and MS Word 2010 
>> support various contextually dependent OpenType features at the user's 
>> discretion. The choice of glyph for U+002E could be chosen according to an 
>> OpenType style.
>>
>> It's worth remembering that plain text is a format that was introduced due 
>> to the limitations of early computers. Books have always been rendered with 
>> at least some degree of rich text. And due to the complexity of Unicode, 
>> even Unicode plain text often needs to be rendered with more than one font.
>
> Are or will be OT features supported in, say, filenames?

They are on my Windows Vista machine ... if you configure it to use an
appropriate font. For example, when configured to use Code2000 the
filenames "∪︀.txt" (U+222A UNION plus VS1) and "insec‍t.txt" (with a
ZWJ between the c and t) both use OT features in Code2000 to render
the filenames in Windows Explorer differently compared with plain
"∪.txt" and "insect.txt".

Andrew




RE: High dot/dot above punctuation?

2010-07-28 Thread Murray Sargent
> Michael asks, "Are or will be OT features supported in, say, filenames?" The 
> answer depends on the
> renderer. For example, if you display filenames in NotePad using the Calibri 
> font, default English
> ligatures are used automatically using OpenType table info.

> I meant on the desktop or in the Finder or Explorer.

I don't see them used in the Windows 7 Explorer, but that's no guarantee they 
won't be used in the next version :-) Here I'm assuming you mean OT features 
for English text. OpenType features are used extensively in shaping complex 
script text in general and in complex-script filenames in particular.

Murray




Re: High dot/dot above punctuation?

2010-07-28 Thread Khaled Hosny
On Wed, Jul 28, 2010 at 09:32:57PM +0100, Michael Everson wrote:
> On 28 Jul 2010, at 21:25, Murray Sargent wrote:
> 
> > Michael asks, "Are or will be OT features supported in, say, filenames?" 
> > The answer depends on the renderer. For example, if you display filenames 
> > in NotePad using the Calibri font, default English ligatures are used 
> > automatically using OpenType table info.
> 
> I meant on the desktop or in the Finder or Explorer.

In GTK+ based applications (e.g. Gnome desktop in Linux) you get
OpenType every where, including filenames in the file manager, it is
limited to the feature that are on by default, though (i.e. no
contextual alternates for Latin script).

-- 
 Khaled Hosny
 Arabic localiser and member of Arabeyes.org team
 Free font developer



Re: High dot/dot above punctuation?

2010-07-28 Thread Michael Everson
On 28 Jul 2010, at 21:25, Murray Sargent wrote:

> Michael asks, "Are or will be OT features supported in, say, filenames?" The 
> answer depends on the renderer. For example, if you display filenames in 
> NotePad using the Calibri font, default English ligatures are used 
> automatically using OpenType table info.

I meant on the desktop or in the Finder or Explorer.

Michael Everson * http://www.evertype.com/





RE: High dot/dot above punctuation?

2010-07-28 Thread Murray Sargent

Michael asks, "Are or will be OT features supported in, say, filenames?" The 
answer depends on the renderer. For example, if you display filenames in 
NotePad using the Calibri font, default English ligatures are used 
automatically using OpenType table info.

Murray







RE: High dot/dot above punctuation?

2010-07-28 Thread Murray Sargent
Asmus asks, "Which implementation makes the required context analysis to 
determine whether 002E is part of a number during layout? If it does make this 
determination, which OpenType feature does it invoke? Which font supports this 
particular OpenType feature?"

I haven't looked to see if our various OpenType engines analyze the context of 
002E to treat numerical contexts in a special way. But both 002E and 002C 
(COMMA) are handled contextually in the build up of UTN #28 linear format 
mathematical expressions and in the rendering thereof. In particular, note that 
in nonnumerical contexts, period and comma are followed by some extra spacing 
in math zones, but as parts of numbers, that extra spacing is omitted. Also in 
RtL math, the period and comma are displayed LtR when part of a number, but RtL 
otherwise. So contextual analysis of these characters is quite important.

Murray








Re: High dot/dot above punctuation?

2010-07-28 Thread Asmus Freytag

On 7/28/2010 10:09 AM, Murray Sargent wrote:
Contextual rendering is getting to be more common thanks to adoption of OpenType features. For example, both MS Publisher 2010 and MS Word 2010 support various contextually dependent OpenType features at the user's discretion. The choice of glyph for U+002E could be chosen according to an OpenType style. 
  
I know that the technology exists that (in principle) can overcome an 
early limitation of 1:1 relation between characters and glyphs in a 
single font. I also know that this technology has been implemented for 
certain (but not all) types of mappings that are not 1:1.

It's worth remembering that plain text is a format that was introduced due to 
the limitations of early computers. Books have always been rendered with at 
least some degree of rich text. And due to the complexity of Unicode, even 
Unicode plain text often needs to be rendered with more than one font.
  
However, the question I raised here is whether such mechanisms have been 
implemented to date for FULL STOP. Which implementation makes the 
required context analysis to determine whether 002E is part of a number 
during layout? If it does make this determination, which OpenType 
feature does it invoke? Which font supports this particular OpenType 
feature?


A./






Re: High dot/dot above punctuation?

2010-07-28 Thread Michael Everson
On 28 Jul 2010, at 18:09, Murray Sargent wrote:

> Contextual rendering is getting to be more common thanks to adoption of 
> OpenType features. For example, both MS Publisher 2010 and MS Word 2010 
> support various contextually dependent OpenType features at the user's 
> discretion. The choice of glyph for U+002E could be chosen according to an 
> OpenType style. 
> 
> It's worth remembering that plain text is a format that was introduced due to 
> the limitations of early computers. Books have always been rendered with at 
> least some degree of rich text. And due to the complexity of Unicode, even 
> Unicode plain text often needs to be rendered with more than one font.

Are or will be OT features supported in, say, filenames?

Michael Everson * http://www.evertype.com/





RE: High dot/dot above punctuation?

2010-07-28 Thread Murray Sargent
Contextual rendering is getting to be more common thanks to adoption of 
OpenType features. For example, both MS Publisher 2010 and MS Word 2010 support 
various contextually dependent OpenType features at the user's discretion. The 
choice of glyph for U+002E could be chosen according to an OpenType style. 

It's worth remembering that plain text is a format that was introduced due to 
the limitations of early computers. Books have always been rendered with at 
least some degree of rich text. And due to the complexity of Unicode, even 
Unicode plain text often needs to be rendered with more than one font.

Murray





Re: High dot/dot above punctuation?

2010-07-28 Thread Asmus Freytag

On 7/28/2010 9:30 AM, André Szabolcs Szelp wrote:
You really all say, that general property Sk (DOT ABOVE) rather than 
Po (FULL STOP, COMMA, MIDDLE DOT) (compared with all other decimal 
point characters) can not cause any problems ever in certain algorithms?
No, we say that this is equivalent to a decimal comma - it's the same as 
regular comma, and well-designed algorithms can tell the difference.


Distinguishing identically looking punctuation marks by their function 
in text on the level of character encoding is not something that has 
proven workable.


A./




Re: High dot/dot above punctuation?

2010-07-28 Thread André Szabolcs Szelp
You really all say, that general property Sk (DOT ABOVE) rather than Po
(FULL STOP, COMMA, MIDDLE DOT) (compared with all other decimal point
characters) can not cause any problems ever in certain algorithms?

Szabolcs


Re: High dot/dot above punctuation?

2010-07-28 Thread Kent Karlsson

Den 2010-07-28 17.09, skrev "Jukka K. Korpela" :

> Kent Karlsson wrote:
> 
>> And the Nameslist says:
>> 002EFULL STOP
>>= period, dot, decimal point
>>* may be rendered as a raised decimal point in old style numbers
> 
> Right, I remembered there is such a comment somewhere but did not remember
> where.
> 
>> However, I think that is a bad idea: firstly the digits here aren't
>> necessarily "old style" (indeed, André wrote "lining", i.e. NOT
>> old style). And even if they are old style, it seems to me to be a
>> bad idea to make this a contextual rendering change for FULL STOP
>> (and it also says "may" not "shall" so there is no way of knowing
>> which rendering you should get even with old style digits).
> 
> I don't think the comment suggests the kind of contextual rendering you seem
> to be thinking. It just says "may", without specifying what controls the
> rendering and restricting raised rendering to old style numbers.
>
> But admittedly, if you wish to use raised dot rendering, you will need
> either some programmed logic that applies such style to FULL STOP in certain
> contexts but not others (and this is nontrivial) or manual work that sets
> the style of each FULL STOP used as decimal point.

That's "contextual rendering", in the general sense. And as Asmus says, I
don't think anyone has implemented any automatic contextual rendering as
hinted in that annotation (and in the chapter text you referenced), nor do
I think anyone should implement that.
 
>> Better stay with the MIDDLE DOT for the raised decimal dot.
>> 
>> Further, I don't see any major problem with using U+02D9 DOT ABOVE
>> for "high dot" in this case.
> 
> I see several problems with both approaches. The rendering will depend on
> font and may not be at all suitable for a raised dot. These characters have
> properties different from those of FULL STOP, and you never know what this
> may imply. Software that handles characters by their Unicode properties may
> do unexpected and unsuitable things if some character just "looks adequate"
> but has properties different from those of the character(s) that would be
> semantically (more) correct.

Such as exactly which problems here? If any, those are surely dwarfed
by several orders of magnitude compared to that some locales use "." as
thousands separator and others use it as decimal separator, and some
locales use "," as thousands separator and other use it as decimal
separator.

/kent k
 
> Jukka 
> 
> 






Re: High dot/dot above punctuation?

2010-07-28 Thread Asmus Freytag

On 7/28/2010 2:02 AM, Kent Karlsson wrote:


Den 2010-07-28 09.50, skrev "Jukka K. Korpela" :

  

André Szabolcs Szelp wrote:



Generally, for the decimal point . (U+002E FULLSTOP) and , (U+002C
COMMA) is used in the SI world. However, earlier conventions could use
different notation, such as the common British raised dot which
centers with the lining digits (i.e. that would be U+00B7 MIDDLE DOT).
  

The different dot-like characters are quite a mess, but the case of British
raised dot is simple: it is regarded as typographic variant of FULL STOP.

Ref.: http://unicode.org/uni2book/ch06.pdf (second page, paragraph with
run-in heading "Typographic variation").



And the Nameslist says:
002EFULL STOP
= period, dot, decimal point
* may be rendered as a raised decimal point in old style numbers

However, I think that is a bad idea: firstly the digits here aren't
necessarily "old style" (indeed, André wrote "lining", i.e. NOT
old style). And even if they are old style, it seems to me to be a
bad idea to make this a contextual rendering change for FULL STOP
(and it also says "may" not "shall" so there is no way of knowing
which rendering you should get even with old style digits).
Better stay with the MIDDLE DOT for the raised decimal dot.
  
The real problem I have with this annotation is that it recommends a 
practice that I strongly suspect has never been implemented in the 
entire 20 years since it's been on the books. (If anyone knows of an 
implementation that has contextual rendering of FULL STOP, I'd like to 
learn about it here.)


If a particular text uses both raised periods and raised decimal points, 
then I see use in being able to use 002E for this and make it change by 
using a font with a different glyph. But if it applies only to the 
decimal point, overloading 002E would require a degree of context 
analysis that I believe is unimplemented (see above). If my suspicion is 
true, then, at the minimum, the annotation should be reworded so that it 
doesn't seem to imply a practice that doesn't exist.

Further, I don't see any major problem with using U+02D9 DOT ABOVE
for "high dot" in this case.
  
Me neither - if it's positioned right, then it should be used. 
Duplicating "dots" by function is definitely a no-no. However, unfiying 
punctuation characters with definite differences in appearance only 
works well if these differences are systematically applied with a 
type-style (font) selection and then apply to the entire text in each 
font. Such as the use of a double oblique glyph for HYPHEN (and 
HYPHEN-MINUS) in Fraktur fonts.


A./




Re: High dot/dot above punctuation?

2010-07-28 Thread Jukka K. Korpela

Kent Karlsson wrote:


And the Nameslist says:
002EFULL STOP
   = period, dot, decimal point
   * may be rendered as a raised decimal point in old style numbers


Right, I remembered there is such a comment somewhere but did not remember 
where.



However, I think that is a bad idea: firstly the digits here aren't
necessarily "old style" (indeed, André wrote "lining", i.e. NOT
old style). And even if they are old style, it seems to me to be a
bad idea to make this a contextual rendering change for FULL STOP
(and it also says "may" not "shall" so there is no way of knowing
which rendering you should get even with old style digits).


I don't think the comment suggests the kind of contextual rendering you seem 
to be thinking. It just says "may", without specifying what controls the 
rendering and restricting raised rendering to old style numbers.


But admittedly, if you wish to use raised dot rendering, you will need 
either some programmed logic that applies such style to FULL STOP in certain 
contexts but not others (and this is nontrivial) or manual work that sets 
the style of each FULL STOP used as decimal point.



Better stay with the MIDDLE DOT for the raised decimal dot.

Further, I don't see any major problem with using U+02D9 DOT ABOVE
for "high dot" in this case.


I see several problems with both approaches. The rendering will depend on 
font and may not be at all suitable for a raised dot. These characters have 
properties different from those of FULL STOP, and you never know what this 
may imply. Software that handles characters by their Unicode properties may 
do unexpected and unsuitable things if some character just "looks adequate" 
but has properties different from those of the character(s) that would be 
semantically (more) correct.


Jukka 





Re: High dot/dot above punctuation?

2010-07-28 Thread Kent Karlsson



Den 2010-07-28 09.50, skrev "Jukka K. Korpela" :

> André Szabolcs Szelp wrote:
> 
>> Generally, for the decimal point . (U+002E FULLSTOP) and , (U+002C
>> COMMA) is used in the SI world. However, earlier conventions could use
>> different notation, such as the common British raised dot which
>> centers with the lining digits (i.e. that would be U+00B7 MIDDLE DOT).
> 
> The different dot-like characters are quite a mess, but the case of British
> raised dot is simple: it is regarded as typographic variant of FULL STOP.
> 
> Ref.: http://unicode.org/uni2book/ch06.pdf (second page, paragraph with
> run-in heading "Typographic variation").

And the Nameslist says:
002EFULL STOP
= period, dot, decimal point
* may be rendered as a raised decimal point in old style numbers

However, I think that is a bad idea: firstly the digits here aren't
necessarily "old style" (indeed, André wrote "lining", i.e. NOT
old style). And even if they are old style, it seems to me to be a
bad idea to make this a contextual rendering change for FULL STOP
(and it also says "may" not "shall" so there is no way of knowing
which rendering you should get even with old style digits).
Better stay with the MIDDLE DOT for the raised decimal dot.

Further, I don't see any major problem with using U+02D9 DOT ABOVE
for "high dot" in this case.

/Kent K

> Yucca
> 
> 






Re: High dot/dot above punctuation?

2010-07-28 Thread Jukka K. Korpela

André Szabolcs Szelp wrote:


Generally, for the decimal point . (U+002E FULLSTOP) and , (U+002C
COMMA) is used in the SI world. However, earlier conventions could use
different notation, such as the common British raised dot which
centers with the lining digits (i.e. that would be U+00B7 MIDDLE DOT).


The different dot-like characters are quite a mess, but the case of British 
raised dot is simple: it is regarded as typographic variant of FULL STOP.


Ref.: http://unicode.org/uni2book/ch06.pdf (second page, paragraph with 
run-in heading "Typographic variation").


Yucca




High dot/dot above punctuation?

2010-07-28 Thread André Szabolcs Szelp

Dear Colleagues,


In processing a document, I came across a punctuation character which  
I was not able to find in Unicode. As I find it hard to believe that  
the character has not been encoded yet, I must think my search was  
incomplete, and I'd be hoping that you can point me to the correct  
character to use.


Generally, for the decimal point . (U+002E FULLSTOP) and , (U+002C  
COMMA) is used in the SI world. However, earlier conventions could use  
different notation, such as the common British raised dot which  
centers with the lining digits (i.e. that would be U+00B7 MIDDLE DOT).


Now I came across a bilingual document (from the 1930'ies), which uses  
the aforementioned MIDDLE DOT in the one language, and a clearly  
distinctive "raised dot/high dot" with the other: it's in line with  
the *top* edge of lining numerals and takes the same position  
basically, as U+2019 RIGHT SINGLE QUOTATION MARK or U+02BC MODIFIER  
LETTER APOSTROPHE, but has the shape of a FULL STOP.


While U+02D9 DOT ABOVE (from the spacing modifier letters) would seem  
a correct _graphical_ representative, I believe, it's use might be  
objectional:
Given, that both FULL STOP, COMMA and MIDDLE DOT are of general  
category: Po (Punctuation, other), however, DOT ABOVE has general  
category Sk (Symbol, Modifier), (and it also has a binary property  
"diacritic"), the choice seems wrong. The use of the DOT ABOVE as a  
decimal separator would be a "misuse" of a character, not unlike the  
parallel situation when the MODIFIER LETTER APOSTROPHE is used instead  
of RIGHT SINGLE QUOTATION MARK.


In view of these facts I was wondering, whether DOT ABOVE was the  
right character to use as the decimal point in the given context, or  
whether there is some other "dot above/high dot" character with the  
property "Po" which I missed.



Thank you,
   Szabolcs