Unicode in 'NFG' formation ?

2009-05-16 Thread John M. Dlugosz
I was going over S02, and found it opens with, By default Perl presents 
Unicode in NFG formation, where each grapheme counts as one character.


I looked up NFG, and found it to be an invention of this group, but 
didn't find any details when I tried to chase down the links.


This opens a whole bunch of questions for me.  If you mean that the 
default for what the individual items in a string are is graphemes, OK, 
but what does that have to do with parsing source code?  Even so, that's 
not something that would be called a Normalization Form.


Character set encodings and stuff is one of my strengths.  I'd like to 
straighten this out, and can certainly straighten out the wording, but 
first need to know what you meant by that.


Can someone catch me up on the particulars?

--John



Re: Unicode in 'NFG' formation ?

2009-05-16 Thread Darren Duncan

John M. Dlugosz wrote:
I was going over S02, and found it opens with, By default Perl presents 
Unicode in NFG formation, where each grapheme counts as one character.


I looked up NFG, and found it to be an invention of this group, but 
didn't find any details when I tried to chase down the links.


This opens a whole bunch of questions for me.  If you mean that the 
default for what the individual items in a string are is graphemes, OK, 
but what does that have to do with parsing source code?  Even so, that's 
not something that would be called a Normalization Form.


Character set encodings and stuff is one of my strengths.  I'd like to 
straighten this out, and can certainly straighten out the wording, but 
first need to know what you meant by that.


Can someone catch me up on the particulars?


I noticed and asked about this a few months ago.  As you say, NFG was invented 
for Perl 6 and/or Parrot.


See http://docs.parrot.org/parrot/latest/html/docs/pdds/pdd28_strings.pod.html 
for all the formal details that exist to my knowledge.


Back at the time I raised the issue, it was said that we need to take that 
Parrot PDD 28 and derive the initial Perl 6 Synopsis 15 from it.  Such a 
Synopsis could basically just start out as a clone of the Parrot document.  I 
said that someday I might have the round-tuit for this, but as yet I didn't.


Since you seem eager, I recommend you start with porting the Parrot PDD 28 to a 
new Perl 6 Synopsis 15, and continue from there.


-- Darren Duncan