Re: Unicode Normalization Forms

2001-08-16 Thread Martin Duerst
Hello Bjoern, The wrong lines are easy to verify, they contain the same character code twice after some of the normalizations, a clear error. The fact that you got nine of them and that your code worked for 3.1.1 seems already enough info to tell me that your implementation passes all the necessar

Re: Unicode Normalization Forms

2001-08-16 Thread Bjoern Hoehrmann
* SADAHIRO Tomoyuki wrote: >NormalizationTest-3.1.0.txt seems to have a few bugs. >(exactly speaking, on nine lines) I've written up an ANSI C implementation of NFD and NFC. For NFC my implementation fails for lines 16856 16858 16860 16862 16864 16870 16872 16874 16876 It pass

Re: Unicode Normalization Forms

2001-08-10 Thread SADAHIRO Tomoyuki
On Thu, 09 Aug 2001 22:30:16 +0200 Bjoern Hoehrmann <[EMAIL PROTECTED]> wrote: > * SADAHIRO Tomoyuki wrote: > >How about the following interface? > > > >| $normalized_string = normalize($raw_string) > >| > >| You can use this function only if the normalization form > >| you require is specified

Re: Unicode Normalization Forms

2001-08-09 Thread Martin Duerst
At 09:39 01/08/09 -0500, Jarkko Hietaniemi wrote: > > NormalizationTest-3.1.0.txt seems to have a few bugs. > > (exactly speaking, on nine lines) > >Have you reported this? If not, please do so as soon as possible >so that Unicode 3.1.1 will have them fixed. I reported them quite a while ago and

Re: Unicode Normalization Forms

2001-08-09 Thread Bjoern Hoehrmann
* SADAHIRO Tomoyuki wrote: >How about the following interface? > >| $normalized_string = normalize($raw_string) >| >| You can use this function only if the normalization form >| you require is specified in the C statement: >| >| use Text::Unicode::Normalize 'C'; # Normalization Form C Also fin

Re: Unicode Normalization Forms

2001-08-09 Thread Jarkko Hietaniemi
On Thu, Aug 09, 2001 at 11:57:24PM +0900, SADAHIRO Tomoyuki wrote: > > On Thu, 9 Aug 2001 09:39:41 -0500 > Jarkko Hietaniemi <[EMAIL PROTECTED]> wrote: > > > > NormalizationTest-3.1.0.txt seems to have a few bugs. > > > (exactly speaking, on nine lines) > > > > Have you reported this? If not,

Re: Unicode Normalization Forms

2001-08-09 Thread SADAHIRO Tomoyuki
On Thu, 9 Aug 2001 09:39:41 -0500 Jarkko Hietaniemi <[EMAIL PROTECTED]> wrote: > > NormalizationTest-3.1.0.txt seems to have a few bugs. > > (exactly speaking, on nine lines) > > Have you reported this? If not, please do so as soon as possible > so that Unicode 3.1.1 will have them fixed. > >

Re: Unicode Normalization Forms

2001-08-09 Thread SADAHIRO Tomoyuki
> > use Text::Unicode::Normalize; > > > > $stringNFD = NFD($string); # Normalization Form D > > $stringNFC = NFC($string); # Normalization Form C > > $stringNFKD = NFKD($string); # Normalization Form KD > > $stringNFKC = NFKC($string); # Normalization Form KC > > a normalize function in

Re: Unicode Normalization Forms

2001-08-09 Thread Jarkko Hietaniemi
> NormalizationTest-3.1.0.txt seems to have a few bugs. > (exactly speaking, on nine lines) Have you reported this? If not, please do so as soon as possible so that Unicode 3.1.1 will have them fixed. http://www.unicode.org/unicode/standard/versions/beta.html > This module requires the follow

Re: Unicode Normalization Forms

2001-08-09 Thread Jarkko Hietaniemi
On Thu, Aug 09, 2001 at 10:31:14AM +0100, Nick Ing-Simmons wrote: > Bjoern Hoehrmann <[EMAIL PROTECTED]> writes: > >* SADAHIRO Tomoyuki wrote: > >>Now a pre-release module to get Unicode Normalization Forms > >>(UAX #15) is available. > > >

Re: Unicode Normalization Forms

2001-08-08 Thread Bjoern Hoehrmann
* SADAHIRO Tomoyuki wrote: >Now a pre-release module to get Unicode Normalization Forms >(UAX #15) is available. Cool! :-) >NAME (a temporary name) > >Text::Unicode::Normalize - normalized forms of Unicode text I'd suggest Unicode::Normalize and >SYNOPSIS > >

Unicode Normalization Forms

2001-08-08 Thread SADAHIRO Tomoyuki
Now a pre-release module to get Unicode Normalization Forms (UAX #15) is available. see http://homepage1.nifty.com/nomenclator/perl/indexE.htm NAME (a temporary name) Text::Unicode::Normalize - normalized forms of Unicode text SYNOPSIS use Text::Unicode::Normalize; $stringNFD = NFD