subject:"Clean and Unicode compliance"

Re: Clean and Unicode compliance

2001-12-16 Thread James Kass

Martin Duerst wrote, > As the person who implemented UTF-8 checking for http://validator.w3.org, > I beg to disagree. In order to validate correctly, the validator has > to make sure it correctly interprets the incomming byte sequence as > a sequence of characters. For this, it has to know the c

Re: Clean and Unicode compliance

2001-12-16 Thread James Kass

Martin Duerst wrote, > > This is really bad. Have you made sure you have the right > options? Tidy has a lot of options. > It sure does. One of which is "-utf8". Using this option (tidy -utf8 -f output.txt -m input.htm) works like a charm, directing the errors and warnings for an HTML file c

Re: Clean and Unicode compliance

2001-12-16 Thread Martin Duerst

As the person who implemented UTF-8 checking for http://validator.w3.org, I beg to disagree. In order to validate correctly, the validator has to make sure it correctly interprets the incomming byte sequence as a sequence of characters. For this, it has to know the character encoding. As an exampl

Re: Clean and Unicode compliance

2001-12-16 Thread Martin Duerst

At 07:16 01/12/14 -0800, James Kass wrote: >Having an HTML validator, like Tidy.exe, which generates errors >or warnings every time it encounters a UTF-8 sequence is >unnerving. It's especially irritating when the validator >automatically converts each string making a single UTF-8 >character into

Re: HTML Validation (was Re: Clean and Unicode compliance)

2001-12-16 Thread Martin Duerst

Hello James (and everybody else), Can you please send comments and bug reports on the validator to [EMAIL PROTECTED]? Sending bug reports to the right address seriously increases the chance that they get fixed. Regards, Martin. At 14:46 01/12/16 -0800, James Kass wrote: >Elliotte Rusty Harold

Re: HTML Validation (was Re: Clean and Unicode compliance)

2001-12-16 Thread James Kass

Elliotte Rusty Harold wrote, > > I suspect a lot of our tools haven't been thoroughly tested with > PLane-1 and are likely to have these sorts of bugs in them. Since Plane One is still fairly new, this is understandable. I'm also having trouble getting Plane Zero pages to validate. Spent seve

Re: HTML Validation (was Re: Clean and Unicode compliance)

2001-12-16 Thread Elliotte Rusty Harold

At 3:07 AM -0800 12/16/01, James Kass wrote: >Tests run on non-BMP text show no problem for Plane One using >UTF-8 encoding but error messages are generated when these >characters are referenced as NCRs. > I suspect there's a lot of random mistakes like this waiting to be discovered. I recently

HTML Validation (was Re: Clean and Unicode compliance)

2001-12-16 Thread James Kass

The HTML validation service from W3C at: http://validator.w3.org has been commended on this list and appears to be sophisticated and fast. Tests run on non-BMP text show no problem for Plane One using UTF-8 encoding but error messages are generated when these characters are referenced as NCRs.

Re: Clean and Unicode compliance

2001-12-14 Thread James Kass

Asmus Freytag wrote, > > NCRs *are* markup. Whether they are called "mark-up" or "macros", they are certainly part of HTML and I was not disagreeing with you that they should be checked by the validator. > And validating that the encoding matches > the declaration (e.g. UTF-8 is not ill-for

Re: Clean and Unicode compliance

2001-12-14 Thread Asmus Freytag

James, NCRs *are* markup. And validating that the encoding matches the declaration (e.g. UTF-8 is not ill-formed) has nothing whatsoever to do with content, but all with verifying that the file conforms to the HTML specification. All this is completely different from spelling and grammar checkin

Re: Clean and Unicode compliance

2001-12-14 Thread James Kass

Asmus Freytag wrote, > A validator *should* look between the > and < in order to > catch invalid entity references, esp. invalu NCRs. > > For UTF-8, it would ideally also check that no ill-formed, > and therefore illegal, sequences are part of the UTF-8. You've made a good point about invalid

Re: Clean and Unicode compliance

2001-12-14 Thread Asmus Freytag

W3C's HTML validation service seems to have no such problems. We've been using it to validate all the files on the unicode site regularly. A validator *should* look between the > and < in order to catch invalid entity references, esp. invalu NCRs. For UTF-8, it would ideally also check that no i

Re: Clean and Unicode compliance

2001-12-14 Thread James Kass

Welé Negga wrote, > Does the Clean development team plan to make Concurrent > Clean partially or fully Unicode compliant in their future > releases, as this is crucial for those of us who use non-European > writing systems, and more generally for those who develop > truly global applications. I

Clean and Unicode compliance

2001-12-14 Thread W4z5m4

19940405 Hello, Does the Clean development team plan to make Concurrent Clean partially or fully Unicode compliant in their future releases, as this is crucial for those of us who use non-European writing systems, and more generally for those who develop truly global applications. Thanks in adva

Re: Clean and Unicode compliance

Re: Clean and Unicode compliance

Re: Clean and Unicode compliance

Re: Clean and Unicode compliance

Re: HTML Validation (was Re: Clean and Unicode compliance)

Re: HTML Validation (was Re: Clean and Unicode compliance)

Re: HTML Validation (was Re: Clean and Unicode compliance)

HTML Validation (was Re: Clean and Unicode compliance)

Re: Clean and Unicode compliance

Re: Clean and Unicode compliance

Re: Clean and Unicode compliance

Re: Clean and Unicode compliance

Re: Clean and Unicode compliance

Clean and Unicode compliance

14 matches

Site Navigation

Mail list logo

Footer information