Martin Duerst wrote,
> As the person who implemented UTF-8 checking for http://validator.w3.org,
> I beg to disagree. In order to validate correctly, the validator has
> to make sure it correctly interprets the incomming byte sequence as
> a sequence of characters. For this, it has to know the c
Martin Duerst wrote,
>
> This is really bad. Have you made sure you have the right
> options? Tidy has a lot of options.
>
It sure does. One of which is "-utf8". Using this option
(tidy -utf8 -f output.txt -m input.htm)
works like a charm, directing the errors and warnings for
an HTML file c
As the person who implemented UTF-8 checking for http://validator.w3.org,
I beg to disagree. In order to validate correctly, the validator has
to make sure it correctly interprets the incomming byte sequence as
a sequence of characters. For this, it has to know the character
encoding. As an exampl
At 07:16 01/12/14 -0800, James Kass wrote:
>Having an HTML validator, like Tidy.exe, which generates errors
>or warnings every time it encounters a UTF-8 sequence is
>unnerving. It's especially irritating when the validator
>automatically converts each string making a single UTF-8
>character into
Hello James (and everybody else),
Can you please send comments and bug reports on the validator to
[EMAIL PROTECTED]? Sending bug reports to the right address
seriously increases the chance that they get fixed.
Regards, Martin.
At 14:46 01/12/16 -0800, James Kass wrote:
>Elliotte Rusty Harold
Elliotte Rusty Harold wrote,
>
> I suspect a lot of our tools haven't been thoroughly tested with
> PLane-1 and are likely to have these sorts of bugs in them.
Since Plane One is still fairly new, this is understandable.
I'm also having trouble getting Plane Zero pages to validate.
Spent seve
At 3:07 AM -0800 12/16/01, James Kass wrote:
>Tests run on non-BMP text show no problem for Plane One using
>UTF-8 encoding but error messages are generated when these
>characters are referenced as NCRs.
>
I suspect there's a lot of random mistakes like this waiting to be
discovered. I recently
The HTML validation service from W3C at:
http://validator.w3.org
has been commended on this list and appears to be sophisticated
and fast.
Tests run on non-BMP text show no problem for Plane One using
UTF-8 encoding but error messages are generated when these
characters are referenced as NCRs.
Asmus Freytag wrote,
>
> NCRs *are* markup.
Whether they are called "mark-up" or "macros", they are
certainly part of HTML and I was not disagreeing with you
that they should be checked by the validator.
> And validating that the encoding matches
> the declaration (e.g. UTF-8 is not ill-for
James,
NCRs *are* markup. And validating that the encoding matches
the declaration (e.g. UTF-8 is not ill-formed) has nothing
whatsoever to do with content, but all with verifying that
the file conforms to the HTML specification.
All this is completely different from spelling and grammar
checkin
Asmus Freytag wrote,
> A validator *should* look between the > and < in order to
> catch invalid entity references, esp. invalu NCRs.
>
> For UTF-8, it would ideally also check that no ill-formed,
> and therefore illegal, sequences are part of the UTF-8.
You've made a good point about invalid
W3C's HTML validation service seems to have no such problems.
We've been using it to validate all the files on the unicode
site regularly.
A validator *should* look between the > and < in order to
catch invalid entity references, esp. invalu NCRs.
For UTF-8, it would ideally also check that no i
Welé Negga wrote,
> Does the Clean development team plan to make Concurrent
> Clean partially or fully Unicode compliant in their future
> releases, as this is crucial for those of us who use non-European
> writing systems, and more generally for those who develop
> truly global applications.
I
19940405
Hello,
Does the Clean development team plan to make Concurrent Clean partially or fully Unicode compliant in their future releases, as this is crucial for those of us who use non-European writing systems, and more generally for those who develop truly global applications.
Thanks in adva
14 matches
Mail list logo