On 3/21/2013 4:22 PM, Philippe Verdy wrote:
2013/3/21 Richard Wordingham richard.wording...@ntlworld.com:
Further, the code chart glyphs for the ANO TELEIA and the MIDDLE DOT
differ, see attachment. If they are canonically equivalent, and one
is a mandatory decomposition of the other, why do
This one is incredible:
https://bugzilla.redhat.com/show_bug.cgi?id=922433
2013/3/22 Asmus Freytag asm...@ix.netcom.com:
Semantic selectors are pure pseudo-coding, because if the semantic
differentiation is needed it is needed in plain text - and then it should be
expressible in plain character codes.
We don't disagree, that's exactly what I meant here : plain
2013/3/22 Asmus Freytag asm...@ix.netcom.com:
If you need to annotate text with the results of semantic analysis as
performed by a human reader, then you either need XML, or some other format
that can express that particular intent.
Absolutely NO. If this encodes semantics, this is part of
2013/3/22 Asmus Freytag asm...@ix.netcom.com:
The number of conventions that can be applicable to certain punctuation
characters is truly staggering, and it seems unlikely that Unicode is the
right place to
a) discover all of them or
b) standardize an expression for them.
My intent is
But is how do we know whether the bug is there all the time!
On Fri, Mar 22, 2013 at 4:45 PM, Stephane Bortzmeyer bortzme...@nic.fr wrote:
This one is incredible:
https://bugzilla.redhat.com/show_bug.cgi?id=922433
On 3/22/2013 4:02 AM, Philippe Verdy wrote:
2013/3/22 Asmus Freytag asm...@ix.netcom.com:
Semantic selectors are pure pseudo-coding, because if the semantic
differentiation is needed it is needed in plain text - and then it should be
expressible in plain character codes.
We don't disagree,
On 3/22/2013 4:08 AM, Philippe Verdy wrote:
2013/3/22 Asmus Freytag asm...@ix.netcom.com:
If you need to annotate text with the results of semantic analysis as
performed by a human reader, then you either need XML, or some other format
that can express that particular intent.
Absolutely NO. If
Is there an implementation of a regular expression engine with full
Unicode Level 3 support as per UTS #18?
Mit freundlichen Grüßen,
Martinho
On 3/22/2013 4:16 AM, Philippe Verdy wrote:
2013/3/22 Asmus Freytag asm...@ix.netcom.com:
The number of conventions that can be applicable to certain punctuation
characters is truly staggering, and it seems unlikely that Unicode is the
right place to
a) discover all of them or
b) standardize an
On Fri, 22 Mar 2013 12:08:14 +0100
Philippe Verdy verd...@wanadoo.fr wrote:
adding new variants of existing characters like what was done
specifically for maths is not a stabl long term solution; solutions
similar to variant selectors however are much more meaningful, and
will allow for
On 03/21/2013 04:48 PM, Richard Wordingham wrote:
For linguistic analysis, you need the normalisation appropriate to the
task. This is a case where Unicode normalisation generally throws away
information (namely, how the author views the characters), whereas in
analysing Burmese you may want to
This one is incredible:
https://bugzilla.redhat.com/show_bug.cgi?id=922433
This sort of failure to perform input validation and/or escaping is also
a sign of bad software engineering in general. I recall an important CGI
form of my university refusing to let me submit because I input an
And how many web forms forget to check the presence of a percent sign
and are executing SQL searches without cheking it using clauses
similar to WHERE table.field LIKE :parameter by binding directly the
submitted form value to the parameter variable placeholder, ignoring
the fact that the percent
On 3/22/2013 12:08 PM, Karl Williamson wrote:
On 03/21/2013 04:48 PM, Richard Wordingham wrote:
For linguistic analysis, you need the normalisation appropriate to the
task.
Linguistic analysis (in general) being a hugely complex undertaking,
mere normalization pales in comparison, so
On Fri, 22 Mar 2013 13:08:01 -0600
Karl Williamson pub...@khwilliamson.com wrote:
This is the first time I've heard someone suggest that one can
tailor normalizations.
I think the officially acceptable term is 'folding'. One would
not be 'tailoring a Unicode normalisation', but subverting the
On Fri, 22 Mar 2013 18:01:14 -0700
Asmus Freytag asm...@ix.netcom.com wrote:
On 03/21/2013 04:48 PM, Richard Wordingham wrote:
However, distinguishing U+00B7 and U+0387 would fail spectacularly
of the text had been converted to form NFC before you received it.
That's a claim for which
On 3/22/2013 6:17 PM, Richard Wordingham wrote:
On Fri, 22 Mar 2013 18:01:14 -0700
Asmus Freytag asm...@ix.netcom.com wrote:
On 03/21/2013 04:48 PM, Richard Wordingham wrote:
However, distinguishing U+00B7 and U+0387 would fail spectacularly
of the text had been converted to form NFC before
18 matches
Mail list logo