You mean like this?
The following is two times the zodiac [ U+2648 ... U+2653 ]
Mortbats Zodiac:
1234567890-=
[ Needs Mortbats font to display, http://www.dingbatpages.com ]
Unicode Zodiac:
♈♉♊♋♌♍♎♏♐♑♒♓
[ Needs e.g. Arial Unicode MS to
display ]
The upper of these two zodiacs will give wron
I don't know about others, but my filters place messages in different folders,
EXCEPT when my name is on the cc or to list.
In that case, the message is left in my inbox for more immediate review and
possible response.
The Unicode lists are also slow to send mail so there can be a significant
dela
.
Peter Constable wrote,
> Sure, but why do we want to place so much demand on plain text when the
> vast majority of content we interchange is in some form of marked-up or
> rich text? Let's let plain text be that -- plain -- and look to the markup
> conventions that we've invested so much in and
.
Asmus Freytag wrote,
> Variation selectors also can be ignored based on their code
> point values, but unlike p14 tags, they don't become invalid
> when text is cut&paste from the middle of a string.
Excellent point.
> Unicode 4.0 will be quite specific: P14 tags are "reserved for
> use with p
At 16:47 -0500 2003-02-05, Jim Allan wrote:
There are often conflicting orthographic usages within a language.
Language tagging alone does not indicate whether German text is to
be rendered in Roman or Fraktur, whether Gaelic text is to be
rendered in Roman or Uncial, and if Uncial, a modern U
James Kass posted:
The advantages of using P14 tags (...equals lang IDs mark-up) is
that runs of text could be tagged *in a standard fashion* and
preserved in plain-text.
But this still would not necessarily handle orthographic variations.
See Peter Constable's discussion of language classifca
Erik followed up:
> From what I'm hearing from you all is that a null
> in UTF-8 is for termination and termination only.
> Is this correct?
Not quite. A null byte (0x00) in UTF-8 is only a
representation of the NULL character (U+). It can
be present in UTF-8 for whatever purposes one might
I'm replying to myself, here.
Thank you all for so many quick and helpful responses.
As most of you pointed out, I misread the documentation -- which is doc for multi-byte
strings only (and not wide strings).
So I was brain dead when I asked about encodings other than UTF-8.
The doc states (in
On 02/05/2003 12:24:39 PM jameskass wrote:
>The advantages of using P14 tags (...equals lang IDs mark-up) is
>that runs of text could be tagged *in a standard fashion* and
>preserved in plain-text.
Sure, but why do we want to place so much demand on plain text when the
vast majority of content w
At 06:24 PM 2/5/03 +, [EMAIL PROTECTED] wrote:
The advantages of using P14 tags (...equals lang IDs mark-up) is
that runs of text could be tagged *in a standard fashion* and
preserved in plain-text.
The minute you have scoped tagging, you are no longer using
plain text.
The P14 tags are no
[EMAIL PROTECTED] wrote:
I'm dealing with an API that claims it doesn't support unicode characters with embedded nulls.
...
Test all constituent bytes for 0x00.
This depends on the encoding form you are using (and the API is expecting):
- UTF-8 encodes a Unicode string into a sequence of by
Erik Ostermueller wrote:
> I'm dealing with an API that claims it doesn't support
> unicode characters with embedded nulls.
> I'm trying to figure out how much of a liability this is.
If by "embedded nulls" they mean bytes of value zero, that library can
*only* work with UTF-8. The other two UTF'
Are you sure the API doesn't support Unicode _characters_ with embedded
NULs? Or does it fail to support Unicode _strings_ with embedded NULs?
If it really is the former, no character in UTF-8 (except, of course,
U+) will include a NUL byte. In UTF-16, it will be any character of the
form U+00
.
Peter Constable wrote,
> The plain-text file would be legible without that -- I don't think this is
> an argument in favour of plane 14 tag characters. Preserving
> culturally-preferred appearance would certainly require markup of some
> form, whether lang IDs or for font-face and perhaps font-f
.
Andrew C. West wrote,
> Is this not what the variation selectors are available for ?
>
> And now that we soon to have 256 of them, perhaps Unicode ought not to be shy
> about using them for characters other than mathematical symbols.
>
Yes, there seem to be additional variation selectors coming
On 02/05/2003 04:05:44 AM "Andrew C. West" wrote:
>> If these alternate forms were needed to be displayed in a single
>> multi-lingual plain-text file, wouldn't we need some method of
>> tagging the runs of Latin text for their specific languages?
>
>Is this not what the variation selectors are a
On 02/04/2003 02:52:25 PM jameskass wrote:
>If these alternate forms were needed to be displayed in a single
>multi-lingual plain-text file, wouldn't we need some method of
>tagging the runs of Latin text for their specific languages?
The plain-text file would be legible without that -- I don't
Hello, all.
I'm dealing with an API that claims it doesn't support unicode characters with
embedded nulls.
I'm trying to figure out how much of a liability this is.
What is my best plan of attack for discovering precisely which code points have
embedded nulls
given a particular encoding? Didn'
SRIDHARAN Aravind wrote:
> I have Czech special characters in an excel file.
> I copy them into Notepad.
> I save them.
>
> Now I use native2ascii convertor that is available with JDK.
> After I run this utility, I am getting some other unicode values or
> sometimes only whitespaces come out.
> I
* [EMAIL PROTECTED]
|
| Please forgive me and others who are on similar set-ups if this is
| all just too much of a pain!
It is hard for people to avoid giving others two copies of replies to
on-list messages. In my case I've solved this, since my email client
(Gnus) detects duplicate messages (
> > How to get unicode values for special characters in Java?
> > I have a set of Czech special characters?
Use SC Unipad: http://www.unipad.org/
You can paste Unicode text into it or convert from any other encoding using
the File / Import option. Then, you can use Edit / Select All and Edit /
Co
- Original Message -
From: "SRIDHARAN Aravind" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Wednesday, February 05, 2003 8:27 AM
Subject: How to convert special characters into unicode?
> How to get unicode values for special characters in Java?
> I have a set of Czech special char
Thomas Chan wrote:
> I've typed up my notes on the Vietnamese and Korean ones here, for you:
Many thanks, that is extremely useful, as I don't know either of these languages.
Over night I added Persian (thanks to Roozbeh) and Japanese to the list, and so
with Korean and Vietnamese it should be q
On Wed, 05 Feb 2003 02:00:30 -0800 (PST), [EMAIL PROTECTED] wrote:
> If these alternate forms were needed to be displayed in a single
> multi-lingual plain-text file, wouldn't we need some method of
> tagging the runs of Latin text for their specific languages?
Is this not what the variation sel
How to get unicode values for special characters in Java?
I have a set of Czech special characters?
For LATIN CAPITAL LETTER C WITH CARON, the unicode is 010c and 010d ( for both upper
and lower cases).
And I got this value from a PDF chart(u0100.odf) in www.unicode.org
I have Czech special char
Kent Karlsson wrote:
> Consider English. If I write "", that may well be a spell error.
Or even "Ŋŋŋŋ!", as Michael Everson wrote in WG2 N2306.
-Doug Ewell
Fullerton, California
26 matches
Mail list logo