Re: Another take on the English apostrophe in Unicode

2015-06-11 Thread Bill Poser
e single quotes won't be usable and you'll use > chevrons like in this ‹demo’› and not single or double quotes which are > difficult to discriminate. > > > 2015-06-11 19:47 GMT+02:00 Bill Poser : > >> To add a factor that I think hasn't been mentioned, there a

Re: Another take on the English apostrophe in Unicode

2015-06-11 Thread Bill Poser
To add a factor that I think hasn't been mentioned, there are languages in which apostrophe is used both as a letter by itself and as part of a complex letter. Most of the native languages of British Columbia write glottalized consonants as C+', e.g. for an ejective alveolar stop, and many use apo

Re: "Unicode of Death"

2015-05-28 Thread Bill Poser
No doubt the evil Unicode Consortium is in league with the Trilateral Commission, the Elders of Zion,and the folks at NASA who faked the moon landing :) On Thu, May 28, 2015 at 7:53 AM, Doug Ewell wrote: > Unicode is in the news today as some folks with waaay too much time on > their hands h

Re: Swift

2014-06-05 Thread Bill Poser
A few years ago there was a company in Australia that was developing a multilingual language called Protium Blue. The lead was someone named Diarmuid Pigott. As far as I can tell, the project has come to an end, but one can still find bits about the project, e.g. this: http://www.qualitytesting.in

Re: Websites in Hindi

2014-03-02 Thread Bill Poser
In my experience the problem with Hindi web sites is that many of them used encodings other than unique, frequently encodings designed for a particular font. Some fonts did not use anything like a normal encoding. We encountered a newspaper that used a font with 8,000-some glyphs each representing

Re: What to backup after corruption of code units?

2013-08-27 Thread Bill Poser
"backup" in this context refers to moving to previous bytes in order to find the boundary between the previous, valid character, and the corrupted character that you have encountered. In other words if you have a string consisting of N bytes and at byte K you determine that the current sequence of

Re: pIqaD in actual use

2013-02-20 Thread Bill Poser
If we assume that television sitcoms reflect reality, one can find native speakers of Klingon via their synagogues. :) On Wed, Feb 20, 2013 at 5:26 PM, Phil Carter wrote: > > From: Richard Wordingham > >To: Unicode Mailing List > >Sent: Wednesday, February 20, 2013 6:38 PM > >Subject: Re: pIqa

Re: Navajo

2013-02-13 Thread Bill Poser
I am familiar with written Navajo. The writing system is almost pure ASCII. The only additional characters needed are combining acute accent for high tone, combining ogonek for nasalization, and the upper- and lower-case barred l's for the voiceless lateral fricative. The list below looks complete

Re: Destruction in Timbuktu

2013-01-31 Thread Bill Poser
It is indeed sad. However, I have now seen reports that most of the manuscripts were stored elsewhere and that it is believed that most of the collection has survived. I hope this is true. http://allafrica.com/stories/201301301130.html On Thu, Jan 31, 2013 at 1:07 PM, Ed Trager wrote: > Hi, eve

Re: End of story character

2013-01-24 Thread Bill Poser
There's also the venerable U+0003 "end of text". It has the virtue (?) of having no associated glyph and so can be realized however one likes. On Thu, Jan 24, 2013 at 4:41 PM, Richard Wordingham < richard.wording...@ntlworld.com> wrote: > On Thu, 24 Jan 2013 20:05:41 -0300 > Andrés Sanhueza wrot

Re: Why is "endianness" relevant when storing data on disks but not when in memory?

2013-01-05 Thread Bill Poser
Endian-ness of data stored in memory is relevant but only if you are working at a very low level. Suppose that you have UTF32 data stored as unsigned C integers. On pretty much any modern computer, each codepoint will occupy four 8-bit bytes. So long as you deal with that data via C, as unsigned 32

Re: Tool to convert characters to character names

2012-12-19 Thread Bill Poser
If by "online" you mean "on the web" then this isn't what you want, but the uniname utility in my unidesc package converts characters to names. I haven't yet updated the data but will soon. http://billposer.org/Software/unidesc.html On Wed, Dec 19, 2012 at 9:03 PM, "Martin J. Dürst" wrote: > I'm

Re: problem with combining diacritcs in HTML5

2012-10-09 Thread Bill Poser
Yes, precisely. It's the combining behaviour that matters, not the distinction between the two slightly different low lines. On Tue, Oct 9, 2012 at 10:51 AM, Jukka K. Korpela wrote: > 2012-10-09 20:32, Bill Poser wrote: > > No, I was contrasting the behaviour of s followed by U+0

Re: problem with combining diacritcs in HTML5

2012-10-09 Thread Bill Poser
No, I was contrasting the behaviour of s followed by U+0332, for which there is no precomposed letter, with U+1E95, which is the precomposed equivalent of z followed by U+0332. On Tue, Oct 9, 2012 at 10:13 AM, Andreas Prilop wrote: > On Sat, 6 Oct 2012, Bill Poser wrote: > > > Chara

Re: problem with combining diacritcs in HTML5

2012-10-07 Thread Bill Poser
Voice of G-d. :) On Sun, Oct 7, 2012 at 1:51 AM, Michael Everson wrote: > On 7 Oct 2012, at 08:37, Jukka K. Korpela wrote: > > > 2012-10-07 8:38, Bill Poser wrote: > > > >> I have a web page that writes into an HTML5 textarea via the javascript > >> do

problem with combining diacritcs in HTML5

2012-10-06 Thread Bill Poser
I have a web page that writes into an HTML5 textarea via the javascript dom interface. U+0332 COMBINING LOW LINE is incorrectly rendered as a spacing low line in both Mozilla Firefox and Google Chrome, which is peculiar since they use different rendering agents. Characters with a combining low line

Re: texteditors that can process and save in different encodings

2012-10-04 Thread Bill Poser
Another editor that can read and save in a variety of encodings is vim, the gussied-up successor to the Unix vi editor: http://www.vim.org It is available for MS Windows, Mac OS X, Linux, and a variety of other systems.

Re: Compiling a list of Semitic transliteration characters

2012-09-07 Thread Bill Poser
There is another reason for romanizing, namely where it is desired to represent constituents below the level of analysis of the native writing system, e.g. individual segments when the native system is suprasegmental. For example, many Japanese verbs have stems that end in consonants that are not p

Re: Compiling a list of Semitic transliteration characters

2012-09-05 Thread Bill Poser
It is also at least logically possible for there to be transliterations from Semitic writing systems to non-Roman writing systems. I'm not aware of such a thing, but one can imagine, for example, Russian work using a Cyrillic-based transliteration. Even if such things are not in scholarly use, I be

Re: [unicode] Re: Canadian aboriginal syllabics in vertical writing mode

2012-05-02 Thread Bill Poser
In the case of the Carrier syllabics, I have never seen an example of vertical text so there is no native usage to go by. However, as others have said, rotated text is very difficult to read because of the role of orientation. It's true that the small characters provide evidence as to the direction

A new character to encode from the Onion? :)

2012-04-30 Thread Bill Poser
Digital typography has reached *The Onion*: http://www.theonion.com/articles/errant-keystroke-produces-character-never-before-s,28030/ .

Re: Civil suit; ftp shutdown; mailing list shutdown

2011-10-07 Thread Bill Poser
There's a discussion of the lawsuit on Slashdot:http://yro.slashdot.org/story/11/10/06/1743226/civil-suit-filed-involving-the-time-zone-database On Thu, Oct 6, 2011 at 10:14 PM, "Martin J. Dürst" wrote: > [By accident, I sent this only to Ken first; he recommended I send it to > both Unicode and

Re: searching for PUA characters

2011-08-25 Thread Bill Poser
On Thu, Aug 25, 2011 at 1:17 PM, Lorna Priest wrote: > The recent discussion on PUA characters reminded me of a question I've had. > I am wondering if anyone has a tool whereby we could search for all > documents on a local computer (or server) that use PUA codepoints. I suppose > what I'd like i

Re: Code pages and Unicode

2011-08-19 Thread Bill Poser
Even if we do encounter extraterrestrials with writing, they are likely to be so far away that communication will be all but impossible. The universe is large in comparison to the speed of light. On Fri, Aug 19, 2011 at 2:53 PM, Benjamin M Scarborough < benjamin.scarboro...@utdallas.edu> wrote: >

Re: Unifon

2011-06-28 Thread Bill Poser
Here is a document by Bennett that describes the use of Unifon for Hupa, Tolowa, Yurok and Karok:http://eric.ed.gov/ERICWebPortal/contentdelivery/servlet/ERICServlet?accno=ED310889 On Tue, Jun 28, 2011 at 11:05 AM, Jean-François Colson wrote: > On 28/06/11 19:22, Bill Poser wrote: > >

Re: Unifon

2011-06-28 Thread Bill Poser
archValue_0=Hupa&eric_displayStartCount=1&_pageLabel=ERICSearchResult&ERICExtSearch_SearchType_0=kwNone of the more recent material in Hupa is in Unifon. On Tue, Jun 28, 2011 at 11:05 AM, Jean-François Colson wrote: > On 28/06/11 19:22, Bill Poser wrote: > >> Unifon was used

Re: Unifon

2011-06-28 Thread Bill Poser
Unifon was used at one point to write several languages in northern California, so it has seen practical application. I'm not sure how much material was published in this form. I don't think that any of these tribes is still using Unifon.

Re: Application that displays CJK text in Normalization Form D

2010-11-13 Thread Bill Poser
On Sat, Nov 13, 2010 at 4:46 PM, Jim Monty wrote: > Is there even a single software application that properly displays CJK text > in > Normalization Form D? > > I just tried your examples in Yudit (http://www.yudit.org) and they seem to work: the NFD text looks the same as the NFC text.

Re: ? Reasonable to propose stability policy on numeric type = decimal

2010-07-24 Thread Bill Poser
> Bill, > Michael is no programmer, hence he doesn't have first hand understanding why > programmers distiguish between character set mapping (normally requiring > look-up tables) and digit conversion (normally done by offset calculations). > > That said, there are enough programmers on the committ

Fwd: ? Reasonable to propose stability policy on numeric type = decimal

2010-07-24 Thread Bill Poser
-- Forwarded message -- From: Bill Poser Date: Sat, Jul 24, 2010 at 6:02 PM Subject: Re: ? Reasonable to propose stability policy on numeric type = decimal To: Michael Everson On Sat, Jul 24, 2010 at 4:25 PM, Michael Everson wrote: > On 24 Jul 2010, at 23:00, Bill Poser wr

Re: ? Reasonable to propose stability policy on numeric type = decimal

2010-07-24 Thread Bill Poser
On Sat, Jul 24, 2010 at 1:00 PM, Michael Everson wrote: > Digits can be scattered randomly about the code space and it wouldn't make > any difference. Having written a library for performing conversions between Unicode strings and numbers, I disagree. While it is not all that hard to deal with

Re: Indian Rupee Sign (U+20B9) proposal - copyright/licencing issue

2010-07-20 Thread Bill Poser
The Indian copyright office site is: http://copyright.gov.in/. There are links to the legislation and regulations, an FAQ, etc.

Re: Indian Rupee Sign (U+20B9) proposal - copyright/licencing issue

2010-07-20 Thread Bill Poser
A quick check of the Indian government web site indicates that the Government of India does claim copyright in government works (unlike the US federal government), so under Indian law an explicit license may be necessary.