Re: RTL PUA?

2011-08-20 Thread Asmus Freytag

On 8/20/2011 6:44 PM, Doug Ewell wrote:

Would that really be a better default? I thought the main RTL needs for the PUA 
would be for unencoded scripts, not for even more Arabic letters. (How many 
more are there anyway?)

In any case, either 'R' or 'AL' as the Plane 16 default would be an improvement 
over having 'L' for the entire PUA.




The best default would be an explicit "PU" - undefined behavior in the 
absence of a private agreement.


However, it helps to remember why the PUAs exist to begin with. The 
demand came from East Asian character sets, which long had had such 
private use areas. In their case, the issue of properties did not 
seriously arise, because the vast bulk of private characters where 
ideographs.


I bet this remains true, and so the original motivation for the 
suggestion of "L" as the default would still apply - no matter how 
unsatisfactory this is from a formal point of view.


If maintaining the "L" default were to fail on the cliff of political 
correctness (or the "fairness" argument that has been made) the only 
proper solution is to use a value of "unknown" (i.e the hypothetical PU 
value) for all private use code points.


There are some properties where stability guarantees prevent adding a 
new value. In that case, the documentation should point out that the 
intended effect was to have a PU value, but for historical / stability 
reasons, the tables contain a different entry.


Suggesting a "structure" on the private use area, by suggesting 
different default properties, ipso facto makes the PUA less private. 
That should be a non-starter.


A./




Re: RTL PUA?

2011-08-20 Thread Doug Ewell
Would that really be a better default? I thought the main RTL needs for the PUA 
would be for unencoded scripts, not for even more Arabic letters. (How many 
more are there anyway?)

In any case, either 'R' or 'AL' as the Plane 16 default would be an improvement 
over having 'L' for the entire PUA.

--Original Message--
From: Richard Wordingham
Sender: unicode-bou...@unicode.org
To: unicode@unicode.org
Subject: Re: RTL PUA?
Sent: Aug 20, 2011 19:18

On Sun, 21 Aug 2011 00:21:28 +
"Doug Ewell"  wrote:

> The more I think of it, the more I like the idea of reassigning the
> default BC of Plane 16 to 'R'. What would the arguments against this
> be?

BC of 'AL'?

Richard.


--
Doug Ewell • d...@ewellic.org
Sent via BlackBerry by AT&T




Re: RTL PUA?

2011-08-20 Thread Richard Wordingham
On Sun, 21 Aug 2011 00:21:28 +
"Doug Ewell"  wrote:

> The more I think of it, the more I like the idea of reassigning the
> default BC of Plane 16 to 'R'. What would the arguments against this
> be?

BC of 'AL'?

Richard.



Re: Code pages and Unicode

2011-08-20 Thread Richard Wordingham
On Fri, 19 Aug 2011 17:03:41 -0700
Ken Whistler  wrote:

> O.k., so apparently we have awhile to go before we have to start
> worrying about the Y2K or IPv4 problem for Unicode. Call me again in
> the year 2851, and we'll still have 5 years left to design a new
> scheme and plan for the transition. ;-)

It'll be much easier to extend UTF-16 if there are still enough
contiguous points available.  Set that wake-up call for 2790, or
whenever plane 13 (better, plane 12) is about to come into use.

Richard.



Re: RTL PUA?

2011-08-20 Thread Doug Ewell
The more I think of it, the more I like the idea of reassigning the default BC 
of Plane 16 to 'R'. What would the arguments against this be?


--
Doug Ewell • d...@ewellic.org
Sent via BlackBerry by AT&T




Re: RTL PUA?

2011-08-20 Thread Richard Wordingham
On Fri, 19 Aug 2011 22:14:17 +0700
Martin Hosken  wrote:

> Therefore, I would suggest that a carefully allocated set of columns
> for non L directionality PUA characters be encoded. This PUA doesn't
> have to be big, with probably 1 column allocated per directionality.
> I'm no expert in the bidi algorithm, but my guess is that we only
> need a maximum of 5 columns and perhaps much less.

The Ancient Egyptian hieroglyphic script, an RTL script of Bidi
mirrored characters, has a Unicode LTR script of Bidi unmirrored
characters for its modern representation that currently contains 1071
characters. However, I've seen a quote of about 6,000 different
Graeco-Roman period hieroglyphs, so actually writing hieroglyphic
Ancient Egyptian in plain text would need about 24 columns of
characters.

> I would value some input from Bidi experts on this.

I hope my input helps in the mean time.  I for one couldn't say whether
the characters of the demotic Egyptian script should have a Bidi-class
of R or AL.

Richard.



Re: Code pages and Unicode

2011-08-20 Thread Doug Ewell
It sounds like you’re trying to encode glyphs or glyph fragments, not 
characters.  There is a virtually endless repertoire of “shapes” that could be 
encoded, but unless each of these is a character actually used in a writing 
system (not just hypothetically), it’s probably not appropriate for a character 
encoding.

--
Doug Ewell | Thornton, Colorado, USA | RFC 5645, 4645, UTN #14
www.ewellic.org | www.facebook.com/doug.ewell | @DougEwell ­



From: srivas sinnathurai 
Sent: Saturday, August 20, 2011 3:35
To: Christoph Päper 
Cc: unicode@unicode.org 
Subject: Re: Code pages and Unicode

About the research works.

I alone (with with my colleagues) researching the fact that
Sumerian is Tamil / Tamil is Sumerian
This requires quite a lot of space.

Additionally I do research on Tamil alphabet as based on scientific definitions 
and it only represents the mechanical parts , ie only represents the places of 
articulation as alphabet and not sound based. And, what is call a mathematical 
multiplier theory on expanding the alphabets leads to not just long-mathematics 
(nedung kaNaku), but also to extra long mathematics.

This is just a sample requirement from me and my colleagues. How many others 
are there who would require Unicode support? Do you think allocating 32,000 to 
the code page model would help?

Regards
Sinnathurai 



Timetable for PDAM 1.2, the next PDAM (presumably PDAM 1.3), and DAM 1 of ISO/IEC 10646:2012

2011-08-20 Thread Karl Pentzlin
On the last SC2/WG2 meeting this June in Helsinki, according to
resolution M58.24 (see http://std.dkuug.dk/JTC1/SC2/WG2/docs/n4104.pdf ),
there will be "a discussion list and teleconferencing facilities to
arrive at dispositions to ballot comments, and issuing of any PDAM ballots
(within the scope of current SC2 projects and its subdivisions), between
WG2 face to face meetings".

According to the discussions, it was intended that, due to the
eight-month distance between WG2 meetings due to the prolonged DAM/DIS
period, there is an opportunity to solve PDAM issues by the usual
commenting prcess within this interval. and to create a new edition
of the PDAM, which in turn is to be commented before the next meeting
where the outcome the disposition is progressed to DAM.

Now, we have PDAM 1.2 (L2/11-316 = SC2 N4201 (derived from WG2 N4107
http://std.dkuug.dk/JTC1/SC2/WG2/docs/n4107.pdf ).
In accordance with the issues mentioned above, it states:

> Status:
> In accordance with Resolution M17.06 adopted at the SC 2 Plenary
> Meeting held in Helsinki, Finland, 2011-06-10, this document is
> circulated to the SC 2 national bodies for a second PDAM ballot for a
> 3-month period. Please vote and comment via the Electronic balloting
> system as soon as possible but not later than 2011-10-29.

If I understand it correctly, this means:
- On 2011-10-29, all comments are collected.
- A few days later, the comments will be published on the WG2 list.
- At the same time, the discussion list starts working, and there is a
  reasonable date, maybe two weeks later, the disposition (at least
  these which are unanimous), are collected, and the editor updates
  PDAM 1.2 into a PDAM 1.3.
  A reasonable point in time for the publication of such a PDAM 1.3,
  as far as I presume, is the end of November of 2011.

Then, this PDAM is to be commented the usual way in time to the next
WG2 meeting scheduled for 2012-02-13/17 at Mountain View, CA, USA.

Thus, the commenting time for PDAM 1.3 is less than three months.
- Is this in line with the ISO rules?
- Otherwise, is it intended to squeeze the whole disposition process
  for PDAM 1.2 into two weeks, thus PDAM 1.3 can be published before
  2011-11-13 to have a three-month period until the start of the
  February meeting?

- Karl





Re: Code pages and Unicode

2011-08-20 Thread srivas sinnathurai
About the research works.

I alone (with with my colleagues) researching the fact that
Sumerian is Tamil / Tamil is Sumerian
This requires quite a lot of space.

Additionally I do research on Tamil alphabet as based on scientific
definitions and it only represents the mechanical parts , ie only represents
the places of articulation as alphabet and not sound based. And, what is
call a mathematical multiplier theory on expanding the alphabets leads to
not just long-mathematics (nedung kaNaku), but also to extra long
mathematics.

This is just a sample requirement from me and my colleagues. How many others
are there who would require Unicode support? Do you think allocating 32,000
to the code page model would help?

Regards
Sinnathurai

On 20 August 2011 09:31, Christoph Päper wrote:

> Mark Davis ☕:
>
> > Under the original design principles of Unicode, the goal was a bit more
> limited; we envisioned […] a generative mechanism for infrequent CJK
> ideographs,
>
> I'd still like having that as an option.
>
>


Re: Code pages and Unicode

2011-08-20 Thread Christoph Päper
Mark Davis ☕:

> Under the original design principles of Unicode, the goal was a bit more 
> limited; we envisioned […] a generative mechanism for infrequent CJK 
> ideographs,

I'd still like having that as an option.