Re: The rules of encoding (from Re: Missing geometric shapes)

2012-11-12 Thread vanisaac
William, I think you have a unreasonable idea of what a standard actually is. 
You have already made a standard and published it - I've seen all the posts at 
the FCP forum. All you have to do is let people use it. If a user community is 
going to exchange data, they will do so, and it just plain doesn't matter if 
some other user community were to exchange completely different data 
coincidentally using the same sequence of bytes.

The problem is that you don't want a /standard/ - you already have one. You 
want a legitimacy for your ideas that they haven't earned, and you are trying 
to borrow that legitimacy from Unicode and ISO. What you don't understand is 
that the legitimacy you want to borrow is intimately tied in with the fact that 
Unicode has policies and procedures that they follow, one of which is they do 
not recognize scripts that haven't met the criteria for inclusion.

From: William_J_G Overington 

> A feature of using the Private Use Area is that code point allocations are 
> made by a person or entity that is not a standards organization. Also, 
> Private Use Area code point assignments are not unique.

Which has not kept other PUA standards like MUFI and CSUR from successfully 
exchanging data. In fact, they both have successfully demonstrated usage to the 
point that scripts have then been allocated for public use.

> In many cases, neither of those features presents a problem for successful 
> use of a Private Use Area encoding.

> However, although one can often not be concerned with the fact that the code 
> point assignment is not unique, the fact that it is not made by a standards 
> organization is a big problem if one is seeking to have a system that one has 
> invented taken up by people and companies generally.

In other words, you want legitimacy that the idea has not earned.

> For one of my present uses of the Private Use Area I am seeking to have a 
> system that I have invented taken up by people and companies generally.

Then publish the standard and let them do it. If the idea is useful, then 
others will adopt it; if not, they won't.

> However, I feel that there is no chance of a system that I have invented 
> being taken up by people and companies generally using a Private Use Area 
> encoding. Thus, I feel that I will not be able to present an encoding 
> proposal document showing existing widespread usa.

This /feeling/ is specifically contradicted by the evidence of language 
communities adopting the MUFI and CSUR standards.

> However, if the Unicode Technical Committee and the ISO Committee were to 
> agree to the principle of encoding my inventions in plane 13, not necessarily 
> using the particular items or symbols that I am at present using in my 
> research, yet the committees working out how to form a committee or 
> subcommittee to work out what to encode, then I feel that a group project 
> with lots of people contributing ideas could produce a wonderful system 
> encoded into plane 13 that could be of great usefulness to many people.

If it is so wonderful and useful, there is no reason why you wouldn't be able 
to bring together a group of people to develop the standard in Plane 15 just as 
easily. If you can't do that, it's a pretty good indication that it's not as 
useful as you think it is.

> My present goal is to have the opportunity to write a document requesting 
> that agreement in principle and for the document to be considered and 
> discussed by the committees and a formal decision made.

The formal decision will be "no", because you have shown zero actual usage.

> William Overington
 
> 12 November 2012 

-Van Anderson




Re: The rules of encoding (from Re: Missing geometric shapes)

2012-11-12 Thread William_J_G Overington
On Saturday 10 November 2012, John Knightley  wrote:
 
> Whilst using the PUA is far from perfect at the end of the day it is better 
> than the alternative of not using the PUA.
 
Yes. The Private Use Area is a very useful facility in that it allows 
characters of one's own designation to be added to a personally made font as 
one wishes. One can with many software applications then use the font and the 
characters much as one can use a commercial font that has just regular Unicode 
characters.
 
Here are links to some forum posts where I have used Private Use Area 
characters in various circumstances.
 
http://forum.high-logic.com/viewtopic.php?p=9655#p9655
 
http://forum.high-logic.com/viewtopic.php?p=16813#p16813
 
http://forum.high-logic.com/viewtopic.php?p=16746#p16746
 
http://forum.high-logic.com/viewtopic.php?p=16264#p16264
 
http://forum.high-logic.com/viewtopic.php?p=17499#p17499
 
http://forum.high-logic.com/viewtopic.php?p=17556#p17556
 
A feature of using the Private Use Area is that code point allocations are made 
by a person or entity that is not a standards organization. Also, Private Use 
Area code point assignments are not unique.
 
In many cases, neither of those features presents a problem for successful use 
of a Private Use Area encoding.
 
However, although one can often not be concerned with the fact that the code 
point assignment is not unique, the fact that it is not made by a standards 
organization is a big problem if one is seeking to have a system that one has 
invented taken up by people and companies generally.
 
For one of my present uses of the Private Use Area I am seeking to have a 
system that I have invented taken up by people and companies generally.
 
However, I feel that there is no chance of a system that I have invented being 
taken up by people and companies generally using a Private Use Area encoding. 
Thus, I feel that I will not be able to present an encoding proposal document 
showing existing widespread usa.
 
However, if the Unicode Technical Committee and the ISO Committee were to agree 
to the principle of encoding my inventions in plane 13, not necessarily using 
the particular items or symbols that I am at present using in my research, yet 
the committees working out how to form a committee or subcommittee to work out 
what to encode, then I feel that a group project with lots of people 
contributing ideas could produce a wonderful system encoded into plane 13 that 
could be of great usefulness to many people.
 
My present goal is to have the opportunity to write a document requesting that 
agreement in principle and for the document to be considered and discussed by 
the committees and a formal decision made.
 
William Overington
 
12 November 2012






Re: The rules of encoding (from Re: Missing geometric shapes)

2012-11-10 Thread Philippe Verdy
2012/11/10 john knightley :
>   Whilst using the PUA is far from perfect at the end of the day it is
> better than the alternative of not using the PUA.
>
> Regards
> John
>
> On 10 Nov 2012 17:37, "William_J_G Overington" 
> wrote:
>>
>> On Thursday 8 November 2012, Philippe Verdy  wrote:
>>
>> > 2012/11/8 William_J_G Overington> :
>> > > However, an encoding using a Private Use Area encoding has great
>> > > problems in being implemented as a widespread system.
>>
>> > Wrong, this is what has been made during centuries if not millenium !
>>
>> Well, the point that I am trying to make is that a new glyph that is used
>> in an electronic communication system that uses the ISO/IEC 10646 character
>> encoding system with the new glyph being encoded using a Private Use Area
>> code point does, of necessity, have a code point associated with the glyph.
>> In a handwritten or printed document, communication uses the glyph alone, so
>> there is not the same problem existing.
>>
>>
>> > This is still true today: even if you define new glyphs, and as long as
>> > you do not explicitly give permission to others to reused those glyphs or
>> > variant of them, these glyphs remain private in terms of copyright
>> > restrictions on their designs.
>>
>> Yes, you are right. I do hope that that is not going to be a problem over
>> people trying out the glyphs that I have devised. I need to think about
>> that. Feedback from readers on this issue is invited please.
>>
>>
>> > Making your publication "public" by depositing to a national library is
>> > not a situation where you grant an open licence to others : the legal
>> > deposit made at a national library is instead used as a proof of your date
>> > of work to claim your copyright on this work.
>>
>> Well, in the United Kingdom, copyright in a work exists from when the work
>> is put into permanent form.
>>
>> The reason that I use the voluntary deposit facility of the British
>> Library is so that there is a permanent archived record of what I produce
>> for as long as the British Library exists.
>>
>> As I understand it, the deposit at the British Library is because the work
>> has been published, or is on the point of being published and actually
>> becomes published: the depositing at the British Library is not the
>> publishing action.

This is true, but anyway the act of publishing something is not by
itself a proof of authorship and owning of copyrights, you need a
formal registration, and this is the purpose of legal deposit (In UK
may be, but certainly in France too, where it is required before
someone can claim copyright on a published work).

The Internet has changed a bit the sotuation, but theorically in
France at least, websites must have an information info or link to a
page expliaining the rights and copyrights attached, with the legal
names and points of contact of the publisher which will act on behalf
of the authors to reply to legal requests. in some countries, web
hosting companies must also be able to legally act on behalf of
publishers of websites (and some laws require maintining legal logs to
help identifying the publishers of websites or any controbutor to a
website).

Web sites are usually not given a copy to a legal deposit library (but
in some countries website publishers can do that if they wish, using
electronic copies of their documents; this legal deposit is not
necessarily free, and not always performed at a national public
library, there may be legally approved trusted proxies, such as
notarial offices, or national IP agencies). The main purpose of such
deposit is to register a claim of copyright, at a proven date (to
protect the work from competitive claims that would come later).

The mere act of publishing something is generally not enough when
copyright claims start being disputed in justice (because the
publication act could have been illegal, and this illegal act not
discovered before long by their legitime owners, or because the owners
won't be able to find who perfomed the illegal publication) : this is
even harder to prove when the first publication was made on the
Internet only, as documents that are not electronically signed.



Re: The rules of encoding (from Re: Missing geometric shapes)

2012-11-10 Thread john knightley
  Whilst using the PUA is far from perfect at the end of the day it is
better than the alternative of not using the PUA.

Regards
John
On 10 Nov 2012 17:37, "William_J_G Overington" 
wrote:

> On Thursday 8 November 2012, Philippe Verdy  wrote:
>
> > 2012/11/8 William_J_G Overington> :
> > > However, an encoding using a Private Use Area encoding has great
> problems in being implemented as a widespread system.
>
> > Wrong, this is what has been made during centuries if not millenium !
>
> Well, the point that I am trying to make is that a new glyph that is used
> in an electronic communication system that uses the ISO/IEC 10646 character
> encoding system with the new glyph being encoded using a Private Use Area
> code point does, of necessity, have a code point associated with the glyph.
> In a handwritten or printed document, communication uses the glyph alone,
> so there is not the same problem existing.
>
> > This is still true today: even if you define new glyphs, and as long as
> you do not explicitly give permission to others to reused those glyphs or
> variant of them, these glyphs remain private in terms of copyright
> restrictions on their designs.
>
> Yes, you are right. I do hope that that is not going to be a problem over
> people trying out the glyphs that I have devised. I need to think about
> that. Feedback from readers on this issue is invited please.
>
> > Making your publication "public" by depositing to a national library is
> not a situation where you grant an open licence to others : the legal
> deposit made at a national library is instead used as a proof of your date
> of work to claim your copyright on this work.
>
> Well, in the United Kingdom, copyright in a work exists from when the work
> is put into permanent form.
>
> The reason that I use the voluntary deposit facility of the British
> Library is so that there is a permanent archived record of what I produce
> for as long as the British Library exists.
>
> As I understand it, the deposit at the British Library is because the work
> has been published, or is on the point of being published and actually
> becomes published: the depositing at the British Library is not the
> publishing action.
>
> > > Also, I feel that implementation other than for research purposes
> using a Private Use Area encoding would cause problems for the future: I
> feel that a formal encoding is needed from the start.
>
> > Certainly no. For widespread use you first need to create a work, claim
> ownership of copyrights, makde a legal deposit to proove it, publish an
> explicit open licence statement allowing reuse of your design by other
> authors, and then make your own work for convincing others to reuse these
> glyphs (or derived variants of them) in a way similar as yours.
>
> If one coins a new word into the English language (for example, I coined
> the word telesoftware in 1974 for my then new broadcasting invention), the
> word will only become included in the Oxford English Dictionary if it is
> used to an assessed extent by people other than the coiner. That is fine,
> because the word stands on its own and the spelling of the word that is put
> into the dictionary is the same spelling as the word that is in use.
>
> However, with a new symbol that is used by glyph and Private Use Area
> encoding in interoperable situations, such as email exchanges, or
> searchable documents and so on, if the character is encoded into Unicode
> and ISO/IEC 10646 then the glyph is encoded yet the code point is changed.
> There are then issues of legacy data.
>
> > It is when there will be similar reuses by others, in their own
> publications, and when people will start communicating with them in a
> sizeable community, on a long enough period (more than the year of your
> initial publication), that the appearing "abstract" character will be saif
> existant (Unicode or ISO won't consider these characters as long as others
> than you alone are not using these designs legally in their published
> interchanges, printed or not, have not been proven to exist over a period
> consisting in more than 1 year by much more than just 1 independant author).
>
> Well, whether Unicode and ISO accept the characters for encoding is a
> decision that they can make. However, in relation to considering them for
> acceptance, I feel that it would be fair and reasonable for there to be a
> facility that allows me the opportunity to place a document before the
> committees making the case for encoding without there being widespread
> existing usage using a Private Use Area encoding, with the decision as to
> whether to accept the encoding being made after reading the document and
> after discussion.
>
> > > I feel that the rules for encoding such new symbols are out of date
> and not suitable for present day use.
>
> > > Unfortunately, it seems that there is not a way available for me to
> request formal consideration of the possibility of changing the rules.
>
> > > Technology has

Re: The rules of encoding (from Re: Missing geometric shapes)

2012-11-10 Thread William_J_G Overington
On Thursday 8 November 2012, Philippe Verdy  wrote:
 
> 2012/11/8 William_J_G Overington> :
> > However, an encoding using a Private Use Area encoding has great problems 
> > in being implemented as a widespread system.

> Wrong, this is what has been made during centuries if not millenium !
 
Well, the point that I am trying to make is that a new glyph that is used in an 
electronic communication system that uses the ISO/IEC 10646 character encoding 
system with the new glyph being encoded using a Private Use Area code point 
does, of necessity, have a code point associated with the glyph. In a 
handwritten or printed document, communication uses the glyph alone, so there 
is not the same problem existing.
 
> This is still true today: even if you define new glyphs, and as long as you 
> do not explicitly give permission to others to reused those glyphs or variant 
> of them, these glyphs remain private in terms of copyright restrictions on 
> their designs.
 
Yes, you are right. I do hope that that is not going to be a problem over 
people trying out the glyphs that I have devised. I need to think about that. 
Feedback from readers on this issue is invited please.
 
> Making your publication "public" by depositing to a national library is not a 
> situation where you grant an open licence to others : the legal deposit made 
> at a national library is instead used as a proof of your date of work to 
> claim your copyright on this work.
 
Well, in the United Kingdom, copyright in a work exists from when the work is 
put into permanent form.
 
The reason that I use the voluntary deposit facility of the British Library is 
so that there is a permanent archived record of what I produce for as long as 
the British Library exists.
 
As I understand it, the deposit at the British Library is because the work has 
been published, or is on the point of being published and actually becomes 
published: the depositing at the British Library is not the publishing action.
 
> > Also, I feel that implementation other than for research purposes using a 
> > Private Use Area encoding would cause problems for the future: I feel that 
> > a formal encoding is needed from the start.
 
> Certainly no. For widespread use you first need to create a work, claim 
> ownership of copyrights, makde a legal deposit to proove it, publish an 
> explicit open licence statement allowing reuse of your design by other 
> authors, and then make your own work for convincing others to reuse these 
> glyphs (or derived variants of them) in a way similar as yours.
 
If one coins a new word into the English language (for example, I coined the 
word telesoftware in 1974 for my then new broadcasting invention), the word 
will only become included in the Oxford English Dictionary if it is used to an 
assessed extent by people other than the coiner. That is fine, because the word 
stands on its own and the spelling of the word that is put into the dictionary 
is the same spelling as the word that is in use.
 
However, with a new symbol that is used by glyph and Private Use Area encoding 
in interoperable situations, such as email exchanges, or searchable documents 
and so on, if the character is encoded into Unicode and ISO/IEC 10646 then the 
glyph is encoded yet the code point is changed. There are then issues of legacy 
data. 
 
> It is when there will be similar reuses by others, in their own publications, 
> and when people will start communicating with them in a sizeable community, 
> on a long enough period (more than the year of your initial publication), 
> that the appearing "abstract" character will be saif existant (Unicode or ISO 
> won't consider these characters as long as others than you alone are not 
> using these designs legally in their published interchanges, printed or not, 
> have not been proven to exist over a period consisting in more than 1 year by 
> much more than just 1 independant author).
 
Well, whether Unicode and ISO accept the characters for encoding is a decision 
that they can make. However, in relation to considering them for acceptance, I 
feel that it would be fair and reasonable for there to be a facility that 
allows me the opportunity to place a document before the committees making the 
case for encoding without there being widespread existing usage using a Private 
Use Area encoding, with the decision as to whether to accept the encoding being 
made after reading the document and after discussion.
 
> > I feel that the rules for encoding such new symbols are out of date and not 
> > suitable for present day use.
 
> > Unfortunately, it seems that there is not a way available for me to request 
> > formal consideration of the possibility of changing the rules.
 
> > Technology has changed since the rules were made.
 
> May be, but this just extended the number of technical medias useable for 
> publications (even if copyright issues have been restricting a bit the legal 
> reuses more tightly). There are s

Re: The rules of encoding (from Re: Missing geometric shapes)

2012-11-09 Thread Asmus Freytag

On 11/9/2012 7:14 PM, Philippe Verdy wrote:

2012/11/9 Asmus Freytag :

Actually, there are certain instances where characters are encoded based on 
expected usage. Currency symbols are a well known case for that, but there have 
been instances of phonetic characters encoded in order to facilitate creation 
and publication of certain databases for specialists, without burdening them 
with instant obsolescence (if they had used PUA characters).

But work is still being performed to implement the characters ans
start using it massively, even if it's not encoded.


I think this entire line of discussion is rather drifting into 
irrelevant details. Yes, I agree that it should matter whether serious 
resources have been committed in support of a new symbol or new piece of 
notation. That forms part of the evidence that marks some of  these 
exceptional cases as viable standardized characters - despite lack of 
prior, widespread use. That, somehow, was my point.


However, I find it pointless to speculate about the details. Exceptions 
are exceptions, and the most important issue is to reserve the 
flexibility to deal with them, when they arise.


After they have arisen, they are best dealt with on a case-by-case basis 
(or in the case of currency symbols, we now have an entire category for 
which there is consensus hat it merits exceptional treatment).


A./



Re: The rules of encoding (from Re: Missing geometric shapes)

2012-11-09 Thread Philippe Verdy
2012/11/9 Asmus Freytag :
> Actually, there are certain instances where characters are encoded based on
> expected usage.
> Currency symbols are a well known case for that, but there have been
> instances of phonetic characters encoded in order to facilitate creation and
> publication of certain databases for specialists, without burdening them
> with instant obsolescence (if they had used PUA characters).

But work is still being performed to implement the characters ans
start using it massively, even if it's not encoded. Currency symbols
are among these : their design does NOT need an initial encoding as a
character. This starts by a graphic design, using graphic tools. Then
these tools are used to design and print banknotes and coins.

Many documents will be preoduced to introduce the currency and its
expected symbol. They will use graphic representations rather than
plain text. Plain text however is expected to become an urgent need
for currencies that are to become legal tender in an area as large as
a country or group of countries, because currency units are used
everyday, many times each day, by lots of people, even if they don't
always need to create new documents with the symbol (in fact the first
use will be to name the currency, the symbol will be preprinted in
checkforms and on banknotes and coins, or on commercial advertizing
documents that are never limited to plain text : plain text is
certainly not the best support media for their announcements).

>
> If an important publisher of mathematical works (or publisher of important
> mathematical works) made a case for adding a recently created symbol so that
> they can go ahead an make it part of their standard repertoire, I would
> think it churlish to require them to create portability problems for their
> users by first creating documents with PUA encoding).

If the work is really important, if it because it has been the subject
of serious researches for a long enough time, and publication for peer
review. In scientific domains, most electronic publications are NOT
made in plain-text, but using PDFs. For computing purposes, if there's
a need to program the symbol, scientists are used to create specific
notations in programming languages. This is not a limitation, as such
programs actually don"t need the symbol themselves, except to render
the result (but softwares are not limited to return results in plain
text, so this is not a serious limitation).

In other words, there's no chicken and eggs problem for scientific
symbols: the usage starts expanding first, and at some time the symbol
will be used by enough people that they MAY want it to be supported in
plain-text (this won't always happen, notably for scientific documents
where plain text is already a very poor medium which require specific
conventions and notations that are extremely technical and not always
very readable and usable in practice, except by machines, like
programming code in computer languages).

Computer languages anyway are not in scope of the UCS. Neither is the
representation in mediums other than plain text (and notably not
graphic file formats).

In addition, the UCS is used in plain text to allow things that would
NOT be permitted in the initial definition (and actual usage) of the
symbol : transformations like changes of lettercase, sorting/collation
(which may not make sense for the notation using the symbol itself,
variability of glyphs, and even most character properties (the
classification in Unicode will make asbolutely no sense in the
scientific notation that certainly does not want this flexibility when
the actual notation has its own very precise requirements to be
meaningful in well-defined contexts).

Encoding the symbol in the UCS would immediately permit reuse in other
contexts than the initial one. It would be useful and acceptable in
fact to encode it ONLY if there are such derivation of usages, outside
of the initial scientific definition and context, by people that don't
even need to know the original meaning of the symbol to reuse it in a
'fancy" way to mean soething else. If scientists see this usage being
developped, where there's some unallowed variations, they will still
prefer to maintain their own precise definition, which won't match the
definition that will be encoded in the UCS for more general use  (due
to the "unification" process prior to encoding).

In other words, for long the initial scientific commuity will continue
to use its existing definition and conventions, its own stadnards, and
the encoded character may create an ambiguity that does not exist in
their initial convention. They will reject the result of the
"unification" in the UCS and will still consider their symbol being
different from the currently encoded one.

At least for a long time until the general public starts recognizing
that their unified use is acdtually differerent, and then a request is
being performed by scientific people to desunify the scientific symbol
from 

Re: The rules of encoding (from Re: Missing geometric shapes)

2012-11-08 Thread Mark E. Shoulson

On 11/08/2012 09:00 PM, Asmus Freytag wrote:

On 11/8/2012 4:39 PM, Mark E. Shoulson wrote:
I stand by it: we don't encode what would be cool to have.  We encode 
what people *use*.




Actually, there are certain instances where characters are encoded 
based on expected usage.



...
What these examples have in common is that they reflect a small number 
of characters with an "instant" user community that's well defined and 
understood (and appropriate to the type of character). The main reason 
for the restriction to "encode what people use" is that characters 
cannot be retracted if the hoped for enthusiasm for them doesn't 
materialize.


The other reason is that the Unicode Standard is a standard - what it 
encodes needs to be worthy of standardization. There are exceptional 
instances where "leading" standardization can be justified - they are 
few and far between, but they exist. As exceptions prove the rule - 
the majority of characters will continue to be cases where 
standardization follows demonstrated use.


Well said; and I accept the correction to my position.  It does happen, 
but not very often and not without good reason.


~mark




Re: The rules of encoding (from Re: Missing geometric shapes)

2012-11-08 Thread Asmus Freytag

On 11/8/2012 4:39 PM, Mark E. Shoulson wrote:

On 11/08/2012 01:48 AM, William_J_G Overington wrote:

Michael Everson  wrote:

< ... collect examples of these in print ...

Mark E. Shoulson  wrote:

We don't encode "it would be nice/useful."  We encode *characters*, 
glyphs that people use (yes, I know I conflated glyphs and 
characters there.)

...
Unicode isn't a system for encoding ratings. It's a system for 
encoding what people write and print.


I have at various times, as research has progressed, deposited with 
the British Library pdf documents that I have produced and published 
and I have deposited with the British Library TrueType fonts that I 
have produced and published and I have received email receipts for them.


Some of the pdf publications contain new symbols, used intermixed 
with text in a plain text situation. I have used Private Use Area 
encodings for the symbols.


Yet the publications have not been published in hardcopy form.


I think you may be taking me too literally.  A PDF document which is 
essentially a proxy for a printed page (only cheaper to copy and 
produce) would count, to me, as usage "in print."  I don't make the 
rules, but I think some of the Unicoders who do would agree.  The 
charge of the rules being "out of date" because they demand usage is 
not an accurate one, and pointing to printing vs electronic usage is a 
red herring.


I have long complained about another writing system which I felt had 
trouble being encoded due to chicken-and-egg issues (Klingon), but 
even so people have been using it in the PUA; see 
http://qurgh.blogspot.com/ (now defunct, apparently, but the site is 
still there), and the KLI's collection of Qo'noS QonoS is available in 
Latin letters or in pIqaD in PUA.


I agree that there is something to the charge of chicken-and-egg 
issues with encoding writing systems (you can't write it until it's 
encoded, you can't encode it until it's written), but probably more 
with the amount of usage that has to be seen, not with the requirement 
that there be SOME usage.


I stand by it: we don't encode what would be cool to have.  We encode 
what people *use*.




Actually, there are certain instances where characters are encoded based 
on expected usage.


Currency symbols are a well known case for that, but there have been 
instances of phonetic characters encoded in order to facilitate creation 
and publication of certain databases for specialists, without burdening 
them with instant obsolescence (if they had used PUA characters).


If an important publisher of mathematical works (or publisher of 
important mathematical works) made a case for adding a recently created 
symbol so that they can go ahead an make it part of their standard 
repertoire, I would think it churlish to require them to create 
portability problems for their users by first creating documents with 
PUA encoding).


What these examples have in common is that they reflect a small number 
of characters with an "instant" user community that's well defined and 
understood (and appropriate to the type of character). The main reason 
for the restriction to "encode what people use" is that characters 
cannot be retracted if the hoped for enthusiasm for them doesn't 
materialize.


The other reason is that the Unicode Standard is a standard - what it 
encodes needs to be worthy of standardization. There are exceptional 
instances where "leading" standardization can be justified - they are 
few and far between, but they exist. As exceptions prove the rule - the 
majority of characters will continue to be cases where standardization 
follows demonstrated use.


A./



Re: The rules of encoding (from Re: Missing geometric shapes)

2012-11-08 Thread Mark E. Shoulson

On 11/08/2012 01:48 AM, William_J_G Overington wrote:

Michael Everson  wrote:

< ... collect examples of these in print ...

Mark E. Shoulson  wrote:


We don't encode "it would be nice/useful."  We encode *characters*, glyphs that 
people use (yes, I know I conflated glyphs and characters there.)

...

Unicode isn't a system for encoding ratings. It's a system for encoding what 
people write and print.


I have at various times, as research has progressed, deposited with the British 
Library pdf documents that I have produced and published and I have deposited 
with the British Library TrueType fonts that I have produced and published and 
I have received email receipts for them.

Some of the pdf publications contain new symbols, used intermixed with text in 
a plain text situation. I have used Private Use Area encodings for the symbols.

Yet the publications have not been published in hardcopy form.


I think you may be taking me too literally.  A PDF document which is 
essentially a proxy for a printed page (only cheaper to copy and 
produce) would count, to me, as usage "in print."  I don't make the 
rules, but I think some of the Unicoders who do would agree.  The charge 
of the rules being "out of date" because they demand usage is not an 
accurate one, and pointing to printing vs electronic usage is a red herring.


I have long complained about another writing system which I felt had 
trouble being encoded due to chicken-and-egg issues (Klingon), but even 
so people have been using it in the PUA; see http://qurgh.blogspot.com/ 
(now defunct, apparently, but the site is still there), and the KLI's 
collection of Qo'noS QonoS is available in Latin letters or in pIqaD in PUA.


I agree that there is something to the charge of chicken-and-egg issues 
with encoding writing systems (you can't write it until it's encoded, 
you can't encode it until it's written), but probably more with the 
amount of usage that has to be seen, not with the requirement that there 
be SOME usage.


I stand by it: we don't encode what would be cool to have.  We encode 
what people *use*.


~mark




Re: The rules of encoding (from Re: Missing geometric shapes)

2012-11-08 Thread Asmus Freytag

I'm not sure I follow this analysis.

A./


On 11/8/2012 1:30 AM, Philippe Verdy wrote:

2012/11/8 William_J_G Overington :

However, an encoding using a Private Use Area encoding has great problems in 
being implemented as a widespread system.

Wrong, this is what has been made during centuries if not millenium !
Initially a private use definition, which was not "encoded", but found
their way in widespread use once they were adopted by other editors
(possibly by using glyph variants, including those also introduced by
the initial author depending on the publisher he used and the amount
paid for the publication.

This is still true today: even if you define new glyphs, and as long
as you do not explicitly give permission to others to reused those
glyphs or variant of them, these glyphs remain private in terms of
copyright restrictions on their designs. Making your publication
"public" by depositing to a national library is not a situation where
you grant an open licence to others : the legal deposit made at a
national library is instead used as a proof of your date of work to
claim your copyright on this work.


Also, I feel that implementation other than for research purposes using a 
Private Use Area encoding would cause problems for the future: I feel that a 
formal encoding is needed from the start.

Certainly no. For widespread use you first need to create a work,
claim ownership of copyrights, makde a legal deposit to proove it,
publish an explicit open licence statement allowing reuse of your
design by other authors, and then make your own work for convincing
others to reuse these glyphs (or derived variants of them) in a way
similar as yours.

It is when there will be similar reuses by others, in their own
publications, and when people will start communicating with them in a
sizeable community, on a long enough period (more than the year of
your initial publication), that the appearing "abstract" character
will be saif existant (Unicode or ISO won't consider these characters
as long as others than you alone are not using these designs legally
in their published interchanges, printed or not, have not been proven
to exist over a period consisting in more than 1 year by much more
than just 1 independant author).


I feel that the rules for encoding such new symbols are out of date and not 
suitable for present day use.

Unfortunately, it seems that there is not a way available for me to request 
formal consideration of the possibility of changing the rules.
Technology has changed since the rules were made.

May be, but this just extended the number of technical medias useable
for publications (even if copyright issues have been restricting a bit
the legal reuses more tightly). There are still lots of documents
produced on various medias (at least all the same since milleniums).
Electronic documents are just newer medias for publications, but they
certainly don't create a new restriction to permit widespread use.

Also as the world population has grown a lot, the minimum size of the
community needed to demonstrate its existence has grown proportionally
(the requirements are larger for more recent documents, compared to
historic characters, whose community of users has however grown with
ages, because we can also include the new reusers of the historic
documents over the much larger period where these characters have not
been forgotten, allowing more documents to reuse them up to documents
produced today).


Is it possible for formal consideration to be given to the possibility of 
changing the rules please?

Not the way you describe. You are trying to put the egg before the
chicken. But you forget that both the chiken and egg have a common
creator and are in fact exactly the same thing.

So even if yo uare still required to use private encoding, this is not
what is limiting the birth of an abstract character from your glyphs.
What is important is the number of documents reuing them, over a long
enough period, by a community of authors legally recognized (either
because they are dead since long enough htat their work have fallen in
the public domain, allowing a significant increase of the number of
reusers, or because the exclusive copyright claims have been relaxed
by an open licence so that other will be allowed to reuse your design
or variants of them in their publications, on various medias, not just
electronic ones).







Re: The rules of encoding (from Re: Missing geometric shapes)

2012-11-08 Thread john knightley
One key criteris for inclusion in Unicode is that a character or symbol be
in circulation. Whether these are hand written, printed or electronic. If
one creates a new a new character then one first must get others to use it,
this takes time.

John
On 8 Nov 2012 14:57, "William_J_G Overington" 
wrote:

> Michael Everson  wrote:
>
> < ... collect examples of these in print ...
>
> Mark E. Shoulson  wrote:
>
> > We don't encode "it would be nice/useful."  We encode *characters*,
> glyphs that people use (yes, I know I conflated glyphs and characters
> there.)
> ...
> > Unicode isn't a system for encoding ratings. It's a system for encoding
> what people write and print.
>
> An interesting situation is that the British Library collects pure
> electronic publications by a system of voluntary deposit. A publisher sends
> an email to a specified email address with the pure electronic publication
> or publications attached to the email. The British Library sends, upon
> request, an email receipt for such deposited items.
>
> http://www.bl.uk/aboutus/stratpolprog/legaldep/index.html
>
> I have at various times, as research has progressed, deposited with the
> British Library pdf documents that I have produced and published and I have
> deposited with the British Library TrueType fonts that I have produced and
> published and I have received email receipts for them.
>
> Some of the pdf publications contain new symbols, used intermixed with
> text in a plain text situation. I have used Private Use Area encodings for
> the symbols.
>
> Yet the publications have not been published in hardcopy form.
>
> A problem that exists with the ISO/IEC 10646 encoding process, in my
> opinion, is that there is not a way for new symbols for electronic
> communication systems to be considered for encoding unless there is already
> widespread use of them using a Private Use Area encoding.
>
> However, an encoding using a Private Use Area encoding has great problems
> in being implemented as a widespread system.
>
> Also, I feel that implementation other than for research purposes using a
> Private Use Area encoding would cause problems for the future: I feel that
> a formal encoding is needed from the start.
>
> I feel that the rules for encoding such new symbols are out of date and
> not suitable for present day use.
>
> Unfortunately, it seems that there is not a way available for me to
> request formal consideration of the possibility of changing the rules.
>
> Technology has changed since the rules were made.
>
> Is it possible for formal consideration to be given to the possibility of
> changing the rules please?
>
> William Overington
>
> 8 November 2012
>
>
>
>
>
>
>
>
>
>
>


Re: The rules of encoding (from Re: Missing geometric shapes)

2012-11-08 Thread Philippe Verdy
2012/11/8 William_J_G Overington :
> However, an encoding using a Private Use Area encoding has great problems in 
> being implemented as a widespread system.

Wrong, this is what has been made during centuries if not millenium !
Initially a private use definition, which was not "encoded", but found
their way in widespread use once they were adopted by other editors
(possibly by using glyph variants, including those also introduced by
the initial author depending on the publisher he used and the amount
paid for the publication.

This is still true today: even if you define new glyphs, and as long
as you do not explicitly give permission to others to reused those
glyphs or variant of them, these glyphs remain private in terms of
copyright restrictions on their designs. Making your publication
"public" by depositing to a national library is not a situation where
you grant an open licence to others : the legal deposit made at a
national library is instead used as a proof of your date of work to
claim your copyright on this work.

> Also, I feel that implementation other than for research purposes using a 
> Private Use Area encoding would cause problems for the future: I feel that a 
> formal encoding is needed from the start.

Certainly no. For widespread use you first need to create a work,
claim ownership of copyrights, makde a legal deposit to proove it,
publish an explicit open licence statement allowing reuse of your
design by other authors, and then make your own work for convincing
others to reuse these glyphs (or derived variants of them) in a way
similar as yours.

It is when there will be similar reuses by others, in their own
publications, and when people will start communicating with them in a
sizeable community, on a long enough period (more than the year of
your initial publication), that the appearing "abstract" character
will be saif existant (Unicode or ISO won't consider these characters
as long as others than you alone are not using these designs legally
in their published interchanges, printed or not, have not been proven
to exist over a period consisting in more than 1 year by much more
than just 1 independant author).

> I feel that the rules for encoding such new symbols are out of date and not 
> suitable for present day use.
>
> Unfortunately, it seems that there is not a way available for me to request 
> formal consideration of the possibility of changing the rules.
> Technology has changed since the rules were made.

May be, but this just extended the number of technical medias useable
for publications (even if copyright issues have been restricting a bit
the legal reuses more tightly). There are still lots of documents
produced on various medias (at least all the same since milleniums).
Electronic documents are just newer medias for publications, but they
certainly don't create a new restriction to permit widespread use.

Also as the world population has grown a lot, the minimum size of the
community needed to demonstrate its existence has grown proportionally
(the requirements are larger for more recent documents, compared to
historic characters, whose community of users has however grown with
ages, because we can also include the new reusers of the historic
documents over the much larger period where these characters have not
been forgotten, allowing more documents to reuse them up to documents
produced today).

> Is it possible for formal consideration to be given to the possibility of 
> changing the rules please?

Not the way you describe. You are trying to put the egg before the
chicken. But you forget that both the chiken and egg have a common
creator and are in fact exactly the same thing.

So even if yo uare still required to use private encoding, this is not
what is limiting the birth of an abstract character from your glyphs.
What is important is the number of documents reuing them, over a long
enough period, by a community of authors legally recognized (either
because they are dead since long enough htat their work have fallen in
the public domain, allowing a significant increase of the number of
reusers, or because the exclusive copyright claims have been relaxed
by an open licence so that other will be allowed to reuse your design
or variants of them in their publications, on various medias, not just
electronic ones).



The rules of encoding (from Re: Missing geometric shapes)

2012-11-07 Thread William_J_G Overington
Michael Everson  wrote:

< ... collect examples of these in print ...

Mark E. Shoulson  wrote:

> We don't encode "it would be nice/useful."  We encode *characters*, glyphs 
> that people use (yes, I know I conflated glyphs and characters there.)
... 
> Unicode isn't a system for encoding ratings. It's a system for encoding what 
> people write and print.

An interesting situation is that the British Library collects pure electronic 
publications by a system of voluntary deposit. A publisher sends an email to a 
specified email address with the pure electronic publication or publications 
attached to the email. The British Library sends, upon request, an email 
receipt for such deposited items.

http://www.bl.uk/aboutus/stratpolprog/legaldep/index.html

I have at various times, as research has progressed, deposited with the British 
Library pdf documents that I have produced and published and I have deposited 
with the British Library TrueType fonts that I have produced and published and 
I have received email receipts for them.

Some of the pdf publications contain new symbols, used intermixed with text in 
a plain text situation. I have used Private Use Area encodings for the symbols.

Yet the publications have not been published in hardcopy form.

A problem that exists with the ISO/IEC 10646 encoding process, in my opinion, 
is that there is not a way for new symbols for electronic communication systems 
to be considered for encoding unless there is already widespread use of them 
using a Private Use Area encoding.

However, an encoding using a Private Use Area encoding has great problems in 
being implemented as a widespread system.

Also, I feel that implementation other than for research purposes using a 
Private Use Area encoding would cause problems for the future: I feel that a 
formal encoding is needed from the start.

I feel that the rules for encoding such new symbols are out of date and not 
suitable for present day use.

Unfortunately, it seems that there is not a way available for me to request 
formal consideration of the possibility of changing the rules.

Technology has changed since the rules were made.

Is it possible for formal consideration to be given to the possibility of 
changing the rules please?

William Overington

8 November 2012