RE: Keys. (derives from Re: Sequences of combining characters.)

2002-09-30 Thread Marco Cimarosti

Doug Ewell wrote:
 Marco Cimarosti marco dot cimarosti at essetre dot it wrote:
 
  He said that he didn't understand how this detail could help us but,
  anyway, he obtained the child's name and address from the parent:
 
  Daniel Zubeispiel
  Hauptkirchestrasse, 26
  Zürich, Switzerland
 
 Is this a pseudonym?  I am thinking of the German word Beispiel
 meaning example.

Of course. AFAIK, Zu Beispiel means e.g., for example.
Hauptkirchestrasse is a made-up road name meaning cathedral street.
Zurich is the only real piece of the address.

_ Marco




My German blunders (was Keys. (derives from Re: Sequences of combining characters.))

2002-09-30 Thread Marco Cimarosti

I (Marco Cimarosti) wrote:
 Of course. AFAIK, Zu Beispiel means e.g., for example.
 Hauptkirchestrasse is a made-up road name meaning 
 cathedral street.
 Zurich is the only real piece of the address.

But a native German speaker patiently explained, in a private message:

| If it's an example, it's one not constructed by a native speaker ;-)
| For one, it's zum Beispiel and the name of the street (road 
| would have been ...landstrasse) is either missing an 'n', or
| possibly has an extra 'e'.
| As it stands, it's decidedly odd looking.
| 
| Although, it's supposed to be Swiss. That could explain a lot.

Thanks for the corrections.

I should not have retained the city where it actually happened: if I just
settled the scene in Lugano...

_ Marco




RE: Keys. (derives from Re: Sequences of combining characters.)

2002-09-30 Thread Michael Everson

At 09:56 +0200 2002-09-30, Marco Cimarosti wrote:

Of course. AFAIK, Zu Beispiel means e.g., for example.

Recte Zum Beispiel.
-- 
Michael Everson * * Everson Typography *  * http://www.evertype.com
48B Gleann na Carraige; Cill Fhionntain; Baile Átha Cliath 13; Éire
Telephone +353 86 807 9169 * * Fax +353 1 832 2189 (by arrangement)




Re: Keys. (derives from Re: Sequences of combining characters.)

2002-09-28 Thread Peter_Constable


[Still off-topic, but I'm hopeful that progress can be made, so am
continuing a little farther]


On 09/27/2002 10:26:36 AM William Overington wrote:

XML is the way to go.

Maybe, maybe not.  The issue of U+003C being used to mean LESS-THAN SIGN
in
documents which mix ordinary text and markup may or may not, depending
upon
the application, be a problem.

It really isn't a problem. XML provides other means to represent that
character when it is needed as part of the content rather then as part of
the markup. It is the job of an XML parser to sort that out, and there are
various XML parsers that all handle this without a hitch and that are
freely available. Someone made reference to MathML, which is a markup
language built on XML (XML is a spec for building markup languages), and
clearly mathematicians need to be able to represent this character within
content, and the special use of U+003C for markup in XML was not seen in
any way to be an obstacle.

Your proposed markup convention would also need a parser to identify the
pieces in a stream of data. If someone wants to use U+2604 in content, you
would probably need some indirect way to represent it in a data stream.
(E.g. One can imagine a hypothetical message My favourite Unicode
character is P1 into which someone might want to insert the COMET
character.) So, I expect you'll have to deal with the same problem anyway.
But this parser doesn't yet exist; some software developer will have to
create it. On the other hand, XML parsers exist today. If you had been
pursuing an  XML-based approach, you might already be testing live
prototypes rather than discussing a hypothetical system.

Also, in an earlier message, you mentioned that you wanted to be able to
use this messaging system on the Web. And, of course, you want to be able
to represent U+003C directly in content. Did you realise that those two are
contradictory? HTML has the same heredity as XML (both are implementations
of SGML). It also uses U+003C for markup, and provides the same alternative
means to represent that character as part of content. So, if one of the
contexts within which you want your system to work is the Web, then you're
going to have to deal with indirect representation of U+003C anyway. Since
its already a magic character, why not let it be the magic character for
your proposed protocol.

XML really *is* the way to go. Please believe us. You don't need to believe
me; believe Tex, Ken, Marco and the others who have offered you this
recommendation. They really are among the most well-informed contributors
to this list.



BTW, my mail client (Lotus Notes, for better or worse) reports what time in
*my* time zone an author wrote the given message. Such reporting of time in
international communications is problematic; time zones need to be stated
explicitly. We discovered this quite a while ago after scheduling a
tele-conference; the half of the dept. in the UK assumed the time they saw
was Dallas time (or maybe they suggested the time and we were reading it),
but Notes had silently done a time zone conversion.
- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: [EMAIL PROTECTED]










Re: Keys. (derives from Re: Sequences of combining characters.)

2002-09-28 Thread Barry Caplan

At 12:24 PM 9/27/2002 +0100, William Overington wrote:
You tell me which one is more
likely to result in productive work and adoption by others.

Likelihood of success and what actually happens are not the same thing.  I
do not know which is more likely as I do not know of what has happened
already.  

Well, as you mentioned, the nature of scholarly research demands that you are familiar 
with the basis for the arguments being presented.
If your goal is merely to build such a system, I am sure everyone is willing to 
concede that it is technically feasible, even bordering on trivial. It is not 
interesting in a scholarly sense at all, so it is only your ego that is going to 
benefit.

If your goal is to enjoy some commercial success, well, that may be possible too. The 
utility of the application will be strongly limited by its lack of interoperability 
with other existing systems, many of which are used by the likely community of users 
for your system. That community has these choices:

- Not use your system
- Use your system and never interchange data
- Use your system and roll their own tools to do data interchange
- Use your system and demand data interchange tools from their other vendors
- Use your tool and demand data interchange tools from you
- Create a closed source functional near-equivalent of your tool with data 
interchange facilities
- Create an open source functional near-equivalent of your tool with data 
interchange facilities

Ponder very carefully the implications of each of these upon: the utility (usefulness 
and value) of your software, the effects on your limited resources of needing to 
support an extra layer of data interchange, and the effects on other vendors' limited 
resources of being asked to support data interchange with a proprietary format in 
limited use.

If you want to share with program with a handful of folks, your proposal might fly. If 
you want real people in real places on earth to contribute text, then I predict issues 
will arise and you will lose all control because the last item in the list above will 
occur. 

Just to give you my sense of how much work it would take to do that, I think about a 
intense week or so for any experienced open source programmer for each type of UI is 
about right (GNOME, Web, etc), based on your description of the functionality and the 
availability of major modules such as XML, message catalog, UI, database and Web 
support.


Some people may have deleted the email, some may have read it and
disregarded it, yet it is possible that some people might have tried to
produce a comet circumflex button on the screen using an all-Unicode font
and might be considering the possibilities of how the system could be
applied or might even be writing an experimental software program which can
take comet circumflex sequences and process them through a database.

Speaking of reading the sources, you might want to read Richard Dawkins' The Selfish 
Gene and other related works on memes to get a sense of why any alternative to XML for 
data interchange is likely to fail in the marketplaces of business and of ideas even 
if technically feasible.


The topic of keys generally which I have introduced 


Why are you claiming credit for a system which has been a core part of programming 
APIs since probably the 1960s? You can search for the documentation online for the 
printf function and its relatives for *nix, or resource APIs for Windows and Mac for 
a good start.

Any translator who has done localization is familiar with the use of parameterized 
sentences that you describe, and why they are a problem when it comes to translation. 
I am sure I am not the only localization consultant on the list that preaches a very 
limited use of them (what I call constructed sentences. 

is potentially a
far-reaching development in the application of markup in Unicode based
systems.  

Its been done to death in the past. See Trados, Uniscape, GlobalSight, and countless 
in-house systems. The only revolutionary aspect is that you want to throw away all the 
experience and consensus that has been developing  in the sw development, i18n, l10n, 
and transaltion communities about proper workflow and data interchange. If you came to 
me with such a tool in 1990, Unicode not withstanding, it may have been useful. But 
now, standalone tools are much less useful for a lot of reasons I won't go into here.

My own comet circumflex system may be highly useful in business
communications and distance education.  

May be, but most likely not. That you think so indicates you are after a commercial 
market, and I refer you to the discussion above of likely outcomes.

I am happy to respond to questions
and to consider documents which people suggest.

I have suggested a lot in a message yesterday and a lot more here. I hope your future 
messages will take the material I have suggested into account.


XML exists and it uses 

Re: Keys. (derives from Re: Sequences of combining characters.)

2002-09-28 Thread Barry Caplan

At 12:23 PM 9/27/2002 +0100, William Overington wrote:
Are you perhaps trying to make a deduction by the fallacy of the
undistributed middle, along the following lines.

William's need is a markup system.
XML is a markup system.

William's need is XML.

I think what is being suggested is not nearly so obvious as that. It is more along the 
lines of:

William's need is a product of which data interchange is a key feature
Said product needs a architecture and a business model
Data interchange happens both externally and internally within the program
The business model chosen may indeed require a non-xml system
XML data interchange is better supported than any proprietary system.
If non-xml is chosen for the outside system, it should be converted to xml as early as 
possible for inbound, and as late as possible for outbound interchange in order to 
capitalize on xml tools

Of course, if the system is closed on the outside, and useful, it will be quickly 
duplicated by someone using open interchange formats anyway,  but that advice on how 
to handle that situation only comes at a price :)


I am simply saying that XML, as I understand it, does not suit my specific
need.

It may be, that you don't understand your need well enough to understand why XML for 
outside interchange is an extremely strong contender.

text cannot be used directly.  For me, that is a major limitation of XML.

Why is it a major limitation of XML? Have not already over a million applications 
and web sites been implemented using XML technology? Is there a record of anyone ever 
griping about this limitation at all?

legacy issue of which I do not want to have the problem with my research in
language translation and distance education. 

How so? A single line of code will automatically escape any characters as needed.


 Maybe one day Unicode will
encode special XML opening and closing angle brackets so that XML can
operate without that problem.  

This is not up to Unicode to decide, it is XML's choice to specify the way its tags 
are constructed. XML's family tree starting with SGML (or earlier for all I know) and 
going through HTML pretty much constrains it. Trillions of people know  as the tag 
delimiter. Earlier markup languages used a . PERIOD in the first character in a line 
as a delimiter - I think RTF is of this heritage. when was the last time someone 
mentioned they were creating or editing a RTF file compared to *ML?

However, as XML uses the U+003C character in
that manner at the moment, for me it is a problem and it has led me to use
the key method using a comet circumflex key.

Instead of typing a trivial escape character in the rare case of a  in the content 
you want to force people to type weird Unicode characters in every tag?


Also, I do not need to have all those  characters and = characters and /
characters within messages.

Have not thought the problem all the way through? Why on earth would you want your 
message creators typing raw XML anyway? You are going to need some other UI, right? 
And that message editor can generate the XML, complete with escapes, using existing 
code you can have for free. This frees your time from having to create your own wheel 
and maintain it.


Well, U+2604 U+0302 U+20E3 is not ridiculous.  It is entirely permissible
within the Unicode specification.  

He is not saying it is ridiculous because it is not within the specification. He is 
saying it is ridiculous because the development community as a whole (a very large 
whole), both closed source and open source advocates, is rallied around  XML as a 
basis for data interchange. If you ever wanted to move your comet files to another 
system, or create them from data in an existing system (such as Trados or another 
translation memory), you will need a 2 way XML-Comet converter anyway. Why bother?

you think it ridiculous then maybe that is good evidence of its originality
as a piece of creativity.  

I am sure it will create a pretty glyph. But software creation is about way more than 
pretty glyphs.

A comet circumflex key could be viewed as a piece
of original art.  I specifically designed it so as to be a design which
involves an inventive leap so as to produce something new and unexpected,
which someone skilled in the art would not produce as the application of
skill in the existing art without invention, yet which would display
properly using an all-Unicode font.

This sounds a lot like you are planning to trademark or patent a character. I would 
personally travel to the ends of the earth to testify that all possible combining 
sequences are described as prior art in the description of how to create them in the 
Unicode specification and thus can never be proprietary. Now if you want to have a 
graphic artist draw a logo of a comet with a box around it, that is your prerogative. 
But the idea that combining characters in any fashion is somehow proprietary is not 
ridiculous, is it just a waste of time. In case you think 

Re: Keys. (derives from Re: Sequences of combining characters.)

2002-09-28 Thread Doug Ewell

Marco Cimarosti marco dot cimarosti at essetre dot it wrote:

 He said that he didn't understand how this detail could help us but,
 anyway, he obtained the child's name and address from the parent:

 Daniel Zubeispiel
 Hauptkirchestrasse, 26
 Zürich, Switzerland

Is this a pseudonym?  I am thinking of the German word Beispiel
meaning example.

A very funny story, whether or not any names were changed to protect the
innocent.

-Doug Ewell
 Fullerton, California





Re: XML Primer (was Keys. (derives from Re: Sequences of combining characters.))

2002-09-27 Thread William Overington

Shawn Steele wrote to the [EMAIL PROTECTED] list, not directly to me, yet
began by writing.

Mr. Overington,

There is then a long document of very helpful information, for which I am
grateful.

Mr Steele then concludes with the following.

I hope that this example improves your understanding of XML and how it may
be applied to your inventions.  As others have mentioned, this topic is
digressing from the purpose of this message board and would be best
discussed off line or in a different forum.

Well, a letter addressed to me could have been sent by private email.

- Shawn

Shawn Steele
Software Developer Engineer
Microsoft

Unfortunately, this is then followed by the following.

My comments in no way endorse the original

Well, that is fine, the letter has been posted to the Unicode list from a
Microsoft address, so a clarification makes the situation clear just in case
anyone had thought that in some way it might.

and are not intended to confer legitimacy,

Ah!  That is not fine.  The original is entirely legitimate and there is no
need for legitimacy to be conferred at all, also the conferring of
legitimacy is not something which is within the powers of Microsoft to
confer, as Microsoft is a corporation and does not vote in public elections,
let alone have jurisdiction in such matters.  Mentioning legitimacy in that
way in a document from Microsoft, a member of the Unicode Consortium, is
very unfair.

rather they are merely intended to be educational.

Well, they are merely intended to be educational.  No rather about it.

This posting is provided AS IS with no warranties,

Well, that is fine, the letter has been posted to the Unicode list from a
Microsoft address, so a clarification makes the situation clear just in case
anyone had thought that in some way it might.

and confers no rights.

What rights are being referred to here?

William Overington

27 September 2002











Re: Keys. (derives from Re: Sequences of combining characters.)

2002-09-27 Thread William Overington

Peter Constable commented as follows.


On 09/26/2002 06:05:45 AM William Overington wrote:

Dallas is 6 hours behind England on the clock.


I'm going to refrain from commenting on anything beyond the markup issues

As you wish.  Though did you stick to that even in the same sentence?

-- and I'm continuing with that only because it's an easy follow-on to what
I already wrote,

As you wish.

even though there is every indication that the sensibility
of it will be ignored.

This did not appear to have meaning.

I checked on the meaning of the word sensibility just to make sure.

Did you intend to convey the meaning the good sense of what I write rather
than the sensibility of it?

Yet what indication whatsoever do you have that I ignore what you write?

I do not always agree with you, yet where specific references to documents
on the web are made I always attempt to obtain them and study the points you
make.

Certainly, I may not agree with you.  Sometimes I agree, sometimes I do not
agree and sometimes I am undecided in a matter.  That surely is the nature
of critical scholarship and research.



A document would contain a sequence such as follows.

U+2604 U+0302 U+20E3 12001 U+2460 London U+2604 U+0302 U+20E2


You could just as easily have used

S C=12001London/S

or

S C=12001 P1=London/

which are only slightly more verbose, but which follow a widely-implemented
standard that can be parsed by lots of existing software, for which there
are a large number of tools available, and which a vast number of
indivuals, businesses and other agencies have an interest in. Your markup
convention is completely proprietary,

Thank you.  That is excellent.  I designed the comet circumflex key with the
specific intention that it was creatively original whilst being expressible
using a standard all-Unicode font.

it has no existing software support,
and nobody but you has any interest in it.

You have no basis whatsoever for claiming that nobody other than me has any
interest in it.  Maybe you are not interested, maybe some people you know
are not interested, yet I feel that it is unfair for you to make such a
statement without evidence when writing from an established organization as
that remark may prejudice people from taking an interest in helping to
develop the idea because of a political dimension of going against the tide.
You have your position and I feel that you should allow someone who does not
have such a position an even-handed chance to put forward an idea and have
it considered on its merits.

You tell me which one is more
likely to result in productive work and adoption by others.

Likelihood of success and what actually happens are not the same thing.  I
do not know which is more likely as I do not know of what has happened
already.  Some people may have deleted the email, some may have read it and
disregarded it, yet it is possible that some people might have tried to
produce a comet circumflex button on the screen using an all-Unicode font
and might be considering the possibilities of how the system could be
applied or might even be writing an experimental software program which can
take comet circumflex sequences and process them through a database.

Look, for example, at The Respectfully Experiment in the Unicode mailing
list archives.  There a result was assumed and something different was
observed in practice.


that it is
because I am an inventor, interested in pushing the envelope as to what is
possible scientifically and technologically.

Marco asked me a specific question, so I answered what he had asked.


Perhaps there is an [EMAIL PROTECTED] list somewhere where you might
find greater interest in your ideas than here.

That is unfair of you.  You have chosen to respond to my posts and I have
answered the questions which you asked.

You even stated in the same post.

quote

I'm going to refrain from commenting on anything beyond the markup issues

end quote

The topic of keys generally which I have introduced is potentially a
far-reaching development in the application of markup in Unicode based
systems.  My own comet circumflex system may be highly useful in business
communications and distance education.  I am happy to respond to questions
and to consider documents which people suggest.

None of us here mind
invention, but I think most would believe that inventiveness is most
productive when building off the advancement of others rather than
reinventing wheels or widgets. XML exists, and it works.

XML exists and it uses U+003C in a way that makes using U+003C with the
meaning LESS-THAN SIGN in body text intermixed with markup sections awkward.
That feature of XML may not matter for situations involving encoding simply
literary works, yet for a comprehensive system which can include the U+003C
character with the meaning LESS-THAN SIGN in body text and in markup
parameters, it does not suit my need.


Beside the fact that your proposed markup convention is not a good idea, it
has nothing 

Re: Keys. (derives from Re: Sequences of combining characters.)

2002-09-27 Thread William Overington

Peter Constable wrote as follows.


On 09/26/2002 03:42:16 AM William Overington wrote:


Well, it might have been 03:42:16 AM where you are, indeed it probably was,
as Dallas is six hours behind England on the clock, but I would not want
people to think that I write my posts in the middle of the night!


On the one hand, you say

XML does not suit my specific need as far as I can tell.


But you also said

Documents with the code sequence are intended to be sent over the internet
as email, used as web pages and broadcast in multimedia broadcasts over a
direct broadcast satellite system, so the codes which you suggest would be
unsuitable.

In that quote the codes which you suggest  was your list of specific
Unicode code points as follows.

quote

Sorry to be blunt, but that's silly. If you need a special-purpose
character (a code-sequence, to be more precise) for use within your
specialised application, use one of FDD0..FDEF, FFFE, , 1FFFE, 1,
2FFFE...  10FFFE, 10. They are non-characters available for exactly
this use.


end quote

I maintain that they are unsuitable for use in documents which are to be
sent from one end user to another.

Yet the first part of my sentence which you have quoted could by going to
the final comma and converting it to a full stop form a sentence on its own
as follows.

Documents with the code sequence are intended to be sent over the internet
as email, used as web pages and broadcast in multimedia broadcasts over a
direct broadcast satellite system.

So, I will reason from that.

You also quote me as stating the following sentence.

XML does not suit my specific need as far as I can tell.

I am happy with that.

The two sentences are entirely consistent.

Are you perhaps trying to make a deduction by the fallacy of the
undistributed middle, along the following lines.

William's need is a markup system.
XML is a markup system.

William's need is XML.

It may well be that XML could be used to carry the comet circumflex code
numbers which I am devising.  I am not saying that it could not be so used.

I am simply saying that XML, as I understand it, does not suit my specific
need.

For example, if I understand it correctly, XML uses U+003C in a document in
such a manner that its use for the meaning LESS-THAN SIGN in the body of the
text cannot be used directly.  For me, that is a major limitation of XML.
Now, I am not trying to make some big issue out of this by criticising XML
as I am not trying to criticise XML, yet to my mind that is a very big
legacy issue of which I do not want to have the problem with my research in
language translation and distance education.  Maybe one day Unicode will
encode special XML opening and closing angle brackets so that XML can
operate without that problem.  However, as XML uses the U+003C character in
that manner at the moment, for me it is a problem and it has led me to use
the key method using a comet circumflex key.

Also, I do not need to have all those  characters and = characters and /
characters within messages.

One of the things that is especially useful about XML and related
technologies is the facility with which data can be repurposed. You have
one schema for marking up data, and stylesheets that transform it as needed
for different publishing / usage contexts.

Also, I don't see how it can be that a character sequence such as U+003C
U+0061 U+003E can't be useful to you when some ridiculous character
sequence like U+2604 U+0302 U+20E3 is.


Well, U+2604 U+0302 U+20E3 is not ridiculous.  It is entirely permissible
within the Unicode specification.  I have used combining characters
productively, in accordance with the rules set out in the specification.
Please see section 7.9.  The button displays using an all-Unicode font.  If
you think it ridiculous then maybe that is good evidence of its originality
as a piece of creativity.  A comet circumflex key could be viewed as a piece
of original art.  I specifically designed it so as to be a design which
involves an inventive leap so as to produce something new and unexpected,
which someone skilled in the art would not produce as the application of
skill in the existing art without invention, yet which would display
properly using an all-Unicode font.

The sequence U+003C U+0061 U+003E is unsuitable because it begins with a
U+003C character and  I do not want the use of U+003C to mean LESS-THAN SIGN
to be unavailable in a simple direct manner.  I want to be able to use the
comet circumflex translation system in documents which contain mathematics
and software listings as well as literary text.  So, I have decided to use a
straightforward system which allows me to do that without problems.

An added bonus of using the comet circumflex key is that documents
containing comet circumflex codes do not necessarily need to contain any
characters from the Latin alphabet.

William Overington

27 September 2002










Re: Keys. (derives from Re: Sequences of combining characters.)

2002-09-27 Thread John Cowan

William Overington scripsit:

 Well, it depends what one is trying to do.  If one wishes to establish a
 system whereby proprietary intellectual property rights exist, then a
 proprietary coding can be a good idea.

That is the function of encryption.

 XML is the way to go.
 
 Maybe, maybe not.  The issue of U+003C being used to mean LESS-THAN SIGN in
 documents which mix ordinary text and markup may or may not, depending upon
 the application, be a problem.

Since there are several standard ways to represent the semantic LESS-THAN
SIGN in XML (lt; is most typical, but #x3C; also works), there is
no problem, only a little extra work as tradeoff.  After all, why not
invent your own character code as well as your own markup language?

 The keys idea is pushing the envelope.  As spin off from this discussion,
 maybe the XML people, and the Unicode Technical Committee, will do something
 about having special characters for the XML tags rather than using U+003C
 and thereby help people wanting to place mathematics and software listings
 in the same file as markup.  

MathML is a markup standard for mathematical text that is an application of
XML, so people wanting to place etc. need no further help.

Don't hold your breath, and don't *mutcheh* us about it.

 What is wrong with private encodings?

Interchanging them does not scale.

 People may ignore them if they wish.  

They will, they will.

 High level application semantics assigned to particular code points are
 potentially very useful.  I have published various documents on the web
 about them with Private Use Area allocations for various items such as
 colour and point size for text.

Of course you can use the Private Use Area for whatever you like.  A character
standard, however, is intended for encoding *characters*.  It is not intended
as a source of useful integers -- for that, apply to Dedekind.

-- 
John Cowan   [EMAIL PROTECTED]
You need a change: try Canada  You need a change: try China
--fortune cookies opened by a couple that I know




Re: Keys. (derives from Re: Sequences of combining characters.)

2002-09-27 Thread Peter_Constable


[This is entirely off-topic.]

On 09/27/2002 06:24:27 AM William Overington wrote:

Yet what indication whatsoever do you have that I ignore what you write?

The fact that you have been given recommendations from several people on
this list not to invent new markup conventions but to take advantage of the
existing, state-of-the-art technologies for this purpose, yet you have
consistently rejected those recommendations.



I do not always agree with you,

I doubt there's anyone on this list that always agrees with me (I certainly
hope not; after the passage of time, I often don't agree with myself :-).



it has no existing software support,
and nobody but you has any interest in it.

You have no basis whatsoever for claiming that nobody other than me has
any
interest in it.

It's only a claim, a hypothesis that I happen to consider to have enough
probability of validity to make me feel confident in stating in a public
forum. Of course, I may be wrong.



Maybe you are not interested, maybe some people you know
are not interested, yet I feel that it is unfair for you to make such a
statement without evidence when writing from an established organization
as
that remark may prejudice people from taking an interest in helping to
develop the idea because of a political dimension of going against the
tide.

I feel there is evidence: take a look at any serial publication related to
the software industry from the past three years and look for references to
XML. It comes up again and again and again. The evidence very strongly
points in favour of XML if one is needing a markup convention for some
protocol. There may well be some situation in which XML isn't appropriate;
e.g. one might have valid reasons for wanting to maintain a binary file
format as the native storage representation for a word-processing or
spreadsheet app. But if one is going to use a *character*-based markup
convention, I think you'd be hard pressed to come up with good reasons at
this point for using something other than XML.



Perhaps there is an [EMAIL PROTECTED] list somewhere where you
might
find greater interest in your ideas than here.

That is unfair of you.

If I offended, then I apologize. I merely wished to suggest that your ideas
regarding markup are what I think the vast majority on this list would
consider eccentric, and to also suggest that it's all off-topic for this
list and really should be taken up elsewhere.




You even stated in the same post.

quote

I'm going to refrain from commenting on anything beyond the markup issues

end quote

And I believe I did so.



The topic of keys generally which I have introduced is potentially a
far-reaching development in the application of markup in Unicode based
systems.  My own comet circumflex system may be highly useful in business
communications and distance education.  I am happy to respond to questions
and to consider documents which people suggest.

But please, not on this list. The is not the comet circumflex list.



XML exists and it uses U+003C in a way that makes using U+003C with the
meaning LESS-THAN SIGN in body text intermixed with markup sections
awkward.

Not significantly so, as evidenced by the fact that many have needed to
represent the character  within content yet this has not impeded the
widespread -- near ubiquitous -- adoption of XML.


That feature of XML may not matter for situations involving encoding
simply
literary works, yet for a comprehensive system which can include the
U+003C
character with the meaning LESS-THAN SIGN in body text and in markup
parameters, it does not suit my need.

Then I think you're making decisions about design of a protocol using the
wrong criteria.



Actually, I was rather hoping that, with your specific interest in
languages
that you would have wished to have a try at using the comet circumflex
system as one of the features of the comet circumflex system is that it
could be used with minority languages as easily as with the major
languages
of the world.

Actually, one of the things that I chose *not* to comment on in the
previous message was the very significant issues the comet circumflex
system raises in relation to internationalisation and localisation. As
someone else pointed out, your system has a problem in that a parameter
such as London needs to be localised. There are a range of
internationalisation issues that your system doesn't address. It isn't
always safe to assume that one can define a matrix statement that can be
translated into multiple languages and into which parameter strings can be
inserted; issues such as grammatical concord may be a problem. I don't want
to get into such a discussion (especially on this list). My point is, I see
many potential problems in terms of multilingual application of the system.
Also, the users I support are not dealing with text involving a set of
short, pre-defined messages, so this system isn't all that relevant for my
work.



- Peter



Re: Keys. (derives from Re: Sequences of combining characters.)

2002-09-27 Thread John H. Jenkins


On Friday, September 27, 2002, at 09:52 AM, [EMAIL PROTECTED] 
wrote:

 I doubt there's anyone on this list that always agrees with me


I think you're wrong, there, Peter.  I *never* disagree with you.  :-)

==
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://www.tejat.net/





Re: Keys. (derives from Re: Sequences of combining characters.)

2002-09-27 Thread Tex Texin



William Overington wrote:
 Message catalogs are not new.
 
 I had not heard the description Message catalog previously, so I can
 search for that too.
 
 I have previously searched under telegraphic code and language and
 translation.

look for: software localization, message, catalog, resource files,
perhaps localisation ;-)

 
 An email correspondent drew my attention to the following list of numbered
 radiograms.
 
 http://www.arrl.org/FandES/field/forms/fsd3.html
 
 That is an interesting document.
 
 I have not yet found any example oriented to language translation.  I have
 not yet found any example oriented to carrying on a complete conversation.

A new prisoner sits down for his first lunch. Someone shouts out 53.
Everyone laughs.
Another shouts 26. More laughter. He asks his neighbor what's going
on... The neighbor explains they have all been there so long they have
heard all the jokes told very many times. Finally they just gave them
numbers. So when someone shouts out a number they remember the joke and
laugh.
After a bit the new guy shouts out: 42! Dead silence. He asks his
neighbor what went wrong. He turns to him and says That one is not
funny..

This is a very old joke. It is an indication of how old the idea of
numbered messages might be. ;-)

The arrl list was missing quite a few. 73  88 were common for Best
regards, and love and kisses.
I was rather surprised therefore when the Target products with 88 were
recently pulled from the market because they signaled the neo-nazi
movement. I thought it meant Love and kisses.

 
 A proprietary coding system is a bad idea.
 
 Well, it depends what one is trying to do. 


Yes, for the problem you described, given the availability of an open
system, with lots of tool support, creating a proprietary system in
which you could not create nearly as many tools as the open-based
systems, it would not be competitive. You would really have to build in
some significant market advantage. Given your lack of familiarity with
what exists in the market, and a presumption of a one-man shop (limited
resources), we speculated it was a mistake.

 XML is the way to go.
 
 Maybe, maybe not.  The issue of U+003C being used to mean LESS-THAN SIGN in
 documents which mix ordinary text and markup may or may not, depending upon
 the application, be a problem.

You can use the character with some minor escaping. It is a smaller
issue than trying to create all the various tools and benefits you would
get from XML.

 
 but as Peter and others have already defined
 several times where the envelope needs pushing (e.g. XML), and in
 particular where they should not (private encodings, and hi level
 application semantics assigned to particular code points), continued
 attempts to do so are not welcome.
 
 What is wrong with private encodings?  The Private Use Area is there to be
 used. 

Sure, but use them privately and discuss them privately with people who
have an interest in those particular purposes.
This is not the place. I know this has been stated before.

I think Suzanne or Barry even created a list for purposes of PUA
discussion:
http://groups.yahoo.com/group/CharMan/
Or start a list of your own.

You are welcome (as are others) to send announcements here saying- Hey
I have these PUA ideas, and would like to discuss them here and here.

It is really quite unfair to the members of the list to cause it to go
over the same ground.

hth
tex
-- 
-
Tex Texin   cell: +1 781 789 1898   mailto:[EMAIL PROTECTED]
Xen Master  http://www.i18nGuy.com
 
XenCrafthttp://www.XenCraft.com
Making e-Business Work Around the World
-




Re: Keys. (derives from Re: Sequences of combining characters.)

2002-09-27 Thread John Cowan

John H. Jenkins scripsit:

 I think you're wrong, there, Peter.  I *never* disagree with you.  :-)

Hmm.  Has anyone ever seen Peter and John together?  :-)

-- 
John Cowan  [EMAIL PROTECTED]  www.ccil.org/~cowan  www.reutershealth.com
In the sciences, we are now uniquely privileged to sit side by side
with the giants on whose shoulders we stand.
--Gerald Holton




Re: Keys. (derives from Re: Sequences of combining characters.)

2002-09-27 Thread John Cowan

Tex Texin scripsit:

 After a bit the new guy shouts out: 42! Dead silence. He asks his
 neighbor what went wrong. He turns to him and says That one is not
 funny..

Other punchlines I have heard:

(about a third party):  Steve should know he can't handle Swedish dialect.

(after uproarious laughter):  Hey, we've never heard that one before!

(after silence): I guess you just don't know how to tell a joke.

 This is a very old joke. It is an indication of how old the idea of
 numbered messages might be. ;-)

As William mentions, commercial telegraph codes are almost as old as the
telegraph itself; when the five-letter-code principle was eventually
accepted internationally, it became possible to use a single group to
represent things as complex as We are shipping to you, care of your
agent in X, our product Y where all possible combinations of X and Y
were given individual codes.  This of course was a code commissioned by
a private company; public codes necessarily had to be more inclusive and
thus more verbose.  Several of them were indeed published in multilingual
editions, so that the same code sequence could be read as English,
French, German, 

In the case of public codes, company code clerks became quite adept
at reading the more frequent codes without reference to the code book.
On one occasion, a code clerk got a cable from an agent located halfway
around the planet reading simply AHXNO, a code entirely unfamiliar to him.
Unfortunately, when he looked it up, he found the reading to be:

Met with a fatal accident.

-- 
John Cowanhttp://www.ccil.org/~cowan  [EMAIL PROTECTED]
Please leave your values|   Check your assumptions.  In fact,
   at the front desk.   |  check your assumptions at the door.
 --sign in Paris hotel  |--Miles Vorkosigan




Re: Keys. (derives from Re: Sequences of combining characters.)

2002-09-27 Thread Barry Caplan

At 04:26 PM 9/27/2002 +0100, William Overington wrote:
I had not heard the description Message catalog previously, so I can
search for that too.

I have previously searched under telegraphic code and language and
translation.

An email correspondent drew my attention to the following list of numbered
I have not yet found any example oriented to language translation.  

Key Unix libraries have used message catalogs as part of the API since time 
immemorial. Hence any Unix application with even a whiff of a chance of being 
internationalized is likely to have used those functions.


I have
not yet found any example oriented to carrying on a complete conversation.

I would look for the earliest references to machine translation int he 1940s and 50s, 
up to the work with Eliza at MIT in the 60s. I think there is an enormous project 
whose name I don't recall right now going on in Texas, perhaps Austin, which is 
spiritually derived from Eliza and focused on sending whole, previous composed 
sentences back conversational style.

If you want to find the whole of the literature in this area, I suggest searching 
Turing Test.


A proprietary coding system is a bad idea.

Well, it depends what one is trying to do.  If one wishes to establish a
system whereby proprietary intellectual property rights exist, then a
proprietary coding can be a good idea.  Various large companies use
proprietary coding systems for files used with their software packages.  If,
however, one is trying to establish an open system, then you might well be
right.

Or if you want to minimize the amount of reinventing the wheel you do internally. You 
can easily use a proprietary format outside and XML inside, just as you can use SJIS 
outside and Unicode for internal processing.


Failure to investigate the state of the art, (especially where google is
so effortless), means this idea is not pushing any envelope.

Well, if you have any specific suggestions of what keywords to use in a
search, that would be very helpful.


I have given you some. Rather than focusing on pseudo-scientific terms like 
radiogram, I suggest a starting with a familiarity with the history of computer 
science, both pure and applied research.


The keys idea is pushing the envelope.  


No it is not. 

As spin off from this discussion,
maybe the XML people, and the Unicode Technical Committee, will do something
about having special characters for the XML tags rather than using U+003C
and thereby help people wanting to place mathematics and software listings
in the same file as markup.  Is using U+003C a legacy from ASCII days?

Why is it not possible to use  signs in XML? 


Most of my postings in this thread are in response to people asking me
specific questions and raising interesting points.  That is surely why a
discussion group exists.

But most of the answers you get are based on a shared technical and educational 
background which you don't have and/or seem to value. It is difficult to describe but 
a lot of early computer science research was about how to effectively decompose 
functionality and data. Sadly, I think  a lot of this is being lost. For a more 
technical starting point, look for the works of Edsger Dijkstra starting in the 1960s. 
For a less technical point of view, look for The mythical Man-month from the mid 60s 
(recently updated), and its spiritual followups by Ed Yourdon and Tom Demarco. 

When I read the responses you get, I have the feeling that the authors have 
internalized the lessons of these important texts (even if they may not have studied 
them explicitly). Once you internalize the lessons also, then you will have a better 
understanding of the points of view you are consistently receiving with friction.


I am hoping that I can publish some web pages with some comet circumflex
codes and sentences about asking about the weather conditions and
temperatures at the message recipients location together with codes and
sentences for making replies so that hopefully people who might be
interested in some concept proving experiments can hopefully have a go at
some fascinating experiments with this technology.  Unicode can be used to
encode many langauges and it will be interesting to find out what can be
achieved using the comet circumflex system.

That might be an interesting web site in its own right, but the technology is nothing 
special and has ben done a million times under a million names and ten million times 
with no name at all.

Barry Caplan
Publisher, www.i18n.com





RE: Keys. (derives from Re: Sequences of combining characters.)

2002-09-27 Thread Marco Cimarosti

Tex Texin wrote:
 What's funny to me about this message, is a product message catalog I
 was responsible for localizing had messages created by software
 developers, such as (paraphrasing from memory):
 
 The client is dead.
 The client has been killed.
 You killed the client.
 
 Some of the translators were horrified. We had to explain that the
 client was software used by the user, and that to kill it
 meant the software was no longer operating, not that the 
 product caused
 the death of the user. And then we had to get the developers to change
 the message, since even in english they were not the most effective
 messages.
 
 Lucky too, that support couldn't cause someone on the phone to give a
 command that could kill the client...

Years ago, I was in charge of supporting software system composed of a main
module, called the parent (task), and of a number of secondary modules,
called child (tasks). Each child was identified with a name and a (task)
address.

One day, the IT manager reported that the system started having problems
after a child had turned off the computer.

I explained that, according to my knowledge, that was impossible: children
ran in a protected area, so the parent would have stopped them before they
had any chance of turning off the computer. But he replied that he saw the
child turning off the system with his own eyes, and the parent could not
stop it.

This guy was such an idiot, and I was quite surprised to discover that he
could use the utility called Children Monitor. So, I asked him to let me
know the child's name and address.

He said that he didn't understand how this detail could help us but, anyway,
he obtained the child's name and address from the parent:

Daniel Zubeispiel
Hauptkirchestrasse, 26
Zürich, Switzerland

(Seven years-old Daniel, the son of a system engineer, was in the laboratory
that day because his school was closed for maintenance.)

Ciao.
Marco




Re: Keys. (derives from Re: Sequences of combining characters.)

2002-09-26 Thread William Overington

Peter Constable commented as follows.

On 09/25/2002 05:55:02 AM William Overington wrote:

For example, I am looking at using the following sequence so as to produce
a
special purpose key within documents.

U+2604 U+0302 U+20E3

Hopefully that sequence will be so unlikely to occur other than in my
specialised application that the sequence can be used uniquely for that
specialised application.

Sorry to be blunt, but that's silly. If you need a special-purpose
character (a code-sequence, to be more precise) for use within your
specialised application, use one of FDD0..FDEF, FFFE, , 1FFFE, 1,
2FFFE...  10FFFE, 10. They are non-characters available for exactly
this use.


Documents with the code sequence are intended to be sent over the internet
as email, used as web pages and broadcast in multimedia broadcasts over a
direct broadcast satellite system, so the codes which you suggest would be
unsuitable.

If you need real character sequences for markup, there's this thing called
XML. Perhaps you've heard of it. It's worth taking a look at; I think it
really might catch on some day.

I have heard of XML, though I know little about it.

I have read some introductory documents about XML.

XML does not suit my specific need as far as I can tell.

William Overington

26 September 2002






Re: Keys. (derives from Re: Sequences of combining characters.)

2002-09-26 Thread William Overington

Marco Cimarosti asked about what key caps have to do with mark up or text
files.

My idea is as follows.

A document would contain a sequence such as follows.

U+2604 U+0302 U+20E3 12001 U+2460 London U+2604 U+0302 U+20E2

This would have a meaning such as follows.

It was a pleasure to welcome you to our stand at the recent exhibition in
London.

Please now consider the following sequence.

U+2604 U+0302 U+20E3 12001 U+2460 Rome U+2604 U+0302 U+20E2

This would have the following meaning.

It was a pleasure to welcome you to our stand at the recent exhibition in
Rome.

This being because my published dictionary would state that sentence 12001
within the Comet Circumflex system has one parameter and has the meaning as
follows.

It was a pleasure to welcome you to our stand at the recent exhibition in
P1.

The idea is based upon the telegraphic codes of days gone by, as used, in
particular, on railway systems, except that this idea is for automated
computer translation of preset sentences with one or more parameters.  For
example, someone in,say, Japan, who does not speak English (or does not
speak it well enough to produce a professional quality translation) could
communicate over the internet with someone in England who does not speak
Japanese by using sentence C_C+12001 as above, provided that both sender and
recipient have a dictionary for the Comet Circumflex system in his or her
own language.  The system needs the sender to encode the document.  A
recipient could, with an automated system, simply read the message in his or
her own language.  However, it will hopefully be possible to have a computer
assisted encoding system whereby an end user may select sentences from topic
areas and an encoded document be produced.

In a computer system which does not have translation software installed, or
has it installed but only uses it when specifically requested, the message
would appear with a button at the start, provided that a font which carries
the characters is being used.  The message could then be translated, either
automatically if translation software with a local database of C_C sentences
in the local language is available, or manually from a dictionary of
sentences.  I expect that, whatever the potential for automation, to get
started translations will be done manually.  What languages will be used in
early experiments will depend largely on whether any people who are fluent
in a language other than English and can also translate from English into
that language will want to try the system out, and thus upon whatever those
languages happen to be.  Ultimately, if no one is interested, I can get some
translations done into a few languages by paying a professional bureau to do
the work for me.  However, the scope is there that the sentences could
potentially be translated into many languages, both major languages and
minority languages.

Although I am preparing the sentences in English, it would not be necessary
for either a sender or a recipient to know English, as, once the sentences
have been translated once into their respective languages, then the code
numbers could be used directly without using English in the sending and
receiving of the messages.

I have it in mind that I might author and publish, as shareware, a
collection of sentences which could be used in business communications,
hopefully gaining shareware royalties.  For example, sentences making an
enquiry about an item shown on someone's website, where the part number of
the item is a parameter of the sentence.  I am also interested in producing
a set of sentences which might be useful in a distance education context.  I
am thinking of producing a few sentences asking about and commenting about
the weather as a convenient way to experiment with a few sentences.  For
example, a sentence such as It is raining. would not have a parameter, a
sentence such as The temperature in this room is P1 degrees Celsius. would
have one parameter.

There would clearly need to be lots of sentences encoded.  However, I am
hoping that meaningful communication will be possible with a collection of
sentences which can be used with modern computing equipment.

By using the U+2604 U+0302 U+20E3 sequence the system can be used within an
email so that some special sentences are either translated manually or left
in the original language.  That, however, is only useful for one-to-one
correspondence, for general publication of learning material only encoded
sentences could be used, though that could, in conjunction with
illustrations be potentially useful for some purposes.

I am not envisaging doing any of the translation myself, as my linguistic
knowledge is insufficient for professional quality translation work.

Certainly, sentences for this Comet Circumflex system will need to be
carefully designed so as to cover the needs of business communication
without causing problems for a translation engine inserting parameters, so
parameters will need to be either 

Re: Keys. (derives from Re: Sequences of combining characters.)

2002-09-26 Thread Peter_Constable


On 09/26/2002 03:42:16 AM William Overington wrote:


On the one hand, you say

XML does not suit my specific need as far as I can tell.


But you also said

Documents with the code sequence are intended to be sent over the internet
as email, used as web pages and broadcast in multimedia broadcasts over a
direct broadcast satellite system, so the codes which you suggest would be
unsuitable.

One of the things that is especially useful about XML and related
technologies is the facility with which data can be repurposed. You have
one schema for marking up data, and stylesheets that transform it as needed
for different publishing / usage contexts.

Also, I don't see how it can be that a character sequence such as U+003C
U+0061 U+003E can't be useful to you when some ridiculous character
sequence like U+2604 U+0302 U+20E3 is.




- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: [EMAIL PROTECTED]







Re: Keys. (derives from Re: Sequences of combining characters.)

2002-09-26 Thread Peter_Constable


On 09/26/2002 06:05:45 AM William Overington wrote:

I'm going to refrain from commenting on anything beyond the markup issues
-- and I'm continuing with that only because it's an easy follow-on to what
I already wrote, even though there is every indication that the sensibility
of it will be ignored.


A document would contain a sequence such as follows.

U+2604 U+0302 U+20E3 12001 U+2460 London U+2604 U+0302 U+20E2


You could just as easily have used

S C=12001London/S

or

S C=12001 P1=London/

which are only slightly more verbose, but which follow a widely-implemented
standard that can be parsed by lots of existing software, for which there
are a large number of tools available, and which a vast number of
indivuals, businesses and other agencies have an interest in. Your markup
convention is completely proprietary, it has no existing software support,
and nobody but you has any interest in it. You tell me which one is more
likely to result in productive work and adoption by others.




that it is
because I am an inventor, interested in pushing the envelope as to what is
possible scientifically and technologically.

Perhaps there is an [EMAIL PROTECTED] list somewhere where you might
find greater interest in your ideas than here. None of us here mind
invention, but I think most would believe that inventiveness is most
productive when building off the advancement of others rather than
reinventing wheels or widgets. XML exists, and it works.

Beside the fact that your proposed markup convention is not a good idea, it
has nothing whatsoever to do with the development of Unicode. This
discussion really ought to be taken elsewhere.



- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: [EMAIL PROTECTED]







Re: Keys. (derives from Re: Sequences of combining characters.)

2002-09-26 Thread Tex Texin

So that Peter's comments cannot be perceived as strictly Peter's view, I
am seconding them.
Message catalogs are not new. 
A proprietary coding system is a bad idea. 
XML is the way to go.
Failure to investigate the state of the art, (especially where google is
so effortless), means this idea is not pushing any envelope.

New ideas are welcome, but as Peter and others have already defined
several times where the envelope needs pushing (e.g. XML), and in
particular where they should not (private encodings, and hi level
application semantics assigned to particular code points), continued
attempts to do so are not welcome.

tex

[EMAIL PROTECTED] wrote:
...Your markup
 convention is completely proprietary, it has no existing software support,
 and nobody but you has any interest in it. You tell me which one is more
 likely to result in productive work and adoption by others.
 
... None of us here mind
 invention, but I think most would believe that inventiveness is most
 productive when building off the advancement of others rather than
 reinventing wheels or widgets. XML exists, and it works.
 
 Beside the fact that your proposed markup convention is not a good idea, it
 has nothing whatsoever to do with the development of Unicode. This
 discussion really ought to be taken elsewhere.
 
 - Peter

-- 
-
Tex Texin   cell: +1 781 789 1898   mailto:[EMAIL PROTECTED]
Xen Master  http://www.i18nGuy.com
 
XenCrafthttp://www.XenCraft.com
Making e-Business Work Around the World
-




RE: XML Primer (was Keys. (derives from Re: Sequences of combining characters.))

2002-09-26 Thread Shawn Steele

Mr. Overington,

Peter didn't specifically mention that his suggestion is an example of XML, although 
he alluded to that fact.  As many people have mentioned before on this list, XML is a 
more appropriate mechanism for many of your inventions, and it is also a standard.

One of the neatest things about XML is that you can invent your own tags, as Peter's 
example did below.  Of course applications still must agree on the meanings of those 
tags, but your suggestion has the same limitation.

A big advantage of XML is that even when the tags are not understood, they can still 
be safely ignored without fear that other information is lost, garbled or otherwise 
mangled.

Some other examples of how your XML tags may have been chosen are:

CometCircumflex SentenceCode=12001London/CometCircumflex

or 

CometCircumflex SentenceCode=12001 Parameter1=London/

or

CometCircumflex SentenceCode=12001 Parameter1=LondonThanks for visiting 
our stand in London./CometCircumflex

or

CometCircumflex SentenceCode=12001Thanks for visiting our Parameter 
Number=1London/Parameter stand./CometCircumflex

Notice that in the last 2 examples an English string appears, so a reader without your 
translation system will still have understandable text if your XML tags are ignored 
(as most programs do when they don't understand XML.)

Also, even though English is provided in the last 2 strings, the other necessary 
information (Sentence=12001 and Parameter #1=London) is included for your translation 
algorithm.  The author chose to use slightly different text than your standard It was 
a pleasure to welcome you to our stand at the recent exhibition in P1.  That allows 
the author to make minor deviations to customize his text for native speakers, yet the 
author could still communicate with non-native speakers.

I should also mention that your proposed system still has some limitations.  For 
example if the conference were in Cologne, Germany, a Deutsch speaker would expect the 
city name Köln instead.

I hope that this example improves your understanding of XML and how it may be applied 
to your inventions.  As others have mentioned, this topic is digressing from the 
purpose of this message board and would be best discussed off line or in a different 
forum.

- Shawn

Shawn Steele
Software Developer Engineer
Microsoft

My comments in no way endorse the original and are not intended to confer legitimacy, 
rather they are merely intended to be educational.

This posting is provided AS IS with no warranties, and confers no rights.

-Original Message-
A document would contain a sequence such as follows.

U+2604 U+0302 U+20E3 12001 U+2460 London U+2604 U+0302 U+20E2


You could just as easily have used

S C=12001London/S

or

S C=12001 P1=London/





Re: Keys. (derives from Re: Sequences of combining characters.)

2002-09-26 Thread Kenneth Whistler

Peter responded:

 A document would contain a sequence such as follows.
 
 U+2604 U+0302 U+20E3 12001 U+2460 London U+2604 U+0302 U+20E2
 
 
 You could just as easily have used
 
 S C=12001London/S
 
 or
 
 S C=12001 P1=London/

or even:

cometcircumflex messageId=12001London/cometcircumflex

if one likes the ring of comet circumflex for one's tags.

 which are only slightly more verbose, but which follow a widely-implemented
 standard 

namely, XML, which I think effectively gainsays William's earlier
comment:

 XML does not suit my specific need as far as I can tell.

And as far as the idea of having parameterized messages, with
translation catalogs, I would join the chorus inviting William
to investigate state of the art before attempting to invent
something that already exists in many forms.

Or, to further mangle Marco's musical metaphor, as you
go round and around on this topic, make sure that
you don't mix up the apples *for* the horses with the
horseapples *from* the horses.

--Ken ;-)




Keys. (derives from Re: Sequences of combining characters.)

2002-09-25 Thread William Overington

The recent discussion on sequences has led me to have a look through the
various combining characters and I have found the following.

U+20E3 COMBINING ENCLOSING KEYCAP

It has occurred to me that the use of a sequence of a base character, then
one or more combining characters so as to produce a sequence which would be
otherwise unlikely, followed by U+20E3 might be a very effective way to
include specialised markup systems within a plain text file without
disrupting the normal textual information conveying capabilities of a file.
An all-Unicode font would then produce a graphic representation of the key,
without any prior arrangement being necessary, so that such marked-up
sequences could be produced using just a regular all-Unicode plain text
editor.  A receiving program with a specialized plug-in could then decode
the markup, or it could be decoded manually in some cases.

For example, I am looking at using the following sequence so as to produce a
special purpose key within documents.

U+2604 U+0302 U+20E3

Hopefully that sequence will be so unlikely to occur other than in my
specialised application that the sequence can be used uniquely for that
specialised application.

I am also thinking in terms of using the following sequence to indicate the
end of the markup sequence.

U+2604 U+0302 U+20E2

I have it in mind that characters in the range U+2460 through to U+2473
could be used before parameters within the markup system.



Also, I have noticed that in the document U02D0.pdf that U+20E4 is shown, in
the listing, in magenta whereas U+20DF is shown in black.  Could someone say
what significance the magenta colouring in the document has please?  Is it
perhaps to indicate additions since the previous issue of the document?

William Overington

25 September 2002








Re: Keys. (derives from Re: Sequences of combining characters.)

2002-09-25 Thread Doug Ewell

William Overington WOverington at ngo dot globalnet dot co dot uk
wrote:

 Also, I have noticed that in the document U02D0.pdf

(actually U20D0.pdf)

 that U+20E4 is
 shown, in the listing, in magenta whereas U+20DF is shown in black.
 Could someone say what significance the magenta colouring in the
 document has please?  Is it perhaps to indicate additions since the
 previous issue of the document?

Since the previous release of Unicode.  The magenta characters are those
added in Unicode 3.2.  They were marked specially in the draft copies of
the code charts to indicate the changes (and probably to highlight the
fact that the assignments were still tentative), and left that way after
3.2 went live.  Whether this was intentional or not, I don't know.

-Doug Ewell
 Fullerton, California





RE: Keys. (derives from Re: Sequences of combining characters.)

2002-09-25 Thread Marco Cimarosti

William Overington wrote:
 The recent discussion on sequences has led me to have a look 
 through the various combining characters and I have found
 the following.
 
 U+20E3 COMBINING ENCLOSING KEYCAP
 
 It has occurred to me that the use of a sequence of a base 
 character, then one or more combining characters so as to
 produce a sequence which would be otherwise unlikely,
 followed by U+20E3 might be a very effective way to
 include specialised markup systems within a plain text
 file [...]

What the hell do key caps have to do with mark up or text files!!??

Mr. Overington, why do you have this irresistible compulsion to mix up
apples and horses? (I feel that the usual apples and oranges is not enough
to convey the idea fully.)

Regards.

_ Marco