Re: [XeTeX] TeX in the modern World. (goes OT) Was: Re: Whitespace in input

2011-11-19 Thread Philip TAYLOR



bhutex wrote:


I don't really understand why this discussion.

Have not you read the article in TUG Boat. Don is planning to bring out a new 
TeX called iTeX*. Actually the * is a sound of bell - ding!!

But it may not be free. But it can handle all types of output formats, all 
languages etc. etc. Don presented this paper in TUG meeting last year - because 
the TUG volume is a proceedings volume.


I have heard rumours that DEK has had to put work on ĭ-TeX
on hold while he addresses a more important (and previously
unsolved problem) in computer science.  Apparently, realising
that computerised embroidery is sadly deficient when compared
to the real thing, DEK has, it is rumoured, decided that
as a computer scientist it is his duty to write the ultimate,
definitive, computerised embroidery software.  It is alleged
that initially this will be solely for his, and Jill's,
private use, but as word  of this remarkable tool leaks out
into academe, ever more would-be computer embroiderers are
expected to plead for a copy, and, if the rumours are true,
DEK is therefore likely to find himself distracted from his
ground-breaking work on ĭ-TeX as he responds to bug reports
(few) and feature requests (many) from his early adopters.

Philip Taylor



--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] TeX in the modern World. (goes OT) Was: Re: Whitespace in input

2011-11-18 Thread bhutex
I don't really understand why this discussion.

Have not you read the article in TUG Boat. Don is planning to bring out a
new TeX called iTeX*. Actually the * is a sound of bell - ding!!

But it may not be free. But it can handle all types of output formats, all
languages etc. etc. Don presented this paper in TUG meeting last year -
because the TUG volume is a proceedings volume.

D Venu Gopal

-- 
Happy TeXing
The BHU TeX Group
क्या आप यह देख पा रहें हैं।
इस का मतलब आप का कम्प्यूटर यूनीकोड
को समझती है। देर किस बात की हिन्दी मे
चिठ्ठियां लिखिये।


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] TeX in the modern World. (goes OT) Was: Re: Whitespace in input

2011-11-18 Thread Adam Twardoch (List)

> Yet, it remains one of the most
> powerful and cheapest typesetting systems to date.
"Cheap" in terms of initial investment -- surely, as it's open-source
and free.

"Cheap" in terms of implementing -- not quite so, because you need to
format your sources in a very specific, "isolated" syntax.

I initially tried to implement TeX in some projects of my own, and
switched to Prince XML (http://princexml.com/ )

I found it much easier to start with, as it takes HTML or XML + Unicode
+ CSS + SVG/bitmaps + OpenType fonts as input, executes JavaScript
during processing, has a rather high-quality, constantly improving
line-breaking algorithm, and produces reliable PDFs. Some aspects of it
are not quite as powerful as TeX's, but other aspects greatly surpass
TeX -- especially in terms of ease of use and quick implementation while
maintaining acceptably high quality.

So I ended up with Prince XML as my tool of choice because it natively
supports my "preferred input formats", i.e. the web formats. A
commercial server license costs 3800 USD, which may sound like a lot,
but I found it a fair price to pay for the comfort of being able to use
my content directly and without much debugging/converting/fine-tuning.

Best,
Adam

-- 

May success attend your efforts,
-- Adam Twardoch
(Remove "list." from e-mail address to contact me directly.)



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] TeX in the modern World. (goes OT) Was: Re: Whitespace in input

2011-11-18 Thread Herbert Schulz

On Nov 18, 2011, at 7:57 AM, Arthur Reutenauer wrote:

> On Fri, Nov 18, 2011 at 10:16:31AM +0100, Keith J. Schultz wrote in
> reply to Ross Moore:
>>  You are probably a little young to know this, but TeX's original output 
>> format was a dvi file.
> 
>  I think I'll have this one framed and sent to Ross for his next
> birthday.
> 
>   Arthur


Howdy,

I'll split the cost with you! :-)

Good Luck,

Herb Schulz
(herbs at wideopenwest dot com)






--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] TeX in the modern World. (goes OT) Was: Re: Whitespace in input

2011-11-18 Thread Arthur Reutenauer
On Fri, Nov 18, 2011 at 10:16:31AM +0100, Keith J. Schultz wrote in
reply to Ross Moore:
>   You are probably a little young to know this, but TeX's original output 
> format was a dvi file.

  I think I'll have this one framed and sent to Ross for his next
birthday.

Arthur


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] TeX in the modern World. (goes OT) Was: Re: Whitespace in input

2011-11-18 Thread Zdenek Wagner
2011/11/18 Keith J. Schultz :
> Hi All,
> Sorry, I go OT here, but in order to debate it is necessary.
> Please forgive.

Hi all,
I agree with Keith, I have just a few comments.

> I have to side more with Philip.
> What most are forgetting is what (Xe)TeX is intended for.
> It is for most a typesetting program(you do mention this below).
> It was not designed to handle different languages or actually truly
> do wordprocessing in the modern sense.
> Due to the power of the TeX engine, it evolved to deal with different
> languages
> and newer output methods and encodings. The problem with TeX that the basic
> engine has not been redesigned to handle these new developments well.
> The internals need to be completely revamped.
> Am 17.11.2011 um 20:36 schrieb Ross Moore:
>
> Hi Phil,
> On 17/11/2011, at 23:53, Philip TAYLOR  wrote:
>
> Keith J. Schultz wrote:
>
> You mention in a later post that you do consider a space as a printable
> character.
>
>This line should read as:
>
>  You mention in a later post that you consider a space as a
> non-printable character.
>
> No, I don't think of it as a "character" at all, when we are talking
> about typeset output (as opposed to ASCII (or Unicode) input).
>
> This is fine, when all that you require of your output is that it be visible
> on
> a printed page. But modern communication media goes much beyond that.
> A machine needs to be able to tell where words and lines end, reflowing
> paragraphs when appropriate and able to produce a flat extraction of all the
> text, perhaps also with some indication of the purpose of that text (e.g. by
> structural tagging).
>
OK, tagged PDF is an option, but it is an optional feature, it is not
enforced. You can never be sure that the PDF you get ans an input will
be tagged. Even if spaces were stored as glyph, the original structure
will be lost. I typeset documents where even a paragraph is originally
a nested structure of elements...

> I would agree with you, but TeX was not designed as a communications
> program, it was designed for creating printed media.
> Furthermore, it may be desirable in the Modern World to have every programs
> out used as input for another program.
> This ideal is utopia. If you need the output from one program(media) to
> another then you will need a intermediate program/filter
> in order to reformat/convert the differences. As with all types of
> communication there will be structures missing/lacking in the other
> system. So a one to one conversion will not be possible. You will need to
> use some kind of heuristics or in modern terms intelligence.
>
> In short, what is output for one format should also be able to serve as
> input for another.
>
> This assertion is completely idealistic. Then again, it is true. It is
> possibly, today, to design a system that goes from audio, to TeX, to printed
> documents
> to audio again. Yet, you will need a lot of effort and most likely the
> results will be far from perfect. Though it is workable and require
> considerable
> resources.
>
> Thus the space certainly does play the role of an output character - though
> the presence of a gap in the positioning of visible letters may serve this
> role in many, but not all, circumstances.
>
> This depends on what you are outputting. For a printed page and is consumed
> by a human it goes not matter, because humans do not process space
> characters just space, and they even
> at times ignore them completely, because it is irrelevant for their natural
> language processing.
> For computers on the other hand the use of a space character can be very
> relevant.
> In the early days of TeX and LaTeX I have know people to create their e-mail
> with TeX. So you can see TeX is capable of outputting character based
> output.
> Furthermore, TeX could be used to produce any form of character based
> formats as its output.
>
> Clearly
> it is a character on input, but unless it generates a glyph in the
> output stream (which TeX does not, for normal spaces) then it is not
> a character (/qua/ character) on output but rather a formatting
> instruction not dissimilar to (say) end-of-line.
>
> But a formatting instruction for one program cannot serve as reliable input
> for another.
> A heuristic is then needed, to attempt to infer that a programming
> instruction must have been used, and guess what kind of instruction it might
> have been. This is not 100% reliable, so is deprecated in modern methods of
> data storage and document formats.
>
> Are you not contradicting yourself here! See above.
>
> XML based formats use tagging, rather that programming instructions. This is
> the modern way, which is used extensively for communicating data between
> different software systems.
>
> True it is used, for communicating data. Yet, you are misconceived in
> thinking that it truly solves any of the problems involved different data
> types or content!
> You can get a parse tree of the data, yet if a program can not understan

[XeTeX] TeX in the modern World. (goes OT) Was: Re: Whitespace in input

2011-11-18 Thread Keith J. Schultz
Hi All,

Sorry, I go OT here, but in order to debate it is necessary.
Please forgive.

I have to side more with Philip.

What most are forgetting is what (Xe)TeX is intended for.
It is for most a typesetting program(you do mention this below).
It was not designed to handle different languages or actually truly
do wordprocessing in the modern sense. 

Due to the power of the TeX engine, it evolved to deal with different languages
and newer output methods and encodings. The problem with TeX that the basic 
engine has not been redesigned to handle these new developments well.
The internals need to be completely revamped.

Am 17.11.2011 um 20:36 schrieb Ross Moore:

> Hi Phil,
> 
> On 17/11/2011, at 23:53, Philip TAYLOR  wrote:
> 
>> Keith J. Schultz wrote:
 
 You mention in a later post that you do consider a space as a printable 
 character.
>>>This line should read as:
>>>  You mention in a later post that you consider a space as a 
>>> non-printable character.
>> 
>> No, I don't think of it as a "character" at all, when we are talking
>> about typeset output (as opposed to ASCII (or Unicode) input).  
> 
> This is fine, when all that you require of your output is that it be visible 
> on
> a printed page. But modern communication media goes much beyond that.
> A machine needs to be able to tell where words and lines end, reflowing 
> paragraphs when appropriate and able to produce a flat extraction of all the 
> text, perhaps also with some indication of the purpose of that text (e.g. by 
> structural tagging).
I would agree with you, but TeX was not designed as a communications 
program, it was designed for creating printed media.
Furthermore, it may be desirable in the Modern World to have every 
programs out used as input for another program.
This ideal is utopia. If you need the output from one program(media) to 
another then you will need a intermediate program/filter
in order to reformat/convert the differences. As with all types of 
communication there will be structures missing/lacking in the other
system. So a one to one conversion will not be possible. You will need 
to use some kind of heuristics or in modern terms intelligence.
> 
> In short, what is output for one format should also be able to serve as input 
> for another.
This assertion is completely idealistic. Then again, it is true. It is 
possibly, today, to design a system that goes from audio, to TeX, to printed 
documents
to audio again. Yet, you will need a lot of effort and most likely the 
results will be far from perfect. Though it is workable and require considerable
resources.
> 
> Thus the space certainly does play the role of an output character – though 
> the presence of a gap in the positioning of visible letters may serve this 
> role in many, but not all, circumstances.
This depends on what you are outputting. For a printed page and is 
consumed by a human it goes not matter, because humans do not process space 
characters just space, and they even
at times ignore them completely, because it is irrelevant for their 
natural language processing.
For computers on the other hand the use of a space character can be 
very relevant.

In the early days of TeX and LaTeX I have know people to create their 
e-mail with TeX. So you can see TeX is capable of outputting character based 
output.
Furthermore, TeX could be used to produce any form of character based 
formats as its output. 
> 
>> Clearly
>> it is a character on input, but unless it generates a glyph in the
>> output stream (which TeX does not, for normal spaces) then it is not
>> a character (/qua/ character) on output but rather a formatting
>> instruction not dissimilar to (say) end-of-line.
> 
> But a formatting instruction for one program cannot serve as reliable input 
> for another.
> A heuristic is then needed, to attempt to infer that a programming 
> instruction must have been used, and guess what kind of instruction it might 
> have been. This is not 100% reliable, so is deprecated in modern methods of 
> data storage and document formats.
Are you not contradicting yourself here! See above.
> XML based formats use tagging, rather that programming instructions. This is 
> the modern way, which is used extensively for communicating data between 
> different software systems.
True it is used, for communicating data. Yet, you are misconceived in 
thinking that it truly solves any of the problems involved different data types 
or content!
You can get a parse tree of the data, yet if a program can not 
understand or process the data/content it is useless. 
Agreed the XML file contains information about it structure and is 
human readable, yet it does NOTHING, for convert from one format to another. 
You still need a parser/filter to 
convert into another format. 
Do not forget you can put pract