Re: [XeTeX] centering using geometry package

2011-11-19 Thread Kevin Godby
On Sat, Nov 19, 2011 at 10:39 PM, Daniel Greenhoe  wrote:
> But one thing that concerns me is that there
> is an extra vertical line that appears about 2.5mm to the right of the
> text body frame box. Can somebody tell me, what is that line? Can I
> eliminate it somehow? Here is a somewhat minimal example:

The vertical line on the far right is showing where the marginpar area
starts.  That vertical line is the left margin of the \marginpars.
The distance between the right edge of your box and that vertical line
is \marginparsep.  The width of the marginpar area (i.e., the width of
the margin notes) is \marginparwidth.

You won't need to worry about either of those values as long as you're
not using \marginpars on that page.

--Kevin Godby


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] centering using geometry package

2011-11-19 Thread Daniel Greenhoe
On Sun, Nov 20, 2011 at 2:57 PM, Axel E. Retif  wrote:
> It's the showframe option.
> There is also a very thin horizontal line at the top of the page.
> Commenting out showframe, both disappear.

But I want the showframe option. In particular, I want the geometry
package to put a frame around the text area. It is a kind of check to
see if my understanding of where the text area should be matches with
the geometry package's understanding of where it should be.

Dan

On Sun, Nov 20, 2011 at 2:57 PM, Axel E. Retif  wrote:
> On 11/19/2011 10:39 PM, Daniel Greenhoe wrote:
>
>
>>  But one thing that concerns me is that there
>> is an extra vertical line that appears about 2.5mm to the right of the
>> text body frame box. Can somebody tell me, what is that line? Can I
>> eliminate it somehow?
>
>
> It's the showframe option. There is also a very thin horizontal line at the
> top of the page. Commenting out showframe, both disappear.
>
> Best
>
> Axel
>
>
>
> --
> Subscriptions, Archive, and List information, etc.:
>  http://tug.org/mailman/listinfo/xetex
>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] centering using geometry package

2011-11-19 Thread Axel E. Retif

On 11/19/2011 10:39 PM, Daniel Greenhoe wrote:



 But one thing that concerns me is that there
is an extra vertical line that appears about 2.5mm to the right of the
text body frame box. Can somebody tell me, what is that line? Can I
eliminate it somehow?



It's the showframe option. There is also a very thin horizontal line at 
the top of the page. Commenting out showframe, both disappear.


Best

Axel



--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


[XeTeX] centering using geometry package

2011-11-19 Thread Daniel Greenhoe
I am using pstricks to produce a book cover. Before sending it off to
the print house, I want it "exactly" (or with a very tight tolerance
anyways) centered on an A3 sized page. To help with that, I use the
geometry package. In an effort to check if everything is really
centered, I use the "showframe" option. I have reason to believe it
may be working correctly. But one thing that concerns me is that there
is an extra vertical line that appears about 2.5mm to the right of the
text body frame box. Can somebody tell me, what is that line? Can I
eliminate it somehow? Here is a somewhat minimal example:

\documentclass{book}
\setlength{\parskip}{0mm}%
\setlength{\parindent}{0mm}%
\usepackage{geometry}
\usepackage{pstricks}
\usepackage{pstricks-add}
\geometry{
  xetex,
  paper=a3paper,landscape,
  centering,twoside=false,
  ignoreall,
  textheight=284mm,textwidth=400mm,
  truedimen,
  showframe
  }
\begin{document}%
\psset{unit=1mm}%
\begin{pspicture}(-200,-142)(200,142)%
  \psframe[fillstyle=none,linestyle=dotted,linecolor=blue](-200,-142)(200,142)%
\end{pspicture}%
\end{document}%

Many thanks in advance,
Dan


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] cmyk encoded files

2011-11-19 Thread Daniel Greenhoe
2011/11/20 Zdenek Wagner :
> Printed colour samples are commercially available.
> They are printed on different types of papers and CMYK values are given.

Is there any such thing available in book form? That is, could you
make a recommendation? Here in Taiwan, there is something commonly
sold called Pantone彩色聖經 (Pantone Cai3Se4 Sheng4Jing1 = Pantone Color
Bible). I did finally locate one in a bookstore yesterday, but it was
sealed up and I wasn't allowed to open it without buying it.

Dan



2011/11/20 Zdenek Wagner :
> 2011/11/20 Daniel Greenhoe :
>> 2011/11/20 Zdenek Wagner :
>>> No.
>>
>>> LCMS is a good choice.
>> LCMS is "Little Color Management System"?
>> (http://www.color.org/opensource.xalter)?
>>
> Yes.
>
>>> 1. It ensures that the colours you specify in the document will be 
>>> converted to cmyk.
>>> However, the corrections are wrong.
>>> 2. xcolor does not look into inserted graphics,...
>>
>> But what if I hand define all my colors using cmyk syntax like this for 
>> example
>>     \definecolor{magenta}{cmyk}{0,1,0,0}
>> and create all my graphics using pstricks and related packages (with
>> no inserted graphics)?
>> Then won't the resulting pdf be cmyk compliant and contain exactly the
>> colors I defined?
>>
> That's what I do. Printed colour samples are commercially available.
> They are printed on different types of papers and CMYK values are
> given. Thus you select the required colour on a proper paper and use
> it. Sometimes I select the colour in gimp and then using LCMS convert
> the values from RGB to CMYK. Scanned images are also easy. I keep them
> as TIF, using LCMS convert them to CMYK and then by tiff2pdf to PDF
> that can be included by \includegraphics.
>
>> Dan
>>
>>
>>
>>
>> 2011/11/20 Zdenek Wagner :
>>> 2011/11/19 Daniel Greenhoe :
 Print shops often require pdf files containing color to be encoded
 using CMYK colorspace values.

 Version 2.11 of the xcolor package says that cmyk is "supported by
 Postscripts directly" (page 8). So if I simply specify
  \usepackage[cmyk]{xcolor}
 in the preamble and compile with XeTeX/XeLaTeX, is that sufficient to
 ensure the resulting pdf is cmyk encoded?

>>> No.
>>>
>>> 1. It ensures that the colours you specify in the document will be
>>> converted to cmyk. However, the corrections are wrong. If you wish to
>>> convert the colours properly, you have to use colour profiles. LCMS is
>>> a good choice. Useful ICC profiles come with different products as
>>> Adobe Reader, colour printers, scanners etc. They can also be
>>> downloaded from the web. Calculations in the xcolor package can only
>>> be used if you are satisfied with approximate colours. It is written
>>> in the documentation that conversions are device dependent.
>>>
>>> 2. xcolor does not look into inserted graphics, you have to convert
>>> your images to cmyk separately. Again LCMS is a good tool for this
>>> purpose.
>>>
 Secondly, is there any free utility available for checking the
 colorspace encoding of pdf files (maybe similar to foolab's pdffonts
 for checking embedded fonts).

>>> I have not found any. Since I produce PDF files for printing very
>>> often, I calculated that commercial Adobe Acrobat is cheaper than the
>>> risk of paying unusable books, thus I have bought it.
>>>
 Many thanks in advance,
 Dan


 --
 Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

>>>
>>>
>>>
>>> --
>>> Zdeněk Wagner
>>> http://hroch486.icpf.cas.cz/wagner/
>>> http://icebearsoft.euweb.cz
>>>
>>>
>>>
>>> --
>>> Subscriptions, Archive, and List information, etc.:
>>>  http://tug.org/mailman/listinfo/xetex
>>>
>>
>>
>>
>> --
>> Subscriptions, Archive, and List information, etc.:
>>  http://tug.org/mailman/listinfo/xetex
>>
>
>
>
> --
> Zdeněk Wagner
> http://hroch486.icpf.cas.cz/wagner/
> http://icebearsoft.euweb.cz
>
>
>
> --
> Subscriptions, Archive, and List information, etc.:
>  http://tug.org/mailman/listinfo/xetex
>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Whitespace in input

2011-11-19 Thread Chris Travers
On Sat, Nov 19, 2011 at 5:19 AM, Keith J. Schultz  wrote:
> OUCH! I have been hit by a veteran truck drivers truck. ;-))
>
> I concede!
>
> I am curious if many still know what a XX-bit word is. Is that term even 
> still used?

It will fade out of use until someone decides we need 128-bit words
and then will pop in again ;-)

Best Wishes,
Chris Travers


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] cmyk encoded files

2011-11-19 Thread Zdenek Wagner
2011/11/20 Daniel Greenhoe :
> 2011/11/20 Zdenek Wagner :
>> No.
>
>> LCMS is a good choice.
> LCMS is "Little Color Management System"?
> (http://www.color.org/opensource.xalter)?
>
Yes.

>> 1. It ensures that the colours you specify in the document will be converted 
>> to cmyk.
>> However, the corrections are wrong.
>> 2. xcolor does not look into inserted graphics,...
>
> But what if I hand define all my colors using cmyk syntax like this for 
> example
>     \definecolor{magenta}{cmyk}{0,1,0,0}
> and create all my graphics using pstricks and related packages (with
> no inserted graphics)?
> Then won't the resulting pdf be cmyk compliant and contain exactly the
> colors I defined?
>
That's what I do. Printed colour samples are commercially available.
They are printed on different types of papers and CMYK values are
given. Thus you select the required colour on a proper paper and use
it. Sometimes I select the colour in gimp and then using LCMS convert
the values from RGB to CMYK. Scanned images are also easy. I keep them
as TIF, using LCMS convert them to CMYK and then by tiff2pdf to PDF
that can be included by \includegraphics.

> Dan
>
>
>
>
> 2011/11/20 Zdenek Wagner :
>> 2011/11/19 Daniel Greenhoe :
>>> Print shops often require pdf files containing color to be encoded
>>> using CMYK colorspace values.
>>>
>>> Version 2.11 of the xcolor package says that cmyk is "supported by
>>> Postscripts directly" (page 8). So if I simply specify
>>>  \usepackage[cmyk]{xcolor}
>>> in the preamble and compile with XeTeX/XeLaTeX, is that sufficient to
>>> ensure the resulting pdf is cmyk encoded?
>>>
>> No.
>>
>> 1. It ensures that the colours you specify in the document will be
>> converted to cmyk. However, the corrections are wrong. If you wish to
>> convert the colours properly, you have to use colour profiles. LCMS is
>> a good choice. Useful ICC profiles come with different products as
>> Adobe Reader, colour printers, scanners etc. They can also be
>> downloaded from the web. Calculations in the xcolor package can only
>> be used if you are satisfied with approximate colours. It is written
>> in the documentation that conversions are device dependent.
>>
>> 2. xcolor does not look into inserted graphics, you have to convert
>> your images to cmyk separately. Again LCMS is a good tool for this
>> purpose.
>>
>>> Secondly, is there any free utility available for checking the
>>> colorspace encoding of pdf files (maybe similar to foolab's pdffonts
>>> for checking embedded fonts).
>>>
>> I have not found any. Since I produce PDF files for printing very
>> often, I calculated that commercial Adobe Acrobat is cheaper than the
>> risk of paying unusable books, thus I have bought it.
>>
>>> Many thanks in advance,
>>> Dan
>>>
>>>
>>> --
>>> Subscriptions, Archive, and List information, etc.:
>>>  http://tug.org/mailman/listinfo/xetex
>>>
>>
>>
>>
>> --
>> Zdeněk Wagner
>> http://hroch486.icpf.cas.cz/wagner/
>> http://icebearsoft.euweb.cz
>>
>>
>>
>> --
>> Subscriptions, Archive, and List information, etc.:
>>  http://tug.org/mailman/listinfo/xetex
>>
>
>
>
> --
> Subscriptions, Archive, and List information, etc.:
>  http://tug.org/mailman/listinfo/xetex
>



-- 
Zdeněk Wagner
http://hroch486.icpf.cas.cz/wagner/
http://icebearsoft.euweb.cz



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] cmyk encoded files

2011-11-19 Thread Daniel Greenhoe
2011/11/20 Zdenek Wagner :
> No.

> LCMS is a good choice.
LCMS is "Little Color Management System"?
(http://www.color.org/opensource.xalter)?

> 1. It ensures that the colours you specify in the document will be converted 
> to cmyk.
> However, the corrections are wrong.
> 2. xcolor does not look into inserted graphics,...

But what if I hand define all my colors using cmyk syntax like this for example
 \definecolor{magenta}{cmyk}{0,1,0,0}
and create all my graphics using pstricks and related packages (with
no inserted graphics)?
Then won't the resulting pdf be cmyk compliant and contain exactly the
colors I defined?

Dan




2011/11/20 Zdenek Wagner :
> 2011/11/19 Daniel Greenhoe :
>> Print shops often require pdf files containing color to be encoded
>> using CMYK colorspace values.
>>
>> Version 2.11 of the xcolor package says that cmyk is "supported by
>> Postscripts directly" (page 8). So if I simply specify
>>  \usepackage[cmyk]{xcolor}
>> in the preamble and compile with XeTeX/XeLaTeX, is that sufficient to
>> ensure the resulting pdf is cmyk encoded?
>>
> No.
>
> 1. It ensures that the colours you specify in the document will be
> converted to cmyk. However, the corrections are wrong. If you wish to
> convert the colours properly, you have to use colour profiles. LCMS is
> a good choice. Useful ICC profiles come with different products as
> Adobe Reader, colour printers, scanners etc. They can also be
> downloaded from the web. Calculations in the xcolor package can only
> be used if you are satisfied with approximate colours. It is written
> in the documentation that conversions are device dependent.
>
> 2. xcolor does not look into inserted graphics, you have to convert
> your images to cmyk separately. Again LCMS is a good tool for this
> purpose.
>
>> Secondly, is there any free utility available for checking the
>> colorspace encoding of pdf files (maybe similar to foolab's pdffonts
>> for checking embedded fonts).
>>
> I have not found any. Since I produce PDF files for printing very
> often, I calculated that commercial Adobe Acrobat is cheaper than the
> risk of paying unusable books, thus I have bought it.
>
>> Many thanks in advance,
>> Dan
>>
>>
>> --
>> Subscriptions, Archive, and List information, etc.:
>>  http://tug.org/mailman/listinfo/xetex
>>
>
>
>
> --
> Zdeněk Wagner
> http://hroch486.icpf.cas.cz/wagner/
> http://icebearsoft.euweb.cz
>
>
>
> --
> Subscriptions, Archive, and List information, etc.:
>  http://tug.org/mailman/listinfo/xetex
>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] cmyk encoded files

2011-11-19 Thread Daniel Greenhoe
On Sun, Nov 20, 2011 at 7:34 AM, Peter Dyballa  wrote:
> It seems so!
> XeTeX/XeLaTeX can be invoked with --no-pdf.
> The created XDV file gives hints that CMYK is used (color push cmyk <4 
> values>).

That is good news. And that was a clever method for checking. I did
not think of trying that myself.

Thank you!
Dan

On Sun, Nov 20, 2011 at 7:34 AM, Peter Dyballa  wrote:
>
> Am 19.11.2011 um 23:03 schrieb Daniel Greenhoe:
>
>> Version 2.11 of the xcolor package says that cmyk is "supported by
>> Postscripts directly" (page 8). So if I simply specify
>>  \usepackage[cmyk]{xcolor}
>> in the preamble and compile with XeTeX/XeLaTeX, is that sufficient to
>> ensure the resulting pdf is cmyk encoded?
>
> It seems so! XeTeX/XeLaTeX can be invoked with --no-pdf. The created XDV file 
> gives hints that CMYK is used (color push cmyk <4 values>). Pdftops from the 
> xpdf suite produces a PS file which also gives hints that colour is used in 
> the CMYK model. Mac OS X's sips tells it uses RGB model...
>
> --
> Greetings
>
>  Pete
>
> This is a signature virus.  Add me to your signature and help me to live!
>
>
>
>
> --
> Subscriptions, Archive, and List information, etc.:
>  http://tug.org/mailman/listinfo/xetex
>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] cmyk encoded files

2011-11-19 Thread Zdenek Wagner
2011/11/19 Daniel Greenhoe :
> Print shops often require pdf files containing color to be encoded
> using CMYK colorspace values.
>
> Version 2.11 of the xcolor package says that cmyk is "supported by
> Postscripts directly" (page 8). So if I simply specify
>  \usepackage[cmyk]{xcolor}
> in the preamble and compile with XeTeX/XeLaTeX, is that sufficient to
> ensure the resulting pdf is cmyk encoded?
>
No.

1. It ensures that the colours you specify in the document will be
converted to cmyk. However, the corrections are wrong. If you wish to
convert the colours properly, you have to use colour profiles. LCMS is
a good choice. Useful ICC profiles come with different products as
Adobe Reader, colour printers, scanners etc. They can also be
downloaded from the web. Calculations in the xcolor package can only
be used if you are satisfied with approximate colours. It is written
in the documentation that conversions are device dependent.

2. xcolor does not look into inserted graphics, you have to convert
your images to cmyk separately. Again LCMS is a good tool for this
purpose.

> Secondly, is there any free utility available for checking the
> colorspace encoding of pdf files (maybe similar to foolab's pdffonts
> for checking embedded fonts).
>
I have not found any. Since I produce PDF files for printing very
often, I calculated that commercial Adobe Acrobat is cheaper than the
risk of paying unusable books, thus I have bought it.

> Many thanks in advance,
> Dan
>
>
> --
> Subscriptions, Archive, and List information, etc.:
>  http://tug.org/mailman/listinfo/xetex
>



-- 
Zdeněk Wagner
http://hroch486.icpf.cas.cz/wagner/
http://icebearsoft.euweb.cz



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] cmyk encoded files

2011-11-19 Thread Peter Dyballa

Am 19.11.2011 um 23:03 schrieb Daniel Greenhoe:

> Version 2.11 of the xcolor package says that cmyk is "supported by
> Postscripts directly" (page 8). So if I simply specify
>  \usepackage[cmyk]{xcolor}
> in the preamble and compile with XeTeX/XeLaTeX, is that sufficient to
> ensure the resulting pdf is cmyk encoded?

It seems so! XeTeX/XeLaTeX can be invoked with --no-pdf. The created XDV file 
gives hints that CMYK is used (color push cmyk <4 values>). Pdftops from the 
xpdf suite produces a PS file which also gives hints that colour is used in the 
CMYK model. Mac OS X's sips tells it uses RGB model...

--
Greetings

  Pete

This is a signature virus.  Add me to your signature and help me to live!




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


[XeTeX] cmyk encoded files

2011-11-19 Thread Daniel Greenhoe
Print shops often require pdf files containing color to be encoded
using CMYK colorspace values.

Version 2.11 of the xcolor package says that cmyk is "supported by
Postscripts directly" (page 8). So if I simply specify
  \usepackage[cmyk]{xcolor}
in the preamble and compile with XeTeX/XeLaTeX, is that sufficient to
ensure the resulting pdf is cmyk encoded?

Secondly, is there any free utility available for checking the
colorspace encoding of pdf files (maybe similar to foolab's pdffonts
for checking embedded fonts).

Many thanks in advance,
Dan


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Whitespace in input

2011-11-19 Thread Zdenek Wagner
2011/11/19 Pander :
> On 2011-11-19 14:25, Keith J. Schultz wrote:
>
> Perhaps this can be of use:
>  https://github.com/wspr/fontspec/issues/121
>
As Khaled wrote, it belongs to the engine. ZWJ and ZWNJ are used in
Indic scripts and they work fine since I started to use XeTeX in 2008.

>> Am 19.11.2011 um 13:51 schrieb Zdenek Wagner:
>>
>>> 2011/11/19 Keith J. Schultz >> >:
>>>
        As for getting junk when copying unicode, just copy between to
 text using different fonts, where one font does
        not contain the glyph.

>>> When performing copy&paste or text search in PDF, I am not interested
>>> in glyphs but in characters. I do not care what glyphs will be
>>> displayed. If I copy the text to OpenOffice, I can change the font
>>> later and if the codepoints were transferred correctly, I will see the
>> As you say if transferred correctly!
>>
>>> text (it was true even with OpenOffice 1.x, I tried many years ago).
>>> If I copy the text to gedit, ontconfig will automatically find a font
>>> for displaying the characters not present in the current font. I still
>>> have to read the fontconfig manual in order to find how to configure
>>> its searching algorithm. Arabic fonts may be a problem especially if
>>> you wish to use Arabic, Persian and Urdu. Now I know that I have to
>>> force fontonfic to select automatically SIL Scheherezade because it
>>> contains all characters. I can thus use both U+0643 and U+06A. When
>>> writing Akbar, I can write it both in Arabic and in Urdu/Farsi
>>
>> [snip, snip]
>>
       The only advise I can give is choose your tools wisely.

>>
>> regards
>> Keith.
>>
>>
>>
>>
>>
>>
>> --
>> Subscriptions, Archive, and List information, etc.:
>>   http://tug.org/mailman/listinfo/xetex
>
>
>
> --
> Subscriptions, Archive, and List information, etc.:
>  http://tug.org/mailman/listinfo/xetex
>



-- 
Zdeněk Wagner
http://hroch486.icpf.cas.cz/wagner/
http://icebearsoft.euweb.cz



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Whitespace in input

2011-11-19 Thread Pander
On 2011-11-19 14:25, Keith J. Schultz wrote:

Perhaps this can be of use:
  https://github.com/wspr/fontspec/issues/121

> Am 19.11.2011 um 13:51 schrieb Zdenek Wagner:
> 
>> 2011/11/19 Keith J. Schultz > >:
>>
>>>As for getting junk when copying unicode, just copy between to
>>> text using different fonts, where one font does
>>>not contain the glyph.
>>>
>> When performing copy&paste or text search in PDF, I am not interested
>> in glyphs but in characters. I do not care what glyphs will be
>> displayed. If I copy the text to OpenOffice, I can change the font
>> later and if the codepoints were transferred correctly, I will see the
> As you say if transferred correctly!
> 
>> text (it was true even with OpenOffice 1.x, I tried many years ago).
>> If I copy the text to gedit, ontconfig will automatically find a font
>> for displaying the characters not present in the current font. I still
>> have to read the fontconfig manual in order to find how to configure
>> its searching algorithm. Arabic fonts may be a problem especially if
>> you wish to use Arabic, Persian and Urdu. Now I know that I have to
>> force fontonfic to select automatically SIL Scheherezade because it
>> contains all characters. I can thus use both U+0643 and U+06A. When
>> writing Akbar, I can write it both in Arabic and in Urdu/Farsi
> 
> [snip, snip]
> 
>>>   The only advise I can give is choose your tools wisely.
>>>
> 
> regards
> Keith.
> 
> 
> 
> 
> 
> 
> --
> Subscriptions, Archive, and List information, etc.:
>   http://tug.org/mailman/listinfo/xetex



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Whitespace in input

2011-11-19 Thread Keith J. Schultz

Am 19.11.2011 um 13:51 schrieb Zdenek Wagner:

> 2011/11/19 Keith J. Schultz :
> 
>>As for getting junk when copying unicode, just copy between to text 
>> using different fonts, where one font does
>>not contain the glyph.
>> 
> When performing copy&paste or text search in PDF, I am not interested
> in glyphs but in characters. I do not care what glyphs will be
> displayed. If I copy the text to OpenOffice, I can change the font
> later and if the codepoints were transferred correctly, I will see the
As you say if transferred correctly!

> text (it was true even with OpenOffice 1.x, I tried many years ago).
> If I copy the text to gedit, ontconfig will automatically find a font
> for displaying the characters not present in the current font. I still
> have to read the fontconfig manual in order to find how to configure
> its searching algorithm. Arabic fonts may be a problem especially if
> you wish to use Arabic, Persian and Urdu. Now I know that I have to
> force fontonfic to select automatically SIL Scheherezade because it
> contains all characters. I can thus use both U+0643 and U+06A. When
> writing Akbar, I can write it both in Arabic and in Urdu/Farsi

[snip, snip]

>>   The only advise I can give is choose your tools wisely.
>> 

regards
Keith.




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Whitespace in input

2011-11-19 Thread Keith J. Schultz
OUCH! I have been hit by a veteran truck drivers truck. ;-))

I concede! 

I am curious if many still know what a XX-bit word is. Is that term even still 
used?

Turn Unicode needs to be clean up it has become to fragmented.

regards
Keith.

Am 19.11.2011 um 09:39 schrieb Philip TAYLOR:

> 
> 
> Keith J. Schultz wrote:
> 
>>  I do not think anybody disputes the fact that characters are not glyphs.
>> 
>>  The confusion arises that a character in CS is well defined and has a 
>> history.
>>  To be more exact it is just one byte in size so that there can be only 
>> 256 characters.
> 
> Sorry, Keith, this is patently untrue.  Replace "is" by "was once" and
> you get a little closer to the truth, but you still completely ignore
> issues such as the difference between (say) EBCDIC and ASCII.  CDC machines
> used a 60-bit word, and one character was six bits, not eight.  And before
> the advent of the extended character set, a character consisted of seven
> bits plus a parity bit, thus yielding at most 128 characters of which
> 32 were reserved for control functions.
>   
>>  The average user considers a glyph to be the same as a "letter" and 
>> thereby a character.
> 
> It is rarely safe to believe that one knows what the average user thinks ...
> 
>>  Now, in order to process the glyphs with a computer it must be 
>> decomposed back to unicode.
> 
> But one rarely, if ever, "processes glyphs"; the glyphs are the end result,
> not the input.  Glyph processing does become necessary in languages such
> as Arabic, where context has a major impact on the way in which the
> individual glyphs are presented, but in Western languages the nearest we
> get to "glyph processing" is in the formation of ligature digraphs.
> 
>>  How well this is done depends of the system its self. If the system is 
>> not fully unicode aware and
>>  implements in properly then there will be problems. What adds to the 
>> complexity of the problem is that
>>  not all fonts used for displaying unicode contain all code points, 
>> Thereby, creating your many to many
>>  decomposition.
>> 
>>  As for getting junk when copying unicode, just copy between to text 
>> using different fonts, where one font does
>>  not contain the glyph.
>> 
>>  The only true way to master this problem is if the computer world would 
>> go completely full unicode with
>>  fonts support the full unicode code set!
> 
> I personally hope that this does not happen, and that before then
> we have an "Omnicode consortium" to review the mistakes of Unicode
> and to address them in a future, more orthogonal, more consistent,
> specification.
> 
> Philip Taylor
> 
> 
> --
> Subscriptions, Archive, and List information, etc.:
> http://tug.org/mailman/listinfo/xetex




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Whitespace in input

2011-11-19 Thread Karljurgen Feuerherm



Karljürgen G. Feuerherm, PhD
Undergraduate Advisor
Department of Archaeology and Classical Studies
Wilfrid Laurier University
75 University Avenue West
Waterloo, Ontario N2L 3C5
Tel. (519) 884-1970 x3193
Fax (519) 883-0991 (ATTN Arch. & Classics)




>>> On Sat, Nov 19, 2011 at  3:39 AM, in message <4ec76b33.2060...@rhul.ac.uk>,
Philip TAYLOR  wrote: 
 
> I personally hope that this does not happen, and that before then
> we have an "Omnicode consortium" to review the mistakes of Unicode
> and to address them in a future, more orthogonal, more consistent,
> specification.

Hear, hear! (is that the right spelling?)

Wisdom is of course 20/20 hindsight--and the Omnicodists will make their own 
mistakes... it's inevitable. But still, one should try.

K




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


[XeTeX] Free Erler Dingbats Unicode font

2011-11-19 Thread R (Chandra) Chandrasekhar

Dear Folks,

I have just received an email about a free Unicode font: Erler Dingbat. 
The web page at


http://ffdingbatsfont.com/erler/index.html

states, inter alia,

 Quote 

For the first time in the entire history of Unicode standard, the full 
encoding range for dingbats (U + 2700 – U + 27BF) is now covered by a 
complete, contemporary quality font. Erler Dingbats is a spin-off of the 
distinguished FF Dingbats 2.0 family, and was designed as a special 
collaboration between designers Johannes Erler and Henning Skibbe.


---

To help support and encourage the use of the Unicode standard, publisher 
FSI FontShop International is giving away the Erler Dingbats completely 
free of charge.


---

FSI invites not only private users, but also OS developers, to include 
the Erler Dingbats in their font collection.


 Unquote 

Perhaps these fonts could be made available to the TeX community as 
suggested above. Hence this email.


Chandra


--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Whitespace in input

2011-11-19 Thread Zdenek Wagner
2011/11/19 Ulrike Fischer :
> Am Sat, 19 Nov 2011 00:30:58 +0100 schrieb Zdenek Wagner:
>
>>> /ActualText is your friend here.
>>> You tag the content and provide the string that you want to appear
>>> with Copy/Paste as the value associated to a dictionary key.
>
>> I do not know whether the PDF specification has evolved since I read
>> it the last time. /ActualText allows only single-byte characters, ie
>> those with codes between 0 and 255, not arbitrary Unicode characters.
>
> This here works fine with pdflatex + xetex:
>
Thank you, the package looks useful.

> \documentclass{article}
> \usepackage{accsupp}
> \begin{document}
> \BeginAccSupp{method=hex,unicode,ActualText=20AC}%
>  Euro%
> \EndAccSupp{}%
>
> \BeginAccSupp{method=hex,unicode,ActualText=03B1}%
>  alpha%
> \EndAccSupp{}%
> \end{document}
>
> --
> Ulrike Fischer
>
>
>
> --
> Subscriptions, Archive, and List information, etc.:
>  http://tug.org/mailman/listinfo/xetex
>



-- 
Zdeněk Wagner
http://hroch486.icpf.cas.cz/wagner/
http://icebearsoft.euweb.cz



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Whitespace in input

2011-11-19 Thread Ulrike Fischer
Am Sat, 19 Nov 2011 00:30:58 +0100 schrieb Zdenek Wagner:

>> /ActualText is your friend here.
>> You tag the content and provide the string that you want to appear
>> with Copy/Paste as the value associated to a dictionary key.

> I do not know whether the PDF specification has evolved since I read
> it the last time. /ActualText allows only single-byte characters, ie
> those with codes between 0 and 255, not arbitrary Unicode characters.

This here works fine with pdflatex + xetex:

\documentclass{article}
\usepackage{accsupp}
\begin{document}
\BeginAccSupp{method=hex,unicode,ActualText=20AC}%
 Euro%
\EndAccSupp{}%

\BeginAccSupp{method=hex,unicode,ActualText=03B1}%
 alpha%
\EndAccSupp{}%
\end{document}

-- 
Ulrike Fischer 



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Whitespace in input

2011-11-19 Thread Zdenek Wagner
2011/11/19 Keith J. Schultz :
> Hi Zdenek,
>
>        I do not think anybody disputes the fact that characters are not 
> glyphs.
>
>        The confusion arises that a character in CS is well defined and has a 
> history.
>        To be more exact it is just one byte in size so that there can be only 
> 256 characters.
>
>        Unicode has change all this. and we have a unicode character which is 
> of different sizes
>        depending on the unicode encoding used.
>
>        It gets even hairier as in unicode several unicode characters can be 
> combined (composed).
>        the result to be output is known as a glyph!
>
>        The average user considers a glyph to be the same as a "letter" and 
> thereby a character.
>
>        Now, in order to process the glyphs with a computer it must be 
> decomposed back to unicode.
>        How well this is done depends of the system its self. If the system is 
> not fully unicode aware and
>        implements in properly then there will be problems. What adds to the 
> complexity of the problem is that
>        not all fonts used for displaying unicode contain all code points, 
> Thereby, creating your many to many
>        decomposition.
>
No, conversion of a sequence of glyphs to a sequence of unicode
codepoints has little to do with fonts. Position of RU ligature in the
font may differ, but it is handled easily by the toUnicode map.
Conjunct STA may also occupy different position in different fonts but
it can always be printed using two glyphs, half-SA + TA. In general,
the half forms should be decoded as the full form followed by VIRAMA.
This makes the toUnicode table smaller and leads to correct results.
The only problem is correct ordering of a few characters.

>        As for getting junk when copying unicode, just copy between to text 
> using different fonts, where one font does
>        not contain the glyph.
>
When performing copy&paste or text search in PDF, I am not interested
in glyphs but in characters. I do not care what glyphs will be
displayed. If I copy the text to OpenOffice, I can change the font
later and if the codepoints were transferred correctly, I will see the
text (it was true even with OpenOffice 1.x, I tried many years ago).
If I copy the text to gedit, ontconfig will automatically find a font
for displaying the characters not present in the current font. I still
have to read the fontconfig manual in order to find how to configure
its searching algorithm. Arabic fonts may be a problem especially if
you wish to use Arabic, Persian and Urdu. Now I know that I have to
force fontonfic to select automatically SIL Scheherezade because it
contains all characters. I can thus use both U+0643 and U+06A. When
writing Akbar, I can write it both in Arabic and in Urdu/Farsi.

>        The only true way to master this problem is if the computer world 
> would go completely full unicode with
>        fonts support the full unicode code set!
>
>        That is impractical for the time being.
>
fontconfig currently has the solution and usually works out of the box.
>        The only advise I can give is choose your tools wisely.
>
>        regards
>                Keith.
>
> Am 18.11.2011 um 23:51 schrieb Zdenek Wagner:
>
>> 2011/11/18 maxwell :
>>> On Fri, 18 Nov 2011 13:52:56 +0100, Zdenek Wagner
>>> 
>>> wrote:
 2011/11/18 Philip TAYLOR :
> Is it safe to assume that these "code listings"
> are restricted to the ASCII character set ?  If
> so, yes, spaces are likely to be a problem, but
> if the code listing can also include ligature-
> digraphs, then these are likely to prove even
> more problematic.
>
 If the code listing is typeset in a fixed width font, it is usually no
 problem. I copied a few code samples from books in PDF, most of them
 were typeset by TeX. If I want to copy text in Devanagari, it is
 almost impossible.
>>>
>>> Besides TeX, Dr. Knuth also invented Literate Programming.  In our own
>>> project, we use LP to extract the code listings from the original source
>>> code, rather than from the PDF.  One advantage is that in addition to the
>>> re-ordering at the character level (mentioned in part of Zdenek's email
>>> that I didn't copy over), this allows re-ordering at any arbitrary level,
>>
>> This is a demonstration that glyphs are not the same as characters. I
>> will startt with a simpler case and will not put Devanagari to the
>> mail message. If you wish to write a syllable RU, you have to add a
>> dependent vowel (matra) U to a consonant RA. There is a ligature RU,
>> so in PDF you will not see RA consonant with U matra but a RU glyph.
>> Similarly, TRA is a single glyph representing the following
>> characters: TA+VIRAMA+RA. The toUnicode map supports 1:1 and 1:many
>> mappings thus it is possible to handle these cases when copying text
>> from a PDF or when searching. More difficult case is I matra (short
>> dependent vowel I). As a character it must always follow a cons

Re: [XeTeX] Whitespace in input

2011-11-19 Thread Zdenek Wagner
2011/11/19 Ross Moore :
> Hi Zdenek,
>
> On 19/11/2011, at 10:30 AM, Zdenek Wagner wrote:
>
>>> /ActualText is your friend here.
>>> You tag the content and provide the string that you want to appear
>>> with Copy/Paste as the value associated to a dictionary key.
>>>
>> I do not know whether the PDF specification has evolved since I read
>> it the last time. /ActualText allows only single-byte characters, ie
>> those with codes between 0 and 255, not arbitrary Unicode characters.
>
> That is most certainly not true.
> You code up UTF-16BE as Hex strings.
>
> Here is a snippet of the (tagged-pdfLaTeX) source coding from
> the main example that I showed in my  TUG2011 talk.
> The URL for the video of the talk is given in several of my previous emails:
>
Thank you for the sample. I will try again when I have more time.
Maybe there is a stupid bug in my old code. As a matter of fact, when
playing with /ActualText I knew much less than now.

    \SMC attr{/ActualText\TPDFaloud{1D44F}} noendtext 254 
 {mi}%
  b%
    _{\noEMC%
   \TPDFsub
    \SMC attr{/ActualText\TPDFaloud{1D458}} noendtext 255 
 {mi}%
  k%
    \EMC
  }^{\EMC
    \SMC attr{/ActualText( )} noendtext 256 {Span}%
  \pdffakespace
    \EMC
  }%
    \TPDFpopbrack
    \SMC attr{/ActualText\TPDFaloud{0029}} noendtext 257 {mo}%
  \Bigr)%
>
>
> Inside the resulting PDF, this content looks like:
>
 1 0 0 1 4.902 2.463 cm
 /mi >BDC
 BT
 /F11 9.9626 Tf
  [(b)]TJ
 ET
 EMC
 1 0 0 1 4.276 4.114 cm
 /Span <>>> >>BDC
 BT
 /F103 1 Tf
  [( )]TJ
 ET
 EMC
 1 0 0 1 0 -6.577 cm
 /mi >BDC
 BT
 /F10 6.9738 Tf
  [(k)]TJ
 ET
 EMC
 1 0 0 1 4.901 2.463 cm
 /mo <>>> >>BDC
>
>
> The full PDF passes all of Adobe's validation tests for
> correct PDF syntax, Accessible Content, PDF/A-1b compliance.
>
> More particularly:
>
>  /mi   >>BDC
>  BT
>  /F11 9.9626 Tf
>   [(b)]TJ
>  ET
>  EMC
>
> expresses a math-italic 'b' as :
>
>  1.  the glyph in the position of letter 'b' (in CMMI10  font);
>
>  2.  to be spoken aloud as  " , b , "  where commas indicate a slight pause
>
>  3.  to Copy/Paste as the surrogate pair  Ux0D835 Ux0DC4F
>      equivalent to a Plane-1 math-italic character 'b' .
>
> The /MCID key is necessary for tagged PDF, but the /Alt and /ActualText
> should work independently to full tagging.
> The '/mi' is immaterial; it could equally well be  '/Span'.
>
>
>> /ActualText is demonstrated on German hyphenated words such as Zucker
>> which is hyphenated as Zuk- ker. I have tried to put /ActualText
>> manually via a special, I could see it in the PDF file but it did not
>> work.
>
> Yes, because it is quite important to position the tagging pieces
> correctly within the PDF content stream. It has to balance correctly
> with BT ... ET  and the BDC ... EMC  operator pairs, and there may
> be other subtle requirements.
>
> Certainly it cannot be done with just a single \special .
> There needs to be stuff both before and after the content
> that causes actual glyphs to be displayed.
>
>
> Just using \pdfliteral  is not sufficient with pdfTeX; we needed
> a special modification that allowed the  /mi <<...>>BDC
> and  EMC to fit snuggly around the  BT ... ET .
>
> There could be a similar problem with XeTeX's
>     \special{pdf:literal ... }
> (or whatever is the syntax).
> This is the issue that I was trying to discuss with JK in 2009 or 2010.
>
>
>>
>> When converting a white space to a space character some [complex]
>> heuristics is needed while proper conversion of glyphs to characters
>> of Indic scripts require just a few strict rules. The ligatures as TRA
>> have to appear in the toUnicode map, otherwise its meaning will be
>> unclear. If you see the I-matra, go to the last consonant in the
>> sequence and put the I-matra character there. If you see the RA glyph
>> at the right edge of a syllable, go back to the leftmost consonant in
>> the group and prepend RA+VIRAMA there. This is all what has to be done
>> with Devanagari. Other Indic scripts contain two-part vowels but the
>> rules will be similarly simple. We should not be forced to double the
>> size of the PDF file. AR and other PDF rendering programs should learn
>> these simple rules and use them when extracting text.
>
> If you can provide the  UTF-16BE Hex representation of these,
> I can create a PDF using it as the /ActualText  replacement for
> some arbitrary string of letters.
>
> This will test whether this is a viable approach for Devanagari.
> If so, then it is a matter of working out how to expand this
> for a full solution.
>
>
>>
>>> There is a macro package that can do this with pdfTeX, and it is
>>> a vital part of my Tagged PDF work for mathematics.
>>> Also, I have an example where the CJK.sty package is extended
>>> to tag Chinese characters built from multiple glyphs so

Re: [XeTeX] TeX in the modern World. (goes OT) Was: Re: Whitespace in input

2011-11-19 Thread Philip TAYLOR



bhutex wrote:


I don't really understand why this discussion.

Have not you read the article in TUG Boat. Don is planning to bring out a new 
TeX called iTeX*. Actually the * is a sound of bell - ding!!

But it may not be free. But it can handle all types of output formats, all 
languages etc. etc. Don presented this paper in TUG meeting last year - because 
the TUG volume is a proceedings volume.


I have heard rumours that DEK has had to put work on ĭ-TeX
on hold while he addresses a more important (and previously
unsolved problem) in computer science.  Apparently, realising
that computerised embroidery is sadly deficient when compared
to the real thing, DEK has, it is rumoured, decided that
as a computer scientist it is his duty to write the ultimate,
definitive, computerised embroidery software.  It is alleged
that initially this will be solely for his, and Jill's,
private use, but as word  of this remarkable tool leaks out
into academe, ever more would-be computer embroiderers are
expected to plead for a copy, and, if the rumours are true,
DEK is therefore likely to find himself distracted from his
ground-breaking work on ĭ-TeX as he responds to bug reports
(few) and feature requests (many) from his early adopters.

Philip Taylor



--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Whitespace in input

2011-11-19 Thread Philip TAYLOR



Keith J. Schultz wrote:


I do not think anybody disputes the fact that characters are not glyphs.

The confusion arises that a character in CS is well defined and has a 
history.
To be more exact it is just one byte in size so that there can be only 
256 characters.


Sorry, Keith, this is patently untrue.  Replace "is" by "was once" and
you get a little closer to the truth, but you still completely ignore
issues such as the difference between (say) EBCDIC and ASCII.  CDC machines
used a 60-bit word, and one character was six bits, not eight.  And before
the advent of the extended character set, a character consisted of seven
bits plus a parity bit, thus yielding at most 128 characters of which
32 were reserved for control functions.


The average user considers a glyph to be the same as a "letter" and 
thereby a character.


It is rarely safe to believe that one knows what the average user thinks ...


Now, in order to process the glyphs with a computer it must be 
decomposed back to unicode.


But one rarely, if ever, "processes glyphs"; the glyphs are the end result,
not the input.  Glyph processing does become necessary in languages such
as Arabic, where context has a major impact on the way in which the
individual glyphs are presented, but in Western languages the nearest we
get to "glyph processing" is in the formation of ligature digraphs.


How well this is done depends of the system its self. If the system is 
not fully unicode aware and
implements in properly then there will be problems. What adds to the 
complexity of the problem is that
not all fonts used for displaying unicode contain all code points, 
Thereby, creating your many to many
decomposition.

As for getting junk when copying unicode, just copy between to text 
using different fonts, where one font does
not contain the glyph.

The only true way to master this problem is if the computer world would 
go completely full unicode with
fonts support the full unicode code set!


I personally hope that this does not happen, and that before then
we have an "Omnicode consortium" to review the mistakes of Unicode
and to address them in a future, more orthogonal, more consistent,
specification.

Philip Taylor


--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] (OT) Re: TeX in the modern World. (goes OT) Was: Re: Whitespace in input

2011-11-19 Thread Philip TAYLOR



Keith J. Schultz wrote:


Me I am almost 50 and have been around computers since the 80s.
First was a Apple IIe, at the university we used a main frame.


My first computer was a Clary 404, with 8K of magnetic core memory,
a magnetic card reader and/or teletype as input device, and an
IBM golf ball "Selectric" typewriter for output.  A 3rd-year
undergraduate, working under my supervision, wrote a chess end-game
solver that would run on this machine and solve end-game problems
in reasonable time.  I wonder how many programmers today could do
the same with 125 000 times as much memory (1Gb) ?

** Phil.


--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Whitespace in input

2011-11-19 Thread Keith J. Schultz
Hi Zdenek,

I do not think anybody disputes the fact that characters are not glyphs.

The confusion arises that a character in CS is well defined and has a 
history.
To be more exact it is just one byte in size so that there can be only 
256 characters.

Unicode has change all this. and we have a unicode character which is 
of different sizes
depending on the unicode encoding used.

It gets even hairier as in unicode several unicode characters can be 
combined (composed).
the result to be output is known as a glyph!

The average user considers a glyph to be the same as a "letter" and 
thereby a character.

Now, in order to process the glyphs with a computer it must be 
decomposed back to unicode.
How well this is done depends of the system its self. If the system is 
not fully unicode aware and
implements in properly then there will be problems. What adds to the 
complexity of the problem is that 
not all fonts used for displaying unicode contain all code points, 
Thereby, creating your many to many
decomposition. 

As for getting junk when copying unicode, just copy between to text 
using different fonts, where one font does 
not contain the glyph.

The only true way to master this problem is if the computer world would 
go completely full unicode with 
fonts support the full unicode code set!

That is impractical for the time being.

The only advise I can give is choose your tools wisely.

regards
Keith.

Am 18.11.2011 um 23:51 schrieb Zdenek Wagner:

> 2011/11/18 maxwell :
>> On Fri, 18 Nov 2011 13:52:56 +0100, Zdenek Wagner
>> 
>> wrote:
>>> 2011/11/18 Philip TAYLOR :
 Is it safe to assume that these "code listings"
 are restricted to the ASCII character set ?  If
 so, yes, spaces are likely to be a problem, but
 if the code listing can also include ligature-
 digraphs, then these are likely to prove even
 more problematic.
 
>>> If the code listing is typeset in a fixed width font, it is usually no
>>> problem. I copied a few code samples from books in PDF, most of them
>>> were typeset by TeX. If I want to copy text in Devanagari, it is
>>> almost impossible.
>> 
>> Besides TeX, Dr. Knuth also invented Literate Programming.  In our own
>> project, we use LP to extract the code listings from the original source
>> code, rather than from the PDF.  One advantage is that in addition to the
>> re-ordering at the character level (mentioned in part of Zdenek's email
>> that I didn't copy over), this allows re-ordering at any arbitrary level,
> 
> This is a demonstration that glyphs are not the same as characters. I
> will startt with a simpler case and will not put Devanagari to the
> mail message. If you wish to write a syllable RU, you have to add a
> dependent vowel (matra) U to a consonant RA. There is a ligature RU,
> so in PDF you will not see RA consonant with U matra but a RU glyph.
> Similarly, TRA is a single glyph representing the following
> characters: TA+VIRAMA+RA. The toUnicode map supports 1:1 and 1:many
> mappings thus it is possible to handle these cases when copying text
> from a PDF or when searching. More difficult case is I matra (short
> dependent vowel I). As a character it must always follow a consonant
> (this is a general rule for all dependent vowels) but visually (as a
> glyph) it precedes the consonant group after which it is pronounced.
> The sample word was kitab (it means a book). In Unicode (as
> characters) the order is KA+I-matra+TA+A-matra(long)+BA. Visually
> I-matra precedes KA. XeTeX (knowing that it works with a Devanagari
> script) runs the character sequence through ICU and the result is the
> glyph sequence. The original sequence is lost so that when the text is
> copied from PDF, we get (not exactly) i*katab. Microsoft suggested
> what additional characters should appear in Indic OpenType fonts. One
> of them is a dotted ring which denotes a missing consonant. I-matra
> must always follow a consonant (in character order). If it is moved to
> the beginning of a word, it is wrong. If you paste it to a text
> editor, the OpenType rendering engine should display a missing
> consonant as a dotted ring (if it is present in the font). In
> character order the dotted ring will precede I-matra but in visual
> (glyph) order it will be just opposite. Thus the asterisk shows the
> place where you will see the dotted circle. This is just one simple
> case. I-matra may follow a consonant group, such as in word PRIY
> (dear) which is PA+VIRAMA+RA+I-matra+YA or STRIYOCIT (good for women)
> which is SA+VIRAMA+TA+VIRAMA+RA+I-matra+YA+O-matra+CA+I-matra+TA. Both
> words will start with the I-matra glyph. The latter will contain two
> ordering bugs after copy&paste. Consider also word MURTI (statue)
> which is a sequence of characters
> MA+U-matra(long)+RA+VIRAMA+TA+I-matra.