Re: [help-texinfo] how does one encode a middle dot? and other questions

2015-04-06 Thread Benno Schulenberg

On Sun, Apr 5, 2015, at 17:32, Eli Zaretskii wrote:
  From: Benno Schulenberg bensb...@justemail.net
  
  First, is there a way to encode U+00B7 (middle dot) in a texi
  file, in a way similar to @guillemetright{} and @bullet{}?
 
 Not clear what you are asking.  A simple answer is just use that
 character in the Texinfo source, [...]

Well, the actual UTF-8 characters is what we have in the .texi
source file now.  But 'svn blame' complains that it is a binary
file.  Of course I could use --force or do a propset, but I
realized that I wish to have the source file in pure plain ASCII.

So I would like to write as ASCII things like @guillemetright{}
and @bullet{} and @middledot{}, and have them come out as actual
UTF-8 characters when makeinfo is run in a UTF-8 locale, and have
them reduced to somehting vaguely similar in locale encodings that
don't have that specific character in their character set.

 The next release will have a feature in the stand-alone Info reader to
 replace the characters that cannot be displayed by suitable ASCII art.

Does that mean that when the .info file contains an actual UTF-8
character, say a right guillemet (U+00BB), and info is run in a plain
POSIX locale, the character would be shown as  or something?

Hmm, testing it...  Yes, that appears to work.  Cool.

So I could use @documentencoding UTF-8 in some future, and rest
assured that it won't result in garbage in other locales.

Benno

-- 
http://www.fastmail.com - A no graphics, no pop-ups email service




Re: [help-texinfo] how does one encode a middle dot? and other questions

2015-04-06 Thread Eli Zaretskii
 From: Benno Schulenberg bensb...@justemail.net
 Cc: help-texinfo@gnu.org
 Date: Mon, 06 Apr 2015 13:10:54 +0200
 
   First, is there a way to encode U+00B7 (middle dot) in a texi
   file, in a way similar to @guillemetright{} and @bullet{}?
  
  Not clear what you are asking.  A simple answer is just use that
  character in the Texinfo source, [...]
 
 Well, the actual UTF-8 characters is what we have in the .texi
 source file now.  But 'svn blame' complains that it is a binary
 file.  Of course I could use --force or do a propset, but I
 realized that I wish to have the source file in pure plain ASCII.
 
 So I would like to write as ASCII things like @guillemetright{}
 and @bullet{} and @middledot{}, and have them come out as actual
 UTF-8 characters when makeinfo is run in a UTF-8 locale, and have
 them reduced to somehting vaguely similar in locale encodings that
 don't have that specific character in their character set.

Others have pointed out the @U feature in the next release.

  The next release will have a feature in the stand-alone Info reader to
  replace the characters that cannot be displayed by suitable ASCII art.
 
 Does that mean that when the .info file contains an actual UTF-8
 character, say a right guillemet (U+00BB), and info is run in a plain
 POSIX locale, the character would be shown as  or something?

Nitpicking: the Info file will contain a UTF-8 _sequence_ for the
U+00BB character.

 Hmm, testing it...  Yes, that appears to work.  Cool.
 
 So I could use @documentencoding UTF-8 in some future, and rest
 assured that it won't result in garbage in other locales.

Please note that this treatment is reserved to certain Unicode
characters that the stand-alone reader knows about, not to any
arbitrary character out there.  Basically, the characters emitted by
makeinfo as part of formatting are the only ones supported.



Re: [help-texinfo] how does one encode a middle dot? and other questions

2015-04-06 Thread Benno Schulenberg

Hi Karl,

  Sixth, how do I run texi2any without installing it?
 
 I use a one-line shell script:
 exec /path/to/texinfo/checkout/tp/texi2any.pl $@

Aaah... one has to run *texi2any.pl* directly, and not texi2any.
Okay, that works.

 In my further experience, the best answer is, don't use
 @documentencoding UTF-8 unless it is really needed

Well, it must be set in order for the new character-set reducing
feature of stand-alone info to work.  Or at least, the tail of the
.info file must contain coding: utf-8.

 Not that anyone cares, but personally I deplore the current trend of
 randomly forcing all manuals to UTF-8.  The plethora of resulting
 Unicode quotes makes the manuals unreadable in non-UTF-8 environments.

But not with the new stand-alone info reader: the left and right
single quote signs get both reduced to an apostrophe here.

(By the way: thanks for getting rid of the backtick as a left single
quote mark.  Phew!  That makes things much more readble for me.)

 I wish to check whether makeinfo still produces no blank line
 between the items of a bulleted list (which 5.1 doesn't but
 4.13 did).  
 
 With the test file below, there is no blank line between the items with
 either svn texinfo or with makeinfo 4.13 either.

My makeinfo 4.13 messes up the list: it makes This is... into
a list item, instead of i..

 I seem to recall precisely that 4.13 was inconsistent in this regard,
 sometimes but not always adding blank lines.  And so when Patrice
 discovered this, it seemed like the most user-controllable and
 -understandable behavior was to preserve blank lines in the input
 between items, but not have the program add blank lines sometimes but
 not others.  Pretty sure other manuals used the
 no-blank-line-insertion behavior for short lists.
 
 I don't imagine that answer will make you happy, [...]

But it does.  :)  Now I can control whether list items are contiguous
or separated.   I will just add blank lines to the source, and the
result should be the same in all makeinfos.

 And whether @bullet{} still produces only a * (U+002A)
 instead of a real [binary garbage] (U+2022) in an info file.
 
 With the test file below, I get some multibyte character for the bullet.
 I can't tell what it is, but it's probably the one you want.

It is.  So that has improved, too.  Cool.  And also info no longer
sees the AltGr key as an Alt, so now I can also search for accented
characters, guillemets, and anything else I can type.  Thanks!

Benno


 \input texinfo
 @setfilename listutf.info
 @documentencoding UTF-8
 
 @itemize
 @item i.
 @item j.
 @end itemize
 
 @bye

-- 
http://www.fastmail.com - Email service worth paying for. Try it for free




Re: [help-texinfo] how does one encode a middle dot? and other questions

2015-04-05 Thread Karl Berry
at some point, it is envisonned to have a
command like @U{} to allow to put any unicode point.

We have @U for the upcoming release.  In fact I implemented it
specifically after the discussion of middle dot for Catalan in January
and February.

The dots are sometimes inside, sometimes outside the square brackets.

Ok, will look.

checking for a french Unicode locale... none
Why does it check for that? Is it just for running a test?
Or would it enable stuff?

It comes from gnulib/m4/locale-fr.m4.  I don't know why or what
dependency pulled it in.  We did not request it explicitly.  The test
result has no particular effect on Texinfo's behavior, so far as I know.

 Sixth, how do I run texi2any without installing it?

I use a one-line shell script:
exec /path/to/texinfo/checkout/tp/texi2any.pl $@
I suppose an alias would do as well, if preferred.

 but what will happen on a machine that does not use a UTF-8
 locale?  Will the resulting info file still be readable when the
 above command is used?  How will the guillemet get rendered there?

In my experience, the usual answer is that it will appear as binary
garbage.

In my further experience, the best answer is, don't use
@documentencoding UTF-8 unless it is really needed (e.g., the manual is
not written in English).

Not that anyone cares, but personally I deplore the current trend of
randomly forcing all manuals to UTF-8.  The plethora of resulting
Unicode quotes makes the manuals unreadable in non-UTF-8 environments.
I have taken to replacing the three Unicode bytes with SPC SPC ` and '
SPC SPC so I can use them.

I wish to check whether makeinfo still produces no blank line
between the items of a bulleted list (which 5.1 doesn't but
4.13 did).  

With the test file below, there is no blank line between the items with
either svn texinfo or with makeinfo 4.13 either.

I seem to recall precisely that 4.13 was inconsistent in this regard,
sometimes but not always adding blank lines.  And so when Patrice
discovered this, it seemed like the most user-controllable and
-understandable behavior was to preserve blank lines in the input
between items, but not have the program add blank lines sometimes but
not others.  Pretty sure other manuals used the
no-blank-line-insertion behavior for short lists.

I don't imagine that answer will make you happy, but it's where we are.

And whether @bullet{} still produces only a * (U+002A)
instead of a real [binary garbage] (U+2022) in an info file.

With the test file below, I get some multibyte character for the bullet.
I can't tell what it is, but it's probably the one you want.

best,
karl


\input texinfo
@setfilename listutf.info
@documentencoding UTF-8

@itemize
@item i.
@item j.
@end itemize

@bye



Re: [help-texinfo] how does one encode a middle dot? and other questions

2015-04-05 Thread Gavin Smith
On 5 April 2015 at 10:18, Benno Schulenberg bensb...@justemail.net wrote:
 First, is there a way to encode U+00B7 (middle dot) in a texi
 file, in a way similar to @guillemetright{} and @bullet{}?
Either use the character itself directly in the encoding of the file,
or the upcoming release has a new command @U so you could do @U{00B7}
- for Info output this gives the centre dot for UTF-8 output only.


 Second, in the manual it says that @documentencoding sets the
 input encoding.  But I find that an @guillemetright{} only gets
 rendered as » (instead of ) when I set @documentencoding
 to UTF-8.  So it's more like that command sets the output encoding,
 no?

From the latest revision of the manual: The '@documentencoding'
command declares the input document encoding, and can also affect the
encoding of the output.

 Sixth, how do I run texi2any without installing it?
 Running tp/texi2any fails with:
 Can't locate Texinfo/Convert/Texinfo.pm in @INC (@INC contains: 
 /usr/local/share/texinfo/lib/Text-Unidecode/lib 
 /usr/local/share/texinfo/lib/Unicode-EastAsianWidth/lib 
 /usr/local/share/texinfo/lib/libintl-perl/lib /etc/perl 
 /usr/local/lib/perl/5.10.1 /usr/local/share/perl/5.10.1 /usr/lib/perl5 
 /usr/share/perl5 /usr/lib/perl/5.10 /usr/share/perl/5.10 
 /usr/local/lib/site_perl . /usr/local/share/texinfo) at tp/texi2any line 106.
 BEGIN failed--compilation aborted at tp/texi2any line 106.

Doing cd tp first has always worked for me. I haven't researched how
to get it to run as tp/texi2any. A related problem that has caught
me out several times is trying to run an installed makeinfo (e.g.
/usr/local/bin/makeinfo) when the present working directory is tp
in the source directory - it will use the modules from the source
directory (which may have uninstalled modifications) instead of the
installed modules.

 (By the way, the gluing together of failed--compilation is ugly.)

AFAIK this is a Perl error message.



Re: [help-texinfo] how does one encode a middle dot? and other questions

2015-04-05 Thread Patrice Dumas
On Sun, Apr 05, 2015 at 11:18:34AM +0200, Benno Schulenberg wrote:
 
 Hi,
 
 Several things.
 
 First, is there a way to encode U+00B7 (middle dot) in a texi
 file, in a way similar to @guillemetright{} and @bullet{}?

Not that I know of, but at some point, it is envisonned to have a
command like @U{} to allow to put any unicode point.

 Second, in the manual it says that @documentencoding sets the
 input encoding.  But I find that an @guillemetright{} only gets
 rendered as » (instead of ) when I set @documentencoding
 to UTF-8.  So it's more like that command sets the output encoding,
 no?  

Indeed, the output encoding is set to the input encoding in the
default case.

 But what will happen on a machine that does not use a UTF-8
 locale?  Will the resulting info file still be readable when the
 above command is used?  How will the guillemet get rendered there?

Eli answered this question already on the info reader side.

It is also possible to set the output encoding explicitly, by using 
  -c OUTPUT_ENCODING_NAME=ascii
for instance.  Also setting --disable-encoding could have the effect you
are looking for.

-- 
Pat