On Sun, Dec 21, 2025 at 10:08:58PM +0000, Gavin Smith wrote:
> Here's a patch:

Here's a more complete patch.  To avoid changing the output for
HTML, DocBook and one other other output format ("Texinfo XML"), when the
input was not UTF-8, I had to remove the default OUTPUT_ENCODING_NAME
UTF-8 setting.  Otherwise these formats would be forced to UTF-8 as
well.

I'm slightly worried this may have some unintended effect (possibly on
some different setup from mine).  For example, maybe OUTPUT_ENCODING_NAME
may end up as unset.  I expect the output encoding should always be set
as it should be propagated from the input encoding, which should always
be set.

I don't think that the OUPTUT_ENCODING_NAME defaults did very much,
but I'm not certain.  It's possible these default values stemmed from
a time before UTF-8 was the default input encoding for Texinfo.  (For
example, "git blame" tracks the setting in DocBook.pm to a commit on
2012-09-14 (49aa00da6ae37), whereas UTF-8 only became the default input
encoding in 2019.)


diff --git a/ChangeLog b/ChangeLog
index b4185d5f7a..9f549d6155 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,24 @@
+2025-12-23  Gavin Smith <[email protected]>
+
+       UTF-8 by default for LaTeX output
+
+       * tta/perl/Texinfo/Convert/LaTeX.pm (%defaults):
+       Set OUTPUT_ENCODING_NAME to 'utf-8'.
+
+       * tta/perl/Texinfo/Common.pm (set_output_encoding),
+       * tta/C/main/document.c (set_output_encoding): Only propagate
+       encoding name from input encoding to output encoding if output
+       encoding is not already set.
+       * tta/perl/Texinfo/Convert/Text.pm: update comments
+       
+       * tta/data/converters_defaults.txt (html_converter),
+       * tta/perl/Texinfo/Convert/DocBook.pm (%defaults),
+       * tta/perl/Texinfo/Convert/HTML.pm (%defaults),
+       * tta/perl/Texinfo/Convert/TexinfoXML.pm (%defaults):
+       Remove OUTPUT_ENCODING_NAME utf-8 default.
+
+       * NEWS: update
+
 2025-12-23 Patrice Dumas  <[email protected]>
 
        * tta/C/convert/convert_html.c (html_conversion_finalization),
diff --git a/NEWS b/NEWS
index a709f21184..cddc981fd6 100644
--- a/NEWS
+++ b/NEWS
@@ -70,6 +70,9 @@ See the manual for detailed information.
   . Info output:
     . new (experimental) variable INFO_MATH_IMAGES allows outputting
       images for mathematics notation
+  . LaTeX output:
+     . use UTF-8 encoding for output by default, regardless of input
+       encoding.  override with OUTPUT_ENCODING_NAME.
   . Remove the Texinfo::TeX4HT customization package.
   . XML output:
       . use HTML entities names for @H and @dotaccent accents types
diff --git a/tta/C/main/document.c b/tta/C/main/document.c
index f554ad555a..bea53f01a7 100644
--- a/tta/C/main/document.c
+++ b/tta/C/main/document.c
@@ -208,6 +208,7 @@ void
 set_output_encoding (OPTIONS *customization_information, DOCUMENT *document)
 {
   if (customization_information
+      && !customization_information->OUTPUT_ENCODING_NAME.o.string
       && document && document->global_info.input_encoding_name) {
     option_set_conf (&customization_information->OUTPUT_ENCODING_NAME, -1,
                      document->global_info.input_encoding_name);
diff --git a/tta/data/converters_defaults.txt b/tta/data/converters_defaults.txt
index bd7b4f8dae..29ae0e9439 100644
--- a/tta/data/converters_defaults.txt
+++ b/tta/data/converters_defaults.txt
@@ -114,7 +114,6 @@ NO_CSS                 0
 NO_NUMBER_FOOTNOTE_SYMBOL  *
 NODE_NAME_IN_MENU      1
 OPEN_QUOTE_SYMBOL      undef
-OUTPUT_ENCODING_NAME   utf-8
 SECTION_NAME_IN_TITLE  0
 SHORT_TOC_LINK_TO_TOC  1
 SHOW_TITLE             undef
diff --git a/tta/perl/Texinfo/Common.pm b/tta/perl/Texinfo/Common.pm
index 4054ba4321..7387802a6a 100644
--- a/tta/perl/Texinfo/Common.pm
+++ b/tta/perl/Texinfo/Common.pm
@@ -1338,10 +1338,13 @@ sub set_output_encoding($$) {
   if (defined($document)) {
     $document_information = $document->global_information();
   }
-  $customization_information->set_conf('OUTPUT_ENCODING_NAME',
-               $document_information->{'input_encoding_name'})
-     if (defined($document_information)
-         and exists($document_information->{'input_encoding_name'}));
+
+  if (!$customization_information->get_conf('OUTPUT_ENCODING_NAME')
+      and defined($document_information)
+      and exists($document_information->{'input_encoding_name'})) {
+    $customization_information->set_conf('OUTPUT_ENCODING_NAME',
+                 $document_information->{'input_encoding_name'})
+  }
 }
 
 # $DOCUMENT is the parsed Texinfo document.  It is optional, but it
diff --git a/tta/perl/Texinfo/Convert/DocBook.pm 
b/tta/perl/Texinfo/Convert/DocBook.pm
index 604cca6678..49c838f31c 100644
--- a/tta/perl/Texinfo/Convert/DocBook.pm
+++ b/tta/perl/Texinfo/Convert/DocBook.pm
@@ -57,7 +57,6 @@ my %defaults = (
   # Customization option variables
   'FORMAT_MENU'          => 'nomenu',
   'EXTENSION'            => 'xml', # dbk?
-  'OUTPUT_ENCODING_NAME' => 'utf-8',
   'SPLIT'                => '',
   'OPEN_QUOTE_SYMBOL'    => '&#'.hex('2018').';',
   'CLOSE_QUOTE_SYMBOL'   => '&#'.hex('2019').';',
diff --git a/tta/perl/Texinfo/Convert/LaTeX.pm 
b/tta/perl/Texinfo/Convert/LaTeX.pm
index 3bfc247e39..3c8b0fecac 100644
--- a/tta/perl/Texinfo/Convert/LaTeX.pm
+++ b/tta/perl/Texinfo/Convert/LaTeX.pm
@@ -819,6 +819,7 @@ my %defaults = (
   'FORMAT_MENU'          => 'nomenu',
   'EXTENSION'            => 'tex',
   'paragraphindent'      => undef, # global default is for Info/Plaintext
+  'OUTPUT_ENCODING_NAME' => 'utf-8'
 );
 
 
diff --git a/tta/perl/Texinfo/Convert/TexinfoXML.pm 
b/tta/perl/Texinfo/Convert/TexinfoXML.pm
index 3d99e26bbf..8040f5c741 100644
--- a/tta/perl/Texinfo/Convert/TexinfoXML.pm
+++ b/tta/perl/Texinfo/Convert/TexinfoXML.pm
@@ -45,7 +45,6 @@ my %defaults = (
   # Customization option variables
   'FORMAT_MENU'          => 'menu',
   'EXTENSION'            => 'xml',
-  'OUTPUT_ENCODING_NAME' => 'utf-8',
   'SPLIT'                => '',
 );
 
diff --git a/tta/perl/Texinfo/Convert/Text.pm b/tta/perl/Texinfo/Convert/Text.pm
index 7408985f8b..20462aa2a4 100644
--- a/tta/perl/Texinfo/Convert/Text.pm
+++ b/tta/perl/Texinfo/Convert/Text.pm
@@ -954,7 +954,7 @@ sub convert($$) {
   if (defined($document)) {
     $global_info = $document->global_information();
 
-    # same as Texinfo::Common::set_output_encoding
+    # similar to Texinfo::Common::set_output_encoding
     $self->{'OUTPUT_ENCODING_NAME'} = $global_info->{'input_encoding_name'}
       if (defined($global_info)
           and exists($global_info->{'input_encoding_name'}));
@@ -991,7 +991,7 @@ sub output($$) {
   if ($document) {
     $global_info = $document->global_information();
 
-    # same as Texinfo::Common::set_output_encoding
+    # similar to Texinfo::Common::set_output_encoding
     $self->{'OUTPUT_ENCODING_NAME'} = $global_info->{'input_encoding_name'}
       if (defined($global_info)
           and exists($global_info->{'input_encoding_name'}));


Reply via email to