gnulib-tool.py: Stop using codecs.open
It seems that codecs.open is frowned upon, nowadays [1], and that the Python 3 way of opening a file is a built-in function 'open' [2]. Let's use this consistently. With newline='\n' in order to match what gnulib-tool.sh does. Specifying encoding='utf-8' is what makes the most sense today. If a package still has a configure.ac or Makefile.am in ISO-8859-1 encoding, this will probably fail. Let's see if people report a problem with this; it should be very rare. [1] https://stackoverflow.com/questions/5250744/difference-between-open-and-codecs-open-in-python [2] https://docs.python.org/3/library/functions.html#open 2024-04-13 Bruno Haible gnulib-tool.py: Stop using codecs.open. * pygnulib/*.py: To open a file, consistently use open(..., mode='[rwa]', newline='\n', encoding='utf-8'). diff --git a/gnulib-tool.py.TODO b/gnulib-tool.py.TODO index 1eada15429..160f7b7465 100644 --- a/gnulib-tool.py.TODO +++ b/gnulib-tool.py.TODO @@ -10,9 +10,5 @@ Optimize: Various other refactorings, as deemed useful: - Use an enum for 'all', 'old', 'new', 'added', 'removed' in GLImport.py. - - go through all the open() and codecs.open() calls and turn them into -with open(file_name, 'r', newline='\n', encoding='utf-8') as file: -or -with open(file_name, 'w', newline='\n', encoding='utf-8') as file: diff --git a/pygnulib/GLEmiter.py b/pygnulib/GLEmiter.py index ad164515dc..6edad6619e 100644 --- a/pygnulib/GLEmiter.py +++ b/pygnulib/GLEmiter.py @@ -20,7 +20,6 @@ from __future__ import annotations #=== import os import re -import codecs import subprocess as sp from collections.abc import Callable from . import constants @@ -895,7 +894,7 @@ AC_DEFUN([%V1%_LIBSOURCES], [ if makefile_name: path = joinpath(sourcebase, 'Makefile.am') if isfile(path): -with codecs.open(path, 'rb', 'UTF-8') as file: +with open(path, mode='r', newline='\n', encoding='utf-8') as file: data = file.read() if pattern.findall(data): lib_gets_installed = True diff --git a/pygnulib/GLFileSystem.py b/pygnulib/GLFileSystem.py index 6c40586574..ee78181d82 100644 --- a/pygnulib/GLFileSystem.py +++ b/pygnulib/GLFileSystem.py @@ -354,10 +354,10 @@ class GLFileAssistant: transformer = sed_transform_testsrelated_lib_file if transformer != None: # Read the file that we looked up. -with open(lookedup, 'r', newline='\n', encoding='utf-8') as file: +with open(lookedup, mode='r', newline='\n', encoding='utf-8') as file: src_data = file.read() # Write the transformed data to the temporary file. -with open(tmpfile, 'w', newline='\n', encoding='utf-8') as file: +with open(tmpfile, mode='w', newline='\n', encoding='utf-8') as file: file.write(re.sub(transformer[0], transformer[1], src_data)) path = joinpath(self.config['destdir'], rewritten) if isfile(path): diff --git a/pygnulib/GLImport.py b/pygnulib/GLImport.py index 3d694a7919..02a49f8512 100644 --- a/pygnulib/GLImport.py +++ b/pygnulib/GLImport.py @@ -20,7 +20,6 @@ from __future__ import annotations #=== import os import re -import codecs import subprocess as sp from . import constants from .GLError import GLError @@ -97,7 +96,7 @@ class GLImport: os.rmdir(self.cache['tempdir']) # Read configure.{ac,in}. -with codecs.open(self.config.getAutoconfFile(), 'rb', 'UTF-8') as file: +with open(self.config.getAutoconfFile(), mode='r', newline='\n', encoding='utf-8') as file: data = file.read() # Get cached auxdir and libtool from configure.{ac,in}. @@ -127,7 +126,7 @@ class GLImport: # Get other cached variables. path = joinpath(self.config['m4base'], 'gnulib-cache.m4') if isfile(path): -with codecs.open(path, 'rb', 'UTF-8') as file: +with open(path, mode='r', newline='\n', encoding='utf-8') as file: data = file.read() # gl_LGPL
Re: gnulib-tool.py: Stop using codecs.open
Hi Bruno, On 4/13/24 4:08 AM, Bruno Haible wrote: > It seems that codecs.open is frowned upon, nowadays [1], > and that the Python 3 way of opening a file is a built-in function 'open' [2]. Thanks for this patch. When I started working on gnulib-tool.py I didn't even know the codecs module existed, since I had never used Python 2. > Let's use this consistently. With newline='\n' in order to match what > gnulib-tool.sh does. Sounds good. You mentioned removing the constants.nlconvert() stuff in an earlier email [1]. How about these two patches? Patch 0001 removes nlconvert and all of its calls. That should make sure gnulib-tool.py and gnulib-tool.sh deal with newlines the same way. Patch 0002 removes the 'NL' constant. I assume this was to match gnulib-tool.sh's use of "$nl"? But there it is used to expand to a newline character without breaking lines right? For example, this: func_append license_incompatibilities "$module $license$nl" vs. this: func_append license_incompatibilities "$module $license " Since we use a mix of '\n' and constants.NL, I would just like to use '\n' everwhere consistently. > Specifying encoding='utf-8' is what makes the most sense today. If a package > still has a configure.ac or Makefile.am in ISO-8859-1 encoding, this will > probably fail. Let's see if people report a problem with this; it should be > very rare. Sounds good. [1] https://lists.gnu.org/archive/html/bug-gnulib/2024-03/msg00370.html CollinFrom fbdd4e7783b8a4b6e637bac2b392024b0e880622 Mon Sep 17 00:00:00 2001 From: Collin Funk Date: Sun, 14 Apr 2024 10:18:03 -0700 Subject: [PATCH 1/2] gnulib-tool.py: Don't perform newline conversions. * pygnulib/constants.py (nlconvert): Remove function. Remove unused platform import. * pygnulib/GLImport.py (GLImport.gnulib_cache): Remove calls to nlconvert(). * pygnulib/GLModuleSystem.py (GLModule.getAutomakeSnippet_Unconditional): Likewise. * pygnulib/GLTestDir.py (GLTestDir.execute, GLMegaTestDir.execute): Likewise. --- ChangeLog | 12 pygnulib/GLImport.py | 2 +- pygnulib/GLModuleSystem.py | 1 - pygnulib/GLTestDir.py | 6 -- pygnulib/constants.py | 10 -- 5 files changed, 13 insertions(+), 18 deletions(-) diff --git a/ChangeLog b/ChangeLog index 9b40f3d9d8..b5c16eae4e 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,15 @@ +2024-04-14 Collin Funk + + gnulib-tool.py: Don't perform newline conversions. + * pygnulib/constants.py (nlconvert): Remove function. Remove unused + platform import. + * pygnulib/GLImport.py (GLImport.gnulib_cache): Remove calls to + nlconvert(). + * pygnulib/GLModuleSystem.py + (GLModule.getAutomakeSnippet_Unconditional): Likewise. + * pygnulib/GLTestDir.py (GLTestDir.execute, GLMegaTestDir.execute): + Likewise. + 2024-04-14 Collin Funk gnulib-tool.py: Remove some unused variables. diff --git a/pygnulib/GLImport.py b/pygnulib/GLImport.py index eadf45828d..eaff520fa4 100644 --- a/pygnulib/GLImport.py +++ b/pygnulib/GLImport.py @@ -598,7 +598,7 @@ def gnulib_cache(self) -> str: if vc_files != None: # Convert Python bools to shell (True -> true). emit += 'gl_VC_FILES([%s])\n' % str(vc_files).lower() -return constants.nlconvert(emit) +return emit def gnulib_comp(self, filetable: dict[str, list[str]], gentests: bool) -> str: '''Emit the contents of generated $m4base/gnulib-comp.m4 file. diff --git a/pygnulib/GLModuleSystem.py b/pygnulib/GLModuleSystem.py index 2551e316e1..11cd6ef78a 100644 --- a/pygnulib/GLModuleSystem.py +++ b/pygnulib/GLModuleSystem.py @@ -645,7 +645,6 @@ def getAutomakeSnippet_Unconditional(self) -> str: for filename in buildaux_files.split(constants.NL) ] result += 'EXTRA_DIST += %s' % ' '.join(buildaux_files) result += '\n\n' -result = constants.nlconvert(result) self.cache['makefile-unconditional'] = result return self.cache['makefile-unconditional'] diff --git a/pygnulib/GLTestDir.py b/pygnulib/GLTestDir.py index dc70e8d304..a7709a1259 100644 --- a/pygnulib/GLTestDir.py +++ b/pygnulib/GLTestDir.py @@ -404,7 +404,6 @@ def execute(self) -> None: if file.startswith('m4/'): file = constants.substart('m4/', '', file) emit += 'EXTRA_DIST += %s\n' % file -emit = constants.nlconvert(emit) with open(destfile, mode='w', newline='\n', encoding='utf-8') as file: file.write(emit) @@ -522,7 +521,6 @@ def execute(self) -> None: emit += 'AH_TOP([#include \"../config.h\"])\n\n' emit += 'AC_CONFIG_FILES([Makefile])\n' emit += 'AC_OUTPUT\n' -emit = constants.nlconvert(emit) path = joinpath(self.testdir, testsbase, 'configure.ac') with open(path, mode='w', newline='\n', encoding='utf-8') as file:
Re: gnulib-tool.py: Stop using codecs.open
Hi Collin, > You mentioned removing the constants.nlconvert() stuff in > an earlier email [1]. How about these two patches? I verified that on Cygwin, the test suite passes; this is because - Cygwin programs produce LF as line terminator, - Python's platform.system() returns "CYGWIN_NT-10.0". Regarding native Windows, I don't think there's a realistic use-case, as users would have to have a working autoconf and automake first, which is unlikely for that platform. So, I don't even spend time testing gnulib-tool on native Windows. And regarding z/OS — its newline conventions are that much of a mess [1] that it's better not to even think about it. > Patch 0001 removes nlconvert and all of its calls. That should make > sure gnulib-tool.py and gnulib-tool.sh deal with newlines the same > way. > > Patch 0002 removes the 'NL' constant. Thanks! Both patches applied. > I assume this was to match gnulib-tool.sh's use of "$nl"? Yes. In a shell script, use of literal newlines produces hard-to-read code. Bruno [1] https://lists.gnu.org/archive/html/bug-gnu-libiconv/2023-04/msg00010.html
Re: gnulib-tool.py: Stop using codecs.open
Hi Bruno, On 4/14/24 3:16 PM, Bruno Haible wrote: > I verified that on Cygwin, the test suite passes; this is because > - Cygwin programs produce LF as line terminator, > - Python's platform.system() returns "CYGWIN_NT-10.0". > > Regarding native Windows, I don't think there's a realistic use-case, > as users would have to have a working autoconf and automake first, > which is unlikely for that platform. So, I don't even spend time > testing gnulib-tool on native Windows. Thanks for testing that. As far as native Windows goes, that makes sense. I know Perl and Python work there, but I have never tried Autoconf and Automake. I think in the past I used MYS2 to test builds [1]. > And regarding z/OS — its newline conventions are that much of a mess [1] > that it's better not to even think about it. Interesting read. I've never used z/OS, or much of the other proprietary operating systems for that matter (e.g. HP-UX, AIX). [1] https://www.msys2.org Collin