Hi Stefan,

On Sun, Sep 24, 2023, 06:38 Stefan H. Holek <[email protected]> wrote:

> Hi All,
>
> There appears to be an issue with a recent addition to
> rl_filename_completion_function. It now applies rl_filename_rewrite_hook to
> the filename part of "what the user has typed". This seems wrong. Let me
> explain.
>

This was my patch (submitted to bug-bash, the original thread is at [1]) so
I'll defend the motivation for it -- though I think you're right that the
implementation was too narrowly focused on addressing the issue described
there and can violate assumptions in existing code.

The rl_filename_rewrite_hook exists to convert data read from the
> filesystem to a representation that works in the terminal. E.g. on macOS
> the filesystem returns decomposed UTF-8, which must be converted to fully
> composed UTF-8 before comparing it to a string the user has typed.


Side note: APFS preserves normalization -- so we get both composed and
decomposed entries to compare against.  But that doesn't really affect this
feature.

For background, with either filesystem, macOS filenames are not the usual
opaque byte strings that they are on other platforms but rather
normalization-insensitive UTF-8 text, i.e.:
* it's not possible to have two distinct directory entries that normalize
equally
* a file can be accessed using any name that normalizes the same as the
filename

Now, the section below (in complete.c) appears to apply
> rl_filename_rewrite_hook to a string in TERMINAL representation ('filename'
> is the rightmost part of the path the user has typed):


While text literally typed in will likely be NFC, any filenames pasted into
the terminal (or placed there by glob-complete-word, etc) will retain the
normalization stored on the filesystem -- which is usually _not_ NFC. See
examples in the thread at [2].

I struggle to find this useful and in fact think it's dangerous and should
> be backed out.


So without normalizing the input text, it's not possible to reuse filenames
read from the filesystem (`ls` output, etc.) as input to readline
completion code.  Or rather, it would be possible, but Readline normalizes
the directory entries so it only makes sense to normalize the text to match
against them as well.

If I have an rl_filename_rewrite_hook that works in Readline 8.2, it may
> just fail in 8.3 because it is applied to a string that is not in the
> expected filesystem representation!
>
> Readline has so far worked fine in scenarios where the terminal encoding
> differs from the filesystem encoding. I can use rl_directory_rewrite_hook
> and rl_filename_stat_hook to go from terminal -> filesystem encoding, and
> rl_filename_rewrite_hook to go from filesystem -> terminal encoding. It is
> my understanding that these hooks have been added to support this use-case
> in the first place.
>

Is this an existing application or a hypothetical one? I'm not sure how
this can work as described -- rl_directory_rewrite_hook only modifies the
directory portion of the text, not the part after the final slash, and
rl_filename_stat_hook is applied only after completion matches have already
been generated.

What was missing was a way to modify the filename portion of the text
before generating completion matches. (Well, rl_filename_dequoting_function
does that, but that only gets called if the name is quoted).

Maybe a better approach is a separate variable (e.g.
rl_filename_completion_hook) to serve this purpose since an application may
want to perform different transformations on generated filenames vs input
text.

[1]: https://lists.gnu.org/archive/html/bug-bash/2023-07/msg00050.html
[2]: https://lists.gnu.org/archive/html/bug-bash/2023-07/msg00081.html

>

Reply via email to