On Wed, May 15, 2013 at 07:52:30AM -0600, Todd C. Miller wrote: > On Wed, 15 May 2013 16:16:53 +0300, Arto Jonsson wrote: > > > I asked stsp@ about the multibyte support yesterday and it was his > > opinion that it's not currently needed. > > Seemed like it might be useful for the future but I suppose we can > add things like this when we have better multibyte support.
Just to clarify: In my opinion multibyte support is needed in general. But we don't seem to have clear consensus on which tools in the base system should support multibyte characters, and in what way. The biggest open question is where to draw the line between tools that do support multibyte and those that don't. There was a discussion about adding such support to ls(1). It ended with the question of whether or not adding this to ls means that virtually every other base tool that outputs data in column-aligned tables (e.g. df, du, ...) needs similar modifications, for consistency. Doing this consistently is a lot of work, more than I have time to do. And it would cause a lot of code churn by introducing use of wchar_t and/or UTF-8 handling in many places. Churn isn't very popular and requires lots of review to catch new bugs. So currently, most tools in base do not support mulitbyte, with few exceptions which were mostly imported or based on third-party software. However, nl(1) is a new tool, and it doesn't seem to format data in columns so it could be considered a separate case. Still, I'm not sure what the project generally wants. There are contradictory opinions depending on who you ask. Some prefer concise code in base that only deals with ASCII, others would like multibyte features in base which are currently only available in ports. So I told Arto, who asked me about this, that I don't think adding multibyte to his nl diff is a very important thing to do at present. But now that the work has already been done, I don't see why not.