On Sun, Mar 24, 2013 at 08:01:03PM +0900, Charles Plessy wrote: > after more than one month of discussion, we have not reached a conclusion.
Thanks for the ping. > In the current situation there is no policy, which means that everything is > allowed. Indeed, there is at least one package with filenames using more than > one set of non-ASCII characters, so no user can see correctly the names of > every file in this package at the same time. Some more data here. I checked sid main amd64 binary packages. The only ones containing invalid UTF-8 sequences (and thus violating the current proposal) would be aspell-is and jpilot. This suggests that UTF-8 is a defacto standard already. Fixing two packages shouldn't be that hard. I have filed a wishlist bug #704446 against lintian to check for this regardless of the outcome of this bug. > On my side, I made a proposal with actionable items: fix the few packages that > are not using UTF-8, and modify the Policy to reflect the current practice > of using ASCII in most of the times and other UTF-8 characters parcimoniously. I am in favour of this solution. * Requiring any subset of UTF-8 has the direct benefit of being able to interpret all filenames used without guesswork. * This is in line with Fedora's policy. * I saw very little disagreement about whether to permit non-UTF-8 sequences. Discussion seemed mostly to be around which subset to require. > I understand very well the arguments against having any UTF-8 character at > all, > but we currently have such packages in our archive, so if there is no plan to > modify these packages, then we can not plan to solve this bug. I see little benefit with restricting to ASCII compared to the benefit with restricting to UTF-8. Remember that the goal of this bug was to make filenames machine readable. I think that further restrictions should happen in the context of #99933. I asked for not merging these issues, because I would like to keep the scope of this issue limited and thus implementable. > Can others comment how they would like to see this bug solved ? Any proposal that limits to a subset of UTF-8 and a superset of printable ASCII is fine with me. My preferred choice would be just UTF-8. I have no objections to recommending the use of a subset of printable ASCII either. To me it appears to be a matter of wording right now. Consensus is basically there. Implementing it would cause two policy violations (aspell-is and jpilot), which imo is little impact. Helmut -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org