On Sat, Nov 14, 2015 at 4:14 AM, Yuri Pankov <yuri.pan...@nexenta.com> wrote:
> I'm trying to understand the idea behind the "normalization" property in > ZFS. > > What's the original idea behind the normalization when "normalization" is > set to "none" - is it "Or we could choose to be normalization-insensitive > on LOOKUP and normalization-preserving on CREATE." as described in [1]? > According to the zfs.1m manpage: "File names are always stored unmodified, names are normalized as part of any comparison process." In general, normalization and casesensitivity work similarly: we always store the specified bytes, but depending on the settings, some byte sequences may be considered "identical" from the point of view of lookup and create operations (in terms of determining if an entry exists). Therefore: - when you list the entries, you will always see the bytes sequence you used to create a file. - when you lookup a byte sequence, it may match a file whose name is a different byte sequence, but which is considered to be the equivalent according to the normalization and casesensitivity properties. (e.g. casesensitivity=insensitive, there is a file name "foo", you lookup "Foo", it will match the existing file). - when you create a file, it may fail with EEXIST if there is a file with a name that is equivalent according to the normalization and casesensitivity properties. normalization=none means that we do not do normalization, so even if two characters look the same, if they use different byte sequences, they will be considered to be distinct. (Analogous to casesensitivity=sensitive.) Hopefully the answers to your specific questions below are obvious given the above principles: > When comparing filenames for other "normalization" values, which part of > the comparison do we normalize - the stored filename, or the one in lookup > request? normalize on lookup. > > Currently I'm seeing that "normalization-preserving on CREATE" part is > there, but "normalization-insensitive on LOOKUP" is not: > # zfs create -o mountpoint=/norm/n -o utf8only=on -o normalization=none > rpool/formN > That's because you requested that it not be, by setting normalization=none. > # cd /norm/n > # touch $( echo "\xc3\xbc" ) > # touch $( echo "\x75\xcc\x88" ) > # ls > ü ü > # LC_ALL=C ls -b > u\314\210 \303\274 > > What of the following is correct per design, not as currently implemented > (given we have the "same" filename with "ü" character in NFC and NFD forms > as "fileC" and "fileD"): > > A. for all normalization settings the filename itself is NOT modified. > Correct. > > B. normalization=none > - creating either of fileC OR fileD is OK, creating another form when one > exists is NOT. > Incorrect, the names are not equivalent according to normalization=none, so you can create both names, > > C. normalization=formC > - creating either of fileC OR fileD is OK > Correct. > - C1. fileC exists, creating fileD is OK; Incorrect, these names are equivalent according to normalization=formC, so you will get EEXIST when creating fileD. > fileD exists, creating fileC isn't OK - normalizing stored filename. > Correct, creating fileC will get EEXIST. > - OR > - C2. fileC exists, creating fileD isn't OK; Correct, creating fileD will get EEXIST. > fileD exists, creating fileC is OK - normalizing the looked up filename. > Incorrect, creating fileC will get ENOENT > > D. normalization=formD, same as C, swapping the fileC and fileD. > Same as with formC, because you said the names are equivalent under both formC and formD. > > > 1. https://blogs.oracle.com/nico/entry/filesystem_i18n That post seems accurate to me. --matt > > > ------------------------------------------- > illumos-zfs > Archives: https://www.listbox.com/member/archive/182191/=now > RSS Feed: > https://www.listbox.com/member/archive/rss/182191/27179292-bb9021e0 > Modify Your Subscription: > https://www.listbox.com/member/?member_id=27179292&id_secret=27179292-acf9db97 > Powered by Listbox: http://www.listbox.com >
_______________________________________________ developer mailing list developer@open-zfs.org http://lists.open-zfs.org/mailman/listinfo/developer