On Oct 5, 2012, at 5:30 AM, Joerg Schilling wrote: > Joerg Schilling <joerg.schill...@fokus.fraunhofer.de> wrote: > >>> As mentioned there, I'm also missing documentation for >>> star's xattr format. In particular, I'm not clear how it handles >>> non-ASCII bytes in the attributes name. >> >> Thank you for this hint, I thought that I did document everything..... > > Mmmmm, could you explain, what information you are missing in this text: > > SCHILY.xattr.attr > A POSIX.1-2001 coded version of the Linux extended file > attributes. Linux extended file attributes are > name/value pairs. Every attribute name results in a > SCHILY.xattr.name tag and the value of the extended > attribute is used as the value of the POSIX.1-2001 > header tag. Note that this way of coding is not port- > able across platforms. A version for BSD may be > created but Solaris includes far more features with > extended attribute files than Linux does. > > A future version of star will implement a similar > method as the tar program on Solaris currently uses. > When this implementation is ready, the > SCHILY.xattr.name feature may be removed in favor of a > truly portable implementation that supports Solaris > also.
What do you do if the name includes byte 0x3d? > BTW: converting name/value pairs with unknown meaning into something > different (as UTF-8) may cause problems … This is why bsdtar URL-encodes the bytes of the xattr property name when building the pax extended header record name. The result is ASCII and therefore won't be damaged if the reader does any sort of character set translation. > … and is not needed as the meta data in extended > tar headers is binary clean due to it's length tag. POSIX.1-2008 is very clear: "The <keyword> fields can be any UTF-8 characters." The *value* is allowed to be binary in specific instances (gname, uname, path, and link path in the presence of a hdrcharset header), but the keyword should always be UTF-8. Tim