If I have an untrusted byte-stream (mp3 and mp4 tags) which I'm using to create file names I should do this before allowing the file to be created: 1. truncate the name to 255 characters 2. replace the characters '\', '\t', '\n', '\r', ':', and '/' with a substitute character, say '_' 3. replace non-utf8 characters with a substitute character, say '_'
I'm basing that process on these assumptions: 1. A *nix file system (ext2 , ext3, ext4, zfs) stores a name as an array of bytes. 2. Of those bytes, the only one that CANNOT be part of the file name is '/' (and ':' on mac and zfs, I believe, and windows doesn't like '\\' but will handle '*', '?', '`', etc) 3. \\, \t, \n, \r, the bell sound of the terminal, although annoying, are all valid byte strings for a filename. 4. The filesystem will try to decode the filename as utf8 if it can, but otherwise just show <?> to signify an unrecognized byte-sequence. 5. *, ?, ', ", `, !, #, all other characters are all valid bytes for a filename AJ ONeal
-------------------- BYU Unix Users Group http://uug.byu.edu/ The opinions expressed in this message are the responsibility of their author. They are not endorsed by BYU, the BYU CS Department or BYU-UUG. ___________________________________________________________________ List Info (unsubscribe here): http://uug.byu.edu/mailman/listinfo/uug-list
