If I have an untrusted byte-stream (mp3 and mp4 tags) which I'm using to
create file names I should do this before allowing the file to be created:
1. truncate the name to 255 characters
2. replace the characters '\', '\t', '\n', '\r', ':', and '/' with a
substitute character, say '_'
3. replace non-utf8 characters with a substitute character, say '_'

I'm basing that process on these assumptions:
1. A *nix file system (ext2 , ext3, ext4, zfs)  stores a name as an array of
bytes.
2. Of those bytes, the only one that CANNOT be part of the file name is '/'
(and ':' on mac and zfs, I believe, and windows doesn't like '\\' but will
handle '*', '?', '`', etc)
3. \\, \t, \n, \r, the bell sound of the terminal, although annoying, are
all valid byte strings for a filename.
4. The filesystem will try to decode the filename as utf8 if it can, but
otherwise just show <?> to signify an unrecognized byte-sequence.
5. *, ?, ', ", `, !, #, all other characters are all valid bytes for a
filename

AJ ONeal
--------------------
BYU Unix Users Group 
http://uug.byu.edu/ 

The opinions expressed in this message are the responsibility of their
author.  They are not endorsed by BYU, the BYU CS Department or BYU-UUG. 
___________________________________________________________________
List Info (unsubscribe here): http://uug.byu.edu/mailman/listinfo/uug-list

Reply via email to