You'll get more predictable results by focusing on what you should allow than 
by focusing on what you should strip.

Richard

On Sat 12 December 2009 11:14:48 AJ ONeal <[email protected]> wrote:
> If I have an untrusted byte-stream (mp3 and mp4 tags) which I'm using to
> create file names I should do this before allowing the file to be created:
> 1. truncate the name to 255 characters
> 2. replace the characters '\', '\t', '\n', '\r', ':', and '/' with a
> substitute character, say '_'
> 3. replace non-utf8 characters with a substitute character, say '_'
> 
> I'm basing that process on these assumptions:
> 1. A *nix file system (ext2 , ext3, ext4, zfs)  stores a name as an array of
> bytes.
> 2. Of those bytes, the only one that CANNOT be part of the file name is '/'
> (and ':' on mac and zfs, I believe, and windows doesn't like '\\' but will
> handle '*', '?', '`', etc)
> 3. \\, \t, \n, \r, the bell sound of the terminal, although annoying, are
> all valid byte strings for a filename.
> 4. The filesystem will try to decode the filename as utf8 if it can, but
> otherwise just show <?> to signify an unrecognized byte-sequence.
> 5. *, ?, ', ", `, !, #, all other characters are all valid bytes for a
> filename
> 
> AJ ONeal
--------------------
BYU Unix Users Group 
http://uug.byu.edu/ 

The opinions expressed in this message are the responsibility of their
author.  They are not endorsed by BYU, the BYU CS Department or BYU-UUG. 
___________________________________________________________________
List Info (unsubscribe here): http://uug.byu.edu/mailman/listinfo/uug-list

Reply via email to