Re: suggestions please what should i watch for/guard against' in a file upload situation?
On 06/10/2010 21:01, Martin Gregorie wrote: On Wed, 06 Oct 2010 09:02:21 -0700, geekbuntu wrote: in general, what are things i would want to 'watch for/guard against' in a file upload situation? i have my file upload working (in the self-made framework @ work without any concession for multipart form uploads), but was told to make sure it's cleansed and cannot do any harm inside the system. Off the top of my head, and assuming that you get passed the exact filename that the user entered: - The user may need to use an absolute pathname to upload a file that isn't in his current directory, so retain only the basename by discarding the rightmost slash and everything to the left of it: /home/auser/photos/my_photo.jpg === my_photo.jpg c:\My Photos\My Photo.jpg === My Photo.jpg - If your target system doesn't like spaces in names or you want to be on the safe side there, replace spaces in the name with underscores: My Photo.jpg === My_Photo.jpg - reject any filenames that could cause the receiving system to do dangerous things, e.g. .EXE or .SCR if the upload target is Windows. This list will be different for each upload target, so make it configurable. You can't assume anything about else about the extension. .py .c .txt and .html are all valid in the operating systems I use and so are their capitalised equivalents. A whitelist is better than a blacklist; instead of rejecting what you know could be dangerous, accept what you know _isn't_ dangerous. - check whether the file already exists. You need rules about what to do if it exists (do you reject the upload, silently overwrite, or alter the name, e.g. by adding a numeric suffix to make the name unique: my_photo.jpg === my_photo-01.jpg - run the application in your upload target directory and put the uploaded file there or, better, into a configured uploads directory by prepending it to the file name: my_photo.jpg === /home/upload_user/uploads/my_photo.jpg - make sure you document the process so that a user can work out what has happened to his file and why if you have to reject it or alter its name. not sure but any suggestions or examples are most welcome :) There's probably something I've forgotten, but that list should get you going. Maximum file size, perhaps? -- http://mail.python.org/mailman/listinfo/python-list
Re: suggestions please what should i watch for/guard against' in a file upload situation?
On 10/6/2010 12:02 PM, geekbuntu wrote: in general, what are things i would want to 'watch for/guard against' in a file upload situation? i have my file upload working (in the self-made framework @ work without any concession for multipart form uploads), but was told to make sure it's cleansed and cannot do any harm inside the system. my checklist so far is basically to check the extension - ensure it has 3 places, ensure it's in the allowed list (like jpg gif etc...). not sure what else i could do to guard against anything bad happening. maybe the file name itself could cause greif? not sure but any suggestions or examples are most welcome :) I am not sure whether anyone mentioned limiting the file size, checking the incoming header, and aborting an upload if it goes over anyway. Most sites do not want 10 gigabyte files ;-). -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
suggestions please what should i watch for/guard against' in a file upload situation?
in general, what are things i would want to 'watch for/guard against' in a file upload situation? i have my file upload working (in the self-made framework @ work without any concession for multipart form uploads), but was told to make sure it's cleansed and cannot do any harm inside the system. my checklist so far is basically to check the extension - ensure it has 3 places, ensure it's in the allowed list (like jpg gif etc...). not sure what else i could do to guard against anything bad happening. maybe the file name itself could cause greif? not sure but any suggestions or examples are most welcome :) -- http://mail.python.org/mailman/listinfo/python-list
Re: suggestions please what should i watch for/guard against' in a file upload situation?
On 2010-10-06, geekbuntu gmi...@gmail.com wrote: in general, what are things i would want to 'watch for/guard against' in a file upload situation? This question has virtually nothing to do with Python, which means you may not get very good answers. my checklist so far is basically to check the extension - ensure it has 3 places, ensure it's in the allowed list (like jpg gif etc...). This strikes me as 100% irrelevant. Who cares what the extension is? not sure what else i could do to guard against anything bad happening. maybe the file name itself could cause greif? Obvious things: * File name causes files to get created outside some particular upload directory (../foo) * File name has spaces * Crazy stuff like null bytes in file name * File names which might break things if a user carelessly interacts with them, such as foo.jpg /etc/passwd bar.jpg (all one file name including two spaces). Basically, the key question is, could a hostile user come up with input to your script which could break something? -s -- Copyright 2010, all wrongs reversed. Peter Seebach / usenet-nos...@seebs.net http://www.seebs.net/log/ -- lawsuits, religion, and funny pictures http://en.wikipedia.org/wiki/Fair_Game_(Scientology) -- get educated! I am not speaking for my employer, although they do rent some of my opinions. -- http://mail.python.org/mailman/listinfo/python-list
Re: suggestions please what should i watch for/guard against' in a file upload situation?
On 10/06/10 12:14, Seebs wrote: not sure what else i could do to guard against anything bad happening. maybe the file name itself could cause greif? Obvious things: * File name causes files to get created outside some particular upload directory (../foo) * File name has spaces * Crazy stuff like null bytes in file name * File names which might break things if a user carelessly interacts with them, such as foo.jpg /etc/passwd bar.jpg (all one file name including two spaces). And depending on the system, Win32 chokes on filenames like nul, con, com1...comN, lpt1...lptN, and a bunch of others. -tkc -- http://mail.python.org/mailman/listinfo/python-list
Re: suggestions please what should i watch for/guard against' in a file upload situation?
Seebs usenet-nos...@seebs.net writes: On 2010-10-06, geekbuntu gmi...@gmail.com wrote: in general, what are things i would want to 'watch for/guard against' in a file upload situation? This question has virtually nothing to do with Python, which means you may not get very good answers. In contrast to comp.super.web.experts? There are quite a few people with web-experience here I'd say. my checklist so far is basically to check the extension - ensure it has 3 places, ensure it's in the allowed list (like jpg gif etc...). This strikes me as 100% irrelevant. Who cares what the extension is? Given that most people are not computer savvy (always remember, the default for windows is to hide extensions..), using it client-side can be valuable to prevent long uploads that eventuall need to be rejected otherwise (no mom, you can't upload word-docs as profile pictures). not sure what else i could do to guard against anything bad happening. maybe the file name itself could cause greif? Obvious things: * File name causes files to get created outside some particular upload directory (../foo) Or rather just store that as a simple meta-info, as allowing even the best-intended me-in-cool-pose.jpg to overwrite that of the one other cool guy using the website isn't gonna fly anyway. * File name has spaces See above, but other then that - everything but shell-scripts deal well with it. * Crazy stuff like null bytes in file name * File names which might break things if a user carelessly interacts with them, such as foo.jpg /etc/passwd bar.jpg (all one file name including two spaces). Your strange focus on file-names that are pure meta information is a little bit concerning... Basically, the key question is, could a hostile user come up with input to your script which could break something? Certainly advice. But that's less focussed on filenames or file-uploads, but on the whole subject of processing HTTP-requestst. Which would make a point for *not* using a home-grown framework. But then, Python is a bit less likely to suffer from buffer overflow or similar kind of attacks. Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: suggestions please what should i watch for/guard against' in a file upload situation?
On Wed, 06 Oct 2010 09:02:21 -0700, geekbuntu wrote: in general, what are things i would want to 'watch for/guard against' in a file upload situation? i have my file upload working (in the self-made framework @ work without any concession for multipart form uploads), but was told to make sure it's cleansed and cannot do any harm inside the system. Off the top of my head, and assuming that you get passed the exact filename that the user entered: - The user may need to use an absolute pathname to upload a file that isn't in his current directory, so retain only the basename by discarding the rightmost slash and everything to the left of it: /home/auser/photos/my_photo.jpg === my_photo.jpg c:\My Photos\My Photo.jpg === My Photo.jpg - If your target system doesn't like spaces in names or you want to be on the safe side there, replace spaces in the name with underscores: My Photo.jpg ===My_Photo.jpg - reject any filenames that could cause the receiving system to do dangerous things, e.g. .EXE or .SCR if the upload target is Windows. This list will be different for each upload target, so make it configurable. You can't assume anything about else about the extension. .py .c .txt and .html are all valid in the operating systems I use and so are their capitalised equivalents. - check whether the file already exists. You need rules about what to do if it exists (do you reject the upload, silently overwrite, or alter the name, e.g. by adding a numeric suffix to make the name unique: my_photo.jpg === my_photo-01.jpg - run the application in your upload target directory and put the uploaded file there or, better, into a configured uploads directory by prepending it to the file name: my_photo.jpg === /home/upload_user/uploads/my_photo.jpg - make sure you document the process so that a user can work out what has happened to his file and why if you have to reject it or alter its name. not sure but any suggestions or examples are most welcome :) There's probably something I've forgotten, but that list should get you going. -- martin@ | Martin Gregorie gregorie. | Essex, UK org | -- http://mail.python.org/mailman/listinfo/python-list
Re: suggestions please what should i watch for/guard against' in a file upload situation?
On 2010-10-06, Diez B. Roggisch de...@web.de wrote: Seebs usenet-nos...@seebs.net writes: On 2010-10-06, geekbuntu gmi...@gmail.com wrote: in general, what are things i would want to 'watch for/guard against' in a file upload situation? This question has virtually nothing to do with Python, which means you may not get very good answers. In contrast to comp.super.web.experts? There are quite a few people with web-experience here I'd say. Oh, certainly. But in general, I try to ask questions in a group focused on their domain, rather than merely a group likely to contain people who would for other reasons have the relevant experience. I'm sure that a great number of Python programmers have experience with sex, that doesn't make this a great newsgroup for sex tips. (Well, maybe it does.) Given that most people are not computer savvy (always remember, the default for windows is to hide extensions..), using it client-side can be valuable to prevent long uploads that eventuall need to be rejected otherwise (no mom, you can't upload word-docs as profile pictures). That's a good point. On the other hand, there's a corollary; you may want to look at the contents of the file in case they're not really what they're supposed to be. Your strange focus on file-names that are pure meta information is a little bit concerning... If you're uploading files into a directory, then it is quite likely that you're getting file names from somewhere. Untrusted file names are a much more effective attack vector, in most cases, than EXIF information. Certainly advice. But that's less focussed on filenames or file-uploads, but on the whole subject of processing HTTP-requestst. Which would make a point for *not* using a home-grown framework. Well, yeah. I was assuming that the home-grown framework was mandatory for some reason. Possibly a very important reason, such as otherwise we won't have written it ourselves. -s -- Copyright 2010, all wrongs reversed. Peter Seebach / usenet-nos...@seebs.net http://www.seebs.net/log/ -- lawsuits, religion, and funny pictures http://en.wikipedia.org/wiki/Fair_Game_(Scientology) -- get educated! I am not speaking for my employer, although they do rent some of my opinions. -- http://mail.python.org/mailman/listinfo/python-list
Re: suggestions please what should i watch for/guard against' in a file upload situation?
On Wed, 06 Oct 2010 09:02:21 -0700, geekbuntu wrote: in general, what are things i would want to 'watch for/guard against' in a file upload situation? i have my file upload working (in the self-made framework @ work without any concession for multipart form uploads), but was told to make sure it's cleansed and cannot do any harm inside the system. Make sure *what* is cleansed? Your code? The uploaded files? Define cleansed. Do you have to block viruses, malware, spybots, illegal pornography, legal pornography, illegal content, warez, copyright violations, stolen trade secrets, dirty words, pictures of cats? What operating system are you uploading to? What happens if somebody tries to upload a 1 TB file to your server? What happens if they try to upload a billion 1 KB files instead? my checklist so far is basically to check the extension - ensure it has 3 places, ensure it's in the allowed list (like jpg gif etc...). Do you have something against file extensions like .gz or .jpeg ? I'm not sure why you think you need to check the file extension. not sure what else i could do to guard against anything bad happening. maybe the file name itself could cause greif? You think? :) What happens if the file name has characters in it that your file system can't deal with? Bad unicode, binary bytes, slashes, colons, question marks, asterisks, etc. What about trying to break out of your file storage area using .. paths? Without knowing what your file upload code actually does, it's hard to give specific advice. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: suggestions please what should i watch for/guard against' in a file upload situation?
Seebs usenet-nos...@seebs.net writes: On 2010-10-06, Diez B. Roggisch de...@web.de wrote: Seebs usenet-nos...@seebs.net writes: On 2010-10-06, geekbuntu gmi...@gmail.com wrote: in general, what are things i would want to 'watch for/guard against' in a file upload situation? This question has virtually nothing to do with Python, which means you may not get very good answers. In contrast to comp.super.web.experts? There are quite a few people with web-experience here I'd say. Oh, certainly. But in general, I try to ask questions in a group focused on their domain, rather than merely a group likely to contain people who would for other reasons have the relevant experience. I'm sure that a great number of Python programmers have experience with sex, that doesn't make this a great newsgroup for sex tips. (Well, maybe it does.) As the OP asked about a Python web framework (self written or not), I think all advice that can be given is certainly more related to Python than to airy references to general web programming such as oh, make sure if your server side application environment hasn't any security issues. Or, to be more concrete: what NG would you suggest for frameworks or webapps written in python to ask this question? Given that most people are not computer savvy (always remember, the default for windows is to hide extensions..), using it client-side can be valuable to prevent long uploads that eventuall need to be rejected otherwise (no mom, you can't upload word-docs as profile pictures). That's a good point. On the other hand, there's a corollary; you may want to look at the contents of the file in case they're not really what they're supposed to be. For sure. But the focus of you and others seems to be the file-name, as if that was anything especially dangerous. Matter of factly, it's a paramteter to a multipart/form-data encoded request body parameter definition, and as such has a rather locked-down in terms of null-bytes and such. So you are pretty safe as long as you - use standard library request parsing modules such as cgi. If one instist on reading streams bytewise and using ctypes to poke the results into memory, you can of course provoke unimaginable havoc.. - don't use the filename for anything but meta-info. And ususally, they are simply regarded as nice that you've provided us with it, we try make our best to fill an img alt attribute with the basename. But not more. Worth pointing out to the OP to do that. But this is *not* a matter of mapping HTTP-request paths to directories I'd wager to say. Something that is of much more importance (I should have mentioned earlier, shame on me) is of course file-size. Denying requests that come with CONTENT_LENGTH over a specified limit, of course respecting CONTENT_LENGTH and not reading beyond it, and possibly dealing with chunked-encodings in similarily safe ways (I have to admit I haven't yet dealt with one of those myself on a visceral level - but as they are part of the HTTP-spec...) is important, as otherwise DOS attacks are possible. Your strange focus on file-names that are pure meta information is a little bit concerning... If you're uploading files into a directory, then it is quite likely that you're getting file names from somewhere. Untrusted file names are a much more effective attack vector, in most cases, than EXIF information. The into a directory quote coming from where? And given that EXIF information is probably read by some C-lib, I'd say it is much more dangerous. This is a gut feeling only, but fed by problems with libpng a year or two ago. Certainly advice. But that's less focussed on filenames or file-uploads, but on the whole subject of processing HTTP-requestst. Which would make a point for *not* using a home-grown framework. Well, yeah. I was assuming that the home-grown framework was mandatory for some reason. Possibly a very important reason, such as otherwise we won't have written it ourselves. In Python, it's usually more along the lines of well, we kinda started, and now we have it, and are reluctant to switch. But of course one never knows... Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: suggestions please what should i watch for/guard against' in a file upload situation?
Martin Gregorie mar...@address-in-sig.invalid writes: On Wed, 06 Oct 2010 09:02:21 -0700, geekbuntu wrote: in general, what are things i would want to 'watch for/guard against' in a file upload situation? i have my file upload working (in the self-made framework @ work without any concession for multipart form uploads), but was told to make sure it's cleansed and cannot do any harm inside the system. Off the top of my head, and assuming that you get passed the exact filename that the user entered: - The user may need to use an absolute pathname to upload a file that isn't in his current directory, so retain only the basename by discarding the rightmost slash and everything to the left of it: /home/auser/photos/my_photo.jpg === my_photo.jpg c:\My Photos\My Photo.jpg === My Photo.jpg - If your target system doesn't like spaces in names or you want to be on the safe side there, replace spaces in the name with underscores: My Photo.jpg ===My_Photo.jpg - reject any filenames that could cause the receiving system to do dangerous things, e.g. .EXE or .SCR if the upload target is Windows. This list will be different for each upload target, so make it configurable. Erm, this assumes that the files are executed in some way. Why should they? It's perfectly fine to upload *anything*, and of course filenames mean nothing wrt to the actual file contents (Are you sure you want to change the extension of this file?). It might make no sense for the user, because you can't shon an exe as profile image. But safe-guarding against that has nothing to do with OS. And even safe file formats such as PNGs have been attack vectors. Precisely because they are processed client-side in the browser through some library with security issues. For serving the files, one could rely on the file-command or similar means to determine the mime-type. So far, I've never done that - as faking the extension for something else doesn't buy you something unless there is a documented case of internet explorer ignoring mime-type, and executing downloaded file as program. You can't assume anything about else about the extension. .py .c .txt and .html are all valid in the operating systems I use and so are their capitalised equivalents. - check whether the file already exists. You need rules about what to do if it exists (do you reject the upload, silently overwrite, or alter the name, e.g. by adding a numeric suffix to make the name unique: my_photo.jpg === my_photo-01.jpg Better, associate the file with the uploader and or it's hash. Use the name as pure meta-information only. There's probably something I've forgotten, but that list should get you going. Dealing with to large upload requests I'd say is much more important, as careless reading of streams into memory has at least the potential for a DOS-attack. Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: suggestions please what should i watch for/guard against' in a file upload situation?
In message 2ce3860b-ae21-48ae-9abc-cb169a6f1...@e20g2000vbn.googlegroups.com, geekbuntu wrote: in general, what are things i would want to 'watch for/guard against' in a file upload situation? If you stored the file contents as a blob in a database field, you wouldn’t have to worry about filename problems. -- http://mail.python.org/mailman/listinfo/python-list