Re: suggestions please what should i watch for/guard against' in a file upload situation?

2010-10-07 Thread MRAB

On 06/10/2010 21:01, Martin Gregorie wrote:

On Wed, 06 Oct 2010 09:02:21 -0700, geekbuntu wrote:


in general, what are things i would want to 'watch for/guard against' in
a file upload situation?

i have my file upload working (in the self-made framework @ work without
any concession for multipart form uploads), but was told to make sure
it's cleansed and cannot do any harm inside the system.


Off the top of my head, and assuming that you get passed the exact
filename that the user entered:

- The user may need to use an absolute pathname to upload a file
   that isn't in his current directory, so retain only the basename
   by discarding the rightmost slash and everything to the left of it:
 /home/auser/photos/my_photo.jpg   ===  my_photo.jpg
 c:\My Photos\My Photo.jpg ===  My Photo.jpg

- If your target system doesn't like spaces in names or you want to be
   on the safe side there, replace spaces in the name with underscores:
 My Photo.jpg === My_Photo.jpg

- reject any filenames that could cause the receiving system to do
   dangerous things, e.g. .EXE or .SCR if the upload target is Windows.
   This list will be different for each upload target, so make it
   configurable.

   You can't assume anything about else about the extension.
   .py .c .txt and .html are all valid in the operating systems I use
   and so are their capitalised equivalents.


A whitelist is better than a blacklist; instead of rejecting what you
know could be dangerous, accept what you know _isn't_ dangerous.


- check whether the file already exists. You need
   rules about what to do if it exists (do you reject the upload,
   silently overwrite, or alter the name, e.g. by adding a numeric
   suffix to make the name unique:

  my_photo.jpg  ===   my_photo-01.jpg

- run the application in your upload target directory and put the
   uploaded file there or, better, into a configured uploads directory
   by prepending it to the file name:

 my_photo.jpg   ===   /home/upload_user/uploads/my_photo.jpg

- make sure you document the process so that a user can work out
   what has happened to his file and why if you have to reject it
   or alter its name.


not sure but any suggestions or examples are most welcome :)


There's probably something I've forgotten, but that list should get you
going.


Maximum file size, perhaps?
--
http://mail.python.org/mailman/listinfo/python-list


Re: suggestions please what should i watch for/guard against' in a file upload situation?

2010-10-07 Thread Terry Reedy

On 10/6/2010 12:02 PM, geekbuntu wrote:

in general, what are things i would want to 'watch for/guard against'
in a file upload situation?

i have my file upload working (in the self-made framework @ work
without any concession for multipart form uploads), but was told to
make sure it's cleansed and cannot do any harm inside the system.

my checklist so far is basically to check the extension - ensure it
has 3 places, ensure it's in the allowed list (like jpg gif etc...).

not sure what else i could do to guard against anything bad
happening.  maybe the file name itself could cause greif?

not sure but any suggestions or examples are most welcome :)


I am not sure whether anyone mentioned limiting the file size, checking 
the incoming header, and aborting an upload if it goes over anyway. Most 
sites do not want 10 gigabyte files ;-).


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


suggestions please what should i watch for/guard against' in a file upload situation?

2010-10-06 Thread geekbuntu
in general, what are things i would want to 'watch for/guard against'
in a file upload situation?

i have my file upload working (in the self-made framework @ work
without any concession for multipart form uploads), but was told to
make sure it's cleansed and cannot do any harm inside the system.

my checklist so far is basically to check the extension - ensure it
has 3 places, ensure it's in the allowed list (like jpg gif etc...).

not sure what else i could do to guard against anything bad
happening.  maybe the file name itself could cause greif?

not sure but any suggestions or examples are most welcome :)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: suggestions please what should i watch for/guard against' in a file upload situation?

2010-10-06 Thread Seebs
On 2010-10-06, geekbuntu gmi...@gmail.com wrote:
 in general, what are things i would want to 'watch for/guard against'
 in a file upload situation?

This question has virtually nothing to do with Python, which means you
may not get very good answers.

 my checklist so far is basically to check the extension - ensure it
 has 3 places, ensure it's in the allowed list (like jpg gif etc...).

This strikes me as 100% irrelevant.  Who cares what the extension is?

 not sure what else i could do to guard against anything bad
 happening.  maybe the file name itself could cause greif?

Obvious things:

* File name causes files to get created outside some particular
  upload directory (../foo)
* File name has spaces
* Crazy stuff like null bytes in file name
* File names which might break things if a user carelessly interacts
  with them, such as foo.jpg /etc/passwd bar.jpg (all one file name
  including two spaces).

Basically, the key question is, could a hostile user come up with
input to your script which could break something?

-s
-- 
Copyright 2010, all wrongs reversed.  Peter Seebach / usenet-nos...@seebs.net
http://www.seebs.net/log/ -- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) -- get educated!
I am not speaking for my employer, although they do rent some of my opinions.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: suggestions please what should i watch for/guard against' in a file upload situation?

2010-10-06 Thread Tim Chase

On 10/06/10 12:14, Seebs wrote:

not sure what else i could do to guard against anything bad
happening.  maybe the file name itself could cause greif?


Obvious things:

* File name causes files to get created outside some particular
   upload directory (../foo)
* File name has spaces
* Crazy stuff like null bytes in file name
* File names which might break things if a user carelessly interacts
   with them, such as foo.jpg /etc/passwd bar.jpg (all one file name
   including two spaces).


And depending on the system, Win32 chokes on filenames like 
nul, con, com1...comN, lpt1...lptN, and a bunch of 
others.


-tkc




--
http://mail.python.org/mailman/listinfo/python-list


Re: suggestions please what should i watch for/guard against' in a file upload situation?

2010-10-06 Thread Diez B. Roggisch
Seebs usenet-nos...@seebs.net writes:

 On 2010-10-06, geekbuntu gmi...@gmail.com wrote:
 in general, what are things i would want to 'watch for/guard against'
 in a file upload situation?

 This question has virtually nothing to do with Python, which means you
 may not get very good answers.

In contrast to comp.super.web.experts? There are quite a few people
with web-experience here I'd say. 


 my checklist so far is basically to check the extension - ensure it
 has 3 places, ensure it's in the allowed list (like jpg gif etc...).

 This strikes me as 100% irrelevant.  Who cares what the extension is?

Given that most people are not computer savvy (always remember, the
default for windows is to hide extensions..), using it client-side can
be valuable to prevent long uploads that eventuall need to be rejected
otherwise (no mom, you can't upload word-docs as profile pictures).

 not sure what else i could do to guard against anything bad
 happening.  maybe the file name itself could cause greif?

 Obvious things:

 * File name causes files to get created outside some particular
   upload directory (../foo)

Or rather just store that as a simple meta-info, as allowing even the
best-intended me-in-cool-pose.jpg to overwrite that of the one other
cool guy using the website isn't gonna fly anyway.

 * File name has spaces

See above, but other then that - everything but shell-scripts deal well
with it.

 * Crazy stuff like null bytes in file name
 * File names which might break things if a user carelessly interacts
   with them, such as foo.jpg /etc/passwd bar.jpg (all one file name
   including two spaces).

Your strange focus on file-names that are pure meta information is a
little bit concerning... 

 Basically, the key question is, could a hostile user come up with
 input to your script which could break something?

Certainly advice. But that's less focussed on filenames or file-uploads, but
on the whole subject of processing HTTP-requestst. Which would make a
point for *not* using a home-grown framework.

But then, Python is a bit less likely to suffer from buffer overflow or 
similar kind of attacks.

Diez
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: suggestions please what should i watch for/guard against' in a file upload situation?

2010-10-06 Thread Martin Gregorie
On Wed, 06 Oct 2010 09:02:21 -0700, geekbuntu wrote:

 in general, what are things i would want to 'watch for/guard against' in
 a file upload situation?
 
 i have my file upload working (in the self-made framework @ work without
 any concession for multipart form uploads), but was told to make sure
 it's cleansed and cannot do any harm inside the system.

Off the top of my head, and assuming that you get passed the exact 
filename that the user entered:

- The user may need to use an absolute pathname to upload a file
  that isn't in his current directory, so retain only the basename
  by discarding the rightmost slash and everything to the left of it:
/home/auser/photos/my_photo.jpg   === my_photo.jpg
c:\My Photos\My Photo.jpg === My Photo.jpg

- If your target system doesn't like spaces in names or you want to be
  on the safe side there, replace spaces in the name with underscores:
My Photo.jpg ===My_Photo.jpg

- reject any filenames that could cause the receiving system to do
  dangerous things, e.g. .EXE or .SCR if the upload target is Windows.
  This list will be different for each upload target, so make it 
  configurable.

  You can't assume anything about else about the extension. 
  .py .c .txt and .html are all valid in the operating systems I use
  and so are their capitalised equivalents. 

- check whether the file already exists. You need
  rules about what to do if it exists (do you reject the upload,
  silently overwrite, or alter the name, e.g. by adding a numeric
  suffix to make the name unique:

 my_photo.jpg  ===  my_photo-01.jpg

- run the application in your upload target directory and put the
  uploaded file there or, better, into a configured uploads directory
  by prepending it to the file name:

my_photo.jpg   ===  /home/upload_user/uploads/my_photo.jpg

- make sure you document the process so that a user can work out
  what has happened to his file and why if you have to reject it
  or alter its name.

 not sure but any suggestions or examples are most welcome :)

There's probably something I've forgotten, but that list should get you 
going.
 


-- 
martin@   | Martin Gregorie
gregorie. | Essex, UK
org   |
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: suggestions please what should i watch for/guard against' in a file upload situation?

2010-10-06 Thread Seebs
On 2010-10-06, Diez B. Roggisch de...@web.de wrote:
 Seebs usenet-nos...@seebs.net writes:
 On 2010-10-06, geekbuntu gmi...@gmail.com wrote:
 in general, what are things i would want to 'watch for/guard against'
 in a file upload situation?

 This question has virtually nothing to do with Python, which means you
 may not get very good answers.

 In contrast to comp.super.web.experts? There are quite a few people
 with web-experience here I'd say. 

Oh, certainly.  But in general, I try to ask questions in a group focused
on their domain, rather than merely a group likely to contain people who
would for other reasons have the relevant experience.  I'm sure that a great
number of Python programmers have experience with sex, that doesn't make
this a great newsgroup for sex tips.  (Well, maybe it does.)

 Given that most people are not computer savvy (always remember, the
 default for windows is to hide extensions..), using it client-side can
 be valuable to prevent long uploads that eventuall need to be rejected
 otherwise (no mom, you can't upload word-docs as profile pictures).

That's a good point.  On the other hand, there's a corollary; you may want
to look at the contents of the file in case they're not really what they're
supposed to be.

 Your strange focus on file-names that are pure meta information is a
 little bit concerning... 

If you're uploading files into a directory, then it is quite likely that
you're getting file names from somewhere.  Untrusted file names are a much
more effective attack vector, in most cases, than EXIF information.

 Certainly advice. But that's less focussed on filenames or file-uploads, but
 on the whole subject of processing HTTP-requestst. Which would make a
 point for *not* using a home-grown framework.

Well, yeah.  I was assuming that the home-grown framework was mandatory for
some reason.  Possibly a very important reason, such as otherwise we won't
have written it ourselves.

-s
-- 
Copyright 2010, all wrongs reversed.  Peter Seebach / usenet-nos...@seebs.net
http://www.seebs.net/log/ -- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) -- get educated!
I am not speaking for my employer, although they do rent some of my opinions.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: suggestions please what should i watch for/guard against' in a file upload situation?

2010-10-06 Thread Steven D'Aprano
On Wed, 06 Oct 2010 09:02:21 -0700, geekbuntu wrote:

 in general, what are things i would want to 'watch for/guard against' in
 a file upload situation?
 
 i have my file upload working (in the self-made framework @ work without
 any concession for multipart form uploads), but was told to make sure
 it's cleansed and cannot do any harm inside the system.

Make sure *what* is cleansed? Your code? The uploaded files? Define 
cleansed.

Do you have to block viruses, malware, spybots, illegal pornography, 
legal pornography, illegal content, warez, copyright violations, stolen 
trade secrets, dirty words, pictures of cats?

What operating system are you uploading to?

What happens if somebody tries to upload a 1 TB file to your server?

What happens if they try to upload a billion 1 KB files instead?


 
 my checklist so far is basically to check the extension - ensure it has
 3 places, ensure it's in the allowed list (like jpg gif etc...).

Do you have something against file extensions like .gz or .jpeg ?

I'm not sure why you think you need to check the file extension.

 
 not sure what else i could do to guard against anything bad happening. 
 maybe the file name itself could cause greif?

You think? :)

What happens if the file name has characters in it that your file system 
can't deal with? Bad unicode, binary bytes, slashes, colons, question 
marks, asterisks, etc.

What about trying to break out of your file storage area using .. paths?

Without knowing what your file upload code actually does, it's hard to 
give specific advice.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: suggestions please what should i watch for/guard against' in a file upload situation?

2010-10-06 Thread Diez B. Roggisch
Seebs usenet-nos...@seebs.net writes:

 On 2010-10-06, Diez B. Roggisch de...@web.de wrote:
 Seebs usenet-nos...@seebs.net writes:
 On 2010-10-06, geekbuntu gmi...@gmail.com wrote:
 in general, what are things i would want to 'watch for/guard against'
 in a file upload situation?

 This question has virtually nothing to do with Python, which means you
 may not get very good answers.

 In contrast to comp.super.web.experts? There are quite a few people
 with web-experience here I'd say. 

 Oh, certainly.  But in general, I try to ask questions in a group focused
 on their domain, rather than merely a group likely to contain people who
 would for other reasons have the relevant experience.  I'm sure that a great
 number of Python programmers have experience with sex, that doesn't make
 this a great newsgroup for sex tips.  (Well, maybe it does.)

As the OP asked about a Python web framework (self written or not), I
think all advice that can be given is certainly more related to Python
than to airy references to general web programming such as 
oh, make sure if your server side application environment hasn't any 
security issues.

Or, to be more concrete: what NG would you suggest for frameworks or webapps
written in python to ask this question?

 Given that most people are not computer savvy (always remember, the
 default for windows is to hide extensions..), using it client-side can
 be valuable to prevent long uploads that eventuall need to be rejected
 otherwise (no mom, you can't upload word-docs as profile pictures).

 That's a good point.  On the other hand, there's a corollary; you may want
 to look at the contents of the file in case they're not really what they're
 supposed to be.

For sure. But the focus of you and others seems to be the file-name,
as if that was anything especially dangerous. Matter of factly, it's a
paramteter to a multipart/form-data encoded request body parameter
definition, and as such has a rather locked-down in terms of
null-bytes and such. So you are pretty safe as long as you

 - use standard library request parsing modules such as cgi. If 
   one instist on reading streams bytewise and using ctypes to poke the
   results into memory, you can of course provoke unimaginable havoc..

 - don't use the filename for anything but meta-info. And ususally, they
   are simply regarded as nice that you've provided us with it, we try
make our best to fill an img alt attribute with the basename. 
   But not more. Worth pointing out to the OP to do that. But this is
   *not* a matter of mapping HTTP-request paths to directories I'd wager
   to say. 

Something that is of much more importance (I should have mentioned
earlier, shame on me) is of course file-size. Denying requests that come
with CONTENT_LENGTH over a specified limit, of course respecting
CONTENT_LENGTH and not reading beyond it, and possibly dealing with
chunked-encodings in similarily safe ways (I have to admit I haven't yet
dealt with  one of those myself on a visceral level - 
but as they are part of the HTTP-spec...) is important, 
as otherwise DOS attacks are possible.

 Your strange focus on file-names that are pure meta information is a
 little bit concerning... 

 If you're uploading files into a directory, then it is quite likely that
 you're getting file names from somewhere.  Untrusted file names are a much
 more effective attack vector, in most cases, than EXIF information.

The into a directory quote coming from where? And given that EXIF
information is probably read by some C-lib, I'd say it is much more
dangerous. This is a gut feeling only, but fed by problems with libpng a
year or two ago.

 Certainly advice. But that's less focussed on filenames or file-uploads, but
 on the whole subject of processing HTTP-requestst. Which would make a
 point for *not* using a home-grown framework.

 Well, yeah.  I was assuming that the home-grown framework was mandatory for
 some reason.  Possibly a very important reason, such as otherwise we won't
 have written it ourselves.

In Python, it's usually more along the lines of well, we kinda started,
and now we have it, and are reluctant to switch.

But of course one never knows...

Diez
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: suggestions please what should i watch for/guard against' in a file upload situation?

2010-10-06 Thread Diez B. Roggisch
Martin Gregorie mar...@address-in-sig.invalid writes:

 On Wed, 06 Oct 2010 09:02:21 -0700, geekbuntu wrote:

 in general, what are things i would want to 'watch for/guard against' in
 a file upload situation?
 
 i have my file upload working (in the self-made framework @ work without
 any concession for multipart form uploads), but was told to make sure
 it's cleansed and cannot do any harm inside the system.

 Off the top of my head, and assuming that you get passed the exact 
 filename that the user entered:

 - The user may need to use an absolute pathname to upload a file
   that isn't in his current directory, so retain only the basename
   by discarding the rightmost slash and everything to the left of it:
 /home/auser/photos/my_photo.jpg   === my_photo.jpg
 c:\My Photos\My Photo.jpg === My Photo.jpg

 - If your target system doesn't like spaces in names or you want to be
   on the safe side there, replace spaces in the name with underscores:
 My Photo.jpg ===My_Photo.jpg

 - reject any filenames that could cause the receiving system to do
   dangerous things, e.g. .EXE or .SCR if the upload target is Windows.
   This list will be different for each upload target, so make it 
   configurable.

Erm, this assumes that the files are executed in some way. Why should
they? It's perfectly fine to upload *anything*, and of course filenames
mean nothing wrt to the actual file contents (Are you sure you want to
change the extension of this file?). 

It might make no sense for the user, because you can't shon an exe as profile
image. But safe-guarding against that has nothing to do with OS. And
even safe file formats such as PNGs have been attack
vectors. Precisely because they are processed client-side in the browser
through some library with security issues.

For serving the files, one could rely on the file-command or similar
means to determine the mime-type. So far, I've never done that - as
faking the extension for something else doesn't buy you something unless
there is a documented case of internet explorer ignoring mime-type, and
executing downloaded file as program.


   You can't assume anything about else about the extension. 
   .py .c .txt and .html are all valid in the operating systems I use
   and so are their capitalised equivalents. 

 - check whether the file already exists. You need
   rules about what to do if it exists (do you reject the upload,
   silently overwrite, or alter the name, e.g. by adding a numeric
   suffix to make the name unique:

  my_photo.jpg  ===  my_photo-01.jpg

Better, associate the file with the uploader and or it's hash. Use the
name as pure meta-information only.

 There's probably something I've forgotten, but that list should get you 
 going.

Dealing with to large upload requests I'd say is much more important, as
careless reading of streams into memory has at least the potential for a
DOS-attack.

Diez
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: suggestions please what should i watch for/guard against' in a file upload situation?

2010-10-06 Thread Lawrence D'Oliveiro
In message
2ce3860b-ae21-48ae-9abc-cb169a6f1...@e20g2000vbn.googlegroups.com, 
geekbuntu wrote:

 in general, what are things i would want to 'watch for/guard against'
 in a file upload situation?

If you stored the file contents as a blob in a database field, you wouldn’t 
have to worry about filename problems.
-- 
http://mail.python.org/mailman/listinfo/python-list