[issue15564] cgi.FieldStorage should not call read_multi on files

2017-03-13 Thread Joshua Shields

Joshua Shields added the comment:

I ran into this issue as well. I think it is something cgi.py will need to 
handle correctly when this type of file is uploaded from a browser's file input.

--
nosy: +jshields

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15564] cgi.FieldStorage should not call read_multi on files

2015-11-18 Thread Mark Bordas

Mark Bordas added the comment:

Was this ever addressed or resolved? I just ran into this bug and it looks like 
there's a solution, but was never fixed?

--
nosy: +Mark Bordas
versions: +Python 2.7 -Python 3.2

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15564] cgi.FieldStorage should not call read_multi on files

2013-01-01 Thread Christian Boos

Christian Boos added the comment:

I think that reverting to a read_single() when the read_multi() fails could do 
the trick here. At least this approach seems to work for uploading .mht files. 
See also http://trac.edgewall.org/ticket/9880.

--
nosy: +cboos

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15564] cgi.FieldStorage should not call read_multi on files

2012-08-13 Thread patrick vrijlandt

patrick vrijlandt added the comment:

I must admit my usage case is a hack, but the summary is: view a page on one 
computer, process it on another computer; like sending the page to a friend, 
with friend -> self and send -> upload.

I found one other victim in python 
(https://groups.google.com/d/topic/web2py/ixeUUWryZh0/discussion) but only an 
occasional reference to other languages; most posts relate to security issues 
with mht files.

My previous example only served to show that the mime-type is a necessary 
condition for the problem to occur; you are right that this input would be 
expected to throw an exception.

So I went on and created a complete testcase/example (attached). The 
PatchedFieldStorage class parses the mht file correctly into parts. However, 
the names of the parts are in "content-location" headers inside  
the mht file and get lost. Also the code is ugly.

Trying to better re-use existing code like in ExperimentalFieldStorage was not 
succesful so far: The MIME-prologue is parsed as one of the parts, and the 
outerboundary is not respected, losing a dataelement "next to" the file. The 
print() calls show that the next line may be valuable (like a header) or not so 
much (like a boundary), but so far the class has no provision for look-ahead I 
think.

email.message_from_binary_file correctly parses my mht-files; so a completely 
different approach might be to more rely on that package for parsing MIME 
encoded data.

--
Added file: http://bugs.python.org/file26780/test_cgi4.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15564] cgi.FieldStorage should not call read_multi on files

2012-08-11 Thread R. David Murray

R. David Murray added the comment:

I'd like to weigh in on this, but I need time to do research on the question 
first.  It may be a bit before I get that time.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15564] cgi.FieldStorage should not call read_multi on files

2012-08-11 Thread Glenn Linderman

Glenn Linderman added the comment:

I forgot to mention that the file you provided in your test doesn't look like a 
well-formed MHTML file, and so an exception would be expected in this case.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15564] cgi.FieldStorage should not call read_multi on files

2012-08-11 Thread Glenn Linderman

Glenn Linderman added the comment:

I didn't call the current behaviour of browsers in assigning MIME types 
automatically based on file extension a bug; I would consider it more of a 
missing capability, an oversight due to the rareness of attempts to upload 
MHTML files. This is similar to the situation of email clients automatically 
choosing the Content-Disposition for attachments (which is just a 
recommendation) about whether to suggest they be displayed inline, or provided 
as attachments to be saved. Most automatically select a Content-Disposition 
based on their own capability to deal with an attachment of a particular MIME 
type, rather than the (unknown) capability of the email client of the ultimate 
recipient. I think in both cases, the default behavior works well enough for a 
large enough subset of cases, that there has been little demand for increased 
functionality, even though one can contrive reasonable sounding cases for that 
functionality.

As a point of discussion, my perception is that MHTML files have two uses: to 
email an image of a web page (something typically done implicitly by bundled 
email/web-browser client software, and not generally explicit in the creation 
of a standalone MHTML file), and to archive a web page for local reference. 
Neither of these uses involves upload MHTML files to web sites, although saving 
a web page, and then attempting to email it to a friend as an attachment via a 
web mail client might encounter the same difficulty you are having.

Another use I have heard discussed (but I've forgotten where, so have no 
references), is as a source for custom browsers to prepackage responses for 
particular WEB forms.  In that case, I think it would be the custom browser's 
responsibility to supply the MHTML file content as a response to the form 
request, rather than to supply it as an uploaded file, expecting the server to 
dissect it... 

I think it is obvious that my personal, first reaction is that the parsing 
problem should be fixed... if the MIME type states it is multipart, it should 
dissected into its parts... and if that is not the desired behavior, then the 
MIME type should be different.  Email standards, the source of MIME type 
specifications, certainly use and support nested multipart dissection, although 
various email software performs it in various manners and to various levels. 
Naturally, if the content syntax of the multipart file is incorrect, it should 
produce an exception, the same as if the multipart content a (buggy) browser 
produced from an HTML form were syntactically incorrect.

Given a lack of capability of browser to allow specification of MIME type (this 
is .mht, but treat it as application/octet-stream rather than 
multipart/related), it does seem that web server toolkits such as 
cgi.FieldStorage might want to offer an option or hook to allow an application 
to disable the otherwise automatic parsing of multipart/* files.

This is a rather murky area, indeed. Research into whether and how other web 
toolkits handle such a situation would be interesting in deciding how to 
proceed. While there is no need for Python to slavishly follow the lead of any 
other particular web toolkit, it would be interesting to know if any actually 
successfully parse such files, and it would be interesting to know if any 
ignore the MIME type for uploaded files, and it would be interesting to know if 
any support options for handling uploaded files with multipart/* MIME types.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15564] cgi.FieldStorage should not call read_multi on files

2012-08-11 Thread patrick vrijlandt

patrick vrijlandt added the comment:

I would not know how to set the MIME-type of a file during upload. This is 
apparently set by the browser based on the filename (extension). Even (or: 
especially) if this is a bug in all the current browsers, python should provide 
the tools to adapt to this situation.

I could perhaps request the whole form to be "application/octet-stream", but 
the current "multipart/form-data" is appropriate for a form.

You are right about renaming. The innocent test file "test2.txt" can be 
uploaded, but the same file renamed to "test2.mht" causes an exception.

Below is a dump of the posted data (using Chrome in this case); attached a 
script (requiring bottle.py - www.bottlepy.org or PyPI) that demonstrates the 
problem.

There is no doubt that parsing fails; an exception cannot be the result of 
successful parsing. The input may be wrong, but python should offer the 
flexibility to handle wrong input.

Instead, are you sure it is appropriate to *automatically* dissect a file? It 
should be fairly easy to handle for the scripter if he really wants to dig 
deeper.

Headers

Origin: http://localhost:10080
Referer: http://localhost:10080/url-get
Content-Length: 349
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
Cache-Control: max-age=0
Connection: keep-alive
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
User-Agent: Mozilla/5.0 (Windows NT 6.0) AppleWebKit/537.1 (KHTML, like Gecko) 
Chrome/21.0.1180.75 Safari/537.1
Host: localhost:10080
Accept-Encoding: gzip,deflate,sdch
Accept-Language: nl-NL,nl;q=0.8,en-US;q=0.6,en;q=0.4,en-GB;q=0.2
Content-Type: multipart/form-data; 
boundary=WebKitFormBoundaryBsBVBYDTxou89uBj

Body

--WebKitFormBoundaryBsBVBYDTxou89uBj
Content-Disposition: form-data; name="data"; filename="test2.mht"
Content-Type: multipart/related

# dit is een test
Dit is een regel
Dit is het einde.
#


--WebKitFormBoundaryBsBVBYDTxou89uBj
Content-Disposition: form-data; name="value"

abc123
--WebKitFormBoundaryBsBVBYDTxou89uBj--

--
Added file: http://bugs.python.org/file26764/cgibug.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15564] cgi.FieldStorage should not call read_multi on files

2012-08-10 Thread Glenn Linderman

Glenn Linderman added the comment:

So the issue you perceive is that a correctly MIME-typed .mht file has a MIME 
type of multipart/related -- but that for the purposes of uploading the file, 
you don't want to treat it as that MIME type, but rather as an opaque data file.

Just give it a different MIME type at the time of upload, like 
application/octet-stream. That is appropriate, if your application wants to 
treat the data as an opaque data stream.

But, you say, none of the browsers support user-specified or user-selectable 
MIME types, but rather they infer the MIME type from the file extension.  So 
that sounds like a bug in the browsers... but also gives an out... change the 
name of the file before uploading it.

The only bug I see here is your comment that the parsing fails.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15564] cgi.FieldStorage should not call read_multi on files

2012-08-10 Thread Senthil Kumaran

Changes by Senthil Kumaran :


--
nosy: +orsenthil

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15564] cgi.FieldStorage should not call read_multi on files

2012-08-10 Thread Glenn Linderman

Changes by Glenn Linderman :


--
nosy: +v+python

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15564] cgi.FieldStorage should not call read_multi on files

2012-08-06 Thread R. David Murray

Changes by R. David Murray :


--
nosy: +r.david.murray

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15564] cgi.FieldStorage should not call read_multi on files

2012-08-06 Thread patrick vrijlandt

New submission from patrick vrijlandt:

.mht is an archive format created by Microsoft IE 8 when saving a webpage. It 
is essentially a mime multipart message.

My problem occurred when I uploaded such a file to a cgi-based server. The 
posted data would be fed to cgi.FieldStorage. (I can't post the file 
unfortunately)

As it turns out, cgi.FieldStorage tries to recursively parse the postdata, 
thereby splitting up the uploaded file; this fails. However, this (automatic) 
recursive behaviour seems unwanted for an uploaded file.

My proposal is thus to adapt cgi.py (line number for Python 3.2), so that in 
FieldStorage.__init__, line 542, read_multi would not be invoked in this case.

Currently it says:

elif ctype[:10] == 'multipart/':
self.read_multi(environ, keep_blank_values, strict_parsing)

Change this to:

elif ctype[:10] == 'multipart/' and not self.filename: 
self.read_multi(environ, keep_blank_values, strict_parsing)

(I apologise for not submitting a test case. When trying to create it, it is 
either very complicated, or not easily recognizable as valid. Moreover, my 
server used a 3rd party software (bottlypy.org: bottle.py))

--
components: Library (Lib)
messages: 167548
nosy: patrick.vrijlandt
priority: normal
severity: normal
status: open
title: cgi.FieldStorage should not call read_multi on files
type: behavior
versions: Python 3.2

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com