[issue10231] SimpleHTTPRequestHandler directory bugs

2017-11-26 Thread Berker Peksag

Berker Peksag  added the comment:

Note that the macpath module has been deprecated in Python 3.7 in issue 9850 
and it's going to be removed in Python 3.8 (and Python 3.6 is the only Python 3 
release that accepts bug fixes at the moment) Perhaps it's not worth to fix the 
macpath case at all.

--
nosy: +berker.peksag

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10231] SimpleHTTPRequestHandler directory bugs

2017-11-25 Thread Martin Panter

Martin Panter  added the comment:

I read in PEP 11 that Mac OS 9 support was dropped in Python 2.4.

I agree that eliminating “.” and “..” components makes sense, since that is how 
they should be handled when resolving relative URLs. But it seems low priority, 
since this doesn’t happen on current supported platforms, and would only be 
triggered by an invalid HTTP request.

--
priority: normal -> low
stage: resolved -> needs patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10231] SimpleHTTPRequestHandler directory bugs

2017-11-25 Thread Hallvard B Furuseth

Hallvard B Furuseth  added the comment:

On 26/11/17 04:59, Martin Panter wrote:
> That leaves the fourth complaint, which I don’t understand: ‘translate_path() 
> does not handle initial "."/".." on non-Posix systems’.
>
> As far as I know, in 2010 (and still in 2017) the only non-Posix system 
> Python supported was Windows. But Windows has os.curdir = "." and os.pardir = 
> "..", just like Posix.

os.macpath has ":" and "::".

I don't remember if that's what I was thinking though.  Maybe just
"non-posixpath.py".  A generic problem - you have to think about
each os.path implementation to see if the translate_path()
is valid.  If you someday add support for another OS, that can
break a working translate_path().  My proposed code would fix that,
at least for that particular code.

> There is a quirk where requests like “GET .” and “GET ../path” will retain 
> the dots after passing through “posixpath.normpath”. If there was a platform 
> where a single or double dot was a legal file or directory name, the server 
> will access the corresponding file or directory in these cases. But this does 
> not seem like a problem.

More generally, translate_path() ought to escape characters and
character combinations which have special meaning in the filesystem.
But that can be hairy, as the *url2path.py modules demonstrate,
and it would break compatibility with people's existing directory
structures.  And with ospath->URL transation elsewhere, I'm sure.

--
status: pending -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10231] SimpleHTTPRequestHandler directory bugs

2017-11-25 Thread Martin Panter

Martin Panter  added the comment:

The first two bugs ("foo/dir?baz" and "foo/dir?baz/") were solved by Issue 
23112. The third (".../foo.html/") was solved by Issue 17324.

That leaves the fourth complaint, which I don’t understand: ‘translate_path() 
does not handle initial "."/".." on non-Posix systems’.

As far as I know, in 2010 (and still in 2017) the only non-Posix system Python 
supported was Windows. But Windows has os.curdir = "." and os.pardir = "..", 
just like Posix.

There is a quirk where requests like “GET .” and “GET ../path” will retain the 
dots after passing through “posixpath.normpath”. If there was a platform where 
a single or double dot was a legal file or directory name, the server will 
access the corresponding file or directory in these cases. But this does not 
seem like a problem.

I propose to close this, unless there really is a bug with non-Posix systems.

--
nosy: +martin.panter
resolution:  -> out of date
stage:  -> resolved
status: open -> pending
superseder:  -> SimpleHTTPServer/http.server adds trailing slash after query 
string

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10231] SimpleHTTPRequestHandler directory bugs

2010-11-22 Thread Hallvard B Furuseth

Hallvard B Furuseth  added the comment:

Senthil Kumaran writes:
> I have doubts on the validity of this bug itself.
> 
> - First is, query and fragment are usually for the file being served
> from the webserver, not on the directories. If there are characters such
> as '?' and '#' in the directory names, which may get featured in the
> path, then those should be quoted in the request. So foo/dir?baz is
> wrong where as foo/dir%3Fbaz it the correct request.

That's backwards.  Start with the URL spec (RFC 3986), not with
thinking of filesystem paths.  If '?' or '#' do occur in the URL, they
are not part of the path.  That is the case this bug report is about.

That's because it reserves these characters for query and fragment.
So yes, the if filesystem path contains '?' or '#', these must be
escaped in the URL.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10231] SimpleHTTPRequestHandler directory bugs

2010-11-21 Thread Jeremy Thurgood

Jeremy Thurgood  added the comment:

On Sun, Nov 21, 2010 at 17:11, Senthil Kumaran wrote:

>>I can't see any situation in which redirecting
>> "/something?foo" to "/something?foo/" is the correct behaviour.

> As I explained, in the previous post, this would *not happen* in
> practical scenarios, because code won't reach that point for valid
> URLs.

It reaches that point in the tests I added, and the results confirm
the first two points in the original bug report. Am I mistaken?

"/something?foo" is a valid URL. If "/something" is translated to a
file on the filesystem, the content of that file is returned. If it is
translated to a directory on the filesystem, a 301 to
"/something?foo/" is returned.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10231] SimpleHTTPRequestHandler directory bugs

2010-11-21 Thread Senthil Kumaran

Senthil Kumaran  added the comment:

On Sun, Nov 21, 2010 at 12:12:08PM +, Jeremy Thurgood wrote:
> I see your point now, but I don't agree with it completely. It seems
> reasonable to allow query parameters to specify things like sort
> order for a directory listing or have a fragment to focus the
> browser on a particular entry.

Can you please point me to some examples where such a kind of behavior
is exhibited or designed?

> On the other hand, if we don't want to support the
> redirect with a fragment or query parameters, we should instead return
> a 400 response. 

SimpleHTTPRequestHandler does not support REDIRECT on *a path* (any
path, directory or file, for that matter).  This bug was about a
primitive case where directory in the file system is not specified
with '/', it does a Hardcoded 301 redirect and adds a '/'.

>I can't see any situation in which redirecting
> "/something?foo" to "/something?foo/" is the correct behaviour.

As I explained, in the previous post, this would *not happen* in
practical scenarios, because code won't reach that point for valid
URLs.

> A 401 is "Unauthorized", which means the server is asking for
> authentication -- I don't think that's relevant here.

I am sorry, this was a typo.
It fails with -> urllib.error.HTTPError: HTTP Error 400: Bad Request

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10231] SimpleHTTPRequestHandler directory bugs

2010-11-21 Thread Jeremy Thurgood

Jeremy Thurgood  added the comment:

On Sun, Nov 21, 2010 at 10:37, Senthil Kumaran  wrote:
> Now, what happens when you type "http://bugs.python.org?10231"; [1] in
> your browser? According to this bug report, the server should 301
> redirect it to "http://bugs.python.org/?10231";. If you try this, this
> does not happen. The browser (client) is in fact, changing it to the
> corrected URL (because the original is invalid) and the server is just
> ignoring the so called query portion).

I see your point now, but I don't agree with it completely. It seems
reasonable to allow query parameters to specify things like sort order
for a directory listing or have a fragment to focus the browser on a
particular entry. On the other hand, if we don't want to support the
redirect with a fragment or query parameters, we should instead return
a 400 response. I can't see any situation in which redirecting
"/something?foo" to "/something?foo/" is the correct behaviour.

> If you use, urllib2 to request the above [1], you will find that it
> will fail with 401 error.

A 401 is "Unauthorized", which means the server is asking for
authentication -- I don't think that's relevant here.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10231] SimpleHTTPRequestHandler directory bugs

2010-11-21 Thread Senthil Kumaran

Senthil Kumaran  added the comment:

On Sat, Nov 20, 2010 at 07:09:58PM +, Jeremy Thurgood wrote:
> There are two separate things here: the URL and the filesystem path. 
> The bug is that the fragment and query parameters are stripped in
> translate_path(), but are *not* stripped when manipulating the URL
> for the redirect.

Thanks for the explanation. My understanding of the bug (and the
patch) is same as yours.

> This means that when the URL is "/something?foo" and the cwd is

Now, lets stop here for a moment to reflect the meaning of this URL.

Here is the Query string article from Wikipedia:
http://en.wikipedia.org/wiki/Query_string

It gives a generic definition:

A typical URL containing a query string is as follows:

http://server/path/program?query_string

When a server receives a request for such a page, it runs a
program (if configured to do so), passing the query_string
unchanged to the program. The question mark is used as a separator
and is not part of the query string.

Given this, in the URL /something?foo , can 'something' be a directory
in the filesystem? To which program will query string be sent? ( You
can say that a program by name index{.py,cgi,pl) inside the something
directory *can* handle it, but I don't see such a practical scenario
or CGI server configuration)

- Same argument a #fragment_identifier. This is  portion of webpage
  (anchor tag), a file. It cannot start immediately after a
  file-system directory path.

In the http.server code, you will see that 301 redirection part is
entered only when the last part is directory, but in cases where a?q
and a#f, ('a' wont be a directory, so that redirection part is not
entered at all).

Let's come to some real world scenarios.

Now, what happens when you type "http://bugs.python.org?10231"; [1] in
your browser? According to this bug report, the server should 301
redirect it to "http://bugs.python.org/?10231";. If you try this, this
does not happen. The browser (client) is in fact, changing it to the
corrected URL (because the original is invalid) and the server is just
ignoring the so called query portion). 

If you use, urllib2 to request the above [1], you will find that it
will fail with 401 error.

These are the reasons, why I consider this report is not really a bug.
Any suggestions or thoughts on the explanation?

In your attached patch, I think we should still be able to use some
tests without any change in the http.server code.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10231] SimpleHTTPRequestHandler directory bugs

2010-11-20 Thread Jeremy Thurgood

Jeremy Thurgood  added the comment:

Updated patch as per previous comment.

--
Added file: http://bugs.python.org/file19701/issue10231_v2.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10231] SimpleHTTPRequestHandler directory bugs

2010-11-20 Thread Jeremy Thurgood

Jeremy Thurgood  added the comment:

Thanks for the comments.

There are two separate things here: the URL and the filesystem path. The only 
part of the URL we care about is the path section, but the fragment ("#anchor") 
and query parameters ("?foo") are valid -- SimpleHTTPRequestHandler just 
ignores them. translate_path() turns the URL into the filesystem path, which 
may be a file or a directory, by extracting the URL path and mapping it onto 
the filesystem.

The bug is that the fragment and query parameters are stripped in 
translate_path(), but are *not* stripped when manipulating the URL for the 
redirect.

This means that when the URL is "/something?foo" and the cwd is "/tmp", the 
filesystem path is "/tmp/something" (which is a directory) and therefore the 
response needs to be a redirect. The redirect needs to modify the path section 
of the URL (which is "/something") to add a slash. This means the redirect 
needs to be to "/something/" (or "/something/?foo" if you want to preserve the 
query parameters) rather than "/something?foo/" which is what the current 
implementation does.

translate_path() unescapes the URL path before mapping it to the filesystem, 
which means that "/something%3Ffoo" (and even "/something%3Ffoo?bar") will be 
turned into the filesystem path "/tmp/something?foo".

I'll add some tests for the "/something%3Ffoo" case and possibly update 
send_head() to preserve the fragment and query parameters on redirect.

--
status: pending -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10231] SimpleHTTPRequestHandler directory bugs

2010-11-20 Thread Senthil Kumaran

Senthil Kumaran  added the comment:

I have doubts on the validity of this bug itself.

- First is, query and fragment are usually for the file being served from the 
webserver, not on the directories. If there are characters such as '?' and '#'  
in the directory names, which may get featured in the path, then those should 
be quoted in the request. So foo/dir?baz is wrong where as foo/dir%3Fbaz it the 
correct request.

We see the 301 redirection code in http.server code, the for the cases wherein 
it the "directory" is not specified with '/'. It just adds the '/' to get to 
the path. The code explicitly checks that path is directory before doing '/' 
added 301 redirection.

In the patch's first case:

+response = self.request(self.tempdir_name + '?foo')

This is wrong because /tmp/somthing?foo (Is invalid path - It should be quoted 
for it be a PATH and follow the 301 redirection to list its contents by 
appending '/')

To verify the above points, just create a file foo?bar and directory foo?baz 
and serve those using http.server, you come to know that the interpretation by 
the OP does not come up here.

If you any counter arguments to the above, please provide good examples or a 
better yet, the test_httpservers patch.

--
status: open -> pending

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10231] SimpleHTTPRequestHandler directory bugs

2010-11-20 Thread Georg Brandl

Changes by Georg Brandl :


--
assignee:  -> orsenthil

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10231] SimpleHTTPRequestHandler directory bugs

2010-11-20 Thread Jeremy Thurgood

Jeremy Thurgood  added the comment:

Attached a patch to test for and fix the first two issues described in this 
ticket.

Basically, it modifies SimpleHTTPRequestHandler.send_head() to operate on a 
path already stripped of the query string and fragment rather than the 
completely unparsed URL.

--
keywords: +patch
nosy: +jerith
Added file: http://bugs.python.org/file19654/issue10231.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10231] SimpleHTTPRequestHandler directory bugs

2010-11-04 Thread Hallvard B Furuseth

Changes by Hallvard B Furuseth :


--
versions: +Python 2.7, Python 3.1

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10231] SimpleHTTPRequestHandler directory bugs

2010-11-04 Thread STINNER Victor

Changes by STINNER Victor :


--
nosy: +haypo

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10231] SimpleHTTPRequestHandler directory bugs

2010-10-29 Thread R. David Murray

Changes by R. David Murray :


--
nosy: +orsenthil

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10231] SimpleHTTPRequestHandler directory bugs

2010-10-29 Thread Hallvard B Furuseth

New submission from Hallvard B Furuseth :

SimpleHTTPRequestHandler directory bugs

Running 3.2a3 http/server.py or 2.7 SimpleHTTPServer.py as a script:

* Redirection appends "/" to the unparsed URL instead of to the
  pathname component of the parsed URL: "foo/dir?baz" => "foo/dir?baz/".

* The unparsed URL is also used to check if the URL ends with "/".
  Thus "foo/dir?baz/" gives a directory listing instead of redirecting,
  which makes the files relative to "foo/" instead of to "foo/dir/".

* translate_path("directory/") produces filenames without a final "/".
  Not sure if that is correct for CGI env['PATH_TRANSLATED'].  Anyway:
  This means a non-directory file with a final slash is accepted, but
  again relative URLs in that file will refer to the wrong absolute URL.
  ".../foo.html/" + relative URL "bar.html" -> ".../foo.html/bar.html".

  However if translate_path("...foo/") is changed and you use stat() on
  the result, I do not know if all relevant directory operations work
  with the final directory separator on all OSes.  I seem to remember
  getting errors in some OS for stat("dirname/", &st) in C.

* translate_path() does not handle initial "."/".." on non-Posix systems.
  If that's wrong, it can (ignoring other issues listed here) do this:
  drop = frozenset((os.curdir, os.pardir, '', '.', '..'))
  for ...:
  if word not in drop: os.path.join(path, word)
  Though it looks a bit quicker to do
  words, drop = [], frozenset((os.curdir, os.pardir, '', '.', '..'))
  for word in filter(None, path.split('/')):
  word = os.path.split(os.path.splitdrive(word)[1])[1]
  if word not in drop: words.append(word)
  return os.path.join(os.getcwd(), *words)
  unless that can somehow produce a different result.

--
components: Library (Lib)
messages: 119899
nosy: hfuru
priority: normal
severity: normal
status: open
title: SimpleHTTPRequestHandler directory bugs
type: behavior
versions: Python 3.2

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com