[issue18119] urllib.FancyURLopener does not treat URL fragments correctly

2019-04-26 Thread Mark Lawrence


Change by Mark Lawrence :


--
nosy:  -BreamoreBoy

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18119] urllib.FancyURLopener does not treat URL fragments correctly

2014-10-07 Thread karl

karl added the comment:

OK I fixed the code. The issue is here
https://hg.python.org/cpython/file/1e1c6e306eb4/Lib/urllib/request.py#l656

newurl = urlunparse(urlparts)

Basically it reinjects the fragment in the new url. The fix is easy.

if urlparts.fragment:
urlparts = list(urlparts)
urlparts[5] = 
newurl = urlunparse(urlparts)

I was trying to make a test for it, but failed. Could someone help me for the 
test so I can complete the patch?

Added the code patch only.

--
keywords: +patch
Added file: http://bugs.python.org/file36832/issue18119-code-only.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18119
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18119] urllib.FancyURLopener does not treat URL fragments correctly

2014-10-06 Thread karl

karl added the comment:

This is the correct behavior
  GET http://example.com/foo
with a response containing 
  302 and Location: /bar#test
must trigger
  http://example.com/bar#test



Location is defined in 
http://tools.ietf.org/html/rfc7231#section-7.1.2

7.1.2. Location


   The Location header field is used in some responses to refer to a
   specific resource in relation to the response.  The type of
   relationship is defined by the combination of request method and
   status code semantics.

 Location = URI-reference

   The field value consists of a single URI-reference.  When it has the
   form of a relative reference ([RFC3986], Section 4.2), the final
   value is computed by resolving it against the effective request URI
   ([RFC3986], Section 5).

A bit after in the spec.


   If the Location value provided in a 3xx (Redirection) response does
   not have a fragment component, a user agent MUST process the
   redirection as if the value inherits the fragment component of the
   URI reference used to generate the request target (i.e., the
   redirection inherits the original reference's fragment, if any).

   For example, a GET request generated for the URI reference
   http://www.example.org/~tim; might result in a 303 (See Other)
   response containing the header field:

 Location: /People.html#tim

   which suggests that the user agent redirect to
   http://www.example.org/People.html#tim;

   Likewise, a GET request generated for the URI reference
   http://www.example.org/index.html#larry; might result in a 301
   (Moved Permanently) response containing the header field:

 Location: http://www.example.net/index.html

   which suggests that the user agent redirect to
   http://www.example.net/index.html#larry;, preserving the original
   fragment identifier.

--
nosy: +karlcow

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18119
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18119] urllib.FancyURLopener does not treat URL fragments correctly

2014-10-06 Thread Shuhei Takahashi

Shuhei Takahashi added the comment:

Hi karl,

Of course it is correct that the user agent is redirected to 
http://example.com/bar#test when it got such response. However, it never means 
UA can send an HTTP request containing fragment part.

In RFC7230 section 3.1.1, HTTP request line is defined as:
request-line   = method SP request-target SP HTTP-version CRLF
where request-target is defined in section 5.3. In usual case, it is 
origin-form:
origin-form= absolute-path [ ? query ]
which never includes fragment part.

This means current urllib behavior violates HTTP/1.1 spec.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18119
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18119] urllib.FancyURLopener does not treat URL fragments correctly

2014-10-06 Thread karl

karl added the comment:

Takahashi-san,

Ah sorry misunderstood which part your were talking about. I assume wrongly you 
were talking about navigation. 

Yes for the request which is sent to the server it should be
http://tools.ietf.org/html/rfc7230#section-5.3.1
So refactoring your example.

1st request:

GET /foo HTTP/1.1
Accept: text/html
Host: example.com

server response
HTTP/1.1 302 Found
Location: /bar#test

second request must be
GET /bar HTTP/1.1
Accept: text/html
Host: example.com

As for the navigation context is indeed part of the piece of code taking in 
charge the document after being parsed and not the one doing the HTTP request. 
(putting it here just that people understand)


(to be tested)
For server side receiving invalid Request-line 
http://tools.ietf.org/html/rfc7230#section-3.1.1

   Recipients of an invalid request-line SHOULD respond with either a
   400 (Bad Request) error or a 301 (Moved Permanently) redirect with
   the request-target properly encoded.  A recipient SHOULD NOT attempt
   to autocorrect and then process the request without a redirect, since
   the invalid request-line might be deliberately crafted to bypass
   security filters along the request chain.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18119
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18119] urllib.FancyURLopener does not treat URL fragments correctly

2014-10-06 Thread karl

karl added the comment:

In class urlopen_HttpTests
https://hg.python.org/cpython/file/4f314dedb84f/Lib/test/test_urllib.py#l191

there is a test for invalid redirects

def test_invalid_redirect(self):
https://hg.python.org/cpython/file/4f314dedb84f/Lib/test/test_urllib.py#l247

And one for fragments
def test_url_fragment(self):
https://hg.python.org/cpython/file/4f314dedb84f/Lib/test/test_urllib.py#l205

which refers to http://bugs.python.org/issue11703

code in 
https://hg.python.org/cpython/file/d5688a94a56c/Lib/urllib.py

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18119
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18119] urllib.FancyURLopener does not treat URL fragments correctly

2014-10-03 Thread Mark Lawrence

Mark Lawrence added the comment:

Slipped under the radar?

--
nosy: +BreamoreBoy

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18119
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18119] urllib.FancyURLopener does not treat URL fragments correctly

2013-06-08 Thread Ezio Melotti

Changes by Ezio Melotti ezio.melo...@gmail.com:


--
nosy: +ezio.melotti, orsenthil

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18119
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18119] urllib.FancyURLopener does not treat URL fragments correctly

2013-06-04 Thread Mher Movsisyan

Changes by Mher Movsisyan mher.movsis...@gmail.com:


--
nosy: +mher.movsisyan

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18119
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18119] urllib.FancyURLopener does not treat URL fragments correctly

2013-06-02 Thread Shuhei Takahashi

New submission from Shuhei Takahashi:

When urllib.FancyURLopener encounters 302 redirection to a URL with fragments, 
it sends wrong URL to servers.

For example, if we run:
  urllib.urlopen('http://example.com/foo')
and the server responds like following.
  HTTP/1.1 302 Found
  Location: /bar#test
Then urllib tries next to fetch http://example.com/bar#test, instead of 
http://example.com/bar.

Correctly, urllib should strip fragment part of URL before issuing requests.

--
components: Library (Lib)
messages: 190480
nosy: takahashi.shuhei
priority: normal
severity: normal
status: open
title: urllib.FancyURLopener does not treat URL fragments correctly
type: behavior
versions: Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18119
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com