[issue23350] Content-length is incorrect when request body is a list or tuple

2015-04-14 Thread Vincent Alquier

Vincent Alquier added the comment:

Martin: You're right, it's the same issue, and only related to python2's old 
style classes. Sorry for the useless noise.

Demian: My problem is `len(obj)` raises an Using AttributeError in python2 
(with obj being old style class instance). It's python 2.X specific and I don't 
think it is going to be addressed (reading comments in #15267). But thanks for 
your code rework.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-04-13 Thread Martin Panter

Martin Panter added the comment:

Vincent: That sounds more like a case of Issue 15267, or have you found a way 
to trigger the AttributeError in Python 3 as well?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-04-13 Thread Demian Brecht

Demian Brecht added the comment:

Vincent: The logic to determine content length is undergoing a bit of an 
overhaul as part of #12319, which I'm hoping to wrap up in the next week or so.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-04-13 Thread Vincent Alquier

Vincent Alquier added the comment:

Another issue should be addressed by patch...

When trying to guess the content-length, here is the code you find...

try:
thelen = str(len(body))
except TypeError as te:
[...]

The call to `len` will raise a `TypeError` in case of a C file object. But if 
body is a python file-like object, it will often raise an `AttributeError`.

So, the code should be replaced by (in both python 2.7 and 3):

try:
thelen = str(len(body))
except (TypeError, AttributeError):
[...]

--
nosy: +Pinz

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-04-12 Thread R. David Murray

R. David Murray added the comment:

See also issue 12327 for length issue using StingIO.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-03-31 Thread Demian Brecht

Demian Brecht added the comment:

The computation of Content-Length has also undergone some refactoring as part 
of #12319. Setting this as pending until #12319 has been accepted or rejected. 
If rejected, the implementation specific to generating Content-Length should be 
migrated here.

--
status: open - pending

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-03-31 Thread Demian Brecht

Demian Brecht added the comment:

If #12319 is accepted, the implementation for Content-Length should also likely 
be migrated to this issue to be applied to maintenance branches as a bug fix.

--
status: pending - open

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-03-22 Thread R. David Murray

R. David Murray added the comment:

Well, the current reality not counting the bug reported in this issue.  So, I 
documented it as if the fix here is to not set the length when body is an 
iterator.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-03-22 Thread R. David Murray

R. David Murray added the comment:

I just updated the docs to what I think is the current reality.  See issue 
23740 for what I think are problems with the current implementation, aside from 
any enhancement of computing a length for tuple or list.  Since the latter 
cannot be done reliably unless we know the list or tuple is all bytes, I 
propose that we don't do it at all (since I'd like to see iterables of text 
strings supported).

--
nosy: +r.david.murray

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-03-22 Thread Martin Panter

Martin Panter added the comment:

David: Calculating the length of a list or tuple of Latin-1 text strings should 
actually be straight-forward and reliable, because it is a single-byte encoding.

However I am starting to think adding a new special case for lists and tuples 
is a bad idea. There are already way too many special cases.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-03-22 Thread Demian Brecht

Demian Brecht added the comment:

@Serhiy:
 Content-Length shouldn't be calculated for lists, tuples, and other 
 non-bytes-compatible sequences.

I'd agree with this if it wasn't relatively trivial to calculate. There's no 
reason that I can think of to exclude the auto-generated Content-Length header 
for data types for which the size is known.


@Martin:
 Technically I don’t think there is a bug.

The bug is that the Content-Length header is currently added for bodies that 
are lists and tuples and the value is incorrect. For example:

con.request('POST', '/', ['aaa', 'bbb', 'ccc'])

results in 

Host: example.com
Accept-Encoding: identity
Content-Length: 3


@David:
 Since the latter cannot be done reliably unless we know the list or tuple is 
 all bytes, I propose that we don't do it at all (since I'd like to see 
 iterables of text strings supported).

The patch here adds support for iterables of text strings (as well as iterables 
comprised of both bytes and strings). Content-Length can be computed reliably 
as the size of a latin1-encoded string will be identical to the original string.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-03-21 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

There are two issues. The one is that calculated Content-Length is not correct 
for lists, tuples, and other types (such as deque or array.array). The right 
solution is to calculate size using a technique used in urllib. Content-Length 
shouldn't be calculated for lists, tuples, and other non-bytes-compatible 
sequences. This is a bug, and the patch should be applied to all maintained 
releases.

The second issue is feature request. Allow calculating Content-Length for lists 
and tuples.

--
nosy: +serhiy.storchaka

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-03-21 Thread Martin Panter

Martin Panter added the comment:

Technically I don’t think there is a bug. The documentation says [the] 
“Content-Length header should be explicitly provided”, so if you don’t set it 
you could argue that you’re using the library wrong.

For this issue I think Demian was trying to add support (i.e. new feature) for 
implicit Content-Length with tuples and lists of bytes (or strings). He has 
also added support for iterables of Latin-1 encodable text strings.

What you are suggesting Serhiy sounds like a separate new feature to support 
bodies of arbitrary bytes-like objects (or lists or tuples of them). According 
to the documentation, only byte and Latin-1 text strings, file objects 
supporting stat(), and iterables are currently supported. It does not say, but 
before this patch I think the iterables had to be of bytes-like objects.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-02-20 Thread Berker Peksag

Changes by Berker Peksag berker.pek...@gmail.com:


--
nosy: +berker.peksag
stage:  - patch review

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-02-13 Thread Demian Brecht

Demian Brecht added the comment:

Thanks for the review Martin, I've addressed your comments.

 The length of an encoded Latin-1 string should equal the length of the 
 unencoded text string, since it is a one-to-one character-to-byte encoding.
Once in a while, I want to stop what I'm doing, put my head in my hands
and think to myself how did that escape me?! Of course you're right
and thanks for the catch. I've reverted the handling to how it was being
done in the previous patch.

 Though I’m not particularly excited by silently Latin-1 encoding text bodies 
 in the first place.
Truth be told, I'm more fond of only accepting pre-encoded byte strings
as input. However, that backwards incompatible change would likely break
many things. Request bodies can currently be strings, byte strings,
iterables or file objects. In the cases of string and file objects,
encoding is already supported. The change I made makes handling
iterables consistent with the other accepted data types.

I'm not sure why, but the auto-encoding of the raw string input object
was being done higher up in the general use case callstack
(Lib/http/client.py:1064). I've moved this handling to send() for
consistency with the auto-encoding of other input types. This also
ensures consistent behavior between calling request() with a string body
and calling send() directly.

--
Added file: http://bugs.python.org/file38130/list_content_length_3.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___diff -r e548ab4ce71d Doc/library/http.client.rst
--- a/Doc/library/http.client.rst   Mon Feb 09 19:49:00 2015 +
+++ b/Doc/library/http.client.rst   Fri Feb 13 07:45:43 2015 -0800
@@ -212,8 +212,10 @@
contents of the file is sent; this file object should support ``fileno()``
and ``read()`` methods. The header Content-Length is automatically set to
the length of the file as reported by stat. The *body* argument may also be
-   an iterable and Content-Length header should be explicitly provided when the
-   body is an iterable.
+   an iterable. If the iterable is a tuple or list, the Content-Length will
+   automatically be set if not already supplied in the request headers.
+   In all other iterable cases, the Content-Length header should be explicitly
+   provided.
 
The *headers* argument should be a mapping of extra HTTP
headers to send with the request.
@@ -221,6 +223,10 @@
.. versionadded:: 3.2
   *body* can now be an iterable.
 
+   .. versionadded:: 3.5
+  The Content-Length header will be set when *body* is a list or tuple.
+
+
 .. method:: HTTPConnection.getresponse()
 
Should be called after a request is sent to get the response from the 
server.
diff -r e548ab4ce71d Lib/http/client.py
--- a/Lib/http/client.pyMon Feb 09 19:49:00 2015 +
+++ b/Lib/http/client.pyFri Feb 13 07:45:43 2015 -0800
@@ -836,11 +836,19 @@
 datablock = datablock.encode(iso-8859-1)
 self.sock.sendall(datablock)
 return
+
+if isinstance(data, str):
+# RFC 2616 Section 3.7.1 says that text default has a
+# default charset of iso-8859-1.
+data = data.encode('iso-8859-1')
+
 try:
 self.sock.sendall(data)
 except TypeError:
 if isinstance(data, collections.Iterable):
 for d in data:
+if hasattr(d, 'encode'):
+d = d.encode('iso-8859-1')
 self.sock.sendall(d)
 else:
 raise TypeError(data should be a bytes-like object 
@@ -1031,20 +1039,25 @@
 
 def _set_content_length(self, body):
 # Set the content-length based on the body.
-thelen = None
-try:
-thelen = str(len(body))
-except TypeError as te:
-# If this is a file-like object, try to
-# fstat its file descriptor
+size = None
+if isinstance(body, (list, tuple)):
+# the body will either be already encoded or will be latin-1
+# encoded when being sent. as latin-1 and ascii strings are of
+# equal size, there isn't a need to make a distinction here.
+size = sum(len(line) for line in body)
+else:
 try:
-thelen = str(os.fstat(body.fileno()).st_size)
-except (AttributeError, OSError):
-# Don't send a length if this failed
-if self.debuglevel  0: print(Cannot stat!!)
+size = len(body)
+except TypeError:
+try:
+size = os.fstat(body.fileno()).st_size
+except (AttributeError, OSError):
+if self.debuglevel  0:
+print(Cannot stat!!)
+size = None
 
-if thelen is not None:

[issue23350] Content-length is incorrect when request body is a list or tuple

2015-02-13 Thread Martin Panter

Martin Panter added the comment:

New patch looks good I think. Making the encoding code more consistent is nice.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-02-12 Thread Demian Brecht

Demian Brecht added the comment:

Thanks for the clarification Martin. After giving this some further thought, I 
think that the best way to go is to /only/ calculate and add the Content-Length 
header if each element in the list or tuple is pre-encoded. If it's mixed or 
only strings, then there are one of three options:

1. Don't try to compute the Content-Length automatically as encoding may change 
the number of bytes being sent across as the body (which is what the 
Content-Length represents)
2. Encode the entire body twice, once during the computation of the 
Content-Length and again on the fly as the body is being written to the socket. 
This will incur 2x the CPU cost, which can be costly if the body is 
sufficiently large.
3. Encode the entire body once and store it in memory. Given body sizes can be 
quite large, I don't think that duplicating the body would be a good approach.

The attached patch uses option 1 in order to not add any CPU or memory cost to 
the operation, but still fix the Content-Length issue as reported. I've also 
updated the docs to indicate as much.

--
Added file: http://bugs.python.org/file38125/list_content_length_2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-02-12 Thread Martin Panter

Martin Panter added the comment:

The length of an encoded Latin-1 string should equal the length of the 
unencoded text string, since it is a one-to-one character-to-byte encoding. So 
encoding should not actually be needed to determine the Latin-1 encoded length. 
Though I’m not particularly excited by silently Latin-1 encoding text bodies in 
the first place.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-01-30 Thread Demian Brecht

Demian Brecht added the comment:

On 2015-01-29 9:51 PM, Martin Panter wrote:
 The documentation currently says “Content-Length header should be explicitly 
 provided when the body is an iterable”. See Lib/urllib/request.py:1133 for 
 how it is done for urlopen(), using memoryview(), which is probabaly more 
 correct.

Sure, entirely disabling computing the content length if the body is an
iterable is one way to address it, but I'm not convinced that it's
better. If the ability is there to compute the content length, why not
do so?

The current implementation /should/ be correct whether elements are
bytes or strings (the default encoding doesn't allow for multi-byte
chars, so len(raw_string) should equal len(encoded_string)) when encoded
using the new block of encoding code I added in the patch.

Is there something that I'm missing? I could possibly see an argument
for performance as you're iterating over each element in the list, but
that would be entirely circumvented if the user defines the content
length up front.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-01-30 Thread Martin Panter

Martin Panter added the comment:

Sorry my comment was a bit rushed. I wasn’t saying this feature shouldn’t be 
added. I guess I was pointing out two things:

1. Someone should updated the documentation to say that Content-Length no 
longer has to be explicitly provided for lists and tuples.

2. Perhaps you could consider using the same len(memoryview) * 
memoryview.itemsize technique used in urllib, so that the length of e.g. 
array.array(I, range(3)) is correct. But that is tangential to what you are 
trying to achieve, and now I realize coping with Latin-1 encoding at the same 
time might make it a bit too complicated, so perhaps don’t worry about it :)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-01-29 Thread Demian Brecht

New submission from Demian Brecht:

Rather than summing the value of each element of a list or tuple to use as the 
value of the content-length header, the length of the list or tuple is used.

--
files: list_content_length.patch
keywords: patch
messages: 235012
nosy: demian.brecht
priority: normal
severity: normal
status: open
title: Content-length is incorrect when request body is a list or tuple
Added file: http://bugs.python.org/file37913/list_content_length.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-01-29 Thread Demian Brecht

Changes by Demian Brecht demianbre...@gmail.com:


--
components: +Library (Lib)
type:  - behavior
versions: +Python 3.5

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-01-29 Thread Antoine Pitrou

Changes by Antoine Pitrou pit...@free.fr:


--
nosy: +orsenthil

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-01-29 Thread Martin Panter

Changes by Martin Panter vadmium...@gmail.com:


--
nosy: +vadmium

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-01-29 Thread Martin Panter

Martin Panter added the comment:

[Edit Error: 'utf8' codec can't decode byte 0xe2 in position 207: invalid 
continuation byte]

The documentation currently says “Content-Length header should be explicitly 
provided when the body is an iterable”. See Lib/urllib/request.py:1133 for how 
it is done for urlopen(), using memoryview(), which is probabaly more correct.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23350] Content-length is incorrect when request body is a list or tuple

2015-01-29 Thread Demian Brecht

Demian Brecht added the comment:

Updated patch based on review.

--
Added file: http://bugs.python.org/file37915/list_content_length_1.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23350
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com