Well I will repeat the question:
Can I get how many bytes have a string object independently of its encoding?
Is the len function the right way of get it?
Laci look the following code:
import urllib2
request = urllib2.Request(url= 'http://localhost:6000')
data = 'data to send\n'.encode('utf_8')
request.add_data(data)
request.add_header('content-length', str(len(data)))
request.add_header('content-encoding', 'UTF-8')
file = urllib2.urlopen(request)
Is always true that the size of the entity-body is len(data)
independently of the encoding of data?
-Original Message-
From: Laszlo Zsolt Nagy [mailto:[EMAIL PROTECTED]
Sent: Monday, June 06, 2005 1:43 PM
To: Frank Abel Cancio Bello; python-list@python.org
Subject: Re: About size of Unicode string
Frank Abel Cancio Bello wrote:
Hi all!
I need know the size of string object independently of its encoding. For
example:
len('123') == len('123'.encode('utf_8'))
while the size of '123' object is different of the size of
'123'.encode('utf_8')
More:
I need send in HTTP request a string. Then I need know the length of the
string to set the header content-length independently of its encoding.
Any idea?
This is from the RFC:
The Content-Length entity-header field indicates the size of the
entity-body, in decimal number of OCTETs, sent to the recipient or, in
the case of the HEAD method, the size of the entity-body that would
have been sent had the request been a GET.
Content-Length= Content-Length : 1*DIGIT
An example is
Content-Length: 3495
Applications SHOULD use this field to indicate the transfer-length of
the message-body, unless this is prohibited by the rules in section
4.4 http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4.
Any Content-Length greater than or equal to zero is a valid value.
Section 4.4 describes how to determine the length of a message-body if
a Content-Length is not given.
Looks to me that the Content-Length header has nothing to do with the
encoding. It is a very low levet stuff. The content length is given in
OCTETs and it represents the size of the body. Clearly, it has nothing
to do with MIME/encoding etc. It is about the number of bits transferred
in the body. Try to write your unicode strings into a StringIO and take
its length
Laci
--
http://mail.python.org/mailman/listinfo/python-list