Well I will repeat the question: Can I get how many bytes have a string object independently of its encoding? Is the "len" function the right way of get it?
Laci look the following code: import urllib2 request = urllib2.Request(url= 'http://localhost:6000') data = 'data to send\n'.encode('utf_8') request.add_data(data) request.add_header('content-length', str(len(data))) request.add_header('content-encoding', 'UTF-8') file = urllib2.urlopen(request) Is always true that "the size of the entity-body" is "len(data)" independently of the encoding of "data"? > -----Original Message----- > From: Laszlo Zsolt Nagy [mailto:[EMAIL PROTECTED] > Sent: Monday, June 06, 2005 1:43 PM > To: Frank Abel Cancio Bello; python-list@python.org > Subject: Re: About size of Unicode string > > Frank Abel Cancio Bello wrote: > > >Hi all! > > > >I need know the size of string object independently of its encoding. For > >example: > > > > len('123') == len('123'.encode('utf_8')) > > > >while the size of '123' object is different of the size of > >'123'.encode('utf_8') > > > >More: > >I need send in HTTP request a string. Then I need know the length of the > >string to set the header "content-length" independently of its encoding. > > > >Any idea? > > > > > This is from the RFC: > > > > > The Content-Length entity-header field indicates the size of the > > entity-body, in decimal number of OCTETs, sent to the recipient or, in > > the case of the HEAD method, the size of the entity-body that would > > have been sent had the request been a GET. > > > > Content-Length = "Content-Length" ":" 1*DIGIT > > > > > > An example is > > > > Content-Length: 3495 > > > > > > Applications SHOULD use this field to indicate the transfer-length of > > the message-body, unless this is prohibited by the rules in section > > 4.4 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4>. > > > > Any Content-Length greater than or equal to zero is a valid value. > > Section 4.4 describes how to determine the length of a message-body if > > a Content-Length is not given. > > > Looks to me that the Content-Length header has nothing to do with the > encoding. It is a very low levet stuff. The content length is given in > OCTETs and it represents the size of the body. Clearly, it has nothing > to do with MIME/encoding etc. It is about the number of bits transferred > in the body. Try to write your unicode strings into a StringIO and take > its length.... > > Laci > > -- http://mail.python.org/mailman/listinfo/python-list