Re: best way to create warning for obsolete functions and call new one

2012-03-28 Thread Gelonida N
Hi Dan,

On 03/26/2012 11:24 PM, Dan Sommers wrote:
> On Mon, 26 Mar 2012 22:26:11 +0200
> Gelonida N  wrote:
> 
>> As these modules are used by quite some projects and as I do not want
>> to force everybody to rename immediately I just want to warn users,
>> that they call functions, that have been renamed and that will be
>> obsoleted.
> 
> You want a  DeprecationWarning.


Yes, this is a good idea.
In my case I had to combine it with
 logging.captureWarnings()



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OAuth 2.0 implementation

2012-03-28 Thread Mark Hammond

On 28/03/2012 1:18 AM, Roy Smith wrote:

In article
<7909491.0.1332826232743.JavaMail.geo-discussion-forums@pbim5>,
  Demian Brecht  wrote:


OAuth 2.0 is still in draft status (draft 25 is the current one I believe)
and yes, unfortunately every single server available at this point have
varying degrees of separation from the actual spec. It's not a
pseudo-standard, it's just not observed to the letter. Google is the closest
and Facebook seems to be the farthest away (Stack Exchange is in close second
due to building theirs to work like Facebook's).


In practice, OAuth is all about getting your site to work with Facebook.
That is all most web sites care about today because that's where the
money is.  The fact that other sites also use OAuth is of mostly
academic interest at this point.

The next player on the list is Twitter, and they're not even up to using
their own incompatible version of OAuth 2.0.  They're still using OAuth
1.0 (although, I understand, they're marching towards 2.0).


Almost all "social" or "sharing" sites implement OAuth - either 1.0 or 
2.0.  Facebook is clearly the big winner here but not the only player. 
It's also used extensively by google (eg, even their SMTP server 
supports using OAuth credentials to send email)


I'd go even further - most sites which expose an API use OAuth for 
credentials with that API.


Mark
--
http://mail.python.org/mailman/listinfo/python-list


Re: best way to create warning for obsolete functions and call new one

2012-03-28 Thread Gelonida N
Hi Chris,

On 03/26/2012 11:50 PM, Chris Angelico wrote:
> On Tue, Mar 27, 2012 at 7:26 AM, Gelonida N  wrote:
>> One option I though of would be:
>>
>> def obsolete_func(func):
>>def call_old(*args, **kwargs):
>>print "func is old psl use new one"
>>return func(*args, **kwargs)
>>return call_old
>>
>> and
>>
>> def get_time(a='high'):
>>   return a + 'noon'
> 
> That's a reasonable idea. Incorporate Dan's suggestion of using
> DeprecationWarning.
> 
Will do that.

> You may want to try decorator syntax:
> 
> def was(oldname):
>   def _(func):
>   globals()[oldname]=func
>   return func
>   return _
> 
> @was("get_thyme")
> def get_time(a='high'):
>return a + 'noon'
> 
> That won't raise DeprecationWarning, though. It's a very simple
> assignment. The was() internal function could be enhanced to do a bit
> more work, but I'm not sure what version of Python you're using and
> what introspection facilities you have. But if you're happy with the
> old versions coming up with (*args,**kwargs) instead of their
> parameter lists, it's not difficult:

I'm using python 2.6 and sometimes still 2.5 (the latter is not
important for this question though)


Good idea about the decorators.
I overlooked to see, that if done properly a decorator will not add call
time overhead for calling the function with it's new name

> 
> def was(oldname):
>   def _(func):
>   def bounce(*args,**kwargs):
>   # raise DeprecationWarning
>   return func(*args,**kwargs)
>   globals()[oldname]=bounce
>   return func
>   return _
> 
> I've never actually used the Python warnings module, but any line of
> code you fill in at the comment will be executed any time the old name
> is used. In any case, it's still used with the same convenient
> decorator. You could even unify multiple functions under a single new
> name:
> 
> @was("foo")
> @was("bar")
> def quux(spam,ham):
> return ham.eat()
> 
Now the next step will do write a decorator working for class methods.
I think I have a solution, but it would require to pass the class as
additional parameter to each decorator.

it would be
setattr(cls, oldname, finc)

> Hope that helps!
It does. Thanks again.

-- 
http://mail.python.org/mailman/listinfo/python-list


"convert" string to bytes without changing data (encoding)

2012-03-28 Thread Peter Daum
Hi,

is there any way to convert a string to bytes without
interpreting the data in any way? Something like:

s='abcde'
b=bytes(s, "unchanged")

Regards,
  Peter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Chris Angelico
On Wed, Mar 28, 2012 at 7:56 PM, Peter Daum  wrote:
> Hi,
>
> is there any way to convert a string to bytes without
> interpreting the data in any way? Something like:
>
> s='abcde'
> b=bytes(s, "unchanged")

What is a string? It's not a series of bytes. You can't convert it
without encoding those characters into bytes in some way.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Stefan Behnel
Peter Daum, 28.03.2012 10:56:
> is there any way to convert a string to bytes without
> interpreting the data in any way? Something like:
> 
> s='abcde'
> b=bytes(s, "unchanged")

If you can tell us what you actually want to achieve, i.e. why you want to
do this, we may be able to tell you how to do what you want.

Stefan

-- 
http://mail.python.org/mailman/listinfo/python-list


question about file handling with "with"

2012-03-28 Thread Jabba Laci
Hi,

Is the following function correct? Is the input file closed in order?

def read_data_file(self):
with open(self.data_file) as f:
return json.loads(f.read())

Thanks,

Laszlo
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Peter Daum
On 2012-03-28 11:02, Chris Angelico wrote:
> On Wed, Mar 28, 2012 at 7:56 PM, Peter Daum  wrote:
>> is there any way to convert a string to bytes without
>> interpreting the data in any way? Something like:
>>
>> s='abcde'
>> b=bytes(s, "unchanged")
> 
> What is a string? It's not a series of bytes. You can't convert it
> without encoding those characters into bytes in some way.

... in my example, the variable s points to a "string", i.e. a series of
bytes, (0x61,0x62 ...) interpreted as ascii/unicode characters.

b=bytes(s,'ascii') # or ('utf-8', 'latin1', ...)

would of course work in this case, but in general, if s holds any
data with bytes > 127, the actual data will be changed according
to the provided encoding.

What I am looking for is a general way to just copy the raw data
from a "string" object to a "byte" object without any attempt to
"decode" or "encode" anything ...

Regards,
Peter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: question about file handling with "with"

2012-03-28 Thread Peter Otten
Jabba Laci wrote:

> Is the following function correct? 

Yes, though I'd use json.load(f) instead of json.loads().

> Is the input file closed in order?
> 
> def read_data_file(self):
> with open(self.data_file) as f:
> return json.loads(f.read())


The file will be closed when the with-block is left. That is before 
read_data_file() returns, even in a Python implementation that doesn't use 
ref-counting. Think of

with open(...) as f:
   # whatever

as roughly equivalent to

f = open(...)
try:
# whatever
finally:
f.close()

See the "specification" section of

http://www.python.org/dev/peps/pep-0343/

for the gory details.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Heiko Wundram

Am 28.03.2012 11:43, schrieb Peter Daum:
... in my example, the variable s points to a "string", i.e. a series 
of

bytes, (0x61,0x62 ...) interpreted as ascii/unicode characters.


No; a string contains a series of codepoints from the unicode plane, 
representing natural language characters (at least in the simplistic 
view, I'm not talking about surrogates). These can be encoded to 
different binary storage representations, of which ascii is (a common) 
one.



What I am looking for is a general way to just copy the raw data
from a "string" object to a "byte" object without any attempt to
"decode" or "encode" anything ...


There is "logically" no raw data in the string, just a series of 
codepoints, as stated above. You'll have to specify the encoding to use 
to get at "raw" data, and from what I gather you're interested in the 
latin-1 (or iso-8859-15) encoding, as you're specifically referencing 
chars >= 0x80 (which hints at your mindset being in LATIN-land, so to 
speak).


--
--- Heiko.
--
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Stefan Behnel
Peter Daum, 28.03.2012 11:43:
> What I am looking for is a general way to just copy the raw data
> from a "string" object to a "byte" object without any attempt to
> "decode" or "encode" anything ...

That's why I asked about your use case - where does the data come from and
why is it contained in a character string in the first place? If you could
provide that information, we can help you further.

Stefan

-- 
http://mail.python.org/mailman/listinfo/python-list


errors building python 2.7.3

2012-03-28 Thread Alexey Luchko

Hi!

I've tried to build Python 2.7.3rc2 on cygwin and got the following errors:

$ CFLAGS=-I/usr/include/ncursesw/ CPPFLAGS=-I/usr/include/ncursesw/ ./configure
$ make
...
gcc -shared -Wl,--enable-auto-image-base 
build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/bufferedio.o 
build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/bytesio.o 
build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/fileio.o 
build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/iobase.o 
build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/_iomodule.o 
build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/stringio.o 
build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/textio.o 
-L/usr/local/lib -L. -lpython2.7 -o build/lib.cygwin-1.7.11-i686-2.7/_io.dll
build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/bufferedio.o: 
In function `_set_BlockingIOError':
/Python-2.7.3rc2/Modules/_io/bufferedio.c:579: undefined reference to 
`__imp__PyExc_BlockingIOError'
/Python-2.7.3rc2/Modules/_io/bufferedio.c:579: undefined reference to 
`__imp__PyExc_BlockingIOError'
build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/bufferedio.o: 
In function `_buffered_check_blocking_error':
/Python-2.7.3rc2/Modules/_io/bufferedio.c:595: undefined reference to 
`__imp__PyExc_BlockingIOError'

collect2: ld returned 1 exit status

building '_curses' extension
gcc -fno-strict-aliasing -I/usr/include/ncursesw/ -DNDEBUG -g -fwrapv -O3 
-Wall -Wstrict-prototypes -I. -IInclude -I./Include 
-I/usr/include/ncursesw/ -I/Python-2.7.3rc2/Include -I/Python-2.7.3rc2 -c 
/Python-2.7.3rc2/Modules/_cursesmodule.c -o 
build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_cursesmodule.o
/Python-2.7.3rc2/Modules/_cursesmodule.c: In function 
‘PyCursesWindow_EchoChar’:
/Python-2.7.3rc2/Modules/_cursesmodule.c:810:18: error: dereferencing 
pointer to incomplete type
/Python-2.7.3rc2/Modules/_cursesmodule.c: In function 
‘PyCursesWindow_NoOutRefresh’:
/Python-2.7.3rc2/Modules/_cursesmodule.c:1238:22: error: dereferencing 
pointer to incomplete type

/Python-2.7.3rc2/Modules/_cursesmodule.c: In function ‘PyCursesWindow_Refresh’:
/Python-2.7.3rc2/Modules/_cursesmodule.c:1381:22: error: dereferencing 
pointer to incomplete type

/Python-2.7.3rc2/Modules/_cursesmodule.c: In function ‘PyCursesWindow_SubWin’:
/Python-2.7.3rc2/Modules/_cursesmodule.c:1448:18: error: dereferencing 
pointer to incomplete type

/Python-2.7.3rc2/Modules/_cursesmodule.c: In function ‘PyCursesWindow_Refresh’:
/Python-2.7.3rc2/Modules/_cursesmodule.c:1412:1: warning: control reaches 
end of non-void function
/Python-2.7.3rc2/Modules/_cursesmodule.c: In function 
‘PyCursesWindow_NoOutRefresh’:
/Python-2.7.3rc2/Modules/_cursesmodule.c:1270:1: warning: control reaches 
end of non-void function
/Python-2.7.3rc2/Modules/_cursesmodule.c: In function 
‘PyCursesWindow_EchoChar’:
/Python-2.7.3rc2/Modules/_cursesmodule.c:817:1: warning: control reaches 
end of non-void function


...

Failed to build these modules:
_curses_io



Then tried to see if the problem is sovled, fetched the source from 
https://bitbucket.org/python_mirrors/releasing-2.7.3 and got another one:


$ CFLAGS=-I/usr/include/ncursesw/ CPPFLAGS=-I/usr/include/ncursesw/ ./configure
$ make
gcc -Wno-unused-result -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes 
-I/usr/include/ncursesw/   -I. -I./Include -I/usr/include/ncursesw/ 
-DPy_BUILD_CORE  -c ./Modules/signalmodule.c -o Modules/signalmodule.o

./Modules/signalmodule.c: In function ‘fill_siginfo’:
./Modules/signalmodule.c:734:5: error: ‘siginfo_t’ has no member named 
‘si_band’

Makefile:1456: recipe for target `Modules/signalmodule.o' failed
make: *** [Modules/signalmodule.o] Error 1


Reporting here, because bugs.python.org refuses connections currently.

Just in case
CYGWIN_NT-6.1-WOW64 ... 1.7.11(0.260/5/3) 2012-02-24 14:05 i686 Cygwin
gcc version 4.5.3 (GCC)

--
Alex
--
http://mail.python.org/mailman/listinfo/python-list


unittest: assertRaises() with an instance instead of a type

2012-03-28 Thread Ulrich Eckhardt

Hi!

I'm currently writing some tests for the error handling of some code. In 
this scenario, I must make sure that both the correct exception is 
raised and that the contained error code is correct:



  try:
  foo()
  self.fail('exception not raised')
  catch MyException as e:
  self.assertEqual(e.errorcode, SOME_FOO_ERROR)
  catch Exception:
  self.fail('unexpected exception raised')


This is tedious to write and read. The docs mention this alternative:


   with self.assertRaises(MyException) as cm:
   foo()
   self.assertEqual(cm.the_exception.errorcode, SOME_FOO_ERROR)


This is shorter, but I think there's an alternative syntax possible that 
would be even better:



with self.assertRaises(MyException(SOME_FOO_ERROR)):
foo()


Here, assertRaises() is not called with an exception type but with an 
exception instance. I'd implement it something like this:



def assertRaises(self, exception, ...):
# divide input parameter into type and instance
if isinstance(exception, Exception):
exception_type = type(exception)
else:
exception_type = exception
exception = None
# call testee and verify results
try:
...call function here...
except exception_type as e:
if not exception is None:
self.assertEqual(e, exception)


This of course requires the exception to be equality-comparable.


Questions here:
1. Does this sound like a useful extension or am I missing another 
obvious solution to my problem?
2. The assertRaises() sketch above tries to auto-detect whether the 
given parameter is the type or an instance. Given the highly dynamic 
nature of Python, an object can be both instance and type, is the above 
detection mechanism reliable?



Of course I'm open for other suggestions to solve my problem. One that I 
thought of but which I haven't really looked into was to modify __init__ 
or __new__ of my exception class to return an instance of a derived 
class that uniquely identifies the error. I.e. 
MyException(SOME_FOO_ERROR) would not create a MyException instance but 
a MyFooErrorException instance (which has MyException as a baseclass). 
In that case, the existing code that checks for the right exception type 
would suffice for my needs.



Cheers everybody!

Uli
--
http://mail.python.org/mailman/listinfo/python-list


Difference between json.load() and json.loads() [From: RE: question about file handling with "with"]

2012-03-28 Thread Nadir Sampaoli
Hello everyone (my first message in the mailing list),


> > Is the following function correct?
> Yes, though I'd use json.load(f) instead of json.loads().
>

The docs  aren't very
clear (at least for me) about the difference between json.load() and
json.loads (and about "dump()" and "dumps()" too"). Could you clarify it
for me?

Thanks in advance.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Difference between json.load() and json.loads() [From: RE: question about file handling with "with"]

2012-03-28 Thread ian douglas
On Mar 28, 2012 6:54 AM, "Nadir Sampaoli"  wrote:
>
> Hello everyone (my first message in the mailing list),
>
>>
>> > Is the following function correct?
>> Yes, though I'd use json.load(f) instead of json.loads().
>
>
> The docs aren't very clear (at least for me) about the difference between
json.load() and json.loads (and about "dump()" and "dumps()" too"). Could
you clarify it for me?
>

The functions with an s take string parameters. The others take file
streams.

foo = '{"age": 38}'
my_json = json.loads(foo)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python segfault

2012-03-28 Thread Kiuhnm

On 3/28/2012 8:16, Michael Poeltl wrote:

yeah - of course 'while True' was the first, most obvious best way... ;-)
but I was asked if there was a way without 'while True'
and so I started the 'recursive function'

and quick quick; RuntimeError-Exception ->  not thinking much ->  just adding
two zeros to the default limit (quick and dirty) ->  segfault ==>  subject: 
python segfault ;-)


You give up too easily! Here's another way:

--->
def get_steps2(pos=0, steps=0, level = 100):
if steps == 0:
pos = random.randint(-1,1)
if pos == 0:
return steps
steps += 2
pos += random.randint(-1,1)

if level == 0:
return (pos, steps)
res = get_steps2(pos,steps, level-1)
if not isinstance(res, tuple):
return res
return get_steps2(res[0], res[1], level-1)

import random
for i in range(200):
print ( get_steps2() )

print("done")
input("")
<---

Now the limit is 1267650600228229401496703205376. I hope that's enough.

Kiuhnm
--
http://mail.python.org/mailman/listinfo/python-list


Re: Difference between json.load() and json.loads() [From: RE: question about file handling with "with"]

2012-03-28 Thread Nadir Sampaoli
2012/3/28 ian douglas
>
> The functions with an s take string parameters. The others take file
> streams.
>
> foo = '{"age": 38}'
> my_json = json.loads(foo)
>
I see, it makes perfectly sense now. Thanks for clearing it up.
-- 
http://mail.python.org/mailman/listinfo/python-list


Work

2012-03-28 Thread Alicja Krzyżanowska

Hello
My name is Alicja Krzyżanowska and I represent Software Press company
We are creating a New version of popular magazine PHP Solution (English version), which will be 
available online. We are looking for a specialist in Python, who will be interested in writing some
articles in this subject. 
While looking for interesting ways of writing about programming I found your website. Articles are really interesting and I think that people who does not know almost anything about programming may be able to understand it (even if only a little bit…). 
May you be interested in sharing your knowledge with us? Would you be interested in such cooperation? Please send me more information about yourself and your experience in this subject. 
I am looking forward to hearing from you


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: errors building python 2.7.3

2012-03-28 Thread Alexey Luchko

On 28.03.2012 14:50, Alexey Luchko wrote:

Hi!

I've tried to build Python 2.7.3rc2 on cygwin and got the following errors:

$ CFLAGS=-I/usr/include/ncursesw/ CPPFLAGS=-I/usr/include/ncursesw/
./configure
$ make
...
gcc -shared -Wl,--enable-auto-image-base
build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/bufferedio.o
build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/bytesio.o
build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/fileio.o
build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/iobase.o
build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/_iomodule.o
build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/stringio.o
build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/textio.o
-L/usr/local/lib -L. -lpython2.7 -o build/lib.cygwin-1.7.11-i686-2.7/_io.dll
build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/bufferedio.o:
In function `_set_BlockingIOError':
/Python-2.7.3rc2/Modules/_io/bufferedio.c:579: undefined reference to
`__imp__PyExc_BlockingIOError'
/Python-2.7.3rc2/Modules/_io/bufferedio.c:579: undefined reference to
`__imp__PyExc_BlockingIOError'
build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/bufferedio.o:
In function `_buffered_check_blocking_error':
/Python-2.7.3rc2/Modules/_io/bufferedio.c:595: undefined reference to
`__imp__PyExc_BlockingIOError'
collect2: ld returned 1 exit status

building '_curses' extension
gcc -fno-strict-aliasing -I/usr/include/ncursesw/ -DNDEBUG -g -fwrapv -O3
-Wall -Wstrict-prototypes -I. -IInclude -I./Include
-I/usr/include/ncursesw/ -I/Python-2.7.3rc2/Include -I/Python-2.7.3rc2 -c
/Python-2.7.3rc2/Modules/_cursesmodule.c -o
build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_cursesmodule.o
/Python-2.7.3rc2/Modules/_cursesmodule.c: In function
‘PyCursesWindow_EchoChar’:
/Python-2.7.3rc2/Modules/_cursesmodule.c:810:18: error: dereferencing
pointer to incomplete type
/Python-2.7.3rc2/Modules/_cursesmodule.c: In function
‘PyCursesWindow_NoOutRefresh’:
/Python-2.7.3rc2/Modules/_cursesmodule.c:1238:22: error: dereferencing
pointer to incomplete type
/Python-2.7.3rc2/Modules/_cursesmodule.c: In function
‘PyCursesWindow_Refresh’:
/Python-2.7.3rc2/Modules/_cursesmodule.c:1381:22: error: dereferencing
pointer to incomplete type
/Python-2.7.3rc2/Modules/_cursesmodule.c: In function ‘PyCursesWindow_SubWin’:
/Python-2.7.3rc2/Modules/_cursesmodule.c:1448:18: error: dereferencing
pointer to incomplete type
/Python-2.7.3rc2/Modules/_cursesmodule.c: In function
‘PyCursesWindow_Refresh’:
/Python-2.7.3rc2/Modules/_cursesmodule.c:1412:1: warning: control reaches
end of non-void function
/Python-2.7.3rc2/Modules/_cursesmodule.c: In function
‘PyCursesWindow_NoOutRefresh’:
/Python-2.7.3rc2/Modules/_cursesmodule.c:1270:1: warning: control reaches
end of non-void function
/Python-2.7.3rc2/Modules/_cursesmodule.c: In function
‘PyCursesWindow_EchoChar’:
/Python-2.7.3rc2/Modules/_cursesmodule.c:817:1: warning: control reaches
end of non-void function

...

Failed to build these modules:
_curses _io


The same happens with Python 2.7.2.



CYGWIN_NT-6.1-WOW64 ... 1.7.11(0.260/5/3) 2012-02-24 14:05 i686 Cygwin
gcc version 4.5.3 (GCC)



--
Alex
--
http://mail.python.org/mailman/listinfo/python-list


RE: RE: Advise of programming one of my first programs

2012-03-28 Thread Prasad, Ramit
> >> The use of eval is dangerous if you are not *completely* sure what is
> >> being passed in. Try using pickle instead:
> >> http://docs.python.org/release/2.5.2/lib/pickle-example.html
> >
> >
> > Um, at least by my understanding, the use of Pickle is also dangerous if
> you
> > are not completely sure what is being passed in:
> 
> Oh goodness yes. pickle is exactly as unsafe as eval is. Try running this
> code:
> 
> from pickle import loads
> loads("c__builtin__\neval\n(c__builtin__\nraw_input\n(S'py>'\ntRtR.")

It might be as dangerous, but which is more likely to cause problems in
real world scenarios?

Ramit


Ramit Prasad | JPMorgan Chase Investment Bank | Currencies Technology
712 Main Street | Houston, TX 77002
work phone: 713 - 216 - 5423

--


This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.  
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Question about collections.defaultdict

2012-03-28 Thread Steven W. Orr

On 3/26/2012 11:52 AM, Robert Kern wrote:

On 3/26/12 4:33 PM, Steven W. Orr wrote:

On 3/26/2012 9:44 AM, Robert Kern wrote:

On 3/26/12 2:33 PM, Steven W. Orr wrote:

I created a new class called CaseInsensitiveDict (by stealing from code I
found
on the web, thank you very much). The new class inherits from dict. It
makes it
so that if the key has a 'lower' method, it will always access the key using
lower

I'd like to change the place where I previously declared a dict

self.lookup = defaultdict(list)

so that the new code will allow this new dict to be used instead. But then I
realized I may have painted myself into a small corner:

Is there a way to use defaultdict so that I can override what *kind* of
dict it
will use?


No.


I would like the value to still be a list be default, but it seems like I
can't
tell defaultdict to use *my* new dict.

Do I give up on defaultdict?


Assuming that your CaseInsensitiveDict subclasses from dict or UserDict, it's
relatively easy to make a subclass of your CaseInsensitiveDict act like a
defaultdict. Just implement the __missing__(key) method appropriately (and
modify the constructor to take the callable, of course).

http://docs.python.org/library/stdtypes.html#dict
http://docs.python.org/library/collections.html#collections.defaultdict.__missing__






I'm not quite getting what you're telling me, but I'm sure you have the right
idea. Here's the beginning of my class:

class CaseInsensitiveDict(dict):
def __init__(self, init=None):
if isinstance(init, (dict, list, tuple)):
for kk, vv in init.items():
self[self.key_has_lower(kk)] = vv


It sounds like you want me to subclass defaultdict to create something like
this?

class CaseInsensitiveDictDef(defaultdict):
def __init__(self, init=None):
super(CaseInsensitiveDictDef, self).__init__(list)
self.__missing__ = list

I think I'm way off base. I'm not clear on what the calling sequence is for
defaultdict or how to get it to use my CaseInsensitiveDict instead of
regular dict.

Can you help?


You need to make a subclass of CaseInsensitiveDict, implement the
__missing__(key) method, and override the __init__() method to take the
factory function as an argument instead of data. defaultdict is just a
subclass of dict that does this.


class CaseInsensitiveDictDef(CaseInsensitiveDict):
def __init__(self, default_factory):
super(CaseInsensitiveDictDef, self).__init__()
self.default_factory = default_factory

def __missing__(self, key):
return self.default_factory()



Many thanks. This was a great learning experience as well as ending up with 
exactly what I wanted.


Python is rich with "Ah ha!" moments. This was definitely one of them.

In my feeble attempt to give back, here's the answer:

class CaseInsensitiveDefaultDict(CaseInsensitiveDict):
def __init__(self, default_factory=None, init=None):
if not callable(default_factory):
raise TypeError('First argument must be callable')
super(CaseInsensitiveDefaultDict, self).__init__(init)
self.default_factory = default_factory

def __missing__(self, key):
self[key] = val = self.default_factory()
return val

def __getitem__(self, key):
try:
return super(CaseInsensitiveDefaultDict, self).__getitem__(key)
except KeyError:
return self.__missing__(key)


--
Time flies like the wind. Fruit flies like a banana. Stranger things have  .0.
happened but none stranger than this. Does your driver's license say Organ ..0
Donor?Black holes are where God divided by zero. Listen to me! We are all- 000
individuals! What if this weren't a hypothetical question?
steveo at syslang.net
--
http://mail.python.org/mailman/listinfo/python-list


Re: errors building python 2.7.3

2012-03-28 Thread Colton Myers
>
> Reporting here, because bugs.python.org refuses connections currently.
>

bugs.python.org seems to be back up, I'd repost there if you haven't
already.

--
Colton Myers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: errors building python 2.7.3

2012-03-28 Thread David Robinow
On Wed, Mar 28, 2012 at 7:50 AM, Alexey Luchko  wrote:
> I've tried to build Python 2.7.3rc2 on cygwin and got the following errors:
>
> $ CFLAGS=-I/usr/include/ncursesw/ CPPFLAGS=-I/usr/include/ncursesw/
> ./configure
 I haven't tried 2.7.3 yet, so I'll describe my experience with 2.7.2
 I use /usr/include/ncurses   rather than /usr/include/ncursesw
 I don't remember what the difference is but ncurses seems to work.

> $ make
> ...
> gcc -shared -Wl,--enable-auto-image-base
> build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/bufferedio.o
> build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/bytesio.o
> build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/fileio.o
> build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/iobase.o
> build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/_iomodule.o
> build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/stringio.o
> build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/textio.o
> -L/usr/local/lib -L. -lpython2.7 -o build/lib.cygwin-1.7.11-i686-2.7/_io.dll
> build/temp.cygwin-1.7.11-i686-2.7/Python-2.7.3rc2/Modules/_io/bufferedio.o:
> In function `_set_BlockingIOError':
> /Python-2.7.3rc2/Modules/_io/bufferedio.c:579: undefined reference to
> `__imp__PyExc_BlockingIOError'

In Modules/_io/_iomodule.h, use:
PyObject *PyExc_BlockingIOError;
instead of:
PyAPI_DATA(PyObject *) PyExc_BlockingIOError;

> Failed to build these modules:
> _curses            _io
>

But please note that Cygwin does not support Python-2.7. There may be
other reasons.
I don't really use cygwin Python for anything important. It's just
nice to have around since I spend a lot of time in the bash shell.
It would probably be helpful to ask on the Cygwin mailing list
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Ross Ridge
Chris Angelico   wrote:
>What is a string? It's not a series of bytes.

Of course it is.  Conceptually you're not supposed to think of it that
way, but a string is stored in memory as a series of bytes.

What he's asking for many not be very useful or practical, but if that's
your problem here than then that's what you should be addressing, not
pretending that it's fundamentally impossible.

Ross Ridge

-- 
 l/  //   Ross Ridge -- The Great HTMU
[oo][oo]  rri...@csclub.uwaterloo.ca
-()-/()/  http://www.csclub.uwaterloo.ca/~rridge/ 
 db  //   
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tools for refactoring/obfuscation

2012-03-28 Thread Javier
Yes, in general I follow clear guidelines for writing code.  I just use
modules with functions in the same directory and clear use of name
spaces. I almost never use classes.  I wonder if you use some tool for
refactoring.  I am mainly intersted in scripting tools, no eclipse-style
guis.

Just let me know if you use some scripting tool.

And, as somebody pointed in this thread obfuscating or refactoring the
code are very different things but they can be done with the same tools.

Javier


Vladimir Ignatov  wrote:
> Hi,
> 
> (sorry for replying to the old topic)
> 
> On Tue, Mar 6, 2012 at 10:29 PM, Javier  wrote:
>> I am looking for an automated tool for refactoring/obfuscation.
>> Something that changes names of functions, variables, or which would
>> merge all the functions of various modules in a single module.
>> The closest I have seen is http://bicyclerepair.sourceforge.net/
>>
>> Does somebody know of something that can work from the command line or
>> simple data structures/text files?, like a python dictionary of functions
>> {"old":"new",...}
> 
> I am not surprised what nobody answers. I think that such tool is
> nearly impossible given the dynamic Python's nature.
> But if you put little discipline/restrictions in your source code,
> when doing obfuscation could be much more easier. Almost trivial I
> would say.
> 
> Javier, you can contact me directly if you are interested in this topic.
> 
> Vladimir Ignatov
> kmisoft at gmail com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Chris Angelico
On Thu, Mar 29, 2012 at 2:36 AM, Ross Ridge  wrote:
> Chris Angelico   wrote:
>>What is a string? It's not a series of bytes.
>
> Of course it is.  Conceptually you're not supposed to think of it that
> way, but a string is stored in memory as a series of bytes.

Note that distinction. I said that a string "is not" a series of
bytes; you say that it "is stored" as bytes.

> What he's asking for many not be very useful or practical, but if that's
> your problem here than then that's what you should be addressing, not
> pretending that it's fundamentally impossible.

That's equivalent to taking a 64-bit integer and trying to treat it as
a 64-bit floating point number. They're all just bits in memory, and
in C it's quite easy to cast a pointer to a different type and
dereference it. But a Python Unicode string might be stored in several
ways; for all you know, it might actually be stored as a sequence of
apples in a refrigerator, just as long as they can be referenced
correctly. There's no logical Python way to turn that into a series of
bytes.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


ResponseNotReady in httplib

2012-03-28 Thread Manu
Hi
  I try to access a web site and it returns me this exception
"ResponseNotReady"  . I don't know what is the root of the problem and
how to sort it out.
I am using the excellent python requests library to access the web
site but it relies on httplib utlimately.


Could someone one explains me the problem and how to sort it out .

Thx
Dave
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Grant Edwards
On 2012-03-28, Chris Angelico  wrote:

> for all you know, it might actually be stored as a sequence of
> apples in a refrigerator

[...]

> There's no logical Python way to turn that into a series of bytes.

There's got to be a joke there somewhere about how to eat an apple...

-- 
Grant Edwards   grant.b.edwardsYow! Somewhere in DOWNTOWN
  at   BURBANK a prostitute is
  gmail.comOVERCOOKING a LAMB CHOP!!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Dave Angel

On 03/28/2012 04:56 AM, Peter Daum wrote:

Hi,

is there any way to convert a string to bytes without
interpreting the data in any way? Something like:

s='abcde'
b=bytes(s, "unchanged")

Regards,
   Peter



You needed to specify that you are using Python 3.x .  In python 2.x, a 
string is indeed a series of bytes.  But in Python 3.x, you have to be 
much more specific.


For example, if that string is coming from a literal, then you usually 
can convert it back to bytes simply by encoding using the same method as 
the one specified for the source file.  So look at the encoding line at 
the top of the file.




--

DaveA

--
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Peter Daum
On 2012-03-28 12:42, Heiko Wundram wrote:
> Am 28.03.2012 11:43, schrieb Peter Daum:
>> ... in my example, the variable s points to a "string", i.e. a series of
>> bytes, (0x61,0x62 ...) interpreted as ascii/unicode characters.
> 
> No; a string contains a series of codepoints from the unicode plane,
> representing natural language characters (at least in the simplistic
> view, I'm not talking about surrogates). These can be encoded to
> different binary storage representations, of which ascii is (a common) one.
> 
>> What I am looking for is a general way to just copy the raw data
>> from a "string" object to a "byte" object without any attempt to
>> "decode" or "encode" anything ...
> 
> There is "logically" no raw data in the string, just a series of
> codepoints, as stated above. You'll have to specify the encoding to use
> to get at "raw" data, and from what I gather you're interested in the
> latin-1 (or iso-8859-15) encoding, as you're specifically referencing
> chars >= 0x80 (which hints at your mindset being in LATIN-land, so to
> speak).

... I was under the illusion, that python (like e.g. perl) stored
strings internally in utf-8. In this case the "conversion" would simple
mean to re-label the data. Unfortunately, as I meanwhile found out, this
is not the case (nor the "apple encoding" ;-), so it would indeed be
pretty useless.

The longer story of my question is: I am new to python (obviously), and
since I am not familiar with either one, I thought it would be advisory
to go for python 3.x. The biggest problem that I am facing is, that I
am often dealing with data, that is basically text, but it can contain
8-bit bytes. In this case, I can not safely assume any given encoding,
but I actually also don't need to know - for my purposes, it would be
perfectly good enough to deal with the ascii portions and keep anything
else unchanged.

As it seems, this would be far easier with python 2.x. With python 3
and its strict distinction between "str" and "bytes", things gets
syntactically pretty awkward and error-prone (something as innocently
looking like "s=s+'/'" hidden in a rarely reached branch and a
seemingly correct program will crash with a TypeError 2 years
later ...)

Regards,
 Peter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Steven D'Aprano
On Wed, 28 Mar 2012 11:36:10 -0400, Ross Ridge wrote:

> Chris Angelico   wrote:
>>What is a string? It's not a series of bytes.
> 
> Of course it is.  Conceptually you're not supposed to think of it that
> way, but a string is stored in memory as a series of bytes.

You don't know that. They might be stored as a tree, or a rope, or some 
even more complex data structure. In fact, in Python, they are stored as 
an object.

But even if they were stored as a simple series of bytes, you don't know 
what bytes they are. That is an implementation detail of the particular 
Python build being used, and since Python doesn't give direct access to 
memory (at least not in pure Python) there's no way to retrieve those 
bytes using Python code.

Saying that strings are stored in memory as bytes is no more sensible 
than saying that dicts are stored in memory as bytes. Yes, they are. So 
what? Taken out of context in a running Python interpreter, those bytes 
are pretty much meaningless.


> What he's asking for many not be very useful or practical, but if that's
> your problem here than then that's what you should be addressing, not
> pretending that it's fundamentally impossible.

The right way to convert bytes to strings, and vice versa, is via 
encoding and decoding operations. What the OP is asking for is as silly 
as somebody asking to turn a float 1.3792 into a string without calling 
str() or any equivalent float->string conversion. They're both made up of 
bytes, right? Yeah, they are. So what?

Even if you do a hex dump of float 1.3792, the result will NOT be the 
string "1.3792". And likewise, even if you somehow did a hex dump of the 
memory representation of a string, the result will NOT be the equivalent 
sequence of bytes except *maybe* for some small subset of possible 
strings.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unittest: assertRaises() with an instance instead of a type

2012-03-28 Thread Steven D'Aprano
On Wed, 28 Mar 2012 14:28:08 +0200, Ulrich Eckhardt wrote:

> Hi!
> 
> I'm currently writing some tests for the error handling of some code. In
> this scenario, I must make sure that both the correct exception is
> raised and that the contained error code is correct:
> 
> 
>try:
>foo()
>self.fail('exception not raised')
>catch MyException as e:
>self.assertEqual(e.errorcode, SOME_FOO_ERROR)
>catch Exception:
>self.fail('unexpected exception raised')

First off, that is not Python code. "catch Exception" gives a syntax 
error.

Secondly, that is not the right way to do this unit test. You are testing 
two distinct things, so you should write it as two separate tests:


def testFooRaisesException(self):
# Test that foo() raises an exception.
self.assertRaises(MyException, foo)


If foo does *not* raise an exception, the unittest framework will handle 
the failure for you. If it raises a different exception, the framework 
will also handle that too.

Then write a second test to check the exception code:

def testFooExceptionCode(self):
# Test that foo()'s exception has the right error code.
try:
foo()
except MyException as err:
self.assertEquals(err.errorcode, SOME_FOO_ERROR)


Again, let the framework handle any unexpected cases.

If you have lots of functions to test, write a helper function:

def catch(exception, func, *args, **kwargs):
try:
func(*args, **kwargs)
except exception as err:
return err
raise RuntimeError('no error raised')


and then the test becomes:

def testFooExceptionCode(self):
# Test that foo()'s exception has the right error code.
self.assertEquals(
catch(MyException, foo).errorcode, SOME_FOO_ERROR
)



(By the way, I have to question the design of an exception with error 
codes. That seems pretty poor design to me. Normally the exception *type* 
acts as equivalent to an error code.)


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Heiko Wundram

Am 28.03.2012 19:43, schrieb Peter Daum:

As it seems, this would be far easier with python 2.x. With python 3
and its strict distinction between "str" and "bytes", things gets
syntactically pretty awkward and error-prone (something as innocently
looking like "s=s+'/'" hidden in a rarely reached branch and a
seemingly correct program will crash with a TypeError 2 years
later ...)


It seems that you're mixing things up wrt. the string/bytes 
distinction; it's not as "complicated" as it might seem.


1) Strings

s = "This is a test string"
s = 'This is another test string with single quotes'
s = """
And this is a multiline test string.
"""
s = 'c' # This is also a string...

all create/refer to string objects. How Python internally stores them 
is none of your concern (actually, that's rather complicated anyway, at 
least with the upcoming Python 3.3), and processing a string basically 
means that you'll work on the natural language characters present in the 
string. Python strings can store (pretty much) all characters and 
surrogates that unicode allows, and when the python interpreter/compiler 
reads strings from input (I'm talking about source files), a default 
encoding defines how the bytes in your input file get interpreted as 
unicode codepoint encodings (generally, it depends on your system locale 
or file header indications) to construct the internal string object 
you're using to access the data in the string.


There is no such thing as a type for a single character; single 
characters are simply strings of length 1 (and so indexing also returns 
a [new] string object).


Single/double quotes work no different.

The internal encoding used by the Python interpreter is of no concern 
to you.


2) Bytes

s = b'this is a byte-string'
s = b'\x22\x33\x44'

The above define bytes. Think of the bytes type as arrays of 8-bit 
integers, only representing a buffer which you can process as an array 
of fixed-width integers. Reading from stdin/a file gets you bytes, and 
not a string, because Python cannot automagically guess what format the 
input is in.


Indexing the bytes type returns an integer (which is the clearest 
distinction between string and bytes).


Being able to input "string-looking" data in source files as bytes is a 
debatable "feature" (IMHO; see the first example), simply because it 
breaks the semantic difference between the two types in the eye of the 
programmer looking at source.


3) Conversions

To get from bytes to string, you have to decode the bytes buffer, 
telling Python what kind of character data is contained in the array of 
integers. After decoding, you'll get a string object which you can 
process using the standard string methods. For decoding to succeed, you 
have to tell Python how the natural language characters are encoded in 
your array of bytes:


b'hello'.decode('iso-8859-15')

To get from string back to bytes (you want to write the natural 
language character data you've processed to a file), you have to encode 
the data in your string buffer, which gets you an array of 8-bit 
integers to write to the output:


'hello'.encode('iso-8859-15')

Most output methods will happily do the encoding for you, using a 
standard encoding, and if that happens to be ASCII, you're getting 
UnicodeEncodeErrors which tell you that a character in your string 
source is unsuited to be transmitted using the encoding you've 
specified.


If the above doesn't make the string/bytes-distinction and usage 
clearer, and you have a C#-background, check out the distinction between 
byte[] (which the System.IO-streams get you), and how you have to use a 
System.Encoding-derived class to get at actual System.String objects to 
manipulate character data. Pythons type system wrt. character data is 
pretty much similar, except for missing the "single character" type 
(char).


Anyway, back to what you wrote: how are you getting the input data? Why 
are "high bytes" in there which you do not know the encoding for? 
Generally, from what I gather, you'll decode data from some source, 
process it, and write it back using the same encoding which you used for 
decoding, which should do exactly what you want and not get you into any 
trouble with encodings.


--
--- Heiko.
--
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Ross Ridge
Ross Ridge  wr=
> Of course it is. =A0Conceptually you're not supposed to think of it that
> way, but a string is stored in memory as a series of bytes.

Chris Angelico   wrote:
>Note that distinction. I said that a string "is not" a series of
>bytes; you say that it "is stored" as bytes.

The distinction is meaningless.  I'm not going argue with you about what
you or I ment by the word "is".

>But a Python Unicode string might be stored in several
>ways; for all you know, it might actually be stored as a sequence of
>apples in a refrigerator, just as long as they can be referenced
>correctly.

But it is in fact only stored in one particular way, as a series of bytes.

>There's no logical Python way to turn that into a series of bytes.

Nonsense.  Play all the semantic games you want, it already is a series
of bytes.

Ross Ridge

-- 
 l/  //   Ross Ridge -- The Great HTMU
[oo][oo]  rri...@csclub.uwaterloo.ca
-()-/()/  http://www.csclub.uwaterloo.ca/~rridge/ 
 db  //   
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Steven D'Aprano
On Wed, 28 Mar 2012 11:43:52 +0200, Peter Daum wrote:

> ... in my example, the variable s points to a "string", i.e. a series of
> bytes, (0x61,0x62 ...) interpreted as ascii/unicode characters.

No. Strings are not sequences of bytes (except in the trivial sense that 
everything in computer memory is made of bytes). They are sequences of 
CODE POINTS. (Roughly speaking, code points are *almost* but not quite 
the same as characters.)

I suggest that you need to reset your understanding of strings and bytes. 
I suggest you start by reading this:

http://www.joelonsoftware.com/articles/Unicode.html

Then come back and try to explain what actual problem you are trying to 
solve.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Jussi Piitulainen
Peter Daum writes:

> ... I was under the illusion, that python (like e.g. perl) stored
> strings internally in utf-8. In this case the "conversion" would simple
> mean to re-label the data. Unfortunately, as I meanwhile found out, this
> is not the case (nor the "apple encoding" ;-), so it would indeed be
> pretty useless.
> 
> The longer story of my question is: I am new to python (obviously), and
> since I am not familiar with either one, I thought it would be advisory
> to go for python 3.x. The biggest problem that I am facing is, that I
> am often dealing with data, that is basically text, but it can contain
> 8-bit bytes. In this case, I can not safely assume any given encoding,
> but I actually also don't need to know - for my purposes, it would be
> perfectly good enough to deal with the ascii portions and keep anything
> else unchanged.

You can read as bytes and decode as ASCII but ignoring the troublesome
non-text characters:

>>> print(open('text.txt', 'br').read().decode('ascii', 'ignore'))
Das fr ASCII nicht benutzte Bit kann auch fr Fehlerkorrekturzwecke
(Parittsbit) auf den Kommunikationsleitungen oder fr andere
Steuerungsaufgaben verwendet werden. Heute wird es aber fast immer zur
Erweiterung von ASCII auf einen 8-Bit-Code verwendet. Diese
Erweiterungen sind mit dem ursprnglichen ASCII weitgehend kompatibel,
so dass alle im ASCII definierten Zeichen auch in den verschiedenen
Erweiterungen durch die gleichen Bitmuster kodiert werden. Die
einfachsten Erweiterungen sind Kodierungen mit sprachspezifischen
Zeichen, die nicht im lateinischen Grundalphabet enthalten sind.

The paragraph is from the German Wikipedia on ASCII, in UTF-8.
-- 
http://mail.python.org/mailman/listinfo/python-list


RE: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Prasad, Ramit
> As it seems, this would be far easier with python 2.x. With python 3
> and its strict distinction between "str" and "bytes", things gets
> syntactically pretty awkward and error-prone (something as innocently
> looking like "s=s+'/'" hidden in a rarely reached branch and a
> seemingly correct program will crash with a TypeError 2 years
> later ...)

Just a small note as you are new to Python, string concatenation can
be expensive (quadratic time). The Python (2.x and 3.x) idiom for 
frequent string concatenation is to append to a list and then join 
them like the following (linear time). 

>>>lst = [ 'Hi,' ]
>>>lst.append( 'how' )
>>>lst.append( 'are' )
>>>lst.append( 'you?' )
>>>sentence = ' '.join( lst ) # use a space separating each element
>>>print sentence
Hi, how are you?

You can use join on an empty string, but then they will not be 
separated by spaces.

>>>sentence = ''.join( lst ) # empty string so no separation
>>>print sentence
Hi,howareyou?

You can use any string as a separator, length does not matter.

>>>sentence = '@-Q'.join( lst )
>>>print sentence
Hi,@-Qhow@-Qare@-Qyou?


Ramit


Ramit Prasad | JPMorgan Chase Investment Bank | Currencies Technology
712 Main Street | Houston, TX 77002
work phone: 713 - 216 - 5423

--

This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.  
-- 
http://mail.python.org/mailman/listinfo/python-list


RE: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Prasad, Ramit
> You can read as bytes and decode as ASCII but ignoring the troublesome
> non-text characters:
> 
> >>> print(open('text.txt', 'br').read().decode('ascii', 'ignore'))
> Das fr ASCII nicht benutzte Bit kann auch fr Fehlerkorrekturzwecke
> (Parittsbit) auf den Kommunikationsleitungen oder fr andere
> Steuerungsaufgaben verwendet werden. Heute wird es aber fast immer zur
> Erweiterung von ASCII auf einen 8-Bit-Code verwendet. Diese
> Erweiterungen sind mit dem ursprnglichen ASCII weitgehend kompatibel,
> so dass alle im ASCII definierten Zeichen auch in den verschiedenen
> Erweiterungen durch die gleichen Bitmuster kodiert werden. Die
> einfachsten Erweiterungen sind Kodierungen mit sprachspezifischen
> Zeichen, die nicht im lateinischen Grundalphabet enthalten sind.
> 
> The paragraph is from the German Wikipedia on ASCII, in UTF-8.

I see no non-ASCII characters, not sure if that is because the source
has none or something else. From this example I would not say that
the rest of the text is "unchanged".  Decode converts to Unicode,
did you mean encode?

I think "ignore" will remove non-translatable characters and not 
leave them in the returned string.

Ramit


Ramit Prasad | JPMorgan Chase Investment Bank | Currencies Technology
712 Main Street | Houston, TX 77002
work phone: 713 - 216 - 5423

--

This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.  
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Ian Kelly
On Wed, Mar 28, 2012 at 11:43 AM, Peter Daum  wrote:
> ... I was under the illusion, that python (like e.g. perl) stored
> strings internally in utf-8. In this case the "conversion" would simple
> mean to re-label the data. Unfortunately, as I meanwhile found out, this
> is not the case (nor the "apple encoding" ;-), so it would indeed be
> pretty useless.

No, unicode strings can be stored internally as any of UCS-1, UCS-2,
UCS-4, C wchar strings, or even plain ASCII.  And those are all
implementation details that could easily change in future versions of
Python.

> The longer story of my question is: I am new to python (obviously), and
> since I am not familiar with either one, I thought it would be advisory
> to go for python 3.x. The biggest problem that I am facing is, that I
> am often dealing with data, that is basically text, but it can contain
> 8-bit bytes. In this case, I can not safely assume any given encoding,
> but I actually also don't need to know - for my purposes, it would be
> perfectly good enough to deal with the ascii portions and keep anything
> else unchanged.

You can't generally just "deal with the ascii portions" without
knowing something about the encoding.  Say you encounter a byte
greater than 127.  Is it a single non-ASCII character, or is it the
leading byte of a multi-byte character?  If the next character is less
than 127, is it an ASCII character, or a continuation of the previous
character?  For UTF-8 you could safely assume ASCII, but without
knowing the encoding, there is no way to be sure.  If you just assume
it's ASCII and manipulate it as such, you could be messing up
non-ASCII characters.

Cheers,
Ian
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Terry Reedy



On 3/28/2012 11:36 AM, Ross Ridge wrote:

Chris Angelico  wrote:

What is a string? It's not a series of bytes.


Of course it is.  Conceptually you're not supposed to think of it that
way, but a string is stored in memory as a series of bytes.


*If* it is stored in byte memory. If you execute a 3.x program mentally 
or on paper, then there are no bytes.


If you execute a 3.3 program on a byte-oriented computer, then the 'a' 
in the string might be represented by 1, 2, or 4 bytes, depending on the 
other characters in the string. The actual logical bit pattern will 
depend on the big versus little endianness of the system.


My impression is that if you go down to the physical bit level, then 
again there are, possibly, no 'bytes' as a physical construct as the 
bits, possibly, are stored in parallel on multiple ram chips.



What he's asking for many not be very useful or practical, but if that's
your problem here than then that's what you should be addressing, not
pretending that it's fundamentally impossible.


The python-level way to get the bytes of an object that supports the 
buffer interface is memoryview(). 3.x strings intentionally do not 
support the buffer interface as there is not any particular 
correspondence between characters (codepoints) and bytes.


The OP could get the ordinal for each character and decide how *he* 
wants to convert them to bytes.


ba = bytearray()
for c in s:
  i = ord(c)
  

To get the particular bytes used for a particular string on a particular 
system, OP should use the C API, possibly through ctypes.


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: unittest: assertRaises() with an instance instead of a type

2012-03-28 Thread Terry Reedy

On 3/28/2012 8:28 AM, Ulrich Eckhardt wrote:

Hi!

I'm currently writing some tests for the error handling of some code. In
this scenario, I must make sure that both the correct exception is
raised and that the contained error code is correct:


try:
foo()
self.fail('exception not raised')
catch MyException as e:
self.assertEqual(e.errorcode, SOME_FOO_ERROR)
catch Exception:
self.fail('unexpected exception raised')


This is tedious to write and read. The docs mention this alternative:


with self.assertRaises(MyException) as cm:
foo()
self.assertEqual(cm.the_exception.errorcode, SOME_FOO_ERROR)


Exceptions can have multiple attributes. This allows the tester to 
exactly specify what attributes to test.



This is shorter, but I think there's an alternative syntax possible that
would be even better:

with self.assertRaises(MyException(SOME_FOO_ERROR)):
foo()


I presume that if this worked the way you want, all attributes would 
have to match. The message part of builtin exceptions is allowed to 
change, so hard-coding an exact expected message makes tests fragile. 
This is a problem with doctest.



Here, assertRaises() is not called with an exception type but with an
exception instance. I'd implement it something like this:

def assertRaises(self, exception, ...):
# divide input parameter into type and instance
if isinstance(exception, Exception):
exception_type = type(exception)
else:
exception_type = exception
exception = None
# call testee and verify results
try:
...call function here...
except exception_type as e:
if not exception is None:
self.assertEqual(e, exception)


Did you use tabs? They do not get preserved indefinitely, so they are 
bad for posting.



This of course requires the exception to be equality-comparable.


Equality comparison is by id. So this code will not do what you want.

You can, of course, write a custom AssertX subclass that at least works 
for your custom exception class.


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Steven D'Aprano
On Wed, 28 Mar 2012 19:43:36 +0200, Peter Daum wrote:

> The longer story of my question is: I am new to python (obviously), and
> since I am not familiar with either one, I thought it would be advisory
> to go for python 3.x. The biggest problem that I am facing is, that I am
> often dealing with data, that is basically text, but it can contain
> 8-bit bytes. 

All bytes are 8-bit, at least on modern hardware. I think you have to go 
back to the 1950s to find 10-bit or 12-bit machines.

> In this case, I can not safely assume any given encoding,
> but I actually also don't need to know - for my purposes, it would be
> perfectly good enough to deal with the ascii portions and keep anything
> else unchanged.

Well you can't do that, because *by definition* you are changing a 
CHARACTER into ONE OR MORE BYTES. So the question you have to ask is, 
*how* do you want to change them?

You can use an error handler to convert any untranslatable characters 
into question marks, or to ignore them altogether:

bytes = string.encode('ascii', 'replace')
bytes = string.encode('ascii', 'ignore')

When going the other way, from bytes to strings, it can sometimes be 
useful to use the Latin-1 encoding, which essentially cannot fail:

string = bytes.decode('latin1')

although the non-ASCII chars that you get may not be sensible or 
meaningful in any way. But if there are only a few of them, and you don't 
care too much, this may be a simple approach.

But in a nutshell, it is physically impossible to map the millions of 
Unicode characters to just 256 possible bytes without either throwing 
some characters away, or performing an encoding.



> As it seems, this would be far easier with python 2.x. 

It only seems that way until you try.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Ross Ridge
Steven D'Aprano   wrote:
>The right way to convert bytes to strings, and vice versa, is via 
>encoding and decoding operations.

If you want to dictate to the original poster the correct way to do
things then you don't need to do anything more that.  You don't need to
pretend like Chris Angelico that there's isn't a direct mapping from
the his Python 3 implementation's internal respresentation of strings
to bytes in order to label what he's asking for as being "silly".

Ross Ridge

-- 
 l/  //   Ross Ridge -- The Great HTMU
[oo][oo]  rri...@csclub.uwaterloo.ca
-()-/()/  http://www.csclub.uwaterloo.ca/~rridge/ 
 db  //   
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Ethan Furman

Peter Daum wrote:

On 2012-03-28 12:42, Heiko Wundram wrote:

Am 28.03.2012 11:43, schrieb Peter Daum:

... in my example, the variable s points to a "string", i.e. a series of
bytes, (0x61,0x62 ...) interpreted as ascii/unicode characters.

No; a string contains a series of codepoints from the unicode plane,
representing natural language characters (at least in the simplistic
view, I'm not talking about surrogates). These can be encoded to
different binary storage representations, of which ascii is (a common) one.


What I am looking for is a general way to just copy the raw data
from a "string" object to a "byte" object without any attempt to
"decode" or "encode" anything ...

There is "logically" no raw data in the string, just a series of
codepoints, as stated above. You'll have to specify the encoding to use
to get at "raw" data, and from what I gather you're interested in the
latin-1 (or iso-8859-15) encoding, as you're specifically referencing
chars >= 0x80 (which hints at your mindset being in LATIN-land, so to
speak).


The longer story of my question is: I am new to python (obviously), and
since I am not familiar with either one, I thought it would be advisory
to go for python 3.x. The biggest problem that I am facing is, that I
am often dealing with data, that is basically text, but it can contain
8-bit bytes. In this case, I can not safely assume any given encoding,
but I actually also don't need to know - for my purposes, it would be
perfectly good enough to deal with the ascii portions and keep anything
else unchanged.


Where is the data coming from?  Files?  In that case, it sounds like you 
will want to decode/encode using 'latin-1', as the bulk of your text is 
plain ascii and you don't really care about the upper-ascii chars.


~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Tim Chase

On 03/28/12 13:05, Ross Ridge wrote:

Ross Ridge  wr=

But a Python Unicode string might be stored in several
ways; for all you know, it might actually be stored as a sequence of
apples in a refrigerator, just as long as they can be referenced
correctly.


But it is in fact only stored in one particular way, as a series of bytes.


There's no logical Python way to turn that into a series of bytes.


Nonsense.  Play all the semantic games you want, it already is a series
of bytes.


Internally, they're a series of bytes, but they are MEANINGLESS 
bytes unless you know how they are encoded internally.  Those 
bytes could be UTF-8, UTF-16, UTF-32, or any of a number of other 
possible encodings[1].  If you get the internal byte stream, 
there's no way to meaningfully operate on it unless you also know 
how it's encoded (or you're willing to sacrifice the ability to 
reliably get the string back).


-tkc

[1]
http://docs.python.org/library/codecs.html#standard-encodings




--
http://mail.python.org/mailman/listinfo/python-list


Re: Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Evan Driscoll

On 01/-10/-28163 01:59 PM, Ross Ridge wrote:

Steven D'Aprano  wrote:

The right way to convert bytes to strings, and vice versa, is via
encoding and decoding operations.


If you want to dictate to the original poster the correct way to do
things then you don't need to do anything more that.  You don't need to
pretend like Chris Angelico that there's isn't a direct mapping from
the his Python 3 implementation's internal respresentation of strings
to bytes in order to label what he's asking for as being "silly".


That mapping may as well be:

  def get_bytes(some_string):
  import random
  length = random.randint(len(some_string), 5*len(some_string))
  bytes = [0] * length
  for i in xrange(length):
  bytes[i] = random.randint(0, 255)
  return bytes

Of course this is hyperbole, but it's essentially about as much 
guarantee as to what the result is.


As many others have said, the encoding isn't defined, and I would guess 
varies between implementations. (E.g. if Jython and IronPython use their 
host platforms' native strings, both have 16-bit chars and thus probably 
use UTF-16 encoding. I am not sure what CPython uses, but I bet it's 
*not* that.)


It's even guaranteed that the byte representation won't change! If 
something is lazily evaluated or you have a COW string or something, the 
bytes backing it will differ.



So yes, you can say that pretending there's not a mapping of strings to 
internal representation is silly, because there is. However, there's 
nothing you can say about that mapping.


Evan
--
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Ross Ridge
Tim Chase   wrote:
>Internally, they're a series of bytes, but they are MEANINGLESS 
>bytes unless you know how they are encoded internally.  Those 
>bytes could be UTF-8, UTF-16, UTF-32, or any of a number of other 
>possible encodings[1].  If you get the internal byte stream, 
>there's no way to meaningfully operate on it unless you also know 
>how it's encoded (or you're willing to sacrifice the ability to 
>reliably get the string back).

In practice the number of ways that CPython (the only Python 3
implementation) represents strings is much more limited.  Pretending
otherwise really isn't helpful.

Still, if Chris Angelico had used your much less misleading explaination,
then this could've been resolved much quicker.  The original poster
didn't buy Chris's bullshit for a minute, instead he had to find out on
his own that that the internal representation of strings wasn't what he
expected to be.

Ross Ridge

-- 
 l/  //   Ross Ridge -- The Great HTMU
[oo][oo]  rri...@csclub.uwaterloo.ca
-()-/()/  http://www.csclub.uwaterloo.ca/~rridge/ 
 db  //   
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Albert W. Hopkins
On Wed, 2012-03-28 at 14:05 -0400, Ross Ridge wrote:
> Ross Ridge  wr=
> > Of course it is. =A0Conceptually you're not supposed to think of it that
> > way, but a string is stored in memory as a series of bytes.
> 
> Chris Angelico   wrote:
> >Note that distinction. I said that a string "is not" a series of
> >bytes; you say that it "is stored" as bytes.
> 
> The distinction is meaningless.  I'm not going argue with you about what
> you or I ment by the word "is".
> 

Off topic, but obligatory:

https://www.youtube.com/watch?v=j4XT-l-_3y0


-- 
http://mail.python.org/mailman/listinfo/python-list


RE: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Prasad, Ramit
> >The right way to convert bytes to strings, and vice versa, is via
> >encoding and decoding operations.
> 
> If you want to dictate to the original poster the correct way to do
> things then you don't need to do anything more that.  You don't need to
> pretend like Chris Angelico that there's isn't a direct mapping from
> the his Python 3 implementation's internal respresentation of strings
> to bytes in order to label what he's asking for as being "silly".

It might be technically possible to recreate internal implementation,
or get the byte data. That does not mean it will make any sense or
be understood in a meaningful manner. I think Ian summarized it
very well:

>You can't generally just "deal with the ascii portions" without
>knowing something about the encoding.  Say you encounter a byte
>greater than 127.  Is it a single non-ASCII character, or is it the
>leading byte of a multi-byte character?  If the next character is less
>than 127, is it an ASCII character, or a continuation of the previous
>character?  For UTF-8 you could safely assume ASCII, but without
>knowing the encoding, there is no way to be sure.  If you just assume
>it's ASCII and manipulate it as such, you could be messing up
>non-ASCII characters.

Technically, ASCII goes up to 256 but they are not A-z letters.

Ramit


Ramit Prasad | JPMorgan Chase Investment Bank | Currencies Technology
712 Main Street | Houston, TX 77002
work phone: 713 - 216 - 5423

--


This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.  
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Ethan Furman

Prasad, Ramit wrote:

You can read as bytes and decode as ASCII but ignoring the troublesome
non-text characters:


print(open('text.txt', 'br').read().decode('ascii', 'ignore'))

Das fr ASCII nicht benutzte Bit kann auch fr Fehlerkorrekturzwecke
(Parittsbit) auf den Kommunikationsleitungen oder fr andere
Steuerungsaufgaben verwendet werden. Heute wird es aber fast immer zur
Erweiterung von ASCII auf einen 8-Bit-Code verwendet. Diese
Erweiterungen sind mit dem ursprnglichen ASCII weitgehend kompatibel,
so dass alle im ASCII definierten Zeichen auch in den verschiedenen
Erweiterungen durch die gleichen Bitmuster kodiert werden. Die
einfachsten Erweiterungen sind Kodierungen mit sprachspezifischen
Zeichen, die nicht im lateinischen Grundalphabet enthalten sind.

The paragraph is from the German Wikipedia on ASCII, in UTF-8.


I see no non-ASCII characters, not sure if that is because the source
has none or something else.


The 'ignore' argument to .decode() caused all non-ascii characters to be 
removed.


~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread John Nagle

On 3/28/2012 10:43 AM, Peter Daum wrote:

On 2012-03-28 12:42, Heiko Wundram wrote:

Am 28.03.2012 11:43, schrieb Peter Daum:



The longer story of my question is: I am new to python (obviously), and
since I am not familiar with either one, I thought it would be advisory
to go for python 3.x. The biggest problem that I am facing is, that I
am often dealing with data, that is basically text, but it can contain
8-bit bytes. In this case, I can not safely assume any given encoding,
but I actually also don't need to know - for my purposes, it would be
perfectly good enough to deal with the ascii portions and keep anything
else unchanged.


   So why let the data get into a "str" type at all? Do everything
end to end with "bytes" or "bytearray" types.

John Nagle
--
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Grant Edwards
On 2012-03-28, Steven D'Aprano  wrote:
> On Wed, 28 Mar 2012 19:43:36 +0200, Peter Daum wrote:
>
>> The longer story of my question is: I am new to python (obviously), and
>> since I am not familiar with either one, I thought it would be advisory
>> to go for python 3.x. The biggest problem that I am facing is, that I am
>> often dealing with data, that is basically text, but it can contain
>> 8-bit bytes. 
>
> All bytes are 8-bit, at least on modern hardware. I think you have to
> go back to the 1950s to find 10-bit or 12-bit machines.

Well, on anything likely to run Python that's true.  There are modern
DSP-oriented CPUs where a byte is 16 or 32 bits (and so is an int and
a long, and a float and a double).

>> As it seems, this would be far easier with python 2.x. 
>
> It only seems that way until you try.

It's easy as long as you deal with nothing but ASCII and Latin-1. ;)

-- 
Grant Edwards   grant.b.edwardsYow! Somewhere in Tenafly,
  at   New Jersey, a chiropractor
  gmail.comis viewing "Leave it to
   Beaver"!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread MRAB

On 28/03/2012 20:02, Prasad, Ramit wrote:

 >The right way to convert bytes to strings, and vice versa, is via
 >encoding and decoding operations.

 If you want to dictate to the original poster the correct way to do
 things then you don't need to do anything more that.  You don't need to
 pretend like Chris Angelico that there's isn't a direct mapping from
 the his Python 3 implementation's internal respresentation of strings
 to bytes in order to label what he's asking for as being "silly".


It might be technically possible to recreate internal implementation,
or get the byte data. That does not mean it will make any sense or
be understood in a meaningful manner. I think Ian summarized it
very well:


You can't generally just "deal with the ascii portions" without
knowing something about the encoding.  Say you encounter a byte
greater than 127.  Is it a single non-ASCII character, or is it the
leading byte of a multi-byte character?  If the next character is less
than 127, is it an ASCII character, or a continuation of the previous
character?  For UTF-8 you could safely assume ASCII, but without
knowing the encoding, there is no way to be sure.  If you just assume
it's ASCII and manipulate it as such, you could be messing up
non-ASCII characters.


Technically, ASCII goes up to 256 but they are not A-z letters.


Technically, ASCII is 7-bit, so it goes up to 127.
--
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Grant Edwards
On 2012-03-28, Prasad, Ramit  wrote:
> 
>>You can't generally just "deal with the ascii portions" without
>>knowing something about the encoding.  Say you encounter a byte
>>greater than 127.  Is it a single non-ASCII character, or is it the
>>leading byte of a multi-byte character?  If the next character is less
>>than 127, is it an ASCII character, or a continuation of the previous
>>character?  For UTF-8 you could safely assume ASCII, but without
>>knowing the encoding, there is no way to be sure.  If you just assume
>>it's ASCII and manipulate it as such, you could be messing up
>>non-ASCII characters.
> 
> Technically, ASCII goes up to 256

No, ASCII only defines 0-127.  Values >=128 are not ASCII.

>From https://en.wikipedia.org/wiki/ASCII:

  ASCII includes definitions for 128 characters: 33 are non-printing
  control characters (now mostly obsolete) that affect how text and
  space is processed and 95 printable characters, including the space
  (which is considered an invisible graphic).

-- 
Grant Edwards   grant.b.edwardsYow! Used staples are good
  at   with SOY SAUCE!
  gmail.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Ross Ridge
Evan Driscoll   wrote:
>So yes, you can say that pretending there's not a mapping of strings to 
>internal representation is silly, because there is. However, there's 
>nothing you can say about that mapping.

I'm not the one labeling anything as being silly.  I'm the one labeling
the things as bullshit, and that's what you're doing here.  I can in
fact say what the internal byte string representation of strings is any
given build of Python 3.  Just because I can't say what it would be in
an imaginary hypothetical implementation doesn't mean I can never say
anything about it.

Ross Ridge

-- 
 l/  //   Ross Ridge -- The Great HTMU
[oo][oo]  rri...@csclub.uwaterloo.ca
-()-/()/  http://www.csclub.uwaterloo.ca/~rridge/ 
 db  //   
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Mark Lawrence

On 28/03/2012 20:43, Ross Ridge wrote:

Evan Driscoll  wrote:

So yes, you can say that pretending there's not a mapping of strings to
internal representation is silly, because there is. However, there's
nothing you can say about that mapping.


I'm not the one labeling anything as being silly.  I'm the one labeling
the things as bullshit, and that's what you're doing here.  I can in
fact say what the internal byte string representation of strings is any
given build of Python 3.  Just because I can't say what it would be in
an imaginary hypothetical implementation doesn't mean I can never say
anything about it.

Ross Ridge



Bytes is bytes and strings is strings
And the wrong one I have chose
Let's go where they keep on wearin'
Those frills and flowers and buttons and bows
Rings and things and buttons and bows.

No guessing the tune.

--
Cheers.

Mark Lawrence.

--
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Neil Cerutti
On 2012-03-28, Ross Ridge  wrote:
> Evan Driscoll   wrote:
>> So yes, you can say that pretending there's not a mapping of
>> strings to internal representation is silly, because there is.
>> However, there's nothing you can say about that mapping.
>
> I'm not the one labeling anything as being silly.  I'm the one
> labeling the things as bullshit, and that's what you're doing
> here.  I can in fact say what the internal byte string
> representation of strings is any given build of Python 3.  Just
> because I can't say what it would be in an imaginary
> hypothetical implementation doesn't mean I can never say
> anything about it.

I am in a similar situation viz a viz my wife's undergarments.

-- 
Neil Cerutti
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Terry Reedy

On 3/28/2012 1:43 PM, Peter Daum wrote:


The longer story of my question is: I am new to python (obviously), and
since I am not familiar with either one, I thought it would be advisory
to go for python 3.x.


I strongly agree with that unless you have reason to use 2.7. Python 3.3 
(.0a1 in nearly out) has an improved unicode implementation, among other 
things.


< The biggest problem that I am facing is, that I

am often dealing with data, that is basically text, but it can contain
8-bit bytes. In this case, I can not safely assume any given encoding,
but I actually also don't need to know - for my purposes, it would be
perfectly good enough to deal with the ascii portions and keep anything
else unchanged.


You are assuming, or must assume, that the text is in an 
ascii-compatible encoding, meaning that bytes 0-127 really represent 
ascii chars. Otherwise, you cannot reliably interpret anything, let 
alone change it.


This problem of knowing that much but not the specific encoding is 
unfortunately common. It has been discussed among core developers and 
others the last few months. Different people prefer one of the following 
approaches.


1. Keep the bytes as bytes and use bytes literals and bytes functions as 
needed. The danger, as you noticed, is forgetting the 'b' prefix.


2. Decode as if the text were latin-1 and ignore the non-ascii 'latin-1' 
chars. When done, encode back to 'latin-1' and the non-ascii chars will 
be as they originally were. The danger is forgetting the pretense, and 
perhaps passing on the the string (as a string, not bytes) to other 
modules that will not know the pretense.


3. Decode using encoding = 'ascii', errors='surrogate_escape'. This 
reversibly encodes the unknown non-ascii chars as 'illegal' non-chars 
(using the surrogate-pair second-half code units). This is probably the 
safest in that invalid operations on the non-chars should raise an 
exception. Re-encoding with the same setting will reproduce the original 
hi-bit chars. The main danger is passing the illegal strings out of your 
local sandbox.


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Need Help Using list items as output table names in MsACCESS

2012-03-28 Thread Cathy James
Dear Python folks,
I need your help on using list items as output table names in
MsACCESS-new to Python- simple would be better:

import arcpy, os
outSpace = "c:\\data\\Info_Database.mdb\\"
arcpy.overwriteOutput = True
SQL = "Database Connections\\SDE_ReadOnly.sde\\"

inFcList = [(SDE + "sde.GIS.Parcel"),
(SDE + "sde.GIS.Residence"),
(SDE + "sde.GIS.Park"),
(SDE + "sde.GIS.Field"),
(SDE + "sde.GIS.Business"),
(SDE + "sde.GIS.Facility"),
(SDE + "sde.GIS.Tertiary"),
(SDE + "sde.GIS.KiddieClub")]

#I'd like to crete output tables in the MDB whose names correspond to
input names such that
#"sde.GIS.Parcel" becomes "sde.GIS.Parcel_Buffer_500"
for fc in inFcList:
arcpy.overwriteOutput = True
arcpy.Buffer_analysis(fc,(outSpace+fc+"_Buffer_500"), "500 Feet",
"FULL", "ROUND", "ALL", "")
print Finished
#Thanks in advance
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Advise of programming one of my first programs

2012-03-28 Thread Anatoli Hristov
> You are correct it is not. :) You code is overly complex making it harder
> to understand. Try and reduce the problem to the least number of tasks you
> need.
> >From the Zen of Python, "Simple is better than complex." It is a good
> programming
> mentality.


Complex is better than complicated. :p


> 2. A system to navigate your program.
> def mmenu():
># load tbook here
>while True:
>choicem = get_menu_choice()
>if choicem == 'e' or choicem == 'E':
>  book = get_book_name()
>  edit( tbook, book )
>elif choicem == 'd' or choicem == 'D':
>  book = get_book_name()
>  details( tbook, book )
>elif choicem =='Q' or choicem == 'q':
>  break # end loop to exit program
>else:
>  print 'Selection {0} not understood.'.format( choicem )I have
> given you more functions

With the main navigation menu I will only have the option to select a
nickname and when a nickname is selected then it loads Details of the
contact and from loaded details I can choice Edit or back to main screen,
like I did it the first time, or else I can do it => when 'e' pressed to
ask for a nickname and then edit it.


> 3. function to write an edited book
> def write_book(tbook):
>write_book = open('c:/Python27/Toli/myfile.txt', 'w')
>write_book.write(str(tbook))
># I think repr would be more accurate than str here.
>
I`m not that far as you already know :)  I hope I will learn everything as
soon as I can as I`m not capable to read python everyday.


> I do not think you need any other functions. Now you just need to finsh
> all the functions
> and put it all together.
>
I will finish it just need to understand more the functions and complex
arguments.


Thanks

Anatoli
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: RE: Advise of programming one of my first programs

2012-03-28 Thread Anatoli Hristov
> > Um, at least by my understanding, the use of Pickle is also dangerous if
> > you
> > > are not completely sure what is being passed in:
> >
> > Oh goodness yes. pickle is exactly as unsafe as eval is. Try running this
> > code:
> >
> > from pickle import loads
> > loads("c__builtin__\neval\n(c__builtin__\nraw_input\n(S'py>'\ntRtR.")
>
> It might be as dangerous, but which is more likely to cause problems in
> real world scenarios?


Guys this is really something  that is not that important at this time for
me
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Number of languages known [was Re: Python is readable] - somewhat OT

2012-03-28 Thread Tim Delaney
On 25 March 2012 11:03, Tim Chase  wrote:

> On 03/24/12 17:08, Tim Delaney wrote:
>
>> Absolutely. 10 years ago (when I was just a young lad) I'd say that I'd
>> *forgotten* at least 20 programming languages. That number has only
>> increased.
>>
>
> And in the case of COBOL for me, it wasn't just forgotten, but actively
> repressed ;-)
>

2 weeks on work experience in year 10 (16 years old) was enough for me.
Although I did have a functional book catalogue program by the end of it.
Apparently the feedback was that if I'd wanted a job there I could have had
one ...

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Steven D'Aprano
On Wed, 28 Mar 2012 15:43:31 -0400, Ross Ridge wrote:

> I can in
> fact say what the internal byte string representation of strings is any
> given build of Python 3.

Don't keep us in suspense! Given:

Python 3.2.2 (default, Mar  4 2012, 10:50:33)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-51)] on linux2

what *is* the internal byte representation of the string "a∫©πz"?

(lowercase a, integral sign, copyright symbol, lowercase Greek pi, 
lowercase z)


And more importantly, given that internal byte representation, what could 
you do with it?


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: RE: Advise of programming one of my first programs

2012-03-28 Thread Chris Angelico
 Thu, Mar 29, 2012 at 9:36 AM, Anatoli Hristov  wrote:
>> > > Um, at least by my understanding, the use of Pickle is also dangerous
>> > > if you are not completely sure what is being passed in:
>> >
>> > Oh goodness yes. pickle is exactly as unsafe as eval is. Try running
>> > this code:
>> >
>> > from pickle import loads
>> > loads("c__builtin__\neval\n(c__builtin__\nraw_input\n(S'py>'\ntRtR.")
>>
>> It might be as dangerous, but which is more likely to cause problems in
>> real world scenarios?
>
> Guys this is really something  that is not that important at this time for
> me

Maybe not, but it's still worth being aware of. Even if today your
strings will never include apostrophes, it's still important to
understand the risks of SQL injection and properly escape them before
inserting them into an SQL statement. Just docket the information in
the back of your mind "Don't use pickle with untrusted data" and move
on. :)

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Re: Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Evan Driscoll
On 3/28/2012 14:43, Ross Ridge wrote:
> Evan Driscoll   wrote:
>> So yes, you can say that pretending there's not a mapping of strings to 
>> internal representation is silly, because there is. However, there's 
>> nothing you can say about that mapping.
> 
> I'm not the one labeling anything as being silly.  I'm the one labeling
> the things as bullshit, and that's what you're doing here.  I can in
> fact say what the internal byte string representation of strings is any
> given build of Python 3.  Just because I can't say what it would be in
> an imaginary hypothetical implementation doesn't mean I can never say
> anything about it.

People like you -- who write to assumptions which are not even remotely
guaranteed by the spec -- are part of the reason software sucks.

People like you hold back progress, because system implementers aren't
free to make changes without breaking backwards compatibility. Enormous
amounts of effort are expended to test programs and diagnose problems
which are caused by unwarranted assumptions like "the encoding of a
string is UTF-8". In the worst case, assumptions like that lead to
security fixes that don't go as far as they could, like the recent
discussion about hashing.

Python is definitely closer to the "willing to break backwards
compatibility to improve" end of the spectrum than some other projects
(*cough* Windows *cough*), but that still doesn't mean that you can make
assumptions like that.


This email is a bit harsher than it deserves -- but I feel not by much.

Evan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Number of languages known [was Re: Python is readable] - somewhat OT

2012-03-28 Thread Rodrick Brown
At my current firm we hire people who are efficient in one of the following and 
familiar with any another C#, Java, C++, Perl, Python or Ruby.

We then expect developers to quickly pick up any of the following languages we 
use in house which is very broad. In our source repository not including the 
languages I've already stated above I've seen Fortran, Erlang, Groovy, HTML, 
CSS, JavaScript, Mathlab, C, K, R, S, Q,  Excel, PHP, Bash, Ksh, PowerShell, 
Ruby, and Cuda.

We do heavy computational and statistical analysis type work so developers need 
to be able to use a vast army of programming tools to tackle the various work 
loads were faced with on a daily basis. 

The best skill any developer can have is the ability to pickup languages very 
quickly and know what tools work well for which task.

On Mar 22, 2012, at 3:14 PM, Chris Angelico  wrote:

> On Fri, Mar 23, 2012 at 4:44 AM, Steven D'Aprano
>  wrote:
>> The typical developer knows three, maybe four languages
>> moderately well, if you include SQL and regexes as languages, and might
>> have a nodding acquaintance with one or two more.
> 
> I'm not entirely sure what you mean by "moderately well", nor
> "languages", but I'm of the opinion that a good developer should be
> able to learn a new language very efficiently. Do you count Python 2
> and 3 as the same language? What about all the versions of the C
> standard?
> 
> In any case, though, I agree that there's a lot of people
> professionally writing code who would know about the 3-4 that you say.
> I'm just not sure that they're any good at coding, even in those few
> languages. All the best people I've ever known have had experience
> with quite a lot of languages.
> 
> ChrisA
> -- 
> http://mail.python.org/mailman/listinfo/python-list
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Number of languages known [was Re: Python is readable] - somewhat OT

2012-03-28 Thread Chris Angelico
On Thu, Mar 29, 2012 at 11:59 AM, Rodrick Brown  wrote:
> The best skill any developer can have is the ability to pickup languages very 
> quickly and know what tools work well for which task.

Definitely. Not just languages but all tools. The larger your toolkit
and the better you know it, the more easily you'll be able to grasp
the tool you need.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unittest: assertRaises() with an instance instead of a type

2012-03-28 Thread Ben Finney
Steven D'Aprano  writes:

> (By the way, I have to question the design of an exception with error 
> codes. That seems pretty poor design to me. Normally the exception *type* 
> acts as equivalent to an error code.)

Have a look at Python's built-in OSError. The various errors from the
operating system can only be distinguished by the numeric code the OS
returns, so that's what to test on in one's unit tests.

-- 
 \  “In the long run, the utility of all non-Free software |
  `\  approaches zero. All non-Free software is a dead end.” —Mark |
_o__)Pilgrim, 2006 |
Ben Finney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Re: Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Ross Ridge
Evan Driscoll   wrote:
>People like you -- who write to assumptions which are not even remotely
>guaranteed by the spec -- are part of the reason software sucks.
...
>This email is a bit harsher than it deserves -- but I feel not by much.

I don't see how you could feel the least bit justified.  Well meaning,
if unhelpful, lies about the nature Python strings in order to try to
convince someone to follow what you think are good programming practices
is one thing.  Maliciously lying about someone else's code that you've
never seen is another thing entirely.

Ross Ridge

-- 
 l/  //   Ross Ridge -- The Great HTMU
[oo][oo]  rri...@csclub.uwaterloo.ca
-()-/()/  http://www.csclub.uwaterloo.ca/~rridge/ 
 db  //   
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Re: Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Chris Angelico
On Thu, Mar 29, 2012 at 2:04 PM, Ross Ridge  wrote:
> Evan Driscoll   wrote:
>>People like you -- who write to assumptions which are not even remotely
>>guaranteed by the spec -- are part of the reason software sucks.
> ...
>>This email is a bit harsher than it deserves -- but I feel not by much.
>
> I don't see how you could feel the least bit justified.  Well meaning,
> if unhelpful, lies about the nature Python strings in order to try to
> convince someone to follow what you think are good programming practices
> is one thing.  Maliciously lying about someone else's code that you've
> never seen is another thing entirely.

Actually, he is justified. It's one thing to work in C or assembly and
write code that depends on certain bit-pattern representations of data
(although even that causes trouble - assuming that
sizeof(int)==sizeof(int*) isn't good for portability), but in a high
level language, you cannot assume any correlation between objects and
bytes. Any code that depends on implementation details is risky.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Re: Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Ross Ridge
Chris Angelico   wrote:
>Actually, he is justified. It's one thing to work in C or assembly and
>write code that depends on certain bit-pattern representations of data
>(although even that causes trouble - assuming that
>sizeof(int)=3D=3Dsizeof(int*) isn't good for portability), but in a high
>level language, you cannot assume any correlation between objects and
>bytes. Any code that depends on implementation details is risky.

How does that in anyway justify Evan Driscoll maliciously lying about
code he's never seen?

Ross Ridge

-- 
 l/  //   Ross Ridge -- The Great HTMU
[oo][oo]  rri...@csclub.uwaterloo.ca
-()-/()/  http://www.csclub.uwaterloo.ca/~rridge/ 
 db  //   
-- 
http://mail.python.org/mailman/listinfo/python-list


CFG for python

2012-03-28 Thread J. Mwebaze
Anyone knows how to create control-flow-graph for python.. After searching
around, i found  this article,
http://www.python.org/dev/peps/pep-0339/#ast-to-cfg-to-bytecode  and also a
reference to http://doc.pypy.org/en/latest/objspace.html#the-flow-model

However, i stil cant figure out what how to create the CFG from the
two references.
Regards
-- 
*Mob UG: +256 (0) 70 1735800 | NL +31 (0) 6 852 841 38 | Gtalk: jmwebaze |
skype: mwebazej | URL: www.astro.rug.nl/~jmwebaze

/* Life runs on code */*
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Mark Lawrence

On 29/03/2012 04:58, Ross Ridge wrote:

Chris Angelico  wrote:

Actually, he is justified. It's one thing to work in C or assembly and
write code that depends on certain bit-pattern representations of data
(although even that causes trouble - assuming that
sizeof(int)=3D=3Dsizeof(int*) isn't good for portability), but in a high
level language, you cannot assume any correlation between objects and
bytes. Any code that depends on implementation details is risky.


How does that in anyway justify Evan Driscoll maliciously lying about
code he's never seen?

Ross Ridge



We appear to have a case of "would you stand up please, your voice is 
rather muffled".  I can hear all the *plonks* from miles away.


--
Cheers.

Mark Lawrence.

--
http://mail.python.org/mailman/listinfo/python-list


Re: unittest: assertRaises() with an instance instead of a type

2012-03-28 Thread Steven D'Aprano
On Thu, 29 Mar 2012 12:55:13 +1100, Ben Finney wrote:

> Steven D'Aprano  writes:
> 
>> (By the way, I have to question the design of an exception with error
>> codes. That seems pretty poor design to me. Normally the exception
>> *type* acts as equivalent to an error code.)
> 
> Have a look at Python's built-in OSError. The various errors from the
> operating system can only be distinguished by the numeric code the OS
> returns, so that's what to test on in one's unit tests.

I'm familiar with OSError. It is necessary because OSError is a high-
level interface to low-level C errors. I wouldn't call it a good design 
though, I certainly wouldn't choose it if we were developing an error 
system from scratch and weren't constrained by compatibility with a more 
primitive error model (error codes instead of exceptions).

The new, revamped exception hierarchy in Python 3.3 will rationalise much 
(but not all) for this, unifying IOError and OSError and making error 
codes much less relevant:


http://www.python.org/dev/peps/pep-3151/



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "convert" string to bytes without changing data (encoding)

2012-03-28 Thread Steven D'Aprano
On Wed, 28 Mar 2012 23:58:53 -0400, Ross Ridge wrote:

> How does that in anyway justify Evan Driscoll maliciously lying about
> code he's never seen?

You are perfectly justified to complain about Evan making sweeping 
generalisations about your code when he has not seen it; you are NOT 
justified in making your own sweeping generalisations that he is not just 
lying but *maliciously* lying. He might be just confused by the strength 
of his emotions and so making an honest mistake. Or he might have guessed 
perfectly accurately about your code, and you are the one being 
dishonest. Who knows?

Evan's impassioned rant is based on his estimate of your mindset, namely 
that you are the sort of developer who writes code making assumptions 
about implementation details even when explicitly told not to by the 
library authors. I have no idea whether Evan's estimate is right or not, 
but I don't think it is justified based on the little amount we've seen 
of you.

Your reaction is to make an equally unjustified estimate of Evan's 
mindset, namely that he is not just wrong about you, but *deliberately 
and maliciously* lying about you in the full knowledge that he is wrong. 
If anything, I would say that you have less justification for calling 
Evan a malicious liar than he has for calling you the sort of person who 
would write to an implementation instead of an interface.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list