Re: base64.b64encode(data)

2016-06-12 Thread Marko Rauhamaa
Random832 :
> base64 characters are *characters*, not *bytes*

Ok, I ran into the same surprise just two days ago. But is this a big
deal?


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: base64.b64encode(data)

2016-06-12 Thread Random832
On Mon, Jun 13, 2016, at 01:16, Steven D'Aprano wrote:
> Suppose instead it returned the Unicode string 'AUERFg=='. That's all
> well and good, but what are you going to do with it? You can't
> transmit it over a serial cable, because that almost surely is going
> to expect bytes, so you have to encode it. You can't embed it in an
> email, because that also expects bytes.

Unless you're using a library that expects to receive strings and encode
them itself. Such as, in the example you so helpfully provide, a file
opened in text mode.
 
> You could write it to a file. If the file is opened in binary mode,
> you have to encode the Unicode string to bytes before you can write
> it. If the file is opened in text mode, Python will accept your
> Unicode string and encode it for you, which could introduce non-
> base64 characters into the file. Consider if the file was opened
> using UTF-16:
> 
> \x00A\x00U\x00E\x00R\x00F\x00g\x00=\x00=
> 
> hardly counts as base64 in any meaningful sense.

Why do you say these things like you assume I will agree with them. It
doesn't, in fact, introduce non-base64 characters because base64
characters are *characters*, not *bytes* and UTF-16 (or EBCDIC or
whatever) is a perfectly valid encoding of those *characters*, and the
recipient will, naturally, open that file in text mode in the same
encoding, and receive the same string, which it can then decode as
base64.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: base64.b64encode(data)

2016-06-12 Thread Steven D'Aprano
On Mon, 13 Jun 2016 01:20 pm, Random832 wrote:

> On Sun, Jun 12, 2016, at 22:22, Steven D'Aprano wrote:
>> That's because base64 is a bytes-to-bytes transformation. It has
>> nothing to do with unicode encodings.
> 
> Nonsense. base64 is a binary-to-text encoding scheme. The output range
> is specifically chosen to be safe to transmit in text protocols.

"Safe to transmit in text protocols" surely should mean "any Unicode code
point", since all of Unicode is text. What's so special about the base64
ones?

Well, that depends on your context. For somebody who cares about sending
bits over a physical wire, their idea of "text" is not Unicode, but a
subset of ASCII *bytes*.

The end result is that after you've base64ed your "binary" data, to
get "text" data, what are you going to do with is? Treat it as Unicode code
points? Probably not. Squirt it down a wire as bytes? Almost certainly.
Looking at this from the high-level perspective of Python, that makes it
conceptually bytes not text.

Yes, I know that there's a terminology clash between communication engineers
and the programmers who work in their world, and the rest of us. We
use "text" to mean Unicode[1], they use "text" to mean "roughly 100 of the
128 bytes with the high-bit cleared, interpreted as ASCII".

But those folks are unlikely to be asking why base64 encoding a bunch of
bytes returns bytes. They *want* it to return bytes, because that's what
they're going to squirt down the wire. If you gave them Unicode, encoded
using (say) UTF-16 or UTF-32, they're likely to say "WTF are you giving me
this binary data for? Look at all these NUL bytes, what am I supposed to do
with them?!?!". (If they could cope with arbitrary bytes, they wouldn't
have base64 encoded it.) And if you gave them UTF-8, well, how would anyone
know? With base64 encoded data, it's all a subset of ASCII.

Python defines a nice clean separation between text (Unicode) and binary
data (bytes). Under that model, base64 is a transformation between
unrestricted bytes 0...255 to a restricted subset of bytes that matches
some ASCII encoded text. It shouldn't return a Unicode string, because
that's an abstract text format and we can't make any assumptions about the
implementation. Say you base64 encode some binary data:

py> base64.b64encode(b'\x01A\x11\x16')
b'AUERFg=='

Suppose instead it returned the Unicode string 'AUERFg=='. That's all well
and good, but what are you going to do with it? You can't transmit it over
a serial cable, because that almost surely is going to expect bytes, so you
have to encode it. You can't embed it in an email, because that also
expects bytes.

You could write it to a file. If the file is opened in binary mode, you have
to encode the Unicode string to bytes before you can write it. If the file
is opened in text mode, Python will accept your Unicode string and encode
it for you, which could introduce non-base64 characters into the file.
Consider if the file was opened using UTF-16:

\x00A\x00U\x00E\x00R\x00F\x00g\x00=\x00=

hardly counts as base64 in any meaningful sense.

So while I complete accept your comment about "text protocols" in the
context of the networking world, we're not in the networking world. We're
in the high-level programming language world of Python, where text does not
mean a subset of ASCII bytes, it means Unicode. And in *our* world, having
base64 return text is a mistake.






[1] Or at least we should, since the idea that only American English[2]
counts as text cannot possibly survive in the 21st Century when we're
connected to the entire world of different languages. Although I'd allow
TRON as well, if you can actually find any TRON users outside of Japan.[3]

[2] And only a subset of American English at that.

[3] Or inside Japan for that matter.

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: how to search item in list of list

2016-06-12 Thread Jussi Piitulainen
meInvent bbird writes:

> once a nested list have a word "node" then true else false
>
> def search(current_item):
> if isinstance(current_item, list):
> if len(current_item)==4: 
> if [item for item in current_item if item[4] == "node"] != []:
> return True
> if True in [search(item) for item in current_item]:
> return True
> else:
> return False
>
> search(mresult)
>
> but it return false

Your mresult is not really nested in the way you are trying to treat it.
It looks like a list of lists of length 1, always containing a tuple of
length 5, where the last element is a string that may be 'node'. You
could just test each top-level item for item[0][4] == 'node'.

(It might help to make your case analysis more explicit. Now there's an
implicit return if current_item is not a list, and a fall-through to
another test if the length of current_item is 4. The structure of the
function should match the structure of the data.)

Below is the output of pprint.pprint(mresult). Much easier to read (also
less likely to break in transit - I had to fix an unfortunate line break
to make the original valid Python at all).

[[(2,
   {'00': 0, '01': 1, '10': 1, '11': 1},
   ['000', '001', '010', '011', '100', '101', '110', '111'],
   'xy',
   'start')],
 [(2,
   {'00': 0, '01': 1, '10': 1, '11': 1},
   ['000', '001', '010', '011', '100', '101', '110', '111'],
   'yz',
   'start')],
 [(2,
   {'00': 0, '01': 1, '10': 1, '11': 1},
   ['000', '001', '010', '011', '100', '101 ', '110', '111'],
   'xz',
   'start')],
 [(2,
   {'00': 0, '01': 0, '10': 0, '11': 1},
   ['000', '001', '010', '011', '100', '101', '110', '111'],
   'xy',
   'start')],
 [(2,
   {' 11': 1, '00': 0, '01': 0, '10': 0},
   ['000', '001', '010', '011', '100', '101', ' 110', '111'],
   'yz',
   'start')],
 [(2,
   {'00': 0, '01': 0, '10': 0, '11': 1},
   ['000', '001', '010', '011', '100', '101', '110', '111'],
   'xz',
   'start')],
 [(2,
   {'00': 1, '01': 1, '10': 0, '11': 1},
   ['000', '001', '010', '011', '100', '101', '110', '111'],
   'xy',
   'start')],
 [(2,
   {'00': 1, '01': 1, '10': 0, '11': 1},
   ['000', '0 01', '010', '011', '100', '101', '110', '111'],
   'yz',
   'start')],
 [(2,
   {'00': 1, '01': 1, '10': 0, '11': 1},
   ['000', '001', '010', '011', '100', '101', '110', '1 11'],
   'xz',
   'start')],
 [(1,
   {'00': 0, '01': 1, '10': 1, '11': 1},
   ['00', '01', ' 11', '11', '10', '11', '11', '11'],
   'xy',
   'node')],
 [(1,
   {'00': 0, '01': 1, '10': 1, '11': 1},
   ['00', '01', '10', '11', '11', '11', '11', '11'],
   'xy',
   'node')],
 [(1,
   {'00': 0, '01': 1, '10': 1, '11': 1},
   ['00', '00', '10', '10', '10', '10', '11', '11'],
   'xy',
   'node')],
 [(1,
   {'00': 0, '01': 1, '10': 1, '11': 1},
   ['00', '00', '10', '11', '10', '10', '10', '11'],
   'xy',
   'node')],
 [(1,
   {'00': 0, '01': 1, '10': 1, '11': 1},
   ['00', '00', '10', '10', '10', '11', '10', '11'],
   'xy',
   'node')]]
-- 
https://mail.python.org/mailman/listinfo/python-list


Overriding methods inherited from a superclass with methods from a mixin

2016-06-12 Thread alanqueiros
Hello there.

I'm trying to override methods inherited from a superclass by methods defined 
in a mixin class.
Here's an sscce:
https://bpaste.net/show/6c7d8d590658 (never expires)

I've had problems finding the proper way to do that, since at first the base 
class wasn't to the right and I've assumed the correct order was from left to 
right. It was previously suggested on IRC that the mixin should be a subclass 
of "Base"; that worked but I wasn't happy with it because the B class basically 
serves the purpose of "holding" a list of methods to be inherited and that 
should override the methods inherited from the "A" classes, so I didn't see why 
it should derive from "Base".

I eventually found in an article that the problem was the ordering of the 
superclasses I was deriving from, which should be from right to left, the only 
article I could find that states that is this one: 
https://www.ianlewis.org/en/mixins-and-python

Every single example of mixins in Python that I've read -except that one- (and 
I've seen literally dozens) has the base class to the left, although the other 
classes aren't overriding any methods (at least in most of them).

That bpaste code is working perfectly for me and makes sense, but I don't 
really like it, and the people on IRC couldn't convince me the code is fine.

I haven't used Python for some time so I don't feel confident to judge that 
code, and perhaps there's a better way to achieve that result. However, what 
really scared me is the obscurity of the mixins usage in Python, and the fact 
that every example except that single one gets it "wrong", including from 
notable pythonistas.

Perhaps you guys could help me either convincing me that the bpaste code is OK, 
or perhaps coming up with a better solution for that problem. What matters to 
me is code re-usability in this case. I surely could re-implement the overrides 
in all "Z" classes separately, but that's what I'm trying to avoid. The 
requirements are:
1. I can't touch the "classes I don't have control over" (as per comment in 
code).
2. I don't want to pass the superclasses as parameters in the constructor. I 
see how you could solve the problem that way, but that would only increase the 
complexity of the code (PEP20).
3. I absolutely need to override methodX, I can't use composition and access 
the members another way unless I override methodX and access them there. This 
is to interface properly with other modules.
4. I need to be able to access A#.methodX in the "Z" classes methods.
5. I want to avoid using a factory. It's one of the most over-used patterns in 
my opinion, and I really don't like that.

Please note that the bpaste code is just an example. The real problem is much 
bigger and involves multiple methods to override and more classes, so the 
solution has to scale accordingly.

Thank you in advance.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: for / while else doesn't make sense

2016-06-12 Thread Rustom Mody
On Monday, June 13, 2016 at 7:42:25 AM UTC+5:30, Steven D'Aprano wrote:
> On Mon, 13 Jun 2016 04:44 am, Michael Selik wrote:
> 
> > On Sun, Jun 12, 2016 at 6:11 AM Steven D'Aprano  wrote:
> > 
> >> - run the for block
> >> - THEN unconditionally run the "else" block
> >>
> > 
> > Saying "unconditionally" is a bit misleading here. As you say, it's
> > conditioned on completing the loop without break/return/raise.
> 
> It's also conditional on the OS not killing the Python process, conditional
> on the CPU not catching fire, conditional on the user not turning the power
> of, and conditional on the sun not exploding and disintegrating the entire
> earth.
> 
> In the absence of any event which interferes with the normal execution of
> code by the Python VM, and in the absence of one of a very few
> explicit "JUMP" statements which explicitly jump out of the compound
> for...else statement, the else clause is unconditionally executed after the
> for clause.
> 
> Happy now?

Wholesale business of strawmen doing brisk business?

-- 
https://mail.python.org/mailman/listinfo/python-list


[RELEASE] Python 2.7.12 release candidate 1

2016-06-12 Thread Benjamin Peterson
Python 2.7.12 release candidate 1 is now available for download. This is
a preview release of the next bugfix release in the Python 2.7.x series.
Assuming no horrible regressions are located, a final release will
follow in two weeks.

Downloads for 2.7.12rc1 can be found python.org:
https://www.python.org/downloads/release/python-2712rc1/

The complete changelog may be viewed at
https://hg.python.org/cpython/raw-file/v2.7.12rc1/Misc/NEWS

Please test the pre-release and report any bugs to
   https://bugs.python.org

Servus,
Benjamin
-- 
https://mail.python.org/mailman/listinfo/python-list


how to search item in list of list

2016-06-12 Thread meInvent bbird
once a nested list have a word "node" then true else false

def search(current_item):
if isinstance(current_item, list):
if len(current_item)==4: 
if [item for item in current_item if item[4] == "node"] != []:
return True
if True in [search(item) for item in current_item]:
return True
else:
return False

search(mresult)

but it return false


mresult = [[(2, {'11': 1, '10': 1, '00': 0, '01': 1}, ['000', '001', '010', 
'011', '100',
'101', '110', '111'], 'xy', 'start')], [(2, {'11': 1, '10': 1, '00': 0, '01': 1}
, ['000', '001', '010', '011', '100', '101', '110', '111'], 'yz', 'start')], [(2
, {'11': 1, '10': 1, '00': 0, '01': 1}, ['000', '001', '010', '011', '100', '101
', '110', '111'], 'xz', 'start')], [(2, {'11': 1, '10': 0, '00': 0, '01': 0}, ['
000', '001', '010', '011', '100', '101', '110', '111'], 'xy', 'start')], [(2, {'
11': 1, '10': 0, '00': 0, '01': 0}, ['000', '001', '010', '011', '100', '101', '
110', '111'], 'yz', 'start')], [(2, {'11': 1, '10': 0, '00': 0, '01': 0}, ['000'
, '001', '010', '011', '100', '101', '110', '111'], 'xz', 'start')], [(2, {'11':
 1, '10': 0, '00': 1, '01': 1}, ['000', '001', '010', '011', '100', '101', '110'
, '111'], 'xy', 'start')], [(2, {'11': 1, '10': 0, '00': 1, '01': 1}, ['000', '0
01', '010', '011', '100', '101', '110', '111'], 'yz', 'start')], [(2, {'11': 1,
'10': 0, '00': 1, '01': 1}, ['000', '001', '010', '011', '100', '101', '110', '1
11'], 'xz', 'start')], [(1, {'11': 1, '10': 1, '00': 0, '01': 1}, ['00', '01', '
11', '11', '10', '11', '11', '11'], 'xy', 'node')], [(1, {'11': 1, '10': 1, '00'
: 0, '01': 1}, ['00', '01', '10', '11', '11', '11', '11', '11'], 'xy', 'node')],
 [(1, {'11': 1, '10': 1, '00': 0, '01': 1}, ['00', '00', '10', '10', '10', '10',
 '11', '11'], 'xy', 'node')], [(1, {'11': 1, '10': 1, '00': 0, '01': 1}, ['00',
'00', '10', '11', '10', '10', '10', '11'], 'xy', 'node')], [(1, {'11': 1, '10':
1, '00': 0, '01': 1}, ['00', '00', '10', '10', '10', '11', '10', '11'], 'xy', 'n
ode')]]
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: base64.b64encode(data)

2016-06-12 Thread Random832
On Sun, Jun 12, 2016, at 22:22, Steven D'Aprano wrote:
> That's because base64 is a bytes-to-bytes transformation. It has
> nothing to do with unicode encodings.

Nonsense. base64 is a binary-to-text encoding scheme. The output range
is specifically chosen to be safe to transmit in text protocols.

> > That is, the b64_encoded_data variable is of type 'bytes' and when
> > you peek inside it's a string (made up of what seems to be only
> > characters  that exist in Base 64).
>
> If you print or otherwise display bytes, for the convenience of human
> beings, those bytes are displayed as if they were ASCII. E.g. the byte
> 0x61 is displayed as 'a'. Good idea? Bad idea? I can see arguments
> either way, but that's how it is.

There's absolutely no rational basis for choosing "0x41-0x5A, 0x61-0x7A,
0x30-0x39, 0x2B, 0x2F" as the output range except for what characters
those values represent in ASCII. And if you needed to smuggle some
binary data through an EBCDIC system in the same manner, you would
naturally wish to convert it to the EBCDIC bytes corresponding to those
same characters.
-- 
https://mail.python.org/mailman/listinfo/python-list


[RELEASED] Python 3.4.5rc1 and Python 3.5.2rc1 are now available

2016-06-12 Thread Larry Hastings


On behalf of the Python development community and the Python 3.4 and 
Python 3.5 release teams, I'm pleased to announce the availability of 
Python 3.4.5rc1 and Python 3.5.2rc1.


Python 3.4 is now in "security fixes only" mode.  This is the final 
stage of support for Python 3.4.  All changes made to Python 3.4 since 
Python 3.4.4 should be security fixes only; conventional bug fixes are 
not accepted.  Also, Python 3.4.5rc1 and all future releases of Python 
3.4 will only be released as source code--no official binary installers 
will be produced.


Python 3.5 is still in active "bug fix" mode.  Python 3.5.2rc1 contains 
many incremental improvements over Python 3.5.1.


Both these releases are "release candidates".  They should not be 
considered the final releases, although the final releases should 
contain only minor differences.  Python users are encouraged to test 
with these releases and report any problems they encounter.



You can find Python 3.4.5rc1 here:

   https://www.python.org/downloads/release/python-345rc1/

And you can find Python 3.5.2rc1 here:

   https://www.python.org/downloads/release/python-352rc1/ 




Python 3.4.5 final and Python 3.5.2 final are both scheduled for release 
on June 26th, 2016.


Happy Pythoneering,


//arry/
--
https://mail.python.org/mailman/listinfo/python-list


Re: base64.b64encode(data)

2016-06-12 Thread Steven D'Aprano
On Mon, 13 Jun 2016 04:56 am, Marcin Rak wrote:

> Hi to everyone.
> 
> Let's say I have some binary data, be it whatever, in the 'data' variable.
>  After calling the following line
> 
> b64_encoded_data = base64.b64encode(data)
> 
> my b64_encoded_data variables holds, would you believe it, a string as
> bytes!.

That's because base64 is a bytes-to-bytes transformation. It has nothing to
do with unicode encodings.

> That is, the b64_encoded_data variable is of type 'bytes' and when you
> peek inside it's a string (made up of what seems to be only characters
> that exist in Base 64).  

If you print or otherwise display bytes, for the convenience of human
beings, those bytes are displayed as if they were ASCII. E.g. the byte 0x61
is displayed as 'a'. Good idea? Bad idea? I can see arguments either way,
but that's how it is.

Naturally after base64 encoding some bytes, you will be left with only bytes
in base64. That's the whole point of it.


> Why isn't it a string yet?

*shrug* For backwards compatibility, probably, or for historical reasons, or
because the person who write the base64 module thought that this was the
most useful behaviour.

I can promise you that had he chosen the opposite behaviour, that it returns
a str instead of bytes, there would be people complaining "why do I have to
use encode('ascii') to get bytes?".


> In fact, I now on  
> that variable have to apply the decode('utf-8') method to get a string
> object holding the exact same sequence of characters as was held by
> b64_encoded_data bytes variable.

You could also use decode('ascii'), which is probably more "correct", as the
base64 data shouldn't include anything which isn't ASCII.



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: for / while else doesn't make sense

2016-06-12 Thread Steven D'Aprano
On Mon, 13 Jun 2016 04:44 am, Michael Selik wrote:

> On Sun, Jun 12, 2016 at 6:11 AM Steven D'Aprano <
> steve+comp.lang.pyt...@pearwood.info> wrote:
> 
>> - run the for block
>> - THEN unconditionally run the "else" block
>>
> 
> Saying "unconditionally" is a bit misleading here. As you say, it's
> conditioned on completing the loop without break/return/raise.

It's also conditional on the OS not killing the Python process, conditional
on the CPU not catching fire, conditional on the user not turning the power
of, and conditional on the sun not exploding and disintegrating the entire
earth.

In the absence of any event which interferes with the normal execution of
code by the Python VM, and in the absence of one of a very few
explicit "JUMP" statements which explicitly jump out of the compound
for...else statement, the else clause is unconditionally executed after the
for clause.

Happy now?



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


yatel install error

2016-06-12 Thread meInvent bbird
sudo apt-get install python-dev libatlas-base-dev gfortran
pip install yatel


Cleaning up...
Command /usr/bin/python -c "import setuptools, 
tokenize;__file__='/tmp/pip_build_martin/yatel/setup.py';exec(compile(getattr(tokenize,
 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" 
install --record /tmp/pip-HwGsHy-record/install-record.txt 
--single-version-externally-managed --compile failed with error code 1 in 
/tmp/pip_build_martin/yatel
Traceback (most recent call last):
  File "/usr/bin/pip", line 9, in 
load_entry_point('pip==1.5.4', 'console_scripts', 'pip')()
  File "/usr/lib/python2.7/dist-packages/pip/__init__.py", line 235, in main
return command.main(cmd_args)
  File "/usr/lib/python2.7/dist-packages/pip/basecommand.py", line 161, in main
text = '\n'.join(complete_log)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 72: 
ordinal not in range(128)
-- 
https://mail.python.org/mailman/listinfo/python-list


how to use yatel with redis?

2016-06-12 Thread meInvent bbird
my code is

https://gist.github.com/hoyeunglee/58df4c41a63a2f37e153cbdbc03c16bf


would like to apply itertools.combinations to use redis to use hard disk as
memory rather than using ram(real memory) as memory

def edge_gen(self):
# we combine  haplotypes by two
for hap0, hap1 in itertools.combinations(self.haplotypes_cache.values(), 2):
w = weight.weight("hamming", hap0, hap1)
haps_id = hap0.hap_id, hap1.hap_id
yield dom.Edge(w, haps_id)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: loading trees...

2016-06-12 Thread Michael Selik
On Sun, Jun 12, 2016 at 3:01 PM Fillmore 
wrote:

> What's my best way to achieve this?
>

What are your criteria for "best"?


> The idea is that I'll receive a bit of data, determine which tree is
> suitable for handling it, and dispatch the data to the right tree for
> further processing.
>

How do you determine which tree is suitable? Does it require knowledge of
the whole tree? How big are the trees? How long does pickle take to load
those trees? How frequently does data arrive? How long does it take you to
determine the tree? How long is acceptable?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Indentation example?

2016-06-12 Thread Marc Dietz
On Sun, 12 Jun 2016 08:10:27 -0700 ICT Ezy wrote:

> Pl explain with an example the following phase "Indentation cannot be
> split over multiple physical lines using backslashes; the whitespace up
> to the first backslash determines the indentation" (in 2.1.8.
> Indentation of Tutorial.)
> I want to teach my student that point using some examples.
> Pl help me any body?

Hi!

This is my very first post inside the usenet. I hope I did understand 
this right, so here is my answer. :)

I assume, that you do understand the concept of indentation inside Python 
code. You can concatenate lines with a backslash. These lines work as if 
they were only one line. For example:

>>> print ("This is a very long"\
... " line, that got "\
... "diveded into three lines.")
This is a very long line, that was diveded into three.
>>>

Because the lines get concatenated, one might think, that you could 
divide for example 16 spaces of indentation into one line of 8 spaces 
with a backslash and one line with 8 spaces and the actual code.
Your posted text tells you though, that you can't do this. Instead the 
indentation would be considered to be only 8 spaces wide.

I hope this helped a little. :)

Cheers 
Marc.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: the global keyword:

2016-06-12 Thread BartC

On 12/06/2016 20:25, Ned Batchelder wrote:

On Sunday, June 12, 2016 at 3:08:01 PM UTC-4, BartC wrote:

On 12/06/2016 00:44, Marcin Rak wrote:



from Test import some_function, my_print
from Test import test_var

some_function()
my_print()
print(test_var)
*

and I have the following Test.py:
*
test_var = 5



  from Test import a,b,c

is equivalent to:

  import Test

  a = Test.a
  b = Test.b
  c = Test.c

which I hadn't been aware of. Then the link between a and Test.a (eg.
Test.test_var) is broken (unless Test.a is something like a list so both
still refer to the same data. But assignment to either - not an in-place
mod - will break the connection).


Just to clarify: there is no link directly between a and Test.a, except that
both refer to the same object.


OK, but I meant the link there have been if 'a' was in fact a synonym 
for Test.a.


  Just as here there is no link between x

and y:

x = 12
y = x


(And that's a good illustration of why 'y' isn't a name reference to 
'x', referring to the "...ducks limp" thread. But best not to rake it up 
again...)


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: base64.b64encode(data)

2016-06-12 Thread Marko Rauhamaa
Marcin Rak :

> b64_encoded_data = base64.b64encode(data)
>
> my b64_encoded_data variables holds, would you believe it, a string as
> bytes!.

It doesn't much matter one way or another. The logic is that whenever
you encode objects, you typically want the output as bytes. However,
it's trivial to decode the bytes into a string if that's what you need.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: the global keyword:

2016-06-12 Thread Ned Batchelder
On Sunday, June 12, 2016 at 3:08:01 PM UTC-4, BartC wrote:
> On 12/06/2016 00:44, Marcin Rak wrote:
> > Hi to all.
> >
> > I have the following file named Solver.py:
> > *
> > from Test import some_function, my_print
> > from Test import test_var
> >
> > some_function()
> > my_print()
> > print(test_var)
> > *
> >
> > and I have the following Test.py:
> > *
> > test_var = 5
> >
> > def some_function():
> > global test_var
> > test_var = 44
> > print("f {0}".format(test_var))
> >
> > def my_print():
> > print(test_var)
> > *
> >
> > Would you believe it that when I run Solver.py I get the following output:
> > f 44
> > 44
> > 5
> >
> > So my question is, how the heck is it possible that I get 5 as the last 
> > value printed? the global test_var (global to Test.py) I set to 44 when I 
> > ran some_function()???  does anyone have a clue they could throw my way?
> 
> I was puzzled too. Apparently importing stuff using 'from':
> 
>   from Test import a,b,c
> 
> is equivalent to:
> 
>   import Test
> 
>   a = Test.a
>   b = Test.b
>   c = Test.c
> 
> which I hadn't been aware of. Then the link between a and Test.a (eg. 
> Test.test_var) is broken (unless Test.a is something like a list so both 
> still refer to the same data. But assignment to either - not an in-place 
> mod - will break the connection).

Just to clarify: there is no link directly between a and Test.a, except that
both refer to the same object.  Just as here there is no link between x
and y:

x = 12
y = x

As I explained elsewhere in this thread, import statements behave exactly
like assignments.  The same reasoning that applies to multiple variables
referring to integers applies to multiple names being imported across
modules.

--Ned.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: how to handle surrogate encoding: read from fs write to database

2016-06-12 Thread Marko Rauhamaa
Random832 :

> On Sun, Jun 12, 2016, at 12:50, Steven D'Aprano wrote:
>> I think Windows also gets it almost write: NTFS uses UTF-16, and (I
>> think) only allow valid Unicode file names.
>
> Nope. Windows allows any sequence of 16-bit units (except for a dozen or
> so ASCII characters) in filenames.

Also, somewhat related, Python allows strings to contain non-Unicode
code points, namely code points in the surrogate hole. Thus, Python's
native character set is a superset of Unicode.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: the global keyword:

2016-06-12 Thread BartC

On 12/06/2016 00:44, Marcin Rak wrote:

Hi to all.

I have the following file named Solver.py:
*
from Test import some_function, my_print
from Test import test_var

some_function()
my_print()
print(test_var)
*

and I have the following Test.py:
*
test_var = 5

def some_function():
global test_var
test_var = 44
print("f {0}".format(test_var))

def my_print():
print(test_var)
*

Would you believe it that when I run Solver.py I get the following output:
f 44
44
5

So my question is, how the heck is it possible that I get 5 as the last value 
printed? the global test_var (global to Test.py) I set to 44 when I ran 
some_function()???  does anyone have a clue they could throw my way?


I was puzzled too. Apparently importing stuff using 'from':

 from Test import a,b,c

is equivalent to:

 import Test

 a = Test.a
 b = Test.b
 c = Test.c

which I hadn't been aware of. Then the link between a and Test.a (eg. 
Test.test_var) is broken (unless Test.a is something like a list so both 
still refer to the same data. But assignment to either - not an in-place 
mod - will break the connection).


Your code could be rewritten as:

from Test import some_function, my_print
import Test

some_function()
my_print()
print(Test.test_var)


Anyway, it shows Python doesn't have true cross-module globals.

--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: for / while else doesn't make sense

2016-06-12 Thread Michael Selik
On Sun, Jun 12, 2016 at 6:11 AM Steven D'Aprano <
steve+comp.lang.pyt...@pearwood.info> wrote:

> - run the for block
> - THEN unconditionally run the "else" block
>

Saying "unconditionally" is a bit misleading here. As you say, it's
conditioned on completing the loop without break/return/raise.
-- 
https://mail.python.org/mailman/listinfo/python-list


base64.b64encode(data)

2016-06-12 Thread Marcin Rak
Hi to everyone.

Let's say I have some binary data, be it whatever, in the 'data' variable.  
After calling the following line

b64_encoded_data = base64.b64encode(data)

my b64_encoded_data variables holds, would you believe it, a string as bytes!.
That is, the b64_encoded_data variable is of type 'bytes' and when you peek 
inside it's a string (made up of what seems to be only characters that exist in 
Base 64).  Why isn't it a string yet?  In fact, I now on that variable have to 
apply the decode('utf-8') method to get a string object holding the exact same 
sequence of characters as was held by b64_encoded_data bytes variable.

I'm a little confused as to why I would even have to apply the
.decode('utf-8') method - why doesn't base64.b64encode provide us with a result 
that is a 'str'?

-- 
https://mail.python.org/mailman/listinfo/python-list


loading trees...

2016-06-12 Thread Fillmore


Hi, problem for today. I have a batch file that creates "trees of data".
I can save these trees in the form of python code or serialize them with 
something
like pickle.

I then need to run a program that loads the whole forest in the form of a dict()
where each item will point to a dynamically loaded tree.

What's my best way to achieve this? Pickle? or creating curtom python code?

or maybe I am just reinventing the wheel and there are better ways to achieve 
this?

The idea is that I'll receive a bit of data, determine which tree is suitable 
for handling it,
and dispatch the data to the right tree for further processing...

Thanks
--
https://mail.python.org/mailman/listinfo/python-list


Re: the global keyword:

2016-06-12 Thread Random832
On Sat, Jun 11, 2016, at 23:15, Lawrence D’Oliveiro wrote:
> On Sunday, June 12, 2016 at 11:51:11 AM UTC+12, Random832 wrote:
> > Importing a variable from a module copies its value into your own
> > module's variable.
> 
> Every name in Python is a variable, and can be assigned to to change its
> value at any time.

Sure, but that doesn't really help explain this situation, since your
statement doesn't make clear the fact that by importing foo.a into bar
you are really merely assigning to bar.a rather than having continuous
access to foo.a, unlike how importing things works in static languages.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: the global keyword:

2016-06-12 Thread Random832
On Sat, Jun 11, 2016, at 23:15, Lawrence D’Oliveiro wrote:
> On Sunday, June 12, 2016 at 11:51:11 AM UTC+12, Random832 wrote:
> > Importing a variable from a module copies its value into your own
> > module's variable.
> 
> Every name in Python is a variable, and can be assigned to to change its
> value at any time.

Sure, but that doesn't really help explain this situation, since your
statement doesn't make clear the fact that by importing foo.a into bar
you are really merely assigning to bar.a rather than having continuous
access to foo.a, unlike how importing things works in static languages.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: the global keyword:

2016-06-12 Thread Marcin Rak
Much thanks to all for their time, but Ned in particular...I learned something 
new about Python!!

On Saturday, 11 June 2016 22:48:32 UTC-5, Ned Batchelder  wrote:
> On Saturday, June 11, 2016 at 11:38:33 PM UTC-4, Steven D'Aprano wrote:
> > On Sun, 12 Jun 2016 11:26 am, Random832 wrote:
> > 
> > > On Sat, Jun 11, 2016, at 20:09, MRAB wrote:
> > >> Not true. Importing doesn't copy the value.
> > >> 
> > >> Importing a name creates a new name in the local scope that refers to
> > >> the same object that the imported name referred to.
> > 
> > MRAB is correct here.
> > 
> > 
> > > Yes, the value of a variable is a reference to an object. Can we not
> > > have another round of this right now?
> > 
> > Sure, if you stop spreading misinformation about variables in Python and
> > cease the insanity of claiming that the value of a variable is not the
> > value you assign to it, but some invisible, unreachable "reference".
> > 
> > x = 999
> > 
> > The value of x is 999, not some invisible reference.
> > 
> > x = []
> > 
> > The value of x is an empty list, not some invisible reference.
> 
> We just went through all this.  It's clear to me that there are different
> ways of looking at these underlying mechanisms, and different people find
> truth in different ways of describing them.  The virtual world we live in
> is complex because of the differing levels of abstraction that are possible.
> Some of this disagreement is really a matter of choosing different
> abstractions to focus on.
> 
> Most importantly, it's clear to me that we aren't going to come to some
> simple consensus, certainly not by throwing around words like "insanity."
> 
> Perhaps at least in this thread we can limit ourselves to addressing the
> OP and their question directly, rather than fighting with each other over
> which answer is correct?
> 
> --Ned.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: is there a big in python 3.5.1 shell?

2016-06-12 Thread Joel Goldstick
On Sun, Jun 12, 2016 at 11:39 AM, Listo Amugongo
 wrote:
> In my attarchment above is a screenshot of my session.
> Screenshot describtion:
> In my first elif  function (first code block) I deliberately programmed an 
> indention error.i did it purposely to make my point clear and the point is 
> Python 3.5.1 shell* gives a synatax error after I run the elif function.
> It does not  matter how correctly you indent your code block.however  this is 
> not the case when I code in Python 3.5.1 which I open using the windows 
> command Line prompt  c:\python.
>
> I’m new to Python and I’m wondering if there’s something I don’t know or 
> doing wrong.Furthermore the Python shell won’t display the continuation line 
> three dots(…).which is also not the case in Python 3.5.1.
> Any feedback will be highly regarded.
>
> Sent from Mail for Windows 10
>
> --
> https://mail.python.org/mailman/listinfo/python-list

Welcome.  You can't do attachments in this list.  Most people won't
see them.  Just cut and paste your code in your text

-- 
Joel Goldstick
http://joelgoldstick.com/blog
http://cc-baseballstats.info/stats/birthdays
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Altering sys.argv on startup in Python 2

2016-06-12 Thread Random832
On Sun, Jun 12, 2016, at 13:51, Random832 wrote:
> if edit_done:
> return

this should of course be raise ImportError - an artifact of some
refactoring I did on the example.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Altering sys.argv on startup in Python 2

2016-06-12 Thread Random832
On Sun, Jun 12, 2016, at 13:51, Random832 wrote:
> if edit_done:
> return

this should of course be raise ImportError - an artifact of some
refactoring I did on the example.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: how to handle surrogate encoding: read from fs write to database

2016-06-12 Thread Random832
On Sun, Jun 12, 2016, at 12:50, Steven D'Aprano wrote:
> I think Windows also gets it almost write: NTFS uses UTF-16, and (I
> think) only allow valid Unicode file names.

Nope. Windows allows any sequence of 16-bit units (except for a dozen or
so ASCII characters) in filenames.

Of course, you're not particularly _likely_ to encounter invalid
surrogates, since nothing is going to create them without deliberately
setting out to (unlike Linux where 'invalid' filenames will be created
by any program using the 'wrong' locale).
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Altering sys.argv on startup in Python 2

2016-06-12 Thread Random832
On Sun, Jun 12, 2016, at 08:56, Adam Bartoš wrote:
> Hello,
> 
> I'm trying to employ code like
> https://code.activestate.com/recipes/572200/
> at Python 2 startup. The problem is that sys.argv isn't polulated yet
> when
> sitecustomize is executed. Is there any way around? I was thinking about
> something like temporarily chaning sys.modules['sys'] or sys.__dict__ to
> something that runs my code on PySys_SetArgv, but that doesn't work since
> PySys_SetArgv doesn't invoke any hooks like __setitem__ on sys.__dict__.
> So
> is there any way how to automatically run my code after sys.argv was set
> but before executing the main script (in Python 2)?

From what I can tell, if you create a path hook in sitecustomize.py,
argv will have been set before the hook is called for the script.

So, for example:

import sys

edit_done = False
def argv_setter_hook(path):
global edit_done
if edit_done:
return
try:
argv = sys.argv
except AttributeError:
pass
else:
argv[:] = get_unicode_argv()
edit_done = True
raise ImportError # let the real import machinery do its work

sys.path_hooks[:0] = [argv_setter_hook]
-- 
https://mail.python.org/mailman/listinfo/python-list


is there a big in python 3.5.1 shell?

2016-06-12 Thread Listo Amugongo
In my attarchment above is a screenshot of my session.
Screenshot describtion:
In my first elif  function (first code block) I deliberately programmed an 
indention error.i did it purposely to make my point clear and the point is 
Python 3.5.1 shell* gives a synatax error after I run the elif function.
It does not  matter how correctly you indent your code block.however  this is 
not the case when I code in Python 3.5.1 which I open using the windows command 
Line prompt  c:\python.

I’m new to Python and I’m wondering if there’s something I don’t know or doing 
wrong.Furthermore the Python shell won’t display the continuation line three 
dots(…).which is also not the case in Python 3.5.1.
Any feedback will be highly regarded.

Sent from Mail for Windows 10

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: how to handle surrogate encoding: read from fs write to database

2016-06-12 Thread Steven D'Aprano
On Sun, 12 Jun 2016 10:09 pm, Peter Volkov wrote:

> Hi, everybody.
> 
> What is a best practice to deal with filenames in python3? The problem is
> that os.walk(src_dir), os.listdir(src_dir), ... return "surrogate" strings
> as filenames.

Can you give an example?



> It is impossible to assume that they are normal strings that 
> could be print()'ed on unicode terminal or saved as as string into
> database (mongodb) as they'll issue UnicodeEncodeError on surrogate
> character. So, how to handle this situation?

Use a better OS :-)

I believe that Mac OS X handles this the right way. If I understand
correctly, its preferred file system, HFS+, will only allow valid UTF-8
strings as file names. So you cannot get invalid Unicode strings containing
surrogates on Apple systems (unless you read from a non-HFS+ disk).

I think Windows also gets it almost write: NTFS uses UTF-16, and (I think)
only allow valid Unicode file names.

Its only Unix file systems, including Linux, that allows arbitrary bytes
(except for / and \0) in file names, so file names can be invalid Unicode,
including surrogates.

All the terminals I know of on Linux will print "bad" file names. They will
be ugly, with control characters inside them, or invisible characters that
cannot even be seen, but they will print.

> The first solution I found was to convert filenames to bytes and use them.

I think that's the only real solution. The file names on disk actually are
bytes, and they're invalid Unicode, so it shouldn't surprise you if the
only way to deal with them losslessly is to keep them as bytes.

Another way is to simply refuse to process those files. Filenames with
broken Unicode are, arguably, broken and shouldn't be allowed. It's your
application, you can specify how files have to be named. Even if your
operating system allows it, your application can refuse to deal with them:
either just skip those files, or raise an error, or insist that the user
renames them to something usable. Perhaps you can even repair the file name
yourself, by deleting or replacing the surrogates.


> But that's not nice. Once I need to compare filename with some string I'll
> have to convert strings to bytes.

That's not hard. 'my filename.txt'.encode('utf-8') # or whatever encoding
your file system uses


> Also Bytes() objects are base64 encoded 
> in mongo shell and thus they are hard to read, 

Use a better database :-)


> *e.g. "binary" : 
> BinData(0,"c29tZSBiaW5hcnkgdGV4dA==")*. Finally PEP 383 states that using
> bytes does not work in windows (btw, why?).

Windows file systems, at least NTFS, uses UTF-16 file names. But that
shouldn't matter to you: all that means is that when you read files names
from Windows, you should never see any surrogates. (I think. I don't have
access to a Windows machine I can test this.)


> Another option I found is to work with filenames as surrogate strings but
> enc them to 'latin-1' before printing/saving into database:
> filename.encode(fse, errors='surrogateescape').decode('latin-1')

Do you want mojibake? Because that's how you get mojibake.

py> 'µPy'.encode('utf-8').decode('latin1')
'µPy'
py> 'αω'.encode('utf-8').decode('latin1')
'αÏ\x89'

You are mapping the full range of 1114112 distinct Unicode code points into
just 256 Latin-1 characters. Bad Things happen.

> This way I like more since latin symbols are clearly visible in mongo
> shell. Yet I doubt this is best solution.

It certainly isn't.


> Ideally I would like to send surrogate strings to database or to terminal
> as is and let db/terminal handle them. IOW let terminal print garbage
> where surrogate letters appear. Is this possible in python?

That's nothing to do with Python, it depends on the database, and the
terminal.

> So what do you think: is  usage unicode strings and explicit conversion to
> latin-1 a good option?

Absolutely not.



> Also related question: is it possible to detect surrogate symbols in
> strings?

any('\uD800' <= c <= '\uDFFF' for c in the_string)



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Indentation example?

2016-06-12 Thread ICT Ezy
On Sunday, June 12, 2016 at 9:36:16 PM UTC+5:30, Steven D'Aprano wrote:
> On Mon, 13 Jun 2016 01:10 am, ICT Ezy wrote:
> 
> > Pl explain with an example the following phase
> > "Indentation cannot be split over multiple physical lines using
> > backslashes; the whitespace up to the first backslash determines the
> > indentation" (in 2.1.8. Indentation of Tutorial.) I want to teach my
> > student that point using some examples. Pl help me any body?
> 
> 
> Good indentation:
> 
> def function():
> # four spaces per indent
> print("hello")
> print("goodbye")
> 
> 
> Bad indentation:
> 
> def function():
> # four spaces per indent
> print("hello")  # four spaces
>   \
>   print("goodbye")  # two spaces, then backslash, then two more
> 
> 
> The second example will be a SyntaxError.
> 
> 
> 
> -- 
> Steven
Thank you very much your example 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Indentation example?

2016-06-12 Thread ICT Ezy
On Sunday, June 12, 2016 at 9:46:00 PM UTC+5:30, Ned Batchelder wrote:
> On Sunday, June 12, 2016 at 11:10:39 AM UTC-4, ICT Ezy wrote:
> > Pl explain with an example the following phase
> > "Indentation cannot be split over multiple physical lines using 
> > backslashes; the whitespace up to the first backslash determines the 
> > indentation" (in 2.1.8. Indentation of Tutorial.)
> > I want to teach my student that point using some examples.
> > Pl help me any body?
> 
> For what it's worth, that sentence isn't in the tutorial, it's in the
> reference manual, which has to mention all sorts of edge cases that most
> people will likely never encounter.
> 
> I've never seen someone try to make indentation work in that way with a
> backslash.  If I were you, I wouldn't mention it to your students, it might
> just confuse them further.
> 
> --Ned.
Thank for your advice me
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Indentation example?

2016-06-12 Thread ICT Ezy
On Sunday, June 12, 2016 at 9:36:16 PM UTC+5:30, Steven D'Aprano wrote:
> On Mon, 13 Jun 2016 01:10 am, ICT Ezy wrote:
> 
> > Pl explain with an example the following phase
> > "Indentation cannot be split over multiple physical lines using
> > backslashes; the whitespace up to the first backslash determines the
> > indentation" (in 2.1.8. Indentation of Tutorial.) I want to teach my
> > student that point using some examples. Pl help me any body?
> 
> 
> Good indentation:
> 
> def function():
> # four spaces per indent
> print("hello")
> print("goodbye")
> 
> 
> Bad indentation:
> 
> def function():
> # four spaces per indent
> print("hello")  # four spaces
>   \
>   print("goodbye")  # two spaces, then backslash, then two more
> 
> 
> The second example will be a SyntaxError.
> 
> 
> 
> -- 
> Steven

Thank you very much your example
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Indentation example?

2016-06-12 Thread Ned Batchelder
On Sunday, June 12, 2016 at 11:10:39 AM UTC-4, ICT Ezy wrote:
> Pl explain with an example the following phase
> "Indentation cannot be split over multiple physical lines using backslashes; 
> the whitespace up to the first backslash determines the indentation" (in 
> 2.1.8. Indentation of Tutorial.)
> I want to teach my student that point using some examples.
> Pl help me any body?

For what it's worth, that sentence isn't in the tutorial, it's in the
reference manual, which has to mention all sorts of edge cases that most
people will likely never encounter.

I've never seen someone try to make indentation work in that way with a
backslash.  If I were you, I wouldn't mention it to your students, it might
just confuse them further.

--Ned.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Indentation example?

2016-06-12 Thread Steven D'Aprano
On Mon, 13 Jun 2016 01:10 am, ICT Ezy wrote:

> Pl explain with an example the following phase
> "Indentation cannot be split over multiple physical lines using
> backslashes; the whitespace up to the first backslash determines the
> indentation" (in 2.1.8. Indentation of Tutorial.) I want to teach my
> student that point using some examples. Pl help me any body?


Good indentation:

def function():
# four spaces per indent
print("hello")
print("goodbye")


Bad indentation:

def function():
# four spaces per indent
print("hello")  # four spaces
  \
  print("goodbye")  # two spaces, then backslash, then two more


The second example will be a SyntaxError.



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: i'm a python newbie & wrote my first script, can someone critique it?

2016-06-12 Thread mad scientist jr
Thanks for your reply!

On Saturday, June 11, 2016 at 2:41:39 PM UTC-4, MRAB wrote:
> Drop the next 3 comment lines. They add visual clutter, and no useful info.
> > ###
> > # REFERENCE MODULES
> > ###

I'm not going to argue minor points at length, because others have said the 
same thing here, but I will push back a little and explain that what you guys 
consider visual clutter, I find helps me to visually break up and quickly 
identify sections in my code, which is why I like those separators.

> There's an easier to make the repeated string: "+" * 79
> > 
> > print("+++")

Thanks, that's a good tip. In this case I might prefer showing the full line of 
+s because that way, wysiwyg, it's just easier to visualize the output.

Thanks again for the input. . . I will further digest what you all said and 
study some more (including the docstrings)
-- 
https://mail.python.org/mailman/listinfo/python-list


Indentation example?

2016-06-12 Thread ICT Ezy
Pl explain with an example the following phase
"Indentation cannot be split over multiple physical lines using backslashes; 
the whitespace up to the first backslash determines the indentation" (in 2.1.8. 
Indentation of Tutorial.)
I want to teach my student that point using some examples.
Pl help me any body?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: pytz and Python timezones

2016-06-12 Thread Carl Meyer
Hi Johannes,

On 06/11/2016 05:37 AM, Johannes Bauer wrote:
> I try to create a localized timestamp
> in the easiest possible way. So, intuitively, I did this:
> 
> datetime.datetime(2016,1,1,0,0,0,tzinfo=pytz.timezone("Europe/Berlin"))

That is indeed intuitive, but unfortunately (due to a misunderstanding
between the original authors of Python's datetime module and the author
of pytz about how timezone-aware datetimes should work in Python) it is
not correct. The correct way to create a localized datetime using pytz
is this:

tz = pytz.timezone('Europe/Berlin')
dt = tz.localize(datetime.datetime(2016, 1, 1, 0, 0, 0)

This is documented prominently in the pytz documentation:
http://pytz.sourceforge.net/

> Which gives me:
> 
> datetime.datetime(2016, 1, 1, 0, 0, tzinfo= LMT+0:53:00 STD>)
> 
> Uh... what?

When you create a pytz timezone object, it encompasses all historical
UTC offsets that have ever been in effect in that location. When you
pass a datetime to the `localize()` method of that timezone object, it
is able to figure out which actual UTC offset was in effect at that
local time in that location, and apply the correct "version" of itself
to that datetime.

However, no such logic is built into the datetime module itself. So when
you just apply a pytz timezone directly to the tzinfo property of a
datetime, pytz by default falls back to the first entry in its
historical table of UTC offsets for that location. For most locations,
that is something called "LMT" or Local Mean Time, which is the
customary time in use at that location prior to the standardization of
timezones. And in most locations, LMT is offset from UTC by a strange
number of minutes. That's why you see "LMT" and the odd 53-minute offset
above.

> This here:
> 
> pytz.timezone("Europe/Berlin").localize(datetime.datetime(2016,1,1))
> 
> Gives me the expected result of:
> 
> datetime.datetime(2016, 1, 1, 0, 0, tzinfo= CET+1:00:00 STD>)
> 
> Can someone explain what's going on here and why I end up with the weird
> "00:53" timezone? Is this a bug or am I doing things wrong?

It is not a bug in pytz or in datetime, in that it is intended behavior,
although that behavior is unfortunately obscure, bug-prone, and
little-understood.

If you are masochistic enough to want to understand how this bad
situation came to be, and what might be done about it, you can read
through PEPs 431 and 495.

Carl



signature.asc
Description: OpenPGP digital signature
-- 
https://mail.python.org/mailman/listinfo/python-list


how to handle surrogate encoding: read from fs write to database

2016-06-12 Thread Peter Volkov
Hi, everybody.

What is a best practice to deal with filenames in python3? The problem is
that os.walk(src_dir), os.listdir(src_dir), ... return "surrogate" strings
as filenames. It is impossible to assume that they are normal strings that
could be print()'ed on unicode terminal or saved as as string into database
(mongodb) as they'll issue UnicodeEncodeError on surrogate character. So,
how to handle this situation?

The first solution I found was to convert filenames to bytes and use them.
But that's not nice. Once I need to compare filename with some string I'll
have to convert strings to bytes. Also Bytes() objects are base64 encoded
in mongo shell and thus they are hard to read, *e.g. "binary" :
BinData(0,"c29tZSBiaW5hcnkgdGV4dA==")*. Finally PEP 383 states that using
bytes does not work in windows (btw, why?).

Another option I found is to work with filenames as surrogate strings but
enc them to 'latin-1' before printing/saving into database:
filename.encode(fse, errors='surrogateescape').decode('latin-1')
This way I like more since latin symbols are clearly visible in mongo
shell. Yet I doubt this is best solution.

Ideally I would like to send surrogate strings to database or to terminal
as is and let db/terminal handle them. IOW let terminal print garbage where
surrogate letters appear. Is this possible in python?

So what do you think: is  usage unicode strings and explicit conversion to
latin-1 a good option?

Also related question: is it possible to detect surrogate symbols in
strings? I found suggestion to use re.compile('[\ud800-\uefff]+'). Yet all
this stuff feels to hacky for me, so I would like some confirmation that
this is the right way.

Thanks in advance and sorry for touching this matter again. Too many
discussions and not evident what is the current state of art here.
--
Peter.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: the global keyword:

2016-06-12 Thread Lawrence D’Oliveiro
On Sunday, June 12, 2016 at 11:51:11 AM UTC+12, Random832 wrote:
> Importing a variable from a module copies its value into your own
> module's variable.

Every name in Python is a variable, and can be assigned to to change its value 
at any time.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: pytz and Python timezones

2016-06-12 Thread Lawrence D’Oliveiro
On Saturday, June 11, 2016 at 11:37:38 PM UTC+12, Johannes Bauer wrote:
> I try to create a localized timestamp in the easiest possible way.

Localized timestamps are perhaps not as easy as you think.

> So, intuitively, I did this:
> 
> datetime.datetime(2016,1,1,0,0,0,tzinfo=pytz.timezone("Europe/Berlin"))
> 
> Which gives me:
> 
> datetime.datetime(2016, 1, 1, 0, 0, tzinfo= LMT+0:53:00 STD>)
> 
> Uh... what?

A careful reading of 
 indicates 
that those classes expect their attributes to already be in local time. I don’t 
think they have the smarts to examine the tzinfo you pass them and decide which 
of possibly several sections of historical information might apply to them. So 
you ended up with it being interpreted according to the oldest section of the 
“Europe/Berlin” timezone data (the one up to April 1893).

Anyway, that’s my guess.

> This here:
> 
> pytz.timezone("Europe/Berlin").localize(datetime.datetime(2016,1,1))
> 
> Gives me the expected result of:
> 
> datetime.datetime(2016, 1, 1, 0, 0, tzinfo= CET+1:00:00 STD>)

Clearly pytz has much more smarts about handling local times.

My general rule about dates/times is: always work in UTC. Convert to local 
dates/times only for display purposes.
-- 
https://mail.python.org/mailman/listinfo/python-list


Altering sys.argv on startup in Python 2

2016-06-12 Thread Adam Bartoš
Hello,

I'm trying to employ code like https://code.activestate.com/recipes/572200/
at Python 2 startup. The problem is that sys.argv isn't polulated yet when
sitecustomize is executed. Is there any way around? I was thinking about
something like temporarily chaning sys.modules['sys'] or sys.__dict__ to
something that runs my code on PySys_SetArgv, but that doesn't work since
PySys_SetArgv doesn't invoke any hooks like __setitem__ on sys.__dict__. So
is there any way how to automatically run my code after sys.argv was set
but before executing the main script (in Python 2)?

Regards, Adam Bartoš
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: for / while else doesn't make sense

2016-06-12 Thread Steven D'Aprano
On Sunday 12 June 2016 17:01, pavlovevide...@gmail.com wrote:

> On Thursday, May 19, 2016 at 9:43:56 AM UTC-7, Herkermer Sherwood wrote:
>> Most keywords in Python make linguistic sense, but using "else" in for and
>> while structures is kludgy and misleading. I am under the assumption that
>> this was just utilizing an already existing keyword. Adding another like
>> "andthen" would not be good.
[...]
>> I think perhaps "finally" should be added to for and while to do the same
>> thing as "else". What do you think?
> 
> I agree it's not the clearest name, but it does behave consistent with
> if...else.  

Er, not really.

if condition:
   print "A"
else:
   print "B"


Exactly one of "A" or "B", but NEVER both, will be printed.

for item in sequence:
print "A"
else:
print "B"

Normally we would expect that both "A" and "B" will be printed. There will be a 
variable number of "A"s printed, zero or more, and exactly one "B", but the 
point is that in general BOTH will run, which is the opposite of if...else.

The actual semantics are:

- run the for block
- THEN unconditionally run the "else" block

The only way to skip running the "else" block is to jump out of the entire 
for...else statement, using `return`, `raise` or `break`.



> Here's how I make sense of for...else.  Consider this loop:
> 
> for item in seq:
> if pred(item):
> use(item)
> break
> else:
> not_found()

That's the major intended use of "else", but note that it is NOT paired with 
the `if`. You can have:

def func():
for item in seq:
if pred(item):
use(item)
if condition:
return something
else:
different_use(item)
continue
do_more()
break
else:
not_found()


or any of an infinite number of other combinations. You can't really understand 
for...else correctly if you think of the "else" being partnered with an "if" 
inside the loop. What if there is no "if"? Or ten of them? Of if they all 
already have "else" clauses?



> This particular loop functions as a kind of a dynamic if...elif...else
> statement.  You can see that if you unroll the loop:
> 
> 
> if pred(seq[0]):
> use(seq[0])
> elif pred(seq[1]):
> use(seq[1])
> elif pred(seq[2]):
> use(seq[2])
> else:
> not_found()

They are only equivalent because of the "break". Take the break out, unroll the 
loop, and what you have is:

if pred(seq[0]):
use(seq[0])
if pred(seq[1]):
use(seq[1])
if pred(seq[2]):
use(seq[2])
not_found()

Put the break back in, and you have:


if pred(seq[0]):
use(seq[0])
GOTO foo  # not actual Python syntax, but see below
if pred(seq[1]):
use(seq[1])
GOTO foo
if pred(seq[2]):
use(seq[2])
GOTO foo
not_found()
label: foo

Admittedly GOTO isn't Python syntax, but this actually is the way that the 
bytecode is done: the else clause is executed unconditionally, and a break 
jumps past the else clause.

In Python 3.3, "break" is compiled to a JUMP_ABSOLUTE bytecode. You can see 
this for yourself using the dis module, e.g.:


import dis
code = compile("""
for i in seq:
this()
if condition:
break
that()
else:
another()
print("done")
""", "", "exec")
dis.dis(code)



> You will note that the else block is the same in both the rolled and unrolled
> versions, and has exactly the same meaning and usage.

But not in the general case. Only certain specific uses of for...else behave as 
you suggest.


> As for a more appropriate keyword, I don't like the examples I saw skimming
> this thread; neither "finally" nor "then" communicates that the block would
> executed conditionally.

The block isn't executed conditionally. The block is executed UNCONDITIONALLY. 
The only way to avoid executing the "else" block is to jump past it, using a 
return, raise of break.


> If you want my opinion, you might as well use something explicit and
> unambiguous, like "if_exhausted", for the block; 

But that is exactly what the else clause is NOT. That is an incredibly common 
mistake that people make, thinking that the "else" clause executes when the 
sequence is exhausted. At first, it *seems* to be the case:

py> seq = []
py> for item in seq:
... print("not empty!")
... else:
... print("empty")
... 
empty

but that wrong. 

py> seq = [1, 2]
py> for item in seq:
... print("not empty!")
... else:
... print("empty")
... 
not empty!
not empty!
empty
py> print(seq)  # not exhausted
[1, 2]


The else clause has nothing to do with whether or not the for block runs, or 
whether it is empty, or whether the iterable is exhausted after the loop is 
complete. The else clause simple runs directly after the for block, unless you 
skip it by using the Python equivalent of a GOTO.


> If you really want my opinion, it probably shouldn't be in the language at
> all, even though I happily use it from time to time, and my code is better
> for it.

o_O



> But it's not usefu

Re: AttributeError into a bloc try-except AttributeError

2016-06-12 Thread Vincent Vande Vyvre

Le 12/06/16 09:20, Vincent Vande Vyvre a écrit :

Hi,

I have a strange behaviour in my code.

In an interactive session, the result is as expected:

Python 3.4.3 (default, Oct 14 2015, 20:28:29)
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a = None
>>> try:
... _ = a.value
... except AttributeError:
... print('OK')
...
OK
>>>

But not in my code:

def call_settings_dialog(self):
try:
_ = self.video.category
self.core.artelive.configure_downloading(self.video)
except AttributeError:
self.core.artetv.configure_downloading(self.video)

and ...

Traceback (most recent call last):
  File "/home/vincent/qarte-3/trunk/loadingscheduler.py", line 240, in 
call_settings_dialog

_  = self.video.category
AttributeError: 'TVItem' object has no attribute 'category'

I have two types of video, one with an attribute category handled by a 
module 'artelive' and an other without this attribute handled by an 
other module, that's the reason of this code.



... I have just rewrite the line "_ = self.video.category" and the 
problem disappears.


Vincent
--
https://mail.python.org/mailman/listinfo/python-list


AttributeError into a bloc try-except AttributeError

2016-06-12 Thread Vincent Vande Vyvre

Hi,

I have a strange behaviour in my code.

In an interactive session, the result is as expected:

Python 3.4.3 (default, Oct 14 2015, 20:28:29)
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a = None
>>> try:
... _ = a.value
... except AttributeError:
... print('OK')
...
OK
>>>

But not in my code:

def call_settings_dialog(self):
try:
_ = self.video.category
self.core.artelive.configure_downloading(self.video)
except AttributeError:
self.core.artetv.configure_downloading(self.video)

and ...

Traceback (most recent call last):
  File "/home/vincent/qarte-3/trunk/loadingscheduler.py", line 240, in 
call_settings_dialog

_  = self.video.category
AttributeError: 'TVItem' object has no attribute 'category'

I have two types of video, one with an attribute category handled by a 
module 'artelive' and an other without this attribute handled by an 
other module, that's the reason of this code.

--
https://mail.python.org/mailman/listinfo/python-list


Re: for / while else doesn't make sense

2016-06-12 Thread pavlovevidence
On Thursday, May 19, 2016 at 9:43:56 AM UTC-7, Herkermer Sherwood wrote:
> Most keywords in Python make linguistic sense, but using "else" in for and
> while structures is kludgy and misleading. I am under the assumption that
> this was just utilizing an already existing keyword. Adding another like
> "andthen" would not be good.
> 
> But there is already a reserved keyword that would work great here.
> "finally". It is already a known keyword used in try blocks, but would work
> perfectly here. Best of all, it would actually make sense.
> 
> Unfortunately, it wouldn't follow the semantics of try/except/else/finally.
> 
> Is it better to follow the semantics used elsewhere in the language, or
> have the language itself make sense semantically?
> 
> I think perhaps "finally" should be added to for and while to do the same
> thing as "else". What do you think?

I agree it's not the clearest name, but it does behave consistent with 
if...else.  "finally" has a strong connotation for code that is guaranteed to 
be executed on the way out regardless of an exception, so it wouldn't be 
appropriate for this even if it were clearer (though it isn't IMO).

Here's how I make sense of for...else.  Consider this loop:

for item in seq:
if pred(item):
use(item)
break
else:
not_found()


This particular loop functions as a kind of a dynamic if...elif...else 
statement.  You can see that if you unroll the loop:


if pred(seq[0]):
use(seq[0])
elif pred(seq[1]):
use(seq[1])
elif pred(seq[2]):
use(seq[2])
else:
not_found()


You will note that the else block is the same in both the rolled and unrolled 
versions, and has exactly the same meaning and usage.

As for a more appropriate keyword, I don't like the examples I saw skimming 
this thread; neither "finally" nor "then" communicates that the block would 
executed conditionally.

If you want my opinion, you might as well use something explicit and 
unambiguous, like "if_exhausted", for the block; and you might as well add an 
"if_broken" block while you're at it (though it's not quite as useful since in 
most cases you could just put the code before the break).  Since it's only 
occasionally useful, there's no real need to make the keyword short.

If you really want my opinion, it probably shouldn't be in the language at all, 
even though I happily use it from time to time, and my code is better for it.  
But it's not useful enough that the language would really suffer without it, 
and it would save some users from something that can be quite confusing.


-- 
Carl Banks
-- 
https://mail.python.org/mailman/listinfo/python-list