date:20161121

[RELEASE] Python 3.6.0b4 is now available

2016-11-21 Thread Ned Deily

On behalf of the Python development community and the Python 3.6 release
team, I'm pleased to announce the availability of Python 3.6.0b4. 3.6.0b4
is the last planned beta release of Python 3.6, the next major release of
Python.

Among the new major new features in Python 3.6 are:

* PEP 468 - Preserving the order of **kwargs in a function
* PEP 487 - Simpler customization of class creation
* PEP 495 - Local Time Disambiguation
* PEP 498 - Literal String Formatting
* PEP 506 - Adding A Secrets Module To The Standard Library
* PEP 509 - Add a private version to dict
* PEP 515 - Underscores in Numeric Literals
* PEP 519 - Adding a file system path protocol
* PEP 520 - Preserving Class Attribute Definition Order
* PEP 523 - Adding a frame evaluation API to CPython
* PEP 524 - Make os.urandom() blocking on Linux (during system startup)
* PEP 525 - Asynchronous Generators (provisional)
* PEP 526 - Syntax for Variable Annotations (provisional)
* PEP 528 - Change Windows console encoding to UTF-8
* PEP 529 - Change Windows filesystem encoding to UTF-8
* PEP 530 - Asynchronous Comprehensions

Please see "What’s New In Python 3.6" for more information:

https://docs.python.org/3.6/whatsnew/3.6.html

You can find Python 3.6.0b4 here:

https://www.python.org/downloads/release/python-360b4/

Beta releases are intended to give the wider community the opportunity
to test new features and bug fixes and to prepare their projects to
support the new feature release. We strongly encourage maintainers of
third-party Python projects to test with 3.6 during the beta phase and
report issues found to bugs.python.org as soon as possible. While the
release is feature complete entering the beta phase, it is possible that
features may be modified or, in rare cases, deleted up until the start
of the release candidate phase (2016-12-05). Our goal is have no changes
after rc1. To achieve that, it will be extremely important to get as
much exposure for 3.6 as possible during the beta phase. Please keep in
mind that this is a preview release and its use is not recommended for
production environments

The next pre-release of Python 3.6 will be 3.6.0rc1, the release candidate,
currently scheduled for 2016-12-05. The official release of Python 3.6.0
is currently scheduled for 2016-12-16.  More information about the release
schedule can be found here:

https://www.python.org/dev/peps/pep-0494/

--
  Ned Deily
  n...@python.org -- []

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Numpy slow at vector cross product?

2016-11-21 Thread Paul Rubin

Steven D'Aprano  writes:
> if we knew we should be doing it, and if we could be bothered to run
> multiple trials and gather statistics and keep a close eye on the
> deviation between measurements. But who wants to do that by hand?

You might like this, for Haskell:

   http://www.serpentine.com/criterion/tutorial.html

I've sometimes thought of wrapping it around other languages.

Make sure to click on the graphs: it's impressive work.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Can somebody tell me what's wrong wrong with my code? I don't understand

2016-11-21 Thread Steven D'Aprano

On Tuesday 22 November 2016 14:10, rmjbr...@gmail.com wrote:

> Hi! This is my first post! I'm having trouble understanding my code. I get
> "SyntaxError:invalid syntax" on line 49.

Sometimes a syntax error can only be reported on the line *following* the line 
with the actual error. So you may have something like this:

x = func(y  # oops forgot to close the brackets
for i in range(x):  # error is reported here
...

So when you get a syntax error on a line, and cannot see anything wrong with 
that line, work your way backwards until you find the actually location of the 
error.

[...]
> elif raceNum==3:
>   print("Nice fur. I don't see too many of your kind 'round here. Maybe
>   that's a good thing...") 
>   print('')
>   classNum=int(input("What's your profession mate?")
>   
> elif raceNum==4: #this line has an error for some reason

This is the line where the error is reported. The error is actually on the 
previous line. Do you see it now?

-- 
Steven
299792.458 km/s — not just a good idea, it’s the law!

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Numpy slow at vector cross product?

2016-11-21 Thread Steven D'Aprano

On Tuesday 22 November 2016 14:00, Steve D'Aprano wrote:

> Running a whole lot of loops can, sometimes, mitigate some of that
> variation, but not always. Even when running in a loop, you can easily get
> variation of 10% or more just at random.

I think that needs to be emphasised: there's a lot of random noise in these 
measurements.

For big, heavyweight functions that do a lot of work, the noise is generally a 
tiny proportion, and you can safely ignore it. (At least for CPU bound tasks: 
I/O bound tasks, the noise in I/O is potentially very high.)

For really tiny operations, the noise *may* be small, depending on the 
operation.  But small is not insignificant. Consider a simple operation like 
addition:

# Python 3.5
import statistics
from timeit import Timer
t = Timer("x + 1", setup="x = 0")
# ten trials, of one million loops each
results = t.repeat(repeat=10)
best = min(results)
average = statistics.mean(results)
std_error = statistics.stdev(results)/statistics.mean(results)

Best: 0.09761243686079979
Average: 0.0988507878035307
Std error: 0.02260956789268462

So this suggests that on my machine, doing no expensive virus scans or 
streaming video, the random noise in something as simple as integer addition is 
around two percent.

So that's your baseline: even simple operations repeated thousands of times 
will show random noise of a few percent.

Consequently, if you're doing one trial (one loop of, say, a million 
operations):

start = time.time()
for i in range(100):
x + 1
elapsed = time.time() - start

and compare the time taken with another trial, and the difference is of the 
order of a few percentage points, then you have *no* reason to believe the 
result is real. You ought to repeat your test multiple times -- the more the 
better.

timeit makes it easy to repeat your tests. It automatically picks the best 
timer for your platform and avoid serious gotchas from using the wrong timer. 
When called from the command line, it will automatically select the best number 
of loops to ensure reliable timing, without wasting time doing more loops than 
needed.

timeit isn't magic. It's not doing anything that you or I couldn't do by hand, 
if we knew we should be doing it, and if we could be bothered to run multiple 
trials and gather statistics and keep a close eye on the deviation between 
measurements. But who wants to do that by hand?

-- 
Steven
299792.458 km/s — not just a good idea, it’s the law!

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Can somebody tell me what's wrong wrong with my code? I don't understand

2016-11-21 Thread Chris Angelico

On Tue, Nov 22, 2016 at 2:10 PM,   wrote:
> Hi! This is my first post! I'm having trouble understanding my code. I get 
> "SyntaxError:invalid syntax" on line 49. I'm trying to code a simple 
> text-based rpg on repl.it. Thank you for reading.
>
>
> elif raceNum==3:
>   print("Nice fur. I don't see too many of your kind 'round here. Maybe 
> that's a good thing...")
>   print('')
>   classNum=int(input("What's your profession mate?")
>
> elif raceNum==4: #this line has an error for some reason
>   print("Your a 'Mongo eh? I thought you lads were extinct...Just keep your 
> tongue in ya mouth and we'll get along fine mate.")
>   classNum=int(input("What's your profession?"))

Welcome to the community! I've trimmed your code to highlight the part
I'm about to refer to.

One of the tricks to understanding these kinds of errors is knowing
how the code is read, which is: top to bottom, left to right, exactly
the same as in English. Sometimes, a problem with one line of code is
actually discovered on the next line of code. (Occasionally further
down.) When you get a syntax error at the beginning of a line, it's
worth checking the previous line to see if it's somehow unfinished.

Have a look at your two blocks of code here. See if you can spot a
difference. There is one, and it's causing your error.

I'm hinting rather than overtly pointing it out, so you get a chance
to try this for yourself. Have at it!

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Can somebody tell me what's wrong wrong with my code? I don't understand

2016-11-21 Thread Larry Martell

On Mon, Nov 21, 2016 at 10:10 PM,   wrote:
> Hi! This is my first post! I'm having trouble understanding my code. I get 
> "SyntaxError:invalid syntax" on line 49. I'm trying to code a simple 
> text-based rpg on repl.it. Thank you for reading.
>
>
>
> print("Welcome to Gladiator Game! Choose your character race, class, and 
> starting equipment!")
>
> print('')
>
> print("Race selection: ")
>
> print('')
>
> print("(1) Orcs. Known for their very wide, robust physiques. They are the 
> strongest of all the races in Polaris.")
>
> print('')
>
> print("(2) Elves. Thin and wiry. They are known for their amazing agility and 
> hand-eye coordiation. They originate from the desert island of Angolia.")
>
> print('')
>
> print("(3) Silverbacks. A hairy, ape-like race from Nothern Polaris. Their 
> metal fur provides them with much needed protection.")
>
> print('')
>
> print("(4) Pomongos. An amphibian race believed to inhabit the wet jungles of 
> Central Polaris. Legends say they have highly corrosive spit...")
>
> print('')
>
> raceNum=int(input("Select your character's race by entering the corresponding 
> number. Then press enter: "))
>
> print('')
>
> while raceNum<1 or raceNum>4:
>   raceNum=int(input('Invalid input. Try again: '))
>
> print('')
>
> if raceNum==1:
>   print("You're an orc, eh? I won't be sayin' anything mean about you...")
>   print('')
>   classNum=int(input("What's your profession big fella?"))
>
> elif raceNum==2:
>   print("I never liked you elven folk...Let's get on with this.")
>   print('')
>   classNum=int(input("What's your profession ? Do ye even have one ?"))
>
> elif raceNum==3:
>   print("Nice fur. I don't see too many of your kind 'round here. Maybe 
> that's a good thing...")
>   print('')
>   classNum=int(input("What's your profession mate?")
>
> elif raceNum==4: #this line has an error for some reason
>   print("Your a 'Mongo eh? I thought you lads were extinct...Just keep your 
> tongue in ya mouth and we'll get along fine mate.")
>   classNum=int(input("What's your profession?"))

Hint: look on line 47
-- 
https://mail.python.org/mailman/listinfo/python-list

Can somebody tell me what's wrong wrong with my code? I don't understand

2016-11-21 Thread rmjbros3

Hi! This is my first post! I'm having trouble understanding my code. I get 
"SyntaxError:invalid syntax" on line 49. I'm trying to code a simple text-based 
rpg on repl.it. Thank you for reading.



print("Welcome to Gladiator Game! Choose your character race, class, and 
starting equipment!")

print('')

print("Race selection: ")

print('')

print("(1) Orcs. Known for their very wide, robust physiques. They are the 
strongest of all the races in Polaris.")

print('')

print("(2) Elves. Thin and wiry. They are known for their amazing agility and 
hand-eye coordiation. They originate from the desert island of Angolia.")

print('')

print("(3) Silverbacks. A hairy, ape-like race from Nothern Polaris. Their 
metal fur provides them with much needed protection.")

print('')

print("(4) Pomongos. An amphibian race believed to inhabit the wet jungles of 
Central Polaris. Legends say they have highly corrosive spit...")

print('')

raceNum=int(input("Select your character's race by entering the corresponding 
number. Then press enter: "))

print('')

while raceNum<1 or raceNum>4:
  raceNum=int(input('Invalid input. Try again: '))

print('')

if raceNum==1:
  print("You're an orc, eh? I won't be sayin' anything mean about you...")
  print('')
  classNum=int(input("What's your profession big fella?"))

elif raceNum==2:
  print("I never liked you elven folk...Let's get on with this.")
  print('')
  classNum=int(input("What's your profession ? Do ye even have one ?"))

elif raceNum==3:
  print("Nice fur. I don't see too many of your kind 'round here. Maybe that's 
a good thing...")
  print('')
  classNum=int(input("What's your profession mate?")
  
elif raceNum==4: #this line has an error for some reason
  print("Your a 'Mongo eh? I thought you lads were extinct...Just keep your 
tongue in ya mouth and we'll get along fine mate.") 
  classNum=int(input("What's your profession?"))
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Numpy slow at vector cross product?

2016-11-21 Thread Steve D'Aprano

On Tue, 22 Nov 2016 12:45 pm, BartC wrote:

> On 21/11/2016 14:50, Steve D'Aprano wrote:
>> On Mon, 21 Nov 2016 11:09 pm, BartC wrote:
> 
>> Modern machines run multi-tasking operating systems, where there can be
>> other processes running. Depending on what you use as your timer, you may
>> be measuring the time that those other processes run. The OS can cache
>> frequently used pieces of code, which allows it to run faster. The CPU
>> itself will cache some code.
> 
> You get to know after while what kinds of processes affect timings. For
> example, streaming a movie at the same time. 

Really, no.

You'll just have to take my word on this, but I'm not streaming any movies
at the moment. I don't even have a web browser running. And since I'm
running Linux, I don't have an anti-virus scanner that might have just
triggered a scan.

(But since I'm running Linux, I do have a web server, mail server, a DNS
server, cron, and about 300 other processes running, any of which might
start running for a microsecond or ten in the middle of a job.)

py> with Stopwatch():
... x = math.sin(1.234)
...
elapsed time is very small; consider using the timeit module for
micro-timings of small code snippets
time taken: 0.007164 seconds

And again:

py> with Stopwatch():
... x = math.sin(1.234)
...
elapsed time is very small; consider using the timeit module for
micro-timings of small code snippets
time taken: 0.14 seconds

Look at the variation in the timing: 0.007164 versus 0.14 second. That's
the influence of a cache, or more than one cache, somewhere. But if I run
it again:

py> with Stopwatch():
... x = math.sin(1.234)
...
elapsed time is very small; consider using the timeit module for
micro-timings of small code snippets
time taken: 0.13 seconds

there's a smaller variation, this time "only" 7%, for code which hasn't
changed. That's what your up against.

Running a whole lot of loops can, sometimes, mitigate some of that
variation, but not always. Even when running in a loop, you can easily get
variation of 10% or more just at random.

> So when you need to compare timings, you turn those off.
> 
>> The shorter the code snippet, the more these complications are relevant.
>> In this particular case, we can be reasonably sure that the time it takes
>> to create a list range(1) and the overhead of the loop is *probably*
>> quite a small percentage of the time it takes to perform 10 vector
>> multiplications. But that's not a safe assumption for all code snippets.
> 
> Yes, it was one of those crazy things that Python used to have to do,
> creating a list of N numbers just in order to be able to count to N.

Doesn't matter. Even with xrange, you're still counting the cost of looking
up xrange, passing one or more arguments to it, parsing those arguments,
creating an xrange object, and iterating over that xrange object
repeatedly. None of those things are free.

You might *hope* that the cost of those things are insignificant compared to
what you're actually interested in timing, but you don't know. And you're
resisting the idea of using a tool that is specifically designed to avoid
measuring all that overhead.

It's okay that your intuitions about the cost of executing Python code is
inaccurate. What's not okay is your refusal to listen to those who have a
better idea of what's involved.

[...]
>> The timeit module automates a bunch of tricky-to-right best practices for
>> timing code. Is that a problem?
> 
> The problem is it substitutes a bunch of tricky-to-get-right options and
> syntax which has to to typed /at the command line/. And you really don't
> want to have to write code at the command line (especially if sourced
> from elsewhere, which means you have to transcribe it).

You have to transcribe it no matter what you do. Unless you are given
correctly written timing code.

You don't have to use timeit from the command line. But you're mad if you
don't: the smaller the code snippet, the more convenient it is.

[steve@ando ~]$ python2.7 -m timeit -s "x = 257" "3*x"
1000 loops, best of 3: 0.106 usec per loop
[steve@ando ~]$ python3.5 -m timeit -s "x = 257" "3*x"
1000 loops, best of 3: 0.137 usec per loop

That's *brilliant* and much simpler than anything you are doing with loops
and clocks and whatnot. Its simple, straightforward, and tells me exactly
what I expected to see. (Python 3.6 will be even better.)

For the record, the reason Python 3.5 is so much slower here is because it
is a debugging build.

>> But if you prefer doing it "old school" from within Python, then:
>>
>> from timeit import Timer
>> t = Timer('np.cross(x, y)',  setup="""
>> import numpy as np
>> x = np.array([1, 2, 3])
>> y = np.array([4, 5, 6])
>> """)
>>
>> # take five measurements of 10 calls each, and report the fastest
>> result = min(t.repeat(number=10, repeat=5))/10
>> print(result)  # time in seconds per call
> 
>> Better?
> 
> A bit, but the code is now inside a string!

As oppo

Re: Numpy slow at vector cross product?

2016-11-21 Thread Steve D'Aprano

On Tue, 22 Nov 2016 05:43 am, BartC wrote:

> The fastest I can get compiled, native code to do this is at 250 million
> cross-products per second.

Yes, yes, you're awfully clever, and your secret private language is so much
more efficient than even C that the entire IT industry ought to hang their
head in shame.

I'm only being *half* sarcastic here, for what its worth. I remember the
days when I could fit an entire operating system, plus applications, on a
400K floppy disk, and they would run at acceptable speed on something like
an 8 MHz CPU. Code used to be more efficient, with less overhead. But given
that your magic compiler runs only on one person's PC in the entire world,
it is completely irrelevant.

-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Numpy slow at vector cross product?

2016-11-21 Thread BartC


On 21/11/2016 14:50, Steve D'Aprano wrote:

On Mon, 21 Nov 2016 11:09 pm, BartC wrote:



Modern machines run multi-tasking operating systems, where there can be
other processes running. Depending on what you use as your timer, you may
be measuring the time that those other processes run. The OS can cache
frequently used pieces of code, which allows it to run faster. The CPU
itself will cache some code.


You get to know after while what kinds of processes affect timings. For 
example, streaming a movie at the same time. So when you need to compare 
timings, you turn those off.



The shorter the code snippet, the more these complications are relevant. In
this particular case, we can be reasonably sure that the time it takes to
create a list range(1) and the overhead of the loop is *probably* quite
a small percentage of the time it takes to perform 10 vector
multiplications. But that's not a safe assumption for all code snippets.


Yes, it was one of those crazy things that Python used to have to do, 
creating a list of N numbers just in order to be able to count to N.


But that's not significant here. Either experience, or a preliminary 
test with an empty loop, or using xrange, or using Py3, will show that 
the loop overheads for N iterations in this case are small in comparison 
to executing the bodies of the loops.



This is why the timeit module exists: to do the right thing when it matters,
so that you don't have to think about whether or not it matters. The timeit
module works really really hard to get good quality, accurate timings,
minimizing any potential overhead.

The timeit module automates a bunch of tricky-to-right best practices for
timing code. Is that a problem?


The problem is it substitutes a bunch of tricky-to-get-right options and 
syntax which has to to typed /at the command line/. And you really don't 
want to have to write code at the command line (especially if sourced 
from elsewhere, which means you have to transcribe it).




But if you prefer doing it "old school" from within Python, then:

from timeit import Timer
t = Timer('np.cross(x, y)',  setup="""
import numpy as np
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
""")

# take five measurements of 10 calls each, and report the fastest
result = min(t.repeat(number=10, repeat=5))/10
print(result)  # time in seconds per call



Better?


A bit, but the code is now inside a string!

Code will normally exist as a proper part of a module, not on the 
command line, in a command history, or in a string, so why not test it 
running inside a module?


But I've done a lot of benchmarking and actually measuring execution 
time is just part of it. This test I ran from inside a function for 
example, not at module-level, as that is more typical.


Are the variables inside a time-it string globals or locals? It's just a 
lot of extra factors to worry about, and extra things to get wrong.


The loop timings used by the OP showed one took considerably longer than 
the other. And that was confirmed by others. There's nothing wrong with 
that method.


--
Bartc

--
https://mail.python.org/mailman/listinfo/python-list

Re: MemoryError and Pickle

2016-11-21 Thread Steve D'Aprano

On Tue, 22 Nov 2016 10:27 am, Fillmore wrote:

> 
> Hi there, Python newbie here.
> 
> I am working with large files. For this reason I figured that I would
> capture the large input into a list and serialize it with pickle for
> later (faster) usage.
> Everything has worked beautifully until today when the large data (1GB)
> file caused a MemoryError :(

At what point do you run out of memory? When building the list? If so, then
you need more memory, or smaller lists, or avoid creating a giant list in
the first place.

If you can successfully build the list, but then run out of memory when
trying to pickle it, then you may need another approach.

But as always, to really be sure what is going on, we need to see the full
traceback (not just the "MemoryError" part) and preferably a short, simple
example that replicates the error:

http://www.sscce.org/

> Question for experts: is there a way to refactor this so that data may
> be filled/written/released as the scripts go and avoid the problem?

I'm not sure what you are doing with this data. I guess you're not just:

- read the input, one line at a time
- create a giant data list
- pickle the list

and then never look at the pickle again.

I imagine that you want to process the list in some way, but how and where
and when is a mystery. But most likely you will later do:

- unpickle the list, creating a giant data list again
- process the data list

So I'm not sure what advantage the pickle is, except as make-work. Maybe
I've missed something, but if you're running out of memory processing the
giant list, perhaps a better approach is:

- read the input, one line at a time
- process that line

and avoid building the giant list or the pickle at all.

> code below.
> 
> Thanks
>
> data = list()
> for line in sys.stdin:
>  try:
>  parts = line.strip().split("\t")
>  t = parts[0]
>  w = parts[1]
>  u = parts[2]
>  #let's retain in-memory copy of data
>  data.append({"ta": t,
>   "wa": w,
>   "ua": u
>  })
>  except IndexError:
>  print("Problem with line :"+line, file=sys.stderr)
>  pass
> 
> #time to save data object into a pickle file
> 
> fileObject = open(filename,"wb")
> pickle.dump(data,fileObject)
> fileObject.close()

Let's re-write some of your code to make it better:

data = []
for line in sys.stdin:
try:
t, w, u = line.strip().split("\t")
except ValueError as err:
print("Problem with line:", line, file=sys.stderr)
data.append({"ta": t, "wa": w, "ua": u})

with open(filename, "wb") as fileObject:
pickle.dump(data, fileObject)

Its not obvious where you are running out of memory, but my guess is that it
is most likely while building the giant list. You have a LOT of small
dicts, each one with exactly the same set of keys. You can probably save a
lot of memory by using a tuple, or better, a namedtuple.

py> from collections import namedtuple
py> struct = namedtuple("struct", "ta wa ua")
py> x = struct("abc", "def", "ghi")
py> y = {"ta": "abc", "wa": "def", "ua": "ghi"}
py> sys.getsizeof(x)
36
py> sys.getsizeof(y)
144

So each of those little dicts {"ta": t, "wa": w, "ua": u} in your list
potentially use as much as four times the memory as a namedtuple would use.
So using namedtuple might very well save enough memory to avoid the
MemoryError altogether.

from collections import namedtuple
struct = namedtuple("struct", "ta wa ua")
data = []
for line in sys.stdin:
try:
t, w, u = line.strip().split("\t")
except ValueError as err:
print("Problem with line:", line, file=sys.stderr)
data.append(struct(t, w, a))

with open(filename, "wb") as fileObject:
pickle.dump(data, fileObject)

And as a bonus, when you come to use the record, instead of having to write:

line["ta"]

to access the first field, you can write:

line.ta

-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Enigma 1140 - link problem

2016-11-21 Thread Terry Reedy


On 11/21/2016 5:45 PM, BlindAnagram wrote:

Hi Jim,

...

  thanks
 Brian


'Brian', you sent this to python-list instead of Jim.  If this is not 
spam, try again with a different 'To:'



--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list

Re: MemoryError and Pickle

2016-11-21 Thread Steve D'Aprano

On Tue, 22 Nov 2016 11:40 am, Peter Otten wrote:

> Fillmore wrote:
> 
>> Hi there, Python newbie here.
>> 
>> I am working with large files. For this reason I figured that I would
>> capture the large input into a list and serialize it with pickle for
>> later (faster) usage.
> 
> But is it really faster? If the pickle is, let's say, twice as large as
> the original file it should take roughly twice as long to read the data...

But the code is more complex, therefore faster.

That's how it works, right?

*wink*




-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: MemoryError and Pickle

2016-11-21 Thread Chris Kaynor

On Mon, Nov 21, 2016 at 3:43 PM, John Gordon  wrote:
> In  Fillmore  
> writes:
>
>
>> Question for experts: is there a way to refactor this so that data may
>> be filled/written/released as the scripts go and avoid the problem?
>> code below.
>
> That depends on how the data will be read.  Here is one way to do it:
>
> fileObject = open(filename, "w")
> for line in sys.stdin:
> parts = line.strip().split("\t")
> fileObject.write("ta: %s\n" % parts[0])
> fileObject.write("wa: %s\n" % parts[1])
> fileObject.write("ua: %s\n" % parts[2])
> fileObject.close()
>
> But this doesn't use pickle format, so your reader program would have to
> be modified to read this format.  And you'll run into the same problem if
> the reader expects to keep all the data in memory.

If you want to keep using pickle, you should be able to pickle each
item of the list to the file one at a time. As long as the file is
kept open (or seeked to the end), you should be able to dump without
overwriting the old data, and read starting at the end of the previous
pickle stream.

I haven't tested it, so there may be issues (if it fails, you can try
using dumps and writing to the file by hand):

Writing:
with open(filename, 'wb') as fileObject:
for line in sys.stdin:
pickle.dump(line, fileObject)

Reading:
with open(filename, 'wb') as fileObject:
while not fileObject.eof: # Not sure of the correct syntax, but
gives the idea
line = pickle.load(fileObject)
# do something with line


It should also be noted that if you do not need to support multiple
Python versions, you may want to specify a protocol to pickle.dump to
use a better version of the format. -1 will use the latest (best if
you only care about one version of Python.); 4 is currently the latest
version (added in 3.4), which may be useful if you need
forward-compatibility but not backwards-compatibility. 2 is the latest
version available in Python 2 (added in Python 2.3) See
https://docs.python.org/3.6/library/pickle.html#data-stream-format for
more information.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: MemoryError and Pickle

2016-11-21 Thread Peter Otten

Fillmore wrote:

> Hi there, Python newbie here.
> 
> I am working with large files. For this reason I figured that I would
> capture the large input into a list and serialize it with pickle for
> later (faster) usage.

But is it really faster? If the pickle is, let's say, twice as large as the 
original file it should take roughly twice as long to read the data...

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: MemoryError and Pickle

2016-11-21 Thread John Gordon

In  Fillmore  
writes:

> Question for experts: is there a way to refactor this so that data may 
> be filled/written/released as the scripts go and avoid the problem?
> code below.

That depends on how the data will be read.  Here is one way to do it:

fileObject = open(filename, "w")
for line in sys.stdin:
parts = line.strip().split("\t")
fileObject.write("ta: %s\n" % parts[0])
fileObject.write("wa: %s\n" % parts[1])
fileObject.write("ua: %s\n" % parts[2])
fileObject.close()

But this doesn't use pickle format, so your reader program would have to
be modified to read this format.  And you'll run into the same problem if
the reader expects to keep all the data in memory.

-- 
John Gordon   A is for Amy, who fell down the stairs
gor...@panix.com  B is for Basil, assaulted by bears
-- Edward Gorey, "The Gashlycrumb Tinies"

-- 
https://mail.python.org/mailman/listinfo/python-list

MemoryError and Pickle

2016-11-21 Thread Fillmore



Hi there, Python newbie here.

I am working with large files. For this reason I figured that I would 
capture the large input into a list and serialize it with pickle for 
later (faster) usage.
Everything has worked beautifully until today when the large data (1GB) 
file caused a MemoryError :(


Question for experts: is there a way to refactor this so that data may 
be filled/written/released as the scripts go and avoid the problem?

code below.

Thanks

data = list()
for line in sys.stdin:

try:
parts = line.strip().split("\t")
t = parts[0]
w = parts[1]
u = parts[2]



#let's retain in-memory copy of data
data.append({"ta": t,
 "wa": w,
 "ua": u
})

except IndexError:
print("Problem with line :"+line, file=sys.stderr)
pass

#time to save data object into a pickle file

fileObject = open(filename,"wb")
pickle.dump(data,fileObject)
fileObject.close()
--
https://mail.python.org/mailman/listinfo/python-list

Enigma 1140 - link problem

2016-11-21 Thread BlindAnagram

Hi Jim,

In my comment/solution for this Enigma I tried to post a link to my
number theory library but my HTML got removed.

Could you please replace the first sentence with:

A solution using my http://173.254.28.24/~brgladma/number_theory.py";>number theory
library:

(without the line wrapping).

  thanks

 Brian
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Working around multiple files in a folder

2016-11-21 Thread Emile van Sebille


On 11/21/2016 11:27 AM, subhabangal...@gmail.com wrote:

I have a python script where I am trying to read from a list of files in a 
folder and trying to process something.
As I try to take out the output I am presently appending to a list.

But I am trying to write the result of individual files in individual list or 
files.

The script is as follows:

import glob
def speed_try():
 #OPENING THE DICTIONARY
 a4=open("/python27/Dictionaryfile","r").read()
 #CONVERTING DICTIONARY INTO WORDS
 a5=a4.lower().split()
 list1=[]
 for filename in glob.glob('/Python27/*.txt'):
 a1=open(filename,"r").read()
 a2=a1.lower()
 a3=a2.split()
 for word in a3:
 if word in a5:
 a6=a5.index(word)
 a7=a6+1
 a8=a5[a7]
 a9=word+"/"+a8
 list1.append(a9)
 elif word not in a5:
 list1.append(word)
 else:
 print "None"

 x1=list1
 x2=" ".join(x1)
 print x2

Till now, I have tried to experiment over the following solutions:

a) def speed_try():
   #OPENING THE DICTIONARY
   a4=open("/python27/Dictionaryfile","r").read()
   #CONVERTING DICTIONARY INTO WORDS
   a5=a4.lower().split()
   list1=[]
   for filename in glob.glob('/Python27/*.txt'):
  a1=open(filename,"r").read()
  a2=a1.lower()
  a3=a2.split()
   list1.append(a3)


 x1=list1
 print x1

Looks very close but I am unable to fit the if...elif...else part.

b) import glob
def multi_filehandle():
 list_of_files = glob.glob('/Python27/*.txt')
 for file_name in list_of_files:
 FI = open(file_name, 'r')
 FI1=FI.read().split()
 FO = open(file_name.replace('txt', 'out'), 'w')
 for line in FI:


at this point, there's nothing left to be read from FI having been fully 
drained to populate FI1 -- maybe you want to loop over FI1 instead?


Emile



 FO.write(line)

 FI.close()
 FO.close()

I could write output but failing to do processing of the files between opening 
and writing.

I am trying to get examples from fileinput.

If anyone of the learned members may kindly suggest how may I proceed.

I am using Python2.x on MS-Windows.

The practices are scripts and not formal codes so I have not followed style 
guides.

Apology for any indentation error.

Thanking in advance.





--
https://mail.python.org/mailman/listinfo/python-list

Working around multiple files in a folder

2016-11-21 Thread subhabangalore

I have a python script where I am trying to read from a list of files in a 
folder and trying to process something. 
As I try to take out the output I am presently appending to a list.

But I am trying to write the result of individual files in individual list or 
files.

The script is as follows:

import glob
def speed_try():
#OPENING THE DICTIONARY
a4=open("/python27/Dictionaryfile","r").read()
#CONVERTING DICTIONARY INTO WORDS
a5=a4.lower().split()
list1=[]
for filename in glob.glob('/Python27/*.txt'):
a1=open(filename,"r").read()
a2=a1.lower()
a3=a2.split()
for word in a3:
if word in a5:
a6=a5.index(word)
a7=a6+1
a8=a5[a7]
a9=word+"/"+a8
list1.append(a9)
elif word not in a5:
list1.append(word)
else:
print "None"

x1=list1
x2=" ".join(x1)
print x2

Till now, I have tried to experiment over the following solutions:

a) def speed_try():
  #OPENING THE DICTIONARY
  a4=open("/python27/Dictionaryfile","r").read()
  #CONVERTING DICTIONARY INTO WORDS
  a5=a4.lower().split()
  list1=[]
  for filename in glob.glob('/Python27/*.txt'):
 a1=open(filename,"r").read()
 a2=a1.lower()
 a3=a2.split()
  list1.append(a3)


x1=list1
print x1

Looks very close but I am unable to fit the if...elif...else part. 

b) import glob
def multi_filehandle():
list_of_files = glob.glob('/Python27/*.txt')
for file_name in list_of_files:
FI = open(file_name, 'r')
FI1=FI.read().split()
FO = open(file_name.replace('txt', 'out'), 'w') 
for line in FI:
FO.write(line)

FI.close()
FO.close()

I could write output but failing to do processing of the files between opening 
and writing.

I am trying to get examples from fileinput.

If anyone of the learned members may kindly suggest how may I proceed.

I am using Python2.x on MS-Windows. 

The practices are scripts and not formal codes so I have not followed style 
guides.

Apology for any indentation error.

Thanking in advance.


-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Which of two variants of code is better?

2016-11-21 Thread Victor Porton

Ned Batchelder wrote:

> On Monday, November 21, 2016 at 12:48:25 PM UTC-5, Victor Porton wrote:
>> Which of two variants of code to construct an "issue comment" object
>> (about BitBucket issue comments) is better?
>> 
>> 1.
>> 
>> obj = IssueComment(Issue(IssueGroup(repository, 'issues'), id1), id2)
>> 
>> or
>> 
>> 2.
>> 
>> list = [('issues', IssueGroup), (id1, Issue), (id2, IssueComment)]
>> obj = construct_subobject(repository, list)
>> 
>> (`construct_subobject` is to be defined in such as way that "1" and "2"
>> do the same.)
>> 
>> Would you advise me to make such function construct_subobject function or
>> just to use the direct coding as in "1"?
> 
> Neither of these seem very convenient. I don't know what an IssueGroup is,

It is a helper object which helps to paginate issues.

> so I don't know why I need to specify it.  To create a comment on an
> issue, why do I need id2, which seems to be the id of a comment?

It does not create a comment. It is preparing to load the comment.

> How about this:
> 
> obj = IssueComment(repo=repository, issue=id1)
> 
> or:
> 
> obj = repository.create_issue_comment(issue=id1)

Your code is too specialized. I want to make all my code following the same 
patterns. (And I am not going to define helper methods like yours, because 
we do not use it often enough to be worth of a specific method.)

-- 
Victor Porton - http://portonvictor.org
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Numpy slow at vector cross product?

2016-11-21 Thread BartC


On 21/11/2016 17:04, Nobody wrote:

On Mon, 21 Nov 2016 14:53:35 +, BartC wrote:


Also that the critical bits were not implemented in Python?


That is correct. You'll notice that there aren't any loops in numpy.cross.
It's just a wrapper around a bunch of vectorised operations (*, -, []).

If you aren't taking advantage of vectorisation, there's no reason to
expect numpy to be any faster than primitive operations, any more than
you'd expect

(numpy.array([1]) + numpy.array([2])[0]

to be faster than "1+2".

Beyond that, you'd expect a generic function to be at a disadvantage
compared to a function which makes assumptions about its arguments.
Given what it does, I wouldn't expect numpy.cross() to be faster for
individual vectors if it was written in C.


The fastest I can get compiled, native code to do this is at 250 million 
cross-products per second.


The fastest using pure Python executed with Python 2.7 is 0.5 million 
per second.


With pypy, around 8 million per second. (Results will vary by machine, 
version, and OS so this is just one set of timings.)


So numpy, at 0.03 million per second [on a different OS and different 
version], has room for improvement I think!


(In all cases, the loop has been hobbled so that one component 
increments per loop, and one component of the result is summed and then 
displayed at the end.


This is to stop gcc, and partly pypy, from optimising the code out of 
existence; usually you are not calculating the same vector product 
repeatedly. Without the restraint, pypy leaps to 100 million per second, 
and gcc to an infinite number.)


The tests were with values assumed to be vectors, assumed to have 3 
components, and without any messing about with axes, whatever that code 
does. It's just a pure, streamlined, vector cross product (just as I use 
in my own code).


Such a streamlined version can also be written in Python. (Although it 
would be better with a dedicated 3-component vector type rather than a 
general purpose list or even numpy array.)


It's still a puzzle why directly executing the code that numpy uses was 
still faster than numpy itself, when both were run with CPython. Unless 
numpy is perhaps using extra wrappers around numpy.cross.


--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list

Re: Which of two variants of code is better?

2016-11-21 Thread Ned Batchelder

On Monday, November 21, 2016 at 12:48:25 PM UTC-5, Victor Porton wrote:
> Which of two variants of code to construct an "issue comment" object (about 
> BitBucket issue comments) is better?
> 
> 1.
> 
> obj = IssueComment(Issue(IssueGroup(repository, 'issues'), id1), id2)
> 
> or
> 
> 2.
> 
> list = [('issues', IssueGroup), (id1, Issue), (id2, IssueComment)]
> obj = construct_subobject(repository, list)
> 
> (`construct_subobject` is to be defined in such as way that "1" and "2" do 
> the same.)
> 
> Would you advise me to make such function construct_subobject function or 
> just to use the direct coding as in "1"?

Neither of these seem very convenient. I don't know what an IssueGroup is,
so I don't know why I need to specify it.  To create a comment on an issue,
why do I need id2, which seems to be the id of a comment?

How about this:

obj = IssueComment(repo=repository, issue=id1)

or:

obj = repository.create_issue_comment(issue=id1)

--Ned.
-- 
https://mail.python.org/mailman/listinfo/python-list

Which of two variants of code is better?

2016-11-21 Thread Victor Porton

Which of two variants of code to construct an "issue comment" object (about 
BitBucket issue comments) is better?

1.

obj = IssueComment(Issue(IssueGroup(repository, 'issues'), id1), id2)

or

2.

list = [('issues', IssueGroup), (id1, Issue), (id2, IssueComment)]
obj = construct_subobject(repository, list)

(`construct_subobject` is to be defined in such as way that "1" and "2" do 
the same.)

Would you advise me to make such function construct_subobject function or 
just to use the direct coding as in "1"?

-- 
Victor Porton - http://portonvictor.org
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Numpy slow at vector cross product?

2016-11-21 Thread Nobody

On Mon, 21 Nov 2016 14:53:35 +, BartC wrote:

> Also that the critical bits were not implemented in Python?

That is correct. You'll notice that there aren't any loops in numpy.cross.
It's just a wrapper around a bunch of vectorised operations (*, -, []).

If you aren't taking advantage of vectorisation, there's no reason to
expect numpy to be any faster than primitive operations, any more than
you'd expect

(numpy.array([1]) + numpy.array([2])[0]

to be faster than "1+2".

Beyond that, you'd expect a generic function to be at a disadvantage
compared to a function which makes assumptions about its arguments.
Given what it does, I wouldn't expect numpy.cross() to be faster for
individual vectors if it was written in C.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Numpy slow at vector cross product?

2016-11-21 Thread Skip Montanaro

Perhaps your implementation isn't as general as numpy's? I pulled out
the TestCross class from numpy.core.tests.test_numeric and replaced
calls to np.cross with calls to your function. I got an error in
test_broadcasting_shapes:

ValueError: operands could not be broadcast together with shapes (1,2) (5,)

Skip
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: What is the difference between class Foo(): and class Date(object):

2016-11-21 Thread Veek M

Steve D'Aprano wrote:

> On Mon, 21 Nov 2016 11:15 pm, Veek M wrote:
> 
> class Foo():
>> ...  pass
>> ...
> class Bar(Foo):
>> ...  pass
>> ...
> b = Bar()
> type(b)
>> 
> [...]
> 
>> What is going on here? Shouldn't x = EuroDate();  type(x) give
>> 'instance'?? Why is 'b' an 'instance' and 'x' EuroDate?
>> Why isn't 'b' Bar?
> 
> 
> It looks like you are running Python 2, and have stumbled across an
> annoyance from the earliest days of Python: the "classic", or
> "old-style", class.
> 
> Before Python 2.2, custom classes and built-in types like int, float,
> dict and list were different. You have just discovered one of the ways
> they were different: instances of custom classes all had the same
> type, even if the class was different:
> 
> # Python 2
> 
> py> class Dog:
> ... pass
> ...
> py> class Cat:
> ... pass
> ...
> py> lassie = Dog()
> py> garfield = Cat()
> py> type(lassie) is type(garfield)
> True
> py> type(lassie)
> 
> 
> 
> This is just the most obvious difference between "classic classes" and
> types. Some of the other differences:
> 
> - The method resolution order (MRO) is different: the classic class
>   MRO is buggy for diamond-shaped multiple inheritance, and special
>   dunder methods like __eq__ are resolved slightly differently.
> 
> - super, properties, class methods and static methods don't work for
>   classic classes.
> 
> - The metaclass of classic classes is different:
> 
> py> type(Dog)
> 
> py> type(float)
> 
> 
> - Attribute lookup for classic classes is slightly different; in
>   particular, the special __getattribute__ method doesn't work.
> 
> 
> In Python 2.2, the built-in types (list, dict, float etc) were unified
> with the class mechanism, but for backwards compatibility the
> old-style classes had to be left in. So Python had two class
> mechanisms:
> 
> - "New-style classes", or types, inherit from object, or some
>   other built-in type, and support properties, etc.
> 
> - "Old-style classes", don't inherit from object, don't support
>   properties etc.
> 
> 
> So in Python 2, when you write:
> 
> class Foo:
> 
> or
> 
> class Foo():
> 
> 
> you get an old-style class. But when you inherit from object, you get
> a new-style class. Classic classes are an obsolete feature from Python
> 2. They are removed in Python 3, and things are much simpler. In
> Python 3, it doesn't matter whether you write:
> 
> class Foo:
> class Foo():
> class Foo(object):
> 
> the result is the same: a new-style class, or type.
> 
> The best thing to do in Python 2 is to always, without exception,
> write
> 
> class Foo(object):
> 
> to define your base classes. That will ensure that property, super,
> classmethod, staticmethod, __getattribute__, etc. will all work
> correctly, and you will avoid the surprises of classic classes.
> 
> 
> 
Thanks guys, got it!
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Numpy slow at vector cross product?

2016-11-21 Thread BartC


On 21/11/2016 12:44, Peter Otten wrote:


After a look into the source this is no longer a big surprise (numpy 1.8.2):

if axis is not None:
axisa, axisb, axisc=(axis,)*3
a = asarray(a).swapaxes(axisa, 0)
b = asarray(b).swapaxes(axisb, 0)




The situation may be different when you process vectors in bulk, i. e.
instead of [cross(a, bb) for bb in b] just say cross(a, b).


I thought that numpy was supposed to be fast? Also that the critical 
bits were not implemented in Python?


Anyway I tried your code (put into a function as shown below) in the 
same test program, and it was *still* 3 times as fast as numpy! 
(mycross() was 3 times as fast as np.cross().)


Explain that...


---

def mycross(a,b,axisa=-1, axisb=-1, axisc=-1, axis=None):
if axis is not None:
axisa, axisb, axisc=(axis,)*3
a = np.asarray(a).swapaxes(axisa, 0)
b = np.asarray(b).swapaxes(axisb, 0)
msg = "incompatible dimensions for cross product\n"\
  "(dimension must be 2 or 3)"
if (a.shape[0] not in [2, 3]) or (b.shape[0] not in [2, 3]):
raise ValueError(msg)
if a.shape[0] == 2:
if (b.shape[0] == 2):
cp = a[0]*b[1] - a[1]*b[0]
if cp.ndim == 0:
return cp
else:
return cp.swapaxes(0, axisc)
else:
x = a[1]*b[2]
y = -a[0]*b[2]
z = a[0]*b[1] - a[1]*b[0]
elif a.shape[0] == 3:
if (b.shape[0] == 3):
x = a[1]*b[2] - a[2]*b[1]
y = a[2]*b[0] - a[0]*b[2]
z = a[0]*b[1] - a[1]*b[0]
else:
x = -a[2]*b[1]
y = a[2]*b[0]
z = a[0]*b[1] - a[1]*b[0]
cp = np.array([x, y, z])
if cp.ndim == 1:
return cp
else:
return cp.swapaxes(0, axisc)

---
Tested as:

x=np.array([1,2,3])
y=np.array([4,5,6])

start=time.clock()
for i in range(loops):
z=mycross(x,y)
print "Calc, %s loops: %.2g seconds" %(loops,time.clock()-start)


--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list

Re: Numpy slow at vector cross product?

2016-11-21 Thread Steve D'Aprano

On Mon, 21 Nov 2016 11:09 pm, BartC wrote:

> On 21/11/2016 02:48, Steve D'Aprano wrote:
[...]
>> However, your code is not a great way of timing code. Timing code is
>> *very* difficult, and can be effected by many things, such as external
>> processes, CPU caches, even the function you use for getting the time.
>> Much of the time you are timing here will be in creating the range(loops)
>> list, especially if loops is big.
> 
> But both loops are the same size. And that overhead can quickly be
> disposed of by measuring empty loops in both cases. (On my machine,
> about 0.006/7 seconds for loops of 100,000.)

No, you cannot make that assumption, not in general. On modern machines, you
cannot assume that the time it takes to execute foo() immediately followed
by bar() is the same as the time it takes to execute foo() and bar()
separately.

Modern machines run multi-tasking operating systems, where there can be
other processes running. Depending on what you use as your timer, you may
be measuring the time that those other processes run. The OS can cache
frequently used pieces of code, which allows it to run faster. The CPU
itself will cache some code.

The shorter the code snippet, the more these complications are relevant. In
this particular case, we can be reasonably sure that the time it takes to
create a list range(1) and the overhead of the loop is *probably* quite
a small percentage of the time it takes to perform 10 vector
multiplications. But that's not a safe assumption for all code snippets.

This is why the timeit module exists: to do the right thing when it matters,
so that you don't have to think about whether or not it matters. The timeit
module works really really hard to get good quality, accurate timings,
minimizing any potential overhead.

The timeit module automates a bunch of tricky-to-right best practices for
timing code. Is that a problem?

>> The best way to time small snippets of code is to use the timeit module.
>> Open a terminal or shell (*not* the Python interactive interpreter, the
>> operating system's shell: you should expect a $ or % prompt) and run
>> timeit from that. Copy and paste the following two commands into your
>> shell prompt:
>>
>>
>> python2.7 -m timeit --repeat 5 -s "import numpy as np" \
>> -s "x = np.array([1, 2, 3])" -s "y = np.array([4, 5, 6])" \
>> -- "np.cross(x, y)"
[...]

> Yes, I can see that typing all the code out again, and remembering all
> those options and putting -s, -- and \ in all the right places, is a
> much better way of doing it! Not error prone at all.

Gosh Bart, how did you manage to write that sentence? How did you remember
all those words, and remember to put the punctuation marks in the right
places?

You even used sarcasm! You must be a genius. (Oh look, I can use it too.)

Seriously Bart? You've been a programmer for how many decades, and you can't
work out how to call a command from the shell? This is about working
effectively with your tools, and a basic understanding of the shell is an
essential tool for programmers.

This was a *simple* command. It was a LONG command, but don't be fooled by
the length, and the fact that it went over multiple lines, it was dirt
simple. I'm not saying that every programmer needs to be a greybeard Unix
guru (heaven knows that I'm not!), but they ought to be able to run simple
commands from the command line.

Those who don't are in the same position as carpenters who don't know the
differences between the various kinds of hammer or saws. Sure, you can
still do a lot of work using just one kind of hammer and one kind of saw,
but you'll work better, faster and smarter with the right kind. You don't
use a rip saw to make fine cuts, and you don't use a mash hammer to drive
tacks.

The -m option lets you run a module without knowing the precise location of
the source file. Some of the most useful commands to learn:

python -m unittest ...
python -m doctest ...
python -m timeit ...

The biggest advantage of calling the timeit module is that it will
automatically select the number of iterations you need to run to get good
timing results, without wasting time running excessive loops.

(The timeit module is *significantly* improved in Python 3.6, but even in
older versions its pretty good.)

But if you prefer doing it "old school" from within Python, then:

from timeit import Timer
t = Timer('np.cross(x, y)',  setup="""
import numpy as np
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
""")

# take five measurements of 10 calls each, and report the fastest
result = min(t.repeat(number=10, repeat=5))/10
print(result)  # time in seconds per call

Better?

-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: What is the difference between class Foo(): and class Date(object):

2016-11-21 Thread Steve D'Aprano

On Mon, 21 Nov 2016 11:15 pm, Veek M wrote:

 class Foo():
> ...  pass
> ...
 class Bar(Foo):
> ...  pass
> ...
 b = Bar()
 type(b)
> 
[...] 

> What is going on here? Shouldn't x = EuroDate();  type(x) give
> 'instance'?? Why is 'b' an 'instance' and 'x' EuroDate?
> Why isn't 'b' Bar?

It looks like you are running Python 2, and have stumbled across an
annoyance from the earliest days of Python: the "classic", or "old-style",
class.

Before Python 2.2, custom classes and built-in types like int, float, dict
and list were different. You have just discovered one of the ways they were
different: instances of custom classes all had the same type, even if the
class was different:

# Python 2

py> class Dog:
... pass
...
py> class Cat:
... pass
...
py> lassie = Dog()
py> garfield = Cat()
py> type(lassie) is type(garfield)
True
py> type(lassie)

This is just the most obvious difference between "classic classes" and
types. Some of the other differences:

- The method resolution order (MRO) is different: the classic class
  MRO is buggy for diamond-shaped multiple inheritance, and special
  dunder methods like __eq__ are resolved slightly differently.

- super, properties, class methods and static methods don't work for
  classic classes.

- The metaclass of classic classes is different:

py> type(Dog)

py> type(float)

- Attribute lookup for classic classes is slightly different; in
  particular, the special __getattribute__ method doesn't work.

In Python 2.2, the built-in types (list, dict, float etc) were unified with
the class mechanism, but for backwards compatibility the old-style classes
had to be left in. So Python had two class mechanisms:

- "New-style classes", or types, inherit from object, or some 
  other built-in type, and support properties, etc.

- "Old-style classes", don't inherit from object, don't support
  properties etc.

So in Python 2, when you write:

class Foo:

or

class Foo():

you get an old-style class. But when you inherit from object, you get a
new-style class. Classic classes are an obsolete feature from Python 2.
They are removed in Python 3, and things are much simpler. In Python 3, it
doesn't matter whether you write:

class Foo:
class Foo():
class Foo(object):

the result is the same: a new-style class, or type.

The best thing to do in Python 2 is to always, without exception, write

class Foo(object):

to define your base classes. That will ensure that property, super,
classmethod, staticmethod, __getattribute__, etc. will all work correctly,
and you will avoid the surprises of classic classes.

-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Numpy slow at vector cross product?

2016-11-21 Thread eryk sun

On Mon, Nov 21, 2016 at 1:38 AM, BartC  wrote:
> On 20/11/2016 20:46, DFS wrote:
>>
>> import sys, time, numpy as np
>> loops=int(sys.argv[1])
>>
>> x=np.array([1,2,3])
>> y=np.array([4,5,6])
>> start=time.clock()

In Unix, time.clock doesn't measure wall-clock time, but rather an
approximation to the CPU time used by the current process. On the
other hand, time.time calls gettimeofday, if available, which has a
resolution of 1 microsecond. Python 2 timing tests should use
time.time on Unix.

In Windows, time.time calls GetSystemTimeAsFileTime, which has a
default resolution of only about 15 ms, adjustable down to about 1 ms.
On other hand, time.clock calls QueryPerformanceCounter, which has a
resolution of about 100 nanoseconds. Python 2 timing tests should use
time.clock on Windows.

In Python 3.3+, timing tests should use time.perf_counter. In Linux
this calls clock_gettime using a monotonic clock with a resolution of
1 nanosecond, and in Windows it calls QueryPerformanceCounter.

In any case, timeit.default_timer selects the best function to call
for a given platform.
-- 
https://mail.python.org/mailman/listinfo/python-list

ANN: eGenix PyRun - One file Python Runtime 2.2.3

2016-11-21 Thread eGenix Team: M.-A. Lemburg



ANNOUNCING

eGenix PyRun - One file Python Runtime

Version 2.2.3


   An easy-to-use single file relocatable Python run-time -
  available for Linux, Mac OS X and Unix platforms,
  with support for Python 2.6, 2.7, 3.4 and
* also for Python 3.5 *


This announcement is also available on our web-site for online reading:
http://www.egenix.com/company/news/eGenix-PyRun-2.2.3-GA.html



INTRODUCTION

eGenix PyRun is our open source, one file, no installation version of
Python, making the distribution of a Python interpreter to run based
scripts and applications to Unix based systems as simple as copying a
single file.

eGenix PyRun's executable only needs 11MB for Python 2 and 13MB for
Python 3, but still supports most Python application and scripts - and
it can be compressed to just 3-4MB using upx, if needed.

Compared to a regular Python installation of typically 100MB on disk,
eGenix PyRun is ideal for applications and scripts that need to be
distributed to several target machines, client installations or
customers.

It makes "installing" Python on a Unix based system as simple as
copying a single file.

eGenix has been using eGenix PyRun internally in the mxODBC Connect
Server product since 2008 with great success and decided to make it
available as a stand-alone open-source product.

We provide both the source archive to build your own eGenix PyRun, as
well as pre-compiled binaries for Linux, FreeBSD and Mac OS X, as 32-
and 64-bit versions. The binaries can be downloaded manually, or you
can let our automatic install script install-pyrun take care of the
installation: ./install-pyrun dir and you're done.

Please see the product page for more details:

http://www.egenix.com/products/python/PyRun/



NEWS

This minor level release of eGenix PyRun comes with the following
enhancements:

Enhancements / Changes
--

 * Removed lzma module from PyRun for Python 3.x again, since this
   caused too many issues with incompatible/missing libzma.so
   references. The module is still being built as optional add-on and
   can be used if the necessary libs are available, but it will no
   longer prevent PyRun from working altogether.

install-pyrun Quick Install Enhancements
-

eGenix PyRun includes a shell script called install-pyrun, which
greatly simplifies installation of PyRun. It works much like the
virtualenv shell script used for creating new virtual environments
(except that there's nothing virtual about PyRun environments).

https://downloads.egenix.com/python/install-pyrun

With the script, an eGenix PyRun installation is as simple as running:

./install-pyrun targetdir

This will automatically detect the platform, download and install the
right pyrun version into targetdir.

We have updated this script since the last release:

 * Updated install-pyrun to default to eGenix PyRun 2.2.3 and its
   feature set.

For a complete list of changes, please see the eGenix PyRun Changelog:

http://www.egenix.com/products/python/PyRun/changelog.html



LICENSE

eGenix PyRun is distributed under the eGenix.com Public License 1.1.0
which is an Open Source license similar to the Python license. You can
use eGenix PyRun in both commercial and non-commercial settings
without fee or charge.

Please see our license page for more details:

http://www.egenix.com/products/python/PyRun/license.html

The package comes with full source code.



DOWNLOADS

The download archives and instructions for installing eGenix PyRun can
be found at:

http://www.egenix.com/products/python/PyRun/

As always, we are providing pre-built binaries for all common
platforms: Windows 32/64-bit, Linux 32/64-bit, FreeBSD 32/64-bit, Mac
OS X 32/64-bit. Source code archives are available for installation on
other platforms, such as Solaris, AIX, HP-UX, etc.

___

SUPPORT

Commercial support for this product is available from eGenix.com.
Please see

http://www.egenix.com/services/support/

for details about our support offerings.



MORE INFORMATION

For more information about eGenix PyRun, licensing and download
instructions, please visit our web-site:

http://www.egenix.com/products/python/PyRun/


About eGenix (http://www.egenix.com/):

eGenix is a Python software project, consulting and product
company delivering expert services and professional quality
products f

Re: Generic dictionary

2016-11-21 Thread Peter Otten

Thorsten Kampe wrote:

>> def GenericDict(dict_or_items):
>> if isinstance(dict_or_items, dict):
>> return dict(dict_or_items)
>> else:
>> return SimpleGenericDictWithOnlyTheFalseBranchesImplemented(
>> dict_or_items
>> )
> 
> That would be a kind of factory function for the class?

Yes, one if...else to pick a dedicated class instead of one if...else per 
method in the all-things-to-all-people class.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Setting the exit status from sys.excepthook

2016-11-21 Thread Peter Otten

Steven D'Aprano wrote:

> I have script with an exception handler that takes care of writing the
> traceback to syslog, and I set it as the global exceptionhook:
> 
> sys.excepthook = my_error_handler
> 
> 
> When my script raises, my_error_handler is called, as expected, and the
> process exits with status 1.
> 
> How can I change the exit status to another value, but only for exceptions
> handled by my_error_handler?

Why not just put

try:
   ...
except:
   ...

around the main function?

That said, it looks like you can exit() from the errorhandler:

$ cat bend_exit.py
import sys

def my_error_handler(etype, exception, traceback):
print("unhandled", exception)
sys.exit(2)

sys.excepthook = my_error_handler

1/int(sys.argv[1])
print("bye")
$ python3 bend_exit.py 0; echo $?
unhandled division by zero
2
$ python3 bend_exit.py 1; echo $?
bye
0

If "bad things" can happen I don't know...

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Numpy slow at vector cross product?

2016-11-21 Thread Peter Otten

Steve D'Aprano wrote:

> On Mon, 21 Nov 2016 07:46 am, DFS wrote:
> 
>> import sys, time, numpy as np
>> loops=int(sys.argv[1])
>> 
>> x=np.array([1,2,3])
>> y=np.array([4,5,6])
>> start=time.clock()
>> for i in range(loops):
>>  np.cross(x,y)
>> print "Numpy, %s loops: %.2g seconds" %(loops,time.clock()-start)
> 
> [...]
>> $ python vector_cross.py
>> Numpy, 10 loops: 2.5 seconds
>> Calc,  10 loops: 0.13 seconds
>> 
>> 
>> Did I do something wrong, or is numpy slow at this?
> 
> I can confirm similar results.
> 
> However, your code is not a great way of timing code. Timing code is
> *very* difficult, and can be effected by many things, such as external
> processes, CPU caches, even the function you use for getting the time.
> Much of the time you are timing here will be in creating the range(loops)
> list, especially if loops is big.
> 
> The best way to time small snippets of code is to use the timeit module.
> Open a terminal or shell (*not* the Python interactive interpreter, the
> operating system's shell: you should expect a $ or % prompt) and run
> timeit from that. Copy and paste the following two commands into your
> shell prompt:
> 
> 
> python2.7 -m timeit --repeat 5 -s "import numpy as np" \
> -s "x = np.array([1, 2, 3])" -s "y = np.array([4, 5, 6])" \
> -- "np.cross(x, y)"
> 
> 
> python2.7 -m timeit --repeat 5 -s "x = [1, 2, 3]" \
> -s "y = [4, 5, 6]" -s "z = [0, 0, 0]" \
> -- "z[0] = x[1]*y[2] - x[2]*y[1]; z[1] = x[2]*y[0] - \
> x[0]*y[2]; z[2] = x[0]*y[1] - x[1]*y[0]"
> 
> 
> The results I get are:
> 
> 1 loops, best of 5: 30 usec per loop
> 
> 100 loops, best of 5: 1.23 usec per loop
> 
> 
> So on my machine, np.cross() is about 25 times slower than multiplying by
> hand.

After a look into the source this is no longer a big surprise (numpy 1.8.2):

if axis is not None:
axisa, axisb, axisc=(axis,)*3
a = asarray(a).swapaxes(axisa, 0)
b = asarray(b).swapaxes(axisb, 0)
msg = "incompatible dimensions for cross product\n"\
  "(dimension must be 2 or 3)"
if (a.shape[0] not in [2, 3]) or (b.shape[0] not in [2, 3]):
raise ValueError(msg)
if a.shape[0] == 2:
if (b.shape[0] == 2):
cp = a[0]*b[1] - a[1]*b[0]
if cp.ndim == 0:
return cp
else:
return cp.swapaxes(0, axisc)
else:
x = a[1]*b[2]
y = -a[0]*b[2]
z = a[0]*b[1] - a[1]*b[0]
elif a.shape[0] == 3:
if (b.shape[0] == 3):
x = a[1]*b[2] - a[2]*b[1]
y = a[2]*b[0] - a[0]*b[2]
z = a[0]*b[1] - a[1]*b[0]
else:
x = -a[2]*b[1]
y = a[2]*b[0]
z = a[0]*b[1] - a[1]*b[0]
cp = array([x, y, z])
if cp.ndim == 1:
return cp
else:
return cp.swapaxes(0, axisc)

The situation may be different when you process vectors in bulk, i. e. 
instead of [cross(a, bb) for bb in b] just say cross(a, b).


-- 
https://mail.python.org/mailman/listinfo/python-list

What is the difference between class Foo(): and class Date(object):

2016-11-21 Thread Veek M

>>> class Foo():
...  pass
... 
>>> class Bar(Foo):
...  pass
... 
>>> b = Bar()
>>> type(b)


>>> class Date(object):
...  pass
... 
>>> class EuroDate(Date):
...  pass
... 
>>> x = EuroDate()
>>> type(x)



What is going on here? Shouldn't x = EuroDate();  type(x) give 
'instance'?? Why is 'b' an 'instance' and 'x' EuroDate?
Why isn't 'b' Bar?


Guhh! (I am reading @classmethods from beazley - i know i have two open 
threads but I'll get to that - will require even more reading)


-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Numpy slow at vector cross product?

2016-11-21 Thread BartC


On 21/11/2016 02:48, Steve D'Aprano wrote:

On Mon, 21 Nov 2016 07:46 am, DFS wrote:



start=time.clock()
for i in range(loops):
 np.cross(x,y)
print "Numpy, %s loops: %.2g seconds" %(loops,time.clock()-start)



However, your code is not a great way of timing code. Timing code is *very*
difficult, and can be effected by many things, such as external processes,
CPU caches, even the function you use for getting the time. Much of the
time you are timing here will be in creating the range(loops) list,
especially if loops is big.


But both loops are the same size. And that overhead can quickly be 
disposed of by measuring empty loops in both cases. (On my machine, 
about 0.006/7 seconds for loops of 100,000.)



The best way to time small snippets of code is to use the timeit module.
Open a terminal or shell (*not* the Python interactive interpreter, the
operating system's shell: you should expect a $ or % prompt) and run timeit
from that. Copy and paste the following two commands into your shell
prompt:


python2.7 -m timeit --repeat 5 -s "import numpy as np" \
-s "x = np.array([1, 2, 3])" -s "y = np.array([4, 5, 6])" \
-- "np.cross(x, y)"


python2.7 -m timeit --repeat 5 -s "x = [1, 2, 3]" \
-s "y = [4, 5, 6]" -s "z = [0, 0, 0]" \
-- "z[0] = x[1]*y[2] - x[2]*y[1]; z[1] = x[2]*y[0] - \
x[0]*y[2]; z[2] = x[0]*y[1] - x[1]*y[0]"


Yes, I can see that typing all the code out again, and remembering all 
those options and putting -s, -- and \ in all the right places, is a 
much better way of doing it! Not error prone at all.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list

Re: Clean way to return error codes

2016-11-21 Thread Peter Otten

Steven D'Aprano wrote:

> I have a script that can be broken up into four subtasks. If any of those
> subtasks fail, I wish to exit with a different exit code and error.
> 
> Assume that the script is going to be run by system administrators who
> know no Python and are terrified of tracebacks, and that I'm logging the
> full traceback elsewhere (not shown).
> 
> I have something like this:
> 
> 
> try:
> begin()
> except BeginError:
> print("error in begin")
> sys.exit(3)
> 
> try:
> cur = get_cur()
> except FooError:
> print("failed to get cur")
> sys.exit(17)
> 
> try:
> result = process(cur)
> print(result)
> except FooError, BarError:
> print("error in processing")
> sys.exit(12)
> 
> try:
> cleanup()
> except BazError:
> print("cleanup failed")
> sys.exit(8)
> 
> 
> 
> It's not awful, but I don't really like the look of all those try...except
> blocks. Is there something cleaner I can do, or do I just have to suck it
> up?

def run():
yield BeginError, "error in begin", 3
begin()

yield FooError, "failed to get cur", 17
cur = get_cur()

yield (FooError, BarError), "error in processing", 12
result = process(cur)
print(result)

yield BazError, "cleanup failed", 8
cleanup()

try:
for Errors, message, exitcode in run():
pass
except Errors:
print(message)
sys.exit(exitcode)


-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Guido? Where are you?

2016-11-21 Thread Ethan Furman


For those new to the list:

Thomas 'PointedEars' Lahn 

has been banned from this mailing list for (at least) the rest of this year.  
Unfortunately, we do not have any control over the comp.lang.python newsgroup.  
If you access Python List from the newsgroup you may want to add him to your 
personal kill-file so you don't have to see his posts.

--
~Ethan~
--
https://mail.python.org/mailman/listinfo/python-list

Re: Guido? Where are you?

2016-11-21 Thread Bob Martin

in 767657 20161121 041134 Thomas 'PointedEars' Lahn  wrote:
>Tristan B. Kildaire wrote:
>
>> Is Guido active on this newsgroup.
>
>That is not even a question.
>
>> Sorry for the off-topic ness.
>
>There is no excuse for (such) stupidity.

Stop posting then.

>
><http://catb.org/esr/faqs/smart-questions.html>
>
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Clean way to return error codes

2016-11-21 Thread Chris Angelico

On Mon, Nov 21, 2016 at 6:09 PM, Steven D'Aprano
 wrote:
> try:
> begin()
> except BeginError:
> print("error in begin")
> sys.exit(3)

Do you control the errors that are getting thrown?

class BeginExit(SystemExit, BeginError): pass

It'll behave like SystemExit, but still be catchable as BeginError.
(Or if BeginError isn't used anywhere else, it can itself be redefined
to inherit from SystemExit.)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Setting the exit status from sys.excepthook

2016-11-21 Thread Steven D'Aprano

I have script with an exception handler that takes care of writing the 
traceback to syslog, and I set it as the global exceptionhook:

sys.excepthook = my_error_handler


When my script raises, my_error_handler is called, as expected, and the process 
exits with status 1.

How can I change the exit status to another value, but only for exceptions 
handled by my_error_handler?



-- 
Steven
299792.458 km/s — not just a good idea, it’s the law!

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Clean way to return error codes

2016-11-21 Thread Steven D'Aprano

On Monday 21 November 2016 19:01, Ben Finney wrote:

[...]
> The next improvement I'd make is to use the “strategy” pattern, and come
> up with some common key that determines what exit status you want. Maybe
> the key is a tuple of (function, exception):
> 
> exit_status_by_problem_key = {
> (begin, BeginError): 3,
> (get_cur, FooError): 17,
> (process, FooError): 12,
> (process, BarError): 12,
> (cleanup, BazError): 8,
> }

Indeed. Probably not worth the overhead in extra complexity for just four sub-
tasks, but worth keeping in mind.


-- 
Steven
299792.458 km/s — not just a good idea, it’s the law!

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Clean way to return error codes

2016-11-21 Thread Steven D'Aprano

On Monday 21 November 2016 18:39, Jussi Piitulainen wrote:

> Steven D'Aprano writes:
[...]
>> It's not awful, but I don't really like the look of all those
>> try...except blocks. Is there something cleaner I can do, or do I just
>> have to suck it up?
> 
> Have the exception objects carry the message and the exit code?
> 
> try:
> begin()
> cur = get_cur()
> result = process(cur)
> print(result)
> cleanup()
> except (BeginError, FooError, BarError, BazError) as exn:
> print("Steven's script:", message(exn))
> sys.exit(code(exn))

Oooh, nice!



-- 
Steven
299792.458 km/s — not just a good idea, it’s the law!

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Clean way to return error codes

2016-11-21 Thread Ben Finney

Steven D'Aprano  writes:

> I have a script that can be broken up into four subtasks. If any of those 
> subtasks fail, I wish to exit with a different exit code and error.
>
> Assume that the script is going to be run by system administrators who
> know no Python and are terrified of tracebacks, and that I'm logging
> the full traceback elsewhere (not shown).

The first improvement I'd make (and you likely already know this one,
but for the benefit of later readers):

try:
begin()
except FooError as exc:
print("failed to get cur: {message}".format(message=exc))
raise SystemExit(17) from exc

That preserves the full traceback for anything that instruments that
program to try to debug what went wrong. It will also emit the message
from the underlying problem, for the user who only sees the program
output.

The next improvement I'd make is to use the “strategy” pattern, and come
up with some common key that determines what exit status you want. Maybe
the key is a tuple of (function, exception):

exit_status_by_problem_key = {
(begin, BeginError): 3,
(get_cur, FooError): 17,
(process, FooError): 12,
(process, BarError): 12,
(cleanup, BazError): 8,
}

Or you might need to define the strategy as (message_template,
exit_status):

exit_status_by_problem_key = {
(begin, BeginError): ("error in begin: {message}", 3),
(get_cur, FooError): ("failed to get cur: {message}", 17),
(process, FooError): ("error in processing: {message}", 12),
(process, BarError): ("error in processing: {message}", 12),
(cleanup, BazError): ("cleanup failed: {message}", 8),
}

-- 
 \  “God forbid that any book should be banned. The practice is as |
  `\  indefensible as infanticide.” —Dame Rebecca West |
_o__)  |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list

45 matches

Mail list logo