date:20100123

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Raymond Hettinger

[Steve Howell]
> Why wouldn't you get a competent C programmer simply make
> list_ass_slice smart enough to make list.pop(0) O(1)?

When this suggestion was discussed on python-dev years ago,
it was rejected.  One reason is that it was somewhat
common for C code to access the list data structure directly
(bypassing API accessor functions).  Changing the list to have
a starting offset would break existing C extensions.

Another reason is that Guido is non-tolerant of space or time
trade-offs for lists and tuples because they pervade the language
and are heavily used internally.  Any additional space or time
requirement however small would impact the language performance
as a whole.  FWIW, that is also the reason that lists are not
weak-referenceable (it would cost one extra pointer field per
instance and that wasn't deemed to be worth it).


> The brilliant computer scientist, Christian Heimes, provides the
> answers, and I am paraphrasing here, of course:

IMHO, Christian IS a brilliant computer scientist, so I'll ignore
the rude intention and take the sentence literally.


>   1) You can save 64 bits for every list by not storing an extra
> pointer!
>   2) You can simplify the CPython implementation!
>   3) You can avoid the oh-so-expensive check "if ilow == 1" for all
> operations that don't need this optimization!
>
> Sounds like two micro-optimizations to me (and a copout to boot).

Micro or not, these reasons were part of Guido's rationale.
Tim Peters and I also participated in the conversation and did not
disagree.

So, collections.deque() was born and the problem stopped being an
issue.
Also, Daniel Stuzbach has published a blist implementation that offers
high performance insertions and deletions and fast handling of sparse
lists.


Raymond
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: how to generate random numbers that satisfy certain distribution

2010-01-23 Thread Steven D'Aprano

On Sat, 23 Jan 2010 14:10:10 -0800, Paul Rubin wrote:

> Steven D'Aprano  writes:
>> The Box-Muller transform is reasonably simple, but you can't be serious
>> that it is simpler than adding twelve random numbers and subtracting
>> six!
> 
> If you want a normal distribution, using the Box-Muller transform is
> simpler because it spares you the complication of figuring out whether
> the 12-trial binomial approximation is close enough to produce reliable
> results for your specific application, which you obviously have to do if
> you are using the approximation for anything serious.

But using Box-Miller gives you the complication of figuring out whether 
you care about being thread safety, which you have to do if you're doing 
anything serious. (See the comments in the random module for the gauss 
method).

> It also involves
> writing less code than that list comprehension, since it is already
> implemented in the random module so you can just call it directly.

By that logic, the Linux kernel is simpler than a function to add one to 
the argument, because the Linux kernel has already been implemented but 
you have to write your own add_one function.

(Except in Forth, which usually comes with a word to add one to the 
number at the top of the stack.)

We can agree that, given that the random module already has two normal 
distributions, there's no real point in using the binomial approximation. 
Everything else is quibbling.

By the way, does anyone know why there is no Poisson random numbers in 
the module? The implementation is quite simple (but not as simple as the 
Linux kernel *wink*):

def poisson(lambda_=1):
L = math.exp(-lambda_)
k = 0
p = 1
while 1:
k += 1
p *= random.random()
if p <= L:
break
return k-1

although this can be improved upon for large values of lambda_.

-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: this customize sort did not work ,what's wrong?

2010-01-23 Thread Stephen Hansen

On Sat, Jan 23, 2010 at 12:57 PM, thinke365  wrote:

> def sort_by_list(E1, E2):
>print len(E1), len(E2)
>return len(list(E1)) > len(list(E2))
>
> l.sort(cmp=sort_by_list)
>

The cmp function is defined as returning one of three values, -1, 0 and 1.

You are returning true or false, so things aren't getting sorted because
you're not giving cmp what it is looking for. You could rewrite that as
return cmp(len(list(E1)), len(list(E2))) if you want. However, cmp is going
away in Py3.

You can just do, instead:

l.sort(key=len)

And you'll get precisely what you're looking for.

--S
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Roy Smith

In article ,
 Terry Reedy  wrote:

> >> The page you should probably be looking at is
> >> http://wiki.python.org/moin/TimeComplexity
> >
> > I was not aware of this page; thanks for pointing it out.
> 
> Perhaps you could suggest on the tracker a place or places in the doc 
> where this relatively new wiki page could be referred to. Perhaps in the 
> introductory paragraphs of the Built-in Type section of the lib ref. 
> Where would you have like to have found it?

I think the most logical place would have been right at the table of 
operations (http://tinyurl.com/cdbwog).
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: this customize sort did not work ,what's wrong?

2010-01-23 Thread Terry Reedy


On 1/23/2010 3:57 PM, thinke365 wrote:


l = list()
l1 = list((1, 2, 3, 4))
l2 = list((1,2))
l3 = list((1, 2, 3, 4, 5))
l.append(l1)
l.append(l2)
l.append(l3)
print l

def sort_by_list(E1, E2):
 print len(E1), len(E2)
 return len(list(E1))>  len(list(E2))

l.sort(cmp=sort_by_list)
print l

output:
[[1, 2, 3, 4], [1, 2], [1, 2, 3, 4, 5]]
2 4
5 2
[[1, 2, 3, 4], [1, 2], [1, 2, 3, 4, 5]]

the order of the elements in the list did not change!


Here is what is right!

>>> l = [ (1,2,3), (1,2), (1,2,3,4,5), () ]
>>> l.sort(key=len)
>>> l
[(), (1, 2), (1, 2, 3), (1, 2, 3, 4, 5)]

The cmp param is gone in 3.x in favor of key. Use the latter.

Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Re: how can i know if a python object have a attribute such as 'attr1'?

2010-01-23 Thread Terry Reedy


On 1/23/2010 10:56 AM, Arnaud Delobelle wrote:

thinke365  writes:


for example, i may define a python class:
class A:
  def sayHello():
   print 'hello'

a = A()
a.attr1 = 'hello'
a.attr2 = 'bb'

b = A()
a.attr2 = 'aa'

how can i know whether an object have an attribute named attr1?


hasattr(a, 'attr1')


or
try: a.attr1
except AttributeError: pass

Terry Jan Reedy




--
http://mail.python.org/mailman/listinfo/python-list

Re: enumerated while loop

2010-01-23 Thread Terry Reedy


On 1/23/2010 9:44 AM, Roald de Vries wrote:

Dear all,

I sometimes want to use an infinite while loop with access to the loop
index, like this:

def naturals():
i = 0
while True:
yield i
y += 1

for i in naturals():
print(i)

I assume a function like 'naturals' already exists, or a similar
construction for the same purpose. But what is it called?


itertools.count

--
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Terry Reedy


On 1/23/2010 2:02 PM, Roy Smith wrote:

In article,
  Duncan Booth  wrote:


Roy Smith  wrote:


I'm looking at http://tinyurl.com/cdbwog.  It shows all the operations
of a list.  What it does not show is their cost.  For pop(), it has a
note:

"The pop() method is only supported by the list and array types. The
optional argument i defaults to -1, so that by default the last item
is removed and returned".


The page you should probably be looking at is
http://wiki.python.org/moin/TimeComplexity


I was not aware of this page; thanks for pointing it out.


Perhaps you could suggest on the tracker a place or places in the doc 
where this relatively new wiki page could be referred to. Perhaps in the 
introductory paragraphs of the Built-in Type section of the lib ref. 
Where would you have like to have found it?


The page was added in response to threads like this one, but it 
obviously is more obscure than it should be.


Terry Jan Reedy


--
http://mail.python.org/mailman/listinfo/python-list

Re: this customize sort did not work ,what's wrong?

2010-01-23 Thread MRAB


thinke365 wrote:

jesus, now i fixed it, using odd lambda sort.
l.sort(lambda x,y: cmp(len(x), len(y)))
print l

BUT I AM STILL CONFUSED WHY COSTOMIZED SORT FAILED TO SORT AS IT IS 
PROGRAMMER!



thinke365 wrote:

i mean the output i want is:
[ [1, 2], [1, 2, 3, 4], [1, 2, 3, 4, 5]], that is sort according to the
length of the list element

thinke365 wrote:

l = list()
l1 = list((1, 2, 3, 4))
l2 = list((1,2))
l3 = list((1, 2, 3, 4, 5))
l.append(l1)
l.append(l2)
l.append(l3)
print l

def sort_by_list(E1, E2):
print len(E1), len(E2)
return len(list(E1)) > len(list(E2))

l.sort(cmp=sort_by_list)
print l

output:
[[1, 2, 3, 4], [1, 2], [1, 2, 3, 4, 5]]
2 4
5 2
[[1, 2, 3, 4], [1, 2], [1, 2, 3, 4, 5]]

the order of the elements in the list did not change!


When comparing [1, 2] with [1, 2, 3, 4] you want it to return -1 to say
that it's shorter, so they should therefore be swapped, but you're
returning 0 (False), which says it's the same length, so they're not
swapped.

This is also works:

l.sort(lambda x,y: len(x) - len(y))

because the sort really only looks at the sign.
--
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Terry Reedy


On 1/23/2010 12:17 PM, Steve Howell wrote:


Terry Reedy said:

'''
If you try writing a full patch, as I believe someone did, or at least
a
prototype thereof, when the idea was discussed, you might have a
better
idea of what the tradeoffs are and why it was rejected.
'''

I have to run, but tomorrow I will try to dig through python-dev
archives and find the patch.  If anybody has hints on where to look
for it (anybody remember the author, for example?), it would be much
appreciated.


The approach you outlined in your other response to me is, I believe, 
what was considered, investigated, and then rejected (by Guido, with 
agreement). The discussion may have been on the now-closed and 
(misspelled) pyk3 (?), or maybe on python-ideas, but my memory is more 
likely the former. I am sure that Raymond H. was involved also.



If the patch looks simple, I will try to pitch the idea that its time
has come.  Now that the specification of the language itself is
frozen, I think there might be more room for improving
implementations.  Also, I might be able to make the argument that
tradeoffs of memory vs. CPU vs. code complexity have different forces
in the 2010s.


I am not opposed to a possible change, just hasty, ill-informed 
criticism. If there is not a PEP on this issue, it would be good to have 
one that recorded the proposal and the pros and cons, regardless of the 
outcome, so there would be something to refer people to. If that had 
been already done, it would have shortened this thread considerably.


Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

ISO module for binomial coefficients, etc.

2010-01-23 Thread kj



Before I go off to re-invent a thoroughly invented wheel, I thought
I'd ask around for some existing module for computing binomial
coefficient, hypergeometric coefficients, and other factorial-based
combinatorial indices.  I'm looking for something that can handle
fairly large factorials (on the order of 1!), using floating-point
approximations as needed, and is smart about optimizations,
memoizations, etc.

TIA!

~K

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PyArg_ParseTupleAndKeywords

2010-01-23 Thread Mr.M


MRAB ha scritto:

I think you're right.



I have rewritten my code, a piece at a time, and (and this is very 
annoying) now it works fine.


I really can't understand what went wrong with my old code.

Luca.
--
http://mail.python.org/mailman/listinfo/python-list

Re: how to generate random numbers that satisfy certain distribution

2010-01-23 Thread Paul Rubin

Steven D'Aprano  writes:
> The Box-Muller transform is reasonably simple, but you can't be serious 
> that it is simpler than adding twelve random numbers and subtracting six!

If you want a normal distribution, using the Box-Muller transform is
simpler because it spares you the complication of figuring out whether
the 12-trial binomial approximation is close enough to produce reliable
results for your specific application, which you obviously have to do if
you are using the approximation for anything serious.  It also involves
writing less code than that list comprehension, since it is already
implemented in the random module so you can just call it directly.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: how to generate random numbers that satisfy certain distribution

2010-01-23 Thread Steven D'Aprano

On Sat, 23 Jan 2010 12:29:22 -0800, Paul Rubin wrote:

> Peter Chant  writes:
>> I remeber being told that adding up 12 random numbers in the range 0-1
>> (which is what most computer random number genertors at the time
>> chucked out) and subtracted 6 gives a pretty good normal distribution. 
>> I think I did try it once and it failed, but I must have done something
>> odd.
> 
> That gives you a binomial distribution on 12 trials, which approximates
> a normal distribution when the number of trials is large.  12 isn't too
> bad.  But there's a simpler way, the Box-Muller transform, that gives
> you a pair drawn from a precisely normal distribution from two uniform
> random samples:
> 
>   http://en.wikipedia.org/wiki/Box-Muller_transform


The Box-Muller transform is reasonably simple, but you can't be serious 
that it is simpler than adding twelve random numbers and subtracting six!

def almost_normal():
return sum([random.random() for _ in xrange(12)], -6)


Not that I'm recommending that anyone use this binomial approximation for 
anything but the quickest and dirtiest uses, particularly since the 
random module already has two excellent normal distributions (gauss and 
normalvariate). 



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: how to generate random numbers that satisfy certain distribution

2010-01-23 Thread Steven D'Aprano

On Sat, 23 Jan 2010 11:10:52 -0800, thinke365 wrote:

> of course i have tried random package, but can this package generate
> random sequence that satisfy possion distribution , normal distribution
> and uniform distribution

Please don't talk garbage. If you had really tried the random module, you 
would know the answer. What do you think random.uniform does?

In an interactive Python session:

>>> import random
>>> help(random)

and read. When you've done that, if you still have any specific 
questions, please ask.

-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: this customize sort did not work ,what's wrong?

2010-01-23 Thread Duncan Booth

thinke365  wrote:

> jesus, now i fixed it, using odd lambda sort.
> l.sort(lambda x,y: cmp(len(x), len(y)))
> print l
> 
> BUT I AM STILL CONFUSED WHY COSTOMIZED SORT FAILED TO SORT AS IT IS 
> PROGRAMMER!

Try reading the documentation:

>>> help(list.sort)
Help on method_descriptor:

sort(...)
L.sort(cmp=None, key=None, reverse=False) -- stable sort *IN PLACE*;
cmp(x, y) -> -1, 0, 1

Your comparison function returned True or False, not -1, 0, +1
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: this customize sort did not work ,what's wrong?

2010-01-23 Thread Alf P. Steinbach


* thinke365:

jesus, now i fixed it, using odd lambda sort.
l.sort(lambda x,y: cmp(len(x), len(y)))
print l


This uses 'cmp'. Your earlier code, quoted below, used '>'.


BUT I AM STILL CONFUSED WHY COSTOMIZED SORT FAILED TO SORT AS IT IS 
PROGRAMMER!


Please don't shout.



thinke365 wrote:

i mean the output i want is:
[ [1, 2], [1, 2, 3, 4], [1, 2, 3, 4, 5]], that is sort according to the
length of the list element

thinke365 wrote:

l = list()
l1 = list((1, 2, 3, 4))
l2 = list((1,2))
l3 = list((1, 2, 3, 4, 5))
l.append(l1)
l.append(l2)
l.append(l3)
print l

def sort_by_list(E1, E2):
print len(E1), len(E2)
return len(list(E1)) > len(list(E2))

l.sort(cmp=sort_by_list)
print l

output:
[[1, 2, 3, 4], [1, 2], [1, 2, 3, 4, 5]]
2 4
5 2
[[1, 2, 3, 4], [1, 2], [1, 2, 3, 4, 5]]

the order of the elements in the list did not change!



Cheers & hth.,

- Alf
--
http://mail.python.org/mailman/listinfo/python-list

Re: this customize sort did not work ,what's wrong?

2010-01-23 Thread thinke365


jesus, now i fixed it, using odd lambda sort.
l.sort(lambda x,y: cmp(len(x), len(y)))
print l

BUT I AM STILL CONFUSED WHY COSTOMIZED SORT FAILED TO SORT AS IT IS 
PROGRAMMER!


thinke365 wrote:
> 
> i mean the output i want is:
> [ [1, 2], [1, 2, 3, 4], [1, 2, 3, 4, 5]], that is sort according to the
> length of the list element
> 
> thinke365 wrote:
>> 
>> l = list()
>> l1 = list((1, 2, 3, 4))
>> l2 = list((1,2))
>> l3 = list((1, 2, 3, 4, 5))
>> l.append(l1)
>> l.append(l2)
>> l.append(l3)
>> print l
>> 
>> def sort_by_list(E1, E2):
>> print len(E1), len(E2)
>> return len(list(E1)) > len(list(E2))
>> 
>> l.sort(cmp=sort_by_list)
>> print l
>> 
>> output:
>> [[1, 2, 3, 4], [1, 2], [1, 2, 3, 4, 5]]
>> 2 4
>> 5 2
>> [[1, 2, 3, 4], [1, 2], [1, 2, 3, 4, 5]]
>> 
>> the order of the elements in the list did not change!
>> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/this-customize-sort-did-not-work-%2Cwhat%27s-wrong--tp27289860p27290014.html
Sent from the Python - python-list mailing list archive at Nabble.com.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: this customize sort did not work ,what's wrong?

2010-01-23 Thread thinke365


i mean the output i want is:
[ [1, 2], [1, 2, 3, 4], [1, 2, 3, 4, 5]], that is sort according to the
length of the list element

thinke365 wrote:
> 
> l = list()
> l1 = list((1, 2, 3, 4))
> l2 = list((1,2))
> l3 = list((1, 2, 3, 4, 5))
> l.append(l1)
> l.append(l2)
> l.append(l3)
> print l
> 
> def sort_by_list(E1, E2):
> print len(E1), len(E2)
> return len(list(E1)) > len(list(E2))
> 
> l.sort(cmp=sort_by_list)
> print l
> 
> output:
> [[1, 2, 3, 4], [1, 2], [1, 2, 3, 4, 5]]
> 2 4
> 5 2
> [[1, 2, 3, 4], [1, 2], [1, 2, 3, 4, 5]]
> 
> the order of the elements in the list did not change!
> 

-- 
View this message in context: 
http://old.nabble.com/this-customize-sort-did-not-work-%2Cwhat%27s-wrong--tp27289860p27289922.html
Sent from the Python - python-list mailing list archive at Nabble.com.

-- 
http://mail.python.org/mailman/listinfo/python-list

this customize sort did not work ,what's wrong?

2010-01-23 Thread thinke365


l = list()
l1 = list((1, 2, 3, 4))
l2 = list((1,2))
l3 = list((1, 2, 3, 4, 5))
l.append(l1)
l.append(l2)
l.append(l3)
print l

def sort_by_list(E1, E2):
print len(E1), len(E2)
return len(list(E1)) > len(list(E2))

l.sort(cmp=sort_by_list)
print l

output:
[[1, 2, 3, 4], [1, 2], [1, 2, 3, 4, 5]]
2 4
5 2
[[1, 2, 3, 4], [1, 2], [1, 2, 3, 4, 5]]

the order of the elements in the list did not change!
-- 
View this message in context: 
http://old.nabble.com/this-customize-sort-did-not-work-%2Cwhat%27s-wrong--tp27289860p27289860.html
Sent from the Python - python-list mailing list archive at Nabble.com.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: how to generate random numbers that satisfy certain distribution

2010-01-23 Thread Paul Rubin

Peter Chant  writes:
> I remeber being told that adding up 12 random numbers in the range 0-1 
> (which is what most computer random number genertors at the time chucked 
> out) and subtracted 6 gives a pretty good normal distribution.  I think I 
> did try it once and it failed, but I must have done something odd.

That gives you a binomial distribution on 12 trials, which approximates
a normal distribution when the number of trials is large.  12 isn't too
bad.  But there's a simpler way, the Box-Muller transform, that gives
you a pair drawn from a precisely normal distribution from two uniform
random samples:

  http://en.wikipedia.org/wiki/Box-Muller_transform 
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: how to generate random numbers that satisfy certain distribution

2010-01-23 Thread Paul Rubin

thinke365  writes:
> of course i have tried random package, but can this package generate random
> sequence that satisfy possion distribution , normal distribution and uniform
> distribution 

Did you look at the documentation?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: how to generate random numbers that satisfy certain distribution

2010-01-23 Thread Arnaud Delobelle

thinke365  writes:

> such as uniform distribution, Normal distribution or poisson distribution.
> is there any package that can be used to generate such random numbers.

It's all in the standard library, the module is called -surprisingly-
'random'.

- use random.uniform for the uniform distributions
- use random normalvariate for normal distributions

There isn't a Poisson distribution function, but there is a expovariate
function.  I think you can get poisson like this, but take it with a
grain of salt because my probability skills are very rusty!

import random
import itertools

def poisson(l):
e = random.expovariate
acc = 0.0
for n in itertools.count():
acc += e(l)
if acc >= 1.0:
return n

-- 
Arnaud
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: installing psycopg2-2.0.13 with python3.1

2010-01-23 Thread Sibylle Koczian


Iain Barnett schrieb:

On 21 Jan 2010, at 00:11, Gabriel Genellina wrote:


If you insist on using Python 3.1, there is another interface to PostgreSQL 
called pg8000 that claims to be Python 3.x compatible (I've not actually tested 
it).

[1] http://mail.python.org/pipermail/python-porting/2008-December/04.html
[2] http://pybrary.net/pg8000/



Or there is py-postgresql:

http://python.projects.postgresql.org/

I've not yet used it very much, but it seems to work quite well, is 
reasonably documented and the mailing list is small and helpful.


HTH
Sibylle
--
http://mail.python.org/mailman/listinfo/python-list

Re: how to generate random numbers that satisfy certain distribution

2010-01-23 Thread Peter Chant

thinke365 wrote:

> 
> such as uniform distribution, Normal distribution or poisson distribution.
> is there any package that can be used to generate such random numbers.
> 

I remeber being told that adding up 12 random numbers in the range 0-1 
(which is what most computer random number genertors at the time chucked 
out) and subtracted 6 gives a pretty good normal distribution.  I think I 
did try it once and it failed, but I must have done something odd.

-- 
http://www.petezilla.co.uk

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PyArg_ParseTupleAndKeywords

2010-01-23 Thread MRAB


Mr.M wrote:

MRAB ha scritto:


Did you specify that the method accepts keywords arguments with
METH_KEYWORDS? The function would take parameters for the instance
(self), the positional arguments (args) and the keyword arguments
(kwargs).

http://docs.python.org/c-api/structures.html

If you don't use METH_KEYWORDS then it'll think that all the arguments
are positional, which is what seems to be happening:



Thank you MRAB for your reply.
No, i didn't specify METH_KEYWORDS flag, but i think init method (the 
one that you put in tp_init slot, it is "initproc" type) doesn't have to 
appear in the PyMethodDef structure.


Maybe i'm wrong?


I think you're right.

--
http://mail.python.org/mailman/listinfo/python-list

Re: how to generate random numbers that satisfy certain distribution

2010-01-23 Thread thinke365




Bugzilla from ra.ravi@gmail.com wrote:
> 
> On Jan 23, 10:37 pm, thinke365  wrote:
>> such as uniform distribution, Normal distribution or poisson
>> distribution.
>> is there any package that can be used to generate such random numbers.
>>
>> --
>> View this message in
>> context:http://old.nabble.com/how-to-generate-random-numbers-that-satisfy-cer...
>> Sent from the Python - python-list mailing list archive at Nabble.com.
> 
> Did you try random package?
> -- 
> http://mail.python.org/mailman/listinfo/python-list
> 
> 

of course i have tried random package, but can this package generate random
sequence that satisfy possion distribution , normal distribution and uniform
distribution 

-- 
View this message in context: 
http://old.nabble.com/how-to-generate-random-numbers-that-satisfy-certain-distribution-tp27288180p27288996.html
Sent from the Python - python-list mailing list archive at Nabble.com.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Roy Smith

In article ,
 Duncan Booth  wrote:

> Roy Smith  wrote:
> 
> > I'm looking at http://tinyurl.com/cdbwog.  It shows all the operations
> > of a list.  What it does not show is their cost.  For pop(), it has a
> > note: 
> > 
> > "The pop() method is only supported by the list and array types. The 
> > optional argument i defaults to -1, so that by default the last item
> > is removed and returned".
> 
> The page you should probably be looking at is 
> http://wiki.python.org/moin/TimeComplexity

I was not aware of this page; thanks for pointing it out.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: how to generate random numbers that satisfy certain distribution

2010-01-23 Thread Ravi

On Jan 23, 10:37 pm, thinke365  wrote:
> such as uniform distribution, Normal distribution or poisson distribution.
> is there any package that can be used to generate such random numbers.
>
> --
> View this message in 
> context:http://old.nabble.com/how-to-generate-random-numbers-that-satisfy-cer...
> Sent from the Python - python-list mailing list archive at Nabble.com.

Did you try random package?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Aahz

In article <8e4d3fe2-c4bd-4a73-9c50-7a336dab2...@o28g2000yqh.googlegroups.com>,
Steve Howell   wrote:
>On Jan 22, 11:10=A0pm, a...@pythoncraft.com (Aahz) wrote:
>>
>>>I know Python's number one concern will never be speed, but if Python
>>>makes an O(1) operation into an unnecessarily O(N) operation for no
>>>good reasons other than "it's too complicated, " or it "adds another
>>>pointer to the structure," or "it adds another conditional check to
>>>list_ass_slice for operations that aren't popping off the top," I
>>>think it's reasonable to challenge the design philosophy.
>>
>> "Rough consensus and running code."
>>
>> You have a good point, but nobody will ever give your idea serious
>> attention until there's a patch and benchmarks.
>
>Here is a benchmark of O(N*N) vs. O(N) for two C programs.  One does
>memmove; the other simply advances the pointer.

You should provide pybench numbers and probably also use the benchmarks
produced by the Unladen Swallow project.  The next step is to file a
patch on bugs.python.org and request review.
-- 
Aahz (a...@pythoncraft.com)   <*> http://www.pythoncraft.com/

import antigravity
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: ISC License

2010-01-23 Thread Aahz

In article <00eb248d-c9c9-430f-bc83-41ac865c5...@e11g2000yqe.googlegroups.com>,
Joan Miller   wrote:
>
>There is a license approved by the OSI, the ISC License [1], which
>should be included in the PyPi classifiers [2].
>
>[1] https://www.isc.org/software/license
>http://www.opensource.org/licenses/isc-license.txt
>[2] http://pypi.python.org/pypi?%3Aaction=list_classifiers

http://pypi.python.org/pypi has a link named "Bug Reports"
-- 
Aahz (a...@pythoncraft.com)   <*> http://www.pythoncraft.com/

import antigravity
-- 
http://mail.python.org/mailman/listinfo/python-list

counting character occurrences

2010-01-23 Thread Wilbert Berendsen

Op vrijdag 22 januari 2010 schreef Arnaud:

> Why not just start with (untested):
> 
> import codecs
> from collections import defaultdict
> 
> tcounters = defaultdict(int)
> f = codecs.open('/home/gavron/git/screen/src/screen.c', 'r', "utf-8")
> 
> for c in f.read():
> tcounters[c] += 1
> 
> for c, n in tcounters.iteritems():
> print "%r\t%i" % (c, n)

Or using Counter from Python3.1 collections:

import codecs
from collections import Counter

filename = '/home/gavron/git/screen/src/screen.c'
with codecs.open(filename, 'r', 'utf-8') as f:
counted = Counter(f.read())

for c, n in counted:
print(c, n, sep='\t')


with best regards,
Wilbert Berendsen

-- 
http://www.wilbertberendsen.nl/
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PyArg_ParseTupleAndKeywords

2010-01-23 Thread Mr.M


MRAB ha scritto:


Did you specify that the method accepts keywords arguments with
METH_KEYWORDS? The function would take parameters for the instance
(self), the positional arguments (args) and the keyword arguments
(kwargs).

http://docs.python.org/c-api/structures.html

If you don't use METH_KEYWORDS then it'll think that all the arguments
are positional, which is what seems to be happening:



Thank you MRAB for your reply.
No, i didn't specify METH_KEYWORDS flag, but i think init method (the 
one that you put in tp_init slot, it is "initproc" type) doesn't have to 
appear in the PyMethodDef structure.


Maybe i'm wrong?

Luca.
--
http://mail.python.org/mailman/listinfo/python-list

Re: PyArg_ParseTupleAndKeywords

2010-01-23 Thread MRAB


Mr.M wrote:

Carl Banks ha scritto:

(some declarations omitted here)


You probably shouldn't have, that could be where the error is  I'd
include the whole function up to the call that raises the exception.


Thank you so much Carl for your help, i'll provide more info so that you
can try to fix my errors!
Thank you again!

First of all i'll try to explain what i've discovered over a night i
spent awake trying to find where the bug is.

My init behaves differently in this situations:

[code]
# this works
dummy = mymodule.myclass()

# this works
dummy = mymodule.myclass("string for service")

# this also works
dummy = mymodule.myclass("string for service", "string for event_type")
# and it works if i provide arguments in the right sequence, of course

# this won't work
dummy = mymodule.myclass("string for service", "string for event_type",
free_text = "string for free_text")

# but this works
dummy = mymodule.myclass(service = "string for service")

# the worst thing: this doesn't work but it did and i can't
# understand what is changed

dummy = mymodule.myclass(free_text = "text for free_text")

ok, every time the code fails i get this exception:

TypeError: expected string or Unicode object, tuple found

so, i think for some reason PyArg_ParseTupleAndKeywords receives "args"
as a touple instead of something else.

This is everything i managed to discover.

This is the code:

[code]
/* DEBUG INFO */
printf("event INIT\n");
if (args)
{
int i;
printf("args= \"%s\"\n", PyString_AsString(args));
printf("type= %s\n",
PyString_AsString(PyObject_Repr(PyObject_Type(args;
printf("tuple size  = %d\n", PyTuple_Size(args));
for (i = 0; i < PyTuple_Size(args); i++)
{
printf("%d) %s\n", i,
PyString_AsString(PyTuple_GetItem(args, i)));
}
}
else
{
printf("args = NULL\n");
}
printf("dict:\n");
if (keywords)
{
printf(" = %s\n",
PyString_AsString(PyObject_Repr(keywords)));
printf("type = %s\n",
PyString_AsString(PyObject_Repr(PyObject_Type(keywords;
}
else
{
printf("keywords = NULL\n");
}

char* service  = NULL;
char* event_type   = NULL;
char* free_text= NULL;
char* group_uid= NULL;
char* remote_name  = NULL;
char* remote_abook_uid = NULL;

time_t start_time   = -1L;
time_t end_time = -1L;
time_t storage_time = -1L;

int flags = 0;

int bytes_sent = -1;
int bytes_received = -1;

PyObject* temp;

static char* keywordlist[] = {"service",
  "event_type",
  "free_text",
  "group_uid",
  "remote_name",
  "remote_abook_uid",
  "start_time",
  "end_time",
  "storage_time",
  "flags",
  "bytes_sent",
  "bytes_received",
  NULL};

if (!PyArg_ParseTupleAndKeywords(args, keywords, "|ssii",
keywordlist,
 &service,
 &event_type,
 &free_text,
 &group_uid,
 &remote_name,
 &remote_abook_uid,
 &start_time,
 &end_time,
 &storage_time,
 &flags,
 &bytes_sent,
 &bytes_received))
{
return -1;
}

printf("PyArg_ParseTupleAndKeywords worked fine\n");

[/code]

(sorry if my code is a little messy and my english rather bad!)


Are you sure that PyArg_ParseTupleAndKeywords is what's raising the
error?


Yes, i think so. I have a lot of printf in my code!


Are you subclassing this type in Python, and passing one of the string
parameters a tuple by mistake?  For instance, did you do something
like this:

class myclass(_mycmodule._myctype):
def __init__(self,*args,**kwargs):
log_something_here()
super(myclass,self).__init__(args,**kwargs)

Note the missing * on args.


no, i'm not subclassing.
I have to admit that i had some testcase and a few weeks ago they worked
like a charm... i can't understand what i changed, of course.





I found that if i write this everything works fine:

[code]
import mymodule
a = mymodule.myclass(service = "blabla", event_type = "e

Re: iterating lists

2010-01-23 Thread Steven D'Aprano

On Sat, 23 Jan 2010 18:29:33 +0100, Roel Schroeven wrote:

>> for w in l1[:]: #use copy of l1 for iteration
>>  print(l1.pop()) #decomposite list
> 
> I would prefer:
> 
> while l1:
> print(l1.pop())


I would prefer:

for x in reversed(l1):
print(x)
l1[:] = []


And garbage dispose of the entire list in one go, instead of an item at a 
time.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Steve Howell

On Jan 23, 7:54 am, Steven D'Aprano  wrote:
> On Sat, 23 Jan 2010 09:57:04 -0500, Roy Smith wrote:
> > In article ,
> >  "Alf P. Steinbach"  wrote:
>
> >> But it would IMHO have been better if it wasn't called "list", which
> >> brings in the wrong associations for someone used to other languages.
>
> > +1.
>
> > When I first started using Python (back in the 1.4 days), I assumed a
> > list was a singly-linked list.
>
> Why would you do that? I can think of at least eight different
> implementations of the abstract list data structure:
>
> constant-size array
> variable-size array
> variable-size array with amortised O(1) appends
> singly-linked list
> singly-linked list with CDR coding
> doubly-linked list
> skip list
> indexable skip list
>
> One can reasonably disregard constant-sized arrays as a possibility,
> given that Python lists aren't fixed size (pity the poor Pascal and
> Fortran coders who are stuck with static arrays!), but the rest are all
> reasonable possibilities. Why assume one specific implementation in the
> absence of documentation promising certain performance characteristics?
>
> > Oddly enough, I was going to write in the above paragraph, "like a C++
> > STL list", until I happened to glance at the STL docs and refreshed my
> > memory that an STL list is doubly-linked.  Which just goes to show that
> > making assumptions based on names is a bad idea.
>
> Exactly :)
>
> > So, we're right back to my statement earlier in this thread that the
> > docs are deficient in that they describe behavior with no hint about
> > cost. Given that, it should be no surprise that users make incorrect
> > assumptions about cost.
>
> There are quite a few problems with having the documentation specify cost:
>
> (1) Who is going to do it? Any volunteers?
>
> (2) Big-oh notation can be misleading, especially for naive users, or
> those whose intuition for what's fast has been shaped by other languages.
> Big-oh doesn't tell you whether something is fast or slow, only how it
> scales -- and sometimes not even then.
>
> (3) Having documented a particular performance, that discourages
> implementation changes. Any would-be patch or new implementation not only
> has to consider that the functional behaviour doesn't change, but that
> the performance doesn't either.
>
> In practice the Python developers are unlikely to make an implementation
> change which leads to radically worse performance, particularly for
> critical types like list and dict. But in other cases, they might choose
> to change big-oh behaviour, and not wish to be tied down by documentation
> of the cost of operations.
>
> (4) How much detail is necessary? What about degenerate cases? E.g. dict
> lookup in CPython is typically O(1) amortised, but if all the keys hash
> to the same value, it falls to O(N).
>
> (5) Should the language guarantee such degenerate behaviour? Who decides
> which costs are guaranteed and which are not?
>
> (6) Such performance guarantees should be implementation specific, not
> language specific. CPython is only one implementation of the language out
> of many.
>

Bringing this thread full circle, does it make sense to strike this
passage from the tutorial?:

'''
It is also possible to use a list as a queue, where the first element
added is the first element retrieved (“first-in, first-out”); however,
lists are not efficient for this purpose. While appends and pops from
the end of list are fast, doing inserts or pops from the beginning of
a list is slow (because all of the other elements have to be shifted
by one).
'''

I think points #3 and #6 possibly apply. Regarding points #2 and #4,
the tutorial is at least not overly technical or specific; it just
explains the requirement to shift other elements one by one in simple
layman's terms.

-- 
http://mail.python.org/mailman/listinfo/python-list

how to generate random numbers that satisfy certain distribution

2010-01-23 Thread thinke365


such as uniform distribution, Normal distribution or poisson distribution.
is there any package that can be used to generate such random numbers.

-- 
View this message in context: 
http://old.nabble.com/how-to-generate-random-numbers-that-satisfy-certain-distribution-tp27288180p27288180.html
Sent from the Python - python-list mailing list archive at Nabble.com.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: iterating lists

2010-01-23 Thread Stefan Behnel

ceciliasei...@gmx.de, 23.01.2010 17:29:
> Arnaud Delobelle wrote:
>> ceciliasei...@gmx.de writes:
>>
>>> As you were talking about list.pop()...
>>>
>>> Is anyone able to reproduce the following and explain why this happens
>>> by chance? (Using 3.1.1)
>>>
>>> l1 = ["ready", "steady", "go"]
>>> l2 = ["one", "two", "tree"]
>>> l3 = ["lift off"]
>>>
>>> for w in l1:
> 
> Ouch... thanks Arnaud... The stable way would've been
> 
> for w in l1[:]: #use copy of l1 for iteration
> print(l1.pop()) #decomposite list

If the intention is to exhaust the list during the iteration, I'd go for this:

while l1:
print(l1.pop())

Stefan
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: iterating lists

2010-01-23 Thread Roel Schroeven

Op 2010-01-23 17:29, ceciliasei...@gmx.de schreef:
> 
> 
> Arnaud Delobelle schrieb:
>> ceciliasei...@gmx.de writes:
>>
>>> As you were talking about list.pop()...
>>>
>>> Is anyone able to reproduce the following and explain why this happens
>>> by chance? (Using 3.1.1)
>>>
>>> l1 = ["ready", "steady", "go"]
>>> l2 = ["one", "two", "tree"]
>>> l3 = ["lift off"]
>>>
>>> for w in l1:
> 
> Ouch... thanks Arnaud... The stable way would've been
> 
> for w in l1[:]: #use copy of l1 for iteration
>   print(l1.pop()) #decomposite list

I would prefer:

while l1:
print(l1.pop())

-- 
The saddest aspect of life right now is that science gathers knowledge
faster than society gathers wisdom.
  -- Isaac Asimov

Roel Schroeven
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Steve Howell

On Jan 23, 4:12 am, Steven D'Aprano  wrote:
> On Fri, 22 Jan 2010 21:42:43 -0800, Steve Howell wrote:
> > This innocent program here literally moves about a million bytes of
> > memory around for no good reason:
>
> >     lst = []
> >     for i in range(2000):
> >         lst.append(i)
> >     while lst:
> >         print lst.pop(0)
>
> > Why?  Because list.pop(0) is implemented in O(N) instead of O(1).
>
> > Why wouldn't you get a competent C programmer simply make list_ass_slice
> > smart enough to make list.pop(0) O(1)?
>
> Because there are always trade-offs, and the competent C programmers who
> coded the implementation for lists choose different tradeoffs to the ones
> you would prefer.
>
> Seems to me that the simple solution to your problem is for you to
> implement your own data structure that makes whatever trade-offs you
> like. If it is good enough and popular enough, it might even replace the
> existing list implementation.
>

The data structure that would make the tradeoffs I want would be
implemented within CPython itself.  I give a sketch of the changes
elsewhere in this thread.

Terry Reedy said:

'''
If you try writing a full patch, as I believe someone did, or at least
a
prototype thereof, when the idea was discussed, you might have a
better
idea of what the tradeoffs are and why it was rejected.
'''

I have to run, but tomorrow I will try to dig through python-dev
archives and find the patch.  If anybody has hints on where to look
for it (anybody remember the author, for example?), it would be much
appreciated.

If the patch looks simple, I will try to pitch the idea that its time
has come.  Now that the specification of the language itself is
frozen, I think there might be more room for improving
implementations.  Also, I might be able to make the argument that
tradeoffs of memory vs. CPU vs. code complexity have different forces
in the 2010s.

Thanks for your reply.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Steve Howell

On Jan 23, 4:12 am, Steven D'Aprano  wrote:
>
>
> An alternative would be to do exactly what you want lists to do: track
> the start of the list. Untested:
>
>     def recurse(prefix_lines):
>         start = 0
>         end = len(prefix_lines)
>         while start < end:
>             prefix, line = prefix_lines[start]
>             if line == '':
>                 start += 1
>                 append('')
>             else:
>                 block_size = get_block(prefix_lines)
>                 if block_size == 1:
>                     start += 1
>                     if line == pass_syntax:
>                         pass
>                     elif line.startswith(flush_left_syntax):
>                         append(line[len(flush_left_syntax):])
>                     elif line.startswith(flush_left_empty_line):
>                         append('')
>                     else:
>                         append(prefix + leaf_method(line))
>                 else:
>                     block = prefix_lines[:block_size]
>                     start = block_size
>                     branch_method(output, block, recurse)
>         return
>
> No more O(N) deletions. Problem solved.
>

A minor modification of your solution does work, but it also slightly
+ complicates the implementation.  Keeping track of the start variable
requires bookkeeping not just in recurse(), but also in methods that
it calls.  This is a pretty small program, so it's acceptable to pass
around an offset variable to anybody else who might want to be
consuming the list.

  Generic indentation stuff follows

-def get_indented_block(prefix_lines):
-prefix, line = prefix_lines[0]
+def get_indented_block(prefix_lines, start):
+prefix, line = prefix_lines[start]
 len_prefix = len(prefix)
 i = 1
-while i < len(prefix_lines):
-new_prefix, line = prefix_lines[i]
+while i + start  < len(prefix_lines):
+new_prefix, line = prefix_lines[start+i]
 if line and len(new_prefix) <= len_prefix:
 break
 i += 1
-while i-1 > 0 and prefix_lines[i-1][1] == '':
+while i-1 > 0 and prefix_lines[start+i-1][1] == '':
 i -= 1
 return i

@@ -190,15 +190,16 @@
 ):
 append = output.append
 def recurse(prefix_lines):
-while prefix_lines:
-prefix, line = prefix_lines[0]
+start = 0
+while start < len(prefix_lines):
+prefix, line = prefix_lines[start]
 if line == '':
-prefix_lines.pop(0)
+start += 1
 append('')
 else:
-block_size = get_block(prefix_lines)
+block_size = get_block(prefix_lines, start)
 if block_size == 1:
-prefix_lines.pop(0)
+start += 1
 if line == pass_syntax:
 pass
 elif line.startswith(flush_left_syntax):
@@ -208,8 +209,8 @@
 else:
 append(prefix + leaf_method(line))
 else:
-block = prefix_lines[:block_size]
-prefix_lines = prefix_lines[block_size:]
+block = prefix_lines[start:start+block_size]
+start += block_size
 branch_method(output, block, recurse)
 return
 prefix_lines = map(indentation_method, lines)



-- 
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Steve Howell

On Jan 23, 6:40 am, Roy Smith  wrote:
> In article
> ,
>  Steve Howell  wrote:
>
> > This innocent program here literally moves about a million bytes of
> > memory around for no good reason:
>
> >     lst = []
> >     for i in range(2000):
> >         lst.append(i)
> >     while lst:
> >         print lst.pop(0)
>
> > Why?  Because list.pop(0) is implemented in O(N) instead of O(1).
>
> I think you're being a little pedantic here.  Yes, it is true that pop(0)
> is O(n), and that if you put an O(n) operation in a loop, you get O(n^2)
> run time.
>
> The problem is that while it is well-known that putting something that's
> O(n) in a loop gets you O(n^2), it's not well known that pop(0) for a
> Python list is O(n).  This is where you and I apparently start to differ in
> what we think about this.
>

The source for Python is open.  It pretty clearly shows that you move
N bytes when you pop from the top of the list.

Less clear is how linear the performance of memmove is.  My benchmarks
on the C program show that, at least on my computer, the results do
not seem to contradict the "roughly linear" assumption.

> You are arguing that this is a bug in the implementation of list.  While I
> suppose there's some validity to that argument, I disagree.  What I would
> argue (and have done so several times over the years, with little success)
> is that this is a bug in the documentation!
>
> I'm looking athttp://tinyurl.com/cdbwog.  It shows all the operations of a
> list.  What it does not show is their cost.  For pop(), it has a note:
>
> "The pop() method is only supported by the list and array types. The
> optional argument i defaults to -1, so that by default the last item is
> removed and returned".
>
> There's nothing there that gives any hint that pop(0) is any more expensive
> than pop(-1).  That is "secret knowledge", which you only get by following
> the newsgroup discussions or looking at the implementation.  You shouldn't
> have to do either.  There's lots of possible ways list could be
> implemented.  Without knowing the details, I'm left to guess about
> important stuff like the cost of operations.
>
> Every one of these operations should list the cost.  Even if it's something
> as vague as, "While not guaranteed by the language spec, in the current
> implemenation of CPython ...".

I agree with that.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Steve Howell

On Jan 23, 5:46 am, Christian Heimes  wrote:
> Steve Howell wrote:
> > Another benchmark is that deques are slower than lists for accessing
> > elements.
>
> deques are optimized for accessing, inserting and removing data from
> both ends. For anything else it's slower than the list type. The fact
> was explained in this very thread yesterday.
>

And the benchmark confirmed it.  The slowness is fairly negligible,
though.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Alf P. Steinbach


* Steven D'Aprano:

On Sat, 23 Jan 2010 09:57:04 -0500, Roy Smith wrote:


In article ,
 "Alf P. Steinbach"  wrote:


But it would IMHO have been better if it wasn't called "list", which
brings in the wrong associations for someone used to other languages.

+1.

When I first started using Python (back in the 1.4 days), I assumed a
list was a singly-linked list.


Why would you do that? I can think of at least eight different 
implementations of the abstract list data structure:


constant-size array
variable-size array
variable-size array with amortised O(1) appends
singly-linked list
singly-linked list with CDR coding
doubly-linked list
skip list
indexable skip list

One can reasonably disregard constant-sized arrays as a possibility, 
given that Python lists aren't fixed size (pity the poor Pascal and 
Fortran coders who are stuck with static arrays!), but the rest are all 
reasonable possibilities.


A linked list implementation would yield O(n) indexing. A great many loops in 
e.g. Python libraries code now having linear time would then get quadratic time, 
O(n^2). Those libraries would then be effectively unusable without extensive 
rewriting: one version for ordinary Python and one for 'list-as-list' Pythons...


Thus, the linked list implementations are IMO *not* reasonable.

And the reason is precisely the implied complexity guarantees, especially on 
indexing  --  which could reasonably be O(log n), but not worse than that.



Why assume one specific implementation in the 
absence of documentation promising certain performance characteristics?




Oddly enough, I was going to write in the above paragraph, "like a C++
STL list", until I happened to glance at the STL docs and refreshed my
memory that an STL list is doubly-linked.  Which just goes to show that
making assumptions based on names is a bad idea.


Exactly :)




So, we're right back to my statement earlier in this thread that the
docs are deficient in that they describe behavior with no hint about
cost. Given that, it should be no surprise that users make incorrect
assumptions about cost.


There are quite a few problems with having the documentation specify cost:

(1) Who is going to do it? Any volunteers?


This problem must have been addressed at each time the documentation for some 
version of Python was written or updated.



(2) Big-oh notation can be misleading, especially for naive users, or 
those whose intuition for what's fast has been shaped by other languages. 
Big-oh doesn't tell you whether something is fast or slow, only how it 
scales -- and sometimes not even then.


It's how things scale that are of interest. :-)

Big-oh tells you an upper asymptotic limit.

That's sufficient for e.g. the C++ standard  --  which, by the way, constitutes 
a concrete example of the practicality of specifying complexity.



(3) Having documented a particular performance, that discourages 
implementation changes. Any would-be patch or new implementation not only 
has to consider that the functional behaviour doesn't change, but that 
the performance doesn't either.


In practice the Python developers are unlikely to make an implementation 
change which leads to radically worse performance, particularly for 
critical types like list and dict. But in other cases, they might choose 
to change big-oh behaviour, and not wish to be tied down by documentation 
of the cost of operations.


Say that there was an O(log n) documented worst complexity for 'list' indexing. 
Above you have described it as "reasonable" to break that, having O(n) 
complexity... But in light of my comments on that, and especially thinking a bit 
about maintainance of two or more! versions of various libraries, don't you 
agree that it would be Just Bad(TM)?



(4) How much detail is necessary? What about degenerate cases? E.g. dict 
lookup in CPython is typically O(1) amortised, but if all the keys hash 
to the same value, it falls to O(N).


From N1745, the Technical Report 1 on C++ library extensions (will be part of 
the C++0x standard), table 21 specifying general requirements of unordered 
associative containers:


expression:  b.find(k)
return type: iterator;
assertion:   Returns an iterator pointing to an element with key equivalent
 to k, or b.end() if no such element exists.
complexity:  Average case O(1), worst case O(b.size()).


(5) Should the language guarantee such degenerate behaviour? Who decides 
which costs are guaranteed and which are not?


I think the C++ standard (the latest draft of C++0x is freely available as PDF 
from the commitee pages) provides good guidance in this regard. :-)



(6) Such performance guarantees should be implementation specific, not 
language specific. CPython is only one implementation of the language out 
of many.


Disagree Very Strongly. An implementation may offer stricter guarantees. But 
what matters regarding e.g. avoiding having to maintain two or three or umpteen 
versions of a library

Re: py2exe deal with python command line inside a program

2010-01-23 Thread im_smialing

On Jan 22, 2:35 pm, susan_kij...@yahoo.ca wrote:
> Hi,
>
> I need to create a python subprogress, like this:
> myProcess = subprocess.Popen([sys.executable, 'C:\myscript.py'],
>                                        env=env, stdin=subprocess.PIPE,
>                                        stdout=subprocess.PIPE)
>
> sys.executable was printed out as ''C:\\Python25\\python.exe'', how
> can I make this work in executable package through py2exe?
>
> I have to fix the following problems:
> -Source code shouldn't exposed in an executable program
> -Since python  environment is not required when running an executable
> program, how to deal with the situation that "C:\\Python25\
> \python.exe" is required as part of command?
>
> Thanks in advance!

Anyone can help? The problem took me two days and still no solutions
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Duncan Booth

Roy Smith  wrote:

> I'm looking at http://tinyurl.com/cdbwog.  It shows all the operations
> of a list.  What it does not show is their cost.  For pop(), it has a
> note: 
> 
> "The pop() method is only supported by the list and array types. The 
> optional argument i defaults to -1, so that by default the last item
> is removed and returned".

The page you should probably be looking at is 
http://wiki.python.org/moin/TimeComplexity
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Re: iterating lists

2010-01-23 Thread ceciliaseidel




Arnaud Delobelle schrieb:

ceciliasei...@gmx.de writes:


As you were talking about list.pop()...

Is anyone able to reproduce the following and explain why this happens
by chance? (Using 3.1.1)

l1 = ["ready", "steady", "go"]
l2 = ["one", "two", "tree"]
l3 = ["lift off"]

for w in l1:


Ouch... thanks Arnaud... The stable way would've been

for w in l1[:]: #use copy of l1 for iteration
print(l1.pop()) #decomposite list

(just in case any other newbie is reading this...)



   print(l1.pop())  #prints only "go steady" - why not "ready"??



I suggest you simulate the loop above using pen and paper, writing the
value of w and the value of l1 at each iteration.  The behaviour you are
observing should then be clearly explained.  And you should also realise
that it's a really bad idea to mutate a list that you are iterating on!

HTH



--
http://mail.python.org/mailman/listinfo/python-list

Re: A.x vs. A["x"]

2010-01-23 Thread Steve Holden

Martin Drautzburg wrote:
> Terry Reedy wrote:
> 
>> On 1/22/2010 2:29 PM, Martin Drautzburg wrote:
>>> This has probably been asekd a million times, but if someone could
>>> give a short answer anyways I's be most grateful.
>>>
>>> What is it that allows one to write A.x? If I have a variable A,
>  
> 
>>> I know you can do this with classes, but not with plain objects, but
>>> why is that so?
>> You exact meaning here is not clear, but I suspect it is somewhat
>> incorrect, at least for built-in classes.
> 
> You're right. I used to think you could do this to classes:
> 
> G = Strum
> G.x=1
> 
> But not to objects (instances):
> 
> g = Strum()
> g.y = 2
> 
> But in fact both work. Thanks for clarifying

Both work, in fact, because a class is (like almost any other Python
object), an instance of some class. In Python the classes that are
defined in the interpreter are usually called types, but since
"new-style objects" were introduced into Python in 2.2 you can define
your own types (and subclass the system types).

The name for the particular types whose instances are classes is
"metaclass".  But once you realize that the "regular classes" you
already know about are just instances of their metaclass (the metaclass
, in many cases) it might seem less surprising.

There is a necessary but strange and surprising relationship between
types and objects that can be summarized as

  isinstance(object, type) == isinstance(type,object) == True

Because the built-in types are implemented as a part of the interpreter
they are somewhat more restricted. Consequently not only can you not add
attributes to the classes, you can't add them to the instances either:

>>> int.x = 2
Traceback (most recent call last):
  File "", line 1, in 
TypeError: can't set attributes of built-in/extension type 'int'
>>> i = int()
>>> i.y = "hello"
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'int' object has no attribute 'y'
>>>

regards
 Steve
-- 
Steve Holden   +1 571 484 6266   +1 800 494 3119
PyCon is coming! Atlanta, Feb 2010  http://us.pycon.org/
Holden Web LLC http://www.holdenweb.com/
UPCOMING EVENTS:http://holdenweb.eventbrite.com/

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: how can i know if a python object have a attribute such as 'attr1'?

2010-01-23 Thread Arnaud Delobelle

thinke365  writes:

> for example, i may define a python class:
> class A:
>  def sayHello():
>   print 'hello'
>
> a = A()
> a.attr1 = 'hello'
> a.attr2 = 'bb'
>
> b = A()
> a.attr2 = 'aa'
>
> how can i know whether an object have an attribute named attr1?

hasattr(a, 'attr1')

-- 
Arnaud
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Steven D'Aprano

On Sat, 23 Jan 2010 09:57:04 -0500, Roy Smith wrote:

> In article ,
>  "Alf P. Steinbach"  wrote:
> 
>> But it would IMHO have been better if it wasn't called "list", which
>> brings in the wrong associations for someone used to other languages.
> 
> +1.
> 
> When I first started using Python (back in the 1.4 days), I assumed a
> list was a singly-linked list.

Why would you do that? I can think of at least eight different 
implementations of the abstract list data structure:

constant-size array
variable-size array
variable-size array with amortised O(1) appends
singly-linked list
singly-linked list with CDR coding
doubly-linked list
skip list
indexable skip list

One can reasonably disregard constant-sized arrays as a possibility, 
given that Python lists aren't fixed size (pity the poor Pascal and 
Fortran coders who are stuck with static arrays!), but the rest are all 
reasonable possibilities. Why assume one specific implementation in the 
absence of documentation promising certain performance characteristics?

> Oddly enough, I was going to write in the above paragraph, "like a C++
> STL list", until I happened to glance at the STL docs and refreshed my
> memory that an STL list is doubly-linked.  Which just goes to show that
> making assumptions based on names is a bad idea.

Exactly :)

> So, we're right back to my statement earlier in this thread that the
> docs are deficient in that they describe behavior with no hint about
> cost. Given that, it should be no surprise that users make incorrect
> assumptions about cost.

There are quite a few problems with having the documentation specify cost:

(1) Who is going to do it? Any volunteers?

(2) Big-oh notation can be misleading, especially for naive users, or 
those whose intuition for what's fast has been shaped by other languages. 
Big-oh doesn't tell you whether something is fast or slow, only how it 
scales -- and sometimes not even then.

(3) Having documented a particular performance, that discourages 
implementation changes. Any would-be patch or new implementation not only 
has to consider that the functional behaviour doesn't change, but that 
the performance doesn't either.

In practice the Python developers are unlikely to make an implementation 
change which leads to radically worse performance, particularly for 
critical types like list and dict. But in other cases, they might choose 
to change big-oh behaviour, and not wish to be tied down by documentation 
of the cost of operations.

(4) How much detail is necessary? What about degenerate cases? E.g. dict 
lookup in CPython is typically O(1) amortised, but if all the keys hash 
to the same value, it falls to O(N).

(5) Should the language guarantee such degenerate behaviour? Who decides 
which costs are guaranteed and which are not?

(6) Such performance guarantees should be implementation specific, not 
language specific. CPython is only one implementation of the language out 
of many.

-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list

how can i know if a python object have a attribute such as 'attr1'?

2010-01-23 Thread thinke365


for example, i may define a python class:
class A:
 def sayHello():
  print 'hello'

a = A()
a.attr1 = 'hello'
a.attr2 = 'bb'

b = A()
a.attr2 = 'aa'

how can i know whether an object have an attribute named attr1?

-- 
View this message in context: 
http://old.nabble.com/how-can-i-know-if-a-python-object-have-a-attribute-such-as-%27attr1%27--tp27286937p27286937.html
Sent from the Python - python-list mailing list archive at Nabble.com.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: iterating lists

2010-01-23 Thread Alf P. Steinbach


* ceciliasei...@gmx.de:

As you were talking about list.pop()...

Is anyone able to reproduce the following and explain why this happens 
by chance? (Using 3.1.1)


l1 = ["ready", "steady", "go"]
l2 = ["one", "two", "tree"]
l3 = ["lift off"]

for w in l1:
   print(l1.pop())  #prints only "go steady" - why not "ready"??

for w in range(len(l2)):
   print(l2.pop())  #prints "three two one" as expected
  for w in l3:
   print(l3.pop()) #prints "lift off" - inconsistent to first case...


At least for 2.2.3 I found the first way to iterate the list as default, 
I guess it is deprecated now but still what happens seems weird to me...


Arnaud Delobelle has already answered your question, but you might alternatively 
try this angle:



l1 = ["ready", "steady", "go"]
l2 = ["one", "two", "tree"]
l3 = ["lift off"]

for w in l1:
   print( w, l1 )
   print(l1.pop())  #prints only "go steady" - why not "ready"??

for w in range(len(l2)):
   print(l2.pop())  #prints "three two one" as expected
for w in l3:
   print( w, l3 )
   print(l3.pop()) #prints "lift off" - inconsistent to first case...


If the list has at least one item you always get into the first iteration of the 
loop. I.e. there's no inconsistency, unless you count the lack of an exception 
as an inconsistency. I don't know whether the behavior is clearly defined or 
not; there is a possibility that it might be well-defined.



Cheers & hth.,

- Alf
--
http://mail.python.org/mailman/listinfo/python-list

Re: enumerated while loop

2010-01-23 Thread Roald de Vries


On Jan 23, 2010, at 3:58 PM, Mark Dickinson wrote:


On Jan 23, 2:44 pm, Roald de Vries  wrote:

I assume a function like 'naturals' already exists, or a similar
construction for the same purpose. But what is it called?


itertools.count()




On Jan 23, 2010, at 4:04 PM, Jan Kaliszewski wrote:


itertools.count() -- see 
http://docs.python.org/library/itertools.html#itertools.count


for i, item in enumerate(iterable):



That's completely beside the point.


Thanks guys


--
http://mail.python.org/mailman/listinfo/python-list

Re: iterating lists

2010-01-23 Thread Arnaud Delobelle

ceciliasei...@gmx.de writes:

> As you were talking about list.pop()...
>
> Is anyone able to reproduce the following and explain why this happens
> by chance? (Using 3.1.1)
>
> l1 = ["ready", "steady", "go"]
> l2 = ["one", "two", "tree"]
> l3 = ["lift off"]
>
> for w in l1:
>print(l1.pop())  #prints only "go steady" - why not "ready"??
>

I suggest you simulate the loop above using pen and paper, writing the
value of w and the value of l1 at each iteration.  The behaviour you are
observing should then be clearly explained.  And you should also realise
that it's a really bad idea to mutate a list that you are iterating on!

HTH

-- 
Arnaud
-- 
http://mail.python.org/mailman/listinfo/python-list

iterating lists

2010-01-23 Thread ceciliaseidel


As you were talking about list.pop()...

Is anyone able to reproduce the following and explain why this happens 
by chance? (Using 3.1.1)


l1 = ["ready", "steady", "go"]
l2 = ["one", "two", "tree"]
l3 = ["lift off"]

for w in l1:
   print(l1.pop())  #prints only "go steady" - why not "ready"??

for w in range(len(l2)):
   print(l2.pop())  #prints "three two one" as expected
  
for w in l3:

   print(l3.pop()) #prints "lift off" - inconsistent to first case...


At least for 2.2.3 I found the first way to iterate the list as default, 
I guess it is deprecated now but still what happens seems weird to me...

--
http://mail.python.org/mailman/listinfo/python-list

Re: enumerated while loop

2010-01-23 Thread Roald de Vries



On Jan 23, 2010, at 3:49 PM, Diez B. Roggisch wrote:


Am 23.01.10 15:44, schrieb Roald de Vries:

Dear all,

I sometimes want to use an infinite while loop with access to the  
loop

index, like this:

def naturals():
i = 0
while True:
yield i
y += 1

for i in naturals():
print(i)

I assume a function like 'naturals' already exists, or a similar
construction for the same purpose. But what is it called?


for i, item in enumerate(iterable):
   


This only moves the problem to the variable 'iterable'. And moreover  
is more complex than needed; it give '(i, item)', whereas I only need  
the index 'i'.



--
http://mail.python.org/mailman/listinfo/python-list

Re: enumerated while loop

2010-01-23 Thread Roald de Vries



On Jan 23, 2010, at 3:50 PM, Grant Edwards wrote:


On 2010-01-23, Roald de Vries  wrote:

Dear all,

I sometimes want to use an infinite while loop with access to the  
loop

index, like this:

def naturals():
i = 0
while True:
yield i
y += 1

for i in naturals():
print(i)

I assume a function like 'naturals' already exists, or a similar
construction for the same purpose. But what is it called?


xrange(), except that you need to provde an upper limit.


I'm sorry, Grant, but you exactly miss my point ;-). The essence of  
this is that it's endless.




--
http://mail.python.org/mailman/listinfo/python-list

Re: enumerated while loop

2010-01-23 Thread Jan Kaliszewski


23-01-2010 o 15:49:23 Diez B. Roggisch  wrote:


Am 23.01.10 15:44, schrieb Roald de Vries:

Dear all,

I sometimes want to use an infinite while loop with access to the loop
index, like this:

def naturals():
i = 0
while True:
yield i
y += 1

for i in naturals():
print(i)

I assume a function like 'naturals' already exists, or a similar
construction for the same purpose. But what is it called?


itertools.count() -- see  
http://docs.python.org/library/itertools.html#itertools.count



for i, item in enumerate(iterable):
 


That's completely beside the point.

Regards,
*j
--
http://mail.python.org/mailman/listinfo/python-list

Re: enumerated while loop

2010-01-23 Thread Mark Dickinson

On Jan 23, 2:44 pm, Roald de Vries  wrote:
> I assume a function like 'naturals' already exists, or a similar  
> construction for the same purpose. But what is it called?

itertools.count()

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Roy Smith

In article ,
 "Alf P. Steinbach"  wrote:

> But it would IMHO have been better if it wasn't called "list", which brings 
> in the wrong associations for someone used to other languages.

+1.

When I first started using Python (back in the 1.4 days), I assumed a list 
was a singly-linked list.  Which, of course, leads to the assumption that 
pop(0) is O(1), and lots of other wrong thinking(*).

Oddly enough, I was going to write in the above paragraph, "like a C++ STL 
list", until I happened to glance at the STL docs and refreshed my memory 
that an STL list is doubly-linked.  Which just goes to show that making 
assumptions based on names is a bad idea.

So, we're right back to my statement earlier in this thread that the docs 
are deficient in that they describe behavior with no hint about cost.  
Given that, it should be no surprise that users make incorrect assumptions 
about cost.

(*) I suspect somebody is going to point out that list.pop was added in 
some version later than 1.4, but that's a detail.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: medians for degree measurements

2010-01-23 Thread Arnaud Delobelle

Steve Howell  writes:

> I just saw the thread for medians, and it reminded me of a problem
> that I need to solve.  We are writing some Python software for
> sailing, and we need to detect when we've departed from the median
> heading on the leg.  Calculating arithmetic medians is
> straightforward, but compass bearings add a twist.
>
> The numerical median of 1, 2, 3, 4, 5, 6, 359 is 4.  But for
> navigational purposes you would actually order the numbers 359, 1, 2,
> 3, 4, 5, 6, so the desired median heading of the boat is actually 3.
>
> Of course, you could express 359 better as -1 degrees to north, then
> the sequence would be -1, 1, 2, 3, 4, 5, and 6.  And you'd be good.
>
> But that trick does not generalize if you go south instead, as you
> have similar issues with -179, 174, 175, 176, 177, 178, and 179.
>
> Has anybody solved this in Python, either for compass bearings or a
> different domain?  I can think of kind of a brute force solution where
> you keep rotating the sequence until the endpoints are closest
> together mod 360, but I wonder if there is something more elegant.

Here's a simple (too simple?) way to do it:

   1. sort the bearings in ascending order
   2. find the largest gap between two consecutive bearings
   3. cut the circle at this point and now find the median the
   normal way

In Python:

>>> def median_bearing(bearings):
... bearings = sorted(bearings)
... n = len(bearings)
... i = max(xrange(n), key=lambda i: (bearings[(i+1)%n] - bearings[i])%360)
... return bearings[(i + (n+1)//2)%n]
... 
>>> median_bearing([1,2,3,4,5,6,359])
3
>>> median_bearing([-179, 174, 175, 176, 177, 178, 179])
177

I guess there may be cases where the results that it gives are still not
very good, as in general there isn't a good notion of median for cyclic
data.  E.g. what would be the median of [0, 180] - it could equally be
090 or 270.  Or the median of [0, 120, 240] has the same problem.  But I
suppose your data couldn't be like this as then it would be ill-advised
to try to apply a notion of median to it.

But it will work well I believe with quite localized data set
with a few, even wildly inaccurate, outliers.  E.g.

>>> median_bearing([123, 124, 125, 125, 126, 10, 240])
125

HTH

-- 
Arnaud
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: enumerated while loop

2010-01-23 Thread Grant Edwards

On 2010-01-23, Roald de Vries  wrote:
> Dear all,
>
> I sometimes want to use an infinite while loop with access to the loop  
> index, like this:
>
> def naturals():
>  i = 0
>  while True:
>  yield i
>  y += 1
>
> for i in naturals():
>  print(i)
>
> I assume a function like 'naturals' already exists, or a similar  
> construction for the same purpose. But what is it called?

xrange(), except that you need to provde an upper limit.

-- 
Grant

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: enumerated while loop

2010-01-23 Thread Diez B. Roggisch


Am 23.01.10 15:44, schrieb Roald de Vries:

Dear all,

I sometimes want to use an infinite while loop with access to the loop
index, like this:

def naturals():
i = 0
while True:
yield i
y += 1

for i in naturals():
print(i)

I assume a function like 'naturals' already exists, or a similar
construction for the same purpose. But what is it called?


for i, item in enumerate(iterable):



Diez
--
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Roy Smith

In article 
,
 Steve Howell  wrote:

> This innocent program here literally moves about a million bytes of
> memory around for no good reason:
> 
> lst = []
> for i in range(2000):
> lst.append(i)
> while lst:
> print lst.pop(0)
> 
> Why?  Because list.pop(0) is implemented in O(N) instead of O(1).

I think you're being a little pedantic here.  Yes, it is true that pop(0) 
is O(n), and that if you put an O(n) operation in a loop, you get O(n^2) 
run time.

The problem is that while it is well-known that putting something that's 
O(n) in a loop gets you O(n^2), it's not well known that pop(0) for a 
Python list is O(n).  This is where you and I apparently start to differ in 
what we think about this.

You are arguing that this is a bug in the implementation of list.  While I 
suppose there's some validity to that argument, I disagree.  What I would 
argue (and have done so several times over the years, with little success) 
is that this is a bug in the documentation!

I'm looking at http://tinyurl.com/cdbwog.  It shows all the operations of a 
list.  What it does not show is their cost.  For pop(), it has a note:

"The pop() method is only supported by the list and array types. The 
optional argument i defaults to -1, so that by default the last item is 
removed and returned".

There's nothing there that gives any hint that pop(0) is any more expensive 
than pop(-1).  That is "secret knowledge", which you only get by following 
the newsgroup discussions or looking at the implementation.  You shouldn't 
have to do either.  There's lots of possible ways list could be 
implemented.  Without knowing the details, I'm left to guess about 
important stuff like the cost of operations.

Every one of these operations should list the cost.  Even if it's something 
as vague as, "While not guaranteed by the language spec, in the current 
implemenation of CPython ...".
-- 
http://mail.python.org/mailman/listinfo/python-list

enumerated while loop

2010-01-23 Thread Roald de Vries


Dear all,

I sometimes want to use an infinite while loop with access to the loop  
index, like this:


def naturals():
i = 0
while True:
yield i
y += 1

for i in naturals():
print(i)

I assume a function like 'naturals' already exists, or a similar  
construction for the same purpose. But what is it called?


Kind regards, Roald



--
http://mail.python.org/mailman/listinfo/python-list

Re: Consume an iterable

2010-01-23 Thread Peter Otten

Muhammad Alkarouri wrote:

> On 23 Jan, 13:46, Peter Otten <__pete...@web.de> wrote:

>> def consume_islice(n, items):
>> next(islice(items, n, n), None)

> I submitted the bug report before considering this alternative, which
> is better. I may add this later in the day, but I have to wait a
> little as it seems you are going to optimize/improve the function
> almost out of existence:)

One problem: the above function doesn't consume the entire iterator like the 
original example does for n=None. Passing sys.maxint instead is not pretty.

Peter

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Consume an iterable

2010-01-23 Thread Muhammad Alkarouri

On 23 Jan, 13:46, Peter Otten <__pete...@web.de> wrote:
> Peter Otten wrote:
> > Duncan Booth wrote:
>
> >> Peter Otten <__pete...@web.de> wrote:
>
> >>> With next(islice(...), None) I seem to have found a variant that beats
> >>> both  competitors.
>
> >> It has different behaviour for n==0 but I'm sure that's easily fixed.
>
> > "Different behaviour" being a euphemism for broken ;)
>
> > def consume_islice(n, items):
> >     if n == 0:
> >         return
> >     next(islice(items, n-1, None), None)
>
> Even better:
>
> def consume_islice(n, items):
>     next(islice(items, n, n), None)
>
> Peter

I submitted the bug report before considering this alternative, which
is better. I may add this later in the day, but I have to wait a
little as it seems you are going to optimize/improve the function
almost out of existence:)
If you are happy with this one, and can add the comment on the issue
(http://bugs.python.org/issue7764) yourself, please do so.

What I can say is, I am definitely very happy that I asked the
question. You live and learn:)

Cheers,

Muhammad
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Consume an iterable

2010-01-23 Thread Muhammad Alkarouri

On 23 Jan, 13:32, Peter Otten <__pete...@web.de> wrote:
> Muhammad Alkarouri wrote:
> > The next function performs much better. It is also much more direct
> > for the purposes of consume and much more understandable (at least for
> > me) as it doesn't require a specialized data structure which is
> > subsequently not used as such.
> > I am thus inclined to report it as a python documentation enhancement
> > (bug) request. Any comments?
>
> I would support that as a the deque(..., maxlen=0) trick is a bit too clever
> for my taste.
>
> Peter
>
> PS: Remember to include the bug fix for n=0 if you proceed.

Done. http://bugs.python.org/issue7764

Thanks for all the effort.

Muhammad
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Consume an iterable

2010-01-23 Thread Peter Otten

Peter Otten wrote:

> Duncan Booth wrote:
> 
>> Peter Otten <__pete...@web.de> wrote:
>> 
>>> With next(islice(...), None) I seem to have found a variant that beats
>>> both  competitors.
>>> 
>> It has different behaviour for n==0 but I'm sure that's easily fixed.
> 
> "Different behaviour" being a euphemism for broken ;)
> 
> def consume_islice(n, items):
> if n == 0:
> return
> next(islice(items, n-1, None), None)

Even better:

def consume_islice(n, items):
next(islice(items, n, n), None)

Peter
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Christian Heimes

Steve Howell wrote:
> Another benchmark is that deques are slower than lists for accessing
> elements.

deques are optimized for accessing, inserting and removing data from
both ends. For anything else it's slower than the list type. The fact
was explained in this very thread yesterday.

Christian
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Consume an iterable

2010-01-23 Thread Peter Otten

Muhammad Alkarouri wrote:

> The next function performs much better. It is also much more direct
> for the purposes of consume and much more understandable (at least for
> me) as it doesn't require a specialized data structure which is
> subsequently not used as such.
> I am thus inclined to report it as a python documentation enhancement
> (bug) request. Any comments?

I would support that as a the deque(..., maxlen=0) trick is a bit too clever 
for my taste.

Peter

PS: Remember to include the bug fix for n=0 if you proceed.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Consume an iterable

2010-01-23 Thread Peter Otten

Duncan Booth wrote:

> Peter Otten <__pete...@web.de> wrote:
> 
>> With next(islice(...), None) I seem to have found a variant that beats
>> both  competitors.
>> 
> It has different behaviour for n==0 but I'm sure that's easily fixed.

"Different behaviour" being a euphemism for broken ;)

def consume_islice(n, items):
if n == 0:
return
next(islice(items, n-1, None), None)

Peter

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Consume an iterable

2010-01-23 Thread Peter Otten

Duncan Booth wrote:

> Peter Otten <__pete...@web.de> wrote:
> 
>> With next(islice(...), None) I seem to have found a variant that beats
>> both  competitors.
>> 
> It has different behaviour for n==0 but I'm sure that's easily fixed.

"Different behaviour" being a euphemism for broken ;)

def consume_islice(n, items):
if n == 0:
return
next(islice(items, n-1, None), None)

Peter

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Consume an iterable

2010-01-23 Thread Muhammad Alkarouri

On 23 Jan, 12:45, Peter Otten <__pete...@web.de> wrote:
> Muhammad Alkarouri wrote:
> > Thanks everyone, but not on my machine (Python 2.6.1, OS X 10.6) it's
> > not:
>
> > In [1]: from itertools import count, islice
>
> > In [2]: from collections import deque
>
> > In [3]: i1=count()
>
> > In [4]: def consume1(iterator, n):
> >    ...:     deque(islice(iterator, n), maxlen=0)
> >    ...:
> >    ...:
>
> > In [5]: i2=count()
>
> > In [6]: def consume2(iterator, n):
> >    ...:     for _ in islice(iterator, n): pass
> >    ...:
> >    ...:
>
> > In [7]: timeit consume1(i1, 10)
> > 100 loops, best of 3: 1.63 us per loop
>
> > In [8]: timeit consume2(i2, 10)
> > 100 loops, best of 3: 846 ns per loop
>
> > Can somebody please test whether it is only my machine or is this
> > reproducible?
>
> I can reproduce it. The deque-based approach has a bigger constant overhead
> but better per-item performance. Its asymptotical behaviour is therefore
> better.
>
> $ python consume_timeit.py
> consume_deque
>     10: 1.77500414848
>    100: 3.7001137
>   1000: 24.7235469818
>
> consume_forloop
>     10: 1.22008490562
>    100: 5.86271500587
>   1000: 52.2449371815
>
> consume_islice
>     10: 0.897439956665
>    100: 1.51542806625
>   1000: 7.70061397552
>
> $ cat consume_timeit.py
> from collections import deque
> from itertools import islice, repeat
>
> def consume_deque(n, items):
>     deque(islice(items, n), maxlen=0)
>
> def consume_forloop(n, items):
>     for _ in islice(items, n):
>         pass
>
> def consume_islice(n, items):
>     next(islice(items, n-1, None), None)
>
> def check(fs):
>     for consume in fs:
>         items = iter(range(10))
>         consume(3, items)
>         rest = list(items)
>         assert rest == range(3, 10), consume.__name__
>
> if __name__ == "__main__":
>     fs = consume_deque, consume_forloop, consume_islice
>     check(fs)
>
>     items = repeat(None)
>
>     from timeit import Timer
>     for consume in fs:
>         print consume.__name__
>         for n in (10, 100, 1000):
>             print "%6d:" % n,
>             print Timer("consume(%s, items)" % n,
>                         "from __main__ import consume, items").timeit()
>         print
> $
>
> With next(islice(...), None) I seem to have found a variant that beats both  
> competitors.
>
> Peter

Thanks Peter, I got more or less the same result on my machine (Python
2.6.1, x86_64, OS X 10.6):

~/tmp> python consume_timeit.py
consume_deque
10: 1.3138859272
   100: 3.54495286942
  1000: 24.9603481293

consume_forloop
10: 0.658113002777
   100: 2.85697007179
  1000: 24.6610429287

consume_islice
10: 0.637741088867
   100: 1.09042882919
  1000: 5.44473600388

The next function performs much better. It is also much more direct
for the purposes of consume and much more understandable (at least for
me) as it doesn't require a specialized data structure which is
subsequently not used as such.
I am thus inclined to report it as a python documentation enhancement
(bug) request. Any comments?

Cheers,

Muhammad
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Consume an iterable

2010-01-23 Thread Duncan Booth

Peter Otten <__pete...@web.de> wrote:

> With next(islice(...), None) I seem to have found a variant that beats
> both  competitors.
> 
It has different behaviour for n==0 but I'm sure that's easily fixed.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Consume an iterable

2010-01-23 Thread Peter Otten

Muhammad Alkarouri wrote:

> Thanks everyone, but not on my machine (Python 2.6.1, OS X 10.6) it's
> not:
> 
> 
> In [1]: from itertools import count, islice
> 
> In [2]: from collections import deque
> 
> In [3]: i1=count()
> 
> In [4]: def consume1(iterator, n):
>...: deque(islice(iterator, n), maxlen=0)
>...:
>...:
> 
> In [5]: i2=count()
> 
> In [6]: def consume2(iterator, n):
>...: for _ in islice(iterator, n): pass
>...:
>...:
> 
> In [7]: timeit consume1(i1, 10)
> 100 loops, best of 3: 1.63 us per loop
> 
> In [8]: timeit consume2(i2, 10)
> 100 loops, best of 3: 846 ns per loop
> 
> 
> Can somebody please test whether it is only my machine or is this
> reproducible?

I can reproduce it. The deque-based approach has a bigger constant overhead 
but better per-item performance. Its asymptotical behaviour is therefore 
better.

$ python consume_timeit.py
consume_deque
10: 1.77500414848
   100: 3.7001137
  1000: 24.7235469818

consume_forloop
10: 1.22008490562
   100: 5.86271500587
  1000: 52.2449371815

consume_islice
10: 0.897439956665
   100: 1.51542806625
  1000: 7.70061397552

$ cat consume_timeit.py
from collections import deque
from itertools import islice, repeat

def consume_deque(n, items):
deque(islice(items, n), maxlen=0)

def consume_forloop(n, items):
for _ in islice(items, n):
pass

def consume_islice(n, items):
next(islice(items, n-1, None), None)

def check(fs):
for consume in fs:
items = iter(range(10))
consume(3, items)
rest = list(items)
assert rest == range(3, 10), consume.__name__

if __name__ == "__main__":
fs = consume_deque, consume_forloop, consume_islice
check(fs)


items = repeat(None)

from timeit import Timer
for consume in fs:
print consume.__name__
for n in (10, 100, 1000):
print "%6d:" % n,
print Timer("consume(%s, items)" % n,
"from __main__ import consume, items").timeit()
print
$

With next(islice(...), None) I seem to have found a variant that beats both  
competitors.

Peter
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Arnaud Delobelle

Dave Angel  writes:

> Arnaud Delobelle wrote:
>> Steve Howell  writes:
>>
>>   
>>> On Jan 22, 12:14 pm, Chris Rebert  wrote:
>>> 
>> I made the comment you quoted.  In CPython, it is O(n) to delete/insert
>> an element at the start of the list - I know it because I looked at the
>> implementation a while ago.  This is why collections.deque exists I
>> guess.  I don't know how they are implemented but insertion/deletion at
>> either end are O(1) and so is random access.  So they are the data
>> structure that you want.
>>
>>   
> Not according to the 2.6 docs.
>
> Indexed access is O(1) at both ends but slows to O(n) in the
> middle. For fast random access, use lists instead.

Yes you are correct.  This will teach me (again!) to check my facts.

>
> That sounds to me like a doubly-linked list implementation.

I've just looked and it is a doubly-linked list of 'blocks' of size
BLOCKLEN, which is 62 on the source I have (I guess it's 62 because then
the whole block structure is 64 exactly, 62 + 1 for each link).  So a
small list will have near constant random access, in a way.

-- 
Arnaud
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: A.x vs. A["x"]

2010-01-23 Thread Martin Drautzburg

Terry Reedy wrote:

> On 1/22/2010 2:29 PM, Martin Drautzburg wrote:
>> This has probably been asekd a million times, but if someone could
>> give a short answer anyways I's be most grateful.
>>
>> What is it that allows one to write A.x? If I have a variable A,
 

>> I know you can do this with classes, but not with plain objects, but
>> why is that so?
> 
> You exact meaning here is not clear, but I suspect it is somewhat
> incorrect, at least for built-in classes.

You're right. I used to think you could do this to classes:

G = Strum
G.x=1

But not to objects (instances):

g = Strum()
g.y = 2

But in fact both work. Thanks for clarifying
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Steven D'Aprano

On Fri, 22 Jan 2010 21:42:43 -0800, Steve Howell wrote:

> This innocent program here literally moves about a million bytes of
> memory around for no good reason:
> 
> lst = []
> for i in range(2000):
> lst.append(i)
> while lst:
> print lst.pop(0)
> 
> Why?  Because list.pop(0) is implemented in O(N) instead of O(1).
> 
> Why wouldn't you get a competent C programmer simply make list_ass_slice
> smart enough to make list.pop(0) O(1)?

Because there are always trade-offs, and the competent C programmers who 
coded the implementation for lists choose different tradeoffs to the ones 
you would prefer.

Seems to me that the simple solution to your problem is for you to 
implement your own data structure that makes whatever trade-offs you 
like. If it is good enough and popular enough, it might even replace the 
existing list implementation.

>> That is, you give me the impression that you prefer this:
>>
>> while alist:
>>     x = alist.pop(0)
>>     process(x)
>>
>> over this:
>>
>> for x in alist:
>>     process(x)
>> # allow alist to be garbage collected when it goes out of scope
>>
>>
> No, to be more precise, I prefer this implementation of a recursive
> parser (using lists) to one that would have to use deque's because of
> the lameness of Python's list implementation:
> 
> https://bitbucket.org/showell/shpaml_website/src/tip/shpaml.py

That's a lot of code. Am I supposed to study the whole module, or can you 
give me a hint as to what you're referring to? The lack of docstrings and 
comments doesn't fill me with enthusiasm for reading it.

Nevertheless, on the basis of a quick scan, I suppose that you're 
probably talking about the nested function called recurse:

def recurse(prefix_lines):
while prefix_lines:
prefix, line = prefix_lines[0]
if line == '':
prefix_lines.pop(0)
append('')
else:
block_size = get_block(prefix_lines)
if block_size == 1:
prefix_lines.pop(0)
if line == pass_syntax:
pass
elif line.startswith(flush_left_syntax):
append(line[len(flush_left_syntax):])
elif line.startswith(flush_left_empty_line):
append('')
else:
append(prefix + leaf_method(line))
else:
block = prefix_lines[:block_size]
prefix_lines = prefix_lines[block_size:]
branch_method(output, block, recurse)
return

Since you're not even looking at the results of the pop, why don't you 
just call del prefix_lines[0]? It probably won't perform any better, but 
it is more idiomatic.

An alternative would be to do exactly what you want lists to do: track 
the start of the list. Untested:

def recurse(prefix_lines):
start = 0
end = len(prefix_lines)
while start < end:
prefix, line = prefix_lines[start]
if line == '':
start += 1
append('')
else:
block_size = get_block(prefix_lines)
if block_size == 1:
start += 1
if line == pass_syntax:
pass
elif line.startswith(flush_left_syntax):
append(line[len(flush_left_syntax):])
elif line.startswith(flush_left_empty_line):
append('')
else:
append(prefix + leaf_method(line))
else:
block = prefix_lines[:block_size]
start = block_size
branch_method(output, block, recurse)
return

No more O(N) deletions. Problem solved.

-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: medians for degree measurements

2010-01-23 Thread Steven D'Aprano

On Fri, 22 Jan 2010 22:09:54 -0800, Steve Howell wrote:

> On Jan 22, 5:12 pm, MRAB  wrote:
>> Steve Howell wrote:
>> > I just saw the thread for medians, and it reminded me of a problem
>> > that I need to solve.  We are writing some Python software for
>> > sailing, and we need to detect when we've departed from the median
>> > heading on the leg.  Calculating arithmetic medians is
>> > straightforward, but compass bearings add a twist.
[...]
> I like this implementation, and it would probably work 99.% of the
> time for my particular use case.  The only (very contrived) edge case
> that I can think of is when you have 10 bearings to SSW, 10 bearings to
> SSE, and the two outliers are unfortunately in the NE and NW quadrants. 
> It seems like the algorithm above would pick one of the outliers.

The trouble is that median of angular measurements is not a meaningful 
concept. The median depends on the values being ordered, but angles can't 
be sensibly ordered. Which is larger, 1 degree north or 359 degrees? Is 
the midpoint between them 0 degree or 180 degree?

The median of the number line 0 <= x <= 360 is 180, but what's the median 
of the circle 0 deg <= theta <= 360 deg? With no end points, it's 
meaningless to talk about the middle.

For this reason, I expect that taking the median of bearing measurements 
isn't well defined. You can probably get away with it so long as the 
bearings are "close", where "close" itself is ill-defined.

In any case, this isn't a programming problem, it's a mathematics 
problem. I think you're better off taking it to a maths or statistics 
forum, and see what they say.

-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Using dictionary key as a regular expression class

2010-01-23 Thread Steve Holden

Chris Jones wrote:
> On Fri, Jan 22, 2010 at 05:07:13PM EST, Arnaud Delobelle wrote:
> 
> [..]
> 
>> import codecs
>> from collections import defaultdict
>>
>> tcounters = defaultdict(int)
>> f = codecs.open('/home/gavron/git/screen/src/screen.c', 'r', "utf-8")
>>
>> for c in f.read():
>> tcounters[c] += 1
>>
>> for c, n in tcounters.iteritems():
>> print "%r\t%i" % (c, n)
> 
> Ah, yes.. much better - I grew suspicious of my 'effort' when I realized
> I could have written a shorter version in C. :-)
> 
Congratulations. That perception shows a sound appreciation of Python's
design.

regards
 Steve
-- 
Steve Holden   +1 571 484 6266   +1 800 494 3119
PyCon is coming! Atlanta, Feb 2010  http://us.pycon.org/
Holden Web LLC http://www.holdenweb.com/
UPCOMING EVENTS:http://holdenweb.eventbrite.com/

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Mastering Python 3 I/O - Special Preview - Feb 5, 2010 (Chicago)

2010-01-23 Thread Steve Holden

Chris Colbert wrote:
> oops :)
> 
Yes, but only a little oops. I think it's probably fairly well-known
that David Beazley and I both supply training, and you'd hardly expect
profit to be off the list of objectives for US trainers. But my failure
to hit "reply to sender only" certainly did show that it's been a long
week. Sorry for the noise!

regards
 Steve
-- 
Steve Holden   +1 571 484 6266   +1 800 494 3119
PyCon is coming! Atlanta, Feb 2010  http://us.pycon.org/
Holden Web LLC http://www.holdenweb.com/
UPCOMING EVENTS:http://holdenweb.eventbrite.com/

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Consume an iterable

2010-01-23 Thread Muhammad Alkarouri

Thanks everyone, but not on my machine (Python 2.6.1, OS X 10.6) it's
not:


In [1]: from itertools import count, islice

In [2]: from collections import deque

In [3]: i1=count()

In [4]: def consume1(iterator, n):
   ...: deque(islice(iterator, n), maxlen=0)
   ...:
   ...:

In [5]: i2=count()

In [6]: def consume2(iterator, n):
   ...: for _ in islice(iterator, n): pass
   ...:
   ...:

In [7]: timeit consume1(i1, 10)
100 loops, best of 3: 1.63 us per loop

In [8]: timeit consume2(i2, 10)
100 loops, best of 3: 846 ns per loop


Can somebody please test whether it is only my machine or is this
reproducible?

(Thanks Jan for making me actually carry the test)
-- 
http://mail.python.org/mailman/listinfo/python-list

ISC License

2010-01-23 Thread Joan Miller

There is a license approved by the OSI, the ISC License [1], which
should be included in the PyPi classifiers [2].


[1] https://www.isc.org/software/license
http://www.opensource.org/licenses/isc-license.txt
[2] http://pypi.python.org/pypi?%3Aaction=list_classifiers
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: counting lines of code

2010-01-23 Thread Michele Simionato

On Jan 22, 10:30 pm, Steve Holden  wrote:
> >> Oh, sorry, did I have the wrong opinion?
>
> > You had a condescending attitude.
>
> Towards someone who is fairly obviously not a Python neophyte.
>
> Please don't think we are telling you you can't have any opinion you
> like. Just don't expect to get away with it when you are wrong ;-)

Come on, we are all friends here! The guy just said

> In my experience with Python codebases that big...
>
> ...how many of those lines are duplicated, and might merge together
> into a better design?
>
> The LOC would go down, too.

and it was even partially right. It is certainly right for the part I
was working on.
If it was good code from the beginning we would not have started this
refactoring project,
right? Peace,

   Michele
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: counting lines of code

2010-01-23 Thread Michele Simionato

On Jan 22, 7:51 pm, Phlip  wrote:
> On Jan 21, 9:00 pm, Michele Simionato 
> wrote:
>
> > Just for fun I have run cloc on our trunk:
>
> > SUM:                8743    272238    215871   1470139 x   1.84 =
> > 2708354.95
>
> Nice!
>
> My favorite version of a cloc system can distinguish test from
> production code. That's why I always use executable cloc to measure
> the ratio of test to production code (where 1.2:1 is almost
> comfortable an 2:1 is sacred).

Most of this code base is old, before we started using automatic
tests, so tests are not a significant fraction of the code. And in any
case I consider tests as code, since you have to maintain them,
refactor them, etc.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Steve Howell

On Jan 23, 1:24 am, Terry Reedy  wrote:
> On 1/23/2010 3:23 AM, Steve Howell wrote:
>
> > On Jan 23, 12:13 am, Terry Reedy  wrote:
>
> >> Challenge yes, mock no.
>
> >> Part of writing good basic data structures is not adding needless
> >> complication from featuritis and not penalizing 99.99% of access to
> >> satify a .01% need better satisfied another way.
>
> > I would like to challenge your assertion that advancing ob_item
> > instead of doing memmove during list_ass_slice would impact the
> > performance of list accesses in any way.  It would only slow down
> > operations that add/insert items into the list by, and then only by a
> > single conditional statement, and those add/insert operations are
> > already O(N) to begin with.
>
> If you try writing a full patch, as I believe someone did, or at least a
> prototype thereof, when the idea was discussed, you might have a better
> idea of what the tradeoffs are and why it was rejected.
>
> For instance, when you append to a full list, it is resized. I believe
> it is now doubled, but the policy has varied over the years. If you turn
> list from essentially a stack to a deque, you complicate the resizing
> policy and have to consider the addition of a shift policy. Do you split
> the over-allocated fore and aft or increase the overallocation from
> double to, say, triple? If the former, then for normal usage that never
> uses the fore part, the over-allocation has been effectively reduced
> from 2x to 1.5x (which is about what it once was, I believe). This means
> more frequent resizings and copyings, which means slower operation for
> most use cases. Adding extra usually wasted space is also a nuisance.
>

It looks like most of the complication would be in list_resize.

I'm gonna oversimplify a bit, but tell me if this is the gist of it.

I would have ob_item continue to always refer to first element of the
list, and then I'd have to introduce another variable to refer to the
start of our allocated memory, ob_start_memory, whenever you do a
realloc/free/malloc.  I'd have a notion of fore_wastage, which would
either be a variable I maintain or something that I just calculate as
needed from the difference of ob_item and ob_start_memory.

In deciding whether I want to give memory back to the memory manager,
I just need to adjust my calculations to account for fore and aft
wastage to see if it's time to do a shrink, and before shrinking, I do
the memmove.

On growth, I would just always do a memmove right away if there is
fore_wastage, and then do the normal procedure for aft wastage.

For the most common scenario of append, append, append, the only
penalty is having to skip over fore_wastage logic by checking for
fore_wastage == 0 or ob_item == ob_start_memory.

For the scenario of several appends followed by several pops, I get
the big win of only doing log 2 N memmoves instead of N as I shrink
the list down to zero.

If I start alternating between pops and appends, it's basically a
wash...instead of doing the memmove on the pop, I do it on the next
append.

If I were to pop the top element and then prepend a new element, to be
truly efficient, I'd want to use reserved right away, but for
simplicity, I would probably not complicate list_ass_slice and just do
the normal resize() dance, which would give me memmove in one
direction followed immediately by a memmove in the other direction
when I come back to list_ass_slice.  (But it would all still be a
wash, since I would have to do the same number of memmoves in the
current implementation.)

A lot of the essential complexity here seems to come from the fact
that realloc() isn't a very robust abstraction.  It seems to be
expensive to tell it you want to shrink, and it also does not have an
interface to tell it to give you a little growing room.  On the other
hand, the code within list_resize() actually provides a nice framework
for amortizing memmoves exponentially.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Efficient Running Median

2010-01-23 Thread Bearophile

Very nice. I have added a comment at the bottom, using a circular
queue instead of a deque may be faster.

Bye,
bearophile
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PyArg_ParseTupleAndKeywords

2010-01-23 Thread Mr.M


Carl Banks ha scritto:

(some declarations omitted here)


You probably shouldn't have, that could be where the error is  I'd
include the whole function up to the call that raises the exception.


Thank you so much Carl for your help, i'll provide more info so that you
can try to fix my errors!
Thank you again!

First of all i'll try to explain what i've discovered over a night i
spent awake trying to find where the bug is.

My init behaves differently in this situations:

[code]
# this works
dummy = mymodule.myclass()

# this works
dummy = mymodule.myclass("string for service")

# this also works
dummy = mymodule.myclass("string for service", "string for event_type")
# and it works if i provide arguments in the right sequence, of course

# this won't work
dummy = mymodule.myclass("string for service", "string for event_type",
free_text = "string for free_text")

# but this works
dummy = mymodule.myclass(service = "string for service")

# the worst thing: this doesn't work but it did and i can't
# understand what is changed

dummy = mymodule.myclass(free_text = "text for free_text")

ok, every time the code fails i get this exception:

TypeError: expected string or Unicode object, tuple found

so, i think for some reason PyArg_ParseTupleAndKeywords receives "args"
as a touple instead of something else.

This is everything i managed to discover.

This is the code:

[code]
/* DEBUG INFO */
printf("event INIT\n");
if (args)
{
int i;
printf("args= \"%s\"\n", PyString_AsString(args));
printf("type= %s\n",
PyString_AsString(PyObject_Repr(PyObject_Type(args;
printf("tuple size  = %d\n", PyTuple_Size(args));
for (i = 0; i < PyTuple_Size(args); i++)
{
printf("%d) %s\n", i,
PyString_AsString(PyTuple_GetItem(args, i)));
}
}
else
{
printf("args = NULL\n");
}
printf("dict:\n");
if (keywords)
{
printf(" = %s\n",
PyString_AsString(PyObject_Repr(keywords)));
printf("type = %s\n",
PyString_AsString(PyObject_Repr(PyObject_Type(keywords;
}
else
{
printf("keywords = NULL\n");
}

char* service  = NULL;
char* event_type   = NULL;
char* free_text= NULL;
char* group_uid= NULL;
char* remote_name  = NULL;
char* remote_abook_uid = NULL;

time_t start_time   = -1L;
time_t end_time = -1L;
time_t storage_time = -1L;

int flags = 0;

int bytes_sent = -1;
int bytes_received = -1;

PyObject* temp;

static char* keywordlist[] = {"service",
  "event_type",
  "free_text",
  "group_uid",
  "remote_name",
  "remote_abook_uid",
  "start_time",
  "end_time",
  "storage_time",
  "flags",
  "bytes_sent",
  "bytes_received",
  NULL};

if (!PyArg_ParseTupleAndKeywords(args, keywords, "|ssii",
keywordlist,
 &service,
 &event_type,
 &free_text,
 &group_uid,
 &remote_name,
 &remote_abook_uid,
 &start_time,
 &end_time,
 &storage_time,
 &flags,
 &bytes_sent,
 &bytes_received))
{
return -1;
}

printf("PyArg_ParseTupleAndKeywords worked fine\n");

[/code]

(sorry if my code is a little messy and my english rather bad!)


Are you sure that PyArg_ParseTupleAndKeywords is what's raising the
error?


Yes, i think so. I have a lot of printf in my code!


Are you subclassing this type in Python, and passing one of the string
parameters a tuple by mistake?  For instance, did you do something
like this:

class myclass(_mycmodule._myctype):
def __init__(self,*args,**kwargs):
log_something_here()
super(myclass,self).__init__(args,**kwargs)

Note the missing * on args.


no, i'm not subclassing.
I have to admit that i had some testcase and a few weeks ago they worked
like a charm... i can't understand what i changed, of course.





I found that if i write this everything works fine:

[code]
import mymodule
a = mymodule.myclass(service = "blabla", event_type = "event", free_t

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Alf P. Steinbach


* Steve Howell:

On Jan 23, 12:32 am, "Alf P. Steinbach"  wrote:

* Steve Howell:


On Jan 23, 12:13 am, Terry Reedy  wrote:

Challenge yes, mock no.
Part of writing good basic data structures is not adding needless
complication from featuritis and not penalizing 99.99% of access to
satify a .01% need better satisfied another way.

I would like to challenge your assertion that advancing ob_item
instead of doing memmove during list_ass_slice would impact the
performance of list accesses in any way.  It would only slow down
operations that add/insert items into the list by, and then only by a
single conditional statement, and those add/insert operations are
already O(N) to begin with.

I'm sorry, no, the last part is incorrect.

Appending to a 'list' can currently be constant time, if OS reallocation is
constant time (as the string '+' optimization relies on).



That's true, but it's also irrelevant, as the pop optimization would
happen in a branch of code that never gets executed during list
appending:

if (d < 0) { /* Delete -d items */
/* ADD CODE HERE TO AVOID memmove
   when ilow == 0 (i.e. ihigh + d == 0)
*/
memmove(&item[ihigh+d], &item[ihigh],
(Py_SIZE(a) - ihigh)*sizeof(PyObject *));
list_resize(a, Py_SIZE(a) + d);
item = a->ob_item;
}



With the pop optimization it can no longer be constant time without risking an
accumulation of unused memory, a memory leak, although it can be amortized
constant time, at the cost of wasting some percentage of memory.


list_resize already overallocates memory to allow room for growth.
Whenever you did an append to the list that would force a realloc, you
could first check to see if there is unused stuff at the front and do
the memmove then and reclaim the unfreed memory.  So instead of doing
a paying for memmove on every pop [at front], you only pay for it when
the list goes to size 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, etc.


Oh. If 'list' already uses a doubling buffer then the only overhead from the 
optimization would, AFAICS, be a single add in every indexing. Which might be 
acceptable (at least it sounds very reasonable in the context of Python).


Re terminology: I write "doubling buffer" to mean increase of buffer size by a 
factor. It's often 2, but might be e.g. 1.5, whatever. The point of using a 
constant factor is to avoid quadratic time for loops doing appending, i.e. the 
constant factor size increase yields amortized constant time per append.


The sizes that you quote above, on the other hand, look like rather arbitrary 
buffer size increases where the size to increase by is limited to a certain 
small range. With copying or moving of everything at each buffer size increase 
that would not yield amortized constant time. It yield linear time, and 
quadratic time for a loop doing appends.


But, anyway, if 'list' already uses a doubling buffer then the only overhead 
from the pop optimization would, AFAICS, be a single add in every indexing.


On the third & gripping hand, however, a proof-of-concept actual implementation 
(patch) would be needed to ensure that one doesn't overlook any showstopper or 
serious problem, and to provide timings. And the special case would have to be 
documented as a special case. Hm, it would be nice if the Python docs offered 
complexity (time) guarantees in general...



Cheers,

- Alf
--
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Steve Howell

On Jan 23, 12:13 am, Terry Reedy  wrote:
> On 1/23/2010 12:58 AM, Steve Howell wrote:
>
> > I really want to use list *normally* with all its perfectly good
> > semantics and reasonable implementation, except for its blind spot
> > with respect to popping the first element off the list.
>
> It was not designed for that. .pop() was added to lists about 10 years
> ago because I asked for it (with no parameter, pop off end only) and
> wrote what would now be a PEP -- and because Tim Peters later supported
> the idea. Adding the optional parameter was something of an afterthought
> (never publicly discussed as far as I know) for occasional use for
> 'short' lists where O(n) is tolerable. You have half persuaded me that
> that the parameter addition was a mistake. Perhaps is is too attractice
> a nuisance for some people ;=).
>

pop(0) is a useful idiom in parsers.  You can see examples in
ElementTree and lib2to3.

Even without pop(0), people would still write code like this, found in
pstats.py:

arg = args[0]
args = args[1:]

It is sometimes overkill (and even inappropriate) to use a queue when
really you just want a list.  Iterators are great, but they also have
slightly different semantics than the list itself.

There is nothing wrong with a language specification that allows users
to do insert, delete, and pop on a list.  Once you freeze the language
specification, then you can turn your attention to improving the
implementation.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Terry Reedy


On 1/23/2010 3:23 AM, Steve Howell wrote:

On Jan 23, 12:13 am, Terry Reedy  wrote:


Challenge yes, mock no.

Part of writing good basic data structures is not adding needless
complication from featuritis and not penalizing 99.99% of access to
satify a .01% need better satisfied another way.



I would like to challenge your assertion that advancing ob_item
instead of doing memmove during list_ass_slice would impact the
performance of list accesses in any way.  It would only slow down
operations that add/insert items into the list by, and then only by a
single conditional statement, and those add/insert operations are
already O(N) to begin with.


If you try writing a full patch, as I believe someone did, or at least a 
prototype thereof, when the idea was discussed, you might have a better 
idea of what the tradeoffs are and why it was rejected.


For instance, when you append to a full list, it is resized. I believe 
it is now doubled, but the policy has varied over the years. If you turn 
list from essentially a stack to a deque, you complicate the resizing 
policy and have to consider the addition of a shift policy. Do you split 
the over-allocated fore and aft or increase the overallocation from 
double to, say, triple? If the former, then for normal usage that never 
uses the fore part, the over-allocation has been effectively reduced 
from 2x to 1.5x (which is about what it once was, I believe). This means 
more frequent resizings and copyings, which means slower operation for 
most use cases. Adding extra usually wasted space is also a nuisance.


Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Re: Using dictionary key as a regular expression class

2010-01-23 Thread Chris Jones

On Sat, Jan 23, 2010 at 02:45:41AM EST, Terry Reedy wrote:
> On 1/22/2010 9:58 PM, Chris Jones wrote:

>> Do you mean I should just read the file one character at a time?
>
> Whoops, my misdirection (you can .read(1), but this is s  l   o   w.
> I meant to suggest processing it a char at a time.

Right.. that's how I understood it - i.e. asking python for the next
character, and not worrying about how much is retrieved from the disk in
one pass.

> 1. If not too big,
>
> for c in open(x, 'rb').read() # left .read() off
> # 'b' will get bytes, though ord(c) same for ascii chars for  byte or  
> unicode
>
> 2. If too big for that,
>
> for line in open():
>   for c in line:# or left off this part

Well the script is not going to process anything larger that a few
KiloBytes, but all the same that's something I want to understand
better.

Isn't there any way I can tell python to retrieve a fairly large chunk
of the disk file, like 4-8K, maybe.. and increment a pointer behind the
scenes while I iterate so that I have access to characters one at a
time. I mean, that should be pretty fast, since disk access would be
minimal, and no data would actually be copied.. I would have thought
that 1. above would cause python to do something like that behind the
scenes.

[..]

>> Thanks much for the snippet, let me play with it and see if I can
>> come up with a Unicode/utf-8 version.. since while I'm at it I might
>> as well write something a bit more general than C code.
>>
>> Since utf-8 is backward-compatible with 7bit ASCII, this shouldn't be
>> a problem.

> For any extended ascii, 

You mean 8-bit encodings, like latin1 right?

> use larger array without decoding (until print,  if need be). For
> unicode, add encoding to open and 'c in line' will  return unicode
> chars. Then use *one* dict or defaultdict. I think  something like

> from collections import defaultdict
> d = defaultdict(int)
> ...
> d[c] += 1 # if c is new, d[c] defaults to int() == 0

I don't know python, so I basically googled for the building blocks of
my little script.. and I remember seeing defaultdict(int) mentioned some
place or other, but somehow I didn't understand what it did. 

Cool feature.

Even if it's a bit wasteful, with unicode/utf-8, it looks like working
with the code points, either as dictionary keys or as index values into
an array might make the logic simpler - i.e. for each char, obtain its
code point 'cp' and add one to dict[cp] or array[cp] - and then loop and
print all non-zero values when the end-of-file condition is reached.

Food for thought in any case.

CJ
-- 
http://mail.python.org/mailman/listinfo/python-list

www.phptutorial.me

2010-01-23 Thread groups_ads12

www.phptutorial.me 
php 
php help 
free php 
free php scripts

htaccess php

learn php 
learning php

open php 
php and 
php applications

php array 
php arrays

php book 
php books 
php cgi 
php checkbox

php class 
php classes

php code 
php comments

php course

php curl 
php database

php date 
php developer

php development

php doc 
php download

php example

php extension

php fgets 
php filter

php find 
php fopen 
php foreach

php form 
php forms 
php function

php functions

php game 
php gd 
php get 
php header

php html 
php http 
php if 
php include

php index 
php insert

php install

php installation

php isset 
php job 
php jobs 
php language

php live 
php login 
php magic 
php mail 
php max 
php next 
php now 
php outsourcing

php pagination

php parse 
php pear 
php perl 
php post

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Steve Howell

On Jan 23, 12:32 am, "Alf P. Steinbach"  wrote:
> * Steve Howell:
>
> > On Jan 23, 12:13 am, Terry Reedy  wrote:
> >> Challenge yes, mock no.
>
> >> Part of writing good basic data structures is not adding needless
> >> complication from featuritis and not penalizing 99.99% of access to
> >> satify a .01% need better satisfied another way.
>
> > I would like to challenge your assertion that advancing ob_item
> > instead of doing memmove during list_ass_slice would impact the
> > performance of list accesses in any way.  It would only slow down
> > operations that add/insert items into the list by, and then only by a
> > single conditional statement, and those add/insert operations are
> > already O(N) to begin with.
>
> I'm sorry, no, the last part is incorrect.
>
> Appending to a 'list' can currently be constant time, if OS reallocation is
> constant time (as the string '+' optimization relies on).
>

That's true, but it's also irrelevant, as the pop optimization would
happen in a branch of code that never gets executed during list
appending:

if (d < 0) { /* Delete -d items */
/* ADD CODE HERE TO AVOID memmove
   when ilow == 0 (i.e. ihigh + d == 0)
*/
memmove(&item[ihigh+d], &item[ihigh],
(Py_SIZE(a) - ihigh)*sizeof(PyObject *));
list_resize(a, Py_SIZE(a) + d);
item = a->ob_item;
}


> With the pop optimization it can no longer be constant time without risking an
> accumulation of unused memory, a memory leak, although it can be amortized
> constant time, at the cost of wasting some percentage of memory.

list_resize already overallocates memory to allow room for growth.
Whenever you did an append to the list that would force a realloc, you
could first check to see if there is unused stuff at the front and do
the memmove then and reclaim the unfreed memory.  So instead of doing
a paying for memmove on every pop, you only pay for it when the list
goes to size 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, etc.


>
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Alf P. Steinbach


* Steve Howell:

On Jan 23, 12:13 am, Terry Reedy  wrote:

Challenge yes, mock no.

Part of writing good basic data structures is not adding needless
complication from featuritis and not penalizing 99.99% of access to
satify a .01% need better satisfied another way.



I would like to challenge your assertion that advancing ob_item
instead of doing memmove during list_ass_slice would impact the
performance of list accesses in any way.  It would only slow down
operations that add/insert items into the list by, and then only by a
single conditional statement, and those add/insert operations are
already O(N) to begin with.


I'm sorry, no, the last part is incorrect.

Appending to a 'list' can currently be constant time, if OS reallocation is 
constant time (as the string '+' optimization relies on).


With the pop optimization it can no longer be constant time without risking an 
accumulation of unused memory, a memory leak, although it can be amortized 
constant time, at the cost of wasting some percentage of memory.



Cheers & hth.,

- Alf

--
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Steve Howell

On Jan 23, 12:13 am, Terry Reedy  wrote:
>
> Challenge yes, mock no.
>
> Part of writing good basic data structures is not adding needless
> complication from featuritis and not penalizing 99.99% of access to
> satify a .01% need better satisfied another way.
>

I would like to challenge your assertion that advancing ob_item
instead of doing memmove during list_ass_slice would impact the
performance of list accesses in any way.  It would only slow down
operations that add/insert items into the list by, and then only by a
single conditional statement, and those add/insert operations are
already O(N) to begin with.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: medians for degree measurements

2010-01-23 Thread Terry Reedy


On 1/23/2010 1:09 AM, Steve Howell wrote:

On Jan 22, 5:12 pm, MRAB  wrote:



I like this implementation, and it would probably work 99.% of the
time for my particular use case.  The only (very contrived) edge case
that I can think of is when you have 10 bearings to SSW, 10 bearings
to SSE, and the two outliers are unfortunately in the NE and NW
quadrants.  It seems like the algorithm above would pick one of the
outliers.

Maybe the refinement to the algorithm above would be to find the mean
first, by summing sines and cosines of the bearings, taking the
quotient, and applying the arctangent.  Then use the resulting angle
as the equivalent of "due north" and adjust angles to be within (-180,
180) respect to the mean, pretty much as you do in the code above,
with minor modifications.


I was going to suggest this. Let us know if it seems to work.


I realize the problem as I stated it as sort of ill-defined.


Yes, I think this one reason stat books typically ignore directional 
data. I think it is an unfortunate omission.


Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Terry Reedy


On 1/23/2010 12:58 AM, Steve Howell wrote:


I really want to use list *normally* with all its perfectly good
semantics and reasonable implementation, except for its blind spot
with respect to popping the first element off the list.


It was not designed for that. .pop() was added to lists about 10 years 
ago because I asked for it (with no parameter, pop off end only) and 
wrote what would now be a PEP -- and because Tim Peters later supported 
the idea. Adding the optional parameter was something of an afterthought 
(never publicly discussed as far as I know) for occasional use for 
'short' lists where O(n) is tolerable. You have half persuaded me that 
that the parameter addition was a mistake. Perhaps is is too attractice 
a nuisance for some people ;=).



 The whole
reason I use CPython vs. C in the first place is that CPython
programmers can generally program basic data structures better than I
can.


They have given us three options other that .pop(0).

1. listiterator
2. queue.Queue
3. collections.deque\

Why are you so stubborn about not using a data structure intended for 
your use case instead of one that was not?


There is also
4. a two-list design for queues: iterator through one (or pop() from a 
reversed version thereof) while appending to another; switch when the 
first is empty. I plan to write it up with tests some day this year.



I know Python's number one concern will never be speed, but if Python
makes an O(1) operation into an unnecessarily O(N) operation for no
good reasons other than "it's too complicated, " or it "adds another
pointer to the structure," or "it adds another conditional check to
list_ass_slice for operations that aren't popping off the top," I
think it's reasonable to challenge the design philosophy.


Challenge yes, mock no.

Part of writing good basic data structures is not adding needless 
complication from featuritis and not penalizing 99.99% of access to 
satify a .01% need better satisfied another way.


Terry Jan Reedy


--
http://mail.python.org/mailman/listinfo/python-list

Re: list.pop(0) vs. collections.dequeue

2010-01-23 Thread Steve Howell

On Jan 22, 11:10 pm, a...@pythoncraft.com (Aahz) wrote:
> In article <83082e19-9130-45a8-91f2-8601c1fda...@22g2000yqr.googlegroups.com>,
> Steve Howell   wrote:
>
>
>
>
>
> >I really want to use list *normally* with all its perfectly good
> >semantics and reasonable implementation, except for its blind spot
> >with respect to popping the first element off the list.  The whole
> >reason I use CPython vs. C in the first place is that CPython
> >programmers can generally program basic data structures better than I
> >can.  But list.pop(0) is the exception.  And, with the possible
> >exception of dicts, lists are the most fundamental data structures
> >that Python has.
>
> >I know Python's number one concern will never be speed, but if Python
> >makes an O(1) operation into an unnecessarily O(N) operation for no
> >good reasons other than "it's too complicated, " or it "adds another
> >pointer to the structure," or "it adds another conditional check to
> >list_ass_slice for operations that aren't popping off the top," I
> >think it's reasonable to challenge the design philosophy.
>
> "Rough consensus and running code."
>
> You have a good point, but nobody will ever give your idea serious
> attention until there's a patch and benchmarks.

Another benchmark is that deques are slower than lists for accessing
elements.

show...@showell-laptop:~$ python foo.py
0.0215361118317 <- list
0.0429010391235 <- deque


import time
from collections import deque

n = 4
lst = []
for i in range(n):
lst.append(i)

t = time.time()
for i in range(n):
lst[i]
print time.time() - t

lst = deque(lst)
t = time.time()
for i in range(n):
lst[i]
print time.time() - t

So substituting deque for list suffers not just in convenience, but
also in performance.

-- 
http://mail.python.org/mailman/listinfo/python-list

1 2 >

100 matches

Mail list logo