date:20100901

Re: Selecting k smallest or largest elements from a large list in python; (benchmarking)

2010-09-01 Thread Arnaud Delobelle

Dmitry Chichkov  writes:

> Given: a large list (10,000,000) of floating point numbers;
> Task: fastest python code that finds k (small, e.g. 10) smallest
> items, preferably with item indexes;
> Limitations: in python, using only standard libraries (numpy & scipy
> is Ok);
>
> I've tried several methods. With N = 10,000,000, K = 10 The fastest so
> far (without item indexes) was pure python implementation
> nsmallest_slott_bisect (using bisect/insert). And with indexes
> nargsmallest_numpy_argmin (argmin() in the numpy array k times).
>
> Anyone up to the challenge beating my code with some clever selection
> algorithm?
>
> Current Table:
> 1.66864395142 mins_heapq(items, n):
> 0.946580886841 nsmallest_slott_bisect(items, n):
> 1.38014793396 nargsmallest(items, n):
> 10.0732769966 sorted(items)[:n]:
> 3.17916202545 nargsmallest_numpy_argsort(items, n):
> 1.31794500351 nargsmallest_numpy_argmin(items, n):
> 2.37499308586 nargsmallest_numpy_array_argsort(items, n):
> 0.524670124054 nargsmallest_numpy_array_argmin(items, n):
>
> 0.0525538921356 numpy argmin(items): 1892997
> 0.364673852921 min(items): 10.026786

I think without numpy, nsmallest_slott_bisect is almost optimal.  There
is a slight improvement:

1.3386270 nsmallest_slott_bisect(items, n): [10.11643188717, 
10.17791492528]
0.883894920349 nsmallest_slott_bisect2(items, n): [10.11643188717, 
10.17791492528]

 code 

from bisect import insort
from itertools import islice

def nsmallest_slott_bisect(iterable, n, insort=insort):
it   = iter(iterable)
mins = sorted(islice(it, n))
for el in it:
if el <= mins[-1]: #NOTE: equal sign is to preserve duplicates
insort(mins, el)
mins.pop()
return mins

def nsmallest_slott_bisect2(iterable, n, insort=insort):
it   = iter(iterable)
mins = sorted(islice(it, n))
maxmin = mins[-1]
for el in it:
if el <= maxmin: #NOTE: equal sign is to preserve duplicates
insort(mins, el)
mins.pop()
maxmin = mins[-1]
return mins

import time
from random import randint, random

test_data = [randint(10, 50) + random() for i in range(1000)]
K = 10

init = time.time()
mins = nsmallest_slott_bisect(test_data, K)
print time.time() - init, 'nsmallest_slott_bisect(items, n):', mins[:
2]

init = time.time()
mins = nsmallest_slott_bisect2(test_data, K)
print time.time() - init, 'nsmallest_slott_bisect2(items, n):', mins[:
2]

-- 
Arnaud
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PyPy and RPython

2010-09-01 Thread John Nagle


On 9/1/2010 10:49 AM, sarvi wrote:


Is there a plan to adopt PyPy and RPython under the python foundation
in attempt to standardize both.

I have been watching PyPy and RPython evolve over the years.

PyPy seems to have momentum and is rapidly gaining followers and
performance.

PyPy JIT and performance would be a good thing for the Python
Community
And it seems to be well ahead of Unladen Swallow in performance and in
a position to improve quite a bit.


Secondly I have always fantasized of never having to write C code yet
get its compiled performance.
With RPython(a strict subset of Python), I can actually compile it to
C/Machine code


These 2 seem like spectacular advantages for Python to pickup on.
And all this by just showing the PyPy and the Python foundation's
support and direction to adopt them.


Yet I see this forum relatively quiet on PyPy or Rpython ?  Any
reasons???

Sarvi


The winner on performance, by a huge margin, is Shed Skin,
the optimizing type-inferring compiler for a restricted subset
of Python.  PyPy and Unladen Swallow have run into the problem
that if you want to keep some of the less useful dynamic semantics
of Python, the heavy-duty optimizations become extremely difficult.

However, if we defined a High Performance Python language, with
some restrictions, the problem becomes much easier.  The necessary
restrictions are roughly this:

-- Functions, once defined, cannot be redefined.
   (Inlining and redefinition do not play well
   together.)

-- Variables are implicitly typed for the base types:
   integer, float, bool, and everything else.  The
   compiler figures this out automatically.
   (Shed Skin does this now.)

-- Unless a class uses a "setattr" function or has
   a __setattr__ method, its entire list of attributes is
   known at compile time.
   (In other words, you can't patch in new attributes
   from outside the class unless the class indicates
   it supports that.  You can subclass, of course.)

-- Mutable objects (other than some form of synchronized
   object) cannot be shared between threads.  This is the
   key step in getting rid of the Global Interpreter Lock.

-- "eval" must be restricted to the form that has a list of
   the variables it can access.

-- Import after startup probably won't work.

Those are the essential restrictions.  With those, Python
could go 20x to 60x faster than CPython.  The failures
of PyPy and Unladen Swallow to get any significant
performance gains over CPython demonstrate the futility
of trying to make the current language go fast.

Reference counts aren't a huge issue.  With some static
analysis, most reference count updates can be optimized out.
(As for how this is done, the key issue is to determine whether
each function "keeps" a reference to each parameter.  For
any function which does not, that parameter doesn't have
to have reference count updates within the function.
Most math library functions have this property.
You do have to analyze the entire program globally, though.)

John Nagle
--
http://mail.python.org/mailman/listinfo/python-list

Re: PyPy and RPython

2010-09-01 Thread Stefan Behnel


sarvi, 02.09.2010 07:06:

Look at all the alternatives we have. Cython? Shedskin?
I'll take PyPy anyday instead of them


Fell free to do so, but don't forget that the choice of a language always 
depends on the specific requirements at hand. Cython has proven its 
applicability in a couple of large projects, for example. And it has a lot 
more third party libraries available than both PyPy and Shedskin together: 
all Python libraries, pure Python and CPython binary extensions, as well as 
tons of code written in Cython, C, C++, Fortran, and then some. And you 
don't have to give up one bit of CPython compatibility to use all of that. 
That alone counts as a pretty huge advantage to some people.


Stefan

--
http://mail.python.org/mailman/listinfo/python-list

Re: importing excel data into a python matrix?

2010-09-01 Thread John Yeung

On Sep 1, 7:45 pm, Chris Rebert  wrote:
> On Wed, Sep 1, 2010 at 4:35 PM, patrick mcnameeking
>
>  wrote:
> > I'm working on a project where I have been given
> > a 1000 by 1000 cell excel spreadsheet and I would
> > like to be able to access the data using Python.
> > Does anyone know of a way that I can do this?
>
> "xlrd 0.7.1 - Library for developers to extract data from Microsoft
> Excel (tm) spreadsheet files":http://pypi.python.org/pypi/xlrd

While I heartily recommend xlrd, it only works with "traditional"
Excel files (extension .xls, not .xlsx).  If the data really is 1000
columns wide, it must be in the new (Excel 2007 or later) format,
because the old only supported up to 256 columns.

The most promising-looking Python package to handle .xlsx files is
openpyxl.  There are also a couple of older .xlsx readers (openpyxl
can write as well).  I have not tried any of these.

John
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Queue cleanup

2010-09-01 Thread John Nagle


On 8/30/2010 12:22 AM, Paul Rubin wrote:

I guess that is how the so-called smart pointers in the Boost C++
template library work.  I haven't used them so I don't have personal
experience with how convenient or reliable they are, or what kinds of
constraints they imposed on programming style.  I've always felt a bit
suspicious of them though, and I seem to remember Alex Martelli (I hope
he shows up here again someday) advising against using them.


   "Smart pointers" in C++ have never quite worked right.  They
almost work.  But there always seems to be something that needs
access to a raw C pointer, which breaks the abstraction.
The mold keeps creeping through the wallpaper.

   Also, since they are a bolt-on at the macro level in C++,
reference count updates aren't optimized and hoisted out of
loops.  (They aren't in CPython either, but there have been reference
counted systems that optimize out most reference count updates.)

John Nagle
--
http://mail.python.org/mailman/listinfo/python-list

Re: dirty problem 3 lines

2010-09-01 Thread alex23

bussiere bussiere  wrote:
> it's just as it seems :
> i want to know how does ti works to get back an object from a string in 
> python :
> pickle.loads("""b'\x80\x03]q\x00(K\x00K\x01e.'""") #doesn't work

Repeating the question without providing any further information
doesn't really help.

This is a byte string: b'\x80\x03]q\x00(K\x00K\x01e.'
As MRAB points out, you can unpickle a byte string directly.

This is a doc string: """note the triplet of double quotes"""
What you have is a doc string that appears to contain a byte string:
"""b'\x80\x03]q\x00(K\x00K\x01e.'"""

So the question for you is: what is putting the byte string inside of
a doc string? If you can stop that from happening, then you'll have a
byte string you can directly unpickle.

Now, if you _don't_ have control over whatever is handing you the dump
string, then you can just use string manipulation to reproduce the
byte string:

>>> dump = """b'\x80\x03]q\x00(K\x00K\x01e.'"""
>>> badump = dump[2:-1].encode()[1:]
>>> pickle.loads(badump)
[0, 1]

So:
 - dump[2:-1] strips off string representation of the byte string
(b'...')
 - .encode() turns it into an actual byte string
 - [1:] strips a unicode blank from the start of the byte string (not
entirely sure how that gets there...)

After that it should be fine to unpickle.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PyPy and RPython

2010-09-01 Thread sarvi

On Sep 1, 6:49 pm, Benjamin Peterson  wrote:
> sarvi  gmail.com> writes:
> > Secondly I have always fantasized of never having to write C code yet
> > get its compiled performance.
> > With RPython(a strict subset of Python), I can actually compile it to
> > C/Machine code
>
> RPython is not supposed to be a general purpose language. As a PyPy developer
> myself, I can testify that it is no fun.

Can be worse than than writing C/C++
Compared to Java, having the interpreter during development is huge

I actually think yall at PyPy are hugely underestimating RPython.
http://olliwang.com/2009/12/20/aes-implementation-in-rpython/
http://alexgaynor.net/2010/may/15/pypy-future-python/

Look at all the alternatives we have. Cython? Shedskin?
I'll take PyPy anyday instead of them

We make performance tradeoffs all the the time. Look at Mercurial.
90% python and 5% C
Wouldn't you rather this be 90% Python and 5% RPython ???

Add to the possibility of writing Python extension module in RPython.
You could be winning a whole group of developer mindshare.

>
>
>
> > Yet I see this forum relatively quite on PyPy or Rpython ?  Any
> > reasons???
>
> You should post to the PyPy list instead. (See pypy.org)
I tried. got bounced. Just subscribed.
Will try again.

Sarvi
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Private variables

2010-09-01 Thread Rasjid Wilcox

On 2 September 2010 12:22, Ryan Kelly  wrote:
> On Thu, 2010-09-02 at 12:06 +1000, Ryan Kelly wrote:
>> On Thu, 2010-09-02 at 11:10 +1000, Rasjid Wilcox wrote:
>> > Hi all,
>> >
>> > I am aware the private variables are generally done via convention
>> > (leading underscore), but I came across a technique in Douglas
>> > Crockford's book "Javascript: The Good Parts" for creating private
>> > variables in Javascript, and I'd thought I'd see how it translated to
>> > Python. Here is my attempt.
>> >
>> > def get_config(_cache=[]):
>> >     private = {}
>> >     private['a'] = 1
>> >     private['b'] = 2
>> >     if not _cache:
>> >         class Config(object):
>> >             @property
>> >             def a(self):
>> >                 return private['a']
>> >             @property
>> >             def b(self):
>> >                 return private['b']
>> >         config = Config()
>> >         _cache.append(config)
>> >     else:
>> >         config = _cache[0]
>> >     return config
>> >
>> > >>> c = get_config()
>> > >>> c.a
>> > 1
>> > >>> c.b
>> > 2
>> > >>> c.a = 10
>> > Traceback (most recent call last):
>> >   File "", line 1, in 
>> > AttributeError: can't set attribute
>> > >>> dir(c)
>> > ['__class__', '__delattr__', '__dict__', '__doc__', '__format__',
>> > '__getattribute__', '__hash__', '__init__', '__module__', '__new__',
>> > '__reduce__', '__reduce_ex__', '__repr__', '__setattr__',
>> > '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'a', 'b']
>> > >>> d = get_config()
>> > >>> d is c
>> > True
>> >
>> > I'm not really asking 'is it a good idea' but just 'does this work'?
>> > It seems to work to me, and is certainly 'good enough' in the sense
>> > that it should be impossible to accidentally change the variables of
>> > c.
>> >
>> > But is it possible to change the value of c.a or c.b with standard
>> > python, without resorting to ctypes level manipulation?
>>
>> It's not easy, but it can be done by introspecting the property object
>> you created and munging the closed-over dictionary object:
>>
>>    >>> c = get_config()
>>    >>> c.a
>>    1
>>    >>> c.__class__.__dict__['a'].fget.func_closure[0].cell_contents['a'] = 7
>>    >>> c.a
>>    7

Ah!  That is what I was looking for.

> Heh, and of course I miss the even more obvious trick of just clobbering
> the property with something else:
>
>  >>> c.a
>  1
>  >>> setattr(c.__class__,"a",7)
>  >>> c.a
>  7

Well, that is just cheating!  :-)

Anyway, thanks for that.  I still think it is 'good enough' for those
cases where private variables are 'required'.  In both cases one has
to go out of ones way to modify the attribute.  OTOH, I guess it
depends on what the use case is.  If it is for storing a secret
password that no other part of the system should have access to, then
perhaps not 'good enough' at all.

Cheers,

Rasjid.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: killing all subprocess childrens

2010-09-01 Thread Astan Chee


Chris Rebert wrote:

import os
import psutil # http://code.google.com/p/psutil/

# your piece of code goes here

myself = os.getpid()
for proc in psutil.process_iter():
  
Is there a way to do this without psutil or installing any external 
modules or doing it from python2.5?

Just wondering.
Thanks again

if proc.ppid == myself:
proc.kill()

Cheers,
Chris
--
http://blog.rebertia.com
  

--
http://mail.python.org/mailman/listinfo/python-list

Re: killing all subprocess childrens

2010-09-01 Thread Chris Rebert

On Wed, Sep 1, 2010 at 8:12 PM, Astan Chee  wrote:
> Hi,
> I have a piece of code that looks like this:
>
> import subprocess
> retcode = subprocess.call(["java","test","string"])
> print "Exited with retcode " + str(retcode)
>
> What I'm trying to do (and wondering if its possible) is to make sure that
> any children (and any descendants) of this process is killed when the main
> java process is killed (or dies).
> How do I do this in windows, linux and OSX?

Something /roughly/ like:

import os
import psutil # http://code.google.com/p/psutil/

# your piece of code goes here

myself = os.getpid()
for proc in psutil.process_iter():
if proc.ppid == myself:
proc.kill()

Cheers,
Chris
--
http://blog.rebertia.com
-- 
http://mail.python.org/mailman/listinfo/python-list

killing all subprocess childrens

2010-09-01 Thread Astan Chee


Hi,
I have a piece of code that looks like this:

import subprocess
retcode = subprocess.call(["java","test","string"])
print "Exited with retcode " + str(retcode)

What I'm trying to do (and wondering if its possible) is to make sure 
that any children (and any descendants) of this process is killed when 
the main java process is killed (or dies).

How do I do this in windows, linux and OSX?
Thanks
Astan

--
http://mail.python.org/mailman/listinfo/python-list

Re: Performance: sets vs dicts.

2010-09-01 Thread ru...@yahoo.com

On 09/01/2010 04:51 PM, Raymond Hettinger wrote:
> On Aug 30, 6:03 am, a...@pythoncraft.com (Aahz) wrote:
>> That reminds me: one co-worker (who really should have known better ;-)
>> had the impression that sets were O(N) rather than O(1).  Although
>> writing that off as a brain-fart seems appropriate, it's also the case
>> that the docs don't really make that clear, it's implied from requiring
>> elements to be hashable.  Do you agree that there should be a comment?
>
> There probably ought to be a HOWTO or FAQ entry on algorithmic
> complexity
> that covers classes and functions where the algorithms are
> interesting.
> That will concentrate the knowledge in one place where performance is
> a
> main theme and where the various alternatives can be compared and
> contrasted.

> I think most users of sets rarely read the docs for sets.  The few lines
> in the tutorial are enough so that most folks "just get it" and don't read
> more detail unless they attempting something exotic.

I think that attitude is very dangerous.  There is
a long history in this world of one group of people
presuming what another group of people does or does
not do or think.  This seems to be a characteristic
of human beings and is often used to promote one's
own ideology.  And even if you have hard evidence
for what you say, why should 60% of people who don't
read docs justify providing poor quality docs to
the 40% that do?

So while you may "think" most people rarely read
the docs for basic language features and objects
(I presume you don't mean to restrict your statement
to only sets), I and most people I know *do* read
them.  And when read them I expect them, as any good
reference documentation does, to completely and
accurately describe the behavior of the item I am
reading about.  If big-O performance is deemed an
intrinsic behavior of an (operation of) an object,
it should be described in the documentation for
that object.

Your use of the word "exotic" is also suspect.
I learned long ago to always click the "advanced
options" box on dialogs because most developers/-
designers really don't have a clue about what
users need access to.

> Our docs have gotten
> somewhat voluminous,

No they haven't (relative to what they attempt to
describe).  The biggest problem with the docs is
that they are too terse.  They often appear to have
been written by people playing a game of "who can
describe X in the minimum number of words that can
still be defended as correct."  While that may be
fun, good docs are produced by considering how to
describe something to the reader, completely and
accurately, as effectively as possible.  The test
is not how few words were used, but how quickly
the reader can understand the object or find the
information being sought about the object.

> so it's unlikely that adding that particular
> needle to the haystack would have cured your colleague's "brain-fart"
> unless he had been focused on a single document talking about the
> performance
> characteristics of various data structures.

I don't know the colleague any more that you so I
feel comfortable saying that having it very likely
*would* have cured that brain-fart.  That is, he
or she very likely would have needed to check some
behavior of sets at some point and would have either
noted the big-O characteristics in passing, or would
have noted that such information was available, and
would have returned to the documentation when the
need for that information arose.  The reference
description of sets is the *one* canonical place to
look for information about sets.

There are people who don't read documentation, but
one has to be very careful not use the existence
of such people as an excuse to justify sub-standard
documentation.

So I think relegating algorithmic complexity information
to some remote document far from the description of the
object it pertains to, is exactly the wrong approach.
This is not to say that a performance HOWTO or FAQ
in addition to the reference manual would not be good.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PyPy and RPython

2010-09-01 Thread alex23

On Sep 2, 3:49 am, sarvi  wrote:
> Yet I see this forum relatively quite on PyPy or Rpython ?  Any
> reasons???

For me, it's two major ones:

1. PyPy only recently hit a stability/performance point that makes it
worth checking out,
2. Using non-pure-python modules wasn't straightforward (at least when
I last looked)

However, I've always felt the PyPy project was far more promising than
Unladen Swallow.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Problem checking an existing browser cookie

2010-09-01 Thread Nik the Greek

On 31 Αύγ, 11:07, Nik the Greek  wrote:
> On 30 Αύγ, 20:50, MRAB  wrote:
>
>
>
>
>
>
>
>
>
> > On 30/08/2010 18:16, Nik the Greek wrote:
>
> > > On 30 Αύγ, 19:41, MRAB  wrote:
> > >> On 30/08/2010 04:33, Nik the Greek wrote:
>
> > >>> On 30 Αύγ, 06:12, MRAB    wrote:
>
> >  This part:
>
> >         ( not mycookie or mycookie.value != 'nikos' )
>
> >  is false but this part:
>
> >         re.search( r'(msn|yandex|13448|spider|crawl)', host ) is None
>
> >  is true because host doesn't contain any of those substrings.
>
> > >>> So, the if code does executed because one of the condition is true?
>
> > >>> How should i write it?
>
> > >>> I cannot think clearly on this at all.
>
> > >>> I just wan to tell it to get executed  ONLY IF
>
> > >>> the cookie values is not 'nikos'
>
> > >>> or ( don't knwo if i have to use and or 'or' here)
>
> > >>> host does not contain any of the substrings.
>
> > >>> What am i doign wrong?!
>
> > >> It might be clearer if you reverse the condition and say:
>
> > >>       me_visiting = ...
> > >>       if not me_visiting:
> > >>           ...
>
> > > I don't understand what are you trying to say
>
> > > Please provide a full example.
>
> > > You mean i should try it like this?
>
> > > unless ( visitor and visitor.value == 'nikos' ) or re.search( r'(msn|
> > > yandex|13448|spider|crawl)', host ) not None:
>
> > > But isnt it the same thing like the if?
>
> > My point is that the logic might be clearer to you if you think first
> > about how you know when you _are_ the visitor.
>
> Well my idea was to set a cookie on my browser with the name visitor
> and a value of "nikos" and then check each time that cooki. if value
> is "nikos" then dont count!
>
> I could also pass an extra url string likehttp://webville.gr?show=nikos
> and check that but i dont like the idea very much of giving an extra
> string each time i want to visit my webpage.
> So form the 2 solution mentioned the 1st one is better but cant come
> into action for some reason.
>
> Aprt form those too solution i cant think of anyhting else that would
> identify me and filter me out of the actual guest of my website.
>
> I'm all ears if you can think of something else.

Is there any other way for the webpage to identify me and filter me
out except checking a cookie or attach an extra url string to the
address bar?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Private variables

2010-09-01 Thread Ryan Kelly

On Thu, 2010-09-02 at 12:06 +1000, Ryan Kelly wrote:
> On Thu, 2010-09-02 at 11:10 +1000, Rasjid Wilcox wrote:
> > Hi all,
> > 
> > I am aware the private variables are generally done via convention
> > (leading underscore), but I came across a technique in Douglas
> > Crockford's book "Javascript: The Good Parts" for creating private
> > variables in Javascript, and I'd thought I'd see how it translated to
> > Python. Here is my attempt.
> > 
> > def get_config(_cache=[]):
> > private = {}
> > private['a'] = 1
> > private['b'] = 2
> > if not _cache:
> > class Config(object):
> > @property
> > def a(self):
> > return private['a']
> > @property
> > def b(self):
> > return private['b']
> > config = Config()
> > _cache.append(config)
> > else:
> > config = _cache[0]
> > return config
> > 
> > >>> c = get_config()
> > >>> c.a
> > 1
> > >>> c.b
> > 2
> > >>> c.a = 10
> > Traceback (most recent call last):
> >   File "", line 1, in 
> > AttributeError: can't set attribute
> > >>> dir(c)
> > ['__class__', '__delattr__', '__dict__', '__doc__', '__format__',
> > '__getattribute__', '__hash__', '__init__', '__module__', '__new__',
> > '__reduce__', '__reduce_ex__', '__repr__', '__setattr__',
> > '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'a', 'b']
> > >>> d = get_config()
> > >>> d is c
> > True
> > 
> > I'm not really asking 'is it a good idea' but just 'does this work'?
> > It seems to work to me, and is certainly 'good enough' in the sense
> > that it should be impossible to accidentally change the variables of
> > c.
> > 
> > But is it possible to change the value of c.a or c.b with standard
> > python, without resorting to ctypes level manipulation?
> 
> It's not easy, but it can be done by introspecting the property object
> you created and munging the closed-over dictionary object:
> 
>>>> c = get_config()
>>>> c.a
>1
>>>> c.__class__.__dict__['a'].fget.func_closure[0].cell_contents['a'] = 7
>>>> c.a
>7
>>>> 


Heh, and of course I miss the even more obvious trick of just clobbering
the property with something else:

  >>> c.a
  1
  >>> setattr(c.__class__,"a",7)
  >>> c.a
  7
  >>> 


   Ryan


-- 
Ryan Kelly
http://www.rfk.id.au  |  This message is digitally signed. Please visit
r...@rfk.id.au|  http://www.rfk.id.au/ramblings/gpg/ for details



signature.asc
Description: This is a digitally signed message part
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Private variables

2010-09-01 Thread Ryan Kelly

On Thu, 2010-09-02 at 11:10 +1000, Rasjid Wilcox wrote:
> Hi all,
> 
> I am aware the private variables are generally done via convention
> (leading underscore), but I came across a technique in Douglas
> Crockford's book "Javascript: The Good Parts" for creating private
> variables in Javascript, and I'd thought I'd see how it translated to
> Python. Here is my attempt.
> 
> def get_config(_cache=[]):
> private = {}
> private['a'] = 1
> private['b'] = 2
> if not _cache:
> class Config(object):
> @property
> def a(self):
> return private['a']
> @property
> def b(self):
> return private['b']
> config = Config()
> _cache.append(config)
> else:
> config = _cache[0]
> return config
> 
> >>> c = get_config()
> >>> c.a
> 1
> >>> c.b
> 2
> >>> c.a = 10
> Traceback (most recent call last):
>   File "", line 1, in 
> AttributeError: can't set attribute
> >>> dir(c)
> ['__class__', '__delattr__', '__dict__', '__doc__', '__format__',
> '__getattribute__', '__hash__', '__init__', '__module__', '__new__',
> '__reduce__', '__reduce_ex__', '__repr__', '__setattr__',
> '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'a', 'b']
> >>> d = get_config()
> >>> d is c
> True
> 
> I'm not really asking 'is it a good idea' but just 'does this work'?
> It seems to work to me, and is certainly 'good enough' in the sense
> that it should be impossible to accidentally change the variables of
> c.
> 
> But is it possible to change the value of c.a or c.b with standard
> python, without resorting to ctypes level manipulation?

It's not easy, but it can be done by introspecting the property object
you created and munging the closed-over dictionary object:

   >>> c = get_config()
   >>> c.a
   1
   >>> c.__class__.__dict__['a'].fget.func_closure[0].cell_contents['a'] = 7
   >>> c.a
   7
   >>> 


  Cheers,

 Ryan


-- 
Ryan Kelly
http://www.rfk.id.au  |  This message is digitally signed. Please visit
r...@rfk.id.au|  http://www.rfk.id.au/ramblings/gpg/ for details



signature.asc
Description: This is a digitally signed message part
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: PyPy and RPython

2010-09-01 Thread Benjamin Peterson

sarvi  gmail.com> writes:

> 
> 
> Is there a plan to adopt PyPy and RPython under the python foundation
> in attempt to standardize both.

There is not.

> 
> Secondly I have always fantasized of never having to write C code yet
> get its compiled performance.
> With RPython(a strict subset of Python), I can actually compile it to
> C/Machine code

RPython is not supposed to be a general purpose language. As a PyPy developer
myself, I can testify that it is no fun.

> 
> Yet I see this forum relatively quite on PyPy or Rpython ?  Any
> reasons???

You should post to the PyPy list instead. (See pypy.org) 




-- 
http://mail.python.org/mailman/listinfo/python-list

Private variables

2010-09-01 Thread Rasjid Wilcox

Hi all,

I am aware the private variables are generally done via convention
(leading underscore), but I came across a technique in Douglas
Crockford's book "Javascript: The Good Parts" for creating private
variables in Javascript, and I'd thought I'd see how it translated to
Python. Here is my attempt.

def get_config(_cache=[]):
private = {}
private['a'] = 1
private['b'] = 2
if not _cache:
class Config(object):
@property
def a(self):
return private['a']
@property
def b(self):
return private['b']
config = Config()
_cache.append(config)
else:
config = _cache[0]
return config

>>> c = get_config()
>>> c.a
1
>>> c.b
2
>>> c.a = 10
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: can't set attribute
>>> dir(c)
['__class__', '__delattr__', '__dict__', '__doc__', '__format__',
'__getattribute__', '__hash__', '__init__', '__module__', '__new__',
'__reduce__', '__reduce_ex__', '__repr__', '__setattr__',
'__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'a', 'b']
>>> d = get_config()
>>> d is c
True

I'm not really asking 'is it a good idea' but just 'does this work'?
It seems to work to me, and is certainly 'good enough' in the sense
that it should be impossible to accidentally change the variables of
c.

But is it possible to change the value of c.a or c.b with standard
python, without resorting to ctypes level manipulation?

Cheers,

Rasjid.
-- 
http://mail.python.org/mailman/listinfo/python-list

Selecting k smallest or largest elements from a large list in python; (benchmarking)

2010-09-01 Thread Dmitry Chichkov

Given: a large list (10,000,000) of floating point numbers;
Task: fastest python code that finds k (small, e.g. 10) smallest
items, preferably with item indexes;
Limitations: in python, using only standard libraries (numpy & scipy
is Ok);

I've tried several methods. With N = 10,000,000, K = 10 The fastest so
far (without item indexes) was pure python implementation
nsmallest_slott_bisect (using bisect/insert). And with indexes
nargsmallest_numpy_argmin (argmin() in the numpy array k times).

Anyone up to the challenge beating my code with some clever selection
algorithm?

Current Table:
1.66864395142 mins_heapq(items, n):
0.946580886841 nsmallest_slott_bisect(items, n):
1.38014793396 nargsmallest(items, n):
10.0732769966 sorted(items)[:n]:
3.17916202545 nargsmallest_numpy_argsort(items, n):
1.31794500351 nargsmallest_numpy_argmin(items, n):
2.37499308586 nargsmallest_numpy_array_argsort(items, n):
0.524670124054 nargsmallest_numpy_array_argmin(items, n):

0.0525538921356 numpy argmin(items): 1892997
0.364673852921 min(items): 10.026786


Code:

import heapq
from random import randint, random
import time
from bisectimport insort
from itertools import islice
from operator import itemgetter

def mins_heapq(items, n):
nlesser_items = heapq.nsmallest(n, items)
return nlesser_items

def nsmallest_slott_bisect(iterable, n, insort=insort):
it   = iter(iterable)
mins = sorted(islice(it, n))
for el in it:
if el <= mins[-1]: #NOTE: equal sign is to preserve duplicates
insort(mins, el)
mins.pop()
return mins

def nargsmallest(iterable, n, insort=insort):
it   = enumerate(iterable)
mins = sorted(islice(it, n), key = itemgetter(1))
loser = mins[-1][1] # largest of smallest
for el in it:
if el[1] <= loser:# NOTE: equal sign is to preserve
dupl
mins.append(el)
mins.sort(key = itemgetter(1))
mins.pop()
loser = mins[-1][1]
return mins

def nargsmallest_numpy_argsort(iter, k):
distances = N.asarray(iter)
return [(i, distances[i]) for i in distances.argsort()[0:k]]

def nargsmallest_numpy_array_argsort(array, k):
return [(i, array[i]) for i in array.argsort()[0:k]]

def nargsmallest_numpy_argmin(iter, k):
distances = N.asarray(iter)
mins = []

def nargsmallest_numpy_array_argmin(distances, k):
mins = []
for i in xrange(k):
j = distances.argmin()
mins.append((j, distances[j]))
distances[j] = float('inf')

return mins


test_data = [randint(10, 50) + random() for i in range(1000)]
K = 10

init = time.time()
mins = mins_heapq(test_data, K)
print time.time() - init, 'mins_heapq(items, n):', mins[:2]

init = time.time()
mins = nsmallest_slott_bisect(test_data, K)
print time.time() - init, 'nsmallest_slott_bisect(items, n):', mins[:
2]

init = time.time()
mins = nargsmallest(test_data, K)
print time.time() - init, 'nargsmallest(items, n):', mins[:2]

init = time.time()
mins = sorted(test_data)[:K]
print time.time() - init, 'sorted(items)[:n]:', time.time() - init,
mins[:2]

import numpy as N
init = time.time()
mins = nargsmallest_numpy_argsort(test_data, K)
print time.time() - init, 'nargsmallest_numpy_argsort(items, n):',
mins[:2]

init = time.time()
mins = nargsmallest_numpy_argmin(test_data, K)
print time.time() - init, 'nargsmallest_numpy_argmin(items, n):',
mins[:2]


print
init = time.time()
mins = array.argmin()
print time.time() - init, 'numpy argmin(items):', mins

init = time.time()
mins = min(test_data)
print time.time() - init, 'min(items):', mins

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Python libs on Windows ME

2010-09-01 Thread ipatrol6...@yahoo.com

Damn Small Linux could work. If even that won't work, perhaps it's time to 
scrap your old fossil for parts and buy a modern computer. Even a netbook would 
probably be an improvement based on your situation.


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Performance: sets vs dicts.

2010-09-01 Thread John Bokma

Robert Kern  writes:

> On 9/1/10 4:40 PM, John Bokma wrote:
>> Arnaud Delobelle  writes:
>>
>>> Terry Reedy  writes:

[...]

>>> I don't understand what you're trying to say.  Aahz didn't claim that
>>> random list element access was constant time, he said it was O(1) (and
>>> that it should be part of the Python spec that it is).
>>
>> Uhm, O(1) /is/ constant time, see page 45 of Introduction to Algorithms,
>> 2nd edition.
>
> While we often use the term "constant time" to as a synonym for O(1)
> complexity of an algorithm, Arnaud and Terry are using the term here
> to mean "an implementation takes roughly the same amount of wall-clock
> time every time".

Now that's confusing in a discussion that earlier on provided a link to
a page using big O notation. At least for people following this
partially, like I do.

-- 
John Bokma   j3b

Blog: http://johnbokma.com/Facebook: http://www.facebook.com/j.j.j.bokma
Freelance Perl & Python Development: http://castleamber.com/
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Performance: sets vs dicts.

2010-09-01 Thread John Bokma

Terry Reedy  writes:

> On 9/1/2010 5:40 PM, John Bokma wrote:

[..]

> Yes, I switched, because 'constant time' is a comprehensible claim
> that can be refuted and because that is how some will interpret O(1)
> (see below for proof;-).

You make it now sound alsof this interpretation is not correct or out of
place. People who have bothered to read ItA will use O(1) and constant
time interchangeably while talking of the order of growth of the running
time algorithms and most of those are aware that 'big oh' hides a
constant, and that in the real world a O(log n) algorithm can outperform
an O(1) algorithm for small values of n.

>> Uhm, O(1) /is/ constant time, see page 45 of Introduction to Algorithms,
>> 2nd edition.

-- 
John Bokma   j3b

Blog: http://johnbokma.com/Facebook: http://www.facebook.com/j.j.j.bokma
Freelance Perl & Python Development: http://castleamber.com/
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: DeprecationWarning

2010-09-01 Thread Chris Rebert

On Wed, Sep 1, 2010 at 8:58 AM, cerr  wrote:
> Hi There,
>
> I would like to create an scp handle and download a file from a
> client. I have following code:

> but what i'm getting is this and no file is downloaded...:
> /opt/lampp/cgi-bin/attachment.py:243: DeprecationWarning:
> BaseException.message has been deprecated as of Python 2.6
>  chan.send('\x01'+e.message)
> 09/01/2010 08:53:56 : Downloading P-file failed.
>
> What does that mean and how do i resolve this?

http://stackoverflow.com/questions/1272138/baseexception-message-deprecated-in-python-2-6
As the warning message says, line 243 of
/opt/lampp/cgi-bin/attachment.py is the cause of the warning.

However, that's only a warning (albeit probably about a small part of
some error-raising code), not an error itself, so it's not the cause
of the download failure.
Printing out the IOError encountered would be the first step in
debugging the download failure.

Cheers,
Chris
--
http://blog.rebertia.com
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: importing excel data into a python matrix?

2010-09-01 Thread Chris Rebert

On Wed, Sep 1, 2010 at 4:35 PM, patrick mcnameeking
 wrote:
> Hello list,
> I've been working with Python now for about a year using it primarily for
> scripting in the Puredata graphical programming environment.  I'm working on
> a project where I have been given a 1000 by 1000 cell excel spreadsheet and
> I would like to be able to access the data using Python.  Does anyone know
> of a way that I can do this?

"xlrd 0.7.1 - Library for developers to extract data from Microsoft
Excel (tm) spreadsheet files":
http://pypi.python.org/pypi/xlrd

If requiring the user to re-save the file as .CSV instead of .XLS is
feasible, then you /can/ avoid the third-party dependency and use just
the std lib instead:
http://docs.python.org/library/csv.html

Cheers,
Chris
--
http://blog.rebertia.com
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: importing excel data into a python matrix?

2010-09-01 Thread geremy condra

On Wed, Sep 1, 2010 at 4:35 PM, patrick mcnameeking
 wrote:
> Hello list,
> I've been working with Python now for about a year using it primarily for
> scripting in the Puredata graphical programming environment.  I'm working on
> a project where I have been given a 1000 by 1000 cell excel spreadsheet and
> I would like to be able to access the data using Python.  Does anyone know
> of a way that I can do this?
> Thanks,
> Pat

http://tinyurl.com/2eqqjxv

;)

Geremy Condra
-- 
http://mail.python.org/mailman/listinfo/python-list

importing excel data into a python matrix?

2010-09-01 Thread patrick mcnameeking

Hello list,
I've been working with Python now for about a year using it primarily for
scripting in the Puredata graphical programming environment.  I'm working on
a project where I have been given a 1000 by 1000 cell excel spreadsheet and
I would like to be able to access the data using Python.  Does anyone know
of a way that I can do this?
Thanks,
Pat

-- 
'Given enough eyeballs, all bugs are shallow.'
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Performance: sets vs dicts.

2010-09-01 Thread Terry Reedy

On 9/1/2010 5:40 PM, John Bokma wrote:

Arnaud Delobelle  writes:

Terry Reedy  writes:

On 9/1/2010 11:40 AM, Aahz wrote:

I think that any implementation
that doesn't have O(1) for list element access is fundamentally broken,

Whereas I think that that claim is fundamentally broken in multiple ways.

and we should probably document that somewhere.

I agree that *current* algorithmic behavior of parts of CPython on
typical *current* hardware should be documented not just 'somewhere'
(which I understand it is, in the Wiki) but in a CPython doc included
in the doc set distributed with each release.

Perhaps someone or some group could write a HowTo on Programming with
CPython's Builtin Classes that would describe both the implementation
and performance and also the implications for coding style. In
particular, it could compare CPython's array lists and tuples to
singly linked lists (which are easily created in Python also).

But such a document, after stating that array access may be thought of
as constant time on current hardware to a useful first approximation,
should also state that repeated seqeuntial accessess may be *much*
faster than repeated random accessess. People in the high-performance
computing community are quite aware of this difference between
simplified lies and messy truth. Because of this, array algorithms are
(should be) written differently in Fortran and C because Fortran
stores arrays by columns and C by rows and because it is usually much
faster to access the next item than one far away.

I don't understand what you're trying to say.

Most generally, that I view Python as an general algorithm language and 
not just as a VonNeuman machine programming language.

More specifically, that O() claims can be inapplicable, confusing, 
misleading, incomplete, or false, especially when applied to real time 
and to real systems with finite limits.

Aahz didn't claim that random list element access was constant time,

>> he said it was O(1) (and
>> that it should be part of the Python spec that it is).

Yes, I switched, because 'constant time' is a comprehensible claim that 
can be refuted and because that is how some will interpret O(1) (see 
below for proof;-).

If one takes O(1) to mean bounded, which I believe is the usual 
technical meaning, then all Python built-in sequence operations take 
bounded time because of the hard size limit. If sequences were not 
bounded in length, then access time would not be bounded either.

My most specific point is that O(1), interpreted as more-or-less 
constant time across a range of problem sizes, can be either a virute or 
vice depending on whether the constancy is a result of speeding up large 
problems or slowing down small problems. I furthermore contend that 
Python sequences on current hardware exhibit both virtue and vice and 
that is would be absurd to reject a system that kept the virtue without 
the vice and that such absurdity should not be built into the language 
definition.

My fourth point is that we can meet the reasonable goal of helping some 
people make better use of current Python/CPython on current hardware 
without big-O controversy and without screwing around with the language 
definition and locking out the future.

Uhm, O(1) /is/ constant time, see page 45 of Introduction to Algorithms,
2nd edition.

--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Re: Performance: sets vs dicts.

2010-09-01 Thread Raymond Hettinger

On Aug 30, 6:03 am, a...@pythoncraft.com (Aahz) wrote:
> That reminds me: one co-worker (who really should have known better ;-)
> had the impression that sets were O(N) rather than O(1).  Although
> writing that off as a brain-fart seems appropriate, it's also the case
> that the docs don't really make that clear, it's implied from requiring
> elements to be hashable.  Do you agree that there should be a comment?

There probably ought to be a HOWTO or FAQ entry on algorithmic
complexity
that covers classes and functions where the algorithms are
interesting.
That will concentrate the knowledge in one place where performance is
a
main theme and where the various alternatives can be compared and
contrasted.

I think most users of sets rarely read the docs for sets.  The few
lines
in the tutorial are enough so that most folks "just get it" and don't
read
more detail unless they attempting something exotic.   Our docs have
gotten
somewhat voluminous, so it's unlikely that adding that particular
needle to the haystack would have cured your colleague's "brain-fart"
unless he had been focused on a single document talking about the
performance
characteristics of various data structures.

Raymond

Raymond

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Performance: sets vs dicts.

2010-09-01 Thread Robert Kern


On 9/1/10 4:40 PM, John Bokma wrote:

Arnaud Delobelle  writes:


Terry Reedy  writes:



But such a document, after stating that array access may be thought of
as constant time on current hardware to a useful first approximation,
should also state that repeated seqeuntial accessess may be *much*
faster than repeated random accessess. People in the high-performance
computing community are quite aware of this difference between
simplified lies and messy truth. Because of this, array algorithms are
(should be) written differently in Fortran and C because Fortran
stores arrays by columns and C by rows and because it is usually much
faster to access the next item than one far away.


I don't understand what you're trying to say.  Aahz didn't claim that
random list element access was constant time, he said it was O(1) (and
that it should be part of the Python spec that it is).


Uhm, O(1) /is/ constant time, see page 45 of Introduction to Algorithms,
2nd edition.


While we often use the term "constant time" to as a synonym for O(1) complexity 
of an algorithm, Arnaud and Terry are using the term here to mean "an 
implementation takes roughly the same amount of wall-clock time every time".


--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

--
http://mail.python.org/mailman/listinfo/python-list

Re: what is this kind of string: b'string' ?

2010-09-01 Thread Gary Herron

On 09/01/2010 02:32 PM, Stef Mientki wrote:

  in winpdb I see strings like this:

a = b'string'
a

'string'

type(a)

what's the "b" doing in front of the string ?

thanks,
Stef Mientki

In Python2 the b is meaningless (but allowed for compatibility and 
future-proofing purposes), while in Python 3 it creates a byte array (or 
byte string or technically an object of type bytes) rather than a string 
(of unicode).

Python2
>>> type(b'abc')

>>> type('abc')

Python3:
>>> type(b'abc')

>>> type('abc')

--
http://mail.python.org/mailman/listinfo/python-list

Email Previews

2010-09-01 Thread Paul Jefferson

Hello,

I'm currently trying to write a quick script that takes email message
objects and generates quick snippet previews (like the iPhone does when you
are in the menu) but I'm struggling. I was just wondering before I started
to put a lot of work in this if there were any existing scripts out there
that did it, as it seems a bit pointless spending a lot of time reinventing
the wheel if something already exists.

Thanks for your help,
Paul
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Performance: sets vs dicts.

2010-09-01 Thread John Bokma

Arnaud Delobelle  writes:

> Terry Reedy  writes:
>
>> On 9/1/2010 11:40 AM, Aahz wrote:
>>> I think that any implementation
>>> that doesn't have O(1) for list element access is fundamentally broken,
>>
>> Whereas I think that that claim is fundamentally broken in multiple ways.
>>
>>> and we should probably document that somewhere.
>>
>> I agree that *current* algorithmic behavior of parts of CPython on
>> typical *current* hardware should be documented not just 'somewhere'
>> (which I understand it is, in the Wiki) but in a CPython doc included
>> in the doc set distributed with each release.
>>
>> Perhaps someone or some group could write a HowTo on Programming with
>> CPython's Builtin Classes that would describe both the implementation
>> and performance and also the implications for coding style. In
>> particular, it could compare CPython's array lists and tuples to
>> singly linked lists (which are easily created in Python also).
>>
>> But such a document, after stating that array access may be thought of
>> as constant time on current hardware to a useful first approximation,
>> should also state that repeated seqeuntial accessess may be *much*
>> faster than repeated random accessess. People in the high-performance
>> computing community are quite aware of this difference between
>> simplified lies and messy truth. Because of this, array algorithms are
>> (should be) written differently in Fortran and C because Fortran
>> stores arrays by columns and C by rows and because it is usually much
>> faster to access the next item than one far away.
>
> I don't understand what you're trying to say.  Aahz didn't claim that
> random list element access was constant time, he said it was O(1) (and
> that it should be part of the Python spec that it is).

Uhm, O(1) /is/ constant time, see page 45 of Introduction to Algorithms,
2nd edition.

-- 
John Bokma   j3b

Blog: http://johnbokma.com/Facebook: http://www.facebook.com/j.j.j.bokma
Freelance Perl & Python Development: http://castleamber.com/
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: what is this kind of string: b'string' ?

2010-09-01 Thread Robert Kern


On 9/1/10 4:32 PM, Stef Mientki wrote:

  in winpdb I see strings like this:


a = b'string'
a

'string'

type(a)



what's the "b" doing in front of the string ?


http://docs.python.org/py3k/library/stdtypes.html#bytes-and-byte-array-methods

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

--
http://mail.python.org/mailman/listinfo/python-list

what is this kind of string: b'string' ?

2010-09-01 Thread Stef Mientki

 in winpdb I see strings like this:

>>>a = b'string'
>>>a
'string'
>>> type(a)


what's the "b" doing in front of the string ?

thanks,
Stef Mientki
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Performance: sets vs dicts.

2010-09-01 Thread Terry Reedy

On 9/1/2010 2:42 AM, Paul Rubin wrote:

Terry Reedy  writes:

Does anyone seriously think that an implementation should be rejected
as an implementation if it intellegently did seq[n] lookups in
log2(n)/31 time units for all n (as humans would do), instead of
stupidly taking 1 time unit for all n<  2**31 and rejecting all larger
values (as 32-bit CPython does)?

Er, how can one handle n>  2**31 at all, in 32-bit CPython?

I am not sure of what you mean by 'handle'. Ints (longs in 2.x) are not 
limited, but indexes are. 2**31 and bigger are summarily rejected as 
impossibly too large, even though they might not actually be so these days.

>>> s=b''
>>> s[1]
Traceback (most recent call last):
  File "", line 1, in 
s[1]
IndexError: index out of range
>>> s[2**32]
Traceback (most recent call last):
  File "", line 1, in 
s[2**32]
IndexError: cannot fit 'int' into an index-sized integer

As far as I know, this is undocumented. In any case, this means that if 
it were possible to create a byte array longer than 2**31 on an 
otherwise loaded 32-bit linux machine with 2**32 memory, then indexing 
the end elements would not be possible, which is to say, O(1) would jump 
to O(INF). I do not have such a machine to test whether

   big = open('2.01.gigabytes', 'rb').read()
executes or raises an exception. Array size limits are also not documented.

--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Re: C++ - Python API

2010-09-01 Thread Markus Kraus

Thanks for the answer

On 1 Sep., 22:29, Thomas Jollans  wrote:
> On Wednesday 01 September 2010, it occurred to Markus Kraus to exclaim:
>
> > So the feature overview:
>
> First, the obligatory things you don't want to hear: Have you had a look at
> similar efforts? A while ago, Aahz posted something very similar on this very
> list. You should be able to find it in any of the archives without too much
> trouble.
> The most prominent example of this is obviously Boost.Python.

I searched in Aahz posts but i didn't find anything related.
About Boost.Python: I worked with it but (for me) it seems more like
if it's meant to create pyd modules.


> > For C++ classes:
> > - "translating" it into a python object
>
> How do you handle memory management ?

As long as the c++ instanze itself exists, the python object is
existing too. If you delete the c++ instanze the python one is also
deleted (in a multithreaded environment you'll get a "This object has
already been deleted" error).

> > - complete reflexion (attributes and methods) of the c++ instance
> > - call c++ methods nearly directly from python
> > - method-overloading (native python doesnt support it (!))
>
> > Modules:
> > - the API allowes to create hardcoded python modules without having
> > any knowledge about the python C-API
> > - Adding attributes to the module (long/char*/PyObject*)
>
> char*...
> Unicode? Somewhere? wchar_t* maybe, or std::wstring? No? Also -- double? (I'm
> just being pedantic now, at least double should be trivial to add)

I haven't worked too much on this yet but ill add support for all
common c++ types.


> > General:
> > -runs on any platform and doenst need an installed python
>
> Which platforms did you test it on? Which compilers did you test? Are you sure
> your C++ is portable?

My C++ code is not platform dependent so it should (haven't tested it
yet) be portable.


> > -runs in multithreaded environments (requires python > 2.3)
>
> How do you deal with the GIL?
> How do you handle calling to Python from multiple C++ threads?

Since python 2.3 there are the function PyGILState_Ensure and
PyGILState_Release functions which do the whole GIL stuff for you :).

> > -support for python 3.x
> > -no need of any python C-API knowledge (maybe for coding modules but
> > then only 2 or 3 functions)
> > -the project is a VC2010 one and there is also an example module +
> > class
>
> Again, have you tested other compilers?

Dont have the ability for it (could need a linux guy who knows how to
create a makefile).

> > If there is any interest in testing this or using this for your own
> > project, please post; in that case i'll release it now instead of
> > finishing the inheritance support before releasing it (this may take a
> > few days though).
>
> Just publish a bitbucket or github repository ;-)

Ill set up a googlecode site :P

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Performance: sets vs dicts.

2010-09-01 Thread Arnaud Delobelle

Terry Reedy  writes:

> On 9/1/2010 11:40 AM, Aahz wrote:
>> I think that any implementation
>> that doesn't have O(1) for list element access is fundamentally broken,
>
> Whereas I think that that claim is fundamentally broken in multiple ways.
>
>> and we should probably document that somewhere.
>
> I agree that *current* algorithmic behavior of parts of CPython on
> typical *current* hardware should be documented not just 'somewhere'
> (which I understand it is, in the Wiki) but in a CPython doc included
> in the doc set distributed with each release.
>
> Perhaps someone or some group could write a HowTo on Programming with
> CPython's Builtin Classes that would describe both the implementation
> and performance and also the implications for coding style. In
> particular, it could compare CPython's array lists and tuples to
> singly linked lists (which are easily created in Python also).
>
> But such a document, after stating that array access may be thought of
> as constant time on current hardware to a useful first approximation,
> should also state that repeated seqeuntial accessess may be *much*
> faster than repeated random accessess. People in the high-performance
> computing community are quite aware of this difference between
> simplified lies and messy truth. Because of this, array algorithms are
> (should be) written differently in Fortran and C because Fortran
> stores arrays by columns and C by rows and because it is usually much
> faster to access the next item than one far away.

I don't understand what you're trying to say.  Aahz didn't claim that
random list element access was constant time, he said it was O(1) (and
that it should be part of the Python spec that it is).

-- 
Arnaud
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: C++ - Python API

2010-09-01 Thread Thomas Jollans

On Wednesday 01 September 2010, it occurred to Markus Kraus to exclaim:
> So the feature overview:

First, the obligatory things you don't want to hear: Have you had a look at 
similar efforts? A while ago, Aahz posted something very similar on this very 
list. You should be able to find it in any of the archives without too much 
trouble.
The most prominent example of this is obviously Boost.Python.

> For C++ classes:
> - "translating" it into a python object

How do you handle memory management ?

> - complete reflexion (attributes and methods) of the c++ instance
> - call c++ methods nearly directly from python
> - method-overloading (native python doesnt support it (!))
> 
> Modules:
> - the API allowes to create hardcoded python modules without having
> any knowledge about the python C-API
> - Adding attributes to the module (long/char*/PyObject*)

char*...
Unicode? Somewhere? wchar_t* maybe, or std::wstring? No? Also -- double? (I'm 
just being pedantic now, at least double should be trivial to add)

> 
> General:
> -runs on any platform and doenst need an installed python

Which platforms did you test it on? Which compilers did you test? Are you sure 
your C++ is portable?

> -runs in multithreaded environments (requires python > 2.3)

How do you deal with the GIL?
How do you handle calling to Python from multiple C++ threads? 

> -support for python 3.x
> -no need of any python C-API knowledge (maybe for coding modules but
> then only 2 or 3 functions)
> -the project is a VC2010 one and there is also an example module +
> class

Again, have you tested other compilers?

> If there is any interest in testing this or using this for your own
> project, please post; in that case i'll release it now instead of
> finishing the inheritance support before releasing it (this may take a
> few days though).

Just publish a bitbucket or github repository ;-)
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: parsing string into dict

2010-09-01 Thread Arnaud Delobelle

Tim Arnold  writes:

> Hi,
> I have a set of strings that are *basically* comma separated, but with
> the exception that if a comma occur insides curly braces it is not a
> delimiter.  Here's an example:
>
> [code=one, caption={My Analysis for \textbf{t}, Version 1}, continued]
>
> I'd like to parse that into a dictionary (note that 'continued' gets
> the value 'true'):
> {'code':'one', 'caption':'{My Analysis for \textbf{t}, Version
> 1}','continued':'true'}
>
> I know and love pyparsing, but for this particular code I need to rely
> only on the standard library (I'm running 2.7). Here's what I've got,
> and it works. I wonder if there's a simpler way?
> thanks,
> --Tim Arnold
>

FWIW, here's how I would do it:

def parse_key(s, start):
pos = start
while s[pos] not in ",=]":
pos += 1
return s[start:pos].strip(), pos

def parse_value(s, start):
pos, nesting = start, 0
while nesting or s[pos] not in ",]":
nesting += {"{":1, "}":-1}.get(s[pos], 0)
pos += 1
return s[start:pos].strip(), pos

def parse_options(s):
options, pos = {}, 0
while s[pos] != "]":
key, pos = parse_key(s, pos + 1)
if s[pos] == "=":
value, pos = parse_value(s, pos + 1)
else:
value = 'true'
options[key] = value
return options

test = "[code=one, caption={My Analysis for \textbf{t}, Version 1}, continued]"

>>> parse_options(test)
{'caption': '{My Analysis for \textbf{t}, Version 1}', 'code': 'one', 
'continued': True}

-- 
Arnaud
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Performance: sets vs dicts.

2010-09-01 Thread Terry Reedy


On 9/1/2010 11:40 AM, Aahz wrote:

I think that any implementation
that doesn't have O(1) for list element access is fundamentally broken,


Whereas I think that that claim is fundamentally broken in multiple ways.


and we should probably document that somewhere.


I agree that *current* algorithmic behavior of parts of CPython on 
typical *current* hardware should be documented not just 'somewhere' 
(which I understand it is, in the Wiki) but in a CPython doc included in 
the doc set distributed with each release.


Perhaps someone or some group could write a HowTo on Programming with 
CPython's Builtin Classes that would describe both the implementation 
and performance and also the implications for coding style. In 
particular, it could compare CPython's array lists and tuples to singly 
linked lists (which are easily created in Python also).


But such a document, after stating that array access may be thought of 
as constant time on current hardware to a useful first approximation, 
should also state that repeated seqeuntial accessess may be *much* 
faster than repeated random accessess. People in the high-performance 
computing community are quite aware of this difference between 
simplified lies and messy truth. Because of this, array algorithms are 
(should be) written differently in Fortran and C because Fortran stores 
arrays by columns and C by rows and because it is usually much faster to 
access the next item than one far away.


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

C++ - Python API

2010-09-01 Thread Markus Kraus

Hi guys i worked on this for severl days (or even weeks?!) now, but im
nearly finished with it: A complete C++ to Python API which allows you
to use python as a scripting language for your C++ projects. Simple
example:
--- python code ---
def greet( player ):
print( "Hello player " + player.getName() + " !" )
--
--- c++ code ---
class CPlayer {
REGISTER_CLASS( CPlayer, CLASS_METHOD("getName", GetName) )
private:
string m_Name;
public:
CPlayer( string nName )
{
m_Name = nName;
INITIALIZE("player");
}

string GetName( ){ return m_Name; }
};
--
If you call the python function (look into the example in the project
to see how to do this) this results in ( assume you have
CPlayer("myPlayerName") ) "Hello player myPlayerName!".

So the feature overview:
For C++ classes:
- "translating" it into a python object
- complete reflexion (attributes and methods) of the c++ instance
- call c++ methods nearly directly from python
- method-overloading (native python doesnt support it (!))

Modules:
- the API allowes to create hardcoded python modules without having
any knowledge about the python C-API
- Adding attributes to the module (long/char*/PyObject*)

General:
-runs on any platform and doenst need an installed python
-runs in multithreaded environments (requires python > 2.3)
-support for python 3.x
-no need of any python C-API knowledge (maybe for coding modules but
then only 2 or 3 functions)
-the project is a VC2010 one and there is also an example module +
class

If there is any interest in testing this or using this for your own
project, please post; in that case i'll release it now instead of
finishing the inheritance support before releasing it (this may take a
few days though).
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Source code for itertools

2010-09-01 Thread vsoler

On 1 sep, 06:30, Tim Roberts  wrote:
> vsoler  wrote:
> >On 31 ago, 04:42, Paul Rubin  wrote:
> >> vsoler  writes:
> >> > I was expecting an itertools.py file, but I don't see it in your list.
> >> >> ./python3.1-3.1.2+20100829/Modules/itertoolsmodule.c
>
> >> looks promising.  Lots of stdlib modules are written in C for speed or
> >> access to system facilities.
>
> >Lawrence, Paul,
>
> >You seem to be running a utility I am not familiar with. Perhaps this
> >is because I am using Windows, and most likely you are not.
>
> >How could I have found the answer in a windows environment?
>
> Did you take the time to understand what he did?  It's not that hard to
> figure out.  He fetched the Python source code, unpacked it, then search
> for filenames that contained the string "itertools."
>
> The equivalent in Windows, after unpacking the source archive, would have
> been:
>     dir /s *itertools*
> --
> Tim Roberts, t...@probo.com
> Providenza & Boekelheide, Inc.

Thank you Tim, understood!!!
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Source code for itertools

2010-09-01 Thread vsoler

On 31 ago, 05:33, Rolando Espinoza La Fuente 
wrote:
> On Mon, Aug 30, 2010 at 11:06 PM, vsoler  wrote:
> > On 31 ago, 04:42, Paul Rubin  wrote:
> >> vsoler  writes:
> >> > I was expecting an itertools.py file, but I don't see it in your list.
> >> >> ./python3.1-3.1.2+20100829/Modules/itertoolsmodule.c
>
> >> looks promising.  Lots of stdlib modules are written in C for speed or
> >> access to system facilities.
>
> > Lawrence, Paul,
>
> > You seem to be running a utility I am not familiar with. Perhaps this
> > is because I am using Windows, and most likely you are not.
>
> > How could I have found the answer in a windows environment?
>
> Hard question. They are using standard unix utilities.
>
> But you can find the source file of a python module within python:
>
> >>> import itertools
> >>> print(itertools.__file__)
>
> /usr/lib/python2.6/lib-dynload/itertools.so
>
> Yours should point to a windows path. If the file ends with a ".py",
> you can open the file
> with any editor. If ends with ".so" or something else  likely is a
> compiled module in C
> and you should search in the source distribution, not the binary distribution.
>
> Hope it helps.
>
> Regards,
>
> Rolando Espinoza La fuentewww.insophia.com

Thank you Rolando for your contribution.

Followinf your piece of advice I got:

>>> import itertools
>>> print(itertools.__file__)
Traceback (most recent call last):
  File "", line 1, in 
print(itertools.__file__)
AttributeError: 'module' object has no attribute '__file__'
>>>

So, I undestand that the module is written in C.

Vicente Soler
-- 
http://mail.python.org/mailman/listinfo/python-list

Installation problem: Python 2.6.6 (32-Bit) on Windows 7 (32-Bit)

2010-09-01 Thread Cappy2112

Has anyone else had problems running the msi for Python 2.6.6 on
Windows 7 Professional?

If I don't check "Compile .py to byte code", the installer completes
without error.

Checking "Compile .py to byte code" causes the following to be
displayed

"There is a problem with the windows installer package. A program run
as part of setup did not complete as expected"

1. I have GB of disk space available.
2. I have admin privileges
3. The MD5 checksum of the downloaded installer matches the MD5
checksum on python.org
4. Run As Adminsitrator is not available when I Shift-Right Click
(probably because my login already has admin privileges)

I'm also having a similar issue with the PythonWin32 extensions
installer on the same machine.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Dumb Stupid Question About List and String

2010-09-01 Thread Xavier Ho

2010/9/2 Alban Nona 

> Hello Xavier, working great ! thank you very much ! :p
> Do you know by any chance if dictionnary can be sorted asthis:
>

Look at the sorted() global function in the Python API. ;]

Cheers,
Xav
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Windows vs. file.read

2010-09-01 Thread Mike

On Sep 1, 12:31 pm, MRAB  wrote:

> You should open the files in binary mode, not text mode, ie file(path,
> "rb"). Text mode is the default. Not a problem on *nix because the line
> ending is newline.

Thanks.
That was it.
-- 
http://mail.python.org/mailman/listinfo/python-list

PyPy and RPython

2010-09-01 Thread sarvi


Is there a plan to adopt PyPy and RPython under the python foundation
in attempt to standardize both.

I have been watching PyPy and RPython evolve over the years.

PyPy seems to have momentum and is rapidly gaining followers and
performance.

PyPy JIT and performance would be a good thing for the Python
Community
And it seems to be well ahead of Unladen Swallow in performance and in
a position to improve quite a bit.


Secondly I have always fantasized of never having to write C code yet
get its compiled performance.
With RPython(a strict subset of Python), I can actually compile it to
C/Machine code


These 2 seem like spectacular advantages for Python to pickup on.
And all this by just showing the PyPy and the Python foundation's
support and direction to adopt them.


Yet I see this forum relatively quite on PyPy or Rpython ?  Any
reasons???

Sarvi




-- 
http://mail.python.org/mailman/listinfo/python-list

parsing string into dict

2010-09-01 Thread Tim Arnold

Hi,
I have a set of strings that are *basically* comma separated, but with
the exception that if a comma occur insides curly braces it is not a
delimiter.  Here's an example:

[code=one, caption={My Analysis for \textbf{t}, Version 1}, continued]

I'd like to parse that into a dictionary (note that 'continued' gets
the value 'true'):
{'code':'one', 'caption':'{My Analysis for \textbf{t}, Version
1}','continued':'true'}

I know and love pyparsing, but for this particular code I need to rely
only on the standard library (I'm running 2.7). Here's what I've got,
and it works. I wonder if there's a simpler way?
thanks,
--Tim Arnold

The 'line' is like my example above but it comes in without the ending
bracket, so I append one on the 6th line.

def parse_options(line):
options = dict()
if not line:
return options
active  = ['[','=',',','{','}',']']
line += ']'
key = ''
word= ''
inner   = 0
for c in list(line):
if c in active:
if c == '{': inner +=1
elif c == '}': inner -=1
if inner:
word += c
else:
if c == '=':
(key,word) = (word,'')
options[key.strip()] = True
elif c in [',', ']']:
if not key:
options[word.strip()] = True
else:
options[key.strip()] = word.strip()
(key,word) = (False, '')
else:
word += c
return options
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Fibonacci: How to think recursively

2010-09-01 Thread Mike

The most straightforward method would be to apply the formula
directly.
Loop on j computing Fj along the way
if n<=1 : return n

Fold=0
Fnew=1
for j in range(2,n) :
Fold, Fnew = Fnew, Fold+Fnew
return Fnew

Even simpler:
return round(((1+sqrt(5.))/2)**n/sqrt(5.))
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Windows vs. file.read

2010-09-01 Thread David Robinow

On Wed, Sep 1, 2010 at 1:03 PM, Mike  wrote:
> I have a ppm file that python 2.5 on Windows XP cannot read
> completely.
> Python on linux can read the file with no problem
> Python on Windows can read similar files.
> I've placed test code and data here:
> http://www.cs.ndsu.nodak.edu/~hennebry/ppm_test.zip
> Within the directory ppm_test, type
> python ppm_test.py
> The chunk size commentary occurs only if file.read cannot read enough
> bytes.
> The commentary only occurs for the last file.
> Any ideas?
> Any ideas that don't require getting rid of Windows?
> It's not my option.
Open the files in binary mode. i.e.,
x=Ppm(file("ff48x32.ppm",'rb'))
x=Ppm(file("bw48x32.ppm",'rb'))
x=Ppm(file("bisonfootball.ppm",'rb'))

You were just lucky on the first two files.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Windows vs. file.read

2010-09-01 Thread MRAB


On 01/09/2010 18:03, Mike wrote:

I have a ppm file that python 2.5 on Windows XP cannot read
completely.
Python on linux can read the file with no problem
Python on Windows can read similar files.
I've placed test code and data here:
http://www.cs.ndsu.nodak.edu/~hennebry/ppm_test.zip
Within the directory ppm_test, type
python ppm_test.py
The chunk size commentary occurs only if file.read cannot read enough
bytes.
The commentary only occurs for the last file.
Any ideas?
Any ideas that don't require getting rid of Windows?
It's not my option.


You should open the files in binary mode, not text mode, ie file(path,
"rb"). Text mode is the default. Not a problem on *nix because the line
ending is newline.
--
http://mail.python.org/mailman/listinfo/python-list

Re: Dumb Stupid Question About List and String

2010-09-01 Thread MRAB


On 01/09/2010 17:49, Alban Nona wrote:

Hello Xavier,

Thank you :)

Well what Iam trying to generate is that kind of result:

listn1=['ELM001_DIF', 'ELM001_SPC', 'ELM001_RFL', 'ELM001_SSS',
'ELM001_REFR', 'ELM001_ALB', 'ELM001_AMB', 'ELM001_NRM', 'ELM001_MVE',
'ELM001_DPF', 'ELM001_SDW', 'ELM001_MAT', 'ELM001_WPP']

listn2 = ['ELM002_DIF', 'ELM002_SPC', 'ELM002_RFL', 'ELM002_SSS',
'ELM002_REFR', 'ELM002_ALB', 'ELM002_AMB', 'ELM002_NRM', 'ELM002_MVE',
'ELM002_DPF', 'ELM002_SDW', 'ELM002_MAT', 'ELM002_WPP']

etc...

The thing is, the first list will be generated automatically. (so there
will be unknow versions of ELM00x)
that why Im trying to figure out how to genere variable and list in an
automatic way.

Can you tell me if its not clear please ? :P
my english still need improvement when Im trying to explain scripting
things.


[snip]
Create a dict in which the key is the "ELEM" part and the value is a
list of those entries which begin with that "ELEM" part.

For example, if the entry is 'ELEM001_DIF' then the key is 'ELEM001',
which is the first 7 characters of entry, or entry[ : 7].

Something like this:

elem_dict = {}
for entry in list_of_entries:
key = entry[ : 7]
if key in elem_dict:
elem_dict[key].append(entry)
else:
elem_dict[key] = [entry]
--
http://mail.python.org/mailman/listinfo/python-list

Re: Dumb Stupid Question About List and String

2010-09-01 Thread Xavier Ho

On 2 September 2010 02:49, Alban Nona  wrote:

> Well what Iam trying to generate is that kind of result:
>
> listn1=['ELM001_DIF', 'ELM001_SPC', 'ELM001_RFL', 'ELM001_SSS',
> 'ELM001_REFR', 'ELM001_ALB', 'ELM001_AMB', 'ELM001_NRM', 'ELM001_MVE',
> 'ELM001_DPF', 'ELM001_SDW', 'ELM001_MAT', 'ELM001_WPP']
>
> listn2 = ['ELM002_DIF', 'ELM002_SPC', 'ELM002_RFL', 'ELM002_SSS',
> 'ELM002_REFR', 'ELM002_ALB', 'ELM002_AMB', 'ELM002_NRM', 'ELM002_MVE',
> 'ELM002_DPF', 'ELM002_SDW', 'ELM002_MAT', 'ELM002_WPP']
>
> etc...
>

Have a look at http://www.ideone.com/zlBeB .
I took some liberty and renamed some of your variables. I wanted to show you
what I (personally) think as good practices in python, from naming
conventions to how to use the list and dictionary, and so on. Also, 4-spaces
indent. I noticed you have 5 for some reason, but that's none of my business
now. I hope my comments explain what they do, and why they are that way.

> The thing is, the first list will be generated automatically. (so there
> will be unknow versions of ELM00x)
> that why Im trying to figure out how to genere variable and list in an
> automatic way.
>

Yes, that's totally possible. See range() (and xrange(), possibly) in the
Python API.
-- 
http://mail.python.org/mailman/listinfo/python-list

Windows vs. file.read

2010-09-01 Thread Mike

I have a ppm file that python 2.5 on Windows XP cannot read
completely.
Python on linux can read the file with no problem
Python on Windows can read similar files.
I've placed test code and data here:
http://www.cs.ndsu.nodak.edu/~hennebry/ppm_test.zip
Within the directory ppm_test, type
python ppm_test.py
The chunk size commentary occurs only if file.read cannot read enough
bytes.
The commentary only occurs for the last file.
Any ideas?
Any ideas that don't require getting rid of Windows?
It's not my option.
-- 
http://mail.python.org/mailman/listinfo/python-list

scp with paramiko

2010-09-01 Thread cerr

Hi There,

I want to download a file from a client using paramiko. I found plenty
of ressources using google on how to send a file but none  that would
describe how to download files from a client.
Help would be appreciated!
Thanks a lot!

Ron
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Optimising literals away

2010-09-01 Thread MRAB


On 01/09/2010 14:25, Lie Ryan wrote:

On 09/01/10 17:06, Stefan Behnel wrote:

MRAB, 31.08.2010 23:53:

On 31/08/2010 21:18, Terry Reedy wrote:

On 8/31/2010 12:33 PM, Aleksey wrote:

On Aug 30, 10:38 pm, Tobias Weber wrote:

Hi,
whenever I type an "object literal" I'm unsure what optimisation
will do
to it.


Optimizations are generally implentation dependent. CPython currently
creates numbers, strings, and tuple literals just once. Mutable literals
must be created each time as they may be bound and saved.


def m(arg):
if arg&  set([1,2,3]):


set() is a function call, not a literal. When m is called, who knows
what 'set' will be bound to? In Py3, at least, you could write {1,2,3},
which is much faster as it avoids creating and deleting a list. On my
machine, .35 versus .88 usec. Even then, it must be calculated each time
because sets are mutable and could be returned to the calling code.


There's still the possibility of some optimisation. If the resulting
set is never stored anywhere (bound to a name, for example) then it
could be created once. When the expression is evaluated there could be
a check so see whether 'set' is bound to the built-in class, and, if it
is, then just use the pre-created set.


What if the set is mutated by the function? That will modify the global
cache of the set; one way to prevent mutation is to use frozenset, but
from the back of my mind, I think there was a discussion that rejects
set literals producing a frozen set instead of regular set.


[snip]
I was talking about a use case like the example code, where the set is
created, checked, and then discarded.
--
http://mail.python.org/mailman/listinfo/python-list

Re: Newby Needs Help with Python code

2010-09-01 Thread Peter Otten

Nally Kaunda-Bukenya wrote:

> I hope someone can help me. I am new to Python and trying to achive the
> following:
> 1)  I would like to populate the Tot_Ouf_Area field with total area of
> each unique outfall_id (code attempted below,but Tot_Ouf_Area not
> populating) 
> 2)  I would also like to get the user input of Rv ( each
> landuse type will have a specific Rv value). For example the program
> should ask the user for Rv value of Low Density Residential (user enters
> 0.4 in example below and that value must be stored in the Rv field), and
> so on as shown in the 2nd table below…

I don't know arcgis, so the following is just guesswork.
I iterate over the Outfalls_ND table twice, the first time to calculate the 
sums per OUTFALL_ID and put them into a dict. With the second pass the 
Tot_Outf_Area column is updated

import arcgisscripting

def rows(cur):
while True:
row = cur.Next()
if row is None:
break
yield row

gp = arcgisscripting.create()
gp.Workspace = "C:\\NPDES\\NPDES_PYTHON.mdb"

TABLE = "Outfalls_ND"
GROUP = "OUTFALL_ID"
SUM = "AREA_ACRES"
TOTAL = "Tot_Outf_Area"

aggregate = {}
cur = gp.UpdateCursor(TABLE)
for row in rows(cur):
group = row.GetValue(GROUP)
amount = row.GetValue(SUM)
aggregate[group] = aggregate.get(group, 0.0) + amount

cur = gp.UpdateCursor(TABLE)
for row in rows(cur):
group = row.GetValue(GROUP)
row.SetValue(TOTAL, aggregate[group])
cur.UpdateRow(row)

As this is written into the blue it is unlikely that it runs successfully 
without changes. Just try and report back the results.

Peter

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Dumb Stupid Question About List and String

2010-09-01 Thread Alban Nona

Hello Xavier,

Thank you :)

Well what Iam trying to generate is that kind of result:

listn1=['ELM001_DIF', 'ELM001_SPC', 'ELM001_RFL', 'ELM001_SSS',
'ELM001_REFR', 'ELM001_ALB', 'ELM001_AMB', 'ELM001_NRM', 'ELM001_MVE',
'ELM001_DPF', 'ELM001_SDW', 'ELM001_MAT', 'ELM001_WPP']

listn2 = ['ELM002_DIF', 'ELM002_SPC', 'ELM002_RFL', 'ELM002_SSS',
'ELM002_REFR', 'ELM002_ALB', 'ELM002_AMB', 'ELM002_NRM', 'ELM002_MVE',
'ELM002_DPF', 'ELM002_SDW', 'ELM002_MAT', 'ELM002_WPP']

etc...

The thing is, the first list will be generated automatically. (so there will
be unknow versions of ELM00x)
that why Im trying to figure out how to genere variable and list in an
automatic way.

Can you tell me if its not clear please ? :P
my english still need improvement when Im trying to explain scripting
things.

2010/9/1 Xavier Ho 

> On 2 September 2010 01:11, Alban Nona  wrote:
>
>> Hello,
>>
>> seems to have the same error with python.
>> In fact I was coding within nuke, a 2d compositing software (not the best)
>> unfortunately, I dont see how I can use dictionnary to do what I would
>> like to do.
>>
>
> Hello Alban,
>
> The reason it's printing only the ELM004 elements is because the variable,
> first, is 'ELM004' when your code goes to line 29.
>
> I noticed you're using variables created from the for loop out of its block
> as well. Personally I wouldn't recommend it as good practice. There are ways
> around it.
>
> Could you explain briefly what you want to achieve with this program?
> What's the desired sample output?
>
> Cheers,
> Xav
>
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Dumb Stupid Question About List and String

2010-09-01 Thread Xavier Ho

On 2 September 2010 01:11, Alban Nona  wrote:

> Hello,
>
> seems to have the same error with python.
> In fact I was coding within nuke, a 2d compositing software (not the best)
> unfortunately, I dont see how I can use dictionnary to do what I would like
> to do.
>

Hello Alban,

The reason it's printing only the ELM004 elements is because the variable,
first, is 'ELM004' when your code goes to line 29.

I noticed you're using variables created from the for loop out of its block
as well. Personally I wouldn't recommend it as good practice. There are ways
around it.

Could you explain briefly what you want to achieve with this program? What's
the desired sample output?

Cheers,
Xav
-- 
http://mail.python.org/mailman/listinfo/python-list

Better multiprocessing and data persistance with C level serialisation

2010-09-01 Thread ipatrol6...@yahoo.com

I was thinking about this for a while. Owing to a lack of forking or START/STOP 
signals, all process interchange in CPython requires serialisation, usually 
pickling. But what if that could be done within the interpreter core instead of 
by the script, creating a complete internal representation that can then be 
read by the child interpreter. Any comments/ideas/suggestions?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Performance: sets vs dicts.

2010-09-01 Thread Stefan Behnel


Aahz, 01.09.2010 17:40:

I still think that making a full set of
algorithmic guarantees is a Bad Idea, but I think that any implementation
that doesn't have O(1) for list element access is fundamentally broken,
and we should probably document that somewhere.


+1

Stefan

--
http://mail.python.org/mailman/listinfo/python-list

DeprecationWarning

2010-09-01 Thread cerr

Hi There,

I would like to create an scp handle and download a file from a
client. I have following code:
import sys, os, paramiko,time
from attachment import SCPClient

  transport = paramiko.Transport((prgIP, 22))

  try:
  transport.connect(username='root', password=prgPass)
  except IOError:
  print "Transport connect timed out"
  writelog(" Transport connect timed out. \n")
  sys.exit()

  scp = SCPClient(transport)
  writelog("Succesfully created scp transport handle to get P-file
\n")

  # Create '/tmp/autokernel' if it does not exist.
  if not os.access('./PRGfiles', os.F_OK):
os.mkdir('./PRGfiles')

  try:
  scp.get("/usr/share/NovaxTSP/P0086_2003.xml","./PRGfiles/
P0086_2003.xml")
  writelog("succesfully downloaded P-file \n")
  except IOError:
  writelog("Downloading P-file failed. \n")

but what i'm getting is this and no file is downloaded...:
/opt/lampp/cgi-bin/attachment.py:243: DeprecationWarning:
BaseException.message has been deprecated as of Python 2.6
  chan.send('\x01'+e.message)
09/01/2010 08:53:56 : Downloading P-file failed.

What does that mean and how do i resolve this?

Thank you!
Ron
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: [ANN] git peer-to-peer bittorrent experiment: first milestone reached

2010-09-01 Thread Stefan Behnel


Luke Kenneth Casson Leighton, 01.09.2010 17:14:

this is to let people know that a first milestone has been reached in
an experiment to combine git with a file-sharing protocol, thus making
it possible to use git for truly distributed software development


Basically, BitTorrent only works well when there are enough people who 
share a common interest at the same time. Why would you think that is the 
case for software development, and what minimum project size would you 
consider reasonable to make this tool a valid choice?


If you're more like targeting in-house development, it could become a 
little boring to be the first who arrives in the morning ...


Stefan

--
http://mail.python.org/mailman/listinfo/python-list

Re: Performance: sets vs dicts.

2010-09-01 Thread Aahz

In article ,
Jerry Hill   wrote:
>On Tue, Aug 31, 2010 at 10:09 AM, Aahz  wrote:
>>
>> I suggest that we should agree on these guarantees and document them in
>> the core.
>
>I can't get to the online python-dev archives from work (stupid
>filter!) so I can't give you a link to the archives, but the original
>thread that resulted in the creation of that wiki page was started on
>March 9th, 2008 and was titled "Complexity documentation request".

http://mail.python.org/pipermail/python-dev/2008-March/077499.html

>At the time, opposition to formally documenting this seemed pretty
>widespread, including from yourself and Guido.  You've obviously
>changed your mind on the subject, so maybe it's something that would
>be worth revisiting, assuming someone wants to write the doc change.

Looking back at that thread, it's less that I've changed my mind as that
I've gotten a bit more nuanced.  I still think that making a full set of
algorithmic guarantees is a Bad Idea, but I think that any implementation
that doesn't have O(1) for list element access is fundamentally broken,
and we should probably document that somewhere.
-- 
Aahz (a...@pythoncraft.com)   <*> http://www.pythoncraft.com/

"...if I were on life-support, I'd rather have it run by a Gameboy than a
Windows box."  --Cliff Wells
-- 
http://mail.python.org/mailman/listinfo/python-list

[ANN] git peer-to-peer bittorrent experiment: first milestone reached

2010-09-01 Thread Luke Kenneth Casson Leighton

http://gitorious.org/python-libbittorrent/pybtlib

this is to let people know that a first milestone has been reached in
an experiment to combine git with a file-sharing protocol, thus making
it possible to use git for truly distributed software development and
other file-revision-management operations (such as transparently
turning git-configured ikiwiki and moinmoin wikis into peer-to-peer
ones).

the milestone reached is to transfer git commit "pack objects", as if
they were ordinary files, over a bittorrent network, and have them
"unpacked" at the far end.  the significance of being able to transfer
git commit pack objects is that this is the core of the "git fetch"
command.

the core of this experiment comprises a python-based VFS layer,
providing alternatives to os.listdir, os.path.exists, open and so on -
sufficient to make an interesting experiment itself by combining that
VFS layer with e.g. python-fuse.

the bittornado library, also available at the above URL, has been
modified to take a VFS module as an argument to all operations, such
that it would be conceivable to share maildir mailboxes, mailing list
archives, .tar.gz archives, .deb and .rpm archives and so on, as if
they were files and directories within a file-sharing network.

as the core code has only existed for under three days, and is only
400 lines long, there are rough edges:

* all existing commit objects are unpacked at startup time and are
stored in-memory (!).  this is done so as to avoid significant
modification of the bittorrent library, which will be required.
* all transferred commit objects are again stored in-memory before
being unpacked.  so, killing the client will lose all transfers
received up to that point.

on the roadmap:

* make things efficient!  requires modification of the bittornado library.
* create some documentation!
* explore how to make git use this code as a new URI type so that it
will be possible to just do "git pull"
* explore how to use PGP/GPG to sign commits(?) or perhaps just
tags(?) in order to allow commits to be pulled only from trusted
parties.
* share all branches and tags as well as just refs/heads/*
* make "git push" re-create the .torrent (make_torrent.py) and work out
  how to notify seeders of a new HEAD (name the torrent after the HEAD ref,
  and just create a new one rather than delete the old?)

so there is quite a bit to do, with the priority being on making a new
URI type and a new "git-remote-{URI}" command, so that this becomes
actually useable rather than just an experiment, and the project can
be self-hosting as a truly distributed peer-to-peer development
effort.

if anyone would like to assist, you only have to ask and (ironically)
i will happily grant access to the gitorious-hosted repository.

if anyone would like to sponsor this project, that would be very
timely, as if i don't get some money soon i will be unable to pay for
food and rent.

l.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Dumb Stupid Question About List and String

2010-09-01 Thread Alban Nona

Hello,

seems to have the same error with python.
In fact I was coding within nuke, a 2d compositing software (not the best)
unfortunately, I dont see how I can use dictionnary to do what I would like
to do.

2010/9/1 Xavier Ho 

> On 2 September 2010 00:47, Alban Nona  wrote:
>
>> Hello,
>>
>> So I figure out this night how to create automatically varibales via
>> vars(), the script seems to work, exept that where it should give me a list
>> like :
>> [ELM004_DIF,ELM004_SPC,ELM004_RFL,ELM004_SSS, ELM004_REFR, ELM004_ALB,
>> etc...] it gave me just one entry in my list, and the last one [ELM004_WPP]
>> Any Ideas why that please ?
>>
>> http://pastebin.com/7CDbVgdD
>
>
> Some comments:
>
> 1) Avoid overwriting global functions like list as a variable name. If you
> do that, you won't be able to use list() later in your code, and nor can
> anyone else who imports your code.
> 2) I'm a bit iffy about automatic variable generations. Why not just use a
> dictionary? What do others on comp.lang.python think?
> 3) I'm getting an error from your code, and it doesn't match with what you
> seem to get:
>
> # output
>
> ELM004_DIF
> ELM004_SPC
> ELM004_RFL
> ELM004_SSS
> ELM004_REFR
> ELM004_ALB
> ELM004_AMB
> ELM004_NRM
> ELM004_MVE
> ELM004_DPF
> ELM004_SDW
> ELM004_MAT
> ELM004_WPP
> Traceback (most recent call last):
>   File "Test.py", line 33, in 
> print ELM001
> NameError: name 'ELM001' is not defined
>
> Did you get any compiler errors? I'm using Python 2.7
>
> Cheers,
> Xav
>
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Dumb Stupid Question About List and String

2010-09-01 Thread Xavier Ho

On 2 September 2010 00:47, Alban Nona  wrote:

> Hello,
>
> So I figure out this night how to create automatically varibales via
> vars(), the script seems to work, exept that where it should give me a list
> like :
> [ELM004_DIF,ELM004_SPC,ELM004_RFL,ELM004_SSS, ELM004_REFR, ELM004_ALB,
> etc...] it gave me just one entry in my list, and the last one [ELM004_WPP]
> Any Ideas why that please ?
>
> http://pastebin.com/7CDbVgdD

Some comments:

1) Avoid overwriting global functions like list as a variable name. If you
do that, you won't be able to use list() later in your code, and nor can
anyone else who imports your code.
2) I'm a bit iffy about automatic variable generations. Why not just use a
dictionary? What do others on comp.lang.python think?
3) I'm getting an error from your code, and it doesn't match with what you
seem to get:

# output
ELM004_DIF
ELM004_SPC
ELM004_RFL
ELM004_SSS
ELM004_REFR
ELM004_ALB
ELM004_AMB
ELM004_NRM
ELM004_MVE
ELM004_DPF
ELM004_SDW
ELM004_MAT
ELM004_WPP
Traceback (most recent call last):
  File "Test.py", line 33, in 
print ELM001
NameError: name 'ELM001' is not defined

Did you get any compiler errors? I'm using Python 2.7

Cheers,
Xav
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Dumb Stupid Question About List and String

2010-09-01 Thread Alban Nona

Hello,

So I figure out this night how to create automatically varibales via vars(),
the script seems to work, exept that where it should give me a list like :
[ELM004_DIF,ELM004_SPC,ELM004_RFL,ELM004_SSS, ELM004_REFR, ELM004_ALB,
etc...] it gave me just one entry in my list, and the last one [ELM004_WPP]
Any Ideas why that please ?

http://pastebin.com/7CDbVgdD

2010/9/1 Xavier Ho 

> On 1 September 2010 12:00, Alban Nona  wrote:
>
>> @Xavier: ShaDoW, WorldPositionPoint (which is the same thing as
>> WordPointCloud passe) :)
>>
>
> Aha! That's what I was missing.
>
> Cheers,
> Xav
>
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Fibonacci: How to think recursively

2010-09-01 Thread Neil Cerutti

On 2010-09-01, Albert van der Horst  wrote:
> [Didn't you mean: I don't understand what you mean by
> overlapping recursions?  You're right about the base case, so
> clearly the OP uses some confusing terminology.]
>
> I see a problem with overlapping recursions. Unless automatic
> memoizing is one, they are unduely inefficient, as each call
> splits into two calls.
>
> If one insists on recursion (untested code, just for the idea.).
>
> def fib2( n ):
>  ' return #rabbits last year, #rabbits before last '
>  if n ==1 :
>  return (1,1)
>  else
>  penult, ult = fib2( n-1 )
>  return ( ult, ult+penult)
>
> def fub( n ):
>return fib2(n)[1]
>
> Try fib and fub for largish numbers (>1000) and you'll feel the
> problem.

There are standard tricks for converting a recursive iteration
into a tail-recursive one. It's usually done by adding the
necessary parameters, e.g.:

def fibr(n):
def fib_helper(fibminus2, fibminus1, i, n):
if i == n:
return fibminus2 + fibminus1
else:
return fib_helper(fibminus1, fibminus1 + fibminus2, i+1, n)
if n < 2:
return 1
else:
return fib_helper(1, 1, 2, n)

Once you've got a tail-recursive solution, you can usually
convert it to loop iteration for languages like Python that favor
them. The need for a temporary messed me up.

def fibi(n):
if n < 2:
return 1
else:
fibminus2 = 1
fibminus1 = 1
i = 2
while i < n: 
fibminus2, fibminus1 = fibminus1, fibminus2 + fibminus1
i += 1
return fibminus2 + fibminus1

It's interesting that the loop iterative solution is, for me,
harder to think up without doing the tail-recursive one first.

-- 
Neil Cerutti
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: fairly urgent request: paid python (or other) work required

2010-09-01 Thread Ben Finney

lkcl  writes:

> i apologise for having to contact so many people but this is fairly
> urgent, and i'm running out of time and options.
[…]

I sympathise with your situation; work for skilled practicioners is
scarce in many places right now.

For that reason, many people are likely to be in your position. For the
sake of keeping this forum habitable, I have to point out to anyone
reading:

It's not cool to post requests for work here. There are, as you noted,
other appropriate forums for that, of which this is not one.

I wish you success in finding gainful work, but all readers should
please note that this is *not* the place to look for it.

-- 
 \ “Sittin' on the fence, that's a dangerous course / You can even |
  `\   catch a bullet from the peace-keeping force” —Dire Straits, |
_o__)   _Once Upon A Time In The West_ |
Ben Finney
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Reversing a List

2010-09-01 Thread Dave Angel


Victor Subervi wrote:

Hi;
I have this code:

  cursor.execute('describe products;')
  cols = [item[0] for item in cursor]
  cols = cols.reverse()
  cols.append('Delete')
  cols = cols.reverse()

Unfortunately, the list doesn't reverse. If I print cols after the first
reverse(), it prints None. Please advise. Also, is there a way to append to
the front of the list directly?
TIA,
beno

  
The reverse() method reverses that cols object just fine, in place.  
Unfortunately, you immediately assign it a new value of None.

 Just remove the cols=  and it'll work fine.

If you want to understand the problem better, read up on reverse() and 
reversed().  They're very different.


In answer to your second question, you could combine the last three 
lines as:


  cols.insert('Delete', 0)


DaveA

--
http://mail.python.org/mailman/listinfo/python-list

Re: Performance: sets vs dicts.

2010-09-01 Thread Stefan Behnel


Lie Ryan, 01.09.2010 15:46:

On 09/01/10 00:09, Aahz wrote:

However, I think there are some rock-bottom basic guarantees we can make
regardless of implementation.  Does anyone seriously think that an
implementation would be accepted that had anything other than O(1) for
index access into tuples and lists?  Dicts that were not O(1) for access
with non-pathological hashing?  That we would accept sets having O()
performance worse than dicts?

I suggest that we should agree on these guarantees and document them in
the core.


While I think documenting them would be great for all programmers that
care about practical and theoretical execution speed; I think including
these implementation details in core documentation as a "guarantee"
would be a bad idea for the reasons Terry outlined.

One way of resolving that is by having two documentations (or two
separate sections in the documentation) for:
- Python -- the language -- documenting Python as an abstract language,
this is the documentation which can be shared across all Python
implementations. This will also be the specification for Python Language
which other implementations will be measured to.
- CPython -- the Python interpreter -- documents implementation details
and performance metrics. It should be properly noted that these are not
part of the language per se. This will be the playground for CPython
experts that need to fine tune their applications to the last drop of
blood and don't mind their application going nuts with the next release
of CPython.


I disagree. I think putting the "obvious" guarantees right into the normal 
documentation will actually make programmers aware that there *are* 
different implementations (and differences between implementations), simply 
because it wouldn't just say "O(1)" but "the CPython implementation of this 
method has an algorithmic complexity of O(1), other Python implementations 
are known to perform alike at the time of this writing". Maybe without the 
last half of the sentence if we really don't know how other implementations 
work here, or if we expect that there may well be a reason they may choose 
to behave different, but in most cases, it shouldn't be hard to make that 
complete statement.


After all, we basically know what other implementations there are, and we 
also know that they tend to match the algorithmic complexities at least for 
the major builtin types. It seems quite clear to me as a developer that the 
set of builtin types and "collections" types was chosen in order to cover a 
certain set of algorithmic complexities and not just arbitrary interfaces.


Stefan

--
http://mail.python.org/mailman/listinfo/python-list

Re: Reversing a List

2010-09-01 Thread Shashwat Anand

On Wed, Sep 1, 2010 at 6:45 PM, Matt Saxton  wrote:

> On Wed, 1 Sep 2010 09:00:03 -0400
> Victor Subervi  wrote:
>
> > Hi;
> > I have this code:
> >
> >   cursor.execute('describe products;')
> >   cols = [item[0] for item in cursor]
> >   cols = cols.reverse()
> >   cols.append('Delete')
> >   cols = cols.reverse()
> >
> > Unfortunately, the list doesn't reverse. If I print cols after the first
> > reverse(), it prints None. Please advise.
>
> The reverse() method modifies the list in place, but returns None, so just
> use
> >>> cols.reverse()
>
> rather than
> >>> cols = cols.reverse()
>

Alternatively you can do \,

>>>cols = reversed(cols)


>
> > Also, is there a way to append to
> > the front of the list directly?
> > TIA,
> > beno
>
> The insert() method can do this, i.e.
> >>> cols.insert(0, 'Delete')
>
> --
> Matt Saxton 
> --
> http://mail.python.org/mailman/listinfo/python-list
>



-- 
~l0nwlf
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Reversing a List

2010-09-01 Thread Victor Subervi

On Wed, Sep 1, 2010 at 9:17 AM, Shashank Singh <
shashank.sunny.si...@gmail.com> wrote:

> reverse reverses in-place
>
> >>> l = [1, 2, 3]
> >>> r = l.reverse()
> >>> r is None
> True
> >>> l
> [3, 2, 1]
> >>>
>

Ah. Thanks!
beno
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Reversing a List

2010-09-01 Thread Shashank Singh

reverse reverses in-place

>>> l = [1, 2, 3]
>>> r = l.reverse()
>>> r is None
True
>>> l
[3, 2, 1]
>>>


On Wed, Sep 1, 2010 at 6:30 PM, Victor Subervi wrote:

> Hi;
> I have this code:
>
>   cursor.execute('describe products;')
>   cols = [item[0] for item in cursor]
>   cols = cols.reverse()
>   cols.append('Delete')
>   cols = cols.reverse()
>
> Unfortunately, the list doesn't reverse. If I print cols after the first
> reverse(), it prints None. Please advise. Also, is there a way to append to
> the front of the list directly?
> TIA,
> beno
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>


-- 
Regards
Shashank Singh
Senior Undergraduate, Department of Computer Science and Engineering
Indian Institute of Technology Bombay
shashank.sunny.si...@gmail.com
http://www.cse.iitb.ac.in/~shashanksingh
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Reversing a List

2010-09-01 Thread Matt Saxton

On Wed, 1 Sep 2010 09:00:03 -0400
Victor Subervi  wrote:

> Hi;
> I have this code:
> 
>   cursor.execute('describe products;')
>   cols = [item[0] for item in cursor]
>   cols = cols.reverse()
>   cols.append('Delete')
>   cols = cols.reverse()
> 
> Unfortunately, the list doesn't reverse. If I print cols after the first
> reverse(), it prints None. Please advise.

The reverse() method modifies the list in place, but returns None, so just use
>>> cols.reverse()

rather than
>>> cols = cols.reverse()

> Also, is there a way to append to
> the front of the list directly?
> TIA,
> beno

The insert() method can do this, i.e.
>>> cols.insert(0, 'Delete')

-- 
Matt Saxton 
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Saving (unusual) linux filenames

2010-09-01 Thread Albert van der Horst

In article ,
Grant Edwards   wrote:
>On 2010-08-31, MRAB  wrote:
>> On 31/08/2010 17:58, Grant Edwards wrote:
>>> On 2010-08-31, MRAB  wrote:
 On 31/08/2010 15:49, amfr...@web.de wrote:
> Hi,
>
> i have a script that reads and writes linux paths in a file. I save the
> path (as unicode) with 2 other variables. I save them seperated by ","
> and the "packets" by newlines. So my file looks like this:
> path1, var1A, var1B
> path2, var2A, var2B
> path3, var3A, var3B
> 
>
> this works for "normal" paths but as soon as i have a path that does
> include a "," it breaks. The problem now is that (afaik) linux allows
> every char (aside from "/" and null) to be used in filenames. The only
> solution i can think of is using null as a seperator, but there have to
> a cleaner version ?

 You could use a tab character '\t' instead.
>>>
>>> That just breaks with a different set of filenames.
>>>
>> How many filenames contain control characters?
>
>How many filenames contain ","?  Not many, but the OP wants his
>program to be bulletproof.  Can't fault him for that.

As appending ",v" is the convention for rcs / cvs archives, I would
say: a lot. Enough to guarantee that all my backup tar's contain at
least a few.

>
>If I had a nickle for every Unix program or shell-script that failed
>when a filename had a space it it

I'd rather have it fail for spaces than for comma's.

>
>> Surely that's a bad idea.
>
>Of course it's a bad idea.  That doesn't stop people from doing it.
>
>--
>Grant Edwards   grant.b.edwardsYow! !  Now I understand
>  at   advanced MICROBIOLOGY and
>  gmail.comth' new TAX REFORM laws!!


--
-- 
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
alb...@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: [Pickle]dirty problem 3 lines

2010-09-01 Thread bussiere bussiere

it's just as it seems :
i want to know how does ti works to get back an object from a string in python :
pickle.loads("""b'\x80\x03]q\x00(K\x00K\x01e.'""") #doesn't work
Google Fan boy



On Wed, Sep 1, 2010 at 5:23 AM, MRAB  wrote:
> On 01/09/2010 03:33, bussiere bussiere wrote:
>>
>> i know it's dirty, i know i should use json but i want to know, it's
>> quiet late here :
>> import pickle
>> dump = """b'\x80\x03]q\x00(K\x00K\x01e.'"""
>> print(pickle.loads(dump))
>>
>> how can i get back my object from this string ?
>> the string is :  b'\x80\x03]q\x00(K\x00K\x01e.'
>> and i'am using python3
>> help will be appreciated i'am chewing on this for a long time now.
>
> Well, pickle.loads(b'\x80\x03]q\x00(K\x00K\x01e.') works.
>
> That, of course, is not the same as """b'\x80\x03]q\x00(K\x00K\x01e.'""".
>
> Do you mean r"""b'\x80\x03]q\x00(K\x00K\x01e.'"""?
>
> (It's also late here, well, actually, so late it's early... Time to
> sleep. :-))
> --
> http://mail.python.org/mailman/listinfo/python-list
>
-- 
http://mail.python.org/mailman/listinfo/python-list

Reversing a List

2010-09-01 Thread Victor Subervi

Hi;
I have this code:

  cursor.execute('describe products;')
  cols = [item[0] for item in cursor]
  cols = cols.reverse()
  cols.append('Delete')
  cols = cols.reverse()

Unfortunately, the list doesn't reverse. If I print cols after the first
reverse(), it prints None. Please advise. Also, is there a way to append to
the front of the list directly?
TIA,
beno
-- 
http://mail.python.org/mailman/listinfo/python-list

YAMI4 v. 1.1.0 - messaging solution for distributed systems

2010-09-01 Thread Maciej Sobczak

I am pleased to announce that the new version of YAMI4, 1.1.0, has
been just released and is available for download.

http://www.inspirel.com/yami4/

This new version extends the coverage of supported programming
languages with a completely new Python3 module, which allows full
integration of built-in dictionary objects as message payloads. Thanks
to this level of language integration, the API is very easy to learn
and natural in use.
Please check code examples in the src/python/examples directory to see
complete client-server systems.

The users of other programming languages will also benefit from the
ability to transmit raw binary messages, which in addition to support
high-performance scenarios can be used as a hook for custom
serialization routines.

The API of the whole library was also extended a bit to allow better
control of automatic reconnection and to ensure low jitter in
communication involving many receivers even in case of partial system
failure.

Last but not least, a number of fixes and improvements have been
introduced - please see the changelog.txt file, which is part of the
whole package, for a detailed description of all improvements.

--
Maciej Sobczak * http://www.inspirel.com
-- 
http://mail.python.org/mailman/listinfo/python-list

fairly urgent request: paid python (or other) work required

2010-09-01 Thread lkcl

i apologise for having to contact so many people but this is fairly
urgent, and i'm running out of time and options.   i'm a free software
programmer, and i need some paid work - preferably python - fairly
urgently, so that i can pay for food and keep paying rent, and so that
my family doesn't get deported or have to leave the country.

i really would not be doing this unless it was absolutely, absolutely
essential that i get money.

so that both i and the list are not unnecessarily spammed, please
don't reply with recommendations of "where to get jobs", unless they
are guaranteed to result in immediate work and money.

if you have need of a highly skilled and experienced python-preferring
free-software-preferring software engineer, please simply contact me,
and tell me what you need doing: there's no need for you to read the
rest of this message.

so that people are not offended by me asking on such a high-volume
list for work, here are some questions and answers:

Q: who are you?
A: luke leighton.  free sofware developer, free software project
leader, and "unusual cross-project mash-up-er" (meaning: i spot the
value of joining one or more bits of disparate "stuff" to make
something that's more powerful than its components).

Q: where's your CV?
A: executive version of CV is at http://lkcl.net/exec_cv.txt - please
don't ask for a proprietary microsoft word version, as a refusal and
referral to the "sylvester response" often offends.

Q: what can you do?
A: python programming, c programming, web development, networking,
cryptography, reverse-engineering, IT security, etc. etc. preferably
involving free software.

Q: what do you need?
A: money to pay rent and food.  at the ABSOLUTE MINIMUM, i need as
little as £1500 per month to pay everything, and have been earning
approx £800 per month for the past year.   a £5000 inheritance last
year which i was not expecting has delayed eviction and bankruptcy for
me and my family, and deportation for my partner and 17 month old
daughter (marie is here in the UK on a FLR/M visa)

Q: why are you asking here?
A: because it's urgent that i get money really really soon; my family
members are refusing to assist, and the few friends that i have do not
have any spare money to lend.

Q: why here and not "monster jobs" or "python-jobs list" or the
various "recruitment agencies"?
A: those are full-time employment positions, which i have been
frequently applying for and get rejected for various reasons, and i'm
running out of time and money.  further interviews cost money, and do
not result in guaranteed work.  i need work - and money - _now_.

Q: why here and not "peopleperhour.com"?
A: if you've ever bid on peopleperhour.com you will know that you are
bidding against "offshore" contrators and even being undercut by 1st
world country bidders who, insanely, appear to be happy to do work for
as little as £2 / hour.

Q: why are you getting rejected from interviews?
A: that's complex.  a) i simply don't interview well.  people with the
classic symptoms of asperger's just don't. b) my daughter is 17 months
old.  when i go away for as little as 3 days, which i've done three
times now, she is extremely upset both when i am away and when i
return.  i think what would happen if i was doing some sort of full-
time job, away from home, and... i can't do it.  subconsciously that
affects how i react when speaking to interviewers.

Q: why do you not go "get a job at tesco's" or "drive a truck"?
A: tescos and HGV driving etc. pay around £12 per hour.  £12 per hour
after tax comes down to about £8 to £9 per hour.  £9 per hour requires
35 hours per week to earn as little as £1500.  35 hours per week is
effectively full-time, and means that a) my programming and software
engineering skills are utterly, utterly wasted b) my daughter gets
extremely upset because i won't be at home.

so you get the gist, and thank you for putting up with me needing to
take this action.

l.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Fibonacci: How to think recursively

2010-09-01 Thread Albert van der Horst

In article , Mel   wrote:
>Baba wrote:
>
>> Level: beginner
>>
>> I would like to know how to approach the following Fibonacci problem:
>> How may rabbits do i have after n months?
>>
>> I'm not looking for the code as i could Google that very easily. I'm
>> looking for a hint to put me on the right track to solve this myself
>> without looking it up.
>>
>> my brainstorming so far brought me to a stand still as i can't seem to
>> imagine a recursive way to code this:
>>
>> my attempted rough code:
>>
>> def fibonacci(n):
>> # base case:
>> result = fibonacci (n-1) + fibonacci (n-2)
 this will end up in a mess as it will create overlapping recursions
>
>I don't think this is the base case.  The base case would be one or more
>values of `n` that you already know the fibonacci number for.  Your
>recursive function can just test for those and return the right answer right
>away.  The the expression you've coded contains a good way to handle the
>non-base cases.  There's no such problem as "overlapping recursions".

[Didn't you mean: I don't understand what you mean by overlapping
recursions?  You're right about the base case, so clearly the OP
uses some confusing terminology.]

I see a problem with overlapping recursions. Unless automatic memoizing
is one, they are unduely inefficient, as each call splits into two
calls.

If one insists on recursion (untested code, just for the idea.).

def fib2( n ):
 ' return #rabbits last year, #rabbits before last '
 if n ==1 :
 return (1,1)
 else
 penult, ult = fib2( n-1 )
 return ( ult, ult+penult)

def fub( n ):
   return fib2(n)[1]


Try fib and fub for largish numbers (>1000) and you'll feel the
problem.

>
>   Mel.
>

Groetjes Albert


--
-- 
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
alb...@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Performance: sets vs dicts.

2010-09-01 Thread Lie Ryan

On 09/01/10 00:09, Aahz wrote:
> In article ,
> Jerry Hill   wrote:
>> On Mon, Aug 30, 2010 at 7:42 PM, Aahz  wrote:
>>>
>>> Possibly; IMO, people should not need to run timeit to determine basic
>>> algorithmic speed for standard Python datatypes.
>>
>> http://wiki.python.org/moin/TimeComplexity takes a stab at it.  IIRC,
>> last time this came up, there was some resistance to making promises
>> about time complexity in the official docs, since that would make
>> those numbers part of the language, and thus binding on other
>> implementations.
> 
> I'm thoroughly aware of that page and updated it yesterday to make it
> easier to find.  ;-)
> 
> However, I think there are some rock-bottom basic guarantees we can make
> regardless of implementation.  Does anyone seriously think that an
> implementation would be accepted that had anything other than O(1) for
> index access into tuples and lists?  Dicts that were not O(1) for access
> with non-pathological hashing?  That we would accept sets having O()
> performance worse than dicts?
> 
> I suggest that we should agree on these guarantees and document them in
> the core.

While I think documenting them would be great for all programmers that
care about practical and theoretical execution speed; I think including
these implementation details in core documentation as a "guarantee"
would be a bad idea for the reasons Terry outlined.

One way of resolving that is by having two documentations (or two
separate sections in the documentation) for:
- Python -- the language -- documenting Python as an abstract language,
this is the documentation which can be shared across all Python
implementations. This will also be the specification for Python Language
which other implementations will be measured to.
- CPython -- the Python interpreter -- documents implementation details
and performance metrics. It should be properly noted that these are not
part of the language per se. This will be the playground for CPython
experts that need to fine tune their applications to the last drop of
blood and don't mind their application going nuts with the next release
of CPython.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Optimising literals away

2010-09-01 Thread Lie Ryan

On 09/01/10 17:06, Stefan Behnel wrote:
> MRAB, 31.08.2010 23:53:
>> On 31/08/2010 21:18, Terry Reedy wrote:
>>> On 8/31/2010 12:33 PM, Aleksey wrote:
 On Aug 30, 10:38 pm, Tobias Weber wrote:
> Hi,
> whenever I type an "object literal" I'm unsure what optimisation
> will do
> to it.
>>>
>>> Optimizations are generally implentation dependent. CPython currently
>>> creates numbers, strings, and tuple literals just once. Mutable literals
>>> must be created each time as they may be bound and saved.
>>>
> def m(arg):
> if arg& set([1,2,3]):
>>>
>>> set() is a function call, not a literal. When m is called, who knows
>>> what 'set' will be bound to? In Py3, at least, you could write {1,2,3},
>>> which is much faster as it avoids creating and deleting a list. On my
>>> machine, .35 versus .88 usec. Even then, it must be calculated each time
>>> because sets are mutable and could be returned to the calling code.
>>>
>> There's still the possibility of some optimisation. If the resulting
>> set is never stored anywhere (bound to a name, for example) then it
>> could be created once. When the expression is evaluated there could be
>> a check so see whether 'set' is bound to the built-in class, and, if it
>> is, then just use the pre-created set.

What if the set is mutated by the function? That will modify the global
cache of the set; one way to prevent mutation is to use frozenset, but
from the back of my mind, I think there was a discussion that rejects
set literals producing a frozen set instead of regular set.

> Cython applies this kind of optimistic optimisation in a couple of other
> cases and I can affirm that it often makes sense to do that. However,
> drawback here: the set takes up space while not being used (not a huge
> problem if literals are expected to be small), and the global lookup of
> "set" still has to be done to determine if it *is* the builtin set type.
> After that, however, the savings should be considerable.
>
> Another possibility: always cache the set and create a copy on access.
> Copying a set avoids the entire eval loop overhead and runs in a C loop
> instead, using cached item instances with (most likely) cached hash
> values. So even that will most likely be much faster than the
> spelled-out code above.

I think that these kind of optimizations would probably be
out-of-character for CPython, which values implementation simplicity
above small increase in speed. Sets are not that much used and
optimizing this particular case seems to be prone to create many subtle
issues (especially with multithreading).

In other word, these optimizations makes sense for Python
implementations that are heavily geared for speed (e.g. Unladen Swallow,
Stackless Python, PyPy[1], Cython); but probably may only have a
minuscule chance of being implemented in CPython.

[1] yes, their goal was to be faster than CPython (and faster than the
speed of photon in vacuum), though AFAICT they have yet to succeed.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Newby Needs Help with Python code

2010-09-01 Thread Rami Chowdhury

Hi Esther,

On Wed, Sep 1, 2010 at 13:29, Nally Kaunda-Bukenya  wrote:
> #THE PROGRAM:
> import arcgisscripting
> gp=arcgisscripting.create()
> gp.Workspace = "C:\\NPDES\\NPDES_PYTHON.mdb"
> fc = "Outfalls_ND"
>
> try:
>     # Set the field to create a list of unique values
>     fieldname = "OUTFALL_ID"
>
>     # Open a Search Cursor to identify all unique values
>     cur = gp.UpdateCursor(fc)
>     row = cur.Next()
>
>     # Set a list variable to hold all unique values
>     L = []
>
>     # Using a while loop, cursor through all records and append unique
>     #values to the list variable
>     while row <> None:
>     value = row.GetValue(fieldname)
>     if value not in L:
>     L.append(value)
>     row = cur.Next()
>     row.SetValue(Tot_Outf_Area, sum(row.AREA_ACRES)) #total area of
> each outfall=sum of all area 4 each unique outfallid
>     cur.UpdateRow(row) #to commit changes
>     row=cur.Next()
>     print row.Tot_Outf_Area
>     # Sort the list variable
>     L.sort()
>
>     # If a value in the list variable is blank, remove it from the list
> variable
>     #to filter out diffuse outfalls
>     if ' ' in L:
>     L.remove(' ')
>
> except:
>     # If an error occurred while running a tool, print the messages
>     print gp.GetMessages()


Have you tried running this code? I suspect it won't work at all --
and because you are catching all possible exceptions in your
try...except, you won't even know why. Here are the things that I'd
suggest, just from glancing over the code:
 - Remove the try...except for now. Getting an exception, and
understanding why it occurred and how best to deal with it, is IMHO
very helpful when prototyping and debugging.
 - Take another look at your while loop. I don't know ArcGIS, so I
don't know if the UpdateCursor object supports the iterator protocol,
but the normal Python way of looping through all rows would be a for
loop:

for row in cur:
# code

For example, you are calling cur.Next() twice inside the loop -- is
that what you want?

Hope that helps,
Rami


>
>
>
> #Please Help!!!
>
> #Esther
>
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>



-- 
Rami Chowdhury
"Never assume malice when stupidity will suffice." -- Hanlon's Razor
408-597-7068 (US) / 07875-841-046 (UK) / 0189-245544 (BD)
-- 
http://mail.python.org/mailman/listinfo/python-list

Newby Needs Help with Python code

2010-09-01 Thread Nally Kaunda-Bukenya

  
Dear Python experts, 
I hope someone can help me. I am new to Python and trying to achive the 
following:
1)  I would like to populate the Tot_Ouf_Area field with total area of each 
unique outfall_id (code attempted below,but Tot_Ouf_Area not populating)
2)  I would also like to get the user input of Rv ( each landuse type will 
have a specific Rv value). For example the program should ask the user for Rv 
value of Low Density Residential (user enters 0.4 in example below and that 
value must be stored in the Rv field), and so on as shown in the 2nd table 
below…
Below is my original table (comma-delimited)
"OBJECTID","OUTFALL_ID","LANDUSE","AREA_ACRES","Rv","Tot_Outf_Area"
16,"ALD06001","High Density Residential",6.860922,0.00,0.00
15,"ALD06001","General Commercial",7.520816,0.00,0.00
14,"ALD05002","Low Density Residential",7.255491,0.00,0.00
13,"ALD05002","Forest",37.090473,0.00,0.00
12,"ALD05001","Low Density Residential",16.904560,0.00,0.00
11,"ALD05001","Forest",84.971686,0.00,0.00
10,"ALD04002","Urban Open",1.478677,0.00,0.00
9,"ALD04002","Transportation",0.491887,0.00,0.00
8,"ALD04002","Low Density Residential",25.259720,0.00,0.00
7,"ALD04002","Forest",0.355659,0.00,0.00
6,"ALD04001","Recreational",0.013240,0.00,0.00
5,"ALD04001","Low Density Residential",34.440130,0.00,0.00
4,"ALD04001","Forest",10.229973,0.00,0.00
3,"ALD03002","Low Density Residential",23.191538,0.00,0.00
2,"ALD03002","Forest",1.853920,0.00,0.00
1,"ALD03001","Low Density Residential",6.828130,0.00,0.00
21,"ALD06001","Water.dgn",0.013951,0.00,0.00
20,"ALD06001","Urban Open",10.382900,0.00,0.00
19,"ALD06001","Transportation",2.064454,0.00,0.00
18,"ALD06001","Recreational",0.011007,0.00,0.00
17,"ALD06001","Low Density Residential",0.752509,0.00,0.00
 
Below is my desired output table (comma delimited):
"OBJECTID","OUTFALL_ID","LANDUSE","AREA_ACRES","Rv","Tot_Outf_Area"
16,"ALD06001","High Density Residential",6.860922,0.00,27.606562
15,"ALD06001","General Commercial",7.520816,0.00,27.606562
14,"ALD05002","Low Density Residential",7.255491,0.40,44.345966
13,"ALD05002","Forest",37.090473,0.30,44.345966
11,"ALD05001","Forest",84.971686,0.30,101.876247
12,"ALD05001","Low Density Residential",16.904560,0.40,101.876247
10,"ALD04002","Urban Open",1.478677,0.00,27.585945
9,"ALD04002","Transportation",0.491887,0.00,27.585945
8,"ALD04002","Low Density Residential",25.259720,0.40,27.585945
7,"ALD04002","Forest",0.355659,0.30,27.585945
6,"ALD04001","Recreational",0.013240,0.00,44.683345
5,"ALD04001","Low Density Residential",34.440130,0.40,44.683345
4,"ALD04001","Forest",10.229973,0.30,44.683345
3,"ALD03002","Low Density Residential",23.191538,0.40,25.045460
2,"ALD03002","Forest",1.853920,0.30,25.045460
1,"ALD03001","Low Density Residential",6.828130,0.40,6.828130
21,"ALD06001","Water.dgn",0.013951,0.00,27.606562
20,"ALD06001","Urban Open",10.382900,0.00,27.606562
19,"ALD06001","Transportation",2.064454,0.00,27.606562
18,"ALD06001","Recreational",0.011007,0.00,27.606562
17,"ALD06001","Low Density Residential",0.752509,0.40,27.606562
Below is my code so far for updating rows with total area (Tot_Ouf_Area):
 
#THE PROGRAM:
import arcgisscripting
gp=arcgisscripting.create()
gp.Workspace = "C:\\NPDES\\NPDES_PYTHON.mdb"
fc = "Outfalls_ND"
try:
    # Set the field to create a list of unique values    
    fieldname = "OUTFALL_ID"
    
    # Open a Search Cursor to identify all unique values
    cur = gp.UpdateCursor(fc)
    row = cur.Next()
    # Set a list variable to hold all unique values
    L = []
    # Using a while loop, cursor through all records and append unique
    #values to the list variable
    while row <> None:
    value = row.GetValue(fieldname)
    if value not in L:
    L.append(value)
    row = cur.Next()
    row.SetValue(Tot_Outf_Area, sum(row.AREA_ACRES)) #total area of 
each 
outfall=sum of all area 4 each unique outfallid
    cur.UpdateRow(row) #to commit changes
    row=cur.Next()
    print row.Tot_Outf_Area
    # Sort the list variable
    L.sort()
    # If a value in the list variable is blank, remove it from the list 
variable
    #to filter out diffuse outfalls
    if ' ' in L:
    L.remove(' ')
except:
    # If an error occurred while running a tool, print the messages
    print gp.GetMessages()
 
#Please Help!!!
#Esther


  -- 
http://mail.python.org/mailman/listinfo/python-list

Re: Queue cleanup

2010-09-01 Thread Paul Rubin

Lawrence D'Oliveiro  writes:
>> Refcounting is susceptable to the same pauses for reasons already
>> discussed.
>
> Doesn’t seem to happen in the real world, though.

def f(n):
from time import time
a = [1] * n
t0 = time()
del a
t1 = time()
return t1 - t0

for i in range(9):
   print i, f(10**i)


on my system prints:

0 2.86102294922e-06
1 2.14576721191e-06
2 3.09944152832e-06
3 1.00135803223e-05
4 0.000104904174805
5 0.00098991394043
6 0.00413608551025
7 0.037693977356
8 0.362598896027

Looks pretty linear as n gets large.  0.36 seconds (the last line) is a
noticable pause.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Optimising literals away

2010-09-01 Thread Stefan Behnel


MRAB, 31.08.2010 23:53:

On 31/08/2010 21:18, Terry Reedy wrote:

On 8/31/2010 12:33 PM, Aleksey wrote:

On Aug 30, 10:38 pm, Tobias Weber wrote:

Hi,
whenever I type an "object literal" I'm unsure what optimisation
will do
to it.


Optimizations are generally implentation dependent. CPython currently
creates numbers, strings, and tuple literals just once. Mutable literals
must be created each time as they may be bound and saved.


def m(arg):
if arg& set([1,2,3]):


set() is a function call, not a literal. When m is called, who knows
what 'set' will be bound to? In Py3, at least, you could write {1,2,3},
which is much faster as it avoids creating and deleting a list. On my
machine, .35 versus .88 usec. Even then, it must be calculated each time
because sets are mutable and could be returned to the calling code.


There's still the possibility of some optimisation. If the resulting
set is never stored anywhere (bound to a name, for example) then it
could be created once. When the expression is evaluated there could be
a check so see whether 'set' is bound to the built-in class, and, if it
is, then just use the pre-created set.


Cython applies this kind of optimistic optimisation in a couple of other 
cases and I can affirm that it often makes sense to do that. However, 
drawback here: the set takes up space while not being used (not a huge 
problem if literals are expected to be small), and the global lookup of 
"set" still has to be done to determine if it *is* the builtin set type. 
After that, however, the savings should be considerable.


Another possibility: always cache the set and create a copy on access. 
Copying a set avoids the entire eval loop overhead and runs in a C loop 
instead, using cached item instances with (most likely) cached hash values. 
So even that will most likely be much faster than the spelled-out code above.


Stefan

--
http://mail.python.org/mailman/listinfo/python-list

Re: Queue cleanup

2010-09-01 Thread Paul Rubin

Lawrence D'Oliveiro  writes:
> Whereas garbage collection will happen at some indeterminate time long after 
> the last access to the object, when it very likely will no longer be in the 
> cache, and have to be brought back in just to be freed, 

GC's for large systems generally don't free (or even examine) individual
garbage objects.  They copy the live objects to a new contiguous heap
without ever touching the garbage, and then they release the old heap.
That has the effect of improving locality, since the new heap is
compacted and has no dead objects.  The algorithms are generational
(they do frequent gc's on recently-created objects and less frequent
ones on older objects), so "minor" gc operations are on regions that fit
in cache, while "major" ones might have cache misses but are infrequent.

Non-compacting reference counting (or simple mark/sweep gc) has much
worse fragmentation and consequently worse cache locality than
copying-style gc.
-- 
http://mail.python.org/mailman/listinfo/python-list

89 matches

Mail list logo