Re: Selecting k smallest or largest elements from a large list in python; (benchmarking)
Dmitry Chichkov writes: > Given: a large list (10,000,000) of floating point numbers; > Task: fastest python code that finds k (small, e.g. 10) smallest > items, preferably with item indexes; > Limitations: in python, using only standard libraries (numpy & scipy > is Ok); > > I've tried several methods. With N = 10,000,000, K = 10 The fastest so > far (without item indexes) was pure python implementation > nsmallest_slott_bisect (using bisect/insert). And with indexes > nargsmallest_numpy_argmin (argmin() in the numpy array k times). > > Anyone up to the challenge beating my code with some clever selection > algorithm? > > Current Table: > 1.66864395142 mins_heapq(items, n): > 0.946580886841 nsmallest_slott_bisect(items, n): > 1.38014793396 nargsmallest(items, n): > 10.0732769966 sorted(items)[:n]: > 3.17916202545 nargsmallest_numpy_argsort(items, n): > 1.31794500351 nargsmallest_numpy_argmin(items, n): > 2.37499308586 nargsmallest_numpy_array_argsort(items, n): > 0.524670124054 nargsmallest_numpy_array_argmin(items, n): > > 0.0525538921356 numpy argmin(items): 1892997 > 0.364673852921 min(items): 10.026786 I think without numpy, nsmallest_slott_bisect is almost optimal. There is a slight improvement: 1.3386270 nsmallest_slott_bisect(items, n): [10.11643188717, 10.17791492528] 0.883894920349 nsmallest_slott_bisect2(items, n): [10.11643188717, 10.17791492528] code from bisect import insort from itertools import islice def nsmallest_slott_bisect(iterable, n, insort=insort): it = iter(iterable) mins = sorted(islice(it, n)) for el in it: if el <= mins[-1]: #NOTE: equal sign is to preserve duplicates insort(mins, el) mins.pop() return mins def nsmallest_slott_bisect2(iterable, n, insort=insort): it = iter(iterable) mins = sorted(islice(it, n)) maxmin = mins[-1] for el in it: if el <= maxmin: #NOTE: equal sign is to preserve duplicates insort(mins, el) mins.pop() maxmin = mins[-1] return mins import time from random import randint, random test_data = [randint(10, 50) + random() for i in range(1000)] K = 10 init = time.time() mins = nsmallest_slott_bisect(test_data, K) print time.time() - init, 'nsmallest_slott_bisect(items, n):', mins[: 2] init = time.time() mins = nsmallest_slott_bisect2(test_data, K) print time.time() - init, 'nsmallest_slott_bisect2(items, n):', mins[: 2] -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
Re: PyPy and RPython
On 9/1/2010 10:49 AM, sarvi wrote: Is there a plan to adopt PyPy and RPython under the python foundation in attempt to standardize both. I have been watching PyPy and RPython evolve over the years. PyPy seems to have momentum and is rapidly gaining followers and performance. PyPy JIT and performance would be a good thing for the Python Community And it seems to be well ahead of Unladen Swallow in performance and in a position to improve quite a bit. Secondly I have always fantasized of never having to write C code yet get its compiled performance. With RPython(a strict subset of Python), I can actually compile it to C/Machine code These 2 seem like spectacular advantages for Python to pickup on. And all this by just showing the PyPy and the Python foundation's support and direction to adopt them. Yet I see this forum relatively quiet on PyPy or Rpython ? Any reasons??? Sarvi The winner on performance, by a huge margin, is Shed Skin, the optimizing type-inferring compiler for a restricted subset of Python. PyPy and Unladen Swallow have run into the problem that if you want to keep some of the less useful dynamic semantics of Python, the heavy-duty optimizations become extremely difficult. However, if we defined a High Performance Python language, with some restrictions, the problem becomes much easier. The necessary restrictions are roughly this: -- Functions, once defined, cannot be redefined. (Inlining and redefinition do not play well together.) -- Variables are implicitly typed for the base types: integer, float, bool, and everything else. The compiler figures this out automatically. (Shed Skin does this now.) -- Unless a class uses a "setattr" function or has a __setattr__ method, its entire list of attributes is known at compile time. (In other words, you can't patch in new attributes from outside the class unless the class indicates it supports that. You can subclass, of course.) -- Mutable objects (other than some form of synchronized object) cannot be shared between threads. This is the key step in getting rid of the Global Interpreter Lock. -- "eval" must be restricted to the form that has a list of the variables it can access. -- Import after startup probably won't work. Those are the essential restrictions. With those, Python could go 20x to 60x faster than CPython. The failures of PyPy and Unladen Swallow to get any significant performance gains over CPython demonstrate the futility of trying to make the current language go fast. Reference counts aren't a huge issue. With some static analysis, most reference count updates can be optimized out. (As for how this is done, the key issue is to determine whether each function "keeps" a reference to each parameter. For any function which does not, that parameter doesn't have to have reference count updates within the function. Most math library functions have this property. You do have to analyze the entire program globally, though.) John Nagle -- http://mail.python.org/mailman/listinfo/python-list
Re: PyPy and RPython
sarvi, 02.09.2010 07:06: Look at all the alternatives we have. Cython? Shedskin? I'll take PyPy anyday instead of them Fell free to do so, but don't forget that the choice of a language always depends on the specific requirements at hand. Cython has proven its applicability in a couple of large projects, for example. And it has a lot more third party libraries available than both PyPy and Shedskin together: all Python libraries, pure Python and CPython binary extensions, as well as tons of code written in Cython, C, C++, Fortran, and then some. And you don't have to give up one bit of CPython compatibility to use all of that. That alone counts as a pretty huge advantage to some people. Stefan -- http://mail.python.org/mailman/listinfo/python-list
Re: importing excel data into a python matrix?
On Sep 1, 7:45 pm, Chris Rebert wrote: > On Wed, Sep 1, 2010 at 4:35 PM, patrick mcnameeking > > wrote: > > I'm working on a project where I have been given > > a 1000 by 1000 cell excel spreadsheet and I would > > like to be able to access the data using Python. > > Does anyone know of a way that I can do this? > > "xlrd 0.7.1 - Library for developers to extract data from Microsoft > Excel (tm) spreadsheet files":http://pypi.python.org/pypi/xlrd While I heartily recommend xlrd, it only works with "traditional" Excel files (extension .xls, not .xlsx). If the data really is 1000 columns wide, it must be in the new (Excel 2007 or later) format, because the old only supported up to 256 columns. The most promising-looking Python package to handle .xlsx files is openpyxl. There are also a couple of older .xlsx readers (openpyxl can write as well). I have not tried any of these. John -- http://mail.python.org/mailman/listinfo/python-list
Re: Queue cleanup
On 8/30/2010 12:22 AM, Paul Rubin wrote: I guess that is how the so-called smart pointers in the Boost C++ template library work. I haven't used them so I don't have personal experience with how convenient or reliable they are, or what kinds of constraints they imposed on programming style. I've always felt a bit suspicious of them though, and I seem to remember Alex Martelli (I hope he shows up here again someday) advising against using them. "Smart pointers" in C++ have never quite worked right. They almost work. But there always seems to be something that needs access to a raw C pointer, which breaks the abstraction. The mold keeps creeping through the wallpaper. Also, since they are a bolt-on at the macro level in C++, reference count updates aren't optimized and hoisted out of loops. (They aren't in CPython either, but there have been reference counted systems that optimize out most reference count updates.) John Nagle -- http://mail.python.org/mailman/listinfo/python-list
Re: dirty problem 3 lines
bussiere bussiere wrote: > it's just as it seems : > i want to know how does ti works to get back an object from a string in > python : > pickle.loads("""b'\x80\x03]q\x00(K\x00K\x01e.'""") #doesn't work Repeating the question without providing any further information doesn't really help. This is a byte string: b'\x80\x03]q\x00(K\x00K\x01e.' As MRAB points out, you can unpickle a byte string directly. This is a doc string: """note the triplet of double quotes""" What you have is a doc string that appears to contain a byte string: """b'\x80\x03]q\x00(K\x00K\x01e.'""" So the question for you is: what is putting the byte string inside of a doc string? If you can stop that from happening, then you'll have a byte string you can directly unpickle. Now, if you _don't_ have control over whatever is handing you the dump string, then you can just use string manipulation to reproduce the byte string: >>> dump = """b'\x80\x03]q\x00(K\x00K\x01e.'""" >>> badump = dump[2:-1].encode()[1:] >>> pickle.loads(badump) [0, 1] So: - dump[2:-1] strips off string representation of the byte string (b'...') - .encode() turns it into an actual byte string - [1:] strips a unicode blank from the start of the byte string (not entirely sure how that gets there...) After that it should be fine to unpickle. -- http://mail.python.org/mailman/listinfo/python-list
Re: PyPy and RPython
On Sep 1, 6:49 pm, Benjamin Peterson wrote: > sarvi gmail.com> writes: > > Secondly I have always fantasized of never having to write C code yet > > get its compiled performance. > > With RPython(a strict subset of Python), I can actually compile it to > > C/Machine code > > RPython is not supposed to be a general purpose language. As a PyPy developer > myself, I can testify that it is no fun. Can be worse than than writing C/C++ Compared to Java, having the interpreter during development is huge I actually think yall at PyPy are hugely underestimating RPython. http://olliwang.com/2009/12/20/aes-implementation-in-rpython/ http://alexgaynor.net/2010/may/15/pypy-future-python/ Look at all the alternatives we have. Cython? Shedskin? I'll take PyPy anyday instead of them We make performance tradeoffs all the the time. Look at Mercurial. 90% python and 5% C Wouldn't you rather this be 90% Python and 5% RPython ??? Add to the possibility of writing Python extension module in RPython. You could be winning a whole group of developer mindshare. > > > > > Yet I see this forum relatively quite on PyPy or Rpython ? Any > > reasons??? > > You should post to the PyPy list instead. (See pypy.org) I tried. got bounced. Just subscribed. Will try again. Sarvi -- http://mail.python.org/mailman/listinfo/python-list
Re: Private variables
On 2 September 2010 12:22, Ryan Kelly wrote: > On Thu, 2010-09-02 at 12:06 +1000, Ryan Kelly wrote: >> On Thu, 2010-09-02 at 11:10 +1000, Rasjid Wilcox wrote: >> > Hi all, >> > >> > I am aware the private variables are generally done via convention >> > (leading underscore), but I came across a technique in Douglas >> > Crockford's book "Javascript: The Good Parts" for creating private >> > variables in Javascript, and I'd thought I'd see how it translated to >> > Python. Here is my attempt. >> > >> > def get_config(_cache=[]): >> > private = {} >> > private['a'] = 1 >> > private['b'] = 2 >> > if not _cache: >> > class Config(object): >> > @property >> > def a(self): >> > return private['a'] >> > @property >> > def b(self): >> > return private['b'] >> > config = Config() >> > _cache.append(config) >> > else: >> > config = _cache[0] >> > return config >> > >> > >>> c = get_config() >> > >>> c.a >> > 1 >> > >>> c.b >> > 2 >> > >>> c.a = 10 >> > Traceback (most recent call last): >> > File "", line 1, in >> > AttributeError: can't set attribute >> > >>> dir(c) >> > ['__class__', '__delattr__', '__dict__', '__doc__', '__format__', >> > '__getattribute__', '__hash__', '__init__', '__module__', '__new__', >> > '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', >> > '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'a', 'b'] >> > >>> d = get_config() >> > >>> d is c >> > True >> > >> > I'm not really asking 'is it a good idea' but just 'does this work'? >> > It seems to work to me, and is certainly 'good enough' in the sense >> > that it should be impossible to accidentally change the variables of >> > c. >> > >> > But is it possible to change the value of c.a or c.b with standard >> > python, without resorting to ctypes level manipulation? >> >> It's not easy, but it can be done by introspecting the property object >> you created and munging the closed-over dictionary object: >> >> >>> c = get_config() >> >>> c.a >> 1 >> >>> c.__class__.__dict__['a'].fget.func_closure[0].cell_contents['a'] = 7 >> >>> c.a >> 7 Ah! That is what I was looking for. > Heh, and of course I miss the even more obvious trick of just clobbering > the property with something else: > > >>> c.a > 1 > >>> setattr(c.__class__,"a",7) > >>> c.a > 7 Well, that is just cheating! :-) Anyway, thanks for that. I still think it is 'good enough' for those cases where private variables are 'required'. In both cases one has to go out of ones way to modify the attribute. OTOH, I guess it depends on what the use case is. If it is for storing a secret password that no other part of the system should have access to, then perhaps not 'good enough' at all. Cheers, Rasjid. -- http://mail.python.org/mailman/listinfo/python-list
Re: killing all subprocess childrens
Chris Rebert wrote: import os import psutil # http://code.google.com/p/psutil/ # your piece of code goes here myself = os.getpid() for proc in psutil.process_iter(): Is there a way to do this without psutil or installing any external modules or doing it from python2.5? Just wondering. Thanks again if proc.ppid == myself: proc.kill() Cheers, Chris -- http://blog.rebertia.com -- http://mail.python.org/mailman/listinfo/python-list
Re: killing all subprocess childrens
On Wed, Sep 1, 2010 at 8:12 PM, Astan Chee wrote: > Hi, > I have a piece of code that looks like this: > > import subprocess > retcode = subprocess.call(["java","test","string"]) > print "Exited with retcode " + str(retcode) > > What I'm trying to do (and wondering if its possible) is to make sure that > any children (and any descendants) of this process is killed when the main > java process is killed (or dies). > How do I do this in windows, linux and OSX? Something /roughly/ like: import os import psutil # http://code.google.com/p/psutil/ # your piece of code goes here myself = os.getpid() for proc in psutil.process_iter(): if proc.ppid == myself: proc.kill() Cheers, Chris -- http://blog.rebertia.com -- http://mail.python.org/mailman/listinfo/python-list
killing all subprocess childrens
Hi, I have a piece of code that looks like this: import subprocess retcode = subprocess.call(["java","test","string"]) print "Exited with retcode " + str(retcode) What I'm trying to do (and wondering if its possible) is to make sure that any children (and any descendants) of this process is killed when the main java process is killed (or dies). How do I do this in windows, linux and OSX? Thanks Astan -- http://mail.python.org/mailman/listinfo/python-list
Re: Performance: sets vs dicts.
On 09/01/2010 04:51 PM, Raymond Hettinger wrote: > On Aug 30, 6:03 am, a...@pythoncraft.com (Aahz) wrote: >> That reminds me: one co-worker (who really should have known better ;-) >> had the impression that sets were O(N) rather than O(1). Although >> writing that off as a brain-fart seems appropriate, it's also the case >> that the docs don't really make that clear, it's implied from requiring >> elements to be hashable. Do you agree that there should be a comment? > > There probably ought to be a HOWTO or FAQ entry on algorithmic > complexity > that covers classes and functions where the algorithms are > interesting. > That will concentrate the knowledge in one place where performance is > a > main theme and where the various alternatives can be compared and > contrasted. > I think most users of sets rarely read the docs for sets. The few lines > in the tutorial are enough so that most folks "just get it" and don't read > more detail unless they attempting something exotic. I think that attitude is very dangerous. There is a long history in this world of one group of people presuming what another group of people does or does not do or think. This seems to be a characteristic of human beings and is often used to promote one's own ideology. And even if you have hard evidence for what you say, why should 60% of people who don't read docs justify providing poor quality docs to the 40% that do? So while you may "think" most people rarely read the docs for basic language features and objects (I presume you don't mean to restrict your statement to only sets), I and most people I know *do* read them. And when read them I expect them, as any good reference documentation does, to completely and accurately describe the behavior of the item I am reading about. If big-O performance is deemed an intrinsic behavior of an (operation of) an object, it should be described in the documentation for that object. Your use of the word "exotic" is also suspect. I learned long ago to always click the "advanced options" box on dialogs because most developers/- designers really don't have a clue about what users need access to. > Our docs have gotten > somewhat voluminous, No they haven't (relative to what they attempt to describe). The biggest problem with the docs is that they are too terse. They often appear to have been written by people playing a game of "who can describe X in the minimum number of words that can still be defended as correct." While that may be fun, good docs are produced by considering how to describe something to the reader, completely and accurately, as effectively as possible. The test is not how few words were used, but how quickly the reader can understand the object or find the information being sought about the object. > so it's unlikely that adding that particular > needle to the haystack would have cured your colleague's "brain-fart" > unless he had been focused on a single document talking about the > performance > characteristics of various data structures. I don't know the colleague any more that you so I feel comfortable saying that having it very likely *would* have cured that brain-fart. That is, he or she very likely would have needed to check some behavior of sets at some point and would have either noted the big-O characteristics in passing, or would have noted that such information was available, and would have returned to the documentation when the need for that information arose. The reference description of sets is the *one* canonical place to look for information about sets. There are people who don't read documentation, but one has to be very careful not use the existence of such people as an excuse to justify sub-standard documentation. So I think relegating algorithmic complexity information to some remote document far from the description of the object it pertains to, is exactly the wrong approach. This is not to say that a performance HOWTO or FAQ in addition to the reference manual would not be good. -- http://mail.python.org/mailman/listinfo/python-list
Re: PyPy and RPython
On Sep 2, 3:49 am, sarvi wrote: > Yet I see this forum relatively quite on PyPy or Rpython ? Any > reasons??? For me, it's two major ones: 1. PyPy only recently hit a stability/performance point that makes it worth checking out, 2. Using non-pure-python modules wasn't straightforward (at least when I last looked) However, I've always felt the PyPy project was far more promising than Unladen Swallow. -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem checking an existing browser cookie
On 31 Αύγ, 11:07, Nik the Greek wrote: > On 30 Αύγ, 20:50, MRAB wrote: > > > > > > > > > > > On 30/08/2010 18:16, Nik the Greek wrote: > > > > On 30 Αύγ, 19:41, MRAB wrote: > > >> On 30/08/2010 04:33, Nik the Greek wrote: > > > >>> On 30 Αύγ, 06:12, MRAB wrote: > > > This part: > > > ( not mycookie or mycookie.value != 'nikos' ) > > > is false but this part: > > > re.search( r'(msn|yandex|13448|spider|crawl)', host ) is None > > > is true because host doesn't contain any of those substrings. > > > >>> So, the if code does executed because one of the condition is true? > > > >>> How should i write it? > > > >>> I cannot think clearly on this at all. > > > >>> I just wan to tell it to get executed ONLY IF > > > >>> the cookie values is not 'nikos' > > > >>> or ( don't knwo if i have to use and or 'or' here) > > > >>> host does not contain any of the substrings. > > > >>> What am i doign wrong?! > > > >> It might be clearer if you reverse the condition and say: > > > >> me_visiting = ... > > >> if not me_visiting: > > >> ... > > > > I don't understand what are you trying to say > > > > Please provide a full example. > > > > You mean i should try it like this? > > > > unless ( visitor and visitor.value == 'nikos' ) or re.search( r'(msn| > > > yandex|13448|spider|crawl)', host ) not None: > > > > But isnt it the same thing like the if? > > > My point is that the logic might be clearer to you if you think first > > about how you know when you _are_ the visitor. > > Well my idea was to set a cookie on my browser with the name visitor > and a value of "nikos" and then check each time that cooki. if value > is "nikos" then dont count! > > I could also pass an extra url string likehttp://webville.gr?show=nikos > and check that but i dont like the idea very much of giving an extra > string each time i want to visit my webpage. > So form the 2 solution mentioned the 1st one is better but cant come > into action for some reason. > > Aprt form those too solution i cant think of anyhting else that would > identify me and filter me out of the actual guest of my website. > > I'm all ears if you can think of something else. Is there any other way for the webpage to identify me and filter me out except checking a cookie or attach an extra url string to the address bar? -- http://mail.python.org/mailman/listinfo/python-list
Re: Private variables
On Thu, 2010-09-02 at 12:06 +1000, Ryan Kelly wrote: > On Thu, 2010-09-02 at 11:10 +1000, Rasjid Wilcox wrote: > > Hi all, > > > > I am aware the private variables are generally done via convention > > (leading underscore), but I came across a technique in Douglas > > Crockford's book "Javascript: The Good Parts" for creating private > > variables in Javascript, and I'd thought I'd see how it translated to > > Python. Here is my attempt. > > > > def get_config(_cache=[]): > > private = {} > > private['a'] = 1 > > private['b'] = 2 > > if not _cache: > > class Config(object): > > @property > > def a(self): > > return private['a'] > > @property > > def b(self): > > return private['b'] > > config = Config() > > _cache.append(config) > > else: > > config = _cache[0] > > return config > > > > >>> c = get_config() > > >>> c.a > > 1 > > >>> c.b > > 2 > > >>> c.a = 10 > > Traceback (most recent call last): > > File "", line 1, in > > AttributeError: can't set attribute > > >>> dir(c) > > ['__class__', '__delattr__', '__dict__', '__doc__', '__format__', > > '__getattribute__', '__hash__', '__init__', '__module__', '__new__', > > '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', > > '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'a', 'b'] > > >>> d = get_config() > > >>> d is c > > True > > > > I'm not really asking 'is it a good idea' but just 'does this work'? > > It seems to work to me, and is certainly 'good enough' in the sense > > that it should be impossible to accidentally change the variables of > > c. > > > > But is it possible to change the value of c.a or c.b with standard > > python, without resorting to ctypes level manipulation? > > It's not easy, but it can be done by introspecting the property object > you created and munging the closed-over dictionary object: > >>>> c = get_config() >>>> c.a >1 >>>> c.__class__.__dict__['a'].fget.func_closure[0].cell_contents['a'] = 7 >>>> c.a >7 >>>> Heh, and of course I miss the even more obvious trick of just clobbering the property with something else: >>> c.a 1 >>> setattr(c.__class__,"a",7) >>> c.a 7 >>> Ryan -- Ryan Kelly http://www.rfk.id.au | This message is digitally signed. Please visit r...@rfk.id.au| http://www.rfk.id.au/ramblings/gpg/ for details signature.asc Description: This is a digitally signed message part -- http://mail.python.org/mailman/listinfo/python-list
Re: Private variables
On Thu, 2010-09-02 at 11:10 +1000, Rasjid Wilcox wrote: > Hi all, > > I am aware the private variables are generally done via convention > (leading underscore), but I came across a technique in Douglas > Crockford's book "Javascript: The Good Parts" for creating private > variables in Javascript, and I'd thought I'd see how it translated to > Python. Here is my attempt. > > def get_config(_cache=[]): > private = {} > private['a'] = 1 > private['b'] = 2 > if not _cache: > class Config(object): > @property > def a(self): > return private['a'] > @property > def b(self): > return private['b'] > config = Config() > _cache.append(config) > else: > config = _cache[0] > return config > > >>> c = get_config() > >>> c.a > 1 > >>> c.b > 2 > >>> c.a = 10 > Traceback (most recent call last): > File "", line 1, in > AttributeError: can't set attribute > >>> dir(c) > ['__class__', '__delattr__', '__dict__', '__doc__', '__format__', > '__getattribute__', '__hash__', '__init__', '__module__', '__new__', > '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', > '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'a', 'b'] > >>> d = get_config() > >>> d is c > True > > I'm not really asking 'is it a good idea' but just 'does this work'? > It seems to work to me, and is certainly 'good enough' in the sense > that it should be impossible to accidentally change the variables of > c. > > But is it possible to change the value of c.a or c.b with standard > python, without resorting to ctypes level manipulation? It's not easy, but it can be done by introspecting the property object you created and munging the closed-over dictionary object: >>> c = get_config() >>> c.a 1 >>> c.__class__.__dict__['a'].fget.func_closure[0].cell_contents['a'] = 7 >>> c.a 7 >>> Cheers, Ryan -- Ryan Kelly http://www.rfk.id.au | This message is digitally signed. Please visit r...@rfk.id.au| http://www.rfk.id.au/ramblings/gpg/ for details signature.asc Description: This is a digitally signed message part -- http://mail.python.org/mailman/listinfo/python-list
Re: PyPy and RPython
sarvi gmail.com> writes: > > > Is there a plan to adopt PyPy and RPython under the python foundation > in attempt to standardize both. There is not. > > Secondly I have always fantasized of never having to write C code yet > get its compiled performance. > With RPython(a strict subset of Python), I can actually compile it to > C/Machine code RPython is not supposed to be a general purpose language. As a PyPy developer myself, I can testify that it is no fun. > > Yet I see this forum relatively quite on PyPy or Rpython ? Any > reasons??? You should post to the PyPy list instead. (See pypy.org) -- http://mail.python.org/mailman/listinfo/python-list
Private variables
Hi all, I am aware the private variables are generally done via convention (leading underscore), but I came across a technique in Douglas Crockford's book "Javascript: The Good Parts" for creating private variables in Javascript, and I'd thought I'd see how it translated to Python. Here is my attempt. def get_config(_cache=[]): private = {} private['a'] = 1 private['b'] = 2 if not _cache: class Config(object): @property def a(self): return private['a'] @property def b(self): return private['b'] config = Config() _cache.append(config) else: config = _cache[0] return config >>> c = get_config() >>> c.a 1 >>> c.b 2 >>> c.a = 10 Traceback (most recent call last): File "", line 1, in AttributeError: can't set attribute >>> dir(c) ['__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'a', 'b'] >>> d = get_config() >>> d is c True I'm not really asking 'is it a good idea' but just 'does this work'? It seems to work to me, and is certainly 'good enough' in the sense that it should be impossible to accidentally change the variables of c. But is it possible to change the value of c.a or c.b with standard python, without resorting to ctypes level manipulation? Cheers, Rasjid. -- http://mail.python.org/mailman/listinfo/python-list
Selecting k smallest or largest elements from a large list in python; (benchmarking)
Given: a large list (10,000,000) of floating point numbers; Task: fastest python code that finds k (small, e.g. 10) smallest items, preferably with item indexes; Limitations: in python, using only standard libraries (numpy & scipy is Ok); I've tried several methods. With N = 10,000,000, K = 10 The fastest so far (without item indexes) was pure python implementation nsmallest_slott_bisect (using bisect/insert). And with indexes nargsmallest_numpy_argmin (argmin() in the numpy array k times). Anyone up to the challenge beating my code with some clever selection algorithm? Current Table: 1.66864395142 mins_heapq(items, n): 0.946580886841 nsmallest_slott_bisect(items, n): 1.38014793396 nargsmallest(items, n): 10.0732769966 sorted(items)[:n]: 3.17916202545 nargsmallest_numpy_argsort(items, n): 1.31794500351 nargsmallest_numpy_argmin(items, n): 2.37499308586 nargsmallest_numpy_array_argsort(items, n): 0.524670124054 nargsmallest_numpy_array_argmin(items, n): 0.0525538921356 numpy argmin(items): 1892997 0.364673852921 min(items): 10.026786 Code: import heapq from random import randint, random import time from bisectimport insort from itertools import islice from operator import itemgetter def mins_heapq(items, n): nlesser_items = heapq.nsmallest(n, items) return nlesser_items def nsmallest_slott_bisect(iterable, n, insort=insort): it = iter(iterable) mins = sorted(islice(it, n)) for el in it: if el <= mins[-1]: #NOTE: equal sign is to preserve duplicates insort(mins, el) mins.pop() return mins def nargsmallest(iterable, n, insort=insort): it = enumerate(iterable) mins = sorted(islice(it, n), key = itemgetter(1)) loser = mins[-1][1] # largest of smallest for el in it: if el[1] <= loser:# NOTE: equal sign is to preserve dupl mins.append(el) mins.sort(key = itemgetter(1)) mins.pop() loser = mins[-1][1] return mins def nargsmallest_numpy_argsort(iter, k): distances = N.asarray(iter) return [(i, distances[i]) for i in distances.argsort()[0:k]] def nargsmallest_numpy_array_argsort(array, k): return [(i, array[i]) for i in array.argsort()[0:k]] def nargsmallest_numpy_argmin(iter, k): distances = N.asarray(iter) mins = [] def nargsmallest_numpy_array_argmin(distances, k): mins = [] for i in xrange(k): j = distances.argmin() mins.append((j, distances[j])) distances[j] = float('inf') return mins test_data = [randint(10, 50) + random() for i in range(1000)] K = 10 init = time.time() mins = mins_heapq(test_data, K) print time.time() - init, 'mins_heapq(items, n):', mins[:2] init = time.time() mins = nsmallest_slott_bisect(test_data, K) print time.time() - init, 'nsmallest_slott_bisect(items, n):', mins[: 2] init = time.time() mins = nargsmallest(test_data, K) print time.time() - init, 'nargsmallest(items, n):', mins[:2] init = time.time() mins = sorted(test_data)[:K] print time.time() - init, 'sorted(items)[:n]:', time.time() - init, mins[:2] import numpy as N init = time.time() mins = nargsmallest_numpy_argsort(test_data, K) print time.time() - init, 'nargsmallest_numpy_argsort(items, n):', mins[:2] init = time.time() mins = nargsmallest_numpy_argmin(test_data, K) print time.time() - init, 'nargsmallest_numpy_argmin(items, n):', mins[:2] print init = time.time() mins = array.argmin() print time.time() - init, 'numpy argmin(items):', mins init = time.time() mins = min(test_data) print time.time() - init, 'min(items):', mins -- http://mail.python.org/mailman/listinfo/python-list
Re: Python libs on Windows ME
Damn Small Linux could work. If even that won't work, perhaps it's time to scrap your old fossil for parts and buy a modern computer. Even a netbook would probably be an improvement based on your situation. -- http://mail.python.org/mailman/listinfo/python-list
Re: Performance: sets vs dicts.
Robert Kern writes: > On 9/1/10 4:40 PM, John Bokma wrote: >> Arnaud Delobelle writes: >> >>> Terry Reedy writes: [...] >>> I don't understand what you're trying to say. Aahz didn't claim that >>> random list element access was constant time, he said it was O(1) (and >>> that it should be part of the Python spec that it is). >> >> Uhm, O(1) /is/ constant time, see page 45 of Introduction to Algorithms, >> 2nd edition. > > While we often use the term "constant time" to as a synonym for O(1) > complexity of an algorithm, Arnaud and Terry are using the term here > to mean "an implementation takes roughly the same amount of wall-clock > time every time". Now that's confusing in a discussion that earlier on provided a link to a page using big O notation. At least for people following this partially, like I do. -- John Bokma j3b Blog: http://johnbokma.com/Facebook: http://www.facebook.com/j.j.j.bokma Freelance Perl & Python Development: http://castleamber.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Performance: sets vs dicts.
Terry Reedy writes: > On 9/1/2010 5:40 PM, John Bokma wrote: [..] > Yes, I switched, because 'constant time' is a comprehensible claim > that can be refuted and because that is how some will interpret O(1) > (see below for proof;-). You make it now sound alsof this interpretation is not correct or out of place. People who have bothered to read ItA will use O(1) and constant time interchangeably while talking of the order of growth of the running time algorithms and most of those are aware that 'big oh' hides a constant, and that in the real world a O(log n) algorithm can outperform an O(1) algorithm for small values of n. >> Uhm, O(1) /is/ constant time, see page 45 of Introduction to Algorithms, >> 2nd edition. -- John Bokma j3b Blog: http://johnbokma.com/Facebook: http://www.facebook.com/j.j.j.bokma Freelance Perl & Python Development: http://castleamber.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: DeprecationWarning
On Wed, Sep 1, 2010 at 8:58 AM, cerr wrote: > Hi There, > > I would like to create an scp handle and download a file from a > client. I have following code: > but what i'm getting is this and no file is downloaded...: > /opt/lampp/cgi-bin/attachment.py:243: DeprecationWarning: > BaseException.message has been deprecated as of Python 2.6 > chan.send('\x01'+e.message) > 09/01/2010 08:53:56 : Downloading P-file failed. > > What does that mean and how do i resolve this? http://stackoverflow.com/questions/1272138/baseexception-message-deprecated-in-python-2-6 As the warning message says, line 243 of /opt/lampp/cgi-bin/attachment.py is the cause of the warning. However, that's only a warning (albeit probably about a small part of some error-raising code), not an error itself, so it's not the cause of the download failure. Printing out the IOError encountered would be the first step in debugging the download failure. Cheers, Chris -- http://blog.rebertia.com -- http://mail.python.org/mailman/listinfo/python-list
Re: importing excel data into a python matrix?
On Wed, Sep 1, 2010 at 4:35 PM, patrick mcnameeking wrote: > Hello list, > I've been working with Python now for about a year using it primarily for > scripting in the Puredata graphical programming environment. I'm working on > a project where I have been given a 1000 by 1000 cell excel spreadsheet and > I would like to be able to access the data using Python. Does anyone know > of a way that I can do this? "xlrd 0.7.1 - Library for developers to extract data from Microsoft Excel (tm) spreadsheet files": http://pypi.python.org/pypi/xlrd If requiring the user to re-save the file as .CSV instead of .XLS is feasible, then you /can/ avoid the third-party dependency and use just the std lib instead: http://docs.python.org/library/csv.html Cheers, Chris -- http://blog.rebertia.com -- http://mail.python.org/mailman/listinfo/python-list
Re: importing excel data into a python matrix?
On Wed, Sep 1, 2010 at 4:35 PM, patrick mcnameeking wrote: > Hello list, > I've been working with Python now for about a year using it primarily for > scripting in the Puredata graphical programming environment. I'm working on > a project where I have been given a 1000 by 1000 cell excel spreadsheet and > I would like to be able to access the data using Python. Does anyone know > of a way that I can do this? > Thanks, > Pat http://tinyurl.com/2eqqjxv ;) Geremy Condra -- http://mail.python.org/mailman/listinfo/python-list
importing excel data into a python matrix?
Hello list, I've been working with Python now for about a year using it primarily for scripting in the Puredata graphical programming environment. I'm working on a project where I have been given a 1000 by 1000 cell excel spreadsheet and I would like to be able to access the data using Python. Does anyone know of a way that I can do this? Thanks, Pat -- 'Given enough eyeballs, all bugs are shallow.' -- http://mail.python.org/mailman/listinfo/python-list
Re: Performance: sets vs dicts.
On 9/1/2010 5:40 PM, John Bokma wrote: Arnaud Delobelle writes: Terry Reedy writes: On 9/1/2010 11:40 AM, Aahz wrote: I think that any implementation that doesn't have O(1) for list element access is fundamentally broken, Whereas I think that that claim is fundamentally broken in multiple ways. and we should probably document that somewhere. I agree that *current* algorithmic behavior of parts of CPython on typical *current* hardware should be documented not just 'somewhere' (which I understand it is, in the Wiki) but in a CPython doc included in the doc set distributed with each release. Perhaps someone or some group could write a HowTo on Programming with CPython's Builtin Classes that would describe both the implementation and performance and also the implications for coding style. In particular, it could compare CPython's array lists and tuples to singly linked lists (which are easily created in Python also). But such a document, after stating that array access may be thought of as constant time on current hardware to a useful first approximation, should also state that repeated seqeuntial accessess may be *much* faster than repeated random accessess. People in the high-performance computing community are quite aware of this difference between simplified lies and messy truth. Because of this, array algorithms are (should be) written differently in Fortran and C because Fortran stores arrays by columns and C by rows and because it is usually much faster to access the next item than one far away. I don't understand what you're trying to say. Most generally, that I view Python as an general algorithm language and not just as a VonNeuman machine programming language. More specifically, that O() claims can be inapplicable, confusing, misleading, incomplete, or false, especially when applied to real time and to real systems with finite limits. Aahz didn't claim that random list element access was constant time, >> he said it was O(1) (and >> that it should be part of the Python spec that it is). Yes, I switched, because 'constant time' is a comprehensible claim that can be refuted and because that is how some will interpret O(1) (see below for proof;-). If one takes O(1) to mean bounded, which I believe is the usual technical meaning, then all Python built-in sequence operations take bounded time because of the hard size limit. If sequences were not bounded in length, then access time would not be bounded either. My most specific point is that O(1), interpreted as more-or-less constant time across a range of problem sizes, can be either a virute or vice depending on whether the constancy is a result of speeding up large problems or slowing down small problems. I furthermore contend that Python sequences on current hardware exhibit both virtue and vice and that is would be absurd to reject a system that kept the virtue without the vice and that such absurdity should not be built into the language definition. My fourth point is that we can meet the reasonable goal of helping some people make better use of current Python/CPython on current hardware without big-O controversy and without screwing around with the language definition and locking out the future. Uhm, O(1) /is/ constant time, see page 45 of Introduction to Algorithms, 2nd edition. -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: Performance: sets vs dicts.
On Aug 30, 6:03 am, a...@pythoncraft.com (Aahz) wrote: > That reminds me: one co-worker (who really should have known better ;-) > had the impression that sets were O(N) rather than O(1). Although > writing that off as a brain-fart seems appropriate, it's also the case > that the docs don't really make that clear, it's implied from requiring > elements to be hashable. Do you agree that there should be a comment? There probably ought to be a HOWTO or FAQ entry on algorithmic complexity that covers classes and functions where the algorithms are interesting. That will concentrate the knowledge in one place where performance is a main theme and where the various alternatives can be compared and contrasted. I think most users of sets rarely read the docs for sets. The few lines in the tutorial are enough so that most folks "just get it" and don't read more detail unless they attempting something exotic. Our docs have gotten somewhat voluminous, so it's unlikely that adding that particular needle to the haystack would have cured your colleague's "brain-fart" unless he had been focused on a single document talking about the performance characteristics of various data structures. Raymond Raymond -- http://mail.python.org/mailman/listinfo/python-list
Re: Performance: sets vs dicts.
On 9/1/10 4:40 PM, John Bokma wrote: Arnaud Delobelle writes: Terry Reedy writes: But such a document, after stating that array access may be thought of as constant time on current hardware to a useful first approximation, should also state that repeated seqeuntial accessess may be *much* faster than repeated random accessess. People in the high-performance computing community are quite aware of this difference between simplified lies and messy truth. Because of this, array algorithms are (should be) written differently in Fortran and C because Fortran stores arrays by columns and C by rows and because it is usually much faster to access the next item than one far away. I don't understand what you're trying to say. Aahz didn't claim that random list element access was constant time, he said it was O(1) (and that it should be part of the Python spec that it is). Uhm, O(1) /is/ constant time, see page 45 of Introduction to Algorithms, 2nd edition. While we often use the term "constant time" to as a synonym for O(1) complexity of an algorithm, Arnaud and Terry are using the term here to mean "an implementation takes roughly the same amount of wall-clock time every time". -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: what is this kind of string: b'string' ?
On 09/01/2010 02:32 PM, Stef Mientki wrote: in winpdb I see strings like this: a = b'string' a 'string' type(a) what's the "b" doing in front of the string ? thanks, Stef Mientki In Python2 the b is meaningless (but allowed for compatibility and future-proofing purposes), while in Python 3 it creates a byte array (or byte string or technically an object of type bytes) rather than a string (of unicode). Python2 >>> type(b'abc') >>> type('abc') Python3: >>> type(b'abc') >>> type('abc') -- http://mail.python.org/mailman/listinfo/python-list
Email Previews
Hello, I'm currently trying to write a quick script that takes email message objects and generates quick snippet previews (like the iPhone does when you are in the menu) but I'm struggling. I was just wondering before I started to put a lot of work in this if there were any existing scripts out there that did it, as it seems a bit pointless spending a lot of time reinventing the wheel if something already exists. Thanks for your help, Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: Performance: sets vs dicts.
Arnaud Delobelle writes: > Terry Reedy writes: > >> On 9/1/2010 11:40 AM, Aahz wrote: >>> I think that any implementation >>> that doesn't have O(1) for list element access is fundamentally broken, >> >> Whereas I think that that claim is fundamentally broken in multiple ways. >> >>> and we should probably document that somewhere. >> >> I agree that *current* algorithmic behavior of parts of CPython on >> typical *current* hardware should be documented not just 'somewhere' >> (which I understand it is, in the Wiki) but in a CPython doc included >> in the doc set distributed with each release. >> >> Perhaps someone or some group could write a HowTo on Programming with >> CPython's Builtin Classes that would describe both the implementation >> and performance and also the implications for coding style. In >> particular, it could compare CPython's array lists and tuples to >> singly linked lists (which are easily created in Python also). >> >> But such a document, after stating that array access may be thought of >> as constant time on current hardware to a useful first approximation, >> should also state that repeated seqeuntial accessess may be *much* >> faster than repeated random accessess. People in the high-performance >> computing community are quite aware of this difference between >> simplified lies and messy truth. Because of this, array algorithms are >> (should be) written differently in Fortran and C because Fortran >> stores arrays by columns and C by rows and because it is usually much >> faster to access the next item than one far away. > > I don't understand what you're trying to say. Aahz didn't claim that > random list element access was constant time, he said it was O(1) (and > that it should be part of the Python spec that it is). Uhm, O(1) /is/ constant time, see page 45 of Introduction to Algorithms, 2nd edition. -- John Bokma j3b Blog: http://johnbokma.com/Facebook: http://www.facebook.com/j.j.j.bokma Freelance Perl & Python Development: http://castleamber.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: what is this kind of string: b'string' ?
On 9/1/10 4:32 PM, Stef Mientki wrote: in winpdb I see strings like this: a = b'string' a 'string' type(a) what's the "b" doing in front of the string ? http://docs.python.org/py3k/library/stdtypes.html#bytes-and-byte-array-methods -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
what is this kind of string: b'string' ?
in winpdb I see strings like this: >>>a = b'string' >>>a 'string' >>> type(a) what's the "b" doing in front of the string ? thanks, Stef Mientki -- http://mail.python.org/mailman/listinfo/python-list
Re: Performance: sets vs dicts.
On 9/1/2010 2:42 AM, Paul Rubin wrote: Terry Reedy writes: Does anyone seriously think that an implementation should be rejected as an implementation if it intellegently did seq[n] lookups in log2(n)/31 time units for all n (as humans would do), instead of stupidly taking 1 time unit for all n< 2**31 and rejecting all larger values (as 32-bit CPython does)? Er, how can one handle n> 2**31 at all, in 32-bit CPython? I am not sure of what you mean by 'handle'. Ints (longs in 2.x) are not limited, but indexes are. 2**31 and bigger are summarily rejected as impossibly too large, even though they might not actually be so these days. >>> s=b'' >>> s[1] Traceback (most recent call last): File "", line 1, in s[1] IndexError: index out of range >>> s[2**32] Traceback (most recent call last): File "", line 1, in s[2**32] IndexError: cannot fit 'int' into an index-sized integer As far as I know, this is undocumented. In any case, this means that if it were possible to create a byte array longer than 2**31 on an otherwise loaded 32-bit linux machine with 2**32 memory, then indexing the end elements would not be possible, which is to say, O(1) would jump to O(INF). I do not have such a machine to test whether big = open('2.01.gigabytes', 'rb').read() executes or raises an exception. Array size limits are also not documented. -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: C++ - Python API
Thanks for the answer On 1 Sep., 22:29, Thomas Jollans wrote: > On Wednesday 01 September 2010, it occurred to Markus Kraus to exclaim: > > > So the feature overview: > > First, the obligatory things you don't want to hear: Have you had a look at > similar efforts? A while ago, Aahz posted something very similar on this very > list. You should be able to find it in any of the archives without too much > trouble. > The most prominent example of this is obviously Boost.Python. I searched in Aahz posts but i didn't find anything related. About Boost.Python: I worked with it but (for me) it seems more like if it's meant to create pyd modules. > > For C++ classes: > > - "translating" it into a python object > > How do you handle memory management ? As long as the c++ instanze itself exists, the python object is existing too. If you delete the c++ instanze the python one is also deleted (in a multithreaded environment you'll get a "This object has already been deleted" error). > > - complete reflexion (attributes and methods) of the c++ instance > > - call c++ methods nearly directly from python > > - method-overloading (native python doesnt support it (!)) > > > Modules: > > - the API allowes to create hardcoded python modules without having > > any knowledge about the python C-API > > - Adding attributes to the module (long/char*/PyObject*) > > char*... > Unicode? Somewhere? wchar_t* maybe, or std::wstring? No? Also -- double? (I'm > just being pedantic now, at least double should be trivial to add) I haven't worked too much on this yet but ill add support for all common c++ types. > > General: > > -runs on any platform and doenst need an installed python > > Which platforms did you test it on? Which compilers did you test? Are you sure > your C++ is portable? My C++ code is not platform dependent so it should (haven't tested it yet) be portable. > > -runs in multithreaded environments (requires python > 2.3) > > How do you deal with the GIL? > How do you handle calling to Python from multiple C++ threads? Since python 2.3 there are the function PyGILState_Ensure and PyGILState_Release functions which do the whole GIL stuff for you :). > > -support for python 3.x > > -no need of any python C-API knowledge (maybe for coding modules but > > then only 2 or 3 functions) > > -the project is a VC2010 one and there is also an example module + > > class > > Again, have you tested other compilers? Dont have the ability for it (could need a linux guy who knows how to create a makefile). > > If there is any interest in testing this or using this for your own > > project, please post; in that case i'll release it now instead of > > finishing the inheritance support before releasing it (this may take a > > few days though). > > Just publish a bitbucket or github repository ;-) Ill set up a googlecode site :P -- http://mail.python.org/mailman/listinfo/python-list
Re: Performance: sets vs dicts.
Terry Reedy writes: > On 9/1/2010 11:40 AM, Aahz wrote: >> I think that any implementation >> that doesn't have O(1) for list element access is fundamentally broken, > > Whereas I think that that claim is fundamentally broken in multiple ways. > >> and we should probably document that somewhere. > > I agree that *current* algorithmic behavior of parts of CPython on > typical *current* hardware should be documented not just 'somewhere' > (which I understand it is, in the Wiki) but in a CPython doc included > in the doc set distributed with each release. > > Perhaps someone or some group could write a HowTo on Programming with > CPython's Builtin Classes that would describe both the implementation > and performance and also the implications for coding style. In > particular, it could compare CPython's array lists and tuples to > singly linked lists (which are easily created in Python also). > > But such a document, after stating that array access may be thought of > as constant time on current hardware to a useful first approximation, > should also state that repeated seqeuntial accessess may be *much* > faster than repeated random accessess. People in the high-performance > computing community are quite aware of this difference between > simplified lies and messy truth. Because of this, array algorithms are > (should be) written differently in Fortran and C because Fortran > stores arrays by columns and C by rows and because it is usually much > faster to access the next item than one far away. I don't understand what you're trying to say. Aahz didn't claim that random list element access was constant time, he said it was O(1) (and that it should be part of the Python spec that it is). -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
Re: C++ - Python API
On Wednesday 01 September 2010, it occurred to Markus Kraus to exclaim: > So the feature overview: First, the obligatory things you don't want to hear: Have you had a look at similar efforts? A while ago, Aahz posted something very similar on this very list. You should be able to find it in any of the archives without too much trouble. The most prominent example of this is obviously Boost.Python. > For C++ classes: > - "translating" it into a python object How do you handle memory management ? > - complete reflexion (attributes and methods) of the c++ instance > - call c++ methods nearly directly from python > - method-overloading (native python doesnt support it (!)) > > Modules: > - the API allowes to create hardcoded python modules without having > any knowledge about the python C-API > - Adding attributes to the module (long/char*/PyObject*) char*... Unicode? Somewhere? wchar_t* maybe, or std::wstring? No? Also -- double? (I'm just being pedantic now, at least double should be trivial to add) > > General: > -runs on any platform and doenst need an installed python Which platforms did you test it on? Which compilers did you test? Are you sure your C++ is portable? > -runs in multithreaded environments (requires python > 2.3) How do you deal with the GIL? How do you handle calling to Python from multiple C++ threads? > -support for python 3.x > -no need of any python C-API knowledge (maybe for coding modules but > then only 2 or 3 functions) > -the project is a VC2010 one and there is also an example module + > class Again, have you tested other compilers? > If there is any interest in testing this or using this for your own > project, please post; in that case i'll release it now instead of > finishing the inheritance support before releasing it (this may take a > few days though). Just publish a bitbucket or github repository ;-) -- http://mail.python.org/mailman/listinfo/python-list
Re: parsing string into dict
Tim Arnold writes: > Hi, > I have a set of strings that are *basically* comma separated, but with > the exception that if a comma occur insides curly braces it is not a > delimiter. Here's an example: > > [code=one, caption={My Analysis for \textbf{t}, Version 1}, continued] > > I'd like to parse that into a dictionary (note that 'continued' gets > the value 'true'): > {'code':'one', 'caption':'{My Analysis for \textbf{t}, Version > 1}','continued':'true'} > > I know and love pyparsing, but for this particular code I need to rely > only on the standard library (I'm running 2.7). Here's what I've got, > and it works. I wonder if there's a simpler way? > thanks, > --Tim Arnold > FWIW, here's how I would do it: def parse_key(s, start): pos = start while s[pos] not in ",=]": pos += 1 return s[start:pos].strip(), pos def parse_value(s, start): pos, nesting = start, 0 while nesting or s[pos] not in ",]": nesting += {"{":1, "}":-1}.get(s[pos], 0) pos += 1 return s[start:pos].strip(), pos def parse_options(s): options, pos = {}, 0 while s[pos] != "]": key, pos = parse_key(s, pos + 1) if s[pos] == "=": value, pos = parse_value(s, pos + 1) else: value = 'true' options[key] = value return options test = "[code=one, caption={My Analysis for \textbf{t}, Version 1}, continued]" >>> parse_options(test) {'caption': '{My Analysis for \textbf{t}, Version 1}', 'code': 'one', 'continued': True} -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
Re: Performance: sets vs dicts.
On 9/1/2010 11:40 AM, Aahz wrote: I think that any implementation that doesn't have O(1) for list element access is fundamentally broken, Whereas I think that that claim is fundamentally broken in multiple ways. and we should probably document that somewhere. I agree that *current* algorithmic behavior of parts of CPython on typical *current* hardware should be documented not just 'somewhere' (which I understand it is, in the Wiki) but in a CPython doc included in the doc set distributed with each release. Perhaps someone or some group could write a HowTo on Programming with CPython's Builtin Classes that would describe both the implementation and performance and also the implications for coding style. In particular, it could compare CPython's array lists and tuples to singly linked lists (which are easily created in Python also). But such a document, after stating that array access may be thought of as constant time on current hardware to a useful first approximation, should also state that repeated seqeuntial accessess may be *much* faster than repeated random accessess. People in the high-performance computing community are quite aware of this difference between simplified lies and messy truth. Because of this, array algorithms are (should be) written differently in Fortran and C because Fortran stores arrays by columns and C by rows and because it is usually much faster to access the next item than one far away. -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
C++ - Python API
Hi guys i worked on this for severl days (or even weeks?!) now, but im nearly finished with it: A complete C++ to Python API which allows you to use python as a scripting language for your C++ projects. Simple example: --- python code --- def greet( player ): print( "Hello player " + player.getName() + " !" ) -- --- c++ code --- class CPlayer { REGISTER_CLASS( CPlayer, CLASS_METHOD("getName", GetName) ) private: string m_Name; public: CPlayer( string nName ) { m_Name = nName; INITIALIZE("player"); } string GetName( ){ return m_Name; } }; -- If you call the python function (look into the example in the project to see how to do this) this results in ( assume you have CPlayer("myPlayerName") ) "Hello player myPlayerName!". So the feature overview: For C++ classes: - "translating" it into a python object - complete reflexion (attributes and methods) of the c++ instance - call c++ methods nearly directly from python - method-overloading (native python doesnt support it (!)) Modules: - the API allowes to create hardcoded python modules without having any knowledge about the python C-API - Adding attributes to the module (long/char*/PyObject*) General: -runs on any platform and doenst need an installed python -runs in multithreaded environments (requires python > 2.3) -support for python 3.x -no need of any python C-API knowledge (maybe for coding modules but then only 2 or 3 functions) -the project is a VC2010 one and there is also an example module + class If there is any interest in testing this or using this for your own project, please post; in that case i'll release it now instead of finishing the inheritance support before releasing it (this may take a few days though). -- http://mail.python.org/mailman/listinfo/python-list
Re: Source code for itertools
On 1 sep, 06:30, Tim Roberts wrote: > vsoler wrote: > >On 31 ago, 04:42, Paul Rubin wrote: > >> vsoler writes: > >> > I was expecting an itertools.py file, but I don't see it in your list. > >> >> ./python3.1-3.1.2+20100829/Modules/itertoolsmodule.c > > >> looks promising. Lots of stdlib modules are written in C for speed or > >> access to system facilities. > > >Lawrence, Paul, > > >You seem to be running a utility I am not familiar with. Perhaps this > >is because I am using Windows, and most likely you are not. > > >How could I have found the answer in a windows environment? > > Did you take the time to understand what he did? It's not that hard to > figure out. He fetched the Python source code, unpacked it, then search > for filenames that contained the string "itertools." > > The equivalent in Windows, after unpacking the source archive, would have > been: > dir /s *itertools* > -- > Tim Roberts, t...@probo.com > Providenza & Boekelheide, Inc. Thank you Tim, understood!!! -- http://mail.python.org/mailman/listinfo/python-list
Re: Source code for itertools
On 31 ago, 05:33, Rolando Espinoza La Fuente wrote: > On Mon, Aug 30, 2010 at 11:06 PM, vsoler wrote: > > On 31 ago, 04:42, Paul Rubin wrote: > >> vsoler writes: > >> > I was expecting an itertools.py file, but I don't see it in your list. > >> >> ./python3.1-3.1.2+20100829/Modules/itertoolsmodule.c > > >> looks promising. Lots of stdlib modules are written in C for speed or > >> access to system facilities. > > > Lawrence, Paul, > > > You seem to be running a utility I am not familiar with. Perhaps this > > is because I am using Windows, and most likely you are not. > > > How could I have found the answer in a windows environment? > > Hard question. They are using standard unix utilities. > > But you can find the source file of a python module within python: > > >>> import itertools > >>> print(itertools.__file__) > > /usr/lib/python2.6/lib-dynload/itertools.so > > Yours should point to a windows path. If the file ends with a ".py", > you can open the file > with any editor. If ends with ".so" or something else likely is a > compiled module in C > and you should search in the source distribution, not the binary distribution. > > Hope it helps. > > Regards, > > Rolando Espinoza La fuentewww.insophia.com Thank you Rolando for your contribution. Followinf your piece of advice I got: >>> import itertools >>> print(itertools.__file__) Traceback (most recent call last): File "", line 1, in print(itertools.__file__) AttributeError: 'module' object has no attribute '__file__' >>> So, I undestand that the module is written in C. Vicente Soler -- http://mail.python.org/mailman/listinfo/python-list
Installation problem: Python 2.6.6 (32-Bit) on Windows 7 (32-Bit)
Has anyone else had problems running the msi for Python 2.6.6 on Windows 7 Professional? If I don't check "Compile .py to byte code", the installer completes without error. Checking "Compile .py to byte code" causes the following to be displayed "There is a problem with the windows installer package. A program run as part of setup did not complete as expected" 1. I have GB of disk space available. 2. I have admin privileges 3. The MD5 checksum of the downloaded installer matches the MD5 checksum on python.org 4. Run As Adminsitrator is not available when I Shift-Right Click (probably because my login already has admin privileges) I'm also having a similar issue with the PythonWin32 extensions installer on the same machine. -- http://mail.python.org/mailman/listinfo/python-list
Re: Dumb Stupid Question About List and String
2010/9/2 Alban Nona > Hello Xavier, working great ! thank you very much ! :p > Do you know by any chance if dictionnary can be sorted asthis: > Look at the sorted() global function in the Python API. ;] Cheers, Xav -- http://mail.python.org/mailman/listinfo/python-list
Re: Windows vs. file.read
On Sep 1, 12:31 pm, MRAB wrote: > You should open the files in binary mode, not text mode, ie file(path, > "rb"). Text mode is the default. Not a problem on *nix because the line > ending is newline. Thanks. That was it. -- http://mail.python.org/mailman/listinfo/python-list
PyPy and RPython
Is there a plan to adopt PyPy and RPython under the python foundation in attempt to standardize both. I have been watching PyPy and RPython evolve over the years. PyPy seems to have momentum and is rapidly gaining followers and performance. PyPy JIT and performance would be a good thing for the Python Community And it seems to be well ahead of Unladen Swallow in performance and in a position to improve quite a bit. Secondly I have always fantasized of never having to write C code yet get its compiled performance. With RPython(a strict subset of Python), I can actually compile it to C/Machine code These 2 seem like spectacular advantages for Python to pickup on. And all this by just showing the PyPy and the Python foundation's support and direction to adopt them. Yet I see this forum relatively quite on PyPy or Rpython ? Any reasons??? Sarvi -- http://mail.python.org/mailman/listinfo/python-list
parsing string into dict
Hi, I have a set of strings that are *basically* comma separated, but with the exception that if a comma occur insides curly braces it is not a delimiter. Here's an example: [code=one, caption={My Analysis for \textbf{t}, Version 1}, continued] I'd like to parse that into a dictionary (note that 'continued' gets the value 'true'): {'code':'one', 'caption':'{My Analysis for \textbf{t}, Version 1}','continued':'true'} I know and love pyparsing, but for this particular code I need to rely only on the standard library (I'm running 2.7). Here's what I've got, and it works. I wonder if there's a simpler way? thanks, --Tim Arnold The 'line' is like my example above but it comes in without the ending bracket, so I append one on the 6th line. def parse_options(line): options = dict() if not line: return options active = ['[','=',',','{','}',']'] line += ']' key = '' word= '' inner = 0 for c in list(line): if c in active: if c == '{': inner +=1 elif c == '}': inner -=1 if inner: word += c else: if c == '=': (key,word) = (word,'') options[key.strip()] = True elif c in [',', ']']: if not key: options[word.strip()] = True else: options[key.strip()] = word.strip() (key,word) = (False, '') else: word += c return options -- http://mail.python.org/mailman/listinfo/python-list
Re: Fibonacci: How to think recursively
The most straightforward method would be to apply the formula directly. Loop on j computing Fj along the way if n<=1 : return n Fold=0 Fnew=1 for j in range(2,n) : Fold, Fnew = Fnew, Fold+Fnew return Fnew Even simpler: return round(((1+sqrt(5.))/2)**n/sqrt(5.)) -- http://mail.python.org/mailman/listinfo/python-list
Re: Windows vs. file.read
On Wed, Sep 1, 2010 at 1:03 PM, Mike wrote: > I have a ppm file that python 2.5 on Windows XP cannot read > completely. > Python on linux can read the file with no problem > Python on Windows can read similar files. > I've placed test code and data here: > http://www.cs.ndsu.nodak.edu/~hennebry/ppm_test.zip > Within the directory ppm_test, type > python ppm_test.py > The chunk size commentary occurs only if file.read cannot read enough > bytes. > The commentary only occurs for the last file. > Any ideas? > Any ideas that don't require getting rid of Windows? > It's not my option. Open the files in binary mode. i.e., x=Ppm(file("ff48x32.ppm",'rb')) x=Ppm(file("bw48x32.ppm",'rb')) x=Ppm(file("bisonfootball.ppm",'rb')) You were just lucky on the first two files. -- http://mail.python.org/mailman/listinfo/python-list
Re: Windows vs. file.read
On 01/09/2010 18:03, Mike wrote: I have a ppm file that python 2.5 on Windows XP cannot read completely. Python on linux can read the file with no problem Python on Windows can read similar files. I've placed test code and data here: http://www.cs.ndsu.nodak.edu/~hennebry/ppm_test.zip Within the directory ppm_test, type python ppm_test.py The chunk size commentary occurs only if file.read cannot read enough bytes. The commentary only occurs for the last file. Any ideas? Any ideas that don't require getting rid of Windows? It's not my option. You should open the files in binary mode, not text mode, ie file(path, "rb"). Text mode is the default. Not a problem on *nix because the line ending is newline. -- http://mail.python.org/mailman/listinfo/python-list
Re: Dumb Stupid Question About List and String
On 01/09/2010 17:49, Alban Nona wrote: Hello Xavier, Thank you :) Well what Iam trying to generate is that kind of result: listn1=['ELM001_DIF', 'ELM001_SPC', 'ELM001_RFL', 'ELM001_SSS', 'ELM001_REFR', 'ELM001_ALB', 'ELM001_AMB', 'ELM001_NRM', 'ELM001_MVE', 'ELM001_DPF', 'ELM001_SDW', 'ELM001_MAT', 'ELM001_WPP'] listn2 = ['ELM002_DIF', 'ELM002_SPC', 'ELM002_RFL', 'ELM002_SSS', 'ELM002_REFR', 'ELM002_ALB', 'ELM002_AMB', 'ELM002_NRM', 'ELM002_MVE', 'ELM002_DPF', 'ELM002_SDW', 'ELM002_MAT', 'ELM002_WPP'] etc... The thing is, the first list will be generated automatically. (so there will be unknow versions of ELM00x) that why Im trying to figure out how to genere variable and list in an automatic way. Can you tell me if its not clear please ? :P my english still need improvement when Im trying to explain scripting things. [snip] Create a dict in which the key is the "ELEM" part and the value is a list of those entries which begin with that "ELEM" part. For example, if the entry is 'ELEM001_DIF' then the key is 'ELEM001', which is the first 7 characters of entry, or entry[ : 7]. Something like this: elem_dict = {} for entry in list_of_entries: key = entry[ : 7] if key in elem_dict: elem_dict[key].append(entry) else: elem_dict[key] = [entry] -- http://mail.python.org/mailman/listinfo/python-list
Re: Dumb Stupid Question About List and String
On 2 September 2010 02:49, Alban Nona wrote: > Well what Iam trying to generate is that kind of result: > > listn1=['ELM001_DIF', 'ELM001_SPC', 'ELM001_RFL', 'ELM001_SSS', > 'ELM001_REFR', 'ELM001_ALB', 'ELM001_AMB', 'ELM001_NRM', 'ELM001_MVE', > 'ELM001_DPF', 'ELM001_SDW', 'ELM001_MAT', 'ELM001_WPP'] > > listn2 = ['ELM002_DIF', 'ELM002_SPC', 'ELM002_RFL', 'ELM002_SSS', > 'ELM002_REFR', 'ELM002_ALB', 'ELM002_AMB', 'ELM002_NRM', 'ELM002_MVE', > 'ELM002_DPF', 'ELM002_SDW', 'ELM002_MAT', 'ELM002_WPP'] > > etc... > Have a look at http://www.ideone.com/zlBeB . I took some liberty and renamed some of your variables. I wanted to show you what I (personally) think as good practices in python, from naming conventions to how to use the list and dictionary, and so on. Also, 4-spaces indent. I noticed you have 5 for some reason, but that's none of my business now. I hope my comments explain what they do, and why they are that way. > The thing is, the first list will be generated automatically. (so there > will be unknow versions of ELM00x) > that why Im trying to figure out how to genere variable and list in an > automatic way. > Yes, that's totally possible. See range() (and xrange(), possibly) in the Python API. -- http://mail.python.org/mailman/listinfo/python-list
Windows vs. file.read
I have a ppm file that python 2.5 on Windows XP cannot read completely. Python on linux can read the file with no problem Python on Windows can read similar files. I've placed test code and data here: http://www.cs.ndsu.nodak.edu/~hennebry/ppm_test.zip Within the directory ppm_test, type python ppm_test.py The chunk size commentary occurs only if file.read cannot read enough bytes. The commentary only occurs for the last file. Any ideas? Any ideas that don't require getting rid of Windows? It's not my option. -- http://mail.python.org/mailman/listinfo/python-list
scp with paramiko
Hi There, I want to download a file from a client using paramiko. I found plenty of ressources using google on how to send a file but none that would describe how to download files from a client. Help would be appreciated! Thanks a lot! Ron -- http://mail.python.org/mailman/listinfo/python-list
Re: Optimising literals away
On 01/09/2010 14:25, Lie Ryan wrote: On 09/01/10 17:06, Stefan Behnel wrote: MRAB, 31.08.2010 23:53: On 31/08/2010 21:18, Terry Reedy wrote: On 8/31/2010 12:33 PM, Aleksey wrote: On Aug 30, 10:38 pm, Tobias Weber wrote: Hi, whenever I type an "object literal" I'm unsure what optimisation will do to it. Optimizations are generally implentation dependent. CPython currently creates numbers, strings, and tuple literals just once. Mutable literals must be created each time as they may be bound and saved. def m(arg): if arg& set([1,2,3]): set() is a function call, not a literal. When m is called, who knows what 'set' will be bound to? In Py3, at least, you could write {1,2,3}, which is much faster as it avoids creating and deleting a list. On my machine, .35 versus .88 usec. Even then, it must be calculated each time because sets are mutable and could be returned to the calling code. There's still the possibility of some optimisation. If the resulting set is never stored anywhere (bound to a name, for example) then it could be created once. When the expression is evaluated there could be a check so see whether 'set' is bound to the built-in class, and, if it is, then just use the pre-created set. What if the set is mutated by the function? That will modify the global cache of the set; one way to prevent mutation is to use frozenset, but from the back of my mind, I think there was a discussion that rejects set literals producing a frozen set instead of regular set. [snip] I was talking about a use case like the example code, where the set is created, checked, and then discarded. -- http://mail.python.org/mailman/listinfo/python-list
Re: Newby Needs Help with Python code
Nally Kaunda-Bukenya wrote: > I hope someone can help me. I am new to Python and trying to achive the > following: > 1) I would like to populate the Tot_Ouf_Area field with total area of > each unique outfall_id (code attempted below,but Tot_Ouf_Area not > populating) > 2) I would also like to get the user input of Rv ( each > landuse type will have a specific Rv value). For example the program > should ask the user for Rv value of Low Density Residential (user enters > 0.4 in example below and that value must be stored in the Rv field), and > so on as shown in the 2nd table below… I don't know arcgis, so the following is just guesswork. I iterate over the Outfalls_ND table twice, the first time to calculate the sums per OUTFALL_ID and put them into a dict. With the second pass the Tot_Outf_Area column is updated import arcgisscripting def rows(cur): while True: row = cur.Next() if row is None: break yield row gp = arcgisscripting.create() gp.Workspace = "C:\\NPDES\\NPDES_PYTHON.mdb" TABLE = "Outfalls_ND" GROUP = "OUTFALL_ID" SUM = "AREA_ACRES" TOTAL = "Tot_Outf_Area" aggregate = {} cur = gp.UpdateCursor(TABLE) for row in rows(cur): group = row.GetValue(GROUP) amount = row.GetValue(SUM) aggregate[group] = aggregate.get(group, 0.0) + amount cur = gp.UpdateCursor(TABLE) for row in rows(cur): group = row.GetValue(GROUP) row.SetValue(TOTAL, aggregate[group]) cur.UpdateRow(row) As this is written into the blue it is unlikely that it runs successfully without changes. Just try and report back the results. Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: Dumb Stupid Question About List and String
Hello Xavier, Thank you :) Well what Iam trying to generate is that kind of result: listn1=['ELM001_DIF', 'ELM001_SPC', 'ELM001_RFL', 'ELM001_SSS', 'ELM001_REFR', 'ELM001_ALB', 'ELM001_AMB', 'ELM001_NRM', 'ELM001_MVE', 'ELM001_DPF', 'ELM001_SDW', 'ELM001_MAT', 'ELM001_WPP'] listn2 = ['ELM002_DIF', 'ELM002_SPC', 'ELM002_RFL', 'ELM002_SSS', 'ELM002_REFR', 'ELM002_ALB', 'ELM002_AMB', 'ELM002_NRM', 'ELM002_MVE', 'ELM002_DPF', 'ELM002_SDW', 'ELM002_MAT', 'ELM002_WPP'] etc... The thing is, the first list will be generated automatically. (so there will be unknow versions of ELM00x) that why Im trying to figure out how to genere variable and list in an automatic way. Can you tell me if its not clear please ? :P my english still need improvement when Im trying to explain scripting things. 2010/9/1 Xavier Ho > On 2 September 2010 01:11, Alban Nona wrote: > >> Hello, >> >> seems to have the same error with python. >> In fact I was coding within nuke, a 2d compositing software (not the best) >> unfortunately, I dont see how I can use dictionnary to do what I would >> like to do. >> > > Hello Alban, > > The reason it's printing only the ELM004 elements is because the variable, > first, is 'ELM004' when your code goes to line 29. > > I noticed you're using variables created from the for loop out of its block > as well. Personally I wouldn't recommend it as good practice. There are ways > around it. > > Could you explain briefly what you want to achieve with this program? > What's the desired sample output? > > Cheers, > Xav > -- http://mail.python.org/mailman/listinfo/python-list
Re: Dumb Stupid Question About List and String
On 2 September 2010 01:11, Alban Nona wrote: > Hello, > > seems to have the same error with python. > In fact I was coding within nuke, a 2d compositing software (not the best) > unfortunately, I dont see how I can use dictionnary to do what I would like > to do. > Hello Alban, The reason it's printing only the ELM004 elements is because the variable, first, is 'ELM004' when your code goes to line 29. I noticed you're using variables created from the for loop out of its block as well. Personally I wouldn't recommend it as good practice. There are ways around it. Could you explain briefly what you want to achieve with this program? What's the desired sample output? Cheers, Xav -- http://mail.python.org/mailman/listinfo/python-list
Better multiprocessing and data persistance with C level serialisation
I was thinking about this for a while. Owing to a lack of forking or START/STOP signals, all process interchange in CPython requires serialisation, usually pickling. But what if that could be done within the interpreter core instead of by the script, creating a complete internal representation that can then be read by the child interpreter. Any comments/ideas/suggestions? -- http://mail.python.org/mailman/listinfo/python-list
Re: Performance: sets vs dicts.
Aahz, 01.09.2010 17:40: I still think that making a full set of algorithmic guarantees is a Bad Idea, but I think that any implementation that doesn't have O(1) for list element access is fundamentally broken, and we should probably document that somewhere. +1 Stefan -- http://mail.python.org/mailman/listinfo/python-list
DeprecationWarning
Hi There, I would like to create an scp handle and download a file from a client. I have following code: import sys, os, paramiko,time from attachment import SCPClient transport = paramiko.Transport((prgIP, 22)) try: transport.connect(username='root', password=prgPass) except IOError: print "Transport connect timed out" writelog(" Transport connect timed out. \n") sys.exit() scp = SCPClient(transport) writelog("Succesfully created scp transport handle to get P-file \n") # Create '/tmp/autokernel' if it does not exist. if not os.access('./PRGfiles', os.F_OK): os.mkdir('./PRGfiles') try: scp.get("/usr/share/NovaxTSP/P0086_2003.xml","./PRGfiles/ P0086_2003.xml") writelog("succesfully downloaded P-file \n") except IOError: writelog("Downloading P-file failed. \n") but what i'm getting is this and no file is downloaded...: /opt/lampp/cgi-bin/attachment.py:243: DeprecationWarning: BaseException.message has been deprecated as of Python 2.6 chan.send('\x01'+e.message) 09/01/2010 08:53:56 : Downloading P-file failed. What does that mean and how do i resolve this? Thank you! Ron -- http://mail.python.org/mailman/listinfo/python-list
Re: [ANN] git peer-to-peer bittorrent experiment: first milestone reached
Luke Kenneth Casson Leighton, 01.09.2010 17:14: this is to let people know that a first milestone has been reached in an experiment to combine git with a file-sharing protocol, thus making it possible to use git for truly distributed software development Basically, BitTorrent only works well when there are enough people who share a common interest at the same time. Why would you think that is the case for software development, and what minimum project size would you consider reasonable to make this tool a valid choice? If you're more like targeting in-house development, it could become a little boring to be the first who arrives in the morning ... Stefan -- http://mail.python.org/mailman/listinfo/python-list
Re: Performance: sets vs dicts.
In article , Jerry Hill wrote: >On Tue, Aug 31, 2010 at 10:09 AM, Aahz wrote: >> >> I suggest that we should agree on these guarantees and document them in >> the core. > >I can't get to the online python-dev archives from work (stupid >filter!) so I can't give you a link to the archives, but the original >thread that resulted in the creation of that wiki page was started on >March 9th, 2008 and was titled "Complexity documentation request". http://mail.python.org/pipermail/python-dev/2008-March/077499.html >At the time, opposition to formally documenting this seemed pretty >widespread, including from yourself and Guido. You've obviously >changed your mind on the subject, so maybe it's something that would >be worth revisiting, assuming someone wants to write the doc change. Looking back at that thread, it's less that I've changed my mind as that I've gotten a bit more nuanced. I still think that making a full set of algorithmic guarantees is a Bad Idea, but I think that any implementation that doesn't have O(1) for list element access is fundamentally broken, and we should probably document that somewhere. -- Aahz (a...@pythoncraft.com) <*> http://www.pythoncraft.com/ "...if I were on life-support, I'd rather have it run by a Gameboy than a Windows box." --Cliff Wells -- http://mail.python.org/mailman/listinfo/python-list
[ANN] git peer-to-peer bittorrent experiment: first milestone reached
http://gitorious.org/python-libbittorrent/pybtlib this is to let people know that a first milestone has been reached in an experiment to combine git with a file-sharing protocol, thus making it possible to use git for truly distributed software development and other file-revision-management operations (such as transparently turning git-configured ikiwiki and moinmoin wikis into peer-to-peer ones). the milestone reached is to transfer git commit "pack objects", as if they were ordinary files, over a bittorrent network, and have them "unpacked" at the far end. the significance of being able to transfer git commit pack objects is that this is the core of the "git fetch" command. the core of this experiment comprises a python-based VFS layer, providing alternatives to os.listdir, os.path.exists, open and so on - sufficient to make an interesting experiment itself by combining that VFS layer with e.g. python-fuse. the bittornado library, also available at the above URL, has been modified to take a VFS module as an argument to all operations, such that it would be conceivable to share maildir mailboxes, mailing list archives, .tar.gz archives, .deb and .rpm archives and so on, as if they were files and directories within a file-sharing network. as the core code has only existed for under three days, and is only 400 lines long, there are rough edges: * all existing commit objects are unpacked at startup time and are stored in-memory (!). this is done so as to avoid significant modification of the bittorrent library, which will be required. * all transferred commit objects are again stored in-memory before being unpacked. so, killing the client will lose all transfers received up to that point. on the roadmap: * make things efficient! requires modification of the bittornado library. * create some documentation! * explore how to make git use this code as a new URI type so that it will be possible to just do "git pull" * explore how to use PGP/GPG to sign commits(?) or perhaps just tags(?) in order to allow commits to be pulled only from trusted parties. * share all branches and tags as well as just refs/heads/* * make "git push" re-create the .torrent (make_torrent.py) and work out how to notify seeders of a new HEAD (name the torrent after the HEAD ref, and just create a new one rather than delete the old?) so there is quite a bit to do, with the priority being on making a new URI type and a new "git-remote-{URI}" command, so that this becomes actually useable rather than just an experiment, and the project can be self-hosting as a truly distributed peer-to-peer development effort. if anyone would like to assist, you only have to ask and (ironically) i will happily grant access to the gitorious-hosted repository. if anyone would like to sponsor this project, that would be very timely, as if i don't get some money soon i will be unable to pay for food and rent. l. -- http://mail.python.org/mailman/listinfo/python-list
Re: Dumb Stupid Question About List and String
Hello, seems to have the same error with python. In fact I was coding within nuke, a 2d compositing software (not the best) unfortunately, I dont see how I can use dictionnary to do what I would like to do. 2010/9/1 Xavier Ho > On 2 September 2010 00:47, Alban Nona wrote: > >> Hello, >> >> So I figure out this night how to create automatically varibales via >> vars(), the script seems to work, exept that where it should give me a list >> like : >> [ELM004_DIF,ELM004_SPC,ELM004_RFL,ELM004_SSS, ELM004_REFR, ELM004_ALB, >> etc...] it gave me just one entry in my list, and the last one [ELM004_WPP] >> Any Ideas why that please ? >> >> http://pastebin.com/7CDbVgdD > > > Some comments: > > 1) Avoid overwriting global functions like list as a variable name. If you > do that, you won't be able to use list() later in your code, and nor can > anyone else who imports your code. > 2) I'm a bit iffy about automatic variable generations. Why not just use a > dictionary? What do others on comp.lang.python think? > 3) I'm getting an error from your code, and it doesn't match with what you > seem to get: > > # output > > ELM004_DIF > ELM004_SPC > ELM004_RFL > ELM004_SSS > ELM004_REFR > ELM004_ALB > ELM004_AMB > ELM004_NRM > ELM004_MVE > ELM004_DPF > ELM004_SDW > ELM004_MAT > ELM004_WPP > Traceback (most recent call last): > File "Test.py", line 33, in > print ELM001 > NameError: name 'ELM001' is not defined > > Did you get any compiler errors? I'm using Python 2.7 > > Cheers, > Xav > -- http://mail.python.org/mailman/listinfo/python-list
Re: Dumb Stupid Question About List and String
On 2 September 2010 00:47, Alban Nona wrote: > Hello, > > So I figure out this night how to create automatically varibales via > vars(), the script seems to work, exept that where it should give me a list > like : > [ELM004_DIF,ELM004_SPC,ELM004_RFL,ELM004_SSS, ELM004_REFR, ELM004_ALB, > etc...] it gave me just one entry in my list, and the last one [ELM004_WPP] > Any Ideas why that please ? > > http://pastebin.com/7CDbVgdD Some comments: 1) Avoid overwriting global functions like list as a variable name. If you do that, you won't be able to use list() later in your code, and nor can anyone else who imports your code. 2) I'm a bit iffy about automatic variable generations. Why not just use a dictionary? What do others on comp.lang.python think? 3) I'm getting an error from your code, and it doesn't match with what you seem to get: # output ELM004_DIF ELM004_SPC ELM004_RFL ELM004_SSS ELM004_REFR ELM004_ALB ELM004_AMB ELM004_NRM ELM004_MVE ELM004_DPF ELM004_SDW ELM004_MAT ELM004_WPP Traceback (most recent call last): File "Test.py", line 33, in print ELM001 NameError: name 'ELM001' is not defined Did you get any compiler errors? I'm using Python 2.7 Cheers, Xav -- http://mail.python.org/mailman/listinfo/python-list
Re: Dumb Stupid Question About List and String
Hello, So I figure out this night how to create automatically varibales via vars(), the script seems to work, exept that where it should give me a list like : [ELM004_DIF,ELM004_SPC,ELM004_RFL,ELM004_SSS, ELM004_REFR, ELM004_ALB, etc...] it gave me just one entry in my list, and the last one [ELM004_WPP] Any Ideas why that please ? http://pastebin.com/7CDbVgdD 2010/9/1 Xavier Ho > On 1 September 2010 12:00, Alban Nona wrote: > >> @Xavier: ShaDoW, WorldPositionPoint (which is the same thing as >> WordPointCloud passe) :) >> > > Aha! That's what I was missing. > > Cheers, > Xav > -- http://mail.python.org/mailman/listinfo/python-list
Re: Fibonacci: How to think recursively
On 2010-09-01, Albert van der Horst wrote: > [Didn't you mean: I don't understand what you mean by > overlapping recursions? You're right about the base case, so > clearly the OP uses some confusing terminology.] > > I see a problem with overlapping recursions. Unless automatic > memoizing is one, they are unduely inefficient, as each call > splits into two calls. > > If one insists on recursion (untested code, just for the idea.). > > def fib2( n ): > ' return #rabbits last year, #rabbits before last ' > if n ==1 : > return (1,1) > else > penult, ult = fib2( n-1 ) > return ( ult, ult+penult) > > def fub( n ): >return fib2(n)[1] > > Try fib and fub for largish numbers (>1000) and you'll feel the > problem. There are standard tricks for converting a recursive iteration into a tail-recursive one. It's usually done by adding the necessary parameters, e.g.: def fibr(n): def fib_helper(fibminus2, fibminus1, i, n): if i == n: return fibminus2 + fibminus1 else: return fib_helper(fibminus1, fibminus1 + fibminus2, i+1, n) if n < 2: return 1 else: return fib_helper(1, 1, 2, n) Once you've got a tail-recursive solution, you can usually convert it to loop iteration for languages like Python that favor them. The need for a temporary messed me up. def fibi(n): if n < 2: return 1 else: fibminus2 = 1 fibminus1 = 1 i = 2 while i < n: fibminus2, fibminus1 = fibminus1, fibminus2 + fibminus1 i += 1 return fibminus2 + fibminus1 It's interesting that the loop iterative solution is, for me, harder to think up without doing the tail-recursive one first. -- Neil Cerutti -- http://mail.python.org/mailman/listinfo/python-list
Re: fairly urgent request: paid python (or other) work required
lkcl writes: > i apologise for having to contact so many people but this is fairly > urgent, and i'm running out of time and options. […] I sympathise with your situation; work for skilled practicioners is scarce in many places right now. For that reason, many people are likely to be in your position. For the sake of keeping this forum habitable, I have to point out to anyone reading: It's not cool to post requests for work here. There are, as you noted, other appropriate forums for that, of which this is not one. I wish you success in finding gainful work, but all readers should please note that this is *not* the place to look for it. -- \ “Sittin' on the fence, that's a dangerous course / You can even | `\ catch a bullet from the peace-keeping force” —Dire Straits, | _o__) _Once Upon A Time In The West_ | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list
Re: Reversing a List
Victor Subervi wrote: Hi; I have this code: cursor.execute('describe products;') cols = [item[0] for item in cursor] cols = cols.reverse() cols.append('Delete') cols = cols.reverse() Unfortunately, the list doesn't reverse. If I print cols after the first reverse(), it prints None. Please advise. Also, is there a way to append to the front of the list directly? TIA, beno The reverse() method reverses that cols object just fine, in place. Unfortunately, you immediately assign it a new value of None. Just remove the cols= and it'll work fine. If you want to understand the problem better, read up on reverse() and reversed(). They're very different. In answer to your second question, you could combine the last three lines as: cols.insert('Delete', 0) DaveA -- http://mail.python.org/mailman/listinfo/python-list
Re: Performance: sets vs dicts.
Lie Ryan, 01.09.2010 15:46: On 09/01/10 00:09, Aahz wrote: However, I think there are some rock-bottom basic guarantees we can make regardless of implementation. Does anyone seriously think that an implementation would be accepted that had anything other than O(1) for index access into tuples and lists? Dicts that were not O(1) for access with non-pathological hashing? That we would accept sets having O() performance worse than dicts? I suggest that we should agree on these guarantees and document them in the core. While I think documenting them would be great for all programmers that care about practical and theoretical execution speed; I think including these implementation details in core documentation as a "guarantee" would be a bad idea for the reasons Terry outlined. One way of resolving that is by having two documentations (or two separate sections in the documentation) for: - Python -- the language -- documenting Python as an abstract language, this is the documentation which can be shared across all Python implementations. This will also be the specification for Python Language which other implementations will be measured to. - CPython -- the Python interpreter -- documents implementation details and performance metrics. It should be properly noted that these are not part of the language per se. This will be the playground for CPython experts that need to fine tune their applications to the last drop of blood and don't mind their application going nuts with the next release of CPython. I disagree. I think putting the "obvious" guarantees right into the normal documentation will actually make programmers aware that there *are* different implementations (and differences between implementations), simply because it wouldn't just say "O(1)" but "the CPython implementation of this method has an algorithmic complexity of O(1), other Python implementations are known to perform alike at the time of this writing". Maybe without the last half of the sentence if we really don't know how other implementations work here, or if we expect that there may well be a reason they may choose to behave different, but in most cases, it shouldn't be hard to make that complete statement. After all, we basically know what other implementations there are, and we also know that they tend to match the algorithmic complexities at least for the major builtin types. It seems quite clear to me as a developer that the set of builtin types and "collections" types was chosen in order to cover a certain set of algorithmic complexities and not just arbitrary interfaces. Stefan -- http://mail.python.org/mailman/listinfo/python-list
Re: Reversing a List
On Wed, Sep 1, 2010 at 6:45 PM, Matt Saxton wrote: > On Wed, 1 Sep 2010 09:00:03 -0400 > Victor Subervi wrote: > > > Hi; > > I have this code: > > > > cursor.execute('describe products;') > > cols = [item[0] for item in cursor] > > cols = cols.reverse() > > cols.append('Delete') > > cols = cols.reverse() > > > > Unfortunately, the list doesn't reverse. If I print cols after the first > > reverse(), it prints None. Please advise. > > The reverse() method modifies the list in place, but returns None, so just > use > >>> cols.reverse() > > rather than > >>> cols = cols.reverse() > Alternatively you can do \, >>>cols = reversed(cols) > > > Also, is there a way to append to > > the front of the list directly? > > TIA, > > beno > > The insert() method can do this, i.e. > >>> cols.insert(0, 'Delete') > > -- > Matt Saxton > -- > http://mail.python.org/mailman/listinfo/python-list > -- ~l0nwlf -- http://mail.python.org/mailman/listinfo/python-list
Re: Reversing a List
On Wed, Sep 1, 2010 at 9:17 AM, Shashank Singh < shashank.sunny.si...@gmail.com> wrote: > reverse reverses in-place > > >>> l = [1, 2, 3] > >>> r = l.reverse() > >>> r is None > True > >>> l > [3, 2, 1] > >>> > Ah. Thanks! beno -- http://mail.python.org/mailman/listinfo/python-list
Re: Reversing a List
reverse reverses in-place >>> l = [1, 2, 3] >>> r = l.reverse() >>> r is None True >>> l [3, 2, 1] >>> On Wed, Sep 1, 2010 at 6:30 PM, Victor Subervi wrote: > Hi; > I have this code: > > cursor.execute('describe products;') > cols = [item[0] for item in cursor] > cols = cols.reverse() > cols.append('Delete') > cols = cols.reverse() > > Unfortunately, the list doesn't reverse. If I print cols after the first > reverse(), it prints None. Please advise. Also, is there a way to append to > the front of the list directly? > TIA, > beno > > -- > http://mail.python.org/mailman/listinfo/python-list > > -- Regards Shashank Singh Senior Undergraduate, Department of Computer Science and Engineering Indian Institute of Technology Bombay shashank.sunny.si...@gmail.com http://www.cse.iitb.ac.in/~shashanksingh -- http://mail.python.org/mailman/listinfo/python-list
Re: Reversing a List
On Wed, 1 Sep 2010 09:00:03 -0400 Victor Subervi wrote: > Hi; > I have this code: > > cursor.execute('describe products;') > cols = [item[0] for item in cursor] > cols = cols.reverse() > cols.append('Delete') > cols = cols.reverse() > > Unfortunately, the list doesn't reverse. If I print cols after the first > reverse(), it prints None. Please advise. The reverse() method modifies the list in place, but returns None, so just use >>> cols.reverse() rather than >>> cols = cols.reverse() > Also, is there a way to append to > the front of the list directly? > TIA, > beno The insert() method can do this, i.e. >>> cols.insert(0, 'Delete') -- Matt Saxton -- http://mail.python.org/mailman/listinfo/python-list
Re: Saving (unusual) linux filenames
In article , Grant Edwards wrote: >On 2010-08-31, MRAB wrote: >> On 31/08/2010 17:58, Grant Edwards wrote: >>> On 2010-08-31, MRAB wrote: On 31/08/2010 15:49, amfr...@web.de wrote: > Hi, > > i have a script that reads and writes linux paths in a file. I save the > path (as unicode) with 2 other variables. I save them seperated by "," > and the "packets" by newlines. So my file looks like this: > path1, var1A, var1B > path2, var2A, var2B > path3, var3A, var3B > > > this works for "normal" paths but as soon as i have a path that does > include a "," it breaks. The problem now is that (afaik) linux allows > every char (aside from "/" and null) to be used in filenames. The only > solution i can think of is using null as a seperator, but there have to > a cleaner version ? You could use a tab character '\t' instead. >>> >>> That just breaks with a different set of filenames. >>> >> How many filenames contain control characters? > >How many filenames contain ","? Not many, but the OP wants his >program to be bulletproof. Can't fault him for that. As appending ",v" is the convention for rcs / cvs archives, I would say: a lot. Enough to guarantee that all my backup tar's contain at least a few. > >If I had a nickle for every Unix program or shell-script that failed >when a filename had a space it it I'd rather have it fail for spaces than for comma's. > >> Surely that's a bad idea. > >Of course it's a bad idea. That doesn't stop people from doing it. > >-- >Grant Edwards grant.b.edwardsYow! ! Now I understand > at advanced MICROBIOLOGY and > gmail.comth' new TAX REFORM laws!! -- -- Albert van der Horst, UTRECHT,THE NETHERLANDS Economic growth -- being exponential -- ultimately falters. alb...@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst -- http://mail.python.org/mailman/listinfo/python-list
Re: [Pickle]dirty problem 3 lines
it's just as it seems : i want to know how does ti works to get back an object from a string in python : pickle.loads("""b'\x80\x03]q\x00(K\x00K\x01e.'""") #doesn't work Google Fan boy On Wed, Sep 1, 2010 at 5:23 AM, MRAB wrote: > On 01/09/2010 03:33, bussiere bussiere wrote: >> >> i know it's dirty, i know i should use json but i want to know, it's >> quiet late here : >> import pickle >> dump = """b'\x80\x03]q\x00(K\x00K\x01e.'""" >> print(pickle.loads(dump)) >> >> how can i get back my object from this string ? >> the string is : b'\x80\x03]q\x00(K\x00K\x01e.' >> and i'am using python3 >> help will be appreciated i'am chewing on this for a long time now. > > Well, pickle.loads(b'\x80\x03]q\x00(K\x00K\x01e.') works. > > That, of course, is not the same as """b'\x80\x03]q\x00(K\x00K\x01e.'""". > > Do you mean r"""b'\x80\x03]q\x00(K\x00K\x01e.'"""? > > (It's also late here, well, actually, so late it's early... Time to > sleep. :-)) > -- > http://mail.python.org/mailman/listinfo/python-list > -- http://mail.python.org/mailman/listinfo/python-list
Reversing a List
Hi; I have this code: cursor.execute('describe products;') cols = [item[0] for item in cursor] cols = cols.reverse() cols.append('Delete') cols = cols.reverse() Unfortunately, the list doesn't reverse. If I print cols after the first reverse(), it prints None. Please advise. Also, is there a way to append to the front of the list directly? TIA, beno -- http://mail.python.org/mailman/listinfo/python-list
YAMI4 v. 1.1.0 - messaging solution for distributed systems
I am pleased to announce that the new version of YAMI4, 1.1.0, has been just released and is available for download. http://www.inspirel.com/yami4/ This new version extends the coverage of supported programming languages with a completely new Python3 module, which allows full integration of built-in dictionary objects as message payloads. Thanks to this level of language integration, the API is very easy to learn and natural in use. Please check code examples in the src/python/examples directory to see complete client-server systems. The users of other programming languages will also benefit from the ability to transmit raw binary messages, which in addition to support high-performance scenarios can be used as a hook for custom serialization routines. The API of the whole library was also extended a bit to allow better control of automatic reconnection and to ensure low jitter in communication involving many receivers even in case of partial system failure. Last but not least, a number of fixes and improvements have been introduced - please see the changelog.txt file, which is part of the whole package, for a detailed description of all improvements. -- Maciej Sobczak * http://www.inspirel.com -- http://mail.python.org/mailman/listinfo/python-list
fairly urgent request: paid python (or other) work required
i apologise for having to contact so many people but this is fairly urgent, and i'm running out of time and options. i'm a free software programmer, and i need some paid work - preferably python - fairly urgently, so that i can pay for food and keep paying rent, and so that my family doesn't get deported or have to leave the country. i really would not be doing this unless it was absolutely, absolutely essential that i get money. so that both i and the list are not unnecessarily spammed, please don't reply with recommendations of "where to get jobs", unless they are guaranteed to result in immediate work and money. if you have need of a highly skilled and experienced python-preferring free-software-preferring software engineer, please simply contact me, and tell me what you need doing: there's no need for you to read the rest of this message. so that people are not offended by me asking on such a high-volume list for work, here are some questions and answers: Q: who are you? A: luke leighton. free sofware developer, free software project leader, and "unusual cross-project mash-up-er" (meaning: i spot the value of joining one or more bits of disparate "stuff" to make something that's more powerful than its components). Q: where's your CV? A: executive version of CV is at http://lkcl.net/exec_cv.txt - please don't ask for a proprietary microsoft word version, as a refusal and referral to the "sylvester response" often offends. Q: what can you do? A: python programming, c programming, web development, networking, cryptography, reverse-engineering, IT security, etc. etc. preferably involving free software. Q: what do you need? A: money to pay rent and food. at the ABSOLUTE MINIMUM, i need as little as £1500 per month to pay everything, and have been earning approx £800 per month for the past year. a £5000 inheritance last year which i was not expecting has delayed eviction and bankruptcy for me and my family, and deportation for my partner and 17 month old daughter (marie is here in the UK on a FLR/M visa) Q: why are you asking here? A: because it's urgent that i get money really really soon; my family members are refusing to assist, and the few friends that i have do not have any spare money to lend. Q: why here and not "monster jobs" or "python-jobs list" or the various "recruitment agencies"? A: those are full-time employment positions, which i have been frequently applying for and get rejected for various reasons, and i'm running out of time and money. further interviews cost money, and do not result in guaranteed work. i need work - and money - _now_. Q: why here and not "peopleperhour.com"? A: if you've ever bid on peopleperhour.com you will know that you are bidding against "offshore" contrators and even being undercut by 1st world country bidders who, insanely, appear to be happy to do work for as little as £2 / hour. Q: why are you getting rejected from interviews? A: that's complex. a) i simply don't interview well. people with the classic symptoms of asperger's just don't. b) my daughter is 17 months old. when i go away for as little as 3 days, which i've done three times now, she is extremely upset both when i am away and when i return. i think what would happen if i was doing some sort of full- time job, away from home, and... i can't do it. subconsciously that affects how i react when speaking to interviewers. Q: why do you not go "get a job at tesco's" or "drive a truck"? A: tescos and HGV driving etc. pay around £12 per hour. £12 per hour after tax comes down to about £8 to £9 per hour. £9 per hour requires 35 hours per week to earn as little as £1500. 35 hours per week is effectively full-time, and means that a) my programming and software engineering skills are utterly, utterly wasted b) my daughter gets extremely upset because i won't be at home. so you get the gist, and thank you for putting up with me needing to take this action. l. -- http://mail.python.org/mailman/listinfo/python-list
Re: Fibonacci: How to think recursively
In article , Mel wrote: >Baba wrote: > >> Level: beginner >> >> I would like to know how to approach the following Fibonacci problem: >> How may rabbits do i have after n months? >> >> I'm not looking for the code as i could Google that very easily. I'm >> looking for a hint to put me on the right track to solve this myself >> without looking it up. >> >> my brainstorming so far brought me to a stand still as i can't seem to >> imagine a recursive way to code this: >> >> my attempted rough code: >> >> def fibonacci(n): >> # base case: >> result = fibonacci (n-1) + fibonacci (n-2) this will end up in a mess as it will create overlapping recursions > >I don't think this is the base case. The base case would be one or more >values of `n` that you already know the fibonacci number for. Your >recursive function can just test for those and return the right answer right >away. The the expression you've coded contains a good way to handle the >non-base cases. There's no such problem as "overlapping recursions". [Didn't you mean: I don't understand what you mean by overlapping recursions? You're right about the base case, so clearly the OP uses some confusing terminology.] I see a problem with overlapping recursions. Unless automatic memoizing is one, they are unduely inefficient, as each call splits into two calls. If one insists on recursion (untested code, just for the idea.). def fib2( n ): ' return #rabbits last year, #rabbits before last ' if n ==1 : return (1,1) else penult, ult = fib2( n-1 ) return ( ult, ult+penult) def fub( n ): return fib2(n)[1] Try fib and fub for largish numbers (>1000) and you'll feel the problem. > > Mel. > Groetjes Albert -- -- Albert van der Horst, UTRECHT,THE NETHERLANDS Economic growth -- being exponential -- ultimately falters. alb...@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst -- http://mail.python.org/mailman/listinfo/python-list
Re: Performance: sets vs dicts.
On 09/01/10 00:09, Aahz wrote: > In article , > Jerry Hill wrote: >> On Mon, Aug 30, 2010 at 7:42 PM, Aahz wrote: >>> >>> Possibly; IMO, people should not need to run timeit to determine basic >>> algorithmic speed for standard Python datatypes. >> >> http://wiki.python.org/moin/TimeComplexity takes a stab at it. IIRC, >> last time this came up, there was some resistance to making promises >> about time complexity in the official docs, since that would make >> those numbers part of the language, and thus binding on other >> implementations. > > I'm thoroughly aware of that page and updated it yesterday to make it > easier to find. ;-) > > However, I think there are some rock-bottom basic guarantees we can make > regardless of implementation. Does anyone seriously think that an > implementation would be accepted that had anything other than O(1) for > index access into tuples and lists? Dicts that were not O(1) for access > with non-pathological hashing? That we would accept sets having O() > performance worse than dicts? > > I suggest that we should agree on these guarantees and document them in > the core. While I think documenting them would be great for all programmers that care about practical and theoretical execution speed; I think including these implementation details in core documentation as a "guarantee" would be a bad idea for the reasons Terry outlined. One way of resolving that is by having two documentations (or two separate sections in the documentation) for: - Python -- the language -- documenting Python as an abstract language, this is the documentation which can be shared across all Python implementations. This will also be the specification for Python Language which other implementations will be measured to. - CPython -- the Python interpreter -- documents implementation details and performance metrics. It should be properly noted that these are not part of the language per se. This will be the playground for CPython experts that need to fine tune their applications to the last drop of blood and don't mind their application going nuts with the next release of CPython. -- http://mail.python.org/mailman/listinfo/python-list
Re: Optimising literals away
On 09/01/10 17:06, Stefan Behnel wrote: > MRAB, 31.08.2010 23:53: >> On 31/08/2010 21:18, Terry Reedy wrote: >>> On 8/31/2010 12:33 PM, Aleksey wrote: On Aug 30, 10:38 pm, Tobias Weber wrote: > Hi, > whenever I type an "object literal" I'm unsure what optimisation > will do > to it. >>> >>> Optimizations are generally implentation dependent. CPython currently >>> creates numbers, strings, and tuple literals just once. Mutable literals >>> must be created each time as they may be bound and saved. >>> > def m(arg): > if arg& set([1,2,3]): >>> >>> set() is a function call, not a literal. When m is called, who knows >>> what 'set' will be bound to? In Py3, at least, you could write {1,2,3}, >>> which is much faster as it avoids creating and deleting a list. On my >>> machine, .35 versus .88 usec. Even then, it must be calculated each time >>> because sets are mutable and could be returned to the calling code. >>> >> There's still the possibility of some optimisation. If the resulting >> set is never stored anywhere (bound to a name, for example) then it >> could be created once. When the expression is evaluated there could be >> a check so see whether 'set' is bound to the built-in class, and, if it >> is, then just use the pre-created set. What if the set is mutated by the function? That will modify the global cache of the set; one way to prevent mutation is to use frozenset, but from the back of my mind, I think there was a discussion that rejects set literals producing a frozen set instead of regular set. > Cython applies this kind of optimistic optimisation in a couple of other > cases and I can affirm that it often makes sense to do that. However, > drawback here: the set takes up space while not being used (not a huge > problem if literals are expected to be small), and the global lookup of > "set" still has to be done to determine if it *is* the builtin set type. > After that, however, the savings should be considerable. > > Another possibility: always cache the set and create a copy on access. > Copying a set avoids the entire eval loop overhead and runs in a C loop > instead, using cached item instances with (most likely) cached hash > values. So even that will most likely be much faster than the > spelled-out code above. I think that these kind of optimizations would probably be out-of-character for CPython, which values implementation simplicity above small increase in speed. Sets are not that much used and optimizing this particular case seems to be prone to create many subtle issues (especially with multithreading). In other word, these optimizations makes sense for Python implementations that are heavily geared for speed (e.g. Unladen Swallow, Stackless Python, PyPy[1], Cython); but probably may only have a minuscule chance of being implemented in CPython. [1] yes, their goal was to be faster than CPython (and faster than the speed of photon in vacuum), though AFAICT they have yet to succeed. -- http://mail.python.org/mailman/listinfo/python-list
Re: Newby Needs Help with Python code
Hi Esther, On Wed, Sep 1, 2010 at 13:29, Nally Kaunda-Bukenya wrote: > #THE PROGRAM: > import arcgisscripting > gp=arcgisscripting.create() > gp.Workspace = "C:\\NPDES\\NPDES_PYTHON.mdb" > fc = "Outfalls_ND" > > try: > # Set the field to create a list of unique values > fieldname = "OUTFALL_ID" > > # Open a Search Cursor to identify all unique values > cur = gp.UpdateCursor(fc) > row = cur.Next() > > # Set a list variable to hold all unique values > L = [] > > # Using a while loop, cursor through all records and append unique > #values to the list variable > while row <> None: > value = row.GetValue(fieldname) > if value not in L: > L.append(value) > row = cur.Next() > row.SetValue(Tot_Outf_Area, sum(row.AREA_ACRES)) #total area of > each outfall=sum of all area 4 each unique outfallid > cur.UpdateRow(row) #to commit changes > row=cur.Next() > print row.Tot_Outf_Area > # Sort the list variable > L.sort() > > # If a value in the list variable is blank, remove it from the list > variable > #to filter out diffuse outfalls > if ' ' in L: > L.remove(' ') > > except: > # If an error occurred while running a tool, print the messages > print gp.GetMessages() Have you tried running this code? I suspect it won't work at all -- and because you are catching all possible exceptions in your try...except, you won't even know why. Here are the things that I'd suggest, just from glancing over the code: - Remove the try...except for now. Getting an exception, and understanding why it occurred and how best to deal with it, is IMHO very helpful when prototyping and debugging. - Take another look at your while loop. I don't know ArcGIS, so I don't know if the UpdateCursor object supports the iterator protocol, but the normal Python way of looping through all rows would be a for loop: for row in cur: # code For example, you are calling cur.Next() twice inside the loop -- is that what you want? Hope that helps, Rami > > > > #Please Help!!! > > #Esther > > > > -- > http://mail.python.org/mailman/listinfo/python-list > > -- Rami Chowdhury "Never assume malice when stupidity will suffice." -- Hanlon's Razor 408-597-7068 (US) / 07875-841-046 (UK) / 0189-245544 (BD) -- http://mail.python.org/mailman/listinfo/python-list
Newby Needs Help with Python code
Dear Python experts, I hope someone can help me. I am new to Python and trying to achive the following: 1) I would like to populate the Tot_Ouf_Area field with total area of each unique outfall_id (code attempted below,but Tot_Ouf_Area not populating) 2) I would also like to get the user input of Rv ( each landuse type will have a specific Rv value). For example the program should ask the user for Rv value of Low Density Residential (user enters 0.4 in example below and that value must be stored in the Rv field), and so on as shown in the 2nd table below… Below is my original table (comma-delimited) "OBJECTID","OUTFALL_ID","LANDUSE","AREA_ACRES","Rv","Tot_Outf_Area" 16,"ALD06001","High Density Residential",6.860922,0.00,0.00 15,"ALD06001","General Commercial",7.520816,0.00,0.00 14,"ALD05002","Low Density Residential",7.255491,0.00,0.00 13,"ALD05002","Forest",37.090473,0.00,0.00 12,"ALD05001","Low Density Residential",16.904560,0.00,0.00 11,"ALD05001","Forest",84.971686,0.00,0.00 10,"ALD04002","Urban Open",1.478677,0.00,0.00 9,"ALD04002","Transportation",0.491887,0.00,0.00 8,"ALD04002","Low Density Residential",25.259720,0.00,0.00 7,"ALD04002","Forest",0.355659,0.00,0.00 6,"ALD04001","Recreational",0.013240,0.00,0.00 5,"ALD04001","Low Density Residential",34.440130,0.00,0.00 4,"ALD04001","Forest",10.229973,0.00,0.00 3,"ALD03002","Low Density Residential",23.191538,0.00,0.00 2,"ALD03002","Forest",1.853920,0.00,0.00 1,"ALD03001","Low Density Residential",6.828130,0.00,0.00 21,"ALD06001","Water.dgn",0.013951,0.00,0.00 20,"ALD06001","Urban Open",10.382900,0.00,0.00 19,"ALD06001","Transportation",2.064454,0.00,0.00 18,"ALD06001","Recreational",0.011007,0.00,0.00 17,"ALD06001","Low Density Residential",0.752509,0.00,0.00 Below is my desired output table (comma delimited): "OBJECTID","OUTFALL_ID","LANDUSE","AREA_ACRES","Rv","Tot_Outf_Area" 16,"ALD06001","High Density Residential",6.860922,0.00,27.606562 15,"ALD06001","General Commercial",7.520816,0.00,27.606562 14,"ALD05002","Low Density Residential",7.255491,0.40,44.345966 13,"ALD05002","Forest",37.090473,0.30,44.345966 11,"ALD05001","Forest",84.971686,0.30,101.876247 12,"ALD05001","Low Density Residential",16.904560,0.40,101.876247 10,"ALD04002","Urban Open",1.478677,0.00,27.585945 9,"ALD04002","Transportation",0.491887,0.00,27.585945 8,"ALD04002","Low Density Residential",25.259720,0.40,27.585945 7,"ALD04002","Forest",0.355659,0.30,27.585945 6,"ALD04001","Recreational",0.013240,0.00,44.683345 5,"ALD04001","Low Density Residential",34.440130,0.40,44.683345 4,"ALD04001","Forest",10.229973,0.30,44.683345 3,"ALD03002","Low Density Residential",23.191538,0.40,25.045460 2,"ALD03002","Forest",1.853920,0.30,25.045460 1,"ALD03001","Low Density Residential",6.828130,0.40,6.828130 21,"ALD06001","Water.dgn",0.013951,0.00,27.606562 20,"ALD06001","Urban Open",10.382900,0.00,27.606562 19,"ALD06001","Transportation",2.064454,0.00,27.606562 18,"ALD06001","Recreational",0.011007,0.00,27.606562 17,"ALD06001","Low Density Residential",0.752509,0.40,27.606562 Below is my code so far for updating rows with total area (Tot_Ouf_Area): #THE PROGRAM: import arcgisscripting gp=arcgisscripting.create() gp.Workspace = "C:\\NPDES\\NPDES_PYTHON.mdb" fc = "Outfalls_ND" try: # Set the field to create a list of unique values fieldname = "OUTFALL_ID" # Open a Search Cursor to identify all unique values cur = gp.UpdateCursor(fc) row = cur.Next() # Set a list variable to hold all unique values L = [] # Using a while loop, cursor through all records and append unique #values to the list variable while row <> None: value = row.GetValue(fieldname) if value not in L: L.append(value) row = cur.Next() row.SetValue(Tot_Outf_Area, sum(row.AREA_ACRES)) #total area of each outfall=sum of all area 4 each unique outfallid cur.UpdateRow(row) #to commit changes row=cur.Next() print row.Tot_Outf_Area # Sort the list variable L.sort() # If a value in the list variable is blank, remove it from the list variable #to filter out diffuse outfalls if ' ' in L: L.remove(' ') except: # If an error occurred while running a tool, print the messages print gp.GetMessages() #Please Help!!! #Esther -- http://mail.python.org/mailman/listinfo/python-list
Re: Queue cleanup
Lawrence D'Oliveiro writes: >> Refcounting is susceptable to the same pauses for reasons already >> discussed. > > Doesn’t seem to happen in the real world, though. def f(n): from time import time a = [1] * n t0 = time() del a t1 = time() return t1 - t0 for i in range(9): print i, f(10**i) on my system prints: 0 2.86102294922e-06 1 2.14576721191e-06 2 3.09944152832e-06 3 1.00135803223e-05 4 0.000104904174805 5 0.00098991394043 6 0.00413608551025 7 0.037693977356 8 0.362598896027 Looks pretty linear as n gets large. 0.36 seconds (the last line) is a noticable pause. -- http://mail.python.org/mailman/listinfo/python-list
Re: Optimising literals away
MRAB, 31.08.2010 23:53: On 31/08/2010 21:18, Terry Reedy wrote: On 8/31/2010 12:33 PM, Aleksey wrote: On Aug 30, 10:38 pm, Tobias Weber wrote: Hi, whenever I type an "object literal" I'm unsure what optimisation will do to it. Optimizations are generally implentation dependent. CPython currently creates numbers, strings, and tuple literals just once. Mutable literals must be created each time as they may be bound and saved. def m(arg): if arg& set([1,2,3]): set() is a function call, not a literal. When m is called, who knows what 'set' will be bound to? In Py3, at least, you could write {1,2,3}, which is much faster as it avoids creating and deleting a list. On my machine, .35 versus .88 usec. Even then, it must be calculated each time because sets are mutable and could be returned to the calling code. There's still the possibility of some optimisation. If the resulting set is never stored anywhere (bound to a name, for example) then it could be created once. When the expression is evaluated there could be a check so see whether 'set' is bound to the built-in class, and, if it is, then just use the pre-created set. Cython applies this kind of optimistic optimisation in a couple of other cases and I can affirm that it often makes sense to do that. However, drawback here: the set takes up space while not being used (not a huge problem if literals are expected to be small), and the global lookup of "set" still has to be done to determine if it *is* the builtin set type. After that, however, the savings should be considerable. Another possibility: always cache the set and create a copy on access. Copying a set avoids the entire eval loop overhead and runs in a C loop instead, using cached item instances with (most likely) cached hash values. So even that will most likely be much faster than the spelled-out code above. Stefan -- http://mail.python.org/mailman/listinfo/python-list
Re: Queue cleanup
Lawrence D'Oliveiro writes: > Whereas garbage collection will happen at some indeterminate time long after > the last access to the object, when it very likely will no longer be in the > cache, and have to be brought back in just to be freed, GC's for large systems generally don't free (or even examine) individual garbage objects. They copy the live objects to a new contiguous heap without ever touching the garbage, and then they release the old heap. That has the effect of improving locality, since the new heap is compacted and has no dead objects. The algorithms are generational (they do frequent gc's on recently-created objects and less frequent ones on older objects), so "minor" gc operations are on regions that fit in cache, while "major" ones might have cache misses but are infrequent. Non-compacting reference counting (or simple mark/sweep gc) has much worse fragmentation and consequently worse cache locality than copying-style gc. -- http://mail.python.org/mailman/listinfo/python-list