Re: GC is very expensive: am I doing something wrong?
On Mon, 22 Mar 2010 22:05:40 -0700, Paul Rubin wrote: > Antoine Pitrou writes: >> "Orders of magnitude worse", in any case, sounds very exaggerated. > > The worst case can lose orders of magnitude if a lot of values hash to > the same bucket. Well, perhaps one order of magnitude. >>> for i in xrange(100): ... n = 32*i+1 ... assert hash(2**n) == hash(2) ... >>> d1 = dict.fromkeys(xrange(100)) >>> d2 = dict.fromkeys([2**(32*i+1) for i in xrange(100)]) >>> >>> from timeit import Timer >>> setup = "from __main__ import d1, d2" >>> t1 = Timer("for k in d1.keys(): x = d1[k]", setup) >>> t2 = Timer("for k in d2.keys(): x = d2[k]", setup) >>> >>> min(t1.repeat(number=1000, repeat=5)) 0.026707887649536133 >>> min(t2.repeat(number=1000, repeat=5)) 0.33103203773498535 -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Castrated traceback in sys.exc_info()
En Mon, 22 Mar 2010 15:20:39 -0300, Pascal Chambon escribió: Allright, here is more concretely the problem : ERROR:root:An error Traceback (most recent call last): File "C:/Users/Pakal/Desktop/aaa.py", line 7, in c return d() File "C:/Users/Pakal/Desktop/aaa.py", line 11, in d def d(): raise ValueError ValueError >>> As you see, the traceback only starts from function c, which handles the exception. It doesn't show main(), a() and b(), which might however be (and are, in my case) critical to diagnose the severity of the problem (since many different paths would lead to calling c()). So the question is : is that possible to enforce, by a way or another, the retrieval of the FULL traceback at exception raising point, instead of that incomplete one ? Thanks for bringing this topic! I learned a lot trying to understand what happens. The exception traceback (what sys.exc_info()[2] returns) is *not* a complete stack trace. The sys module documentation is wrong [1] when it says "...encapsulates the call stack at the point where the exception originally occurred." The Language Reference is more clear [2]: "Traceback objects represent a stack trace of an exception. A traceback object is created when an exception occurs. When the search for an exception handler unwinds the execution stack, at each unwound level a traceback object is inserted in front of the current traceback. When an exception handler is entered, the stack trace is made available to the program." That is, a traceback holds only the *forward* part of the stack: the frames already exited when looking for an exception handler. Frames going from the program starting point up to the current execution point are *not* included. Conceptually, it's like having two lists: stack and traceback. The complete stack trace is always stack+traceback. At each step (when unwinding the stack, looking for a frame able to handle the current exception) an item is popped from the top of the stack (last item) and inserted at the head of the traceback. The traceback holds the "forward" path (from the current execution point, to the frame where the exception was actually raised). It's a linked list, its tb_next attribute holds a reference to the next item; None marks the last one. The "back" path (going from the current execution point to its caller and all the way to the program entry point) is a linked list of frames; the f_back attribute points to the previous one, or None. In order to show a complete stack trace, one should combine both. The traceback module contains several useful functions: extract_stack() + extract_tb() are a starting point. The simplest way I could find to make the logging module report a complete stack is to monkey patch logging.Formatter.formatException so it uses format_exception() and format_stack() combined (in fact it is simpler than the current implementation using a StringIO object): import logging import traceback def formatException(self, ei): """ Format and return the specified exception information as a string. This implementation builds the complete stack trace, combining traceback.format_exception and traceback.format_stack. """ lines = traceback.format_exception(*ei) if ei[2]: lines[1:1] = traceback.format_stack(ei[2].tb_frame.f_back) return ''.join(lines) # monkey patch the logging module logging.Formatter.formatException = formatException def a(): return b() def b(): return c() def c(): try: return d() except: logging.exception("An error") raise def d(): raise ValueError def main(): a() main() Output: ERROR:root:An error Traceback (most recent call last): File "test_logging.py", line 32, in main() File "test_logging.py", line 30, in main a() File "test_logging.py", line 19, in a def a(): return b() File "test_logging.py", line 20, in b def b(): return c() File "test_logging.py", line 23, in c return d() File "test_logging.py", line 27, in d def d(): raise ValueError ValueError Traceback (most recent call last): File "test_logging.py", line 32, in main() File "test_logging.py", line 30, in main a() File "test_logging.py", line 19, in a def a(): return b() File "test_logging.py", line 20, in b def b(): return c() File "test_logging.py", line 23, in c return d() File "test_logging.py", line 27, in d def d(): raise ValueError ValueError Note that both tracebacks are identical: the first comes from the patched logging module, the second is the standard Python one. [1] http://docs.python.org/library/sys.html#sys.exc_info [2] http://docs.python.org/reference/datamodel.html#the-standard-type-hierarchy -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list
Re: GC is very expensive: am I doing something wrong?
Antoine Pitrou writes: > "Orders of magnitude worse", in any case, sounds very exaggerated. The worst case can lose orders of magnitude if a lot of values hash to the same bucket. -- http://mail.python.org/mailman/listinfo/python-list
Re: device identification
Omer Ihsan wrote: > >i have installed pyusb now and run the sample usbenum.pyi have 3 >usb ports on my PC but the results show 6 outputs to >dev.filename..they are numbers like 001 or 005 etc and they >changed when i plugged in devices...(i am no good with the usb >standards)i just want to identify each device/port... what >parameter in the example would help me You can't identify the ports.[1] What good would it do you? The ports on your PC are not numbered. You certainly CAN identify the devices, by their VID and PID (or idVendor and idProduct). You identify by function, not by location. When you plug in a USB drive, you don't want to worry about where it's plugged in. === [1]: OK, technically, it is not impossible to identify the port numbers, but it is quite tedious. You need to chase through the sysfs expansion of your buses hub/port tree and find a match for your device. It's not worth the trouble. -- Tim Roberts, t...@probo.com Providenza & Boekelheide, Inc. -- http://mail.python.org/mailman/listinfo/python-list
Re: google token
On Monday 22 March 2010 17:23:27 Thufir wrote: > On Mar 20, 3:12 am, Steven D'Aprano > cybersource.com.au> wrote: > > On Sat, 20 Mar 2010 09:17:14 +, Thufir wrote: > > > I'd like to acquire a token, as below, but from Java: > > Perhaps you should be asking a Java discussion group? This group is for > > discussing Python. > > > > -- > > Steven > > What I meant to ask is, how is that token being acquired? Is that > just a GET? > Looks like it -- a urllib2 request, anyway -- the self._web object is defined here: http://pyrfeed.googlecode.com/svn/trunk/lib/web/web.py > > > thanks, > > Thufir Rami Chowdhury "Any sufficiently advanced incompetence is indistinguishable from malice." -- Grey's Law 408-597-7068 (US) / 07875-841-046 (UK) / 01819-245544 (BD) -- http://mail.python.org/mailman/listinfo/python-list
Re: short-circuiting any/all ?
In Steven D'Aprano writes: >On Mon, 22 Mar 2010 22:19:57 +, kj wrote: >In any case, the once-off cost of creating or importing a function is >usually quite cheap. As usual, the best advise is not to worry about >optimization until you have profiled the code and learned where the >actual bottlenecks are. Write what reads best, not what you guess might >be faster, until you really know you need the speed and that it is an >optimization and not a pessimation. My preference for map in this case is not due to performance considerations, but to avoid unnecessary code-clutter. I just find, e.g., x = map(int, y) slightly easier on the eyes than x = [int(z) for z in y] This tiny improvement in readability gets negated if one needs to define a function in order to use map. Hence, e.g., I prefer x = [_[0] for _ in y] over x = map(lambda _: _[0], y) and certainly over def _first(seq): return seq[0] x = map(_first, y) Arguably, Knuth's "premature optimization is the root of all evil" applies even to readability (e.g. "what's the point of making code optimally readable if one is going to change it completely next day?") If there were the equivalent of a profiler for code clutter, I guess I could relax my readability standards a bit... ~K -- http://mail.python.org/mailman/listinfo/python-list
Re: GC is very expensive: am I doing something wrong?
Le Mon, 22 Mar 2010 23:40:16 +, tan a écrit : > >>Remember that the original use case was to load a dictionary from a text >>file. For this use case, a trie can be very wasteful in terms of memory >>and rather CPU cache unfriendly on traversal, whereas hash values are a) >>rather fast to calculate for a string, and b) often just calculated once >>and then kept alive in the string object for later reuse. > > You still have to walk the bucket in a hash map/table. Performance may > be orders of magnitude worse than for trees. "walk the bucket" shouldn't be a significant cost factor here, especially if you are doing meaningful work with the traversed items. In the CPython implementation the total hash table size is less than a constant times the number of actual items. Moreover, it's a linear scan over an array rather than having to dereference pointers as in tree. "Orders of magnitude worse", in any case, sounds very exaggerated. (and, of course, as the OP said, there's the major difference that the dict type is implemented in C, which makes constant factors an order of magnitude smaller than for a Python trie implementation) -- http://mail.python.org/mailman/listinfo/python-list
Re: short-circuiting any/all ?
On Mon, 22 Mar 2010 22:19:57 +, kj wrote: > In Tim Golden > writes: > >>On 22/03/2010 18:30, kj wrote: >>> Thanks! I'm glad to know that one can get the short circuiting using >>> a map-type idiom. (I prefer map over comprehensions when I don't need >>> to define a function just for the purpose of passing it to it.) > >>In what way does "map" over "comprehensions" save you defining a >>function? > >>any (map (is_invalid, L)) >>any (is_invalid (i) for i in L) > > I was talking in the *general* case. map at the very least requires a > lambda expression, which is a one-time function defintion. But keep in mind that instead of this: map(lambda x,y: x+y, somelist) you can do this: import operator map(operator.add, somelist) In any case, the once-off cost of creating or importing a function is usually quite cheap. As usual, the best advise is not to worry about optimization until you have profiled the code and learned where the actual bottlenecks are. Write what reads best, not what you guess might be faster, until you really know you need the speed and that it is an optimization and not a pessimation. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: individually updating unicodedata db?
Vlastimil Brom wrote: Hi all, I just tried to find some information about the unicodedata database and the possibilities of updating it to the latest version of the unicode standards (currently 5.2, while python supports 5.1 in the latest versions). An option to update this database individually might be useful as the unicode standard updates seem to be more frequent than the official python releases (and not every release is updated to the latest available unicode db version either). Am I right, that this is not possible without recompiling python from source? I eventually found the promissing file ...Python-src--2.6.5\Python-2.6.5\Tools\unicode\makeunicodedata.py which required the following files from the unicode database to be in the same folder: EastAsianWidth-3.2.0.txt UnicodeData-3.2.0.txt CompositionExclusions-3.2.0.txt UnicodeData.txt EastAsianWidth.txt CompositionExclusions.txt and also Modules/unicodedata_db.h Modules/unicodename_db.h, Objects/unicodetype_db.h After a minor correction - addig the missing "import re" - the script was able to run and recreate the above h files. I guess, I am stuck here, as I use the precompiled version supplied in the windows installer and can't compile python from source to obtain the needed unicodedata.pyd. Or are there any possibilities I missed to individually upgrade the unicodedata databese? (Using Python 2.6.5, Win XPh SP3) Thanks in advance for any hints, vbr From the look of it the Unicode data is compiled into the DLL, but I don't see any reason, other than speed, why preprocessed data couldn't be read from a file at startup by the DLL, provided that the format hasn't changed, eg new fields added, without affecting the DLL's interface to the rest of Python. -- http://mail.python.org/mailman/listinfo/python-list
Re: google token
On Mar 20, 3:12 am, Steven D'Aprano wrote: > On Sat, 20 Mar 2010 09:17:14 +, Thufir wrote: > > I'd like to acquire a token, as below, but from Java: > > Perhaps you should be asking a Java discussion group? This group is for > discussing Python. > > -- > Steven What I meant to ask is, how is that token being acquired? Is that just a GET? thanks, Thufir -- http://mail.python.org/mailman/listinfo/python-list
individually updating unicodedata db?
Hi all, I just tried to find some information about the unicodedata database and the possibilities of updating it to the latest version of the unicode standards (currently 5.2, while python supports 5.1 in the latest versions). An option to update this database individually might be useful as the unicode standard updates seem to be more frequent than the official python releases (and not every release is updated to the latest available unicode db version either). Am I right, that this is not possible without recompiling python from source? I eventually found the promissing file ...Python-src--2.6.5\Python-2.6.5\Tools\unicode\makeunicodedata.py which required the following files from the unicode database to be in the same folder: EastAsianWidth-3.2.0.txt UnicodeData-3.2.0.txt CompositionExclusions-3.2.0.txt UnicodeData.txt EastAsianWidth.txt CompositionExclusions.txt and also Modules/unicodedata_db.h Modules/unicodename_db.h, Objects/unicodetype_db.h After a minor correction - addig the missing "import re" - the script was able to run and recreate the above h files. I guess, I am stuck here, as I use the precompiled version supplied in the windows installer and can't compile python from source to obtain the needed unicodedata.pyd. Or are there any possibilities I missed to individually upgrade the unicodedata databese? (Using Python 2.6.5, Win XPh SP3) Thanks in advance for any hints, vbr -- http://mail.python.org/mailman/listinfo/python-list
Re: GC is very expensive: am I doing something wrong?
In article , Stefan Behnel wrote: >Lawrence D'Oliveiro, 22.03.2010 00:36: >> Terry Reedy wrote: >>> No one has discovered a setting >>> of the internal tuning parameters for which there are no bad patterns >>> and I suspect there are not any such. This does not negate Xavier's >>> suggestion that a code change might also solve your problem. >> >> Could it be that for implementing a structure like a trie as the OP is, >> where a lot of CPU cycles can be spent manipulating the structure, a high- >> level language like Python, Perl or Ruby just gets in the way? > >I would rather say that the specific problem of the trie data structure is >that it has extremely little benefit over other available data structures. Not true. >There may still be a couple of niches where it makes sense to consider it >as an alternative, but given that dicts are so heavily optimised in Python, >it'll be hard for tries to compete even when written in a low-level language. It depends. If your data is not in nearly sorted order, trees are some of the best mechanisms available. >Remember that the original use case was to load a dictionary from a text >file. For this use case, a trie can be very wasteful in terms of memory and >rather CPU cache unfriendly on traversal, whereas hash values are a) rather >fast to calculate for a string, and b) often just calculated once and then >kept alive in the string object for later reuse. You still have to walk the bucket in a hash map/table. Performance may be orders of magnitude worse than for trees. >> My feeling would be, try to get the language to do as much of the work for >> you as possible. If you canât do that, then you might be better off with a >> lower-level language. > >I agree with the first sentence, but I'd like to underline the word 'might' >in the second. As this newsgroup shows, very often it's enough to look for >a better algorithmic approach first. > >Stefan > -- You want to know who you are? http://oshosearch.net/Convert/search.php Most Osho books on line: http://oshosearch.net -- http://mail.python.org/mailman/listinfo/python-list
Re: DreamPie - The Python shell you've always dreamed about!
Luis M. González wrote: > On Feb 21, 6:40 pm, Mensanator wrote: >> On Feb 21, 12:14 pm, Paul Boddie wrote: >> >> >> >> >> >>> On 21 Feb, 17:32, Mensanator wrote: On Feb 21, 10:30 am, Mensanator wrote: > What versions of Python does it suuport? What OS are supported? >>> From the Web site referenced in the announcement (http:// >>> dreampie.sourceforge.net/): >>> """ >>> # Supports Python 2.5, Python 2.6, Jython 2.5, IronPython 2.6 and >>> Python 3.1. >>> # Works on Windows and Linux. >>> """ >> Yeah, I saw that. Funny that something important like that wasn't part >> of the >> announcement. I notice no mention of Mac OS, so visiting the website >> was a complete >> waste of time on my part, wasn't it? >> >> >> >>> Paul > > Geez... someone spends many hours (or days or even months) writing a > program to be offered for free to the world, and you get annoyed by > losing two seconds while checking it out? And if it's open source there's always the possibility of doing a Mac port and contributing the code back. regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 See PyCon Talks from Atlanta 2010 http://pycon.blip.tv/ Holden Web LLC http://www.holdenweb.com/ UPCOMING EVENTS:http://holdenweb.eventbrite.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: short-circuiting any/all ?
In Tim Golden writes: >On 22/03/2010 18:30, kj wrote: >> Thanks! I'm glad to know that one can get the short circuiting >> using a map-type idiom. (I prefer map over comprehensions when I >> don't need to define a function just for the purpose of passing it >> to it.) >In what way does "map" over "comprehensions" save you defining a function? >any (map (is_invalid, L)) >any (is_invalid (i) for i in L) I was talking in the *general* case. map at the very least requires a lambda expression, which is a one-time function defintion. ~K -- http://mail.python.org/mailman/listinfo/python-list
Re: How to automate accessor definition?
In <4ba79040$0$22397$426a7...@news.free.fr> Bruno Desthuilliers writes: >kj a écrit : >> PS: BTW, this is not the first time that attempting to set an >> attribute (in a class written by me even) blows up on me. It's >> situations like these that rattle my grasp of attributes, hence my >> original question about boring, plodding, verbose Java-oid accessors. >> For me these Python attributes are still waaay too mysterious and >> unpredictable to rely on. >Somehow simplified, here's what you have to know: ... >As I said, this is a somehow simplified description of the process - I >skipped the parts about __slots__, __getattribute__ and __setattr__, as >well as the part about how function class attributes become methods. >this should be enough to get an idea of what's going on. Thank you, sir! That was quite the education. (Someday I really should read carefully the official documentation for the stuff you described, assuming it exists.) Thanks also for your code suggestions. ~K -- http://mail.python.org/mailman/listinfo/python-list
Re: StringChain -- a data structure for managing large sequences of chunks of bytes
On Mon, Mar 22, 2010 at 2:07 AM, Steven D'Aprano wrote: > > Perhaps you should have said that it was a wrapper around deque giving > richer functionality, rather than giving the impression that it was a > brand new data structure invented by you. People are naturally going to > be more skeptical about a newly invented data structure than one based on > a reliable, component like deque. The fact that StringChain uses deque to hold the queue of strings isn't that important. I just benchmarked it by swapping in the deque for a list and using the list costs about one third of a nanosecond per byte at the scales that the benchmark exercises (namely 10,000,000 bytes in about 10,000 strings). A third of a nanosecond per byte is about 4% of the runtime. I also implemented and benchmarked a simpler deque-based scheme which puts all the actual bytes from the strings into a deque with self.d.extend(newstr). As you would expect, this shows good asymptotic performance but the constants are relatively bad so that at all of the actual loads measured it was three orders of magnitude worse than StringChain and than String-Chain-with-a-list-instead-of-a-deque. Moral: the constants matter! Those benchmarks are appended. You can run the benchmarks yourself per the README.txt. But anyway, I take your point and I updated the StringChain README to explain that it is a pretty simple data structure that holds a list (actually a deque) of strings and isn't anything too clever or novel. By the way, we could further micro-optimize this kind of thing if ''.join() would accept either strings or buffers instead of requiring strings: >>> ''.join([buffer("abc"), "def"]) Traceback (most recent call last): File "", line 1, in TypeError: sequence item 0: expected string, buffer found Then whenever StringChain needs to slice a string into leading and trailing parts, it could construct a buffer() viewing each part instead of making a copy of each part. > it. Maybe you should consider linking to it on PyPI and seeing if there > is any interest from others? http://pypi.python.org/pypi/stringchain Regards, Zooko impl: StringChain task: _accumulate_then_one_gulp 1 best: 5.698e+00, 3th-best: 7.486e+00, mean: 7.758e+00, 10 best: 4.640e+00, 3th-best: 4.690e+00, mean: 7.674e+00, 100 best: 3.789e+00, 3th-best: 3.806e+00, mean: 3.995e+00, 1000 best: 4.095e+00, 1th-best: 4.095e+00, mean: 4.095e+00, task: _alternate_str 1 best: 1.378e+01, 3th-best: 1.390e+01, mean: 1.500e+01, 10 best: 9.198e+00, 3th-best: 9.248e+00, mean: 9.385e+00, 100 best: 8.715e+00, 3th-best: 8.755e+00, mean: 8.808e+00, 1000 best: 8.738e+00, 1th-best: 8.738e+00, mean: 8.738e+00, impl: StringChainWithList task: _accumulate_then_one_gulp 1 best: 3.600e+00, 3th-best: 3.695e+00, mean: 4.129e+00, 10 best: 4.070e+00, 3th-best: 4.079e+00, mean: 4.162e+00, 100 best: 3.662e+00, 3th-best: 3.688e+00, mean: 3.721e+00, 1000 best: 3.902e+00, 1th-best: 3.902e+00, mean: 3.902e+00, 1th-worst: 3.902e+00, worst: 3.902e+00 (of 1) task: _alternate_str 1 best: 1.369e+01, 3th-best: 1.380e+01, mean: 1.442e+01, 10 best: 9.251e+00, 3th-best: 9.289e+00, mean: 9.416e+00, 100 best: 8.809e+00, 3th-best: 8.876e+00, mean: 8.943e+00, 1000 best: 9.095e+00, 1th-best: 9.095e+00, mean: 9.095e+00, impl: Dequey task: _accumulate_then_one_gulp 1 best: 2.772e+02, 3th-best: 2.785e+02, mean: 2.911e+02, 10 best: 2.314e+02, 3th-best: 2.334e+02, mean: 2.422e+02, 100 best: 2.282e+02, 3th-best: 2.288e+02, mean: 2.370e+02, 1000 best: 2.587e+02, 1th-best: 2.587e+02, mean: 2.587e+02, task: _alternate_str 1 best: 1.576e+03, 3th-best: 1.580e+03, mean: 1.608e+03, 10 best: 1.301e+03, 3th-best: 1.303e+03, mean: 1.306e+03, 100 best: 1.275e+03, 3th-best: 1.276e+03, mean: 1.278e+03, 1000 best: 1.280e+03, 1th-best: 1.280e+03, mean: 1.280e+03, -- http://mail.python.org/mailman/listinfo/python-list
Problem with sys.path when embedding Python3 in C
Hi, I've recently begun experimenting with embedding python and i got a small problem. This is my current testing code (basically all from python docs): > int main(int argc, char *argv[]) > > { > > PyObject *pModuleName, *pTestModule, *pTestFunc, *pTestResult, *pTestArgs; > > PyImport_AppendInittab("node", &PyInit_node); > > Py_Initialize(); The following line here is the ugly-hack I had to do to make it work, nothing else I know of makes it possible to import modules from startup directory. So my question is: Is there a prettier way to do this? > PyRun_SimpleString("import sys\nsys.path.append(\"\")"); > > PyRun_SimpleString("import sys\nprint(sys.path)"); > > pModuleName = PyUnicode_FromString("stuff"); > > pTestModule = PyImport_Import(pModuleName); > > Py_DECREF(pModuleName); > > if (pTestModule != NULL) > > { ... The whole code is here: http://pastebin.com/805BSY8f You only need a file in the same directory called stuff.py containing a function def for a function called do_stuff -- http://mail.python.org/mailman/listinfo/python-list
Re: Castrated traceback in sys.exc_info()
Gabriel Genellina a écrit : En Wed, 17 Mar 2010 09:42:06 -0300, Pascal Chambon escribió: traceback functions indeed allow the manipulation of exception tracebacks, but the root problem is that anyway, since that traceback is incomplete, your "traceback.format_exc().splitlines()" will only provide frames for callee (downward) functions, not caller (upward) ones, starting from the exception catching frame. Either I don't understand what you mean, or I can't reproduce it: Allright, here is more concretely the problem : import logging def a(): return b() def b(): return c() def c(): try: return d() except: logging.exception("An error") def d(): raise ValueError def main(): logging.basicConfig(level=logging.DEBUG) a() main() OUTPUT: >>> ERROR:root:An error Traceback (most recent call last): File "C:/Users/Pakal/Desktop/aaa.py", line 7, in c return d() File "C:/Users/Pakal/Desktop/aaa.py", line 11, in d def d(): raise ValueError ValueError >>> As you see, the traceback only starts from function c, which handles the exception. It doesn't show main(), a() and b(), which might however be (and are, in my case) critical to diagnose the severity of the problem (since many different paths would lead to calling c()). So the question is : is that possible to enforce, by a way or another, the retrieval of the FULL traceback at exception raising point, instead of that incomplete one ? Thank you for your help, regards, Pascal -- http://mail.python.org/mailman/listinfo/python-list
Re: short-circuiting any/all ?
On 22/03/2010 18:30, kj wrote: Thanks! I'm glad to know that one can get the short circuiting using a map-type idiom. (I prefer map over comprehensions when I don't need to define a function just for the purpose of passing it to it.) In what way does "map" over "comprehensions" save you defining a function? any (map (is_invalid, L)) any (is_invalid (i) for i in L) TJG -- http://mail.python.org/mailman/listinfo/python-list
Re: short-circuiting any/all ?
In <291d82b7-b13b-4f49-901c-8194f3e07...@e7g2000yqf.googlegroups.com> nn writes: >If you are in Python 3 "any(map(is_invalid, L))" should short circuit. >If you are in Python 2 use "from itertools import imap; >any(imap(is_invalid, L))" Thanks! I'm glad to know that one can get the short circuiting using a map-type idiom. (I prefer map over comprehensions when I don't need to define a function just for the purpose of passing it to it.) And thanks also to the other repliers for pointing out that the comprehension version does what I was asking for. ~K -- http://mail.python.org/mailman/listinfo/python-list
Re: short-circuiting any/all ?
On Mar 22, 7:45 am, kj wrote: > I have a list of items L, and a test function is_invalid that checks > the validity of each item. To check that there are no invalid > items in L, I could check the value of any(map(is_invalid, L)). > But this approach is suboptimal in the sense that, no matter what > L is, is_invalid will be executed for all elements of L, even though > the value returned by any() is fully determined by the first True > in its argument. In other words, all calls to is_invalid after > the first one to return True are superfluous. Is there a > short-circuiting counterpart to any(map(is_invalid, L)) that avoids > these superfluous calls? > > OK, there's this one, of course: > > def _any_invalid(L): > for i in L: > if is_invalid(i): > return True > return False > > But is there anything built-in? (I imagine that a lazy version of > map *may* do the trick, *if* any() will let it be lazy.) Yes, that will work: from itertools import imap # lazy version of map any(imap(is_invalid, L) # short-circuits on first True Yet another approach (slightly faster): from itertools import ifilter any(ifilter(is_invalid, L)) Raymond -- http://mail.python.org/mailman/listinfo/python-list
Re: DreamPie - The Python shell you've always dreamed about!
On Feb 21, 6:40 pm, Mensanator wrote: > On Feb 21, 12:14 pm, Paul Boddie wrote: > > > > > > > On 21 Feb, 17:32, Mensanator wrote: > > > > On Feb 21, 10:30 am, Mensanator wrote: > > > > > What versions of Python does it suuport? > > > > What OS are supported? > > > From the Web site referenced in the announcement (http:// > > dreampie.sourceforge.net/): > > > """ > > # Supports Python 2.5, Python 2.6, Jython 2.5, IronPython 2.6 and > > Python 3.1. > > # Works on Windows and Linux. > > """ > > Yeah, I saw that. Funny that something important like that wasn't part > of the > announcement. I notice no mention of Mac OS, so visiting the website > was a complete > waste of time on my part, wasn't it? > > > > > Paul Geez... someone spends many hours (or days or even months) writing a program to be offered for free to the world, and you get annoyed by losing two seconds while checking it out? -- http://mail.python.org/mailman/listinfo/python-list
Re: How to automate accessor definition?
On 3/22/2010 11:44 AM, Bruno Desthuilliers wrote: Another (better IMHO) solution is to use a plain property, and store the computed value as an implementation attribute : @property def foo(self): cached = self.__dict__.get('_foo_cache') if cached is None: self._foo_cache = cached = self._some_time_consuming_operation() return cached There's no need to access __dict__ directly. I believe this is equivalent (and clearer): @property def foo(self): try: cached = self._foo_cache except AttributeError: self._foo_cache = cached = self._time_consuming_op() return cached -John -- http://mail.python.org/mailman/listinfo/python-list
Re: How to automate accessor definition?
kj a écrit : In Dennis Lee Bieber writes: On Sun, 21 Mar 2010 16:57:40 + (UTC), kj declaimed the following in gmane.comp.python.general: Regarding properties, is there a built-in way to memoize them? For example, suppose that the value of a property is obtained by parsing the contents of a file (specified in another instance attribute). It would make no sense to do this parsing more than once. Is there a standard idiom for memoizing the value once it is determined for the first time? Pickle, Shelve? Maybe in conjunction with SQLite3... I was thinking of something less persistent; in-memory, that is. Maybe something in the spirit of: @property def foo(self): # up for some "adaptive auto-redefinition"? self.foo = self._some_time_consuming_operation() return self.foo ...except that that assignment won't work! It bombs with "AttributeError: can't set attribute". ~K PS: BTW, this is not the first time that attempting to set an attribute (in a class written by me even) blows up on me. It's situations like these that rattle my grasp of attributes, hence my original question about boring, plodding, verbose Java-oid accessors. For me these Python attributes are still waaay too mysterious and unpredictable to rely on. Somehow simplified, here's what you have to know: 1/ there are instance attributes and class attributes. Instance attributes lives in the instance's __dict__, class attributes lives in the class's __dict__ or in a parent's class __dict__. 2/ when looking up an attribute on an instance, the rules are * first, check if there's a key by that name in the instance's __dict__. If yes, return the associated value * else, check if there's a class or parent class attribute by that name. * if yes ** if the attribute has a '__get__' method, call the __get__ method with class and instance as arguments, and return the result (this is known as the "descriptor protocol" and provides support for computed attributes (including methods and properties) ** else return the attribute itself * else (if nothing has been found yet), look for a __getattr__ method in the class and it's parents. If found, call this __getattr__ method with the attribute name and return the result * else, give up and raise an AttributeError 3/ When binding an attribute on an instance, the rules are: * first, check if there's a class (or parent class) attribute by that name that has a '__set__' method. If yes, call this class attribute's __set__ method with instance and value as arguments. This is the second part part of the "descriptor protocol", as used by the property type. * else, add the attribute's name and value in the instance's __dict__ As I said, this is a somehow simplified description of the process - I skipped the parts about __slots__, __getattribute__ and __setattr__, as well as the part about how function class attributes become methods. But this should be enough to get an idea of what's going on. In your above case, you defined a "foo" property class attribute. The property type implements both __get__ and __set__, but you only defined a callback for the __get__ method (the function you decorated with 'property'), so when you try to rebind "foo", the default property type's __set__ implementation is invoked, which behaviour is to forbid setting the attribute. If you want a settable property, you have to provide a setter too. Now if you want a "replaceable" property-like attribute, you could define your own computed attribute (aka "descriptor") type _without_ a __set__ method: class replaceableprop(object): def __init__(self, fget): self._fget = fget def __get__(self, instance, cls): if instance is None: return self return self._fget(instance) @replaceableprop def foo(self): # will add 'foo' into self.__dict__, s self.foo = self._some_time_consuming_operation() return self.foo Another (better IMHO) solution is to use a plain property, and store the computed value as an implementation attribute : @property def foo(self): cached = self.__dict__.get('_foo_cache') if cached is None: self._foo_cache = cached = self._some_time_consuming_operation() return cached Sometimes one can set them, sometimes not, and I can't quite tell the two situations apart. It's all very confusing to the Noob. (I'm sure this is all documented *somewhere*, but this does not make using attributes any more intuitive or straightforward. I'm also sure that *eventually*, with enough Python experience under one's belt, this all becomes second nature. My point is that Python attributes are not as transparent and natural to the uninitiated as some of you folks seem to think.) I agree that the introduction of the descriptor protocol added some more complexity to an already somehow unusual model object. HTH. -- http://mail.python.org/mailman/listinfo/python-list
Re: nonuniform sampling with replacement
On 2010-03-21 05:11 AM, Jah_Alarm wrote: I've got a vector length n of integers (some of them are repeating), I recommend reducing it down to unique integers first. and I got a selection probability vector of the same length. How will I sample with replacement k (<=n) values with the probabilty vector. In Matlab this function is randsample. I couldn't find anything to this extent in Scipy or Numpy. In [19]: from scipy.stats import rv_discrete In [20]: p = rv_discrete(name='adhoc', values=([0, 1, 2], [0.5, 0.25, 0.25])) In [21]: p.rvs(size=100) Out[21]: array([0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 2, 2, 2, 1, 0, 0, 2, 0, 0, 1, 0, 0, 2, 2, 0, 1, 2, 1, 0, 0, 2, 1, 1, 1, 1, 1, 2, 1, 2, 0, 2, 0, 2, 0, 0, 2, 0, 1, 0, 2, 2, 1, 0, 0, 1, 0, 2, 1, 0, 0, 1, 0, 2, 1, 2, 1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 2, 0, 1, 2, 1, 1, 0, 0, 0, 1, 0]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: short-circuiting any/all ?
kj wrote: > I have a list of items L, and a test function is_invalid that checks > the validity of each item. To check that there are no invalid > items in L, I could check the value of any(map(is_invalid, L)). > But this approach is suboptimal in the sense that, no matter what > L is, is_invalid will be executed for all elements of L, even though > the value returned by any() is fully determined by the first True > in its argument. In other words, all calls to is_invalid after > the first one to return True are superfluous. Is there a > short-circuiting counterpart to any(map(is_invalid, L)) that avoids > these superfluous calls? > > OK, there's this one, of course: > > def _any_invalid(L): > for i in L: > if is_invalid(i): > return True > return False > > But is there anything built-in? (I imagine that a lazy version of > map *may* do the trick, *if* any() will let it be lazy.) > > TIA! > > ~K If you are in Python 3 "any(map(is_invalid, L))" should short circuit. If you are in Python 2 use "from itertools import imap; any(imap(is_invalid, L))" -- http://mail.python.org/mailman/listinfo/python-list
Re: short-circuiting any/all ?
On Mon, 2010-03-22 at 14:45 +, kj wrote: > I have a list of items L, and a test function is_invalid that checks > the validity of each item. To check that there are no invalid > items in L, I could check the value of any(map(is_invalid, L)). > But this approach is suboptimal in the sense that, no matter what > L is, is_invalid will be executed for all elements of L, any( is_invalid(a) for a in L ) ... generator expression will be lazily computed. Tim -- http://mail.python.org/mailman/listinfo/python-list
Re: short-circuiting any/all ?
kj wrote: I have a list of items L, and a test function is_invalid that checks the validity of each item. To check that there are no invalid items in L, I could check the value of any(map(is_invalid, L)). But this approach is suboptimal in the sense that, no matter what L is, is_invalid will be executed for all elements of L, even though the value returned by any() is fully determined by the first True in its argument. In other words, all calls to is_invalid after the first one to return True are superfluous. Is there a short-circuiting counterpart to any(map(is_invalid, L)) that avoids these superfluous calls? OK, there's this one, of course: def _any_invalid(L): for i in L: if is_invalid(i): return True return False But is there anything built-in? (I imagine that a lazy version of map *may* do the trick, *if* any() will let it be lazy.) TIA! ~K Sounds like unnecessary optimization. Just write def _any_valid(L): return bool([i for i in L if is_valid(i)]) If you really care about speed, meaning if the user experiences some execution duration increase, then the solution you proposed is fine. JM -- http://mail.python.org/mailman/listinfo/python-list
Re: short-circuiting any/all ?
On 22/03/2010 14:45, kj wrote: I have a list of items L, and a test function is_invalid that checks the validity of each item. To check that there are no invalid items in L, I could check the value of any(map(is_invalid, L)). But this approach is suboptimal in the sense that, no matter what L is, is_invalid will be executed for all elements of L, even though the value returned by any() is fully determined by the first True in its argument. In other words, all calls to is_invalid after the first one to return True are superfluous. Is there a short-circuiting counterpart to any(map(is_invalid, L)) that avoids these superfluous calls? OK, there's this one, of course: def _any_invalid(L): for i in L: if is_invalid(i): return True return False But is there anything built-in? (I imagine that a lazy version of map *may* do the trick, *if* any() will let it be lazy.) Have I missed the point of your question, perhaps? This seems to work as lazily as you'd like... def less_than_five (x): print "testing", x return x < 5 L = range (10) print any (less_than_five (i) for i in L) print all (less_than_five (i) for i in L) # for symmetry TJG -- http://mail.python.org/mailman/listinfo/python-list
Re: add an entry to twentyquestions.org (please)
Ben Finney writes: > twenty questions writes: > >> add an entry to http:// > > Don't spam groups with your off-topic begging for a closed database silo > (please) Don't repeat spam -- John Bokma j3b Hacking & Hiking in Mexico - http://johnbokma.com/ http://castleamber.com/ - Perl & Python Development -- http://mail.python.org/mailman/listinfo/python-list
short-circuiting any/all ?
I have a list of items L, and a test function is_invalid that checks the validity of each item. To check that there are no invalid items in L, I could check the value of any(map(is_invalid, L)). But this approach is suboptimal in the sense that, no matter what L is, is_invalid will be executed for all elements of L, even though the value returned by any() is fully determined by the first True in its argument. In other words, all calls to is_invalid after the first one to return True are superfluous. Is there a short-circuiting counterpart to any(map(is_invalid, L)) that avoids these superfluous calls? OK, there's this one, of course: def _any_invalid(L): for i in L: if is_invalid(i): return True return False But is there anything built-in? (I imagine that a lazy version of map *may* do the trick, *if* any() will let it be lazy.) TIA! ~K -- http://mail.python.org/mailman/listinfo/python-list
Re: How to automate accessor definition?
* kj: In Dennis Lee Bieber writes: On Sun, 21 Mar 2010 16:57:40 + (UTC), kj declaimed the following in gmane.comp.python.general: Regarding properties, is there a built-in way to memoize them? For example, suppose that the value of a property is obtained by parsing the contents of a file (specified in another instance attribute). It would make no sense to do this parsing more than once. Is there a standard idiom for memoizing the value once it is determined for the first time? Pickle, Shelve? Maybe in conjunction with SQLite3... I was thinking of something less persistent; in-memory, that is. Maybe something in the spirit of: @property def foo(self): # up for some "adaptive auto-redefinition"? self.foo = self._some_time_consuming_operation() return self.foo ...except that that assignment won't work! It bombs with "AttributeError: can't set attribute". Since foo is a read only property you can assign to it. But it doesn't matter: if it worked technically it wouldn't give you what you're after, the once-only evaluation. A simple way to do that, in the sense of copying code and having it work, is to use a generator that, after evaluating the expensive op, loops forever yielding the resulting value. A probably more efficient way, and anyway one perhaps more easy to understand, is as follows: from __future__ import print_function class LazyEval: def __init__( self, f ): self._f = f self._computed = False @property def value( self ): if not self._computed: self._value = self._f() self._computed = True return self._value class HiHo: def _expensive_op( self ): print( "Expensive op!" ) return 42 def __init__( self ): self._foo = LazyEval( self._expensive_op ) @property def foo( self ): return self._foo.value o = HiHo() for i in range( 5 ): print( o.foo ) Cheers & hth., - Alf -- http://mail.python.org/mailman/listinfo/python-list
Re: How to automate accessor definition?
In Dennis Lee Bieber writes: >On Sun, 21 Mar 2010 16:57:40 + (UTC), kj >declaimed the following in gmane.comp.python.general: >> Regarding properties, is there a built-in way to memoize them? For >> example, suppose that the value of a property is obtained by parsing >> the contents of a file (specified in another instance attribute). >> It would make no sense to do this parsing more than once. Is there >> a standard idiom for memoizing the value once it is determined for >> the first time? >> > Pickle, Shelve? Maybe in conjunction with SQLite3... I was thinking of something less persistent; in-memory, that is. Maybe something in the spirit of: @property def foo(self): # up for some "adaptive auto-redefinition"? self.foo = self._some_time_consuming_operation() return self.foo ...except that that assignment won't work! It bombs with "AttributeError: can't set attribute". ~K PS: BTW, this is not the first time that attempting to set an attribute (in a class written by me even) blows up on me. It's situations like these that rattle my grasp of attributes, hence my original question about boring, plodding, verbose Java-oid accessors. For me these Python attributes are still waaay too mysterious and unpredictable to rely on. Sometimes one can set them, sometimes not, and I can't quite tell the two situations apart. It's all very confusing to the Noob. (I'm sure this is all documented *somewhere*, but this does not make using attributes any more intuitive or straightforward. I'm also sure that *eventually*, with enough Python experience under one's belt, this all becomes second nature. My point is that Python attributes are not as transparent and natural to the uninitiated as some of you folks seem to think.) -- http://mail.python.org/mailman/listinfo/python-list
Re: chroot fails with mount point passed to subprocess.Popen?
* newton10471: Hi, I'm trying to use subprocess.Popen() to do a Linux chroot to a mount point passed in as a parameter to the following function: def getInstalledKernelVersion(mountPoint): linuxFsRoot = mountPoint + "/root" print "type of linuxFsRoot is %s" % type(linuxFsRoot) installedKernelVersionResult = subprocess.Popen(['chroot',linuxFsRoot,'rpm','-q','kernel-xen']) return installedKernelVersionResult and it dies with the following: type of linuxFsRoot is chroot: cannot change root directory to /storage/mounts/ mnt_3786314034939740895.mnt/root: No such file or directory When I explicitly set linuxFsRoot = "/storage/mounts/ mnt_3786314034939740895.mnt/root", it works fine. I also tried this to concatenate the mountpoint + /root, and it failed in the same way: linuxFsRoot = ("%s/root") % mountPoint Use the os.path functions. Anyone know what might be happening here? Since the computed and literal paths /look/ identical and same type, the only thing I can imagine is that there is some invisible character. Try comparing the computed and literal path character by character. Print the difference or if they're identical, that they are. Possibly you have GIGO problem. Cheers & hth., - Alf -- http://mail.python.org/mailman/listinfo/python-list
Re: subtraction is giving me a syntax error
In article <56597268-3472-4fd9-a829-6d9cf51cf...@e7g2000yqf.googlegroups.com>, Joel Pendery wrote: >So I am trying to write a bit of code and a simple numerical >subtraction > >y_diff = y_diff-H > >is giving me the error > >Syntaxerror: Non-ASCII character '\x96' in file on line 70, but no >encoding declared. > >Even though I have deleted some lines before it and this line is no >longer line 70, I am still getting the error every time. I have tried >to change the encoding of the file to utf-8 but to no avail, I still >am having this issue. Any ideas? Make a hex-dump of your file. How does line 70 look? If you see 0900 66 66 96 48 0A -- ff.H. you know you are doing something illegal. > >Thanks in advance Groetjes Albert P.S. With all due respect, error messages come not any clearer than that! -- -- Albert van der Horst, UTRECHT,THE NETHERLANDS Economic growth -- being exponential -- ultimately falters. alb...@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst -- http://mail.python.org/mailman/listinfo/python-list
chroot fails with mount point passed to subprocess.Popen?
Hi, I'm trying to use subprocess.Popen() to do a Linux chroot to a mount point passed in as a parameter to the following function: def getInstalledKernelVersion(mountPoint): linuxFsRoot = mountPoint + "/root" print "type of linuxFsRoot is %s" % type(linuxFsRoot) installedKernelVersionResult = subprocess.Popen(['chroot',linuxFsRoot,'rpm','-q','kernel-xen']) return installedKernelVersionResult and it dies with the following: type of linuxFsRoot is chroot: cannot change root directory to /storage/mounts/ mnt_3786314034939740895.mnt/root: No such file or directory When I explicitly set linuxFsRoot = "/storage/mounts/ mnt_3786314034939740895.mnt/root", it works fine. I also tried this to concatenate the mountpoint + /root, and it failed in the same way: linuxFsRoot = ("%s/root") % mountPoint Anyone know what might be happening here? Thanks in advance, Matt Newton -- http://mail.python.org/mailman/listinfo/python-list
Re: What is pkg-config for ?
In message , Gabriel Genellina wrote: > I fail to see how is this relevant to Python... Well, so many of the questions in this noisegroup seem to be about Windows problems, not Python ones... :) -- http://mail.python.org/mailman/listinfo/python-list
Re: accessing variable of the __main__ module
News123 wrote: Hi, I wondered about the best way, that a module's function could determine the existance and value of variables in the __main__ module. What I came up with is: ### main.py ## import mod A = 4 if __name__ == "__main__": mod.f() ### mod.py ## def f(): try: from __main__ import A except ImportError as e: A = "does not exist" print "__main__.A" ,A Is there anything better / more pythonic? Thanks in advance and bye N The 'was I imported from that module' is usually some sign of bad design. I can't detail more wihtout further detail of what you're trying to achieve. Bud since what you have is working, I would'nt bother more than that cause no matter what you try, it will be ugly :o). JM -- http://mail.python.org/mailman/listinfo/python-list
Re: Tuples vs. variable-length argument lists
Spencer Pearson wrote: Hi! This might be more of a personal-preference question than anything, but here goes: when is it appropriate for a function to take a list or tuple as input, and when should it allow a varying number of arguments? It seems as though the two are always interchangeable. For a simple example... def subtract( x, nums ): return x - sum( nums ) ... works equally well if you define it as "subtract( x, *nums )" and put an asterisk in front of any lists/tuples you pass it. I can't think of any situation where you couldn't convert from one form to the other with just a star or a pair of parentheses. Is there a generally accepted convention for which method to use? Is there ever actually a big difference between the two that I'm not seeing? FYI some linters report the usage of * as bad practice, I don't know the reason though. Pylint reports it as using 'magic'. Anyway the form without * is commonly used. JM -- http://mail.python.org/mailman/listinfo/python-list
Re: execute bash builtins in python
actually using the -i param in the command to subprocess doesn't seem to work as well as setting PS1 to some garbage, it starts a new interactive shell therein kicking me out of python. :/ Thank you, -Alex Goretoy -- http://mail.python.org/mailman/listinfo/python-list
Re: StringChain -- a data structure for managing large sequences of chunks of bytes
On Sun, 21 Mar 2010 23:09:46 -0600, Zooko O'Whielacronx wrote: > But the use case that I am talking about is where you need to accumulate > new incoming strings into your buffer while alternately processing > leading prefixes of the buffer. [...] > Below are the abbreviated results of the benchmark. You can easily run > this benchmark yourself by following the README.txt in the StringChain > source package [3]. These results show that for the load imposed by this > benchmark StringChain is the only one of the four that scales and that > it is also nice and fast. I was reading this email, and thinking "Why do you need this StringChain data structure, from the description it sounds like a job for collections.deque?" And funnily enough, following the links you supplied I came across this: "You can get the package from http://pypi.python.org/pypi/stringchain or with darcs get http://tahoe-lafs.org/source/stringchain/trunk. It has unit tests. It is in pure Python (it uses collections.deque and string)." http://tahoe-lafs.org/trac/stringchain Perhaps you should have said that it was a wrapper around deque giving richer functionality, rather than giving the impression that it was a brand new data structure invented by you. People are naturally going to be more skeptical about a newly invented data structure than one based on a reliable, component like deque. In any case, it looks to me that the StringChain data structure itself is a little too application specific for me to be personally interested in it. Maybe you should consider linking to it on PyPI and seeing if there is any interest from others? Regards, Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: execute bash builtins in python
for the broken pipe error, perhaps theres a different way I can get shell output other than using subprocess? I need the output of alias command into a string and output of declare command into a string as well, I would like to also avoid creating of a single liner script to make this happen if at all possible Thank you, -Alex Goretoy -- http://mail.python.org/mailman/listinfo/python-list
Re: GC is very expensive: am I doing something wrong?
Lawrence D'Oliveiro, 22.03.2010 00:36: Terry Reedy wrote: No one has discovered a setting of the internal tuning parameters for which there are no bad patterns and I suspect there are not any such. This does not negate Xavier's suggestion that a code change might also solve your problem. Could it be that for implementing a structure like a trie as the OP is, where a lot of CPU cycles can be spent manipulating the structure, a high- level language like Python, Perl or Ruby just gets in the way? I would rather say that the specific problem of the trie data structure is that it has extremely little benefit over other available data structures. There may still be a couple of niches where it makes sense to consider it as an alternative, but given that dicts are so heavily optimised in Python, it'll be hard for tries to compete even when written in a low-level language. Remember that the original use case was to load a dictionary from a text file. For this use case, a trie can be very wasteful in terms of memory and rather CPU cache unfriendly on traversal, whereas hash values are a) rather fast to calculate for a string, and b) often just calculated once and then kept alive in the string object for later reuse. My feeling would be, try to get the language to do as much of the work for you as possible. If you can’t do that, then you might be better off with a lower-level language. I agree with the first sentence, but I'd like to underline the word 'might' in the second. As this newsgroup shows, very often it's enough to look for a better algorithmic approach first. Stefan -- http://mail.python.org/mailman/listinfo/python-list
Re: execute bash builtins in python
I do have a problem however that I don't know how to solve. My application dies abruptly at random times because of this and I get this output error in the terminal: bash: line 0: declare: write error: Broken pipe and sometimes it crashes and I get this output error; this one maybe gtk related, yes? *** glibc detected *** /usr/bin/python: double free or corruption (fasttop): 0xb650fa78 *** === Backtrace: = /lib/tls/i686/cmov/libc.so.6[0x17aff1] /lib/tls/i686/cmov/libc.so.6[0x17c6f2] /lib/tls/i686/cmov/libc.so.6(cfree+0x6d)[0x17f7cd] /lib/libglib-2.0.so.0(g_free+0x36)[0x7a3196] /usr/lib/libgdk-x11-2.0.so.0[0x27faba] /usr/lib/libgdk-x11-2.0.so.0(gdk_region_union+0x8e)[0x28129e] /usr/lib/libgdk-x11-2.0.so.0[0x28e26c] /usr/lib/libgdk-x11-2.0.so.0(gdk_window_invalidate_maybe_recurse+0x243)[0x28eb33] /usr/lib/libgdk-x11-2.0.so.0(gdk_window_invalidate_maybe_recurse+0x206)[0x28eaf6] /usr/lib/libgtk-x11-2.0.so.0[0x28b5893] /usr/lib/libgtk-x11-2.0.so.0[0x28b6cff] /usr/lib/libgtk-x11-2.0.so.0(gtk_widget_queue_resize+0x75)[0x28bcaa5] /usr/lib/libgtk-x11-2.0.so.0[0x288b557] /usr/lib/libgobject-2.0.so.0(g_cclosure_marshal_VOID__BOXED+0x88)[0x320068] /usr/lib/libgobject-2.0.so.0(g_closure_invoke+0x1b2)[0x313072] /usr/lib/libgobject-2.0.so.0[0x3287a8] /usr/lib/libgobject-2.0.so.0(g_signal_emit_valist+0x7bd)[0x329b2d] /usr/lib/libgobject-2.0.so.0(g_signal_emit+0x26)[0x329fb6] /usr/lib/libgtk-x11-2.0.so.0(gtk_tree_model_row_deleted+0x9a)[0x28751da] /usr/lib/libgtk-x11-2.0.so.0(gtk_list_store_remove+0x114)[0x278e9e4] /usr/lib/libgtk-x11-2.0.so.0(gtk_list_store_clear+0x90)[0x278eab0] /usr/lib/pymodules/python2.6/gtk-2.0/gtk/_gtk.so[0x10b64a1] /usr/bin/python(PyEval_EvalFrameEx+0x4175)[0x80dbfd5] /usr/bin/python(PyEval_EvalFrameEx+0x5524)[0x80dd384] /usr/bin/python(PyEval_EvalCodeEx+0x7d2)[0x80dddf2] /usr/bin/python[0x816014c] /usr/bin/python(PyObject_Call+0x4a)[0x806120a] /usr/bin/python[0x80684ac] /usr/bin/python(PyObject_Call+0x4a)[0x806120a] /usr/bin/python(PyEval_CallObjectWithKeywords+0x42)[0x80d6ef2] /usr/bin/python(PyObject_CallObject+0x20)[0x80612a0] /usr/lib/pymodules/python2.6/gtk-2.0/gobject/_gobject.so[0xd4503e] /usr/lib/libgobject-2.0.so.0(g_closure_invoke+0x1b2)[0x313072] /usr/lib/libgobject-2.0.so.0[0x3287a8] /usr/lib/libgobject-2.0.so.0(g_signal_emit_valist+0x7bd)[0x329b2d] /usr/lib/libgobject-2.0.so.0(g_signal_emit+0x26)[0x329fb6] /usr/lib/libgtk-x11-2.0.so.0[0x26fb8dc] /usr/lib/libgobject-2.0.so.0(g_cclosure_marshal_VOID__BOXED+0x88)[0x320068] /usr/lib/libgobject-2.0.so.0(g_closure_invoke+0x1b2)[0x313072] /usr/lib/libgobject-2.0.so.0[0x3287a8] /usr/lib/libgobject-2.0.so.0(g_signal_emit_valist+0x7bd)[0x329b2d] /usr/lib/libgobject-2.0.so.0(g_signal_emit+0x26)[0x329fb6] /usr/lib/libgtk-x11-2.0.so.0(gtk_tree_model_row_deleted+0x9a)[0x28751da] /usr/lib/libgtk-x11-2.0.so.0(gtk_list_store_remove+0x114)[0x278e9e4] /usr/lib/libgtk-x11-2.0.so.0(gtk_list_store_clear+0x90)[0x278eab0] /usr/lib/pymodules/python2.6/gtk-2.0/gtk/_gtk.so[0x10b64a1] /usr/bin/python(PyEval_EvalFrameEx+0x4175)[0x80dbfd5] /usr/bin/python(PyEval_EvalFrameEx+0x5524)[0x80dd384] /usr/bin/python[0x815de2f] /usr/bin/python(PyEval_EvalFrameEx+0x95f)[0x80d87bf] /usr/bin/python(PyEval_EvalCodeEx+0x7d2)[0x80dddf2] /usr/bin/python[0x816022f] /usr/bin/python(PyObject_Call+0x4a)[0x806120a] /usr/bin/python(PyEval_EvalFrameEx+0x30b9)[0x80daf19] /usr/bin/python(PyEval_EvalFrameEx+0x5524)[0x80dd384] /usr/bin/python(PyEval_EvalFrameEx+0x5524)[0x80dd384] /usr/bin/python(PyEval_EvalCodeEx+0x7d2)[0x80dddf2] /usr/bin/python[0x816014c] /usr/bin/python(PyObject_Call+0x4a)[0x806120a] /usr/bin/python[0x80684ac] /usr/bin/python(PyObject_Call+0x4a)[0x806120a] /usr/bin/python(PyEval_CallObjectWithKeywords+0x42)[0x80d6ef2] /usr/bin/python[0x8107d88] === Memory map: 0011-0024e000 r-xp 08:06 688127 /lib/tls/i686/cmov/ libc-2.10.1.so 0024e000-0024f000 ---p 0013e000 08:06 688127 /lib/tls/i686/cmov/ libc-2.10.1.so 0024f000-00251000 r--p 0013e000 08:06 688127 /lib/tls/i686/cmov/ libc-2.10.1.so 00251000-00252000 rw-p 0014 08:06 688127 /lib/tls/i686/cmov/ libc-2.10.1.so 00252000-00255000 rw-p 00:00 0 00255000-002e7000 r-xp 08:06 198090 /usr/lib/libgdk-x11-2.0.so.0.1800.3 002e7000-002e9000 r--p 00092000 08:06 198090 /usr/lib/libgdk-x11-2.0.so.0.1800.3 002e9000-002ea000 rw-p 00094000 08:06 198090 /usr/lib/libgdk-x11-2.0.so.0.1800.3 002ea000-00305000 r-xp 08:06 198967 /usr/lib/libatk-1.0.so.0.2809.1 00305000-00306000 r--p 0001b000 08:06 198967 /usr/lib/libatk-1.0.so.0.2809.1 00306000-00307000 rw-p 0001c000 08:06 198967 /usr/lib/libatk-1.0.so.0.2809.1 00307000-00308000 rwxp 00:00 0 00308000-00344000 r-xp 08:06 196888 /usr/lib/libgobject-2.0.so.0.2200.3 00344000-00345000 r--p 0003b000 08:06 196888 /usr/lib/libgobject-2.0.so.0.2200.3 00345000-00346000 rw-p 0003c000 08:06 196888 /usr/lib/libgobject-2.0.so.0.2200.3 00346000-00351000 r-xp 08:06 199652 /usr/lib/libpangocairo-1.0.s
Re: nonuniform sampling with replacement
On 22 мар, 01:27, Peter Otten <__pete...@web.de> wrote: > Jah_Alarm wrote: > > I've got a vector length n of integers (some of them are repeating), > > and I got a selection probability vector of the same length. How will > > I sample with replacement k (<=n) values with the probabilty vector. > > In Matlab this function is randsample. I couldn't find anything to > > this extent in Scipy or Numpy. > > If all else fails you can do it yourself: > > import random > import bisect > > def iter_sample_with_replacement(values, weights): > _random = random.random > _bisect = bisect.bisect > > acc_weights = [] > sigma = 0 > for w in weights: > sigma += w > acc_weights.append(sigma) > while 1: > yield values[_bisect(acc_weights, _random()*sigma)] > > def sample_with_replacement(k, values, weights): > return list(islice(iter_sample_with_replacement(values, weights), k)) > > if __name__ == "__main__": > from itertools import islice > N = 10**6 > values = range(4) > weights = [2, 3, 4, 1] > > histo = [0] * len(values) > for v in islice(iter_sample_with_replacement(values, weights), N): > histo[v] += 1 > print histo > print sample_with_replacement(30, values, weights) > > Peter thanks a lot, Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: nonuniform sampling with replacement
On 22 мар, 01:28, "Alf P. Steinbach" wrote: > * Alf P. Steinbach: > > > > > * Jah_Alarm: > >> I've got a vector length n of integers (some of them are repeating), > >> and I got a selection probability vector of the same length. How will > >> I sample with replacement k (<=n) values with the probabilty vector. > >> In Matlab this function is randsample. I couldn't find anything to > >> this extent in Scipy or Numpy. > > > > [snip] > > > > > Disclaimer: I just cooked it up and just cooked up binary searches > > usually have bugs. They usually need to be exercised and fixed. But I > > think you get the idea. Note also that division differs in Py3 and Py2. > > This is coded for Py3. > > Sorry, I realized this just now: the name "p" in the choice() method is > utterly > misleading, which you can see from the call; it's a random number not a > probability. I guess my fingers just repeated what they typed earlier. > > Cheeers, > > - Alf (repeat typist) thanks a lot alex -- http://mail.python.org/mailman/listinfo/python-list