[ANNOUNCE] Twisted 10.2.0 Released
Twisted 10.2.0, the third Twisted release of 2010, has emerged from the mysterious depths of Twisted Matrix Labs, as so many releases before it. Survivors of the release process - what few there were of them - have been heard to claim that this version is awesome, even more robust, fun-sized and oven fresh. Crossing several things that shouldn't ought to be, including the streams and the rubicon, I have assumed the triple responsibilities of feature author, project leader, *and* release manager for 10.2: with this dark and terrible power - a power which no man ought to wield alone - I have wrought a release which contains many exciting new features, including: - A plug-in API for adding new types of endpoint descriptions. http://tm.tl/4695 - A new, simpler, substantially more robust CoreFoundation reactor. http://tm.tl/1833 - Improvements to the implementation of Deferred which should both improve performance and fix certain runtime errors with long callback chains. http://tm.tl/411 - Deferred.setTimeout is (finally) gone. To quote the author of this change: A new era of peace has started. http://tm.tl/1702 - NetstringReceiver is substantially faster. http://tm.tl/4378 And, of course, nearly one hundred smaller bug fixes, documentation updates, and general improvements. See the NEWS file included in the release for more details. Look upon our Twisted, ye mighty, and make your network applications event-driven: get it now, from: http://twistedmatrix.com/ ... or simply install the 'Twisted' package from PyPI. Many thanks to Christopher Armstrong, for his work on release-automation tools that made this possible; to Jonathan Lange, for thoroughly documenting the process and thereby making my ascent to the throne of release manager possible, and to Jean-Paul Calderone for his tireless maintenance of our build and test infrastructure as well as his help with the release. Most of all, thanks to everyone who contributed a patch, reported a bug or reviewed a ticket for 10.2. Not including those already thanked, there are 41 of you, so it would be a bit tedious to go through everyone, but you know who you are and we absolutely couldn't do it without you! Thanks a ton! -- http://mail.python.org/mailman/listinfo/python-announce-list Support the Python Software Foundation: http://www.python.org/psf/donations/
Possible to determine number of rows affected by a SQLite update or delete command?
Is there a cursor or connection property that returns the number of rows affected by a SQLite update or delete command? Or, if we want this information, do we have to pre-query our database for a count of records that will be affected by an operation? Thank you, Malcolm -- http://mail.python.org/mailman/listinfo/python-list
Re: Possible to determine number of rows affected by a SQLite update or delete command?
On Tue, Nov 30, 2010 at 2:29 PM, pyt...@bdurham.com wrote: Is there a cursor or connection property that returns the number of rows affected by a SQLite update or delete command? The cursor has a rowcount attribute. The documentation of the sqlite3 module says the implementation is quirky. You might take a look at it and see if it fits your needs. Or, if we want this information, do we have to pre-query our database for a count of records that will be affected by an operation? -- regards, kushal -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3 encoding question: Read a filename from stdin, subsequently?open that filename
Dan Stromberg drsali...@gmail.com wrote: I've got a couple of programs that read filenames from stdin, and then open those files and do things with them. These programs sort of do the *ix xargs thing, without requiring xargs. In Python 2, these work well. Irrespective of how filenames are encoded, things are opened OK, because it's all just a stream of single byte characters. In Python 3, I'm finding that I have encoding issues with characters with their high bit set. Things are fine with strictly ASCII filenames. With high-bit-set characters, even if I change stdin's encoding with: import io STDIN = io.open(sys.stdin.fileno(), 'r', encoding='ISO-8859-1') ...even with that, when I read a filename from stdin with a single-character Spanish n~, the program cannot open that filename because the n~ is apparently internally converted to two bytes, but remains one byte in the filesystem. I decided to try ISO-8859-1 with Python 3, because I have a Java program that encountered a similar problem until I used en_US.ISO-8859-1 in an environment variable to set the JVM's encoding for stdin. Python 2 shows the n~ as 0xf1 in an os.listdir('.'). Python 3 with an encoding of ISO-8859-1 wants it to be 0xc3 followed by 0xb1. Does anyone know what I need to do to read filenames from stdin with Python 3.1 and subsequently open them, when some of those filenames include characters with their high bit set? TIA! Try using sys.stdin.buffer instead of sys.stdin. It gives you bytes instead of strings. Also use byteliterals instead of stringliterals for paths, i.e. os.listdir(b'.'). Marc -- http://mail.python.org/mailman/listinfo/python-list
Memory issues when storing as List of Strings vs List of List
Hi all, I have a big file 1.5GB in size, with about 6 million lines of tab-delimited data. I have to perform some filtration on the data and keep the good data. After filtration, I have about 5.5 million data left remaining. As you might already guessed, I have to read them in batches and I did so using .readlines(1). After reading each batch, I will split the line (in string format) to a list using .split(\t) and then check several conditions, after which if all conditions are satisfied, I will store the list into a matrix. The code is as follows: -Start-- a=open(bigfile) matrix=[] while True: lines = a.readlines(1) for line in lines: data=line.split(\t) if several_conditions_are_satisfied: matrix.append(data) print Number of lines read:, len(lines), matrix.__sizeof__:, matrix.__sizeof__() if len(lines)==0: break -End- Results: Number of lines read: 461544 matrix.__sizeof__: 1694768 Number of lines read: 449840 matrix.__sizeof__: 3435984 Number of lines read: 455690 matrix.__sizeof__: 5503904 Number of lines read: 451955 matrix.__sizeof__: 6965928 Number of lines read: 452645 matrix.__sizeof__: 8816304 Number of lines read: 448555 matrix.__sizeof__: 9918368 Traceback (most recent call last): MemoryError The peak memory usage at the task manager is 2GB which results in the memory error. However, if I modify the code, to store as a list of string rather than a list of list by changing the append statement stated above to matrix.append(\t.join(data)), then I do not run out of memory. Results: Number of lines read: 461544 matrix.__sizeof__: 1694768 Number of lines read: 449840 matrix.__sizeof__: 3435984 Number of lines read: 455690 matrix.__sizeof__: 5503904 Number of lines read: 451955 matrix.__sizeof__: 6965928 Number of lines read: 452645 matrix.__sizeof__: 8816304 Number of lines read: 448555 matrix.__sizeof__: 9918368 Number of lines read: 453455 matrix.__sizeof__: 12552984 Number of lines read: 432440 matrix.__sizeof__: 14122132 Number of lines read: 432921 matrix.__sizeof__: 15887424 Number of lines read: 464259 matrix.__sizeof__: 17873376 Number of lines read: 450875 matrix.__sizeof__: 20107572 Number of lines read: 458552 matrix.__sizeof__: 20107572 Number of lines read: 453261 matrix.__sizeof__: 22621044 Number of lines read: 413456 matrix.__sizeof__: 22621044 Number of lines read: 166464 matrix.__sizeof__: 25448700 Number of lines read: 0 matrix.__sizeof__: 25448700 In this case, the peak memory according to the task manager is about 1.5 GB. Does anyone know why is there such a big difference memory usage when storing the matrix as a list of list, and when storing it as a list of string? According to __sizeof__ though, the values are the same whether storing it as a list of list, or storing it as a list of string. Is there any methods how I can store all the info into a list of list? I have tried creating such a matrix of equivalent size and it only uses 35mb of memory but I am not sure why when using the code above, the memory usage shot up so fast and exceeded 2GB. Any advice is greatly appreciated. Regards, Jinxiang -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename
Dan Stromberg wrote: I've got a couple of programs that read filenames from stdin, and then open those files and do things with them. These programs sort of do the *ix xargs thing, without requiring xargs. In Python 2, these work well. Irrespective of how filenames are encoded, things are opened OK, because it's all just a stream of single byte characters. I think you're wrong. The filenames' encoding as they are read from stdin must be the same as the encoding used by the file system. If the file system expects UTF-8 and you feed it ISO-8859-1 you'll run into errors. You always have to know either (a) both the file system's and stdin's actual encoding, or (b) that both encodings are the same. If byte strings work you are in situation (b) or just lucky. I'd guess the latter ;) In Python 3, I'm finding that I have encoding issues with characters with their high bit set. Things are fine with strictly ASCII filenames. With high-bit-set characters, even if I change stdin's encoding with: import io STDIN = io.open(sys.stdin.fileno(), 'r', encoding='ISO-8859-1') I suppose you can handle (b) with STDIN = sys.stdin.buffer or STDIN = io.TextIOWrapper(sys.stdin.buffer, encoding=sys.getfilesystemencoding()) in Python 3. I'd prefer the latter because it makes your assumptions explicit. (Disclaimer: I'm not sure whether I'm using the io API as Guido intended it) ...even with that, when I read a filename from stdin with a single-character Spanish n~, the program cannot open that filename because the n~ is apparently internally converted to two bytes, but remains one byte in the filesystem. I decided to try ISO-8859-1 with Python 3, because I have a Java program that encountered a similar problem until I used en_US.ISO-8859-1 in an environment variable to set the JVM's encoding for stdin. Python 2 shows the n~ as 0xf1 in an os.listdir('.'). Python 3 with an encoding of ISO-8859-1 wants it to be 0xc3 followed by 0xb1. Does anyone know what I need to do to read filenames from stdin with Python 3.1 and subsequently open them, when some of those filenames include characters with their high bit set? TIA! -- http://mail.python.org/mailman/listinfo/python-list
Re: Memory issues when storing as List of Strings vs List of List
OW Ghim Siong wrote: I have a big file 1.5GB in size, with about 6 million lines of tab-delimited data. How many fields are there an each line? I have to perform some filtration on the data and keep the good data. After filtration, I have about 5.5 million data left remaining. As you might already guessed, I have to read them in batches and I did so using .readlines(1). I'd have guessed differently. Typically, I would say that you read one line, apply whatever operation you want to it and then write out the result. At least that is the typical operation of filtering. a=open(bigfile) I guess you are on MS Windows. There, you have different handling of textual and non-textual files with regards to the handling of line endings. Generally, using non-textual as input is easier, because it doesn't require any translations. However, textual input is the default, therefore: a = open(bigfile, rb) Or, even better: with open(bigfile, rb) as a: to make sure the file is closed correctly and in time. matrix=[] while True: lines = a.readlines(1) for line in lines: I believe you could do for line in a: # use line here data=line.split(\t) Question here: How many elements does each line contain? And what is their content? The point is that each object has its overhead, and if the content is just e.g. an integral number or a short string, the ratio of interesting content to overhead is rather bad! Compare this to storing a longer string with just the overhead of a single string object instead, it should be obvious. However, if I modify the code, to store as a list of string rather than a list of list by changing the append statement stated above to matrix.append(\t.join(data)), then I do not run out of memory. You already have the result of that join: matrix.append(line) Does anyone know why is there such a big difference memory usage when storing the matrix as a list of list, and when storing it as a list of string? According to __sizeof__ though, the values are the same whether storing it as a list of list, or storing it as a list of string. I can barely believe that. How are you using __sizeof__? Why aren't you using sys.getsizeof() instead? Are you aware that the size of a list doesn't include the size for its content (even though it grows with the number of elements), while the size of a string does? Is there any methods how I can store all the info into a list of list? I have tried creating such a matrix of equivalent size and it only uses 35mb of memory but I am not sure why when using the code above, the memory usage shot up so fast and exceeded 2GB. The size of an empty list is 20 here, plus 4 per element (makes sense on a 32-bit machine), excluding the elements themselves. That means that you have around 8M elements (25448700/4). These take around 32MB of memory, which is what you are probably seeing. The point is that your 35mb don't include any content, probably just a single interned integer or None, so that all elements of your list are the same and only require memory once. In your real-world application that is obviously not so. My suggestions: 1. Find out what exactly is going on here, in particular why our interpretations of the memory usage differ. 2. Redesign your code to really use a filtering design, i.e. don't keep the whole data in memory. 3. If you still have memory issues, take a look at the array library, which should make storage of large arrays a bit more efficient. Good luck! Uli -- Domino Laser GmbH Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932 -- http://mail.python.org/mailman/listinfo/python-list
ANNOUNCE: NHI1-0.10, PLMK-1.8 und libmsgque-4.8
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dear User, ANNOUNCE:Major Feature Release libmsgque: Application-Server-Toolkit for C, C++, JAVA, C#, Go, TCL, PERL, PHP, PYTHON, RUBY, VB.NET PLMK: Programming-Language-Microkernel NHI1: Non-Human-Intelligence #1 STATEMENT = It takes 2 years and a team of qualified software developers to implement a new programming language, but it takes only 2 weeks to add a micro-kernel - - aotto1968 SUMMARY === Add support from the programming language Go from Google LINKS = UPDATE - PLMK definition http://openfacts2.berlios.de/wikien/index.php/BerliosProject:NHI1_-_TheKernel ChangeLog: http://nhi1.berlios.de/theLink/changelog.htm libmsgque including PHP documentation: http://nhi1.berlios.de/theLink/index.htm NHI1: http://nhi1.berlios.de/ DOWNLOAD: http://developer.berlios.de/projects/nhi1/ Go man pages: reference: gomsgqueref.n tutorial: gomsgquetut.n mfg, Andreas Otto (aotto1968) -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.15 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJM9OZsAAoJEGTcPijNG3/A+qwH/1WT3K8619eLzQ78dylS623r qrZtHXRxieD+4GIBgkU7KbNu+LGztxasLW9upafmmF2mGcWtIFuiOEJtw6MJM+07 0X7elXM5WZkXK65dbLE5bbSfO0DHw5T6aIweogA3zjcjDbB3rSC/T6WIlZB4HNYh nBj9xC6WMP7s/jEjs4i5FCRT6gTRzDDJbR+SXqNEEYc/z8wVKPUDfpU/6JGxl9MV rPSUsO+YdZX0XI7+imiUYSVyt+kniL3C36kGON/qGDahscoQYFS6GdoI5XDzI0c+ jN7Q2Ecrphd5F5G/2plNLbVy4mPVd9k/I8VjXMaHLm+skT2Z4Zt7aF29A1FFw68= =/O74 -END PGP SIGNATURE- -- http://mail.python.org/mailman/listinfo/python-list
Re: Memory issues when storing as List of Strings vs List of List
OW Ghim Siong wrote: Hi all, I have a big file 1.5GB in size, with about 6 million lines of tab-delimited data. I have to perform some filtration on the data and keep the good data. After filtration, I have about 5.5 million data left remaining. As you might already guessed, I have to read them in batches and I did so using .readlines(1). After reading each batch, I will split the line (in string format) to a list using .split(\t) and then check several conditions, after which if all conditions are satisfied, I will store the list into a matrix. The code is as follows: -Start-- a=open(bigfile) matrix=[] while True: lines = a.readlines(1) for line in lines: data=line.split(\t) if several_conditions_are_satisfied: matrix.append(data) print Number of lines read:, len(lines), matrix.__sizeof__:, matrix.__sizeof__() if len(lines)==0: break -End- As Ulrich says, don't use readlines(), use for line in a: ... that way you have only one line in memory at a time instead of the huge lines list. Results: Number of lines read: 461544 matrix.__sizeof__: 1694768 Number of lines read: 449840 matrix.__sizeof__: 3435984 Number of lines read: 455690 matrix.__sizeof__: 5503904 Number of lines read: 451955 matrix.__sizeof__: 6965928 Number of lines read: 452645 matrix.__sizeof__: 8816304 Number of lines read: 448555 matrix.__sizeof__: 9918368 Traceback (most recent call last): MemoryError The peak memory usage at the task manager is 2GB which results in the memory error. However, if I modify the code, to store as a list of string rather than a list of list by changing the append statement stated above to matrix.append(\t.join(data)), then I do not run out of memory. Results: Number of lines read: 461544 matrix.__sizeof__: 1694768 Number of lines read: 449840 matrix.__sizeof__: 3435984 Number of lines read: 455690 matrix.__sizeof__: 5503904 Number of lines read: 451955 matrix.__sizeof__: 6965928 Number of lines read: 452645 matrix.__sizeof__: 8816304 Number of lines read: 448555 matrix.__sizeof__: 9918368 Number of lines read: 453455 matrix.__sizeof__: 12552984 Number of lines read: 432440 matrix.__sizeof__: 14122132 Number of lines read: 432921 matrix.__sizeof__: 15887424 Number of lines read: 464259 matrix.__sizeof__: 17873376 Number of lines read: 450875 matrix.__sizeof__: 20107572 Number of lines read: 458552 matrix.__sizeof__: 20107572 Number of lines read: 453261 matrix.__sizeof__: 22621044 Number of lines read: 413456 matrix.__sizeof__: 22621044 Number of lines read: 166464 matrix.__sizeof__: 25448700 Number of lines read: 0 matrix.__sizeof__: 25448700 In this case, the peak memory according to the task manager is about 1.5 GB. Does anyone know why is there such a big difference memory usage when storing the matrix as a list of list, and when storing it as a list of string? According to __sizeof__ though, the values are the same whether storing it as a list of list, or storing it as a list of string. Is sizeof gives you the shallow size of the list, basically the memory to hold C pointers to the items in the list. A better approximation for the total size of a list of lists of string is from sys import getsizeof as sizeof matrix = [[alpha, beta], [gamma, delta]] sizeof(matrix), sum(sizeof(row) for row in matrix), sum(sizeof(entry) for row in matrix for entry in row) (88, 176, 179) sum(_) 443 As you can see the outer list requires only a small portion of the total memory, and its relative size will decrease as the matrix grows. The above calculation may still be wrong because some of the strings could be identical. Collapsing identical strings into a single object is also a way to save memory if you have a significant number of repetitions. Try matrix = [] with open(...) as f: for line in f: data = line.split(\t) if ...: matrix.append(map(intern, data)) to see whether it sufficiently reduces the amount of memory needed. there any methods how I can store all the info into a list of list? I have tried creating such a matrix of equivalent size and it only uses 35mb of memory but I am not sure why when using the code above, the memory usage shot up so fast and exceeded 2GB. Any advice is greatly appreciated. Regards, Jinxiang -- http://mail.python.org/mailman/listinfo/python-list
Re: remote control firefox with python
On Sunday 28 November 2010, 16:22:33 News123 wrote: Hi, I wondered whether there is a simpe way to 'remote' control fire fox with python. With remote controlling I mean: - enter a url in the title bar and click on it - create a new tab - enter another url click on it - save the html document of this page - Probably the most difficult one: emulate a click or 'right click' on a certain button or link of the current page. - other interesting things would be to be able to enter the master password from a script - to enable disable proxy settings while running. The reason why I want to stay within Firefox and not use any other 'mechanize' frame work is, that the pages I want to automate might contain a lot of javascript for the construction of the actual page. If webkit based rendering in an option (since its javascript engine is respected by web developers nowadays..), you might want to check out PyQt, based on current versions of Qt. It provides very easy access to a full featured web browser engine without sacrificing low level details. All your requirements are provided easily (if you're able to grok the Qt documentation, e.g. ignore all C++ clutter, you're set). I've transcoded all available QtWebKit examples to python lately, available here: http://www.riverbankcomputing.com/pipermail/pyqt/2010-November/028614.html The attachment is a tar.bz2 archive, btw. Clicking is archived by: webelement.evaluateJavaScript( var event = document.createEvent('MouseEvents'); event.initEvent('click', true, true); this.dispatchEvent(event); ) Cheers, Pete -- http://mail.python.org/mailman/listinfo/python-list
Re: TDD in python
In article 58fe3680-21f5-42f8-9341-e069cbb88...@r19g2000prm.googlegroups.com, rustom rustompm...@gmail.com wrote: Looking around I found this: http://bytes.com/topic/python/answers/43330-unittest-vs-py-test where Raymond Hettinger no less says quite unequivocally that he prefers test.py to builtin unittest because it is not so heavy-weight Is this the general consensus nowadays among pythonistas? [Note I tend to agree but Ive no experience so asking] Both frameworks have their fans; I doubt you'll find any consensus. Pick one, learn it, and use it. What's important is that you write tests, write lots of tests, and write good tests. Which framework you use is a detail. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename
On Tue, 2010-11-30 at 11:52 +0100, Peter Otten wrote: Dan Stromberg wrote: I've got a couple of programs that read filenames from stdin, and then open those files and do things with them. These programs sort of do the *ix xargs thing, without requiring xargs. In Python 2, these work well. Irrespective of how filenames are encoded, things are opened OK, because it's all just a stream of single byte characters. I think you're wrong. The filenames' encoding as they are read from stdin must be the same as the encoding used by the file system. If the file system expects UTF-8 and you feed it ISO-8859-1 you'll run into errors. I think this is wrong. In Unix there is no concept of filename encoding. Filenames can have any arbitrary set of bytes (except '/' and '\0'). But the filesystem itself neither knows nor cares about encoding. You always have to know either (a) both the file system's and stdin's actual encoding, or (b) that both encodings are the same. If this is true, then I think that it is wrong to do in Python3. Any language should be able to deal with the filenames that the host OS allows. Anyway, going on with the OP.. can you open stdin so that you can accept arbitrary bytes instead of strings and then open using the bytes as the filename? I don't have that much experience with Python3 to say for sure. -a -- http://mail.python.org/mailman/listinfo/python-list
How does GC affect generator context managers?
I've been reading through the docs for contextlib and PEP 343, and came across this: Note that we're not guaranteeing that the finally-clause is executed immediately after the generator object becomes unused, even though this is how it will work in CPython. ...referring to context managers created via the contextlib.contextmanager decorator containing cleanup code in a finally clause. While I understand that Python-the-language does not specify GC semantics, and different implementations can do different things with that, what I don't get is how GC even relates to a context manager created from a generator. As I understood it, when the with block exits, the __exit__() method is called immediately. This calls the next() method on the underlying generator, which forces it to run to completion (and raise a StopIteration), which includes the finally clause... right? — Jason -- http://mail.python.org/mailman/listinfo/python-list
nike shoes , fashi on clothes ; brand hand bags
Dear customers, thank you for your support of our company. Here, there's good news to tell you: The company recently launched a number of new fashion items! ! Fashionable and welcome everyone to come buy. If necessary, please plut: http://www.vipshops.org == http://www.vipshops.org == http://www.vipshops.org == http://www.vipshops.org == http://www.vipshops.org == http://www.vipshops.org == http://www.vipshops.org == 1) More pictures available on our website (= http://www.vipshops.org ) 2) Many colors available . 3) Perfect quality, 4) 100% safe door to door delivery, Best reputation , Best services Posted: 4:13 pm on November 21st -- http://mail.python.org/mailman/listinfo/python-list
how to go on learning python
I'm basically a c/c++ programmer and recently come to python for some web development. Using django and javascript I'm afraid I can develop some web application now. But often I feel I'm not good at python. I don't know much about generators, descriptors and decorators(although I can use some of it to accomplish something, but I don't think I'm capable of knowing its internals). I find my code ugly, and it seems near everything are already gotten done by the libraries. When I want to do something, I just find some libraries or modules and then just finish the work. So I'm a bit tired of just doing this kind of high level scripting, only to find myself a bad programmer. Then my question is after one coded some kind of basic app, how one can keep on learning programming using python? Do some more interesting projects? Read more general books about programming? or...? -- http://mail.python.org/mailman/listinfo/python-list
Re: Needed: Real-world examples for Python's Cooperative Multiple Inheritance
Most of the examples presented here can use the decorator pattern instead. Especially the window system On Mon, Nov 29, 2010 at 5:27 PM, Gregory Ewing greg.ew...@canterbury.ac.nzwrote: Paul Rubin wrote: The classic example though is a window system, where you have a window class, and a scroll bar class, and a drop-down menu class, etc. and if you want a window with a scroll bar and a drop-down menu, you inherit from all three of those classes. Not in any GUI library I've ever seen. Normally there would be three objects involved in such an arrangement, a Window, a ScrollBar and a DropDownMenu, connected to each other in some way. -- Greg -- http://mail.python.org/mailman/listinfo/python-list -- http://www.afroblend.com African news as it happens. -- http://mail.python.org/mailman/listinfo/python-list
Re: How does GC affect generator context managers?
Jason jason.hee...@gmail.com wrote: As I understood it, when the with block exits, the __exit__() method is called immediately. This calls the next() method on the underlying generator, which forces it to run to completion (and raise a StopIteration), which includes the finally clause... right? That is true if the with block exits, but if the with block (or try..finally block) contains yield you have a generator. In that case if you simply drop the generator on the floor the cleanup at the end of the with will still happen, but maybe not until the generator is garbage collected. def foo(): with open(foo) as foo: for line in foo: yield line ... bar = foo() print bar.next() del bar # May close the file now or maybe later... -- Duncan Booth http://kupuguy.blogspot.com -- http://mail.python.org/mailman/listinfo/python-list
Programming games in historical linguistics with Python
Hello, Following a discussion that began 3 weeks ago I would like to ask a question regarding substitution of letters according to grammatical rules in historical linguistics. I would like to automate the transformation of words according to complex rules of phonology and integrate that script in a visual environment. Here follows the previous thread: http://groups.google.com/group/comp.lang.python/browse_thread/thread/3c55f9f044c3252f/fe7c2c82ecf0dbf5?lnk=gstq=evolutionary+linguistics#fe7c2c82ecf0dbf5 Is there a way to refer to vowels and consonants as a subcategory of text? Is there a function to remove all vowels? How should one create and order the dictionary file for the rules? How to chain several transformations automatically from multiple rules? Finally can anyone show me what existing python program or phonological software can do this? What function could tag syllables, the word nucleus and the codas? How easy is it to bridge this with a more visual environment where interlinear, aligned text can be displayed with Greek notations and braces as usual in the phonology textbooks? Best regards, Dax Bloom -- http://mail.python.org/mailman/listinfo/python-list
Help: problem in setting the background colour ListBox
Hi everyone , I have a requirement of displaying my data in a textCtrl like widget , but i need that the data in the row be clickable , so as when i click the data i could be able to get fire and even and get me the selected data value.After a long search i found ListBox to be perfect for my use but When i try to set the backGround colour to the colour of my application requirement i am not able to do so, but i am able to set the foreground colour . Hope someone will guide me in solving my problem Thanks -- http://mail.python.org/mailman/listinfo/python-list
Re: Memory issues when storing as List of Strings vs List of List
On 11/30/2010 04:29 AM, OW Ghim Siong wrote: a=open(bigfile) matrix=[] while True: lines = a.readlines(1) for line in lines: data=line.split(\t) if several_conditions_are_satisfied: matrix.append(data) print Number of lines read:, len(lines), matrix.__sizeof__:, matrix.__sizeof__() if len(lines)==0: break As others have mentiond, don't use .readlines() but use the file-object as an iterator instead. This can even be rewritten as a simple list-comprehension: from csv import reader matrix = [data for data in reader(file('bigfile.txt', 'rb'), delimiter='\t') if several_conditions_are_satisfied(data) ] Assuming that you're throwing away most of the data (the final matrix fits well within memory, even if the source file doesn't). -tkc -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 2.7.1
On Mon, 29 Nov 2010 15:11:28 -0800 (PST) Spider matt...@cuneiformsoftware.com wrote: 2.7 includes many features that were first released in Python 3.1. The faster io module ... I understand that I/O in Python 3.0 was slower than 2.x (due to quite a lot of the code being in Python rather than C, I gather), and that this was fixed up in 3.1. So, io in 3.1 is faster than in 3.0. Is it also true that io is faster in 2.7 than 2.6? That's what the release notes imply, but I wonder whether that comment has been back- ported from the 3.1 release notes, and doesn't actually apply to 2.7. The `io` module, which was backported from 3.1/3.2, is faster than in 2.6, but that's not what is used by default in 2.x when calling e.g. open() or file() (you'd have to use io.open() instead). So, as you suspect, the speed of I/O in 2.7 hasn't changed. The `io` module is available in 2.6/2.7 so that you can experiment with some 3.x features without switching, and in this case it's much faster than 2.6. Regards Antoine. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename
On Mon, 29 Nov 2010 21:52:07 -0800 (PST) Yingjie Lan lany...@yahoo.com wrote: --- On Tue, 11/30/10, Dan Stromberg drsali...@gmail.com wrote: In Python 3, I'm finding that I have encoding issues with characters with their high bit set. Things are fine with strictly ASCII filenames. With high-bit-set characters, even if I change stdin's encoding with: Co-ask. I have also had problems with file names in Chinese characters with Python 3. I unzipped the turtle demo files into the desktop folder (of course, the word 'desktop' is in Chinese, it is a windows XP system, localization is Chinese), then all in a sudden some of the demos won't work anymore. But if I move it to a folder whose path contains only english characters, everything comes back to normal. Can you try the latest 3.2alpha4 (*) and check if this is fixed? If not, then could you please open a bug on http://bugs.python.org ? (*) http://python.org/download/releases/3.2/ Thank you Antoine. -- http://mail.python.org/mailman/listinfo/python-list
Re: Memory issues when storing as List of Strings vs List of List
On Tue, 30 Nov 2010 18:29:35 +0800 OW Ghim Siong o...@bii.a-star.edu.sg wrote: Does anyone know why is there such a big difference memory usage when storing the matrix as a list of list, and when storing it as a list of string? That's because any object has a fixed overhead (related to metadata and allocation), so storing a matrix line as a sequence of several objects rather than a single string makes the total overhead larger, especially when the payload of each object is small. If you want to mitigate the issue, you could store your lines as tuples rather than lists, since tuples have a smaller memory footprint: matrix.append(tuple(data)) According to __sizeof__ though, the values are the same whether storing it as a list of list, or storing it as a list of string. As mentioned by others, __sizeof__ only gives you the size of the container, not the size of the contained values (which is where the difference is here). Regards Antoine. -- http://mail.python.org/mailman/listinfo/python-list
Re: how to go on learning python
Howdy Xavier! [Apologies for the length of this; I didn't expect to write so much!] I've been a Python programmer for many years now (having come from a PHP, Perl, C, and Pascal background) and I'm constantly learning new idioms and ways of doing things that are more Pythonic; cleaner, more efficient, or simply more beautiful. I learn by coding, rather than by reading books, taking lectures, or sitting idly watching screencasts. I constantly try to break the problems I come up with in my head into smaller and smaller pieces, then write the software for those pieces in as elegant a method as possible. Because of my turtles all the way down design philosophy, a lot of my spare time projects have no immediate demonstrable benefit; I code them for fun! I have a folder full of hundreds of these little projects, the vast majority of which never see a public release. I also collect little snippets of code that I come across[1] or write, and often experiment with performance tests[2] of small Python snippets. Often I'll assign myself the task of doing something far outside my comfort zone; a recent example is writing a HTTP/1.1 web server. I had no idea how to do low-level socket programming in Python, let alone how HTTP actually worked under-the-hood, and because my goal wasn't (originally) to produce a production-quality product for others it gave me the freedom to experiment, rewrite, and break things in as many ways as I wanted. :) I had people trying to convince me that I shouldn't re-invent the wheel (just use Twisted!) though they mis-understood the reason for my re-invention: to learn. It started as a toy 20-line script to dump a static HTTP/1.0 response on each request and has grown into a ~270 line fully HTTP/1.1 compliant, ultra-performant multi-process HTTP server rivalling pretty much every other pure-Python web server I've tested. (I still don't consider it production ready, though.) Progressive enhancement as I came up with and implemented ideas meant that sometimes I had to rewrite it from scratch, but I'm quite proud of the result and have learned far more than I expected in the process. While I don't necessarily study books on Python, I did reference HTTP: The Definitive Guide and many websites in developing that server, and I often use the Python Quick Reference[3] when I zone out and forget something basic or need to find something more advanced. In terms of understanding how Python works, or how you can use certain semantics (or even better, why you'd want to!) Python Enhancement Proposals (PEPs) can be an invaluable resource. For example, PEP 318[4] defines what a decorator is, why they're useful, how they work, and how you can write your own. Pretty much everything built into Python after Python 2.0 was first described, reasoned, and discussed in a PEP. If you haven't seen this already, the Zen of Python[5] (a PEP) has many great guidelines. I try to live and breathe the Zen. So that's my story: how I learn to improve my own code. My motto, re-inventing the wheel, every time, is the short version of the above. Of course, for commercial work I don't generally spend so much time on the nitty-gritty details; existing libraries are there for a reason, and, most of the time, Getting Things Done™ is more important than linguistic purity! ;) — Alice. [1] https://github.com/GothAlice/Random/ [2] https://gist.github.com/405354 [3] http://rgruet.free.fr/PQR26/PQR2.6.html [4] http://www.python.org/dev/peps/pep-0318/ [5] http://www.python.org/dev/peps/pep-0020/ -- http://mail.python.org/mailman/listinfo/python-list
Iran slams Wiki-release as US psywar - WIKILEAKS is replacing those BIN LADEN communiques of CIA (the global ELITE) intended to threaten MASSES
Iran slams Wiki-release as US psywar - WIKILEAKS is replacing those BIN LADEN communiques of CIA (the global ELITE) intended to threaten MASSES CIA is the criminal agency of the global elite. They want to destroy the middle class from the planet and also create a global tyranny of a police state. http://presstv.ir/detail/153128.html http://presstv.ir/detail/153128.html http://presstv.ir/detail/153128.html Iran slams Wiki-release as US psywar Mon Nov 29, 2010 12:56PM Share | Email | Print Iran's President Mahmoud Ahmadinejad has questioned the recent 'leaked' documents published by Wikileaks website, describing them as part of a US psychological warfare. In response to a question by Press TV on Monday over the whistleblower website's leaks, President Mahmoud Ahmadinejad said let me first correct you. The material was not leaked, but rather released in an organized effort. The US administration releases documents and makes a judgment based on them. They are mostly like a psychological warfare and lack legal basis, President Ahmadinejad told reporters on Monday. The documents will certainly have no political effects. Nations are vigilant today and such moves will have no impact on international relations, the Iranian chief executive added at the press briefing in Tehran. President Ahmadinejad stressed that the Wikileaks game is not even worth a discussion and that no one would waste their time analysing them. The countries in the region are like friends and brothers and these acts of mischief will not affect their relations, he concluded. Talks with the West The president announced that aside from Brazil and Turkey a number of other countries may take part in the new round of talks between Iran and the P5+1 -- Britain, China, France, Russia, the US, plus Germany. Human rights They (Western powers) trample on the dignity of man, their identity and real freedom. They infringe all of these and then they call it human rights, Ahmadinejad said. Earlier this month, the UN General Assembly's Third Committee accused Iran of violating human rights regulations. The 118-member Non-Aligned Movement and the 57-member Organization of the Islamic Conference have condemned the resolution against the Islamic Republic. In 2005, the human rights [issue] got a new mechanism in the United Nations ... human rights was pushed away and human rights was used for political manipulation, Secretary General of Iran's High Council for Human Rights Mohammed Javad Larijani told Press TV following the vote on the resolution. This is while the United Nations Human Rights Council reviewed the US human rights record for the first time in its history. The council then issued a document making 228 suggestions to the US to improve its rights record. IAEA 'leak' The president said that Iran has always had a positive relationship with the International Atomic Energy Agency but criticized the UN nuclear agency for caving under pressure from the masters of power and wealth. The president said due to this pressure the IAEA has at times adopted unfair and illegal stances against the Islamic Republic. Their recent one (IAEA report) is better than the previous ones and is closer to the truth but still all the facts are not reflected, he added. Of course the latest report also has shortcomings, for example all [of Iran's nuclear] information has been released and these are secret and confidential documents belonging to the country. Ahmadinejad said since Iran was following a policy of nuclear transparency, it did not care about the leaks, but called the move 'illegal. New world order The world needs order … an order in which different people form different walks of life enjoy equal rights and proper dignity, the president said in his opening speech before taking questions form Iranian and foreign journalist. The president added that the world was already on the path to setting up this order. Iran isolation When asked to comment on the US and Western media claims that Iran has become highly isolated in the region despite an active diplomacy with Persian Gulf littoral states, the president said the remarks were part of the discourse of hegemony. In the hegemonic discourse, it seems that concepts and words take on different meanings than those offered by dictionaries, Ahmadinejad said. When they say they have isolated Iran, it means that they themselves are isolated and when they say Iran is economically weak, it means that it has strengthened, the president reasoned. When they say there is a dictatorship somewhere, it means that country is really chosen by the people and vise a versa, the president further noted, adding, I do not want to name names. ZHD/HGH/SF/MMN/MB Comments Add Comment Click Here Note: The views expressed and the links provided on our comment pages are the personal views of individual contributors and do not necessarily reflect the views of Press TV. check this out 11/30/2010 9:18:05 AMit is a coincidence
Re: remote control firefox with python
On Nov 28, 4:22 pm, News123 news1...@free.fr wrote: Hi, I wondered whether there is a simpe way to 'remote' control fire fox with python. With remote controlling I mean: - enter a url in the title bar and click on it - create a new tab - enter another url click on it - save the html document of this page - Probably the most difficult one: emulate a click or 'right click' on a certain button or link of the current page. - other interesting things would be to be able to enter the master password from a script - to enable disable proxy settings while running. The reason why I want to stay within Firefox and not use any other 'mechanize' frame work is, that the pages I want to automate might contain a lot of javascript for the construction of the actual page. Thanks in advance for any pointers ideas. I have had some good experience with Sikuli. http://sikuli.org/ Regards, Andreas bal...@gmail.com -- http://mail.python.org/mailman/listinfo/python-list
Re: Using property() to extend Tkinter classes but Tkinter classes are old-style classes?
Terry Reedy tjre...@udel.edu writes: On 11/28/2010 3:47 PM, pyt...@bdurham.com wrote: I had planned on subclassing Tkinter.Toplevel() using property() to wrap access to properties like a window's title. After much head scratching and a peek at the Tkinter.py source, I realized that all Tkinter classes are old-style classes (even under Python 2.7). 1. Is there a technical reason why Tkinter classes are still old-style classes? To not break old code. Being able to break code by upgrading all classes in the stdlib was one of the reasons for 3.x. In 3.x, are Tkinter classes still derived by old-style classes? -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename
Albert Hopkins wrote: On Tue, 2010-11-30 at 11:52 +0100, Peter Otten wrote: Dan Stromberg wrote: I've got a couple of programs that read filenames from stdin, and then open those files and do things with them. These programs sort of do the *ix xargs thing, without requiring xargs. In Python 2, these work well. Irrespective of how filenames are encoded, things are opened OK, because it's all just a stream of single byte characters. I think you're wrong. The filenames' encoding as they are read from stdin must be the same as the encoding used by the file system. If the file system expects UTF-8 and you feed it ISO-8859-1 you'll run into errors. I think this is wrong. In Unix there is no concept of filename encoding. Filenames can have any arbitrary set of bytes (except '/' and '\0'). But the filesystem itself neither knows nor cares about encoding. I think you misunderstood what I was trying to say. If you write a list of filenames into files.txt, and use an encoding (ISO-8859-1, say) other than that used by the shell to display file names (on Linux typically UTF-8 these days) and then write a Python script exist.py that reads filenames and checks for the files' existence, $ python3 exist.py files.txt will report that a file b'\xe4\xf6\xfc.txt' doesn't exist. The user looking at his editor with the encoding set to ISO-8859-1 seeing the line äöü.txt and then going to the console typing $ ls äöü.txt will be confused even though everything is working correctly. The system may be shuffling bytes, but the user thinks in codepoints and sometimes assumes that codepoints and bytes are the same. You always have to know either (a) both the file system's and stdin's actual encoding, or (b) that both encodings are the same. If this is true, then I think that it is wrong to do in Python3. Any language should be able to deal with the filenames that the host OS allows. Anyway, going on with the OP.. can you open stdin so that you can accept arbitrary bytes instead of strings and then open using the bytes as the filename? You can access the underlying stdin.buffer that feeds you the raw bytes with no attempt to shoehorn them into codepoints. You can use filenames that are not valid in the encoding that the system uses to display filenames: $ ls $ python3 Python 3.1.1+ (r311:74480, Nov 2 2009, 15:45:00) [GCC 4.4.1] on linux2 Type help, copyright, credits or license for more information. with open(b\xe4\xf6\xfc.txt, w) as f: ... f.write(hello\n) ... 6 $ ls ???.txt I don't have that much experience with Python3 to say for sure. Me neither. Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: Using property() to extend Tkinter classes but Tkinter classes are old-style classes?
Giacomo Boffi wrote: Terry Reedy tjre...@udel.edu writes: On 11/28/2010 3:47 PM, pyt...@bdurham.com wrote: I had planned on subclassing Tkinter.Toplevel() using property() to wrap access to properties like a window's title. After much head scratching and a peek at the Tkinter.py source, I realized that all Tkinter classes are old-style classes (even under Python 2.7). 1. Is there a technical reason why Tkinter classes are still old-style classes? To not break old code. Being able to break code by upgrading all classes in the stdlib was one of the reasons for 3.x. In 3.x, are Tkinter classes still derived by old-style classes? 3.x does not provide old-style classes. Oh, and the name Tkinter was changed to tkinter: all modules in the standard library have lower case names in 3.x. HTH, -- HansM -- http://mail.python.org/mailman/listinfo/python-list
Re: Using property() to extend Tkinter classes but Tkinter classes are old-style classes?
On 11/30/10 11:00 AM, Giacomo Boffi wrote: Terry Reedytjre...@udel.edu writes: On 11/28/2010 3:47 PM, pyt...@bdurham.com wrote: I had planned on subclassing Tkinter.Toplevel() using property() to wrap access to properties like a window's title. After much head scratching and a peek at the Tkinter.py source, I realized that all Tkinter classes are old-style classes (even under Python 2.7). 1. Is there a technical reason why Tkinter classes are still old-style classes? To not break old code. Being able to break code by upgrading all classes in the stdlib was one of the reasons for 3.x. In 3.x, are Tkinter classes still derived by old-style classes? No. [~]$ python3 Python 3.1.2 (r312:79360M, Mar 24 2010, 01:33:18) [GCC 4.0.1 (Apple Inc. build 5493)] on darwin Type help, copyright, credits or license for more information. import tkinter tkinter.Tk.mro() [class 'tkinter.Tk', class 'tkinter.Misc', class 'tkinter.Wm', class 'object'] -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
C struct to Python
I am not sure how to proceed. I am writing a Python interface to a C library. The C library uses structures. I was looking at the struct module but struct.unpack only seems to deal with data that was packed using struct.pack or some other buffer. All I have is the struct itself, a pointer in C. Is there a way to unpack directly from a memory address? Right now on the C side of things I can create a buffer of the struct data like so... MyStruct ms; unsigned char buffer[sizeof(MyStruct) + 1]; memcpy(buffer, ms, sizeof(MyStruct)); return Py_BuildValue(s#, buffer, sizeof(MyStruct)); Then on the Python side I can unpack it using struct.unpack. I'm just wondering if I need to jump through these hoops of packing it on the C side or if I can do it directly from Python. Thanks, ~Eric -- http://mail.python.org/mailman/listinfo/python-list
Almost free iPod
I know nothing is ever free and that is true. However, you can get things really cheap. Two offers I am working on right now are: (Copy and Paste link into your web browser) A Free iPod 64gb - http://www.YouriPodTouch4free.com/index.php?ref=6695331 Here is how it works: You click on one of the links above, select the item you want, then enter your email in the sign-up section.The next page it will ask you if you want to do the offer as referral or points, I would suggest referral. Now it is going to take you to your main page. Here you will need to complete a level A offer or 50 points in level B offers. Now you may have the question, is this legit. Surf the internet about these sites and you will find out that they are legit. I will not lie; it is hard to get the referrals needed to get the items. A suggestion is try joining the Freebie Forums. There are several people at these forums doing the same thing we are doing and this may help you get some referrals quicker. -- http://mail.python.org/mailman/listinfo/python-list
Re: C struct to Python
On Tue, Nov 30, 2010 at 10:57 AM, Eric Frederich eric.freder...@gmail.com wrote: I am not sure how to proceed. I am writing a Python interface to a C library. The C library uses structures. I was looking at the struct module but struct.unpack only seems to deal with data that was packed using struct.pack or some other buffer. All I have is the struct itself, a pointer in C. Is there a way to unpack directly from a memory address? Right now on the C side of things I can create a buffer of the struct data like so... MyStruct ms; unsigned char buffer[sizeof(MyStruct) + 1]; memcpy(buffer, ms, sizeof(MyStruct)); return Py_BuildValue(s#, buffer, sizeof(MyStruct)); Then on the Python side I can unpack it using struct.unpack. I'm just wondering if I need to jump through these hoops of packing it on the C side or if I can do it directly from Python. Thanks, ~Eric ctypes[0] sounds like a possible solution, although if you're already writing a C extension it might be better practice to just write a Python object that wraps your C struct appropriately. If you're not wedded to the C extension, though, I've had very good luck writing C interfaces with with ctypes and a few useful decorators [1], [2]. Others prefer Cython[3], which I like for speed but which sometimes seems to get in my way when I'm trying to interface with existing code. There's a good, if somewhat dated, overview of a few other strategies here[4]. Geremy Condra [0]: http://docs.python.org/library/ctypes.html [1]: http://code.activestate.com/recipes/576734-c-struct-decorator/ [2]: http://code.activestate.com/recipes/576731/ [3]: http://www.cython.org/ [4]: http://www.suttoncourtenay.org.uk/duncan/accu/integratingpython.html -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename
Does anyone know what I need to do to read filenames from stdin with Python 3.1 and subsequently open them, when some of those filenames include characters with their high bit set? If your files on disk use file names encoded in iso-8859-1, don't set your locale to a UTF-8 locale (as you apparently do), but set it to a locale that actually matches the encoding that you use. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
[Q] get device major/minor number
Hello all, In a script I would like to extract all device infos from block or character device. The stat function gives me most of the infos (mode, timestamp, user and group id, ...), however I did not find how to get the devices major and minor numbers. Of course I could do it by calling an external program, but is it possible to stay within python? In the example below, I would like to get the major (8) and minor (0, 1, 2) numbers of /dev/sda{,1,2}. How can I get them? u...@host:~$ ls -l /dev/sda /dev/sda1 /dev/sda2 brw-rw 1 root disk 8, 0 Nov 30 19:10 /dev/sda brw-rw 1 root disk 8, 1 Nov 30 19:10 /dev/sda1 brw-rw 1 root disk 8, 2 Nov 30 19:10 /dev/sda2 u...@host:~$ python3.1 -c 'import os for el in [,1,2]: print(os.stat(/dev/sda+el));' posix.stat_result(st_mode=25008, st_ino=1776, st_dev=5, st_nlink=1, st_uid=0, st_gid=6, st_size=0, st_atime=1291140641, st_mtime=1291140640, st_ctime=1291140640) posix.stat_result(st_mode=25008, st_ino=1780, st_dev=5, st_nlink=1, st_uid=0, st_gid=6, st_size=0, st_atime=1291140644, st_mtime=1291140641, st_ctime=1291140641) posix.stat_result(st_mode=25008, st_ino=1781, st_dev=5, st_nlink=1, st_uid=0, st_gid=6, st_size=0, st_atime=1291140644, st_mtime=1291140641, st_ctime=1291140641) Thanks Tom -- http://mail.python.org/mailman/listinfo/python-list
Re: [Q] get device major/minor number
On Tue, 30 Nov 2010 21:09:14 +0100, Thomas Portmann wrote: Hello all, In a script I would like to extract all device infos from block or character device. The stat function gives me most of the infos (mode, timestamp, user and group id, ...), however I did not find how to get the devices major and minor numbers. Of course I could do it by calling an external program, but is it possible to stay within python? In the example below, I would like to get the major (8) and minor (0, 1, 2) numbers of /dev/sda{,1,2}. How can I get them? I think the os.major() and os.minor() calls ought to do what you want. import os s = os.stat('/dev/sda1') os.major(s.st_rdev) 8 os.minor(s.st_rdev) 1 d...@dan:~$ ls -l /dev/sda1 brw-rw 1 root disk 8, 1 2010-11-18 05:41 /dev/sda1 -- http://mail.python.org/mailman/listinfo/python-list
Re: Memory issues when storing as List of Strings vs List of List
OW Ghim Siong o...@bii.a-star.edu.sg writes: I have a big file 1.5GB in size, with about 6 million lines of tab-delimited data. I have to perform some filtration on the data and keep the good data. After filtration, I have about 5.5 million data left remaining. As you might already guessed, I have to read them in batches and I did so using .readlines(1). Why do you need to handle the batching in your code? Perhaps you're not aware that a file object is already an iterator for the lines of text in the file. After reading each batch, I will split the line (in string format) to a list using .split(\t) and then check several conditions, after which if all conditions are satisfied, I will store the list into a matrix. As I understand it, you don't need a line after moving to the next. So there's no need to maintain a manual buffer of lines at all; please explain if there is something additional requiring a huge buffer of input lines. The code is as follows: -Start-- a=open(bigfile) matrix=[] while True: lines = a.readlines(1) for line in lines: data=line.split(\t) if several_conditions_are_satisfied: matrix.append(data) print Number of lines read:, len(lines), matrix.__sizeof__:, matrix.__sizeof__() if len(lines)==0: break -End- Using the file's native line iterator:: infile = open(bigfile) matrix = [] for line in infile: record = line.split(\t) if several_conditions_are_satisfied: matrix.append(record) Results: Number of lines read: 461544 matrix.__sizeof__: 1694768 Number of lines read: 449840 matrix.__sizeof__: 3435984 Number of lines read: 455690 matrix.__sizeof__: 5503904 Number of lines read: 451955 matrix.__sizeof__: 6965928 Number of lines read: 452645 matrix.__sizeof__: 8816304 Number of lines read: 448555 matrix.__sizeof__: 9918368 Traceback (most recent call last): MemoryError If you still get a MemoryError, you can use the ‘pdb’ module URL:http://docs.python.org/library/pdb.html to debug it interactively. Another option is to catch the MemoryError and construct a diagnostic message similar to the one you had above:: import sys infile = open(bigfile) matrix = [] for line in infile: record = line.split(\t) if several_conditions_are_satisfied: try: matrix.append(record) except MemoryError: matrix_len = len(matrix) sys.stderr.write( len(matrix): %(matrix_len)d\n % vars()) raise I have tried creating such a matrix of equivalent size and it only uses 35mb of memory but I am not sure why when using the code above, the memory usage shot up so fast and exceeded 2GB. Any advice is greatly appreciated. With large data sets, and the manipulation and computation you will likely be wanting to perform, it's probably time to consider the NumPy library URL:http://numpy.scipy.org/ which has much more powerful array types, part of the SciPy library URL:http://www.scipy.org/. -- \“[It's] best to confuse only one issue at a time.” —Brian W. | `\ Kernighan, Dennis M. Ritchie, _The C programming language_, 1988 | _o__) | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list
SAX unicode and ascii parsing problem
Hi, I'm trying to parse an xml file using SAX. About half-way through a file I get this error: Traceback (most recent call last): File C:\Python26\Lib\site-packages\pythonwin\pywin\framework \scriptutils.py, line 325, in RunScript exec codeObject in __main__.__dict__ File E:\sc\b2.py, line 58, in module parser.parse(open(r'ppb5.xml')) File C:\Python26\Lib\xml\sax\expatreader.py, line 107, in parse xmlreader.IncrementalParser.parse(self, source) File C:\Python26\Lib\xml\sax\xmlreader.py, line 123, in parse self.feed(buffer) File C:\Python26\Lib\xml\sax\expatreader.py, line 207, in feed self._parser.Parse(data, isFinal) File C:\Python26\Lib\xml\sax\expatreader.py, line 304, in end_element self._cont_handler.endElement(name) File E:\sc\b2.py, line 51, in endElement d.write(csv+\n) UnicodeEncodeError: 'ascii' codec can't encode characters in position 146-147: ordinal not in range(128) I'm using ActivePython 2.6. I trying to figure out the simplest fix. If there's a Python way to just take the source XML file and covert/ process it so this will not happen - that would be best. Or should I just update to Python 3 ? I tried this but nothing changed, I thought this might convert it and then I'd paerse the new file - didn't work: uc = open(r'E:\sc\ppb4.xml').read().decode('utf8') ascii = uc.decode('ascii') mex9 = open( r'E:\scrapes\ppb5.xml', 'w' ) mex9.write(ascii) Again I'm looking for something simple even it's a few more lines of codes...or upgrade(?) Thanks, appreciate any help. mex9.close() -- http://mail.python.org/mailman/listinfo/python-list
get a free domain , free design , and free host
get a free domain , free design , and free host http://freedesignandhost.co.cc/ get a free domain , free design , and free host http://freedesignandhost.co.cc/free-design.php http://freedesignandhost.co.cc/free-host.php http://freedesignandhost.co.cc/free-domain.php -- http://mail.python.org/mailman/listinfo/python-list
Re: [Q] get device major/minor number
On Tue, Nov 30, 2010 at 9:18 PM, Dan M d...@catfolks.net wrote: On Tue, 30 Nov 2010 21:09:14 +0100, Thomas Portmann wrote: In the example below, I would like to get the major (8) and minor (0, 1, 2) numbers of /dev/sda{,1,2}. How can I get them? I think the os.major() and os.minor() calls ought to do what you want. import os s = os.stat('/dev/sda1') os.major(s.st_rdev) 8 os.minor(s.st_rdev) 1 Thank you very much Dan, this is exactly what I was looking for. Tom -- http://mail.python.org/mailman/listinfo/python-list
Re: [Q] get device major/minor number
On Tue, 30 Nov 2010 21:35:43 +0100, Thomas Portmann wrote: Thank you very much Dan, this is exactly what I was looking for. Tom You're very welcome. -- http://mail.python.org/mailman/listinfo/python-list
SAX unicode and ascii parsing problem
Hi, I'm trying to parse an xml file using SAX. About half-way through a file I get this error: Traceback (most recent call last): File C:\Python26\Lib\site-packages\pythonwin\pywin\framework \scriptutils.py, line 325, in RunScript exec codeObject in __main__.__dict__ File E:\sc\b2.py, line 58, in module parser.parse(open(r'ppb5.xml')) File C:\Python26\Lib\xml\sax\expatreader.py, line 107, in parse xmlreader.IncrementalParser.parse(self, source) File C:\Python26\Lib\xml\sax\xmlreader.py, line 123, in parse self.feed(buffer) File C:\Python26\Lib\xml\sax\expatreader.py, line 207, in feed self._parser.Parse(data, isFinal) File C:\Python26\Lib\xml\sax\expatreader.py, line 304, in end_element self._cont_handler.endElement(name) File E:\sc\b2.py, line 51, in endElement d.write(csv+\n) UnicodeEncodeError: 'ascii' codec can't encode characters in position 146-147: ordinal not in range(128) I'm using ActivePython 2.6. I trying to figure out the simplest fix. If there's a Python way to just take the source XML file and covert/ process it so this will not happen - that would be best. Or should I just update to Python 3 ? I tried this but nothing changed, I thought this might convert it and then I'd paerse the new file - didn't work: uc = open(r'E:\sc\ppb4.xml').read().decode('utf8') ascii = uc.decode('ascii') mex9 = open( r'E:\scrapes\ppb5.xml', 'w' ) mex9.write(ascii) Again I'm looking for something simple even it's a few more lines of codes...or upgrade(?) Thanks, appreciate any help. mex9.close() -- http://mail.python.org/mailman/listinfo/python-list
Re: SAX unicode and ascii parsing problem
On 11/30/2010 3:43 PM, goldtech wrote: Hi, I'm trying to parse an xml file using SAX. About half-way through a file I get this error: Traceback (most recent call last): File C:\Python26\Lib\site-packages\pythonwin\pywin\framework \scriptutils.py, line 325, in RunScript exec codeObject in __main__.__dict__ File E:\sc\b2.py, line 58, in module parser.parse(open(r'ppb5.xml')) File C:\Python26\Lib\xml\sax\expatreader.py, line 107, in parse xmlreader.IncrementalParser.parse(self, source) File C:\Python26\Lib\xml\sax\xmlreader.py, line 123, in parse self.feed(buffer) File C:\Python26\Lib\xml\sax\expatreader.py, line 207, in feed self._parser.Parse(data, isFinal) File C:\Python26\Lib\xml\sax\expatreader.py, line 304, in end_element self._cont_handler.endElement(name) File E:\sc\b2.py, line 51, in endElement d.write(csv+\n) UnicodeEncodeError: 'ascii' codec can't encode characters in position 146-147: ordinal not in range(128) I'm using ActivePython 2.6. I trying to figure out the simplest fix. If there's a Python way to just take the source XML file and covert/ process it so this will not happen - that would be best. Or should I just update to Python 3 ? I tried this but nothing changed, I thought this might convert it and then I'd paerse the new file - didn't work: uc = open(r'E:\sc\ppb4.xml').read().decode('utf8') ascii = uc.decode('ascii') mex9 = open( r'E:\scrapes\ppb5.xml', 'w' ) mex9.write(ascii) Again I'm looking for something simple even it's a few more lines of codes...or upgrade(?) Thanks, appreciate any help. mex9.close() I'm just as stumped as I was when you first asked this question 13 minutes ago. ;-) regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 PyCon 2011 Atlanta March 9-17 http://us.pycon.org/ See Python Video! http://python.mirocommunity.org/ Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: IMAP support
Please, give me an example of raw query to IMAP server? And why do you focus on Nevermind is so ekhm... nevermind... ?? Cannot you just help? -- http://mail.python.org/mailman/listinfo/python-list
Re: SAX unicode and ascii parsing problem
snip... I'm just as stumped as I was when you first asked this question 13 minutes ago. ;-) regards Steve snip... Hi Steve, Think I found it, for example: line = 'my big string' line.encode('ascii', 'ignore') I processed the problem strings during parsing with this and it works now. Got this from: http://stackoverflow.com/questions/2365411/python-convert-unicode-to-ascii-without-errors Best, Lee :^) -- http://mail.python.org/mailman/listinfo/python-list
Re: SAX unicode and ascii parsing problem
can't check right now but are you sure it's the parser and not this line d.write(csv+\n) that's failing? what is d? -- http://mail.python.org/mailman/listinfo/python-list
Re: IMAP support
On Tue, 2010-11-30 at 13:03 -0800, pakalk wrote: Please, give me an example of raw query to IMAP server? http://www.devshed.com/c/a/Python/Python-Email-Libraries-part-2-IMAP/2/ I'm not certain what you mean by raw query. And why do you focus on Nevermind is so ekhm... nevermind... ?? Cannot you just help? This list does suffer from a case of attitude. Most programming forums have that; Python attitude has its own special flavor. -- http://mail.python.org/mailman/listinfo/python-list
Reading by positions plain text files
Hi all, Sorry, newbie question: I have database in a plain text file (could be .txt or .dat, it's the same) that I need to read in python in order to do some data validation. In other files I read this kind of files with the split() method, reading line by line. But split() relies on a separator character (I think... all I know is that it's work OK). I have a case now in wich another file has been provided (besides the database) that tells me in wich column of the file is every variable, because there isn't any blank or tab character that separates the variables, they are stick together. This second file specify the variable name and his position: VARIABLE NAME POSITION (COLUMN) IN FILE var_name_1 123-123 var_name_2 124-125 var_name_3 126-126 .. .. var_name_N 512-513 (last positions) How can I read this so each position in the file it's associated with each variable name? Thanks a lot!! Javier -- http://mail.python.org/mailman/listinfo/python-list
How to initialize each multithreading Pool worker with an individual value?
Hi, multithreading.pool Pool has a promissing initializer argument in its constructor. However it doesn't look possible to use it to initialize each Pool's worker with some individual value (I'd wish to be wrong here) So, how to initialize each multithreading Pool worker with the individual values? The typical use case might be a connection pool, say, of 3 workers, where each of 3 workers has its own TCP/IP port. from multiprocessing.pool import Pool def port_initializer(_port): global port port = _port def use_connection(some_packet): global _port print sending data over port # %s % port if __name__ == __main__: ports=((4001,4002, 4003), ) p = Pool(3, port_initializer, ports) # oops... :-) some_data_to_send = range(20) p.map(use_connection, some_data_to_send) best regards -- Valery A.Khamenya -- http://mail.python.org/mailman/listinfo/python-list
Re: IMAP support
On 30 Lis, 22:26, Adam Tauno Williams awill...@whitemice.org wrote: On Tue, 2010-11-30 at 13:03 -0800, pakalk wrote: Please, give me an example of raw query to IMAP server? http://www.devshed.com/c/a/Python/Python-Email-Libraries-part-2-IMAP/2/ I'm not certain what you mean by raw query. m = imap() m.query('UID SORT ...') # etc. Thanks for link :) -- http://mail.python.org/mailman/listinfo/python-list
Re: Reading by positions plain text files
On 2010-11-30, javivd javiervan...@gmail.com wrote: I have a case now in wich another file has been provided (besides the database) that tells me in wich column of the file is every variable, because there isn't any blank or tab character that separates the variables, they are stick together. This second file specify the variable name and his position: VARIABLE NAME POSITION (COLUMN) IN FILE var_name_1123-123 var_name_2124-125 var_name_3126-126 .. .. var_name_N512-513 (last positions) I am unclear on the format of these positions. They do not look like what I would expect from absolute references in the data. For instance, 123-123 may only contain one byte??? which could change for different encodings and how you mark line endings. Frankly, the use of the world columns in the header suggests that the data *is* separated by line endings rather then absolute position and the position refers to the line number. In which case, you can use splitlines() to break up the data and then address the proper line by index. Nevertheless, you can use file.seek() to move to an absolute offset in the file, if that really is what you are looking for. -- http://mail.python.org/mailman/listinfo/python-list
Catching user switching and getting current active user from root on linux
I have situation where I need to be able to get the current active user, and catch user switching eg user1 locks screen, leaves computer, user2 comes, and logs on. basically, when there is any type of user switch my script needs to know. -- http://mail.python.org/mailman/listinfo/python-list
Re: Catching user switching and getting current active user from root on linux
On 2010-11-30, mpnordland mpnordl...@gmail.com wrote: I have situation where I need to be able to get the current active user, and catch user switching eg user1 locks screen, leaves computer, user2 comes, and logs on. basically, when there is any type of user switch my script needs to know. Well you could use inotify to trigger on any changes to /var/log/wtmp. When a change is detected, you could check of deltas in the output of who -a to figure out what has changed since the last time wtmp triggered. -- http://mail.python.org/mailman/listinfo/python-list
Re: Catching user switching and getting current active user from root on linux
On Wed, Dec 1, 2010 at 8:54 AM, Tim Harig user...@ilthio.net wrote: Well you could use inotify to trigger on any changes to /var/log/wtmp. When a change is detected, you could check of deltas in the output of who -a to figure out what has changed since the last time wtmp triggered. This is a good idea and you could also make use of the following library: http://pypi.python.org/pypi?:action=searchterm=utmpsubmit=search cheers James -- -- James Mills -- -- Problems are solved by method -- http://mail.python.org/mailman/listinfo/python-list
Re: how to go on learning python
On 11/30/2010 9:37 AM, Xavier Heruacles wrote: I'm basically a c/c++ programmer and recently come to python for some web development. Using django and javascript I'm afraid I can develop some web application now. But often I feel I'm not good at python. I don't know much about generators, descriptors and decorators(although I can use some of it to accomplish something, but I don't think I'm capable of knowing its internals). I find my code ugly, and it seems near everything are already gotten done by the libraries. When I want to do something, I just find some libraries or modules and then just finish the work. So I'm a bit tired of just doing this kind of high level scripting, only to find myself a bad programmer. Then my question is after one coded some kind of basic app, how one can keep on learning programming using python? Do some more interesting projects? Read more general books about programming? or...? You can use both your old C skills and new Python skills by helping to develop Python by working on issues on the tracker bugs.python.org. If you are interested but needed help getting started, ask. -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: Reading by positions plain text files
On 30/11/2010 21:31, javivd wrote: Hi all, Sorry, newbie question: I have database in a plain text file (could be .txt or .dat, it's the same) that I need to read in python in order to do some data validation. In other files I read this kind of files with the split() method, reading line by line. But split() relies on a separator character (I think... all I know is that it's work OK). I have a case now in wich another file has been provided (besides the database) that tells me in wich column of the file is every variable, because there isn't any blank or tab character that separates the variables, they are stick together. This second file specify the variable name and his position: VARIABLE NAME POSITION (COLUMN) IN FILE var_name_1 123-123 var_name_2 124-125 var_name_3 126-126 .. .. var_name_N 512-513 (last positions) How can I read this so each position in the file it's associated with each variable name? It sounds like a similar problem to this: http://groups.google.com/group/comp.lang.python/browse_thread/thread/53e6f41bfff6/123422d510187dc3?show_docid=123422d510187dc3 -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming games in historical linguistics with Python
2010/11/30 Dax Bloom bloom@gmail.com: Hello, Following a discussion that began 3 weeks ago I would like to ask a question regarding substitution of letters according to grammatical rules in historical linguistics. I would like to automate the transformation of words according to complex rules of phonology and integrate that script in a visual environment. Here follows the previous thread: http://groups.google.com/group/comp.lang.python/browse_thread/thread/3c55f9f044c3252f/fe7c2c82ecf0dbf5?lnk=gstq=evolutionary+linguistics#fe7c2c82ecf0dbf5 Is there a way to refer to vowels and consonants as a subcategory of text? Is there a function to remove all vowels? How should one create and order the dictionary file for the rules? How to chain several transformations automatically from multiple rules? Finally can anyone show me what existing python program or phonological software can do this? What function could tag syllables, the word nucleus and the codas? How easy is it to bridge this with a more visual environment where interlinear, aligned text can be displayed with Greek notations and braces as usual in the phonology textbooks? Best regards, Dax Bloom -- http://mail.python.org/mailman/listinfo/python-list Hi, as far as I know, there is no predefined function or library for distinguishing vowels or consonants, but these can be simply implemented individually according to the exact needs. e.g. regular expressions can be used here: to remove vowels, the code could be (example from the command prompt): import re re.sub(r(?i)[aeiouy], , This is a SAMPLE TEXT) 'Ths s SMPL TXT' See http://docs.python.org/library/re.html or http://www.regular-expressions.info/ for the regexp features. You may eventually try the new development version regex, which adds many interesting new features and remove some limitations http://bugs.python.org/issue2636 In some cases regular expressions aren't really appropriate or may become too complicated. Sometimes a parsing library like pyparsing may be a more adequate tool: http://pyparsing.wikispaces.com/ If the rules are simple enough, that they can be formulated for single characters or character clusters with a regular expression, you can model the phonological changes as a series of replacements with matching patterns and the respective replacement patterns. For character-wise matching and replacing the regular expressions are very effective; using lookarounds http://www.regular-expressions.info/lookaround.html even some combinatorics for conditional changes can be expressed; however, i would find some more complex conditions, suprasegmentals, morpheme boundaries etc. rather difficult to formalise this way... hth, vbr -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename
On Tue, Nov 30, 2010 at 11:47 AM, Martin v. Loewis mar...@v.loewis.de wrote: Does anyone know what I need to do to read filenames from stdin with Python 3.1 and subsequently open them, when some of those filenames include characters with their high bit set? If your files on disk use file names encoded in iso-8859-1, don't set your locale to a UTF-8 locale (as you apparently do), but set it to a locale that actually matches the encoding that you use. Regards, Martin It'd be great if all programs used the same encoding on a given OS, but at least on Linux, I believe historically filenames have been created with different encodings. IOW, if I pick one encoding and go with it, filenames written in some other encoding are likely to cause problems. So I need something for which a filename is just a blob that shouldn't be monkeyed with. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename
On Tue, Nov 30, 2010 at 7:19 AM, Antoine Pitrou solip...@pitrou.net wrote: On Mon, 29 Nov 2010 21:52:07 -0800 (PST) Yingjie Lan lany...@yahoo.com wrote: --- On Tue, 11/30/10, Dan Stromberg drsali...@gmail.com wrote: In Python 3, I'm finding that I have encoding issues with characters with their high bit set. Things are fine with strictly ASCII filenames. With high-bit-set characters, even if I change stdin's encoding with: Co-ask. I have also had problems with file names in Chinese characters with Python 3. I unzipped the turtle demo files into the desktop folder (of course, the word 'desktop' is in Chinese, it is a windows XP system, localization is Chinese), then all in a sudden some of the demos won't work anymore. But if I move it to a folder whose path contains only english characters, everything comes back to normal. Can you try the latest 3.2alpha4 (*) and check if this is fixed? If not, then could you please open a bug on http://bugs.python.org ? (*) http://python.org/download/releases/3.2/ Thank you Antoine. I have the same problem using 3.2alpha4: the word man~ana (6 characters long) in a filename causes problems (I'm catching the exception and skipping the file for now) despite using what I believe is an 8-bit, all 256-bytes-are-characters encoding: iso-8859-1. 'not sure if you wanted both of us to try this, or Yingjie alone though. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename
On Tue, Nov 30, 2010 at 9:53 AM, Peter Otten __pete...@web.de wrote: $ ls $ python3 Python 3.1.1+ (r311:74480, Nov 2 2009, 15:45:00) [GCC 4.4.1] on linux2 Type help, copyright, credits or license for more information. with open(b\xe4\xf6\xfc.txt, w) as f: ... f.write(hello\n) ... 6 $ ls ???.txt This sounds like a strong prospect for how to get things working (I didn't realize open would accept a bytes argument for the filename), but I'm also interested in whether reading filenames from stdin and subsequently opening them is supposed to just work given a suitable encoding - like with Java which also uses unicode strings. In Java, I'm told that ISO-8859-1 is supposed to guarantee a roundtrip conversion. -- http://mail.python.org/mailman/listinfo/python-list
Change one list item in place
This works for me: def sendList(): return [item0, item1] def query(): l=sendList() return [Formatting only {0} into a string.format(l[0]), l[1]] query() However, is there a way to bypass the l=sendList() and change one list item in-place? Possibly a list comprehension operating on a numbered item? -- Gnarlie -- http://mail.python.org/mailman/listinfo/python-list
Re: How to initialize each multithreading Pool worker with an individual value?
On Tue, Nov 30, 2010 at 1:35 PM, Valery Khamenya khame...@gmail.com wrote: Hi, multithreading.pool Pool has a promissing initializer argument in its constructor. However it doesn't look possible to use it to initialize each Pool's worker with some individual value (I'd wish to be wrong here) So, how to initialize each multithreading Pool worker with the individual values? The typical use case might be a connection pool, say, of 3 workers, where each of 3 workers has its own TCP/IP port. from multiprocessing.pool import Pool def port_initializer(_port): global port port = _port def use_connection(some_packet): global _port print sending data over port # %s % port if __name__ == __main__: ports=((4001,4002, 4003), ) p = Pool(3, port_initializer, ports) # oops... :-) some_data_to_send = range(20) p.map(use_connection, some_data_to_send) Using an initializer with multiprocessing is something I've never tried. I have used queues with multiprocessing though, and I believe you could use them, at least as a fallback plan if you prefer to get the initializer to work. If you create in the parent a queue in shared memory (multiprocessing facilitates this nicely), and fill that queue with the values in your ports tuple, then you could have each child in the worker pool extract a single value from this queue so each worker can have its own, unique port value. HTH -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename
On Mon, 29 Nov 2010 21:26:23 -0800, Dan Stromberg wrote: Does anyone know what I need to do to read filenames from stdin with Python 3.1 and subsequently open them, when some of those filenames include characters with their high bit set? Use bytes rather than str. Everywhere. This means reading names from sys.stdin.buffer (which is a binary stream) rather than sys.stdin (which is a text stream). If you pass a bytes to an I/O function (e.g. open()), it will just pass the bytes directly to the OS without any decoding. But really, if you're writing *nix system utilities, you should probably stick with Python 2.x until the end of time. Using 3.x will just make life difficult for no good reason (e.g. in 3.x, os.environ also contains Unicode strings). -- http://mail.python.org/mailman/listinfo/python-list
Intro to Python slides, was Re: how to go on learning python
On Tue, Nov 30, 2010 at 6:37 AM, Xavier Heruacles xheruac...@gmail.com wrote: I'm basically a c/c++ programmer and recently come to python for some web development. Using django and javascript I'm afraid I can develop some web application now. But often I feel I'm not good at python. I don't know much about generators, descriptors and decorators(although I can use some of it to accomplish something, but I don't think I'm capable of knowing its internals). I find my code ugly, and it seems near everything are already gotten done by the libraries. When I want to do something, I just find some libraries or modules and then just finish the work. So I'm a bit tired of just doing this kind of high level scripting, only to find myself a bad programmer. Then my question is after one coded some kind of basic app, how one can keep on learning programming using python? Do some more interesting projects? Read more general books about programming? or...? -- http://mail.python.org/mailman/listinfo/python-list You could check out these slides from an Intro to Python talk I'm giving tonight: http://stromberg.dnsalias.org/~dstromberg/Intro-to-Python/ ...perhaps especially the Further Resources section at the end. The Koans might be very nice for you, as might Dive Into Python. BTW, if you're interested in Python and looking into Javascript anew, you might look at Pyjamas. It lets you write web apps in Python that also run on a desktop; you can even call into Raphael from it. Only thing about it is it's kind of a young project compared to most Python implementations. PS: I mostly came from C too - knowing C can be a real advantage for a Python programmer sometimes. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename
On Tue, 30 Nov 2010 18:53:14 +0100, Peter Otten wrote: I think this is wrong. In Unix there is no concept of filename encoding. Filenames can have any arbitrary set of bytes (except '/' and '\0'). But the filesystem itself neither knows nor cares about encoding. I think you misunderstood what I was trying to say. If you write a list of filenames into files.txt, and use an encoding (ISO-8859-1, say) other than that used by the shell to display file names (on Linux typically UTF-8 these days) and then write a Python script exist.py that reads filenames and checks for the files' existence, I think you misunderstood. In the Unix kernel, there aren't any encodings. Strings of bytes are /just/ strings of bytes. A text file containing a list of filenames doesn't /have/ an encoding. The filenames passed to API functions don't /have/ an encoding. This is why Unix filenames are case-sensitive: because there isn't any case. The number 65 has no more in common with the number 97 than it does with the number 255. The fact that 65 is the ASCII code for A while 97 is the ASCII code for a doesn't come into it. Case-insensitive filenames require knowledge of the encoding in order to determine when filenames are equivalent. DOS/Windows tried this and never really got it right (it works fine on a standalone system, or within later versions of a Windows-only ecosystem, but becomes a nightmare when files get transferred between systems via older or non-Microsoft channels). Python 3.x's decision to treat filenames (and environment variables) as text even on Unix is, in short, a bug. One which, IMNSHO, will mean that Python 2.x is still around when Python 4 is released. -- http://mail.python.org/mailman/listinfo/python-list
Re: Change one list item in place
On 01/12/2010 01:08, Gnarlodious wrote: This works for me: def sendList(): return [item0, item1] def query(): l=sendList() return [Formatting only {0} into a string.format(l[0]), l[1]] query() However, is there a way to bypass the l=sendList() and change one list item in-place? Possibly a list comprehension operating on a numbered item? There's this: return [Formatting only {0} into a string.format(x) if i == 0 else x for i, x in enumerate(sendList())] but that's too clever for its own good. Keep it simple. :-) -- http://mail.python.org/mailman/listinfo/python-list
Re: Programming games in historical linguistics with Python
Have you considered entering all this data into an SQLite database? You could do fast searches based on any features you deem relevant to the phoneme. Using an SQLite editor application you can get started building a database right away. You can add columns as you get the inspiration, along with any tags you want. Putting it all in database tables can really make chaotic linguistic data seem manageable. My own linguistics project uses mostly SQLite and a number of OrderedDict's based on .plist files. It is all working very nicely, although I haven't tried to deal with any phonetics (yet). -- Gnarlie http://Sectrum.com -- http://mail.python.org/mailman/listinfo/python-list
Re: Change one list item in place
Thanks. Unless someone has a simpler solution, I'll stick with 2 lines. -- Gnarlie -- http://mail.python.org/mailman/listinfo/python-list
Re: Reading by positions plain text files
On Nov 30, 11:43 pm, Tim Harig user...@ilthio.net wrote: On 2010-11-30, javivd javiervan...@gmail.com wrote: I have a case now in wich another file has been provided (besides the database) that tells me in wich column of the file is every variable, because there isn't any blank or tab character that separates the variables, they are stick together. This second file specify the variable name and his position: VARIABLE NAME POSITION (COLUMN) IN FILE var_name_1 123-123 var_name_2 124-125 var_name_3 126-126 .. .. var_name_N 512-513 (last positions) I am unclear on the format of these positions. They do not look like what I would expect from absolute references in the data. For instance, 123-123 may only contain one byte??? which could change for different encodings and how you mark line endings. Frankly, the use of the world columns in the header suggests that the data *is* separated by line endings rather then absolute position and the position refers to the line number. In which case, you can use splitlines() to break up the data and then address the proper line by index. Nevertheless, you can use file.seek() to move to an absolute offset in the file, if that really is what you are looking for. I work in a survey research firm. the data im talking about has a lot of 0-1 variables, meaning yes or no of a lot of questions. so only one position of a character is needed (not byte), explaining the 123-123 kind of positions of a lot of variables. and no, MRAB, it's not the similar problem (at least what i understood of it). I have to associate the position this file give me with the variable name this file give me for those positions. thank you both and sorry for my english! J -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename
On 01/12/2010 01:28, Nobody wrote: On Tue, 30 Nov 2010 18:53:14 +0100, Peter Otten wrote: I think this is wrong. In Unix there is no concept of filename encoding. Filenames can have any arbitrary set of bytes (except '/' and '\0'). But the filesystem itself neither knows nor cares about encoding. I think you misunderstood what I was trying to say. If you write a list of filenames into files.txt, and use an encoding (ISO-8859-1, say) other than that used by the shell to display file names (on Linux typically UTF-8 these days) and then write a Python script exist.py that reads filenames and checks for the files' existence, I think you misunderstood. In the Unix kernel, there aren't any encodings. Strings of bytes are /just/ strings of bytes. A text file containing a list of filenames doesn't /have/ an encoding. The filenames passed to API functions don't /have/ an encoding. This is why Unix filenames are case-sensitive: because there isn't any case. The number 65 has no more in common with the number 97 than it does with the number 255. The fact that 65 is the ASCII code for A while 97 is the ASCII code for a doesn't come into it. Case-insensitive filenames require knowledge of the encoding in order to determine when filenames are equivalent. DOS/Windows tried this and never really got it right (it works fine on a standalone system, or within later versions of a Windows-only ecosystem, but becomes a nightmare when files get transferred between systems via older or non-Microsoft channels). Python 3.x's decision to treat filenames (and environment variables) as text even on Unix is, in short, a bug. One which, IMNSHO, will mean that Python 2.x is still around when Python 4 is released. If the filenames are to be shown to a user then there needs to be a mapping between bytes and glyphs. That's an encoding. If different users use different encodings then exchange of textual data becomes difficult. That's where encodings which can be used globally come in. By the time Python 4 is released I'd be surprised if Unix hadn't standardised on a single encoding like UTF-8. -- http://mail.python.org/mailman/listinfo/python-list
Re: Reading by positions plain text files
On 01/12/2010 02:03, javivd wrote: On Nov 30, 11:43 pm, Tim Hariguser...@ilthio.net wrote: On 2010-11-30, javivdjaviervan...@gmail.com wrote: I have a case now in wich another file has been provided (besides the database) that tells me in wich column of the file is every variable, because there isn't any blank or tab character that separates the variables, they are stick together. This second file specify the variable name and his position: VARIABLE NAME POSITION (COLUMN) IN FILE var_name_1 123-123 var_name_2 124-125 var_name_3 126-126 .. .. var_name_N 512-513 (last positions) I am unclear on the format of these positions. They do not look like what I would expect from absolute references in the data. For instance, 123-123 may only contain one byte??? which could change for different encodings and how you mark line endings. Frankly, the use of the world columns in the header suggests that the data *is* separated by line endings rather then absolute position and the position refers to the line number. In which case, you can use splitlines() to break up the data and then address the proper line by index. Nevertheless, you can use file.seek() to move to an absolute offset in the file, if that really is what you are looking for. I work in a survey research firm. the data im talking about has a lot of 0-1 variables, meaning yes or no of a lot of questions. so only one position of a character is needed (not byte), explaining the 123-123 kind of positions of a lot of variables. and no, MRAB, it's not the similar problem (at least what i understood of it). I have to associate the position this file give me with the variable name this file give me for those positions. thank you both and sorry for my english! You just have to parse the second file to build a list (or dict) containing the name, start position and end position of each variable: variables = [(var_name_1, 123, 123), ...] and then work through that list, extracting the data between those positions in the first file and putting the values in another list (or dict). You also need to check whether the positions are 1-based or 0-based (Python uses 0-based). -- http://mail.python.org/mailman/listinfo/python-list
Re: How to initialize each multithreading Pool worker with an individual value?
On Wed, Dec 1, 2010 at 7:35 AM, Valery Khamenya khame...@gmail.com wrote: multithreading.pool Pool has a promissing initializer argument in its constructor. However it doesn't look possible to use it to initialize each Pool's worker with some individual value (I'd wish to be wrong here) So, how to initialize each multithreading Pool worker with the individual values? The typical use case might be a connection pool, say, of 3 workers, where each of 3 workers has its own TCP/IP port. from multiprocessing.pool import Pool def port_initializer(_port): global port port = _port def use_connection(some_packet): global _port print sending data over port # %s % port if __name__ == __main__: ports=((4001,4002, 4003), ) p = Pool(3, port_initializer, ports) # oops... :-) some_data_to_send = range(20) p.map(use_connection, some_data_to_send) I assume you are talking about multiprocessing despite you mentioning multithreading in the mix. Have a look at the source code for multiprocessing.pool and how the Pool object works and what it does with the initializer argument. I'm not entirely sure it does what you expect and yes documentation on this is lacking... cheers James -- -- James Mills -- -- Problems are solved by method -- http://mail.python.org/mailman/listinfo/python-list
Re: Reading by positions plain text files
On 11/30/2010 08:03 PM, javivd wrote: On Nov 30, 11:43 pm, Tim Hariguser...@ilthio.net wrote: VARIABLE NAME POSITION (COLUMN) IN FILE var_name_1 123-123 var_name_2 124-125 var_name_3 126-126 .. .. var_name_N 512-513 (last positions) and no, MRAB, it's not the similar problem (at least what i understood of it). I have to associate the position this file give me with the variable name this file give me for those positions. MRAB may be referring to my reply in that thread where you can do something like OFFSETS = 'offsets.txt' offsets = {} f = file(OFFSETS) f.next() # throw away the headers for row in f: varname, rest = row.split()[:2] # sanity check if varname in offsets: print [%s] in %s twice?! % (varname, OFFSETS) if '-' not in rest: continue start, stop = map(int, rest.split('-')) offsets[varname] = slice(start, stop+1) # 0-based offsets #offsets[varname] = slice(start+1, stop+2) # 1-based offsets f.close() def do_something_with(data): # your real code goes here print data['var_name_2'] for row in file('data.txt'): data = dict((name, row[offsets[name]]) for name in offsets) do_something_with(data) There's additional robustness-checks I'd include if your offsets-file isn't controlled by you (people send me daft data). -tkc -- http://mail.python.org/mailman/listinfo/python-list
To Thread or not to Thread....?
Hi there, I'm currently writing an application to control and take measurements during an experiments. This is to be done on an embedded computer running XPe so I am happy to have python available, although I am pretty new to it. The application basically runs as a state machine, which transitions through it's states based on inputs read in from a set of general purpose input/output (GPIO) lines. So when a certain line is pulled low/high, do something and move to another state. All good so far and since I get through main loop pretty quickly, I can just do a read of the GPIO lines on each pass through the loop and respond accordingly. However, in one of the states I have to start reading in, and storing frames from a camera. In another, I have to start reading accelerometer data from an I2C bus (which operates at 400kHz). I haven't implemented either yet but I would imagine that, in the case of the camera data, reading a frame would take a large amount of time as compared to other operations. Therefore, if I just tried to read one (or one set of) frames on each pass through the loop then I would hold up the rest of the application. Conversely, as the I2C bus will need to be read at such a high rate, I may not be able to get the required data rate I need even without the camera data. This naturally leads me to think I need to use threads. As I am no expert in either I2C, cameras, python or threading I thought I would chance asking for some advice on the subject. Do you think I need threads here or would I be better off using some other method. I was previously toying with the idea of using generators to create weightless threads (as detailed in http://www.ibm.com/developerworks/library/l-pythrd.html) for reading the GPIOs. Do you think this would work in this situation? Another option would be to write separately programs, perhaps even in C, and spawn these in the background when needed. I'm a little torn as to which way to go. If it makes a difference and more in case you are wondering, I will be interfacing to the GPIOs, cameras and I2C bus through a set of C DLLs using Ctypes. Any help or suggestions will be greatly appreciated, Thanks very much, Jack -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename
On Wed, 2010-12-01 at 02:14 +, MRAB wrote: If the filenames are to be shown to a user then there needs to be a mapping between bytes and glyphs. That's an encoding. If different users use different encodings then exchange of textual data becomes difficult. That's presentation, that's separate. Indeed, I have my user encoding set to UTF-8, and if there is a filename that's not valid utf-8 then my GUI (GNOME will show (invalid encoding) and even allow me to rename it and my shell (bash) will show '?' next to the invalid characters (and make it a little more challenging to rename ;)). And I can freely copy these invalid files across different (Unix) systems, because the OS doesn't care about encoding. But that's completely different from the actual name of the file. Unix doesn't care about presentation in filenames. It just cares about the data. There are not glyphs in Unix, only in the UI that runs on top of it. Or to put it another way, Unix's filename encoding is RAW-DATA. It's not textual data. The fact that most filenames contain mainly human-readable text is a convenient convention, but not required or enforced by the OS. That's where encodings which can be used globally come in. By the time Python 4 is released I'd be surprised if Unix hadn't standardised on a single encoding like UTF-8. I have serious doubts about that. At least in the Linux world the kernel wants to stay out of encoding debates (except where it has to like Window filesystems). But the point is that: The world does not revolve around Python. Unix filenames have been encoding-agnostic long before Python was around. If Python3 does not support this then it's a regression on Python's part. -- http://mail.python.org/mailman/listinfo/python-list
Re: Change one list item in place
On 11/30/2010 8:28 PM, MRAB wrote: On 01/12/2010 01:08, Gnarlodious wrote: This works for me: def sendList(): return [item0, item1] def query(): l=sendList() return [Formatting only {0} into a string.format(l[0]), l[1]] query() However, is there a way to bypass the l=sendList() and change one list item in-place? Possibly a list comprehension operating on a numbered item? There's this: return [Formatting only {0} into a string.format(x) if i == 0 else x for i, x in enumerate(sendList())] but that's too clever for its own good. Keep it simple. :-) I quite agree. That solution is so clever it would be asking for a fight walking into a bar in Glasgow. However, an unpacking assignment can make everything much more comprehensible [pun intended] by removing the index operations. The canonical solution would be something like: def query(): x, y = sendList() return [Formatting only {0} into a string.format(x), y] regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 PyCon 2011 Atlanta March 9-17 http://us.pycon.org/ See Python Video! http://python.mirocommunity.org/ Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Reading by positions plain text files
On 2010-12-01, javivd javiervan...@gmail.com wrote: On Nov 30, 11:43 pm, Tim Harig user...@ilthio.net wrote: On 2010-11-30, javivd javiervan...@gmail.com wrote: I have a case now in wich another file has been provided (besides the database) that tells me in wich column of the file is every variable, because there isn't any blank or tab character that separates the variables, they are stick together. This second file specify the variable name and his position: VARIABLE NAME POSITION (COLUMN) IN FILE var_name_1 123-123 var_name_2 124-125 var_name_3 126-126 .. .. var_name_N 512-513 (last positions) I am unclear on the format of these positions. They do not look like what I would expect from absolute references in the data. For instance, 123-123 may only contain one byte??? which could change for different encodings and how you mark line endings. Frankly, the use of the world columns in the header suggests that the data *is* separated by line endings rather then absolute position and the position refers to the line number. In which case, you can use splitlines() to break up the data and then address the proper line by index. Nevertheless, you can use file.seek() to move to an absolute offset in the file, if that really is what you are looking for. I work in a survey research firm. the data im talking about has a lot of 0-1 variables, meaning yes or no of a lot of questions. so only one position of a character is needed (not byte), explaining the 123-123 kind of positions of a lot of variables. Then file.seek() is what you are looking for; but, you need to be aware of line endings and encodings as indicated. Make sure that you open the file using whatever encoding was used when it was generated or you could have problems with multibyte characters affecting the offsets. -- http://mail.python.org/mailman/listinfo/python-list
Regarding searching directory and to delete it with specific pattern.
Hi all, Would like to search list of directories with specific pattern and delete it?.. How can i do it?. Example: in /home/jpr/ i have the following list of directories. 1.2.3-2, 1.2.3-10, 1.2.3-8, i would like to delete the directories other than 1.2.3-10 which is the higher value?.. Regards, JPR. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename
It'd be great if all programs used the same encoding on a given OS, but at least on Linux, I believe historically filenames have been created with different encodings. IOW, if I pick one encoding and go with it, filenames written in some other encoding are likely to cause problems. So I need something for which a filename is just a blob that shouldn't be monkeyed with. In that case, you should use byte strings as file names, not character strings. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename
The world does not revolve around Python. Unix filenames have been encoding-agnostic long before Python was around. If Python3 does not support this then it's a regression on Python's part. Fortunately, Python 3 does support that. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename
This sounds like a strong prospect for how to get things working (I didn't realize open would accept a bytes argument for the filename), but I'm also interested in whether reading filenames from stdin and subsequently opening them is supposed to just work given a suitable encoding - like with Java which also uses unicode strings. In Java, I'm told that ISO-8859-1 is supposed to guarantee a roundtrip conversion. It's the same in Python. However, as in Java, Python will *not* necessarily use ISO-8859-1 when you pass a (Unicode) string to open; instead, it will (as will Java) use your locale's encoding. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Watch the YANK CIA BUSTARDS Censorship live delete this post on KEY WIKILEAK REVELATIONS - RARE
http://www.telegraph.co.uk/news/worldnews/northamerica/usa/8152326/WikiLeaks-release-Timeline-of-the-key-WikiLeaks-revelations.html WikiLeaks release: Timeline of the key WikiLeaks revelations By Jon Swaine in New York 6:53PM GMT 22 Nov 2010 December 2007: Guantanamo Bay operating procedures A US Army manual for soldiers at Camp Delta discloses that prisoners were denied access to the Red Cross for up to four weeks and that inmates could earn “special rewards”, including a roll of lavatory paper, for good behaviour and co-operation. September 2008: Sarah Palin's email account Emails taken from the then-Republican Vice-Presidential candidate's personal account suggest that she has been using it for official business as Governor of Alaska. Doing so could have helped her avoid having her communications subjected to state laws on the disclosure of public records. November 2008: BNP membership list The names, addresses and occupations of more than 13,000 members of the far-Right British party are released in one file. The list shows that members include police officers, senior members of the military, doctors and other professionals. October 2009: Trafigura report An internal study about the effects of dumping waste by the energy trading company discloses that it used amateurish processes while dumping gasoline on the Ivory Coast and probably would have left dangerous sulphur compounds untreated November 2009: Climategate emails More than 1,000 emails sent between staff at the University of East Anglia's Climate Research Unit appeared to show that scientists distorted research to boost their argument that global warming was man- made, causing an international media storm. November 2009: September 11 pager messages About half a million pager messages sent in New York City on September 11, 2001, tell the story of the 9/11 terrorist attacks through individuals. Personal messages from people caught up in the carnage emerge, prompting criticism from commentators who claim the leak is an invasion of privacy. April 2010: Apache helicopter attack on journalists Video footage shows 15 people, including two people working for the Reuters news agency, being gunned down by a US Army helicopter in Iraq. The crew, who were not disciplined, mistook their targets' camera equipment for weapons. July 2010: Afghanistan war logs Tens of thousands of classified US military documents tell of the daily events of war in Afghanistan. The logs disclose that the Taliban is receiving greater assistance from the Pakistani intelligence services than was previously known and that the US runs a secret assassination squad. They also raise questions over potential crimes committed by coalition troops. October 2010: Iraq war logs Almost 400,000 classified US military documents recording the Iraq war suggest that evidence of the torture of Iraqis by coalition troops was ignored and record civilian deaths in more detail than was previously known. More than 66,000 civilians suffered “violent deaths” between 2004 and the end of 2009, they show. -- http://mail.python.org/mailman/listinfo/python-list
Re: SAX unicode and ascii parsing problem
goldtech, 30.11.2010 22:15: Think I found it, for example: line = 'my big string' line.encode('ascii', 'ignore') I processed the problem strings during parsing with this and it works now. That's not the right way of dealing with encodings, though. You should open the file with a well defined encoding (using codecs.open() or io.open() in Python = 2.6), and then write the unicode strings into it just as you get them. Stefan -- http://mail.python.org/mailman/listinfo/python-list
[issue9639] urllib2's AbstractBasicAuthHandler is limited to 6 requests
Mark Dickinson dicki...@gmail.com added the comment: Grr. Why wasn't this fix backported to the release maintenance branch before 2.6.6 was released? I've just had an application break as a result of upgrading from 2.6.5 to 2.6.6. Oh well, too late now. :-( /grumble -- nosy: +mark.dickinson ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9639 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue9639] urllib2's AbstractBasicAuthHandler is limited to 6 requests
Senthil Kumaran orsent...@gmail.com added the comment: Ouch. My mistake. Had not realize then, that code that actually broke things was merged in 2.6.x and it had to be fixed too. :( -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9639 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue9639] urllib2's AbstractBasicAuthHandler is limited to 6 requests
Mark Dickinson dicki...@gmail.com added the comment: Ah well, it turned out to be fairly easy to work around, at least. :-) Just in case any other urllib2 users have to deal with this in 2.6.6 (and also manage to find their way to this bug report :-): it's easy to monkeypatch your way around the problem. E.g.: import sys import urllib2 if sys.version_info[:2] == (2, 6) and sys.version_info[2] = 6: def fixed_http_error_401(self, req, fp, code, msg, headers): url = req.get_full_url() response = self.http_error_auth_reqed('www-authenticate', url, req, headers) self.retried = 0 return response urllib2.HTTPBasicAuthHandler.http_error_401 = fixed_http_error_401 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9639 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10588] imp.find_module raises unexpected SyntaxError
New submission from Emile Anclin emile.anc...@logilab.fr: Considering following file: $ cat pylint/test/input/func_unknown_encoding.py # -*- coding: IBO-8859-1 -*- check correct unknown encoding declaration __revision__ = '' $ When we try to find that module, imp.find_module raises SyntaxError: from imp import find_module find_module('func_unknown_encoding', None) Traceback (most recent call last): File stdin, line 1, in module SyntaxError: encoding problem: with BOM It should be considered as a bug, as stated by Brett Cannon: Considering these semantics changed between Python 2 and 3 w/o a discernable benefit (I would consider it a negative as finding a module should not be impacted by syntactic correctness; the full act of importing should be the only thing that cares about that), I would consider it a bug that should be filed. -- messages: 122896 nosy: emile.anclin priority: normal severity: normal status: open title: imp.find_module raises unexpected SyntaxError type: behavior versions: Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10588 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue9709] test_distutils warning: initfunc exported twice on Windows
Stefan Krah stefan-use...@bytereef.org added the comment: Without the patch, you see the warning if test_build_ext is run in verbose mode. With the patch, the warning disappears. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9709 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3243] Support iterable bodies in httplib
Xuanji Li xua...@gmail.com added the comment: pitrou: actually that seems a bit suspect now... you need to handle 'data' differently depending on its type, and while you can determine the type by finding out when 'data' throws certain exceptions, it doesn't seem like what exceptions were meant for. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3243 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10537] OS X IDLE 2.7rc1 from 64-bit installer hangs when you paste something.
Ned Deily n...@acm.org added the comment: More data points: using the 2.7.1 release source tarball, the problem is reproducible on 10.6 when dynamically linked to the Apple Tcl/Tk 8.5 and executing in either 64-bit or 32-bit mode. It is not reproducible when using ActiveState Tcl/Tk 8.5.9, AS Tcl/Tk 8.4.19, or Apple Tcl/Tk 8.4 (none of which, of course, is available in 64-bit mode). Unfortunately, the obvious workaround for the 64-bit/32-bit variant - building with one of the working 32-bit versions - does not result in a working IDLE.app or bin/idle since IDLE and its subprocesses are all launched in 64-bit mode (where possible) on 10.6. For testing, it is possible to demonstrate 32-bit mode in a 64-/32- build with a properly built _tkinter.so by using the -n parameter, which causes IDLE to run with no subprocesses: arch -i386 /usr/local/bin/idle2.7 -n Next step: see if the Issue6075 patches help with the Apple 8.5 Tk and, if not, add stuff to force both IDLE.app and bin/idle and their subprocesses to run only in 32-bit mode: probably either some more lipo-ing and/or adding posix_spawns. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10537 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10464] netrc module not parsing passwords containing #s.
Xuanji Li xua...@gmail.com added the comment: bumping...can someone review this? The reported bug seems valid enough. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10464 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10576] Add a progress callback to gcmodule
Kristján Valur Jónsson krist...@ccpgames.com added the comment: Hi, as I stated, the original patch was simply our original implementation. Here is a new patch. It is simpler: 1) it exposes a gc.callbacks list where users can register themselves, in the spirit of sys.meta_path 2) One can have multiple callbacks 3) Improve error handling 4) The callback is called with a phase argument, currently 0 for start, and 1 for the end. Let's start bikeshedding the calling signature. I like having a single callback, since multiple callables are a nuisance to manage. Once we agree, I'll post a patch for the documentation, and unittests. -- Added file: http://bugs.python.org/file19884/gccallback2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10576 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10576] Add a progress callback to gcmodule
Antoine Pitrou pit...@free.fr added the comment: Let's start bikeshedding the calling signature. I like having a single callback, since multiple callables are a nuisance to manage. IMO the callback should have a second argument as a dict containing various statistics that we can expand over time. The generation number, for example, should be present. As for the phase number, if 0 means start and 1 means end, you can't decently add another phase anyway (having 2 mean somewhere between 0 and 1 would be completely confusing). PS : please don't use C++-style comments in your patch -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10576 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10576] Add a progress callback to gcmodule
Kristján Valur Jónsson krist...@ccpgames.com added the comment: You are right, Antoine. How about a string and a dict? the string can be start and stop and we can add interesting information to the dict as you suggest. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10576 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3243] Support iterable bodies in httplib
Antoine Pitrou pit...@free.fr added the comment: pitrou: actually that seems a bit suspect now... you need to handle 'data' differently depending on its type, Yes, but you can't know all appropriate types in advance, so it's better to try and catch the TypeError. I don't understand your changes in Lib/urllib/request.py. len(data) will raise anyway. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3243 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3243] Support iterable bodies in httplib
Xuanji Li xua...@gmail.com added the comment: I don't fully understand Lib/urllib/request.py either, I just ported it and ran the unittests... it seems like what it does is that if you send an iterator through as 'data' you can't know the length in advance, and rather than let the len(data) raise an exception catlee thought it's better to raise an exception to tell the user exactly why the code failed (ie, because the user sent an iterator and there's no way to meaningfully find the Content-Length of that). As for the catching exceptions vs using isinstance: I thought about it for a while, I think something like this feels right to me: try: self.sock.sendall(data) except TypeError: if isinstance(data, collections.Iterable): for d in t: self.sock.sendall(d) else: raise TypeError(data should be a bytes-like object or an iterable, got %r % type(it)) anyway, calling iter(data) is equivalent to calling data.__iter__(), so catching the exception is equivalent to hasattr(data, '__iter__'), which is roughly the same as isinstance(data, collections.Iterable). so we try the most straightforward method (sending everything) then if that fails, data is either an iterator or a wrong type. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3243 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10588] imp.find_module raises unexpected SyntaxError
Changes by Ron Adam ron_a...@users.sourceforge.net: -- nosy: +ron_adam ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10588 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3243] Support iterable bodies in httplib
Davide Rizzo sor...@gmail.com added the comment: len(data) will raise anyway. No, it won't, if the iterable happens to be a sequence. -- nosy: +davide.rizzo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3243 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3243] Support iterable bodies in httplib
Antoine Pitrou pit...@free.fr added the comment: len(data) will raise anyway. No, it won't, if the iterable happens to be a sequence. Well, it seems the patch is confused between iterable and iterator. Only iterators have a __next__, but they usually don't have a __len__. The patch should really check for iterables, so it should use: if isinstance(data, collections.Iterable) raise ValueError#etc. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3243 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue9873] urllib.parse: Allow bytes in some APIs that use string literals internally
Nick Coghlan ncogh...@gmail.com added the comment: Committed in r86889 The docs changes should soon be live at: http://docs.python.org/dev/library/urllib.parse.html If anyone would like to suggest changes to the wording of the docs for post beta1, or finds additional corner cases that the new bytes handling can't cope with, feel free to create a new issue. -- resolution: - accepted stage: needs patch - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9873 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com