Re: documentation for the change of Python 2.5
On 6/28/06, bussiere [EMAIL PROTECTED] wrote: I've read thsi documentation n: http://docs.python.org/dev/whatsnew/whatsnew25.html is there a way to have it in a more printable form ? Yep: http://www.python.org/ftp/python/doc/2.5b1/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem with sets and Unicode strings
On 6/27/06, Dennis Benzinger [EMAIL PROTECTED] wrote: Hi! The following program in an UTF-8 encoded file: # -*- coding: UTF-8 -*- FIELDS = (Fächer, ) FROZEN_FIELDS = frozenset(FIELDS) FIELDS_SET = set(FIELDS) print uFächer in FROZEN_FIELDS print uFächer in FIELDS_SET print uFächer in FIELDS gives this output False False Traceback (most recent call last): File test.py, line 9, in ? print uFÀcher in FIELDS UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128) Why do the first two print statements succeed and the third one fails with an exception? Actually all three statements fail to produce correct result. Why does the use of set/frozenset remove the exception? Because sets use hash algorithm to find matches, whereas the last statement directly compares a unicode string with a byte string. Byte strings can only contain ascii characters, that's why python raises an exception. The problem is very easy to fix: use unicode strings for all non-ascii strings. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python UTF-8 and codecs
On 6/27/06, Mike Currie [EMAIL PROTECTED] wrote: I'm trying to write out files that have utf-8 characters 0x85 and 0x08 in them. Every configuration I try I get a UnicodeError: ascii codec can't decode byte 0x85 in position 255: oridinal not in range(128) I've tried using the codecs.open('foo.txt', 'rU', 'utf-8', errors='strict') and that doesn't work and I've also try wrapping the file in an utf8_writer using codecs.lookup('utf8') Any clues? Use unicode strings for non-ascii characters. The following program works: import codecs c1 = unichr(0x85) f = codecs.open('foo.txt', 'wU', 'utf-8') f.write(c1) f.close() But unichr(0x85) is a control characters, are you sure you want it? What is the encoding of your data? -- http://mail.python.org/mailman/listinfo/python-list
Re: to py or not to py ?
On 6/27/06, Chandrashekhar kaushik [EMAIL PROTECTED] wrote: HI all I have the following prob. I am to write a parallel vis application . I wud have by default used C++ for the same but somehow thought if py cud help me .. It does as in many things that i would otherwise have written down already exists ... ( like built in classes for sockets , threading etc ) I would be doin the following type of tasks .. 1. sending data across the network the data is going to be huge 2. once data has been sent i will run some vis algos parallely on them and get the results now one thing that i wud req. is serializing my data structures so that they can be sent across the net. pyton does allow this using cPickle , but it bloats the data like anythin !!! for example a class containing 2 integers which i expect will be 8 bytes long .. cPickle.dumps returns a string thats 86 bytes wide ( this is the binary version protocol 1 ) anyway to improve serialization ?? Do it yourself using struct module. also is it actually a good idea to write high perf applications in python ? Take a look at Mercurial http://www.selenic.com/mercurial/ sources. It's a high performance python application. Or watch Bryan O'Sullivan's Mercurial presentation http://video.google.com/videoplay?docid=-7724296011317502612 he talks briefly how they made it work fast. But writing high performance application in python requires self-discipline and attention to details, looking at the way you spell I think it will be a challenge ;) -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem with sets and Unicode strings
On 6/27/06, Dennis Benzinger [EMAIL PROTECTED] wrote: Serge Orlov wrote: On 6/27/06, Dennis Benzinger [EMAIL PROTECTED] wrote: Hi! The following program in an UTF-8 encoded file: # -*- coding: UTF-8 -*- FIELDS = (Fächer, ) FROZEN_FIELDS = frozenset(FIELDS) FIELDS_SET = set(FIELDS) print uFächer in FROZEN_FIELDS print uFächer in FIELDS_SET print uFächer in FIELDS gives this output False False Traceback (most recent call last): File test.py, line 9, in ? print uFÀcher in FIELDS UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128) Why do the first two print statements succeed and the third one fails with an exception? Actually all three statements fail to produce correct result. So this is a bug in Python? No. frozenset remove the exception? Because sets use hash algorithm to find matches, whereas the last statement directly compares a unicode string with a byte string. Byte strings can only contain ascii characters, that's why python raises an exception. The problem is very easy to fix: use unicode strings for all non-ascii strings. No, byte strings contain characters which are at least 8-bit wide http://docs.python.org/ref/types.html. Yes, but later it's written that non-ascii characters do not have universal meaning assigned to them. In other words if you put byte 0xE4 into a bytes string all python knows about it is that it's *some* character. If you put character U+00E4 into a unicode string python knows it's a latin small letter a with diaeresis. Trying to compare *some* character with a specific character is obviously undefined. But I don't understand what Python is trying to decode and why the exception says something about the ASCII codec, because my file is encoded with UTF-8. Because byte strings can come from different sources (network, files, etc) not only from the sources of your program python cannot assume all of them are utf-8. It assumes they are ascii, because most of wide-spread text encodings are ascii bases. Actually it's a guess, since there are utf-16, utf-32 and other non-ascii encodings. If you want to experience the life without guesses put sys.setdefaultencoding(undefined) into site.py -- http://mail.python.org/mailman/listinfo/python-list
Re: Python UTF-8 and codecs
On 6/27/06, Mike Currie [EMAIL PROTECTED] wrote: Okay, Here is a sample of what I'm doing: Python 2.4.3 (#69, Mar 29 2006, 17:35:34) [MSC v.1310 32 bit (Intel)] on win32 Type help, copyright, credits or license for more information. filterMap = {} for i in range(0,255): ... filterMap[chr(i)] = chr(i) ... filterMap[chr(9)] = chr(136) filterMap[chr(10)] = chr(133) filterMap[chr(136)] = chr(9) filterMap[chr(133)] = chr(10) This part is incorrect, it should be: filterMap = {} for i in range(0,128): filterMap[chr(i)] = chr(i) filterMap[chr(9)] = unichr(136) filterMap[chr(10)] = unichr(133) filterMap[unichr(136)] = chr(9) filterMap[unichr(133)] = chr(10) -- http://mail.python.org/mailman/listinfo/python-list
Re: Python UTF-8 and codecs
On 6/27/06, Mike Currie [EMAIL PROTECTED] wrote: Well, not really. It doesn't affect the result. I still get the error message. Did you get a different result? Yes, the program succesfully wrote text file. Without magic abilities to read the screen of your computer I guess you now get exception in print statement. It is because you use legacy windows console (I use unicode-capable console of lightning compiler http://www.python.org/pypi/Lightning%20Compiler to run snippets of code). You can either change console or comment out print statement or change your program to print unicode representation: print repr(filteredLine) -- http://mail.python.org/mailman/listinfo/python-list
Re: Ascii Encoding Error with UTF-8 encoder
On 6/27/06, Mike Currie [EMAIL PROTECTED] wrote: Thanks for the thorough explanation. What I am doing is converting data for processing that will be tab (for columns) and newline (for row) delimited. Some of the data contains tabs and newlines so, I have to convert them to something else so the file integrity is good. Usually it is done by escaping: translate tab - \t, new line - \n, back slash - \\. Python strings already have a method to do it in just one line: s=chr(9)+chr(10)+chr(92) print s.encode(string_escape) \t\n\\ when you're ready to convert it back you call decode(string_escape) Not my idea, I've been left with the implementation however. The idea is actually not bad as long as you know how to cope with unicode. -- http://mail.python.org/mailman/listinfo/python-list
Re: Function to prune dictionary keys not working
On 6/27/06, John Machin [EMAIL PROTECTED] wrote: | '1.00' = 0.5 True | '0.33' = 0.5 True Python (correctly) does very little (guesswork-based) implicit type conversion. At the same time, Python (incorrectly :) compares incomparable objects. -- http://mail.python.org/mailman/listinfo/python-list
Re: nested dictionary assignment goes too far
On 26 Jun 2006 16:56:22 -0700, Jake Emerson [EMAIL PROTECTED] wrote: I'm attempting to build a process that helps me to evaluate the performance of weather stations. The script below operates on an MS Access database, brings back some data, and then loops through to pull out statistics. One such stat is the frequency of reports from the stations ('char_freq'). I have a collection of methods that operate on the data to return the 'char_freq' and this works great. However, when the process goes to insert the unique 'char_freq' into a nested dictionary the value gets put into ALL of the sub-keys for all of the weather stations. It's a sure sign you're sharing an object. In python, unless specifically written, an assignment-like method doesn't create copies: d = dict.fromkeys([1,2,3],[4,5,6]) id(d[1]) == id(d[2]) True Instead of rain_raw_dict = dict.fromkeys(distinctID,{'N':-6999,'char_freq':-6999,'tip1':-6999,'tip2':-6999,'tip3':-6999,'tip4':-6999,'tip5':-6999,'tip6':-6999,'lost_rain':-6999}) you should do something like this: defaults = {'N':-6999,'char_freq':-6999,'tip1':-6999,'tip2':-6999,'tip3':-6999,'tip4':-6999,'tip5':-6999,'tip6':-6999,'lost_rain':-6999} rain_raw_dict = {} for ID in [110,140,650,1440]: rain_raw_dict[ID] = defaults.copy() -- http://mail.python.org/mailman/listinfo/python-list
Re: Python database access
On 25 Jun 2006 21:19:18 -0700, arvind [EMAIL PROTECTED] wrote: Hi all, I am going to work on Python 2.4.3 and MSSQL database server on Windows platform. But I don't know how to make the connectivity or rather which module to import. I searched for the modules in the Python library, but I couldn't find which module to go for. The module you're looking for is the first result if you search python mysql on google or if you search mysql on python package index http://www.python.org/pypi -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 314 - requirements for Python itself
On 6/23/06, Mark Nottingham [EMAIL PROTECTED] wrote: PEP 314 introduces metadata that explains what packages are required by a particular package. Is there any way to express what version of Python itself is required? No, but you can do it yourself: # do not edit this file, edit actualsetup.py instead import sys if sys.version_info (2, 4): print Error: Python 2.4 or greater is required to use this package sys.exit(1) import actualsetup Disclaimer: I haven't actually run or tested this code, but the idea is to write the checking code that is compatible with very old python versions and do the actual work in actualsetup.py -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 314 - requirements for Python itself
On 6/23/06, Mark Nottingham [EMAIL PROTECTED] wrote: I was looking for some normal (hopefully, machine-readable) way to indicate it so that people can figure out the version of Python required before they download the package. I'm sure writing English text like make sure you have python 2.4 before downloading this package is not abnormal :) How do you expect to prevent users from downloading your package if they don't have python your package needs? It could be useful if there was a tool to silently download and install python, but I'm sure it is a pain to code and support such a tool, so nobody was crazy enough to do it. -- http://mail.python.org/mailman/listinfo/python-list
Re: PEP 314 - requirements for Python itself
On 6/23/06, Mark Nottingham [EMAIL PROTECTED] wrote: I was thinking more about things where people can search for packages that need different versions of python, etc.; not so much for automation. OK, now I see why you need it. I'm sure using virtual package name python to declare python dependence is logical and non-controversial, so you can write to python-dev asking for PEP 314 addendum, I just wanted say that nobody is checking it right now and *right now* run-time checking is the way to go. -- http://mail.python.org/mailman/listinfo/python-list
Re: Porting python to a TI Processor (C64xx)
On 6/21/06, Roland Geibel [EMAIL PROTECTED] wrote: Dear all.We want to make python run on DSP processors (C64xx family of TI).I don't know what C64xx is, but I believe python needs general purpose CPU to run I've already tried to ask [EMAIL PROTECTED] (about his Python forarm-Linux),but didn't get an answer so far.Neither could Ifind it in the Python tree at sourceforge. What are you trying to find in the sources? Python is just a C program and you port it just like any C program. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python at compile - possible to add to PYTHONPATH
On 21 Jun 2006 15:54:56 -0700, rh0dium [EMAIL PROTECTED] wrote: Hi all, Can anyone help me out. I would like to have python automatically look in a path for modules similar to editing the PYTHONPATH but do it at compile time so every user doesn't have to do this.. Soo... I want to add /foo/bar to the PYTHONPATH build so I don't have to add it later on. Is there a way to do this? You don't need to recompile python. Just change sys.path before all import statements. -- http://mail.python.org/mailman/listinfo/python-list
Re: statically linked python
Ralph Butler wrote: Serge Orlov wrote: Ralph Butler wrote: Hi: I have searched the docs and google but have not totally figured out how to accomplish my task: On a linux box, I want to compile and link python so that it uses no shared libraries, but does support import of some extra modules. I have made a few attempts but with limited success. In particular, I have tried things like adding -static to the compiler options in the Makefile. At one point I managed to build a python that was close to what I wanted, e.g. when I ran ldd python, it said: not a dynamic executable In that version, when I do some imports, e.g. sys, os, etc. they load fine. But, when I try to import some other modules, e.g. time, they are not found. I have tried similar procedures while also altering Modules/Setup.local (produced by configure) to contain: time timemodule.c # -lm # time operations and variables There has to be a simple, elegant way to accomplish this which I am simply overlooking. Any help would be appreciated. This has nothing to do with python. glibc doesn't support loading shared libraries into statically linked executables. At least it didn't support in 2002: http://www.cygwin.com/ml/libc-alpha/2002-06/msg00079.html Since it still doesn't work most likely it is still not supported, but you may ask glibc developers what is the problem. I do not want to load them. I want to statically link the code for a module (e.g. time) directly into the statically linked executable. Sorry if that was not clear. OK, so you're asking how to make a module builtin. I haven't done that myself, but let me give you a hint where to look: there is list of builtin modules sys.builtin_module_names if you search the whole python source distribution for some of the names in the list you'll get list of files where to look. I've just searched and found that only two files are involved: PC\config.c and setup.py -- http://mail.python.org/mailman/listinfo/python-list
Re: statically linked python
Ralph Butler wrote: Hi: I have searched the docs and google but have not totally figured out how to accomplish my task: On a linux box, I want to compile and link python so that it uses no shared libraries, but does support import of some extra modules. I have made a few attempts but with limited success. In particular, I have tried things like adding -static to the compiler options in the Makefile. At one point I managed to build a python that was close to what I wanted, e.g. when I ran ldd python, it said: not a dynamic executable In that version, when I do some imports, e.g. sys, os, etc. they load fine. But, when I try to import some other modules, e.g. time, they are not found. I have tried similar procedures while also altering Modules/Setup.local (produced by configure) to contain: time timemodule.c # -lm # time operations and variables There has to be a simple, elegant way to accomplish this which I am simply overlooking. Any help would be appreciated. This has nothing to do with python. glibc doesn't support loading shared libraries into statically linked executables. At least it didn't support in 2002: http://www.cygwin.com/ml/libc-alpha/2002-06/msg00079.html Since it still doesn't work most likely it is still not supported, but you may ask glibc developers what is the problem. -- http://mail.python.org/mailman/listinfo/python-list
Re: BeautifulSoup error
William Xu wrote: Hi, all, This piece of code used to work well. i guess the error occurs after some upgrade. import urllib from BeautifulSoup import BeautifulSoup url = 'http://www.google.com' port = urllib.urlopen(url).read() soup = BeautifulSoup() soup.feed(port) Traceback (most recent call last): File stdin, line 1, in ? File /usr/lib/python2.3/sgmllib.py, line 94, in feed Look at the traceback: you're not calling BeautifulSoup module! In fact, there is no feed method in the current BeautifulSoup documentation. Maybe it used to work well, but now it's definitely going to fail. As I understand documentation you need to write soup = BeautifulSoup(port) -- http://mail.python.org/mailman/listinfo/python-list
Re: memory leak problem with arrays
sonjaa wrote: Serge Orlov wrote: sonjaa wrote: Serge Orlov wrote: sonjaa wrote: Hi I'm new to programming in python and I hope that this is the problem. I've created a cellular automata program in python with the numpy array extensions. After each cycle/iteration the memory used to examine and change the array as determined by the transition rules is never freed. I've tried using del on every variable possible, but that hasn't worked. Python keeps track of number of references to every object if the object has more that one reference by the time you use del the object is not freed, only number of references is decremented. Print the number of references for all the objects you think should be freed after each cycle/iteration, if is not equal 2 that means you are holding extra references to those objects. You can get the number of references to any object by calling sys.getrefcount(obj) thanks for the info. I used this several variables/objects and discovered that little counters i.e. k = k +1 have many references to them, up tp 1+. Is there a way to free them? Although it's looks suspicious, even if you manage to free it you will gain only 12 bytes. I think you should concentrate on more fat objects ;) Sent message to the NumPy forum as per Roberts suggestion. An update after implimenting the suggestions: After doing this I see that iterative counters used to collect occurrences and nested loop counters (ii jj) as seen in the code example below are the culprits with the worst ones over 1M: That means you have over 1M integers in your program. How did it happen if you're using numpy arrays? If I allocate a numpy array of one million bytes it is not using one million integers, whereas a python list of 1M integers creates 1M integers: import numpy a = numpy.zeros((100,), numpy.UnsignedInt8) import sys sys.getrefcount(0) 632 b=[0]*100 sys.getrefcount(0) 1000632 But that doesn't explain why your program doesn't free memory. But the way, are you sure you have enough memory for one iteration of your program? -- http://mail.python.org/mailman/listinfo/python-list
[OT] Re: Python open proxy honeypot
imcs ee wrote: On 13 Jun 2006 15:09:57 -0700, Serge Orlov [EMAIL PROTECTED] wrote: Alex Reinhart wrote: My spam folder at gmail is not growing anymore for many months (it is about 600-700 spams a month). Have spammers given up spamming gmail.com only or is it global trend? Gmail said messages that have been in Spam more than 30 days will be automatically deleted so may be the speed of spam comes in counterbalanced to the speed spam goes out? Yes, it is. My point was monthly amount is not increasing for me. But I guess if you publish your email everywhere it is increasing: http://egofood.blogspot.com/2006/06/well-spam-is-officially-annoying.html 20,000 a month. Wow. -- http://mail.python.org/mailman/listinfo/python-list
Re: BeautifulSoup error
William Xu wrote: Hi, all, This piece of code used to work well. i guess the error occurs after some upgrade. import urllib from BeautifulSoup import BeautifulSoup url = 'http://www.google.com' port = urllib.urlopen(url).read() soup = BeautifulSoup() soup.feed(port) Traceback (most recent call last): File stdin, line 1, in ? File /usr/lib/python2.3/sgmllib.py, line 94, in feed self.rawdata = self.rawdata + data UnicodeDecodeError: 'ascii' codec can't decode byte 0xb8 in position 565: ordinal not in range(128) Any ideas to solve this? According to the documentation http://www.crummy.com/software/BeautifulSoup/documentation.html chapter Beautiful Soup Gives You Unicode, Dammit Beautiful Soup fully supports unicode so it's probably a bug. version info: Python 2.3.5 (#2, Mar 7 2006, 12:43:17) [GCC 4.0.3 20060212 (prerelease) (Debian 4.0.2-9)] on linux2 python-beautifulsoup: 3.0.1-1 Upgrading python-beautifulsoup is a good idea, since there were two bug fix releases after 3.0.1 -- http://mail.python.org/mailman/listinfo/python-list
Re: memory leak problem with arrays
sonjaa wrote: Hi I'm new to programming in python and I hope that this is the problem. I've created a cellular automata program in python with the numpy array extensions. After each cycle/iteration the memory used to examine and change the array as determined by the transition rules is never freed. I've tried using del on every variable possible, but that hasn't worked. Python keeps track of number of references to every object if the object has more that one reference by the time you use del the object is not freed, only number of references is decremented. Print the number of references for all the objects you think should be freed after each cycle/iteration, if is not equal 2 that means you are holding extra references to those objects. You can get the number of references to any object by calling sys.getrefcount(obj) -- http://mail.python.org/mailman/listinfo/python-list
Re: memory leak problem with arrays
sonjaa wrote: Serge Orlov wrote: sonjaa wrote: Hi I'm new to programming in python and I hope that this is the problem. I've created a cellular automata program in python with the numpy array extensions. After each cycle/iteration the memory used to examine and change the array as determined by the transition rules is never freed. I've tried using del on every variable possible, but that hasn't worked. Python keeps track of number of references to every object if the object has more that one reference by the time you use del the object is not freed, only number of references is decremented. Print the number of references for all the objects you think should be freed after each cycle/iteration, if is not equal 2 that means you are holding extra references to those objects. You can get the number of references to any object by calling sys.getrefcount(obj) thanks for the info. I used this several variables/objects and discovered that little counters i.e. k = k +1 have many references to them, up tp 1+. Is there a way to free them? Although it's looks suspicious, even if you manage to free it you will gain only 12 bytes. I think you should concentrate on more fat objects ;) -- http://mail.python.org/mailman/listinfo/python-list
Re: split with * in string and ljust() puzzles
Sambo wrote: I have just (finally) realized that it is splitting and removing on single space but that seams useless, and split items 1 and 2 are empty strings not spaces?? What is useless for you is worth $1,000,000 for somebody else ;) If you have comma separated list '1,,2'.split(',') naturally returns ['1', '', '2']. I think you can get what you want with a simple regexp. -- http://mail.python.org/mailman/listinfo/python-list
Re: Bundling an application with third-party modules
Ben Finney wrote: Serge Orlov [EMAIL PROTECTED] writes: Ben Finney wrote: That's a large part of my question. How can I lay out these modules sensibly during installation so they'll be easily available to, but specific to, my application? Put them in a directory lib next to the main module and start the main module with the following blurb: import sys, os sys.path.insert(1, os.path.join(sys.path[0],lib)) The application consists of many separate programs to perform various tasks, some larger than others. There's no sensible place for a main module. Perhaps I'm using my own jargon. By main module I mean every module used to start any application within your project. If you want relocatable solution, create empty .topdir file in the top directory and put this blurb into every application: import sys, os top_dir = sys.path[0] while True: if os.path.exists(os.path.join(top_dir,.topdir)): break top_dir = os.path.dirname(top_dir) sys.path.insert(1, os.path.join(top_dir,lib)) I don't think you need to worry about duplication, I used this code. It is verion 1.0 and it is final :) You won't need to change it. There probably will be a library directory for common code, though. Are you suggesting that the third-party libraries should go within the application-native library? Not really. I was just feeling lazy to type a generic solution, so I assumed one project == one application. What's a good way to get from upstream source code (some of which is eggs, some of which expects 'distutils' installation, and some of which is simple one-file modules) to a coherent set of application library code, that is automatable in an install script? Well, I did it manually, it's not that time consuming if you keep in mind that you also need to test new version and by testing I also mean finding integration bugs days later. Anyway, if you feel like automating I think you can do something using distutils command install --home=/temp/dir and then copying to your common library directory in a flat manner (--home option puts files in subdirectories that don't make sense for a bundled lib) -- http://mail.python.org/mailman/listinfo/python-list
Re: embedded python and windows multi threading, can't get it to work
freesteel wrote: I am trying to run a python programme embedded from C++. I want to run the same python code concurrently in several threads. I read the manual on embedding, especially chapter 8, and searched for relevant info on google all afternoon, but I can't get this to work. What am I doing wrong? I use python2.4 and vc++7 (.net). The first thread seems to work okay, the 2nd thread crashes, but the exception information is not very useful: (An unhandled exception of type 'System.NullReferenceException' occurred in pyembed_test.exe Running one iterpreter in more than one thread is not supported. You need to create one interpreter per thread using Py_NewInterpreter (don't forget to read Bugs and caveats paragraph). I hope you also realize the interpreters won't share objects. -- http://mail.python.org/mailman/listinfo/python-list
[OT] Re: Python open proxy honeypot
Alex Reinhart wrote: Being deluged by spam like nearly all of us (though fortunately I have a very good spam filter), I also hate spam as much as almost everybody. I know basic Python (enough to make a simple IRC bot) and I figured a good project to help learn Python would be to make a simple proxypot. I've done some research and found one already existing, written in Perl (http://www.proxypot.org/). However, I prefer the syntax and ease of Python (and Proxypot is no longer maintained, as far as I can see), so I decided to write my own. I have just one question: Is running Python's built-in smtpd, pretending to accept and forward all messages, enough to get me noticed by a spammer, or do I have to do something else to advertise my script as an open proxy? I'm hoping to make this proxy script distributed, in that several honeypots are run on different servers, and the results are then collected on a central server that provides statistics and a listing of all spammers caught. So, just out of curiosity, I'd like to know how many people would actually be willing to run a honeypot on their server, and how many are opposed to the idea (just so I know if the concept is even valid). IMHO it's pretty useless, spammers are starting to use botnets, and the more you make inconvenient to them use open proxies, the more of them will move to closed botnets. My spam folder at gmail is not growing anymore for many months (it is about 600-700 spams a month). Have spammers given up spamming gmail.com only or is it global trend? -- http://mail.python.org/mailman/listinfo/python-list
[OT] Re: Python open proxy honeypot
Alex Reinhart wrote: Serge Orlov wrote: IMHO it's pretty useless, spammers are starting to use botnets, and the more you make inconvenient to them use open proxies, the more of them will move to closed botnets. As long as I inconvenience them, or at least catch one or two, I'll be satisfied. What makes you think that spammers won't discover you're blackholing their spam as soon as you start to make some impact on their business? They will just skip your proxypots and move to real open proxies. I think you'll make bigger impact if you implement proxy checking software http://dsbl.org/programs in Python, so it can run on windows too. -- http://mail.python.org/mailman/listinfo/python-list
Re: Intermittent Failure on Serial Port (Trace Result)
H J van Rooyen wrote: Note that the point of failure is not the same place in the python file, but it is according to the traceback, again at a flush call... Yes, traceback is bogus. Maybe the error is raised during garbage collection, although the strace you've got doesn't show that. The main reason of the failure seems to be a workaround in python's function new_buffersize, it doesn't clear errno after lseek and then this errno pops up somewhere else. There are two places I can clearly see that don't clear errno: file_dealloc and get_line. Obviously this stuff needs to be fixed, so you'd better file a bug report. I'm not sure how to work around this bug in the meantime, since it is still not clear where this error is coming from. Try to pin point it. For example, if your code relies on garbage collection to call file.close, try to close all files in your program explicitly. It seems like a good idea anyway, since your program is long running, errors during close are not that significant. Instead of standard close I'd call something like this: def soft_close(f): try: f.close() except IOError, e: print stderr, Hmm, close of file failed. Error was: %s % e.errno The close failed is explicable - it seems to happen during closedown, with the port already broken.., It is not clear who calls lseek right before close. lseek is called by new_buffersize that is called by file.read. But who calls file.read during closedown? -- http://mail.python.org/mailman/listinfo/python-list
Re: Intermittent Failure on Serial Port
H J van Rooyen wrote: Serge Orloff wrote: | H J van Rooyen wrote: | Traceback (most recent call last): |File portofile.py, line 232, in ? | ret_val = main_routine(port, pollstruct, pfifo) |File portofile.py, line 108, in main_routine | send_nak(port, timeout) # so bad luck - comms error |File /home/hvr/Polling/lib/readerpoll.py, line 125, in send_nak | port.flush() | IOError: [Errno 29] Illegal seek | close failed: [Errno 29] Illegal seek | | | | Where can I find out what the Errno 29 really means? | Is this Python, the OS or maybe hardware? | | It is from kernel: grep -w 29 `locate errno` | /usr/include/asm-generic/errno-base.h: #define ESPIPE 29 | /* Illegal seek */ | | man lseek: | | ERRORS: | ESPIPE fildes is associated with a pipe, socket, or FIFO. | | RESTRICTIONS: | Linux specific restrictions: using lseek on a tty device | returns ESPIPE. Thanks for the info - so the Kernel sometimes bombs me out - does anybody know why the python flush sometimes calls lseek? I thought it was your own flush method. If it is file.flush method that makes the issue more complicated, since stdlib file.flush doesn't call lseek method. I suggest you run your program using strace to log system calls, without such log it's pretty hard to say what's going on. The most interesting part is the end, but make sure you have enough space for the whole log, it's going to be big. -- http://mail.python.org/mailman/listinfo/python-list
Re: Intermittent Failure on Serial Port
H J van Rooyen wrote: Traceback (most recent call last): File portofile.py, line 232, in ? ret_val = main_routine(port, pollstruct, pfifo) File portofile.py, line 108, in main_routine send_nak(port, timeout) # so bad luck - comms error File /home/hvr/Polling/lib/readerpoll.py, line 125, in send_nak port.flush() IOError: [Errno 29] Illegal seek close failed: [Errno 29] Illegal seek Where can I find out what the Errno 29 really means? Is this Python, the OS or maybe hardware? It is from kernel: grep -w 29 `locate errno` /usr/include/asm-generic/errno-base.h: #define ESPIPE 29 /* Illegal seek */ man lseek: ERRORS: ESPIPE fildes is associated with a pipe, socket, or FIFO. RESTRICTIONS: Linux specific restrictions: using lseek on a tty device returns ESPIPE. -- http://mail.python.org/mailman/listinfo/python-list
Re: Getting start/end dates given week-number
Tim Chase wrote: I've been trying to come up with a good algorithm for determining the starting and ending dates given the week number (as defined by the strftime(%W) function). I think you missed %U format, since later you write: My preference would be for a Sunday-Saturday range rather than a Monday-Sunday range. Thus, Any thoughts/improvements/suggestions would be most welcome. If you want to match %U: def weekBoundaries(year, week): startOfYear = date(year, 1, 1) week0 = startOfYear - timedelta(days=startOfYear.isoweekday()) sun = week0 + timedelta(weeks=week) sat = sun + timedelta(days=6) return sun, sat -- http://mail.python.org/mailman/listinfo/python-list
Re: Freezing a static executable
Will Ware wrote: I am trying to freeze a static executable. I built a static Python executable this way: ./configure --disable-shared --prefix=/usr/local make make install Even that didn't give me a really static executable, though: AFAIK it's not supported because the interpreter won't be able to load C extensions if compiled statically. There is a bootstrap issue, to build a static python executable you need extensions built but to build extensions you need python, so you need unconventional build procedure. After python build is finished you get static library libpython2.4.a. Then you need all extensions you're going to use built as .a files (I'm not even sure there is a standard way to do it). Then you need to write a loader like in py2exe, exemaker, pyinstaller, etc that will initialize python interperter and extensions. Those three pieces (libpython2.4.a, extensions, loader) can be linked as a static executable. What stupid thing am I doing wrong? You are just trying to do something nobody was really interested to implement. -- http://mail.python.org/mailman/listinfo/python-list
Re: is it possible to find which process dumped core
su wrote: to find which process dumped core at the promt we give $ file core.28424 core.28424: ELF 32-bit LSB core file of 'soffice.bin' (signal 11), Intel 80386, version 1 (SYSV), from 'soffice.bin' from this command we know 'soffice.bin' process dumped core. Now can i do the same using python i.e. finding which process dumped core? if so how can i do it? Parse a core file like the file command does? -- http://mail.python.org/mailman/listinfo/python-list
Re: Get EXE (made with py2exe) path directory name
Andrei B wrote: I need to get absolute path name of a file that's in the same dir as the exe, however the Current Working Directory is changed to somthing else. Use sys.path[0] -- http://mail.python.org/mailman/listinfo/python-list
Re: struct: type registration?
John Machin wrote: On 2/06/2006 4:18 AM, Serge Orlov wrote: If you want to parse binary data use pyconstruct http://pyconstruct.wikispaces.com/ Looks promising on the legibility and functionality fronts. Can you make any comment on the speed? I don't know really. I used it for small data parsing, its performance was acceptable. As I understand it is implemented right now as pure python code using struct under the hood. The biggest concern is the lack of comprehensive documentation, if that scares you, it's not for you. Reason for asking is that Microsoft Excel files have this weird RK format for expressing common float values in 32 bits (refer http://sc.openoffice.org, see under Documentation heading). I wrote and support the xlrd module (see http://cheeseshop.python.org/pypi/xlrd) for reading those files in portable pure Python. Below is a function that would plug straight in as an example of Giovanni's custom unpacker functions. Some of the files can be very large, and reading rather slow. I *guess* that the *current* implementation of pyconstruct will make parsing slightly slower. But you have to try to find out. from struct import unpack def unpack_RK(rk_str): # arg is 4 bytes flags = ord(rk_str[0]) if flags 2: # There's a SIGNED 30-bit integer in there! i, = unpack('i', rk_str) i = 2 # div by 4 to drop the 2 flag bits if flags 1: return i / 100.0 return float(i) else: # It's the most significant 30 bits # of an IEEE 754 64-bit FP number d, = unpack('d', '\0\0\0\0' + chr(flags 252) + rk_str[1:4]) if flags 1: return d / 100.0 return d I had to lookup what means :) Since nobody except this function cares about internals of RK number, you don't need to use pyconstruct to parse at bit level. The code will be almost like you wrote except you replace unpack('d', with Construct.LittleFloat64().parse( and plug the unpack_RK into pyconstruct framework by deriving from Field class. Sure, nobody is going to raise your paycheck because of this rewrite :) The biggest benefit comes from parsing the whole data file with pyconstruct, not individual fields. -- http://mail.python.org/mailman/listinfo/python-list
Re: py2exe qt4/qimage
aljosa wrote: i'm trying to convert python (image resizer script using PyQt4) script to exe but support for jpeg and tiff image formats is located in Qt4.1\plugins\imageformats (dll files) and when script is converted exe file doesn't support jpeg and tiff. i tryed using all file formats in script: tmp1 = QImage('images/type.bmp') tmp2 = QImage('images/type.gif') tmp3 = QImage('images/type.jpg') tmp4 = QImage('images/type.png') tmp5 = QImage('images/type.tif') but it doesn't work when i convert script to exe. any tips on howto include jpeg and tiff image formats support in exe? You need bundle the plugins as data files: http://docs.python.org/dist/node12.html -- http://mail.python.org/mailman/listinfo/python-list
Re: struct: type registration?
Giovanni Bajo wrote: John Machin wrote: I am an idiot, so please be gentle with me: I don't understand why you are using struct.pack at all: Because I want to be able to parse largest chunks of binary datas with custom formatting. Did you miss the whole point of my message: struct.unpack(3liiSiiShh, data) Did you want to write struct.unpack(Sheesh, data) ? Seriously, the main problem of struct is that it uses ad-hoc abbreviations for relatively rarely[1] used functions calls and that makes it hard to read. If you want to parse binary data use pyconstruct http://pyconstruct.wikispaces.com/ [1] Relatively to regular expression and string formatting calls. -- http://mail.python.org/mailman/listinfo/python-list
Re: Are ActivePython scripts compatible with Linux?
A.M wrote: I am planning to develop python applications on windows and run them on Linux. Larry Bates [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Short answer: yes A.M wrote: Thanks alot Larry for your comprehensive answer. Small addition: test, test, test. This is the only way to make sure your program works on another platform. VMware is offering now free virtual machine emulator, vmplayer. You have no excuse not to install linux! :) If you have dual-core processor or an idle machine you can even setup http://buildbot.sf.net to continuously test your source code changes. -- http://mail.python.org/mailman/listinfo/python-list
Re: Is anybody knows about a linkable, quick MD5/SHA1 calculator library ?
DurumDara wrote: Hi ! I need to speedup my MD5/SHA1 calculator app that working on filesystem's files. I use the Python standard modules, but I think that it can be faster if I use C, or other module for it. I use FSUM before, but I got problems, because I move into DOS area, and the parameterizing of outer process maked me very angry (not working). You will see this in this place: http://mail.python.org/pipermail/python-win32/2006-May/004697.html FWIW I looked at what is the problem, apparently fsum converts the name back to unicode, tries to print it and silently corrupts the output. You give it short name XA02BB~1 of the file xAÿ and fsum prints xA Use python module or try another utility. -- http://mail.python.org/mailman/listinfo/python-list
Re: why not in python 2.4.3
Rocco wrote: Also with ascii the function does not work. Well, at least you fixed misconfiguration ;) Googling for 1F8B (that's two first bytes from your strange python 2.4 result) gives a hint: it's a beginning of gzip stream. Maybe urllib2 in python 2.4 reports to the server that it supports compressed data but doesn't decompress it when receives the reply? -- http://mail.python.org/mailman/listinfo/python-list
Re: saving settings
SuperHik wrote: aum wrote: On Mon, 29 May 2006 09:05:36 +0200, SuperHik wrote: Hi, I was wondering how to make a single .exe file, say some kind od clock, and be able to save some settings (alarm for example) into the same file? Basically make code rewrite it self... thanks! Yikes!!! I'd strongly suggest you read the doco for ConfigParser, and load/save your config file to/from os.path.join(os.path.expanduser(~)). Another option - save your stuff in the Windows Registry but if I copy this file on the other computer settings will be lost... Put your program in a writable folder and save configuration right into that folder. Then your can transfer the whole folder. Tip: sys.path[0] always contains the path to the directory where __main__ module is located. -- http://mail.python.org/mailman/listinfo/python-list
Re: why not in python 2.4.3
John Machin wrote: On 29/05/2006 10:47 PM, Serge Orlov wrote: Maybe urllib2 in python 2.4 reports to the server that it supports compressed data but doesn't decompress it when receives the reply? Something funny is happening here. Others reported it working with 2.4.3 and Rocco's original code as posted in this thread -- which works for me on 2.4.2, Windows XP. It works for me too, returning raw uncompressed data. There was one suss thing about Rocco's problem description: First message ended with d=takefeed(url) But next message said print rss Is rss == d? Nope. If you look at html tags, 2.3 code returns feed generator ... whereas 2.4 code returns rss channel generator ... That may explain why 2.3 result is not compressed and 2.4 result is compressed, but that doesn't explain why 2.4 *is* compressed. I looked at python 2.4 httplib, I'm sure it's not a problem, quote from httplib: # we only want a Content-Encoding of identity since we don't # support encodings such as x-gzip or x-deflate. I think there is a web accellerator sitting somewhere between Rocco and Google server that is confused that Rocco is misinforming web server saying he's using Firefox, but at the same time claiming that he cannot handle compressed data. That's why they teach little kids: don't lie :) -- http://mail.python.org/mailman/listinfo/python-list
Re: why not in python 2.4.3
Rocco wrote: import sys sys.getdefaultencoding() 'latin_1' Don't change default encoding. It should be always ascii. -- http://mail.python.org/mailman/listinfo/python-list
Re: q - including manpages in setup.py
aum wrote: Hi, What is the best way to incorporate manpages in a distutils setup.py script? Is there any distro-independent way to find the most appropriate place to put the manpages? For instance, /usr/man/? /usr/share/man? /usr/local/man? /usr/local/share/man? What do you mean distro? Linux? That should be /usr/local/man but AFAIK some distros are misconfigured and their man doesn't search /usr/local by default, YMMV. Also - I've got .html conversions of the manpages, for the benefit of OSs such as Windows which don't natively support manpages. What's the best place to put these? your_tool --html-manual that uses os.start or webbrowser module to invoke html viewer. Or your_tool --man that dumps plain text on the screen. -- http://mail.python.org/mailman/listinfo/python-list
Re: iteration over non-sequence ,how can I resolve it?
python wrote: To BJörn Lindqvist : thank you . how to write the code specifically ?Could you give a example? Use Queue module: import threading from Queue import Queue class PrintThread(threading.Thread): def __init__(self, urlList, results_queue): threading.Thread.__init__(self) urllist=[] self.urllist=urlList self.results_queue = results_queue def run(self): urllink=[self.urllist] * 2 self.results_queue.put(urllink) results = Queue() threadList = [] for i in range(0,2): thread=PrintThread(Thread+str(i), results) threadList.append(thread) thread.start() for i in threadList: linkReturned = results.get() for j in linkReturned: print j -- http://mail.python.org/mailman/listinfo/python-list
Re: deploying big python applications
AndyL wrote: Hi, let me describe how I do that today. There is standard python taken from python.org installed in a c:\python23 with at least dozen different additional python packages (e.g. SOAPpy, Twisted, wx, many smaller ones etc) included. Also python23.dll moved from c:\windows to c:\python23. This is zipped and available as over 100MB file to anyone to manually unzip on his/her PC. This is a one time step. On top of that there is 30K lines of code with over 100 .py files application laid out within a directory tree. Very specific for the domain, typical application. This again is zipped and available to anyone as much smaller file to unzip and use. This step is per software releases. There is one obvious drawback - I can not separate python from standard libraries easily. True, python releases on windows are forward incompatible with C extensions, so don't even think about that. I'm not even talking about big pure python packages that could probably break because of small subtle changes in python API between releases. So when upgrade to 2.4 comes, I need to reinstall all the packages. Yes, but how much time it will *actually* take? I bet it's 1 hour. Seriously, why don't you *time* it with a stopwatch? And then compare that time to the time needed to debug the new release. In order to address that as well as the Linux port I project following structure: -default python.org installation or one time step on Windows -set of platform dependent libraries in directory A -set of platform independent libraries in directory B -application in directory C I would suggest the same structure I described for deploying over LAN: http://groups.google.com/group/comp.lang.python/msg/2482a93eb7115cb6?hl=en; The only problem is that exemaker cannot find python relative to itself, you will have to mash exemaker, python and application launcher in one directory. So the layout is like this: app/ engine/ -- directory with your actual application app.exe -- renamed exemaker.exe app.py -- dispatching module, see below python.exe python24.dll lib -- python stdlib, etc === app.py === from engine import real_application This way file engine/real_application.py is platform independant. On Linux/Unix shell script is an equivalent of exemaker. Or C program like exemaker, but you will have to compile it for all platforms. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python Version Testing Tool?
Michael Yanowitz wrote: Hello: Is there a version testing tool available for Python such that I can check to see if my code will still run in versions 2.2, 2.3, 2.4.3, and 1.1 (for example) (or whatever) without having to install all these different versions on my computer? Such tool will never be reliable, unless it duplicates all the work that went into all python versions. -- http://mail.python.org/mailman/listinfo/python-list
Re: NEWB: reverse traversal of xml file
manstey wrote: But will this work if I don't know parts in advance. Yes it will work as long as the highest part number in the whole file is not very high. The algorithm needs only store N records in memory, where N is the highest part number in the whole file. I only know parts by reading through the file, which has 450,000 lines. Lines or records? I created a sequence of 10,000,000 numbers which is equal to your ten million records like this: def many_numbers(): for n in xrange(100): for part in xrange(10): yield part parts = many_numbers() and the code processed it consuming virtually no memory in 13 seconds. That is the advantage of iterators and generators, you can process long sequences without allocating a lot of memory. -- http://mail.python.org/mailman/listinfo/python-list
Re: NEWB: reverse traversal of xml file
manstey wrote: Hi, I have an xml file of about 140Mb like this: book record ... wordpartWTS1/wordpartWTS /record record ... wordpartWTS2/wordpartWTS /record record ... wordpartWTS1/wordpartWTS /record /book I want to traverse it from bottom to top and add another field to each record totalWordPart1/totalWordPart which would give the highest value of wordpartWTS for each record for each word so if wordparts for the first ten records were 1 2 1 1 1 2 3 4 1 2 I want totalWordPart to be 2 2 1 1 4 4 4 4 2 2 I figure the easiest way to do this is to go thru the file backwards. Any ideas how to do this with an xml data file? You need to iterate from the beginning and use itertools.groupby: from itertools import groupby def enumerate_words(parts): word_num = 0 prev = 0 for part in parts: if prev = part: word_num += 1 prev = part yield word_num, part def get_word_num(item): return item[0] parts = 1,2,1,1,1,2,3,4,1,2 for word_num, word in groupby(enumerate_words(parts), get_word_num): parts_list = list(word) max_part = parts_list[-1][1] for word_num, part_num in parts_list: print max_part, part_num prints: 2 1 2 2 1 1 1 1 4 1 4 2 4 3 4 4 2 1 2 2 -- http://mail.python.org/mailman/listinfo/python-list
Re: No math module??
WIdgeteye wrote: I have been trying to run a python program and I get the following error: Traceback (most recent call last): Fil e string, line 39, in ? That doesn't look like a python program, File string means it's an embedded script. When a script is embedded it is responsibility of the caller (blender application) to setup correct path to modules. File /home/Larry/.blender/scripts/bzflag/__init__.py, line 22, in ? import BZfileRead File /home/Larry/.blender/scripts/bzflag/BZfileRead.py, line 24, in ? import BZsceneWriter File /home/Larry/.blender/scripts/bzflag/BZsceneWriter.py, line 25, in ? import BZcommon File /home/Larry/.blender/scripts/bzflag/BZcommon.py, line 24, in ? import math ImportError: No module named math [snip] So what's up??:) Try to insert == import sys print sys.path, sys.version, sys.executable == right before the failing import math. The next step is most likely to RTFM how to properly setup python embedded into blender. If everything looks as described in the manual, it's a bug in blender. -- http://mail.python.org/mailman/listinfo/python-list
Re: import woe
[EMAIL PROTECTED] wrote: hello, i have a problem. i would like to import python files above and below my current directory. i'm working on /home/foo/bar/jar.py i would like to import /home/foo/car.py and /home/foo/bar/far.py how can i do this? $ cat ~/.bashrc export PATH=/home/foo/:$PATH $ cat /home/foo/application #!/usr/bin/env python import bar.jar $ chmod +x /home/foo/application $ cd /home/foo/bar $ application all imports work fine ... ps: i want to scale, so i do not want to edit the python path In what sense do you want to scale, working with multiple projects or multiple versions of one project at the same time? Anyway you are to quick to jump to conclusions, if you don't want to edit python path who will do it for you? Python path won't appear out of thin air if your file layout is not supported out of the box. -- http://mail.python.org/mailman/listinfo/python-list
Re: WTF? Printing unicode strings
Ron Garret wrote: In article [EMAIL PROTECTED], Serge Orlov [EMAIL PROTECTED] wrote: Ron Garret wrote: I'm using an OS X terminal to ssh to a Linux machine. In theory it should work out of the box. OS X terminal should set enviromental variable LANG=en_US.utf-8, then ssh should transfer this variable to Linux and python will know that your terminal is utf-8. Unfortunately AFAIK OS X terminal doesn't set that variable and most (all?) ssh clients don't transfer it between machines. As a workaround you can set that variable on linux yourself . This should work in the command line right away: LANG=en_US.utf-8 python -c print unichr(0xbd) Or put the following line in ~/.bashrc and logout/login export LANG=en_US.utf-8 No joy. [EMAIL PROTECTED]:~$ LANG=en_US.utf-8 python -c print unichr(0xbd) Traceback (most recent call last): File string, line 1, in ? UnicodeEncodeError: 'ascii' codec can't encode character u'\xbd' in position 0: ordinal not in range(128) [EMAIL PROTECTED]:~$ What version of python and what shell do you run? What the following commands print: python -V echo $SHELL $SHELL --version [EMAIL PROTECTED]:~$ python -V Python 2.3.4 [EMAIL PROTECTED]:~$ echo $SHELL /bin/bash [EMAIL PROTECTED]:~$ $SHELL --version GNU bash, version 2.05b.0(1)-release (i386-pc-linux-gnu) Copyright (C) 2002 Free Software Foundation, Inc. [EMAIL PROTECTED]:~$ That's recent enough. I guess the distribution you're using set LC_* variables for no good reason. Either unset all enviromental variables starting with LC_ and set LANG variable or overide LC_CTYPE variable: LC_CTYPE=en_US.utf-8 python -c print unichr(0xbd) Should be working now :) -- http://mail.python.org/mailman/listinfo/python-list
Re: WTF? Printing unicode strings
Serge Orlov wrote: Ron Garret wrote: In article [EMAIL PROTECTED], Serge Orlov [EMAIL PROTECTED] wrote: Ron Garret wrote: I'm using an OS X terminal to ssh to a Linux machine. In theory it should work out of the box. OS X terminal should set enviromental variable LANG=en_US.utf-8, then ssh should transfer this variable to Linux and python will know that your terminal is utf-8. Unfortunately AFAIK OS X terminal doesn't set that variable and most (all?) ssh clients don't transfer it between machines. As a workaround you can set that variable on linux yourself . This should work in the command line right away: LANG=en_US.utf-8 python -c print unichr(0xbd) Or put the following line in ~/.bashrc and logout/login export LANG=en_US.utf-8 No joy. [EMAIL PROTECTED]:~$ LANG=en_US.utf-8 python -c print unichr(0xbd) Traceback (most recent call last): File string, line 1, in ? UnicodeEncodeError: 'ascii' codec can't encode character u'\xbd' in position 0: ordinal not in range(128) [EMAIL PROTECTED]:~$ What version of python and what shell do you run? What the following commands print: python -V echo $SHELL $SHELL --version [EMAIL PROTECTED]:~$ python -V Python 2.3.4 [EMAIL PROTECTED]:~$ echo $SHELL /bin/bash [EMAIL PROTECTED]:~$ $SHELL --version GNU bash, version 2.05b.0(1)-release (i386-pc-linux-gnu) Copyright (C) 2002 Free Software Foundation, Inc. [EMAIL PROTECTED]:~$ That's recent enough. I guess the distribution you're using set LC_* variables for no good reason. Either unset all enviromental variables starting with LC_ and set LANG variable or overide LC_CTYPE variable: LC_CTYPE=en_US.utf-8 python -c print unichr(0xbd) Should be working now :) I've pulled myself together and installed linux in vwware player. Apparently there is another way linux distributors can screw up. I chose debian 3.1 minimal network install and after answering all installation questions I found that only ascii and latin-1 english locales were installed: $ locale -a C en_US en_US.iso88591 POSIX In 2006, I would expect utf-8 english locale to be present even in minimal install. I had to edit /etc/locale.gen and run locale-gen as root. After that python started to print unicode characters. -- http://mail.python.org/mailman/listinfo/python-list
Re: Encode exception for chinese text
Vinayakc wrote: Hi all, I am new to python. I have written one small application which reads data from xml file and tries to encode data using apprpriate charset. I am facing problem while encoding one chinese paragraph with charset gb2312. code is: encoded_str = str_data.encode(gb2312) The type of str_data is type 'unicode' The exception is: UnicodeEncodeError: 'gb2312' codec can't encode character u'\xa0' in position 0: illegal multibyte sequence Hmm, this is 'no-break space' in the very beginning of the text. It look suspiciously like a plain text utf-8 signature which is 'zero width no-break space'. If you strip the first character do you still have encoding errors? -- http://mail.python.org/mailman/listinfo/python-list
Re: Encode exception for chinese text
Vinayakc wrote: Yes serge, I have removed the first character but it is still giving encoding exception. Then I guess this character was used as a poor man indentation tool at least in the beginning of your text. It's up to you to decide what to do with that character, you have several choices: * edit source xml file to get rid of it * remove it while you process your data * replace it with ordinary space * consider utf-8 Note, there are legitimate use cases for no-break space, for example one million can be written like 1 000 000, where spaces are non-breakable. This prevents the number to be broken by right margin like this: 1 000 000 Keep that in mind when you remove or replace no-break space. -- http://mail.python.org/mailman/listinfo/python-list
Re: the tostring and XML methods in ElementTree
George Sakkis wrote: I'm currently using (a variation of) the workaround below instead of ET.tostring and it works fine for me: def tostring(element, encoding=None): text = element.text if text: if not isinstance(text, basestring): text2 = str(text) elif isinstance(text, str) and encoding: text2 = text.decode(encoding) element.text = text2 s = ET.tostring(element, encoding) element.text = text return s Why isn't this the standard behaviour ? Because it wouldn't work. What if you wanted to serialize a different encoding than that of the strings you put into the .text fields? How is ET supposed to know what encoding your strings have? And how should it know that you didn't happily mix various different byte encodings in your strings? If you're mixing different encodings, no tool can help you clean up the mess, you're on your own. This is very different though from having nice utf-8 strings everywhere, asking ET.tostring explicitly to print them in utf-8 and getting back garbage. Isn't the most reasonable assumption that the input's encoding is the same with the output, or does this fall under the refuse the temptation to guess motto ? If this is the case, ET could at least accept an optional input encoding parameter and convert everything to unicode internally. This is an optimization. Basically you're delaying decoding. First of all have you measured the impact on your program if you delay decoding? I'm sure for many programs it doesn't matter, so what you're proposing will just pollute their source code with optimization they don't need. That doesn't mean it's a bad idea in general. I'd prefer it implemented in python core with minimal impact on such programs, decoding delayed until you try to access individual characters. The code below can be implemented without actual decoding: utf8_text_file.write(abc.decode(utf-8) + def.decode(utf-8)) But this example will require decoding done during split method: a = (abc.decode(utf-8) + def.decode(utf-8)).split() Use unicode, that works *and* is portable. *and* it's not supported by all the 3rd party packages, databases, middleware, etc. you have to or want to use. You can always call .encode method. Granted that could be a waste of CPU and memory, but it works. -- http://mail.python.org/mailman/listinfo/python-list
Re: WTF? Printing unicode strings
Ron Garret wrote: In article [EMAIL PROTECTED], Robert Kern [EMAIL PROTECTED] wrote: Ron Garret wrote: I forgot to mention: sys.getdefaultencoding() 'utf-8' A) You shouldn't be able to do that. What can I say? I can. B) Don't do that. OK. What should I do instead? Exact answer depends on what OS and terminal you are using and what your program is supposed to do, are you going to distribute the program or it's just for internal use. -- http://mail.python.org/mailman/listinfo/python-list
Re: WTF? Printing unicode strings
Ron Garret wrote: In article [EMAIL PROTECTED], Serge Orlov [EMAIL PROTECTED] wrote: Ron Garret wrote: In article [EMAIL PROTECTED], Robert Kern [EMAIL PROTECTED] wrote: Ron Garret wrote: I forgot to mention: sys.getdefaultencoding() 'utf-8' A) You shouldn't be able to do that. What can I say? I can. B) Don't do that. OK. What should I do instead? Exact answer depends on what OS and terminal you are using and what your program is supposed to do, are you going to distribute the program or it's just for internal use. I'm using an OS X terminal to ssh to a Linux machine. In theory it should work out of the box. OS X terminal should set enviromental variable LANG=en_US.utf-8, then ssh should transfer this variable to Linux and python will know that your terminal is utf-8. Unfortunately AFAIK OS X terminal doesn't set that variable and most (all?) ssh clients don't transfer it between machines. As a workaround you can set that variable on linux yourself . This should work in the command line right away: LANG=en_US.utf-8 python -c print unichr(0xbd) Or put the following line in ~/.bashrc and logout/login export LANG=en_US.utf-8 -- http://mail.python.org/mailman/listinfo/python-list
Re: WTF? Printing unicode strings
Ron Garret wrote: I'm using an OS X terminal to ssh to a Linux machine. In theory it should work out of the box. OS X terminal should set enviromental variable LANG=en_US.utf-8, then ssh should transfer this variable to Linux and python will know that your terminal is utf-8. Unfortunately AFAIK OS X terminal doesn't set that variable and most (all?) ssh clients don't transfer it between machines. As a workaround you can set that variable on linux yourself . This should work in the command line right away: LANG=en_US.utf-8 python -c print unichr(0xbd) Or put the following line in ~/.bashrc and logout/login export LANG=en_US.utf-8 No joy. [EMAIL PROTECTED]:~$ LANG=en_US.utf-8 python -c print unichr(0xbd) Traceback (most recent call last): File string, line 1, in ? UnicodeEncodeError: 'ascii' codec can't encode character u'\xbd' in position 0: ordinal not in range(128) [EMAIL PROTECTED]:~$ What version of python and what shell do you run? What the following commands print: python -V echo $SHELL $SHELL --version -- http://mail.python.org/mailman/listinfo/python-list
Re: newb: comapring two strings
manstey wrote: Hi, Is there a clever way to see if two strings of the same length vary by only one character, and what the character is in both strings. E.g. str1=yaqtil str2=yaqtel they differ at str1[4] and the difference is ('i','e') But if there was str1=yiqtol and str2=yaqtel, I am not interested. can anyone suggest a simple way to do this? My next problem is, I have a list of 300,000+ words and I want to find every pair of such strings. I thought I would first sort on length of string, but how do I iterate through the following: str1 str2 str3 str4 str5 so that I compare str1 str2, str1 str3, str 1 str4, str1 str5, str2 str3, str3 str4, str3 str5, str4 str5. If your strings are pretty short you can do it like this even without sorting by length first: def fuzzy_keys(s): for pos in range(len(s)): yield s[0:pos]+chr(0)+s[pos+1:] def fuzzy_insert(d, s): for fuzzy_key in fuzzy_keys(s): if fuzzy_key in d: strings = d[fuzzy_key] if type(strings) is list: strings += s else: d[fuzzy_key] = [strings, s] else: d[fuzzy_key] = s def gather_fuzzy_matches(d): for strings in d.itervalues(): if type(strings) is list: yield strings acc = {} fuzzy_insert(acc, yaqtel) fuzzy_insert(acc, yaqtil) fuzzy_insert(acc, oaqtil) print list(gather_fuzzy_matches(acc)) prints [['yaqtil', 'oaqtil'], ['yaqtel', 'yaqtil']] -- http://mail.python.org/mailman/listinfo/python-list
Re: arrays, even, roundup, odd round down ?
Lance Hoffmeyer wrote: So, I have using the following to grab numbers from MS Word. I discovered that that there is a special rule being used for rounding. If a ??.5 is even the number is to rounded down (20.5 = 20) if a ??.5 is odd the number is to rounded up (21.5 = 22) Brands = [B1,B2] A1 = [] A1 = [ re.search(r(?m)(?s)\r%s.*?SECOND.*?(?:(\d{1,3}\.\d)\s+){2} % i, target_table).group(1) for i in Brands ] A1 = [int(float(str(x))+0.5) for x in A1 ] print A1 Any solutions for this line with the above conditions? Seems like a job for Decimal: from decimal import Decimal numbers = 20.50 21.5.split() ZERO_PLACES = Decimal(1) print [int(Decimal(num).quantize(ZERO_PLACES)) for num in numbers] produces [20, 22] -- http://mail.python.org/mailman/listinfo/python-list
Re: Strange IO Error when extracting zips to a network location
Hari Sekhon wrote: Hi, I've written a script to run on windows to extract all zips under a given directory path to another directory path as such: python extractzips.py fetch all zips under this dir put all extracted files under this dir The purpose of this script is to retrieve backup files which are individually zipped under a backup directory tree on a backup server. This scripts works nicely and has input validation etc, exiting gracefully and telling you if you gave a non existent start or target path... When running the script as follows python extractzips.py \\backupserver\backupshare\machine\folder d:\unziphere the script works perfectly, but if I do python extractzips.py \\backupserver\backupshare\machine\folder \\anetworkmachine\share\folder then it unzips a lot of files, recreating the directory tree as it goes but eventually fails with the traceback: File extractzips.py, line 41, in zipextract outfile.write(zip.read(x)) IOError: [Errno 22] Invalid argument But I'm sure the code is correct and the argument is passed properly, otherwise a hundred files before it wouldn't have extracted successfully using this exact same piece of code (it loops over it). It always fails on this same file every time. When I extract the same tree to my local drive it works fine without error. I have no idea why pushing to a network share causes an IO Error, shouldn't it be the same as extracting locally from our perspective? It looks like http://support.microsoft.com/default.aspx?scid=kb;en-us;899149 is the answer. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python script windows servcie
Mivabe wrote: Mivabe formulated the question : Google helped me discovering that it has something to do something with 'CTRL_LOGOFF_EVENT'. I know what it means but i don't know how to solve it. Is that something i have to configure in the script? I'n totally new to Python so maybe someone can point me to the right direction? :D Regards, Mivabe No-one who can help me or did i visit the wrong group for this 'problem'? Indeed. Next time you'd better ask in a windows specific list: http://mail.python.org/mailman/listinfo/python-win32 You need to ignore CTRL_LOGOFF_EVENT. Take a look for example at http://mail.zope.org/pipermail/zope-checkins/2005-March/029068.html -- http://mail.python.org/mailman/listinfo/python-list
Re: distributing a app frozen by cx_freeze
Flavio wrote: Well I managed to get rid of the undefined symbol message by copying all qt libs to the freeze directory, the problem is that now the package is huge (83MB)! So my question is: is there a way to find out exactly which lib is missing ? I haven't done that myself, but I've had an idea of discovering dependances for dynamic languages: run your test suite and register which files are loaded (byte code, dlls, datafiles), then remove from the list all files your know they were used only for testing, that's it, now you know all the files that you need to run your application. On linux you can find you which .so files are loaded by looking at file /proc/self/maps at the end of running your test suite. To find out which python bytecode files were loaded you can use -v option of python, it will print all files that were loaded to stderr, to separate its output from other stderr stuff, you can redirect sys.stderr to some other file. After you've done all that work, I'm not sure if you need cx_freeze. You just need to write little startup script that will set LD_LIBRARY_PATH, PYTHONPATH and start your main script. -- http://mail.python.org/mailman/listinfo/python-list
Re: cx_freeze and matplotlib
Flavio wrote: I am trying to freeze an application which imports matplotlib. It all works fine on the machine where it was frozen. The executable runs without a glitch. But when I move the directory containing the frozen executable and other libs to a new machine, I get the following error: Traceback (most recent call last): File /home/fccoelho/Downloads/cx_Freeze-3.0.2/initscripts/Console.py, line 26, in ? File epigrass.py, line 5, in ? File Epigrass/manager.py, line 7, in ? File Epigrass/simobj.py, line 4, in ? File /usr/lib/python2.4/site-packages/matplotlib/__init__.py, line 457, in ? try: return float(s) File /usr/lib/python2.4/site-packages/matplotlib/__init__.py, line 245, in wrapper if level not in self.levels: File /usr/lib/python2.4/site-packages/matplotlib/__init__.py, line 319, in _get_data_path Return the string representing the configuration dir. If s is the RuntimeError: Could not find the matplotlib data files Matplotlib can't find its data files. I'm not familiar with cx_freeze, but have you told cx_freeze that you don't want to bundle matplotlib or cx_freeze has decided that matplotlib is not going to be bundled? That fact that matplotlib is loaded from site-package is pretty strange, standalone application are not supposed to depend on non-system packages. -- http://mail.python.org/mailman/listinfo/python-list
Re: Install libraries only without the program itself
Gregor Horvath wrote: Hi, My application is a client/server in a LAN. I want to keep my programs .py files on a central File Server serving all clients. The clients should load those over the LAN every time they start the program since I expect that they are rapidly changing and I dont want to update each client seperatly. Don't forget you can screw up running clients if you override old version with a new one. On the clients there should only be python and the necessary libraries and third party modules (sqlobject etc.) installed. I believe it's better to keep *everything* on the file server. Suppose your OS is windows and suppose you want to keep everything in s:/tools. The actions are: 1. Copy python with all 3rd party modules from c:/python24 to s:/tools/python24-win32 2. Grab exemaker from http://effbot.org/zone/exemaker.htm, copy exemaker.exe to s:/tools/win32/client.exe 3. Create little dispatcher s:/tools/win32/client.py: #!s:/tools/python24-win32/python.exe import sys sys.path[0] = s:/tools/client-1.0.0 import client 4. Create your first version of s:/tools/client-1.0.0/client.py: print I'm a client version 1.0.0 - That's it. Now s:/tools/win32/client.exe is ready to go. I guess it's obvious how to release version 1.0.1 If you need to support other architectures or operating systems you just need to create tiny dispatchers in directories s:/tools/linux, s:/tools/macosx ... -- http://mail.python.org/mailman/listinfo/python-list
Re: Python memory deallocate
Heiko Wundram wrote: Am Donnerstag 11 Mai 2006 15:15 schrieb [EMAIL PROTECTED]: I MUST find a system which deallocate memory... Otherwise, my application crashes not hardly it's arrived to break-point system As was said before: as long as you keep a reference to an object, the object's storage _will not be_ reused by Python for any other objects (which is sensible, or would you like your object to be overwritten by other objects before you're done with them?). Besides, even if Python did free the memory that was used, the operating system wouldn't pick it up (in the general case) anyway (because of fragmentation issues), so Python keeping the memory in an internal free-list for new objects is a sensible choice the Python developers took here. BTW python 2.5 now returns free memory to OS, but if a program keeps allocating more memory with each new iteration in python 2.4, it will not help. -- http://mail.python.org/mailman/listinfo/python-list
Re: Memory leak in Python
[EMAIL PROTECTED] wrote: I ran simulation for 128 nodes and used the following oo = gc.get_objects() print len(oo) on every time step the number of objects are increasing. For 128 nodes I had 1058177 objects. I think I need to revisit the code and remove the referencesbut how to do that. I am still a newbie coder and every help will be greatly appreciated. The next step is to find out what type of objects contributes to the growth most of all, after that print several object of that type that didn't exist on iteration N-1 but exist on iteration N -- http://mail.python.org/mailman/listinfo/python-list
Re: Use subprocesses in simple way...
DurumDara wrote: 10 May 2006 04:57:17 -0700, Serge Orlov [EMAIL PROTECTED]: I thought md5 algorithm is pretty light, so you'll be I/O-bound, then why bother with multi-processor algorithm? This is an assessor utility. The program's architecture must be flexible, because I don't know, where it need to run (only I have a possibility to fix this: I write to user's guide). But I want to speedup my alg. with native code, and multiprocess code. I not tested yed, but I think that 4 subprocess quickly as one large process. I believe you need to look at Queue module. Using Queue will help you avoid threading hell that you're afraid of (and rightly so!). Create two queues: one for jobs, another one for results, the main thread submits jobs and picks up results for results queue. As soon as number of results == number of jobs, it's time to quit. Submit N special jobs that indicate it's time to exit, where N is the number of worker threads. Then join the main thread with worker threads and exit the application. -- http://mail.python.org/mailman/listinfo/python-list
Re: How to encode html and xml tag datas with standard python modules ?
DurumDara wrote: Hi ! I probed this function, but that is not encode the hungarian specific characters, like áéíóüóöoúüu: so the chars above chr(127). Have the python a function that can encode these chars too, like in Zope ? The word encode is ambiguous. What do you mean? The example Fredrik gave to you does encode: import cgi cgi.escape(uáéíóüóöoúüu).encode(ascii, xmlcharrefreplace) '#225;#233;#237;#243;#252;#243;#246;#337;#250;#252;#369;' -- http://mail.python.org/mailman/listinfo/python-list
Re: FTP filename escaping
Almad wrote: OK, after some investigation...problem is in non-latin characters in filenames on ftp. Yes, users should be killed for this, It's futile, users will always find a way to crash you program :) And you can't kill them all, there are too many of them. but I would like to handle it somehow... It depends on what you're actually doing. Did you write the ftp server? Or do you have any information about server (OS etc...)? Is your client the only client who can upload? Do you care how file names actually look interally in the server? I can't figure out how it's handled by protocol, ftplib seems to just strip those characters... I believe filename == sequence of bytes terminated by newline byte. I doubt ftplib strips bytes over 127. Even if it does, copy it to your private module collection as ftplibng.py, fix it and import ftplibng as ftplib -- http://mail.python.org/mailman/listinfo/python-list
Re: can distutils windows installer invoke another distutils windows installer
timw.google wrote: Hi all. I have a package that uses other packages. I created a setup.py to use 'try:' and import to check if some required packages are installed. I have the tarballs and corresponding windows installers in my sdist distribution, so if I untar my source distribution and do 'python setup.py install', the script either untars the subpackages to a tmp directory and does an os.system('python setup.py install') (Linux), or os.system(bdist_wininst installer) (win32) for the missing subpackage. I believe there are two ways to handle dependances: either you bundle your dependances with your package (they just live in a directory inside your package, you don't install them) or you leave resolution of dependances to the application that uses your package. Handling dependances like you do it (package installs other packages) doesn't seem like a good idea to me. -- http://mail.python.org/mailman/listinfo/python-list
Re: ascii to latin1
Luis P. Mendes wrote: Errors occur when I assign the result of ''.join(cp for cp in de_str if not unicodedata.category(cp).startswith('M')) to a variable. The same happens with de_str. When I print the strings everything is ok. Here's a short example of data: 115448,DAÇÃO 117788,DA 1º DE MO Nº 2 I used the following script to convert the data: # -*- coding: iso8859-15 -*- class Latin1ToAscii: def abreFicheiro(self): import csv self.reader = csv.reader(open(self.input_file, rb)) def converter(self): import unicodedata self.lista_csv = [] for row in self.reader: s = unicode(row[1],latin-1) de_str = unicodedata.normalize(NFD, s) nome = ''.join(cp for cp in de_str if not \ unicodedata.category(cp).startswith('M')) linha_ascii = row[0] + , + nome # * print linha_ascii.encode(ascii) self.lista_csv.append(linha_ascii) def __init__(self): self.input_file = 'nome_latin1.csv' self.output_file = 'nome_ascii.csv' if __name__ == __main__: f = Latin1ToAscii() f.abreFicheiro() f.converter() And I got the following result: $ python latin1_to_ascii.py 115448,DACAO Traceback (most recent call last): File latin1_to_ascii.py, line 44, in ? f.converter() File latin1_to_ascii.py, line 22, in converter print linha_ascii.encode(ascii) UnicodeEncodeError: 'ascii' codec can't encode character u'\xba' in position 11: ordinal not in range(128) The script converted the ÇÃ from the first line, but not the º from the second one. Still in *, I also don't get a list as [115448,DAÇÃO] but a [u'115448,DAÇÃO'] element, which doesn't suit my needs. Would you mind telling me what should I change? Calling this process latin1 to ascii was a misnomer, sorry that I used this phrase. It should be called latin1 to search key, there is no requirement that the key must be ascii, so change the corresponding lines in your code: linha_key = row[0] + , + nome print linha_key self.lista_csv.append(linha_key.encode(latin-1) With regards to º, Richie already gave you food for thoughts, if you want 1 DE MO to match 1º DE MO remove that symbol from the key (linha_key = linha_key.translate({uº: None}), if you don't want such a fuzzy matching, keep it. -- http://mail.python.org/mailman/listinfo/python-list
Re: Memory leak in Python
[EMAIL PROTECTED] wrote: I am using Ubuntu Linux. My program is a simulation program with four classes and it mimics bit torrent file sharing systems on 2000 nodes. Now, each node has lot of attributes and my program kinds of tries to keep tab of everything. As I mentioned its a simulation program, it starts at time T=0 and goes on untill all nodes have recieved all parts of the file(BitTorrent concept). The ending time goes to thousands of seconds. In each sec I process all the 2000 nodes. Most likely you keep references to objects you don't need, so python garbage collector cannot remove those objects. If you cannot figure it out looking at the source code, you can gather some statistics to help you, for example use module gc to iterate over all objects in your program (gc.get_objects()) and find out objects of which type are growing with each iteration. -- http://mail.python.org/mailman/listinfo/python-list
Re: Use subprocesses in simple way...
Dara Durum wrote: [snip design of a multi-processor algorithm] I thought md5 algorithm is pretty light, so you'll be I/O-bound, then why bother with multi-processor algorithm? 2.) Do you know command line to just like FSUM that can compute file hashes (MD5/SHA1), and don't have any problems with unicode alt. file names ? I believe you can wrap the broken program with a simple python wrapper. Use win32api.GetShortPathName to convert non-ascii file names to DOS filenames. -- http://mail.python.org/mailman/listinfo/python-list
Re: data entry tool
Peter wrote: Diez B. Roggisch wrote: Make it a webapp. That will guarantee to make it runnable on the list of OSses you gave. Use Django/TurboGears/ZOPE for the application itself- whichever suits you best. A webapp isn't feasible as most of the users are on dial up (this is in New Zealand and broadband isn't available for lots of people). I don't see connection here, why it's not feasible? I was hoping for a simple tool. Even if it only worked on Windows, it would be a start. It just needs to present a form of text entry fields to the user, and place the data in a plain text file. You can do it using for example Tkinter http://wiki.python.org/moin/TkInter that comes with python distribution for windows. If python can't do this, can anyone suggest another language or approach? Sure you can code that application in Python, the problem is distribution and support of application running on multiple platforms. That's what webapp will help you to avoid. Keep in mind that standalone application for windows will be about 2Mb. Keep also in mind linux is not a platform, it is hmm, how to say it? a snapshot of random programs found on internet, so it's very hard to distribute and support programs for it. -- http://mail.python.org/mailman/listinfo/python-list
Re: data entry tool
[EMAIL PROTECTED] wrote: If the data to be entered is simple and textual you can even think about using a text only interface. The resulting program will be really simple, and probably good enough. FWIW here is size of Hello, world! program distribution using different interfaces: text console: 1.2Mb web.py w/o ssl: 1.5Mb tkinter: 2.1Mb wxpython: 3.0Mb Getting more slim distributions requires some manual work -- http://mail.python.org/mailman/listinfo/python-list
Re: Using time.sleep() in 2 threads causes lockup whenhyper-threading is enabled
Dennis Lee Bieber wrote: On 8 May 2006 15:44:04 -0700, Serge Orlov [EMAIL PROTECTED] declaimed the following in comp.lang.python: The test program in question doesn't require a dedicated machine and it doesn't consume a lot of CPU resources since it sleeps most of the time. Yet... Do we have any evidence that other activity on the machine may or may not affect the situation? There is a big difference between leaving a machine idle for a few hours to see if it hangs, vs doing normal activities with a process in the background (and what about screen savers? network activity?) But what if other processes will actually help to trigger the bug? IMHO both situations (idle or busy) are equal if you don't know what's going on. -- http://mail.python.org/mailman/listinfo/python-list
Re: Embedding Python
gavinpaterson wrote: Dear Pythoners, I am writing as I am having trouble embedding a Python program into a Win XP C++ console application. I have written a script file for importing and I am trying to use the example for Pure Embedding found in the product documentation. The program fails to successfully execute past the line: if (pFunc PyCallable_Check(pFunc)) I have checked the pFunc pointer at runtime and it is not null so I assume that the PyCallable_Check fails. Looking at the source in Pure Embedding I see that it is supposed to print an error message if PyCallable_Check fails, have you got the message? -- http://mail.python.org/mailman/listinfo/python-list
Re: ascii to latin1
Richie Hindle wrote: [Serge] def search_key(s): de_str = unicodedata.normalize(NFD, s) return ''.join(cp for cp in de_str if not unicodedata.category(cp).startswith('M')) Lovely bit of code - thanks for posting it! Well, it is not so good. Please read my next message to Luis. You might want to use NFKD to normalize things like LATIN SMALL LIGATURE FI and subscript/superscript characters as well as diacritics. IMHO It is perfectly acceptable to declare you don't interpret those symbols. After all they are called *compatibility* code points. I tried a quater symbol: Google and MSN don't interpret it. Yahoo doesn't support it at all. NFKD form is also more tricky to use. It loses semantic of characters, for example if you have character digit two followed by superscript digit two; they look like 2 power 2, but NFKD will convert them into 22 (twenty two), which is wrong. So if you want to use NFKD for search your will have to preprocess your data, for example inserting space between the twos. -- http://mail.python.org/mailman/listinfo/python-list
Re: ascii to latin1
Luis P. Mendes wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Richie Hindle escreveu: [Serge] def search_key(s): de_str = unicodedata.normalize(NFD, s) return ''.join(cp for cp in de_str if not unicodedata.category(cp).startswith('M')) Lovely bit of code - thanks for posting it! You might want to use NFKD to normalize things like LATIN SMALL LIGATURE FI and subscript/superscript characters as well as diacritics. Thank you very much for your info. It's a very good aproach. When I used the NFD option, I came across many errors on these and possibly other codes: \xba, \xc9, \xcd. What errors? normalize method is not supposed to give any errors. You mean it doesn't work as expected? Well, I have to admit that using normalize is a far from perfect way to implement search. The most advanced algorithm is published by Unicode guys: http://www.unicode.org/reports/tr10/ If you read it you'll understand it's not so easy. I tried to use NFKD instead, and the number of errors was only about half a dozen, for a universe of 60+ names, on code \xbf. It looks like I have to do a search and substitute using regular expressions for these cases. Or is there a better way to do it? Perhaps you can use unicode translate method to map the characters that still give you problems to whatever you want. -- http://mail.python.org/mailman/listinfo/python-list
Re: hyperthreading locks up sleeping threads
[EMAIL PROTECTED] wrote: Tried importing win32api instead of time and using the win32api.GetTickCount() and win32api.Sleep() methods. What about win32api.SleepEx? What about WaitForMultipleObjects WaitForMultipleObjectsEx WaitForSingleObject WaitForSingleObjectEx when the object is not expected to produce events and the function timeouts? -- http://mail.python.org/mailman/listinfo/python-list
Re: Using time.sleep() in 2 threads causes lockup whenhyper-threading is enabled
Delaney, Timothy (Tim) wrote: [EMAIL PROTECTED] wrote: I am a bit surprised that nobody else has tried running the short Python program above on a hyper-threading or dual core / dual processor system. Does it happen every time? Have you tried it on multiple machines? Is it possible that that one machine is having problems? Does it take the same amount of time each run to replicate - and if so, how long is that (give or take a minute)? Until you can answer these questions with definite answers, people are not going to dedicate machines *that they use* for hours on end trying to replicate it. And from this thread, the time required appears to be minutes to hours. That suggests you have a race condition which results in a deadlock - and of course, that is more likely to occur on a dual-core or dual-cpu machine, as you really have multiple threads executing at once. The test program in question doesn't require a dedicated machine and it doesn't consume a lot of CPU resources since it sleeps most of the time. I'm surprised that so many people have been willing to dedicate as much time as they have, but then again considering the people involved it's not quite so surprising. This problem doesn't require more time than any other in comp.lang.python which I try to help to resolve. In fact, since OP is not a newbie, it took less time than some newbie questions. -- http://mail.python.org/mailman/listinfo/python-list
Re: ascii to latin1
Luis P. Mendes wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi, I'm developing a django based intranet web server that has a search page. Data contained in the database is mixed. Some of the words are accented, some are not but they should be. This is because the collection of data began a long time ago when ascii was the only way to go. The problem is users have to search more than once for some word, because the searched word can be or not be accented. If we consider that some expressions can have several letters that can be accented, the search effort is too much. I've searched the net for some kind of solution but couldn't find. I've just found for the opposite. example: if the word searched is 'televisão', I want that a search by either 'televisao', 'televisão' or even 'télévisao' (this last one doesn't exist in Portuguese) is successful. So, instead of only one search, there will be several used. Is there anything already coded, or will I have to try to do it all by myself? You need to covert from latin1 to ascii not from ascii to latin1. The function below does that. Then you need to build database index not on latin1 text but on ascii text. After that convert user input to ascii and search. import unicodedata def search_key(s): de_str = unicodedata.normalize(NFD, s) return ''.join(cp for cp in de_str if not unicodedata.category(cp).startswith('M')) print search_key(utelevisão) print search_key(utélévisao) = Result: televisao televisao -- http://mail.python.org/mailman/listinfo/python-list
Re: A critic of Guido's blog on Python's lambda
Ken Tilton wrote: It is vastly more disappointing that an alleged tech genius would sniff at the chance to take undeserved credit for PyCells, something probably better than a similar project on which Adobe (your superiors at software, right?) has bet the ranch. This is the Grail, dude, Brooks's long lost Silver Bullet. And you want to pass? C'mon, Alex, I just want you as co-mentor for your star quality. Of course you won't have to do a thing, just identify for me a True Python Geek and she and I will take it from there. Here's the link in case you lost it: http://www.lispnyc.org/wiki.clp?page=PyCells :) peace, kenny ps. flaming aside, PyCells really would be amazingly good for Python. And so Google. (Now your job is on the line. g) k Perhaps I'm missing something but what's the big deal about PyCells? Here is 22-lines barebones implementation of spreadsheet in Python, later I create 2 cells a and b, b depends on a and evaluate all the cells. The output is a = negate(sin(pi/2)+one) = -2.0 b = negate(a)*10 = 20.0 === spreadsheet.py == class Spreadsheet(dict): def __init__(self, **kwd): self.namespace = kwd def __getitem__(self, cell_name): item = self.namespace[cell_name] if hasattr(item, formula): return item() return item def evaluate(self, formula): return eval(formula, self) def cell(self, cell_name, formula): Create a cell defined by formula def evaluate_cell(): return self.evaluate(formula) evaluate_cell.formula = formula self.namespace[cell_name] = evaluate_cell def cells(self): Yield all cells of the spreadsheet along with current values and formulas for cell_name, value in self.namespace.items(): if not hasattr(value, formula): continue yield cell_name, self[cell_name], value.formula import math def negate(x): return -x sheet1 = Spreadsheet(one=1, sin=math.sin, pi=math.pi, negate=negate) sheet1.cell(a, negate(sin(pi/2)+one)) sheet1.cell(b, negate(a)*10) for name, value, formula in sheet1.cells(): print name, =, formula, =, value -- http://mail.python.org/mailman/listinfo/python-list
Re: A critic of Guido's blog on Python's lambda
Bill Atkins wrote: Serge Orlov [EMAIL PROTECTED] writes: Ken Tilton wrote: It is vastly more disappointing that an alleged tech genius would sniff at the chance to take undeserved credit for PyCells, something probably better than a similar project on which Adobe (your superiors at software, right?) has bet the ranch. This is the Grail, dude, Brooks's long lost Silver Bullet. And you want to pass? C'mon, Alex, I just want you as co-mentor for your star quality. Of course you won't have to do a thing, just identify for me a True Python Geek and she and I will take it from there. Here's the link in case you lost it: http://www.lispnyc.org/wiki.clp?page=PyCells :) peace, kenny ps. flaming aside, PyCells really would be amazingly good for Python. And so Google. (Now your job is on the line. g) k Perhaps I'm missing something but what's the big deal about PyCells? Here is 22-lines barebones implementation of spreadsheet in Python, later I create 2 cells a and b, b depends on a and evaluate all the cells. The output is a = negate(sin(pi/2)+one) = -2.0 b = negate(a)*10 = 20.0 === spreadsheet.py == class Spreadsheet(dict): def __init__(self, **kwd): self.namespace = kwd def __getitem__(self, cell_name): item = self.namespace[cell_name] if hasattr(item, formula): return item() return item def evaluate(self, formula): return eval(formula, self) def cell(self, cell_name, formula): Create a cell defined by formula def evaluate_cell(): return self.evaluate(formula) evaluate_cell.formula = formula self.namespace[cell_name] = evaluate_cell def cells(self): Yield all cells of the spreadsheet along with current values and formulas for cell_name, value in self.namespace.items(): if not hasattr(value, formula): continue yield cell_name, self[cell_name], value.formula import math def negate(x): return -x sheet1 = Spreadsheet(one=1, sin=math.sin, pi=math.pi, negate=negate) sheet1.cell(a, negate(sin(pi/2)+one)) sheet1.cell(b, negate(a)*10) for name, value, formula in sheet1.cells(): print name, =, formula, =, value I hope Ken doesn't mind me answering for him, but Cells is not a spreadsheet (where did you get that idea?). It's written on the page linked above, second sentence: Think of the slots as cells in a spreadsheet, and you've got the right idea. I'm not claiming that my code is full PyCell implementation. It does apply the basic idea of a spreadsheet to software - that is, instead of updating value when some event occurs, you specify in advance how that value can be computed and then you stop worrying about keeping it updated. The result is the same. Of course, I don't track dependances in such a tiny barebones example. But when you retrieve a cell you will get the same value as with dependances. Adding dependances is left as an exercise. Incidentally, is this supposed to be an example of Python's supposed aesthetic pleasantness? Nope. This is an example that you don't need macros and multi-statements. Ken writes: While the absence of macros and multi-statement lambda in Python will make coding more cumbersome. I'd like to see Python code doing the same if the language had macros and multi-statement lambda. Will it be more simple? More expressive? I find it a little hideous, even giving you the benefit of the doubt and pretending there are newlines between each function. There's nothing like a word wrapped in pairs of underscores to totally ruin an aesthetic experience. I don't think anyone who is not a master of a language can judge readability. You're just distracted by insignificant details, they don't matter if you code in that language for many years. I'm not going to tell you how Lisp Cell code looks to me ;) P.S. Is this really a spreadsheet? It looks like it's a flat hashtable... Does it matter if it's flat or 2D? -- http://mail.python.org/mailman/listinfo/python-list
Re: the tostring and XML methods in ElementTree
[EMAIL PROTECTED] wrote: Question 1: assuming the following: a) beforeCtag.text gets assigned a value of 'I\x92m confused' b) afterRoot is built using the XML() method where the input to the XML() method is the results of a tostring() method from beforeRoot Are there any settings/arguments that could have been modified that would have resulted in afterCtag.text being of type type 'str' and afterCtag.text when printed displays: I'm confused ? str type (also known as byte string) is only suitable for ascii text. chr(0x92) is outside of ascii so you should use unicode strings or you\x92ll be confused :) print uI\u2019m not confused I'm not confused Question 2: Does the fact that resultToStr is equal to resultToStr2 mean that an encoding of utf-8 is the defacto default when no encoding is passed as an argument to the tostring method, or does it only mean that in this particular example, they happened to be the same? No. Dejure default encoding is ascii, defacto people try to change it, but it's not a good idea. I'm not sure how you got the strings to be the same, but it's definately host-specific result, when I repeat your interactive session I get different resultToStr at this point: afterRoot = ElementTree.XML(resultToStr) resultToStr 'beforeRootCI#146;m confused/C/beforeRoot' 3) would it be possible to construct a statement of the form newResult = afterCtag.text.encode(?? some argument ??) where newResult was the same as beforeCtag.text? If so, what should the argument be to the encode method? Dealing with unicode doesn't require you to pollute your code with encode methods, just open the file using codecs module and then write unicode strings directly: import codecs fileHandle = codecs.open('c:/output1.text', 'w',utf-8) fileHandle.write(uI\u2019m not confused, because I'm using unicode) 4) what is the second character in encodedCtagtext (the character with an ordinal value of 194)? That is byte with value 194, it's not a character. It is part of unicode code point U+0092 when it is encoded in utf-8 '\xc2\x92'.decode(utf-8) u'\x92' This code point actually has no name, so you shouldn't produce it: import unicodedata unicodedata.name('\xc2\x92'.decode(utf-8)) Traceback (most recent call last): File pyshell#40, line 1, in -toplevel- unicodedata.name('\xc2\x92'.decode(utf-8)) ValueError: no such name -- http://mail.python.org/mailman/listinfo/python-list
Re: the tostring and XML methods in ElementTree
[EMAIL PROTECTED] wrote: O/S: Windows XP Home Vsn of Python: 2.4 [snip fighting with unicode character U+2019 (RIGHT SINGLE QUOTATION MARK) ] I don't know what console you use but if it is IDLE you'll get confused even more because it is buggy and improperly handles that character: print repr(u'I'm confused') u'I\x92m confused' I'm using Lightning Compiler http://cheeseshop.python.org/pypi/Lightning%20Compiler to run snippets of code and in the editor tab it handles that character fine: print repr(u'I'm confused') u'I\u2019m confused' But in the console tab it produces the same buggy result :) It looks like handling unicode is like rocket science :) -- http://mail.python.org/mailman/listinfo/python-list
Re: Embedding Python: How to run compiled(*.pyc/*.pyo) files using Python C API?
Shankar wrote: Hello, I am trying to run compiled Python files (*.pyc and *.pyo) using Python C API. I am using the method PyRun_FileFlags() for this purpose. The code snippet is as follows:- PyCompilerFlags myFlags; myFlags.cf_flags=1; // I tried all values 0, 1 and 2 PyRun_FileFlags(script, file, Py_file_input, globals, locals, myFlags); But unfortunately I get the following exception:- DeprecationWarning: Non-ASCII character '\xf2' in file E:\test.pyc on line 1, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details Note, it's not an exception, it's a warning. When I run the .py file, then things work fine. The .py file contains only one statement, print Hello World Which Python C API should I use to run compiled Python files(*.pyc and *.pyo) in the scenario where the source file (*.py) is not present. I believe it's PyImport_ImportModule(test) -- http://mail.python.org/mailman/listinfo/python-list
Re: Elegent solution to replacing ' and ?
fyleow wrote: I'm trying to replace the ' and characters in the strings I get from feedparser so I can enter it in the database without getting errors. Here's what I have right now. self.title = entry.title.encode('utf-8') self.title = self.title.replace('\', '\\\') self.title = self.title.replace('\'', '\\\'') This works just great but is there a more elegent way to do this? It looks like maybe I could use the translate method but I'm not sure. You should use execute method to construct sql statements. This is wrong: self.title = entry.title.encode('utf-8') self.title = self.title.replace('\', '\\\') self.title = self.title.replace('\'', '\\\'') cursor.execute('select foo from bar where baz=%s ' % self.title) This is right: self.title = entry.title cursor.execute(select foo from bar where baz=%s, (self.title,)) The formatting style differs between db modules, take a look at paramstyle description in PEP 249: http://www.python.org/dev/peps/pep-0249/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Why does built-in set not take keyword arguments?
Jack Diederich wrote: On Thu, May 04, 2006 at 02:08:30PM -0400, Steven Watanabe wrote: I'm trying to do something like this in Python 2.4.3: class NamedSet(set): def __init__(self, items=(), name=''): set.__init__(self, items) self.name = name class NamedList(list): def __init__(self, items=(), name=''): list.__init__(self, items) self.name = name I can do: mylist = NamedList(name='foo') but I can't do: myset = NamedSet(name='bar') TypeError: set() does not take keyword arguments How come? How would I achieve what I'm trying to do? setobject.c checks for keyword arguments in it's __new__ instead of its __init__. I can't think of a good reason other to enforce inheriters to be maximally set-like. We're all adults here so I'd call it a bug. bufferobect, rangeobject, and sliceobject all do this too, but classmethod and staticmethod both check in tp_init. Go figure. As a work around use a function to make the set-alike. class NamedSet(set): pass def make_namedset(vals, name): ob = NamedSet(vals) ob.name = name return ob Then make_namedset as a constructor in place of NamedSet(vals, name) Or use this work around: class NamedSet(set): def __new__(cls, iterable=(), name=): return super(NamedSet, cls).__new__(cls) def __init__(self, iterable=(), name=): super(NamedSet, self).__init__(iterable) self.name = name -- http://mail.python.org/mailman/listinfo/python-list
Re: Using time.sleep() in 2 threads causes lockup when hyper-threading is enabled
[EMAIL PROTECTED] wrote: Below are 2 files that isolate the problem. Note, both programs hang (stop responding) with hyper-threading turned on (a BIOS setting), but work as expected with hyper-threading turned off. What do you mean stop responding? Not responding when you press ctrl-c? They stop printing? If you mean stop printing, try sys.stdout.flush() after each print -- http://mail.python.org/mailman/listinfo/python-list
Re: Using time.sleep() in 2 threads causes lockup when hyper-threading is enabled
[EMAIL PROTECTED] wrote: What do you mean stop responding? Both threads print their thread numbers (either 1 or 2) approximately every 10 seconds. However, after a while (minutes to hours) both programs (see above) hang! Pressing ctrl-c (after the printing stops) causes the threads to wake up from their sleep statement. And since the sleep took more than 1 seconds the thread number and the duration of the sleep is printed to the screen. Do you have a hyper-threading/dual/multi core CPU? Did you try this? I don't have such CPU but I run the first program anyway. It printed C:\pyth.py thread 1 started sleep time: 0.01 3.63174649292e-006 8.43682646817e-005 0.000164825417756 thread 2 started sleep time: 0.003 0.000675225482568 0.000753447714724 0.00082943502596 1 1 1 2 1 1 1 2 1 1 1 2 1 1 1 1 2 1 1 1 2 1 I got bored and tried to stop it with ctrl-c but it didn't respond and kept running and printing the numbers. I had to kill it from task manager. -- http://mail.python.org/mailman/listinfo/python-list
Re: Strange result with math.atan2()
Vedran Furac wrote: Ben Caradoc-Davies wrote: Vedran Furac wrote: I think that this results must be the same: In [3]: math.atan2(-0.0,-1) Out[3]: -3.1415926535897931 In [4]: math.atan2(-0,-1) Out[4]: 3.1415926535897931 -0 is converted to 0, then to 0.0 for calculation, losing the sign. You might as well write 0.0 instead of -0 The behaviour of atan2 conforms to the ISO C99 standard (Python is implemented in C). Changing the sign of the first argument changes the sign of the output, with no special treatment for zero. http://www.ugcs.caltech.edu/manuals/libs/mpfr-2.2.0/mpfr_22.html Well, here I can read: Special values are currently handled as described in the ISO C99 standard for the atan2 function (note this may change in future versions): * atan2(+0, -0) returns +Pi. * atan2(-0, -0) returns -Pi. /* wrong too */ * atan2(+0, +0) returns +0. * atan2(-0, +0) returns -0. /* wrong too */ * atan2(+0, x) returns +Pi for x 0. * atan2(-0, x) returns -Pi for x 0 And the formula (also from that site): if x 0, atan2(y, x) = sign(y)*(PI - atan (abs(y/x))) ^^^ So, you can convert -0 to 0, but you must multiply the result with sign of y, which is '-' (minus). But you miss the fact that 0 is an *integer*, not a float, and -0 doesn't exist. Use this code until you stop passing integers to atan2: from math import atan2 as math_atan2 def atan2(y, x): if (isinstance(y, int) and y == 0) or ( isinstance(x, int) and x == 0): raise ValueError(Argument that is an integer zero can \ produce wrong results) return math_atan2(y, x) print atan2(-0.0, -0.0) print atan2(-0, -0) -- http://mail.python.org/mailman/listinfo/python-list
Re: simultaneous assignment
John Salerno wrote: bruno at modulix wrote: Now if I may ask: what is your actual problem ? Ok, since you're so curious. :) Here's a scan of the page from the puzzle book: http://johnjsalerno.com/spies.png Basically I'm reading this book to give me little things to try out in Python. There's no guarantee that this puzzle is even conducive to (or worthy of) a programming solution. So what you're trying to do is to run over all possible combinations? Anyway you don't need to worry about identity, since boolean values are immutable. In general when you see statement like some_var = immutable value you can be *sure* you're changing *only* some_var Warning! Half-spoiler below :) Following is a function run_over_space from my personal utils package for generating all combinations and an example how it can be applied to your puzzle: def decrement(point, space): Yield next point of iteration space for coord in range(len(point)): if point[coord] 0: point[coord] -= 1 return else: point[coord] = space[coord] continue raise StopIteration def run_over_space(space): Yield all points of iteration space. Space is a list of maximum values of each dimension point = space[:] while True: yield point decrement(point,space) def describe_point(spy,w,x,y,z): if spy: print Spy1 is right, , else: print Spy1 is wrong, , print w, x, y, z = , w, x, y, z for point in run_over_space([1,1,1,1,1]): describe_point(*point) -- http://mail.python.org/mailman/listinfo/python-list
Re: How to prevent this from happening?
[EMAIL PROTECTED] wrote: Regarding this expression: 1 x I had a bug in my code that made x become Very Large - much larger than I had intended. This caused Python, and my PC, to lock up tight as a drum, and it appeared that the Python task (Windows XP) was happily and rapidly consuming all available virtual memory. Presumably, Python was trying to create a really really long integer, just as I had asked it. Is there a way to put a limit on Python, much like there is a stack limit, so that this sort of thing can't get out of hand? This is a general problem regardless of programming language and it's better solved by OS. Windows has API for limiting resource usage but it lacks user tools. At least I'm not aware of them, maybe *you* can find them. There is Windows System Resource Manager http://www.microsoft.com/technet/downloads/winsrvr/wsrm.mspx It won't run on Windows XP, but you may take a look at its distribution CD image. If you're lucky maybe there is a command line tool for Windows XP. Alternatively you can switch to a better OS ;) Any Unix-like (Max OS X, Linux, *BSD, etc...), they all have resource usage limiting tools out of the box. -- http://mail.python.org/mailman/listinfo/python-list
Re: stdin: processing characters
Kevin Simmons wrote: Thanks for your input. I found an answer that suits my needs, not curses :-), but stty settings and sys.stdin.read(n) : import os, sys while 1: os.system(stty -icanon min 1 time 0) print Radio computer control program. -- Choose a function: po) Power toggle fq) Change frequency cm) Change mode vo) Change volume re) Reset qu) Quit --, func = sys.stdin.read(2) if func == po: ... ... rest of menu actions ... elif func = qu: os.system(stty cooked) sys.exit() Looks reasonable if you don't need portability. But you may want to refactor it a little bit to make sure terminal setting are always restored: try: do_all_the_work() finally: os.system(stty cooked) P.S. Maybe its me, but when I see call sys.exit() I always have a gut feeling this function never returns. But in fact my I'm wrong and sys.exit is more reasonable: it raises exception. So you can call sys.exit() inside do_all_the_work and you can still be sure that os.system(stty cooked) is always executed at the end. -- http://mail.python.org/mailman/listinfo/python-list
Re: Can Python kill a child process that keeps on running?
I. Myself wrote: Suppose we spawn a child process with Popen. I'm thinking of an executable file, like a compiled C program. Suppose it is supposed to run for one minute, but it just keeps going and going. Does Python have any way to kill it? This is not hypothetical; I'm doing it now, and it's working pretty well, but I would like to be able to handle this run-on condition. I'm using Windows 2000, but I want my program to be portable to linux. On linux it's pretty easy to do, just setup alarm signal. On windows it's not so trivial to the point you cannot do it using python.org distribution, you will need to poke in low level C API using win32 extensions or ctypes. AFAIK twisted package http://twistedmatrix.com has some code to help you. Also take a look at buildbot sources http://buildbot.sf.net that uses twisted. Buildbot has the same problem as you have, it needs to kill run away or non-responding processes. -- http://mail.python.org/mailman/listinfo/python-list