Re: List to string
Steven D'Aprano [EMAIL PROTECTED] wrote: On Tue, 20 Mar 2007 13:01:36 +0100, Bruno Desthuilliers wrote: 8 --- confusion about left and right It gets worse. When you work on a lathe, a right hand cutting tool has its cutting edge on the left... And the worse part is that its for good reason. - Hendrik -- http://mail.python.org/mailman/listinfo/python-list
Re: Daylight saving time question
On Mar 20, 12:53 pm, Mr Pekka Niiranen [EMAIL PROTECTED] wrote: Hi, is it possible to get the two annual daylight saving times (day, month and time) from Python by giving location in some country/location string (Europe/Finland for example). I need to ask country in program and calculate daylight saving times for the next few years onwards somehow like this: for y in range(2007, 2017): (m1,d1,t1,m2,d2,t2) = daylight_change_epochs(Finland) -pekka- A generator defined via recursion: import dateutil.rrule, dateutil.tz import datetime mytz = dateutil.tz.tzfile(/usr/share/zoneinfo/Europe/Helsinki) start = datetime.datetime(2007,1,1,0,0,0,tzinfo=mytz) end = datetime.datetime(2017,1,1,0,0,0,tzinfo=mytz) successively_finer = { dateutil.rrule.WEEKLY: dateutil.rrule.DAILY, dateutil.rrule.DAILY: dateutil.rrule.HOURLY, dateutil.rrule.HOURLY: dateutil.rrule.MINUTELY, dateutil.rrule.MINUTELY: dateutil.rrule.SECONDLY } # find week, then day, then hour, etc. that spans a change in DST def sieve (start, end, freq): dstflag = start.timetuple()[-1] iter = dateutil.rrule.rrule(freq,dtstart=start,until=end) tprior = start for t in iter: if t.timetuple()[-1] != dstflag: dstflag = t.timetuple()[-1] if freq == dateutil.rrule.SECONDLY: yield tprior, t else: yield sieve(tprior, t, successively_finer[freq]).next() tprior = t raise StopIteration for before, after in sieve(start, end, dateutil.rrule.WEEKLY): print %s = %s % ( before.strftime(%Y-%m-%d %H:%M:%S (%a) %Z), after.strftime(%Y-%m-%d %H:%M:%S (%a) %Z)) I get: 2007-03-25 02:59:59 (Sun) EET = 2007-03-25 03:00:00 (Sun) EEST 2007-10-28 02:59:59 (Sun) EEST = 2007-10-28 03:00:00 (Sun) EET 2008-03-30 02:59:59 (Sun) EET = 2008-03-30 03:00:00 (Sun) EEST 2008-10-26 02:59:59 (Sun) EEST = 2008-10-26 03:00:00 (Sun) EET 2009-03-29 02:59:59 (Sun) EET = 2009-03-29 03:00:00 (Sun) EEST 2009-10-25 02:59:59 (Sun) EEST = 2009-10-25 03:00:00 (Sun) EET 2010-03-28 02:59:59 (Sun) EET = 2010-03-28 03:00:00 (Sun) EEST 2010-10-31 02:59:59 (Sun) EEST = 2010-10-31 03:00:00 (Sun) EET 2011-03-27 02:59:59 (Sun) EET = 2011-03-27 03:00:00 (Sun) EEST 2011-10-30 02:59:59 (Sun) EEST = 2011-10-30 03:00:00 (Sun) EET 2012-03-25 02:59:59 (Sun) EET = 2012-03-25 03:00:00 (Sun) EEST 2012-10-28 02:59:59 (Sun) EEST = 2012-10-28 03:00:00 (Sun) EET 2013-03-31 02:59:59 (Sun) EET = 2013-03-31 03:00:00 (Sun) EEST 2013-10-27 02:59:59 (Sun) EEST = 2013-10-27 03:00:00 (Sun) EET 2014-03-30 02:59:59 (Sun) EET = 2014-03-30 03:00:00 (Sun) EEST 2014-10-26 02:59:59 (Sun) EEST = 2014-10-26 03:00:00 (Sun) EET 2015-03-29 02:59:59 (Sun) EET = 2015-03-29 03:00:00 (Sun) EEST 2015-10-25 02:59:59 (Sun) EEST = 2015-10-25 03:00:00 (Sun) EET 2016-03-27 02:59:59 (Sun) EET = 2016-03-27 03:00:00 (Sun) EEST 2016-10-30 02:59:59 (Sun) EEST = 2016-10-30 03:00:00 (Sun) EET -- Hope this helps, Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: When is List Comprehension inappropriate?
On Mar 19, 2:41 pm, Ben [EMAIL PROTECTED] wrote: I have recently learned how list comprehension works and am finding it extremely cool. I am worried, however, that I may be stuffing it into places that it does not belong. What's the most pythony way to do this: even = [] for x in range(0,width,2): for y in range(0,height,2): color = im.getpixel((x,y)) even.append(((x,y), color)) versus list comprehension: even2 = [((x,y), im.getpixel((x,y))) for x in range(0,width,2) for y in range(0,height,2)] Is there a computational difference in creating a blank list and appending to it versus doing a list comprehension? Are there advantages to it outside of short and pretty code? Feel free to tell me a different way to do this, as well. Thanks, Ben I have found that I have gone too far when I used listcomps for their sideeffects rather than wanting the list produced, for example the second listcomp below is an expression as statement I don't want the list produced - just the effect on data. # some random ranges data = [range(random.randrange(3,7)) for x in range(4)] # but I want each range jumbled [ random.shuffle(d) for d in data] [None, None, None, None] data [[2, 0, 3, 1], [0, 2, 1], [3, 4, 1, 0, 2], [2, 1, 0, 3]] (I do know how to re-write it). - Paddy. -- http://mail.python.org/mailman/listinfo/python-list
How to tell easy_install that a package is already installed
I am not sure if this is the right forum to ask this. If it is not, could someone pleas point me to a more appropriate newsgroup. I am attempting to install dap.plugins.netcdf using easy_install on HP-UX 11. As a user, I do not have access to root so have followed the easy_install recommendation to set up a virtual python, as described at http://peak.telecommunity.com/DevCenter/EasyInstall#creating-a-virtual-python and this is mostly working well. One of the dependencies, numpy is already installed, but easy_install does not detect this, and tries to install numpy, which fails with many compilation errors. (numerous pointer type mismatches, undeclared functions like rintf that do not exist on HP, ...) I could try to install numpy myself from the tar.gz file, but this seems pointless since it is already installed. Is there any way to tell easy_install that numpy is installed, while still installing any other dependencies ? Would the best option be to just use the --no-deps option and see if it works ? Charles -- http://mail.python.org/mailman/listinfo/python-list
How to set SA_RESTART flag in Python
Hi, I want to set SA_RESTART flag to restart a system-call. How to do this in Python? Thank you! -- LinuX Power -- http://mail.python.org/mailman/listinfo/python-list
Unit testing daemon creation
Howdy all, I'm in the process of creating a daemon program using Python. I know that the program needs to 'os.fork()', then branch based on the return value and either exit (if the parent) or continue (if the child). How can I unit test that? That is to say, how can I programmatically test that functionality of daemonisation from an external module which imports the daemon module for testing? I could be hackish and simply clobber the existing 'os.fork' for the purpose of the unit test, making a stub function that returns the appropriate values for the test. Is there a better way? -- \Beware of and eschew pompous prolixity. -- Charles A. | `\ Beardsley | _o__) | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list
Re: How to tell easy_install that a package is already installed
Charles Sanders [EMAIL PROTECTED] writes: I am not sure if this is the right forum to ask this. If it is not, could someone pleas point me to a more appropriate newsgroup. You want the distutils SIG mailing list for Python. Setuptools is an extension of Python's standard distutils. URL:http://mail.python.org/mailman/listinfo/distutils-sig Be sure to read the information links on that page before posting. -- \ If you get invited to your first orgy, don't just show up | `\ nude. That's a common mistake. You have to let nudity | _o__) 'happen.' -- Jack Handey | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list
replace illegal xml characters
hi! I am working with InDesign exported xml and parse it in a python application. I learned here: http://boodebr.org/main/python/all-about-python-and-unicode that there actually are sets of illegal unicode characters for xml (and henceforth for every compliant xml parser). I already implemented a regex solution to replace the characters in question, but I wonder if there is a efficient and out-of-the-box solution somewhere out there for this problem. does anybody know? thanks! gabriel -- http://mail.python.org/mailman/listinfo/python-list
Re[2]: coverage.py problem
Thank you very much. Just reading the comments on that url, I found the reason of my problem. It was here: --- coverage.py +++ coverage.py @@ -464,6 +464,8 @@ def collect(self): cache_dir, local = os.path.split(self.cache) + if not cache_dir: #this two strings was upsent + cache_dir = . for file in os.listdir(cache_dir): if not file.startswith(local): continue With best regards, Orin -- http://mail.python.org/mailman/listinfo/python-list
Re: How many connections can accept a 'binded' socket?
On 20 Mar, 17:44, John Nagle [EMAIL PROTECTED] wrote: When you ask questions like this, please specify what operating system you're using. Thanks. That was a Linux Ubuntu 6.10. I submitted a bug report on sourceforge: http://sourceforge.net/tracker/index.php?func=detailaid=1685000group_id=5470atid=105470 Alex Martelli wrote: A shell command ulimit -Hn should report on the hard-limit of the number of open file descriptors; just ulimit -n should report on the current soft-limit. Thank you, I'll try it. -- http://mail.python.org/mailman/listinfo/python-list
Re: replace illegal xml characters
In [EMAIL PROTECTED], killkolor wrote: I am working with InDesign exported xml and parse it in a python application. I learned here: http://boodebr.org/main/python/all-about-python-and-unicode that there actually are sets of illegal unicode characters for xml (and henceforth for every compliant xml parser). I already implemented a regex solution to replace the characters in question, but I wonder if there is a efficient and out-of-the-box solution somewhere out there for this problem. does anybody know? Does InDesign export broken XML documents? What exactly is your problem? Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list
Re: How to set SA_RESTART flag in Python
In [EMAIL PROTECTED], Marco wrote: I want to set SA_RESTART flag to restart a system-call. How to do this in Python? You question seems to lack a little bit context. Are you talking about system calls into the Linux kernel? Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list
Re: an enumerate question
[EMAIL PROTECTED] wrote: say i want to enumerate lines of a file eg for n,l in enumerate(open(file)): # print next line ie I think you'd find it much easier to move your frame of reference one line forward and think in terms of remembering the previous line, e.g.: for n,curr in enumerate(open(file)): if n1: print n,curr print m,prev m,prev = n,curr Of course, if the file isn't so big, then you could use readlines as you mention. Cheers, Terry -- Terry Hancock ([EMAIL PROTECTED]) Anansi Spaceworks http://www.AnansiSpaceworks.com -- http://mail.python.org/mailman/listinfo/python-list
Re: #!/usr/bin/env python 2.4?
In article [EMAIL PROTECTED], Stargaming wrote: from sys import version_info if version_info[0] 2 or version_info[1] 4: raise RuntimeError(You need at least python2.4 to run this script) That'll fail when the major version number is increased (i.e. Python 3.0). You want: if sys.hexversion 0x020400f0: ... error ... -- http://mail.python.org/mailman/listinfo/python-list
Unicode in Excel files
On Mar 21, 11:37 am, Carsten Haese [EMAIL PROTECTED] wrote: On Tue, 2007-03-20 at 16:47 -0700, Gerry wrote: I'm still mystified why: qno was ever unicode, Thus quoth http://www.lexicon.net/sjmachin/xlrd.html This module presents all text strings as Python unicode objects. And why would that be? As the next sentence in the referenced docs says, From Excel 97 onwards, text in Excel spreadsheets has been stored as Unicode. Gerry, your Q1 string was converted to Unicode when you wrote it using pyExcelerator's Worksheet.write() method. HTH, John -- http://mail.python.org/mailman/listinfo/python-list
Re: How to calculate a file of equations in python
r='' for line in file : r = str(eval(r+line)) John wrote: Hi, I have a text file which contains math expression, like this 134 +234 +234 (i.e. an operation (e.g. '+) and then a number and then a new line). Can you please tell me what is the easiest way to calculate that file? for example the above example should be = 134 + 234 + 234 = 602. Thank you. -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem with sockets and python 2.5
El Lunes, 19 de Marzo de 2007, Jose Alberto Reguero escribió: I had two programs, server.py and client.py(attached) 1: server.py at i386 python 2.4 client.py at x86_64 python 2.5 Work 2: server.py at x86_64 python 2.5 client.py at i386 python 2.4 Don't work Any ideas? Thanks. Jose Alberto The problem was with kernel 2.6.21-rc4. I put kernel 2.6.20.3 and the programs work well. Thanks. Jose Alberto -- http://mail.python.org/mailman/listinfo/python-list
encoding characters
First: excuse me for my very little english! I have a problem, in my scripts python, with the encoding characters type àâäéèêëïîôöûùüç I use the twisted.web package. The problem is about the insertion data into database. one of errors is: exceptions.UnicodeEncodeError: 'ascii' codec can't encode characters in position 25-26: ordinal not in range(128) http://baghera.crs4.it:8080/sources#tbend help me! Thanks -- http://mail.python.org/mailman/listinfo/python-list
Re: Still the __new__ hell ...
Paulo da Silva a écrit : Bruno Desthuilliers escreveu: Paulo da Silva a écrit : ... class MyDate(date): def __new__(cls,year,month=None,day=None): if type(year) is str: And what if it's a unicode string ? The correct idiom here is: if isinstance(year, basestring): Thanks. If I do type(year) I get either int or str (may be unicode for unicode strings) but never anything like basestring. It's an abstract class. As a relatively inexperient in python, how could I know that a 'string' is an instance of basestring? By reading this newsgroup ?-) x=uxx; help(x) says this is unicode based on basestring but help does not work for x=. help(str) May be the python tutorial should be upgraded to include these new concepts. Also covering the basics of __init__/__new__ (why have both?) It's a long story. But it's a GoodThing(tm) IMVHO. -- http://mail.python.org/mailman/listinfo/python-list
Exceptions and unicode messages
This works: raise StandardError(u'Wrong type') Traceback (most recent call last): File stdin, line 1, in ? StandardError: Wrong type but don't in Finnish: raise StandardError(u'Väärä tyyppi') Traceback (most recent call last): File stdin, line 1, in ? StandardError Any solution in Python? TV -- http://mail.python.org/mailman/listinfo/python-list
Re: Wanted: a python24 package for Python 2.3
Gerald Klix wrote: Hi, You can't import subproces from future, only syntactic and semantic changes that will become standard feature in future python version can be activated that way. You can copy the subprocess module from python 2.4 somewhere where it will be found from python 2.3. At least subporcess is importable after that: --- snip --- [EMAIL PROTECTED]:~/ttt cp -av /usr/local/lib/python2.4/subprocess.py . »/usr/local/lib/python2.4/subprocess.py« - »./subprocess.py« [EMAIL PROTECTED]:~/ttt python2.3 Python 2.3.3 (#1, Jun 29 2004, 14:43:40) [GCC 3.3 20030226 (prerelease) (SuSE Linux)] on linux2 Type help, copyright, credits or license for more information. import subprocess You're quite right about the use of __future__. I decided to put subprocess in a package, so that my system can choose which one to find, whether running Python 2.3 or 2.4. (Well, in 2.3 there's no choice, but in 2.4 I don't want the just for 2.3 module to hide the real 2.4 module.) The responses I've had indicate that my approach might be a good idea, and might be useful to others. For me, that's enough for now. -- Jonathan -- http://mail.python.org/mailman/listinfo/python-list
Re: Exceptions and unicode messages
Hey anybody please tell me... How we can get the properties of a file in python On 3/21/07, Tuomas [EMAIL PROTECTED] wrote: This works: raise StandardError(u'Wrong type') Traceback (most recent call last): File stdin, line 1, in ? StandardError: Wrong type but don't in Finnish: raise StandardError(u'Väärä tyyppi') Traceback (most recent call last): File stdin, line 1, in ? StandardError Any solution in Python? TV -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: Unicode in Excel files
On Mar 21, 6:07 am, John Machin [EMAIL PROTECTED] wrote: On Mar 21, 11:37 am, Carsten Haese [EMAIL PROTECTED] wrote: On Tue, 2007-03-20 at 16:47 -0700, Gerry wrote: I'm still mystified why: qno was ever unicode, Thus quothhttp://www.lexicon.net/sjmachin/xlrd.htmlThis module presents all text strings as Python unicode objects. And why would that be? As the next sentence in the referenced docs says, From Excel 97 onwards, text in Excel spreadsheets has been stored as Unicode. Gerry, your Q1 string was converted to Unicode when you wrote it using pyExcelerator's Worksheet.write() method. HTH, John John, That helps a lot. Thanks again! Gerry -- http://mail.python.org/mailman/listinfo/python-list
Re: Still the __new__ hell ...
Paulo da Silva [EMAIL PROTECTED] wrote: As a relatively inexperient in python, how could I know that a 'string' is an instance of basestring? I think a good tip for anyone learning Python (or upgrading from an earlier version) is to do: dir(__builtins__) at the interactive prompt and read through the list it produces working out what each builtin does. Some of them you can ignore(1), some you can file away as interesting to know they exist but you don't need to know the details today(2), and others are indispensible(3). Do that and you'll find basestring (category 2 until you needed to ask your question) between apply (category 1) and bool (category 3). -- http://mail.python.org/mailman/listinfo/python-list
flattening/rolling up/aggregating a large sorted text file
Hi, Given a large ascii file (delimited or fixed width) with one ID field and dimensions/measures fields, sorted by dimensions, I'd like to flatten or rollup the file by creating new columns: one for each combination of dimension level, and summing up measures over all records for a given ID. If the wheel has already been invented, great, please point me in the right direction. If not, please share some pointers on how to think about this problem in order to write efficient code. Is a hash with dimension level combinations a good approach, with values reset at each new ID level? I know mysql, Oracle etc will do this , but they all have a cap on # of columns allowed. SAS will allow unlimited columns, but I don't own SAS. Thanks. ID,color,shape,msr1 -- 001, blue, square, 4 001, red , circle,5 001, red, circle,6 ID, blue_circle, blue_square, red_circle, red_square -- 001,0,4,11,0 002 ... -- http://mail.python.org/mailman/listinfo/python-list
scipy installations
Hi all, I was trying to install (Scientific Python) scipy 0.3.2 in LINUX 3, but I am facing some problem.Please help me in resolving this issue. here are the steps that i have followed : Installed pre-requisites for scipy installations: 1. Python-2.4.1 $./configure $make $make install 2. ATLAS $./configure $make $make install arch=Linux_P4ESSE3 3. Numerical Python-24.0b2 $python setup.py install 4.F2PY-2.45.241_1926 (Fortran to Python interface generator) Prerequisite: (i). NumPy_ -1.0 $python setup.py install 5. C, C++, Fortran 77 compilers - gcc 3.4.6 $./configure $make $make install SciPy_complete-0.3.2 $python setup.py install all the 5 installations were succesfull but could not install scipy successfully. Following error message is shown error: Command gcc -shared build/temp.linux-x86_64-2.2/gistCmodule.o build/temp .linux-x86_64-2.2/gist.o build/temp.linux-x86_64-2.2/tick.o build/temp.linux-x86 _64-2.2/tick60.o build/temp.linux-x86_64-2.2/engine.o build/temp.linux-x86_64-2. 2/gtext.o build/temp.linux-x86_64-2.2/draw.o build/temp.linux-x86_64-2.2/draw0.o build/temp.linux-x86_64-2.2/clip.o build/temp.linux-x86_64-2.2/gread.o build/te mp.linux-x86_64-2.2/gcntr.o build/temp.linux-x86_64-2.2/hlevel.o build/temp.linu x-x86_64-2.2/ps.o build/temp.linux-x86_64-2.2/cgm.o build/temp.linux-x86_64-2.2/ eps.o build/temp.linux-x86_64-2.2/style.o build/temp.linux-x86_64-2.2/xfancy.o b uild/temp.linux-x86_64-2.2/xbasic.o build/temp.linux-x86_64-2.2/dir.o build/temp .linux-x86_64-2.2/files.o build/temp.linux-x86_64-2.2/fpuset.o build/temp.linux- x86_64-2.2/pathnm.o build/temp.linux-x86_64-2.2/timew.o build/temp.linux-x86_64- 2.2/uevent.o build/temp.linux-x86_64-2.2/ugetc.o build/temp.linux-x86_64-2.2/uma in.o build/temp.linux-x86_64-2.2/usernm.o build/temp.linux-x86_64-2.2/slinks.o b uild/temp.linux-x86_64-2.2/colors.o build/temp.linux-x86_64-2.2/connect.o build/ temp.linux-x86_64-2.2/cursors.o build/temp.linux-x86_64-2.2/errors.o build/temp. linux-x86_64-2.2/events.o build/temp.linux-x86_64-2.2/fills.o build/temp.linux-x 86_64-2.2/fonts.o build/temp.linux-x86_64-2.2/images.o build/temp.linux-x86_64-2 .2/lines.o build/temp.linux-x86_64-2.2/pals.o build/temp.linux-x86_64-2.2/pwin.o build/temp.linux-x86_64-2.2/resource.o build/temp.linux-x86_64-2.2/rgbread.o bu ild/temp.linux-x86_64-2.2/textout.o build/temp.linux-x86_64-2.2/rect.o build/tem p.linux-x86_64-2.2/clips.o build/temp.linux-x86_64-2.2/points.o build/temp.linux -x86_64-2.2/hash.o build/temp.linux-x86_64-2.2/hash0.o build/temp.linux-x86_64-2 .2/mm.o build/temp.linux-x86_64-2.2/alarms.o build/temp.linux-x86_64-2.2/pstrcpy .o build/temp.linux-x86_64-2.2/pstrncat.o build/temp.linux-x86_64-2.2/p595.o bui ld/temp.linux-x86_64-2.2/bitrev.o build/temp.linux-x86_64-2.2/bitlrot.o build/te mp.linux-x86_64-2.2/bitmrot.o -LLib/xplt/. -LLib/xplt/src -L/usr/X11R6/lib -L/us r/lib -L/usr/X11R6/lib -Lbuild/temp.linux-x86_64-2.2 - -lm -o build/lib.linu x-x86_64-2.2/scipy/xplt/gistC.so failed with exit status 1 Thanks in advance Regards Balaraju Sucker-punch spam with award-winning protection. Try the free Yahoo! Mail Beta. http://advision.webevents.yahoo.com/mailbeta/features_spam.html -- http://mail.python.org/mailman/listinfo/python-list
Re: flattening/rolling up/aggregating a large sorted text file
[EMAIL PROTECTED] wrote: Hi, Given a large ascii file (delimited or fixed width) with one ID field and dimensions/measures fields, sorted by dimensions, I'd like to flatten or rollup the file by creating new columns: one for each combination of dimension level, and summing up measures over all records for a given ID. If the wheel has already been invented, great, please point me in the right direction. If not, please share some pointers on how to think about this problem in order to write efficient code. Is a hash with dimension level combinations a good approach, with values reset at each new ID level? I know mysql, Oracle etc will do this , but they all have a cap on # of columns allowed. SAS will allow unlimited columns, but I don't own SAS. Thanks. ID,color,shape,msr1 -- 001, blue, square, 4 001, red , circle,5 001, red, circle,6 ID, blue_circle, blue_square, red_circle, red_square -- 001,0,4,11,0 002 ... It seems a bit wrong-headed to force this problem to fit a solution where you define relations with a variable number of columns when the natural way to solve it would seem to be to sum the msr1 values for each unique combination of ID, color and shape. That's a pretty straightforward relational problem. So, is there some reason the result *has* to have that variable number of columns? regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden Recent Ramblings http://holdenweb.blogspot.com -- http://mail.python.org/mailman/listinfo/python-list
Re: On Java's Interface (the meaning of interface in computer programing)
Xah Lee wrote: In a functional language, a function can be specified by its name and Are you sure you know what a functional language is? parameter specs. For example: f(3) f(3, [9,2]) f(some string) This is not really typical syntax for a functional language. LISP, for example, has the function name as an element of a list. (Some might argue that LISP isn't exactly a functional language.) Also, since you are commenting on Java, you should use Java-like syntax rather than [9,2]. What is [9,2] intended to represent? The range of integers decreasing from 9 to 2, inclusive? For another example, usually a program needs to talk to another software such as a database software. Interesting use of the word software. In essence, making the database useful to other software. This is not a sentence. Such a list of function spec is often called API, which stands for Application Programing Interface. an API The API terminology is abused by the marketing-loving Sun Microsystems by calling the Java language's documentation as “The Java API”, even though Java the language and its paraphernalia of libraries and hardware-emulation system (all together jargonized as “the Java Platform”) isn't a Application nor Interface. (a [sic] API implies that there are two disparate entities involved, which are allowed to communicate thru [sic] it. In the case of “The Java API”, it's one entity talking to itself.). This is incorrect in every factual detail. And what's with the editorial comment in the middle of the exposition (marketing-loving, jargonized)? How does that help explain the concepts, even if it were supportable by the evidence? Sun calls the API documentation the Java API documentation, not the Java API, and not the language documentation, and the API is indeed an interface. An API need not be, and quite often is not, an application - being an application is in no wise part of being an API. And why in the world did you capitalize Application and Interface? It's an API, not a API. It's through, not thru. The statement about an API having to do with two disparate entities makes no sense. There is certainly nothing in the API that one can characterize as one entity talking to itself. What entities do you imagine are involved? In general, the interface concept in programing is a sort of specification that allows different entities to call and make use of the other [sic], with the implication that the caller need not know what's behind the facade. There is no antecedent for the other, and you haven't defined entities, and the word interface has a number of meanings in general ... in programming. You should focus on the Java meaning (and your grammar). In the Object Oriented Programing Paradigm [sic], a new concept arose, that is the “interface” aspect of a class. Historical citation needed. And an interface is not an aspect of a class. As we've seen, a function has parameter spec [sic] that is all there it [sic] is a user needs to know for using it. In Java, this is the method's “signature”. Now, as the methodology of the OOP experience multiplies, it became apparent that the interface concept can be applied to Classes as well. Specifically: the interface of a class is the class's methods. OK, I've had enough. I'd say you need a good editor to clean up the grammar, but then all you'd have is a better-written incorrect explanation. -- Lew -- http://mail.python.org/mailman/listinfo/python-list
Re: replace illegal xml characters
Does InDesign export broken XML documents? What exactly is your problem? yes, unfortunately it does. it uses all possible unicode characters, though not all are alowed in valid xml (see link in the first post). in any way for my application i should be checking if the xml that comes in is valid and replace all non-valid characters. is there something out there to do this? -- http://mail.python.org/mailman/listinfo/python-list
Garbage collection
Hi all I suspect I may be missing something vital here, but Python's garbage collection doesn't seem to work as I expect it to. Here's a small test program which shows the problem on python 2.4 and 2.5: $ python2.5 Python 2.5 (release25-maint, Dec 9 2006, 15:33:01) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)] on linux2 Type help, copyright, credits or license for more information. (at this point, Python is using 15MB) a = range(int(1e7)) (at this point, Python is using 327MB) a = None (at this point, Python is using 251MB) import gc gc.collect() 0 (at this point, Python is using 252MB) Is there something I've forgotten to do? Why is Python still using such a lot of memory? Thanks! -- I'm at CAMbridge, not SPAMbridge -- http://mail.python.org/mailman/listinfo/python-list
Message (Your message dated Wed, 21 Mar 2007 13:15:54...)
Your message dated Wed, 21 Mar 2007 13:15:54 +0100 with subject Returned mail: see transcript for details has been submitted to the moderator of the CLASS-84 list: Martha Hartfiel '83 [EMAIL PROTECTED]. -- http://mail.python.org/mailman/listinfo/python-list
Re: Exceptions and unicode messages
On Mar 21, 6:03 am, Tuomas [EMAIL PROTECTED] wrote: This works: raise StandardError(u'Wrong type') Traceback (most recent call last): File stdin, line 1, in ? StandardError: Wrong type but don't in Finnish: raise StandardError(u'Väärä tyyppi') Traceback (most recent call last): File stdin, line 1, in ? StandardError Any solution in Python? TV When I do this, mine says StandardError: unprintable StandardError object on both Python 2.4 and 2.5. I tried setting them to unicode, but that didn't help. One way around this may be to subclass the errors you want and do some custom processing that way. Mike -- http://mail.python.org/mailman/listinfo/python-list
Re: Exceptions and unicode messages
On Mar 21, 6:03 am, Tuomas [EMAIL PROTECTED] wrote: This works: raise StandardError(u'Wrong type') Traceback (most recent call last): File stdin, line 1, in ? StandardError: Wrong type but don't in Finnish: raise StandardError(u'Väärä tyyppi') Traceback (most recent call last): File stdin, line 1, in ? StandardError Any solution in Python? TV You may even need to change you computer's default character set to make this work. This guy did it for one of his programs: http://aroberge.blogspot.com/2007/01/unicode-headaches-and-solution.html Mike -- http://mail.python.org/mailman/listinfo/python-list
Re: On Java's Interface (the meaning of interface in computer programing)
Don't Feed The Trolls :-) -- http://mail.python.org/mailman/listinfo/python-list
Re: Garbage collection
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Tom Wright wrote: Hi all I suspect I may be missing something vital here, but Python's garbage collection doesn't seem to work as I expect it to. Here's a small test program which shows the problem on python 2.4 and 2.5: ... skip . (at this point, Python is using 252MB) Is there something I've forgotten to do? Why is Python still using such a lot of memory? Thanks! How do you know amount of memory used by Python? ps 、 top or something? - -- Thinker Li - [EMAIL PROTECTED] [EMAIL PROTECTED] http://heaven.branda.to/~thinker/GinGin_CGI.py -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGATUI1LDUVnWfY8gRAhy9AKDTA2vZYkF7ZLl9Ufy4i+onVSmWhACfTAOv PdQn/V1ppnaKAhdrblA3y+0= =dmnr -END PGP SIGNATURE- -- http://mail.python.org/mailman/listinfo/python-list
Technical Answer - Protecting code in python
Hello All, I have a hard question, every time I look for this answer its get out from the technical domain and goes on in the moral/social domain. First, I live in third world with bad gov., bad education, bad police and a lot of taxes and bills to pay, and yes I live in a democratic state (corrupt, but democratic). So please, don't try to convince me about the social / economical / open source / give to all / be open / all people are honest until prove contrary / dance with the rabbits... Remember I need to pay bills and security. Now the technical question: 1 - There is a way to make some program in python and protects it? I am not talking about ultra hard-core protection, just a simple one that will stop 90% script kiddies. 2 - If I put the code in web like a web service, how can I protect my code from being ripped? There is a way to avoid someone using my site and ripping the .py files? Thanks and sorry for the introduction -- http://mail.python.org/mailman/listinfo/python-list
Re: encoding characters
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Valentina Marotto wrote: The problem is about the insertion data into database. one of errors is: exceptions.UnicodeEncodeError: 'ascii' codec can't encode characters in position 25-26: ordinal not in range(128) http://baghera.crs4.it:8080/sources#tbend help me! Thanks More information, please. For example, what DBMS are used. Snippet run into error! - -- Thinker Li - [EMAIL PROTECTED] [EMAIL PROTECTED] http://heaven.branda.to/~thinker/GinGin_CGI.py -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGATYX1LDUVnWfY8gRAtkgAKC9VV4GrraeL9+f3WcHuoUIvZQsLwCg4vPq qTMx1Rbr5vUGGwmO5hGn/hU= =zWkT -END PGP SIGNATURE- -- http://mail.python.org/mailman/listinfo/python-list
Re: Technical Answer - Protecting code in python
flit wrote: 1 - There is a way to make some program in python and protects it? I am not talking about ultra hard-core protection, just a simple one that will stop 90% script kiddies. Put it in an executable? It's more hidden than protected, but it will stop a fair few non-experts. I use and have been happy with pyinstaller, though there are other options. I use it more for ease of distribution to non-techy users, but it's also a simply way to hide your code. 2 - If I put the code in web like a web service, how can I protect my code from being ripped? There is a way to avoid someone using my site and ripping the .py files? Configure your web-server properly and it will never serve up the .py files, only the results generated by them. I've not done it with Python, but I have set up a similar thing with Apache and XSLT where it will only give the generated data, not the code which created it. This is true even if there's an error in the code - it will just give HTTP 500 Internal Server Error and dump something a bit more useful to its error log. -- I'm at CAMbridge, not SPAMbridge -- http://mail.python.org/mailman/listinfo/python-list
Re: Garbage collection
Tom I suspect I may be missing something vital here, but Python's Tom garbage collection doesn't seem to work as I expect it to. Here's Tom a small test program which shows the problem on python 2.4 and 2.5: Tom (at this point, Python is using 15MB) a = range(int(1e7)) a = None import gc gc.collect() 0 Tom (at this point, Python is using 252MB) Tom Is there something I've forgotten to do? Why is Python still using Tom such a lot of memory? You haven't forgotten to do anything. Your attempts at freeing memory are being thwarted (in part, at least) by Python's int free list. I believe the int free list remains after the 10M individual ints' refcounts drop to zero. The large storage for the list is grabbed in one gulp and thus mmap()d I believe, so it is reclaimed by being munmap()d, hence the drop from 320+MB to 250+MB. I haven't looked at the int free list or obmalloc implementations in awhile, but if the free list does return any of its memory to the system it probably just calls the free() library function. Whether or not the system actually reclaims any memory from your process is dependent on the details of the malloc/free implementation's details. That is, the behavior is outside Python's control. Skip -- http://mail.python.org/mailman/listinfo/python-list
Re: Garbage collection
Thinker wrote: How do you know amount of memory used by Python? ps ? top or something? $ ps up `pidof python2.5` USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND tew2426275 0.0 11.9 257592 243988 pts/6 S+ 13:10 0:00 python2.5 VSZ is Virtual Memory Size (ie. total memory used by the application) RSS is Resident Set Size (ie. non-swapped physical memory) -- I'm at CAMbridge, not SPAMbridge -- http://mail.python.org/mailman/listinfo/python-list
Re: Technical Answer - Protecting code in python
flit wrote: Hello All, I have a hard question, every time I look for this answer its get out from the technical domain and goes on in the moral/social domain. First, I live in third world with bad gov., bad education, bad police and a lot of taxes and bills to pay, and yes I live in a democratic state (corrupt, but democratic). So please, don't try to convince me about the social / economical / open source / give to all / be open / all people are honest until prove contrary / dance with the rabbits... Remember I need to pay bills and security. Now the technical question: Most of these discussions aren't about open source or moral, but exactly about what you ask - technicalities. A friend of mine is so f**ing fluent with a disassembler, he immediately has whatever amount of credits he wants in your usual simulation style game. It's just a question of if the hurdles you put up are high enough for you intended audience - and for some reason people feel that compiled code would be much more safe. It's not. Unless very special measures are taken (e.g. skype), but that then is also beyond the common C-compiler run. And what almost always is not a point is that you've programmed something that would be interesting for outher to rip apart and use in pieces. Sorry, but 99% of all code is just a bit of glue logic - and the reluctance of developers to even use explicitly bought and well-documented libraries instead of rolling out their own, customized solution illustrates that adjusting your mindset to that of somebody else is much more of a problem than actually writing amounts of - mostly trivial - code. The only _real_ interesting thing is copy-protection. But that's a problem for all, also the compiler-camp-buddies. 1 - There is a way to make some program in python and protects it? I am not talking about ultra hard-core protection, just a simple one that will stop 90% script kiddies. If you can, just deliver the pyc-files. Should be hard enough for most people. 2 - If I put the code in web like a web service, how can I protect my code from being ripped? There is a way to avoid someone using my site and ripping the .py files? A service doesn't expose those files, unless you somehow instruct it to do so. Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: Garbage collection
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Tom Wright wrote: Thinker wrote: How do you know amount of memory used by Python? ps ? top or something? $ ps up `pidof python2.5` USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND tew24 26275 0.0 11.9 257592 243988 pts/6 S+ 13:10 0:00 python2.5 VSZ is Virtual Memory Size (ie. total memory used by the application) RSS is Resident Set Size (ie. non-swapped physical memory) This is amount of memory allocate by process not Python interpreter. It is managemed by malloc() of C library. When you free a block memory by free() function, it only return the memory to C library for later use, but C library not always return the memory to the kernel. Since there is a virtual memory for modem OS, inactive memory will be paged to pager when more physical memory blocks are need. It don't hurt much if you have enough swap space. What you get from ps command is memory allocated by process, it don't means they are used by Python interpreter. - -- Thinker Li - [EMAIL PROTECTED] [EMAIL PROTECTED] http://heaven.branda.to/~thinker/GinGin_CGI.py -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGATzJ1LDUVnWfY8gRAjSOAKC3uzoAWBow0VN77srjR5eBF0kXawCcCUYv 0RgdHNHqWMEn2Ap7zQuOFaQ= =/hWg -END PGP SIGNATURE- -- http://mail.python.org/mailman/listinfo/python-list
Re: #!/usr/bin/env python 2.4?
Jon Ribbens [EMAIL PROTECTED] wrote: You want: if sys.hexversion 0x020400f0: ... error ... Readability counts. if sys.version_info (2, 4): ... error ... -- \S -- [EMAIL PROTECTED] -- http://www.chaos.org.uk/~sion/ ___ | Frankly I have no feelings towards penguins one way or the other \X/ |-- Arthur C. Clarke her nu becomeþ se bera eadward ofdun hlæddre heafdes bæce bump bump bump -- http://mail.python.org/mailman/listinfo/python-list
Re: Technical Answer - Protecting code in python
On Mar 21, 2:36 pm, flit [EMAIL PROTECTED] wrote: Now the technical question: 1 - There is a way to make some program in python and protects it? I am not talking about ultra hard-core protection, just a simple one that will stop 90% script kiddies. Freeze. That should be hard enough for 99% of users. 2 - If I put the code in web like a web service, how can I protect my code from being ripped? There is a way to avoid someone using my site and ripping the .py files? That's more of a question about the security of that particular server. Normally if the server is well set up, there is no way that unauthorized user could access source code form outside. -- http://mail.python.org/mailman/listinfo/python-list
Re: Technical Answer - Protecting code in python
On Wed, 2007-03-21 at 06:36 -0700, flit wrote: Hello All, I have a hard question, every time I look for this answer its get out from the technical domain and goes on in the moral/social domain. First, I live in third world with bad gov., bad education, bad police and a lot of taxes and bills to pay, and yes I live in a democratic state (corrupt, but democratic). So please, don't try to convince me about the social / economical / open source / give to all / be open / all people are honest until prove contrary / dance with the rabbits... Remember I need to pay bills and security. Developing open-source code and getting paid are not necessarily mutually exclusive, but I digress... Now the technical question: 1 - There is a way to make some program in python and protects it? I am not talking about ultra hard-core protection, just a simple one that will stop 90% script kiddies. Not providing .py files and instead only providing .pyc files is perfectly viable, really easy to do, and provides adequate protection against casual/accidental code inspection. A sufficiently determined person will be able to retrieve the source code, but that is also true for any other imaginable protection scheme. In order for the user's computer to execute your code, you have to give the user's computer your code. Once that happens it's only a question of how determined you are to obfuscate the code and how determined they are to break your obfuscation. 2 - If I put the code in web like a web service, how can I protect my code from being ripped? There is a way to avoid someone using my site and ripping the .py files? Providing the code as a service instead means that you don't have to give the user your code, since the code runs on your hardware. As long as the server is properly configured, it will never serve the source code. You would still have to worry about malicious users trying to gain unauthorized root access to your server, and then they can do whatever they want, including looking at your super secret and super valuable code. It all comes back down to the question of how determined you are to protect your code and how determined your users are to break into it. -Carsten -- http://mail.python.org/mailman/listinfo/python-list
writing dictionaries to a file
Hi All, I have two dictionaries: somex:{'unit':00, type:'processing'} somey:{'comment':'fair', 'code':'aaa'} somey:{'comment':'good', 'code':bbb'} somey:{'comment':'excellent', 'code':'ccc'} now i would like to write this to a file in the following format(unit, code),,,the output should be as follows written to a file,,, 00, aaa 00, bbb 00, ccc can someone help me? Regards Kavitha - Heres a new way to find what you're looking for - Yahoo! Answers -- http://mail.python.org/mailman/listinfo/python-list
why brackets commas in func calls can't be ommited? (maybe it could be PEP?)
Hi all, I looked to the PEPs didn't find a proposition to remove brackets commas for to make Python func call syntax caml- or tcl- like: instead of result = myfun(param1, myfun2(param5, param8), param3) just make possible using result = myfun param1 (myfun2 param5 param8) param3 it would reduce length of code lines and make them more readable, + no needs to write annoing charecters. Maybe it will require more work than I suppose, for example handling of things like result = myfun(param1, myfun2(param5, param8), param3=15, param4=200) to result = myfun param1 (myfun2 param5 param8) param3=15 param4=200 #so it needs some more efforts to decode by compiler but anyway I think it worth. + it will not yield incompabilities with previous Python versions. WBR, D. -- http://mail.python.org/mailman/listinfo/python-list
Re: why brackets commas in func calls can't be ommited? (maybe it could be PEP?)
Hello Dmitrey, I looked to the PEPs didn't find a proposition to remove brackets commas for to make Python func call syntax caml- or tcl- like: instead of result = myfun(param1, myfun2(param5, param8), param3) just make possible using result = myfun param1 (myfun2 param5 param8) param3 If you have result = func1 func2 arg is it result = func1(func2, arg) or result = func1(func2(arg)) Miki [EMAIL PROTECTED] http://pythonwise.blogspot.com -- http://mail.python.org/mailman/listinfo/python-list
Re: why brackets commas in func calls can't be ommited? (maybe it could be PEP?)
I think it should result result = func1(func2, arg) if you want result = func1(func2(arg)) you should use result = func1 (func2 arg) if ... = word1 word2 word3 ... then only word word1 should be call to func word1 with parameters word2, word3 etc If you have result = func1 func2 arg is it result = func1(func2, arg) or result = func1(func2(arg)) Miki [EMAIL PROTECTED]http://pythonwise.blogspot.com -- http://mail.python.org/mailman/listinfo/python-list
Testing Python updates
Hello: What is the methodology for testing the updates to the Python language? Thanks Matthew Harelick -- http://mail.python.org/mailman/listinfo/python-list
Re: Garbage collection
[EMAIL PROTECTED] wrote: You haven't forgotten to do anything. Your attempts at freeing memory are being thwarted (in part, at least) by Python's int free list. I believe the int free list remains after the 10M individual ints' refcounts drop to zero. The large storage for the list is grabbed in one gulp and thus mmap()d I believe, so it is reclaimed by being munmap()d, hence the drop from 320+MB to 250+MB. I haven't looked at the int free list or obmalloc implementations in awhile, but if the free list does return any of its memory to the system it probably just calls the free() library function. Whether or not the system actually reclaims any memory from your process is dependent on the details of themalloc/free implementation's details. That is, the behavior is outside Python's control. Ah, thanks for explaining that. I'm a little wiser about memory allocation now, but am still having problems reclaiming memory from unused objects within Python. If I do the following: (memory use: 15 MB) a = range(int(4e7)) (memory use: 1256 MB) a = None (memory use: 953 MB) ...and then I allocate a lot of memory in another process (eg. open a load of files in the GIMP), then the computer swaps the Python process out to disk to free up the necessary space. Python's memory use is still reported as 953 MB, even though nothing like that amount of space is needed. From what you said above, the problem is in the underlying C libraries, but is there anything I can do to get that memory back without closing Python? -- I'm at CAMbridge, not SPAMbridge -- http://mail.python.org/mailman/listinfo/python-list
Re: Garbage collection
Tom ...and then I allocate a lot of memory in another process (eg. open Tom a load of files in the GIMP), then the computer swaps the Python Tom process out to disk to free up the necessary space. Python's Tom memory use is still reported as 953 MB, even though nothing like Tom that amount of space is needed. From what you said above, the Tom problem is in the underlying C libraries, but is there anything I Tom can do to get that memory back without closing Python? Not really. I suspect the unused pages of your Python process are paged out, but that Python has just what it needs to keep going. Memory contention would be a problem if your Python process wanted to keep that memory active at the same time as you were running GIMP. I think the process's resident size is more important here than virtual memory size (as long as you don't exhaust swap space). Skip -- http://mail.python.org/mailman/listinfo/python-list
Re: replace illegal xml characters
On Mar 21, 8:03 am, killkolor [EMAIL PROTECTED] wrote: Does InDesign export broken XML documents? What exactly is your problem? yes, unfortunately it does. it uses all possible unicode characters, though not all are alowed in valid xml (see link in the first post). in any way for my application i should be checking if the xml that comes in is valid and replace all non-valid characters. is there something out there to do this? You might be able to use Beautiful Soup: http://www.crummy.com/software/BeautifulSoup/ There are also some good examples for parsing XML at http://www.devarticles.com/c/a/XML/Parsing-XML-with-SAX-and-Python/ and the Dive Into Python site. Mike -- http://mail.python.org/mailman/listinfo/python-list
Re: difference between urllib2.urlopen and firefox view 'page source'?
Group: Thank you for all the informative replies, they have helped me figure things out. Next up is learning beautiful soup. Thank you for the code example, but I am trying to learn how to 'screen scrape', because Yahoo does make historical stock data available using the CSV format, but they do not do this for stock options, which is what I am ultimately attempting to scrap. Here is what I have so far, I know how broken and ugly it is: import urllib2, sys from BeautifulSoup import BeautifulSoup page = urllib2.urlopen(http://finance.yahoo.com/q/op?s=; + sys.argv[1]) soup = BeautifulSoup(page) print soup.find(table,{id :yfncsubtit}).big.b.contents[0] This actually works, and will print out the current stock price for whatever ticker symbol you supply as the command line argument when you launch this script. Later I will add error checking, etc. Any advice on how I am using beautiful soup in the above code? thanks again, cjl -- http://mail.python.org/mailman/listinfo/python-list
Re: replace illegal xml characters
killkolor wrote: Does InDesign export broken XML documents? What exactly is your problem? yes, unfortunately it does. it uses all possible unicode characters, though not all are alowed in valid xml (see link in the first post). in any way for my application i should be checking if the xml that comes in is valid and replace all non-valid characters. is there something out there to do this? I doubt it. Dealing with broken XML is nothing standard-modules should cope with. The link you provided has all you need - why not just use it? Diez -- http://mail.python.org/mailman/listinfo/python-list
regex reading html
Hello, I've a program I'm working on where we scrape some of our web pages using Mechanize libraries and then parse what we've scraped using different regex functions. However, I've noticed that no matter what I do with the scraped content, regex functions can't find what they're trying to match. If I take the same text and manually feed it to the same regex functions, everything works as expected? Is there a specific text transformation I need to apply before feeding the regex functions? Thanks! -- [EMAIL PROTECTED] http://wheelmind.blogspot.com -- http://mail.python.org/mailman/listinfo/python-list
Re: Garbage collection
[EMAIL PROTECTED] wrote: Tom ...and then I allocate a lot of memory in another process (eg. open Tom a load of files in the GIMP), then the computer swaps the Python Tom process out to disk to free up the necessary space. Python's Tom memory use is still reported as 953 MB, even though nothing like Tom that amount of space is needed. From what you said above, the Tom problem is in the underlying C libraries, but is there anything I Tom can do to get that memory back without closing Python? Not really. I suspect the unused pages of your Python process are paged out, but that Python has just what it needs to keep going. Yes, that's what's happening. Memory contention would be a problem if your Python process wanted to keep that memory active at the same time as you were running GIMP. True, but why does Python hang on to the memory at all? As I understand it, it's keeping a big lump of memory on the int free list in order to make future allocations of large numbers of integers faster. If that memory is about to be paged out, then surely future allocations of integers will be *slower*, as the system will have to: 1) page out something to make room for the new integers 2) page in the relevant chunk of the int free list 3) zero all of this memory and do any other formatting required by Python If Python freed (most of) the memory when it had finished with it, then all the system would have to do is: 1) page out something to make room for the new integers 2) zero all of this memory and do any other formatting required by Python Surely Python should free the memory if it's not been used for a certain amount of time (say a few seconds), as allocation times are not going to be the limiting factor if it's gone unused for that long. Alternatively, it could mark the memory as some sort of cache, so that if it needed to be paged out, it would instead be de-allocated (thus saving the time taken to page it back in again when it's next needed) I think the process's resident size is more important here than virtual memory size (as long as you don't exhaust swap space). True in theory, but the computer does tend to go rather sluggish when paging large amounts out to disk and back. Surely the use of virtual memory should be avoided where possible, as it is so slow? This is especially true when the contents of the blocks paged out to disk will never be read again. I've also tested similar situations on Python under Windows XP, and it shows the same behaviour, so I think this is a Python and/or GCC/libc issue, rather than an OS issue (assuming Python for linux and Python for windows are both compiled with GCC). -- I'm at CAMbridge, not SPAMbridge -- http://mail.python.org/mailman/listinfo/python-list
Re: why brackets commas in func calls can't be ommited? (maybe it could be PEP?)
dmitrey wrote: it would reduce length of code lines and make them more readable, + no needs to write annoing charecters. IMHO, it's less readable. I suppose I'm not on my own with this opinion. Regards, Björn -- BOFH excuse #34: (l)user error -- http://mail.python.org/mailman/listinfo/python-list
Re: why brackets commas in func calls can't be ommited? (maybe it could be PEP?)
dmitrey wrote: Hi all, I looked to the PEPs didn't find a proposition to remove brackets commas for to make Python func call syntax caml- or tcl- like: instead of result = myfun(param1, myfun2(param5, param8), param3) just make possible using result = myfun param1 (myfun2 param5 param8) param3 it would reduce length of code lines and make them more readable, + no needs to write annoing charecters. This is not true, there is no shorter code lines: foo bar baz foo(bar,baz) And the more readable part certainly depends on the habits of the user - to me, it's harder to read. Apart from that, even if both statements were true, I doubt there is even the slightest chance of including it - after all, you'd have to keep around the old way of doing things anyway, and thus you'd end up with two styles of coding - certainly _not_ something anybody in the python developer community is interested in. Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: why brackets commas in func calls can't be ommited? (maybe it could be PEP?)
dmitrey [EMAIL PROTECTED] (d) wrote: d I think it should result d result = func1(func2, arg) d if you want d result = func1(func2(arg)) d you should use d result = func1 (func2 arg) d if d ... = word1 word2 word3 ... d then only word word1 should be call to func word1 with parameters d word2, word3 etc That depends whether you want function application to be left-associative or right-associative. For example, in haskell it is left associative which is the more obvious choice because it has currying. -- Piet van Oostrum [EMAIL PROTECTED] URL: http://www.cs.uu.nl/~piet [PGP 8DAE142BE17999C4] Private email: [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Re: Garbage collection
Tom Wright wrote: [EMAIL PROTECTED] wrote: Tom ...and then I allocate a lot of memory in another process (eg. open Tom a load of files in the GIMP), then the computer swaps the Python Tom process out to disk to free up the necessary space. Python's Tom memory use is still reported as 953 MB, even though nothing like Tom that amount of space is needed. From what you said above, the Tom problem is in the underlying C libraries, but is there anything I Tom can do to get that memory back without closing Python? Not really. I suspect the unused pages of your Python process are paged out, but that Python has just what it needs to keep going. Yes, that's what's happening. Memory contention would be a problem if your Python process wanted to keep that memory active at the same time as you were running GIMP. True, but why does Python hang on to the memory at all? As I understand it, it's keeping a big lump of memory on the int free list in order to make future allocations of large numbers of integers faster. If that memory is about to be paged out, then surely future allocations of integers will be *slower*, as the system will have to: 1) page out something to make room for the new integers 2) page in the relevant chunk of the int free list 3) zero all of this memory and do any other formatting required by Python If Python freed (most of) the memory when it had finished with it, then all the system would have to do is: 1) page out something to make room for the new integers 2) zero all of this memory and do any other formatting required by Python Surely Python should free the memory if it's not been used for a certain amount of time (say a few seconds), as allocation times are not going to be the limiting factor if it's gone unused for that long. Alternatively, it could mark the memory as some sort of cache, so that if it needed to be paged out, it would instead be de-allocated (thus saving the time taken to page it back in again when it's next needed) Easy to say. How do you know the memory that's not in use is in a contiguous block suitable for return to the operating system? I can pretty much guarantee it won't be. CPython doesn't use a relocating garbage collection scheme, so objects always stay at the same place in the process's virtual memory unless they have to be grown to accommodate additional data. I think the process's resident size is more important here than virtual memory size (as long as you don't exhaust swap space). True in theory, but the computer does tend to go rather sluggish when paging large amounts out to disk and back. Surely the use of virtual memory should be avoided where possible, as it is so slow? This is especially true when the contents of the blocks paged out to disk will never be read again. Right. So all we have to do is identify those portions of memory that will never be read again and return them to the OS. That should be easy. Not. I've also tested similar situations on Python under Windows XP, and it shows the same behaviour, so I think this is a Python and/or GCC/libc issue, rather than an OS issue (assuming Python for linux and Python for windows are both compiled with GCC). It's probably a dynamic memory issue. Of course if you'd like to provide a patch to switch it over to a relocating garbage collection scheme we'll all await it with bated breath :) regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden Recent Ramblings http://holdenweb.blogspot.com -- http://mail.python.org/mailman/listinfo/python-list
Re: why brackets commas in func calls can't be ommited? (maybe it could be PEP?)
On Mar 21, 8:38 am, dmitrey [EMAIL PROTECTED] wrote: Hi all, I looked to the PEPs didn't find a proposition to remove brackets commas for to make Python func call syntax caml- or tcl- like: instead of result = myfun(param1, myfun2(param5, param8), param3) just make possible using result = myfun param1 (myfun2 param5 param8) param3 it would reduce length of code lines and make them more readable, + no needs to write annoing charecters. Maybe it will require more work than I suppose, for example handling of things like result = myfun(param1, myfun2(param5, param8), param3=15, param4=200) to result = myfun param1 (myfun2 param5 param8) param3=15 param4=200 #so it needs some more efforts to decode by compiler but anyway I think it worth. + it will not yield incompabilities with previous Python versions. WBR, D. In my opinion, it is much less readable. That may be due to my experiences with TCL, BASH-scripting, with C, C++, and Python. The parenthesis make it very obvious that a function call is going on, and mirrors the mathematical notations that denote using a function. With touch-typing on an American keyboard, the ()'s are not really any more annoying than any of the various top-row digits. I personally find the backslash character (\) to be far more annoying, as it can have one of several locations depending on the keyboard style. (Most sanely put it above the Enter key.) As others have pointed out, the code that you presented really isn't all that much shorter. Short code isn't really what Python's about. Perl has many ways to write very short, incomprehensible code. A further ambiguity to consider: result = func1 Is the name result bound to the function func1? Or is func1 called, and its result is bound to the name result? Good luck with your PEP. --Jason -- http://mail.python.org/mailman/listinfo/python-list
Re: socket.getfqdn deadlock
On Mar 20, 2:23 pm, [EMAIL PROTECTED] wrote: Hi, I am getting deadlocks (backtrace pasted below) after a while at, presumably, a socket.getfqdn() call in a child process . Fwiw: This child process is created as the result of a pyro call to a Pyro object. Any ideas why this is happening? Are you sure it is not timing out, waiting for a DNS reply? Greg -- http://mail.python.org/mailman/listinfo/python-list
Re: Garbage collection
Tom True, but why does Python hang on to the memory at all? As I Tom understand it, it's keeping a big lump of memory on the int free Tom list in order to make future allocations of large numbers of Tom integers faster. If that memory is about to be paged out, then Tom surely future allocations of integers will be *slower*, as the Tom system will have to: Tom 1) page out something to make room for the new integers Tom 2) page in the relevant chunk of the int free list Tom 3) zero all of this memory and do any other formatting required by TomPython If your program's behavior is: * allocate a list of 1e7 ints * delete that list how does the Python interpreter know your next bit of execution won't be to repeat the allocation? In addition, checking to see that an arena in the free list can be freed is itself not a free operation. From the comments at the top of intobject.c: free_list is a singly-linked list of available PyIntObjects, linked via abuse of their ob_type members. Each time an int is allocated, the free list is checked to see if it's got a spare object lying about sloughin off. If so, it is plucked from the list and reinitialized appropriately. If not, a new block of memory sufficient to hold about 250 ints is grabbed via a call to malloc, which *might* have to grab more memory from the OS. Once that block is allocated, it's strung together into a free list via the above ob_type slot abuse. Then the 250 or so items are handed out one-by-one as needed and stitched back into the free list as they are freed. Now consider how difficult it is to decide if that block of 250 or so objects is all unused so that we can free() it. We have to walk through the list and check to see if that chunk is in the free list. That's complicated by the fact that the ref count fields aren't initialized to zero until a particular chunk is first used as an allocated int object and would have to be to support this block free operation (= more cost up front). Still, assume we can semi-efficiently determine that a particular block is composed of all freed int-object-sized chunks. We will then unstitch it from the chain of blocks and call free() to free it. Still, we are left with the behavior of the operating system's malloc/free implementation. It probably won't sbrk() the block back to the OS, so after all that work your process still holds the memory. Okay, so malloc/free won't work. We could boost the block size up to the size of a page and use mmap() to map a page into memory. I suspect that would become still more complicated to implement, and the block size being probably about eight times larger than the current block size would incur even more cost to determine if it was full of nothing but freed objects. Tom If Python freed (most of) the memory when it had finished with it, Tom then all the system would have to do is: That's the rub. Figuring out when it is truly finished with the memory. Tom Surely Python should free the memory if it's not been used for a Tom certain amount of time (say a few seconds), as allocation times are Tom not going to be the limiting factor if it's gone unused for that Tom long. This is generally the point in such discussions where I respond with something like, patches cheerfully accepted. ;-) If you're interested in digging into this, have a look at the free list implementation in Objects/intobject.c. It might make for a good Google Summer of Code project: http://code.google.com/soc/psf/open.html http://code.google.com/soc/psf/about.html but I'm not the guy you want mentoring such a project. There are a lot of people who understand the ins and outs of Python's memory allocation code much better than I do. Tom I've also tested similar situations on Python under Windows XP, and Tom it shows the same behaviour, so I think this is a Python and/or Tom GCC/libc issue, rather than an OS issue (assuming Python for linux Tom and Python for windows are both compiled with GCC). Sure, my apologies. The malloc/free implementation is strictly speaking not part of the operating system. I tend to mentally lump them together because it's uncommon for people to use a malloc/free implementation different than the one delivered with their computer. Skip -- http://mail.python.org/mailman/listinfo/python-list
Re: why brackets commas in func calls can't be ommited? (maybe it could be PEP?)
dmitrey [EMAIL PROTECTED] wrote: + it will not yield incompabilities with previous Python versions. So how would you write: func(-3) func(*param) with your scheme? These already have an incompatible meaning: func -3 func *param1 -- http://mail.python.org/mailman/listinfo/python-list
Re: why brackets commas in func calls can't be ommited? (maybe it could be PEP?)
On Mar 21, 3:38 pm, dmitrey [EMAIL PROTECTED] wrote: Hi all, I looked to the PEPs didn't find a proposition to remove brackets commas for to make Python func call syntax caml- or tcl- like: instead of result = myfun(param1, myfun2(param5, param8), param3) just make possible using result = myfun param1 (myfun2 param5 param8) param3 How would you write a = b(c())? In my opinion it'll make code extremely obfuscaded. The great thing about Python, when comparing with eg. Perl or C, is that code is readable, even if written by experienced hacker. -- http://mail.python.org/mailman/listinfo/python-list
Re: Technical Answer - Protecting code in python
On Wed, 21 Mar 2007 06:36:16 -0700, flit wrote: 1 - There is a way to make some program in python and protects it? I am not talking about ultra hard-core protection, just a simple one that will stop 90% script kiddies. Protect it from what? Viruses? Terrorists? The corrupt government? Your ex-wife cutting it up with scissors? People who want to copy it? People who will look at your code and laugh at you for being a bad programmer? Until you tell us what you are trying to protect against, your question is meaningless. Is your program valuable? Is it worth money? Then the 90% of script kiddies will just wait three days, and download the program off the Internet after the real hackers have broken your protection. If it is NOT valuable, then why on earth do you think people will put up with whatever protection you use? Why won't they just use another program? 2 - If I put the code in web like a web service, how can I protect my code from being ripped? There is a way to avoid someone using my site and ripping the .py files? Don't make the .py files available on the web server. [penny drops] Hang on, you want us to believe that you're a serious computer programmer with a seriously valuable program that's worth protecting, and you don't know that? I smell a troll. -- Steven. -- http://mail.python.org/mailman/listinfo/python-list
Re: Garbage collection
On Wed, 21 Mar 2007 15:03:17 +, Tom Wright wrote: [snip] Ah, thanks for explaining that. I'm a little wiser about memory allocation now, but am still having problems reclaiming memory from unused objects within Python. If I do the following: (memory use: 15 MB) a = range(int(4e7)) (memory use: 1256 MB) a = None (memory use: 953 MB) ...and then I allocate a lot of memory in another process (eg. open a load of files in the GIMP), then the computer swaps the Python process out to disk to free up the necessary space. Python's memory use is still reported as 953 MB, even though nothing like that amount of space is needed. Who says it isn't needed? Just because *you* have only one object existing, doesn't mean the Python environment has only one object existing. From what you said above, the problem is in the underlying C libraries, What problem? Nothing you've described seems like a problem to me. It sounds like a modern, 21st century operating system and programming language working like they should. Why do you think this is a problem? You've described an extremely artificial set of circumstances: you create 40,000,000 distinct integers, then immediately destroy them. The obvious solution to that problem of Python caching millions of integers you don't need is not to create them in the first place. In real code, the chances are that if you created 4e7 distinct integers you'll probably need them again -- hence the cache. So what's your actual problem that you are trying to solve? but is there anything I can do to get that memory back without closing Python? Why do you want to manage memory yourself anyway? It seems like a horrible, horrible waste to use a language designed to manage memory for you, then insist on over-riding it's memory management. I'm not saying that there is never any good reason for fine control of the Python environment, but this doesn't look like one to me. -- Steven. -- http://mail.python.org/mailman/listinfo/python-list
Delete a function
After a function has been imported to a shell how may it be deleted so that after editing it can reloaded anew? thanx, gtb -- http://mail.python.org/mailman/listinfo/python-list
Re: why brackets commas in func calls can't be ommited? (maybe it could be PEP?)
On Wed, 21 Mar 2007 07:55:08 -0700, dmitrey top posted: I think it should result What's it? Please don't top post, it makes your answer hard to understand and makes your readers have to do more work to read your posts. We're not being paid to read your posts, so I'd estimate that about 70% of readers have just clicked Delete at this point and ignored you. For those remaining: The Original Poster, dmitrey, wants to copy caml syntax, because he doesn't like brackets and commas. The problem is that the expression: name1 name2 is ambiguous. Does it mean name1(name2) or (name1, name2)? Add a third name, and the ambiguity increases: there are now at least four ways to interpret name1 name2 name3: (name1, name2, name3) (name1, name2(name3)) name1(name2, name3) name1(name2(name3)) Dmitrey thinks that the third way is the right way to interpret the proposed expression, just because it seems natural to him. But that is illogical: it means that *different* parsing rules are applied to the name2 name3 part than to the name1 * part (where * stands in for anything): Dmitry wants name1 * to equal name1(*) which is fair enough as it stands. But he wants to expand the * part, not by following the same rule, but by following the different rule name2 name3 = (name2, name3) and form a tuple. So what should a b c d be? (a, b, c, d) a(b, c, d) a(b, (c, d)) a(b(c, d)) a(b(c(d))) Have I missed anything? Which is the right way? Who can tell? Who can guess? I don't know how caml resolves these ambiguities, or even if caml resolves them, or if it is a good solution. But I propose that an even better solution is to insist on EXPLICIT function calls and tuple construction, that is to insist on brackets and commas. In other words, to go back to Dmitry's original post where he wrote: it would reduce length of code lines and make them more readable I would change that to say it would reduce length of code lines and make them LESS readable and MORE ambiguous, leading to MORE bugs. -- Steven. -- http://mail.python.org/mailman/listinfo/python-list
Re: Exceptions and unicode messages
This seems to work: import sys, traceback def excepthook(exctype, value, tb): ... if tb: ... lines = traceback.format_tb(tb) ... for i, line in zip(range(len(lines)), lines): ... lines[i] = lines[i].decode('utf8') ... lines.insert(0, u'Traceback (most recent call last):\n') ... else: ... lines = [] ... msg = str(exctype).split('.')[-1] + u': ' + unicode(value) ... lines.append(msg) ... print u''.join(lines) ... sys.excepthook = excepthook class UserError(StandardError): ... def __str__(self): ... return self.args[0] ... raise UserError(u'Väärä tyyppi') Traceback (most recent call last): File stdin, line 1, in ? UserError: Väärä tyyppi Tuomas Tuomas wrote: This works: raise StandardError(u'Wrong type') Traceback (most recent call last): File stdin, line 1, in ? StandardError: Wrong type but don't in Finnish: raise StandardError(u'Väärä tyyppi') Traceback (most recent call last): File stdin, line 1, in ? StandardError Any solution in Python? TV -- http://mail.python.org/mailman/listinfo/python-list
Re: why brackets commas in func calls can't be ommited? (maybe it could be PEP?)
Bart Ogryczak wrote: On Mar 21, 3:38 pm, dmitrey [EMAIL PROTECTED] wrote: Hi all, I looked to the PEPs didn't find a proposition to remove brackets commas for to make Python func call syntax caml- or tcl- like: instead of result = myfun(param1, myfun2(param5, param8), param3) just make possible using result = myfun param1 (myfun2 param5 param8) param3 How would you write a = b(c())? In my opinion it'll make code extremely obfuscaded. The great thing about Python, when comparing with eg. Perl or C, is that code is readable, even if written by experienced hacker. Yes, but let's not forget that we are in half-baked idea territory here. The fact that dmitrey didn't twig that the absence of such a proposal was likely for good reasons implies either an intellectual arrogance beyond that of most mere mortals or a goodly dollop of ignorance. Maybe we could omit the leading whitespace as well? regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden Recent Ramblings http://holdenweb.blogspot.com -- http://mail.python.org/mailman/listinfo/python-list
Re: Delete a function
gtb wrote: After a function has been imported to a shell how may it be deleted so that after editing it can reloaded anew? Use the built-in reload() function to reload the module that defines the function. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden Recent Ramblings http://holdenweb.blogspot.com -- http://mail.python.org/mailman/listinfo/python-list
Re: why brackets commas in func calls can't be ommited? (maybe it could be PEP?)
foo bar baz foo(bar,baz) 1st still is shorter by 1 char; considering majority of people use space after comma number of parameters can be big it yileds foo bar baz bar2 bar3 bar4 vs foo(bar, baz, bar2, bar3, bar4) result = func1 for this case should using result = func1() should remain + remaining function defenitions def myfun(param1, param2,...,paramk, *args, **kwargs) How would you write a = b(c())? a = b c() So what should a b c d be? (a, b, c, d) a(b, c, d) a(b, (c, d)) a(b(c, d)) a(b(c(d))) I mentioned above that it should be 2nd case I don't know how caml resolves these ambiguities, or even if caml resolves them Yes, he does it perfectly, hence Python could have same abilities. But I don't insist anything, I only proposed. Since majority disagreed with the proposition, it don't worth further discussion. WBR, D. P.S. Steven, many people aren't able speak English as good as you, so do I. I hope majority of readers will forgive me for wasting their costly attantion time. -- http://mail.python.org/mailman/listinfo/python-list
Making faster scientific calculations...
Ok, this is my first post to this list and don't know if it's the right one. I'm currently making sort of a scientific application. Most data structures and algorithmic knots are in C++, and from there I have a set of extension modules to Python. So, users of the app don't have to compile or ever see C++ code to personalize the app to their ends, except for the inner mathematical part. Regarding the last statement, I saw a few messages in this list about Psyco, a JIT for Python and things like that. Well, I need all the speed the machine can deliver in the mathematica l spot (I know by profiling that it's is criticall), so, nothing of that can do the work. Recently I read something about generating mathematical kernels in Python for this kind of problems (can't locate the paper right now), the idea is to just generate machine code and have it to do the work, but it wasn't for x86 based architectures (yes, portability is always a nightmare regarding tools which generate native code). Do somebody know about a parallel effort for x86? If there isn't such a thing out there, could it be of some interest to people which is working numbers in Python? -- http://mail.python.org/mailman/listinfo/python-list
Re: Technical Answer - Protecting code in python
First I wanna thanks the all people who gives good contribution to this thread, thank you all.. Now I have something more to say: OK, that kind of answer is what I was trying to avoid.. On Mar 21, 1:23 pm, Steven D'Aprano [EMAIL PROTECTED] wrote: On Wed, 21 Mar 2007 06:36:16 -0700, flit wrote: 1 - There is a way to make some program in python and protects it? I am not talking about ultra hard-core protection, just a simple one that will stop 90% script kiddies. Protect it from what? Viruses? Terrorists? The corrupt government? Your ex-wife cutting it up with scissors? People who want to copy it? People who will look at your code and laugh at you for being a bad programmer? Until you tell us what you are trying to protect against, your question is meaningless. In this time I supposed someone took too much coffee..But will ignore.. Is your program valuable? Is it worth money? Then the 90% of script kiddies will just wait three days, and download the program off the Internet after the real hackers have broken your protection. If it is NOT valuable, then why on earth do you think people will put up with whatever protection you use? Why won't they just use another program? It´s doesn´t matter if it is the next BIG HIT Ultra-fast-next-google thing or a programm to take control of my pet-people-living-in- welfare-trying-to-be-political It´s a technical question, If you can´t answer it ok, I will not suppose that you are it or that, it´s not a personal question or matter. 2 - If I put the code in web like a web service, how can I protect my code from being ripped? There is a way to avoid someone using my site and ripping the .py files? Don't make the .py files available on the web server. Now we have a real contribution to the thread. Thank You [penny drops] Hang on, you want us to believe that you're a serious computer programmer with a seriously valuable program that's worth protecting, and you don't know that? I smell a troll. -- Steven. Again, you don´t have to believe, suppose or think anything about me, are you capable to make any contribution? Technical one? Are you sooo good and serious programmer that you did not develop your personal skills, and thinks that winning an argument in internet is the best thing in the world? Thanks all, Flit (the-not-serious-programmer-that-wanna-to-be-a-big-capitalist-and-take- the-money-from all) -- http://mail.python.org/mailman/listinfo/python-list
Re: Garbage collection
On Wed, 21 Mar 2007 15:32:17 +, Tom Wright wrote: Memory contention would be a problem if your Python process wanted to keep that memory active at the same time as you were running GIMP. True, but why does Python hang on to the memory at all? As I understand it, it's keeping a big lump of memory on the int free list in order to make future allocations of large numbers of integers faster. If that memory is about to be paged out, then surely future allocations of integers will be *slower*, as the system will have to: 1) page out something to make room for the new integers 2) page in the relevant chunk of the int free list 3) zero all of this memory and do any other formatting required by Python If Python freed (most of) the memory when it had finished with it, then all the system would have to do is: 1) page out something to make room for the new integers 2) zero all of this memory and do any other formatting required by Python Surely Python should free the memory if it's not been used for a certain amount of time (say a few seconds), as allocation times are not going to be the limiting factor if it's gone unused for that long. Alternatively, it could mark the memory as some sort of cache, so that if it needed to be paged out, it would instead be de-allocated (thus saving the time taken to page it back in again when it's next needed) And increasing the time it takes to re-create the objects in the cache subsequently. Maybe this extra effort is worthwhile when the free int list holds 10**7 ints, but is it worthwhile when it holds 10**6 ints? How about 10**5 ints? 10**3 ints? How many free ints is typical or even common in practice? The lesson I get from this is, instead of creating such an enormous list of integers in the first place with range(), use xrange() instead. Fresh running instance of Python 2.5: $ ps up 9579 USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND steve 9579 0.0 0.2 6500 2752 pts/7S+ 03:42 0:00 python2.5 Run from within Python: n = 0 for i in xrange(int(1e7)): ... # create lots of ints, one at a time ... # instead of all at once ... n += i # make sure the int is used ... n 499500L And the output of ps again: $ ps up 9579 USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND steve 9579 4.2 0.2 6500 2852 pts/7S+ 03:42 0:11 python2.5 Barely moved a smidgen. For comparison, here's what ps reports after I create a single list with range(int(1e7)), and again after I delete the list: $ ps up 9579 # after creating list with range(int(1e7)) USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND steve 9579 1.9 15.4 163708 160056 pts/7 S+ 03:42 0:11 python2.5 $ ps up 9579 # after deleting list USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND steve 9579 1.7 11.6 124632 120992 pts/7 S+ 03:42 0:12 python2.5 So there is another clear advantage to using xrange instead of range, unless you specifically need all ten million ints all at once. -- Steven. -- http://mail.python.org/mailman/listinfo/python-list
Re: why brackets commas in func calls can't be ommited? (maybe it could be PEP?)
dmitrey a écrit : Hi all, I looked to the PEPs didn't find a proposition to remove brackets commas for to make Python func call syntax caml- or tcl- like: instead of result = myfun(param1, myfun2(param5, param8), param3) just make possible using result = myfun param1 (myfun2 param5 param8) param3 it would reduce length of code lines Not by a noticeable amount and make them more readable, I guess it's a matter of personal taste, experience with other languages and whatnot, but as far a I'm concerned, I find the actual syntax more readable - I don't have to think twice to know that it's a function call and what are the params. Also, and FWIW, in Python, the parens are the call operator. Given that a function may return another function, how would you handle the following case: result = my_hof(foo, bar)(baaz) And while where at it, since Python functions are first class objects, how would you handle this other case: def somefunc(arg): return 2 * arg alias = somefunc # some code here result = alias(42) + no needs to write annoing charecters. Could it be that these 'annoying characters' have a good reason to be here ? Strange as it might be, Python has not been built randomly. Maybe it will require more work than I suppose, Probably, yes. but anyway I think it worth. So try to solve the above problems and come back here with an example working implementation. + it will not yield incompabilities with previous Python versions. Hmmm. I would not bet my life on this... -- http://mail.python.org/mailman/listinfo/python-list
Re: Delete a function
On Mar 21, 11:37 am, Steve Holden [EMAIL PROTECTED] wrote: gtb wrote: After a function has been imported to a shell how may it be deleted so that after editing it can reloaded anew? Use the built-in reload() function to reload the module that defines the function. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenwebhttp://del.icio.us/steve.holden Recent Ramblings http://holdenweb.blogspot.com Thanks, tried that now and get nameError: with the following. import sys sys.path.append(c:\maxq\testScripts) from CompactTest import CompactTest from compactLogin import dvlogin reload(compactLogin) -- http://mail.python.org/mailman/listinfo/python-list
Re: Garbage collection
Steven D'Aprano wrote: You've described an extremely artificial set of circumstances: you create 40,000,000 distinct integers, then immediately destroy them. The obvious solution to that problem of Python caching millions of integers you don't need is not to create them in the first place. I know it's a very artificial setup - I was trying to make the situation simple to demonstrate in a few lines. The point was that it's not caching the values of those integers, as they can never be read again through the Python interface. It's just holding onto the space they occupy in case it's needed again. So what's your actual problem that you are trying to solve? I have a program which reads a few thousand text files, converts each to a list (with readlines()), creates a short summary of the contents of each (a few floating point numbers) and stores this summary in a master list. From the amount of memory it's using, I think that the lists containing the contents of each file are kept in memory, even after there are no references to them. Also, if I tell it to discard the master list and re-read all the files, the memory use nearly doubles so I presume it's keeping the lot in memory. The program may run through several collections of files, but it only keeps a reference to the master list of the most recent collection it's looked at. Obviously, it's not ideal if all the old collections hang around too, taking up space and causing the machine to swap. but is there anything I can do to get that memory back without closing Python? Why do you want to manage memory yourself anyway? It seems like a horrible, horrible waste to use a language designed to manage memory for you, then insist on over-riding it's memory management. I agree. I don't want to manage it myself. I just want it to re-use memory or hand it back to the OS if it's got an awful lot that it's not using. Wouldn't you say it was wasteful if (say) an image editor kept an uncompressed copy of an image around in memory after the image had been closed? -- I'm at CAMbridge, not SPAMbridge -- http://mail.python.org/mailman/listinfo/python-list
Re: Garbage collection
Steve Holden wrote: Easy to say. How do you know the memory that's not in use is in a contiguous block suitable for return to the operating system? I can pretty much guarantee it won't be. CPython doesn't use a relocating garbage collection scheme Fair point. That is difficult and I don't see a practical solution to it (besides substituting a relocating garbage collector, which seems like a major undertaking). Right. So all we have to do is identify those portions of memory that will never be read again and return them to the OS. That should be easy. Not. Well, you have this nice int free list which points to all the bits which will never be read again (they might be written to, but if you're writing without reading then it doesn't really matter where you do it). The point about contiguous chunks still applies though. -- I'm at CAMbridge, not SPAMbridge -- http://mail.python.org/mailman/listinfo/python-list
Re: Delete a function
gtb wrote: On Mar 21, 11:37 am, Steve Holden [EMAIL PROTECTED] wrote: gtb wrote: After a function has been imported to a shell how may it be deleted so that after editing it can reloaded anew? Use the built-in reload() function to reload the module that defines the function. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenwebhttp://del.icio.us/steve.holden Recent Ramblings http://holdenweb.blogspot.com Thanks, tried that now and get nameError: with the following. import sys sys.path.append(c:\maxq\testScripts) That's an interesting directory name: print c:\maxq\testScripts c:\maxq estScripts from CompactTest import CompactTest from compactLogin import dvlogin The above line puts dvlogin into the current namespace, not compactLogin; therefore reload(compactLogin) compactLogin is undefined in the reload() call. You need import compactLogin # still the old module because it's in the module cache reload(compactLogin) from compactLogin import dvlogin # update the function del compactLogin # optional Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: Garbage collection
[EMAIL PROTECTED] wrote: If your program's behavior is: * allocate a list of 1e7 ints * delete that list how does the Python interpreter know your next bit of execution won't be to repeat the allocation? It doesn't know, but if the program runs for a while without repeating it, it's a fair bet that it won't mind waiting the next time it does a big allocation. How long 'a while' is would obviously be open to debate. In addition, checking to see that an arena in the free list can be freed is itself not a free operation. (snip thorough explanation) Yes, that's a good point. It looks like the list is designed for speedy re-use of the memory it points to, which seems like a good choice. I quite agree that it should hang on to *some* memory, and perhaps my artificial situation has shown this as a problem when it wouldn't cause any issues for real programs. I can't help thinking that there are some situations where you need a lot of memory for a short time though, and it would be nice to be able to use it briefly and then hand most of it back. Still, I see the practical difficulties with doing this. -- I'm at CAMbridge, not SPAMbridge -- http://mail.python.org/mailman/listinfo/python-list
Re: Technical Answer - Protecting code in python
On Wed, 21 Mar 2007 09:56:12 -0700, flit wrote: First I wanna thanks the all people who gives good contribution to this thread, thank you all.. But they haven't. They've given answers to an ill-posed question. How can anyone tell you how to protect code when you haven't told us what you want to protect against? Until you tell us what you are trying to protect against, your question is meaningless. In this time I supposed someone took too much coffee..But will ignore.. That is the absolute core of the problem. What are you trying to protect against? If you can't even answer that question, then how do you expect to find a solution? If a customer came to you and offered you money to protect this data, what would you do? Surely the FIRST thing you would need to do is find out, protect it from what? What problem does the customer want you to solve? Does the customer want error correction codes so he can transmit it over a noisy data channel? Does the customer just want an off-site backup he can take home? Does he want it encrypted? Or does he just want it copyrighted, so it is legally protected? Or does he want you to go out and hire a big strong man with a club to stand over the disk and hit people on the head if they get too close? Is your program valuable? Is it worth money? Then the 90% of script kiddies will just wait three days, and download the program off the Internet after the real hackers have broken your protection. If it is NOT valuable, then why on earth do you think people will put up with whatever protection you use? Why won't they just use another program? It´s doesn´t matter if it is the next BIG HIT Ultra-fast-next-google thing or a programm to take control of my pet-people-living-in- welfare-trying-to-be-political It´s a technical question, No it isn't. You only think it is a technical question. You said it yourself: you have to make money. How much money are you going to make if you spend all your time solving the technical question of protecting your software, if nobody wants your software? What is the value of the protection? Should you spend a thousand hours protecting it, or a hundred hours, or ten, or one, or one minute, or nothing at all? What's your business model for making money? That is far more important than whether you can send out a .pyc file or how many people know how to use the Python disassembler. Maybe you'll make MORE money by giving the software away for free and charging for services. Would you rather sell ten copies of your software at $20 each, or give away ten thousand copies and charge five hours of consulting services at $100 an hour? The technical problem is the LEAST important part of the real problem, which is how do I make money from this?. -- Steven. -- http://mail.python.org/mailman/listinfo/python-list
Re: Testing Python updates
Matthew [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] | What is the methodology for testing the updates to the Python language? By who? There is an expanding /test directory in the pythonx.y/Lib directory. Core developers run these about once a day on multiple systems. People who build their own executable should run the same. People who install a prebuilt executable may do the same, and should if they suspect a problem with their installation. Application developers increasingly have their own automated test files that they run after running /Lib/test on a new installation. Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: #!/usr/bin/env python 2.4?
In article [EMAIL PROTECTED], Sion Arrowsmith wrote: Jon Ribbens [EMAIL PROTECTED] wrote: if sys.hexversion 0x020400f0: ... error ... Readability counts. if sys.version_info (2, 4): ... error ... Maybe you should suggest a patch to the Python documentation then ;-) The formula I mentioned is the one suggested in the official Python documentation ( http://docs.python.org/lib/module-sys.html#l2h-5143 ) as being the way to check the Python version. -- http://mail.python.org/mailman/listinfo/python-list
parsing tables with beautiful soup?
I am learning python and beautiful soup, and I'm stuck. A web page has a table that contains data I would like to scrape. The table has a unique class, so I can use: soup.find(table, {class: class_name}) This isolates the table. So far, so good. Next, this table has a certain number of rows (I won't know ahead of time how many), and each row has a set number of cells (which will be constant). I couldn't find example code on how to loop through the contents of the rows and cells of a table using beautiful soup. I'm guessing I need an outer loop for the rows and an inner loop for the cells, but I don't know how to iterate over the tags that I want. The beautiful soup documentation is a little beyond me at this point. Can anyone point me in the right direction? thanks again, cjl -- http://mail.python.org/mailman/listinfo/python-list
Re: Delete a function
gtb wrote: On Mar 21, 11:37 am, Steve Holden [EMAIL PROTECTED] wrote: gtb wrote: After a function has been imported to a shell how may it be deleted so that after editing it can reloaded anew? Use the built-in reload() function to reload the module that defines the function. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenwebhttp://del.icio.us/steve.holden Recent Ramblings http://holdenweb.blogspot.com Thanks, tried that now and get nameError: with the following. import sys sys.path.append(c:\maxq\testScripts) from CompactTest import CompactTest from compactLogin import dvlogin reload(compactLogin) The local compactLogin isn't being bound to the module because of the import form you are using. Here's meta1.py: $ more meta1.py class MyMeta(type): ... class MyClass: __metaclass__ = MyMeta Now see what happens when I import the MyMeta class from it: from meta1 import MyMeta import sys sys.modules['meta1'] module 'meta1' from 'meta1.py' reload(meta1) Traceback (most recent call last): File stdin, line 1, in module NameError: name 'meta1' is not defined reload(sys.modules['meta1']) module 'meta1' from 'meta1.pyc' Note, however, that this reload() call does NOT re-bind the local MyMeta, which still references the class defines in the original version of the module. You'd be better off using import compactLogin and then setting dvlogin = compactLogin.dvlogin after the original import and each reload. Eventually you'll develop a feel for how namespaces work in Python and you'll be able to take liberties. Finally, note that there isn't much point doing the reload() unless the content of the modue has actually changed! regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden Recent Ramblings http://holdenweb.blogspot.com -- http://mail.python.org/mailman/listinfo/python-list
Re: Delete a function
gtb schrieb: On Mar 21, 11:37 am, Steve Holden [EMAIL PROTECTED] wrote: gtb wrote: After a function has been imported to a shell how may it be deleted so that after editing it can reloaded anew? Use the built-in reload() function to reload the module that defines the function. Thanks, tried that now and get nameError: with the following. import sys sys.path.append(c:\maxq\testScripts) from CompactTest import CompactTest from compactLogin import dvlogin reload(compactLogin) you haven't imported compactLogin but dvlogin from the compactLogin namespace. Try: import compactLogin ...do something with compactLogin.dvlogin... reload(compactLogin) cheers Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: Garbage collection
Tom Wright wrote: Steven D'Aprano wrote: You've described an extremely artificial set of circumstances: you create 40,000,000 distinct integers, then immediately destroy them. The obvious solution to that problem of Python caching millions of integers you don't need is not to create them in the first place. I know it's a very artificial setup - I was trying to make the situation simple to demonstrate in a few lines. The point was that it's not caching the values of those integers, as they can never be read again through the Python interface. It's just holding onto the space they occupy in case it's needed again. So what's your actual problem that you are trying to solve? I have a program which reads a few thousand text files, converts each to a list (with readlines()), creates a short summary of the contents of each (a few floating point numbers) and stores this summary in a master list. From the amount of memory it's using, I think that the lists containing the contents of each file are kept in memory, even after there are no references to them. Also, if I tell it to discard the master list and re-read all the files, the memory use nearly doubles so I presume it's keeping the lot in memory. I'd like to bet you are keeping references to them without realizing it. The interpreter won't generally allocate memory that it can get by garbage collection, and reference counting pretty much eliminates the need for garbage collection anyway except when you create cyclic data structures. The program may run through several collections of files, but it only keeps a reference to the master list of the most recent collection it's looked at. Obviously, it's not ideal if all the old collections hang around too, taking up space and causing the machine to swap. We may need to see code here for you to convince us of the correctness of your hypothesis. It sounds pretty screwy to me. but is there anything I can do to get that memory back without closing Python? Why do you want to manage memory yourself anyway? It seems like a horrible, horrible waste to use a language designed to manage memory for you, then insist on over-riding it's memory management. I agree. I don't want to manage it myself. I just want it to re-use memory or hand it back to the OS if it's got an awful lot that it's not using. Wouldn't you say it was wasteful if (say) an image editor kept an uncompressed copy of an image around in memory after the image had been closed? Yes, but I'd say it was the programmer's fault if it turned out that the interpreter wasn't doing anything wrong ;-) It could be something inside an exception handler that is keeping a reference to a stack frame or something silly like that. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden Recent Ramblings http://holdenweb.blogspot.com -- http://mail.python.org/mailman/listinfo/python-list
Re: why brackets commas in func calls can't be ommited? (maybe it could be PEP?)
dmitrey wrote: I looked to the PEPs didn't find a proposition to remove brackets commas for to make Python func call syntax caml- or tcl- like: instead of result = myfun(param1, myfun2(param5, param8), param3) just make possible using result = myfun param1 (myfun2 param5 param8) param3 You should really post this somewhere that Guido will see it so he can add it to PEP 3099: Things that will Not Change in Python 3000. Really, there's no way this is going to fly, so you might as well drop it or write your own language. STeVe P.S. If you use IPython, I believe you can get some of this. -- http://mail.python.org/mailman/listinfo/python-list
2 Command prompt at the same time
Hello, I have two batch files and I'm trying to run them in parallel. I haven't been able to find any information on how to make python open 2 command prompt and then make each prompt run one of the batch files. Can anyone point me in the right direction? Thanks, Emile Boudreau This message may contain privileged and/or confidential information. If you have received this e-mail in error or are not the intended recipient, you may not use, copy, disseminate or distribute it; do not open any attachments, delete it immediately from your system and notify the sender promptly by e-mail that you have done so. Thank you. -- http://mail.python.org/mailman/listinfo/python-list
Re: Garbage collection
On Wed, 21 Mar 2007 17:19:23 +, Tom Wright wrote: So what's your actual problem that you are trying to solve? I have a program which reads a few thousand text files, converts each to a list (with readlines()), creates a short summary of the contents of each (a few floating point numbers) and stores this summary in a master list. From the amount of memory it's using, I think that the lists containing the contents of each file are kept in memory, even after there are no references to them. Also, if I tell it to discard the master list and re-read all the files, the memory use nearly doubles so I presume it's keeping the lot in memory. Ah, now we're getting somewhere! Python's caching behaviour with strings is almost certainly going to be different to its caching behaviour with ints. (For example, Python caches short strings that look like identifiers, but I don't believe it caches great blocks of text or short strings which include whitespace.) But again, you haven't really described a problem, just a set of circumstances. Yes, the memory usage doubles. *Is* that a problem in practice? A few thousand 1KB files is one thing; a few thousand 1MB files is an entirely different story. Is the most cost-effective solution to the problem to buy another 512MB of RAM? I don't say that it is. I just point out that you haven't given us any reason to think it isn't. The program may run through several collections of files, but it only keeps a reference to the master list of the most recent collection it's looked at. Obviously, it's not ideal if all the old collections hang around too, taking up space and causing the machine to swap. Without knowing exactly what your doing with the data, it's hard to tell where the memory is going. I suppose if you are storing huge lists of millions of short strings (words?), they might all be cached. Is there a way you can avoid storing the hypothetical word-lists in RAM, perhaps by writing them straight out to a disk file? That *might* make a difference to the caching algorithm used. Or you could just have an object leak somewhere. Do you have any complicated circular references that the garbage collector can't resolve? Lists-of-lists? Trees? Anything where objects aren't being freed when you think they are? Are you holding on to references to lists? It's more likely that your code simply isn't freeing lists you think are being freed than it is that Python is holding on to tens of megabytes of random text. -- Steven. -- http://mail.python.org/mailman/listinfo/python-list
Re: Testing Python updates
Matthew wrote: Hello: What is the methodology for testing the updates to the Python language? http://www.pybots.org/ regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden Recent Ramblings http://holdenweb.blogspot.com -- http://mail.python.org/mailman/listinfo/python-list
Re: parsing tables with beautiful soup?
This works: for row in soup.find(table,{class: class_name}): for cell in row: print cell.contents[0] Is there a better way to do this? -cjl -- http://mail.python.org/mailman/listinfo/python-list
Re: replace illegal xml characters
killkolor wrote: Does InDesign export broken XML documents? What exactly is your problem? yes, unfortunately it does. it uses all possible unicode characters, though not all are alowed in valid xml (see link in the first post). Are you sure about this? Could you post a small example? If this is true, don't forget to file a bug report with Adobe too. --Irmen -- http://mail.python.org/mailman/listinfo/python-list
Re: Technical Answer - Protecting code in python
On Mar 21, 8:36 am, flit [EMAIL PROTECTED] wrote: Hello All, I have a hard question, every time I look for this answer its get out from the technical domain and goes on in the moral/social domain. First, I live in third world with bad gov., bad education, bad police and a lot of taxes and bills to pay, and yes I live in a democratic state (corrupt, but democratic). So please, don't try to convince me about the social / economical / open source / give to all / be open / all people are honest until prove contrary / dance with the rabbits... Remember I need to pay bills and security. Now the technical question: 1 - There is a way to make some program in python and protects it? I am not talking about ultra hard-core protection, just a simple one that will stop 90% script kiddies. 2 - If I put the code in web like a web service, how can I protect my code from being ripped? There is a way to avoid someone using my site and ripping the .py files? Thanks and sorry for the introduction Maybe an application for php. Then any html visible is not source but result of execution of php. -- http://mail.python.org/mailman/listinfo/python-list
Re: why brackets commas in func calls can't be ommited? (maybe it couldbe PEP?)
You asked two questions; the first others have asked also. Mathematicians sometimes use brackets to indicate function application and sometimes just juxtaposition. When they do the latter and when there are things other than functions (to exclude pure lambda calculus), then there are usually (always?) typographic conventions to differentiate between function names and value names. A common convention is value names one letter, function names multiple letters; as in 'exp cos ln x'. Note that the functions all take one arg. Another is that functions get capital letters, as in 'Fx'. Without such differentiation, and without declaritive typing of *names*, the reader cannot parse a sequence of expressions into function calls until names are resolved into objects. And if calls can return either callable or non-callable objects, as in Python, but rare in mathematics, then one cannot parse until calls are actually made (or at least attempted, and an exception raised). I mention 'attempted' because in Python there is no certain way to be sure an object is callable except by calling it. It is much more flexible and extensible than most math systems. Other have alluded to this problem: if you have 'f' mean what 'f()' now means, then you need another way to mean what 'f' now means, such as '`f' (quote f). But again, Python names are not typed. And how would you then indicate 'f()()' instead of 'f()'. The math notations without brackets generally don't have to deal with callables returning callables. I don't like typing ()s much either, but they seem necessary for Python, and become easier with practice. As for ,s: they are rather easy to type. More importantly, they turn certain extraneous spaces into syntax errors instead of runtime bugs that might pass silently and give bad answers. For instance, 'x_y' mistyped as 'x y'. Most importantly, making space syntactically significant either within call brackets or everywhere would require prohibiting currently optional spaces. For instance, 'daffodils + crocuses' would have to be written 'daffodils+crocuses', like it or not. Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
parsing combination strings
I need a fast and efficient way to parse a combination string(digits + chars) ex: s = 12ABA or 1ACD or 123CSD etc I want to parse the the above string such that i can grab only the first digits and ignore the rest of the chacters, so if i have s = 12ABA , parser(s) should give me 12 or 1 or 123. I can think of a quick dirty way by checking each element in the string and do a 'str.isdigit()' and stop once its not a digit, but appreciate any eligent way. -- http://mail.python.org/mailman/listinfo/python-list
Re: flattening/rolling up/aggregating a large sorted text file
Apparently you want to use this data to know how many blue circles, blue squares, red circles and red squares. In other words, I doubt you want to output redundant data columns, you just want this data in a more usable format and that you don't actually need to do multiple passes over it. This is a fun problem to solve because it uses two very powerful tools: cvs.dictreader and bitwise categorization. Note: your initial data has three records with the same ID. I assumes the ID is the unique key. So I changed the data slightly. [EMAIL PROTECTED] wrote: Hi, Given a large ascii file (delimited or fixed width) with one ID field and dimensions/measures fields, sorted by dimensions, I'd like to flatten or rollup the file by creating new columns: one for each combination of dimension level, and summing up measures over all records for a given ID. If the wheel has already been invented, great, please point me in the right direction. If not, please share some pointers on how to think about this problem in order to write efficient code. Is a hash with dimension level combinations a good approach, with values reset at each new ID level? I know mysql, Oracle etc will do this , but they all have a cap on # of columns allowed. SAS will allow unlimited columns, but I don't own SAS. Thanks. ID,color,shape,msr1 -- 001, blue, square, 4 001, red , circle,5 001, red, circle,6 ID, blue_circle, blue_square, red_circle, red_square -- 001,0,4,11,0 002 ... -- Shane Geiger IT Director National Council on Economic Education [EMAIL PROTECTED] | 402-438-8958 | http://www.ncee.net Leading the Campaign for Economic and Financial Literacy Apparently you want to use this data to know how many blue circles, blue squares, red circles and red squares. In other words, I doubt you want to output redundant data columns, you just want this data in a more usable format and that you don't actually need to do multiple passes over it. This is a fun problem to solve because it uses two very powerful tools: cvs.dictreader and bitwise categorization. Note: your initial data has three records with the same ID. I assumes the ID is the unique key. So I changed the data slightly. -- Given a large ascii file (delimited or fixed width) with one ID field and dimensions/measures fields, sorted by dimensions, I'd like to flatten or rollup the file by creating new columns: one for each combination of dimension level, and summing up measures over all records for a given ID. If the wheel has already been invented, great, please point me in the right direction. If not, please share some pointers on how to think about this problem in order to write efficient code. Is a hash with dimension level combinations a good approach, with values reset at each new ID level? I know mysql, Oracle etc will do this , but they all have a cap on # of columns allowed. SAS will allow unlimited columns, but I don't own SAS. Thanks. ID,color,shape,msr1 -- 001, blue, square, 4 001, red , circle,5 001, red, circle,6 ID, blue_circle, blue_square, red_circle, red_square -- 001,0,4,11,0 002 ... import string ## BITWISE CATEGORIZATION STUFF def gNextBit(val=0): while True: y = 2**val val += 1 yield y nb = gNextBit() categories = ['blue','red','square','circle'] #categories_value = ['blue','red','square','circle'] def bitwise_categorize(items): d = {} for item in items: d[item] = nb.next() return d categories_dict = bitwise_categorize(categories) #print categories_dict # {'blue': 1, 'circle': 8, 'square': 4, 'red': 2} def get_properties(category_int): p_list = [] for k,v in categories_dict.items(): if category_int v == v: p_list.append(k) return p_list def list_properties(): for i in range(len(categories)**2): print Properties for something with category_int of,str(i),str(get_properties(i)) #list_properties() ### EXAMPLE DATA header_fields = ['id','color','shape','msr1'] example_data = 001, blue, square, 4 002, red , circle,5 003, red, circle,6 # write out the example import os def writefile(f, data, perms=750): open(f, 'w').write(data) and os.chmod(f, perms) csv_file = /Users/shanegeiger/temp.csv writefile(csv_file, example_data) ### READING IN THE DATA AND CATEGORIZING IT WITH BITWISE CATEGORIZATION import csv reader = csv.DictReader(open(csv_file), [], delimiter=,) data = [] info = {} while True: try: # Read next header line (if there isn't one then exit the loop) reader.fieldnames = header_fields rdr = reader.next() data.append(rdr) except StopIteration: break categories_int = 0 #print