Re: file seek is slow
On Mar 10, 6:01 pm, Neil Hodgson wrote: > Metalone: > > > As it turns out each call is only > > 646 nanoseconds slower than 'C'. > > However, that is still 80% of the time to perform a file seek, > > which I would think is a relatively slow operation compared to just > > making a system call. > > A seek may not be doing much beyond setting a current offset value. > It is likely that fseek(f1, 0, SEEK_SET) isn't even doing a system call. Exactly. If I replace both calls to fseek with gettimeofday (aka time.time() on my platform in python) I get fairly close results: $ ./testseek 4.120 $ python2.5 testseek.py 4.170 $ ./testseek 4.080 $ python2.5 testseek.py 4.130 FWIW, my results with fseek aren't as bad as those of the OP. This is python2.5 on a 2.6.9 Linux OS, with psyco: $ ./testseek 0.560 $ python2.5 testseek.py 0.750 $ ./testseek 0.570 $ python2.5 testseek.py 0.760 -- http://mail.python.org/mailman/listinfo/python-list
Re: Evaluate my first python script, please
On Mar 5, 10:53 am, Pete Emerson wrote: > Thanks for your response, further questions inline. > > On Mar 4, 11:07 am, Tim Wintle wrote: > > > On Thu, 2010-03-04 at 10:39 -0800, Pete Emerson wrote: > > > I am looking for advice along the lines of "an easier way to do this" > > > or "a more python way" (I'm sure that's asking for trouble!) or > > > "people commonly do this instead" or "here's a slick trick" or "oh, > > > interesting, here's my version to do the same thing". > > > (1) I would wrap it all in a function > > > def main(): > > # your code here > > > if __name__ == "__main__": > > main() > > Is this purely aesthetic reasons, or will I appreciate this when I > write my own modules, or something else? Suppose the above code is in mymodule.py. By wrapping main() you can: 1. Have another module do: import mymodule ... (so some stuff, perhaps munge sys.argv) mymodule.main() 2. If mymodule has a small function in it, someone else can import it and call that function 3. You can run pylint, pychecker and other source-code checkers that need to be able to import your module to check it (I wouldn't be surprised if recent versions of one or the other of those don't require imports, and some checkers like pyflakes certainly don't). 4. You can easily have a unit tester call into the module etc. > > (2) PEP8 (python style guidelines) suggests one import per line > > > (3) I'd use four spaces as tab width +1 on both; it's good to get into the habit of writing standard- looking Python code. -- http://mail.python.org/mailman/listinfo/python-list
Re: Evaluate my first python script, please
On Mar 4, 1:39 pm, Pete Emerson wrote: > I've written my first python program, and would love suggestions for > improvement. > > I'm a perl programmer and used a perl version of this program to guide > me. So in that sense, the python is "perlesque" > > This script parses /etc/hosts for hostnames, and based on terms given > on the command line (argv), either prints the list of hostnames that > match all the criteria, or uses ssh to connect to the host if the > number of matches is unique. > > I am looking for advice along the lines of "an easier way to do this" > or "a more python way" (I'm sure that's asking for trouble!) or > "people commonly do this instead" or "here's a slick trick" or "oh, > interesting, here's my version to do the same thing". > > I am aware that there are performance improvements and error checking > that could be made, such as making sure the file exists and is > readable and precompiling the regular expressions and not calculating > how many sys.argv arguments there are more than once. I'm not hyper > concerned with performance or idiot proofing for this particular > script. > > Thanks in advance. > > > #!/usr/bin/python > > import sys, fileinput, re, os 'Some people, when confronted with a problem, think "I know, I’ll use regular expressions." Now they have two problems.' — Jamie Zawinski Seriously, regexes can be very useful but there's no need for them here. Simpler is usually better, and easier to understand. > filename = '/etc/hosts' > > hosts = [] > > for line in open(filename, 'r'): > match = re.search('\d+\.\d+\.\d+\.\d+\s+(\S+)', line) > if match is None or re.search('^(?:float|localhost)\.', line): > continue > hostname = match.group(1) I find this much clearer without regexes: try: ip, hostname = line.strip().split(None, 1) except IndexError: continue # I think this is equivalent to your re, but I'm not sure it's what # you actually meant... #if line.startswith("float.") or line.startswith("localhost."): #continue # I'm going with: if hostname.startswith("float.") or hostname.startswith("localhost"): continue > count = 0 > for arg in sys.argv[1:]: > for section in hostname.split('.'): > if section == arg: > count = count + 1 > break > if count == len(sys.argv) - 1: > hosts.append(hostname) A perfect application of sets. #initialize at program outset args = set(sys.argv[1:]) ... hostparts = set(hostname.split(".")) if hostparts & args: hosts.append(hostname) Full program: import sys import os filename = '/etc/hosts' hosts = [] args = set(sys.argv[1:]) for line in open(filename, 'r'): # Parse line into ip address and hostname, skipping bogus lines try: ipaddr, hostname = line.strip().split(None, 1) except ValueError: continue if hostname.startswith("float.") or hostname.startswith("localhost."): continue # Add to hosts if it matches at least one argument hostparts = set(hostname.split(".")) if hostparts & args: hosts.append(hostname) # If there's only one match, ssh to it--otherwise print out the matches if len(hosts) == 1: os.system("ssh -A %s"%hosts[0]) else: for host in hosts: print host -- http://mail.python.org/mailman/listinfo/python-list
Re: When will Java go mainstream like Python?
On Feb 24, 8:05 pm, Lawrence D'Oliveiro wrote: > In message , Wanja Gayk wrote: > > > Reference counting is about the worst technique for garbage collection. > > It avoids the need for garbage collection. That's like saying that driving a VW Beetle avoids the need for an automobile. Reference counting is a form of garbage collection (like mark-sweep, copy-collect, and others), not a way of avoiding it. You're right that ref counting in many implementations is more deterministic than other common forms of garbage collection; IMO, Python would be well-served by making the ref-counting semantics it currently has a guaranteed part of the language spec--or at least guaranteeing that when a function returns, any otherwise unreferenced locals are immediately collected. I could be convinced otherwise, but I _think_ that that change would offer an alternative to all of the interesting cases of where the "with" statement is "useful". -- http://mail.python.org/mailman/listinfo/python-list
Re: The future of "frozen" types as the number of CPU cores increases
On Feb 23, 8:03 pm, Nobody wrote: > On Mon, 22 Feb 2010 22:27:54 -0800, sjdevn...@yahoo.com wrote: > > Basically, multiprocessing is always hard--but it's less hard to start > > without shared everything. Going with the special case (sharing > > everything, aka threading) is by far the stupider and more complex way > > to approach multiprocssing. > > Multi-threading hardly shares anything (only dynamically-allocated > and global data), while everything else (the entire stack) is per-thread. > > Yeah, I'm being facetious. Slightly. I'm afraid I may be missing the facetiousness here. The only definitional difference between threads and processes is that threads share memory, while processes don't. There are often other important practical implementation details, but sharing memory vs not sharing memory is the key theoretical distinction between threads and processes. On most platforms, whether or not you want to share memory (and abandon memory protection wrt the rest of the program) is the key factor a programmer should consider when deciding between threads and processes--the only time that's not true is when the implementation forces ancillary details upon you. -- http://mail.python.org/mailman/listinfo/python-list
Re: The future of "frozen" types as the number of CPU cores increases
On Feb 22, 9:24 pm, John Nagle wrote: > sjdevn...@yahoo.com wrote: > > On Feb 20, 9:58 pm, John Nagle wrote: > >> sjdevn...@yahoo.com wrote: > >>> On Feb 18, 2:58 pm, John Nagle wrote: > >>>> Multiple processes are not the answer. That means loading multiple > >>>> copies of the same code into different areas of memory. The cache > >>>> miss rate goes up accordingly. > >>> A decent OS will use copy-on-write with forked processes, which should > >>> carry through to the cache for the code. > >> That doesn't help much if you're using the subprocess module. The > >> C code of the interpreter is shared, but all the code generated from > >> Python is not. > > > Of course. Multithreading also fails miserably if the threads all try > > to call exec() or the equivalent. > > > It works fine if you use os.fork(). > > Forking in multithreaded programs is iffy. One more thing: the above statement ("forking in multithreaded programs is iffy"), is absolutely true, but it's also completely meaningless in modern multiprocessing programs--it's like saying "gotos in structured programs are iffy". That's true, but it also has almost no bearing on decently constructed modern programs. -- http://mail.python.org/mailman/listinfo/python-list
Re: The future of "frozen" types as the number of CPU cores increases
On Feb 22, 9:24 pm, John Nagle wrote: > sjdevn...@yahoo.com wrote: > > On Feb 20, 9:58 pm, John Nagle wrote: > >> sjdevn...@yahoo.com wrote: > >>> On Feb 18, 2:58 pm, John Nagle wrote: > >>>> Multiple processes are not the answer. That means loading multiple > >>>> copies of the same code into different areas of memory. The cache > >>>> miss rate goes up accordingly. > >>> A decent OS will use copy-on-write with forked processes, which should > >>> carry through to the cache for the code. > >> That doesn't help much if you're using the subprocess module. The > >> C code of the interpreter is shared, but all the code generated from > >> Python is not. > > > Of course. Multithreading also fails miserably if the threads all try > > to call exec() or the equivalent. > > > It works fine if you use os.fork(). > > Forking in multithreaded programs is iffy. What happens depends > on the platform, and it's usually not what you wanted to happen. Well, yeah. And threading in multiprocess apps is iffy. In the real world, though, multiprocessing is much more likely to result in a decent app than multithreading--and if you're not skilled at either, starting with multiprocessing is by far the smarter way to begin. Basically, multiprocessing is always hard--but it's less hard to start without shared everything. Going with the special case (sharing everything, aka threading) is by far the stupider and more complex way to approach multiprocssing. And really, for real-world apps, it's much, much more likely that fork() will be sufficient than that you'll need to explore the vagueries of a multithreaded solution. Protected memory rocks, and in real life it's probably 95% of the time where threads are only even considered if the OS can't fork() and otherwise use processes well. -- http://mail.python.org/mailman/listinfo/python-list
Re: The future of "frozen" types as the number of CPU cores increases
On Feb 20, 9:58 pm, John Nagle wrote: > sjdevn...@yahoo.com wrote: > > On Feb 18, 2:58 pm, John Nagle wrote: > >> Multiple processes are not the answer. That means loading multiple > >> copies of the same code into different areas of memory. The cache > >> miss rate goes up accordingly. > > > A decent OS will use copy-on-write with forked processes, which should > > carry through to the cache for the code. > > That doesn't help much if you're using the subprocess module. The > C code of the interpreter is shared, but all the code generated from > Python is not. Of course. Multithreading also fails miserably if the threads all try to call exec() or the equivalent. It works fine if you use os.fork(). -- http://mail.python.org/mailman/listinfo/python-list
Re: The future of "frozen" types as the number of CPU cores increases
On Feb 18, 2:58 pm, John Nagle wrote: > Multiple processes are not the answer. That means loading multiple > copies of the same code into different areas of memory. The cache > miss rate goes up accordingly. A decent OS will use copy-on-write with forked processes, which should carry through to the cache for the code. -- http://mail.python.org/mailman/listinfo/python-list
Re: Interesting talk on Python vs. Ruby and how he would like Python to have just a bit more syntactic flexibility.
On Feb 20, 1:30 am, Lawrence D'Oliveiro wrote: > In message , Rhodri James wrote: > > > In classic Pascal, a procedure was distinct from a function in that it had > > no return value. The concept doesn't really apply in Python; there are no > > procedures in that sense, since if a function terminates without supplying > > an explicit return value it returns None. > > If Python doesn’t distinguish between procedures and functions, why should > it distinguish between statements and expressions? Because the latter are different in Python (and in Ruby, and in most modern languages), while the former aren't distinguished in Python or Ruby or most modern languages? Primarily functional languages are the main exception, but other than them it's pretty uncommon to find any modern language that does distinguish procedures and functions, or one that doesn't distinguished statements and expressions. You can certainly find exceptions, but distinguishing statements and expressions is absolutely commonplace in modern languages, and distinguishing functions and procedures is in the minority. -- http://mail.python.org/mailman/listinfo/python-list
Re: Interesting talk on Python vs. Ruby and how he would like Python to have just a bit more syntactic flexibility.
On Feb 20, 1:28 am, Lawrence D'Oliveiro wrote: > In message <87eikjcuzk@benfinney.id.au>, Ben Finney wrote: > > > > > Lawrence D'Oliveiro writes: > > >> In message , cjw wrote: > > >> > Aren't lambda forms better described as function? > > >> Is this a function? > > >> lambda : None > > >> What about this? > > >> lambda : sys.stdout.write("hi there!\n") > > > They are both lambda forms in Python. As a Python expression, they > > evaluate to (they “return”) a function object. > > So there is no distinction between functions and procedures, then? Not in most modern languages, no. i think the major places they are differentiated are in functional languages and in pre-1993ish languages (give or take a few years), neither of which applies to Python or Ruby. -- http://mail.python.org/mailman/listinfo/python-list
Re: Interesting talk on Python vs. Ruby and how he would like Python to have just a bit more syntactic flexibility.
On Feb 18, 10:58 pm, Paul Rubin wrote: > Steve Howell writes: > >> But frankly, although there's no reason that you _have_ to name the > >> content at each step, I find it a lot more readable if you do: > > >> def print_numbers(): > >> tuples = [(n*n, n*n*n) for n in (1,2,3,4,5,6)] > >> filtered = [ cube for (square, cube) in tuples if square!=25 and > >> cube!=64 ] > >> for f in filtered: > >> print f > > > The names you give to the intermediate results here are > > terse--"tuples" and "filtered"--so your code reads nicely. > > But that example makes tuples and filtered into completely expanded > lists in memory. I don't know Ruby so I've been wondering whether the > Ruby code would run as an iterator pipeline that uses constant memory. I don't know how Ruby works, either. If it's using constant memory, switching the Python to generator comprehensions (and getting constant memory usage) is simply a matter of turning square brackets into parentheses: def print_numbers(): tuples = ((n*n, n*n*n) for n in (1,2,3,4,5,6)) filtered = ( cube for (square, cube) in tuples if square!=25 and cube!=64 ) for f in filtered: print f Replace (1,2,3,4,5,6) with xrange(1) and memory usage still stays constant. Though for this particular example, I prefer a strict looping solution akin to what Jonathan Gardner had upthread: for n in (1,2,3,4,5,6): square = n*n cube = n*n*n if square == 25 or cube == 64: continue print cube > > In a more real world example, the intermediate results would be > > something like this: > > > departments > > departments_in_new_york > > departments_in_new_york_not_on_bonus_cycle > > employees_in_departments_in_new_york_not_on_bonus_cycle > > names_of_employee_in_departments_in_new_york_not_on_bonus_cycle I don't think the assertion that the names would be ridiculously long is accurate, either. Something like: departments = blah ny_depts = blah(departments) non_bonus_depts = blah(ny_depts) non_bonus_employees = blah(non_bonus_depts) employee_names = blah(non_bonus_employees) If the code is at all well-structured, it'll be just as obvious from the context that each list/generator/whatever is building from the previous one as it is in the anonymous block case. -- http://mail.python.org/mailman/listinfo/python-list
Re: Interesting talk on Python vs. Ruby and how he would like Python to have just a bit more syntactic flexibility.
On Feb 18, 11:15 am, Steve Howell wrote: > def print_numbers() > [1, 2, 3, 4, 5, 6].map { |n| > [n * n, n * n * n] > }.reject { |square, cube| > square == 25 || cube == 64 > }.map { |square, cube| > cube > }.each { |n| > puts n > } > end > > IMHO there is no reason that I should have to name the content of each > of those four blocks of code, nor should I have to introduce the > "lambda" keyword. You could do it without intermediate names or lambdas in Python as: def print_numbers(): for i in [ cube for (square, cube) in [(n*n, n*n*n) for n in [1,2,3,4,5,6]] if square!=25 and cube!=64 ]: print i But frankly, although there's no reason that you _have_ to name the content at each step, I find it a lot more readable if you do: def print_numbers(): tuples = [(n*n, n*n*n) for n in (1,2,3,4,5,6)] filtered = [ cube for (square, cube) in tuples if square!=25 and cube!=64 ] for f in filtered: print f -- http://mail.python.org/mailman/listinfo/python-list
Re: The future of "frozen" types as the number of CPU cores increases
On Feb 17, 2:35 am, Steven D'Aprano wrote: > On Tue, 16 Feb 2010 21:09:27 -0800, John Nagle wrote: > > Yes, we're now at the point where all the built-in mutable types > > have "frozen" versions. But we don't have that for objects. It's > > generally considered a good thing in language design to offer, for user > > defined types, most of the functionality of built-in ones. > > It's not hard to build immutable user-defined types. Whether they're > immutable enough is another story :) > > >>> class FrozenMeta(type): > > ... def __new__(meta, classname, bases, classDict): > ... def __setattr__(*args): > ... raise TypeError("can't change immutable class") > ... classDict['__setattr__'] = __setattr__ > ... classDict['__delattr__'] = __setattr__ > ... return type.__new__(meta, classname, bases, classDict) > ... > > >>> class Thingy(object): > > ... __metaclass__ = FrozenMeta > ... def __init__(self, x): > ... self.__dict__['x'] = x > ... > > >>> t = Thingy(45) > >>> t.x > 45 > >>> t.x = 42 > > Traceback (most recent call last): > File "", line 1, in > File "", line 4, in __setattr__ > TypeError: can't change immutable class > > It's a bit ad hoc, but it seems to work for me. Unfortunately there's no > way to change __dict__ to a "write once, read many" dict. Which makes it not really immutable, as does the relative ease of using a normal setattr: ... t.__dict__['x'] = "foo" ... print t.x foo ... object.__setattr__(t, "x", 42) ... print t.x 42 -- http://mail.python.org/mailman/listinfo/python-list
Re: To (monkey)patch or not to (monkey)patch, that is the question
On Feb 9, 3:54 am, George Sakkis wrote: > I was talking to a colleague about one rather unexpected/undesired > (though not buggy) behavior of some package we use. Although there is > an easy fix (or at least workaround) on our end without any apparent > side effect, he strongly suggested extending the relevant code by hard > patching it and posting the patch upstream, hopefully to be accepted > at some point in the future. In fact we maintain patches against > specific versions of several packages that are applied automatically > on each new build. The main argument is that this is the right thing > to do, as opposed to an "ugly" workaround or a fragile monkey patch. > On the other hand, I favor practicality over purity and my general > rule of thumb is that "no patch" > "monkey patch" > "hard patch", at > least for anything less than a (critical) bug fix. I'd monkey patch for the meantime, but send a hard patch in the hopes of shifting the maintenance burden to someone else. (Plus maybe help out the upstream project and other people, I guess) -- http://mail.python.org/mailman/listinfo/python-list
Re: how to run part of my python code as root
On Feb 4, 2:05 pm, Tomas Pelka wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Hey, > > is there possibility how to run part of my code (function for example) > as superuser. > > Or only way how to do this is create a wrapper and run is with Popen > through sudo (but I have to configure sudo to run "whole" python as root). In decreasing order of desirability: 1. Find a way to not need root access (e.g. grant another user or group access to whatever resource you're trying to access). 2. Isolate the stuff that needs root access into a small helper program that does strict validation of all input (including arguments, environment, etc); when needed, run that process under sudo or similar. 2a. Have some sort of well-verified helper daemon that has access to the resource you need and mediates use of that resource. 3. Run the process as root, using seteuid() to switch between user and root privs. The entire program must be heavily verified and do strict validation of all inputs. Any attacker who gets control over the process can easily switch to root privs and do damage. This is generally a bad idea. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python and Ruby
On Feb 2, 5:01 pm, Jonathan Gardner wrote: > On Feb 1, 6:36 pm, John Bokma wrote: > > > > > > > Jonathan Gardner writes: > > > One of the bad things with languages like perl > > > FYI: the language is called Perl, the program that executes a Perl > > program is called perl. > > > > without parentheses is that getting a function ref is not obvious. You > > > need even more syntax to do so. In perl: > > > > foo(); # Call 'foo' with no args. > > > $bar = foo; # Call 'foo; with no args, assign to '$bar' > > > $bar = &foo; # Don't call 'foo', but assign a pointer to it to '$bar' > > > # By the way, this '&' is not the bitwise-and '&' > > > It should be $bar = \&foo > > Your example actually calls foo... > > I rest my case. I've been programming perl professionally since 2000, > and I still make stupid, newbie mistakes like that. > > > > One is simple, consistent, and easy to explain. The other one requires > > > the introduction of advanced syntax and an entirely new syntax to make > > > function calls with references. > > > The syntax follows that of referencing and dereferencing: > > > $bar = \...@array; # bar contains now a reference to array > > $bar->[ 0 ]; # first element of array referenced by bar > > $bar = \%hash; # bar contains now a reference to a hash > > $bar->{ key }; # value associated with key of hash ref. by bar > > $bar = \&foo; # bar contains now a reference to a sub > > $bar->( 45 ); # call sub ref. by bar with 45 as an argument > > > Consistent: yes. New syntax? No. > > Except for the following symbols and combinations, which are entirely > new and different from the $...@% that you have to know just to use > arrays and hashes. > > \@, ->[ ] > \%, ->{ } > \&, ->( ) > > By the way: > * How do you do a hashslice on a hashref? > * How do you invoke reference to a hash that contains a reference to > an array that contains a reference to a function? > > Compare with Python's syntax. > > # The only way to assign > a = b >>> locals().__setitem__('a', 'b') >>> print a b > # The only way to call a function > b(...) >>> def b(a): ...print a*2 >>> apply(b, (3,)) 6 > # The only way to access a hash or array or string or tuple > b[...] >>> b={} >>> b[1] = 'a' >>> print b.__getitem__(1) a -- http://mail.python.org/mailman/listinfo/python-list
Re: myths about python 3
On Jan 27, 9:22 am, Daniel Fetchinson wrote: > >> Hi folks, > > >> I was going to write this post for a while because all sorts of myths > >> periodically come up on this list about python 3. I don't think the > >> posters mean to spread false information on purpose, they simply are > >> not aware of the facts. > > >> My list is surely incomplete, please feel free to post your favorite > >> misconception about python 3 that people periodically state, claim or > >> ask about. > > >> 1. Print statement/function creates incompatibility between 2.x and 3.x! > > >> Certainly false or misleading, if one uses 2.6 and 3.x the > >> incompatibility is not there. Print as a function works in 2.6: > > >> Python 2.6.2 (r262:71600, Aug 21 2009, 12:23:57) > >> [GCC 4.4.1 20090818 (Red Hat 4.4.1-6)] on linux2 > >> Type "help", "copyright", "credits" or "license" for more information. > > print( 'hello' ) > >> hello > > print 'hello' > >> hello > > >> 2. Integer division creates incompatibility between 2.x and 3.x! > > >> Again false or misleading, because one can get the 3.x behavior with 2.6: > > >> Python 2.6.2 (r262:71600, Aug 21 2009, 12:23:57) > >> [GCC 4.4.1 20090818 (Red Hat 4.4.1-6)] on linux2 > >> Type "help", "copyright", "credits" or "license" for more information. > > 6/5 > >> 1 > > from __future__ import division > > 6/5 > >> 1.2 > > >> Please feel free to post your favorite false or misleading claim about > >> python 3! > > > Well, I see two false or misleading claims just above - namely that > > the two claims above are false or misleading. They tell just half of > > the story, and that half is indeed easy. A Python 3 program can be > > unchanged (in the case of print) or with only trivial modifications > > (in the case of integer division) be made to run on Python 2.6. > > Okay, so we agree that as long as print and integer division is > concerned, a program can easily be written that runs on both 2.6 and > 3.x. > > My statements are exactly this, so I don't understand why you disagree. > > > The other way around this is _not_ the case. > > What do you mean? > > > To say that two things are > > compatible if one can be used for the other, but the other not for the > > first, is false or misleading. > > I'm not sure what you mean here. Maybe I didn't make myself clear > enough, but what I mean is this: as long as print and integer division > is concerned, it is trivial to write code that runs on both 2.6 and > 3.x. Hence if someone wants to highlight incompatibility (which surely > exists) between 2.6 and 3.x he/she has to look elsewhere. I think you're misunderstanding what "compatibility" means in a programming language context. Python 3 and Python 2 are not mutually compatible, as arbitrary programs written in one will not run in the other. The most important fallout of that is that Python 3 is not backwards compatible, in that existing Python 2 programs won't run unaltered in Python 3--while it's easy to write a new program in a subset of the languages that runs on both Python 2 and 3, the huge existing codebase of Python 2 code won't run under Python 3. That there exists an intersection of the two languages that is compatible with both doesn't make the two languages compatible with each other--although it being a fairly large subset does help mitigate a sizable chunk of the problems caused by incompatibility (as do tools like 2to3). In your example, a python2 program that uses print and division will fail in python3. The print problem is less significant, since the failure will probably be a syntax error or a rapid exception raised. The division problem is more problematic, since a program may appear to run fine but silently misbehave; such errors are much more likely to result in significant damage to data or other long-term badness. -- http://mail.python.org/mailman/listinfo/python-list
Re: Library support for Python 3.x
On Jan 27, 2:03 pm, Paul Rubin wrote: > a...@pythoncraft.com (Aahz) writes: > > From my POV, your question would be precisely identical if you had > > started your project when Python 2.3 was just released and wanted to > > know if the libraries you selected would be available for Python 2.6. > > I didn't realize 2.6 broke libraries that had worked in 2.3, at least on > any scale. Did I miss something? I certainly had to update several modules I use (C extensions) to work with the new memory management in a recent release (changing PyMem_Del to Py_DECREF being a pretty common alteration); I can't remember whether that was for 2.6 or 2.5. -- http://mail.python.org/mailman/listinfo/python-list
Re: Bare Excepts
On Jan 2, 9:35 pm, Dave Angel wrote: > Steven D'Aprano wrote: > > On Sat, 02 Jan 2010 09:40:44 -0800, Aahz wrote: > > >> OTOH, if you want to do something different depending on whether the > >> file exists, you need to use both approaches: > > >> if os.path.exists(fname): > >> try: > >> f = open(fname, 'rb') > >> data = f.read() > >> f.close() > >> return data > >> except IOError: > >> logger.error("Can't read: %s", fname) return '' > >> else: > >> try: > >> f = open(fname, 'wb') > >> f.write(data) > >> f.close() > >> except IOError: > >> logger.error("Can't write: %s", fname) > >> return None > > > Unfortunately, this is still vulnerable to the same sort of race > > condition I spoke about. > > > Even more unfortunately, I don't know that there is any fool-proof way of > > avoiding such race conditions in general. Particularly the problem of > > "open this file for writing only if it doesn't already exist". > > > > > In Windows, there is a way to do it. It's just not exposed to the > Python built-in function open(). You use the CreateFile() function, > with /dwCreationDisposition/ of CREATE_NEW. > > It's atomic, and fails politely if the file already exists. > > No idea if Unix has a similar functionality. It does. In Unix, you'd pass O_CREAT|O_EXCL to the open(2) system call (O_CREAT means create a new file, O_EXCL means exclusive mode: fail if the file exists already). The python os.open interface looks suspiciously like the Unix system call as far as invocation goes, but it wraps the Windows functionality properly for the relevant flags. The following should basically work on both OSes (though you'd want to specify a Windows filename, and also set os.O_BINARY or os.O_TEXT as needed on Windows): import os def exclusive_open(fname): rv = os.open(fname, os.O_RDWR|os.O_CREAT|os.O_EXCL) return os.fdopen(rv) first = exclusive_open("/tmp/test") print "SUCCESS: ", first second = exclusive_open("/tmp/test") print "SUCCESS: ", second Run that and you should get: SUCCESS ', mode 'r' at 0xb7f72c38> Traceback (most recent call last): File "testopen.py", line 9, in second = exclusive_open("/tmp/test") File "testopen.py", line 4, in exclusive_open rv = os.open(fname, os.O_RDWR|os.O_CREAT|os.O_EXCL) OSError: [Errno 17] File exists: '/tmp/test' -- http://mail.python.org/mailman/listinfo/python-list
Re: read text file byte by byte
On Dec 14, 11:44 pm, Terry Reedy wrote: > On 12/14/2009 7:37 PM, Gabriel Genellina wrote: > > > > > En Mon, 14 Dec 2009 18:09:52 -0300, Nobody escribió: > >> On Sun, 13 Dec 2009 22:56:55 -0800, sjdevn...@yahoo.com wrote: > > >>> The 3.1 documentation specifies that file.read returns bytes: > > >>> Does it need fixing? > > >> There are no file objects in 3.x. The file() function no longer > >> exists. The return value from open(), will be an instance of > >> _io. depending upon the mode, e.g. _io.TextIOWrapper for 'r', > >> _io.BufferedReader for 'rb', _io.BufferedRandom for 'w+b', etc. > > >>http://docs.python.org/3.1/library/io.html > > >> io.IOBase.read() doesn't exist, io.RawIOBase.read(n) reads n bytes, > >> io.TextIOBase.read(n) reads n characters. > > > So basically this section [1] should not exist, or be completely rewritten? > > At least the references to C stdio library seem wrong to me. > > > [1]http://docs.python.org/3.1/library/stdtypes.html#file-objects > > I agree.http://bugs.python.org/issue7508 > > Terry Jan Reedy Thanks, Terry. -- http://mail.python.org/mailman/listinfo/python-list
Re: read text file byte by byte
On Dec 14, 4:09 pm, Nobody wrote: > On Sun, 13 Dec 2009 22:56:55 -0800, sjdevn...@yahoo.com wrote: > > The 3.1 documentation specifies that file.read returns bytes: > > Does it need fixing? > > There are no file objects in 3.x. Then the documentation definitely needs fixing; the excerpt I posted earlier is from the 3.1 documentation's section about file objects: http://docs.python.org/3.1/library/stdtypes.html#file-objects Which begins: "5.9 File Objects File objects are implemented using C’s stdio package and can be created with the built-in open() function. File objects are also returned by some other built-in functions and methods, such as os.popen () and os.fdopen() and the makefile() method of socket objects." (It goes on to describe the read method's operation on bytes that I quoted upthread.) Sadly I'm not familiar enough with 3.x to suggest an appropriate edit. -- http://mail.python.org/mailman/listinfo/python-list
Re: read text file byte by byte
On Dec 14, 1:57 pm, Dennis Lee Bieber wrote: > On Sun, 13 Dec 2009 22:56:55 -0800 (PST), "sjdevn...@yahoo.com" > declaimed the following in > gmane.comp.python.general: > > > > > > > > > The 3.1 documentation specifies that file.read returns bytes: > > > file.read([size]) > > Read at most size bytes from the file (less if the read hits EOF > > before obtaining size bytes). If the size argument is negative or > > omitted, read all data until EOF is reached. The bytes are returned as > > a string object. An empty string is returned when EOF is encountered > > immediately. (For certain files, like ttys, it makes sense to continue > > reading after an EOF is hit.) Note that this method may call the > > underlying C function fread() more than once in an effort to acquire > > as close to size bytes as possible. Also note that when in non- > > blocking mode, less data than was requested may be returned, even if > > no size parameter was given. > > > Does it need fixing? > > I'm still running 2.5 (Maybe next spring I'll see if all the third > party libraries I have exist in 2.6 versions)... BUT... > > "... are returned as a string object..." Aren't "strings" in 3.x now > unicode? Which would imply, to me, that the interpretation of the > contents will not be plain bytes. I'm not even concerned (yet) about how the data is interpreted after it's read. First I'm trying to clarify what exactly gets read. The post I was replying to said "In Python 3.x, f.read(1) will read one character, which may be more than one byte depending on the encoding." That seems at odds with the documentation saying "Read at most size bytes from the file"--the fact that it's documented to read "size" bytes rather than "size" (possibly multibyte) characters is emphasized by the later language saying that the underlying C fread() call may be called enough times to read as close to size bytes as possible. If the poster I was replying to is correct, it seems like a documentation update is in order. As a long-time programmer, I would be very surprised to make a call to f.read(X) and have it return more than X bytes if I hadn't read this here. -- http://mail.python.org/mailman/listinfo/python-list
Re: read text file byte by byte
On Dec 13, 5:56 pm, "Rhodri James" wrote: > On Sun, 13 Dec 2009 06:44:54 -, Steven D'Aprano > > wrote: > > On Sat, 12 Dec 2009 22:15:50 -0800, daved170 wrote: > > >> Thank you all. > >> Dennis I really liked you solution for the issue but I have two question > >> about it: > >> 1) My origin file is Text file and not binary > > > That's a statement, not a question. > > >> 2) I need to read each time 1 byte. > > > f = open(filename, 'r') # open in text mode > > f.read(1) # read one byte > > The OP hasn't told us what version of Python he's using on what OS. On > Windows, text mode will compress the end-of-line sequence into a single > "\n". In Python 3.x, f.read(1) will read one character, which may be more > than one byte depending on the encoding. The 3.1 documentation specifies that file.read returns bytes: file.read([size]) Read at most size bytes from the file (less if the read hits EOF before obtaining size bytes). If the size argument is negative or omitted, read all data until EOF is reached. The bytes are returned as a string object. An empty string is returned when EOF is encountered immediately. (For certain files, like ttys, it makes sense to continue reading after an EOF is hit.) Note that this method may call the underlying C function fread() more than once in an effort to acquire as close to size bytes as possible. Also note that when in non- blocking mode, less data than was requested may be returned, even if no size parameter was given. Does it need fixing? -- http://mail.python.org/mailman/listinfo/python-list