Re: file seek is slow

2010-03-10 Thread sjdevn...@yahoo.com
On Mar 10, 6:01 pm, Neil Hodgson 
wrote:
> Metalone:
>
> > As it turns out each call is only
> > 646 nanoseconds slower than 'C'.
> > However, that is still 80% of the time to perform a file seek,
> > which I would think is a relatively slow operation compared to just
> > making a system call.
>
>    A seek may not be doing much beyond setting a current offset value.
> It is likely that fseek(f1, 0, SEEK_SET) isn't even doing a system call.

Exactly.  If I replace both calls to fseek with gettimeofday (aka
time.time() on my platform in python) I get fairly close results:
$ ./testseek
4.120
$ python2.5 testseek.py
4.170
$ ./testseek
4.080
$ python2.5 testseek.py
4.130


FWIW, my results with fseek aren't as bad as those of the OP.  This is
python2.5 on a 2.6.9 Linux OS, with psyco:
$ ./testseek
0.560
$ python2.5 testseek.py
0.750
$ ./testseek
0.570
$ python2.5 testseek.py
0.760
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Evaluate my first python script, please

2010-03-05 Thread sjdevn...@yahoo.com
On Mar 5, 10:53 am, Pete Emerson  wrote:
> Thanks for your response, further questions inline.
>
> On Mar 4, 11:07 am, Tim Wintle  wrote:
>
> > On Thu, 2010-03-04 at 10:39 -0800, Pete Emerson wrote:
> > > I am looking for advice along the lines of "an easier way to do this"
> > > or "a more python way" (I'm sure that's asking for trouble!) or
> > > "people commonly do this instead" or "here's a slick trick" or "oh,
> > > interesting, here's my version to do the same thing".
>
> > (1) I would wrap it all in a function
>
> > def main():
> >     # your code here
>
> > if __name__ == "__main__":
> >     main()
>
> Is this purely aesthetic reasons, or will I appreciate this when I
> write my own modules, or something else?

Suppose the above code is in mymodule.py.  By wrapping main() you can:
1. Have another module do:
import mymodule
... (so some stuff, perhaps munge sys.argv)
mymodule.main()
2. If mymodule has a small function in it, someone else can import it
and call that function
3. You can run pylint, pychecker and other source-code checkers that
need to be able to import your module to check it (I wouldn't be
surprised if recent versions of one or the other of those don't
require imports, and some checkers like pyflakes certainly don't).
4. You can easily have a unit tester call into the module

etc.

> > (2) PEP8 (python style guidelines) suggests one import per line
>
> > (3) I'd use four spaces as tab width

+1 on both; it's good to get into the habit of writing standard-
looking Python code.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Evaluate my first python script, please

2010-03-04 Thread sjdevn...@yahoo.com
On Mar 4, 1:39 pm, Pete Emerson  wrote:
> I've written my first python program, and would love suggestions for
> improvement.
>
> I'm a perl programmer and used a perl version of this program to guide
> me. So in that sense, the python is "perlesque"
>
> This script parses /etc/hosts for hostnames, and based on terms given
> on the command line (argv), either prints the list of hostnames that
> match all the criteria, or uses ssh to connect to the host if the
> number of matches is unique.
>
> I am looking for advice along the lines of "an easier way to do this"
> or "a more python way" (I'm sure that's asking for trouble!) or
> "people commonly do this instead" or "here's a slick trick" or "oh,
> interesting, here's my version to do the same thing".
>
> I am aware that there are performance improvements and error checking
> that could be made, such as making sure the file exists and is
> readable and precompiling the regular expressions and not calculating
> how many sys.argv arguments there are more than once. I'm not hyper
> concerned with performance or idiot proofing for this particular
> script.
>
> Thanks in advance.
>
> 
> #!/usr/bin/python
>
> import sys, fileinput, re, os

'Some people, when confronted with a problem, think "I know, I’ll use
regular expressions." Now they have two problems.' — Jamie Zawinski

Seriously, regexes can be very useful but there's no need for them
here.  Simpler is usually better, and easier to understand.

> filename = '/etc/hosts'
>
> hosts = []
>
> for line in open(filename, 'r'):
>         match = re.search('\d+\.\d+\.\d+\.\d+\s+(\S+)', line)
>         if match is None or re.search('^(?:float|localhost)\.', line):
> continue
>         hostname = match.group(1)

I find this much clearer without regexes:

try:
ip, hostname = line.strip().split(None, 1)
except IndexError:
continue
# I think this is equivalent to your re, but I'm not sure it's what
# you actually meant...
#if line.startswith("float.") or line.startswith("localhost."):
#continue
# I'm going with:
if hostname.startswith("float.") or hostname.startswith("localhost"):
continue


>         count = 0
>         for arg in sys.argv[1:]:
>                 for section in hostname.split('.'):
>                         if section == arg:
>                                 count = count + 1
>                                 break
>         if count == len(sys.argv) - 1:
>                 hosts.append(hostname)

A perfect application of sets.

#initialize at program outset
args = set(sys.argv[1:])
...
hostparts = set(hostname.split("."))
if hostparts & args:
hosts.append(hostname)


Full program:

import sys
import os
filename = '/etc/hosts'

hosts = []
args = set(sys.argv[1:])
for line in open(filename, 'r'):
# Parse line into ip address and hostname, skipping bogus lines
try:
ipaddr, hostname = line.strip().split(None, 1)
except ValueError:
continue
if hostname.startswith("float.") or
hostname.startswith("localhost."):
continue

# Add to hosts if it matches at least one argument
hostparts = set(hostname.split("."))
if hostparts & args:
hosts.append(hostname)

# If there's only one match, ssh to it--otherwise print out the
matches
if len(hosts) == 1:
os.system("ssh -A %s"%hosts[0])
else:
for host in hosts:
print host
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: When will Java go mainstream like Python?

2010-02-24 Thread sjdevn...@yahoo.com
On Feb 24, 8:05 pm, Lawrence D'Oliveiro  wrote:
> In message , Wanja Gayk wrote:
>
> > Reference counting is about the worst technique for garbage collection.
>
> It avoids the need for garbage collection.

That's like saying that driving a VW Beetle avoids the need for an
automobile.  Reference counting is a form of garbage collection (like
mark-sweep, copy-collect, and others), not a way of avoiding it.

You're right that ref counting in many implementations is more
deterministic than other common forms of garbage collection; IMO,
Python would be well-served by making the ref-counting semantics it
currently has a guaranteed part of the language spec--or at least
guaranteeing that when a function returns, any otherwise unreferenced
locals are immediately collected.

I could be convinced otherwise, but I _think_ that that change would
offer an alternative to all of the interesting cases of where the
"with" statement is "useful".
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: The future of "frozen" types as the number of CPU cores increases

2010-02-23 Thread sjdevn...@yahoo.com
On Feb 23, 8:03 pm, Nobody  wrote:
> On Mon, 22 Feb 2010 22:27:54 -0800, sjdevn...@yahoo.com wrote:
> > Basically, multiprocessing is always hard--but it's less hard to start
> > without shared everything.  Going with the special case (sharing
> > everything, aka threading) is by far the stupider and more complex way
> > to approach multiprocssing.
>
> Multi-threading hardly shares anything (only dynamically-allocated
> and global data), while everything else (the entire stack) is per-thread.
>
> Yeah, I'm being facetious. Slightly.

I'm afraid I may be missing the facetiousness here.

The only definitional difference between threads and processes is that
threads share memory, while processes don't.

There are often other important practical implementation details, but
sharing memory vs not sharing memory is the key theoretical
distinction between threads and processes.  On most platforms, whether
or not you want to share memory (and abandon memory protection wrt the
rest of the program) is the key factor a programmer should consider
when deciding between threads and processes--the only time that's not
true is when the implementation forces ancillary details upon you.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: The future of "frozen" types as the number of CPU cores increases

2010-02-22 Thread sjdevn...@yahoo.com
On Feb 22, 9:24 pm, John Nagle  wrote:
> sjdevn...@yahoo.com wrote:
> > On Feb 20, 9:58 pm, John Nagle  wrote:
> >> sjdevn...@yahoo.com wrote:
> >>> On Feb 18, 2:58 pm, John Nagle  wrote:
> >>>>     Multiple processes are not the answer.  That means loading multiple
> >>>> copies of the same code into different areas of memory.  The cache
> >>>> miss rate goes up accordingly.
> >>> A decent OS will use copy-on-write with forked processes, which should
> >>> carry through to the cache for the code.
> >>     That doesn't help much if you're using the subprocess module.  The
> >> C code of the interpreter is shared, but all the code generated from
> >> Python is not.
>
> > Of course.  Multithreading also fails miserably if the threads all try
> > to call exec() or the equivalent.
>
> > It works fine if you use os.fork().
>
>     Forking in multithreaded programs is iffy.

One more thing: the above statement ("forking in multithreaded
programs is iffy"), is absolutely true, but it's also completely
meaningless in modern multiprocessing programs--it's like saying
"gotos in structured programs are iffy".  That's true, but it also has
almost no bearing on decently constructed modern programs.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: The future of "frozen" types as the number of CPU cores increases

2010-02-22 Thread sjdevn...@yahoo.com
On Feb 22, 9:24 pm, John Nagle  wrote:
> sjdevn...@yahoo.com wrote:
> > On Feb 20, 9:58 pm, John Nagle  wrote:
> >> sjdevn...@yahoo.com wrote:
> >>> On Feb 18, 2:58 pm, John Nagle  wrote:
> >>>>     Multiple processes are not the answer.  That means loading multiple
> >>>> copies of the same code into different areas of memory.  The cache
> >>>> miss rate goes up accordingly.
> >>> A decent OS will use copy-on-write with forked processes, which should
> >>> carry through to the cache for the code.
> >>     That doesn't help much if you're using the subprocess module.  The
> >> C code of the interpreter is shared, but all the code generated from
> >> Python is not.
>
> > Of course.  Multithreading also fails miserably if the threads all try
> > to call exec() or the equivalent.
>
> > It works fine if you use os.fork().
>
>     Forking in multithreaded programs is iffy.  What happens depends
> on the platform, and it's usually not what you wanted to happen.

Well, yeah.  And threading in multiprocess apps is iffy.  In the real
world, though, multiprocessing is much more likely to result in a
decent app than multithreading--and if you're not skilled at either,
starting with multiprocessing is by far the smarter way to begin.

Basically, multiprocessing is always hard--but it's less hard to start
without shared everything.  Going with the special case (sharing
everything, aka threading) is by far the stupider and more complex way
to approach multiprocssing.

And really, for real-world apps, it's much, much more likely that
fork() will be sufficient than that you'll need to explore the
vagueries of a multithreaded solution.  Protected memory rocks, and in
real life it's probably 95% of the time where threads are only even
considered if the OS can't fork() and otherwise use processes well.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: The future of "frozen" types as the number of CPU cores increases

2010-02-20 Thread sjdevn...@yahoo.com
On Feb 20, 9:58 pm, John Nagle  wrote:
> sjdevn...@yahoo.com wrote:
> > On Feb 18, 2:58 pm, John Nagle  wrote:
> >>     Multiple processes are not the answer.  That means loading multiple
> >> copies of the same code into different areas of memory.  The cache
> >> miss rate goes up accordingly.
>
> > A decent OS will use copy-on-write with forked processes, which should
> > carry through to the cache for the code.
>
>     That doesn't help much if you're using the subprocess module.  The
> C code of the interpreter is shared, but all the code generated from
> Python is not.

Of course.  Multithreading also fails miserably if the threads all try
to call exec() or the equivalent.

It works fine if you use os.fork().
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: The future of "frozen" types as the number of CPU cores increases

2010-02-20 Thread sjdevn...@yahoo.com
On Feb 18, 2:58 pm, John Nagle  wrote:
>     Multiple processes are not the answer.  That means loading multiple
> copies of the same code into different areas of memory.  The cache
> miss rate goes up accordingly.

A decent OS will use copy-on-write with forked processes, which should
carry through to the cache for the code.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Interesting talk on Python vs. Ruby and how he would like Python to have just a bit more syntactic flexibility.

2010-02-19 Thread sjdevn...@yahoo.com
On Feb 20, 1:30 am, Lawrence D'Oliveiro  wrote:
> In message , Rhodri James wrote:
>
> > In classic Pascal, a procedure was distinct from a function in that it had
> > no return value.  The concept doesn't really apply in Python; there are no
> > procedures in that sense, since if a function terminates without supplying
> > an explicit return value it returns None.
>
> If Python doesn’t distinguish between procedures and functions, why should
> it distinguish between statements and expressions?

Because the latter are different in Python (and in Ruby, and in most
modern languages), while the former aren't distinguished in Python or
Ruby or most modern languages?  Primarily functional languages are the
main exception, but other than them it's pretty uncommon to find any
modern language that does distinguish procedures and functions, or one
that doesn't distinguished statements and expressions.

You can certainly find exceptions, but distinguishing statements and
expressions is absolutely commonplace in modern languages, and
distinguishing functions and procedures is in the minority.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Interesting talk on Python vs. Ruby and how he would like Python to have just a bit more syntactic flexibility.

2010-02-19 Thread sjdevn...@yahoo.com
On Feb 20, 1:28 am, Lawrence D'Oliveiro  wrote:
> In message <87eikjcuzk@benfinney.id.au>, Ben Finney wrote:
>
>
>
> > Lawrence D'Oliveiro  writes:
>
> >> In message , cjw wrote:
>
> >> > Aren't lambda forms better described as function?
>
> >> Is this a function?
>
> >>     lambda : None
>
> >> What about this?
>
> >>     lambda : sys.stdout.write("hi there!\n")
>
> > They are both lambda forms in Python. As a Python expression, they
> > evaluate to (they “return”) a function object.
>
> So there is no distinction between functions and procedures, then?

Not in most modern languages, no.  i think the major places they are
differentiated are in functional languages and in pre-1993ish
languages (give or take a few years), neither of which applies to
Python or Ruby.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Interesting talk on Python vs. Ruby and how he would like Python to have just a bit more syntactic flexibility.

2010-02-18 Thread sjdevn...@yahoo.com
On Feb 18, 10:58 pm, Paul Rubin  wrote:
> Steve Howell  writes:
> >> But frankly, although there's no reason that you _have_ to name the
> >> content at each step, I find it a lot more readable if you do:
>
> >> def print_numbers():
> >>     tuples = [(n*n, n*n*n) for n in (1,2,3,4,5,6)]
> >>     filtered = [ cube for (square, cube) in tuples if square!=25 and
> >> cube!=64 ]
> >>     for f in filtered:
> >>         print f
>
> > The names you give to the intermediate results here are
> > terse--"tuples" and "filtered"--so your code reads nicely.
>
> But that example makes tuples and filtered into completely expanded
> lists in memory.  I don't know Ruby so I've been wondering whether the
> Ruby code would run as an iterator pipeline that uses constant memory.

I don't know how Ruby works, either.  If it's using constant memory,
switching the Python to generator comprehensions (and getting constant
memory usage) is simply a matter of turning square brackets into
parentheses:

def print_numbers():
tuples = ((n*n, n*n*n) for n in (1,2,3,4,5,6))
filtered = ( cube for (square, cube) in tuples if square!=25 and
 cube!=64 )
for f in filtered:
print f

Replace (1,2,3,4,5,6) with xrange(1) and memory usage still
stays constant.

Though for this particular example, I prefer a strict looping solution
akin to what Jonathan Gardner had upthread:

for n in (1,2,3,4,5,6):
square = n*n
cube = n*n*n
if square == 25 or cube == 64: continue
print cube

> > In a more real world example, the intermediate results would be
> > something like this:
>
> >    departments
> >    departments_in_new_york
> >    departments_in_new_york_not_on_bonus_cycle
> >    employees_in_departments_in_new_york_not_on_bonus_cycle
> >    names_of_employee_in_departments_in_new_york_not_on_bonus_cycle

I don't think the assertion that the names would be ridiculously long
is accurate, either.

Something like:

departments = blah
ny_depts = blah(departments)
non_bonus_depts = blah(ny_depts)
non_bonus_employees = blah(non_bonus_depts)
employee_names = blah(non_bonus_employees)

If the code is at all well-structured, it'll be just as obvious from
the context that each list/generator/whatever is building from the
previous one as it is in the anonymous block case.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Interesting talk on Python vs. Ruby and how he would like Python to have just a bit more syntactic flexibility.

2010-02-18 Thread sjdevn...@yahoo.com
On Feb 18, 11:15 am, Steve Howell  wrote:
>     def print_numbers()
>         [1, 2, 3, 4, 5, 6].map { |n|
>             [n * n, n * n * n]
>         }.reject { |square, cube|
>             square == 25 || cube == 64
>         }.map { |square, cube|
>             cube
>         }.each { |n|
>             puts n
>         }
>     end
>
> IMHO there is no reason that I should have to name the content of each
> of those four blocks of code, nor should I have to introduce the
> "lambda" keyword.

You could do it without intermediate names or lambdas in Python as:
def print_numbers():
for i in [ cube for (square, cube) in
 [(n*n, n*n*n) for n in [1,2,3,4,5,6]]
   if square!=25 and cube!=64 ]:
print i

But frankly, although there's no reason that you _have_ to name the
content at each step, I find it a lot more readable if you do:

def print_numbers():
tuples = [(n*n, n*n*n) for n in (1,2,3,4,5,6)]
filtered = [ cube for (square, cube) in tuples if square!=25 and
cube!=64 ]
for f in filtered:
print f
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: The future of "frozen" types as the number of CPU cores increases

2010-02-17 Thread sjdevn...@yahoo.com
On Feb 17, 2:35 am, Steven D'Aprano
 wrote:
> On Tue, 16 Feb 2010 21:09:27 -0800, John Nagle wrote:
> >     Yes, we're now at the point where all the built-in mutable types
> > have "frozen" versions.  But we don't have that for objects.  It's
> > generally considered a good thing in language design to offer, for user
> > defined types, most of the functionality of built-in ones.
>
> It's not hard to build immutable user-defined types. Whether they're
> immutable enough is another story :)
>
> >>> class FrozenMeta(type):
>
> ...     def __new__(meta, classname, bases, classDict):
> ...         def __setattr__(*args):
> ...             raise TypeError("can't change immutable class")
> ...         classDict['__setattr__'] = __setattr__
> ...         classDict['__delattr__'] = __setattr__
> ...         return type.__new__(meta, classname, bases, classDict)
> ...
>
> >>> class Thingy(object):
>
> ...     __metaclass__ = FrozenMeta
> ...     def __init__(self, x):
> ...         self.__dict__['x'] = x
> ...
>
> >>> t = Thingy(45)
> >>> t.x
> 45
> >>> t.x = 42
>
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "", line 4, in __setattr__
> TypeError: can't change immutable class
>
> It's a bit ad hoc, but it seems to work for me. Unfortunately there's no
> way to change __dict__ to a "write once, read many" dict.

Which makes it not really immutable, as does the relative ease of
using a normal setattr:

... t.__dict__['x'] = "foo"
... print t.x
foo
... object.__setattr__(t, "x", 42)
... print t.x
42
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: To (monkey)patch or not to (monkey)patch, that is the question

2010-02-09 Thread sjdevn...@yahoo.com
On Feb 9, 3:54 am, George Sakkis  wrote:
> I was talking to a colleague about one rather unexpected/undesired
> (though not buggy) behavior of some package we use. Although there is
> an easy fix (or at least workaround) on our end without any apparent
> side effect, he strongly suggested extending the relevant code by hard
> patching it and posting the patch upstream, hopefully to be accepted
> at some point in the future. In fact we maintain patches against
> specific versions of several packages that are applied automatically
> on each new build. The main argument is that this is the right thing
> to do, as opposed to an "ugly" workaround or a fragile monkey patch.
> On the other hand, I favor practicality over purity and my general
> rule of thumb is that "no patch" > "monkey patch" > "hard patch", at
> least for anything less than a (critical) bug fix.

I'd monkey patch for the meantime, but send a hard patch in the hopes
of shifting the maintenance burden to someone else.  (Plus maybe help
out the upstream project and other people, I guess)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to run part of my python code as root

2010-02-04 Thread sjdevn...@yahoo.com
On Feb 4, 2:05 pm, Tomas Pelka  wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Hey,
>
> is there possibility how to run part of my code (function for example)
> as superuser.
>
> Or only way how to do this is create a wrapper and run is with Popen
> through sudo (but I have to configure sudo to run "whole" python as root).

In decreasing order of desirability:
1. Find a way to not need root access (e.g. grant another user or
group access to whatever resource you're trying to access).
2. Isolate the stuff that needs root access into a small helper
program that does strict validation of all input (including arguments,
environment, etc); when needed, run that process under sudo or
similar.
2a. Have some sort of well-verified helper daemon that has access to
the resource you need and mediates use of that resource.
3. Run the process as root, using seteuid() to switch between user and
root privs.  The entire program must be heavily verified and do strict
validation of all inputs.  Any attacker who gets control over the
process can easily switch to root privs and do damage.  This is
generally a bad idea.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python and Ruby

2010-02-02 Thread sjdevn...@yahoo.com
On Feb 2, 5:01 pm, Jonathan Gardner 
wrote:
> On Feb 1, 6:36 pm, John Bokma  wrote:
>
>
>
>
>
> > Jonathan Gardner  writes:
> > > One of the bad things with languages like perl
>
> > FYI: the language is called Perl, the program that executes a Perl
> > program is called perl.
>
> > > without parentheses is that getting a function ref is not obvious. You
> > > need even more syntax to do so. In perl:
>
> > >  foo();       # Call 'foo' with no args.
> > >  $bar = foo;  # Call 'foo; with no args, assign to '$bar'
> > >  $bar = &foo; # Don't call 'foo', but assign a pointer to it to '$bar'
> > >               # By the way, this '&' is not the bitwise-and '&'
>
> > It should be $bar = \&foo
> > Your example actually calls foo...
>
> I rest my case. I've been programming perl professionally since 2000,
> and I still make stupid, newbie mistakes like that.
>
> > > One is simple, consistent, and easy to explain. The other one requires
> > > the introduction of advanced syntax and an entirely new syntax to make
> > > function calls with references.
>
> > The syntax follows that of referencing and dereferencing:
>
> > $bar = \...@array;       # bar contains now a reference to array
> > $bar->[ 0 ];          # first element of array referenced by bar
> > $bar = \%hash;        # bar contains now a reference to a hash
> > $bar->{ key };        # value associated with key of hash ref. by bar
> > $bar = \&foo;         # bar contains now a reference to a sub
> > $bar->( 45 );         # call sub ref. by bar with 45 as an argument
>
> > Consistent: yes. New syntax? No.
>
> Except for the following symbols and combinations, which are entirely
> new and different from the $...@% that you have to know just to use
> arrays and hashes.
>
> \@, ->[ ]
> \%, ->{ }
> \&, ->( )
>
> By the way:
> * How do you do a hashslice on a hashref?
> * How do you invoke reference to a hash that contains a reference to
> an array that contains a reference to a function?
>
> Compare with Python's syntax.
>
> # The only way to assign
> a = b

>>> locals().__setitem__('a', 'b')
>>> print a
b

> # The only way to call a function
> b(...)

>>> def b(a):
...print a*2
>>> apply(b, (3,))
6

> # The only way to access a hash or array or string or tuple
> b[...]

>>> b={}
>>> b[1] = 'a'
>>> print b.__getitem__(1)
a


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: myths about python 3

2010-01-27 Thread sjdevn...@yahoo.com
On Jan 27, 9:22 am, Daniel Fetchinson 
wrote:
> >> Hi folks,
>
> >> I was going to write this post for a while because all sorts of myths
> >> periodically come up on this list about python 3. I don't think the
> >> posters mean to spread false information on purpose, they simply are
> >> not aware of the facts.
>
> >> My list is surely incomplete, please feel free to post your favorite
> >> misconception about python 3 that people periodically state, claim or
> >> ask about.
>
> >> 1. Print statement/function creates incompatibility between 2.x and 3.x!
>
> >> Certainly false or misleading, if one uses 2.6 and 3.x the
> >> incompatibility is not there. Print as a function works in 2.6:
>
> >> Python 2.6.2 (r262:71600, Aug 21 2009, 12:23:57)
> >> [GCC 4.4.1 20090818 (Red Hat 4.4.1-6)] on linux2
> >> Type "help", "copyright", "credits" or "license" for more information.
> > print( 'hello' )
> >> hello
> > print 'hello'
> >> hello
>
> >> 2. Integer division creates incompatibility between 2.x and 3.x!
>
> >> Again false or misleading, because one can get the 3.x behavior with 2.6:
>
> >> Python 2.6.2 (r262:71600, Aug 21 2009, 12:23:57)
> >> [GCC 4.4.1 20090818 (Red Hat 4.4.1-6)] on linux2
> >> Type "help", "copyright", "credits" or "license" for more information.
> > 6/5
> >> 1
> > from __future__ import division
> > 6/5
> >> 1.2
>
> >> Please feel free to post your favorite false or misleading claim about
> >> python 3!
>
> > Well, I see two false or misleading claims just above - namely that
> > the two claims above are false or misleading. They tell just half of
> > the story, and that half is indeed easy. A Python 3 program can be
> > unchanged (in the case of print) or with only trivial modifications
> > (in the case of integer division) be made to run on Python 2.6.
>
> Okay, so we agree that as long as print and integer division is
> concerned, a program can easily be written that runs on both 2.6 and
> 3.x.
>
> My statements are exactly this, so I don't understand why you disagree.
>
> > The other way around this is _not_ the case.
>
> What do you mean?
>
> > To say that two things are
> > compatible if one can be used for the other, but the other not for the
> > first, is false or misleading.
>
> I'm not sure what you mean here. Maybe I didn't make myself clear
> enough, but what I mean is this: as long as print and integer division
> is concerned, it is trivial to write code that runs on both 2.6 and
> 3.x. Hence if someone wants to highlight incompatibility (which surely
> exists) between 2.6 and 3.x he/she has to look elsewhere.

I think you're misunderstanding what "compatibility" means in a
programming language context.  Python 3 and Python 2 are not mutually
compatible, as arbitrary programs written in one will not run in the
other.  The most important fallout of that is that Python 3 is not
backwards compatible, in that existing Python 2 programs won't run
unaltered in Python 3--while it's easy to write a new program in a
subset of the languages that runs on both Python 2 and 3, the huge
existing codebase of Python 2 code won't run under Python 3.

That there exists an intersection of the two languages that is
compatible with both doesn't make the two languages compatible with
each other--although it being a fairly large subset does help mitigate
a sizable chunk of the problems caused by incompatibility (as do tools
like 2to3).

In your example, a python2 program that uses print and division will
fail in python3.  The print problem is less significant, since the
failure will probably be a syntax error or a rapid exception raised.
The division problem is more problematic, since a program may appear
to run fine but silently misbehave; such errors are much more likely
to result in significant damage to data or other long-term badness.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Library support for Python 3.x

2010-01-27 Thread sjdevn...@yahoo.com
On Jan 27, 2:03 pm, Paul Rubin  wrote:
> a...@pythoncraft.com (Aahz) writes:
> > From my POV, your question would be precisely identical if you had
> > started your project when Python 2.3 was just released and wanted to
> > know if the libraries you selected would be available for Python 2.6.
>
> I didn't realize 2.6 broke libraries that had worked in 2.3, at least on
> any scale.  Did I miss something?

I certainly had to update several modules I use (C extensions) to work
with the new memory management in a recent release (changing PyMem_Del
to Py_DECREF being a pretty common alteration); I can't remember
whether that was for 2.6 or 2.5.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bare Excepts

2010-01-14 Thread sjdevn...@yahoo.com
On Jan 2, 9:35 pm, Dave Angel  wrote:
> Steven D'Aprano wrote:
> > On Sat, 02 Jan 2010 09:40:44 -0800, Aahz wrote:
>
> >> OTOH, if you want to do something different depending on whether the
> >> file exists, you need to use both approaches:
>
> >> if os.path.exists(fname):
> >>     try:
> >>         f = open(fname, 'rb')
> >>         data = f.read()
> >>         f.close()
> >>         return data
> >>     except IOError:
> >>         logger.error("Can't read: %s", fname) return ''
> >> else:
> >>     try:
> >>         f = open(fname, 'wb')
> >>         f.write(data)
> >>         f.close()
> >>     except IOError:
> >>         logger.error("Can't write: %s", fname)
> >>     return None
>
> > Unfortunately, this is still vulnerable to the same sort of race
> > condition I spoke about.
>
> > Even more unfortunately, I don't know that there is any fool-proof way of
> > avoiding such race conditions in general. Particularly the problem of
> > "open this file for writing only if it doesn't already exist".
>
> > 
>
> In Windows, there is  a way to do it.  It's just not exposed to the
> Python built-in function open().  You use the CreateFile() function,
> with /dwCreationDisposition/  of CREATE_NEW.
>
> It's atomic, and fails politely if the file already exists.
>
> No idea if Unix has a similar functionality.

It does.   In Unix, you'd pass O_CREAT|O_EXCL to the open(2) system
call (O_CREAT means create a new file, O_EXCL means exclusive mode:
fail if the file exists already).

The python os.open interface looks suspiciously like the Unix system
call as far as invocation goes, but it wraps the Windows functionality
properly for the relevant flags.  The following should basically work
on both OSes (though you'd want to specify a Windows filename, and
also set os.O_BINARY or os.O_TEXT as needed on Windows):

import os

def exclusive_open(fname):
rv = os.open(fname, os.O_RDWR|os.O_CREAT|os.O_EXCL)
return os.fdopen(rv)

first = exclusive_open("/tmp/test")
print "SUCCESS: ", first
second = exclusive_open("/tmp/test")
print "SUCCESS: ", second




Run that and you should get:
SUCCESS ', mode 'r' at 0xb7f72c38>
Traceback (most recent call last):
  File "testopen.py", line 9, in 
second = exclusive_open("/tmp/test")
  File "testopen.py", line 4, in exclusive_open
rv = os.open(fname, os.O_RDWR|os.O_CREAT|os.O_EXCL)
OSError: [Errno 17] File exists: '/tmp/test'
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: read text file byte by byte

2009-12-15 Thread sjdevn...@yahoo.com
On Dec 14, 11:44 pm, Terry Reedy  wrote:
> On 12/14/2009 7:37 PM, Gabriel Genellina wrote:
>
>
>
> > En Mon, 14 Dec 2009 18:09:52 -0300, Nobody  escribió:
> >> On Sun, 13 Dec 2009 22:56:55 -0800, sjdevn...@yahoo.com wrote:
>
> >>> The 3.1 documentation specifies that file.read returns bytes:
>
> >>> Does it need fixing?
>
> >> There are no file objects in 3.x. The file() function no longer
> >> exists. The return value from open(), will be an instance of
> >> _io. depending upon the mode, e.g. _io.TextIOWrapper for 'r',
> >> _io.BufferedReader for 'rb', _io.BufferedRandom for 'w+b', etc.
>
> >>http://docs.python.org/3.1/library/io.html
>
> >> io.IOBase.read() doesn't exist, io.RawIOBase.read(n) reads n bytes,
> >> io.TextIOBase.read(n) reads n characters.
>
> > So basically this section [1] should not exist, or be completely rewritten?
> > At least the references to C stdio library seem wrong to me.
>
> > [1]http://docs.python.org/3.1/library/stdtypes.html#file-objects
>
> I agree.http://bugs.python.org/issue7508
>
> Terry Jan Reedy

Thanks, Terry.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: read text file byte by byte

2009-12-14 Thread sjdevn...@yahoo.com
On Dec 14, 4:09 pm, Nobody  wrote:
> On Sun, 13 Dec 2009 22:56:55 -0800, sjdevn...@yahoo.com wrote:
> > The 3.1 documentation specifies that file.read returns bytes:
> > Does it need fixing?
>
> There are no file objects in 3.x.

Then the documentation definitely needs fixing; the excerpt I posted
earlier is from the 3.1 documentation's section about file objects:
http://docs.python.org/3.1/library/stdtypes.html#file-objects

Which begins:

"5.9  File Objects

File objects are implemented using C’s stdio package and can be
created with the built-in open() function. File objects are also
returned by some other built-in functions and methods, such as os.popen
() and os.fdopen() and the makefile() method of socket objects."

(It goes on to describe the read method's operation on bytes that I
quoted upthread.)

Sadly I'm not familiar enough with 3.x to suggest an appropriate edit.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: read text file byte by byte

2009-12-14 Thread sjdevn...@yahoo.com
On Dec 14, 1:57 pm, Dennis Lee Bieber  wrote:
> On Sun, 13 Dec 2009 22:56:55 -0800 (PST), "sjdevn...@yahoo.com"
>  declaimed the following in
> gmane.comp.python.general:
>
>
>
>
>
>
>
> > The 3.1 documentation specifies that file.read returns bytes:
>
> > file.read([size])
> >     Read at most size bytes from the file (less if the read hits EOF
> > before obtaining size bytes). If the size argument is negative or
> > omitted, read all data until EOF is reached. The bytes are returned as
> > a string object. An empty string is returned when EOF is encountered
> > immediately. (For certain files, like ttys, it makes sense to continue
> > reading after an EOF is hit.) Note that this method may call the
> > underlying C function fread() more than once in an effort to acquire
> > as close to size bytes as possible. Also note that when in non-
> > blocking mode, less data than was requested may be returned, even if
> > no size parameter was given.
>
> > Does it need fixing?
>
>         I'm still running 2.5 (Maybe next spring I'll see if all the third
> party libraries I have exist in 2.6 versions)... BUT...
>
>         "... are returned as a string object..." Aren't "strings" in 3.x now
> unicode? Which would imply, to me, that the interpretation of the
> contents will not be plain bytes.

I'm not even concerned (yet) about how the data is interpreted after
it's read.  First I'm trying to clarify what exactly gets read.

The post I was replying to said "In Python 3.x, f.read(1) will read
one character, which may be more than one byte depending on the
encoding."

That seems at odds with the documentation saying "Read at most size
bytes from the file"--the fact that it's documented to read "size"
bytes rather than "size" (possibly multibyte) characters is emphasized
by the later language saying that the underlying C fread() call may be
called enough times to read as close to size bytes as possible.

If the poster I was replying to is correct, it seems like a
documentation update is in order.  As a long-time programmer, I would
be very surprised to make a call to f.read(X) and have it return more
than X bytes if I hadn't read this here.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: read text file byte by byte

2009-12-13 Thread sjdevn...@yahoo.com
On Dec 13, 5:56 pm, "Rhodri James" 
wrote:
> On Sun, 13 Dec 2009 06:44:54 -, Steven D'Aprano  
>
>  wrote:
> > On Sat, 12 Dec 2009 22:15:50 -0800, daved170 wrote:
>
> >> Thank you all.
> >> Dennis I really liked you solution for the issue but I have two question
> >> about it:
> >> 1) My origin file is Text file and not binary
>
> > That's a statement, not a question.
>
> >> 2) I need to read each time 1 byte.
>
> > f = open(filename, 'r')  # open in text mode
> > f.read(1)  # read one byte
>
> The OP hasn't told us what version of Python he's using on what OS.  On  
> Windows, text mode will compress the end-of-line sequence into a single  
> "\n".  In Python 3.x, f.read(1) will read one character, which may be more  
> than one byte depending on the encoding.

The 3.1 documentation specifies that file.read returns bytes:

file.read([size])
Read at most size bytes from the file (less if the read hits EOF
before obtaining size bytes). If the size argument is negative or
omitted, read all data until EOF is reached. The bytes are returned as
a string object. An empty string is returned when EOF is encountered
immediately. (For certain files, like ttys, it makes sense to continue
reading after an EOF is hit.) Note that this method may call the
underlying C function fread() more than once in an effort to acquire
as close to size bytes as possible. Also note that when in non-
blocking mode, less data than was requested may be returned, even if
no size parameter was given.

Does it need fixing?
-- 
http://mail.python.org/mailman/listinfo/python-list