from:"Mike Kazantsev"

Re: n00b confusion re: local variable referenced before assignment error

2009-06-19 Thread Mike Kazantsev

On Fri, 19 Jun 2009 11:16:38 -0500
Wells Oliver  wrote:

> def save(self, uri, location):
> try:
> handler = urllib2.urlopen(uri)
> except urllib2.HTTPError, e:
> if e.code == 404:
> return
> else:
> print "retrying %s (HTTPError)" % uri
> time.sleep(1)
> self.save(uri, location)
> except urllib2.URLError, e:
> print "retrying %s" % uri
> time.sleep(1)
> self.save(uri, location)
> 
> if not os.path.exists(os.path.dirname(location)):
> os.makedirs(os.path.dirname(location))
> 
> file = open(location, "w")
> file.write(handler.read())
> file.close()

> But what I am seeing is that after a retry (on catching a URLError
> exception), I see bunches of "UnboundLocalError: local variable 'handler'
> referenced before assignment" errors on line 38, which is the
> "file.write(handler.read())" line..
> 
> What gives?

Why not?
Try fails, except calls retry and after the retry code execution
continues to the undefined "handler", since the try has failed here.
You might want to insert return or avoid (possibly endless) recursion
altogether - just wrap it into while loop with some counter (aka
max_tries).

Also, you can get rid of code duplication by catching some basic
urllib2 exception, then checking if it's urllib2.HTTPError and it's code
is 404, retrying ("continue" for the loop case) otherwise.

-- 
Mike Kazantsev // fraggod.net

signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Calling subprocess with arguments

2009-06-19 Thread Mike Kazantsev

On Fri, 19 Jun 2009 22:00:28 +0600
Mike Kazantsev  wrote:

> On Fri, 19 Jun 2009 08:28:17 -0700
> Tyler Laing  wrote:
> 
> > Thanks mike, the idea that maybe some of the info isn't being passed is
> > certainly interesting.
> > 
> > Here's the output of os.environ and sys.argv:
> >
> ...
> 
> I'm afraid these doesn't make much sense without the output from the
> second results, from py itself. My suggestion was just to compare them
> - pop the py shell, eval the outputs into two sets, do the diff and
> you'll see it at once.
> If there's an empty set then I guess it's pretty safe to assume that
> python creates subprocess in the same way the shell does.

Just thought of one more really simple thing I've missed: vlc might
expect it's remote to work with tty, so when py shoves it a pipe
instead, it automatically switches to non-interactive mode.

You can remedy that a bit by superclassing subprocess.Popen, replacing
pipes with pty, but they are quite hard to work with, prehaps pexpect
module would be of some use there:

  http://pypi.python.org/pypi/pexpect/

-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Calling subprocess with arguments

2009-06-19 Thread Mike Kazantsev

On Fri, 19 Jun 2009 08:28:17 -0700
Tyler Laing  wrote:

> Thanks mike, the idea that maybe some of the info isn't being passed is
> certainly interesting.
> 
> Here's the output of os.environ and sys.argv:
>
...

I'm afraid these doesn't make much sense without the output from the
second results, from py itself. My suggestion was just to compare them
- pop the py shell, eval the outputs into two sets, do the diff and
you'll see it at once.
If there's an empty set then I guess it's pretty safe to assume that
python creates subprocess in the same way the shell does.

-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Retrieving column values by column name with MySQLdb

2009-06-19 Thread Mike Kazantsev

On Fri, 19 Jun 2009 10:32:32 -0500
Tim Chase  wrote:

> Mike gave you a good answer, though I think it's MySQL specific. 

I don't have to deal with MySQL frequently but I've remembered that I
used got the fields out somehow, and now, looking at the code, I wonder
myself, why "how" is 1 and wtf is this "how", anyway!? ;)

I can't seem to find any mention of such methods in documentation and
even python source, guess they are implemented directly in underlying
C lib.
Hope I learned to abstract from such syntax since then, I sure do...

-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Calling subprocess with arguments

2009-06-19 Thread Mike Kazantsev

On Fri, 19 Jun 2009 08:07:29 -0700
Tyler Laing  wrote:

> I can't use communicate, as it waits for the child process to terminate.
> Basically it blocks. I'm trying to have dynamic communication between the
> python program, and vlc.

Unfortunately, subprocess module doesn't allow it out-of-the-box, but
you can use fnctl calls to perform non-blocking reads/writes on it's
pipes, like this:

  flags = fcntl.fcntl(fd, fcntl.F_GETFL)
  fcntl.fcntl(fd, fcntl.F_SETFL, flags | os.O_NONBLOCK)

After that, you can grab all the available data from the pipe at any
given time w/o blocking.

Try this recipe:

  http://code.activestate.com/recipes/576759/

-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Calling subprocess with arguments

2009-06-19 Thread Mike Kazantsev

On Fri, 19 Jun 2009 07:55:19 -0700
Tyler Laing  wrote:

> I want to execute this command string: vlc -I rc
>
> This allows vlc to be controlled via  a remote interface instead of the
> normal gui interface.
>
> Now, say, I try this from subprocess:
>
> >>>p=subprocess.Popen('vlc -I rc test.avi'.split(' '), shell=False,
> stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
>
> But I don't get the remote interface. I get the normal gui interface. So
> how do I do it? I've tried passing ['vlc', '-I', 'rc'], I've tried ['-I',
> 'rc'] with executable set to 'vlc'. I've had shell=True, I've had
> shell=False. I've tried all these combinations.
>
> What am I doing wrong?

Write a simple script:

  #!/usr/bin/env python
  import sys
  open('/tmp/argv', 'w').write(repr(sys.argv))

And replace 'vlc' with a path to this script, then invoke it from a
shell, compare the results.
If it gets the right stuff, try the same with os.environ (prehaps vlc
keeps socket location there, just like ssh/gpg-agents?).

-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Retrieving column values by column name with MySQLdb

2009-06-19 Thread Mike Kazantsev

On Fri, 19 Jun 2009 15:46:46 +0100
jorma kala  wrote:

> Is there a way of retrieving the value of columns in the rows returned by
> fetchall, by column name instead of index on the row?

Try this:

  db = MySQLdb.Connection(host=host,user=user,passwd=passwd,db=database)
  db.query(query)
  result = db.store_result()
  data = result.fetch_row(maxrows=0, how=1)

-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: multiprocessing and process run time

2009-06-19 Thread Mike Kazantsev

On Fri, 19 Jun 2009 07:40:11 -0700 (PDT)
Thomas Robitaille  wrote:

> I'm making use of the multiprocessing module, and I was wondering if there
> is an easy way to find out how long a given process has been running for.
> For example, if I do
> 
> import multiprocessing as mp
> import time
> 
> def time_waster():
> time.sleep(1000)
> 
> p = mp.Process(target=time_waster)
> 
> p.start()
> 
> Is there a way that I can then find how long p has been running for? I
> figured I can use p.pid to get the PID of the process, but I'm not sure
> where to go from there. Is there an easy way to do this?

If you use unix-like platform (e.g. linux) you can just use 'ps -e -o
pid,start_time | grep '.
I'd used procpy to do ps-like stuff in python since it's a wrapper
around the same procps library, prehaps it can get the same results as
well, w/o having to invoke shell commands:

  http://code.google.com/p/procpy/

-- 
Mike Kazantsev // fraggod.net


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: walking a directory with very many files

2009-06-19 Thread Mike Kazantsev

On Fri, 19 Jun 2009 17:53:40 +1200
Lawrence D'Oliveiro  wrote:

> In message <20090618081423.2e035...@coercion>, Mike Kazantsev wrote:
> 
> > On Thu, 18 Jun 2009 10:33:49 +1200
> > Lawrence D'Oliveiro  wrote:
> > 
> >> In message <20090617214535.10866...@coercion>, Mike Kazantsev
> >> wrote:
> >> 
> >>> On Wed, 17 Jun 2009 23:04:37 +1200
> >>> Lawrence D'Oliveiro  wrote:
> >>> 
> >>>> In message <20090617142431.2b25f...@malediction>, Mike Kazantsev
> >>>> wrote:
> >>>> 
> >>>>> On Wed, 17 Jun 2009 17:53:33 +1200
> >>>>> Lawrence D'Oliveiro  wrote:
> >>>>> 
> >>>>>>> Why not use hex representation of md5/sha1-hashed id as a
> >>>>>>> path, arranging them like /path/f/9/e/95ea4926a4 ?
> >>>>>>> 
> >>>>>>> That way, you won't have to deal with many-files-in-path
> >>>>>>> problem ...
> >>>>>> 
> >>>>>> Why is that a problem?
> >>>>> 
> >>>>> So you can os.listdir them?
> >>>> 
> >>>> Why should you have a problem os.listdir'ing lots of files?
> >>> 
> >>> I shouldn't, and I don't ;)
> >> 
> >> Then why did you suggest that there was a problem being able to
> >> os.listdir them?
> > 
> > I didn't, OP did ...
> 
> Then why did you reply to my question "Why is that a problem?" with
> "So that you can os.listdir them?", if you didn't think there was a
> problem (see above)?

Why do you think that if I didn't suggest there is a problem, I think
there is no problem?

I do think there might be such a problem and even I may have to face it
someday. So, out of sheer curiosity how more rediculous this topic can
be I'll try to rephrase and extend what I wrote in the first place:

Why would you want to listdir them?
I can imagine at least one simple scenario: you had some nasty crash
and you want to check that every file has corresponding, valid db
record.

What's the problem with listdir if there's 10^x of them?
Well, imagine that db record also holds file modification time (say,
the files are some kind of cache), so not only you need to compare
listdir results with db, but also do os.stat on every file and some
filesystems will do it very slowly with so many of them in one place.

Now, I think I made this point in the first answer, no?

Of course you can make it more rediculous by your
I-can-talk-away-any-problem-I-can't-see-or-solve approach by asking "why
would you want to use such filesystems?", "why do you have to use
FreeBSD?", "why do you have to work for such employer?", "why do you
have to eat?" etc, but you know, sometimes it's easier and better for
the project/work just to solve it, than talk everyone else away from it
just because you don't like otherwise acceptable solution.

-- 
Mike Kazantsev // fraggod.net

signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: walking a directory with very many files

2009-06-17 Thread Mike Kazantsev

On Thu, 18 Jun 2009 10:33:49 +1200
Lawrence D'Oliveiro  wrote:

> In message <20090617214535.10866...@coercion>, Mike Kazantsev wrote:
> 
> > On Wed, 17 Jun 2009 23:04:37 +1200
> > Lawrence D'Oliveiro  wrote:
> > 
> >> In message <20090617142431.2b25f...@malediction>, Mike Kazantsev wrote:
> >> 
> >>> On Wed, 17 Jun 2009 17:53:33 +1200
> >>> Lawrence D'Oliveiro  wrote:
> >>> 
> >>>>> Why not use hex representation of md5/sha1-hashed id as a path,
> >>>>> arranging them like /path/f/9/e/95ea4926a4 ?
> >>>>> 
> >>>>> That way, you won't have to deal with many-files-in-path problem ...
> >>>> 
> >>>> Why is that a problem?
> >>> 
> >>> So you can os.listdir them?
> >> 
> >> Why should you have a problem os.listdir'ing lots of files?
> > 
> > I shouldn't, and I don't ;)
> 
> Then why did you suggest that there was a problem being able to os.listdir 
> them?

I didn't, OP did, and that's what the topic "walking directory with
many files" is about.
I wonder whether you're unable to read past the first line, trying to
make some point or just some kind of alternatively-gifted (i.e.
brain-handicapped) person to keep interpreting posts w/o context like
that.

-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: walking a directory with very many files

2009-06-17 Thread Mike Kazantsev

On Wed, 17 Jun 2009 23:04:37 +1200
Lawrence D'Oliveiro  wrote:

> In message <20090617142431.2b25f...@malediction>, Mike Kazantsev wrote:
> 
> > On Wed, 17 Jun 2009 17:53:33 +1200
> > Lawrence D'Oliveiro  wrote:
> > 
> >> > Why not use hex representation of md5/sha1-hashed id as a path,
> >> > arranging them like /path/f/9/e/95ea4926a4 ?
> >> > 
> >> > That way, you won't have to deal with many-files-in-path problem ...
> >> 
> >> Why is that a problem?
> > 
> > So you can os.listdir them?
> 
> Why should you have a problem os.listdir'ing lots of files?

I shouldn't, and I don't ;)

-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: walking a directory with very many files

2009-06-17 Thread Mike Kazantsev

On Wed, 17 Jun 2009 17:53:33 +1200
Lawrence D'Oliveiro  wrote:

> > Why not use hex representation of md5/sha1-hashed id as a path,
> > arranging them like /path/f/9/e/95ea4926a4 ?
> > 
> > That way, you won't have to deal with many-files-in-path problem ...
> 
> Why is that a problem?

So you can os.listdir them?
Don't ask me what for, however, since that's the original question.
Also not every fs still in use handles this situation effectively, see
my original post.

-- 
Mike Kazantsev // fraggod.net

signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: walking a directory with very many files

2009-06-16 Thread Mike Kazantsev

On Wed, 17 Jun 2009 03:42:02 GMT
Lie Ryan  wrote:

> Mike Kazantsev wrote:
> > In fact, on modern filesystems it doesn't matter whether you
> > accessing /path/f9e95ea4926a4 with million files in /path
> > or /path/f/9/e/95ea with only hundred of them in each path. Former
> > case (all-in-one-path) would even outperform the latter with ext3
> > or reiserfs by a small margin.
> > Sadly, that's not the case with filesystems like FreeBSD ufs2 (at
> > least in sixth branch), so it's better to play safe and create
> > subdirs if the app might be run on different machines than keeping
> > everything in one path.
> 
> It might not matter for the filesystem, but the file explorer (and ls)
> would still suffer. Subfolder structure would be much better, and much
> easier to navigate manually when you need to.

It's an insane idea to navigate any structure with hash-based names
and hundreds of thousands files *manually*: "What do we have here?
Hashies?" ;)

-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Logging multiple lines

2009-06-16 Thread Mike Kazantsev

On Tue, 16 Jun 2009 22:22:31 -0400
Nikolaus Rath  wrote:

> How do you usually handle multi-line messages? Do you avoid them
> completely (and therefore also the exception logging facilities
> provided by logging)? Or is it possible to tweak the formatter so
> that it inserts the prefix at the beginning of every line? 

I'd log exception name and timestamp (or id) only, pushing the full
message with the same id to another log or facility (like mail it to
some dedicated bug-report box).

-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: walking a directory with very many files

2009-06-16 Thread Mike Kazantsev

On Wed, 17 Jun 2009 14:52:28 +1200
Lawrence D'Oliveiro  wrote:

> In message 
> <234b19ac-7baf-4356-9fe5-37d00146d...@z9g2000yqi.googlegroups.com>,
> thebjorn wrote:
> 
> > Not proud of this, but...:
> > 
> > [django] www4:~/datakortet/media$ ls bfpbilder|wc -l
> >  174197
> > 
> > all .jpg files between 40 and 250KB with the path stored in a
> > database field... *sigh*
> 
> Why not put the images themselves into database fields?
> 
> > Oddly enough, I'm a relieved that others have had similar folder
> > sizes ...
> 
> One of my past projects had 40-odd files in a single folder. They
> were movie frames, to allow assembly of movie sequences on demand.

For both scenarios:
Why not use hex representation of md5/sha1-hashed id as a path,
arranging them like /path/f/9/e/95ea4926a4 ?

That way, you won't have to deal with many-files-in-path problem, and,
since there's thousands of them anyway, name readability shouldn't
matter.

In fact, on modern filesystems it doesn't matter whether you accessing 
/path/f9e95ea4926a4 with million files in /path or /path/f/9/e/95ea
with only hundred of them in each path. Former case (all-in-one-path)
would even outperform the latter with ext3 or reiserfs by a small
margin.
Sadly, that's not the case with filesystems like FreeBSD ufs2 (at least
in sixth branch), so it's better to play safe and create subdirs if the
app might be run on different machines than keeping everything in one
path.

-- 
Mike Kazantsev // fraggod.net

signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Newbie help for using multiprocessing and subprocess packages for creating child processes

2009-06-16 Thread Mike Kazantsev

On Tue, 16 Jun 2009 23:20:05 +0200
Piet van Oostrum  wrote:

> >>>>> Matt  (M) wrote:
> 
> >M> Try replacing:
> >M> cmd = [ "ls /path/to/file/"+staname+"_info.pf" ]
> >M> with:
> >M> cmd = [ “ls”, “/path/to/file/"+staname+"_info.pf" ]
> 
> In addition I would like to remark that -- if the only thing you want
> to do is to start up a new command with subprocess.Popen -- the use
> of the multiprocessing package is overkill. You could use threads as
> well.
> 
> Moreover, if you don't expect any output from these processes and
> don't supply input to them through pipes there isn't even a need for
> these threads. You could just use os.wait() to wait for a child to
> finish and then start a new process if necessary.

And even if there is need to read/write data from/to the pipes more
than once (aka communicate), using threads or any more python
subprocesses seem like hammering a nail with sledgehammer - just _read_
or _write_ to pipes asynchronously.

-- 
Mike Kazantsev // fraggod.net

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: persistent composites

2009-06-16 Thread Mike Kazantsev

On Tue, 16 Jun 2009 06:57:13 -0700 (PDT)
Aaron Brady  wrote:

> Making the charitable interpretation that this was the extent of c-l-
> py's support and enthusiasm for my idea, I will now go into mourning.
> Death occurred at oh-eight-hundred.  Rest in peace, support &
> enthusiasm.

I've read this thread from the beginning, being tempted to insert
remarks about shelve module or ORMs like SQLAlchemy, but that'd be
meaningless without the problem description, which I haven't seen
anywhere. Is it some trick idea like "let's walk on our heads"?

-- 
Mike Kazantsev // fraggod.net

signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Multi-Threading and KeyboardInterrupt

2009-06-15 Thread Mike Kazantsev

On Mon, 15 Jun 2009 15:43:13 -0400
Matt  wrote:

> I'm going to use the multipocessing library from here forward so I can  
> take advantage of multiple cores and clusters. Either one should work  
> for my use, since in my non-demonstration code each thread spends most  
> of it's time waiting for a separate non-Python subprocess (created  
> with subprocess.Popen) to finish anyway. (I guess Python would see  
> this as IO-blocking) Therefore, if you can fix my toy example with  
> threading, that's fine.
> 
> DB.py, followed by a KeyboardInterrupt yields the output in a.out. I  
> want roughly the output in desired.out.
> 
> What do I need to do to modify this code to get my desired output and  
> corresponding functionality? It would be a shame if this wasn't  
> possible in any pure-Python way.

I don't know how complex task you have, but solving trivial IO blocks
with threads or subprocesses look either like ugly hack or an overkill
to me.

Why not just use I/O without blocking?
It's not 80s or 90s anymore, where you had to create subprocess to
handle every non-synchronous task, and since the main burden will be
pushed into non-py subprocesses already, why not implement controller
as a nice, clean and simple single-threaded event loop?

Consider this recipe:
  http://code.activestate.com/recipes/576759/

And if the task before you is complex indeed, involving more than just
two to five child processes with a simple "while True: ..." loop,
consider using twisted framework - it'll allow you to do incredible
stuff with any number of sockets with just few lines of code in a
clean, abstracted way.
Latter would also mean that you can always replace os pipes with network
sockets just by changing transport name, distributing your app to any
number of machines.

-- 
Mike Kazantsev // fraggod.net

signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: parsing json using simplejson

2009-06-15 Thread Mike Kazantsev

On Mon, 15 Jun 2009 20:01:58 -0700 (PDT)
deostroll  wrote:

> I want to be able to parse it into python objects. Any ideas?

JSON objects behave like python dicts (key:val pairs), so why not just
use them?

Both simplejson and py2.6-json (which is quite similar to the former)
do just that, but if you like JS attribute-like key access model you
can use it by extending the builtin dict class:


  import types, collections


  class AttrDict(dict):
'''AttrDict - dict with JS-like key=attr access'''
def __init__(self, *argz, **kwz):
  if len(argz) == 1 and not kwz and isinstance(argz[0], types.StringTypes):
super(AttrDict, self).__init__(open(argz[0]))
  else:
super(AttrDict, self).__init__(*argz, **kwz)
  for k,v in self.iteritems(): setattr(self, k, v) # re-construct all 
values via factory

def __val_factory(self, val):
  return AttrDict(val) if isinstance(val, collections.Mapping) else val

def __getattr__(self, k):
  return super(AttrDict, self).__getitem__(k)
__getitem__ = __getattr__

def __setattr__(self, k, v):
  return super(AttrDict, self).__setitem__(k, self.__val_factory(v))
__setitem__ = __setattr__


  if __name__ == '__main__':
import json

data = AttrDict(json.loads('{"host": "docs.python.org",'
  ' "port": 80,'
  ' "references": [ "collections",'
' "json",'
' "types",'
' "data model" ],'
  ' "see_also": { "UserDict": "similar, although'
' less flexible dict implementation." } }'))

print data.references

# You can always use it as a regular dict
print 'port' in data
print data['see_also']

# Data model propagnates itself to any sub-mappings
data.see_also.new_item = dict(x=1, y=2)
print data.see_also.keys()
data.see_also.new_item['z'] = 3
print data.see_also.new_item.z


-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: waling a directory with very many files

2009-06-15 Thread Mike Kazantsev

On Mon, 15 Jun 2009 15:35:04 -0400
Terry Reedy  wrote:

> Christian Heimes wrote:
> > Terry Reedy wrote:
> >> You did not specify version.  In Python3, os.walk has become a generater
> >> function.  So, to answer your question, use 3.1.
> > 
> > I'm sorry to inform you that Python 3.x still returns a list, not a
> > generator.
> 
>  >>> type(os.walk('.'))
> 
> 
> However, it is a generator of directory tuples that include a filename 
> list produced by listdir, rather than a generator of filenames 
> themselves, as I was thinking. I wish listdir had been changed in 3.0 
> along with map, filter, and range, but I made no effort and hence cannot 
> complain.

Why? We have itertools.imap, itertools.ifilter and xrange already.

-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: parsing json using simplejson

2009-06-15 Thread Mike Kazantsev

On Sun, 14 Jun 2009 22:45:38 -0700 (PDT)
deostroll  wrote:

> I need to be able to parse a json data object using the simplejson
> package. First of all I need to know all the task needed for this job.

Note that py2.6 has a bundled json module.

-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Multi-Threading and KeyboardInterrupt

2009-06-15 Thread Mike Kazantsev

On Mon, 15 Jun 2009 05:37:14 -0700 (PDT)
OdarR  wrote:

> On 13 juin, 07:25, Mike Kazantsev  wrote:
> > There was quite interesting explaination of what happens when you send
> > ^C with threads, posted on concurrency-sig list recently:
> >
> >  http://blip.tv/file/2232410
> >  http://www.dabeaz.com/python/GIL.pdf
> >
> > Can be quite shocking, but my experience w/ threads only confirms that.
> 
> Hi there,
> please read this package page (in 2.6), this is very interesting.
> http://docs.python.org/library/multiprocessing.html
> 
> I tested it : it works. Multi-core cpu's are happy :-)

I'd certainly prefer using processes because they indeed work
flawlessly in that respect, but threads are way simplier and much more
integrated into the language, so I can avoid re-imlementing tons of
shared stuff, IPC and locking by using threads which bassically run in
the same context.
That might mean 90% of code for trivial but parallel task.

Alas, they don't work flawlessly in all cases, but there is still
million and one use for them.

-- 
Mike Kazantsev // fraggod.net

signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Different types of dicts with letter before the curly braces.

2009-06-14 Thread Mike Kazantsev

On Sun, 14 Jun 2009 04:36:17 -0700 (PDT)
kindly  wrote:

> Python already has it for strings r"foo" or u"bar".  So I do not think
> its going against the grain.

Yes, and there's other syntactic sugar like ";" (barely used),
mentioned string types, "(element,)", "%s"%var or curly braces
themselves.

Some of them might even seem as unnecessary and redundant, but they
should there to support legacy code, at least, and I don't think it's a
good idea to add any more. In fact, py3 will drop "%s"%var syntax in
favor of "{0}".format(var) and I think it's a good call.

There's only so much sugar to add before it'll transform into salt and
you'll start seeing lines like these:

s**'@z!~;()=~$x>;%x>l;$(,'*e;y*%z),$;@=!;h(l~;*punch jokers;halt;*;print;

I'm happy to use python because it discourages such syntax, among other things.

-- 
Mike Kazantsev // fraggod.net

signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Different types of dicts with letter before the curly braces.

2009-06-14 Thread Mike Kazantsev

On Sun, 14 Jun 2009 04:02:47 -0700 (PDT)
kindly  wrote:

> Am I crazy to think this is a good idea?  I have not looked deeply
> pythons grammer to see if it conflicts with anything, but on the
> surface it looks fine.

I'd say "on the surface it looks like perl" ;)
I'd prefer to use dict() to declare a dict, not some mix of letters and
incomprehensible symbols, thank you.

-- 
Mike Kazantsev // fraggod.net

signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Make upof Computer

2009-06-14 Thread Mike Kazantsev

On Sun, 14 Jun 2009 00:46:16 -0700 (PDT)
"Mr . Waqar Akbar"  wrote:

...

Judging by the typo in the last subject, someone indeed types all this
crap in manually! Oh my god...

-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Exceptions and Object Destruction (was: Problem with apsw and garbage collection)

2009-06-13 Thread Mike Kazantsev

On Fri, 12 Jun 2009 18:33:13 -0400
Nikolaus Rath  wrote:

> Nikolaus Rath  writes:
> > Hi,
> >
> > Please consider this example:
> []
> 
> I think I managed to narrow down the problem a bit. It seems that when
> a function returns normally, its local variables are immediately
> destroyed. However, if the function is left due to an exception, the
> local variables remain alive:
>
...
> 
> Is there a way to have the obj variable (that is created in dostuff())
> destroyed earlier than at the end of the program? As you can see, I
> already tried to explicitly call the garbage collector, but this does
> not help.

Strange thing is that no one suggested contextlib, which made _exactly_
for this purpose:


  #!/usr/bin/env python
  import gc

  class testclass(object):
def __init__(self):
  self.alive = True # just for example
  print "Initializing"

def __del__(self):
  if self.alive:
# try..except wrapper would suffice here,
# so destruction won't raise ex, if already done
print "Destructing"
self.alive = False

def __enter__(self): pass
def __exit__(self, ex_type, ex_val, ex_trace):
  self.__del__()
  if not ex_type is None:
raise RuntimeError(ex_val)

  
  def dostuff(fail):
with testclass() as obj:
  # some stuff
  if fail:
raise TypeError
  # some more stuff
print "success"

  
  print "Calling dostuff"
  dostuff(fail=False)
  print "dostuff returned"

  try:
print "Calling dostuff"
dostuff(fail=True)
  except TypeError:
pass

  gc.collect()
  print "dostuff returned" 


And it doesn't matter where you use "with", it creates a volatile
context, which destructs before anything else happens on higher level.

Another simplified case, similar to yours is file objects:


  with open(tmp_path, 'w') as file:
# write_ops
  os.rename(tmp_path, path)

So whatever happens inside "with", file should end up closed, else
os.rename might replace valid path with zero-length file.

It should be easy to use cursor with contextlib, consider using
contextmanager decorator:


  from contextlib import contextmanager

  @contextmanager
  def get_cursor():
try:
  cursor = conn.cursor()
  yield cursor
except Exception as ex: raise ex
finally: cursor.close()

  with get_cursor() as cursor:
# whatever ;)



-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Multi-Threading and KeyboardInterrupt

2009-06-13 Thread Mike Kazantsev

On Sat, 13 Jun 2009 04:42:16 -0700 (PDT)
koranthala  wrote:

> Are there other videos/audio like this? I am learning more from these
> videos than by experience alone.

Indeed, it is a very interesting presentation, but I'm afraid I've
stumbled upon it just as you did, but on concurrency-sig mailing list.

It's a relatively new list (now hosted on mail.python.org), not
specifically dedicated to podcasts or, for that matter, any
implementation details. I haven't seen any other material like this
there.

> I did find one - http://www.awaretek.com/python/ - are there other
> links?

Thanks for sharing this link, although I prefer such information in
written form - it's easier/faster to work with and much more accessible.

-- 
Mike Kazantsev // fraggod.net

signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: moving Connection/PipeConnection between processes

2009-06-13 Thread Mike Kazantsev

On Sat, 13 Jun 2009 02:23:37 -0500
Randall Smith  wrote:

> I've got a situation in which I'd like to hand one end of a pipe to 
> another process.  First, in case you ask why, a spawner process is 
> created early before many modules are imported.  That spawner process is 
> responsible for creating new processes and giving a proxy to the parent 
> process.
>
...
> 
> Looking at the pickle docs, I wonder if this could be resolved by adding 
> a __getnewargs__ method to _multiprocessing.Connection.  But even if 
> that would work I couldn't do it now since it's an extension module. 
> I've thought about trying to recreate the Connection.  Looks like it 
> should be possible with Connection.fileno().  The Unix portion looks 
> easy, but the win32 portion does not.
> 
> So if it's possible, what's the best way to pass a Connection to another 
> process?

Pickle has nothing to do with the problem since it lay much deeper: in
the OS.

From kernel point of view, every process has it's own "descriptor
table" and the integer id of the descriptor is all the process gets, so
when you say "os.pipe()" kernel actually gives you a number which is
completely meaningless for any other process - it either doesn't exists
in it's descriptor table or points to something else.

So, what you actually need is to tell the kernel to duplicate
underlying object in another process' table (with it's own numbering),
which is usually done via special flag for sendmsg(2) in C, so you
should probably look out for py implementation of this call, which I
haven't stumbled upon, but, admittely, never looked for.

-- 
Mike Kazantsev // fraggod.net

signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Lexical scope: converting Perl to Python

2009-06-12 Thread Mike Kazantsev

On Fri, 12 Jun 2009 22:02:53 -0700 (PDT)
Andrew Savige  wrote:

> I'd like to convert the following Perl code to Python:
> 
>  use strict;
>  {
>    my %private_hash = ( A=>42, B=>69 );
>    sub public_fn {
>  my $param = shift;
>  return $private_hash{$param};
>    }
>  }
>  print public_fn("A");    # good:  prints 42
>  my $x = $private_hash{"A"};  # error: good, hash not in scope
>
...
> 
> What is the Pythonic equivalent of Perl's lexical scope, as
> illustrated by the code snippet above?

If you're using scope for garbage-collecting purposes, there's "with"
statement and contextlib:

  from contextlib import contextmanager

  @contextmanager
  def get_hash():
complex_hash = dict(A=42, B-69)
try: yield complex_hash
except Exception as ex:
  del complex_hash # complex destructor ;)
  raise ex

  with get_hash() as hash:
# do stuff with hash

Note that this only makes sense if you need to implement some complex
operation on hash destruction, and do that whatever-happens-inside-with
to close the object, obviously not the case with simple dict above.

And if you want to obfuscate one part of your code from another, you'll
probably have better luck with languages like java, since no one seem
to care about such stuff with python, so it'd be a hack against the
language, at best.
Why would you want to hide the code from itself, anyway? It's not like
you'd be able to accomplish it - code can easily grep it's process body
in memory and harvest all the "private" values, so I'd suggest getting
some fresh air when you start to feel like doing that.

-- 
Mike Kazantsev // fraggod.net

signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: [Tutor] Multi-Threading and KeyboardInterrupt

2009-06-12 Thread Mike Kazantsev

On Thu, 11 Jun 2009 22:35:15 -0700
Dennis Lee Bieber  wrote:

> On Thu, 11 Jun 2009 08:44:24 -0500, "Strax-Haber, Matthew (LARC-D320)"
>  declaimed the following in
> gmane.comp.python.general:
> 
> > I sent this to the Tutor mailing list and did not receive a response.
> > Perhaps one of you might be able to offer some sagely wisdom or pointed
> > remarks?
> > 
> > Please reply off-list and thanks in advance. Code examples are below in
> > plain text.
> >
>   Sorry -- you post to a public forum, expect to get the response on a
> public forum...
> 
> > > My program runs interactively by allowing the user to directly
> > > interact with the python prompt. This program has a runAll() method
> > > that runs a series of subprocesses with a cap on how many instances
> > > are running at a time. My intent is to allow the user to use Ctrl-C to
> > > break these subprocesses. Note that while not reflected in the demo
> 
>   Are they subprocesses or threads? Your sample code seems to be using
> threads.
> 
>   When using threads, there is no assurance that any thread other than
> the main program will receive a keyboard interrupt.

In fact, no thread other than the main will get interrupt.


> > def runAll():
> > workers = [ Thread(target = runSingle, args = [i])
> > for i in xrange(MAX_SUBPROCS + 1) ]
> > try:
> > for w in workers:
> > w.start()
> > except KeyboardInterrupt:
> > ## I want this to be shown on a KeyboardInterrupt
> > print '* stopped midway '
> 
>   You are unlikely to see that... After you start the defined worker
> /threads/ (which doesn't take very long -- all threads will be started,
> but some may immediately block on the semaphore) this block will exit
> and you will be at...
> 
> > for w in workers:
> > w.join()
>
>   ... a .join() call, which is the most likely position at which the
> keyboard interrupt will be processed, killing the main program thread
> and probably generating some errors as dangling active threads are
> forceably killed.

There was quite interesting explaination of what happens when you send
^C with threads, posted on concurrency-sig list recently:

  http://blip.tv/file/2232410
  http://www.dabeaz.com/python/GIL.pdf

Can be quite shocking, but my experience w/ threads only confirms that.


-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Specify the sorting direction for the various columns/

2009-06-11 Thread Mike Kazantsev

On Thu, 11 Jun 2009 18:54:56 -0700 (PDT)
Oni  wrote:

> Managed to get a dictionary to sort on multiple columns using a tuple
> to set the sort order (see below). However how can I control that
> column "date" orders descending and the column "name" orders
> ascending.
...
> bob = entries
> bob.sort(key=operator.itemgetter(*sortorderarr),reverse=True)
> pp.pprint(bob)

Note that this accomplishes nothing, since bob and entries are the same
object, so entries.sort and bob.sort are the same method of the same
object.
You can use "copy" module to clone list and it's contents (dict objects)
or just use list constructor to clone list structure only, leaving
contents essencially the same, but in different order.
Or, in this case, you can just use "sorted" function which constructs
sorted list from any iterable.

As for the question, in addition to Mark's suggestion of doing
sub-sorting, you can also construct complex index (code below).
Dunno which would be more efficient in the particular case...

  import datetime
  import pprint

  entries = [{'name': 'ZZ2', 'username': 'ZZ3', 'date':
datetime.datetime (2008, 9, 30, 16, 43, 54)},{'name': 'ZZ2',
'username': 'ZZ5','date': datetime.datetime(2008, 9, 30, 16, 43,
54)},{'name': 'ZZ2', 'username': 'ZZ1', 'date':
datetime.datetime(2007, 9, 30, 16, 43, 54)}, {'name': 'AA2',
    'username': 'AA2','date': datetime.datetime(2007, 9, 30, 16, 43,
54)}]

  entries.sort(lambda x: (x['name'], -time.mktime(x['date'].timetuple(

Here time is inversed, yielding reverse sort order by that column.

-- 
Mike Kazantsev // fraggod.net

signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: do replacement evenly

2009-06-02 Thread Mike Kazantsev

On Tue, 2 Jun 2009 19:10:18 +0800
oyster  wrote:

> I have some strings, and I want to write them into a text files, one
> string one line
> but there is a requirement: every line has a max length of a certain
> number(for example, 10), so I have to replace extra SPACE*3 with
> SPACE*2, at the same time, I want to make the string looks good, so,
> for "I am123456line123456three"(to show the SPACE clearly, I type it
> with a number), the first time, I replace the first SPACE, and get "I
> am23456line123456three", then I must replace at the second SPACE
> block, so I get  "I am23456line23456three", and so on, if no SPACE*3
> is found, I have to aString.replace(SPACE*2, SPACE).
> I hope I have stated my case clear.
> 
> Then the question is, is there a nice solution?

Not so nice, but it should be faster than whole lot of string
manipulations, especially on longer lines:

  len_line = 55
  line = 'Thats  a whole line   of  some utter  nonsense ;)'

  words = line.split()
  count_space = len_line - len(''.join(words))
  count_span = len(words) - 1
  span_min = (count_space // count_span) * ' '
  count_span_max = count_space - (count_span * len(span_min))

  line = buffer(words[0])
  for word in words[1:]:
if count_span_max:
  count_span_max -= 1
  line += span_min + ' '
else: line += span_min
line += word

  print '%d chars: %r'%(len(line), line)

-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: extract to dictionaries

2009-05-29 Thread Mike Kazantsev

On Thu, 28 May 2009 16:03:45 -0700 (PDT)
Marius Retegan  wrote:

> Hello
> I have simple text file that I have to parse. It looks something like
> this:
> 
> parameters1
>  key1 value1
>  key2 value2
> end
> 
> parameters2
>  key1 value1
>  key2 value2
> end
> 
> So I want to create two dictionaries parameters1={key1:value1,
> key2:value2} and the same for parameters2.


You can use iterators to efficiently parse no-matter-how-large file.
Following code depends on line breaks and 'end' statement rather than
indentation.


  import itertools as it, operator as op, functools as ft
  from string import whitespace as spaces

  with open('test.src') as src:
lines = it.ifilter(bool, it.imap(lambda x: x.strip(spaces), src))
sections = ( (lines.next(), dict(it.imap(str.split, lines))) for sep,lines 
in
  it.groupby(lines, key=lambda x: x == 'end') if not sep )
data = dict(sections)

  print data
  # { 'parameters2': {'key2': 'value2', 'key1': 'value1'},
  #  'parameters1': {'key2': 'value2', 'key1': 'value1'} }



To save namespace and make it a bit more unreadable you can write it
as a one-liner:

  with open('test.src') as src:
data = dict( (lines.next(), dict(it.imap(str.split, lines))) for sep,lines 
in
  it.groupby(it.ifilter(bool, it.imap(lambda x: x.strip(spaces), src)),
  key=lambda x: x == 'end') if not sep )


-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Set a variable as in setter

2009-05-24 Thread Mike Kazantsev

On Sun, 24 May 2009 19:03:26 +0600
Mike Kazantsev  wrote:

> On Sun, 24 May 2009 05:06:13 -0700 (PDT)
> Kless  wrote:
> 
> > Is there any way to simplify the next code? Because I'm setting a
> > variable by default of the same way than it's set in the setter.
> > 
> > ---
> > class Foo(object):
> >def __init__(self, bar):
> >   self._bar = self._change(bar)  # !!! as setter
> 
> Guess it's obvious, but why not use "setattr(self, 'bar', bar)" here, in
> __init__ - it'll just call defined setter.

In fact, "self.bar = bar" is even simplier.
Somehow I thought it wouldn't work here, but it does.

> >@property
> >def bar(self):
> >   return self._bar
> > 
> >@bar.setter
> >def bar(self, bar):
> >   self._bar = self._change(bar)  # !!! as in init
> > 
> >def _change(self, text):
> >   return text + 'any change'
> > ---
> 


-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: how to get rid of pyc files ?

2009-05-24 Thread Mike Kazantsev

On Sun, 24 May 2009 15:01:51 +0200
Stef Mientki  wrote:

> Moving my entire program section between windows and Ubuntu,
> sometimes causes problems, due to the existence of pyc-files
> (and probably because my program still has hard coded paths).
> 
> Now I want get rid of the pyc-files,
> so I wrote a py-script to remoce all pyc-files,
> but because it's run from the same program section,
> a few pyc files are recreated.
> 
> Is there a way to prevent generating pyc-files ?
> Or is there a way to redirect the generated pyc-files to a dedicated 
> location ?

Use a "-B" command-line option or "PYTHONDONTWRITEBYTECODE=x" env var.
You can put either "alias python='python -B'" or
"export PYTHONDONTWRITEBYTECODE=x" to your .bashrc/profile and forget
about .pyc/pyo forever.

> btw, What commandline switches are available for python ?
> (googling didn't give me any relevant hits )

You might be amazed how much insight "man python" and "python -h" can
yield ;)

-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Set a variable as in setter

2009-05-24 Thread Mike Kazantsev

On Sun, 24 May 2009 05:06:13 -0700 (PDT)
Kless  wrote:

> Is there any way to simplify the next code? Because I'm setting a
> variable by default of the same way than it's set in the setter.
> 
> ---
> class Foo(object):
>def __init__(self, bar):
>   self._bar = self._change(bar)  # !!! as setter

Guess it's obvious, but why not use "setattr(self, 'bar', bar)" here, in
__init__ - it'll just call defined setter.

>@property
>def bar(self):
>   return self._bar
> 
>@bar.setter
>def bar(self, bar):
>   self._bar = self._change(bar)  # !!! as in init
> 
>    def _change(self, text):
>   return text + 'any change'
> ---

-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: While Statement

2009-05-22 Thread Mike Kazantsev

On Fri, 22 May 2009 21:33:05 +1000
Joel Ross  wrote:

> changed it to "float(number)/total*100" and it worked thanks for all 
> your help appreciated

I believe operator.truediv function also deserves a mention here, since
line "op.truediv(number, total) * 100" somehow seem to make more sense
to me than an explicit conversion.
There's also "op.itruediv" for "number /= float(total) * 100" case.

http://docs.python.org/dev/library/operator.html

-- 
Mike Kazantsev // fraggod.net

signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: How to get Form values in Python code and Send Email

2009-05-20 Thread Mike Kazantsev

On Wed, 20 May 2009 17:49:47 +0530
Kalyan Chakravarthy  wrote:

> Hi
> Now i can able to get the form details in to python code,
> 
> can any one tell me the format to send form values to one Emil
> id   ...  for this  I required SMTP set up?

You can use email and smtplib modules for that along with any SMTP
relay you have access to, even if it requires authentication - smtplib
has support for that.

Following example doesn't uses authentication for SMTP so you might want
to add it or you can indeed set up some SMTP on local machine (there
are really simple ones like ssmtp or msmtp, that can relay mail to
auth-enabled host).


  import smtplib
  from email.MIMEMultipart import MIMEMultipart
  from email.MIMEBase import MIMEBase
  from email.MIMEText import MIMEText
  from email.Utils import COMMASPACE, formatdate
  from email import Encoders
  import os, collections

  def send(to, subj, body, files=[], from=None, relay='localhost'):
if not isinstance(to, collections.Iterable): to = (to,)

msg = MIMEMultipart()
msg['From'] = from
msg['To'] = COMMASPACE.join(to)
msg['Date'] = formatdate(localtime=True)
msg['Subject'] = subj

msg.attach( MIMEText(body) )

for file in files:
  part = MIMEBase('application', "octet-stream")
  part.set_payload( open(file,"rb").read() )
  Encoders.encode_base64(part)
  part.add_header('Content-Disposition', 'attachment;
filename="%s"'% os.path.basename(file)) msg.attach(part)

smtp = smtplib.SMTP(relay)
smtp.sendmail(from, to, msg.as_string() )
smtp.close()



-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Import and absolute file names, sys.path including ''... or not

2009-05-20 Thread Mike Kazantsev

On Wed, 20 May 2009 22:01:50 +0200
Jean-Michel Pichavant  wrote:

> You are right, but my concern is not the relative path resolution. Let 
> me clarify:
> 
> /home/jeanmichel/test.py:
> "import sys
> print sys.path"
> 
>  >python.exe test.py
> sys.path = ['/home/jeanmichel']
>  > from within a python shell:
> sys.path = ['']
> 
> The unpredictable effect of '' (at least something I did not predict) is 
> that it allows absolute path resolution, while '/home/jeanmichel' cannot.
> Example :
> write a anotherTest.py file:
> "
> __import__('/home/jeanmichel/test')
> "

It works for me with py2.6, what version do you have?

> anotherTest.py will be successfully imported in a python shell ('' + 
> '/home/jeanmichel/test.py' is a valid path), but the "python.exe 
> anotherTest2.py"  form will fail as it will try for '/home/jeanmichel' 
> +'/home/jeanmichel/test.py' which is not a valid path.

I believe python uses os.path.join algorithm to combine paths which
discards anything (absolute or not) if absolute path gets appended to
it:
  os.path.join('/some/path', '/home/jeanmichel') == '/home/jeanmichel'

> So my question is: "why the shell is adding '' when the interpreter is 
> adding the full path ?"

Looks like a solid way to construct relative imports to me.

-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: SpellChecker

2009-05-20 Thread Mike Kazantsev

abosalim wrote:
> I used this code.It works fine,but on word not whole text.I want to
> extend this code to correct
> text file not only a word,but i don't know.If you have any help,please
> inform me.
...
> def correct(word):
> candidates = known([word]) or known(edits1(word)) or known_edits2
> (word) or [word]
> return max(candidates, key=lambda w: NWORDS[w])

Here I assume that "word" is any string consisting of letters, feel free
to add your own check in place of str.isalpha, like word length or case.
Note that simple ops like concatenation work much faster with buffers
than str / unicode.

  text = 'some text to correct (anything, really)'
  result = buffer('')

  word, c = buffer(''), ''
  for c in text:
if c.isalpha(): word += c
else:
  if word:
    result += correct(word)
word = buffer('')
  result += c

-- 
Mike Kazantsev // fraggod.net



signature.asc
Description: OpenPGP digital signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Yet another question about class property.

2009-05-20 Thread Mike Kazantsev

Jim Qiu wrote:
> Hi everyone,
> 
> Following is the code i am reading, i don't see anywhere the declaration of
> Message.root object,
> Where is it from?
...

Prehaps it gets assigned by the parent itself?
Like this:

  def spawn_child(self):
child = Message()
child.root = self

-- 
Mike Kazantsev // fraggod.net



signature.asc
Description: OpenPGP digital signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: optparse options

2009-05-20 Thread Mike Kazantsev

Ben Finney wrote:
> icarus  writes:
> 
>>  parser = optparse.OptionParser(usage="%prog [-p dir] [--part=dir] ",
>> version="%prog 1.0")
>>
>>  parser.add_option( "-p", "--part", dest="directory",
>>help="process target directory", metavar="dir")
>>  (options, args) = parser.parse_args()

...

>>  if len(args) != 1:
>>  parser.error("No options specified")
> 
> The message is confusing, since it doesn't match the condition; it would
> be correct to say “Did not specify exactly one non-option argument”.
> 
> In this case, it looks like you don't want to check this at all, and
> should instead operate on the basis of the options only.

I also wanted to note that it looks quite illogical and
counter-intuitive to create "required options", since by definition they
should be optional.
Try using arguments instead, with some type-switching flags, if
necessary - it should make CLI more consistent and save some typing by
omitting otherwise always-required option argument ("--part").

-- 
Mike Kazantsev // fraggod.net

signature.asc
Description: OpenPGP digital signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: How to convert a list of strings to a tuple of floats?

2009-05-18 Thread Mike Kazantsev

On Mon, 18 May 2009 00:51:43 -0700 (PDT)
"boblat...@googlemail.com"  wrote:

> this is the conversion I'm looking for:
> 
> ['1.1', '2.2', '3.3'] -> (1.1, 2.2, 3.3)

Since itertools are useful in nearly every module and probably are
imported already...

  import itertools as it
  ftuple = tuple(it.imap( float, line.split('; ') ))

-- 
Mike Kazantsev // fraggod.net


signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: pushback iterator

2009-05-17 Thread Mike Kazantsev

Somehow, I got the message off the list.

On Sun, 17 May 2009 17:42:43 +0200
Matus  wrote:

> > Sounds to me more like an iterator with a cache - you can't really pull
> > the line from a real iterable like generator function and then just push
> > it back.
> 
> true, that is why you have to implement this iterator wrapper

I fail to see much point of such a dumb cache, in most cases you
shouldn't iterate again and again thru the same sequence, so what's
good hardcoding (and thus, encouraging) such thing will do?

Besides, this wrapper breaks iteration order, since it's cache is LIFO
instead of FIFO, which should rather be implemented with deque instead
of list.

> > If this "iterator" is really a list then you can use it as such w/o
> > unnecessary in-out operations.
> 
> of course, it is not a list. you can wrap 'real' iterator using this
> wrapper (), and voila, you can use pushback method to 'push back' item
> received by next method. by calling next again, you will get pushed back
> item again, that is actually the point.

Wrapper differs from "list(iterator)" in only one thing: it might not
make it to the end of iterable, but if "pushing back" is common
operation, there's a good chance you'll make it to the end of the
iterator during execution, dragging whole thing along as a burden each
time.

> > And if you're "pushing back" the data for later use you might just as
> > well push it to dict with the right indexing, so the next "pop" won't
> > have to roam thru all the values again but instantly get the right one
> > from the cache, or just get on with that iterable until it depletes.
> > 
> > What real-world scenario am I missing here?
> > 
> 
> ok, I admit that that the file was not good example. better example
> would be just any iterator you use in your code.

Somehow I've always managed to avoid such re-iteration scenarios, but
of course, it could be just my luck ;)

-- 
Mike Kazantsev // fraggod.net

signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: pushback iterator

2009-05-17 Thread Mike Kazantsev

On Sun, 17 May 2009 16:39:38 +0200
Matus  wrote:

> I searches web and python documentation for implementation of pushback
> iterator but found none in stdlib.
> 
> problem:
> 
> when you parse a file, often you have to read a line from parsed file
> before you can decide if you want that line it or not. if not, it would
> be a nice feature to be able po push the line back into the iterator, so
> nest time when you pull from iterator you get this 'unused' line.
>  
...
> 
> proposal:
> =
> as this is (as I suppose) common problem, would it be possible to extend
> the stdlib of python (ie itertools module) with a similar solution so
> one do not have to reinvent the wheel every time pushback is needed?  

Sounds to me more like an iterator with a cache - you can't really pull
the line from a real iterable like generator function and then just push
it back.
If this "iterator" is really a list then you can use it as such w/o
unnecessary in-out operations.

And if you're "pushing back" the data for later use you might just as
well push it to dict with the right indexing, so the next "pop" won't
have to roam thru all the values again but instantly get the right one
from the cache, or just get on with that iterable until it depletes.

What real-world scenario am I missing here?

-- 
Mike Kazantsev // fraggod.net

signature.asc
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

45 matches

Mail list logo