RE: [pylucene-dev] simple threading segfault demo

Ofer Nave Mon, 12 Mar 2007 21:23:50 -0800

Thanks, that worked!

Unfortunately, due to Python threads being user-level threads, it seems my
script doesn't see any performance gains from the second CPU (I/O isn't much
of a bottleneck since I have enough spare RAM on the box to cache the entire
filesystem).


I created another version using popen2's Popen4 class which worked (two
processes indexed all my data in half the time that the previous
single-process version needed), so I'll probably keep using that.  (I tried
to use the subprocess module, as per the recommendation in the Python
Library Reference, but that module seems to be absent from my install, and
my lame attempt at stealing a copy from a source package and dropping it in
the library path only yielded syntax errors.)

-ofer 

> -----Original Message-----
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Norbert Wojtowicz
> Sent: Monday, March 12, 2007 8:31 PM
> To: [email protected]
> Subject: Re: [pylucene-dev] simple threading segfault demo
> 
> Hello,
> 
> I'm sure someone can give you more detailed advice, but the 
> general rule with PyLucene and threading is you need to use 
> PyLucene.PythonThread wherever you would normally use a 
> python thread. It's a small wrapper for python's thread that 
> fixes some issues with gcj and the garbage collector. I'm 
> sure someone can explain that better, but I've learned this 
> is the golden rule when working with PyLucene and threads.
> 
> Cheers,
> Norbert
> 
> On Mon, 2007-03-12 at 20:15 -0700, Ofer Nave wrote:
> > Hello.
> > 
> > I wanted to try splitting my index up into two slices and indexing 
> > each in separate threads to see if it would run faster on a 
> dual-proc 
> > box, but my script began segfaulting as soon as threading 
> was added.  
> > This is the first time I've ever used threads in Python, so 
> I might be 
> > doing something obviously stupid.
> > 
> > Anyway, I pared down the script to a minimal test case that still 
> > yields a segfault.  Here is the code:
> > 
> > ---
> > #!/usr/bin/python
> > import os
> > import sys
> > import threading
> > 
> > import PyLucene
> > 
> > class Indexer(object):
> >     def __init__(self, index_dir):
> >         self.index_dir = index_dir
> >         if not os.path.exists(index_dir):
> >             os.mkdir(index_dir)
> > 
> >     def run(self):
> >         worker1 = Worker(self.index_dir + '/1', 1)
> >         worker2 = Worker(self.index_dir + '/2', 2)
> >         worker1.start()
> >         worker2.start()
> >         while (worker1.isAlive() or worker2.isAlive()):
> >             pass
> > 
> > class Worker(threading.Thread):
> >     def __init__(self, index_dir, worker_id):
> >         threading.Thread.__init__(self)
> >         self.index_dir = index_dir
> >         self.worker_id = worker_id
> >         if not os.path.exists(index_dir):
> >             os.mkdir(index_dir)
> > 
> >     def run(self):
> >         print 'woo hoo: ' + self.index_dir
> >         self.store = 
> PyLucene.FSDirectory.getDirectory(self.index_dir, True)
> >         self.store.close()
> > 
> > if __name__ == '__main__':
> >     if len(sys.argv) < 2:
> >         print "Usage: python " + __file__ + " <index_dir>"
> >         sys.exit(1)
> >     print 'PyLucene', PyLucene.VERSION, 'Lucene', 
> PyLucene.LUCENE_VERSION
> >     indexer = Indexer(sys.argv[1])
> >     indexer.run()
> > ---
> > 
> > The output is as follows:
> > 
> > [EMAIL PROTECTED] ~/bin]$ lucene_segfault_demo /tmp PyLucene 
> 2.1.0-1 Lucene 
> > 2.1.0-509013 woo hoo: /tmp/1 Segmentation fault
> > 
> > Any ideas?
> > 
> > -ofer
> > 
> > _______________________________________________
> > pylucene-dev mailing list
> > [email protected]
> > http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
> 
> _______________________________________________
> pylucene-dev mailing list
> [email protected]
> http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

RE: [pylucene-dev] simple threading segfault demo

Reply via email to