Hello,

I'm sure someone can give you more detailed advice, but the general rule
with PyLucene and threading is you need to use PyLucene.PythonThread
wherever you would normally use a python thread. It's a small wrapper
for python's thread that fixes some issues with gcj and the garbage
collector. I'm sure someone can explain that better, but I've learned
this is the golden rule when working with PyLucene and threads.

Cheers,
Norbert

On Mon, 2007-03-12 at 20:15 -0700, Ofer Nave wrote:
> Hello.
> 
> I wanted to try splitting my index up into two slices and indexing each in
> separate threads to see if it would run faster on a dual-proc box, but my
> script began segfaulting as soon as threading was added.  This is the first
> time I've ever used threads in Python, so I might be doing something
> obviously stupid.
> 
> Anyway, I pared down the script to a minimal test case that still yields a
> segfault.  Here is the code:
> 
> ---
> #!/usr/bin/python
> import os
> import sys
> import threading
> 
> import PyLucene
> 
> class Indexer(object):
>     def __init__(self, index_dir):
>         self.index_dir = index_dir
>         if not os.path.exists(index_dir):
>             os.mkdir(index_dir)
> 
>     def run(self):
>         worker1 = Worker(self.index_dir + '/1', 1)
>         worker2 = Worker(self.index_dir + '/2', 2)
>         worker1.start()
>         worker2.start()
>         while (worker1.isAlive() or worker2.isAlive()):
>             pass
> 
> class Worker(threading.Thread):
>     def __init__(self, index_dir, worker_id):
>         threading.Thread.__init__(self)
>         self.index_dir = index_dir
>         self.worker_id = worker_id
>         if not os.path.exists(index_dir):
>             os.mkdir(index_dir)
> 
>     def run(self):
>         print 'woo hoo: ' + self.index_dir
>         self.store = PyLucene.FSDirectory.getDirectory(self.index_dir, True)
>         self.store.close()
> 
> if __name__ == '__main__':
>     if len(sys.argv) < 2:
>         print "Usage: python " + __file__ + " <index_dir>"
>         sys.exit(1)
>     print 'PyLucene', PyLucene.VERSION, 'Lucene', PyLucene.LUCENE_VERSION
>     indexer = Indexer(sys.argv[1])
>     indexer.run()
> ---
> 
> The output is as follows:
> 
> [EMAIL PROTECTED] ~/bin]$ lucene_segfault_demo /tmp
> PyLucene 2.1.0-1 Lucene 2.1.0-509013
> woo hoo: /tmp/1
> Segmentation fault
> 
> Any ideas?
> 
> -ofer
> 
> _______________________________________________
> pylucene-dev mailing list
> [email protected]
> http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to