Thank you very much, Andi. Best, roman
On Tue, Aug 24, 2010 at 5:36 PM, Andi Vajda <va...@apache.org> wrote: > > On Aug 24, 2010, at 8:03, Roman Chyla <roman.ch...@gmail.com> wrote: > >> I am trying to understand PyLucene more and to see if it is faster to >> retrieve result ids with java instead of with Python. The use case is >> to retrieve millions of recids -- with python, 700K ids takes about >> 1.5s. (even if query takes just fraction of that). >> >> I wrote a simple java code (works in java) which returns array of >> ints. I have wrapped it with jcc, it is visible from inside python, >> but callind the static method throws InvalidArgsError (below is an >> example python session) >> >> JCC is version 2.4, built with shared mode -- the DistUtils is in a >> different package than lucene (ie. not inside lucene jars). Can this >> problem be similar to passing jcc-wrapped objects between different >> jcc-packages? http://search-lucene.com/m/SPgeW1hDtAw1 >> >> The java class is very simple: >> >> import org.apache.lucene.search.TopDocs; >> >> public class DumpUtils { >> public static int[] GetDocIds(TopDocs topdocs) { >> int[] out; >> out = new int[topdocs.totalHits]; >> ScoreDoc[] hits = topdocs.scoreDocs; >> for (int i=0; i < topdocs.totalHits; i++) { >> out[i] = hits[i].doc; >> } >> return out; >> } >> } >> >> Thanks for any help/pointers, > > Ah yes, importing separately built extensions that share classes (or > dependencies) didn't work until support for the --import parameter was added > in jcc 2.6 to solve the problem of incompatible shared classes. To make this > work: > - first, build PyLucene as usual, with --shared > - then, build your DistUtils package with --import lucene and with --shared > > That way, instead of generating code and wrapper classes again for the > lucene classes, jcc will import them at build time thus making a much > smaller library and faster build. The resulting shared library is linked > against the lucene one. > > See docs and list archives about --import for more examples. Then, when > running all this, you should also import lucene first, then your other > package. > > Andi.. > >> >> roman >> >> >> Here is an example python session: >> >> In [1]: import pyjama >> >> In [2]: pyjama.initVM(pyjama.CLASSPATH) >> Out[2]: <jcc.JCCEnv object at 0x00C0E1F0> >> >> In [3]: import lucene as lu >> >> In [4]: pyjama.DumpUtils >> Out[4]: <type 'DumpUtils'> >> >> In [5]: pyjama.DumpUtils.GetDocIds >> Out[5]: <built-in method GetDocIds of type object at 0x0189E780> >> >> In [6]: >> >> In [7]: import newseman.pyjamic.slucene.searcher as se >> >> In [8]: s = se.Searcher();s.open('/tmp/whisper/') >> >> In [9]: hits = s._search(s._query('key:bo*',None), 50) >> >> In [10]: hits >> Out[10]: <TopDocs: org.apache.lucene.search.topd...@480457> >> >> In [11]: >> >> In [12]: pyjama.DumpUtils.GetDocIds(hits) >> >> --------------------------------------------------------------------------- >> InvalidArgsError Traceback (most recent call >> last) >> >> InvalidArgsError: (<type 'DumpUtils'>, 'GetDocIds', <TopDocs: >> org.apache.lucene. >> search.topd...@480457>) >