Re: About JCC 2.4 Source tar.gz @ pypi

2009-10-28 Thread Andi Vajda


On Thu, 29 Oct 2009, Cheng-Lung Sung wrote:


Hi,

   The tar ball seems lack Type.h/TypeVariable.h. Can you fix it?


Why do you think it misses these files ?
Do you see these classes used anywhere in JCC 2.4 ?

Andi..

ps: please contact pylucene-dev@lucene.apache.org, the dev list for PyLucene
and JCC, instead of me directly. thanks !


Re: Pylucene and JCC 2.4.1

2009-10-28 Thread Andi Vajda


On Wed, 28 Oct 2009, Andi Vajda wrote:


On Oct 28, 2009, at 2:45, Manolo Padron Martinez  wrote:


What is the version of your gcc ?
I did the same build today on Ubuntu Gutsy 64 bits without any problem.


gcc (Debian 4.3.2-1.1) 4.3.2
g++ (Debian 4.3.2-1.1) 4.3.2


Here are a few things you could try, in no particular order:


So I installed Debian 5 (Lenny) myself onto a virtual machine and built JCC
and PyLucene 2.9.1 from the trunk of the 2.9 branch.


- gcc 4.2


Using gcc 4.2 worked fine. I had to ensure the /usr/bin/g++ and /usr/bin/gcc 
links point at /usr/bin/g++-4.2 and /usr/bin/gcc-4.2 respectively.


I then moved the links to the 4.3 versions and rebuilt JCC and PyLucene 
again. The build succeeded as well. In other words, the problem didn't 
reproduce.


In both cases, the tests passed (make test).
I guess, I can't reproduce the problem.

I noticed that both compilers use large amounts of memory when compiling 
these large C++ files. Maybe you don't have enough memory on your system. I 
gave mine 512mb and it swapped like mad. I then gave it 1gb of RAM and 
the builds completed fine.


If that's indeed the problem, get JCC to generate smaller but more numerous 
files by increasing NUM_FILES. Just guessing here as to what you be the 
problem on your side.


Andi..


Re: which Snowball stemmers are in PyLucene?

2009-10-28 Thread Andi Vajda


On Wed, 28 Oct 2009, Marvin Humphrey wrote:


On Wed, Oct 28, 2009 at 12:20:55PM -0700, Andi Vajda wrote:


There may be an API in the Snowball library to do this enumeration.


There's this, from libstemmer.h:

   /** Returns an array of the names of the available stemming algorithms.
*  Note that these are the canonical names - aliases (ie, other names for
*  the same algorithm) will not be included in the list.
*  The list is terminated with a null pointer.
*
*  The list must not be modified in any way.
*/
   const char ** sb_stemmer_list(void);


Sorry, I should have said "in the java library".

Andi..


Re: which Snowball stemmers are in PyLucene?

2009-10-28 Thread Marvin Humphrey
On Wed, Oct 28, 2009 at 12:20:55PM -0700, Andi Vajda wrote:

> There may be an API in the Snowball library to do this enumeration. 

There's this, from libstemmer.h:

/** Returns an array of the names of the available stemming algorithms.
 *  Note that these are the canonical names - aliases (ie, other names for
 *  the same algorithm) will not be included in the list.
 *  The list is terminated with a null pointer.
 *
 *  The list must not be modified in any way.
 */
const char ** sb_stemmer_list(void);

Cheers,

Marvin Humphrey



Re: which Snowball stemmers are in PyLucene?

2009-10-28 Thread Andi Vajda


On Oct 28, 2009, at 12:09, Bill Janssen  wrote:


Andi Vajda  wrote:


The snowball JAR comes from this statement in the Makefile:
SNOWBALL_JAR=$(LUCENE)/build/contrib/snowball/lucene-snowball-$ 
(LUCENE_VER).jar


Which means that it's whatever corresponds to the Lucene version
checked out. For PyLucene 2.9.0, that is:
  http://svn.apache.org/repos/asf/lucene/java/tags/lucene_2_9_0

In other words, this is a question best asked on the
java-u...@lucene.apache.org mailing list as PyLucene doesn't do
anything different (at least intentionally).


I've looked through that set of APIs, and don't see anything useful.
This was more of a brainstorming question for the list...

What could we do in Python to enumerate the list?


import lucene
lucene.initVM(classpath=lucene.CLASSPATH)
for n,v in lucene.__dict__.items():

 ...if n.endswith("Stemmer"):
 ...   print n, lucene.SnowballProgram.instance_(v)
 ...


That is checking if a class is an instance of SnowballProgram which is  
probably not what you want. Use isAssignableFrom() maybe ?


There may be an API in the Snowball library to do this enumeration. I  
don't know and that's why I suggested asking java-user. Nothing wrong  
with brainstorming here, of course.


Andi..




ItalianStemmer False
FrenchStemmer False
HungarianStemmer False
LovinsStemmer False
RussianStemmer False
FinnishStemmer False
PortugueseStemmer False
KpStemmer False
BrazilianStemmer False
DanishStemmer False
TurkishStemmer False
DutchStemmer False
SwedishStemmer False
German2Stemmer False
EnglishStemmer False
GermanStemmer False
RomanianStemmer False
PorterStemmer False
NorwegianStemmer False
SpanishStemmer False

Seems to me that this should give different results.  Am I using the  
JCC

"instance_" method improperly?

Bill


Re: which Snowball stemmers are in PyLucene?

2009-10-28 Thread Bill Janssen
Sorry, my mistake.  Should be:

for n,v in lucene.__dict__.items():
   if n.endswith("Stemmer") and issubclass(v, lucene.SnowballProgram):
  print n[:-len("Stemmer")]

That produces the list.

Bill


Re: which Snowball stemmers are in PyLucene?

2009-10-28 Thread Bill Janssen
Andi Vajda  wrote:

> The snowball JAR comes from this statement in the Makefile:
> SNOWBALL_JAR=$(LUCENE)/build/contrib/snowball/lucene-snowball-$(LUCENE_VER).jar
> 
> Which means that it's whatever corresponds to the Lucene version
> checked out. For PyLucene 2.9.0, that is:
>http://svn.apache.org/repos/asf/lucene/java/tags/lucene_2_9_0
> 
> In other words, this is a question best asked on the
> java-u...@lucene.apache.org mailing list as PyLucene doesn't do
> anything different (at least intentionally).

I've looked through that set of APIs, and don't see anything useful.
This was more of a brainstorming question for the list...

What could we do in Python to enumerate the list?

  >>> import lucene
  >>> lucene.initVM(classpath=lucene.CLASSPATH)
  >>> for n,v in lucene.__dict__.items():
  ...if n.endswith("Stemmer"):
  ...   print n, lucene.SnowballProgram.instance_(v)
  ... 

ItalianStemmer False
FrenchStemmer False
HungarianStemmer False
LovinsStemmer False
RussianStemmer False
FinnishStemmer False
PortugueseStemmer False
KpStemmer False
BrazilianStemmer False
DanishStemmer False
TurkishStemmer False
DutchStemmer False
SwedishStemmer False
German2Stemmer False
EnglishStemmer False
GermanStemmer False
RomanianStemmer False
PorterStemmer False
NorwegianStemmer False
SpanishStemmer False

Seems to me that this should give different results.  Am I using the JCC
"instance_" method improperly?

Bill


Re: which Snowball stemmers are in PyLucene?

2009-10-28 Thread Andi Vajda


On Wed, 28 Oct 2009, Bill Janssen wrote:


Is there a programmatic way to figure out whether the Snowball stemmer
for a particular language X is supported in a particular installation of
PyLucene?


The snowball JAR comes from this statement in the Makefile:
SNOWBALL_JAR=$(LUCENE)/build/contrib/snowball/lucene-snowball-$(LUCENE_VER).jar

Which means that it's whatever corresponds to the Lucene version checked 
out. For PyLucene 2.9.0, that is:

   http://svn.apache.org/repos/asf/lucene/java/tags/lucene_2_9_0

In other words, this is a question best asked on the 
java-u...@lucene.apache.org mailing list as PyLucene doesn't do anything 
different (at least intentionally).


Andi..


which Snowball stemmers are in PyLucene?

2009-10-28 Thread Bill Janssen
Is there a programmatic way to figure out whether the Snowball stemmer
for a particular language X is supported in a particular installation of
PyLucene?

Bill


Re: Pylucene and JCC 2.4.1

2009-10-28 Thread Andi Vajda


On Oct 28, 2009, at 2:45, Manolo Padron Martinez   
wrote:



What is the version of your gcc ?
I did the same build today on Ubuntu Gutsy 64 bits without any  
problem.


gcc (Debian 4.3.2-1.1) 4.3.2
g++ (Debian 4.3.2-1.1) 4.3.2


Here are a few things you could try, in no particular order:

  - gcc 4.2

  - use JCC from svn trunk (it's got support for java generics and  
the code that didn't compile for you has changed)


  - increase NUM_FILES in PyLucene's Makefile so that you compile  
smaller files.


Let me know what works, if anything.

Thanks !

Andi..



Regards from Canary Islands

Manuel Padrón Martínez


Re: Pylucene and JCC 2.4.1

2009-10-28 Thread Manolo Padron Martinez
> What is the version of your gcc ?
> I did the same build today on Ubuntu Gutsy 64 bits without any problem.

gcc (Debian 4.3.2-1.1) 4.3.2
g++ (Debian 4.3.2-1.1) 4.3.2

Regards from Canary Islands

Manuel Padrón Martínez


Re: Pylucene and JCC 2.4.1

2009-10-28 Thread Andi Vajda


On Oct 28, 2009, at 2:04, Manolo Padron Martinez   
wrote:



Hi:

After compile JCC 2.4.1 correctly I tried to compile pylucene and I
get the next error:


What is the version of your gcc ?
I did the same build today on Ubuntu Gutsy 64 bits without any problem.

Andi..



gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall
-Wstrict-prototypes -fPIC -DPYTHON=1 -D_jcc_shared=1
-I/usr/lib/jvm/java-6-openjdk/include
-I/usr/lib/jvm/java-6-openjdk/include/linux -Ibuild/_lucene
-I/usr/lib/python2.5/site-packages/JCC-2.4.1-py2.5-linux-x86_64.egg/ 
jcc/sources

-I/usr/include/python2.5 -c build/_lucene/__wrap03__.cpp -o
build/temp.linux-x86_64-2.5/build/_lucene/__wrap03__.o
-fno-strict-aliasing -Wno-write-strings
cc1plus: warning: command line option "-Wstrict-prototypes" is valid
for Ada/C/ObjC but not for C++
gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall
-Wstrict-prototypes -fPIC -DPYTHON=1 -D_jcc_shared=1
-I/usr/lib/jvm/java-6-openjdk/include
-I/usr/lib/jvm/java-6-openjdk/include/linux -Ibuild/_lucene
-I/usr/lib/python2.5/site-packages/JCC-2.4.1-py2.5-linux-x86_64.egg/ 
jcc/sources

-I/usr/include/python2.5 -c build/_lucene/__wrap02__.cpp -o
build/temp.linux-x86_64-2.5/build/_lucene/__wrap02__.o
-fno-strict-aliasing -Wno-write-strings
cc1plus: warning: command line option "-Wstrict-prototypes" is valid
for Ada/C/ObjC but not for C++
/usr/lib/python2.5/site-packages/JCC-2.4.1-py2.5-linux-x86_64.egg/ 
jcc/sources/functions.h:

In function ‘PyObject* get_iterator_next(T*) [with T =
java::util::t_Iterator, U = java::lang::t_String, V =
java::lang::String]’:
build/_lucene/__wrap02__.cpp:57762: instantiated from here
/usr/lib/python2.5/site-packages/JCC-2.4.1-py2.5-linux-x86_64.egg/ 
jcc/sources/functions.h:116:

error: no match for ‘operator=’ in ‘next =
java::util::Iterator::next() const()’
build/_lucene/java/lang/String.h:28: note: candidates are:
java::lang::String& java::lang::String::operator=(const
java::lang::String&)
error: command 'gcc' failed with exit status 1
make: *** [compile] Error 1


I suposse that should be a problem between pylucene from SVN and JCC
2.4.1. Again in Linux ( Debian Lenny 64 bits ).

Regards from Canary Islands

Manuel Padrón Martínez


Pylucene and JCC 2.4.1

2009-10-28 Thread Manolo Padron Martinez
Hi:

After compile JCC 2.4.1 correctly I tried to compile pylucene and I
get the next error:

gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall
-Wstrict-prototypes -fPIC -DPYTHON=1 -D_jcc_shared=1
-I/usr/lib/jvm/java-6-openjdk/include
-I/usr/lib/jvm/java-6-openjdk/include/linux -Ibuild/_lucene
-I/usr/lib/python2.5/site-packages/JCC-2.4.1-py2.5-linux-x86_64.egg/jcc/sources
-I/usr/include/python2.5 -c build/_lucene/__wrap03__.cpp -o
build/temp.linux-x86_64-2.5/build/_lucene/__wrap03__.o
-fno-strict-aliasing -Wno-write-strings
cc1plus: warning: command line option "-Wstrict-prototypes" is valid
for Ada/C/ObjC but not for C++
gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall
-Wstrict-prototypes -fPIC -DPYTHON=1 -D_jcc_shared=1
-I/usr/lib/jvm/java-6-openjdk/include
-I/usr/lib/jvm/java-6-openjdk/include/linux -Ibuild/_lucene
-I/usr/lib/python2.5/site-packages/JCC-2.4.1-py2.5-linux-x86_64.egg/jcc/sources
-I/usr/include/python2.5 -c build/_lucene/__wrap02__.cpp -o
build/temp.linux-x86_64-2.5/build/_lucene/__wrap02__.o
-fno-strict-aliasing -Wno-write-strings
cc1plus: warning: command line option "-Wstrict-prototypes" is valid
for Ada/C/ObjC but not for C++
/usr/lib/python2.5/site-packages/JCC-2.4.1-py2.5-linux-x86_64.egg/jcc/sources/functions.h:
In function ‘PyObject* get_iterator_next(T*) [with T =
java::util::t_Iterator, U = java::lang::t_String, V =
java::lang::String]’:
build/_lucene/__wrap02__.cpp:57762:   instantiated from here
/usr/lib/python2.5/site-packages/JCC-2.4.1-py2.5-linux-x86_64.egg/jcc/sources/functions.h:116:
error: no match for ‘operator=’ in ‘next =
java::util::Iterator::next() const()’
build/_lucene/java/lang/String.h:28: note: candidates are:
java::lang::String& java::lang::String::operator=(const
java::lang::String&)
error: command 'gcc' failed with exit status 1
make: *** [compile] Error 1


I suposse that should be a problem between pylucene from SVN and JCC
2.4.1. Again in Linux ( Debian Lenny 64 bits ).

Regards from Canary Islands

Manuel Padrón Martínez


Re: Problem compiling JCC

2009-10-28 Thread Manolo Padron Martinez
Hi:

Sorry, I forget the details. Yes a Debian lenny, now with the lastest JCC.


>> The attached patch resolves the bad version checking code.
>> Could you please apply it to jcc's setup.py and let me know if it solves
>> your problem ? In particular, as this is a new, unknown, version of
>> setuptools, I'd be interested to know if it works for you.
>> I did a test build on Mac OS X 10.6 with it and all seemed correct.


This patch don't seem to work against JCC that comes with pylucene... I mean

patch -p0 < setup.py.patch
(Stripping trailing CRs from patch.)
patching file setup.py
Hunk #1 FAILED at 109.
1 out of 1 hunk FAILED -- saving rejects to file setup.py.rej

But  the last JCC seems to compile without problems with setuptools
0.6c11 (pathcing it with shared mode patch)

So, thanks :)

Regards from Canary Islands

Manuel Padrón Martínez