Luke for Lucene 2.x ?

2006-08-04 Thread KEGan
Hi, I have read that *Andrzej Bialecki *mentioned that he would release new version of Luke based on Lucene 2.0.0 soon. URL here ... http://www.mail-archive.com/java-user@lucene.apache.org/msg08612.html. Anyone has any idea if it has been released ? Andrzej, if you are reading this, could you pl

Re: Luke for Lucene 2.x ?

2006-08-04 Thread Miles Barr
KEGan wrote: I have read that *Andrzej Bialecki *mentioned that he would release new version of Luke based on Lucene 2.0.0 soon. URL here ... http://www.mail-archive.com/java-user@lucene.apache.org/msg08612.html. Anyone has any idea if it has been released ? Andrzej, if you are reading this, co

Re: Luke for Lucene 2.x ?

2006-08-04 Thread Erick Erickson
I've been using Luke against indexes created with Lucene 2.0 for a while, no problem. I haven't been compiling either Luke or Lucene though... FWIW. Erick On 8/4/06, Miles Barr <[EMAIL PROTECTED]> wrote: KEGan wrote: > I have read that *Andrzej Bialecki *mentioned that he would release new >

Re: EMAIL ADDRESS: Tokenize (i.e. an EmailAnalyzer)

2006-08-04 Thread Michael J. Prichard
Chris Hostetter wrote: : Sure I would love to! Can you ping me at [EMAIL PROTECTED] and : let me know what I need to do? Do I just post it to JIRA? instructions on submitting code can be found in the wiki.. http://wiki.apache.org/jakarta-lucene/HowToContribute note in particular that since

RE: Scoring a document (count?)

2006-08-04 Thread Russell M. Allen
Doron, thanks for the code offer. That would be great. I was able to get a partial implementation working myself, but I ran into some issues (most of which are rooted in a lack of understanding of Lucene internals on my part). I am sure I can learn a few things from your solution to this problem

Re: EMAIL ADDRESS: Tokenize (i.e. an EmailAnalyzer)

2006-08-04 Thread Chris Hostetter
: Oh boy, how embarrassing for me. I have never used any unit tests...I : know I know...don't freak out people :) I pretty much just started : really coding in Java. So, is there anyone out there who has time to : help me add these to the code? I would appreciate it and, on the plus : side, th

Re: EMAIL ADDRESS: Tokenize (i.e. an EmailAnalyzer)

2006-08-04 Thread Simon Willnauer
Hey Michael, you might fall in love with the green bar after using it for a while. If you use eclipse or some other ide including junit it so easy and powerful. You might find some help here as well: http://www.onjava.com/pub/a/onjava/2004/02/04/juie.html http://www.junit.org/news/article/index

Stemmer Implementation Strategy - feedback?

2006-08-04 Thread Marios Skounakis
Hi all, The contrib section of Lucene contains a Greek Analyzer, which however only does some letter normalization (capitals to lowercase, accent removal) and basic stop word removal. I am interested in creating a Stemmer for the Greek Language to use with Lucene (i.e. implement it as an anal

Re: scoring formula

2006-08-04 Thread Zhao, Xin
Hi, Erik, What do you think about the difference? Thank you very much for your reply, Xin - Original Message - From: "Erik Hatcher" <[EMAIL PROTECTED]> To: Sent: Wednesday, August 02, 2006 3:56 PM Subject: Re: scoring formula Please disregard my previous quick reply as I did not full

running a lucene indexing app as a windows service on xp, crashing

2006-08-04 Thread Mark Modrall
Hi... Not sure what's the right group for this question. We have a java program running with the tanuki Windows Service Wrapper on XP. This program is using Lucene to do a fair amount of indexing, creating and deleting scads of files in its working directory. The Lucene code is crashing u

Re: running a lucene indexing app as a windows service on xp, crashing

2006-08-04 Thread Michael McCandless
The Lucene code is crashing under circumstances that seem pretty lame. At periodic intervals, lucene tries to File.renameTo(newfile). Sometimes this fails, so Lucene implemented some fall-back code to manually copy the contents of the file from old to new. Our problem is that sometimes *this* f

RE: running a lucene indexing app as a windows service on xp, crashing

2006-08-04 Thread Mark Modrall
Hi Michael... Here's the traceback: [Indexer.java 652] buildFullIndex: Error building full index java.io.IOException: Cannot rename D:\indexbuild1\contact_index\deleteable.new to D:\indexbuild1\contact_index\deletable at org.apache.lucene.store.FSDirectory.renameFile(FSDirectory.java:294) at org.

Re: wildcards and spans

2006-08-04 Thread Doron Cohen
A thought - would you (or the project lead;-) consider limiting the 'wildcard expansion'? Assuming a query like: ( uni* near(5) science ) I.e. match docs with any word with prefix "uni" that spans no further than 5 from the word "science". Assume current lexicon has M (say 1200) words st

is there a simple way to make a list of all words in an index?

2006-08-04 Thread Bill Taylor
I note that Luke is able to create and display a list of all words in the dictionary in descending order of frequency, but I would like to be able to get a simple list of all words in the dictionary, preferably in a file. I can clearly modify Luke to do this, but i hoped that someone else had

Re: is there a simple way to make a list of all words in an index?

2006-08-04 Thread Otis Gospodnetic
Luke probably uses Lucene's TermEnum class, which is the class you, too, can use to get the list of terms. Otis - Original Message From: Bill Taylor <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Saturday, August 5, 2006 1:04:21 AM Subject: is there a simple way to make a li

Re: is there a simple way to make a list of all words in an index?

2006-08-04 Thread Doron Cohen
See IndexReader methods - terms() and terms(Term) - and Lucene FAQ - http://lucene.apache.org/java/docs/api/org/apache/lucene/index/IndexReader.html#terms() http://lucene.apache.org/java/docs/api/org/apache/lucene/index/IndexReader.html#terms(org.apache.lucene.index.Term) http://wiki.apache.org/j

Re: running a lucene indexing app as a windows service on xp, crashing

2006-08-04 Thread eks dev
This is windows/jvm issue . Have a look at how ant is dealing with it, maybe we could give it a try with something like that (I have not noticed ant having problems). We are not able to reproduce this in our environment systematically, so it would be great if you could patch your lucene with th