tokenizer to strip a set of characters

2013-11-21 Thread Stephane Nicoll
Hi, I am using lucene 3.6 and I am looking to a tokenized that would remove certain characters when they are present at the beginning or at the end of a token. I initially used the StandardAnalyzer and switched to the WhitespaceAnalyser because it was too agressive for my use case. A few example

Re: Twitter analyser

2013-11-09 Thread Stephane Nicoll
, 2013 at 9:45 AM, Stephane Nicoll wrote: > Hi, > > This is what I've tried: > https://gist.github.com/anonymous/7383104 > > So far so good except that something is definitely wrong in my code as the > synonym is not emitted as a valid token it seems. This is how my in

Re: Twitter analyser

2013-11-09 Thread Stephane Nicoll
Hi, This is what I've tried: https://gist.github.com/anonymous/7383104 So far so good except that something is definitely wrong in my code as the synonym is not emitted as a valid token it seems. This is how my indexing analyzer is built: private static final class MyIndexAnalyzer extends Reusa

Re: Twitter analyser

2013-11-05 Thread Stephane Nicoll
Hi, Thanks for the reply. It's an index with tweets so any word really is a target for this. This would mean a significant increase of the index. My volumes are really small so that shouldn't be a problem (but performance/scalability is a concern). I have the control over the query. Another solut

Re: lucene / hibernate search in cluster

2009-05-04 Thread Stephane Nicoll
On Mon, May 4, 2009 at 3:33 PM, no spam wrote: > 5 seconds seems short to me also but this is what our client wants and so I > need to get as close to this number as possible :) It's a system that > records live video 24x7 and up to date information is extremely important. > I have the hibernate

Re: lucene / hibernate search in cluster

2009-05-03 Thread Stephane Nicoll
Hi, There are many more alternatives available to the JMS bridge. There is also the abilty to do incremental copy of the index over a shared filesystem for instance. That being said, 5 seconds seems really short to me. I read all this in "Hibernate Search In Action" but I suppose the online mater

Re: newbie question on querying on multiple attributes

2008-12-16 Thread Stephane Nicoll
Consider the use of the ClassBridge in Hibernate Search. Very useful. It basically allows you to merge multiple fields of your hibernate entity into a single lucene field. Once this is done, you can query this single field from lucene without the need for BooleanQuery. HTH, Stéphane On Tue, Dec

Re: confused about an entry in the FAQ

2008-05-24 Thread Stephane Nicoll
s, Stéphane > > Emmanuel > > On May 12, 2008, at 06:13, Stephane Nicoll wrote: > >> Hibernate Search introduces deadlock with multiple threads and the >> lucene integration in spring modules does not seeem to do what I want. > > -- L

Re: confused about an entry in the FAQ

2008-05-13 Thread Stephane Nicoll
ping. Sorry for the long email but I prefer to provide all information first. On Mon, May 12, 2008 at 12:13 PM, Stephane Nicoll <[EMAIL PROTECTED]> wrote: > I tried all this and I am confused about the result. I am trying to > implement an hybrid query handler where I fetch th

Re: confused about an entry in the FAQ

2008-05-12 Thread Stephane Nicoll
.doc(int i, FieldSelector fieldSelector) method? > > Could be faster because Lucene don't have do "prepare" the whole document. > > Patrick > > > On Sat, May 10, 2008 at 9:35 AM, Stephane Nicoll > <[EMAIL PROTECTED]> wrote: > > > > From the FAQ:

confused about an entry in the FAQ

2008-05-10 Thread Stephane Nicoll
>From the FAQ: "Don't iterate over more hits than needed. Iterating over all hits is slow for two reasons. Firstly, the search() method that returns a Hits object re-executes the search internally when you need more than 100 hits. Solution: use the search method that takes a HitCollector instead."

Re: hybrid query (lucene + db)

2008-05-02 Thread Stephane Nicoll
We can use Lucene as Oracle Text, but with many other > features, and using inline pagination We can get better perfomance > than latest 11g Text Counpound Domain Index. > If you are interested in this implementation simply drop me an email. > Best regards, Marcelo. > > > >

Re: hybrid query (lucene + db)

2008-05-02 Thread Stephane Nicoll
#x27;s FieldCache) but it may give you food for thought. > > Cheers > Mark > > > > > - Original Message > From: Stephane Nicoll <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Thursday, 1 May, 2008 9:00:33 AM > Subject: hybrid

Re: hybrid query (lucene + db)

2008-05-01 Thread Stephane Nicoll
p://issues.apache.org/jira/browse/LUCENE-434 > > > > KeyMap.java embodies the core service which translates from lucene doc ids > > to DB primary keys or vice versa. > > There are a couple of implementations of KeyMap that are not optimal (they > > pre-date Lucene's

hybrid query (lucene + db)

2008-05-01 Thread Stephane Nicoll
Hi there, We're using lucene with Hibernate search and we're very happy so far with the performance and the usability of lucene. We have however a specific use cases that prevent us to use only lucene: spatial queries. I already sent a mail on this list a while back about the problem and we starte

Re: HELP...compiling first program for lucene Indexer.java

2008-02-24 Thread Stephane Nicoll
Field.Text has been deprecated and removed a while back. You probably found an old code sample on the Internet that is not applicable anymore to 2.3 Just create your Fields wiht the constructor: new Field("") See the Javadoc for details. HTH, Stéphane On Sun, Feb 24, 2008 at 6:41 AM, sumit

Re: how to safely periodically reopen the IndexReader?

2008-02-21 Thread Stephane Nicoll
On Mon, Feb 18, 2008 at 6:08 PM, <[EMAIL PROTECTED]> wrote: > We have the same situation and use an atomic counter. Basically, we have > a SearcherHolder class and a SearcherManager class. The SearcherHolder > holds the searcher and the number of threads referencing the searcher. > > When the

Re: Using lucene with a Geospatial catalog

2008-02-17 Thread Stephane Nicoll
our corpus. > > There's one thing we never implemented, which was calculating the minimum > distance between two geometries (we almost always have one side of the > comparison as a point). Do you happen to know a reasonably speedy algorithm > to do this? > > Thanks! >

Using lucene with a Geospatial catalog

2008-02-17 Thread Stephane Nicoll
Hi, I've been browsing the archive and the documentation about Lucene. It really seems that it could help implementing my use case but I would like to be sure first. What I need is to be able to search data in a "catalog" which is geo-enabled. The data is stored in a database. A record has namely