RE: Using the highlighter from the sandbox with a prefix query.

2005-02-21 Thread Michael Celona
Thank you this helped a lot...

Michael Celona

-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED] 
Sent: Monday, February 21, 2005 11:55 AM
To: Lucene Users List
Subject: Re: Using the highlighter from the sandbox with a prefix query.


On Feb 21, 2005, at 10:53 AM, Michael Celona wrote:

> That the only stack I get.  One thing to mention that I am using a
> MultiSearcher to rewrite the queries. I tried...
>
> query = searcher_last.rewrite( query );
> query = searcher_cur.rewrite( query );
>
> using IndexSearcher and I don't get an error... However, I not able to
> highlight wildcard queries.

I use Highlighter for lucenebook.com and have two indexes that I search 
with MultiSearcher.  Here's how I highlight:

 IndexReader reader = readers[indexIndex];
 QueryScorer scorer = new QueryScorer(query.rewrite(reader));
 SimpleHTMLFormatter formatter =
 new SimpleHTMLFormatter("",
 "");
 Highlighter highlighter = new Highlighter(formatter, scorer);

I get the appropriate IndexReader for the document being highlighted.  
You can get the index _index_ this way:
'
 int indexIndex = searcher.subSearcher(hits.id(position));

Hope this helps.

Erik


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Using the highlighter from the sandbox with a prefix query.

2005-02-21 Thread Michael Celona
That the only stack I get.  One thing to mention that I am using a
MultiSearcher to rewrite the queries. I tried...

query = searcher_last.rewrite( query );
query = searcher_cur.rewrite( query );

using IndexSearcher and I don't get an error... However, I not able to
highlight wildcard queries.

Michael 

-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED] 
Sent: Monday, February 21, 2005 10:32 AM
To: Lucene Users List
Subject: Re: Using the highlighter from the sandbox with a prefix query.


On Feb 21, 2005, at 10:20 AM, Michael Celona wrote:

> I am using
>   query = searcher.rewrite( query );
>
> and it is throwing java.lang.UnsupportedOperationException .
>
> Am I able to use the searcher rewrite method like this?

What's the full stack trace?

Erik


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Using the highlighter from the sandbox with a prefix query.

2005-02-21 Thread Michael Celona
I am using
query = searcher.rewrite( query );

and it is throwing java.lang.UnsupportedOperationException .

Am I able to use the searcher rewrite method like this?

Thanks,
Michael

-Original Message-
From: Daniel Naber [mailto:[EMAIL PROTECTED] 
Sent: Thursday, February 17, 2005 4:09 AM
To: Lucene Users List
Subject: Re: Using the highlighter from the sandbox with a prefix query.

On Thursday 17 February 2005 08:37, lucuser4851 wrote:

>  We have been using the highlighter from the lucene sandbox, which works
> very nicely most of the time. However when we try and use it with a
> prefix query (which is what you get having parsed a wild-card query), it
> doesn't return any highlighted sections. Has anyone else experienced
> this problem, or found a way around it?

You need to call rewrite() on the query before you pass it to the
highlighter.

Regards
 Daniel

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Search Performance

2005-02-18 Thread Michael Celona
Just tried that... works like a charm... thanks...

Michael

-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] 
Sent: Friday, February 18, 2005 4:42 PM
To: Lucene Users List; Chris Lamprecht
Subject: Re: Search Performance

Or you could just open a new IndexSearcher, forget the old one, and
have GC collect it when everyone is done with it.

Otis

--- Chris Lamprecht <[EMAIL PROTECTED]> wrote:

> I should have mentioned, the reason for not doing this the obvious,
> simple way (just close the Searcher and reopen it if a new version is
> available) is because some threads could be in the middle of
> iterating
> through the search Hits.  If you close the Searcher they get a Bad
> file descriptor IOException.  As I found out the hard way :)
> 
> 
> On Fri, 18 Feb 2005 15:03:29 -0600, Chris Lamprecht
> <[EMAIL PROTECTED]> wrote:
> > I recently dealt with the issue of re-using a Searcher with an
> index
> > that changes often.  I wrote a class that allows my searching
> classes
> > to "check out" a lucene Searcher, perform a search, and then return
> > the Searcher.  It's similar to a database connection pool, except
> that
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Search Performance

2005-02-18 Thread Michael Celona
Thanks... I am seeing this problem right now Has anyone implemented a
better solution...?

Michael

-Original Message-
From: Chris Lamprecht [mailto:[EMAIL PROTECTED] 
Sent: Friday, February 18, 2005 4:14 PM
To: Lucene Users List
Subject: Re: Search Performance

I should have mentioned, the reason for not doing this the obvious,
simple way (just close the Searcher and reopen it if a new version is
available) is because some threads could be in the middle of iterating
through the search Hits.  If you close the Searcher they get a Bad
file descriptor IOException.  As I found out the hard way :)


On Fri, 18 Feb 2005 15:03:29 -0600, Chris Lamprecht
<[EMAIL PROTECTED]> wrote:
> I recently dealt with the issue of re-using a Searcher with an index
> that changes often.  I wrote a class that allows my searching classes
> to "check out" a lucene Searcher, perform a search, and then return
> the Searcher.  It's similar to a database connection pool, except that

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Search Performance

2005-02-18 Thread Michael Celona
I am using the highlighter... does this matter



-Original Message-
From: David Spencer [mailto:[EMAIL PROTECTED] 
Sent: Friday, February 18, 2005 2:05 PM
To: Lucene Users List
Subject: Re: Search Performance

Are you using the highlighter or doing anything non-trivial in 
displaying the results?

Are the pages being compressed (mod_gzip or some servlet equivalent)? 
This definitely helps, though to see the effect you may have to make 
sure your simulated users are "remote".

Also consider caching search results if it's reasonable to assume users 
may search for the same things.

I made some measurements on caching on my site:

http://www.searchmorph.com/weblog/index.php?id=41
http://www.searchmorph.com/weblog/index.php?id=40

And I use OSCache:

http://www.searchmorph.com/weblog/index.php?id=38
http://www.opensymphony.com/oscache/





Michael Celona wrote:

> What is single handedly the best way to improve search performance?  I
have
> an index in the 2G range stored on the local file system of the searcher.
> Under a load test of 5 simultaneous users my average search time is ~4700
> ms.  Under a load test of 10 simultaneous users my average search time is
> ~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz
> Zeons.  Any ideas?  
> 
>  
> 
> Michael
> 
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Search Performance

2005-02-18 Thread Michael Celona
My index is changing in real time constantly... in this case I guess this
will not work for me any suggestions...

Michael

-Original Message-
From: David Townsend [mailto:[EMAIL PROTECTED] 
Sent: Friday, February 18, 2005 11:50 AM
To: Lucene Users List
Subject: RE: Search Performance

IndexSearchers are thread safe, so you can use the same object on multiple
requests.  If the index is static and not constantly updating, just keep one
IndexSearcher for the life of the app.  If the index changes and you need
that instantly reflected in the results, you need to check if the index has
changed, if it has create a new cached IndexSearcher.  To check for changes
use you'll need to monitor the version number of the index obtained via

IndexReader.getCurrentVersion(Index Name)

David

-Original Message-
From: Stefan Groschupf [mailto:[EMAIL PROTECTED]
Sent: 18 February 2005 16:15
To: Lucene Users List
Subject: Re: Search Performance


Try a singleton pattern or an static field.

Stefan

Michael Celona wrote:

>I am creating new IndexSearchers... how do I cache my IndexSearcher...
>
>Michael
>
>-Original Message-
>From: David Townsend [mailto:[EMAIL PROTECTED] 
>Sent: Friday, February 18, 2005 11:00 AM
>To: Lucene Users List
>Subject: RE: Search Performance
>
>Are you creating new IndexSearchers or IndexReaders on each search?
Caching
>your IndexSearchers has a dramatic effect on speed.
>
>David Townsend
>
>-Original Message-
>From: Michael Celona [mailto:[EMAIL PROTECTED]
>Sent: 18 February 2005 15:55
>To: Lucene Users List
>Subject: Search Performance
>
>
>What is single handedly the best way to improve search performance?  I have
>an index in the 2G range stored on the local file system of the searcher.
>Under a load test of 5 simultaneous users my average search time is ~4700
>ms.  Under a load test of 10 simultaneous users my average search time is
>~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz
>Zeons.  Any ideas?  
>
> 
>
>Michael
>
>
>-
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>
>-
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>  
>


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Search Performance

2005-02-18 Thread Michael Celona
I am creating new IndexSearchers... how do I cache my IndexSearcher...

Michael

-Original Message-
From: David Townsend [mailto:[EMAIL PROTECTED] 
Sent: Friday, February 18, 2005 11:00 AM
To: Lucene Users List
Subject: RE: Search Performance

Are you creating new IndexSearchers or IndexReaders on each search?  Caching
your IndexSearchers has a dramatic effect on speed.

David Townsend

-Original Message-
From: Michael Celona [mailto:[EMAIL PROTECTED]
Sent: 18 February 2005 15:55
To: Lucene Users List
Subject: Search Performance


What is single handedly the best way to improve search performance?  I have
an index in the 2G range stored on the local file system of the searcher.
Under a load test of 5 simultaneous users my average search time is ~4700
ms.  Under a load test of 10 simultaneous users my average search time is
~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz
Zeons.  Any ideas?  

 

Michael


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Search Performance

2005-02-18 Thread Michael Celona
What is single handedly the best way to improve search performance?  I have
an index in the 2G range stored on the local file system of the searcher.
Under a load test of 5 simultaneous users my average search time is ~4700
ms.  Under a load test of 10 simultaneous users my average search time is
~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz
Zeons.  Any ideas?  

 

Michael



java.io.IOException: Stale NFS file handle

2005-02-17 Thread Michael Celona
Has anyone seen this..

 

java.io.IOException: Stale NFS file handle

at java.io.RandomAccessFile.readBytes(Native Method)

at java.io.RandomAccessFile.read(RandomAccessFile.java:307)

at
org.apache.lucene.store.FSInputStream.readInternal(FSDirectory.java:420)

at org.apache.lucene.store.InputStream.refill(InputStream.java:158)

at org.apache.lucene.store.InputStream.readByte(InputStream.java:43)

at
org.apache.lucene.store.InputStream.readBytes(InputStream.java:57)

at
org.apache.lucene.index.CompoundFileReader$CSInputStream.readInternal(Compou
ndFileReader.java:220)

at org.apache.lucene.store.InputStream.refill(InputStream.java:158)

at org.apache.lucene.store.InputStream.readByte(InputStream.java:43)

at org.apache.lucene.store.InputStream.readVInt(InputStream.java:83)

at
org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:102)

at org.apache.lucene.index.MultiTermDocs.next(MultiReader.java:361)

at org.apache.lucene.index.MultiTermDocs.next(MultiReader.java:366)

at
org.apache.lucene.search.FieldCacheImpl.getStringIndex(FieldCacheImpl.java:2
68)

at
org.apache.lucene.search.FieldCacheImpl.getAuto(FieldCacheImpl.java:343)

at
org.apache.lucene.search.FieldSortedHitQueue.comparatorAuto(FieldSortedHitQu
eue.java:327)

at
org.apache.lucene.search.FieldSortedHitQueue.getCachedComparator(FieldSorted
HitQueue.java:170)

at
org.apache.lucene.search.FieldSortedHitQueue.(FieldSortedHitQueue.java
:58)

at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:122)

at
org.apache.lucene.search.MultiSearcher.search(MultiSearcher.java:141)

at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:64)

at org.apache.lucene.search.Hits.(Hits.java:51)

at org.apache.lucene.search.Searcher.search(Searcher.java:49)

 

 

I get this during a load test or 5 simultaneous users.  I have the index NFS
mounted from an "indexer box" which holds the index to an application server
(tomcat).  My index is constantly being  added to.  Search performance is in
the 4 second range ( queryString of "the" ) on an index of about 2G (as of
now). does anyone know how I can speed this up. 

 

Any insight would be greatly appreciated.

 

Michael

 

 



RE: Similarity coord,lengthNorm

2005-02-07 Thread Michael Celona
Would fixing the lengthNorm to 1 fix this problem?

Michael

-Original Message-
From: Michael Celona [mailto:[EMAIL PROTECTED] 
Sent: Monday, February 07, 2005 8:48 AM
To: Lucene Users List
Subject: Similarity coord,lengthNorm

I have varying length text fields which I am searching on.  I would like
relevancy to be dictated predominantly by the number of terms in my query
that match.  Right now I am seeing a high relevancy for a single word
matching in a small document even though all the terms in my query don't
match.  Does, anyone have an example of a custom Similarity sub class which
overrides the coord and lengthNorm methods.

 

Thanks..

Michael 




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Similarity coord,lengthNorm

2005-02-07 Thread Michael Celona
I have varying length text fields which I am searching on.  I would like
relevancy to be dictated predominantly by the number of terms in my query
that match.  Right now I am seeing a high relevancy for a single word
matching in a small document even though all the terms in my query don't
match.  Does, anyone have an example of a custom Similarity sub class which
overrides the coord and lengthNorm methods.

 

Thanks..

Michael 



text highlighting

2005-01-26 Thread Michael Celona
Does any have a working example of the highlighter class found in the
sandbox?

-Original Message-
From: Jason Polites [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, January 26, 2005 5:34 PM
To: Lucene Users List
Subject: Re: Search Engine review article/book

Also:

http://labs.google.com/papers.html
http://research.microsoft.com/wsm/

- Original Message - 
From: "Stefan Groschupf" <[EMAIL PROTECTED]>
To: "Lucene Users List" 
Sent: Thursday, January 27, 2005 9:27 AM
Subject: Re: Search Engine review article/book


>+  the lucene in action book. :-)
> +  scholar.google.com
> + acm.org ir group
> + ieee.org has ir group as well
> may you will find http://searchenginewatch.com/ useful as well.
> 
> HTH
> Stefan
> 
> 
> Am 26.01.2005 um 23:18 schrieb Xiaohong Yang ((Sharon)):
> 
>> Hi all,
>>
>> I am looking for good review articles or books regarding latest search 
>> engine development trend and practices.  Any suggestions would be very 
>> helpful.  Any comments not covered by articles are also welcome.
>>
>> Thanks a lot,
>>
>> Sharon
>>
> ---
> company: http://www.media-style.com
> forum: http://www.text-mining.org
> blog: http://www.find23.net
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
>

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]