Re: Index File

2004-11-15 Thread Luke Shannon
Hi Luke;

I implemented the logging like you said. At present I am speeding about 678
milliseconds creating a new IndexSearcher.

I am going to implement your scheme to resolve this but at a later point
since I don't think this is a huge time factor to be worried about at
present.

Thanks for all your help,

Luke

- Original Message - 
From: "Luke Francl" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Monday, November 15, 2004 12:18 PM
Subject: Re: Index File


> As long as you are closing your IndexSearchers when you are done with
> them you should not have problems with file handles. When using Lucene
> 1.2 (pre-compound file format) on Windows, I ran into this problem
> because Windows only lets an application open something like 1000 file
> handles. On Unix the number is larger.
>
> To calculate the cost of creating a new IndexSearcher for each search,
> just put a timing statement around your call to IndexSearcher.open and
> IndexSearcher.close. Your performance cost is the time used when opening
> and closing unnecessarily.
>
> Here is a description of the scheme I used to manage this issue:
>
> "Creating a new IndexSearcher for every request opens too many
> files. Searchers are thread-safe, so it is good to keep only one
> Searcher open at a time. However, we must create a new Searcher if
> the index has changed, and it is not safe to close a Searcher while
> it is still in use by a thread. The SearcherManager handles these
> cases.
>
> The SearcherManager keeps a list of the Searchers which are
> currently in use and returns the the current one, or a new one if
> necessary. The SearcherManager uses a reference counting scheme to
> keep track of which Searchers are still being used. Callers must
> return their Searcher to the SearcherManager when done using it so
> the Searcher can be closed (releasing its filehandles) if no other
> threads are using it and the index has changed."
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Index File

2004-11-15 Thread Luke Francl
As long as you are closing your IndexSearchers when you are done with
them you should not have problems with file handles. When using Lucene
1.2 (pre-compound file format) on Windows, I ran into this problem
because Windows only lets an application open something like 1000 file
handles. On Unix the number is larger.

To calculate the cost of creating a new IndexSearcher for each search,
just put a timing statement around your call to IndexSearcher.open and
IndexSearcher.close. Your performance cost is the time used when opening
and closing unnecessarily.

Here is a description of the scheme I used to manage this issue:

"Creating a new IndexSearcher for every request opens too many
files. Searchers are thread-safe, so it is good to keep only one
Searcher open at a time. However, we must create a new Searcher if
the index has changed, and it is not safe to close a Searcher while
it is still in use by a thread. The SearcherManager handles these
cases.

The SearcherManager keeps a list of the Searchers which are
currently in use and returns the the current one, or a new one if
necessary. The SearcherManager uses a reference counting scheme to
keep track of which Searchers are still being used. Callers must
return their Searcher to the SearcherManager when done using it so
the Searcher can be closed (releasing its filehandles) if no other
threads are using it and the index has changed."


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Index File

2004-11-15 Thread Luke Shannon
Hi Luke;

I have tried to implement a system where the search method is aware whether
the index folder has been updated or not (so it only creates a new searcher
when required). However, due to the environment this process is running in
it turned out not to be a simple task. I need to move on to some out
standing document types I still need to handle. But before I do that...

When you said "performance boost" how substaintal might this be? For now
each time I finish with a Searcher I close it and set the reference to null.
The application this is running in monitors the levels of memory being
utilized. If utilization gets to high it requests the garbage collector to
run.

I am not totally sure I totally understand risks involved with the
FileHandlers issue you brought up, or what kind of burden I may be putting
on performance by not putting better logic around the creation of the
IndexSearcher. I hoping things will be ok as they are until I can dedicate
more time to the issue.

Are there specific errors or behavior I should look for if I start to run
out of FileHandlers?

Luke
- Original Message - 
From: "Luke Francl" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Monday, November 15, 2004 11:03 AM
Subject: Re: Index File


> On Mon, 2004-11-15 at 09:52, Luke Shannon wrote:
> > Once this was modified to create a new IndexerSearch for every search
> > request, all my problems went away.
>
> Be careful with this. You could conceivably run out of file handles.
> This problem got a lot better in Lucene 1.3 with the compound file
> format, it could still happen if you have a lot of heap and aren't
> garbage collecting very often. So close the old one when you're done
> with it.
>
> Also, creating a new IndexSearcher only when the index has been modified
> will give you a performance boost because you do not have to open the
> index with every search.
>
> Luke Francl
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Index File

2004-11-15 Thread Luke Francl
On Mon, 2004-11-15 at 09:52, Luke Shannon wrote:
> Once this was modified to create a new IndexerSearch for every search
> request, all my problems went away.

Be careful with this. You could conceivably run out of file handles.
This problem got a lot better in Lucene 1.3 with the compound file
format, it could still happen if you have a lot of heap and aren't
garbage collecting very often. So close the old one when you're done
with it.

Also, creating a new IndexSearcher only when the index has been modified
will give you a performance boost because you do not have to open the
index with every search. 

Luke Francl


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Index File

2004-11-15 Thread Luke Shannon
Based on Otis's suggestion I was able to resolve this issue. The class I was
integrating with for search created one IndexSearcher when it was
instantiated and keep that same reference throughout the session.

Once this was modified to create a new IndexerSearch for every search
request, all my problems went away.

- Original Message - 
From: "Luke Francl" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Monday, November 15, 2004 10:39 AM
Subject: RE: Index File


> On Fri, 2004-11-12 at 19:07, Richard Greenane wrote:
> > You might wat to look at LUKE @ http://www.getopt.org/luke/
> > A great tool for checking the index to make sure that everything is
> > there
>
> There is also a web-based tool that you can run in your servlet
> container called LIMO. I've added some query features to it in CVS,
> which you can check out from Sourceforge:
> http://sourceforge.net/projects/limo
>
> But I will second what Otis said: you must (or rather your colleague
> must) check to see if the index has been updated before a search (use
> IndexReader.getCurrentVersion), and if it is, close the IndexSearcher
> and create a new one.
>
> Luke
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Index File

2004-11-15 Thread Luke Francl
On Fri, 2004-11-12 at 19:07, Richard Greenane wrote:
> You might wat to look at LUKE @ http://www.getopt.org/luke/
> A great tool for checking the index to make sure that everything is
> there

There is also a web-based tool that you can run in your servlet
container called LIMO. I've added some query features to it in CVS,
which you can check out from Sourceforge:
http://sourceforge.net/projects/limo

But I will second what Otis said: you must (or rather your colleague
must) check to see if the index has been updated before a search (use
IndexReader.getCurrentVersion), and if it is, close the IndexSearcher
and create a new one.

Luke


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Index File

2004-11-12 Thread Richard Greenane
You might wat to look at LUKE @ http://www.getopt.org/luke/
A great tool for checking the index to make sure that everything is
there

Regards

Richard


-Original Message-
From: Luke Shannon [mailto:[EMAIL PROTECTED] 
Sent: 12 November 2004 23:54
To: Lucene Users List
Subject: Index File


Hi;

Is there someway to determine if specific contents are in the index
folder other than running a query against it?

I see that my document is being indexed. But when I run a query against
the index I get no results returned.

The weird thing is if I restart TomCat and run the search again the
correct results are found.

Half this system was built by someone else (the execution of the search
query and displaying of the results). I only handle the indexing. I
would like to find a way to definitely ensure that the index file is
properly being created.

Any ideas? Anyone seen this before?

Thanks

Luke


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Index File

2004-11-12 Thread Otis Gospodnetic
If you add a Document to the index after you've opened an
IndexSearcher/Reader, your IndexSearcher/Reader will not see it.  You
have to open a new IS/R to see the newly added Documents.  This is
often covered on this list... I must have added this to Lucene FAQ at
jGuru, too.

Otis

--- Luke Shannon <[EMAIL PROTECTED]> wrote:

> Hi;
> 
> Is there someway to determine if specific contents are in the index
> folder other than running a query against it?
> 
> I see that my document is being indexed. But when I run a query
> against the index I get no results returned.
> 
> The weird thing is if I restart TomCat and run the search again the
> correct results are found.
> 
> Half this system was built by someone else (the execution of the
> search query and displaying of the results). I only handle the
> indexing. I would like to find a way to definitely ensure that the
> index file is properly being created.
> 
> Any ideas? Anyone seen this before?
> 
> Thanks
> 
> Luke


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Index-file locking while searching?

2003-02-27 Thread Biswas, Goutam_Kumar
Hi Otis,
I too tried to disable the lucene locks for searching index by specifying
"$JAVACMD $TOMCAT_OPTS -DdisableLuceneLocks=true ." in tomcat.sh.
To make sure the variable is set properly I also tried to print its
value.(see below the line marked by arrow).

But still I got the following exception stack trace on searching on readonly
index directory:

INFO: disableLuceneLocks = true<--
SEVERE: 
java.io.IOException: Permission denied
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createNewFile(File.java:827)
at org.apache.lucene.store.FSDirectory$1.obtain(Unknown Source)
at org.apache.lucene.store.Lock$With.run(Unknown Source)
at org.apache.lucene.index.IndexReader.open(Unknown Source)
at org.apache.lucene.index.IndexReader.open(Unknown Source)
at org.apache.lucene.search.IndexSearcher.(Unknown Source)
at deshaw.desearch.search.FileSearcher.search(FileSearcher.java:778)
at deshaw.desearch.search.FileSearcher.doPost(FileSearcher.java:212)
at deshaw.desearch.search.FileSearcher.doGet(FileSearcher.java:95)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:740)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:865)
at
org.apache.tomcat.core.ServletWrapper.doService(ServletWrapper.java:404)
at org.apache.tomcat.core.Handler.service(Handler.java:286)
at
org.apache.tomcat.core.ServletWrapper.service(ServletWrapper.java:372)
at
org.apache.tomcat.core.ContextManager.internalService(ContextManager.java:79
7)
at
org.apache.tomcat.core.ContextManager.service(ContextManager.java:743)
at
org.apache.tomcat.service.http.HttpConnectionHandler.processConnection(HttpC
onnectionHandler.java:210)
at
org.apache.tomcat.service.TcpWorkerThread.runIt(PoolTcpEndpoint.java:416)
at
org.apache.tomcat.util.ThreadPool$ControlRunnable.run(ThreadPool.java:498)
at java.lang.Thread.run(Thread.java:536)
Feb 28, 2003 12:51:49 PM deshaw.desearch.search.FileSearcher doPost


I susspect there may be problem with Lucene Version.
I am using Lucene version 1.2-rc4. I guess this is the most recent version.
Let me know which version you tested this with.

Thanks.

Regards,
-goutam- 



-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Monday, February 24, 2003 10:17 AM
To: Lucene Users List
Subject: Re: Index-file locking while searching?


Hello,

I think you didn't set that system property properly, or maybe you are
using some old Lucene release that does not have this functionality.
I just checked the source of FSDirectory, and the code looks right.

Otis

--- "Giri, Sandeep" <[EMAIL PROTECTED]> wrote:
> Hi!
> I don't want to give write permission to the index directory while
> searching.
> But lucene needs write permission on index directory so that it can
> create
> locks while searching.
> So, I tried to use the "-DdisableLuceneLocks=true" but its not
> working.
> It gives the following error:
> ---
> SEVERE: 
> java.io.IOException: Permission denied
> at java.io.UnixFileSystem.createFileExclusively(Native
> Method)
> at java.io.File.createNewFile(File.java:827)
> at org.apache.lucene.store.FSDirectory$1.obtain(Unknown
> Source)
> at org.apache.lucene.store.Lock$With.run(Unknown Source)
> at org.apache.lucene.index.IndexReader.open(Unknown Source)
> at org.apache.lucene.index.IndexReader.open(Unknown Source)
> at org.apache.lucene.search.IndexSearcher.(Unknown
> Source)
> at FileSearcherCmdLine.search(FileSearcherCmdLine.java:93)
> at FileSearcherCmdLine.main(FileSearcherCmdLine.java:689)
> [Search Time]: 0.0 secs
> ---
> 
> What is the solution?
> Somebody, please help me out..
> 
> Thanks in advance.
> 
> Best Regards,
> Sandeep Giri
> Member Technical 
> D.E.Shaw India Software Pvt. Ltd. 
> Hyderabad.
> DISCLAIMER :"Any views expressed in this message are those of the
> individual
> sender, except where the sender specifically states them to be the
> views of
> D. E. Shaw India Software Private Limited., or any of its affiliates"
> 
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


__
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, more
http://taxes.yahoo.com/

-
To unsubscribe, e-mail: [E

Re: Index-file locking while searching?

2003-02-23 Thread Otis Gospodnetic
Hello,

I think you didn't set that system property properly, or maybe you are
using some old Lucene release that does not have this functionality.
I just checked the source of FSDirectory, and the code looks right.

Otis

--- "Giri, Sandeep" <[EMAIL PROTECTED]> wrote:
> Hi!
> I don't want to give write permission to the index directory while
> searching.
> But lucene needs write permission on index directory so that it can
> create
> locks while searching.
> So, I tried to use the "-DdisableLuceneLocks=true" but its not
> working.
> It gives the following error:
> ---
> SEVERE: 
> java.io.IOException: Permission denied
> at java.io.UnixFileSystem.createFileExclusively(Native
> Method)
> at java.io.File.createNewFile(File.java:827)
> at org.apache.lucene.store.FSDirectory$1.obtain(Unknown
> Source)
> at org.apache.lucene.store.Lock$With.run(Unknown Source)
> at org.apache.lucene.index.IndexReader.open(Unknown Source)
> at org.apache.lucene.index.IndexReader.open(Unknown Source)
> at org.apache.lucene.search.IndexSearcher.(Unknown
> Source)
> at FileSearcherCmdLine.search(FileSearcherCmdLine.java:93)
> at FileSearcherCmdLine.main(FileSearcherCmdLine.java:689)
> [Search Time]: 0.0 secs
> ---
> 
> What is the solution?
> Somebody, please help me out..
> 
> Thanks in advance.
> 
> Best Regards,
> Sandeep Giri
> Member Technical 
> D.E.Shaw India Software Pvt. Ltd. 
> Hyderabad.
> DISCLAIMER :"Any views expressed in this message are those of the
> individual
> sender, except where the sender specifically states them to be the
> views of
> D. E. Shaw India Software Private Limited., or any of its affiliates"
> 
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


__
Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, more
http://taxes.yahoo.com/

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]