Dear All,
Now, I'm considering to build index for my application with lucene.
However, as the document sources I'm going to index has many duplications, so
before adding a document to an IndexWriter, I hope search in the index database
first to see if a same document copy has already been ad
FWIW, +1 from me on all this: when I started poking at my little problem I
found as you said that there was really no way to trace the issue (one can use
the debugger of course and I did, which is how I found the problem). So,
getRefCount() would be good!
thanks,
Paul
-Original Message--
OK, I understand now. Like I said, anything you deem appropriate.
Best
Erick
On Fri, Mar 6, 2009 at 5:45 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:
> If we changed the signature (return value) then on dropping in the JAR
> you'd have to recompile your code, which violates our bac
If we changed the signature (return value) then on dropping in the JAR
you'd have to recompile your code, which violates our back compat
goals, ie "drop in JAR and run".
Mike
Erick Erickson wrote:
Why would it break back compat?
They just return void now, so
IndexReader.incRef();
should
I wrote a really basic read-only Directory implementation for indices
contained in zip files. It's read-only because that's what Java's API
supports, and it has no documentation or anything else because I
haven't gotten to that yet. It also claims its package is
org.apache.lucene.store since that
MakMak wrote:
Hey Mike, thanks for the quick response, I tried passing Directory to
IndexReader.open() and there were no deadlocks!! I will get rid of
synchronizing on FSDirectory too.
Great!
However do you think it will be better to modify the docs for
FSDirectory
and remove the sync part
Hey Mike, thanks for the quick response, I tried passing Directory to
IndexReader.open() and there were no deadlocks!! I will get rid of
synchronizing on FSDirectory too.
However do you think it will be better to modify the docs for FSDirectory
and remove the sync part of "Directories are cached
Why would it break back compat?
They just return void now, so
IndexReader.incRef();
should still compile/run.
But that's arguing about angels dancing on pins. My real issue
is that by not allowing *some* mechanism to get the refcount
developers don't have any tools for figuring out that it's a r
On Fri, Mar 6, 2009 at 5:01 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:
>
> Shalin, did you ever get to the bottom of this?
>
No, I'll try to reproduce this and let you know tomorrow.
--
Regards,
Shalin Shekhar Mangar.
It's not safe for you to synchronize externally on the Directory
instance returned from FSDirectory.getDirectory -- that's leading
to the deadlock here right?
It looks like you passed in a File or String to IndexReader.open? One
workaround (I think -- not tested) might be to pass Directory
inst
Hi,
I have the following :
Thread1
1. Acquires a lock on FSDirectory.getDirectory (not right, not needed, but
should not be harmful anyway)
2. Issues an IndexReader.reopen() to open the reader and search. This call
waits on acquiring a MultiSegmentReader lock.
Thread2 -
1. Issues
Yes ref counts are tricky, though these are expert APIs.
I think changing close, incRef, decRef to return the RC would be good,
though that breaks back compat.
How about exposing getRefCount() instead?
Mike
Erick Erickson wrote:
H, reference counting is always yucky. I looked
the Ind
H, reference counting is always yucky. I looked
the IndexReader javadocs over and there isn't any help
there for managing refcounts. You can't find the current
refcount, close doesn't indicate the results, etc. Or I
missed, for the Nth time, perfectly obvious documentation.
What do people thin
OK, phew! Thanks for bringing closure.
Mike
rolaren...@earthlink.net wrote:
I did just now double/triple-check: the IndexWriter is definitely
closed.
However (cough), I did have a bogus call to IndexReader.incRef() ...
once I removed that, the call to IndexReader.close() actually worked
I did just now double/triple-check: the IndexWriter is definitely closed.
However (cough), I did have a bogus call to IndexReader.incRef() ... once I
removed that, the call to IndexReader.close() actually worked and then the
deletion did so too. Thanks; sorry to trouble you.
-Paul
-Orig
Right, I should have included these data in my orig. message (sorry):
WinXP, R2.4
I do have permissions and the files are definitely part of the index being
removed; nothing outside of (my code that uses) Lucene would have a handle on
these files.
-Paul
-Original Message-
>From: Ia
On Fri, Mar 6, 2009 at 5:43 AM, damu_verse wrote:
> We have tried with the Lucene-2.4.0 also (JVM not changed) ..
> But still threads are blocking..Not able to find the root cause...
What is the *full* thread dump?
Some threads blocking is fine and normal - there isn't necessarily
anyt
h...@all
I'd like to do a fuzzy search with german special characters. For instance I
want to query for "müller", but also terms like "mueller" should be respected,
as ü can also be written as ue. How could this be done? At index creation time,
I could convert ü to ue, and just use the ue ve
Whatever you do will be wrong . What you're saying is
that you have structured data that the user wants to search
in an unstructured way, and you want to try to create a
system that intuits what the user meant. Good luck .
Can you back up a bit and talk about the problem you're
trying to solve? If
See below
On Fri, Mar 6, 2009 at 1:44 AM, Ganesh wrote:
> Hello all
>
> 1)
> Which is best to use Snowball analyzer or Lucene contrib analyzer? There is
> no inbuilt stop word list for Snowball analyzer?
>
What is the "Lucene contrib analyzer"? There are 12 of them..
And regardless, the answ
Prior to 2.4, the search runs first and then the filter. Ie, search
does all the work to produce docIDs that match it, and then per docID
the filter is checked.
As of 2.4, they actually sort of play leap-frog, document by
document. First, was ask the filter for its first matching docID,
I don't know how QueryParser works behind the scenes but it looks like
this is at least known behaviour. From the QueryParser javadocs:
setLowercaseExpandedTerms
public void setLowercaseExpandedTerms(boolean lowercaseExpandedTerms)
Whether terms of wildcard, prefix, fuzzy and range queries
Shalin, did you ever get to the bottom of this?
Mike
Michael McCandless wrote:
You mean on calling IndexWriter.close, with a deletion policy that's
functionally equivalent to KeepOnlyLastCommitDeletionPolicy, you
somehow see that last 2 commits remaining in the Directory once
IndexWrit
You could look at the docID of each hit, and compare to the .maxDoc()
of each underlying reader.
MultiSearcher logically "concatenates" the docIDs.
However, docIDs are an internal identifier for Lucene, so it's always
possible in a new release of Lucene that how docIDs are mapped by
Mult
This is an interesting challenge! Responses below...
Kieran Topping wrote:
Hello,
I would like to be able to instantiate a RAMDirectory from a
directory that an IndexWriter in another process might currently be
modifying.
Ideally, I would like to do this without any synchronizing or
Hi Yonik
We have tried with the Lucene-2.4.0 also (JVM not changed) ..
But still threads are blocking..Not able to find the root cause...
thanks & regards
-damu
Yonik Seeley-2 wrote:
>
> Hmmm, if this is some sort of deadlock, we may need a thread dump of
> all of the threads.
> Do
Another problem.
Using the PerFieldAnalyzerWrapper solves the case where I have a
simple query, such as the following:
Query query = parser.parse("X");
or
Query query = parser.parse("X OR Y");
but if I use a more complex query like the following:
Query query = parser.parse("[A TO
You could have single index file with all the names tagged at the time of
indexing. For the query parsing, you could have a lookup
for common words ending which identify the business names (like Corp, Inc,
LLC, Ltd, etc.) and common words like (road, avenue,
street, lane etc) for address and separ
If truly the IndexWriter & all IndexReaders are closed, then they
should no longer be holding open files. Maybe triple check that
you've indeed closed everything.
It's remotely possible that some other process (virus checker, source
control clients, etc) has the file open.
You could tr
Corrected the second line.. Please refer the Javadocs for more help.
document.add(new Field("path",textFiles[i].getPath(), Field.Store.NO,
Field.Index.ANALYZED));
Regards
Ganesh
- Original Message -
From: "nitin gopi"
To:
Sent: Friday, March 06, 2009 2:27 PM
Subject: Re: error in
Can you not make one index with all three types of name and just
search that? Sounds much easier. You might get a few funnies like
business Kingston on McDonald's street, but they'd be the exception.
--
Ian.
On Fri, Mar 6, 2009 at 6:25 AM, Srinivas Bharghav
wrote:
> I am trying to evaluate as
What OS are you running? What version of lucene? Are you sure that
you have privilege to delete the files that it is failing on? That
they are part of the index you are trying to remove? That something
else doesn't have the files open?
It seems likely that you are on Windows and that something
I am out of the office until 2009-03-07..
I will check emails at night. For anything emergent, you can call my cell
phone (86) 131 6290 0375.
Note: This is an automated response to your message Re: error in code sent
on 6/3/09 13:31:02.
This is the only notification you will receive while this
hi Ganesh,
the program still gives error in the second line . it says that *cannot
find symbol * .I think that we are initializing the object of Field class
two times, that is why we are getting error.
document.add(new Field("content",textReader));
document.add(new Field("path",textFiles[i].getP
34 matches
Mail list logo