Case?
Hmm. I thought it was case intensive.
I will re-index and revert all to lower case and see. The language is stored
as "English" not "english"
ian
P.S. Here's what I built with a basic understanding of Lucene.
http://BahaiResearch.com it's open source, ad-free. It allows people in 20
langu
Hm. Let's see the queries, and query.toString() may give
you some clues. I *suspect* that you really didn't index language.
Did you, perhaps, not re-index all your docs? Or use an analyzer
that didn't fold case when searching but did when searching (or
vice-versa)?
It's *possible* that you've
No, not sloth. Making use of the fine work that others have done in
order to help get your product out the door faster/cheaper
As in "There's no virtue in re-inventing the wheel. No matter how
productive it feels" .
Best
Erick
On Fri, Sep 4, 2009 at 12:19 PM, Shai Erera wrote:
> Thanks Mi
OK, that makes sense. Note that Shashi's solution and mine are essentially
identical.
Shashi's code snippet, I think, pre-supposes that source and dest index
directories are
different. So copying your A/D index off some place else is functionally
equivalent.
I'm not sure what would happen if you h
I have created an index and each document has a contents field and a
language field.
contents has the flags: Indexed Tokenized Stored Vector
language has the flags: Indexed Stored
In luke I can search contents fine, but when I try to search the field
language, I never ever get results.
Every doc
Paul, no problem.
it is not fully functional right now (incomplete, bugs, etc). patch is
kinda for reading only :)
but if you have other similar issues on your project, feel free to
post links to them on that jira ticket.
this way we can look at what problems you have and if appropriate
maybe they
Robert Muir wrote:
Paul, thanks for the examples. In my opinion, only one of these is a
tokenizer problem :)
none of these will be affected by a unicode upgrade.
Thanks for taking the time to write that response, it will take me a bit
of time to understand all this because I've ever used Lucene
Paul, thanks for the examples. In my opinion, only one of these is a
tokenizer problem :)
none of these will be affected by a unicode upgrade.
> Things like:
>
> http://bugs.musicbrainz.org/ticket/1006
in this case, it appears you want to do script conversion, and it
appears from the ticket you a
Robert Muir wrote:
On Fri, Sep 4, 2009 at 11:18 AM, Paul Taylor wrote:
I submitted this https://issues.apache.org/jira/browse/LUCENE-1787 patch to
StandardTokenizerImpl, understandably it hasn't been incoroprated into
Lucene (yet) but I need it for the project Im working on. So would you
reco
Thanks Mike. I did not phrase well my understanding of Cache reload. I
didn't mean literally as part of the reopen, but *because* of the reopen.
Because FieldCache is tied to an IndexReader instance, after reopen it gets
refreshed. If I keep my own Cache, I'll need to code that logic, and I
prefer
On Fri, Sep 4, 2009 at 11:18 AM, Paul Taylor wrote:
> I submitted this https://issues.apache.org/jira/browse/LUCENE-1787 patch to
> StandardTokenizerImpl, understandably it hasn't been incoroprated into
> Lucene (yet) but I need it for the project Im working on. So would you
> recommend keeping the
Hello,
Many thanks for the sample. I've already written a proof of concept with it.
Cheers,
Francisco
On Sep 4, 2009 3:53 PM, "Shashi Kant" wrote:
Here is some code to help you along. This should leave the source
indices intact and merges them into a destination.
//the index t
I submitted this https://issues.apache.org/jira/browse/LUCENE-1787 patch
to StandardTokenizerImpl, understandably it hasn't been incoroprated
into Lucene (yet) but I need it for the project Im working on. So would
you recommend keeping the same class name, and just putting in the
classpath befo
Hello Erick,
On Fri, Sep 4, 2009 at 3:26 PM, Erick Erickson wrote:
> Sure, copy them first to some other directory
> We might have something more helpful if you'd tell us *why* you want to do
> this? What problem are you trying to solve? Because having two copies of
> your index in the same d
Here is some code to help you along. This should leave the source
indices intact and merges them into a destination.
//the index to hold our merged index
IndexWriter iw = new IndexWriter(dest, new
StandardAnalyzer(), true);
string[] sourceIndices;
Sure, copy them first to some other directory
We might have something more helpful if you'd tell us *why* you want to do
this? What problem are you trying to solve? Because having two copies of
your index in the same directory doesn't sound very save.
Best
Erick
On Fri, Sep 4, 2009 at 5:53 A
Hi,
Apologies for resending this email but just wondering if I could get some
input on the below. I am in the final stages of getting a proof of concept
together and this is the final piece of the puzzle.
Sorry again for sending this!
Cheers
Amin
On Fri, Sep 4, 2009 at 10:38 AM, Amin Mohammed-
I am closing the readers when not in use.
I tried testing explicitly not closing the reader and found that the file is
not actually deleted and it remains in the disk. I remeber reading this
information. In my case, readers and searchers are closed and the files are not
existing in disk but /p
Hi Grant
> Seems like something is closing your InputStreamReader out from under you.
> Is there a concurrency issue, perhaps?
I'm coming to the same conclusion - there must be >1 threads accessing this
index at the same time. Better go figure it out ... :-)
Thanks again
- Chris
Chris Bam
Hello everyone,
As I understood it, merging indexes will lead to the deletion of the
original indexes.
Is there a way to merge indexes while keeping the original indexes intact?
Kind regards,
--
Francisco
-
To unsubscribe, e-m
Yes you are doing the reopen right, but my question was: in your code "//
reOpen", which is not visible, do you close the old reader after reopen? If
you do not do this, it stays open forever. This is what I suggested by my
example.
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://w
I doing the following way.
if (!reader.isCurrent()) {
//reOpen
}
I tried debugging and my log shows the correct reference count. Any other idea?
Regards
Ganesh
- Original Message -
From: "Uwe Schindler"
To:
Sent: Friday, September 04, 2009 1:12 PM
Subject: RE: too many file descript
Hi
I include a testcase to show what I am trying to do. Testcase number 3
fails.
Thanks
Amin
On Fri, Sep 4, 2009 at 10:17 AM, Amin Mohammed-Coleman wrote:
> Hi,
>
> I am looking at applying a security filter for our lucene document and I
> was wondering if I could get feedback on whether the so
On Fri, Sep 4, 2009 at 12:33 AM, Shai Erera wrote:
> 1) Refactor the FieldCache API (and TopFieldCollector) such that one can
> provide its own Cache of native values. I'd hate to rewrite the
> FieldComparators logic just because the current API is not extendable. That
> I agree should be quite st
Hi,
I am looking at applying a security filter for our lucene document and I was
wondering if I could get feedback on whether the solution I have come up
with. Firstly I will explain the scenario and followed by the proposed
solution:
We have a concept of a Layer which is a project whereby a br
On Thu, Sep 03, 2009 at 03:07:18PM +0200, Jukka Zitting wrote:
> Hi,
>
> On Wed, Sep 2, 2009 at 2:40 PM, David Causse wrote:
> > If I use tika for parsing HTML code and inject parsed String to a lucene
> > analyzer. What about the offset information for KWIC and return to text
> > (like the google
Hey there, I am iterating over a DocSet and for every id I neew to get the
value of a field wich is analyzed with KeyworddAnalyzer and is not sored.
I have noticed to ways of doing it using Fieldcache. Can someone pleas
explain me the pros and contras of using one or another?
Using StringIndex:
One general trap with reopen(): Reopen() returns a *new* IndexReader. If
this new IndexReader is different from the actual one, you have to close the
old reader when you are finished working on it. If you only have one thread
working on this IndexReader that is reopened, you can close the old reade
I am having only one process using Lucene DB. The same process writes and
reads. I do re-open indexreader. I am maintaing ref count for each
reader/searcher and closing it if it is not used. I am not able to understand,
why the file descriptor is showing as (deleted)?
I guessing some issues? Co
>>It removes the duplicates at query time and not in the results.
Not sure I understand that statement. Do you mean you want index-time
rejection of potentially duplicate inserts?
On 4 Sep 2009, at 07:01, Ganesh wrote:
It removes the duplicates at query time and not in the results.
--
30 matches
Mail list logo