date:20080819

Re: How I can find wildcard symbol with WildcardQuery?

2008-08-19 Thread Daniel Noll


I wrote:

What if you need to match a literal wildcard *and* an actual wildcard. :-)


Actually this was a rhetorical question, but there is at least one 
answer: use a regex query instead.  Regexes do support escaping the 
special symbols, so this problem doesn't exist for those.


Daniel


--
Daniel Noll

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: How I can find wildcard symbol with WildcardQuery?

2008-08-19 Thread Daniel Noll


Kwon, Ohsang wrote:

Why do you use to WildcardQuery? You are not need to whildcard. (maybe..)
Use term query.


What if you need to match a literal wildcard *and* an actual wildcard. :-)

Daniel


--
Daniel Noll

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: How I can find wildcard symbol with WildcardQuery?

2008-08-19 Thread Kwon, Ohsang

Why do you use to WildcardQuery? You are not need to whildcard. (maybe..)
Use term query.

Term term = new Term("field", "Hello w*orld");
Query query1 = new TermQuery(term);

gimme post
-Original Message-
From: Сергій Карпенко [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, August 19, 2008 10:20 PM
To: java-user@lucene.apache.org
Subject: How I can find wildcard symbol with WildcardQuery?


Hello

For example, we have a text:

" Hello w*orld"
 it's indexed as NO_NORMS, so this phrase is term.

And I have a code:

Query query = new WildcardQuery(new Term("field", " Hello w*orld")); its
work

But I need symbol '*' as ordinary symbol, not escape symbol.

The QueryParser's analogue '\\*'
Query query = new WildcardQuery(new Term("field", " Hello w\\*orld"));
don't wokrs.

Thanks



  


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Are there any Lucene optimizations applicable to SSD?

2008-08-19 Thread Cedric Ho

Hi eks,

My index is fully optimized, but I wasn't aware that I can sort it by
fields in Lucene. Could you elaborate on how to do that?

By omitTf(), do you mean Fieldable.setOmitNorms(true)? I'll try that.

Thanks,
Cedric Ho


>
> if you have possibility to sort your index once in a while on something like 
> DateRange  you will be surprised how good OS File cache utilizes locality of 
> reference... we had dramatic (ca 30%) improvements just by having index 
> sorted once a week on the most used fields... depend on nature of your 
> collection and is not always possible, but if possible, does the job. If this 
> is also only used as boolean condition to select range of documents, not 
> affecting score (guess not), give omitTf() a try, your index will be smaller 
> as well
>
>
> Send instant messages to your online friends http://uk.messenger.yahoo.com
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: How I can find wildcard symbol with WildcardQuery?

2008-08-19 Thread Daniel Noll


 Сергій Карпенко wrote:

Yes, you are correct - NO_NORMS has nothing to do with tokenization,
thats mean no analyzers used.


Just to avoid this ambiguous, semi-contradicting wording confusing the 
hell out of anyone...


NO_NORMS *does* have something to do with tokenisation -- it implies 
UN_TOKENIZED.


Source code QFT:

} else if (index == Index.NO_NORMS) {
  this.isIndexed = true;
  this.isTokenized = false;
  this.omitNorms = true;
} ...

Daniel

--
Daniel Noll

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Are there any Lucene optimizations applicable to SSD?

2008-08-19 Thread eks dev

hi Cedric, 
has nothing to do with SSD... but 


> 
> All queries involves a Date Range Filter and a Publication Filter.
> We've used WrappingCachingFilters for the Publication Filter for there
> are only a limited number of combinations for this filter. For the
> Date Range Filter we just let it run every time which seems to be
> doing fine.


if you have possibility to sort your index once in a while on something like 
DateRange  you will be surprised how good OS File cache utilizes locality of 
reference... we had dramatic (ca 30%) improvements just by having index sorted 
once a week on the most used fields... depend on nature of your collection and 
is not always possible, but if possible, does the job. If this is also only 
used as boolean condition to select range of documents, not affecting score 
(guess not), give omitTf() a try, your index will be smaller as well  


Send instant messages to your online friends http://uk.messenger.yahoo.com

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: search for special condition.

2008-08-19 Thread Mr Shore

감사합니다:)

2008/8/18 장용석 <[EMAIL PROTECTED]>

> Hi.
> Yes, that method is in lucene.
> I'm sorry about I did misunderstand your words.
> I hope that you will find the way for you want.
>
> bye.:)
>
>
> 2008/8/16, Mr Shore <[EMAIL PROTECTED]>:
> >
> > thanks,Jang
> > but I didn't find the method isTokenChar
> > maybe it's in lucene,right?
> > but I'm using nutch this time.
> > thank u all the same:)
> >
> > 2008/8/14 장용석 <[EMAIL PROTECTED]>
> >
> > > Hi. I was very happy ,you are love Korean language a lot :)
> > > So do you want search for special characters?
> > >
> > > If you want include special characters when indexing, you can override
> > > method in class
> > > Tokenizer. Method's name is isTokenChar(char c).
> > >
> > > protected boolean isTokenChar(char c) {
> > >return Character.isLetter(c);
> > > }
> > >
> > > As you see, that method is return true when the character c is a
> > > character^^
> > >
> > > If you fix that method "return Character.isLetter(c)  ||  c=='.'; "
> > > then, you will get the result token that has special characters like .
> > >
> > > thanks. :)
> > >
> > > Jang.
> > >
> > > 2008/8/14, Mr Shore <[EMAIL PROTECTED]>:
> > > >
> > > > can nutch or lucene support search for special characters like .?
> > > > when i search ".net" many result come for "net"
> > > > i want to exclude them
> > > > ps:i love korean language a lot
> > > >
> > > > 2008/8/13 장용석 <[EMAIL PROTECTED]>
> > > >
> > > > > hi. thank you for your response.
> > > > >
> > > > > I was found the way with your help.
> > > > >
> > > > > There are class that name is ConstantScoreRangeQuery and
> NumberTools.
> > > > >
> > > > > Reference site is here.
> > > > >
> > > > >
> > > >
> > >
> >
> http://markmail.org/message/dcirmifoat6uqf7y#query:org.apache.lucene.document.NumberTools+page:1+mid:tld3uekaylmu2cwt+state:results
> > > > >
> > > > >
> > > > > Thanks very much. :)
> > > > >
> > > > >
> > > > >
> > > > > 2008/8/13, Otis Gospodnetic <[EMAIL PROTECTED]>:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Lucene doesn't have the greater than operator.  Perhaps you can
> use
> > > > range
> > > > > > queries to accomplish the same thing.
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://lucene.apache.org/java/2_3_2/queryparsersyntax.html#Range%20Searches
> > > > > >
> > > > > > Otis
> > > > > > --
> > > > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > > > > >
> > > > > >
> > > > > >
> > > > > > - Original Message 
> > > > > > > From: 장용석 <[EMAIL PROTECTED]>
> > > > > > > To: java-user@lucene.apache.org
> > > > > > > Sent: Tuesday, August 12, 2008 6:01:00 AM
> > > > > > > Subject: search for special condition.
> > > > > > >
> > > > > > > hi.
> > > > > > >
> > > > > > > I am searching for lucene api or function like query "FIELD >
> > 1000"
> > > > > > >
> > > > > > > For example, a user wants to search a product which price is
> > bigger
> > > > > then
> > > > > > > user's input.
> > > > > > > If user's input is 1 then result are the products in index
> > just
> > > > > like
> > > > > > > "PRICE > 1"
> > > > > > >
> > > > > > > Is there any way to search like that?
> > > > > > >
> > > > > > > thanks.
> > > > > > > Jang.
> > > > > > > --
> > > > > > > DEV용식
> > > > > > > http://devyongsik.tistory.com
> > > > > >
> > > > > >
> > > > > >
> > -
> > > > > > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > > > > For additional commands, e-mail:
> [EMAIL PROTECTED]
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > DEV용식
> > > > > http://devyongsik.tistory.com
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > DEV용식
> > > http://devyongsik.tistory.com
> > >
> >
>
>
>
> --
> DEV용식
> http://devyongsik.tistory.com
>

RE: Case Sensitivity

2008-08-19 Thread Steven A Rowe

Hi Dino,

I think you'd benefit from reading some FAQ answers, like:

"Why is it important to use the same analyzer type during indexing and search?"

Also, have a look at the AnalysisParalysis wiki page for some hints:

On 08/19/2008 at 8:57 AM, Dino Korah wrote:
> From the discussion here what I could understand was, if I am using
> StandardAnalyzer on TOKENIZED fields, for both Indexing and Querying,
> I shouldn't have any problems with cases.

If by "shouldn't have problems with cases" you mean "can match 
case-insensitively", then this is true.

> But if I have any UN_TOKENIZED fields there will be problems if I do
> not case-normalize them myself before adding them as a field to the
> document.

Again, assuming that by "case-normalize" you mean "downcase", and that you want 
case-insensitive matching, and that you use the StandardAnalyzer (or some other 
downcasing analyzer) at query-time, then this is true.

> In my case I have a mixed scenario. I am indexing emails and the email
> addresses are indexed UN_TOKENIZED. I do have a second set of custom
> tokenized field, which keep the tokens in individual fields
> with same name.
[...]
> Does it mean that where ever I use UN_TOKENIZED, they do not get through
> the StandardAnalyzer before getting Indexed, but they do when they are
> searched on?

This is true.

> If that is the case, Do I need to normalise them before adding to
> document?

If you want case-insensitive matching, then yes, you do need to normalize them 
before adding them to the document.

> I also would like to know if it is better to employ an EmailAnalyzer
> that makes a TokenStream out of the given email address, rather
> than using a simplistic function that gives me a list of string pieces
> and adding them one by one. With searches, would both the approaches
> give same result?

Yes, both approaches give the same result.  When you add string pieces 
one-by-one, you are adding multiple same-named fields. By contrast, the 
EmailAnalyzer approach would add a single field, and would allow you to control 
positions (via Token.setPositionIncrement(): 
),
 e.g. to improve phrase handling.  Also, if you make up an EmailAnalyzer, you 
can use it to search against your tokenized email field, along with other 
analyzer(s) on other field(s), using the PerFieldAnalyzerWrapper 
.

Steve

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Are there any Lucene optimizations applicable to SSD?

2008-08-19 Thread Cedric Ho

Hi,

Thanks for the reply =)

>
> What aspect of performance do you find lacking? Is it searching or
> indexing? While we've had stellar results for searches, indexing is just
> so-so better than conventional harddisks.

Search response time. We used the search log from our production
system and test it with SSD. The results shows that 75% of queries
returns within 1 second, 90% returns in 2.5 seconds, the remaining 10%
ranges from 2.5 seconds to less than 100 seconds.

Total number of queries is ~4, so about 1 queries are kind of
slow, 1000 queries are very slow. But those 10% very slow queries are
not from the first 1000 queries. It's more or less evenly distributed.

> As for optimizing towards SSDs, we've found that the CPU is the
> bottleneck for us: The performance keeps climbing markedly for 1-5
> threads on a 4 core system with a single 64GB SSD, nearly identical to
> the same system with a RAID 0 of 4 * 64GB SSD.

I'd guess our CPU is fine because our test is probably different then
yours. We take one day's search log and emulate the exact search
queries to the Index at the exact time it happens in the search log.
So most of the time the CPU's idle except maybe for the peak hours.
(I'll remember to take a look at the CPU utilization during the test
in peak hour tomorrow.)

Your test keep running queries for as fast as it can get. And since
your queries can return so quickly, I'd guess that's probably why your
CPU gets hot =)


>
> Which SSD did you choose?

It's a single OCZ 64G SSD. We just got it yesterday. Is there a big
difference between different SSDs?


>
> Could you give some more information on the searches? What is a typical
> query, what do you do with the result (e.g. iterate through Hits,
> extracting fields)?

Our search queries are quite complicated sometimes.

All queries involves a Date Range Filter and a Publication Filter.
We've used WrappingCachingFilters for the Publication Filter for there
are only a limited number of combinations for this filter. For the
Date Range Filter we just let it run every time which seems to be
doing fine.

The queries also range from simple term query to phraseQueries to
nested spanQueries. Number of search terms > 10 is not uncommon.

Sorting by date or publication is the norm, sometimes also sort by score.

There are 3 returned fields, docId, date and publication, all of which
we retrieve through fieldCaches.

And we use this method to do the search:
TopFieldDocs Searcher.search(Query query, Filter filter, int n, Sort sort)
where for the test run n=100


We are targeting to get >90% of queries to return under 1 sec. Of
course the more the better =)


Thanks,
Cedric Ho

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Updating tag-indexes

2008-08-19 Thread Ivan Vasilev


Thanks Michael and Erick,

Yes we have our own unique IDs for the docs - we use them for other 
purposes, but here we need to deal with Lucene IDs as it is important 
for ParalelReader.
Ok then I will implement the algorithm for updating the small tag-index 
as I intended and as it is not guarantied from the API I will take care 
to check if docIDs remain to be assigned sequentially in each new Lucene 
release.
I will also take care about deletions - first before starting update 
process for the small index I will keep info about deleted docs, then 
will undelete them. After that will start update algorithm. When it 
finishes if there are deletions this will mean that some exception 
occurred. The same will mean if the number of the docs is less than 
initial (then some collapse of deletions occurred) - will have to 
restart the process. If everything is ok then will delete again the docs 
about I have kept info in the begining.


Thanks Once Again,
Ivan



Michael McCandless wrote:


Yes, docIDs are currently sequentially assigned, starting with 0.

BUT: on hitting an exception (say in your analyzer) it will usually 
use up a docID (and then immediately mark it as deleted).


Also, this behavior isn't "promised" in the API, ie it could in theory 
(though I think it unlikely) change in a future release of Lucene.


And remember when a merge completes (or, optimize), any deleted docs 
will "collapse down" all docIDs after them.


Mike

Ivan Vasilev wrote:


Hi Lucene Guys,

I have a question that is simple but is important for me. I did not 
found the answer in the javadoc so I am asking here.
When adding Document-s by the method IndexWriter.addDocument(doc) 
does the documents obtain Lucene IDs in the order that they are added 
to the IndexWriter? I mean will first added doc be with Lucene ID 0, 
second added with Lucene ID 1, etc?


Bellow I describe why I am asking this.
We plan to split our index to two separate indexes that will be read 
by ParallelReader class. This is so because the one of them will 
contain field(s) that will be indexed and stored and it will be 
frequently changed. So to have always correct data returned from the 
ParallelReader when changing documents in the small index the Lucene 
IDs of these docs have to remain the same.
To do this Karl Wettin suggests a solution described in *LUCENE-879 
<https://issues.apache.org/jira/browse/LUCENE-879>*. I do not like 
this solution because it is connected to changing Lucene source code, 
and after each refactoring potentially I will have problems. The 
solution is related to optimizing index so it will not be reasonably 
faster than the one that I prefer. And it is:
1. Read the whole index and reconstruct the documents including index 
data by using TermDocs and TermEnum classes;

2. Change the needed documents;
3. Index documents in new index that will replace the initial one.
I can even simplify this algorithm (and the speed) if all the fields 
will be always stored - I can read just the stored data and based on 
this to reconstruct the content of the docs and re index them in new.


But anyway everything in the my approaches will depend on this - are 
LuceneIDs in the index ordered in the same way as docs are added to 
the IndexWriter.


Thanks in Advance,
Ivan

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


__ NOD32 3366 (20080819) Information __

This message was checked by NOD32 antivirus system.
http://www.eset.com






-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re[2]: How I can find wildcard symbol with WildcardQuery?

2008-08-19 Thread Сергій Карпенко

Yes, you are correct - NO_NORMS has nothing to do with tokenization, thats mean 
no analyzers used. String fall's in index as single term.
But, what about our wildcard symbols?

Re: How I can find wildcard symbol with WildcardQuery?

Before going down this path I'd really recommend you get a copy of Luke
and look at your index. Depending upon the analyzer you're using, you
may or may not have w*orld indexed. You may have the tokens:
w
orld

with the * dropped completely.

As far as I know, NO_NORMS has nothing to do with tokenization, the
critical question is what *analyzer* you're using to index.

And you could always sidestep the issue entirely by pre-processing
your text and query to replace * with something else.

But for escaping, see:
http://lucene.apache.org/java/2_3_2/queryparsersyntax.html

Best
Erick

2008/8/19 Сергій Карпенко <[EMAIL PROTECTED]>

>
> Hello
>
> For example, we have a text:
>
> " Hello w*orld"
> it's indexed as NO_NORMS, so this phrase is term.
>
> And I have a code:
>
> Query query = new WildcardQuery(new Term("field", " Hello w*orld")); its
> work
>
> But I need symbol '*' as ordinary symbol, not escape symbol.
>
> The QueryParser's analogue '\\*'
> Query query = new WildcardQuery(new Term("field", " Hello w\\*orld"));
> don't wokrs.
>
> Thanks
>
>
>
>

Re: Multiple index performance

2008-08-19 Thread Cyndy

Thanks Anthony,

I understand your comment, and I think it makes sense, the only thing is
that I have the issue that I need to guarantee privacy to the users, so if I
am able to read the indexes (if they are not encrypted), then I can pretty
much know what he says in the document, so that is why I was thinking to
encrypt the whole directory of text files as well as the index files, so the
user by giving his password can decrypt all the files and then Lucene can do
its job. In that sense I will have to open/close the indexs on demand. And
so my concern was that: if I have at a moment 1000 indexes open, would that
hit performance?

Thanks again for your answer.

Antony Bowesman wrote:
> 
> [EMAIL PROTECTED] wrote:
>> Thanks Anthony for your response, I did not know about that field.
> 
> You make your own fields in Lucene, it is not something Lucene gives you.
> 
> 
>> But still I have a problem and it is about privacy. The users are
>> concerned
>> about privacy and so, we thought we could have all their files in a
>> folder
>> and encrypt the whole folder and index with a user key, so then when user
>> logs in, decrypt the folder with the key and so Lucene can reach the
>> documents, so that is why I am concerned about efficiency, since I do not
>> know if Lucene could handle the 10,000 indexes.
> 
> 
> It seems like you may be confusing what Lucene will give you.  The
> original file 
> content and the Lucene indexes are two different things.  It sounds like
> you 
> want to protect access to the original content on some shared storage, but
> that 
> is not related to the searching provided by your Lucene app, or maybe I 
> misunderstood your use case.
> 
> Antony
> 
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Multiple-index-performance-tp19043404p19052392.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: windows file system cache

2008-08-19 Thread Robert Stewart

Thank you for the help.  It seems that just changing "memory usage" setting to 
"programs" from default of "system cache" fixed the issue.  Now it takes only 
about 4 GB of system cache instead of 26 GB, and search performance is back to 
normal (fast).

-Original Message-
From: Mark Miller [mailto:[EMAIL PROTECTED]
Sent: Monday, August 18, 2008 10:03 AM
To: java-user@lucene.apache.org
Subject: Re: windows file system cache

Mark Miller wrote:
> Mark Miller wrote:
>> Robert Stewart wrote:
>>> Anyone else run on Windows?  We have index around 26 GB in size.
>>> Seems file system cache ends up taking up nearly all available RAM
>>> (26 GB out of 32 GB on 64-bit box).  Lucene process is around 5 GB,
>>> so very little left over for queries, etc, and box starts swapping
>>> during searches.  I think changing file system cache size setting is
>>> needed.  Anyone else have same issue?
>>>
>>>
>> Hmmm...get more ram :)
>>
>> Windows 64-bit upped the default file system cache size from 1 gig to
>> 1 terabyte. Your feeling the awesome effects of that upgrade I think.
>>
>> There is an API call ( SetSystemFilecache() ) to override this - so
>> perhaps code up a C app to set it before running your Lucene app?
>>
>> - Mark
> You may actually be able to do it from the registry as well:
> http://support.microsoft.com/kb/892589 (don't use windows anymore so
> havn't confirmed)
>
> Info showing the change from 1 gig to 1 terabyte:
> http://support.microsoft.com/kb/294418
>
> You just want to set it to a certain percentage of what you got -
> leaving enough to do whatever your lucene app needs to do.
Found a great page about the problem using Domino:
http://www-1.ibm.com/support/docview.wss?uid=swg21270452

They appear to have compiled all the little bits of info that I have
seen elsewhere, and describe the problem being fixed just as you did.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: How I can find wildcard symbol with WildcardQuery?

2008-08-19 Thread Erick Erickson

Before going down this path I'd really recommend you get a copy of Luke
and look at your index. Depending upon the analyzer you're using, you
may or may not have w*orld indexed. You may have the tokens:
w
orld

with the * dropped completely.

As far as I know, NO_NORMS has nothing to do with tokenization, the
critical question is what *analyzer* you're using to index.

And you could always sidestep the issue entirely by pre-processing
your text and query to replace * with something else.

But for escaping, see:
http://lucene.apache.org/java/2_3_2/queryparsersyntax.html

Best
Erick

2008/8/19 Сергій Карпенко <[EMAIL PROTECTED]>

>
> Hello
>
> For example, we have a text:
>
> " Hello w*orld"
>  it's indexed as NO_NORMS, so this phrase is term.
>
> And I have a code:
>
> Query query = new WildcardQuery(new Term("field", " Hello w*orld")); its
> work
>
> But I need symbol '*' as ordinary symbol, not escape symbol.
>
> The QueryParser's analogue '\\*'
> Query query = new WildcardQuery(new Term("field", " Hello w\\*orld"));
> don't wokrs.
>
> Thanks
>
>
>
>

Re: Updating tag-indexes

2008-08-19 Thread Erick Erickson

I'd add to Michael's mail the *strong* recommendation that you provide
your own unique doc IDs and use *those* instead. It'll save you a world
of grief. Whenever you need to add a new doc to an existing index, you
can get the maximum of *your* unique IDs and increment it yourself.

One thing to remember is that not all Lucene docs need to have the same
fields. So it's even possible to have a *very special* document that
contains
meta-data about your index, say the last used of your generated IDs and
keep that meta-data doc up to date. If you put fields in that doc that are
NOT
in any other doc, you don't have to worry about accidentally getting this
meta-data doc in your searches

Best
Erick

On Tue, Aug 19, 2008 at 8:01 AM, Ivan Vasilev <[EMAIL PROTECTED]> wrote:

> Hi Lucene Guys,
>
> I have a question that is simple but is important for me. I did not found
> the answer in the javadoc so I am asking here.
> When adding Document-s by the method IndexWriter.addDocument(doc) does the
> documents obtain Lucene IDs in the order that they are added to the
> IndexWriter? I mean will first added doc be with Lucene ID 0, second added
> with Lucene ID 1, etc?
>
> Bellow I describe why I am asking this.
> We plan to split our index to two separate indexes that will be read by
> ParallelReader class. This is so because the one of them will contain
> field(s) that will be indexed and stored and it will be frequently changed.
> So to have always correct data returned from the ParallelReader when
> changing documents in the small index the Lucene IDs of these docs have to
> remain the same.
> To do this Karl Wettin suggests a solution described in *LUCENE-879 <
> https://issues.apache.org/jira/browse/LUCENE-879>*. I do not like this
> solution because it is connected to changing Lucene source code, and after
> each refactoring potentially I will have problems. The solution is related
> to optimizing index so it will not be reasonably faster than the one that I
> prefer. And it is:
> 1. Read the whole index and reconstruct the documents including index data
> by using TermDocs and TermEnum classes;
> 2. Change the needed documents;
> 3. Index documents in new index that will replace the initial one.
> I can even simplify this algorithm (and the speed) if all the fields will
> be always stored - I can read just the stored data and based on this to
> reconstruct the content of the docs and re index them in new.
>
> But anyway everything in the my approaches will depend on this - are
> LuceneIDs in the index ordered in the same way as docs are added to the
> IndexWriter.
>
> Thanks in Advance,
> Ivan
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

Re: Simple Query Question

2008-08-19 Thread Erick Erickson

As Ian says, but you can set the default to AND or OR, see
the API docs.

The 'out of the box' default is OR.

See QueryParser.setDefaultOperator

Best
Erick

On Tue, Aug 19, 2008 at 4:30 AM, Ian Lea <[EMAIL PROTECTED]> wrote:

> No, lucene does not automatically replace spaces with AND.
>
> See http://lucene.apache.org/java/2_3_2/queryparsersyntax.html
>
>
> --
> Ian.
>
>
> On Tue, Aug 19, 2008 at 1:34 AM, DanaWhite <[EMAIL PROTECTED]> wrote:
> >
> > For some reason I am thinking I read somewhere that if you queried
> something
> > like:
> >
> > "Eiffel Tower"
> >
> > Lucene would execute the query "Eiffel AND Tower"
> >
> > Basically I am trying to ask, does lucene automatically replaces spaces
> with
> > the AND operator?
> >
> > Thanks
> > Dana
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

Re: Multiple index performance

2008-08-19 Thread Erick Erickson

Another issue is opening/closing your indexes. When you open an
index for searching, the first few queries you fire invoke considerable
overhead as caches warm up, etc. Plus, you don't get any efficiencies
of scale (that is, pretty soon adding 2X the amount of text to an index
increases the size of the index considerably less than 2X if you're
not storing the text).

So, you either have to keep 10,000 indexes open for efficient searching,
or open/close each one on demand and live with the consequent hit to
your searching performance.

I'd think about keeping it all in a large index, storing the user's name
as a field and appending something like "AND user:cyndy" to each
search. You could also assemble a filter for your user and tack that on
to the query. But the above clause is conceptually simplest.

Best
Erick

On Mon, Aug 18, 2008 at 10:34 PM, Cyndy <[EMAIL PROTECTED]> wrote:

>
> Hello, I am new into Lucene and I want to make sure what I am trying to do
> will not hit performance. My scenario is the following:
>
> I want to keep user text files indexed separately, I will have about 10,000
> users and each user may have about 20,000 short files, and I need to keep
> privacy. So the idea is to have one folder with the text files and  index
> for each user, so when search will be done, it will be pointing to the
> corresponding file directory. Would this approach hit performance? is this
> a
> good solution? Any recommendation?
>
> Thanks in advance.
>
>
> --
> View this message in context:
> http://www.nabble.com/Multiple-index-performance-tp19043404p19043404.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

How I can find wildcard symbol with WildcardQuery?

2008-08-19 Thread Сергій Карпенко


Hello

For example, we have a text:

" Hello w*orld"
 it's indexed as NO_NORMS, so this phrase is term.

And I have a code:

Query query = new WildcardQuery(new Term("field", " Hello w*orld")); its work   
 

But I need symbol '*' as ordinary symbol, not escape symbol.

The QueryParser's analogue '\\*'
Query query = new WildcardQuery(new Term("field", " Hello w\\*orld"));
don't wokrs.

Thanks

RE: Case Sensitivity

2008-08-19 Thread Dino Korah

Hi Guys,

>From the discussion here what I could understand was, if I am using
StandardAnalyzer on TOKENIZED fields, for both Indexing and Querying, I
shouldn't have any problems with cases. But if I have any UN_TOKENIZED
fields there will be problems if I do not case-normalize them myself before
adding them as a field to the document.

In my case I have a mixed scenario. I am indexing emails and the email
addresses are indexed UN_TOKENIZED. I do have a second set of custom
tokenized field, which keep the tokens in individual fields with same name.

For example, if the email had a from address "John Smith"
<[EMAIL PROTECTED]>, my document looks like this

--8<
to: ...   - UN_TOKENIZED
from: [EMAIL PROTECTED]   - UN_TOKENIZED
From-tokenized: John  - UN_TOKENIZED
From-tokenized: Smith - UN_TOKENIZED
From-tokenized: J - UN_TOKENIZED
From-tokenized: Smith - UN_TOKENIZED
From-tokenized: world.net - UN_TOKENIZED
From-tokenized: world - UN_TOKENIZED
From-tokenized: net   - UN_TOKENIZED
Subject: ...  - TOKENIZED
Body: ... - TOKENIZED
--8<

Does it mean that where ever I use UN_TOKENIZED, they do not get through the
StandardAnalyzer before getting Indexed, but they do when they are searched
on? If that is the case, Do I need to normalise them before adding to
document?

I also would like to know if it is better to employ an EmailAnalyzer that
makes a TokenStream out of the given email address, rather than using a
simplistic function that gives me a list of string pieces and adding them
one by one. With searches, would both the approaches give same result?

Many thanks,
Dino



-Original Message-
From: Doron Cohen [mailto:[EMAIL PROTECTED] 
Sent: 16 August 2008 21:01
To: java-user@lucene.apache.org
Subject: Re: Case Sensitivity

Hi Sergey, seems like case 4 and 5 are equivalent, both meaning case
insensitive right. Otherwise please explain the difference.

If it is required to support both case sensitive (cases 1,2,3) and case
insensitive (case 4/5) then both forms must be saved in the index - in two
separate fields (as Erick mentioned, I think).

Hope this helps,
Doron

On Fri, Aug 15, 2008 at 10:51 AM, Sergey Kabashnyuk
<[EMAIL PROTECTED]>wrote:

> Hello
>
> Here's my use case   content of the field
> Doc1 -
>Field - "text " -   "Field Without Norms"
>
> Doc2 -
>Field - "text " -   "field without norms"
>
> Doc3 -
>Field - "text " -   "FIELD WITHOUT NORMS"
>
>
> Query expected result
> 1. new Term("text","Field Without Norms")   doc1
> 2. new Term("text","field without norms")   doc2
> 3. new Term("text","FIELD WITHOUT NORMS")   doc3


> lowercase("text","field without norms")   doc1, doc2, doc3
> uppercase("text","FIELD WITHOUT NORMS")   doc1, doc2, doc3
>
> I stor "text" field like :
> new Field("text", Field.Store.NO, 
> Field.Index.NO_NORMS,Field.TermVector.NO
> )
> using StandardAnalyzer and query  1-3 works perfectly as I need. The 
> question is how create query 4-5?
>
> Thanks
>
> Sergey Kabashnyuk
> eXo Platform SAS
>
>
>  Be aware that StandardAnalyzer lowercases all the input,
>> both at index and query times. Field.Store.YES will store the 
>> original text without any transformations, so doc.get() will 
>> return the original text. However, no matter what the Field.Store 
>> value, the *indexed* tokens (using TOKENIZED as you 
>> Field.Index.TOKENIZED) are passed through the analyzer.
>>
>> For instance, indexing "MIXed CasE  TEXT" in a field called "myfield" 
>> with Field.Store.YES, Field.Index.TOKENIZED would index the following 
>> tokens (with StandardAnalyzer).
>> mixed
>> case
>> text
>>
>> and searches (with StandardAnalyzer) would match any case in the 
>> query terms (e.g. MIXED would hit, as would mixed as would CaSE).
>>
>> However, doc.get("myfield") would return "MIXed CasE  TEXT"
>>
>> As Doron said, though, a few use cases would help us provide better 
>> answers.
>>
>> Best
>> Erick
>>
>>
>> On Thu, Aug 14, 2008 at 10:31 AM, Sergey Kabashnyuk 
>> <[EMAIL PROTECTED]
>> >wrote:
>>
>>  Thanks for you  reply Erick.
>>>
>>>
>>>  About the only way to do this that I know of is to
>>>
 index the data three times, once without any case changing, once 
 uppercased and once lowercased.
 You'll have to watch your analyzer, probably making up your own 
 (easily done, see the synonym analyzer in Lucene in Action).

 Your example doesn't tell us anything, since the critical 
 information is the *analyzer* you use, both at query and at index 
 times. The analyzer is responsible for any transformations, like 
 case folding, tokenizing, etc.


>>>
>>> In example  I want to show what I  stored field as  
>>> Field.Index.NO_NORMS
>>>
>>> As I understand it means what field contains original string

Re: Updating tag-indexes

2008-08-19 Thread Michael McCandless

Yes, docIDs are currently sequentially assigned, starting with 0.

BUT: on hitting an exception (say in your analyzer) it will usually
use up a docID (and then immediately mark it as deleted).

Also, this behavior isn't "promised" in the API, ie it could in theory
(though I think it unlikely) change in a future release of Lucene.

And remember when a merge completes (or, optimize), any deleted docs
will "collapse down" all docIDs after them.

Mike

Ivan Vasilev wrote:

Hi Lucene Guys,

I have a question that is simple but is important for me. I did not
found the answer in the javadoc so I am asking here.
When adding Document-s by the method IndexWriter.addDocument(doc)
does the documents obtain Lucene IDs in the order that they are
added to the IndexWriter? I mean will first added doc be with Lucene
ID 0, second added with Lucene ID 1, etc?

Bellow I describe why I am asking this.
We plan to split our index to two separate indexes that will be read
by ParallelReader class. This is so because the one of them will
contain field(s) that will be indexed and stored and it will be
frequently changed. So to have always correct data returned from the
ParallelReader when changing documents in the small index the Lucene
IDs of these docs have to remain the same.
To do this Karl Wettin suggests a solution described in *LUCENE-879 *. I do not like this solution because it is connected to changing
Lucene source code, and after each refactoring potentially I will
have problems. The solution is related to optimizing index so it
will not be reasonably faster than the one that I prefer. And it is:
1. Read the whole index and reconstruct the documents including
index data by using TermDocs and TermEnum classes;

2. Change the needed documents;
3. Index documents in new index that will replace the initial one.
I can even simplify this algorithm (and the speed) if all the fields
will be always stored - I can read just the stored data and based on
this to reconstruct the content of the docs and re index them in new.

But anyway everything in the my approaches will depend on this - are
LuceneIDs in the index ordered in the same way as docs are added to
the IndexWriter.

Thanks in Advance,
Ivan

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: java.lang.NullPointerExcpetion while indexing on linux

2008-08-19 Thread Michael McCandless



On quick look that code looks fine, though removeField is an expensive  
operation and unnecessary for this.


We really need the full traceback of the exception.

Mike

Aditi Goyal wrote:


Thanks Michael and Ian for your valuable response.
I am attaching a small default code. Please have a look and tell me  
where am

I going wrong.

import lucene
from lucene import Document, Field, initVM, CLASSPATH

doc = Document()
fieldA = Field('fieldA', "", Field.Store.YES,  
Field.Index.UN_TOKENIZED)

fieldB = Field('fieldB', "", Field.Store.YES, Field.Index.TOKENIZED)
fieldC = Field ('fieldC', "", Field.Store.YES, Field.Index.TOKENIZED)

doc.add(fieldA)
doc.add(fieldB)
doc.add(fieldC)

def get_fields():
   if doc.getField('FieldA') is not None:
   doc.removeField('FieldA')
   if doc.getField('FieldB') is not None:
   doc.removeField('FieldB')
   if doc.getField('FieldC') is not None:
   doc.removeField('FieldC')

   fieldA.setValue("abc")
   doc.add(fieldA)
   fieldB.setValue("xyz")
   doc.add(fieldB)
   fieldC.setValue("123")
   doc.add(fieldC)

   return doc


def add_document():
   doc = get_fields()
   writer = lucene.IndexWriter(index_directory, analyzer, create_path)
   writer.addDocument(doc)
   writer.close()

This writer.addDocument is throwing an exception saying
java.lang.NullPointerException

Thanks,
Aditi

On Tue, Aug 19, 2008 at 3:25 PM, Michael McCandless <
[EMAIL PROTECTED]> wrote:



Ian Lea wrote:

I don't think you need to remove the field and then add it again, but

I've no idea if that is relevant to your problem or not.



That's right: just leave the Field there and change its value  
(assuming the

doc you are changing to still uses that field).

A full stack trace would be more help, and maybe an upgrade to 2.3.2,

and maybe a snippet of your code, and what is JCC?



JCC generates the necessary C/C++ glue code for Python to directly  
invoke
Java code.  The Chandler project created this for PyLucene because  
they were

having trouble with GCJ:

  http://blog.chandlerproject.org/author/vajda/

Mike


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Updating tag-indexes

2008-08-19 Thread Ivan Vasilev


Hi Lucene Guys,

I have a question that is simple but is important for me. I did not 
found the answer in the javadoc so I am asking here.
When adding Document-s by the method IndexWriter.addDocument(doc) does 
the documents obtain Lucene IDs in the order that they are added to the 
IndexWriter? I mean will first added doc be with Lucene ID 0, second 
added with Lucene ID 1, etc?


Bellow I describe why I am asking this.
We plan to split our index to two separate indexes that will be read by 
ParallelReader class. This is so because the one of them will contain 
field(s) that will be indexed and stored and it will be frequently 
changed. So to have always correct data returned from the ParallelReader 
when changing documents in the small index the Lucene IDs of these docs 
have to remain the same.
To do this Karl Wettin suggests a solution described in *LUCENE-879 
*. I do not like this 
solution because it is connected to changing Lucene source code, and 
after each refactoring potentially I will have problems. The solution is 
related to optimizing index so it will not be reasonably faster than the 
one that I prefer. And it is:
1. Read the whole index and reconstruct the documents including index 
data by using TermDocs and TermEnum classes;

2. Change the needed documents;
3. Index documents in new index that will replace the initial one.
I can even simplify this algorithm (and the speed) if all the fields 
will be always stored - I can read just the stored data and based on 
this to reconstruct the content of the docs and re index them in new.


But anyway everything in the my approaches will depend on this - are 
LuceneIDs in the index ordered in the same way as docs are added to the 
IndexWriter.


Thanks in Advance,
Ivan

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: java.lang.NullPointerExcpetion while indexing on linux

2008-08-19 Thread Aditi Goyal

Thanks Michael and Ian for your valuable response.
I am attaching a small default code. Please have a look and tell me where am
I going wrong.

import lucene
from lucene import Document, Field, initVM, CLASSPATH

doc = Document()
fieldA = Field('fieldA', "", Field.Store.YES, Field.Index.UN_TOKENIZED)
fieldB = Field('fieldB', "", Field.Store.YES, Field.Index.TOKENIZED)
fieldC = Field ('fieldC', "", Field.Store.YES, Field.Index.TOKENIZED)

doc.add(fieldA)
doc.add(fieldB)
doc.add(fieldC)

def get_fields():
if doc.getField('FieldA') is not None:
doc.removeField('FieldA')
if doc.getField('FieldB') is not None:
doc.removeField('FieldB')
if doc.getField('FieldC') is not None:
doc.removeField('FieldC')

fieldA.setValue("abc")
doc.add(fieldA)
fieldB.setValue("xyz")
doc.add(fieldB)
fieldC.setValue("123")
doc.add(fieldC)

return doc


def add_document():
doc = get_fields()
writer = lucene.IndexWriter(index_directory, analyzer, create_path)
writer.addDocument(doc)
writer.close()

This writer.addDocument is throwing an exception saying
java.lang.NullPointerException

Thanks,
Aditi

On Tue, Aug 19, 2008 at 3:25 PM, Michael McCandless <
[EMAIL PROTECTED]> wrote:

>
> Ian Lea wrote:
>
>  I don't think you need to remove the field and then add it again, but
>> I've no idea if that is relevant to your problem or not.
>>
>
> That's right: just leave the Field there and change its value (assuming the
> doc you are changing to still uses that field).
>
>  A full stack trace would be more help, and maybe an upgrade to 2.3.2,
>> and maybe a snippet of your code, and what is JCC?
>>
>
> JCC generates the necessary C/C++ glue code for Python to directly invoke
> Java code.  The Chandler project created this for PyLucene because they were
> having trouble with GCJ:
>
>http://blog.chandlerproject.org/author/vajda/
>
> Mike
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

Re: java.lang.NullPointerExcpetion while indexing on linux

2008-08-19 Thread Michael McCandless



Ian Lea wrote:


I don't think you need to remove the field and then add it again, but
I've no idea if that is relevant to your problem or not.


That's right: just leave the Field there and change its value  
(assuming the doc you are changing to still uses that field).



A full stack trace would be more help, and maybe an upgrade to 2.3.2,
and maybe a snippet of your code, and what is JCC?


JCC generates the necessary C/C++ glue code for Python to directly  
invoke Java code.  The Chandler project created this for PyLucene  
because they were having trouble with GCJ:


http://blog.chandlerproject.org/author/vajda/

Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: java.lang.NullPointerExcpetion while indexing on linux

2008-08-19 Thread Ian Lea

Hi


I don't think you need to remove the field and then add it again, but
I've no idea if that is relevant to your problem or not.

A full stack trace would be more help, and maybe an upgrade to 2.3.2,
and maybe a snippet of your code, and what is JCC?


--
Ian.


On Tue, Aug 19, 2008 at 10:09 AM, Aditi Goyal <[EMAIL PROTECTED]> wrote:
> Hi All,
>
> I am using IndexWriter for adding the documents. I am re-using the document
> as well as the fields for improving index speed as per the link
> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed.
>
> So, for each doc, i am first removing field using doc.removeField() and then
> field.setValue() for changing the value of the field and finally
> doc.add(field) for adding the field to the document.
>
> It works fine on windows, however it throws  (,
> JavaError(,) when I run
> indexwriter.addDocument(doc) on Linux.
>
> Can anyone please guide why is it happening this way.
> I am using lucene 2.3.1 version and JCC version is 1.8 and Python is 2.5
>
> Thanks,
> Aditi
>

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Are there any Lucene optimizations applicable to SSD?

2008-08-19 Thread Toke Eskildsen

On Tue, 2008-08-19 at 16:22 +0800, Cedric Ho wrote:

[Lucene on SSD]

> However it's still not good enough for our particular case. So I
> wonder if there are any tips for optimizing lucene performance on
> SSDs.

What aspect of performance do you find lacking? Is it searching or
indexing? While we've had stellar results for searches, indexing is just
so-so better than conventional harddisks.

As for optimizing towards SSDs, we've found that the CPU is the
bottleneck for us: The performance keeps climbing markedly for 1-5
threads on a 4 core system with a single 64GB SSD, nearly identical to
the same system with a RAID 0 of 4 * 64GB SSD.

> For example, I saw that Lucene's BufferedIndexInput class will read
> 1024bytes off the disk each time. This certainly make sense on hard
> disk because of the seek latency involved. But would it actually
> hinder performance on SSD?

SSD's still retrieve data in blocks, so my _guess_ is that the 1024
doesn't make much of a difference.

Which SSD did you choose?

> FYI, we were trying to fit an index about 20G in size into a single
> machine with 8G ram. And the searches we receive are vastly different.
> So it's not likely we can depends on the system's file cache to speed
> things up for us.

We've experimented with a 37GB index on a machine with the amount of RAM
varying from 3-24GB of RAM, primarily simple searches. After warmup
(1000 queries), with 8GB and dual core, the performance for SSD is in
the area of 200 queries/sec and rising, as opposed to 50 queries/sec and
rising for conventional harddisks (see the graph under "Warming up" at
http://wiki.statsbiblioteket.dk/summa/Hardware ).

For searches with SSD, the size of the disk cache doesn't affect
performance much, but the first 1000 queries or so aren't representative
at all, no matter if the index is in RAM, on SSD or conventional
harddisks. Of course YMMW.

Could you give some more information on the searches? What is a typical
query, what do you do with the result (e.g. iterate through Hits,
extracting fields)?

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

java.lang.NullPointerExcpetion while indexing on linux

2008-08-19 Thread Aditi Goyal

Hi All,

I am using IndexWriter for adding the documents. I am re-using the document
as well as the fields for improving index speed as per the link
http://wiki.apache.org/lucene-java/ImproveIndexingSpeed.

So, for each doc, i am first removing field using doc.removeField() and then
field.setValue() for changing the value of the field and finally
doc.add(field) for adding the field to the document.

It works fine on windows, however it throws  (,
JavaError(,) when I run
indexwriter.addDocument(doc) on Linux.

Can anyone please guide why is it happening this way.
I am using lucene 2.3.1 version and JCC version is 1.8 and Python is 2.5

Thanks,
Aditi

Re: Simple Query Question

2008-08-19 Thread Ian Lea

No, lucene does not automatically replace spaces with AND.

See http://lucene.apache.org/java/2_3_2/queryparsersyntax.html


--
Ian.


On Tue, Aug 19, 2008 at 1:34 AM, DanaWhite <[EMAIL PROTECTED]> wrote:
>
> For some reason I am thinking I read somewhere that if you queried something
> like:
>
> "Eiffel Tower"
>
> Lucene would execute the query "Eiffel AND Tower"
>
> Basically I am trying to ask, does lucene automatically replaces spaces with
> the AND operator?
>
> Thanks
> Dana

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Are there any Lucene optimizations applicable to SSD?

2008-08-19 Thread Cedric Ho

Hi all,

We are testing Lucene with SSD. No doubt the performance is much
better than that of a normal hard disk.

However it's still not good enough for our particular case. So I
wonder if there are any tips for optimizing lucene performance on
SSDs.

For example, I saw that Lucene's BufferedIndexInput class will read
1024bytes off the disk each time. This certainly make sense on hard
disk because of the seek latency involved. But would it actually
hinder performance on SSD?


FYI, we were trying to fit an index about 20G in size into a single
machine with 8G ram. And the searches we receive are vastly different.
So it's not likely we can depends on the system's file cache to speed
things up for us.


Any input is appreciated.


Thanks,
Cedric Ho

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: How I can find wildcard symbol with WildcardQuery?

Re: How I can find wildcard symbol with WildcardQuery?

RE: How I can find wildcard symbol with WildcardQuery?

Re: Are there any Lucene optimizations applicable to SSD?

Re: How I can find wildcard symbol with WildcardQuery?

Re: Are there any Lucene optimizations applicable to SSD?

Re: search for special condition.

RE: Case Sensitivity

Re: Are there any Lucene optimizations applicable to SSD?

Re: Updating tag-indexes

Re[2]: How I can find wildcard symbol with WildcardQuery?

Re: Multiple index performance

RE: windows file system cache

Re: How I can find wildcard symbol with WildcardQuery?

Re: Updating tag-indexes

Re: Simple Query Question

Re: Multiple index performance

How I can find wildcard symbol with WildcardQuery?

RE: Case Sensitivity

Re: Updating tag-indexes

Re: java.lang.NullPointerExcpetion while indexing on linux

Updating tag-indexes

Re: java.lang.NullPointerExcpetion while indexing on linux

Re: java.lang.NullPointerExcpetion while indexing on linux

Re: java.lang.NullPointerExcpetion while indexing on linux

Re: Are there any Lucene optimizations applicable to SSD?

java.lang.NullPointerExcpetion while indexing on linux

Re: Simple Query Question

Are there any Lucene optimizations applicable to SSD?

29 matches

Site Navigation

Mail list logo

Footer information