Re: Porting Analyzer from ver 4.8.1 to ver 6.4.1

2017-02-17 Thread Vincenzo D'Amore
Thank you, works like a charm.


PriorityQueue clarification

2017-02-17 Thread Cristian Lorenzetto
i want realize a priorityqueue not limited persistent (not all in memory)
using lucene.
I found on documemtation the class PriorityQueue.
So i ask you clarifications:
1) PriorityQueue work all in memory or not?
2) if i develop on my own a class making a lucine storage where i search by
priority and index has same performance?
3) other sugestion for the best solution based on your esperience about
lucene ?


Re: Searcher Performance

2017-02-17 Thread Chitra R
Thanx a lot Adrien.

On Fri, Feb 17, 2017 at 10:07 PM, Adrien Grand  wrote:

> Some minimal information about the fields is loaded into memory when you
> open the index reader. Things like the list of fields and how they are
> indexed.
>
> However the vast majority of the data is read from disk lazily, we do not
> warm the filesystem cache or anything like that by default. We do not use
> direct I/O either. So say you run a term query, only pages that contain
> information about these particular field and value will be loaded into the
> cache.
>
> In case you want to warm the filesystem cache explicitly, which could be a
> good idea if you have plenty of filesystem cache for your index (ie. the
> unused memory of the system is larger than the index), you can look into
> using MMapDirectory.setPreload.
>
> Le ven. 17 févr. 2017 à 15:13, Chitra R  a écrit :
>
> > Hey, thank you so much. I got it.
> >
> > I have
> >
> >- 10 lakh docs, 30 fields in my index
> >- opening new searcher at initial search and
> >- there will be no filesystem cache for my current index
> >
> > At initial search, I search across only one field out of 30 fields in my
> > index.
> >
> > My question is,
> >
> > *At initial search, Whether the required page (os pages of Lucene index
> > files) for that field (a single field) will be loaded to filesystem cache
> > or all the fields info will be loaded to filesystem cache from disk?*
> >
> >
> > Regards,
> > Chitra
> >
> > On Fri, Feb 17, 2017 at 7:05 PM, Adrien Grand  wrote:
> >
> > > Regarding whether the filesystem cache helps, you could look at whether
> > > there is some disk activity while your queries are running.
> > >
> > > When everything is in the filesystem cache, the latency of search
> > requests
> > > for simple queries (term queries and combinations through boolean
> > queries)
> > > usually mostly depends on the total number of matches since Lucene
> needs
> > to
> > > call the collector on every match.
> > >
> > > Le ven. 17 févr. 2017 à 10:09, Chitra R  a
> écrit
> > :
> > >
> > > > Hi,
> > > >  While working with Searcher.Search, I have noticed a difference
> in
> > > > their performance. I have 10 lakh documents and 30 fields in my
> index.
> > I
> > > > have performed three searches using different queries in a sequential
> > > > manner. At search time, I used MMapDirectory and index is opened.
> > > >
> > > > *case1: *
> > > >
> > > >- During the first search, I ran the Query Say (new TermQuery(new
> > > >Term("name","Chitra"))) and which yields 1 lakh documents as
> result.
> > > > Time
> > > >taken for first search = 50 - 60 ms nearly.
> > > >- And for the second search, I ran the Query Say (new
> TermQuery(new
> > > >Term("animal","lion"))) which also yields 1 lakh documents as
> > result.
> > > > Time
> > > >taken for Second search = 50 - 60 ms nearly.
> > > >- And for the third search,  I ran the Query Say (new
> TermQuery(new
> > > >Term("bird","peacock"))) which also yields 1 lakh documents as
> > result.
> > > >Time taken for Third search = 50 - 60 ms nearly.
> > > >
> > > > In this case, why does searcher.search take the same search time for
> > > > different queries?
> > > >
> > > > *case2:*
> > > >
> > > > Suppose if I ran the same query twice, Searcher.search took less time
> > > than
> > > > the previous search because of os cache.
> > > >
> > > > *Based on above observation, *
> > > >
> > > > During initial search, only the required portion of index files will
> be
> > > > loaded to i/o cache. And for the next search, if the required portion
> > is
> > > > not present in os cache,
> > > >
> > > > Will it take time to read that files from disk? If so, this is the
> > reason
> > > > behind searcher.search is taking the nearly same search time for
> > > different
> > > > queries.
> > > >
> > > >
> > > > Regards,
> > > > Chitra
> > > >
> > >
> >
>


Re: Searcher Performance

2017-02-17 Thread Adrien Grand
Some minimal information about the fields is loaded into memory when you
open the index reader. Things like the list of fields and how they are
indexed.

However the vast majority of the data is read from disk lazily, we do not
warm the filesystem cache or anything like that by default. We do not use
direct I/O either. So say you run a term query, only pages that contain
information about these particular field and value will be loaded into the
cache.

In case you want to warm the filesystem cache explicitly, which could be a
good idea if you have plenty of filesystem cache for your index (ie. the
unused memory of the system is larger than the index), you can look into
using MMapDirectory.setPreload.

Le ven. 17 févr. 2017 à 15:13, Chitra R  a écrit :

> Hey, thank you so much. I got it.
>
> I have
>
>- 10 lakh docs, 30 fields in my index
>- opening new searcher at initial search and
>- there will be no filesystem cache for my current index
>
> At initial search, I search across only one field out of 30 fields in my
> index.
>
> My question is,
>
> *At initial search, Whether the required page (os pages of Lucene index
> files) for that field (a single field) will be loaded to filesystem cache
> or all the fields info will be loaded to filesystem cache from disk?*
>
>
> Regards,
> Chitra
>
> On Fri, Feb 17, 2017 at 7:05 PM, Adrien Grand  wrote:
>
> > Regarding whether the filesystem cache helps, you could look at whether
> > there is some disk activity while your queries are running.
> >
> > When everything is in the filesystem cache, the latency of search
> requests
> > for simple queries (term queries and combinations through boolean
> queries)
> > usually mostly depends on the total number of matches since Lucene needs
> to
> > call the collector on every match.
> >
> > Le ven. 17 févr. 2017 à 10:09, Chitra R  a écrit
> :
> >
> > > Hi,
> > >  While working with Searcher.Search, I have noticed a difference in
> > > their performance. I have 10 lakh documents and 30 fields in my index.
> I
> > > have performed three searches using different queries in a sequential
> > > manner. At search time, I used MMapDirectory and index is opened.
> > >
> > > *case1: *
> > >
> > >- During the first search, I ran the Query Say (new TermQuery(new
> > >Term("name","Chitra"))) and which yields 1 lakh documents as result.
> > > Time
> > >taken for first search = 50 - 60 ms nearly.
> > >- And for the second search, I ran the Query Say (new TermQuery(new
> > >Term("animal","lion"))) which also yields 1 lakh documents as
> result.
> > > Time
> > >taken for Second search = 50 - 60 ms nearly.
> > >- And for the third search,  I ran the Query Say (new TermQuery(new
> > >Term("bird","peacock"))) which also yields 1 lakh documents as
> result.
> > >Time taken for Third search = 50 - 60 ms nearly.
> > >
> > > In this case, why does searcher.search take the same search time for
> > > different queries?
> > >
> > > *case2:*
> > >
> > > Suppose if I ran the same query twice, Searcher.search took less time
> > than
> > > the previous search because of os cache.
> > >
> > > *Based on above observation, *
> > >
> > > During initial search, only the required portion of index files will be
> > > loaded to i/o cache. And for the next search, if the required portion
> is
> > > not present in os cache,
> > >
> > > Will it take time to read that files from disk? If so, this is the
> reason
> > > behind searcher.search is taking the nearly same search time for
> > different
> > > queries.
> > >
> > >
> > > Regards,
> > > Chitra
> > >
> >
>


Re: Numeric Ranges Faceting

2017-02-17 Thread Chitra R
Hey,
I have indexed "author","module_id" fields as
SortedSetDocValuesFacetField and "time", "price","salary" fields as
NumericDocValuesField.

My Category looks like:

*module_id
 -> author
*price

module_id and price are parent categories. After selecting any one of the
facets from module_id, sub-category ie "author" field will be shown.

*Use-case:*

1. I have received path values from user as "module_id:1" and "price:100 TO
500" and also need to perform drillsideways search.

*initializing drilldown query*

DrillDownQuery drillDownQuery = new DrillDownQuery(facetsConfig,
> userGivenSearchQuery);
> drillDownQuery.add("module_id","1");
> drillDownQuery.add("price",NumericRangeQuery.newDoubleRange("price",
> 100.0, 200.0, range.minInclusive, range.maxInclusive));
>

* hits and facets computation*

DrillSideways sideways = new DrillSideways(searcher, facetsConfig,
> docValuesReaderState);
> DrillSideways.DrillSidewaysResult drillResult =
> sideways.search(drillDownQuery, booleanFilter, null, 10, sort, doDocScore,
> doMaxScore);
> int totalHits = drillResult.hits.totalHits();   --> it show accurate total
> hits documents
> List facetResult = drillResult.facets.getAllDims(10) -->*
> this line throws an exception.*


*Exception*


>

java.lang.IllegalArgumentException: dimension "price" was not indexed

at org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts.
> getTopChildren(SortedSetDocValuesFacetCounts.java:91)

at org.apache.lucene.facet.MultiFacets.getAllDims(MultiFacets.java:74)



Am I did anything wrong???


Kindly post your suggestions.

Thanks,
Chitra



On Fri, Feb 17, 2017 at 9:11 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> Hi, how are you instantiating your MultiFacets?  You should be passing
> e.g. a LongRangeFacetCounts instance for your "time" dimension, which
> should prevent that exception.
>
> For DrillSideways, I think you must subclass, and then override
> buildFacetResult to compute your range facets, because that class
> assumes it's either indexed facets or sorted set doc values facets.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Fri, Feb 17, 2017 at 9:14 AM, Chitra R  wrote:
> > Any suggestions Kindly help me to move forward.
> >
> > Regards,
> > Chitra
> >
> > On Wed, Feb 15, 2017 at 9:23 PM, Chitra R  wrote:
> >>
> >> Hi,
> >>   Thanks for the suggestion. But in the case of drill
> sideways
> >> search, retrieving allDimensions (using Facets.getAllDimension()) threw
> an
> >> exception which is shown below...
> >>
> >> 1. While opening DocValuesReaderState, global ordinals and ordinals
> Range
> >> map will be computed for '$facets' field only.
> >> 2. NumericDocValuesField never indexes under '$ facets' so ordinal
> >> RangeMap will be null for the numeric field ie 'time'.
> >>
>  java.lang.IllegalArgumentException: dimension "time" was not indexed
> 
>  at
>  org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts.
> getTopChildren(SortedSetDocValuesFacetCounts.java:91)
> 
>  t org.apache.lucene.facet.MultiFacets.getAllDims(MultiFacets.java:74)
> >>
> >>
> >> In my use case,
> >>
> >> Both string pathTraversed and Numeric PathTraversedRanges will occur.
> >> And both faceted search and drill sideways search will be used.
> >>
> >> So how can I add path-traversed numericRanges?
> >>
> >> Am I missed anything?
> >>
> >>
> >> Kindly post your suggestions.
> >>
> >>
> >> Regards,
> >> Chitra
> >>
> >> On Wed, Feb 15, 2017 at 3:28 PM, Michael McCandless
> >>  wrote:
> >>>
> >>> Hi, have a look at the RangeFacetsExample.java under the lucene/demo
> >>> module... it shows how to do this.
> >>>
> >>> Mike McCandless
> >>>
> >>> http://blog.mikemccandless.com
> >>>
> >>>
> >>> On Tue, Feb 14, 2017 at 12:07 PM, Chitra R 
> wrote:
> >>> > Hi,
> >>> >We have planned to implement both string and numeric faceting
> using
> >>> > docvalues field.
> >>> >
> >>> > For string faceting, we have added pathtraversed dimensions in
> >>> > drilldownquery. But for numeric faceting , how and where can we add
> >>> > pathtraversed ranges during nextlevel faceted search.?
> >>> > And which is the better way to add pathtraversed ranges
> >>> > ( ie adding pathtraversed ranges in numericRangeQuery or
> >>> > adding pathtraversed ranges in filter).??Or Any other solution.???
> >>> >
> >>> > Thanks & Regards,
> >>> > Chitra
> >>> >
> >>> >
> >>> > Sent from my iPhone
> >>> > 
> -
> >>> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >>> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >>> >
> >>
> >>
> >
>


Re: Numeric Ranges Faceting

2017-02-17 Thread Michael McCandless
Hi, how are you instantiating your MultiFacets?  You should be passing
e.g. a LongRangeFacetCounts instance for your "time" dimension, which
should prevent that exception.

For DrillSideways, I think you must subclass, and then override
buildFacetResult to compute your range facets, because that class
assumes it's either indexed facets or sorted set doc values facets.

Mike McCandless

http://blog.mikemccandless.com


On Fri, Feb 17, 2017 at 9:14 AM, Chitra R  wrote:
> Any suggestions Kindly help me to move forward.
>
> Regards,
> Chitra
>
> On Wed, Feb 15, 2017 at 9:23 PM, Chitra R  wrote:
>>
>> Hi,
>>   Thanks for the suggestion. But in the case of drill sideways
>> search, retrieving allDimensions (using Facets.getAllDimension()) threw an
>> exception which is shown below...
>>
>> 1. While opening DocValuesReaderState, global ordinals and ordinals Range
>> map will be computed for '$facets' field only.
>> 2. NumericDocValuesField never indexes under '$ facets' so ordinal
>> RangeMap will be null for the numeric field ie 'time'.
>>
 java.lang.IllegalArgumentException: dimension "time" was not indexed

 at
 org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts.getTopChildren(SortedSetDocValuesFacetCounts.java:91)

 t org.apache.lucene.facet.MultiFacets.getAllDims(MultiFacets.java:74)
>>
>>
>> In my use case,
>>
>> Both string pathTraversed and Numeric PathTraversedRanges will occur.
>> And both faceted search and drill sideways search will be used.
>>
>> So how can I add path-traversed numericRanges?
>>
>> Am I missed anything?
>>
>>
>> Kindly post your suggestions.
>>
>>
>> Regards,
>> Chitra
>>
>> On Wed, Feb 15, 2017 at 3:28 PM, Michael McCandless
>>  wrote:
>>>
>>> Hi, have a look at the RangeFacetsExample.java under the lucene/demo
>>> module... it shows how to do this.
>>>
>>> Mike McCandless
>>>
>>> http://blog.mikemccandless.com
>>>
>>>
>>> On Tue, Feb 14, 2017 at 12:07 PM, Chitra R  wrote:
>>> > Hi,
>>> >We have planned to implement both string and numeric faceting using
>>> > docvalues field.
>>> >
>>> > For string faceting, we have added pathtraversed dimensions in
>>> > drilldownquery. But for numeric faceting , how and where can we add
>>> > pathtraversed ranges during nextlevel faceted search.?
>>> > And which is the better way to add pathtraversed ranges
>>> > ( ie adding pathtraversed ranges in numericRangeQuery or
>>> > adding pathtraversed ranges in filter).??Or Any other solution.???
>>> >
>>> > Thanks & Regards,
>>> > Chitra
>>> >
>>> >
>>> > Sent from my iPhone
>>> > -
>>> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> > For additional commands, e-mail: java-user-h...@lucene.apache.org
>>> >
>>
>>
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Numeric Ranges Faceting

2017-02-17 Thread Chitra R
Any suggestions Kindly help me to move forward.

Regards,
Chitra

On Wed, Feb 15, 2017 at 9:23 PM, Chitra R  wrote:

> Hi,
>   Thanks for the suggestion. But in the case of drill sideways
> search, retrieving allDimensions (using Facets.getAllDimension()) threw an
> exception which is shown below...
>
> 1. While opening DocValuesReaderState, global ordinals and ordinals Range
> map will be computed for '$facets' field only.
> 2. NumericDocValuesField never indexes under '$ facets' so ordinal
> RangeMap will be null for the numeric field ie 'time'.
>
> java.lang.IllegalArgumentException: dimension "time" was not indexed
>>
>> at org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts.
>>> getTopChildren(SortedSetDocValuesFacetCounts.java:91)
>>
>> t org.apache.lucene.facet.MultiFacets.getAllDims(MultiFacets.java:74)
>>
>>
> In my use case,
>
>- Both string pathTraversed and Numeric PathTraversedRanges will
>occur.
>- And both faceted search and drill sideways search will be used.
>
> So how can I add path-traversed numericRanges?
>
> Am I missed anything?
>
>
> Kindly post your suggestions.
>
>
> Regards,
> Chitra
>
> On Wed, Feb 15, 2017 at 3:28 PM, Michael McCandless <
> luc...@mikemccandless.com> wrote:
>
>> Hi, have a look at the RangeFacetsExample.java under the lucene/demo
>> module... it shows how to do this.
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Tue, Feb 14, 2017 at 12:07 PM, Chitra R  wrote:
>> > Hi,
>> >We have planned to implement both string and numeric faceting using
>> docvalues field.
>> >
>> > For string faceting, we have added pathtraversed dimensions in
>> drilldownquery. But for numeric faceting , how and where can we add
>> pathtraversed ranges during nextlevel faceted search.?
>> > And which is the better way to add pathtraversed ranges
>> > ( ie adding pathtraversed ranges in numericRangeQuery or
>> > adding pathtraversed ranges in filter).??Or Any other solution.???
>> >
>> > Thanks & Regards,
>> > Chitra
>> >
>> >
>> > Sent from my iPhone
>> > -
>> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> > For additional commands, e-mail: java-user-h...@lucene.apache.org
>> >
>>
>
>


Re: Searcher Performance

2017-02-17 Thread Chitra R
Hey, thank you so much. I got it.

I have

   - 10 lakh docs, 30 fields in my index
   - opening new searcher at initial search and
   - there will be no filesystem cache for my current index

At initial search, I search across only one field out of 30 fields in my
index.

My question is,

*At initial search, Whether the required page (os pages of Lucene index
files) for that field (a single field) will be loaded to filesystem cache
or all the fields info will be loaded to filesystem cache from disk?*


Regards,
Chitra

On Fri, Feb 17, 2017 at 7:05 PM, Adrien Grand  wrote:

> Regarding whether the filesystem cache helps, you could look at whether
> there is some disk activity while your queries are running.
>
> When everything is in the filesystem cache, the latency of search requests
> for simple queries (term queries and combinations through boolean queries)
> usually mostly depends on the total number of matches since Lucene needs to
> call the collector on every match.
>
> Le ven. 17 févr. 2017 à 10:09, Chitra R  a écrit :
>
> > Hi,
> >  While working with Searcher.Search, I have noticed a difference in
> > their performance. I have 10 lakh documents and 30 fields in my index. I
> > have performed three searches using different queries in a sequential
> > manner. At search time, I used MMapDirectory and index is opened.
> >
> > *case1: *
> >
> >- During the first search, I ran the Query Say (new TermQuery(new
> >Term("name","Chitra"))) and which yields 1 lakh documents as result.
> > Time
> >taken for first search = 50 - 60 ms nearly.
> >- And for the second search, I ran the Query Say (new TermQuery(new
> >Term("animal","lion"))) which also yields 1 lakh documents as result.
> > Time
> >taken for Second search = 50 - 60 ms nearly.
> >- And for the third search,  I ran the Query Say (new TermQuery(new
> >Term("bird","peacock"))) which also yields 1 lakh documents as result.
> >Time taken for Third search = 50 - 60 ms nearly.
> >
> > In this case, why does searcher.search take the same search time for
> > different queries?
> >
> > *case2:*
> >
> > Suppose if I ran the same query twice, Searcher.search took less time
> than
> > the previous search because of os cache.
> >
> > *Based on above observation, *
> >
> > During initial search, only the required portion of index files will be
> > loaded to i/o cache. And for the next search, if the required portion is
> > not present in os cache,
> >
> > Will it take time to read that files from disk? If so, this is the reason
> > behind searcher.search is taking the nearly same search time for
> different
> > queries.
> >
> >
> > Regards,
> > Chitra
> >
>


Re: unable to delete document via the IndexWriter.deleteDocuments(term) method

2017-02-17 Thread Armnotstrong
Thanks Ian,

   That's what I needed, things now work like a charm.
   someone really should put this in a blog or something :D

   good day



2017-02-17 21:16 GMT+08:00 Ian Lea :
> Hi
>
>
> Sounds like you should use FieldType.setTokenized(false).  For the
> equivalent field in some of my lucene indexes I use
>
> FieldType idf = new FieldType();
> idf.setStored(true);
> idf.setOmitNorms(true);
> idf.setIndexOptions(IndexOptions.DOCS);
> idf.setTokenized(false);
> idf.freeze();
>
> There's also PerFieldAnalyzerWrapper,  in oal.analysis.miscellaneous for
> version 6.x although I have a feeling it was elsewhere in earlier versions.
>
>
> --
> Ian.
>
>
>
> On Fri, Feb 17, 2017 at 12:26 PM, Armnotstrong  wrote:
>
>> Thanks, Ian:
>>
>> You saved my day!
>>
>> And there is a further question to ask:
>>
>> Since the analyzer could only be configured through the IndexWriter,
>> using  different
>> analyzers for different Fields is not possible, right? I only want
>> this '_id' field to identify
>> the document in index, so I could update or delete the specific
>> document from index
>> when needed, the real searching field is a text field, which should be
>> analysed by
>> smart_cn analyser.
>>
>> Thus, I think it will good to have such an configure option as
>> IndexOptions.NOT_ANALYSED.
>> I remember to have that in the old version of lucene, but not found in
>> version 5.x
>>
>> Any suggestion to bypass that?
>>
>> Sorry for my bad English.
>>
>> 2017-02-17 19:40 GMT+08:00 Ian Lea :
>> > Hi
>> >
>> >
>> > SimpleAnalyzer uses LetterTokenizer which divides text at non-letters.
>> > Your add and search methods use the analyzer but the delete method
>> doesn't.
>> >
>> > Replacing SimpleAnalyzer with KeywordAnalyzer in your program fixes it.
>> > You'll need to make sure that your id field is left alone.
>> >
>> >
>> > Good to see a small self-contained test program.  A couple of suggestions
>> > to make it even better if there's a next time:
>> >
>> > Use final static String ID = "_id" and ... KEY =
>> > "5836962b0293a47b09d345f1".  Minimises the risk of typos.
>> >
>> > And use RAMDirectory.  Means your program doesn't leave junk on my disk
>> if
>> > I run it, and also means it starts with an empty index each time.
>> >
>> >
>> > --
>> > Ian.
>> >
>> >
>> > On Fri, Feb 17, 2017 at 10:04 AM, Armnotstrong 
>> wrote:
>> >
>> >> Hi, all:
>> >>
>> >> I am Using version 5.5.4, and find can't delete a document via the
>> >> IndexWriter.deleteDocuments(term) method.
>> >>
>> >> Here is the test code:
>> >>
>> >> import org.apache.lucene.analysis.core.SimpleAnalyzer;
>> >> import org.apache.lucene.document.Document;
>> >> import org.apache.lucene.document.Field;
>> >> import org.apache.lucene.document.FieldType;
>> >> import org.apache.lucene.index.*;
>> >> import org.apache.lucene.queryparser.classic.ParseException;
>> >> import org.apache.lucene.queryparser.classic.QueryParser;
>> >> import org.apache.lucene.search.IndexSearcher;
>> >> import org.apache.lucene.search.Query;
>> >> import org.apache.lucene.search.ScoreDoc;
>> >> import org.apache.lucene.store.Directory;
>> >> import org.apache.lucene.store.FSDirectory;
>> >>
>> >> import java.io.IOException;
>> >> import java.nio.file.Paths;
>> >>
>> >> public class TestSearch {
>> >> static SimpleAnalyzer analyzer = new SimpleAnalyzer();
>> >>
>> >> public static void main(String[] argvs) throws IOException,
>> >> ParseException {
>> >> generateIndex("5836962b0293a47b09d345f1");
>> >> query("5836962b0293a47b09d345f1");
>> >> delete("5836962b0293a47b09d345f1");
>> >> query("5836962b0293a47b09d345f1");
>> >>
>> >> }
>> >>
>> >> public static void generateIndex(String id) throws IOException {
>> >> Directory directory = FSDirectory.open(Paths.get("/
>> >> tmp/test/lucene"));
>> >> IndexWriterConfig config = new IndexWriterConfig(analyzer);
>> >> IndexWriter iwriter = new IndexWriter(directory, config);
>> >> FieldType fieldType = new FieldType();
>> >> fieldType.setStored(true);
>> >> fieldType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_
>> >> AND_POSITIONS_AND_OFFSETS);
>> >> Field idField = new Field("_id", id, fieldType);
>> >> Document doc = new Document();
>> >> doc.add(idField);
>> >> iwriter.addDocument(doc);
>> >> iwriter.close();
>> >>
>> >> }
>> >>
>> >> public static void query(String id) throws ParseException,
>> IOException
>> >> {
>> >> Query query = new QueryParser("_id", analyzer).parse(id);
>> >> Directory directory = FSDirectory.open(Paths.get("/
>> >> tmp/test/lucene"));
>> >> IndexReader ireader  = DirectoryReader.open(directory);
>> >> IndexSearcher isearcher = new IndexSearcher(ireader);
>> >> ScoreDoc[] scoreDoc = isearcher.search(query, 100).scoreDocs;
>> >> for(ScoreDoc scdoc: 

Re: How FST constructed in lucene?

2017-02-17 Thread Adrien Grand
Le ven. 17 févr. 2017 à 11:17, krish mohan  a
écrit :

> During search, whether Lucene uses FST in .tip file to match against the
> terms? How the changes to the index will be updated in FST? Will it be
> re-constructed or will it be updated in existing FST?
>

Lucene never updates existing files. What will happen in practice when you
add or update documents is that it will build new segments that will build
their own FST. FSTs are also re-built upon merge.


> In case of wildcard and fuzzy queries, Lucene needs to test a large number
> of terms. Will FST be useful to skip some portion of terms from comparing?
>

Indeed it will. When intersecting the terms dictionary with an automaton,
Lucene basically performs a leap-frog between the terms contained in the
terms dictionary and the terms that are accepted by the automaton. In some
cases, the terms dictionary will lead the iteration and the automaton will
be used to verify that terms match, but in other cases Lucene will ask the
automaton what the next accepted term is and will ask the terms dictionary
to advance on or beyond that term (seekCeil), which uses the FST under the
hood. You can look at AutomatonTermsEnum and FilteredTermsEnum if you would
like to learn more about how it works.


Re: Searcher Performance

2017-02-17 Thread Adrien Grand
Regarding whether the filesystem cache helps, you could look at whether
there is some disk activity while your queries are running.

When everything is in the filesystem cache, the latency of search requests
for simple queries (term queries and combinations through boolean queries)
usually mostly depends on the total number of matches since Lucene needs to
call the collector on every match.

Le ven. 17 févr. 2017 à 10:09, Chitra R  a écrit :

> Hi,
>  While working with Searcher.Search, I have noticed a difference in
> their performance. I have 10 lakh documents and 30 fields in my index. I
> have performed three searches using different queries in a sequential
> manner. At search time, I used MMapDirectory and index is opened.
>
> *case1: *
>
>- During the first search, I ran the Query Say (new TermQuery(new
>Term("name","Chitra"))) and which yields 1 lakh documents as result.
> Time
>taken for first search = 50 - 60 ms nearly.
>- And for the second search, I ran the Query Say (new TermQuery(new
>Term("animal","lion"))) which also yields 1 lakh documents as result.
> Time
>taken for Second search = 50 - 60 ms nearly.
>- And for the third search,  I ran the Query Say (new TermQuery(new
>Term("bird","peacock"))) which also yields 1 lakh documents as result.
>Time taken for Third search = 50 - 60 ms nearly.
>
> In this case, why does searcher.search take the same search time for
> different queries?
>
> *case2:*
>
> Suppose if I ran the same query twice, Searcher.search took less time than
> the previous search because of os cache.
>
> *Based on above observation, *
>
> During initial search, only the required portion of index files will be
> loaded to i/o cache. And for the next search, if the required portion is
> not present in os cache,
>
> Will it take time to read that files from disk? If so, this is the reason
> behind searcher.search is taking the nearly same search time for different
> queries.
>
>
> Regards,
> Chitra
>


Re: unable to delete document via the IndexWriter.deleteDocuments(term) method

2017-02-17 Thread Ian Lea
Hi


Sounds like you should use FieldType.setTokenized(false).  For the
equivalent field in some of my lucene indexes I use

FieldType idf = new FieldType();
idf.setStored(true);
idf.setOmitNorms(true);
idf.setIndexOptions(IndexOptions.DOCS);
idf.setTokenized(false);
idf.freeze();

There's also PerFieldAnalyzerWrapper,  in oal.analysis.miscellaneous for
version 6.x although I have a feeling it was elsewhere in earlier versions.


--
Ian.



On Fri, Feb 17, 2017 at 12:26 PM, Armnotstrong  wrote:

> Thanks, Ian:
>
> You saved my day!
>
> And there is a further question to ask:
>
> Since the analyzer could only be configured through the IndexWriter,
> using  different
> analyzers for different Fields is not possible, right? I only want
> this '_id' field to identify
> the document in index, so I could update or delete the specific
> document from index
> when needed, the real searching field is a text field, which should be
> analysed by
> smart_cn analyser.
>
> Thus, I think it will good to have such an configure option as
> IndexOptions.NOT_ANALYSED.
> I remember to have that in the old version of lucene, but not found in
> version 5.x
>
> Any suggestion to bypass that?
>
> Sorry for my bad English.
>
> 2017-02-17 19:40 GMT+08:00 Ian Lea :
> > Hi
> >
> >
> > SimpleAnalyzer uses LetterTokenizer which divides text at non-letters.
> > Your add and search methods use the analyzer but the delete method
> doesn't.
> >
> > Replacing SimpleAnalyzer with KeywordAnalyzer in your program fixes it.
> > You'll need to make sure that your id field is left alone.
> >
> >
> > Good to see a small self-contained test program.  A couple of suggestions
> > to make it even better if there's a next time:
> >
> > Use final static String ID = "_id" and ... KEY =
> > "5836962b0293a47b09d345f1".  Minimises the risk of typos.
> >
> > And use RAMDirectory.  Means your program doesn't leave junk on my disk
> if
> > I run it, and also means it starts with an empty index each time.
> >
> >
> > --
> > Ian.
> >
> >
> > On Fri, Feb 17, 2017 at 10:04 AM, Armnotstrong 
> wrote:
> >
> >> Hi, all:
> >>
> >> I am Using version 5.5.4, and find can't delete a document via the
> >> IndexWriter.deleteDocuments(term) method.
> >>
> >> Here is the test code:
> >>
> >> import org.apache.lucene.analysis.core.SimpleAnalyzer;
> >> import org.apache.lucene.document.Document;
> >> import org.apache.lucene.document.Field;
> >> import org.apache.lucene.document.FieldType;
> >> import org.apache.lucene.index.*;
> >> import org.apache.lucene.queryparser.classic.ParseException;
> >> import org.apache.lucene.queryparser.classic.QueryParser;
> >> import org.apache.lucene.search.IndexSearcher;
> >> import org.apache.lucene.search.Query;
> >> import org.apache.lucene.search.ScoreDoc;
> >> import org.apache.lucene.store.Directory;
> >> import org.apache.lucene.store.FSDirectory;
> >>
> >> import java.io.IOException;
> >> import java.nio.file.Paths;
> >>
> >> public class TestSearch {
> >> static SimpleAnalyzer analyzer = new SimpleAnalyzer();
> >>
> >> public static void main(String[] argvs) throws IOException,
> >> ParseException {
> >> generateIndex("5836962b0293a47b09d345f1");
> >> query("5836962b0293a47b09d345f1");
> >> delete("5836962b0293a47b09d345f1");
> >> query("5836962b0293a47b09d345f1");
> >>
> >> }
> >>
> >> public static void generateIndex(String id) throws IOException {
> >> Directory directory = FSDirectory.open(Paths.get("/
> >> tmp/test/lucene"));
> >> IndexWriterConfig config = new IndexWriterConfig(analyzer);
> >> IndexWriter iwriter = new IndexWriter(directory, config);
> >> FieldType fieldType = new FieldType();
> >> fieldType.setStored(true);
> >> fieldType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_
> >> AND_POSITIONS_AND_OFFSETS);
> >> Field idField = new Field("_id", id, fieldType);
> >> Document doc = new Document();
> >> doc.add(idField);
> >> iwriter.addDocument(doc);
> >> iwriter.close();
> >>
> >> }
> >>
> >> public static void query(String id) throws ParseException,
> IOException
> >> {
> >> Query query = new QueryParser("_id", analyzer).parse(id);
> >> Directory directory = FSDirectory.open(Paths.get("/
> >> tmp/test/lucene"));
> >> IndexReader ireader  = DirectoryReader.open(directory);
> >> IndexSearcher isearcher = new IndexSearcher(ireader);
> >> ScoreDoc[] scoreDoc = isearcher.search(query, 100).scoreDocs;
> >> for(ScoreDoc scdoc: scoreDoc){
> >> Document doc = isearcher.doc(scdoc.doc);
> >> System.out.println(doc.get("_id"));
> >> }
> >> }
> >>
> >> public static void delete(String id){
> >> try {
> >>  Directory directory =
> >> FSDirectory.open(Paths.get("/tmp/test/lucene"));
> >> IndexWriterConfig config 

Re: Grouping in Lucene queries giving unexpected results

2017-02-17 Thread Michael Peterson
Thanks everyone.

For our use case in Rocana Search, we don't use scoring at all. We always
sort by a timestamp field present in every Document, so for us Lucene query
logic is always truly boolean - we only want exact matches using boolean
logic like you would get from a database query.

That being said, I can see now why +/- operators are useful when wanting
"should" vs. "must" for scoring.

Trejkaz - thanks for the deeper explanation. We will, in fact, modify naked
"NOT x" queries (where x might be a complex clause) to be

(*:* AND NOT x)

as that is exactly the interpretation we want.

-Michael Peterson

https://www.rocana.com/


On Thu, Feb 16, 2017 at 8:27 PM, Trejkaz  wrote:

> On Fri, Feb 17, 2017 at 11:14 AM, Erick Erickson
>  wrote:
> > Lucene query logic is not strict Boolean logic, the article above
> explains why.
>
> tl;dr it mostly comes down to scoring and syntax.
>
> The scoring argument will depend on how much you care. (My care for
> scoring is pretty close to zero, as I don't care whether the better
> results come first, as long as the exact results come back and the
> non-results don't.)
>
> For the syntax:
>
> * The article doesn't really address the (-NOT) problem, where
> essentially Lucene could insert an implicit *:* when there isn't one,
> to make those queries at least get a sane result. You can work around
> this by customising the query parser, possible for both for the
> classic one (subclass it and override the method to create the
> BooleanQuery) and the flexible one (add a processor to the pipeline).
>
> * The article strongly encourages using the +/- syntax instead of
> AND/OR/NOT, but the astute might notice that AND/OR/NOT is three
> operators, whereas +/- is only two, so clearly one of the boolean
> clause types does not have a prefix operator, making it literally
> impossible to specify some queries using the prefix operators alone.
>
> TX
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


Re: unable to delete document via the IndexWriter.deleteDocuments(term) method

2017-02-17 Thread Armnotstrong
Thanks, Ian:

You saved my day!

And there is a further question to ask:

Since the analyzer could only be configured through the IndexWriter,
using  different
analyzers for different Fields is not possible, right? I only want
this '_id' field to identify
the document in index, so I could update or delete the specific
document from index
when needed, the real searching field is a text field, which should be
analysed by
smart_cn analyser.

Thus, I think it will good to have such an configure option as
IndexOptions.NOT_ANALYSED.
I remember to have that in the old version of lucene, but not found in
version 5.x

Any suggestion to bypass that?

Sorry for my bad English.

2017-02-17 19:40 GMT+08:00 Ian Lea :
> Hi
>
>
> SimpleAnalyzer uses LetterTokenizer which divides text at non-letters.
> Your add and search methods use the analyzer but the delete method doesn't.
>
> Replacing SimpleAnalyzer with KeywordAnalyzer in your program fixes it.
> You'll need to make sure that your id field is left alone.
>
>
> Good to see a small self-contained test program.  A couple of suggestions
> to make it even better if there's a next time:
>
> Use final static String ID = "_id" and ... KEY =
> "5836962b0293a47b09d345f1".  Minimises the risk of typos.
>
> And use RAMDirectory.  Means your program doesn't leave junk on my disk if
> I run it, and also means it starts with an empty index each time.
>
>
> --
> Ian.
>
>
> On Fri, Feb 17, 2017 at 10:04 AM, Armnotstrong  wrote:
>
>> Hi, all:
>>
>> I am Using version 5.5.4, and find can't delete a document via the
>> IndexWriter.deleteDocuments(term) method.
>>
>> Here is the test code:
>>
>> import org.apache.lucene.analysis.core.SimpleAnalyzer;
>> import org.apache.lucene.document.Document;
>> import org.apache.lucene.document.Field;
>> import org.apache.lucene.document.FieldType;
>> import org.apache.lucene.index.*;
>> import org.apache.lucene.queryparser.classic.ParseException;
>> import org.apache.lucene.queryparser.classic.QueryParser;
>> import org.apache.lucene.search.IndexSearcher;
>> import org.apache.lucene.search.Query;
>> import org.apache.lucene.search.ScoreDoc;
>> import org.apache.lucene.store.Directory;
>> import org.apache.lucene.store.FSDirectory;
>>
>> import java.io.IOException;
>> import java.nio.file.Paths;
>>
>> public class TestSearch {
>> static SimpleAnalyzer analyzer = new SimpleAnalyzer();
>>
>> public static void main(String[] argvs) throws IOException,
>> ParseException {
>> generateIndex("5836962b0293a47b09d345f1");
>> query("5836962b0293a47b09d345f1");
>> delete("5836962b0293a47b09d345f1");
>> query("5836962b0293a47b09d345f1");
>>
>> }
>>
>> public static void generateIndex(String id) throws IOException {
>> Directory directory = FSDirectory.open(Paths.get("/
>> tmp/test/lucene"));
>> IndexWriterConfig config = new IndexWriterConfig(analyzer);
>> IndexWriter iwriter = new IndexWriter(directory, config);
>> FieldType fieldType = new FieldType();
>> fieldType.setStored(true);
>> fieldType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_
>> AND_POSITIONS_AND_OFFSETS);
>> Field idField = new Field("_id", id, fieldType);
>> Document doc = new Document();
>> doc.add(idField);
>> iwriter.addDocument(doc);
>> iwriter.close();
>>
>> }
>>
>> public static void query(String id) throws ParseException, IOException
>> {
>> Query query = new QueryParser("_id", analyzer).parse(id);
>> Directory directory = FSDirectory.open(Paths.get("/
>> tmp/test/lucene"));
>> IndexReader ireader  = DirectoryReader.open(directory);
>> IndexSearcher isearcher = new IndexSearcher(ireader);
>> ScoreDoc[] scoreDoc = isearcher.search(query, 100).scoreDocs;
>> for(ScoreDoc scdoc: scoreDoc){
>> Document doc = isearcher.doc(scdoc.doc);
>> System.out.println(doc.get("_id"));
>> }
>> }
>>
>> public static void delete(String id){
>> try {
>>  Directory directory =
>> FSDirectory.open(Paths.get("/tmp/test/lucene"));
>> IndexWriterConfig config = new IndexWriterConfig(analyzer);
>> IndexWriter iwriter = new IndexWriter(directory, config);
>> Term term = new Term("_id", id);
>> iwriter.deleteDocuments(term);
>> iwriter.commit();
>> iwriter.close();
>> }catch (IOException e){
>> e.printStackTrace();
>> }
>> }
>> }
>>
>>
>> --
>> 
>> best regards & a nice day
>> Zhao Ximing
>>
>> -
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>



-- 

best regards & a nice day
Zhao Ximing


Re: unable to delete document via the IndexWriter.deleteDocuments(term) method

2017-02-17 Thread Ian Lea
Hi


SimpleAnalyzer uses LetterTokenizer which divides text at non-letters.
Your add and search methods use the analyzer but the delete method doesn't.

Replacing SimpleAnalyzer with KeywordAnalyzer in your program fixes it.
You'll need to make sure that your id field is left alone.


Good to see a small self-contained test program.  A couple of suggestions
to make it even better if there's a next time:

Use final static String ID = "_id" and ... KEY =
"5836962b0293a47b09d345f1".  Minimises the risk of typos.

And use RAMDirectory.  Means your program doesn't leave junk on my disk if
I run it, and also means it starts with an empty index each time.


--
Ian.


On Fri, Feb 17, 2017 at 10:04 AM, Armnotstrong  wrote:

> Hi, all:
>
> I am Using version 5.5.4, and find can't delete a document via the
> IndexWriter.deleteDocuments(term) method.
>
> Here is the test code:
>
> import org.apache.lucene.analysis.core.SimpleAnalyzer;
> import org.apache.lucene.document.Document;
> import org.apache.lucene.document.Field;
> import org.apache.lucene.document.FieldType;
> import org.apache.lucene.index.*;
> import org.apache.lucene.queryparser.classic.ParseException;
> import org.apache.lucene.queryparser.classic.QueryParser;
> import org.apache.lucene.search.IndexSearcher;
> import org.apache.lucene.search.Query;
> import org.apache.lucene.search.ScoreDoc;
> import org.apache.lucene.store.Directory;
> import org.apache.lucene.store.FSDirectory;
>
> import java.io.IOException;
> import java.nio.file.Paths;
>
> public class TestSearch {
> static SimpleAnalyzer analyzer = new SimpleAnalyzer();
>
> public static void main(String[] argvs) throws IOException,
> ParseException {
> generateIndex("5836962b0293a47b09d345f1");
> query("5836962b0293a47b09d345f1");
> delete("5836962b0293a47b09d345f1");
> query("5836962b0293a47b09d345f1");
>
> }
>
> public static void generateIndex(String id) throws IOException {
> Directory directory = FSDirectory.open(Paths.get("/
> tmp/test/lucene"));
> IndexWriterConfig config = new IndexWriterConfig(analyzer);
> IndexWriter iwriter = new IndexWriter(directory, config);
> FieldType fieldType = new FieldType();
> fieldType.setStored(true);
> fieldType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_
> AND_POSITIONS_AND_OFFSETS);
> Field idField = new Field("_id", id, fieldType);
> Document doc = new Document();
> doc.add(idField);
> iwriter.addDocument(doc);
> iwriter.close();
>
> }
>
> public static void query(String id) throws ParseException, IOException
> {
> Query query = new QueryParser("_id", analyzer).parse(id);
> Directory directory = FSDirectory.open(Paths.get("/
> tmp/test/lucene"));
> IndexReader ireader  = DirectoryReader.open(directory);
> IndexSearcher isearcher = new IndexSearcher(ireader);
> ScoreDoc[] scoreDoc = isearcher.search(query, 100).scoreDocs;
> for(ScoreDoc scdoc: scoreDoc){
> Document doc = isearcher.doc(scdoc.doc);
> System.out.println(doc.get("_id"));
> }
> }
>
> public static void delete(String id){
> try {
>  Directory directory =
> FSDirectory.open(Paths.get("/tmp/test/lucene"));
> IndexWriterConfig config = new IndexWriterConfig(analyzer);
> IndexWriter iwriter = new IndexWriter(directory, config);
> Term term = new Term("_id", id);
> iwriter.deleteDocuments(term);
> iwriter.commit();
> iwriter.close();
> }catch (IOException e){
> e.printStackTrace();
> }
> }
> }
>
>
> --
> 
> best regards & a nice day
> Zhao Ximing
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


How FST constructed in lucene?

2017-02-17 Thread krish mohan
During search, whether Lucene uses FST in .tip file to match against the
terms? How the changes to the index will be updated in FST? Will it be
re-constructed or will it be updated in existing FST?

In case of wildcard and fuzzy queries, Lucene needs to test a large number
of terms. Will FST be useful to skip some portion of terms from comparing?


unable to delete document via the IndexWriter.deleteDocuments(term) method

2017-02-17 Thread Armnotstrong
Hi, all:

I am Using version 5.5.4, and find can't delete a document via the
IndexWriter.deleteDocuments(term) method.

Here is the test code:

import org.apache.lucene.analysis.core.SimpleAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.FieldType;
import org.apache.lucene.index.*;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;

import java.io.IOException;
import java.nio.file.Paths;

public class TestSearch {
static SimpleAnalyzer analyzer = new SimpleAnalyzer();

public static void main(String[] argvs) throws IOException, ParseException {
generateIndex("5836962b0293a47b09d345f1");
query("5836962b0293a47b09d345f1");
delete("5836962b0293a47b09d345f1");
query("5836962b0293a47b09d345f1");

}

public static void generateIndex(String id) throws IOException {
Directory directory = FSDirectory.open(Paths.get("/tmp/test/lucene"));
IndexWriterConfig config = new IndexWriterConfig(analyzer);
IndexWriter iwriter = new IndexWriter(directory, config);
FieldType fieldType = new FieldType();
fieldType.setStored(true);

fieldType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
Field idField = new Field("_id", id, fieldType);
Document doc = new Document();
doc.add(idField);
iwriter.addDocument(doc);
iwriter.close();

}

public static void query(String id) throws ParseException, IOException {
Query query = new QueryParser("_id", analyzer).parse(id);
Directory directory = FSDirectory.open(Paths.get("/tmp/test/lucene"));
IndexReader ireader  = DirectoryReader.open(directory);
IndexSearcher isearcher = new IndexSearcher(ireader);
ScoreDoc[] scoreDoc = isearcher.search(query, 100).scoreDocs;
for(ScoreDoc scdoc: scoreDoc){
Document doc = isearcher.doc(scdoc.doc);
System.out.println(doc.get("_id"));
}
}

public static void delete(String id){
try {
 Directory directory =
FSDirectory.open(Paths.get("/tmp/test/lucene"));
IndexWriterConfig config = new IndexWriterConfig(analyzer);
IndexWriter iwriter = new IndexWriter(directory, config);
Term term = new Term("_id", id);
iwriter.deleteDocuments(term);
iwriter.commit();
iwriter.close();
}catch (IOException e){
e.printStackTrace();
}
}
}


-- 

best regards & a nice day
Zhao Ximing

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Searcher Performance

2017-02-17 Thread Chitra R
Hi,
 While working with Searcher.Search, I have noticed a difference in
their performance. I have 10 lakh documents and 30 fields in my index. I
have performed three searches using different queries in a sequential
manner. At search time, I used MMapDirectory and index is opened.

*case1: *

   - During the first search, I ran the Query Say (new TermQuery(new
   Term("name","Chitra"))) and which yields 1 lakh documents as result. Time
   taken for first search = 50 - 60 ms nearly.
   - And for the second search, I ran the Query Say (new TermQuery(new
   Term("animal","lion"))) which also yields 1 lakh documents as result.  Time
   taken for Second search = 50 - 60 ms nearly.
   - And for the third search,  I ran the Query Say (new TermQuery(new
   Term("bird","peacock"))) which also yields 1 lakh documents as result.
   Time taken for Third search = 50 - 60 ms nearly.

In this case, why does searcher.search take the same search time for
different queries?

*case2:*

Suppose if I ran the same query twice, Searcher.search took less time than
the previous search because of os cache.

*Based on above observation, *

During initial search, only the required portion of index files will be
loaded to i/o cache. And for the next search, if the required portion is
not present in os cache,

Will it take time to read that files from disk? If so, this is the reason
behind searcher.search is taking the nearly same search time for different
queries.


Regards,
Chitra