Hi Uwe, Thanks for the info. You were mentioning about term dictionary and and other index components. I didn't get this. - What could be the other factors that improve the speed of such query ? Can u explain or give some pointers to this ? - Can we do something to improve speed for such queries ?
Also other observation is indexing time has increased by around 6% in 4.0. Arun On Fri, Mar 29, 2013 at 3:25 PM, Uwe Schindler <[email protected]> wrote: > Hi, > > It depends on the type of wildcard query. If you only have a prefix (ab*), > they rewrite to a simple PrefixQuery and this one is implemented exactly > like in 3.x, so you only see the speed improvements of Lucene 4.0 in the > term dictionary and and other index components, not related to the query > itsself. > > If you have wildcards like ab?xy, then this query will be multiple times > faster than in 3.x, because the "?" wildcard can only expand to a limited > set of terms, while in Lucene 3.x, it still scans all terms with prefix > "ab". The same applies to other wildcard constructs, if they limit more > than just prefix. > > Uwe > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: [email protected] > > > > -----Original Message----- > > From: Arun Kumar K [mailto:[email protected]] > > Sent: Friday, March 29, 2013 10:38 AM > > To: java-user > > Subject: Wild Card Query Performance > > > > Hi Guys, > > > > I have been testing the search time improvement in Lucene 4.0 from Lucene > > 3.0.2 version for Wildcard Queries (with atleast say 2 chars Eg.ar*). > > > > For a 2GB size index with 4000000 docs, the following observations were > > made: > > > > Around 3X improvement with and without STRING sort on a sortable field. > > > > I guess this improvement is because of the Automation Query by Robert > > which is used in WildCard Queries. > > > > As per mike's blog, FuzzyQueries are 100X times faster in 4.0 but these > > wildcard queries are not that faster comparatively. > > > > I have used default codecs and postings format. > > > > Did i miss something or is it the max improvement that we can expect > > currently for WildCard Queries? > > > > > > Arun > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
