Query optimization

2016-07-13 Thread Midas A
http://
#:8983/solr/prod/select?q=id_path_ids:166&fq=sort_price:[0%20TO%20*]&fq=status:A&fq=company_status:A&facet=true&facet.mincount=1&facet.field=show_meta_id&facet.field=show_brand&facet.field=product_amount_available&facet.field=by_processor&facet.field=by_system_memory&facet.field=by_screen_size&facet.field=by_operating_system&facet.field=by_laptop_type&facet.field=by_processor_brand&facet.field=by_hard_drive_capacity&facet.field=by_touchscreen&facet.field=by_warranty&facet.field=by_graphic_memory&facet.field=is_trm&facet.field=show_merchant&facet.field=is_cod&facet.field=show_market&facet.query={!ex=p_r%20key=product_rating:[4-5]}product_rating:[4%20TO%205]&facet.query={!ex=p_r%20key=product_rating:[3-5]}product_rating:[3%20TO%205]&facet.query={!ex=p_r%20key=product_rating:[2-5]}product_rating:[2%20TO%205]&facet.query={!ex=p_r%20key=product_rating:[1-5]}product_rating:[1%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[4-5]}merchant_rating:[4%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[3-5]}merchant_rating:[3%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[2-5]}merchant_rating:[2%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[1-5]}merchant_rating:[1%20TO%205]&facet.limit=500&stats=true&stats.field=sort_price&start=0&rows=10&sort=product_amount_available%20desc,boost_index%20asc,popularity%20desc,is_cod%20desc


What kind of optimization we can do in above query . it is taking 2400 ms .


Query optimization

2016-07-28 Thread Midas A
a) my index size is 10 gb   for higher start is query response got slow .
what should i do to optimize this query for higher start value in query


query optimization

2019-07-02 Thread Midas A
Hi,

How can i optimize following query it is taking time

 webapp=/solr path=/search params={
df=ttl&ps=0&hl=true&f.ind.mincount=1&hl.usePhraseHighlighter=true&lowercaseOperators=true&ps2=0&ps3=0&qf=contents^0.05+currdesig^1.5+predesig^1.5+lng^2+ttl+kw_skl+kw_it&sow=false&hl.fl=ttl,kw_skl,kw_it,contents&semantictermsttl=&f.cat.mincount=1&semanticfieldttl=ttl^0.1+currdesig^0.1+predesig^0.1&qs=0&qt=/resumesearch&semantictermsskl="mbbss"+OR+"medicine"&version=2&omitHeader=true&hl.q=mbbs,+"medical+officer",+doctor,+physician+("medical+officer")+"medical+officer"+"physician""+""general+physician""+""physicians""+""consultant+physician""+""house+physician"+"physician"+"doctor"+"mbbs"+"general+physician"+"physicians"+"consultant+physician"+"house+physician"&typeId=(293)&debugQuery=false&bq1=&echoParams=none&fl=id,upt&f.pref.mincount=1&q.op=OR&fq=NOT+contents:("liaise+with+medical+officer"+"worked+with+medical+officer"+"working+with+medical+officer"+"reported+to+medical+officer"+"references+are+medical+officer"+"coordinated+with+medical+officer"+"closely+with+medical+officer"+"signature+of+medical+officer"+"seal+of++medical+officer"+"liaise+with+physician"+"worked+with+physician"+"working+with+physician"+"reported+to+physician"+"references+are+physician"+"coordinated+with+physician"+"closely+with+physician"+"signature+of+physician"+"seal+of++physician"+"liaise+with+doctor"+"worked+with+doctor"+"working+with+doctor"+"reported+to+doctor"+"references+are+doctor"+"coordinated+with+doctor"+"closely+with+doctor"+"signature+of+doctor"+"seal+of++doctor")&fq=NOT+hemp:("xmwxagency"+"xmwxlimited"+"xmwxplacement"+"xmwxplus"+"xmwxprivate"+"xmwxsecurity"+"xmwxz2"+"xmwxand"+"xswxz2+plus+placement+and+security+agency+private+limited"+"xswxz2+plus+placement+and+security+agency+private"+"xswxz2+plus+placement+and+security+agency"+"xswxz2+plus+placement+and+security"+"xswxz2+plus+placement+and"+"xswxz2+plus+placement"+"xswxz2+plus"+"xswxz2")&fq=ctc:[100.0+TO+107.2]+OR+ctc:[-1.0+TO+-1.0]&fq=(dlh:(22))&fq=ind:(24++42++24++8)&fq=(rol:(292+293+294+322))&fq=(cat:(9))&fq=cat:(1000+OR+907+OR+1+OR+2+OR+3+OR+786+OR+4+OR+5+OR+6+OR+7+OR+8+OR+9+OR+10+OR+11+OR+12+OR+13+OR+14+OR+785+OR+15+OR+16+OR+17+OR+18+OR+908+OR+19+OR+20+OR+21+OR+23+OR+24)&fq=NOT+is_udis:2&bq=is_resume:0^-1000&bq=upt_date:[*+TO+NOW/DAY-36MONTHS]^2&bq=upt_date:[NOW/DAY-36MONTHS+TO+NOW/DAY-24MONTHS]^3&bq=upt_date:[NOW/DAY-24MONTHS+TO+NOW/DAY-12MONTHS]^4&bq=upt_date:[NOW/DAY-12MONTHS+TO+NOW/DAY-9MONTHS]^5&bq=upt_date:[NOW/DAY-9MONTHS+TO+NOW/DAY-6MONTHS]^10&bq=upt_date:[NOW/DAY-6MONTHS+TO+NOW/DAY-3MONTHS]^15&bq=upt_date:[NOW/DAY-3MONTHS+TO+*]^20&bq=_query_:"{!edismax+qf%3Drol^2+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$typeId+q.op%3DOR+bq%3D\$bq1+bf%3D}"&bq=_query_:"{!edismax+qf%3Drol^2+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$typeId+q.op%3DOR+bq%3D\$bq1+bf%3D}"&bq=_query_:"{!edismax+qf%3Drol^2+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$typeId+q.op%3DOR+bq%3D\$bq1+bf%3D}"&bq=dlh:(22)^8&bq={!boost+b%3D4}+_query_:{!edismax+qf%3D"currdesig^8+predesig^6+ttl^3+kw_skl^2+contents"+v%3D"\"doctor\"+\"medical+officer\"+\"physician\""+q.op%3DAND+bq%3D}&bq=_query_:{!edismax+qf%3D"currdesig+predesig+ttl+kw_skl+contents^0.01"+v%3D"\"doctor\"+\"medical+officer\"+\"physician\""+q.op%3DOR+bq%3D}&bq=NOT+country:isoin^-10&facet.query=exp:[+10+TO+11+]&facet.query=exp:[+11+TO+13+]&facet.query=exp:[+13+TO+15+]&facet.query=exp:[+15+TO+17+]&facet.query=exp:[+17+TO+20+]&facet.query=exp:[+20+TO+25+]&facet.query=exp:[+25+TO+109+]&facet.query=ctc:[+100+TO+101+]&facet.query=ctc:[+101+TO+101.5+]&facet.query=ctc:[+101.5+TO+102+]&facet.query=ctc:[+102+TO+103+]&facet.query=ctc:[+103+TO+104+]&facet.query=ctc:[+104+TO+105+]&facet.query=ctc:[+105+TO+107.5+]&facet.query=ctc:[+107.5+TO+110+]&facet.query=ctc:[+110+TO+115+]&facet.query=ctc:[+115+TO+10100+]&f.cl.mincount=1&queryany3=(22)&wt=javabin&queryany2=(293)&queryany1=(294)&queryany0=(322)&facet.field=ind&facet.field=cat&facet.field=rol&facet.field=cl&facet.field=pref&debug=false&f.rol.mincount=1&start=0&rows=40&q=((mbbs+OR+_query_:"{!edismax+qf%3Ddlh+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany3+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+((("medical+officer")+OR+"medical+officer"~0)+OR+_query_:"{!edismax+qf%3Drol+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany0+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+(("doctor"+OR+doctor)+OR+_query_:"{!edismax+qf%3Drol+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany2+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+(("physician"+OR+"physicians"+OR+"general+physician"+OR+"house+physician"+OR+"consultant+physician"+OR+physician)+OR+_query_:"{!edismax+qf%3Drol+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany1+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+_query_:"{!edismax+qf%3D\$semanticfieldskl+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D\$semantictermsskl+q.op%3DOR+bq%3D\$bq1+bf%3D}"+OR+_query_:"{!edismax+qf%3D\$semanticfieldttl+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+

Re: Query optimization

2016-07-13 Thread sara hajili
Hi.
Facet query take a long time.you vcan use group query.
Or in fileds in schema that you run facet query on that filed.
Set doc value=true.
To get better answer.in quick time.
On Jul 13, 2016 11:54 AM, "Midas A"  wrote:

> http://
>
> #:8983/solr/prod/select?q=id_path_ids:166&fq=sort_price:[0%20TO%20*]&fq=status:A&fq=company_status:A&facet=true&facet.mincount=1&facet.field=show_meta_id&facet.field=show_brand&facet.field=product_amount_available&facet.field=by_processor&facet.field=by_system_memory&facet.field=by_screen_size&facet.field=by_operating_system&facet.field=by_laptop_type&facet.field=by_processor_brand&facet.field=by_hard_drive_capacity&facet.field=by_touchscreen&facet.field=by_warranty&facet.field=by_graphic_memory&facet.field=is_trm&facet.field=show_merchant&facet.field=is_cod&facet.field=show_market&facet.query={!ex=p_r%20key=product_rating:[4-5]}product_rating:[4%20TO%205]&facet.query={!ex=p_r%20key=product_rating:[3-5]}product_rating:[3%20TO%205]&facet.query={!ex=p_r%20key=product_rating:[2-5]}product_rating:[2%20TO%205]&facet.query={!ex=p_r%20key=product_rating:[1-5]}product_rating:[1%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[4-5]}merchant_rating:[4%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[3-5]}merchant_rating:[3%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[2-5]}merchant_rating:[2%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[1-5]}merchant_rating:[1%20TO%205]&facet.limit=500&stats=true&stats.field=sort_price&start=0&rows=10&sort=product_amount_available%20desc,boost_index%20asc,popularity%20desc,is_cod%20desc
>
>
> What kind of optimization we can do in above query . it is taking 2400 ms .
>


Re: Query optimization

2016-07-13 Thread Midas A
Is there any draw back of using docValues=true ?

On Wed, Jul 13, 2016 at 2:28 PM, sara hajili  wrote:

> Hi.
> Facet query take a long time.you vcan use group query.
> Or in fileds in schema that you run facet query on that filed.
> Set doc value=true.
> To get better answer.in quick time.
> On Jul 13, 2016 11:54 AM, "Midas A"  wrote:
>
> > http://
> >
> >
> #:8983/solr/prod/select?q=id_path_ids:166&fq=sort_price:[0%20TO%20*]&fq=status:A&fq=company_status:A&facet=true&facet.mincount=1&facet.field=show_meta_id&facet.field=show_brand&facet.field=product_amount_available&facet.field=by_processor&facet.field=by_system_memory&facet.field=by_screen_size&facet.field=by_operating_system&facet.field=by_laptop_type&facet.field=by_processor_brand&facet.field=by_hard_drive_capacity&facet.field=by_touchscreen&facet.field=by_warranty&facet.field=by_graphic_memory&facet.field=is_trm&facet.field=show_merchant&facet.field=is_cod&facet.field=show_market&facet.query={!ex=p_r%20key=product_rating:[4-5]}product_rating:[4%20TO%205]&facet.query={!ex=p_r%20key=product_rating:[3-5]}product_rating:[3%20TO%205]&facet.query={!ex=p_r%20key=product_rating:[2-5]}product_rating:[2%20TO%205]&facet.query={!ex=p_r%20key=product_rating:[1-5]}product_rating:[1%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[4-5]}merchant_rating:[4%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[3-5]}merchant_rating:[3%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[2-5]}merchant_rating:[2%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[1-5]}merchant_rating:[1%20TO%205]&facet.limit=500&stats=true&stats.field=sort_price&start=0&rows=10&sort=product_amount_available%20desc,boost_index%20asc,popularity%20desc,is_cod%20desc
> >
> >
> > What kind of optimization we can do in above query . it is taking 2400
> ms .
> >
>


Re: Query optimization

2016-07-13 Thread sara hajili
as i know when you use docValue=true
solr when indexing doc,
solr although store doc and docValue=true field in memory.to use that in
facet query and sort query result.
so maybe use a lot docvalue=true use a lot  memory of ur system.
but use it in logical way.can make better query response time

On Wed, Jul 13, 2016 at 5:11 AM, Midas A  wrote:

> Is there any draw back of using docValues=true ?
>
> On Wed, Jul 13, 2016 at 2:28 PM, sara hajili 
> wrote:
>
> > Hi.
> > Facet query take a long time.you vcan use group query.
> > Or in fileds in schema that you run facet query on that filed.
> > Set doc value=true.
> > To get better answer.in quick time.
> > On Jul 13, 2016 11:54 AM, "Midas A"  wrote:
> >
> > > http://
> > >
> > >
> >
> #:8983/solr/prod/select?q=id_path_ids:166&fq=sort_price:[0%20TO%20*]&fq=status:A&fq=company_status:A&facet=true&facet.mincount=1&facet.field=show_meta_id&facet.field=show_brand&facet.field=product_amount_available&facet.field=by_processor&facet.field=by_system_memory&facet.field=by_screen_size&facet.field=by_operating_system&facet.field=by_laptop_type&facet.field=by_processor_brand&facet.field=by_hard_drive_capacity&facet.field=by_touchscreen&facet.field=by_warranty&facet.field=by_graphic_memory&facet.field=is_trm&facet.field=show_merchant&facet.field=is_cod&facet.field=show_market&facet.query={!ex=p_r%20key=product_rating:[4-5]}product_rating:[4%20TO%205]&facet.query={!ex=p_r%20key=product_rating:[3-5]}product_rating:[3%20TO%205]&facet.query={!ex=p_r%20key=product_rating:[2-5]}product_rating:[2%20TO%205]&facet.query={!ex=p_r%20key=product_rating:[1-5]}product_rating:[1%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[4-5]}merchant_rating:[4%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[3-5]}merchant_rating:[3%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[2-5]}merchant_rating:[2%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[1-5]}merchant_rating:[1%20TO%205]&facet.limit=500&stats=true&stats.field=sort_price&start=0&rows=10&sort=product_amount_available%20desc,boost_index%20asc,popularity%20desc,is_cod%20desc
> > >
> > >
> > > What kind of optimization we can do in above query . it is taking 2400
> > ms .
> > >
> >
>


Re: Query optimization

2016-07-13 Thread Erick Erickson
DocValues are now the preferred mechanism
whenever you need to sort, facet or group. It'll
make your on-disk index bigger, but the on-disk
structure would have been built in Java's memory
if you didn't use DocValues whereas if you do
it's MMap'd.

So overall, use DocValues by preference.

Best,
Erick

On Wed, Jul 13, 2016 at 5:36 AM, sara hajili  wrote:
> as i know when you use docValue=true
> solr when indexing doc,
> solr although store doc and docValue=true field in memory.to use that in
> facet query and sort query result.
> so maybe use a lot docvalue=true use a lot  memory of ur system.
> but use it in logical way.can make better query response time
>
> On Wed, Jul 13, 2016 at 5:11 AM, Midas A  wrote:
>
>> Is there any draw back of using docValues=true ?
>>
>> On Wed, Jul 13, 2016 at 2:28 PM, sara hajili 
>> wrote:
>>
>> > Hi.
>> > Facet query take a long time.you vcan use group query.
>> > Or in fileds in schema that you run facet query on that filed.
>> > Set doc value=true.
>> > To get better answer.in quick time.
>> > On Jul 13, 2016 11:54 AM, "Midas A"  wrote:
>> >
>> > > http://
>> > >
>> > >
>> >
>> #:8983/solr/prod/select?q=id_path_ids:166&fq=sort_price:[0%20TO%20*]&fq=status:A&fq=company_status:A&facet=true&facet.mincount=1&facet.field=show_meta_id&facet.field=show_brand&facet.field=product_amount_available&facet.field=by_processor&facet.field=by_system_memory&facet.field=by_screen_size&facet.field=by_operating_system&facet.field=by_laptop_type&facet.field=by_processor_brand&facet.field=by_hard_drive_capacity&facet.field=by_touchscreen&facet.field=by_warranty&facet.field=by_graphic_memory&facet.field=is_trm&facet.field=show_merchant&facet.field=is_cod&facet.field=show_market&facet.query={!ex=p_r%20key=product_rating:[4-5]}product_rating:[4%20TO%205]&facet.query={!ex=p_r%20key=product_rating:[3-5]}product_rating:[3%20TO%205]&facet.query={!ex=p_r%20key=product_rating:[2-5]}product_rating:[2%20TO%205]&facet.query={!ex=p_r%20key=product_rating:[1-5]}product_rating:[1%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[4-5]}merchant_rating:[4%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[3-5]}merchant_rating:[3%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[2-5]}merchant_rating:[2%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[1-5]}merchant_rating:[1%20TO%205]&facet.limit=500&stats=true&stats.field=sort_price&start=0&rows=10&sort=product_amount_available%20desc,boost_index%20asc,popularity%20desc,is_cod%20desc
>> > >
>> > >
>> > > What kind of optimization we can do in above query . it is taking 2400
>> > ms .
>> > >
>> >
>>


Re: Query optimization

2016-07-13 Thread Midas A
Hi ,

One more thing i would like to add here is  we build facet queries over
dynamic fields so my question is
a) Is there any harm of using docValues true with dynamic fields.
b) Other suggestion that we can implement to optimize this query my index
size is 8GB  and query is taking more tha 3 seconds.

Regards,
Abhishek Tiwari

On Thu, Jul 14, 2016 at 6:42 AM, Erick Erickson 
wrote:

> DocValues are now the preferred mechanism
> whenever you need to sort, facet or group. It'll
> make your on-disk index bigger, but the on-disk
> structure would have been built in Java's memory
> if you didn't use DocValues whereas if you do
> it's MMap'd.
>
> So overall, use DocValues by preference.
>
> Best,
> Erick
>
> On Wed, Jul 13, 2016 at 5:36 AM, sara hajili 
> wrote:
> > as i know when you use docValue=true
> > solr when indexing doc,
> > solr although store doc and docValue=true field in memory.to use that in
> > facet query and sort query result.
> > so maybe use a lot docvalue=true use a lot  memory of ur system.
> > but use it in logical way.can make better query response time
> >
> > On Wed, Jul 13, 2016 at 5:11 AM, Midas A  wrote:
> >
> >> Is there any draw back of using docValues=true ?
> >>
> >> On Wed, Jul 13, 2016 at 2:28 PM, sara hajili 
> >> wrote:
> >>
> >> > Hi.
> >> > Facet query take a long time.you vcan use group query.
> >> > Or in fileds in schema that you run facet query on that filed.
> >> > Set doc value=true.
> >> > To get better answer.in quick time.
> >> > On Jul 13, 2016 11:54 AM, "Midas A"  wrote:
> >> >
> >> > > http://
> >> > >
> >> > >
> >> >
> >>
> #:8983/solr/prod/select?q=id_path_ids:166&fq=sort_price:[0%20TO%20*]&fq=status:A&fq=company_status:A&facet=true&facet.mincount=1&facet.field=show_meta_id&facet.field=show_brand&facet.field=product_amount_available&facet.field=by_processor&facet.field=by_system_memory&facet.field=by_screen_size&facet.field=by_operating_system&facet.field=by_laptop_type&facet.field=by_processor_brand&facet.field=by_hard_drive_capacity&facet.field=by_touchscreen&facet.field=by_warranty&facet.field=by_graphic_memory&facet.field=is_trm&facet.field=show_merchant&facet.field=is_cod&facet.field=show_market&facet.query={!ex=p_r%20key=product_rating:[4-5]}product_rating:[4%20TO%205]&facet.query={!ex=p_r%20key=product_rating:[3-5]}product_rating:[3%20TO%205]&facet.query={!ex=p_r%20key=product_rating:[2-5]}product_rating:[2%20TO%205]&facet.query={!ex=p_r%20key=product_rating:[1-5]}product_rating:[1%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[4-5]}merchant_rating:[4%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[3-5]}merchant_rating:[3%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[2-5]}merchant_rating:[2%20TO%205]&facet.query={!ex=m_r%20key=merchant_rating:[1-5]}merchant_rating:[1%20TO%205]&facet.limit=500&stats=true&stats.field=sort_price&start=0&rows=10&sort=product_amount_available%20desc,boost_index%20asc,popularity%20desc,is_cod%20desc
> >> > >
> >> > >
> >> > > What kind of optimization we can do in above query . it is taking
> 2400
> >> > ms .
> >> > >
> >> >
> >>
>


Re: Query optimization

2016-07-28 Thread Midas A
please reply .

On Fri, Jul 29, 2016 at 10:26 AM, Midas A  wrote:

> a) my index size is 10 gb   for higher start is query response got slow .
> what should i do to optimize this query for higher start value in query
>


Re: Query optimization

2016-07-28 Thread Ahmet Arslan
Ups I forgot the link:
http://yonik.com/solr/paging-and-deep-paging/




On Friday, July 29, 2016 9:51 AM, Ahmet Arslan  wrote:
Hi Midas,

Please search 'deep paging' over the documentation, mailing list, etc.
Solr Deep Paging and Sorting


Ahmet

On Friday, July 29, 2016 9:21 AM, Midas A  wrote:



please reply .


On Fri, Jul 29, 2016 at 10:26 AM, Midas A  wrote:

> a) my index size is 10 gb   for higher start is query response got slow .
> what should i do to optimize this query for higher start value in query
>


Re: Query optimization

2016-07-28 Thread Ahmet Arslan
Hi Midas,

Please search 'deep paging' over the documentation, mailing list, etc.
Solr Deep Paging and Sorting


Ahmet
On Friday, July 29, 2016 9:21 AM, Midas A  wrote:



please reply .


On Fri, Jul 29, 2016 at 10:26 AM, Midas A  wrote:

> a) my index size is 10 gb   for higher start is query response got slow .
> what should i do to optimize this query for higher start value in query
>


Filter query optimization

2009-10-19 Thread Jason Rutherglen
If a filter query matches nothing, then no additional query should be
performed and no results returned?  I don't think we have this today?


Search query optimization

2008-05-29 Thread Yongjun Rong
Hi,
  I have a question about how the lucene query parser. For example, I
have query "A AND B AND C". Will lucene extract all documents satisfy
condition A in memory and then filter it with condition B and C? or only
the documents satisfying "A AND B AND C" will be put into memory? Is
there any articles discuss about how to build a optimization query to
save memory and improve performance?
  Thank you very much.
  Yongjun Rong


Function Query Optimization

2020-12-11 Thread Jae Joo
I have the requirement to create field  - xyz to be returned based on the
matched result.
Here Is the code .

XYZ:concat(

if(exists(query({!v='field1:12345'})), '12345', ''),

if(exists(query({!v='field1:23456'})), '23456', ''),

if(exists(query({!v='field1:34567'})), '34567', ''),

if(exists(query({!v='field:45678'})), '45678','')
),

I am feeling this is very complex, so I am looking for some smart and
faster ideas.

Thanks,

Jae


Re: query optimization

2019-07-03 Thread Midas A
Please suggest here

On Wed, Jul 3, 2019 at 10:23 AM Midas A  wrote:

> Hi,
>
> How can i optimize following query it is taking time
>
>  webapp=/solr path=/search params={
> df=ttl&ps=0&hl=true&f.ind.mincount=1&hl.usePhraseHighlighter=true&lowercaseOperators=true&ps2=0&ps3=0&qf=contents^0.05+currdesig^1.5+predesig^1.5+lng^2+ttl+kw_skl+kw_it&sow=false&hl.fl=ttl,kw_skl,kw_it,contents&semantictermsttl=&f.cat.mincount=1&semanticfieldttl=ttl^0.1+currdesig^0.1+predesig^0.1&qs=0&qt=/resumesearch&semantictermsskl="mbbss"+OR+"medicine"&version=2&omitHeader=true&hl.q=mbbs,+"medical+officer",+doctor,+physician+("medical+officer")+"medical+officer"+"physician""+""general+physician""+""physicians""+""consultant+physician""+""house+physician"+"physician"+"doctor"+"mbbs"+"general+physician"+"physicians"+"consultant+physician"+"house+physician"&typeId=(293)&debugQuery=false&bq1=&echoParams=none&fl=id,upt&f.pref.mincount=1&q.op=OR&fq=NOT+contents:("liaise+with+medical+officer"+"worked+with+medical+officer"+"working+with+medical+officer"+"reported+to+medical+officer"+"references+are+medical+officer"+"coordinated+with+medical+officer"+"closely+with+medical+officer"+"signature+of+medical+officer"+"seal+of++medical+officer"+"liaise+with+physician"+"worked+with+physician"+"working+with+physician"+"reported+to+physician"+"references+are+physician"+"coordinated+with+physician"+"closely+with+physician"+"signature+of+physician"+"seal+of++physician"+"liaise+with+doctor"+"worked+with+doctor"+"working+with+doctor"+"reported+to+doctor"+"references+are+doctor"+"coordinated+with+doctor"+"closely+with+doctor"+"signature+of+doctor"+"seal+of++doctor")&fq=NOT+hemp:("xmwxagency"+"xmwxlimited"+"xmwxplacement"+"xmwxplus"+"xmwxprivate"+"xmwxsecurity"+"xmwxz2"+"xmwxand"+"xswxz2+plus+placement+and+security+agency+private+limited"+"xswxz2+plus+placement+and+security+agency+private"+"xswxz2+plus+placement+and+security+agency"+"xswxz2+plus+placement+and+security"+"xswxz2+plus+placement+and"+"xswxz2+plus+placement"+"xswxz2+plus"+"xswxz2")&fq=ctc:[100.0+TO+107.2]+OR+ctc:[-1.0+TO+-1.0]&fq=(dlh:(22))&fq=ind:(24++42++24++8)&fq=(rol:(292+293+294+322))&fq=(cat:(9))&fq=cat:(1000+OR+907+OR+1+OR+2+OR+3+OR+786+OR+4+OR+5+OR+6+OR+7+OR+8+OR+9+OR+10+OR+11+OR+12+OR+13+OR+14+OR+785+OR+15+OR+16+OR+17+OR+18+OR+908+OR+19+OR+20+OR+21+OR+23+OR+24)&fq=NOT+is_udis:2&bq=is_resume:0^-1000&bq=upt_date:[*+TO+NOW/DAY-36MONTHS]^2&bq=upt_date:[NOW/DAY-36MONTHS+TO+NOW/DAY-24MONTHS]^3&bq=upt_date:[NOW/DAY-24MONTHS+TO+NOW/DAY-12MONTHS]^4&bq=upt_date:[NOW/DAY-12MONTHS+TO+NOW/DAY-9MONTHS]^5&bq=upt_date:[NOW/DAY-9MONTHS+TO+NOW/DAY-6MONTHS]^10&bq=upt_date:[NOW/DAY-6MONTHS+TO+NOW/DAY-3MONTHS]^15&bq=upt_date:[NOW/DAY-3MONTHS+TO+*]^20&bq=_query_:"{!edismax+qf%3Drol^2+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$typeId+q.op%3DOR+bq%3D\$bq1+bf%3D}"&bq=_query_:"{!edismax+qf%3Drol^2+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$typeId+q.op%3DOR+bq%3D\$bq1+bf%3D}"&bq=_query_:"{!edismax+qf%3Drol^2+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$typeId+q.op%3DOR+bq%3D\$bq1+bf%3D}"&bq=dlh:(22)^8&bq={!boost+b%3D4}+_query_:{!edismax+qf%3D"currdesig^8+predesig^6+ttl^3+kw_skl^2+contents"+v%3D"\"doctor\"+\"medical+officer\"+\"physician\""+q.op%3DAND+bq%3D}&bq=_query_:{!edismax+qf%3D"currdesig+predesig+ttl+kw_skl+contents^0.01"+v%3D"\"doctor\"+\"medical+officer\"+\"physician\""+q.op%3DOR+bq%3D}&bq=NOT+country:isoin^-10&facet.query=exp:[+10+TO+11+]&facet.query=exp:[+11+TO+13+]&facet.query=exp:[+13+TO+15+]&facet.query=exp:[+15+TO+17+]&facet.query=exp:[+17+TO+20+]&facet.query=exp:[+20+TO+25+]&facet.query=exp:[+25+TO+109+]&facet.query=ctc:[+100+TO+101+]&facet.query=ctc:[+101+TO+101.5+]&facet.query=ctc:[+101.5+TO+102+]&facet.query=ctc:[+102+TO+103+]&facet.query=ctc:[+103+TO+104+]&facet.query=ctc:[+104+TO+105+]&facet.query=ctc:[+105+TO+107.5+]&facet.query=ctc:[+107.5+TO+110+]&facet.query=ctc:[+110+TO+115+]&facet.query=ctc:[+115+TO+10100+]&f.cl.mincount=1&queryany3=(22)&wt=javabin&queryany2=(293)&queryany1=(294)&queryany0=(322)&facet.field=ind&facet.field=cat&facet.field=rol&facet.field=cl&facet.field=pref&debug=false&f.rol.mincount=1&start=0&rows=40&q=((mbbs+OR+_query_:"{!edismax+qf%3Ddlh+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany3+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+((("medical+officer")+OR+"medical+officer"~0)+OR+_query_:"{!edismax+qf%3Drol+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany0+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+(("doctor"+OR+doctor)+OR+_query_:"{!edismax+qf%3Drol+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany2+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+(("physician"+OR+"physicians"+OR+"general+physician"+OR+"house+physician"+OR+"consultant+physician"+OR+physician)+OR+_query_:"{!edismax+qf%3Drol+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany1+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+_query_:"{!edismax+qf%3D\$semanticfieldskl+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D\$semantictermsskl+q.op%3DOR+bq%3D\$bq1+bf%3D}

Re: query optimization

2019-07-06 Thread Mikhail Khludnev
https://lucene.apache.org/solr/guide/6_6/common-query-parameters.html#CommonQueryParameters-ThedebugParameter


On Wed, Jul 3, 2019 at 10:10 AM Midas A  wrote:

> Please suggest here
>
> On Wed, Jul 3, 2019 at 10:23 AM Midas A  wrote:
>
> > Hi,
> >
> > How can i optimize following query it is taking time
> >
> >  webapp=/solr path=/search params={
> >
> df=ttl&ps=0&hl=true&f.ind.mincount=1&hl.usePhraseHighlighter=true&lowercaseOperators=true&ps2=0&ps3=0&qf=contents^0.05+currdesig^1.5+predesig^1.5+lng^2+ttl+kw_skl+kw_it&sow=false&hl.fl=ttl,kw_skl,kw_it,contents&semantictermsttl=&f.cat.mincount=1&semanticfieldttl=ttl^0.1+currdesig^0.1+predesig^0.1&qs=0&qt=/resumesearch&semantictermsskl="mbbss"+OR+"medicine"&version=2&omitHeader=true&hl.q=mbbs,+"medical+officer",+doctor,+physician+("medical+officer")+"medical+officer"+"physician""+""general+physician""+""physicians""+""consultant+physician""+""house+physician"+"physician"+"doctor"+"mbbs"+"general+physician"+"physicians"+"consultant+physician"+"house+physician"&typeId=(293)&debugQuery=false&bq1=&echoParams=none&fl=id,upt&f.pref.mincount=1&q.op=OR&fq=NOT+contents:("liaise+with+medical+officer"+"worked+with+medical+officer"+"working+with+medical+officer"+"reported+to+medical+officer"+"references+are+medical+officer"+"coordinated+with+medical+officer"+"closely+with+medical+officer"+"signature+of+medical+officer"+"seal+of++medical+officer"+"liaise+with+physician"+"worked+with+physician"+"working+with+physician"+"reported+to+physician"+"references+are+physician"+"coordinated+with+physician"+"closely+with+physician"+"signature+of+physician"+"seal+of++physician"+"liaise+with+doctor"+"worked+with+doctor"+"working+with+doctor"+"reported+to+doctor"+"references+are+doctor"+"coordinated+with+doctor"+"closely+with+doctor"+"signature+of+doctor"+"seal+of++doctor")&fq=NOT+hemp:("xmwxagency"+"xmwxlimited"+"xmwxplacement"+"xmwxplus"+"xmwxprivate"+"xmwxsecurity"+"xmwxz2"+"xmwxand"+"xswxz2+plus+placement+and+security+agency+private+limited"+"xswxz2+plus+placement+and+security+agency+private"+"xswxz2+plus+placement+and+security+agency"+"xswxz2+plus+placement+and+security"+"xswxz2+plus+placement+and"+"xswxz2+plus+placement"+"xswxz2+plus"+"xswxz2")&fq=ctc:[100.0+TO+107.2]+OR+ctc:[-1.0+TO+-1.0]&fq=(dlh:(22))&fq=ind:(24++42++24++8)&fq=(rol:(292+293+294+322))&fq=(cat:(9))&fq=cat:(1000+OR+907+OR+1+OR+2+OR+3+OR+786+OR+4+OR+5+OR+6+OR+7+OR+8+OR+9+OR+10+OR+11+OR+12+OR+13+OR+14+OR+785+OR+15+OR+16+OR+17+OR+18+OR+908+OR+19+OR+20+OR+21+OR+23+OR+24)&fq=NOT+is_udis:2&bq=is_resume:0^-1000&bq=upt_date:[*+TO+NOW/DAY-36MONTHS]^2&bq=upt_date:[NOW/DAY-36MONTHS+TO+NOW/DAY-24MONTHS]^3&bq=upt_date:[NOW/DAY-24MONTHS+TO+NOW/DAY-12MONTHS]^4&bq=upt_date:[NOW/DAY-12MONTHS+TO+NOW/DAY-9MONTHS]^5&bq=upt_date:[NOW/DAY-9MONTHS+TO+NOW/DAY-6MONTHS]^10&bq=upt_date:[NOW/DAY-6MONTHS+TO+NOW/DAY-3MONTHS]^15&bq=upt_date:[NOW/DAY-3MONTHS+TO+*]^20&bq=_query_:"{!edismax+qf%3Drol^2+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$typeId+q.op%3DOR+bq%3D\$bq1+bf%3D}"&bq=_query_:"{!edismax+qf%3Drol^2+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$typeId+q.op%3DOR+bq%3D\$bq1+bf%3D}"&bq=_query_:"{!edismax+qf%3Drol^2+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$typeId+q.op%3DOR+bq%3D\$bq1+bf%3D}"&bq=dlh:(22)^8&bq={!boost+b%3D4}+_query_:{!edismax+qf%3D"currdesig^8+predesig^6+ttl^3+kw_skl^2+contents"+v%3D"\"doctor\"+\"medical+officer\"+\"physician\""+q.op%3DAND+bq%3D}&bq=_query_:{!edismax+qf%3D"currdesig+predesig+ttl+kw_skl+contents^0.01"+v%3D"\"doctor\"+\"medical+officer\"+\"physician\""+q.op%3DOR+bq%3D}&bq=NOT+country:isoin^-10&facet.query=exp:[+10+TO+11+]&facet.query=exp:[+11+TO+13+]&facet.query=exp:[+13+TO+15+]&facet.query=exp:[+15+TO+17+]&facet.query=exp:[+17+TO+20+]&facet.query=exp:[+20+TO+25+]&facet.query=exp:[+25+TO+109+]&facet.query=ctc:[+100+TO+101+]&facet.query=ctc:[+101+TO+101.5+]&facet.query=ctc:[+101.5+TO+102+]&facet.query=ctc:[+102+TO+103+]&facet.query=ctc:[+103+TO+104+]&facet.query=ctc:[+104+TO+105+]&facet.query=ctc:[+105+TO+107.5+]&facet.query=ctc:[+107.5+TO+110+]&facet.query=ctc:[+110+TO+115+]&facet.query=ctc:[+115+TO+10100+]&f.cl.mincount=1&queryany3=(22)&wt=javabin&queryany2=(293)&queryany1=(294)&queryany0=(322)&facet.field=ind&facet.field=cat&facet.field=rol&facet.field=cl&facet.field=pref&debug=false&f.rol.mincount=1&start=0&rows=40&q=((mbbs+OR+_query_:"{!edismax+qf%3Ddlh+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany3+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+((("medical+officer")+OR+"medical+officer"~0)+OR+_query_:"{!edismax+qf%3Drol+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany0+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+(("doctor"+OR+doctor)+OR+_query_:"{!edismax+qf%3Drol+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$queryany2+q.op%3DOR+bq%3D$bq1+bf%3D}")+OR+(("physician"+OR+"physicians"+OR+"general+physician"+OR+"house+physician"+OR+"consultant+physician"+OR+physician)+OR+_query_:"{!edismax+qf%3Drol+pf%3Did+ps%3D1+pf2%3Did+ps2%3D1+pf3%3Did+ps3%3D1+v%3D$que

Re: Filter query optimization

2009-10-19 Thread Yonik Seeley
On Mon, Oct 19, 2009 at 2:55 PM, Jason Rutherglen
 wrote:
> If a filter query matches nothing, then no additional query should be
> performed and no results returned?  I don't think we have this today?

No, but this is a fast operation anyway (In Solr 1.4 at least).

Another thing to watch out for is to not try this with filters that
you don't know the size of (or else you may force a popcount on a
BitDocSet that would not otherwise have been needed).

It could also potentially complicate warming queries - need to be
careful that the combination of filters you are warming with matches
something, or it would cause the fieldCache entries to not be
populated.

-Yonik
http://www.lucidimagination.com


Re: Filter query optimization

2009-10-19 Thread Jason Rutherglen
Yonik,

> this is a fast operation anyway

Can you elaborate on why this is a fast operation?

Basically there's a distributed query with a filter, where on a
number of the servers, the filter query isn't matching anything,
however I'm seeing load on those servers (where nothing
matches), so I'm assuming the filter is generated (and cached)
which is fine, then the user query is being performed on a
filter where no documents match. I could misinterpreting the
data, however, I want to find out about this use case regardless
as it likely will crop up again for us.

-J

On Mon, Oct 19, 2009 at 12:07 PM, Yonik Seeley
 wrote:
> On Mon, Oct 19, 2009 at 2:55 PM, Jason Rutherglen
>  wrote:
>> If a filter query matches nothing, then no additional query should be
>> performed and no results returned?  I don't think we have this today?
>
> No, but this is a fast operation anyway (In Solr 1.4 at least).
>
> Another thing to watch out for is to not try this with filters that
> you don't know the size of (or else you may force a popcount on a
> BitDocSet that would not otherwise have been needed).
>
> It could also potentially complicate warming queries - need to be
> careful that the combination of filters you are warming with matches
> something, or it would cause the fieldCache entries to not be
> populated.
>
> -Yonik
> http://www.lucidimagination.com
>


Re: Filter query optimization

2009-10-19 Thread Yonik Seeley
On Mon, Oct 19, 2009 at 4:45 PM, Jason Rutherglen
 wrote:
> Yonik,
>
>> this is a fast operation anyway
>
> Can you elaborate on why this is a fast operation?

The scorers will never really be used.
The query will be weighted and scorers will be created, but the filter
will be checked first and return NO_MORE_DOCS.

-Yonik
http://www.lucidimagination.com

> Basically there's a distributed query with a filter, where on a
> number of the servers, the filter query isn't matching anything,
> however I'm seeing load on those servers (where nothing
> matches), so I'm assuming the filter is generated (and cached)
> which is fine, then the user query is being performed on a
> filter where no documents match. I could misinterpreting the
> data, however, I want to find out about this use case regardless
> as it likely will crop up again for us.
>
> -J
>
> On Mon, Oct 19, 2009 at 12:07 PM, Yonik Seeley
>  wrote:
>> On Mon, Oct 19, 2009 at 2:55 PM, Jason Rutherglen
>>  wrote:
>>> If a filter query matches nothing, then no additional query should be
>>> performed and no results returned?  I don't think we have this today?
>>
>> No, but this is a fast operation anyway (In Solr 1.4 at least).
>>
>> Another thing to watch out for is to not try this with filters that
>> you don't know the size of (or else you may force a popcount on a
>> BitDocSet that would not otherwise have been needed).
>>
>> It could also potentially complicate warming queries - need to be
>> careful that the combination of filters you are warming with matches
>> something, or it would cause the fieldCache entries to not be
>> populated.
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
>


Re: Filter query optimization

2009-10-19 Thread Jason Rutherglen
Ok, thanks, new Lucene 2.9 features.

On Mon, Oct 19, 2009 at 2:33 PM, Yonik Seeley
 wrote:
> On Mon, Oct 19, 2009 at 4:45 PM, Jason Rutherglen
>  wrote:
>> Yonik,
>>
>>> this is a fast operation anyway
>>
>> Can you elaborate on why this is a fast operation?
>
> The scorers will never really be used.
> The query will be weighted and scorers will be created, but the filter
> will be checked first and return NO_MORE_DOCS.
>
> -Yonik
> http://www.lucidimagination.com
>
>> Basically there's a distributed query with a filter, where on a
>> number of the servers, the filter query isn't matching anything,
>> however I'm seeing load on those servers (where nothing
>> matches), so I'm assuming the filter is generated (and cached)
>> which is fine, then the user query is being performed on a
>> filter where no documents match. I could misinterpreting the
>> data, however, I want to find out about this use case regardless
>> as it likely will crop up again for us.
>>
>> -J
>>
>> On Mon, Oct 19, 2009 at 12:07 PM, Yonik Seeley
>>  wrote:
>>> On Mon, Oct 19, 2009 at 2:55 PM, Jason Rutherglen
>>>  wrote:
 If a filter query matches nothing, then no additional query should be
 performed and no results returned?  I don't think we have this today?
>>>
>>> No, but this is a fast operation anyway (In Solr 1.4 at least).
>>>
>>> Another thing to watch out for is to not try this with filters that
>>> you don't know the size of (or else you may force a popcount on a
>>> BitDocSet that would not otherwise have been needed).
>>>
>>> It could also potentially complicate warming queries - need to be
>>> careful that the combination of filters you are warming with matches
>>> something, or it would cause the fieldCache entries to not be
>>> populated.
>>>
>>> -Yonik
>>> http://www.lucidimagination.com
>>>
>>
>


Re: Search query optimization

2008-05-29 Thread Yonik Seeley
On Thu, May 29, 2008 at 4:05 PM, Yongjun Rong <[EMAIL PROTECTED]> wrote:
>  I have a question about how the lucene query parser. For example, I
> have query "A AND B AND C". Will lucene extract all documents satisfy
> condition A in memory and then filter it with condition B and C?

No, Lucene will try and optimize this the best it can.

It roughly goes like this..
docnum = find_match("A")
docnum = find_first_match_after(docnum, "B")
docnum = find_first_match_after(docnum,"C")
etc...
until the same docnum is returned for "A","B", and "C".

See ConjunctionScorer for the gritty details.

-Yonik



> or only
> the documents satisfying "A AND B AND C" will be put into memory? Is
> there any articles discuss about how to build a optimization query to
> save memory and improve performance?
>  Thank you very much.
>  Yongjun Rong
>


RE: Search query optimization

2008-05-29 Thread Yongjun Rong
Hi Yonik,
  Thanks for your quick reply. I'm very new to the lucene source code.
Can you give me a little more detail explaination about this.
Do you think it will save some memory if docnum = find_match("A") >
docnum = find_match("B") and put B in the front of the AND query like "B
AND A AND C"? How about sorting (sort=A,B,C&q=A AND B AND C)? Do you
think the order of conditions (A,B,C) in a query will affect the
performance of the query?
  Thank you very much.
  Yongjun


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik
Seeley
Sent: Thursday, May 29, 2008 4:12 PM
To: solr-user@lucene.apache.org
Subject: Re: Search query optimization

On Thu, May 29, 2008 at 4:05 PM, Yongjun Rong <[EMAIL PROTECTED]>
wrote:
>  I have a question about how the lucene query parser. For example, I 
> have query "A AND B AND C". Will lucene extract all documents satisfy 
> condition A in memory and then filter it with condition B and C?

No, Lucene will try and optimize this the best it can.

It roughly goes like this..
docnum = find_match("A")
docnum = find_first_match_after(docnum, "B") docnum =
find_first_match_after(docnum,"C")
etc...
until the same docnum is returned for "A","B", and "C".

See ConjunctionScorer for the gritty details.

-Yonik



> or only
> the documents satisfying "A AND B AND C" will be put into memory? Is 
> there any articles discuss about how to build a optimization query to 
> save memory and improve performance?
>  Thank you very much.
>  Yongjun Rong
>


Re: Search query optimization

2008-05-29 Thread Walter Underwood
The people working on Lucene are pretty smart, and this sort of
query optimization is a well-known trick, so I would not worry
about it.

A dozen years ago at Infoseek, we checked the count of matches
for each term in an AND, and evaluated the smallest one first.
If any of them had zero matches, we didn't evaluate any of them.

I expect that Doug Cutting and the other Lucene folk know those
same tricks.

wunder

On 5/29/08 1:50 PM, "Yongjun Rong" <[EMAIL PROTECTED]> wrote:

> Hi Yonik,
>   Thanks for your quick reply. I'm very new to the lucene source code.
> Can you give me a little more detail explaination about this.
> Do you think it will save some memory if docnum = find_match("A") >
> docnum = find_match("B") and put B in the front of the AND query like "B
> AND A AND C"? How about sorting (sort=A,B,C&q=A AND B AND C)? Do you
> think the order of conditions (A,B,C) in a query will affect the
> performance of the query?
>   Thank you very much.
>   Yongjun
> 
> 
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik
> Seeley
> Sent: Thursday, May 29, 2008 4:12 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Search query optimization
> 
> On Thu, May 29, 2008 at 4:05 PM, Yongjun Rong <[EMAIL PROTECTED]>
> wrote:
>>  I have a question about how the lucene query parser. For example, I
>> have query "A AND B AND C". Will lucene extract all documents satisfy
>> condition A in memory and then filter it with condition B and C?
> 
> No, Lucene will try and optimize this the best it can.
> 
> It roughly goes like this..
> docnum = find_match("A")
> docnum = find_first_match_after(docnum, "B") docnum =
> find_first_match_after(docnum,"C")
> etc...
> until the same docnum is returned for "A","B", and "C".
> 
> See ConjunctionScorer for the gritty details.
> 
> -Yonik
> 
> 
> 
>> or only
>> the documents satisfying "A AND B AND C" will be put into memory? Is
>> there any articles discuss about how to build a optimization query to
>> save memory and improve performance?
>>  Thank you very much.
>>  Yongjun Rong
>> 



RE: Search query optimization

2008-06-17 Thread Yongjun Rong
Hi,
  Thanks for your reply. I did some test on my test machine. 
http://stage.boomi.com:8080/solr/select/?q=account:1&rows=1000. It will
return resultset 384 in 3ms. If I add a new AND condition as below:
http://stage.boomi.com:8080/solr/select/?q=account:1+AND+recordeddate_dt
:[NOW/DAYS-7DAYS+TO+NOW]&rows=1000. It will take 18236 to return 21
resultset. If I only use the recordedate_dt condition like
http://stage.boomi.com:8080/solr/select/?q=recordeddate_dt:[NOW/DAYS-7DA
YS+TO+NOW]&rows=1000. It takes 20271 ms to get 412800 results. All the
above URL are live, you test it.

Can anyone give me some explaination why this happens if we have the
query optimization? Thank you very much.
Yongjun Rong
 

-Original Message-
From: Walter Underwood [mailto:[EMAIL PROTECTED] 
Sent: Thursday, May 29, 2008 4:57 PM
To: solr-user@lucene.apache.org
Subject: Re: Search query optimization

The people working on Lucene are pretty smart, and this sort of query
optimization is a well-known trick, so I would not worry about it.

A dozen years ago at Infoseek, we checked the count of matches for each
term in an AND, and evaluated the smallest one first.
If any of them had zero matches, we didn't evaluate any of them.

I expect that Doug Cutting and the other Lucene folk know those same
tricks.

wunder

On 5/29/08 1:50 PM, "Yongjun Rong" <[EMAIL PROTECTED]> wrote:

> Hi Yonik,
>   Thanks for your quick reply. I'm very new to the lucene source code.
> Can you give me a little more detail explaination about this.
> Do you think it will save some memory if docnum = find_match("A") > 
> docnum = find_match("B") and put B in the front of the AND query like 
> "B AND A AND C"? How about sorting (sort=A,B,C&q=A AND B AND C)? Do 
> you think the order of conditions (A,B,C) in a query will affect the 
> performance of the query?
>   Thank you very much.
>   Yongjun
> 
> 
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik 
> Seeley
> Sent: Thursday, May 29, 2008 4:12 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Search query optimization
> 
> On Thu, May 29, 2008 at 4:05 PM, Yongjun Rong <[EMAIL PROTECTED]>
> wrote:
>>  I have a question about how the lucene query parser. For example, I 
>> have query "A AND B AND C". Will lucene extract all documents satisfy

>> condition A in memory and then filter it with condition B and C?
> 
> No, Lucene will try and optimize this the best it can.
> 
> It roughly goes like this..
> docnum = find_match("A")
> docnum = find_first_match_after(docnum, "B") docnum =
> find_first_match_after(docnum,"C")
> etc...
> until the same docnum is returned for "A","B", and "C".
> 
> See ConjunctionScorer for the gritty details.
> 
> -Yonik
> 
> 
> 
>> or only
>> the documents satisfying "A AND B AND C" will be put into memory? Is 
>> there any articles discuss about how to build a optimization query to

>> save memory and improve performance?
>>  Thank you very much.
>>  Yongjun Rong
>> 



Re: Search query optimization

2008-06-17 Thread Otis Gospodnetic
Hi,

Probably because the [NOW/DAYS-7DAYS+TO+NOW] part gets rewritten as lots of OR 
clauses.  I think that you'll see that if you add &debugQuery=true to the URL.  
Make sure your recorded_date_dt is not too granular (e.g. if you don't need 
minutes, round the values to hours. If you don't need hours, round the values 
to days).


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
> From: Yongjun Rong <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, June 17, 2008 11:56:06 AM
> Subject: RE: Search query optimization
> 
> Hi,
>   Thanks for your reply. I did some test on my test machine. 
> http://stage.boomi.com:8080/solr/select/?q=account:1&rows=1000. It will
> return resultset 384 in 3ms. If I add a new AND condition as below:
> http://stage.boomi.com:8080/solr/select/?q=account:1+AND+recordeddate_dt
> :[NOW/DAYS-7DAYS+TO+NOW]&rows=1000. It will take 18236 to return 21
> resultset. If I only use the recordedate_dt condition like
> http://stage.boomi.com:8080/solr/select/?q=recordeddate_dt:[NOW/DAYS-7DA
> YS+TO+NOW]&rows=1000. It takes 20271 ms to get 412800 results. All the
> above URL are live, you test it.
> 
> Can anyone give me some explaination why this happens if we have the
> query optimization? Thank you very much.
> Yongjun Rong
> 
> 
> -Original Message-
> From: Walter Underwood [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, May 29, 2008 4:57 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Search query optimization
> 
> The people working on Lucene are pretty smart, and this sort of query
> optimization is a well-known trick, so I would not worry about it.
> 
> A dozen years ago at Infoseek, we checked the count of matches for each
> term in an AND, and evaluated the smallest one first.
> If any of them had zero matches, we didn't evaluate any of them.
> 
> I expect that Doug Cutting and the other Lucene folk know those same
> tricks.
> 
> wunder
> 
> On 5/29/08 1:50 PM, "Yongjun Rong" wrote:
> 
> > Hi Yonik,
> >   Thanks for your quick reply. I'm very new to the lucene source code.
> > Can you give me a little more detail explaination about this.
> > Do you think it will save some memory if docnum = find_match("A") > 
> > docnum = find_match("B") and put B in the front of the AND query like 
> > "B AND A AND C"? How about sorting (sort=A,B,C&q=A AND B AND C)? Do 
> > you think the order of conditions (A,B,C) in a query will affect the 
> > performance of the query?
> >   Thank you very much.
> >   Yongjun
> >
> > 
> > -Original Message-
> > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik 
> > Seeley
> > Sent: Thursday, May 29, 2008 4:12 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Search query optimization
> > 
> > On Thu, May 29, 2008 at 4:05 PM, Yongjun Rong 
> > wrote:
> >>  I have a question about how the lucene query parser. For example, I 
> >> have query "A AND B AND C". Will lucene extract all documents satisfy
> 
> >> condition A in memory and then filter it with condition B and C?
> > 
> > No, Lucene will try and optimize this the best it can.
> > 
> > It roughly goes like this..
> > docnum = find_match("A")
> > docnum = find_first_match_after(docnum, "B") docnum =
> > find_first_match_after(docnum,"C")
> > etc...
> > until the same docnum is returned for "A","B", and "C".
> > 
> > See ConjunctionScorer for the gritty details.
> > 
> > -Yonik
> > 
> > 
> > 
> >> or only
> >> the documents satisfying "A AND B AND C" will be put into memory? Is 
> >> there any articles discuss about how to build a optimization query to
> 
> >> save memory and improve performance?
> >>  Thank you very much.
> >>  Yongjun Rong
> >> 



RE: Search query optimization

2008-06-17 Thread Yongjun Rong
Thanks for reply. Here is the debugQuery output:

−

account:1 AND recordeddate_dt:[NOW/DAYS-1DAYS TO NOW]

−

account:1 AND recordeddate_dt:[NOW/DAYS-1DAYS TO NOW]

−

+account:1 +recordeddate_dt:[2008-06-16T00:00:00.000Z TO 
2008-06-17T17:07:57.420Z]

−

+account:1 +recordeddate_dt:[2008-06-16T00:00:00.000 TO 2008-06-17T17:07:57.420]

−

−


10.88071 = (MATCH) sum of:
  10.788804 = (MATCH) weight(account:1 in 6515410), product of:
0.9957678 = queryWeight(account:1), product of:
  10.834659 = idf(docFreq=348, numDocs=6515640)
  0.09190578 = queryNorm
10.834659 = (MATCH) fieldWeight(account:1 in 6515410), product of:
  1.0 = tf(termFreq(account:1)=1)
  10.834659 = idf(docFreq=348, numDocs=6515640)
  1.0 = fieldNorm(field=account, doc=6515410)
  0.09190578 = (MATCH) 
ConstantScoreQuery(recordeddate_dt:[2008-06-16T00:00:00.000-2008-06-17T17:07:57.420]),
 product of:
1.0 = boost
0.09190578 = queryNorm


 

-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 17, 2008 12:43 PM
To: solr-user@lucene.apache.org
Subject: Re: Search query optimization

Hi,

Probably because the [NOW/DAYS-7DAYS+TO+NOW] part gets rewritten as lots of OR 
clauses.  I think that you'll see that if you add &debugQuery=true to the URL.  
Make sure your recorded_date_dt is not too granular (e.g. if you don't need 
minutes, round the values to hours. If you don't need hours, round the values 
to days).


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
> From: Yongjun Rong <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, June 17, 2008 11:56:06 AM
> Subject: RE: Search query optimization
> 
> Hi,
>   Thanks for your reply. I did some test on my test machine. 
> http://stage.boomi.com:8080/solr/select/?q=account:1&rows=1000. It 
> will return resultset 384 in 3ms. If I add a new AND condition as below:
> http://stage.boomi.com:8080/solr/select/?q=account:1+AND+recordeddate_
> dt :[NOW/DAYS-7DAYS+TO+NOW]&rows=1000. It will take 18236 to return 21 
> resultset. If I only use the recordedate_dt condition like 
> http://stage.boomi.com:8080/solr/select/?q=recordeddate_dt:[NOW/DAYS-7
> DA
> YS+TO+NOW]&rows=1000. It takes 20271 ms to get 412800 results. All the
> above URL are live, you test it.
> 
> Can anyone give me some explaination why this happens if we have the 
> query optimization? Thank you very much.
> Yongjun Rong
> 
> 
> -Original Message-
> From: Walter Underwood [mailto:[EMAIL PROTECTED]
> Sent: Thursday, May 29, 2008 4:57 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Search query optimization
> 
> The people working on Lucene are pretty smart, and this sort of query 
> optimization is a well-known trick, so I would not worry about it.
> 
> A dozen years ago at Infoseek, we checked the count of matches for 
> each term in an AND, and evaluated the smallest one first.
> If any of them had zero matches, we didn't evaluate any of them.
> 
> I expect that Doug Cutting and the other Lucene folk know those same 
> tricks.
> 
> wunder
> 
> On 5/29/08 1:50 PM, "Yongjun Rong" wrote:
> 
> > Hi Yonik,
> >   Thanks for your quick reply. I'm very new to the lucene source code.
> > Can you give me a little more detail explaination about this.
> > Do you think it will save some memory if docnum = find_match("A") > 
> > docnum = find_match("B") and put B in the front of the AND query 
> > like "B AND A AND C"? How about sorting (sort=A,B,C&q=A AND B AND 
> > C)? Do you think the order of conditions (A,B,C) in a query will 
> > affect the performance of the query?
> >   Thank you very much.
> >   Yongjun
> >
> > 
> > -Original Message-
> > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of 
> > Yonik Seeley
> > Sent: Thursday, May 29, 2008 4:12 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Search query optimization
> > 
> > On Thu, May 29, 2008 at 4:05 PM, Yongjun Rong
> > wrote:
> >>  I have a question about how the lucene query parser. For example, 
> >> I have query "A AND B AND C". Will lucene extract all documents 
> >> satisfy
> 
> >> condition A in memory and then filter it with condition B and C?
> > 
> > No, Lucene will try and optimize this the best it can.
> > 
> > It roughly goes like this..
> > docnum = find_match("A")
> > docnum = find_first_match_after(docnum, "B") docnum =
> > find_first_match_after(docnum,"C")
> > etc...
> > until the same docnum is returned for "A","B", and "C".
> > 
> > See ConjunctionScorer for the gritty details.
> > 
> > -Yonik
> > 
> > 
> > 
> >> or only
> >> the documents satisfying "A AND B AND C" will be put into memory? 
> >> Is there any articles discuss about how to build a optimization 
> >> query to
> 
> >> save memory and improve performance?
> >>  Thank you very much.
> >>  Yongjun Rong
> >> 



Re: Search query optimization

2008-06-17 Thread Otis Gospodnetic
Hi,

This is what I was talking about:

recordeddate_dt:[2008-06-16T00:00:00.000Z TO 2008-06-17T17:07:57.420Z]

Note that the granularity of this date field is down to milliseconds.  You 
should change that to be more coarse if you don't need such precision (e.g. no 
milliseconds, no seconds, no minutes, no hours...)


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
> From: Yongjun Rong <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, June 17, 2008 1:09:19 PM
> Subject: RE: Search query optimization
> 
> Thanks for reply. Here is the debugQuery output:
> 
> −
> 
> account:1 AND recordeddate_dt:[NOW/DAYS-1DAYS TO NOW]
> 
> −
> 
> account:1 AND recordeddate_dt:[NOW/DAYS-1DAYS TO NOW]
> 
> −
> 
> +account:1 +recordeddate_dt:[2008-06-16T00:00:00.000Z TO 
> 2008-06-17T17:07:57.420Z]
> 
> −
> 
> +account:1 +recordeddate_dt:[2008-06-16T00:00:00.000 TO 
> 2008-06-17T17:07:57.420]
> 
> −
> 
> −
> 
> name="id=e03dbd92-3d41-4693-8b69-ac9a0d332446-atom-d52484f5-7aa8-40b3-ad6f-ba3a9071999e,internal_docid=6515410">
> 
> 10.88071 = (MATCH) sum of:
>   10.788804 = (MATCH) weight(account:1 in 6515410), product of:
> 0.9957678 = queryWeight(account:1), product of:
>   10.834659 = idf(docFreq=348, numDocs=6515640)
>   0.09190578 = queryNorm
> 10.834659 = (MATCH) fieldWeight(account:1 in 6515410), product of:
>   1.0 = tf(termFreq(account:1)=1)
>   10.834659 = idf(docFreq=348, numDocs=6515640)
>   1.0 = fieldNorm(field=account, doc=6515410)
>   0.09190578 = (MATCH) 
> ConstantScoreQuery(recordeddate_dt:[2008-06-16T00:00:00.000-2008-06-17T17:07:57.420]),
>  
> product of:
> 1.0 = boost
> 0.09190578 = queryNorm
> 
> 
>  
> 
> -----Original Message-
> From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, June 17, 2008 12:43 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Search query optimization
> 
> Hi,
> 
> Probably because the [NOW/DAYS-7DAYS+TO+NOW] part gets rewritten as lots of 
> OR 
> clauses.  I think that you'll see that if you add &debugQuery=true to the 
> URL.  
> Make sure your recorded_date_dt is not too granular (e.g. if you don't need 
> minutes, round the values to hours. If you don't need hours, round the values 
> to 
> days).
> 
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> - Original Message 
> > From: Yongjun Rong 
> > To: solr-user@lucene.apache.org
> > Sent: Tuesday, June 17, 2008 11:56:06 AM
> > Subject: RE: Search query optimization
> > 
> > Hi,
> >   Thanks for your reply. I did some test on my test machine. 
> > http://stage.boomi.com:8080/solr/select/?q=account:1&rows=1000. It 
> > will return resultset 384 in 3ms. If I add a new AND condition as below:
> > http://stage.boomi.com:8080/solr/select/?q=account:1+AND+recordeddate_ 
> > dt :[NOW/DAYS-7DAYS+TO+NOW]&rows=1000. It will take 18236 to return 21 
> > resultset. If I only use the recordedate_dt condition like 
> > http://stage.boomi.com:8080/solr/select/?q=recordeddate_dt:[NOW/DAYS-7
> > DA
> > YS+TO+NOW]&rows=1000. It takes 20271 ms to get 412800 results. All the
> > above URL are live, you test it.
> > 
> > Can anyone give me some explaination why this happens if we have the 
> > query optimization? Thank you very much.
> > Yongjun Rong
> > 
> > 
> > -Original Message-
> > From: Walter Underwood [mailto:[EMAIL PROTECTED]
> > Sent: Thursday, May 29, 2008 4:57 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Search query optimization
> > 
> > The people working on Lucene are pretty smart, and this sort of query 
> > optimization is a well-known trick, so I would not worry about it.
> > 
> > A dozen years ago at Infoseek, we checked the count of matches for 
> > each term in an AND, and evaluated the smallest one first.
> > If any of them had zero matches, we didn't evaluate any of them.
> > 
> > I expect that Doug Cutting and the other Lucene folk know those same 
> > tricks.
> > 
> > wunder
> > 
> > On 5/29/08 1:50 PM, "Yongjun Rong" wrote:
> > 
> > > Hi Yonik,
> > >   Thanks for your quick reply. I'm very new to the lucene source code.
> > > Can you give me a little more detail explaination about this.
> > > Do you think it will save some memory if docnum = find_match("A") > 
> > > docnum = find_match("B") and put B in the fr

Re: Search query optimization

2008-06-17 Thread Chris Hostetter

: Probably because the [NOW/DAYS-7DAYS+TO+NOW] part gets rewritten as lots 
: of OR clauses.  I think that you'll see that if you add &debugQuery=true 
: to the URL.  Make sure your recorded_date_dt is not too granular (e.g. 
: if you don't need minutes, round the values to hours. If you don't need 
: hours, round the values to days).

for the record: it doesn't get rewritten to a lot of OR clauses, it's 
using ConstantScoreRangeQuery.

granularity is definitely important however, bth when indexing and when 
querying.  

"NOW" is milliseconds, so every time you execute that query it's different 
and there is almost no caching possible.

if you use [NOW/DAY-7DAYS TO NOW/DAY] or even 
[NOW/DAY-7DAYS TO NOW/HOUR] you'll get a lot better caching behavior.  it 
looks like you are trying to find anything in the past week, so you may 
want [NOW/DAY-7DAYS TO NOW/DAY+1DAY] (to go to the end of the current day)

once you have a less granular date restriction, it can frequently make 
sense to put this in a seperate fq clause, so it will get cached 
independently of your main query. 

But Otis's point about reducing granularity can also help when indexing 
... the fewer "unique" dates that apepar in your index, the faster range 
queries will be ... if you've got 1000 documents that all of a 
recordeddate of June 11 2008, but at different times, and you're never 
going to care aboutthe times (just the date) then strip those times off 
when indexing so they all have the same fieled value of 
2008-06-11T00:00:00Z

BTW: the solr port you sent out a URL to ... all of it's caching is 
turned off (the filterCache and queryResultCache configs are commented out 
of your solrconfig.xml) ... you're going to wnat to turn on some caching 
or you'll never see really *great* request times.


-Hoss



RE: Search query optimization

2008-06-17 Thread Yongjun Rong
Hi Otis,
  Thanks for your advice. Do you mean when we add the date data we need 
carefully select the granularity of the date field to make sure it is more 
coarse? How can we do this? We just access solr via http URL not API. If you 
talk about the query syntax, we do have NOW/DAY as round to DAY.
  Yongjun Rong
   

-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 17, 2008 1:32 PM
To: solr-user@lucene.apache.org
Subject: Re: Search query optimization

Hi,

This is what I was talking about:

recordeddate_dt:[2008-06-16T00:00:00.000Z TO 2008-06-17T17:07:57.420Z]

Note that the granularity of this date field is down to milliseconds.  You 
should change that to be more coarse if you don't need such precision (e.g. no 
milliseconds, no seconds, no minutes, no hours...)


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
> From: Yongjun Rong <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, June 17, 2008 1:09:19 PM
> Subject: RE: Search query optimization
> 
> Thanks for reply. Here is the debugQuery output:
> 
> −
> 
> account:1 AND recordeddate_dt:[NOW/DAYS-1DAYS TO NOW]
> 
> −
> 
> account:1 AND recordeddate_dt:[NOW/DAYS-1DAYS TO NOW]
> 
> −
> 
> +account:1 +recordeddate_dt:[2008-06-16T00:00:00.000Z TO
> 2008-06-17T17:07:57.420Z]
> 
> −
> 
> +account:1 +recordeddate_dt:[2008-06-16T00:00:00.000 TO 
> +2008-06-17T17:07:57.420]
> 
> −
> 
> −
> 
> name="id=e03dbd92-3d41-4693-8b69-ac9a0d332446-atom-d52484f5-7aa8-40b3-
> ad6f-ba3a9071999e,internal_docid=6515410">
> 
> 10.88071 = (MATCH) sum of:
>   10.788804 = (MATCH) weight(account:1 in 6515410), product of:
> 0.9957678 = queryWeight(account:1), product of:
>   10.834659 = idf(docFreq=348, numDocs=6515640)
>   0.09190578 = queryNorm
> 10.834659 = (MATCH) fieldWeight(account:1 in 6515410), product of:
>   1.0 = tf(termFreq(account:1)=1)
>   10.834659 = idf(docFreq=348, numDocs=6515640)
>   1.0 = fieldNorm(field=account, doc=6515410)
>   0.09190578 = (MATCH)
> ConstantScoreQuery(recordeddate_dt:[2008-06-16T00:00:00.000-2008-06-17
> T17:07:57.420]),
> product of:
> 1.0 = boost
> 0.09190578 = queryNorm
> 
> 
>  
> 
> -Original Message-
> From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, June 17, 2008 12:43 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Search query optimization
> 
> Hi,
> 
> Probably because the [NOW/DAYS-7DAYS+TO+NOW] part gets rewritten as 
> lots of OR clauses.  I think that you'll see that if you add &debugQuery=true 
> to the URL.
> Make sure your recorded_date_dt is not too granular (e.g. if you don't 
> need minutes, round the values to hours. If you don't need hours, 
> round the values to days).
> 
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> - Original Message 
> > From: Yongjun Rong
> > To: solr-user@lucene.apache.org
> > Sent: Tuesday, June 17, 2008 11:56:06 AM
> > Subject: RE: Search query optimization
> > 
> > Hi,
> >   Thanks for your reply. I did some test on my test machine. 
> > http://stage.boomi.com:8080/solr/select/?q=account:1&rows=1000. It 
> > will return resultset 384 in 3ms. If I add a new AND condition as below:
> > http://stage.boomi.com:8080/solr/select/?q=account:1+AND+recordeddat
> > e_ dt :[NOW/DAYS-7DAYS+TO+NOW]&rows=1000. It will take 18236 to 
> > return 21 resultset. If I only use the recordedate_dt condition like
> > http://stage.boomi.com:8080/solr/select/?q=recordeddate_dt:[NOW/DAYS
> > -7
> > DA
> > YS+TO+NOW]&rows=1000. It takes 20271 ms to get 412800 results. All 
> > YS+TO+the
> > above URL are live, you test it.
> > 
> > Can anyone give me some explaination why this happens if we have the 
> > query optimization? Thank you very much.
> > Yongjun Rong
> > 
> > 
> > -Original Message-
> > From: Walter Underwood [mailto:[EMAIL PROTECTED]
> > Sent: Thursday, May 29, 2008 4:57 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Search query optimization
> > 
> > The people working on Lucene are pretty smart, and this sort of 
> > query optimization is a well-known trick, so I would not worry about it.
> > 
> > A dozen years ago at Infoseek, we checked the count of matches for 
> > each term in an AND, and evaluated the smallest one first.
> > If any of them had zero matches, we didn't evaluate any of them.
> > 
> > I expect that Doug Cutting and the other

RE: Search query optimization

2008-06-17 Thread Yongjun Rong
Hi Chris,
   Thanks for your suggestions. I did try the [NOW/DAY-7DAYS TO
NOW/DAY], but it is not better. And I tried [NOW/DAY-7DAYS TO
NOW/DAY+1DAY], I got some exception as below:
org.apache.solr.core.SolrException: Query parsing error: Cannot parse
'account:1 AND recordeddate_dt:[NOW/DAYS-7DAYS TO NOW/DAY 1DAY]':
Encountered "1DAY" at line 1, column 57.
Was expecting:
"]" ...

at
org.apache.solr.search.QueryParsing.parseQuery(QueryParsing.java:104)
at
org.apache.solr.request.StandardRequestHandler.handleRequestBody(Standar
dRequestHandler.java:109)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerB
ase.java:77)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:658)
at
org.apache.solr.servlet.SolrServlet.doGet(SolrServlet.java:66)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:487)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHan
dler.java:1093)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.j
ava:185)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHan
dler.java:1084)
at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:360)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:2
16)
at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:726)
at
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandler
Collection.java:206)
at
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.jav
a:114)
at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:324)
at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:505)
at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConne
ction.java:828)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:514)
at
org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:211)
at
org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:380)
at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:
395)
at
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.ja
va:450)
Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse
'account:1 AND recordeddate_dt:[NOW/DAYS-7DAYS TO NOW/DAY 1DAY]':
Encountered "1DAY" at line 1, column 57.
Was expecting:
"]" ...

at
org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:152)
at
org.apache.solr.search.QueryParsing.parseQuery(QueryParsing.java:94)
... 26 more

And I will try to open the cache and see if I can get better query time.
I will let you know.
Thank you very much.
Yongjun Rong

-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 17, 2008 1:55 PM
To: solr-user@lucene.apache.org
Subject: Re: Search query optimization


: Probably because the [NOW/DAYS-7DAYS+TO+NOW] part gets rewritten as
lots
: of OR clauses.  I think that you'll see that if you add
&debugQuery=true
: to the URL.  Make sure your recorded_date_dt is not too granular (e.g.

: if you don't need minutes, round the values to hours. If you don't
need
: hours, round the values to days).

for the record: it doesn't get rewritten to a lot of OR clauses, it's
using ConstantScoreRangeQuery.

granularity is definitely important however, bth when indexing and when
querying.  

"NOW" is milliseconds, so every time you execute that query it's
different and there is almost no caching possible.

if you use [NOW/DAY-7DAYS TO NOW/DAY] or even [NOW/DAY-7DAYS TO
NOW/HOUR] you'll get a lot better caching behavior.  it looks like you
are trying to find anything in the past week, so you may want
[NOW/DAY-7DAYS TO NOW/DAY+1DAY] (to go to the end of the current day)

once you have a less granular date restriction, it can frequently make
sense to put this in a seperate fq clause, so it will get cached
independently of your main query. 

But Otis's point about reducing granularity can also help when indexing
... the fewer "unique" dates that apepar in your index, the faster range
queries will be ... if you've got 1000 documents that all of a
recordeddate of June 11 2008, but at different times, and you're never
going to care aboutthe times (just the date) then strip those times off
when indexing so they all have the same

RE: Search query optimization

2008-06-17 Thread Chris Hostetter
:Thanks for your suggestions. I did try the [NOW/DAY-7DAYS TO
: NOW/DAY], but it is not better. And I tried [NOW/DAY-7DAYS TO
: NOW/DAY+1DAY], I got some exception as below:
: org.apache.solr.core.SolrException: Query parsing error: Cannot parse
: 'account:1 AND recordeddate_dt:[NOW/DAYS-7DAYS TO NOW/DAY 1DAY]':
: Encountered "1DAY" at line 1, column 57.

you need to propertly URL escape the "+" character as %2B in your URLs.

: And I will try to open the cache and see if I can get better query time.

the first request won't be any faster.  but the second request will be.  
and if filtering by week is something you expect peopel to do a lot of, 
you can put it in a newSearcher so it's always warmed up and fast 
for everyone.


-Hoss



RE: Search query optimization

2008-06-17 Thread Yongjun Rong
Hi Chris,
  Thank you very much for the detail suggestions. I just did the cache
test. If most of requests return the same set of data, cache will
improve the query performance. But in our usage, almost all requests
have different data set to return. The cache hit ratio is very low.
That's the reason we close the cache for memory saving.  Another
question is: 
q=account:1+AND+recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY] will combine
the resultset of account:1 and
recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]. How lucene handle it? From
my previous test examples, it seems lucene will not check the size of
the subconditions (like account:1 or
recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]). Q=account:1 will return a
small set of data. But q=recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY] will
return a large set of data. If we combine them with "AND" like:
q=account+AND+recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]. It should
return the small set of data and then apply the subcondition
"recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]". But from the response
time, it seems not the case.
Can anyone give me some detail explaination about this?
Thank you very much.
Yongjun Rong

-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 17, 2008 2:32 PM
To: solr-user@lucene.apache.org
Subject: RE: Search query optimization

:Thanks for your suggestions. I did try the [NOW/DAY-7DAYS TO
: NOW/DAY], but it is not better. And I tried [NOW/DAY-7DAYS TO
: NOW/DAY+1DAY], I got some exception as below:
: org.apache.solr.core.SolrException: Query parsing error: Cannot parse
: 'account:1 AND recordeddate_dt:[NOW/DAYS-7DAYS TO NOW/DAY 1DAY]':
: Encountered "1DAY" at line 1, column 57.

you need to propertly URL escape the "+" character as %2B in your URLs.

: And I will try to open the cache and see if I can get better query
time.

the first request won't be any faster.  but the second request will be.

and if filtering by week is something you expect peopel to do a lot of,
you can put it in a newSearcher so it's always warmed up and fast for
everyone.


-Hoss



RE: Search query optimization

2008-06-17 Thread Chris Hostetter
: test. If most of requests return the same set of data, cache will
: improve the query performance. But in our usage, almost all requests
: have different data set to return. The cache hit ratio is very low.

that's hwy i suggested moving clauses that are likely to be common (ie: 
your "within the last week" clause into a seperate fq param where it can 
be cached independently from the main query.  if you do that *and* you 
have the filterCache turned on then after this query...
  q=account:1&fq=recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]
...these other queries will all be fairly fast becauseo f hte cache hit...
  q=account:&fq=recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]
  q=account:&fq=recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]
  q=anything+you+want&fq=recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]

: my previous test examples, it seems lucene will not check the size of
: the subconditions (like account:1 or
: recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]). Q=account:1 will return a
: small set of data. But q=recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY] will
: return a large set of data. If we combine them with "AND" like:
: q=account+AND+recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]. It should
: return the small set of data and then apply the subcondition
: "recordeddate_dt:[NOW/DAY-7DAYS+TO+NOW/DAY]". But from the response

the ConjunctionScorer will do that (as mentioned earlier in this thread) 
but even if the account:1 clause indicates that it can skip ahead to 
*document* #1234567, the ConstantScoreRangeQuery still 
needs iterate over all of the *terms* in the specified range before it 
knows which the lowest matching doc id is above #1234567.

that's why putting "range queries" into seperate "fq" params can be a lot 
better ... that term iteration only needs to be done once and can then be 
cached and reused.



-Hoss



Re: Search query optimization

2008-06-30 Thread wojtekpia

If I know that condition C will eliminate more results than either A or B,
does specifying the query as: "C AND A AND B" make it any faster (than the
original "A AND B AND C")?
-- 
View this message in context: 
http://www.nabble.com/Search-query-optimization-tp17544667p18205504.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Search query optimization

2008-06-30 Thread Chris Hostetter

: If I know that condition C will eliminate more results than either A or B,
: does specifying the query as: "C AND A AND B" make it any faster (than the
: original "A AND B AND C")?

Nope.  Lucene takes care of that for you.



-Hoss



Re: Function Query Optimization

2020-12-12 Thread Vincenzo D'Amore
Hi, looking at this sample it seems you have just one document for '12345',
one for '23456' and so on so forth. If this is true, why don't just try
with a subquery
https://lucene.apache.org/solr/guide/6_6/transforming-result-documents.html#TransformingResultDocuments-_subquery_

On Fri, Dec 11, 2020 at 3:31 PM Jae Joo  wrote:

> I have the requirement to create field  - xyz to be returned based on the
> matched result.
> Here Is the code .
>
> XYZ:concat(
>
> if(exists(query({!v='field1:12345'})), '12345', ''),
>
> if(exists(query({!v='field1:23456'})), '23456', ''),
>
> if(exists(query({!v='field1:34567'})), '34567', ''),
>
> if(exists(query({!v='field:45678'})), '45678','')
> ),
>
> I am feeling this is very complex, so I am looking for some smart and
> faster ideas.
>
> Thanks,
>
> Jae
>


-- 
Vincenzo D'Amore


Re: Function Query Optimization

2020-12-14 Thread Jae Joo
Should SubQuery be faster than FunctionQuery?

On Sat, Dec 12, 2020 at 10:24 AM Vincenzo D'Amore 
wrote:

> Hi, looking at this sample it seems you have just one document for '12345',
> one for '23456' and so on so forth. If this is true, why don't just try
> with a subquery
>
> https://lucene.apache.org/solr/guide/6_6/transforming-result-documents.html#TransformingResultDocuments-_subquery_
>
> On Fri, Dec 11, 2020 at 3:31 PM Jae Joo  wrote:
>
> > I have the requirement to create field  - xyz to be returned based on the
> > matched result.
> > Here Is the code .
> >
> > XYZ:concat(
> >
> > if(exists(query({!v='field1:12345'})), '12345', ''),
> >
> > if(exists(query({!v='field1:23456'})), '23456', ''),
> >
> > if(exists(query({!v='field1:34567'})), '34567', ''),
> >
> > if(exists(query({!v='field:45678'})), '45678','')
> > ),
> >
> > I am feeling this is very complex, so I am looking for some smart and
> > faster ideas.
> >
> > Thanks,
> >
> > Jae
> >
>
>
> --
> Vincenzo D'Amore
>