Re: Cursor Performance Issue

2021-01-14 Thread Ajay Sharma
Hi Mike,

Thanks for your reply.

I remember DocValues is enabled by default since solr 6.

If it is not and I reindex the data with DocValues= true for id field. How
much my index size will increase due to this.
Currently I have 90 GB as index size


On Wed, 13 Jan, 2021, 9:14 pm Mike Drob,  wrote:

> You should be using docvalues on your id, but note that switching this
> would require a reindex.
>
> On Wed, Jan 13, 2021 at 6:04 AM Ajay Sharma 
> wrote:
>
> > Hi All,
> >
> > I have used cursors to search and export documents in solr according to
> >
> >
> https://lucene.apache.org/solr/guide/6_6/pagination-of-results.html#fetching-a-large-number-of-sorted-results-cursors
> >
> > Solr version: 6.5.0
> > No of Documents: 10 crore
> >
> > Before implementing cursor, I was using the start and rows parameter to
> > fetch records
> > Service response time used to be 2 sec
> >
> > *Before implementing Cursor Solr URL:*
> > http://localhost:8080/solr/search/select?q=bird
> > toy=mapping=3=25=100
> >
> > Request handler Looks like this: fl contains approx 20 fields
> > 
> > 
> > edismax
> > on
> > 0.01
> > 
> > 
> > id,refid,title,smalldesc:""
> > 
> >
> > none
> > json
> > 25
> > 15000
> > smalldesc
> > title_text
> > titlews^3
> > sdescnisq
> > 1
> > 
> > 2-1 470%
> > 
> > 
> >
> > Sharing Response with EchoParams=all > Qtime is 6
> > responseHeader: {
> > status: 0,
> > QTime: 6,
> > params: {
> > ps: "3",
> > echoParams: "all",
> > indent: "on",
> > fl: "id,refid,title,smalldesc:"",
> > tie: "0.01",
> > defType: "edismax",
> > qf: "customphonetic",
> > wt: "json",
> >qs: "1",
> >qt: "mapping",
> >rows: "25",
> >q: "bird toy",
> >timeAllowed: "15000"
> > }
> > },
> > response: {
> > numFound: 17,
> > start: 0,
> > maxScore: 26.616478,
> > docs: [
> >   {
> > id: "22347708097",
> > refid: "152585558",
> > title: "Round BIRD COLOURFUL SWINGING CIRCULAR SITTING TOY",
> > smalldesc: "",
> > score: 26.616478
> >  }
> > ]
> > }
> >
> > I am facing a performance issue now after implementing the cursor.
> Service
> > response time is increased 3 to 4 times .i.e. 8 sec in some cases
> >
> > *After implementing Cursor query is-*
> > localhost:8080/solr/search/select?q=bird
> > toy=cursor=3=1000=100=score desc,id asc=*
> >
> > Just added =score desc,id asc=* to the before query and
> > rows to be fetched is 1000 now and fl contains just a single field
> >
> > Request handler remains same as before just changed the name and made fl
> > change and added df in defaults
> >
> > 
> >
> >   edismax
> >   on
> >   0.01
> >
> >
> >   refid
> >
> >
> >   none
> >   json
> >   1000
> >   smalldesc
> >   title_text
> >   titlews^3
> >   sdescnisq
> >   1
> >   2-1 470%
> >   product_titles
> >
> > 
> >
> > Response with Cursor and echoParams=all-> *Qtime is now 17* i.e approx 3
> > time of previous qtime
> > responseHeader: {
> > status: 0,
> > QTime: 17,
> > params: {
> > df: "product_titles",
> > ps: "3",
> > echoParams: "all",
> > indent: "on",
> > fl: "refid",
> > tie: "0.01",
> > defType: "edismax",
> > qf: "customphonetic",
> > qs: "1",
> > qt: "cursor",
> > sort: "score desc,id asc",
> > rows: "1000",
> > q: "bird toy",
> > cursorMark: "*",
> > }
> > },
> > response: {
> > numFound: 17,
> > start: 0,
> > docs: [
> > {
> > refid: "152585558"
> > },
> > {
> > refid: "157276077"
> > }
> > ]
> > }
> >
> >
> > When i curl http://localhost:8080/solr/search/select?q=bird
> > toy=mapping=3=25=100, i can get results in 3 seconds.
> > When i curl localhost:8080/solr/search/select?q=bird
> > toy=cursor=3=1000=100=score desc,id asc=*
> it
> > consumed 8 seconds to return result even if the result count=0
> >
> > BTW, the id schema definition is used in sort
> >  required="true"
> > omitNorms="true" multiValued="false"/>
> >
> > Is it due to the sort I have applied or I have implemented it in the
> wrong
> > way?
> > Please help or provide the direction to solve this issue
> >
> >
> > Thanks in advance
> >
> > --
> > Thanks & Regards,
> > Ajay Sharma
> > Product Search
> > Indiamart Intermesh Ltd.
> >
> > --
> >
> >
>

-- 



Cursor Performance Issue

2021-01-13 Thread Ajay Sharma
Hi All,

I have used cursors to search and export documents in solr according to
https://lucene.apache.org/solr/guide/6_6/pagination-of-results.html#fetching-a-large-number-of-sorted-results-cursors

Solr version: 6.5.0
No of Documents: 10 crore

Before implementing cursor, I was using the start and rows parameter to
fetch records
Service response time used to be 2 sec

*Before implementing Cursor Solr URL:*
http://localhost:8080/solr/search/select?q=bird
toy=mapping=3=25=100

Request handler Looks like this: fl contains approx 20 fields


edismax
on
0.01


id,refid,title,smalldesc:""

   
none
json
25
15000
smalldesc
title_text
titlews^3
sdescnisq
1

2-1 470%



Sharing Response with EchoParams=all > Qtime is 6
responseHeader: {
status: 0,
QTime: 6,
params: {
ps: "3",
echoParams: "all",
indent: "on",
fl: "id,refid,title,smalldesc:"",
tie: "0.01",
defType: "edismax",
qf: "customphonetic",
wt: "json",
   qs: "1",
   qt: "mapping",
   rows: "25",
   q: "bird toy",
   timeAllowed: "15000"
}
},
response: {
numFound: 17,
start: 0,
maxScore: 26.616478,
docs: [
  {
id: "22347708097",
refid: "152585558",
title: "Round BIRD COLOURFUL SWINGING CIRCULAR SITTING TOY",
smalldesc: "",
score: 26.616478
 }
]
}

I am facing a performance issue now after implementing the cursor. Service
response time is increased 3 to 4 times .i.e. 8 sec in some cases

*After implementing Cursor query is-*
localhost:8080/solr/search/select?q=bird
toy=cursor=3=1000=100=score desc,id asc=*

Just added =score desc,id asc=* to the before query and
rows to be fetched is 1000 now and fl contains just a single field

Request handler remains same as before just changed the name and made fl
change and added df in defaults


   
  edismax
  on
  0.01
   
   
  refid
   
   
  none
  json
  1000
  smalldesc
  title_text
  titlews^3
  sdescnisq
  1
  2-1 470%
  product_titles
   


Response with Cursor and echoParams=all-> *Qtime is now 17* i.e approx 3
time of previous qtime
responseHeader: {
status: 0,
QTime: 17,
params: {
df: "product_titles",
ps: "3",
echoParams: "all",
indent: "on",
fl: "refid",
tie: "0.01",
defType: "edismax",
qf: "customphonetic",
qs: "1",
qt: "cursor",
sort: "score desc,id asc",
rows: "1000",
q: "bird toy",
cursorMark: "*",
}
},
response: {
numFound: 17,
start: 0,
docs: [
{
refid: "152585558"
},
{
refid: "157276077"
}
]
}


When i curl http://localhost:8080/solr/search/select?q=bird
toy=mapping=3=25=100, i can get results in 3 seconds.
When i curl localhost:8080/solr/search/select?q=bird
toy=cursor=3=1000=100=score desc,id asc=* it
consumed 8 seconds to return result even if the result count=0

BTW, the id schema definition is used in sort


Is it due to the sort I have applied or I have implemented it in the wrong
way?
Please help or provide the direction to solve this issue


Thanks in advance

-- 
Thanks & Regards,
Ajay Sharma
Product Search
Indiamart Intermesh Ltd.

-- 



Re: Solr Highlighting not working

2020-11-30 Thread Ajay Sharma
Hi All,

pushing the query to the top.
Does anyone have any idea about it?


On Fri, Nov 27, 2020 at 11:49 AM Ajay Sharma  wrote:

> Hi Community,
>
> This is the first time, I am implementing a solr *highlighting *feature.
> I have read the concept via solr documentation
> Link- https://lucene.apache.org/solr/guide/8_2/highlighting.html
>
> To enable highlighting I just have to add *=true=* *in our solr
> query and got the snippet in the solr response and it is working fine in
> most of the cases.
>
> *But highlighting does not work when synonyms came into action*
>
> *Issue:*
> I am searching leopard (q=leopard) in field title (qf=title)
>
> In our synonym file, we have an entry like below
> *leopard,tenduaa,panther*
>
> and in one document id:123456, field title contains below text:
> title:"Jindal Panther TMT Bars
>
> For the query (q=leopard) , i am getting this document (id:123456) in solr
> response
> I could check that due to synonym document is matched  and I confirmed it
> via Solr UI analysis screen where I put Analyse FieldName= title,  Field
> Value (Index) ="Jindal Panther TMT rebars" and Field Value (Query) =
> leopard and I could see in index chain, token panther getting saved as
> leopard also but in highlighting I don't get any matched token and
> getting below response
>
>
>- highlighting:
>{
>   - 123456: { }
>   }
>
>
>
> I just need the matched synonym token like panther in the above case to be
> returned in solr highlighting response
> I have read and re-read the solr documentation, searched on google gone
> through many articles even checked StackOverflow but could not find a
> solution.
> Any help from community members will be highly appreciated.
>
> Thanks in advance.
>
>
> --
> Regards,
> Ajay Sharma
> Software Engineer, Product-Search,
> IndiaMART InterMESH Ltd
>


-- 
Thanks & Regards,
Ajay Sharma
Software Engineer, Product-Search,
IndiaMART InterMESH Ltd

-- 



Solr Highlighting not working

2020-11-26 Thread Ajay Sharma
Hi Community,

This is the first time, I am implementing a solr *highlighting *feature.
I have read the concept via solr documentation
Link- https://lucene.apache.org/solr/guide/8_2/highlighting.html

To enable highlighting I just have to add *=true=* *in our solr
query and got the snippet in the solr response and it is working fine in
most of the cases.

*But highlighting does not work when synonyms came into action*

*Issue:*
I am searching leopard (q=leopard) in field title (qf=title)

In our synonym file, we have an entry like below
*leopard,tenduaa,panther*

and in one document id:123456, field title contains below text:
title:"Jindal Panther TMT Bars

For the query (q=leopard) , i am getting this document (id:123456) in solr
response
I could check that due to synonym document is matched  and I confirmed it
via Solr UI analysis screen where I put Analyse FieldName= title,  Field
Value (Index) ="Jindal Panther TMT rebars" and Field Value (Query) =
leopard and I could see in index chain, token panther getting saved as
leopard also but in highlighting I don't get any matched token and getting
below response


   - highlighting:
   {
  - 123456: { }
  }



I just need the matched synonym token like panther in the above case to be
returned in solr highlighting response
I have read and re-read the solr documentation, searched on google gone
through many articles even checked StackOverflow but could not find a
solution.
Any help from community members will be highly appreciated.

Thanks in advance.


-- 
Regards,
Ajay Sharma
Software Engineer, Product-Search,
IndiaMART InterMESH Ltd

-- 



Re: Increase in Response time when solr fields are merged

2020-11-19 Thread Ajay Sharma
Thank you Shawn for the valuable inputs.

I am assuming, please correct me if I am wrong =>
If we have two fields one has a large amount of text wrt to a field with
shorter text like description and title.
So, the Number of the tokens created for the description field will be much
high w.r.t. the title.
So if I search in the title field, it will be comparatively fast w.r.t. to
the description field

If the above listed is true then in the case of merged fields, no of tokens
have increased exponentially in a single merged field and could be a
possible reason?


On Thu, Nov 19, 2020 at 5:06 PM Shawn Heisey  wrote:

> On 11/19/2020 2:12 AM, Ajay Sharma wrote:
> > Earlier we were searching in 6 fields i.e qf is applied on 6 fields like
> > below
>
> 
>
> > We merged all these 6 fields into one field X and now while searching we
> > using this single filed X
>
> 
>
> > We are able to see a decrease in index size but the response time has
> > increased.
>
> I can't say for sure, but I would imagine that when querying multiple
> fields using edismax, Solr can manage to do some of that work in
> parallel.  But with only one field, any parallel processing is lost.  If
> I have the right idea, that could explain what you are seeing.
>
> Somebody with far more intimate knowledge of edismax will need to
> confirm or refute my thoughts.
>
> Thanks,
> Shawn
>


-- 
Thanks & Regards,
Ajay Sharma
Product Search
+91-8954492245

-- 



Increase in Response time when solr fields are merged

2020-11-19 Thread Ajay Sharma
Hi All,

Earlier we were searching in 6 fields i.e qf is applied on 6 fields like
below

  A
  B
  C
  D
  E
  F


We had assumed if we reduced the number of fields being used to search then
the index size and response time both will decrease.

We merged all these 6 fields into one field X and now while searching we
using this single filed X
By merge i mean i index all the 6 field data into single field X

  X


We are able to see a decrease in index size but the response time has
increased.
*Are we missing something? Is our assumption correct?*

Any help will be highly appreciated.


-- 
Thanks & Regards,
Ajay Sharma
Software Engineer, Product-Search,
IndiaMART InterMESH Ltd

-- 



Give boosting to a grouped documents in Solr Based on number of results in a group

2020-04-22 Thread Ajay Sharma
Hi Community Members,

There is a logic that I need to implement using Solr.

Suppose there are two suppliers on a site dealing in Mobile Phones

   1. Supplier 1 has 10 products related to mobile phones
   2. Supplier 2 has 20 products related to mobile phones


I need to give a boost to supplier 2 because he has more number of products
related to mobile phones.

Is there a way in Solr where I can boost a supplier and give boosting to a
grouped documents in Solr Based on the number of results in a group.

Any help will be appreciated.

-- 
Thanks & Regards,
Ajay Sharma
Software Engineer, Product-Search,
IndiaMART InterMESH Ltd,
Mob.: +91-8954492245

-- 
*
*

 <https://www.facebook.com/IndiaMART/videos/578196442936091/>