Hi Mike,

Thanks for your reply.

I remember DocValues is enabled by default since solr 6.

If it is not and I reindex the data with DocValues= true for id field. How
much my index size will increase due to this.
Currently I have 90 GB as index size


On Wed, 13 Jan, 2021, 9:14 pm Mike Drob, <md...@mdrob.com> wrote:

> You should be using docvalues on your id, but note that switching this
> would require a reindex.
>
> On Wed, Jan 13, 2021 at 6:04 AM Ajay Sharma <aja...@indiamart.com.invalid>
> wrote:
>
> > Hi All,
> >
> > I have used cursors to search and export documents in solr according to
> >
> >
> https://lucene.apache.org/solr/guide/6_6/pagination-of-results.html#fetching-a-large-number-of-sorted-results-cursors
> >
> > Solr version: 6.5.0
> > No of Documents: 10 crore
> >
> > Before implementing cursor, I was using the start and rows parameter to
> > fetch records
> > Service response time used to be 2 sec
> >
> > *Before implementing Cursor Solr URL:*
> > http://localhost:8080/solr/search/select?q=bird
> > toy&qt=mapping&ps=3&rows=25&mm=100
> >
> > Request handler Looks like this: fl contains approx 20 fields
> > <requestHandler name="mapping" class="solr.SearchHandler">
> >     <lst name="invariants">
> >         <str name="defType">edismax</str>
> >         <str name="indent">on</str>
> >         <float name="tie">0.01</float>
> >     </lst>
> >     <lst name="appends">
> >         <str name="fl">id,refid,title,smalldesc:""</str>
> >     </lst>
> >    <lst name="defaults">
> >         <str name="echoParams">none</str>
> >         <str name="wt">json</str>
> >         <int name="rows">25</int>
> >         <str name="timeAllowed">15000</str>
> >         <str name="qf">smalldesc</str>
> >         <str name="qf">title_text</str>
> >         <str name="qf">titlews^3</str>
> >         <str name="qf">sdescnisq</str>
> >         <str name="qs">1</str>
> >         <!-- retrive following fields -->
> >         <str name="mm">2&lt;-1 4&lt;70%</str>
> >     </lst>
> > </requestHandler>
> >
> > Sharing Response with EchoParams=all > Qtime is 6
> > responseHeader: {
> > status: 0,
> > QTime: 6,
> > params: {
> >     ps: "3",
> >     echoParams: "all",
> >     indent: "on",
> >     fl: "id,refid,title,smalldesc:"",
> >     tie: "0.01",
> >     defType: "edismax",
> >     qf: "customphonetic",
> >     wt: "json",
> >    qs: "1",
> >    qt: "mapping",
> >    rows: "25",
> >    q: "bird toy",
> >    timeAllowed: "15000"
> > }
> > },
> > response: {
> > numFound: 17,
> > start: 0,
> > maxScore: 26.616478,
> > docs: [
> >   {
> >     id: "22347708097",
> >     refid: "152585558",
> >     title: "Round BIRD COLOURFUL SWINGING CIRCULAR SITTING TOY",
> >     smalldesc: "",
> >     score: 26.616478
> >  }
> > ]
> > }
> >
> > I am facing a performance issue now after implementing the cursor.
> Service
> > response time is increased 3 to 4 times .i.e. 8 sec in some cases
> >
> > *After implementing Cursor query is-*
> > localhost:8080/solr/search/select?q=bird
> > toy&qt=cursor&ps=3&rows=1000&mm=100&sort=score desc,id asc&cursorMark=*
> >
> > Just added &sort=score desc,id asc&cursorMark=* to the before query and
> > rows to be fetched is 1000 now and fl contains just a single field
> >
> > Request handler remains same as before just changed the name and made fl
> > change and added df in defaults
> >
> > <requestHandler name="cursor" class="solr.SearchHandler">
> >    <lst name="invariants">
> >       <str name="defType">edismax</str>
> >       <str name="indent">on</str>
> >       <float name="tie">0.01</float>
> >    </lst>
> >    <lst name="appends">
> >       <str name="fl">refid</str>
> >    </lst>
> >    <lst name="defaults">
> >       <str name="echoParams">none</str>
> >       <str name="wt">json</str>
> >       <int name="rows">1000</int>
> >       <str name="qf">smalldesc</str>
> >       <str name="qf">title_text</str>
> >       <str name="qf">titlews^3</str>
> >       <str name="qf">sdescnisq</str>
> >       <str name="qs">1</str>
> >       <str name="mm">2&lt;-1 4&lt;70%</str>
> >       <str name="df">product_titles</str>
> >    </lst>
> > </requestHandler>
> >
> > Response with Cursor and echoParams=all-> *Qtime is now 17* i.e approx 3
> > time of previous qtime
> > responseHeader: {
> > status: 0,
> > QTime: 17,
> > params: {
> > df: "product_titles",
> > ps: "3",
> > echoParams: "all",
> > indent: "on",
> > fl: "refid",
> > tie: "0.01",
> > defType: "edismax",
> > qf: "customphonetic",
> > qs: "1",
> > qt: "cursor",
> > sort: "score desc,id asc",
> > rows: "1000",
> > q: "bird toy",
> > cursorMark: "*",
> > }
> > },
> > response: {
> > numFound: 17,
> > start: 0,
> > docs: [
> > {
> > refid: "152585558"
> > },
> > {
> > refid: "157276077"
> > }
> > ]
> > }
> >
> >
> > When i curl http://localhost:8080/solr/search/select?q=bird
> > toy&qt=mapping&ps=3&rows=25&mm=100, i can get results in 3 seconds.
> > When i curl localhost:8080/solr/search/select?q=bird
> > toy&qt=cursor&ps=3&rows=1000&mm=100&sort=score desc,id asc&cursorMark=*
> it
> > consumed 8 seconds to return result even if the result count=0
> >
> > BTW, the id schema definition is used in sort
> > <field name="id" type="string" indexed="true" stored="true"
> required="true"
> > omitNorms="true" multiValued="false"/>
> >
> > Is it due to the sort I have applied or I have implemented it in the
> wrong
> > way?
> > Please help or provide the direction to solve this issue
> >
> >
> > Thanks in advance
> >
> > --
> > Thanks & Regards,
> > Ajay Sharma
> > Product Search
> > Indiamart Intermesh Ltd.
> >
> > --
> >
> >
>

-- 

Reply via email to