Re: Sorting issue while using collection parameter

2018-07-20 Thread Vijay Tiwary
Hello Erick

We are using string field and data is stored in lower case while indexing.
We have alias set up to query multiple collections simultaneously.
alias=collection1, collection2
If we are querying through alias then sorting is broken. For e.g. Results
for descending sort are as follows. (Empty lines are documents with no
value for the field on which sorting is applied).
It seems there is an issue in solr while aggregating the results of
individual
shard responses
d
d
d


b
b
b
c
c

b
b



On Fri, 29 Jun 2018, 9:16 pm Erick Erickson, 
wrote:

> What _is_ your expectation? You haven't provided any examples of what
> your input and expectations _are_.
>
> You might review: https://wiki.apache.org/solr/UsingMailingLists
>
> string types are case-sensitive for instance, so that's one thing that
> could be happening. You
> can also specify sortMissingFirst/Last to determine where docs with
> missing fields appear in the results.
>
> Best,
> Erick
>
> On Fri, Jun 29, 2018 at 3:13 AM, Vijay Tiwary 
> wrote:
> > Hello Eric,
> >
> > title is a string field
> >
> > On Wed, 27 Jun 2018, 9:21 pm Erick Erickson, 
> > wrote:
> >
> >> what kind of field is title? text_general or something? Sorting on a
> >> tokenized field is usually something you don't want to do. If a field
> >> has aardvard and zebra, how would it sort?
> >>
> >> There's usually something like alphaOnlySort. People often copyField
> >> from "title" to "title_sort" and search on "title" and sort on
> >> title_sort.
> >>
> >> alphaOnlySort uses KeywordTokenizer and LowercaseFilterFactory.
> >>
> >> Best,
> >> Erick
> >>
> >> On Wed, Jun 27, 2018 at 12:45 AM, Vijay Tiwary <
> vijaykr.tiw...@gmail.com>
> >> wrote:
> >> > Hello Team,
> >> >
> >> > I have multiple collection on solr (5.4.1) cloud based on year
> >> > content2107
> >> > content2018
> >> >
> >> > Also I have a collection "content" which does not have any data.
> >> >
> >> > Now if I query them as follows
> >> > http://host:port/solr/content/select?q=*:*&collection=content2107,
> >> > content2108&sort=title
> >> > asc
> >> >
> >> > Where title is string field then results are not getting sorted as per
> >> the
> >> > expectation. Also note value for title is not present for some
> documents.
> >> >
> >> > Please help.
> >>
>


Re: Sorting issue while using collection parameter

2018-06-29 Thread Vijay Tiwary
Hello Eric,

title is a string field

On Wed, 27 Jun 2018, 9:21 pm Erick Erickson, 
wrote:

> what kind of field is title? text_general or something? Sorting on a
> tokenized field is usually something you don't want to do. If a field
> has aardvard and zebra, how would it sort?
>
> There's usually something like alphaOnlySort. People often copyField
> from "title" to "title_sort" and search on "title" and sort on
> title_sort.
>
> alphaOnlySort uses KeywordTokenizer and LowercaseFilterFactory.
>
> Best,
> Erick
>
> On Wed, Jun 27, 2018 at 12:45 AM, Vijay Tiwary 
> wrote:
> > Hello Team,
> >
> > I have multiple collection on solr (5.4.1) cloud based on year
> > content2107
> > content2018
> >
> > Also I have a collection "content" which does not have any data.
> >
> > Now if I query them as follows
> > http://host:port/solr/content/select?q=*:*&collection=content2107,
> > content2108&sort=title
> > asc
> >
> > Where title is string field then results are not getting sorted as per
> the
> > expectation. Also note value for title is not present for some documents.
> >
> > Please help.
>


Sorting issue while using collection parameter

2018-06-27 Thread Vijay Tiwary
Hello Team,

I have multiple collection on solr (5.4.1) cloud based on year
content2107
content2018

Also I have a collection "content" which does not have any data.

Now if I query them as follows
http://host:port/solr/content/select?q=*:*&collection=content2107,
content2108&sort=title
asc

Where title is string field then results are not getting sorted as per the
expectation. Also note value for title is not present for some documents.

Please help.


RE: JSON facet performance for aggregations

2017-04-30 Thread Vijay Tiwary
Please enable doc values and try.
There is a bug in the source code which causes json facet on string field
to run very slow. On numeric fields it runs fine with doc value enabled.

On Apr 30, 2017 1:41 PM, "Mikhail Ibraheem" 
wrote:

> Hi Vijay,
> It is already numeric field.
> It is huge difference between json and flat here. Do you know the reason
> for this? Is there a way to improve it ?
>
> -Original Message-
> From: Vijay Tiwary [mailto:vijaykr.tiw...@gmail.com]
> Sent: Sunday, April 30, 2017 9:58 AM
> To: solr-user@lucene.apache.org
> Subject: Re: JSON facet performance for aggregations
>
> Json facet on string fields run lot slower than on numeric fields. Try and
> see if you can represent studentid as a numeric field.
>
> On Apr 30, 2017 1:19 PM, "Mikhail Ibraheem" 
> wrote:
>
> > Hi,
> >
> > I am trying to do aggregation with JSON faceting but performance is
> > very bad for one of the requests:
> >
> > json.facet={
> >
> >studentId:{
> >
> >   type:terms,
> >
> >   limit:-1,
> >
> >   field:"studentId",
> >
> >   facet:{
> >
> >   x:"sum(grades)"
> >
> >   }
> >
> >}
> >
> > }
> >
> >
> >
> > This request finishes in 250 seconds, and we can't paginate for this
> > service for functional reason so we have to use limit:-1, and the
> > cardinality of the studentId is 7500.
> >
> >
> >
> > If I try the same with flat facet it finishes in 3 seconds :
> > stats=true&facet=true&stats.field={!tag=piv1
> > sum=true}grades&facet.pivot={!stats=piv1}studentId
> >
> >
> >
> > We are hoping to use one approach json or flat for all our services.
> > JSON facet performance is better for many case.
> >
> >
> >
> > Please advise on why the performance for this is so bad and if we can
> > improve it. Also what is the default algorithm used for json facet.
> >
> >
> >
> > Thanks
> >
> > Mikhail
> >
>


Re: JSON facet performance for aggregations

2017-04-30 Thread Vijay Tiwary
Json facet on string fields run lot slower than on numeric fields. Try and
see if you can represent studentid as a numeric field.

On Apr 30, 2017 1:19 PM, "Mikhail Ibraheem" 
wrote:

> Hi,
>
> I am trying to do aggregation with JSON faceting but performance is very
> bad for one of the requests:
>
> json.facet={
>
>studentId:{
>
>   type:terms,
>
>   limit:-1,
>
>   field:"studentId",
>
>   facet:{
>
>   x:"sum(grades)"
>
>   }
>
>}
>
> }
>
>
>
> This request finishes in 250 seconds, and we can't paginate for this
> service for functional reason so we have to use limit:-1, and the
> cardinality of the studentId is 7500.
>
>
>
> If I try the same with flat facet it finishes in 3 seconds :
> stats=true&facet=true&stats.field={!tag=piv1
> sum=true}grades&facet.pivot={!stats=piv1}studentId
>
>
>
> We are hoping to use one approach json or flat for all our services. JSON
> facet performance is better for many case.
>
>
>
> Please advise on why the performance for this is so bad and if we can
> improve it. Also what is the default algorithm used for json facet.
>
>
>
> Thanks
>
> Mikhail
>


Re: Sub faceting on string field using json facet runs extremly slow

2016-05-18 Thread Vijay Tiwary
Can somebody confirm whether the jira SOLR-8096 will affect json facet
also as I see sub faceting using term facet on string field is ruuning 5x
slower than on integer field for same number of hits and unique terms.
On 17-May-2016 3:33 pm, "Vijay Tiwary"  wrote:

> Below is the request
>
>q=*:*&rows=0&start=0&json.facet={
>
> "customer_id": {
>
> type": "terms",
>
> "limit": -1,
>
> "field": "cid_ti",
>
> "mincount": 1,
>
> "facet": {
>
> "contact_s": {
>
> "type":
> "terms",
>
> "limit": 1,
>
> "field":
> "contact_s",
>
>
> "mincount": 1
>
>     }
>
>
>
> }
>
> }
>
> }&fq=age_td:[25 TO 50]
>
>
>
>
>
>
>
>
>
> On 17-May-2016 2:20 pm, "chandan khatri"  wrote:
>
>> Can you please share the query for sub faceting?
>>
>> On Tue, May 17, 2016 at 2:13 PM, Vijay Tiwary 
>> wrote:
>>
>>> Hello all,
>>> I have an index of 8 shards having 1 replica each distubuted across 8
>>> node
>>> solr cloud . Size of index is 300 gb having 30 million documents. Solr
>>> json
>>> facet runs extremly slow if I am sub faceting on string field even if
>>> tnumfound is only around 2 (also I am not returning any rows i.e
>>> rows=0).
>>> Is there any way to improve the performance?
>>>
>>> Thanks,
>>> Vijay
>>>
>>
>>


Sub faceting on string field using json facet runs extremly slow

2016-05-17 Thread Vijay Tiwary
Hello all,
I have an index of 8 shards having 1 replica each distubuted across 8 node
solr cloud . Size of index is 300 gb having 30 million documents. Solr json
facet runs extremly slow if I am sub faceting on string field even if
tnumfound is only around 2 (also I am not returning any rows i.e
rows=0).
Is there any way to improve the performance?

Thanks,
Vijay


subscription to the mailing list

2015-01-31 Thread Vijay Tiwary
Hi,

I want to subscribe to the mailing list



Regards,
Vijay