Re: SOLR Cloud: 1500+ threads are in TIMED_WAITING status

2018-04-07 Thread Doss
Hi,

Is there any network parameter that we need to fine tune? Is there any
specific tweaking needed to deploy solr in virtual machines? We use VMware. 

Thanks.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Bad request from solr by just changing the order in the filter query

2018-04-07 Thread Shawn Heisey

On 4/7/2018 1:54 PM, Elier Delgado wrote:

Thanks Shawn, your query works but the facet counters became inconsistent.


What *exactly* was wrong?  Is it possible that it was working the way it 
was designed to work, but different than you expected?  If you really 
think what you're getting is wrong, we'll take your report seriously, 
but to go anywhere, evidence of a problem will need to be available.



A colleague found that by adding an space between the tag and the field the
query works as expected in any order.
Also, adding the plus sign force to match all conditions. Here is the
working query, just adding space and plus sign in fq parameter.

Can someone help to clarify if this is the expected behavior or it is just
a hack that we have found?

={domains:{type:terms,field:domains,domain:{excludeTags:DOMAIN}},specialties:{type:terms,field:specialties,domain:{excludeTags:SPECIALTY}}}
={!tag=DOMAIN} +domains:100 AND {!tag=SPECIALTY} +specialties:(1043 1023)


You've still got the {!tag} localparam twice in one query string.  
That's the fundamental problem I was highlighting.  Even if you've found 
some syntax that makes it function, I don't think it's right.  And the 
workaround might stop working in a future version, when the feature is 
improved or a bug gets fixed.


Thanks,
Shawn



Re: Bad request from solr by just changing the order in the filter query

2018-04-07 Thread Elier Delgado
On Sat, Mar 31, 2018 at 12:46 PM, Shawn Heisey 
wrote:

> On 3/31/2018 10:22 AM, Elier Delgado wrote:
>
>>   Hi, I'm working with facets in solr 7.2.1. Basically I'm following this
>> tutorial:
>> http://yonik.com/multi-select-faceting/
>>
>> The following solr request works just fine:
>>
>> ={domains:{type:terms,field:domains,domain:{exclu
>> deTags:DOMAIN}},specialties:{type:terms,field:specialties,do
>> main:{excludeTags:SPECIALTY}}}
>> ={!tag=SPECIALTY}specialties:(1043 1023) AND {!tag=DOMAIN}domains:100
>>
>> If I just change the order in the fq parameter, solr response with 400 Bad
>> Request.
>>
>> ={domains:{type:terms,field:domains,domain:{exclu
>> deTags:DOMAIN}},specialties:{type:terms,field:specialties,do
>> main:{excludeTags:SPECIALTY}}}
>> ={!tag=DOMAIN}domains:100 AND {!tag=SPECIALTY}specialties:(1043 1023)
>>
>> I would like to know if I'm doing something wrong or it if could be a
>> problem in solr.
>>
>
> You've got {!tag} placed twice in one fq.  I don't think you can do that
> -- localparams (which is what that syntax is using) generally must be at
> the very beginning of query text, and don't work when they're placed in the
> middle.
>
> Try this:
>
> ={!tag=DOMAIN}domains:100={!tag=SPECIALTY}specialties:(1043 1023)
>
> The first request probably should have also thrown an error, and the fact
> that it didn't MIGHT be a problem.
>
> What is the full error, which should be many lines long? If you don't see
> something that's lots of lines, look in solr.log.
>
> Thanks,
> Shawn
>

Thanks Shawn, your query works but the facet counters became inconsistent.

A colleague found that by adding an space between the tag and the field the
query works as expected in any order.
Also, adding the plus sign force to match all conditions. Here is the
working query, just adding space and plus sign in fq parameter.

Can someone help to clarify if this is the expected behavior or it is just
a hack that we have found?

={domains:{type:terms,field:domains,domain:{excludeTags:DOMAIN}},specialties:{type:terms,field:specialties,domain:{excludeTags:SPECIALTY}}}
={!tag=DOMAIN} +domains:100 AND {!tag=SPECIALTY} +specialties:(1043 1023)

Thanks


Re: ZK CLI script giving IOException doing upconfig

2018-04-07 Thread Shawn Heisey

On 4/4/2018 1:51 PM, Shawn Heisey wrote:

On 4/4/2018 12:13 PM, Doug Turnbull wrote:

[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@383] - Exception
causing close of session 0x10024db7e280006: *Len error 5327937*

With that information, I think I can tell you what went wrong.


I asked the ZK list a question related to this situation, to figure out 
how to get certain information from the ZK client so Solr could log 
warnings when files are too large.  They told me that the ZK client 
should have indicated what went wrong, as the server log did.


Would it be possible for you to provide entire logs (both the script and 
the ZK server) from an upload that fails this way, so I can review what 
changes we should make and whether we need to file a bug on ZK?


I did come up with a preliminary patch for Solr that logs warnings about 
file sizes.  It doesn't attempt to change or abort the operation, just 
log warnings.  Any error messages would be the same as they are now.


https://www.dropbox.com/s/uzi607evjt9gacc/upload-zk-size-warnings.patch?dl=0

Thanks,
Shawn



Re: Solr document routing using composite key

2018-04-07 Thread Nawab Zada Asad Iqbal
Thanks Shawn and Erick.

This is what I also ended up finding, as the number of buckets increased, I
noticed the issue.

Zheng: I am using Solr7. But this was only an experiment on the hash, i.e.,
what distribution should I expect from it. (as the above gist shows). I
didn't actually index into solr7 but would expect it to do something like
the above if I had actually indexed in solr with these partitions and Ids.





On Fri, Mar 16, 2018 at 9:24 AM, Erick Erickson 
wrote:

> What Shawn said. 117 shards and 116 docs tells you absolutely nothing
> useful. I've never seen the number of docs on various shards be off by
> more than 2-3% when enough docs are indexed to be statistically valid.
>
> Best,
> Erick
>
> On Fri, Mar 16, 2018 at 5:34 AM, Shawn Heisey  wrote:
> > On 3/6/2018 11:53 AM, Nawab Zada Asad Iqbal wrote:
> >>
> >> I have 117 shards and i tried to use document ids from zero to 116. I
> find
> >> that the distribution is very uneven, e.g., the largest bucket receives
> >> total 5 documents; and around 38 shards will be empty.  Is it expected?
> >
> >
> > With such a small data set, this fits what I would expect.
> >
> > Choosing buckets by hashing (which is what compositeId does) is not
> perfect,
> > but if you send it thousands or millions of documents, it will be
> > *generally* balanced.
> >
> > Thanks,
> > Shawn
> >
>


Re: Running an analyzer chain in an update request processor

2018-04-07 Thread Walter Underwood
As I think more about this, we should have a signature processor that uses 
minhash. The MD5 signature processor was really easy to use.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Apr 7, 2018, at 4:55 AM, Emir Arnautović  
> wrote:
> 
> Hi Walter,
> I did this sample processor for the purpose of having doc values on analysed 
> field: https://github.com/od-bits/solr-multivaluefield-processor 
> 
> 
> (+ related blog: 
> http://www.od-bits.com/2018/02/solr-docvalues-on-analysed-field.html 
> )
> 
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> 
> 
> 
>> On 6 Apr 2018, at 23:46, Walter Underwood  wrote:
>> 
>> Is there an easy way to define an analyzer chain in schema.xml then run it 
>> in an update request processor?
>> 
>> I want to run a chain ending in the minhash token filter, then take those 
>> minhashes, convert them to hex, and put them in a string field. I’d like the 
>> values stored.
>> 
>> It seems like this could all work in an update request processor. Grab the 
>> text from one field, run it through the chain, format the output tokens and 
>> add them to the field for hashes.
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
> 



Re: Running an analyzer chain in an update request processor

2018-04-07 Thread Emir Arnautović
Hi Walter,
I did this sample processor for the purpose of having doc values on analysed 
field: https://github.com/od-bits/solr-multivaluefield-processor 


(+ related blog: 
http://www.od-bits.com/2018/02/solr-docvalues-on-analysed-field.html 
)

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 6 Apr 2018, at 23:46, Walter Underwood  wrote:
> 
> Is there an easy way to define an analyzer chain in schema.xml then run it in 
> an update request processor?
> 
> I want to run a chain ending in the minhash token filter, then take those 
> minhashes, convert them to hex, and put them in a string field. I’d like the 
> values stored.
> 
> It seems like this could all work in an update request processor. Grab the 
> text from one field, run it through the chain, format the output tokens and 
> add them to the field for hashes.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>