Re: UnifiedHighlighter returns an error when setting hl.maxAnalyzedChars=-1

2019-01-06 Thread Yasufumi Mizoguchi
Hi,

I opened a JIRA about this.
https://issues.apache.org/jira/browse/SOLR-13121


Thanks,
Yasufumi.

2018年12月28日(金) 13:39 Yasufumi Mizoguchi :

> Hi,
>
> I faced UnifiedHighlighter error when setting hl.maxAnalyzedChars=-1 in
> Solr 7.6.
> Here is the procedure for reproducing.
>
> $ bin/solr -e techproducts
> $ curl -XGET
> "localhost:8983/solr/techproducts/select?hl.fl=name=-1=unified=on=memory=name"
>
> I have written a patch to replace negative values of the parameter with
> Integer.MAX_VALUE - 1
> (Because UnifiedHighlighter seems not to accept
> maxAnalyzedChars=Integer.MAX_VALUE,
> unlike the others...)
>
> Can I open a JIRA about this issue and post my patch to that?
>
> Thanks,
> Yasufumi.
>


Re: Solr relevancy score different on replicated nodes

2019-01-06 Thread Ashish Bisht
Hi Erick,

Thank you for the details,but doesn't look like a time difference in
autocommit caused this issue.As I said if I do retrieve all query/keyword
query on both server,they returned correct number of docs,its just relevancy
score is taking diff values.  

I waited for brief period,still discrepancy was coming(no indexing also).So
I went ahead deleting the follower node(thinking leader replica should be in
correct state).After adding the new replica again,the issue is not
appearing.

We will monitor same if it appears in future.

Regards
Ashish



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Solr Replication

2019-01-06 Thread Mannar mannan
Hi All,

I would like to configure master slave between two solr cloud clusters (for
failover). Below is the scenario

Solr version : 7.0

Cluster 1:
3 zookeeper instances :   zk1, zk2, zk3
2 solr instances : solr1, solr2

Cluster 2:
1 zookeeper instance : bkpzk1,
1 solr instances : bkpsolr1, bkpsolr2

Master / Slave :  solr1 / bkpsolr1
  solr2 / bkpsolr2

Is it possible to have master / slave replication configured for solr
instances running in cluster1 & cluster2 (for failover). Kindly let me know
the possibility.


How to have the same SOLR cores for both 8983 and 8984 ports

2019-01-06 Thread Muniraj M
Hi,

I am using Apache SOLR 6.6.5 as my search engine running on port 8983. I
just wanted to enable SSL for solr and followed this guide
 to make it
work under 8984 port with SSL.

Here my problem is that I am not able to see any cores on 8984 which is
already created under the port 8983(port without SSL).

http://mywebsite.com:8983/solr/#/ ==> This have 3 cores

https://mywebsite.com:8984/solr/#/ ==> This don't have any cores

It will be really appreciated if anyone could provide the solution for
having the same cores for both 8983 and 8984 ports.

Thanks

-- 
Regards,
*Muniraj M*


Web Server HTTP Header Internal IP Disclosure SOLR port

2019-01-06 Thread Muniraj M
Hi,

I am using Apache SOLR 6.6.5 as my search engine and when we do security
scan on our server, we got the below response

*When processing the following request : GET / HTTP/1.0 this web server
leaks the following private IP address : X.X.X.X as found in the following
collection of HTTP headers : HTTP/1.1 302 Found
Location: http://X.X.X.X:8983/solr/
 Content-Length: 0*

I have checked for more time however haven't find any solutions to fix this
problem. Any idea of how to solve this would be really appreciated.

-- 
Regards,
*Muniraj M*


Re: Debugging Solr Search results & Issues with Distributed IDF

2019-01-06 Thread Lavanya Thirumalaisami
 Thank you for the inputs Doug  and Charlie. 
On Wednesday, 2 January 2019, 11:39:13 pm AEDT, Doug Turnbull 
 wrote:  
 
 On (2) these are BM25 parameters. There are several articles that discuss
BM25 in depth

https://opensourceconnections.com/blog/2015/10/16/bm25-the-next-generation-of-lucene-relevation/

https://www.elastic.co/blog/practical-bm25-part-2-the-bm25-algorithm-and-its-variables





On Tue, Jan 1, 2019 at 6:04 PM Lavanya Thirumalaisami
 wrote:

>
> Hi,
>
> I am trying to debug a query to find out why one documentgets more score
> than the other. The below are two similar products.
>
> Below is the debug results I get from Solr admin console.
>
>  "Doc1": "\n15.20965 = sum of:\n 4.7573533 = max of:\n    4.7573533=
> weight(All:2x in 962) [], result of:\n          4.7573533 =
> score(doc=962,freq=2.0 =termFreq=2.0\n), product of:\n      3.4598935 =
> idf(docFreq=1346, docCount=42836)\n        1.375 = tfNorm, computed
> from:\n          2.0 = termFreq=2.0\n          1.2 = parameter
> k1\n          0.0 = parameter b (norms omitted forfield)\n  10.452296 = max
> of:\n    5.9166136 = weight(All:powerpoint in 962)[], result of:\n
> 5.9166136 =score(doc=962,freq=2.0 = termFreq=2.0\n), product of:\n
> 4.302992 = idf(docFreq=579,docCount=42836)\n        1.375 = tfNorm,computed
> from:\n          2.0 =termFreq=2.0\n          1.2 = parameterk1\n
> 0.0 = parameter b (normsomitted for field)\n    10.452296
> =weight(All:\"socket outlet\" in 962) [], result of:\n      10.452296 =
> score(doc=962,freq=2.0 =phraseFreq=2.0\n), product of:\n      7.60167 =
> idf(), sum of:\n        3.5370626 = idf(docFreq=1246,
> docCount=42836)\n          4.064607 =
> idf(docFreq=735,docCount=42836)\n        1.375 = tfNorm,computed
> from:\n          2.0 =phraseFreq=2.0\n          1.2 =
> parameterk1\n          0.0 = parameter b (normsomitted for field)\n",
>
> "Doc15":"\n13.258003 = sum of:\n  5.7317085 = max of:\n    5.7317085 =
> weight(All:doubl in 2122) [],result of:\n      5.7317085
> =score(doc=2122,freq=2.0 = termFreq=2.0\n), product of:\n        4.168515 =
> idf(docFreq=663,docCount=42874)\n        1.375 = tfNorm,computed
> from:\n          2.0 =termFreq=2.0\n          1.2 = parameterk1\n
> 0.0 = parameter b (normsomitted for field)\n    4.7657394 =weight(All:2x in
> 2122) [], result of:\n    4.7657394 = score(doc=2122,freq=2.0 =
> termFreq=2.0\n), productof:\n        3.4659925 =idf(docFreq=1339,
> docCount=42874)\n      1.375 = tfNorm, computed from:\n        2.0 =
> termFreq=2.0\n          1.2= parameter k1\n          0.0 = parameterb
> (norms omitted for field)\n    5.390302= weight(All:2g in 2122) [], result
> of:\n    5.390302 = score(doc=2122,freq=2.0 = termFreq=2.0\n), product
> of:\n        3.9202197 = idf(docFreq=850,docCount=42874)\n        1.375 =
> tfNorm,computed from:\n          2.0 = termFreq=2.0\n          1.2 =
> parameter k1\n          0.0 = parameter b (norms omitted forfield)\n
> 7.526294 = max of:\n    5.8597584 = weight(All:powerpoint in 2122)[],
> result of:\n      5.8597584 =score(doc=2122,freq=2.0 = termFreq=2.0\n),
> product of:\n        4.2616425 = idf(docFreq=604,docCount=42874)\n
> 1.375 = tfNorm,computed from:\n          2.0 = termFreq=2.0\n          1.2
> = parameter k1\n          0.0 = parameter b (norms omitted forfield)\n
> 7.526294 =weight(All:\"socket outlet\" in 2122) [], result of:\n
> 7.526294 = score(doc=2122,freq=1.0 =phraseFreq=1.0\n), product
> of:\n      7.526294 = idf(), sum of:\n        3.4955401 =
> idf(docFreq=1300, docCount=42874)\n          4.030754 =
> idf(docFreq=761,docCount=42874)\n        1.0 = tfNorm,computed
> from:\n          1.0 =phraseFreq=1.0\n          1.2 =
> parameterk1\n          0.0 = parameter b (normsomitted for field)\n",
>
>
>
> My Questions
>
> 1.      IDF : I understand from solr documents that IDFis calculated for
> each separate shards, I have added the following stats cacheconfig to
> solrconfig.xml and reloaded collection
>
> 
>
> But even after that there is no change incalculated IDF.
>
> 2.      What are parameter b and parameter K1?
>
> 3.      Why there are lots of parameters included in myDoc15 rather than
> Doc1?
>
> Is there any documentations I can refer to understand thesolr query
> calculations in depth.
>
> We are using  Solr 6.1in Cloud with 3 zookeepers and 3 masters and 3
> replicas.
>
> Regards,
> Lavanya
>
-- 
*Doug Turnbull **| CTO* | OpenSource Connections
, LLC | 240.476.9983
Author: Relevant Search 
This e-mail and all contents, including attachments, is considered to be
Company Confidential unless explicitly stated otherwise, regardless
of whether attachments are marked as such.
  

Re: Is it possible to force solr show all facet values for the field with an enum type?

2019-01-06 Thread Erik Hatcher
How about =-field:[* TO *] as a way to see a count of docs that 
don’t have field?

Erik 

> On Jan 5, 2019, at 04:45, Arvydas Silanskas  
> wrote:
> 
> Hello,
> I have an enum solr fieldtype. When I do a facet search, I want that all
> the enum values appear in the facet -- and setting field.mincount = 0 is
> not enough. It only works, if there exist a document with the matching
> value for the field, but it was filtered out by current query (and then I'm
> returned that facet value with the count 0). But can I make it to also
> return the values that literally none of the documents in the index have?
> The values, that only appear in the enum declaration xml.


Re: Is it possible to force solr show all facet values for the field with an enum type?

2019-01-06 Thread Arvydas Silanskas
I haven't checked if it works, but even if it did, that would kinda defeat
the purpose -- namely not repeating enum values
in multiple places, and have them all contained in the single spot
(enumConfig.xml)

2019-01-06, sk, 11:43 David Santamauro  rašė:

> Seeing that the field is an enumeration, couldn't you just use a set of
> facet.query(s)?
>
>   ?q=*:*
>   =user_s:Bar
>   =true
>   =enumfield:A
>   =enumfield:B
>   =0
>
> //
>
> On 1/5/19, 3:01 PM, "Arvydas Silanskas" 
> wrote:
>
> Thanks for your reply.
>
> No, not exactly what I want.
>
> Consider I have enum defined as
>
> 
> A
> B
> 
>
> and correspondingly I have defined a fieldtype "enumType" that uses
> this
> enum, and a field "enumfield" that is of type "enumType". Consider my
> index
> is like this:
>
> [
>   {
> "name_s":"Doc 1",
> "enumfield":"A",
> "user_s":"Foo",
> "id":"2ebc0754-e7d8-405e-9962-99c6cd1d9275",
> "_version_":1621850725207244800},
>   {
> "name_s":"Doc 2",
> "user_s":"Bar",
> "id":"0536827a-703a-456e-9087-71b85b63c58b",
> "_version_":1621850725397037056}]
>
> notice how there are no documents that have "enumfield":"B".
> Now, if I execute query
> "facet.field=enumfield=on=user_s:Bar=on=*:*=json",
> my facet response's fields look like this:
>
> "facet_fields":{
>   "enumfield":[
> "A",0]}
>
> There is no "B" key -- and that's my problem. It tells me about other
> facet values if they're filtered out by fq, but it tells me nothing
> about facet values that aren't present in any doc.
>
> My question is how to force the response to be
> "facet_fields":{
>   "enumfield":[
> "A",0,
> "B", 0]}
>
>
> 2019-01-05, št, 19:42 Erick Erickson  rašė:
>
> > So really the results you want are q=*:*=enumField right?
> > You could fire that query in parallel and combine the two in your
> app,
> > perhaps caching the result if the index isn't changing very rapidly.
> >
> > Facets were designed with the idea that they'd only count for docs
> > that were hits, so there's no built-in way to do what you want.
> Which, BTW,
> > could be _very_ expensive in the general case. The query would
> > have to count up, say, the hits for 100M documents...
> >
> > Best,
> > Erick
> >
> > On Sat, Jan 5, 2019 at 1:53 AM Arvydas Silanskas
> >  wrote:
> > >
> > > Hello,
> > > I have an enum solr fieldtype. When I do a facet search, I want
> that all
> > > the enum values appear in the facet -- and setting field.mincount
> = 0 is
> > > not enough. It only works, if there exist a document with the
> matching
> > > value for the field, but it was filtered out by current query (and
> then
> > I'm
> > > returned that facet value with the count 0). But can I make it to
> also
> > > return the values that literally none of the documents in the
> index have?
> > > The values, that only appear in the enum declaration xml.
> >
>
>


Re: Is it possible to force solr show all facet values for the field with an enum type?

2019-01-06 Thread Arvydas Silanskas
Feared as much, but now at least I know for sure. Thank you.

2019-01-05, št, 22:19 Mikhail Khludnev  rašė:

> Hello,
> On Sat, Jan 5, 2019 at 12:53 PM Arvydas Silanskas <
> nma.arvydas.silans...@gmail.com> wrote:
>
> > But can I make it to also
> > return the values that literally none of the documents in the index have?
> >
> No.
>
> --
> Sincerely yours
> Mikhail Khludnev
>


Re: Is it possible to force solr show all facet values for the field with an enum type?

2019-01-06 Thread David Santamauro
Seeing that the field is an enumeration, couldn't you just use a set of 
facet.query(s)?

  ?q=*:*
  =user_s:Bar
  =true
  =enumfield:A
  =enumfield:B
  =0

//

On 1/5/19, 3:01 PM, "Arvydas Silanskas"  
wrote:

Thanks for your reply.

No, not exactly what I want.

Consider I have enum defined as


A
B


and correspondingly I have defined a fieldtype "enumType" that uses this
enum, and a field "enumfield" that is of type "enumType". Consider my index
is like this:

[
  {
"name_s":"Doc 1",
"enumfield":"A",
"user_s":"Foo",
"id":"2ebc0754-e7d8-405e-9962-99c6cd1d9275",
"_version_":1621850725207244800},
  {
"name_s":"Doc 2",
"user_s":"Bar",
"id":"0536827a-703a-456e-9087-71b85b63c58b",
"_version_":1621850725397037056}]

notice how there are no documents that have "enumfield":"B".
Now, if I execute query
"facet.field=enumfield=on=user_s:Bar=on=*:*=json",
my facet response's fields look like this:

"facet_fields":{
  "enumfield":[
"A",0]}

There is no "B" key -- and that's my problem. It tells me about other
facet values if they're filtered out by fq, but it tells me nothing
about facet values that aren't present in any doc.

My question is how to force the response to be
"facet_fields":{
  "enumfield":[
"A",0,
"B", 0]}


2019-01-05, št, 19:42 Erick Erickson  rašė:

> So really the results you want are q=*:*=enumField right?
> You could fire that query in parallel and combine the two in your app,
> perhaps caching the result if the index isn't changing very rapidly.
>
> Facets were designed with the idea that they'd only count for docs
> that were hits, so there's no built-in way to do what you want. Which, 
BTW,
> could be _very_ expensive in the general case. The query would
> have to count up, say, the hits for 100M documents...
>
> Best,
> Erick
>
> On Sat, Jan 5, 2019 at 1:53 AM Arvydas Silanskas
>  wrote:
> >
> > Hello,
> > I have an enum solr fieldtype. When I do a facet search, I want that all
> > the enum values appear in the facet -- and setting field.mincount = 0 is
> > not enough. It only works, if there exist a document with the matching
> > value for the field, but it was filtered out by current query (and then
> I'm
> > returned that facet value with the count 0). But can I make it to also
> > return the values that literally none of the documents in the index 
have?
> > The values, that only appear in the enum declaration xml.
>