Re: Trouble Installing Solr 7.1.0 On Ubunti 17

2017-10-23 Thread Yasufumi Mizoguchi
Hi,

Maybe, you have a wrong path. Try below.
$ sudo solr-7.1.0/bin/install_solr_service.sh 

Thanks,
Yasufumi.

2017-10-24 12:11 GMT+09:00 Dane Terrell :

> Hi I'm new to apache solr. I'm looking to install apache solr 7.1.0 on my
> localhost computer. I downloaded and extracted the tar file in my tmp
> folder. But when I try to run the script... sudo:
> solr-7.1.0/solr/bin/install_solr_service.sh: command not found
> or
> solr-7.1.0/solr/bin/install_solr_service.sh --strip-components=2
> I get the same error message. Can anyone help?
> Dane


Re: Solr nodes going into recovery mode and eventually failing

2017-10-23 Thread shamik
Thanks Emir and Zisis.

I added the maxRamMB for filterCache and reduced the size. I could the
benefit immediately, the hit ratio went to 0.97. Here's the configuration:





It seemed to be stable for few days, the cache hits and jvm pool utilization
seemed to be well within expected range. But the OOM issue occurred on one
of the nodes as the heap size reached 30gb. The hit ratio for query result
cache and document cache at that point was recorded as 0.18 and 0.65. I'm
not sure if the cache caused the memory spike at this point, with filter
cache restricted to 500mb, it should be negligible. One thing I noticed is
that the eviction rate now (with the addition of maxRamMB) is staying at 0.
Index hard commit happens at every 10 min, that's when the cache gets
flushed. Based on the monitoring log, the spike happened on the indexing
side where almost 8k docs went to pending state.

On the query performance standpoint, there have been occasional slow queries
(1sec+), but nothing alarming so far. Same goes for deep paging, I haven't
seen any evidence which points to that.

Based on the hit ratio, I can further scale down the query result and
document cache, also change to FastLRUCache and add maxRamMB. For filter
cache, I think this setting should be optimal enough to work on a 30gb heap
space unless I'm wrong on the maxRamMB concept. I'll have to get a heap dump
somehow, unfortunately, the whole process (of the node going down) happens
so quickly, I've hardly any time to run a profiler.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Solr deep paging queries run very slow due to redundant q param

2017-10-23 Thread Sundeep T
Pinging again. Anyone has ideas on this? Thanks

On Sat, Oct 14, 2017 at 4:52 PM, Sundeep T  wrote:

> Hello,
>
> In our scale environment, we see that the deep paging queries  using
> cursormark are running really slow. When we traced out the calls, we see
> that the second query which queries the individual id's of matched pages is
> sending the q param that is already sent by the first query again. If we
> remove the q param and directly query for ids, the query runs really fast.
>
> For example, the initial pagination query is like this with q param on
> timestamp field -
>
> 2017-10-14 12:20:51.647 UTC INFO  (qtp331844619-393343)
> [core='x:c6e422fc3054c475-core-1'] org.apache.solr.core.SolrCore.
> Request@2304 [c6e422fc3054c475-core-1]  webapp=/solr path=/select
> params={distrib=false&df=text&paginatedQuery=true&fl=id&
> shards.purpose=4&start=0&fsv=true&sort=timestamp+desc+,id+asc&shard.url=
> http://ops-data-solr-svc-1.rattle.svc.cluster.local:80/solr/
> c6e422fc3054c475-core-1&*rows=50*&version=2&
> *q=(timestamp:["2017-10-13T18:42:36Z"+TO+"2017-10-13T21:09:00Z"])*&
> shards.tolerant=true&*cursorMark=**&NOW=1507928978918&isShard=
> true&timeAllowed=-1&wt=javabin&trackingId=d5eff5476247487555b7413214648}
> hits=40294067 status=0 QTime=12727
>
> This query results in a second query due to solr implementation of deep
> paging like below. In this query, we already know the ids to be matched.
> So, there is no reason to pass the q param again. We tried manually
> executing the below query without the q param and just passing the ids
> alone and that executes in 50ms. So, this looks like a bug that Solr is
> passing in the q param again. Any ideas if there is workaround for this
> problem we can use?
>
> 2017-10-14 12:21:09.193 UTC INFO  (qtp331844619-742579)
> [core='x:6d63f95961c46475-core-1'] org.apache.solr.core.SolrCore.
> Request@2304 [6d63f95961c46475-core-1]  webapp=/solr path=/select
> params={distrib=false&df=text&paginatedQuery=true&fl=*,[
> removedocvaluesuffix]&shards.purpose=64&shard.url=http://
> ops-data-solr-svc-1.rattle.svc.cluster.local:80/solr/
> 6d63f95961c46475-core-1&rows=50&version=2&
> *q=(timestamp:["2017-10-14T08:50:16.340Z"+TO+"2017-10-14T19:19:50Z"])*&
> shards.tolerant=true&NOW=1507983581099&ids=00f037832e571941ed46ddd1959205
> 02,145c82e3eaa7678564b9e520822a3de1,09633cfabc6c830dfb44e04c313ba6b4,
> 0032a76ed4ea01207c2891070348ea39,1b5179ee23fe3e17236da37d6b8d991f,
> 04ee42e481b2a657bd3bb3c9f91b5ed5,2a910cf8a259925046a0c9fb5ee013c3,
> 1d1d607b03c18ec59c14c2f9ca0ab47f,034e775c96633dae7e629a1d37da86e6,
> 2759ca26d449d5df9f41689aa8ed3bac,16995a57699a7bb56d5018fe145028ce,
> 0509d16399e679470ffc07a8af22a918,1797ab6e0174c65bf2f6b650b3538407,
> 11c804ec4ae153a31929abe8613de336,11d20ed5dc0cf3d71f57aefc4e4b3ee2,
> 0135baecd2d3ae819909a0c021bbd48b,224b0671196fd141196b15add2e49b91,
> 271088227cf81e3641130d3bd5de8cc6,01f266b9c130239a06b00e45bda277a0,
> 1438bed6ffd956f1c49d765b942f1988,2fc9fef6500124b1b48218169a7cf974,
> 2d85d00593847398bf09e168bb3a190c,10e1c2803df1db3d47e525d3bd8a1868,
> 28b6d72729e79da3ad65ac3879740133,14be34af9995721b358b3fdb0bcb18d7,
> 1f2e0867bd495b8a332b8c8bd8ce2508,12cf1a1c07d9b9550ece4079b15f7583,
> 022cd0b3eef93cd9a6389c0394cf3899,11aa3132e00a96df6a49540612b91c8f,
> 0ff348e0433c9e751f1475db6dcab213,2b48279c9ff833f43a910edfa455a31d,
> 241e002d744ff0215155f214166fdd49,0fee30860c82d9a24bb8389317cd772c,
> 07f04d380832f514b0575866958eebaa,20b0efa5d88e2a9950fa4fd8ba930455,
> 14a9cadb7c75274bfc028bb9ae31236b,1829730aa4ee4750eb242266830b576b,
> 1ad5012e83bd271cf00b0c70ea86a856,0af4247d057bd833753e1f7bef959fc4,
> 0a09767d81cb351ab1598987022b6955,2f166fae9ca809642b8e20cea3020c24,
> 2c4d900575d8594a040c94751af59cb5,03f1c46a004a4e3b995295b512e1e324,
> 2c2aae83afc7426424c7de5301f8c692,034baf21ac1db436a7f3f2cf2cc668b0,
> 1dda29d03fb8611f8de80b90685fd9ee,0632292ab704dcaa606440cb1fee017b,
> 0fbd68f293c6964458a93f3034348625,2cdff46ab2e4d44b42f3381d5e3250b7,
> 1b2c90dce4a51b5e5c344fc2f9ab431d&isShard=true&timeAllowed=-
> 1&wt=javabin&trackingId=d5eff5476247487555b80c9ac7b82} status=0
> QTime=18136
>
> Thanks
> Sundeep
>


Trouble Installing Solr 7.1.0 On Ubunti 17

2017-10-23 Thread Dane Terrell
Hi I'm new to apache solr. I'm looking to install apache solr 7.1.0 on my 
localhost computer. I downloaded and extracted the tar file in my tmp folder. 
But when I try to run the script... sudo: 
solr-7.1.0/solr/bin/install_solr_service.sh: command not found
or
solr-7.1.0/solr/bin/install_solr_service.sh --strip-components=2
I get the same error message. Can anyone help?
Dane

How to make use of some features from lucene in SOLR?

2017-10-23 Thread Lisheng Zhang
i need to implement some rather customized sort in SOLR, i would appreciate
if you could give some high-level pointer on the following:

1/ could i make use of customized Collector (lucene level) in SOLR?

2/ could i make use of functional query (lucene level) for customized sort
in SOLR?

3/ i would like to pass over some parameters at query time for Collector &
functional query, those parameters are not related to SOLR, soly for our
implementation, could SOLR API accommodate those parameters?

i searched solrconfig.xml and schema.xml but did not see (could have
missed, a pointer would be great)

Thanks very much for helps, Lisheng


Re: Replacing legacyCloud Behaviour in Solr7

2017-10-23 Thread Marko Babic
Thanks for the quick reply, Erick.

To follow up: 

“
Well, first you can explicitly set legacyCloud=true by using the
Collections API CLUSTERPROP command. I don't recommend this, mind you,
as legacyCloud will not be supported forever.
“

Yes, but like you say: we’ll have to deal with at some point, not much benefit 
in punting.

“
I'm not following something here though. When you say:
"The desired final state of a such a deployment is a fully configured
cluster ready to accept updates."
are there any documents already in the index or is this really a new 
collection?
“

It’s a brand new collection with new configuration on fresh hardware which 
we’ll then fully index from a source document store (we do this when we have 
certain schema changes that require re-indexing or we want to experiment).

“
Not sure what you mean here. Configuration of what?  Just spinning up
a Solr node pointing to the right ZooKeeper should be sufficient, or
I'm not understanding at all.
“

Apologies, the way I stated that was all wrong: by “requires configuration” I 
just meant to note the need to specify a shard and a node when adding a replica 
(and not even the node as you point out to me below ☺).

“
I suspect you're really talking about the "node" parameter
to ADDREPLCIA
“

Ah, yes: that is what I meant, sorry.

It sounds like I haven’t missed too much in the documentation then, I’ll look 
more into replica placement rules.  

Thank you so much again for your time and help.

Marko


On 10/23/17, 4:33 PM, "Erick Erickson"  wrote:

Well, first you can explicitly set legacyCloud=true by using the
Collections API CLUSTERPROP command. I don't recommend this, mind you,
as legacyCloud will not be supported forever.

I'm not following something here though. When you say:
"The desired final state of a such a deployment is a fully configured
cluster ready to accept updates."
are there any documents already in the index or is this really a new 
collection?

and "adding new nodes requires explicit configuration"

Not sure what you mean here. Configuration of what?  Just spinning up
a Solr node pointing to the right ZooKeeper should be sufficient, or
I'm not understanding at all.

If not, your proposed outline seems right with one difference:
"if a node needs to be added: provision a machine, start up Solr, use
ADDREPLICA from Collections API passing shard number and coreNodeName"

coreNodeName isn't something you ordinarily need to bother with. I'm
being specific here where coreNodeName is usually something like
core_node7. I suspect you're really talking about the "node" parameter
to ADDREPLCIA, something like: 192.168.1.32:8983_solr, the entry from
live_nodes.

Now, all that said you may be better off just letting Solr add the
replica where it wants, it'll usually put a new replica on a node
without replicas so specifying the collection and shard should be
sufficient. Also, note that there are replica placement rules that can
help enforce this kind of thing.

Best,
Erick

On Mon, Oct 23, 2017 at 3:12 PM, Marko Babic  wrote:
> Hi everyone,
>
> I'm working on upgrading a set of clusters from Solr 4.10.4 to Solr 7.1.0.
>
> Our deployment tooling no longer works given that legacyCloud defaults to 
false (SOLR-8256) and I'm hoping to get some advice on what to do going forward.
>
> Our setup is as follows:
>   * we run in AWS with multiple independent Solr clusters, each with its 
own Zookeeper tier
>   * each cluster hosts only a single collection
>   * each machine/node in the cluster has a single core / is a replica for 
one shard in the collection
>
> We bring up new clusters as needed.  This is entirely automated and 
basically works as follows:
>   * we first provision and set up a fresh Zookeeper tier
>   * then, we provision a Solr bootstrapper machine that uploads 
collection config, specifies numShards and starts up
>   * it's then easy provision the rest of the machines and have them 
automatically join a shard in the collection by hooking them to the right 
Zookeeper cluster and specifying numShards
>   * if a node needs to be added to the cluster we just need to spin a 
machine up and start up Solr
>
> The desired final state of a such a deployment is a fully configured 
cluster ready to accept updates.
>
> Now that legacyCloud is false I'm not sure how to preserve this pretty 
nice, hands-off deployment style as the bootstrapping performed by the first 
node provisioned doesn't create a collection and adding new nodes requires 
explicit configuration.
>
> A new deployment procedure that I've worked out using the Collections API 
would look like:
>   * provision Zookeeper tier
>   * provision all the Solr nodes, wait for them all to come up
>   * upload c

Re: Replacing legacyCloud Behaviour in Solr7

2017-10-23 Thread Erick Erickson
Well, first you can explicitly set legacyCloud=true by using the
Collections API CLUSTERPROP command. I don't recommend this, mind you,
as legacyCloud will not be supported forever.

I'm not following something here though. When you say:
"The desired final state of a such a deployment is a fully configured
cluster ready to accept updates."
are there any documents already in the index or is this really a new collection?

and "adding new nodes requires explicit configuration"

Not sure what you mean here. Configuration of what?  Just spinning up
a Solr node pointing to the right ZooKeeper should be sufficient, or
I'm not understanding at all.

If not, your proposed outline seems right with one difference:
"if a node needs to be added: provision a machine, start up Solr, use
ADDREPLICA from Collections API passing shard number and coreNodeName"

coreNodeName isn't something you ordinarily need to bother with. I'm
being specific here where coreNodeName is usually something like
core_node7. I suspect you're really talking about the "node" parameter
to ADDREPLCIA, something like: 192.168.1.32:8983_solr, the entry from
live_nodes.

Now, all that said you may be better off just letting Solr add the
replica where it wants, it'll usually put a new replica on a node
without replicas so specifying the collection and shard should be
sufficient. Also, note that there are replica placement rules that can
help enforce this kind of thing.

Best,
Erick

On Mon, Oct 23, 2017 at 3:12 PM, Marko Babic  wrote:
> Hi everyone,
>
> I'm working on upgrading a set of clusters from Solr 4.10.4 to Solr 7.1.0.
>
> Our deployment tooling no longer works given that legacyCloud defaults to 
> false (SOLR-8256) and I'm hoping to get some advice on what to do going 
> forward.
>
> Our setup is as follows:
>   * we run in AWS with multiple independent Solr clusters, each with its own 
> Zookeeper tier
>   * each cluster hosts only a single collection
>   * each machine/node in the cluster has a single core / is a replica for one 
> shard in the collection
>
> We bring up new clusters as needed.  This is entirely automated and basically 
> works as follows:
>   * we first provision and set up a fresh Zookeeper tier
>   * then, we provision a Solr bootstrapper machine that uploads collection 
> config, specifies numShards and starts up
>   * it's then easy provision the rest of the machines and have them 
> automatically join a shard in the collection by hooking them to the right 
> Zookeeper cluster and specifying numShards
>   * if a node needs to be added to the cluster we just need to spin a machine 
> up and start up Solr
>
> The desired final state of a such a deployment is a fully configured cluster 
> ready to accept updates.
>
> Now that legacyCloud is false I'm not sure how to preserve this pretty nice, 
> hands-off deployment style as the bootstrapping performed by the first node 
> provisioned doesn't create a collection and adding new nodes requires 
> explicit configuration.
>
> A new deployment procedure that I've worked out using the Collections API 
> would look like:
>   * provision Zookeeper tier
>   * provision all the Solr nodes, wait for them all to come up
>   * upload collection config + solr.xml to Zookeeper
>   * create collection using Collections API
>   * if a node needs to be added: provision a machine, start up Solr, use 
> ADDREPLICA from Collections API passing shard number and coreNodeName
>
> This isn’t a giant deal to build but it adds complexity that I'm not excited 
> about as deployment tooling needs to have some understanding of what the 
> global state of the cluster is before being able to create a collection or 
> when adding/replacing nodes.
>
> The questions I was hoping someone would have some time to help me with are:
>
> * Does the new deployment procedure I've suggested seem reasonable?  Would we 
> be doing anything wrong/fighting best practices?
>   * Is there a way to keep cluster provisioning automated without having to 
> build additional orchestration logic into our deployment tooling (using 
> autoscaling, or triggers, or something I don’t know about)?
>
> Apologies for the wall of text and thanks. :)
>
> Marko
>


How to Efficiently Extract Learning to Rank Similarity Features From Solr?

2017-10-23 Thread Michael Alcorn
Hi,

I'm trying to extract several similarity measures from Solr for use in a
learning to rank model. Doing this mathematically involves taking the dot
product of several different matrices, which is extremely fast for non-huge
data sets (e.g., millions of documents and queries). However, to extract
these similarity features from Solr, I have to perform a Solr query for
each query, which introduces several bottlenecks. Are there more efficient
means of computing these similarity measures for large numbers of queries
(other than increased parallelism)?

Thanks,
Michael A. Alcorn


Replacing legacyCloud Behaviour in Solr7

2017-10-23 Thread Marko Babic
Hi everyone,

I'm working on upgrading a set of clusters from Solr 4.10.4 to Solr 7.1.0.

Our deployment tooling no longer works given that legacyCloud defaults to false 
(SOLR-8256) and I'm hoping to get some advice on what to do going forward.

Our setup is as follows:
  * we run in AWS with multiple independent Solr clusters, each with its own 
Zookeeper tier
  * each cluster hosts only a single collection
  * each machine/node in the cluster has a single core / is a replica for one 
shard in the collection

We bring up new clusters as needed.  This is entirely automated and basically 
works as follows:
  * we first provision and set up a fresh Zookeeper tier
  * then, we provision a Solr bootstrapper machine that uploads collection 
config, specifies numShards and starts up
  * it's then easy provision the rest of the machines and have them 
automatically join a shard in the collection by hooking them to the right 
Zookeeper cluster and specifying numShards
  * if a node needs to be added to the cluster we just need to spin a machine 
up and start up Solr

The desired final state of a such a deployment is a fully configured cluster 
ready to accept updates.

Now that legacyCloud is false I'm not sure how to preserve this pretty nice, 
hands-off deployment style as the bootstrapping performed by the first node 
provisioned doesn't create a collection and adding new nodes requires explicit 
configuration.

A new deployment procedure that I've worked out using the Collections API would 
look like:
  * provision Zookeeper tier
  * provision all the Solr nodes, wait for them all to come up
  * upload collection config + solr.xml to Zookeeper
  * create collection using Collections API
  * if a node needs to be added: provision a machine, start up Solr, use 
ADDREPLICA from Collections API passing shard number and coreNodeName

This isn’t a giant deal to build but it adds complexity that I'm not excited 
about as deployment tooling needs to have some understanding of what the global 
state of the cluster is before being able to create a collection or when 
adding/replacing nodes.

The questions I was hoping someone would have some time to help me with are:

* Does the new deployment procedure I've suggested seem reasonable?  Would we 
be doing anything wrong/fighting best practices?
  * Is there a way to keep cluster provisioning automated without having to 
build additional orchestration logic into our deployment tooling (using 
autoscaling, or triggers, or something I don’t know about)?

Apologies for the wall of text and thanks. :)

Marko



Re: Solr boosting multiple fields using edismax parser.

2017-10-23 Thread ruby
Thanks for your reply.

can the recip function be used to boost  a numeric field here: 

recip(ord(rating),100,1,1) 










--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Solr boosting multiple fields using edismax parser.

2017-10-23 Thread Aravind Durvasula
You can pass additional bq params in the query.

~Aravind

On Oct 23, 2017 4:10 PM, "ruby"  wrote:

> If I want to boost multiple fields using Edismax query parser, is following
> the correct way of doing it:
>
> 
> 
> edismax
> field1:(apple)^500
> field1:(orange)^400
> field1:(pear)^300
> field2:(4)^500
> field2:(2)^100
> recip(ms(NOW,mod_date),3.16e-11,1,1)
> recip(ms(NOW,creation_date),3.16e-11,1,1)
>
> And if boost is configured in solrconfig.xml, can I still pass additional
> boost queries through boost query?
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Solr boosting multiple fields using edismax parser.

2017-10-23 Thread ruby
If I want to boost multiple fields using Edismax query parser, is following
the correct way of doing it:



edismax
field1:(apple)^500  
field1:(orange)^400  
field1:(pear)^300 
field2:(4)^500  
field2:(2)^100 
recip(ms(NOW,mod_date),3.16e-11,1,1)
recip(ms(NOW,creation_date),3.16e-11,1,1)

And if boost is configured in solrconfig.xml, can I still pass additional
boost queries through boost query?



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Solr boosting multiple fields using edismax parser.

2017-10-23 Thread ruby
If I want to boost multiple fields using Edismax query parser, is following
the correct way of doing it:



edismax
field1:(apple)^500  
field1:(orange)^400  
field1:(pear)^300 
field2:(4)^500  
field2:(2)^100 
recip(ms(NOW,mod_date),3.16e-11,1,1)
recip(ms(NOW,creation_date),3.16e-11,1,1)

And if boost is configured in solrconfig.xml, can I still pass additional
boost queries through boost query?



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Really slow facet performance in 6.6

2017-10-23 Thread Toke Eskildsen
John Davis  wrote:
> We are seeing really slow facet performance with new solr release.
> This is on an index of 2M documents.

I am currently running some performance experiments on simple String faceting, 
comparing Solr 4 & 6. There is definitely a performance difference, but it is 
not trivial to pinpoint where it is. My first thought was that it was tied to 
the Solr version, with Solr 6 being markedly slower than Solr 4. However, 
looking at segment count, I can see that Solr 6 has twice as many segments as 
Solr 4 for my test setup. I tried optimizing down to 10 segments, which flipped 
the result: Suddenly Solr 6 was faster than Solr 4.

I'm still poking at this, but I guess my takeaway for now is to be sure to 
compare on fair terms. The strategy for creating segments can be tweaked and 
(guessing a lot here) it seems that Solr 6 defaults leans towards faster 
indexing (by having more small segments) at the cost of faceting performance.

These JIRAs seems relevant:
https://issues.apache.org/jira/browse/SOLR-8096
https://issues.apache.org/jira/browse/SOLR-9599

> 1. method=uif however that didn't help much (the facet fields have
> docValues=false since they are multi-valued). Debug info below.

docValues works fine with multi-values (at least for Strings).

- Toke Eskildsen


Re: Merging is not taking place with tiered merge policy

2017-10-23 Thread Erick Erickson
1> merging takes place up until the max segment size is reached (5G in
the default TieredMergePolicy).

2> there are a couple of options, again config changes for TieredMergePolicy
10
might help.

You could also try upping this (the default is 5G).
5000

Best,
Erick


On Mon, Oct 23, 2017 at 10:34 AM, chandrushanmugasundaram
 wrote:
> Thanks eric.
>
> (Beginner in solr). Few questions.
>
> 1. Does merging take place only when we have deleted docs?
> When my segments reach a count of 35+ the search is getting slow.Only on
> performing force merge to index the search is efficient.
>
> 2. Is there any way we can reduce the number of segments in solr
> automatically without any cron job by just altering some configuration in
> solrconfig.xml.
>
>
>
>
>
>
>
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Facets based on sampling

2017-10-23 Thread John Davis
Docvalues don't work for multivalued fields. I just started a separate
thread with more debug info. It is a bit surprising why facet computation
is so slow even when the query matches hundreds of docs.

On Mon, Oct 23, 2017 at 6:53 AM, alessandro.benedetti 
wrote:

> Hi John,
> first of all, I may state the obvious, but have you tried docValues ?
>
> Apart from that a friend of mine ( Diego Ceccarelli) was discussing a
> probabilistic implementation similar to the hyperloglog[1] to approximate
> facets counting.
> I didn't have time to take a look in details / implement anything yet.
> But it is on our To Do list :)
> He may add some info here.
>
> Cheers
>
>
>
>
> [1]
> https://blog.yld.io/2017/04/19/hyperloglog-a-probabilistic-data-structure/
>
>
>
> -
> ---
> Alessandro Benedetti
> Search Consultant, R&D Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: Merging is not taking place with tiered merge policy

2017-10-23 Thread chandrushanmugasundaram
Thanks eric.

(Beginner in solr). Few questions.

1. Does merging take place only when we have deleted docs? 
When my segments reach a count of 35+ the search is getting slow.Only on
performing force merge to index the search is efficient. 

2. Is there any way we can reduce the number of segments in solr
automatically without any cron job by just altering some configuration in
solrconfig.xml.










--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Merging is not taking place with tiered merge policy

2017-10-23 Thread chandrushanmugasundaram
Amrit,

Thanks for your reply. I have removed that 


1000
1
15
false
1024

  2
  2

hdfs

1
0








--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


segment merge in solr not happening

2017-10-23 Thread chandrushanmugasundaram
I find the Lucene segments in the backend is not merging and the segment
count increases to a lot. I changed the merge policy from
LogByteSizeMergePolicy to TieredMergePolicy

I tried altering properties according to the solr documentation but still,
my segments are high.

I am using solr 6.1.X. **The index data is stored in  HDFS.**

My index config of solrconfig.xml


1000
1
15
false
1024

  10
  1


  10
  10

hdfs

1
0



The only way we optimize is by force merging which is IO costly and also
takes hours to complete.
I have a cluster of three shards and replication factor as 2.

Can anyone help me where I am going wrong



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Really slow facet performance in 6.6

2017-10-23 Thread John Davis
Hello,

We are seeing really slow facet performance with new solr release. This is
on an index of 2M documents. A few things we've tried:

1. method=uif however that didn't help much (the facet fields have
docValues=false since they are multi-valued). Debug info below.

2. changing query (q=) that selects what documents to compute facets on
didn't help a lot, except repeating the same query was fast presumably due
to exact cache hits.

Sample debug info:

“timing”: {
“prepare”: {
“debug”: {
“time”: 0.0
},
“expand”: {
“time”: 0.0
},
“facet”: {
“time”: 0.0
},
“facet_module”: {
“time”: 0.0
},
“highlight”: {
“time”: 0.0
},
“mlt”: {
“time”: 0.0
},
“query”: {
“time”: 0.0
},
“stats”: {
“time”: 0.0
},
“terms”: {
“time”: 0.0
},
“time”: 0.0
},
“process”: {
“debug”: {
“time”: 87.0
},
“expand”: {
“time”: 0.0
},
“facet”: {
“time”: 9814.0
},
“facet_module”: {
“time”: 0.0
},
“highlight”: {
“time”: 0.0
},
“mlt”: {
“time”: 0.0
},
“query”: {
“time”: 20.0
},
“stats”: {
“time”: 0.0
},
“terms”: {
“time”: 0.0
},
“time”: 9922.0
},
“time”: 9923.0
}
},

"facet-debug": {
"elapse": 8310,
"sub-facet": [
{
"action": "field facet",
"elapse": 8310,
"maxThreads": 2,
"processor": "SimpleFacets",
"sub-facet": [
{},
{
"appliedMethod": "UIF",
"field": "school",
"inputDocSetSize": 476,
"requestedMethod": "UIF"
},
{
"appliedMethod": "UIF",
"elapse": 2575,
"field": "work",
"inputDocSetSize": 476,
"requestedMethod": "UIF"
},
{
"appliedMethod": "UIF",
"elapse": 8310,
"field": "level",
"inputDocSetSize": 476,
"requestedMethod": "UIF"
}
]
}

Thanks
John


Re: Merging is not taking place with tiered merge policy

2017-10-23 Thread Erick Erickson
And please define what you mean by "merging is not working". One
parameter is max segments size, which defaults to 5G. Segments
at or near that size are not eligible for merging unless they have
around 50% deleted docs.

Best,
Erick

On Mon, Oct 23, 2017 at 3:11 AM, Amrit Sarkar  wrote:
> Chandru,
>
> Didn't try the above config bu whyt have you defined both "mergePolicy" and
> "mergePolicyFactory"? and pass different values for same parameters?
>
>
> 
>>   10
>>   1
>> 
>> 
>>   10
>>   10
>> 
>>
>
>
> Amrit Sarkar
> Search Engineer
> Lucidworks, Inc.
> 415-589-9269
> www.lucidworks.com
> Twitter http://twitter.com/lucidworks
> LinkedIn: https://www.linkedin.com/in/sarkaramrit2
>
> On Mon, Oct 23, 2017 at 11:00 AM, Chandru Shanmugasundaram <
> chandru.shanmugasunda...@exterro.com> wrote:
>
>> The following is my solrconfig.xml
>>
>> 
>> 1000
>> 1
>> 15
>> false
>> 1024
>> 
>>   10
>>   1
>> 
>> 
>>   10
>>   10
>> 
>> hdfs
>> 
>> 1
>> 0
>> 
>>   
>>
>> Please let me know if should I tweak something above
>>
>>
>> --
>> Thanks,
>> Chandru.S
>>


Re: solr core replication

2017-10-23 Thread Erick Erickson
Great, thanks for bringing closure to this!

oh, and one addendum. I wrote:

It'll probably be around forever since replication is used as a fall-back

Forget the "probably" there. In 7.x there are new replica types that
use this as their way of distributing the index, see the PULL replica
type. So forget the "probably" in that statement ;)

Best,
Erick

On Mon, Oct 23, 2017 at 6:45 AM, Hendrik Haddorp
 wrote:
> Hi Erick,
>
> sorry for the slow reply. You are right, the information is not persisted.
> Once I do a restart there is no information about the replication source
> anymore. That explains why I could not find it anywhere persisted ;-) I
> thought I had tested that last week but must have not done so as it worked
> just fine now.
>
> thanks,
> Hendrik
>
> On 20.10.2017 16:39, Erick Erickson wrote:
>>
>> Does that persist even after you restart Solr on the target cluster?
>>
>> And that clears up one bit of confusion I had, I didn't know how you
>> were having each shard on the target cluster use a different master URL
>> given they all use the same solrconfig file. I was guessing some magic
>> with
>> system variables, but it turns out you were wy ahead of me and
>> not configuring the replication in solrconfig at all.
>>
>> But no, I know of no API level command that works to do what you're
>> asking.
>> I also don't know where that data is persisted, I'm afraid you'll have to
>> go
>> code-diving for all the help I can be
>>
>> Using fetchindex this way in SolrCloud is something of an edge case. It'll
>> probably be around forever since replication is used as a fall-back when
>> a replica syncs, but there'll be some bits like this hanging around I'd
>> guess.
>>
>> Best,
>> Erick
>>
>> On Thu, Oct 19, 2017 at 11:55 PM, Hendrik Haddorp
>>  wrote:
>>>
>>> Hi Erick,
>>>
>>> that is actually the call I'm using :-)
>>> If you invoke
>>> http://solr_target_machine:port/solr/core/replication?command=details
>>> after
>>> that you can see the replication status. But even after a Solr restart
>>> the
>>> call still shows the replication relation and I would like to remove this
>>> so
>>> that the core looks "normal" again.
>>>
>>> regards,
>>> Hendrik
>>>
>>> On 20.10.2017 02:31, Erick Erickson wrote:

 Little known trick:

 The fetchIndex replication API call can take any parameter you specify
 in your config. So you don't have to configure replication at all on
 your target collection, just issue the replication API command with
 masterUrl, something like:



 http://solr_target_machine:port/solr/core/replication?command=fetchindex&masterUrl=http://solr_source_machine:port/solr/core

 NOTE, "core" above will be something like
 collection1_shard1_replica1

 During the fetchindex, you won't be able to search on the target
 collection although the source will be searchable.

 Now, all that said this is just copying stuff. So let's say you've
 indexed to your source cluster and set up your target cluster (but
 don't index anything to the target or do the replication etc). Now if
 you shut down the target cluster and just copy the entire data dir
 from each source replica to each target replica then start all the
 target Solr instances up you'll be fine.

 Best,
 Erick

 On Thu, Oct 19, 2017 at 1:33 PM, Hendrik Haddorp
  wrote:
>
> Hi,
>
> I want to transfer a Solr collection from one SolrCloud to another one.
> For
> that I create a collection in the target cloud using the same config
> set
> as
> on the source cloud but with a replication factor of one. After that
> I'm
> using the Solr core API with a "replication?command=fetchindex" command
> to
> transfer the data. In the last step I'm increasing the replication
> factor.
> This seems to work fine so far. When I invoke
> "replication?command=details"
> I can see my replication setup and check if the replication is done. In
> the
> end I would like to remove this relation again but there does not seem
> to
> be
> an API call for that. Given that the replication should be a one time
> replication according to the API on
> https://lucene.apache.org/solr/guide/6_6/index-replication.html this
> should
> not be a big problem. It just does not look clean to me to leave this
> in
> the
> system. Is there anything I'm missing?
>
> regards,
> Hendrik
>>>
>>>
>


Re: Goal: reverse chronological display Methods? (1) boost, and/or (2) disable idf

2017-10-23 Thread alessandro.benedetti
In addition : bf=recip(ms(NOW/DAY,unixdate),3.16e-11,5,0.1)) is an additive
boost.
I tend to prefer multiplicative ones but that is up to you [1].

You can specify the order of magnitude of the values generated by that
function.
This means that you have control of how much the date will affect the score.
If you decide to go additive be careful with the order of magnitude of the
scores :

Your relevancy score magnitude will variate depending on the query and the
index while your additive boost is going to be < constant.

Regards


[1] https://nolanlawson.com/2012/06/02/comparing-boost-methods-in-solr/



-
---
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: LTR feature extraction performance issues

2017-10-23 Thread alessandro.benedetti
It strictly depends on the kind of features you are using.
At the moment there is just one cache for all the features.
This means that even if you have 1 query dependent feature and 100 document
dependent feature, a different value for the query dependent one will
invalidate the cache entry for the full vector[1].

You may look to optimise your features ( where possible).

[1]  https://issues.apache.org/jira/browse/SOLR-10448



-
---
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Facets based on sampling

2017-10-23 Thread alessandro.benedetti
Hi John, 
first of all, I may state the obvious, but have you tried docValues ?

Apart from that a friend of mine ( Diego Ceccarelli) was discussing a
probabilistic implementation similar to the hyperloglog[1] to approximate
facets counting.
I didn't have time to take a look in details / implement anything yet.
But it is on our To Do list :)
He may add some info here.

Cheers




[1]
https://blog.yld.io/2017/04/19/hyperloglog-a-probabilistic-data-structure/



-
---
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


RE: LTR feature cache performance issues

2017-10-23 Thread Brian Yee
Has anyone had experience tuning feature caches? Do any of the values below 
look unreasonable?



--Brian

-Original Message-
From: Brian Yee [mailto:b...@wayfair.com] 
Sent: Friday, October 20, 2017 1:41 PM
To: solr-user@lucene.apache.org
Subject: LTR feature extraction performance issues

I enabled LTR feature extraction and response times spiked. I suppose that was 
to be expected, but are there any tips regarding performance? I have the 
feature values cache set up as described in the docs:



Do I simply have to wait for the cache to fill up and hope that response times 
go down? Should I make these cache values bigger?


  *   Brian


Re: solr core replication

2017-10-23 Thread Hendrik Haddorp

Hi Erick,

sorry for the slow reply. You are right, the information is not 
persisted. Once I do a restart there is no information about the 
replication source anymore. That explains why I could not find it 
anywhere persisted ;-) I thought I had tested that last week but must 
have not done so as it worked just fine now.


thanks,
Hendrik

On 20.10.2017 16:39, Erick Erickson wrote:

Does that persist even after you restart Solr on the target cluster?

And that clears up one bit of confusion I had, I didn't know how you
were having each shard on the target cluster use a different master URL
given they all use the same solrconfig file. I was guessing some magic with
system variables, but it turns out you were wy ahead of me and
not configuring the replication in solrconfig at all.

But no, I know of no API level command that works to do what you're asking.
I also don't know where that data is persisted, I'm afraid you'll have to go
code-diving for all the help I can be

Using fetchindex this way in SolrCloud is something of an edge case. It'll
probably be around forever since replication is used as a fall-back when
a replica syncs, but there'll be some bits like this hanging around I'd guess.

Best,
Erick

On Thu, Oct 19, 2017 at 11:55 PM, Hendrik Haddorp
 wrote:

Hi Erick,

that is actually the call I'm using :-)
If you invoke
http://solr_target_machine:port/solr/core/replication?command=details after
that you can see the replication status. But even after a Solr restart the
call still shows the replication relation and I would like to remove this so
that the core looks "normal" again.

regards,
Hendrik

On 20.10.2017 02:31, Erick Erickson wrote:

Little known trick:

The fetchIndex replication API call can take any parameter you specify
in your config. So you don't have to configure replication at all on
your target collection, just issue the replication API command with
masterUrl, something like:


http://solr_target_machine:port/solr/core/replication?command=fetchindex&masterUrl=http://solr_source_machine:port/solr/core

NOTE, "core" above will be something like collection1_shard1_replica1

During the fetchindex, you won't be able to search on the target
collection although the source will be searchable.

Now, all that said this is just copying stuff. So let's say you've
indexed to your source cluster and set up your target cluster (but
don't index anything to the target or do the replication etc). Now if
you shut down the target cluster and just copy the entire data dir
from each source replica to each target replica then start all the
target Solr instances up you'll be fine.

Best,
Erick

On Thu, Oct 19, 2017 at 1:33 PM, Hendrik Haddorp
 wrote:

Hi,

I want to transfer a Solr collection from one SolrCloud to another one.
For
that I create a collection in the target cloud using the same config set
as
on the source cloud but with a replication factor of one. After that I'm
using the Solr core API with a "replication?command=fetchindex" command
to
transfer the data. In the last step I'm increasing the replication
factor.
This seems to work fine so far. When I invoke
"replication?command=details"
I can see my replication setup and check if the replication is done. In
the
end I would like to remove this relation again but there does not seem to
be
an API call for that. Given that the replication should be a one time
replication according to the API on
https://lucene.apache.org/solr/guide/6_6/index-replication.html this
should
not be a big problem. It just does not look clean to me to leave this in
the
system. Is there anything I'm missing?

regards,
Hendrik






Re: Solr nodes going into recovery mode and eventually failing

2017-10-23 Thread Emir Arnautović
You mentioned hat you are on v. 6.6, but in case someone else uses this, just 
to add that maxRamMB is added to FastLRUCache in version 6.4.

Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 23 Oct 2017, at 14:35, Zisis T.  wrote:
> 
> shamik wrote
>> I was not aware of maxRamMB parameter, looks like it's only available for
>> queryResultCache. Is that what you are referring to? Can you please share
>> your cache configuration?
> 
> I've setup filterCache entry inside solrconfig.xml as follows
> 
> * autowarmCount="0" maxRamMB="120"/>*
> 
> I had a look inside FastLRUCache code and saw that maxRamMB has precedence
> over the size. I can also confirm that I had more than 512 entries inside
> the cache, so the above will work. 
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html



Sort by field from another collection

2017-10-23 Thread Dmitry Gerasimov
Hi!

I have one main collection of people and a few more collections with
additional data. All search queries are on the main collection with
joins to one or more additional collections. A simple example would
be:

(*:* {!join from=people_person_id to=people_person_id
fromIndex=fundraising_donor_info v='total_donations_1y: [1000 TO
2000]'})


I need to sort results by fields from additional collections (e.g.
"total_donations_1y”) . Is there any way to do that through the common
query parameters? Or the only way is using streaming expressions?

Dmitry


RE: Certificate issue ERR_SSL_VERSION_OR_CIPHER_MISMATCH

2017-10-23 Thread Younge, Kent A - Norman, OK - Contractor

I was able to resolve the issue.  I was adding the certificate and then I had 
combined my certificate and private key.  So when I added the certificate plus 
the certificate and private key it was breaking.  I removed just the 
certificate and it resolved the issue.  So I had my root certificates and the 
certificate plus private key and everything starting working correctly. 





Thank you,

Kent Younge
Systems Engineer
USPS MTSC IT Support
600 W. Rock Creek Rd, Norman, OK  73069-8357
O:405 573 2273


-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org] 
Sent: Friday, October 20, 2017 4:33 PM
To: solr-user@lucene.apache.org
Subject: Re: Certificate issue ERR_SSL_VERSION_OR_CIPHER_MISMATCH

On 10/19/2017 6:30 AM, Younge, Kent A - Norman, OK - Contractor wrote:
> Built a clean Solr server imported my certificates and when I go to the 
> SSL/HTTPS page it tells me that I have ERR_SSL_VERSION_OR_CIPHER_MISMATCH in 
> Chrome and in IE tells me that I need to TURN ON TLS 1.0, TLS 1.1, and TLS 
> 1.2.

What java version?  What Java vendor?  What operating system?  The OS won't 
have a lot of impact on HTTPS, I just ask in case other information is desired, 
so we can tailor the information requests.

I see other messages where you mention Solr 6.6, which requires Java 8.

As Hoss mentioned to you in another thread, *all* of the SSL capability is 
provided by Java.  The Jetty that ships with Solr includes a config for HTTPS.  
The included Jetty config *excludes* a handful of low-quality ciphers that your 
browser probably already refuses to use, but that's the only cipher-specific 
configuration.  If you haven't changed the Jetty config in the Solr download, 
then Jetty defaults and your local Java settings will control everything else.  
As far as I am aware, Solr doesn't influence the SSL config at all.

  
    
  SSL_RSA_WITH_DES_CBC_SHA
  SSL_DHE_RSA_WITH_DES_CBC_SHA
  SSL_DHE_DSS_WITH_DES_CBC_SHA
  SSL_RSA_EXPORT_WITH_RC4_40_MD5
  SSL_RSA_EXPORT_WITH_DES40_CBC_SHA
  SSL_DHE_RSA_EXPORT_WITH_DES40_CBC_SHA
  SSL_DHE_DSS_EXPORT_WITH_DES40_CBC_SHA
    
  

It is extremely unlikely that Solr itself is causing these problems.  It is 
more likely that there's something about your environment (java version, custom 
java config, custom Jetty config, browser customization, or maybe something 
else) that is resulting in a protocol and cipher list that your browser doesn't 
like.

Thanks,
Shawn



Re: Solr nodes going into recovery mode and eventually failing

2017-10-23 Thread Zisis T.
shamik wrote
> I was not aware of maxRamMB parameter, looks like it's only available for
> queryResultCache. Is that what you are referring to? Can you please share
> your cache configuration?

I've setup filterCache entry inside solrconfig.xml as follows

**

I had a look inside FastLRUCache code and saw that maxRamMB has precedence
over the size. I can also confirm that I had more than 512 entries inside
the cache, so the above will work. 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: 3 color jvm memory usage bar

2017-10-23 Thread Toke Eskildsen
On Thu, 2017-10-19 at 08:56 -0700, Nawab Zada Asad Iqbal wrote:
> I see three colors in the JVM usage bar. Dark Gray, light Gray,
> white. (left to right).  Only one dark and one light color made sense
> to me (as i could interpret them as used vs available memory), but
> there is light gray between dark gray and white parts.

The light grey is the amount of memory reserved by the JVM. It is only
visible if you do not specify Xms, so many people do not have that.

Generally the dark grey (the amount of heap that is actively used to
hold data) will fluctuate a lot and I don't find it very usable for
observing and tweaking heap size. The GC-log is better.

- Toke Eskildsen, Royal Danish Library



Re: Solr nodes going into recovery mode and eventually failing

2017-10-23 Thread Emir Arnautović
Hi Shamik,
I agree that your filter cache is not the reason for OOMs. Can you confirm that 
your fieldCache and filedValueCache sizes are not consuming too much memory. 
The next on the list would be some heavy faceting with pivots, but you 
mentioned that all fields are low cardinality. Do you see any extremely slow 
queries in your logs? Can you check if there are some deep paging queries?
If nothing else, you can always do heap dump and see what’s in it.

And about your filterCache hit ratio: how frequently do you commit? With 400 
rq/h it can be that filters are not repeated between two commits. Do you have 
high eviction rate?

Thanks,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 20 Oct 2017, at 20:10, shamik  wrote:
> 
> Zisis, thanks for chiming in. This is really an interesting information and
> probably in line what I'm trying to fix. In my case, the facet fields are
> certainly not high cardinal ones. Most of them have a finite set of data,
> the max being 200 (though it has a low usage percentage). Earlier I had
> facet.limit=-1, but then scaled down to 200 to eliminate any performance
> overhead.
> 
> I was not aware of maxRamMB parameter, looks like it's only available for
> queryResultCache. Is that what you are referring to? Can you please share
> your cache configuration? 
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html



Problem JOIN Solr

2017-10-23 Thread guido.lau...@uksh.de
Hi,

i have a problem with JOIN-Function (query of two collections)

I have two collections

"ColAAA" and "ColBBB"

ColAAA => field ABC fieldtype "text_general" or "string"
ColBBB => fields XYZ and DEF fieldtype "string"

Example of field "ABC" ->"SomeWord 250kg"
With the JOIN-Function I want to subquery in collection ColBBB field "DEF" and 
use the result of XYZ as a query in field "ABC" of collection ColAAA

Query in collection ColAAA
{Join! from=XYZ to=ABC FromIndex=ColBBB}DEF=Something


What works:
When the result of XYZ is "SomeWord" (without 250kg) in ColBBB, that finds a 
match in ColAAA field ABC "SomeWord 250kg"

What does not work:
When the result of XYZ is "SomeWord 250kg" in ColBBB, that does not find in 
ColAAA field ABC "SomeWord 250kg"

What do I miss?

Greetings
Guido
[http://media.uksh.de/logo/uksh.gif]

Universit?tsklinikum Schleswig-Holstein
Rechtsf?hige Anstalt des ?ffentlichen Rechts der 
Christian-Albrechts-Universit?t zu Kiel und der Universit?t zu L?beck

Vorstandsmitglieder: Prof. Dr. Jens Scholz (Vorsitzender), Peter Pansegrau, 
Christa Meyer, Prof. Dr. Thomas M?nte, Prof. Dr. Ulrich Stephani
Vorsitzender der Gew?hrtr?gerversammlung: Dr. Philipp Nimmermann
Bankverbindungen:
F?rde Sparkasse IBAN: DE14 2105 0170  1002 06 SWIFT/BIC: NOLA DE 21 KIE
Commerzbank AG IBAN: DE17 2308 0040 0300 0412 00 SWIFT/BIC: DRES DE FF 230

Diese E-Mail enth?lt vertrauliche Informationen und ist nur f?r die Personen 
bestimmt, an welche sie gerichtet ist.
Sollten Sie nicht der bestimmungsgem??e Empf?nger sein, bitten wir Sie, uns 
hiervon unverz?glich zu unterrichten und die E-Mail zu vernichten.
Wir weisen darauf hin, dass der Gebrauch und die Weiterleitung einer nicht 
bestimmungsgem?? empfangenen E-Mail und ihres Inhalts gesetzlich verboten sind 
und ggf. Schadensersatzanspr?che ausl?sen k?nnen.


Re: Merging is not taking place with tiered merge policy

2017-10-23 Thread Amrit Sarkar
Chandru,

Didn't try the above config bu whyt have you defined both "mergePolicy" and
"mergePolicyFactory"? and pass different values for same parameters?



>   10
>   1
> 
> 
>   10
>   10
> 
>


Amrit Sarkar
Search Engineer
Lucidworks, Inc.
415-589-9269
www.lucidworks.com
Twitter http://twitter.com/lucidworks
LinkedIn: https://www.linkedin.com/in/sarkaramrit2

On Mon, Oct 23, 2017 at 11:00 AM, Chandru Shanmugasundaram <
chandru.shanmugasunda...@exterro.com> wrote:

> The following is my solrconfig.xml
>
> 
> 1000
> 1
> 15
> false
> 1024
> 
>   10
>   1
> 
> 
>   10
>   10
> 
> hdfs
> 
> 1
> 0
> 
>   
>
> Please let me know if should I tweak something above
>
>
> --
> Thanks,
> Chandru.S
>