Re: Solr 7.0.1 Duplicate document appearing in search results

2019-05-14 Thread Erick Erickson
This is indeed strange. First of all, forget about explanations that involve 
the transaction log etc. When Lucene opens a searcher, it is only for closed 
segments, the tlog has nothing to do with that.

Have you ever merget indexes? The MapReduceIndexerTool, if you ever used it, 
does not de-duplicate. Ditto if you ever changed the . The fact that 
you say that this clears up when you re-index the document leads me to wonder 
whether you have manipulated the index outside the normal Solr framework.

IOW, I’ve never seen this before, so I suspect there’s something you did in 
your setup that seemed innocent at the time that lead to this (temporary) 
situation.

Best,
Erick

> On May 14, 2019, at 5:43 PM, Adam Walz  wrote:
> 
> In my solr schema I have set a uniqueKey of "id" where the id field is a
> solr.StrField. When querying with this field as a filter I would expect to
> always get 1 or 0 documents as a result. However I am getting back multiple
> documents with the same "id" field, but different internal `docid`s. This
> problem is intermittent and seems to resolve itself when the document is
> updated. This is happening on solr 7.0.1 without SolrCloud and while only
> querying a single shard without routing.
> 
> Any thoughts on what could be causing this behavior? This is a very large
> single shard with 300 million documents and an index size of 750GB. I know
> that is not recommended for a single shard, but could it explain these
> duplicate results possibly because of the time it takes to commit, merge,
> or something with tlogs?
> 
> -- Query --
> http://solr:8983/solr/filesearch/select?fl=id,[docid],score=id:file_
> 
> *382506116*=*:*
> 
> -- Response --
> 
> {
>  "responseHeader":{
>"status":0,
>"QTime":0,
>"params":{
>  "mm":" 1<-0% ",
>  "q.alt":"*:*",
>  "ps":"100",
>  "echoParams":"all",
>  "fl":"id,[docid],score",
>  "fq":"id:file_413041895994",
>  "sort":"score desc",
>  "rows":"35",
>  "version":"2.2",
>  "q":"*:*",
>  "tie":"0.01",
>  "defType":"edismax",
>  "qf":"id name_combined^10 name_zh-cn^10 name_shingle
> name_shingle_zh-cn name_token^60 description file_content_en
> file_content_fr file_content_de file_content_it file_content_es
> file_content_zh-cn user_name user_email comments tags",
>  "pf":"description name_shingle^100 name_shingle_zh-cn^100 comments tags",
>  "wt":"json",
>  "debugQuery":"off"}},
>  "response":{"numFound":2,"start":0,"maxScore":1.0,"docs":[
>  {
>"id":"file_382506116",
> 
>"[docid]":346266675,
>"score":1.0}]
>  },{
> 
>"id":"file_382506116",
>"[docid]":170442733,
>"score":1.0}]
> 
>  }}
> 
> 
> -- Schema snippet --
> 
>   required="true"/>
> 
> id
> 
> -- 
> Adam Walz



Re: Seeking advice on SolrCloud production architecture with CDCR

2019-05-14 Thread Shawn Heisey

On 5/14/2019 4:55 PM, Cody Burleson wrote:

I’m worried, for example, about spreading the Zookeper cluster between the two 
data centers because of potential latency across the pond. Maybe we keep the ZK 
ensemble on one side of the pond only? I imagined, for instance,  2 ZK nodes on 
one server, and one on the other (in at least one data center). But maybe we 
need 5 ZKs, with 1 on each server in the other data center? Then how about the 
Solr nodes, shards, and replicas? If anybody has done some remotely similar 
setup for production purposes, I would be grateful for any tips (and down-right 
giddy for a diagram).


If you're planning a geographically diverse ZooKeeper setup, you cannot 
do it with only two datacenters.  You need at least three.  This is 
inherent to the design of ZK and cannot be changed.  With two data 
centers, you will always have one DC that if it goes down, ZK loses 
quorum.  When ZK loses quorum, SolrCloud loses the ability to react to 
failures and goes read-only.


You mentioned CDCR.  This involves two completely separate SolrCloud 
clusters -- a full ZK ensemble in each location.  So you would have 3 ZK 
servers and at least two Solr servers in one data center, and 3 ZK 
servers plus at least two Solr servers in the other data center.


Thanks,
Shawn


Seeking advice on SolrCloud production architecture with CDCR

2019-05-14 Thread Cody Burleson
Hi, all. We’re upgrading an old Solr 3.5 setup (master/slave replication) to 
SolrCloud (v7 or v8) and with the addition of a new data center (for dual data 
centers). I’ve done a lot of homework, but could still use some advice. While 
documentation explains Zookeper and SolrCloud pretty well, I don’t get a 
comfortable sense for how to lay everything out physically in the architecture.

At present, we have planned the same physical hardware as what we had for our 
master/slave setup (basically, 2 servers). Now, however, we’re going to 
duplicate that so that we also have the same in another data center: US and 
Europe. For this, the Cross Data Center Replication (CDCR; bi-directional) 
seems appropriate, but I’m not confident. Also, for the best fault tolerance 
and high-availability, I’m not real sure how to layout my Zookeper nodes and my 
Solr instances/shards/replicas physically across the servers. I’d like to start 
with the simplest possible setup and scale up only if necessary. Our index size 
is relatively small, I guess: ~150,000 documents.

I’m worried, for example, about spreading the Zookeper cluster between the two 
data centers because of potential latency across the pond. Maybe we keep the ZK 
ensemble on one side of the pond only? I imagined, for instance,  2 ZK nodes on 
one server, and one on the other (in at least one data center). But maybe we 
need 5 ZKs, with 1 on each server in the other data center? Then how about the 
Solr nodes, shards, and replicas? If anybody has done some remotely similar 
setup for production purposes, I would be grateful for any tips (and down-right 
giddy for a diagram).

I know I’m probably not even providing enough information to begin with, but 
perhaps someone will entertain a conversation?

Thanks, in advance, for sharing some of your valuable time and experience.

Cody


Solr 7.0.1 Duplicate document appearing in search results

2019-05-14 Thread Adam Walz
In my solr schema I have set a uniqueKey of "id" where the id field is a
solr.StrField. When querying with this field as a filter I would expect to
always get 1 or 0 documents as a result. However I am getting back multiple
documents with the same "id" field, but different internal `docid`s. This
problem is intermittent and seems to resolve itself when the document is
updated. This is happening on solr 7.0.1 without SolrCloud and while only
querying a single shard without routing.

Any thoughts on what could be causing this behavior? This is a very large
single shard with 300 million documents and an index size of 750GB. I know
that is not recommended for a single shard, but could it explain these
duplicate results possibly because of the time it takes to commit, merge,
or something with tlogs?

-- Query --
http://solr:8983/solr/filesearch/select?fl=id,[docid],score=id:file_

*382506116*=*:*

-- Response --

{
  "responseHeader":{
"status":0,
"QTime":0,
"params":{
  "mm":" 1<-0% ",
  "q.alt":"*:*",
  "ps":"100",
  "echoParams":"all",
  "fl":"id,[docid],score",
  "fq":"id:file_413041895994",
  "sort":"score desc",
  "rows":"35",
  "version":"2.2",
  "q":"*:*",
  "tie":"0.01",
  "defType":"edismax",
  "qf":"id name_combined^10 name_zh-cn^10 name_shingle
name_shingle_zh-cn name_token^60 description file_content_en
file_content_fr file_content_de file_content_it file_content_es
file_content_zh-cn user_name user_email comments tags",
  "pf":"description name_shingle^100 name_shingle_zh-cn^100 comments tags",
  "wt":"json",
  "debugQuery":"off"}},
  "response":{"numFound":2,"start":0,"maxScore":1.0,"docs":[
  {
"id":"file_382506116",

"[docid]":346266675,
"score":1.0}]
  },{

"id":"file_382506116",
"[docid]":170442733,
"score":1.0}]

  }}


-- Schema snippet --

  

 id

-- 
Adam Walz


Solr 8.1 issue with collection aliases

2019-05-14 Thread Jörn Franke
Hi,

I tried to upgrade from 8.0 to 8.1. I noticed that there is an issue with
collection aliases, but I am not 100% sure it is due to the upgrade.

Situation:
I have a collection called c_testcollection.
I have an alias called testcollection.
Alias "testcollection" points to "c_testcollection".
On Solr 8.0 no issue

After upgrade to Solr 8.1:
When I do a query on c_testcollection then there is no issue:
http://localhost:8983/solr/c_testcollection/select?q=test
When I do a query on testcollection then I receive the stacktrace below
http://localhost:8983/solr/testcollection/select?q=test

Additionally I observe a strange behavior in the admin ui. When I try to
create an alias (e.g. new) for a new collection (e.g. c_new) then it
creates two aliases:
new => c_new
c_new => c_new
if i then do a query on the alias new it works without issues. If I remove
the alias from c_new to c_new then I get the same error. Is this desired
behaviour?
It is rather annoying to have unnecessary aliases, because I need to filter
them out in my application when retrieving all aliases.
Is there a related issue.

Here the stacktrace:
{
  "error":{
"trace":"java.lang.NullPointerException\n\tat
java.base/java.util.AbstractCollection.addAll(AbstractCollection.java:351)\n\tat
org.apache.solr.common.cloud.Aliases.resolveAliasesGivenAliasMap(Aliases.java:258)\n\tat
org.apache.solr.common.cloud.Aliases.resolveAliases(Aliases.java:181)\n\tat
org.apache.solr.servlet.HttpSolrCall.resolveCollectionListOrAlias(HttpSolrCall.java:385)\n\tat
org.apache.solr.servlet.HttpSolrCall.init(HttpSolrCall.java:273)\n\tat
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:486)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:397)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:343)\n\tat
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)\n\tat
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)\n\tat
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\n\tat
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)\n\tat
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1588)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)\n\tat
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)\n\tat
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)\n\tat
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1557)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)\n\tat
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)\n\tat
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)\n\tat
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)\n\tat
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)\n\tat
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat
org.eclipse.jetty.server.Server.handle(Server.java:502)\n\tat
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364)\n\tat
org.eclipse.jetty.server.HttpChannel.run(HttpChannel.java:305)\n\tat
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)\n\tat
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)\n\tat
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)\n\tat
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:132)\n\tat
org.eclipse.jetty.http2.HTTP2Connection.produce(HTTP2Connection.java:171)\n\tat
org.eclipse.jetty.http2.HTTP2Connection.onFillable(HTTP2Connection.java:126)\n\tat
org.eclipse.jetty.http2.HTTP2Connection$FillableCallback.succeeded(HTTP2Connection.java:338)\n\tat
org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)\n\tat
org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.onFillable(SslConnection.java:411)\n\tat
org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:305)\n\tat
org.eclipse.jetty.io.ssl.SslConnection$2.succeeded(SslConnection.java:159)\n\tat
org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)\n\tat
org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)\n\tat

Re: Solr node goes into recovery mode

2019-05-14 Thread Erick Erickson
I think it’s the wrong question to ask. By using docaValues you’ll be able to 
significantly reduce the heap allocated to the Java process, reduce the 
overhead of garbage collections, reduce the possibility of nodes going into 
recovery and increase stability generally. Compared to those gains, performance 
improvements is a secondary concern.

Best,
Erick 

> On May 14, 2019, at 11:28 AM, Maulin Rathod  wrote:
> 
> Thanks Erick,
> 
> I understand using docvalue should improve query performance. Please correct 
> me if my understanding is incorrect.
> 
> Regards,
> 
> Maulin
> 
> 
> 
> On May 14, 2019 19:11, Erick Erickson  wrote:
> Use docValues on all fields you group, facet or sort on.
> 
> NOTE: you _must_ re-index from scratch, I’d index to a new collection and 
> start over. Paradoxically your index size _on disk_ will increase, but your 
> JVM will need drastically less heap. See: 
> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
> 
> Best,
> Erick
> 
>> On May 14, 2019, at 1:11 AM, Maulin Rathod  wrote:
>> 
>> Thanks for reply.
>> 
>> Our solr node normally uses 30-45 gb and hence we allocated 60 heap size.  
>> We analyzed heap dump and found that around 85% heap was used by 
>> org.apache.solr.uninverting.FieldCacheImpl.
>> 
>> One instance of
>> "org.apache.solr.uninverting.FieldCacheImpl" loaded by 
>> "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x48fe5e9b0" occupies 
>> 19,72,04,15,160 (86.28%) bytes. The memory is accumulated in one instance of 
>> "java.util.HashMap$Node[]" loaded by "".
>> 
>> 
>> Please note we are not using any solr cache as in our scenario new documents 
>> added to index quite fast (at least 10 documents are added to index every 
>> second) and we need to open searcher again to make this new documents 
>> available.
>> 
>> We are not using docValues. As per our understanding using docValues to 
>> should improve query performance and should also reduce memory requirement 
>> as we are using lots of sorting/faceting in our queries. Please let me know 
>> your thoughts on it. Please also suggest if there are any other way to 
>> reduce to memory requirement/optimize the performance.
>> 
>> 
>> Regards,
>> 
>> Maulin
>> 
>> -Original Message-
>> From: Shawn Heisey 
>> Sent: 14 May 2019 01:04
>> To: solr-user@lucene.apache.org
>> Subject: Re: Solr node goes into recovery mode
>> 
>> On 5/13/2019 8:26 AM, Maulin Rathod wrote:
>>> Recently we are observing issue where solr node (any random node) 
>>> automatically goes into recovery mode and stops responding.
>> 
>> Do you KNOW that these Solr instances actually need a 60GB heap?  That's a 
>> HUGE heap.  When a full GC happens on a heap that large, it's going to be a 
>> long pause, and there's nothing that can be done about it.
>> 
>>> We have enough memory allocated to Solr (60 gb) and system also have enough 
>>> memory (300 gb)...
>> 
>> As just mentioned, unless you are CERTAIN that you need a 60GB heap, which 
>> most users do not, don't set it that high.  Any advice you read that says 
>> "set the heap to XX percent of the installed system memory"
>> will frequently result in a setting that's incorrect for your specific setup.
>> 
>> And if you really DO need a 60GB heap, it would be recommended to either add 
>> more servers and put less of your index on each one, or to split your 
>> replicas between two Solr instances each running 31GB or less -- as Erick 
>> mentioned in his reply.
>> 
>>> We have analyzed GC logs and found that there was GC pause time of 
>>> 29.6583943 second when problem happened. Can this GC Pause lead to make the 
>>> node unavailable/recovery mode? or there could be some another reason ?
>> 
>>> Please note we have set zkClientTimeout to 10 minutes 
>>> (zkClientTimeout=60) so that zookeeper will not consider this node 
>>> unavailable during high GC pause time.
>> 
>> You can't actually set that timeout that high.  I believe that ZooKeeper 
>> limits the session timeout to 20 times the tickTime, which is typically set 
>> to 2 seconds.  So 40 seconds is typically the maximum you can have for that 
>> timeout.  Solr's zkClientTimeout value is used to set ZooKeeper's session 
>> timeout.
>> 
>> And, as Erick also mentioned, there are other ways that a long GC pause can 
>> cause problems other than that specific timeout.  SolrCloud is not going to 
>> work well with a huge heap ... eventually a full GC is going to happen, and 
>> if it takes more than a few seconds, it's going to cause issues.
>> 
>> Thanks,
>> Shawn
>> 
>> [CC Award Winners!]
>> 
> 
> 
> 
> [CC Award Winners!]
> 



Re: Writing unit tests to test complex solr queries

2019-05-14 Thread Angie Rabelero
Hi, I’ll advised you to extend the class SolrCloudTestCase, which extends the 
MiniSolrCloudCluster. Theres a hello world example in the solr source at 
https://github.com/apache/lucene-solr/blob/master/solr/core/src/test/org/apache/solr/HelloWorldSolrCloudTestCase.java.

Here’s how I setup a cluster, create a collection with my ConfigSet, and index 
a file.

@BeforeClass
public static void setupCluster() throws Exception {

// Create and configure cluster
configureCluster(nodeCount)
.addConfig(CONFIG_NAME, getFile(CONFIG_DIR).toPath())
.configure();

// Create an empty collection
Create.createCollection(COLLECTION, CONFIG_NAME, numShards, numReplicas)
  .setMaxShardsPerNode(maxShardsPerNode)
  .process(cluster.getSolrClient(), COLLECTION);
AbstractDistribZkTestBase
.waitForRecoveriesToFinish(COLLECTION, 
cluster.getSolrClient().getZkStateReader(), true, true, 120);

// Set default collection
cluster.getSolrClient().setDefaultCollection(COLLECTION);

// Add documents to collection
ContentStreamUpdateRequest up = new 
ContentStreamUpdateRequest("/update");
up.addFile(getFile("testdata/test-data.json"), "application/json");
up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
NamedList result = cluster.getSolrClient().request(up);

// Print cluster status
System.out.println("Default Collection: " + 
cluster.getSolrClient().getDefaultCollection());
System.out.println("Cluster State: " + 
cluster.getSolrClient().getZkStateReader().getClusterState());
System.out.println("Update Result: " + result);

}

 I copy the configset to the resources dir in the pom using a mauven plugin. 
And the test file is already in the resources dir.




> On May 14, 2019, at 04:01, Mikhail Khludnev  wrote:
> 
> Hello, Pratick.
> Welcome to mysterious world of Solr testing. The best way is to find
> existing test closest to your problem field, copy in and amend necessarily.
> What about
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_lucene-2Dsolr_blob_master_solr_solrj_src_test_org_apache_solr_client_solrj_io_stream_StreamExpressionTest.java=DwIBaQ=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE=lUsTzFRk0CX38HvagQ0wd52D67dA0fx_D6M6F3LHzAU=9tFliF4KA1tiG2lGmDJWO34hyq9-Sz1inAxRPVKkz78=KjveDzxzQAKRmvzPYk2y1FQ-w6yAGWuwfTVGHMQP2ZA=
> ?
> 
> On Fri, May 10, 2019 at 11:36 PM Pratik Patel  wrote:
> 
>> Hello Everyone,
>> 
>> I want to write unit tests for some solr queries which are being triggered
>> through java code. These queries includes complex streaming expressions and
>> faceting queries which requires large number of documents to be present in
>> solr index. I can not create and push so many documents programmatically
>> through my tests.
>> 
>> I am trying to find a way to test these queries without depending on
>> externally running solr instance. I found following approach which is using
>> classes like EmbeddedSolrServer and CoreContainer. We can put index files
>> and solr configuration on classpath and run the tests against them.
>> 
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__dzone.com_articles_junit-2Dtesting-2Dfor-2Dsolr-2D6=DwIBaQ=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE=lUsTzFRk0CX38HvagQ0wd52D67dA0fx_D6M6F3LHzAU=9tFliF4KA1tiG2lGmDJWO34hyq9-Sz1inAxRPVKkz78=K4vPwvz9h9H8s-nsZTbkmCvTh002RP3CHcpbb9IOrpw=
>> 
>> However, this seems to be an old approach and I am trying to find a way to
>> do it using latest solr-test-framework. I also can not use old approach
>> because I want to test Streaming Expressions as well and I need
>> SolrCloudClient for that.
>> In solr-test-framework, I found MiniSolrCloudCluster class but I don't know
>> how to use pre-created index files and configuration with that.
>> 
>> Does anyone know how we can use pre-created index files and configuration
>> with latest test-framework? What is the recommended way to do such kind of
>> testing? Any direction with this would be really helpful.
>> 
>> Thanks!
>> Pratik
>> 
> 
> 
> -- 
> Sincerely yours
> Mikhail Khludnev



Re: Solr node goes into recovery mode

2019-05-14 Thread Maulin Rathod
Thanks Erick,

I understand using docvalue should improve query performance. Please correct me 
if my understanding is incorrect.

Regards,

Maulin



On May 14, 2019 19:11, Erick Erickson  wrote:
Use docValues on all fields you group, facet or sort on.

NOTE: you _must_ re-index from scratch, I’d index to a new collection and start 
over. Paradoxically your index size _on disk_ will increase, but your JVM will 
need drastically less heap. See: 
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

Best,
Erick

> On May 14, 2019, at 1:11 AM, Maulin Rathod  wrote:
>
> Thanks for reply.
>
> Our solr node normally uses 30-45 gb and hence we allocated 60 heap size.  We 
> analyzed heap dump and found that around 85% heap was used by 
> org.apache.solr.uninverting.FieldCacheImpl.
> 
> One instance of
> "org.apache.solr.uninverting.FieldCacheImpl" loaded by 
> "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x48fe5e9b0" occupies 
> 19,72,04,15,160 (86.28%) bytes. The memory is accumulated in one instance of 
> "java.util.HashMap$Node[]" loaded by "".
> 
>
> Please note we are not using any solr cache as in our scenario new documents 
> added to index quite fast (at least 10 documents are added to index every 
> second) and we need to open searcher again to make this new documents 
> available.
>
> We are not using docValues. As per our understanding using docValues to 
> should improve query performance and should also reduce memory requirement as 
> we are using lots of sorting/faceting in our queries. Please let me know your 
> thoughts on it. Please also suggest if there are any other way to reduce to 
> memory requirement/optimize the performance.
>
>
> Regards,
>
> Maulin
>
> -Original Message-
> From: Shawn Heisey 
> Sent: 14 May 2019 01:04
> To: solr-user@lucene.apache.org
> Subject: Re: Solr node goes into recovery mode
>
> On 5/13/2019 8:26 AM, Maulin Rathod wrote:
>> Recently we are observing issue where solr node (any random node) 
>> automatically goes into recovery mode and stops responding.
>
> Do you KNOW that these Solr instances actually need a 60GB heap?  That's a 
> HUGE heap.  When a full GC happens on a heap that large, it's going to be a 
> long pause, and there's nothing that can be done about it.
>
>> We have enough memory allocated to Solr (60 gb) and system also have enough 
>> memory (300 gb)...
>
> As just mentioned, unless you are CERTAIN that you need a 60GB heap, which 
> most users do not, don't set it that high.  Any advice you read that says 
> "set the heap to XX percent of the installed system memory"
> will frequently result in a setting that's incorrect for your specific setup.
>
> And if you really DO need a 60GB heap, it would be recommended to either add 
> more servers and put less of your index on each one, or to split your 
> replicas between two Solr instances each running 31GB or less -- as Erick 
> mentioned in his reply.
>
>> We have analyzed GC logs and found that there was GC pause time of 
>> 29.6583943 second when problem happened. Can this GC Pause lead to make the 
>> node unavailable/recovery mode? or there could be some another reason ?
>
>> Please note we have set zkClientTimeout to 10 minutes 
>> (zkClientTimeout=60) so that zookeeper will not consider this node 
>> unavailable during high GC pause time.
>
> You can't actually set that timeout that high.  I believe that ZooKeeper 
> limits the session timeout to 20 times the tickTime, which is typically set 
> to 2 seconds.  So 40 seconds is typically the maximum you can have for that 
> timeout.  Solr's zkClientTimeout value is used to set ZooKeeper's session 
> timeout.
>
> And, as Erick also mentioned, there are other ways that a long GC pause can 
> cause problems other than that specific timeout.  SolrCloud is not going to 
> work well with a huge heap ... eventually a full GC is going to happen, and 
> if it takes more than a few seconds, it's going to cause issues.
>
> Thanks,
> Shawn
>
> [CC Award Winners!]
>



[CC Award Winners!]



RE: mergeindexes action does not seem to be merging cores.

2019-05-14 Thread Piyush Kumar Nayak
Yes that worked. 
Thanks Erick, for your help.


-Original Message-
From: Erick Erickson  
Sent: Tuesday, May 14, 2019 7:09 PM
To: solr-user@lucene.apache.org
Subject: Re: mergeindexes action does not seem to be merging cores.

Did you commit afterwards?

> On May 14, 2019, at 8:04 AM, Piyush Kumar Nayak  
> wrote:
> 
> Hi,
> 
> I don't seem to be able to get the merge core feature to work with Solr 7.2.1.
> I'm using the srcCore parameter method documented at 
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fluce
> ne.apache.org%2Fsolr%2Fguide%2F6_6%2Fcoreadmin-api.html%23CoreAdminAPI
> -MERGEINDEXESdata=02%7C01%7Cpnayak%40adobe.com%7Cf5b4b1e2204749c7
> e8a208d6d871969d%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C63693437
> 9703364465sdata=sJ05%2Fm63PMNaaH3%2BHZXIw31xCFEPDuppIC2B4xe5rwQ%3
> Dreserved=0
> 
> I am making the following GET HTTP call using a browser:
> http://localhost:8991/solr/admin/cores?action=mergeindexes=mcol
> rcCore=col1=col2 In the call above the following cores are 
> referenced:
> merged core name: mcol
> core 1 name : fc1
> core 2 name : fc1
> All 3 cores preexist.
> 
> The response content I get is :
> {
>  "responseHeader":{
>"status":0,
>"QTime":57}}
> 
> I see the following in the logs:
> 747532 [qtp2028017635-17] INFO  org.apache.solr.servlet.HttpSolrCall  
> - [admin] webapp=null path=/admin/cores 
> params={core=mcx=mergeindexes=fcl1=fc2} 
> status=400 QTime=2
> 756577 [qtp2028017635-18] INFO  
> org.apache.solr.update.DirectUpdateHandler2  - start 
> mergeIndexes{NRTCachingDirectory(MMapDirectory@C:\collections\fc1\data
> \index 
> lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@7c53c896
> ; maxCacheMB=48.0 
> maxMergeSizeMB=4.0),NRTCachingDirectory(MMapDirectory@C:\collections\f
> c2\data\index 
> lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@3e2255a5
> ; maxCacheMB=48.0 maxMergeSizeMB=4.0)}
> 756602 [qtp2028017635-18] INFO  
> org.apache.solr.update.DirectUpdateHandler2  - end_mergeIndexes
> 
> But the number of docs in the merged core do not increase post merge 
> operation. Search for keywords that exist in the core1 do not return results 
> with merged core.
> Am I missing something?
> 
> 
> Best regards,
> Piyush.



Re: Solr node goes into recovery mode

2019-05-14 Thread Erick Erickson
Use docValues on all fields you group, facet or sort on.

NOTE: you _must_ re-index from scratch, I’d index to a new collection and start 
over. Paradoxically your index size _on disk_ will increase, but your JVM will 
need drastically less heap. See: 
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

Best,
Erick

> On May 14, 2019, at 1:11 AM, Maulin Rathod  wrote:
> 
> Thanks for reply.
> 
> Our solr node normally uses 30-45 gb and hence we allocated 60 heap size.  We 
> analyzed heap dump and found that around 85% heap was used by 
> org.apache.solr.uninverting.FieldCacheImpl.
> 
> One instance of
> "org.apache.solr.uninverting.FieldCacheImpl" loaded by 
> "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x48fe5e9b0" occupies 
> 19,72,04,15,160 (86.28%) bytes. The memory is accumulated in one instance of 
> "java.util.HashMap$Node[]" loaded by "".
> 
> 
> Please note we are not using any solr cache as in our scenario new documents 
> added to index quite fast (at least 10 documents are added to index every 
> second) and we need to open searcher again to make this new documents 
> available.
> 
> We are not using docValues. As per our understanding using docValues to 
> should improve query performance and should also reduce memory requirement as 
> we are using lots of sorting/faceting in our queries. Please let me know your 
> thoughts on it. Please also suggest if there are any other way to reduce to 
> memory requirement/optimize the performance.
> 
> 
> Regards,
> 
> Maulin
> 
> -Original Message-
> From: Shawn Heisey 
> Sent: 14 May 2019 01:04
> To: solr-user@lucene.apache.org
> Subject: Re: Solr node goes into recovery mode
> 
> On 5/13/2019 8:26 AM, Maulin Rathod wrote:
>> Recently we are observing issue where solr node (any random node) 
>> automatically goes into recovery mode and stops responding.
> 
> Do you KNOW that these Solr instances actually need a 60GB heap?  That's a 
> HUGE heap.  When a full GC happens on a heap that large, it's going to be a 
> long pause, and there's nothing that can be done about it.
> 
>> We have enough memory allocated to Solr (60 gb) and system also have enough 
>> memory (300 gb)...
> 
> As just mentioned, unless you are CERTAIN that you need a 60GB heap, which 
> most users do not, don't set it that high.  Any advice you read that says 
> "set the heap to XX percent of the installed system memory"
> will frequently result in a setting that's incorrect for your specific setup.
> 
> And if you really DO need a 60GB heap, it would be recommended to either add 
> more servers and put less of your index on each one, or to split your 
> replicas between two Solr instances each running 31GB or less -- as Erick 
> mentioned in his reply.
> 
>> We have analyzed GC logs and found that there was GC pause time of 
>> 29.6583943 second when problem happened. Can this GC Pause lead to make the 
>> node unavailable/recovery mode? or there could be some another reason ?
> 
>> Please note we have set zkClientTimeout to 10 minutes 
>> (zkClientTimeout=60) so that zookeeper will not consider this node 
>> unavailable during high GC pause time.
> 
> You can't actually set that timeout that high.  I believe that ZooKeeper 
> limits the session timeout to 20 times the tickTime, which is typically set 
> to 2 seconds.  So 40 seconds is typically the maximum you can have for that 
> timeout.  Solr's zkClientTimeout value is used to set ZooKeeper's session 
> timeout.
> 
> And, as Erick also mentioned, there are other ways that a long GC pause can 
> cause problems other than that specific timeout.  SolrCloud is not going to 
> work well with a huge heap ... eventually a full GC is going to happen, and 
> if it takes more than a few seconds, it's going to cause issues.
> 
> Thanks,
> Shawn
> 
> [CC Award Winners!]
> 



Re: mergeindexes action does not seem to be merging cores.

2019-05-14 Thread Erick Erickson
Did you commit afterwards?

> On May 14, 2019, at 8:04 AM, Piyush Kumar Nayak  
> wrote:
> 
> Hi,
> 
> I don't seem to be able to get the merge core feature to work with Solr 7.2.1.
> I'm using the srcCore parameter method documented at 
> https://lucene.apache.org/solr/guide/6_6/coreadmin-api.html#CoreAdminAPI-MERGEINDEXES
> 
> I am making the following GET HTTP call using a browser:
> http://localhost:8991/solr/admin/cores?action=mergeindexes=mcol=col1=col2
> In the call above the following cores are referenced:
> merged core name: mcol
> core 1 name : fc1
> core 2 name : fc1
> All 3 cores preexist.
> 
> The response content I get is :
> {
>  "responseHeader":{
>"status":0,
>"QTime":57}}
> 
> I see the following in the logs:
> 747532 [qtp2028017635-17] INFO  org.apache.solr.servlet.HttpSolrCall  - 
> [admin] webapp=null path=/admin/cores 
> params={core=mcx=mergeindexes=fcl1=fc2} status=400 
> QTime=2
> 756577 [qtp2028017635-18] INFO  org.apache.solr.update.DirectUpdateHandler2  
> - start 
> mergeIndexes{NRTCachingDirectory(MMapDirectory@C:\collections\fc1\data\index 
> lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@7c53c896; 
> maxCacheMB=48.0 
> maxMergeSizeMB=4.0),NRTCachingDirectory(MMapDirectory@C:\collections\fc2\data\index
>  lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@3e2255a5; 
> maxCacheMB=48.0 maxMergeSizeMB=4.0)}
> 756602 [qtp2028017635-18] INFO  org.apache.solr.update.DirectUpdateHandler2  
> - end_mergeIndexes
> 
> But the number of docs in the merged core do not increase post merge 
> operation. Search for keywords that exist in the core1 do not return results 
> with merged core.
> Am I missing something?
> 
> 
> Best regards,
> Piyush.



mergeindexes action does not seem to be merging cores.

2019-05-14 Thread Piyush Kumar Nayak
Hi,

I don't seem to be able to get the merge core feature to work with Solr 7.2.1.
I'm using the srcCore parameter method documented at 
https://lucene.apache.org/solr/guide/6_6/coreadmin-api.html#CoreAdminAPI-MERGEINDEXES

I am making the following GET HTTP call using a browser:
http://localhost:8991/solr/admin/cores?action=mergeindexes=mcol=col1=col2
In the call above the following cores are referenced:
merged core name: mcol
core 1 name : fc1
core 2 name : fc1
All 3 cores preexist.

The response content I get is :
{
  "responseHeader":{
"status":0,
"QTime":57}}

I see the following in the logs:
747532 [qtp2028017635-17] INFO  org.apache.solr.servlet.HttpSolrCall  - [admin] 
webapp=null path=/admin/cores 
params={core=mcx=mergeindexes=fcl1=fc2} status=400 
QTime=2
756577 [qtp2028017635-18] INFO  org.apache.solr.update.DirectUpdateHandler2  - 
start 
mergeIndexes{NRTCachingDirectory(MMapDirectory@C:\collections\fc1\data\index 
lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@7c53c896; 
maxCacheMB=48.0 
maxMergeSizeMB=4.0),NRTCachingDirectory(MMapDirectory@C:\collections\fc2\data\index
 lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@3e2255a5; 
maxCacheMB=48.0 maxMergeSizeMB=4.0)}
756602 [qtp2028017635-18] INFO  org.apache.solr.update.DirectUpdateHandler2  - 
end_mergeIndexes

But the number of docs in the merged core do not increase post merge operation. 
Search for keywords that exist in the core1 do not return results with merged 
core.
Am I missing something?


Best regards,
Piyush.


Re: query keyword but no result (solr 8)

2019-05-14 Thread Charles Sanders
Derrick, 
This makes me think you do not have a default query field identified. Look in 
your solrcong.xml file for the requesthandler you are calling and see if it has 
the "df" parameter set. Should look something like: 
text 

Hope this helps! 

- Original Message -

From: "Derrick Cui"  
To: solr-user@lucene.apache.org 
Sent: Monday, May 13, 2019 10:52:13 PM 
Subject: query keyword but no result (solr 8) 

Hi, 

I am trying to setup solrcloud, I can index a few documents successfully. 
but I cannot get result if I search keyword(without field). if I use 
field:keyword, I can get result. 

any idea why I get this issue? 

Thank you 

-- 
Regards, 

Derrick Cui 
Email: derrick...@gmail.com 



RE: Solr node goes into recovery mode

2019-05-14 Thread Maulin Rathod
Thanks for reply.

Our solr node normally uses 30-45 gb and hence we allocated 60 heap size.  We 
analyzed heap dump and found that around 85% heap was used by 
org.apache.solr.uninverting.FieldCacheImpl.

One instance of
"org.apache.solr.uninverting.FieldCacheImpl" loaded by 
"org.eclipse.jetty.webapp.WebAppClassLoader @ 0x48fe5e9b0" occupies 
19,72,04,15,160 (86.28%) bytes. The memory is accumulated in one instance of 
"java.util.HashMap$Node[]" loaded by "".


Please note we are not using any solr cache as in our scenario new documents 
added to index quite fast (at least 10 documents are added to index every 
second) and we need to open searcher again to make this new documents available.

We are not using docValues. As per our understanding using docValues to should 
improve query performance and should also reduce memory requirement as we are 
using lots of sorting/faceting in our queries. Please let me know your thoughts 
on it. Please also suggest if there are any other way to reduce to memory 
requirement/optimize the performance.


Regards,

Maulin

-Original Message-
From: Shawn Heisey 
Sent: 14 May 2019 01:04
To: solr-user@lucene.apache.org
Subject: Re: Solr node goes into recovery mode

On 5/13/2019 8:26 AM, Maulin Rathod wrote:
> Recently we are observing issue where solr node (any random node) 
> automatically goes into recovery mode and stops responding.

Do you KNOW that these Solr instances actually need a 60GB heap?  That's a HUGE 
heap.  When a full GC happens on a heap that large, it's going to be a long 
pause, and there's nothing that can be done about it.

> We have enough memory allocated to Solr (60 gb) and system also have enough 
> memory (300 gb)...

As just mentioned, unless you are CERTAIN that you need a 60GB heap, which most 
users do not, don't set it that high.  Any advice you read that says "set the 
heap to XX percent of the installed system memory"
will frequently result in a setting that's incorrect for your specific setup.

And if you really DO need a 60GB heap, it would be recommended to either add 
more servers and put less of your index on each one, or to split your replicas 
between two Solr instances each running 31GB or less -- as Erick mentioned in 
his reply.

> We have analyzed GC logs and found that there was GC pause time of 29.6583943 
> second when problem happened. Can this GC Pause lead to make the node 
> unavailable/recovery mode? or there could be some another reason ?

> Please note we have set zkClientTimeout to 10 minutes 
> (zkClientTimeout=60) so that zookeeper will not consider this node 
> unavailable during high GC pause time.

You can't actually set that timeout that high.  I believe that ZooKeeper limits 
the session timeout to 20 times the tickTime, which is typically set to 2 
seconds.  So 40 seconds is typically the maximum you can have for that timeout. 
 Solr's zkClientTimeout value is used to set ZooKeeper's session timeout.

And, as Erick also mentioned, there are other ways that a long GC pause can 
cause problems other than that specific timeout.  SolrCloud is not going to 
work well with a huge heap ... eventually a full GC is going to happen, and if 
it takes more than a few seconds, it's going to cause issues.

Thanks,
Shawn

[CC Award Winners!]