Shawn,

Just wondering if you have any other suggestions on what the next steps
whould be ? Thanks.

On Thu, Oct 16, 2014 at 11:12 PM, S.L <simpleliving...@gmail.com> wrote:

> Shawn ,
>
>
>    1. I will upgrade to 67 JVM  shortly .
>    2. This is  a new collection as , I was facing a similar issue in 4.7
>    and based on Erick's recommendation I updated to 4.10.1 and created a new
>    collection.
>    3. Yes, I am hitting the replicas of the same shard and I see the
>    lists are completely non overlapping.I am using CloudSolrServer to add the
>    documents.
>    4. I have a 3 physical node cluster , with each having 16GB in memory.
>    5. I also have a custom request handler defined in my solrconfig.xml
>    as below , however I am not using that and I am only using the default
>    select handler, but my MyCustomHandler class has been been added to the
>    source and included in the build , but not being used for any requests yet.
>
>   <requestHandler name="/mycustomselect" class="solr.MyCustomHandler"
> startup="lazy">
>     <lst name="defaults">
>       <str name="df">suggestAggregate</str>
>
>       <str name="spellcheck.dictionary">direct</str>
>       <!--<str name="spellcheck.dictionary">wordbreak</str>-->
>       <str name="spellcheck">on</str>
>       <str name="spellcheck.extendedResults">true</str>
>       <str name="spellcheck.count">10</str>
>       <str name="spellcheck.alternativeTermCount">5</str>
>       <str name="spellcheck.maxResultsForSuggest">5</str>
>       <str name="spellcheck.collate">true</str>
>       <str name="spellcheck.collateExtendedResults">true</str>
>       <str name="spellcheck.maxCollationTries">10</str>
>       <str name="spellcheck.maxCollations">5</str>
>     </lst>
>     <arr name="last-components">
>       <str>spellcheck</str>
>     </arr>
>   </requestHandler>
>
>
>             5. The clusterstate.json is copied below
>
>                     {"dyCollection1":{
>     "shards":{
>       "shard1":{
>         "range":"80000000-d554ffff",
>         "state":"active",
>         "replicas":{
>           "core_node3":{
>             "state":"active",
>             "core":"dyCollection1_shard1_replica1",
>             "node_name":"server3.mydomain.com:8082_solr",
>             "base_url":"http://server3.mydomain.com:8082/solr"},
>           "core_node4":{
>             "state":"active",
>             "core":"dyCollection1_shard1_replica2",
>             "node_name":"server2.mydomain.com:8081_solr",
>             "base_url":"http://server2.mydomain.com:8081/solr";,
>             "leader":"true"}}},
>       "shard2":{
>         "range":"d5550000-2aa9ffff",
>         "state":"active",
>         "replicas":{
>           "core_node1":{
>             "state":"active",
>             "core":"dyCollection1_shard2_replica1",
>             "node_name":"server1.mydomain.com:8081_solr",
>             "base_url":"http://server1.mydomain.com:8081/solr";,
>             "leader":"true"},
>           "core_node6":{
>             "state":"active",
>             "core":"dyCollection1_shard2_replica2",
>             "node_name":"server3.mydomain.com:8081_solr",
>             "base_url":"http://server3.mydomain.com:8081/solr"}}},
>       "shard3":{
>         "range":"2aaa0000-7fffffff",
>         "state":"active",
>         "replicas":{
>           "core_node2":{
>             "state":"active",
>             "core":"dyCollection1_shard3_replica2",
>             "node_name":"server1.mydomain.com:8082_solr",
>             "base_url":"http://server1.mydomain.com:8082/solr";,
>             "leader":"true"},
>           "core_node5":{
>             "state":"active",
>             "core":"dyCollection1_shard3_replica1",
>             "node_name":"server2.mydomain.com:8082_solr",
>             "base_url":"http://server2.mydomain.com:8082/solr"}}}},
>     "maxShardsPerNode":"1",
>     "router":{"name":"compositeId"},
>     "replicationFactor":"2",
>     "autoAddReplicas":"false"}}
>
>   Thanks!
>
> On Thu, Oct 16, 2014 at 9:02 PM, Shawn Heisey <apa...@elyograg.org> wrote:
>
>> On 10/16/2014 6:27 PM, S.L wrote:
>>
>>> 1. Java Version :java version "1.7.0_51"
>>> Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
>>> Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)
>>>
>>
>> I believe that build 51 is one of those that is known to have bugs
>> related to Lucene.  If you can upgrade this to 67, that would be good, but
>> I don't know that it's a pressing matter.  It looks like the Oracle JVM,
>> which is good.
>>
>>  2.OS
>>> CentOS Linux release 7.0.1406 (Core)
>>>
>>> 3. Everything is 64 bit , OS , Java , and CPU.
>>>
>>> 4. Java Args.
>>>      -Djava.io.tmpdir=/opt/tomcat1/temp
>>>      -Dcatalina.home=/opt/tomcat1
>>>      -Dcatalina.base=/opt/tomcat1
>>>      -Djava.endorsed.dirs=/opt/tomcat1/endorsed
>>>      -DzkHost=server1.mydomain.com:2181,server2.mydomain.com:2181,
>>> server3.mydomain.com:2181
>>>      -DzkClientTimeout=20000
>>>      -DhostContext=solr
>>>      -Dport=8081
>>>      -Dhost=server1.mydomain.com
>>>      -Dsolr.solr.home=/opt/solr/home1
>>>      -Dfile.encoding=UTF8
>>>      -Duser.timezone=UTC
>>>      -XX:+UseG1GC
>>>      -XX:MaxPermSize=128m
>>>      -XX:PermSize=64m
>>>      -Xmx2048m
>>>      -Xms128m
>>>      -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
>>>      -Djava.util.logging.config.file=/opt/tomcat1/conf/
>>> logging.properties
>>>
>>
>> I would not use the G1 collector myself, but with the heap at only 2GB, I
>> don't know that it matters all that much.  Even a worst-case collection
>> probably is not going to take more than a few seconds, and you've already
>> increased the zookeeper client timeout.
>>
>> http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning
>>
>>  5. Zookeeper ensemble has 3 zookeeper instances , which are external and
>>> are not embedded.
>>>
>>>
>>> 6. Container : I am using Tomcat Apache Tomcat Version 7.0.42
>>>
>>> *Additional Observations:*
>>>
>>> I queries all docs on both replicas with distrib=false&fl=id&sort=id+
>>> asc,
>>> then compared the two lists, I could see by eyeballing the first few
>>> lines
>>> of ids in both the lists ,I could say that even though each list has
>>> equal
>>> number of documents i.e 96309 each , but the document ids in them seem to
>>> be *mutually exclusive* ,  , I did not find even a single  common id in
>>> those lists , I tried at least 15 manually ,it looks like to me that the
>>> replicas are disjoint sets.
>>>
>>
>> Are you sure you hit both replicas of the same shard number?  If you are,
>> then it sounds like something is going wrong with your document routing, or
>> maybe your clusterstate is really messed up.  Recreating the collection
>> from scratch and doing a full reindex might be a good plan ... assuming
>> this is possible for you.  You could create a whole new collection, and
>> then when you're ready to switch, delete the original collection and create
>> an alias so your app can still use the old name.
>>
>> How much total RAM do you have on these systems, and how large are those
>> index shards?  With a shard having 96K documents, it sounds like your whole
>> index is probably just shy of 300K documents.
>>
>> Thanks,
>> Shawn
>>
>>
>

Reply via email to