[ 
https://issues.apache.org/jira/browse/DRILL-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16463306#comment-16463306
 ] 

ASF GitHub Bot commented on DRILL-6380:
---------------------------------------

ilooner commented on a change in pull request #1249: DRILL-6380: Fix sporadic 
mongo db hangs.
URL: https://github.com/apache/drill/pull/1249#discussion_r185981888
 
 

 ##########
 File path: 
contrib/storage-mongo/src/test/java/org/apache/drill/exec/store/mongo/MongoTestSuit.java
 ##########
 @@ -94,7 +94,8 @@ private static void setup() throws Exception {
       configServers.add(crateConfigServerConfig(CONFIG_SERVER_3_PORT));
 
       // creating replicaSets
-      Map<String, List<IMongodConfig>> replicaSets = new HashMap<>();
+      // A TreeMap ensures that the config servers are started first.
+      Map<String, List<IMongodConfig>> replicaSets = new TreeMap<>();
 
 Review comment:
   The config servers need to be launched first. Using a tree map would 
guarantee that the CONFIG_REPLICA_SET key would be iterated over first, since 
it is lexicographically first compared to the other replica set names. So when 
the flapadoodle library iterates over the replica sets it will see 
CONFIG_REPLICA_SET first and launch the config servers.
   
   I agree a LinkedHashMap is more intuitive though, so I have changed the code 
to use that instead.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Mongo db storage plugin tests can hang on jenkins.
> --------------------------------------------------
>
>                 Key: DRILL-6380
>                 URL: https://issues.apache.org/jira/browse/DRILL-6380
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Timothy Farkas
>            Assignee: Timothy Farkas
>            Priority: Major
>             Fix For: 1.14.0
>
>
> When running on our Jenkins server the mongodb tests hang because the Config 
> servers take up to 5 seconds to process each request (see *Error 1*). This 
> causes the tests to never finish within a reasonable span of time. Searching 
> online people run into this issue when mixing versions of mongo db, but that 
> is not happening in our tests. A possible cause is *Error 2* which seems to 
> indicate that the mongo db config servers are not completely initialized 
> since the config servers should have a lockping document when starting up.
> *Error 1*
> {code}
> [mongod output] 2018-05-01T23:38:47.468-0700 I COMMAND  
> [replSetDistLockPinger] command config.lockpings command: findAndModify { 
> findAndModify: "lockpings", query: { _id: "ConfigServer" }, update: { $set: { 
> ping: new Date(1525243123413) } }, upsert: true, writeConcern: { w: 
> "majority", wtimeout: 15000 } } planSummary: IDHACK update: { $set: { ping: 
> new Date(1525243123413) } } keysExamined:0 docsExamined:0 nMatched:0 
> nModified:0 upsert:1 keysInserted:2 numYields:0 reslen:198 locks:{ Global: { 
> acquireCount: { r: 2, w: 2 } }, Database: { acquireCount: { w: 2 } }, 
> Collection: { acquireCount: { w: 1 } }, Metadata: { acquireCount: { w: 1 } }, 
> oplog: { acquireCount: { w: 1 } } } protocol:op_query 4055ms
> [mongod output] 2018-05-01T23:38:47.469-0700 W SHARDING 
> [replSetDistLockPinger] pinging failed for distributed lock pinger :: caused 
> by :: LockStateChangeFailed: findAndModify query predicate didn't match any 
> lock document
> [mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] lock 
> 'balancer' successfully forced
> [mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] 
> distributed lock 'balancer' acquired, ts : 5ae95cd5d1023488104e6282
> [mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] CSRS 
> balancer thread is recovering
> [mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] CSRS 
> balancer thread is recovered
> [mongod output] 2018-05-01T23:38:48.056-0700 I NETWORK  [thread2] connection 
> accepted from 127.0.0.1:50244 #10 (7 connections now open)
> {code}
> *Error 2*
> {code}
> [mongod output] 2018-05-01T23:39:37.690-0700 I COMMAND  [conn7] command 
> config.settings command: find { find: "settings", filter: { _id: "chunksize" 
> }, readConcern: { level: "majority", afterOpTime: { ts: Timestamp 
> 1525243172000|1, t: 1 } }, limit: 1, maxTimeMS: 30000 } planSummary: EOF 
> keysExamined:0 docsExamined:0 cursorExhausted:1 numYields:0 nreturned:0 
> reslen:354 locks:{ Global: { acquireCount: { r: 2 } }, Database: { 
> acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } 
> protocol:op_command 4988ms
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to