[
https://issues.apache.org/jira/browse/DRILL-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16462916#comment-16462916
]
ASF GitHub Bot commented on DRILL-6380:
---------------------------------------
vrozov commented on a change in pull request #1249: DRILL-6380: Fix sporadic
mongo db hangs.
URL: https://github.com/apache/drill/pull/1249#discussion_r185900919
##########
File path:
contrib/storage-mongo/src/test/java/org/apache/drill/exec/store/mongo/MongoTestSuit.java
##########
@@ -94,7 +94,8 @@ private static void setup() throws Exception {
configServers.add(crateConfigServerConfig(CONFIG_SERVER_3_PORT));
// creating replicaSets
- Map<String, List<IMongodConfig>> replicaSets = new HashMap<>();
+ // A TreeMap ensures that the config servers are started first.
+ Map<String, List<IMongodConfig>> replicaSets = new TreeMap<>();
Review comment:
Does it require `TreeMap` or `LinkedHashMap`? What order needs to be
preserved?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Mongo db storage plugin tests can hang on jenkins.
> --------------------------------------------------
>
> Key: DRILL-6380
> URL: https://issues.apache.org/jira/browse/DRILL-6380
> Project: Apache Drill
> Issue Type: Bug
> Reporter: Timothy Farkas
> Assignee: Timothy Farkas
> Priority: Major
>
> When running on our Jenkins server the mongodb tests hang because the Config
> servers take up to 5 seconds to process each request (see *Error 1*). This
> causes the tests to never finish within a reasonable span of time. Searching
> online people run into this issue when mixing versions of mongo db, but that
> is not happening in our tests. A possible cause is *Error 2* which seems to
> indicate that the mongo db config servers are not completely initialized
> since the config servers should have a lockping document when starting up.
> *Error 1*
> {code}
> [mongod output] 2018-05-01T23:38:47.468-0700 I COMMAND
> [replSetDistLockPinger] command config.lockpings command: findAndModify {
> findAndModify: "lockpings", query: { _id: "ConfigServer" }, update: { $set: {
> ping: new Date(1525243123413) } }, upsert: true, writeConcern: { w:
> "majority", wtimeout: 15000 } } planSummary: IDHACK update: { $set: { ping:
> new Date(1525243123413) } } keysExamined:0 docsExamined:0 nMatched:0
> nModified:0 upsert:1 keysInserted:2 numYields:0 reslen:198 locks:{ Global: {
> acquireCount: { r: 2, w: 2 } }, Database: { acquireCount: { w: 2 } },
> Collection: { acquireCount: { w: 1 } }, Metadata: { acquireCount: { w: 1 } },
> oplog: { acquireCount: { w: 1 } } } protocol:op_query 4055ms
> [mongod output] 2018-05-01T23:38:47.469-0700 W SHARDING
> [replSetDistLockPinger] pinging failed for distributed lock pinger :: caused
> by :: LockStateChangeFailed: findAndModify query predicate didn't match any
> lock document
> [mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] lock
> 'balancer' successfully forced
> [mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer]
> distributed lock 'balancer' acquired, ts : 5ae95cd5d1023488104e6282
> [mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] CSRS
> balancer thread is recovering
> [mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] CSRS
> balancer thread is recovered
> [mongod output] 2018-05-01T23:38:48.056-0700 I NETWORK [thread2] connection
> accepted from 127.0.0.1:50244 #10 (7 connections now open)
> {code}
> *Error 2*
> {code}
> [mongod output] 2018-05-01T23:39:37.690-0700 I COMMAND [conn7] command
> config.settings command: find { find: "settings", filter: { _id: "chunksize"
> }, readConcern: { level: "majority", afterOpTime: { ts: Timestamp
> 1525243172000|1, t: 1 } }, limit: 1, maxTimeMS: 30000 } planSummary: EOF
> keysExamined:0 docsExamined:0 cursorExhausted:1 numYields:0 nreturned:0
> reslen:354 locks:{ Global: { acquireCount: { r: 2 } }, Database: {
> acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } }
> protocol:op_command 4988ms
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)