I made a simple lab environment with Vagrant that demonstrates why we should not use {fsync: True} as our write concern in our queries. The attached tarball contains a directory that has a Vagrantfile that configures a simple mongo replicaset and creates three simple tests that you can run on the set.
The tests themselves are not automated, though it is pretty simple to drive them yourself. The Vagrantfile assumes that your box is already configured for Vagrant + NFS. If it is not, you can use rsync by replacing the line: config.vm.synced_folder ".", "/home/vagrant/devel", type: "nfs", nfs_version: 4, nfs_udp: false with: config.vm.synced_folder ".", "/home/vagrant/devel", type: "rsync" Once you are ready to begin, simply run vagrant up: $ time vagrant up <snip> real 2m14.050s user 0m1.898s sys 0m0.569s You now have a VM on your computer with three mongods on ports 2701{7,8,9}. You can ssh to your vm with vagrant ssh: $ vagrant ssh There are some scripts in /home/vagrant/devel. You can configure your hosts to be a replica set with default write concern of {w: 3} using the replica_set.js file in there: [vagrant@replicationtest ~]$ mongo --port 27017 devel/replica_set.js MongoDB shell version: 2.4.13 connecting to: 127.0.0.1:27017/test { "_id" : "rs0", "version" : 1, "members" : [ { "_id" : 0, "host" : "localhost:27017" }, { "_id" : 1, "host" : "localhost:27018" }, { "_id" : 2, "host" : "localhost:27019" } ], "settings" : { "getLastErrorDefaults" : { "w" : 3 } } } There is a connection.py that all three test scripts use that defines a simple model and establishes the connection. The connection is established with the NEAREST read_preference, which will allow mongoengine to auto-discover the current master (if you are unlucky and connect to a secondary, reads will cause an error without this setting.) There's other options for read_preference you can read about[0]. If 27017 happens to be down, it'll try 27018 instead. I've made little while loops in all the tests so that they will keep trying to make their writes until they are connected to a current primary. Sometimes no autoreconnects happen, but it's there so you don't get any scary sounding errors if you happen to be connected to a secondary. Spend a little time becoming familiar with each test before we start. Go ahead and run the first test, which only writes a single entry that will get replicated to all three nodes: [vagrant@replicationtest ~]$ time python devel/test_0.py Successfully written in 1 attempts. real 0m4.986s user 0m0.066s sys 0m0.021s Feel free to connect to one of the mongod's to inspect the replication. If you happen to connect to the primary, everything will work as you are used to. If you connect to a secondary, it will not allow you to perform even read queries: rs0:SECONDARY> show collections Thu May 21 23:27:55.849 error: { "$err" : "not master and slaveOk=false", "code" : 13435 } at src/mongo/shell/query.js:128 You can tell mongo that reads are OK on a secondary like this: rs0:SECONDARY> rs.slaveOk() rs0:SECONDARY> db.test_document.find() { "_id" : ObjectId("555ea1ea99038b0b7d8c0b22"), "test_field" : "test 0" } By the way, you can also use rs.status() to see the current state of replication when connected to any of the nodes: rs0:SECONDARY> rs.status() <snip> Note that our test 0 object has indeed made it to this secondary replica. This is all normal so far. Next, let's kill one of the mongods. You can do it with a regular old sigterm if you like. I decided to kill the oldest, but it doesn't really matter which you choose: [vagrant@replicationtest ~]$ sudo pkill -o mongod OK, now we have a degraded replica set as we've gone from three nodes down to two (feel free to inspect with rs.status() on one of the remaining nodes): [vagrant@replicationtest ~]$ pgrep mongod | wc -l 2 However, we have also configured our replica set to have a default write concern of 3. This means that by default, our calls to getLastError() will not return success unless our writes have propagated to and been confirmed by three nodes. In this state, let's try to run test_1.py which will simply try to insert one more document into the database without specifying any write concerns: [vagrant@replicationtest ~]$ time python devel/test_1.py Successfully written in 1 attempts. real 1m30.918s user 0m0.159s sys 0m0.053s That code above ran for 1.5 minutes, and it only exited because in another shell I started the missing mongod back up like this: [vagrant@replicationtest ~]$ sudo mongod --fork --nojournal --syslog --port 27017 --dbpath /home/vagrant/rs0-0 --replSet rs0 If you wish to try that too, make sure you start the one that was missing (so if you didn't kill the one on port 27017, adjust the above port and dbpath to match what the missing one would have been. You can see all of the mongod args in the Vagrantfile). Note that even though it took 1.5 minutes, it wasn't due to AutoReconnects - it was the save() call hanging until the write made it to all three nodes. Now is a good time to play around in the mongo shell to prove to yourself that the replica set is healthy, and they they all have the expected writes. So far all we've done so far is prove that we can write to the DB when it's up and that writes hang when a node is down, which is what we expect given how we configured the replica set. Now let's perform the last test. We need to get back to two running mongods again, so if you started your missing one back up like I did, kill one again: [vagrant@replicationtest ~]$ sudo pkill -o mongod You should now have two mongods running again, meaning your replica set is degraded (again, rs.status() will give you more detail): [vagrant@replicationtest ~]$ pgrep mongod | wc -l 2 Now, have a look at test_2.py and note that it is overriding the write concern in the way that has been suggested in pull request #1864[1]: [vagrant@replicationtest ~]$ grep write_concern < devel/test_2.py t.save(write_concern={'fsync': True}) Note that we've only inserted an fsync, and have not provided the w option. Let's run test_2.py: [vagrant@replicationtest ~]$ time python devel/test_2.py Successfully written in 1 attempts. real 0m0.186s user 0m0.055s sys 0m0.017s Note that the write went through and returned before hitting all the nodes, even though the replica set had been configured for a safer write concern. Go ahead and check out the replica set again in the mongo shell, and have a look at rs.status() again. One cool thing is that if you start the missing node back up, the missing node should automatically figure out who's master and rejoin the replica set and get that last write. In my case, the one on port 27018 was the dead one, so I'll start it back up: [vagrant@replicationtest ~]$ sudo mongod --fork --nojournal --syslog --port 27018 --dbpath /home/vagrant/rs0-1 --replSet rs0 To end, I'll connect to the one that had been dead and demonstrate that everything is back to normal: [vagrant@replicationtest ~]$ mongo --port 27018 MongoDB shell version: 2.4.13 connecting to: 127.0.0.1:27018/test rs0:SECONDARY> rs.slaveOk() rs0:SECONDARY> rs.status() { "set" : "rs0", "date" : ISODate("2015-05-22T03:47:04Z"), "myState" : 2, "syncingTo" : "localhost:27017", "members" : [ { "_id" : 0, "name" : "localhost:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 38, "optime" : Timestamp(1432266200, 1), "optimeDate" : ISODate("2015-05-22T03:43:20Z"), "lastHeartbeat" : ISODate("2015-05-22T03:47:04Z"), "lastHeartbeatRecv" : ISODate("2015-05-22T03:47:03Z"), "pingMs" : 0, "syncingTo" : "localhost:27019" }, { "_id" : 1, "name" : "localhost:27018", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 38, "optime" : Timestamp(1432266200, 1), "optimeDate" : ISODate("2015-05-22T03:43:20Z"), "errmsg" : "syncing to: localhost:27017", "self" : true }, { "_id" : 2, "name" : "localhost:27019", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 38, "optime" : Timestamp(1432266200, 1), "optimeDate" : ISODate("2015-05-22T03:43:20Z"), "lastHeartbeat" : ISODate("2015-05-22T03:47:04Z"), "lastHeartbeatRecv" : ISODate("2015-05-22T03:47:03Z"), "pingMs" : 0 } ], "ok" : 1 } rs0:SECONDARY> db.test_document.find() { "_id" : ObjectId("555ea1ea99038b0b7d8c0b22"), "test_field" : "test 0" } { "_id" : ObjectId("555ea46799038b149e1e4d7a"), "test_field" : "test 1" } { "_id" : ObjectId("555ea5d899038b19d92f5449"), "test_field" : "test 2" } [0] http://api.mongodb.org/python/current/api/pymongo/read_preferences.html#pymongo.read_preferences.ReadPreference [1] https://github.com/pulp/pulp/pull/1864 -- Randy Barlow
write_concern_test.tar.bz2
Description: application/bzip
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Pulp-list mailing list Pulp-list@redhat.com https://www.redhat.com/mailman/listinfo/pulp-list