Responses inline. On Nov 30, 2010, at 4:18 PM, Jon Brisbin wrote:
> I'm still struggling with some unexpected results in running tests for the > Java support I'm writing for Grails and Spring Data. > > As an example, I ran the full Gorm TCK test suite against my local Riak > server (0.13.0) and had 3 failures out of 103. Not bad, though my goal is 0 > failures in 103 tests. :) The weirdness started happening when I ran the test > that failed during the full test run manually. It passed with no errors. So I > got a different result when running it manually than what I got when running > it in a batch. > > Another thing that's actually a little concerning is the following two calls > to Riak's Map/Reduce. I log all the M/R I execute, so on the test that > failed, I executed that Javascript. Sure enough, no results, which means a > failed test. In the second run, I added the call to ejsLog() so I could see > what's going on, and I got a different result: > > +-( ~ ):> curl -v -H "Content-Type: application/json" > http://localhost:8098/mapred -d @- > {"inputs":"grails.gorm.tests.TestEntity","query":[{"map":{"language":"javascript","source":"function(v){ejsLog('/tmp/mapred.log', > 'map input: '+JSON.stringify(v)); var row = Riak.mapValuesJson(v); row[0].id > = v.key; return row; }"}}]} > >> POST /mapred HTTP/1.1 >> User-Agent: curl/7.19.7 (universal-apple-darwin10.0) libcurl/7.19.7 >> OpenSSL/0.9.8l zlib/1.2.3 >> Host: localhost:8098 >> Accept: */* >> Content-Type: application/json >> Content-Length: 234 >> > < HTTP/1.1 200 OK > < Server: MochiWeb/1.1 WebMachine/1.7.2 (participate in the frantic) > < Date: Tue, 30 Nov 2010 20:57:36 GMT > < Content-Type: application/json > < Content-Length: 2 > < > > [] > > +-( ~ ):> curl -v -H "Content-Type: application/json" > http://localhost:8098/mapred -d @- > {"inputs":"grails.gorm.tests.TestEntity","query":[{"map":{"language":"javascript","source":"function(v){ejsLog('/tmp/mapred.log', > 'map input: '+JSON.stringify(v)); var row = Riak.mapValuesJson(v); row[0].id > = v.key; ejsLog('/tmp/mapred.log', 'map output: '+JSON.stringify(row)); > return row; }"}}]} > >> POST /mapred HTTP/1.1 >> User-Agent: curl/7.19.7 (universal-apple-darwin10.0) libcurl/7.19.7 >> OpenSSL/0.9.8l zlib/1.2.3 >> Host: localhost:8098 >> Accept: */* >> Content-Type: application/json >> Content-Length: 297 >> > < HTTP/1.1 200 OK > < Server: MochiWeb/1.1 WebMachine/1.7.2 (participate in the frantic) > < Date: Tue, 30 Nov 2010 20:58:08 GMT > < Content-Type: application/json > < Content-Length: 75 > < > > [{"child":"-5040867138647877277","name":"Bob","id":"-7706240328526746461"}] > > It's like just by changing the Javascript enough to get a different hash, I > got a different result. > > What I suspect is happening with the test suite is that M/R scripts are run > against data sets that aren't yet complete because the tests load several > entries into the database, then immediately try to query them back out. It > looks like if that M/R script is run against this incomplete data, I'll keep > getting the same incorrect result until the M/R script's hash changes. This > is a complete WAG. All I know is that the code does what it's supposed to > when Riak returns the results it's supposed to. :) Is there caching I'm > running into here? There is a cache between the MapReduce machinery and the Javascript VMs to reduce demand on the VMs. It is a two-level cache based on the bucket/key pair being accessed and the hash of the Javascript function name or source being invoked. Changing either the function or writing to the bucket/key pair should cause the cache to eject entries. If you want to disable the cache, you can do so by adding the following lines to the riak_kv section of your app.config: {vnode_cache_entries, 0} I will dig into the caching code and see if there's an issue with stale entries not getting ejected. > > If you could help me figure out why some tests are prone to this problem when > others aren't, I'd be very appreciative. It seems to be related to tests that > save multiple objects in a loop. We had talked about getting close to doing > an M1 release of this sometime soon. I'm still not 100% comfortable with > moving forward on that until I can get consist and clean test runs. Having > tests fail randomly doesn't instill a tremendous amount of confidence in me. > :/ > > If the problem is eventual consistency is biting me, how do you work around > that in a test suite with dozens of tests that hit the server as fast as it > can? What are you using for W or DW on your write calls? With W/DW < N the call can return while the data is still being replicated. --Kevin > > Thanks for all the help so far! :) > > Jon Brisbin > Portal Webmaster > NPC International, Inc. > > > > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com