On Apr 11, 2016, at 3:35 PM, Fred Dushin <fdus...@basho.com<mailto:fdus...@basho.com>> wrote:
Hi Jim, Interesting problem. That error is occurring here: https://github.com/basho/yokozuna/blob/2.1.2/src/yz_cover.erl#L275 because length(Mapping) and length(UniqNodes) are unequal: https://github.com/basho/yokozuna/blob/2.1.2/src/yz_cover.erl#L262 This might be because you are getting timeouts trying to query the port on remote nodes: https://github.com/basho/yokozuna/blob/2.1.2/src/yz_solr.erl#L324 As you can see, there is a hard-wired 1-second timeout on that RPC call, which could account for why you are seeing this failure into a load run. You might try to rebuild a version of this module with an increased timeout, to see if that gets you over the hump, or consider making a configurable timeout. Riak 2.1.3 ships with yokozuna 2.1.2, who's GIT SHA 3520d11ec21ee08b7c18478fbbe1b61d7e3d8e0f, so you'd want to branch off that point of the tree, if you care to experiment. If you rebuild the module, you can place the generated beam file in the lib/basho-patches directory of each of your riak installs, and restart Riak (or manually re-load the module on each node via the Riak console, if you need to keep your riak nodes up and running) Let us know what you find or if you need more assistance. -Fred On Apr 11, 2016, at 4:11 PM, Jim Raney <jim.ra...@physiq.com<mailto:jim.ra...@physiq.com>> wrote: Failed to determine Solr port for all nodes in search plan _______________________________________________ riak-users mailing list riak-users@lists.basho.com<mailto:riak-users@lists.basho.com> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com Fred, Thanks for the quick response. After you basically verified that it was a a solr timeout issue I rebuilt the cluster with 14 nodes to see what would happen. The amount of time it took for the query fails (and associated log entries) basically doubled as well. I -could- try increasing the hard coded timeout but I don't think that's the route we want to go as it is likely this system will have that much data or more being pushed into and long query times won't work. I imagine there is probably some solr tuning we can do - any ideas on what we could look at that we could pass through the riak config? I'm going to try an Oracle 1.8 JDK with it later and see if any GC tuning helps in case there are long GC pauses. -- Jim Raney jim.ra...@physiq.com<mailto:jim.ra...@physiq.com>
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com