On Apr 11, 2016, at 3:35 PM, Fred Dushin 
<fdus...@basho.com<mailto:fdus...@basho.com>> wrote:

Hi Jim,

Interesting problem.

That error is occurring here:

https://github.com/basho/yokozuna/blob/2.1.2/src/yz_cover.erl#L275

because length(Mapping) and length(UniqNodes) are unequal:

https://github.com/basho/yokozuna/blob/2.1.2/src/yz_cover.erl#L262

This might be because you are getting timeouts trying to query the port on 
remote nodes:

https://github.com/basho/yokozuna/blob/2.1.2/src/yz_solr.erl#L324

As you can see, there is a hard-wired 1-second timeout on that RPC call, which 
could account for why you are seeing this failure into a load run.

You might try to rebuild a version of this module with an increased timeout, to 
see if that gets you over the hump, or consider making a configurable timeout.

Riak 2.1.3 ships with yokozuna 2.1.2, who's GIT SHA 
3520d11ec21ee08b7c18478fbbe1b61d7e3d8e0f, so you'd want to branch off that 
point of the tree, if you care to experiment.

If you rebuild the module, you can place the generated beam file in the 
lib/basho-patches directory of each of your riak installs, and restart Riak (or 
manually re-load the module on each node via the Riak console, if you need to 
keep your riak nodes up and running)

Let us know what you find or if you need more assistance.

-Fred

On Apr 11, 2016, at 4:11 PM, Jim Raney 
<jim.ra...@physiq.com<mailto:jim.ra...@physiq.com>> wrote:

Failed to determine Solr port for all nodes in search plan

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com<mailto:riak-users@lists.basho.com>
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Fred,

Thanks for the quick response.  After you basically verified that it was a a 
solr timeout issue I rebuilt the cluster with 14 nodes to see what would 
happen.  The amount of time it took for the query fails (and associated log 
entries) basically doubled as well.

I -could- try increasing the hard coded timeout but I don't think that's the 
route we want to go as it is likely this system will have that much data or 
more being pushed into and long query times won't work.  I imagine there is 
probably some solr tuning we can do - any ideas on what we could look at that 
we could pass through the riak config?

I'm going to try an Oracle 1.8 JDK with it later and see if any GC tuning helps 
in case there are long GC pauses.

--
Jim Raney
jim.ra...@physiq.com<mailto:jim.ra...@physiq.com>

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to