Re: Why no virtual nodes for Cassandra on EC2?

Eric Stevens Mon, 23 Feb 2015 12:03:13 -0800

That link is the one from the 4.6 New Features page:
http://www.datastax.com/documentation/datastax_enterprise/4.6/datastax_enterprise/newFeatures.html


   - Ability to use virtual nodes (vnodes)
   
<http://www.datastax.com/documentation/datastax_enterprise/4.6/datastax_enterprise/ana/anaNdeOps.html#anaNdeOps__implicationsVnodes>
in
   Solr nodes. Recommended range: 64 to 256 (overhead increases by
   approximately 30%)

Anyway, thanks for clearing this up Jack.  This overhead is on queries
only, right?



On Mon, Feb 23, 2015 at 10:03 AM, Jack Krupansky <jack.krupan...@gmail.com>
wrote:

> Thanks for pointing out a mistake in the doc - that statement (for
> Search/Solr) was simply a leftover from before 4.6. Besides, it's in the
> Analytics section, which is not relevant for Search/Solr anyway.
>
> -- Jack Krupansky
>
> On Mon, Feb 23, 2015 at 11:54 AM, Eric Stevens <migh...@gmail.com> wrote:
>
>> 30% overhead is pretty brutal.  I think this is basic support for it, and
>> not necessarily a recommendation to use it.
>>
>> From
>>
>> http://www.datastax.com/documentation/datastax_enterprise/4.6/datastax_enterprise/ana/anaNdeOps.html?scroll=anaNdeOps__implicationsVnodes
>>
>> *DataStax does not recommend turning on vnodes *for other Hadoop use
>> cases *or for Solr nodes*, but you can use vnodes for any Cassandra-only
>> cluster, or a Cassandra-only data center in a mixed Hadoop/Solr/Cassandra
>> deployment. If you have enabled virtual nodes on Hadoop nodes, disable
>> virtual nodes before using the cluster.
>>
>>
>> On Mon, Feb 23, 2015 at 9:34 AM, Jack Krupansky <jack.krupan...@gmail.com
>> > wrote:
>>
>>> DSE 4.6 improved Solr vnode performance dramatically, so that vnodes for
>>> Search workloads is now no longer officially discouraged. As per the
>>> official doc for improvements, : "*Ability to use virtual nodes
>>> (vnodes) in Solr nodes. Recommended range: 64 to 256 (overhead increases by
>>> approximately 30%)*". A vnode token count of 64 or 32 would reduce that
>>> overhead further. And... the new 4.6 feature of being able to direct a Solr
>>> query to a specific partition essentially eliminates that overhead entirely.
>>>
>>> -- Jack Krupansky
>>>
>>> On Mon, Feb 23, 2015 at 11:23 AM, Eric Stevens <migh...@gmail.com>
>>> wrote:
>>>
>>>> Vnodes is officially disrecommended for DSE Solr integration (though a
>>>> small number isn't ruinous). That might be why they still don't enable them
>>>> by default.
>>>> On Feb 21, 2015 3:58 PM, "mck" <m...@apache.org> wrote:
>>>>
>>>>> At least the problem of hadoop and vnodes described in CASSANDRA-6091
>>>>> doesn't apply to spark.
>>>>>  (Spark already allows multiple token ranges per split).
>>>>>
>>>>> If this is the reason why DSE hasn't enabled vnodes then fingers
>>>>> crossed
>>>>> that'll change soon.
>>>>>
>>>>>
>>>>> > Some of the DataStax videos that I watched discussed how the
>>>>> Cassandra Spark connecter has
>>>>> > optimizations to deal with vnodes.
>>>>>
>>>>>
>>>>> Are these videos public? if so got any link to them?
>>>>>
>>>>> ~mck
>>>>>
>>>>
>>>
>>
>

Re: Why no virtual nodes for Cassandra on EC2?

Reply via email to