Re: Unbalanced cluster with RandomPartitioner

Marcel Steinbach Fri, 20 Jan 2012 01:33:20 -0800

On 19.01.2012, at 20:15, Narendra Sharma wrote:
> I believe you need to move the nodes on the ring. What was the load on the 
> nodes before you added 5 new nodes? Its just that you are getting data in 
> certain token range more than others.
With three nodes, it was also imbalanced.


What I don't understand is, why the md5 sums would generate such massive hot 
spots. 

Most of our keys look like that: 
00013270494972450001234567
with the first 16 digits being a timestamp of one of our application server's 
startup times, and the last 10 digits being sequentially generated per user. 

There may be a lot of keys that start with e.g. "0001327049497245"  (or some 
other time stamp). But I was under the impression that md5 doesn't bother and 
generates uniform distribution?
But then again, I know next to nothing about md5. Maybe someone else has a 
better insight to the algorithm?

However, we also use cfs with a date ("yyyymmdd") as key, as well as cfs with 
uuids as keys. And those cfs in itself are not balanced either. E.g. node 5 has 
12 GB live space used in the cf the uuid as key, and node 8 only 428MB. 

Cheers,
Marcel

> 
> On Thu, Jan 19, 2012 at 3:22 AM, Marcel Steinbach <marcel.steinb...@chors.de> 
> wrote:
> On 18.01.2012, at 02:19, Maki Watanabe wrote:
>> Are there any significant difference of number of sstables on each nodes?
> No, no significant difference there. Actually, node 8 is among those with 
> more sstables but with the least load (20GB)
> 
> On 17.01.2012, at 20:14, Jeremiah Jordan wrote:
>> Are you deleting data or using TTL's?  Expired/deleted data won't go away 
>> until the sstable holding it is compacted.  So if compaction has happened on 
>> some nodes, but not on others, you will see this.  The disparity is pretty 
>> big 400Gb to 20GB, so this probably isn't the issue, but with our data using 
>> TTL's if I run major compactions a couple times on that column family it can 
>> shrink ~30%-40%.
> Yes, we do delete data. But I agree, the disparity is too big to blame only 
> the deletions. 
> 
> Also, initially, we started out with 3 nodes and upgraded to 8 a few weeks 
> ago. After adding the node, we did
> compactions and cleanups and didn't have a balanced cluster. So that should 
> have removed outdated data, right?
> 
>> 2012/1/18 Marcel Steinbach <marcel.steinb...@chors.de>:
>>> We are running regular repairs, so I don't think that's the problem.
>>> And the data dir sizes match approx. the load from the nodetool.
>>> Thanks for the advise, though.
>>> 
>>> Our keys are digits only, and all contain a few zeros at the same
>>> offsets. I'm not that familiar with the md5 algorithm, but I doubt that it
>>> would generate 'hotspots' for those kind of keys, right?
>>> 
>>> On 17.01.2012, at 17:34, Mohit Anchlia wrote:
>>> 
>>> Have you tried running repair first on each node? Also, verify using
>>> df -h on the data dirs
>>> 
>>> On Tue, Jan 17, 2012 at 7:34 AM, Marcel Steinbach
>>> <marcel.steinb...@chors.de> wrote:
>>> 
>>> Hi,
>>> 
>>> 
>>> we're using RP and have each node assigned the same amount of the token
>>> space. The cluster looks like that:
>>> 
>>> 
>>> Address         Status State   Load            Owns    Token
>>> 
>>> 
>>> 205648943402372032879374446248852460236
>>> 
>>> 1       Up     Normal  310.83 GB       12.50%
>>>  56775407874461455114148055497453867724
>>> 
>>> 2       Up     Normal  470.24 GB       12.50%
>>>  78043055807020109080608968461939380940
>>> 
>>> 3       Up     Normal  271.57 GB       12.50%
>>>  99310703739578763047069881426424894156
>>> 
>>> 4       Up     Normal  282.61 GB       12.50%
>>>  120578351672137417013530794390910407372
>>> 
>>> 5       Up     Normal  248.76 GB       12.50%
>>>  141845999604696070979991707355395920588
>>> 
>>> 6       Up     Normal  164.12 GB       12.50%
>>>  163113647537254724946452620319881433804
>>> 
>>> 7       Up     Normal  76.23 GB        12.50%
>>>  184381295469813378912913533284366947020
>>> 
>>> 8       Up     Normal  19.79 GB        12.50%
>>>  205648943402372032879374446248852460236
>>> 
>>> 
>>> I was under the impression, the RP would distribute the load more evenly.
>>> 
>>> Our row sizes are 0,5-1 KB, hence, we don't store huge rows on a single
>>> node. Should we just move the nodes so that the load is more even
>>> distributed, or is there something off that needs to be fixed first?
>>> 
>>> 
>>> Thanks
>>> 
>>> Marcel
>>> 
>>> <hr style="border-color:blue">
>>> 
>>> <p>chors GmbH
>>> 
>>> <br><hr style="border-color:blue">
>>> 
>>> <p>specialists in digital and direct marketing solutions<br>
>>> 
>>> Haid-und-Neu-Straße 7<br>
>>> 
>>> 76131 Karlsruhe, Germany<br>
>>> 
>>> www.chors.com</p>
>>> 
>>> <p>Managing Directors: Dr. Volker Hatz, Markus Plattner<br>Amtsgericht
>>> Montabaur, HRB 15029</p>
>>> 
>>> <p style="font-size:9px">This e-mail is for the intended recipient only and
>>> may contain confidential or privileged information. If you have received
>>> this e-mail by mistake, please contact us immediately and completely delete
>>> it (and any attachments) and do not forward it or inform any other person of
>>> its contents. If you send us messages by e-mail, we take this as your
>>> authorization to correspond with you by e-mail. E-mail transmission cannot
>>> be guaranteed to be secure or error-free as information could be
>>> intercepted, amended, corrupted, lost, destroyed, arrive late or incomplete,
>>> or contain viruses. Neither chors GmbH nor the sender accept liability for
>>> any errors or omissions in the content of this message which arise as a
>>> result of its e-mail transmission. Please note that all e-mail
>>> communications to and from chors GmbH may be monitored.</p>
>>> 
>>> 
>> 
>> 
>> 
>> -- 
>> w3m
> 
> 
> 
> 
> -- 
> Narendra Sharma
> Software Engineer
> http://www.aeris.com
> http://narendrasharma.blogspot.com/
> 
>

Re: Unbalanced cluster with RandomPartitioner

Reply via email to