Re: Pre-split table using shell

2012-06-13 Thread Simon Kelly
Thanks Mike, that's pretty much the same reaction I had before. We should be getting another 8Gb shortly but that's the limit for those servers and while that's still not a lot I think we'll manage for now. Unfortunately I'm not the decision maker when it comes to these things so I'm just doing my

Re: Pre-split table using shell

2012-06-12 Thread Michael Segel
?Inferred sigh of despair? Was it that obvious? :-) I'm not sure what hardware you're running on so its hard to say. Here's the problem... On each DN, you're running a DN and a RS. Assuming that you're not going to run a TT or do any M/R to push/pull data in and out of HBase. You don't have

Re: Pre-split table using shell

2012-06-12 Thread Simon Kelly
Using the API to create the splits worked. The data is now evenly spread across all the regions. However every time I tried to create a table the HBase master crashed. I used the class listed here http://pastebin.com/i1yFVEwj as follows: ./hbase CreateTable The table gets created but HBase maste

Re: Pre-split table using shell

2012-06-12 Thread Simon Kelly
No, this isn't on EC2 and yes, its (supposed to be) production. Please elaboration on your inferred sigh of dispair On 12 June 2012 15:48, Michael Segel wrote: > Ok... > > Please tell me that this isn't a production system. > > Is this on EC2? > > On Jun 12, 2012, at 6:55 AM, Simon Kelly wro

Re: Pre-split table using shell

2012-06-12 Thread Michael Segel
Ok... Please tell me that this isn't a production system. Is this on EC2? On Jun 12, 2012, at 6:55 AM, Simon Kelly wrote: > Thanks Michael > > I'm 100% sure its not the UUID distribution that's causing the problem. I'm > going to try us the API to create the table and see if that changes thi

Re: Pre-split table using shell

2012-06-12 Thread Simon Kelly
Thanks Michael I'm 100% sure its not the UUID distribution that's causing the problem. I'm going to try us the API to create the table and see if that changes things. The reason I want to pre-split the table is that HBase doesn't handle the initial load to a single regionserver and I can't start

Re: Pre-split table using shell

2012-06-12 Thread Michael Segel
Ok, Now that I'm awake, and am drinking my first cup of joe... If you just generate UUIDs you are not going to have an even distribution. Nor are they going to be truly random due to how the machines are generating their random numbers. But this is not important in solving your problem Th

Re: Pre-split table using shell

2012-06-12 Thread Oliver Meyn (GBIF)
Hi Simon, I might be wrong but I'm pretty sure the splits file you specify is assumed to be full of strings. So even though they look like bytes they're being interpreted as the string value (like '\x00') instead of the actual byte \x00. The only way I could get the byte representation of int

Re: Pre-split table using shell

2012-06-12 Thread Simon Kelly
Yes, I'm aware that UUID's are designed to be unique and not evenly distributed but I wouldn't expect a big gap in their distribution either. The other thing that is really confusing me is that the regions splits aren't lexicographical sorted. Perhaps there is a problem with the way I'm specifying

Re: Pre-split table using shell

2012-06-12 Thread Michael Segel
UUIDs are unique but not necessarily random and even in random samplings, you may not see an even distribution except over time. Sent from my iPhone On Jun 12, 2012, at 3:18 AM, "Simon Kelly" wrote: > Hi > > I'm getting some unexpected results with a pre-split table where some of > the regio