Re: Setting up another machine as secondary node

2009-05-14 Thread Ninad Raut
But if we have two master in the master file we have master and secondary node, *both *processes getting started on the two servers listed. Cant we have master and secondary node started seperately on two machines?? On Fri, May 15, 2009 at 9:39 AM, jason hadoop wrote: > I agree with billy. conf/m

Re: Keeping Compute Nodes seperate from the region server node-- pros and cons

2009-05-14 Thread Ninad Raut
Hi Andy, Thanks for the tip. I have a EC2 cluster with 6 nodes. Each a server grade large instance. I have the mapred & regionservers running on all the nodes. Our deployment will not go beyond 20 clusters in the near future. What would you suggest me to have? Scenario 1 or 2 as u mentioned ? On T

Re: Problems when executing many (?) HTable.lockRow()

2009-05-14 Thread Nitay
I like this a lot Guilherme. Perhaps we should open a JIRA with them so we can track these great ideas. On Thu, May 14, 2009 at 7:05 PM, Ryan Rawson wrote: > Given the non core nature, I think the api should potentially facilitate > this but the code should be contrib. > > On May 14, 2009 5:32 P

Re: Setting up another machine as secondary node

2009-05-14 Thread jason hadoop
I agree with billy. conf/masters is misleading as the place for secondary namenodes. On Thu, May 14, 2009 at 8:38 PM, Billy Pearson wrote: > I thank the secondary namenode is set in the masters file in the conf > folder > misleading > > Billy > > > > "Rakhi Khatwani" wrote in message > news:3848

Re: Setting up another machine as secondary node

2009-05-14 Thread Billy Pearson
I thank the secondary namenode is set in the masters file in the conf folder misleading Billy "Rakhi Khatwani" wrote in message news:384813770905140603g4d552834gcef2db3028a00...@mail.gmail.com... Hi, I wanna set up a cluster of 5 nodes in such a way that node1 - master node2 - secondar

Re: Problems when executing many (?) HTable.lockRow()

2009-05-14 Thread Ryan Rawson
Given the non core nature, I think the api should potentially facilitate this but the code should be contrib. On May 14, 2009 5:32 PM, "Guilherme Germoglio" wrote: On Thu, May 14, 2009 at 3:40 PM, stack wrote: > No consideration has been made f... I think so. If nothing is to be changed on Row

Re: Problems when executing many (?) HTable.lockRow()

2009-05-14 Thread Guilherme Germoglio
On Thu, May 14, 2009 at 3:40 PM, stack wrote: > No consideration has been made for changes in how locks are done in new > 0.20.0 API. Want to propose something Guilherme? Could new zk-arbitrated > locking be done inside the confines of the RowLock object? I think so. If nothing is to be chan

Re: Problems when executing many (?) HTable.lockRow()

2009-05-14 Thread stack
On Thu, May 14, 2009 at 1:07 PM, Joey Echeverria wrote: > With a 5 server zk ensemble and an 80% write ratio, you should be able > to support about 10,000 operations per second[1]. That sounds > reasonable to me for most uses that require locks. If you require > higher performance than that, then

Re: Problems when executing many (?) HTable.lockRow()

2009-05-14 Thread Ryan Rawson
I stand corrected, thanks for the info! -ryan On Thu, May 14, 2009 at 1:07 PM, Joey Echeverria wrote: > With a 5 server zk ensemble and an 80% write ratio, you should be able > to support about 10,000 operations per second[1]. That sounds > reasonable to me for most uses that require locks. If

Re: Problems when executing many (?) HTable.lockRow()

2009-05-14 Thread Joey Echeverria
With a 5 server zk ensemble and an 80% write ratio, you should be able to support about 10,000 operations per second[1]. That sounds reasonable to me for most uses that require locks. If you require higher performance than that, then locking probably isn't for you. Taking advantage of versioning or

Re: Problems when executing many (?) HTable.lockRow()

2009-05-14 Thread Ryan Rawson
Ah I hate to be a cloud on a sunny day, but iirc, zk isn't designed for a high write load. With thousands of requests a second one could overwhelm the zk paxos consensus seeking protocol. Another thing to remember is hbase doesn't "overwrite" values, it just versions them. Perhaps this can be of h

Re: Problems when executing many (?) HTable.lockRow()

2009-05-14 Thread stack
No consideration has been made for changes in how locks are done in new 0.20.0 API. Want to propose something Guilherme? Could new zk-arbitrated locking be done inside the confines of the RowLock object? St.Ack On Thu, May 14, 2009 at 9:44 AM, Guilherme Germoglio wrote: > This way, HTable coul

Re: Problems when executing many (?) HTable.lockRow()

2009-05-14 Thread stack
No consideration has been made for changes in how locks are done in new 0.20.0 API. Want to propose something Guilherme? Could new zk-arbitrated locking be done inside the confines of the RowLock object? St.Ack On Thu, May 14, 2009 at 9:44 AM, Guilherme Germoglio wrote: > This way, HTable coul

Re: Hbase 0.19.2 - Large import results in heavily unbalanced hadoop DFS

2009-05-14 Thread Andrew Purtell
Hi Alexandra, Yes, a HBase release 0.19 is compatible with Hadoop 0.19 with no other guarantees, and a HBase 0.20 release is compatible with Hadoop 0.20, and so on. It is hard to say without actually trying a recompile and deployment if HBase 0.19.2 would have any trouble on Hadoop trunk. I wo

Re: Administration tool for HBase

2009-05-14 Thread Andrew Purtell
NEVER KILL -9 A REGION SERVER!!! - Andy From: Ninad Raut To: hbase-user@hadoop.apache.org Cc: Ranjit Nair Sent: Thursday, May 14, 2009 3:03:06 AM Subject: Re: Administration tool for HBase Ryan, Using bin/hbase-daemon.sh start regionserver and bin/hbase-d

Re: Keeping Compute Nodes seperate from the region server node-- pros and cons

2009-05-14 Thread Andrew Purtell
Hi Ninad, I think the answer depends on the anticipated scale of the deployment. For small clusters (up to a few racks, ~40 servers per rack) I don't think there is any significant performance hit to separate storage and computation. Presumably all servers will share the same large GigE switch

Re: Problems when executing many (?) HTable.lockRow()

2009-05-14 Thread Guilherme Germoglio
This way, HTable could directly request for read or write row locks ( http://hadoop.apache.org/zookeeper/docs/current/recipes.html#Shared+Locks) using zookeeper wrapper. The problem is that the client api would change a little. Would these changes fit into the client api redesign for 0.20 (HBASE-12

Re: Problems when executing many (?) HTable.lockRow()

2009-05-14 Thread stack
On Wed, May 13, 2009 at 11:00 PM, Joey Echeverria wrote: > Wouldn't it be better to implement the row locks using zookeeper? > THBase was done before ZK was in the mix. Now its here, we should look into using it. St.Ack

Setting up another machine as secondary node

2009-05-14 Thread Rakhi Khatwani
Hi, I wanna set up a cluster of 5 nodes in such a way that node1 - master node2 - secondary namenode node3 - slave node4 - slave node5 - slave How do we go about that? there is no property in hadoop-env where i can set the ip-address for secondary name node. if i set node-1 and node-2 in ma

Re: Hbase 0.19.2 - Large import results in heavily unbalanced hadoop DFS

2009-05-14 Thread Sasha Dolgy
i tried hbase 0.19.2 on hadoop 0.20.0 and had problems due to versioning errors. 19 is v2 and 20 is v3 protocol or something along those lines. so i went back down to hadoop 0.19.1 and hbase 0.19.2 and it all works fine -sd maybe i didn't do something right.. On Thu, May 14, 2009 at 1:16 PM, A

Re: Hbase 0.19.2 - Large import results in heavily unbalanced hadoop DFS

2009-05-14 Thread Alexandra Alecu
Hi, Many thanks to all for your answers. Your help is much appreciated. Andy, you were right, fs.trash.interval didn't help.However, on a second run I was lucky enough that the overload got distributed over 2 of my 4 datanodes and it didn't result into an error during the import. I was looking

Lookup table design and RowCounter job

2009-05-14 Thread Alexandra Alecu
Hi, I am using Hadoop 0.19.1 and HBase 0.19.2. I have ingested data in two tables. One of them 'observations' which is expected to have a large number of records (for a medium size data set 10^9). Each record in table 'observations' refers to a certain star observed which I store in a different

Re: Administration tool for HBase

2009-05-14 Thread Ryan Rawson
Dont kill database servers with kill -9 :-) hbase-daemon.sh stop regionserver does a graceful shutdown, which can take a while as data is flushed. there was a bug, which i indicated, which can prevent root/meta from being reassigned properly. a few other notes: - adding servers is easy, just sta

Re: Administration tool for HBase

2009-05-14 Thread Ninad Raut
Ryan, Using bin/hbase-daemon.sh start regionserver and bin/hbase-daemon.sh stop regionserver can we add/remove slave nodes when the cluster is live? How to handle a region not serving exception? Because in this scenario the daemon.sh script seems to go on forever? We usually kill the HRegion proces

Keeping Compute Nodes seperate from the region server node-- pros and cons

2009-05-14 Thread Ninad Raut
Hi, I want to get a design perspective here as to what will be the advantages of seperating region servers and compute node(to run mapreduce tasks) Will seperating datanodes from computes node reduce the load on the servers and avoid swapping problems? Will this seperation make map reduce tasks les

Re: Set hbase configuration when client is on different machine

2009-05-14 Thread Sasha Dolgy
HBaseConfiguration config = new HBaseConfiguration(); config.set("hbase.master", "foo.bar.com:6"); On Thu, May 14, 2009 at 8:32 AM, monty123 wrote: > > Hi All, > > I am a newbie to hbase. > I am able to setup hbase in pseudo-distributed mode and I have also done > with its integration from

Set hbase configuration when client is on different machine

2009-05-14 Thread monty123
Hi All, I am a newbie to hbase. I am able to setup hbase in pseudo-distributed mode and I have also done with its integration from Java. ( java client class and hbase were on same system ) Now, I have no idea how to change configuration to access hbase from a remote client ( like mysql jdbc conn