Hbase importtsv fails with java.net.UnknownHostException: unknown host:

2012-07-03 Thread AnandaVelMurugan Chandra Mohan
Hi, I have a 3 node HBase cluster up and running. I could list and scan tables in HBase shell. I am trying to run HBase map-reduce job to load bulk data from TSV file. It fails with 12/07/04 11:42:11 INFO mapred.JobClient: Task Id : attempt_201207031124_0022_m_02_0, Status : FAILED java.lang.

Re: HBASE -- Session Expire ?

2012-07-03 Thread Amandeep Khurana
Jay, You need to modify the zoo.cfg to reflect the quorum. server.0=localhost:2888:3888 will change to something like server.0=zk_host_1:2888:3888 server.1=zk_host_2:2888:3888 server.3=zk_host_3:2888:3888 The same config needs to be on all the zookeeper hosts. Also, I assume it's a self manag

RE: HBase table disk usage

2012-07-03 Thread Anoop Sam John
Hi, The KV storage will be like KeyLength (4 bytes) + Value length(4 bytes) + rowkeylength(2bytes) + rowkey(.. bytes) + CF length(1 byte) + CF (...bytes) + Qualifier(..bytes) + timestamp(8 bytes) + type(1 byte) + value (...bytes) If you are using HFile V2 there will be memstoreTS also added wi

Re: HBASE -- Session Expire ?

2012-07-03 Thread Jay Wilson
First, thank you for looking at this for me. Second, the network is up. It is dedicated to the cluster and it appears stable. Third, I haven't modified the zoo.cfg; however, I have put it on pastebin. I made all my zookeeper changes in hbase-site.xml zoo.cfg -- http://pastebin.com/down

Re: HBASE -- Session Expire ?

2012-07-03 Thread Amandeep Khurana
Can you put your zoo.cfg and hbase-site.xml on pastebin and put the links here? Have you verified that your network is fine? Also, can you put up your RS and ZK logs too? On Tuesday, July 3, 2012 at 5:19 PM, Jay Wilson wrote: > I have reread the sections in the O'Reilly HBase book on cluster >

HBASE -- Session Expire ?

2012-07-03 Thread Jay Wilson
I have reread the sections in the O'Reilly HBase book on cluster configuration and troubleshooting and I am still getting "session expired" after X number of minutes. X being anywhere from 15 to 20 minutes. There is 0 load on the cluster and it's using a dedicated isolated network. No jobs runnin

Rowkey design for time series data

2012-07-03 Thread Bartosz M. Frak
Hey Guys, Before I get to my thoughts on the rowkey design, here's some background info about the problem we are trying to tackle. We are producing about 60TB of data a year (uncompressed). Most of this data is collected continuously from various detectors around our facility Vast majority o

Re: Blocking Inserts

2012-07-03 Thread Suraj Varma
In your case, likely you are hitting the blocking store files (hbase.hstore.blockingStoreFiles default:7) and/or hbase.hregion.memstore.block.multiplier - check out http://hbase.apache.org/book/config.files.html for more details on this configurations and how they affect your insert performance. O

Re: HBase master starting up

2012-07-03 Thread Stack
On Tue, Jul 3, 2012 at 11:02 PM, kasturi wrote: > Hi, > I just upgraded the hbase to 0.90.6. I am usign a mapr distribution. > When I start up the mster, I get the > following error message: > > 2012-07-03 13:30:01,214 FATAL org.apache.hadoop.hbase.master.HMaster: > Unhandled exception. > Starting

HBase master starting up

2012-07-03 Thread kasturi
Hi, I just upgraded the hbase to 0.90.6. I am usign a mapr distribution. When I start up the mster, I get the following error message: 2012-07-03 13:30:01,214 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. org.apache.hadoop.ipc.RemoteException: java.io.I

Re: HMASTER -- odd messages ?

2012-07-03 Thread N Keywal
> Would Datanode issues impact the HMaster stability? Yes and no. If you have only a few datanodes down, their should be no issue. When there are enough missing datanodes to make some blocks not available at all in the cluster, there are many tasks that can not be done anymore (to say the least, a

Re: HMASTER -- odd messages ?

2012-07-03 Thread Amandeep Khurana
On Tuesday, July 3, 2012 at 10:08 AM, Jay Wilson wrote: > 2012-07-03 09:05:00,530 ERROR > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Couldn't close > log at > hdfs://devrackA-00:8020/var/hbase-hadoop/hbase/-ROOT-/70236052/recovered.edits/046.temp > java.net.NoRouteToH

HMASTER -- odd messages ?

2012-07-03 Thread Jay Wilson
My HMaster and HRegionservers start and run for awhile. Looking at the messages, there are appear to be some Datanodes with some issues, HLogSplitter has some block issues, the HMaster appears to drop off the network (i know bad), then it comes back, and then the cluster runs for about 10 more min

Re: Connection error while usinh HBase client API for pseudodistrbuted mode

2012-07-03 Thread Mohammad Tariq
It would be better to start fresh..Add these props in your core-site.xml file - fs.default.name hdfs://localhost:9000 hadoop.tmp.dir /home/mohammad/hdfs/temp In hdfs-site.xml -

Re: HBase table disk usage

2012-07-03 Thread Sever Fundatureanu
I was only du'ing the table dir. The tmp dirs only had a couple of hundred bytes in my case. The HFile tool only gives the avgKeyLen=46. This does not include 4 bytes KeyLength + 4 bytes ValueLength. Now indeed I get a total of 54 bytes/KV *1.5 billion ~= 81GB. Probably there are also leftovers fro

Re: Connection error while usinh HBase client API for pseudodistrbuted mode

2012-07-03 Thread AnandaVelMurugan Chandra Mohan
That did not help. Still I dont see hbase master and zookeeper processes. I am thinking of starting everything from scratch. Any suggestions? On Tue, Jul 3, 2012 at 8:17 PM, Mohammad Tariq wrote: > Change the value of fs.default.name to "hdfs://localhost:8020" and > restart everything again..It

Re: Connection error while usinh HBase client API for pseudodistrbuted mode

2012-07-03 Thread Mohammad Tariq
Change the value of fs.default.name to "hdfs://localhost:8020" and restart everything again..It should be a combination of host:port. Regards, Mohammad Tariq On Tue, Jul 3, 2012 at 8:12 PM, AnandaVelMurugan Chandra Mohan wrote: > For starting Hbase master manually > > I did "cd" to \bin > T

Re: Connection error while usinh HBase client API for pseudodistrbuted mode

2012-07-03 Thread AnandaVelMurugan Chandra Mohan
For starting Hbase master manually I did "cd" to \bin Then did ./hbase master start Contents of my hbase-site.xml hbase.rootdir hdfs://localhost:8020/user/eucalyptus/hbase The directory shared by RegionServers. hbase.zookeeper.quorum

Re: Connection error while usinh HBase client API for pseudodistrbuted mode

2012-07-03 Thread Mohammad Tariq
What do you mean by "I tried starting Hbase master manually and I got this error."??..By manually do you mean through the shell??How were you trying to do it earlier??And if possible could you please post the modified core-site.xml and hbase-site.xml files. Regards, Mohammad Tariq On Tue, Ju

Re: Connection error while usinh HBase client API for pseudodistrbuted mode

2012-07-03 Thread AnandaVelMurugan Chandra Mohan
Thanks for the link. I followed the link and fixed my hdfs url too. But when I start hbase, hbase master and zookeeper processes are not starting I tried starting Hbase master manually and I got this error. ERROR master.HMasterCommandLine: Failed to start master java.io.IOException: CRC check f

Re: Advices for HTable schema

2012-07-03 Thread Michael Segel
Comparisons are fine. Try to not think of this in terms of rows and columns, but in terms of records. Think of each record as being atomic. Create a list of all of the components that make up that record. Then combine like components in to structures. Like the Street Address. Add in a coupl

Re: Connection error while usinh HBase client API for pseudodistrbuted mode

2012-07-03 Thread Mohammad Tariq
Not a prob..I was expecting this after looking at the config file..First of all your "hbase.rootdir" property must contain the "complete" value of the "fs.default.name" property in your hadoop's "core-site.xml" file.(This includes "port no" also)..After that just add the following properties in you

Re: Connection error while usinh HBase client API for pseudodistrbuted mode

2012-07-03 Thread AnandaVelMurugan Chandra Mohan
Sorry. I tried list and it returned 0 as no table exists. Then I posted this question. Now when I try create table now, shell is hanging and I get following exception org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy interface org.apache.hadoop.hbase.ipc.HRegionInt

Re: Connection error while usinh HBase client API for pseudodistrbuted mode

2012-07-03 Thread Mohammad Tariq
Are you sure about the Hbase shell???Are you able to create tables , or list the tables through shell?? Regards, Mohammad Tariq On Tue, Jul 3, 2012 at 5:41 PM, Michael Segel wrote: > What's the status of Hadoop and IPV6 vs IPV4? > > On Jul 3, 2012, at 7:07 AM, AnandaVelMurugan Chandra Mohan

Re: Advices for HTable schema

2012-07-03 Thread Jean-Marc Spaggiari
Hi Michael, I'm trying to deeply dive into HBase and forget all my RDBMS knowledge but sometime it's difficult to not try to compare and I don't have yet all the right thinking mechanism. The more Amandeep was replying yesterday, more clear it become, but seems I still have a LOT to learn. I will

Re: HBase table disk usage

2012-07-03 Thread Stack
On Tue, Jul 3, 2012 at 2:17 PM, Sever Fundatureanu wrote: > Right, forgot about the timestamps. These should be a long value each, so 8 > bytes. The versioning is set to 1 so it shouldn't count. > Note the column qualifier is also void on each entry. > > So now we get (33+1+8)x1.5*10^9 = 63GB, sti

Re: HBase table disk usage

2012-07-03 Thread Sever Fundatureanu
Right, forgot about the timestamps. These should be a long value each, so 8 bytes. The versioning is set to 1 so it shouldn't count. Note the column qualifier is also void on each entry. So now we get (33+1+8)x1.5*10^9 = 63GB, still a 19GB difference... Thanks, Sever On Tue, Jul 3, 2012 at 1:48

Re: Powered By Page

2012-07-03 Thread Stack
On Tue, Jul 3, 2012 at 1:29 PM, Buckley,Ron wrote: > Stack/Lars, > > Here's an entry for OCLC: > > OCLC (www.worldcat.org) uses HBase as the main data store for WorldCat, > a union catalog which aggregates the collections of 72,000 libraries in > 112 countries and territories. WorldCat is current

Re: Connection error while usinh HBase client API for pseudodistrbuted mode

2012-07-03 Thread Michael Segel
What's the status of Hadoop and IPV6 vs IPV4? On Jul 3, 2012, at 7:07 AM, AnandaVelMurugan Chandra Mohan wrote: > Hi, > > These are text from the files > > /etc/hosts > > 127.0.0.1 localhost > > > # The following lines are desirable for IPv6 capable hosts > ::1 localhost ip6-local

Re: Region servers fall after Zookeeper connectivity loss on EC2

2012-07-03 Thread Nicolas ThiƩbaud
The issue does appear to be due to the leap second, we are investigating. Thanks ! On Mon, Jul 2, 2012 at 5:50 PM, Nicolas ThiƩbaud wrote: > Hi, > > We have been successfully running a cdh3 HBase cluster on c1.xlarge > instances for over a month, but we recently started hitting what looks like >

Re: Connection error while usinh HBase client API for pseudodistrbuted mode

2012-07-03 Thread AnandaVelMurugan Chandra Mohan
Hi, These are text from the files /etc/hosts 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters hbase-site.xml hbase

Re: Advices for HTable schema

2012-07-03 Thread Michael Segel
Hi, You're over thinking this. Take a step back and remember that you can store anything you want as a byte stream in a column. Literally. So you have a record that could be a text blob. Store it in one column. Use JSON to define its structure and fields. The only thing that makes it diff

Re: HBase table disk usage

2012-07-03 Thread Michael Segel
Timestamps on the cells themselves? # Versions? On Jul 3, 2012, at 4:54 AM, Sever Fundatureanu wrote: > Hello, > > I have a simpel table with 1.5 billion rows and one column familiy 'F'. > Each row key is 33 bytes and the cell values are void. By doing the math I > would expect this table to t

RE: Powered By Page

2012-07-03 Thread Buckley,Ron
Stack/Lars, Here's an entry for OCLC: OCLC (www.worldcat.org) uses HBase as the main data store for WorldCat, a union catalog which aggregates the collections of 72,000 libraries in 112 countries and territories. WorldCat is currently comprised of nearly 1 billion records with nearly 2 billion

Re: Connection error while usinh HBase client API for pseudodistrbuted mode

2012-07-03 Thread Mohammad Tariq
Can you paste the contents of your /etc/hosts and hbase-site.xml files?? Regards, Mohammad Tariq On Tue, Jul 3, 2012 at 4:06 PM, AnandaVelMurugan Chandra Mohan wrote: > Hi, > > Thanks for the response. Sadly I am still getting same error. > > > On Tue, Jul 3, 2012 at 3:58 PM, Mohammad Tariq

Re: Connection error while usinh HBase client API for pseudodistrbuted mode

2012-07-03 Thread AnandaVelMurugan Chandra Mohan
Hi, Thanks for the response. Sadly I am still getting same error. On Tue, Jul 3, 2012 at 3:58 PM, Mohammad Tariq wrote: > Hello Ananda, > > Add these two lines in your client and sww if it works for you : > > config.set("hbase.zookeeper.property.clientPort","2181"); > config.set("hba

Re: Connection error while usinh HBase client API for pseudodistrbuted mode

2012-07-03 Thread Mohammad Tariq
Hello Ananda, Add these two lines in your client and sww if it works for you : config.set("hbase.zookeeper.property.clientPort","2181"); config.set("hbase.master", "localhost:6"); Regards, Mohammad Tariq On Tue, Jul 3, 2012 at 3:49 PM, AnandaVelMurugan Chandra Mohan wrote: >

Connection error while usinh HBase client API for pseudodistrbuted mode

2012-07-03 Thread AnandaVelMurugan Chandra Mohan
Hi, For development purpose, I have set up HBase in pseudodistributed mode. I have following line in hbase-env.sh file export HBASE_MANAGES_ZK=true HBase shell works fine. But client API is not working. My client code is as follows Configuration config = HBaseConfiguration.cr

RE: Out of memory error in Hbase

2012-07-03 Thread Prakrati Agrawal
Hi I am using HBase 0.90.6 - cdh3u4. I am inserting data into the HBase table almost every second using my Java code. After running the program for 3-4 days, I am getting heap space error while running Hbase on certain nodes of my cluster. I have allocated 2G as the heapspace of Hbase. I can't

HBase table disk usage

2012-07-03 Thread Sever Fundatureanu
Hello, I have a simpel table with 1.5 billion rows and one column familiy 'F'. Each row key is 33 bytes and the cell values are void. By doing the math I would expect this table to take up (33+1)x1.5*10^9 = 51GB. However if I do a "hadoop dfs -du" I get that the table takes up ~82GB. This is after

Re: Possible unintended use of finalizers in HTablePool

2012-07-03 Thread yuzhihong
You can log a Jira where you attach your patch. Thanks On Jul 2, 2012, at 8:13 PM, Ryan Brush wrote: > While generating some load against a library that makes extensive use of > HTablePool in 0.92, I noticed that the largest heap consumer was > java.lang.ref.Finalizer. Digging in, I discove