how to drop a table forcefully?

2013-01-17 Thread hua beatls
hi, A table mistakenly created cannot be dropped? first we create it with 'hbase shell', below: [hadoop@hadoop3 ~]$ [hadoop@hadoop3 ~]$ hbase shell 13/01/17 15:49:35 WARN conf.Configuration: hadoop.native.lib is deprecated. Instead, use io.native.lib.available HBase Shell;

Re: How is DataXceiver been used?

2013-01-17 Thread Mohammad Tariq
Hello Raymond, You might find this linkhttp://blog.cloudera.com/blog/2012/03/hbase-hadoop-xceivers/helpful. It explains the problem and solution in great detail. Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Thu, Jan 17, 2013 at 12:14 PM, Liu, Raymond

Re: how to drop a table forcefully?

2013-01-17 Thread Jean-Marc Spaggiari
Hi Beat, Have you tried to check the filesystem using fsck to see if there is no inconsistencies? JM 2013/1/17, hua beatls bea...@gmail.com: hi, A table mistakenly created cannot be dropped? first we create it with 'hbase shell', below: [hadoop@hadoop3 ~]$

Re: how to drop a table forcefully?

2013-01-17 Thread hua beatls
Hi, the file system is ok. any other idea? beatls On Thu, Jan 17, 2013 at 7:38 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Beat, Have you tried to check the filesystem using fsck to see if there is no inconsistencies? JM 2013/1/17, hua beatls bea...@gmail.com:

Re: how to drop a table forcefully?

2013-01-17 Thread Jean-Marc Spaggiari
Hi, What do you mean by the file system is ok? Have you run both the HBase and the Hadoop checks and both of them aren't returning any error? Also, what do you have on the webui? Can yo usee the table? The regions? When you are trying to delete the table, what do you havae on you master logs?

Re: how to drop a table forcefully?

2013-01-17 Thread hua beatls
On Thu, Jan 17, 2013 at 8:21 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi, *below is snapshot of webui,* tabel 'cdrtestsnap' and 'cdrtestsnappy' is on the list , as well as their regions. cdrtestGZ http://192.168.250.106:60010/table.jsp?name=cdrtestGZ {NAME = 'cdrtestGZ',

Re: How to upgrade HBase from 0.90.5 to 0.94

2013-01-17 Thread Ted
With the version gap I don't think bulk load can be used directly. Do the two clusters have same set of tables ? Is the 0.90 cluster serving write load actively ? Thanks On Jan 17, 2013, at 1:00 AM, Mickey huanfeng...@gmail.com wrote: Thanks, Ted. If I have another cluster already installed

Re: Constructing rowkeys and HBASE-7221

2013-01-17 Thread Doug Meil
Thanks Aaron! I will take a look at Kiji. And I think it underscores the need for some type of utility row rowkey building/parsing being available in HBase, because one of the first things folks tend to do is start building their own keybuilder utility when they start using Hbase (same

Re: how to drop a table forcefully?

2013-01-17 Thread Jean-Marc Spaggiari
Ok. So. You HBase system is inconsistent. And I can't see you table t111 anywhere. On the shell, if you do a list, which tables are you getting? I guess t111 will not be there. First thing to do is to fix that. Can you provide more logs from the hbase check? It might tell you what the issues

Re: how to drop a table forcefully?

2013-01-17 Thread hua beatls
Hi JM, sorry, the table name is 'cdrtestsnappy' and 'cdrtestsnap'. i can see them from 'list'. hbase(main):001:0 list TABLE SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in

Re: how to drop a table forcefully?

2013-01-17 Thread Jean-Marc Spaggiari
OK, sorry. From the example you sent I tought it cas t111. Did you get a chance to past on pastebin the entire output for the hbase check? Not just the summary, but everything, to see what issue it's reporting. JM 2013/1/17, hua beatls bea...@gmail.com: Hi JM, sorry, the table name is

RE: Hbase as mongodb

2013-01-17 Thread Panshul Whisper
Thank you for the suggestions, but timestamp is not a unique value in my case so using it as key is not suggested. The data is too large, so full table scan is also not good. Will look into Panthera. Regards, Ouch Whisper 01010101010 On Jan 17, 2013 4:52 AM, Anoop Sam John anoo...@huawei.com

Re: Hbase as mongodb

2013-01-17 Thread Michael Segel
The interface to pull JSON objects out of HBase will not be the same as Mongo DB. Meaning that you can store JSON objects in HBase. ( See Avro, or Wibidata's Kiji for example) But the infrastructure to query and interact with HBase is going to be different than with MongoDB. HTH -Mike On

Re: How to upgrade HBase from 0.90.5 to 0.94

2013-01-17 Thread Mickey
Thanks, Ted. If I have another cluster already installed with hbase 0.94.4, can I migrate the data from the original hbase 0.90 cluster to the new cluster? Is it possible to do something like bulk load? Best regards, Mickey 2013/1/15 Ted yuzhih...@gmail.com You can upgrade to 0.92.2 first,

Loading data, hbase slower than Hive?

2013-01-17 Thread Austin Chungath
Hi, Problem: hive took 6 mins to load a data set, hbase took 1 hr 14 mins. It's a 20 gb data set approx 230 million records. The data is in hdfs, single text file. The cluster is 11 nodes, 8 cores. I loaded this in hive, partitioned by date and bucketed into 32 and sorted. Time taken is 6 mins.

Re: Loading data, hbase slower than Hive?

2013-01-17 Thread Michael Segel
The writes take longer in HBase. Just how much longer may depend on how well you tuned HBase. Now, having said that... suppose you want to find a single record in either HBase or Hive. Which do you think will be faster? ;-) On Jan 17, 2013, at 10:44 AM, Austin Chungath austi...@gmail.com

Re: Loading data, hbase slower than Hive?

2013-01-17 Thread Anoop John
In case of Hive data insertion means placing the file under table path in HDFS. HBase need to read the data and convert it into its format. (HFiles) MR is doing this work.. So this makes it clear that HBase will be slower. :) As Michael said the read operation... -Anoop- On Thu, Jan 17,

Re: Loading data, hbase slower than Hive?

2013-01-17 Thread ramkrishna vasudevan
Hive is more for batch and HBase is for more of real time data. Regards Ram On Thu, Jan 17, 2013 at 10:30 PM, Anoop John anoop.hb...@gmail.com wrote: In case of Hive data insertion means placing the file under table path in HDFS. HBase need to read the data and convert it into its format.

Just joined the user group and have a question

2013-01-17 Thread Chalcy Raja
Hi HBASE Gurus, I am Chalcy Raja and I joined the hbase group yesterday. I am already a member of hive and sqoop user groups. Looking forward to learn and share information about hbase here! Have a question: We have a cluster where we run hive jobs and also hbase. There are stability

Re: Just joined the user group and have a question

2013-01-17 Thread Kevin O'dell
Chalcy, Glad to have you aboard. One thing to look at is your max map and reduce slots that you are currently allowing. Typically, we look at the CPU architecture and say if it is not HT(hyperthreaded) then it is a 1:1, if it is using HT 1:1.5. Dual quad core without HT you would be able to use

Re: Just joined the user group and have a question

2013-01-17 Thread Doug Meil
Hi there- If you're absolutely new to Hbase, you might want to check out the Hbase refGuide in the architecture, performance, and troubleshooting chapters first. http://hbase.apache.org/book.html In terms of determining why your region servers just die, I think you need to read the background

Re: Loading data, hbase slower than Hive?

2013-01-17 Thread Mohammad Tariq
Just to add to whatever all the heavyweights have said above, your MR job may not be as efficient as the MR job corresponding to your Hive query. You can enhance the performance by setting the mapred config parameters wisely and by tuning your MR job. Warm Regards, Tariq https://mtariq.jux.com/

Re: Just joined the user group and have a question

2013-01-17 Thread anil gupta
Hi Chalcy, In addition to points others have made. Also have a look at your Disk I/O load. Mapreduce jobs are disk i/o intensive. When a MapReduce job is running there might be a contention for Disk i/o. Contention in Disk i/o might lead to request timeouts in HBase. Hence, you will start having

Throttle replication speed in case of datanode failure

2013-01-17 Thread Brennon Church
Hello, Is there a way to throttle the speed at which under-replicated blocks are copied across a cluster? Either limiting the bandwidth or the number of blocks per time period would work. I'm currently running Hadoop v1.0.1. I think the

Re: Throttle replication speed in case of datanode failure

2013-01-17 Thread Jean-Daniel Cryans
Since this is a Hadoop question, it should be sent u...@hadoop.apache.org (which I'm now sending this to and I put user@hbase in BCC). J-D On Thu, Jan 17, 2013 at 9:54 AM, Brennon Church bren...@getjar.com wrote: Hello, Is there a way to throttle the speed at which under-replicated blocks are

Re: Throttle replication speed in case of datanode failure

2013-01-17 Thread Brennon Church
Ack. Sorry about that guys. Need to pay more attention when I email. --Brennon On 1/17/13 10:03 AM, Jean-Daniel Cryans wrote: Since this is a Hadoop question, it should be sent u...@hadoop.apache.org (which I'm now sending this to and I put user@hbase in BCC). J-D On Thu, Jan 17, 2013 at

Re: Region status.

2013-01-17 Thread Adrien Mogenet
Perhaps can you dive into ZK nodes ? On Fri, Jan 11, 2013 at 9:32 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Thanks Ram. It was not for a test case, so I will try to find a different way to do what I need. JM 2013/1/10, ramkrishna vasudevan ramkrishna.s.vasude...@gmail.com:

Re: Region status.

2013-01-17 Thread Jean-Marc Spaggiari
That's an idea. But I was looking to something more straightforward like region.getStatus(). At the end, I will modify the way I'm approaching the probleme and anch on some region based coprocessors... 2013/1/17, Adrien Mogenet adrien.moge...@gmail.com: Perhaps can you dive into ZK nodes ? On

RE: Just joined the user group and have a question

2013-01-17 Thread Chalcy Raja
Hi Kevin, Thanks for the reply. Currently using 10 mappers and 10 reducers on each node. With 32 GB memory, allotted 2 GB for hbase heapsize, mapred.map.child.java.opts and reduce.child.java.opts is 1 GB, and therefore having 10 mappers and 10 reducers looks like not a bad idea. From what

RE: Just joined the user group and have a question

2013-01-17 Thread Chalcy Raja
Thanks! Doug. I am not absolutely new to hbase. Like in Kevin's email, because of mapred job (hive) contention, hbase regionservers die and whole hbase go down. I understand that we have to somehow logically or physically separate the clusters. --Chalcy -Original Message- From:

RE: Just joined the user group and have a question

2013-01-17 Thread Chalcy Raja
Thank you, Anil, for your reply. I am beginning to get the feeling, that may be we should not push both in the same cluster. In three replies, I get that same info from 2 of you. Thanks again, Chalcy -Original Message- From: anil gupta [mailto:anilgupt...@gmail.com] Sent: Thursday,

Hbase heap size

2013-01-17 Thread Varun Sharma
Hi, I was wondering how much folks typical give to hbase and how much they leave for the file system cache for the region server. I am using hbase 0.94 and running only the region server and data node daemons. I have a system with 15G ram. Thanks

Re: Hbase heap size

2013-01-17 Thread lars hofhansl
A good rule of thumb that I found is to give each region server a Java help that is roughly 1/100th of the size of the disk space per region server. (that is assuming all the default setting: 10G regions, 128M memstores, 40% of heap for memstores, 20% of heap for block cache, 3-way replication)

Re: RegionSplitter command

2013-01-17 Thread Jean-Marc Spaggiari
Hi Anil, bin/hbase org.apache.hadoop.hbase.util.RegionSplitter -c 60 -f f1 test HexStringSplit is working for me. So your 2nd line should work. I'm not sure if it's because of the version since I'm using 0.94.4. Can you upgrade your version and retry? Or you need to stay with 0.92.1? JM

Re: Hbase heap size

2013-01-17 Thread Varun Sharma
Thanks for the info. I am looking for a balance where I have a write heavy work load and need excellent read latency. So 40 % to block cache for caching, 35 % to memstore. But I would like to also reduce the number of HFiles and amount of compaction activity. So, having few number of regions and

Re: Storing images in Hbase

2013-01-17 Thread Varun Sharma
Hey Jack, Thanks for the useful information. By flush size being 15 %, do you mean the memstore flush size ? 15 % would mean close to 1G, have you seen any issues with flushes taking too long ? Thanks Varun On Sun, Jan 13, 2013 at 8:17 AM, Jack Levin magn...@gmail.com wrote: That's right,

RE: How is DataXceiver been used?

2013-01-17 Thread Liu, Raymond
Hi Tariq Thanks, this blog is very informational! I roughly figure out the usage of xceiver. While the weird thing is even according to this blog, I should have far enough xceiever in my full table scan job. Especially if it count the number of storefile instead of real blk_ tfile on

around 500 (CLOSE_WAIT) connection

2013-01-17 Thread Liu, Raymond
Hi I have hadoop 1.1.1 and hbase 0.94.1. Around 300 region on each region server. Right after the cluster is started, before I do anything. There are already around 500 (CLOSE_WAIT) connection from regionserver process to Datanode process. Is that normal? Seems there are a

Re: RegionSplitter command

2013-01-17 Thread anil gupta
Hi Jean, Yes, i am stuck with 0.92.1. Thanks for your response. I think, i will need to dig more deep into this. ~Anil On Thu, Jan 17, 2013 at 3:02 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Anil, bin/hbase org.apache.hadoop.hbase.util.RegionSplitter -c 60 -f f1 test

Re: Hbase heap size

2013-01-17 Thread lars hofhansl
You'llĀ  need more memory then, or more machines with not much disk attached. You can look at it this way: - The largest useful region size is 20G (at least that is the current common tribal knowledge). - Each region has at least one memstore (one per column family actually, let's just say one

Re: How to use substring in Rowkey

2013-01-17 Thread Ramasubramanian Narayanan
Hi, Thanks!! How to use/query inside HBASE shell with this filter.. regards, Rams On Thu, Dec 27, 2012 at 4:13 PM, Mohammad Tariq donta...@gmail.com wrote: Try RowFilter with RegexStringComparator Filter filter = new RowFilter(CompareFilter.CompareOp.EQUAL,new

Re: How to de-nomarlize for this situation in HBASE Table

2013-01-17 Thread Sonal Goyal
What are your data access patterns? Best Regards, Sonal Real Time Analytics for BigData https://github.com/sonalgoyal/crux Nube Technologies http://www.nubetech.co http://in.linkedin.com/in/sonalgoyal On Fri, Jan 18, 2013 at 9:04 AM, Ramasubramanian Narayanan

Re: How to de-nomarlize for this situation in HBASE Table

2013-01-17 Thread Ramasubramanian Narayanan
Hi Sonal, 1. will fetch all demographic details of customer based on client ID 2. Fetch the particular type of address along with other demographic for a client.. for example, HOME Physical address or HOME Telephone address or office Email address etc., regards, Rams On Fri, Jan 18, 2013 at

Re: How to de-nomarlize for this situation in HBASE Table

2013-01-17 Thread Sonal Goyal
How about client id as the rowkey, with column families as physical address, email address, telephone address? within each cf, you could have various qualifiers. For eg in physical address, you could have home Street, office street etc. Best Regards, Sonal Real Time Analytics for BigData

Re: H-Rider / HTable UI

2013-01-17 Thread Sonal Goyal
If you are looking at HBase data access, querying, charting, filtering and aggregations, feel free to check out Crux[1]. 1. http://github.com/sonalgoyal/crux Best Regards, Sonal Real Time Analytics for BigData https://github.com/sonalgoyal/crux Nube Technologies http://www.nubetech.co

Re: How to de-nomarlize for this situation in HBASE Table

2013-01-17 Thread Ramasubramanian Narayanan
Hi Sonal, In that case, the problem is how to store multiple physical address sets in the same column family.. what rowkey to be used for this scenario.. A Physical address will contain the following fields (need to store multiple physical address like this): Physical address type :

Re: How to de-nomarlize for this situation in HBASE Table

2013-01-17 Thread Sonal Goyal
A rowkey is associated with the complete row. So you could have client id as the rowkey. Hbase allows different qualifiers within a column family, so you could potentially do the following: 1. You could have qualifiers like home address street 1, home address street 2, home address city, office

Re: How to de-nomarlize for this situation in HBASE Table

2013-01-17 Thread Ramasubramanian Narayanan
Hi, Is there any other way instead of using HOME/Work/etc? we expect some 10 such types may come in future.. hence asking regards, Rams On Fri, Jan 18, 2013 at 10:24 AM, Sonal Goyal sonalgoy...@gmail.com wrote: A rowkey is associated with the complete row. So you could have client id as the

ValueFilter and VERSIONS

2013-01-17 Thread Li, Min
Hi all, As you know, ValueFilter will filter data from all versions, so I create a table and indicate it has only 1 version. However, the old version record still can be gotten by ValueFilter? Does anyone know how to create a table with only one version record? BTW, I am using hbase 0.92.1.

RE: ValueFilter and VERSIONS

2013-01-17 Thread Anoop Sam John
Can you make use of SingleColumnValueFilter. In this you can specify whether the condition to be checked only on the latest version or not. SCVF#setLatestVersionOnly ( true) -Anoop- From: Li, Min [m...@microstrategy.com] Sent: Friday, January 18, 2013

RE: ValueFilter and VERSIONS

2013-01-17 Thread Anoop Sam John
ValueFilter works only on the KVs not at a row level . So something similar is not possible. Setting versions to 1 will make only one version (latest) version getting back to the client. But the filtering is done prior to the versioning decision and filters will see all the version values.

RE: ValueFilter and VERSIONS

2013-01-17 Thread Li, Min
Thanks for your explanation. Min -Original Message- From: Anoop Sam John [mailto:anoo...@huawei.com] Sent: Friday, January 18, 2013 2:44 PM To: user@hbase.apache.org Subject: RE: ValueFilter and VERSIONS ValueFilter works only on the KVs not at a row level . So something similar is