Re: data loss due to regionserver going down

2011-07-27 Thread Nico Guba
Very interesting. What is a good value where there is not too much of a trade-off in performance? I'd imagine that setting this too high could create a very 'chatty' cluster. On 28 Jul 2011, at 00:33, Jeff Whiting wrote: > Replication needs to be higher than 1. If you have a node which is run

Reading Performance

2011-07-27 Thread hmchiud
Hi, There are 500 concurrent clients request which trying to fetch data from HBASE through our 10 application servers. These requests will be equally dispatched to 10 application servers. In my application server, I am using HBASE java API to fetch hbase data by scan key and secindary ind

Re: how to solve this problem?

2011-07-27 Thread Gan, Xiyun
Hi, Did you forget to log out and back in again for the changes to take effect? On Wed, Jul 27, 2011 at 7:35 PM, Gan, Xiyun wrote: > http://wiki.apache.org/hadoop/Hbase/FAQ#A6 > > On Wed, Jul 27, 2011 at 7:12 PM, shanmuganathan.r > wrote: >> Hi All, >> >> >>            I am running the hbase

LoadIncrementalHFile doesn't check hfile families?

2011-07-27 Thread David Capwell
Heya, I am testing hbase with bulk loads and I seeing something unexpected. I'm generating a set of random KeyValues where key, family, column, and value are all random strings, i then sort them as Arrays.sort(values, KeyValue.COMPARATOR); I wrote this list to a StoreFile.Writer under /tmp/$tabl

Re: data loss due to regionserver going down

2011-07-27 Thread Jeff Whiting
Replication needs to be higher than 1. If you have a node which is running both DataNode and HRegionServer then shut it down you WILL loose all the data that the DataNode was holding because no one else on the cluster has it. HBase relies on HDFS for the replication of data and does NOT have it'

Re: Filter Rows on last 4 bytes

2011-07-27 Thread Jean-Daniel Cryans
You will need to use a RowFilter with a Comparator that only looks at the last 4 bytes, http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/RowFilter.html FYI unless you are restricting your scans on a few rows, doing this on a whole table won't scale as it will do a full table scan. J

Re: Design/Schema questions

2011-07-27 Thread Mark
Unfortunately that was the same problem I was having with flume. I guess I'll try out HBase to see if it works. Thanks for the suggestions. On 7/27/11 10:31 AM, Stack wrote: On Wed, Jul 27, 2011 at 7:29 AM, Mark wrote: Thanks for the info. We are wondering if using syslog to aggregate these

Re: The Isolation and allocation part of HBase is availabe.

2011-07-27 Thread Stack
Thank you for making this contribution Jia Liu. St.Ack On Wed, Jul 27, 2011 at 3:55 AM, Liujia wrote: > Now The Isolation and allocation part of HBase is available. > The issue is discussed in > https://issues.apache.org/jira/browse/HBASE-4120,the source code and patched > tarball are located i

Re: Design/Schema questions

2011-07-27 Thread Stack
On Wed, Jul 27, 2011 at 7:29 AM, Mark wrote: > Thanks for the info. We are wondering if using syslog to aggregate these > type of "log" files would be an a safer alternative. > syslog is lossy in our experience. St.Ack

Re: can readBlock change to this for improving performance

2011-07-27 Thread Stack
What you thinking? That we do the below call twice? ByteBuffer cachedBuf = cache.getBlock(name + block, cacheBlock); Once before we go into the synchronized block (and if it returns a non-null value, return) and then a second time once inside the synchronized block? Have you tried it? Do you s

Re: data loss due to regionserver going down

2011-07-27 Thread Suraj Varma
When you shutdown the region server, check the master logs to see if master has detected this condition. I've seen weird things happen if dns is not setup correctly - so, check if master (logs & ui) is correctly detecting that the region server is down after step 2. --Suraj 2011/7/27 吴限 : > Just

Re: data loss due to regionserver going down

2011-07-27 Thread 吴限
Just by keep cheking http://master:60010. Before Step 2 : AddressStart CodeLoadserver4.yun.com:600301311785159202requests=0, regions=10, usedHeap=32, maxHeap=995server5.yun.com:600301311768553647requests=18, regions=7, usedHeap=117, maxHeap=995Total:servers: 2 requests=18, regions=17Then at Step 2,

Re: data loss due to regionserver going down

2011-07-27 Thread Chris Tarnas
That is strange behavior. How long did you wait between Step 2 and 3, and what is the results of running hbase hbck at step 3? -chris On Jul 27, 2011, at 9:23 AM, 吴限 wrote: > Thx for your reply. But actually later I did another experiment similar to > one which I explained earlier. > Step 1:

Re: data loss due to regionserver going down

2011-07-27 Thread 吴限
Here is my hbase-site.xml: configuration> hbase.cluster.distributed true hbase.rootdir hdfs://server3.yun.com:54310/hbase The directory shared by region servers. hbase.zookeeper.quorum server3.yun.com

Re: data loss due to regionserver going down

2011-07-27 Thread 吴限
Dear Stack, thx for your reply~ First I don't know if there is something wrong with the cdh3u0. And thx for ur reminding me about the replication property,which I didn't quite understand but now understands. I'll try to correct this mistake. But actually these situations which I have described real

Re: data loss due to regionserver going down

2011-07-27 Thread Stack
This I can not explain. Check blocks directory on the two servers. Maybe they were all under one datanode only. St.Ack 2011/7/27 吴限 : > Thx for your reply. But actually later I did another experiment similar to > one which I explained earlier. > Step 1: I inserted some data into the hbase. > St

data loss due to a region server going down

2011-07-27 Thread 吴限
Hi everyone. I'd like to run the following *data* *loss* scenario by you to see if we are doing something obviously wrong with our setup here. Setup: -cdh3u0 - Hadoop 0.20.2 - HBase 0.90.1 - 1 Master Node running as NameNode & JobTracker -zookeeper quorum - 2 child nodes running

Re: data loss due to regionserver going down

2011-07-27 Thread 吴限
yep~Is there anything wrong with that? 2011/7/28 Stack > On Wed, Jul 27, 2011 at 8:58 AM, 吴限 wrote: > > Setup: > > -cdh3u0 > > - Hadoop 0.20.2 > > You are using the hadoop from cdh3u0? > > > > - dfs.replication is set to 1 > > > > You will lose data if a machine goes away. You have two m

Re: data loss due to regionserver going down

2011-07-27 Thread 吴限
Thx for your reply. But actually later I did another experiment similar to one which I explained earlier. Step 1: I inserted some data into the hbase. Step 2: I shut one of the region servers. Step 3 : I checked the table and found some data had been lost. Step 4: I disabled the table and then ena

Re: data loss due to regionserver going down

2011-07-27 Thread Stack
On Wed, Jul 27, 2011 at 8:58 AM, 吴限 wrote: > Setup: >   -cdh3u0 >   - Hadoop 0.20.2 You are using the hadoop from cdh3u0? >   - dfs.replication is set to 1 > You will lose data if a machine goes away. You have two machines but only one instance of each data block; think of it as half of your d

RE: data loss due to regionserver going down

2011-07-27 Thread Buttler, David
When replication is set to 1, that means there is only one copy of the data. If you take a node offline, any data on that node will be unavailable. In your scenario, try upping to a replication factor of 2 Dave -Original Message- From: 吴限 [mailto:infinity0...@gmail.com] Sent: Wednesd

Re: data loss due to regionserver going down

2011-07-27 Thread Chris Tarnas
Replication of 1x means no replication. 2x would mean the data exists in two locations (what it looks like you want). Running with a replication of 1x is a very bad idea and is pretty much a guaranteed way to get data loss. -chris On Jul 27, 2011, at 8:58 AM, 吴限 wrote: > Hi everyone. I'd like

RE: Doubt in hbase installation?

2011-07-27 Thread Buttler, David
You probably don't need the data directory in hbase-site.xml, but the rest look important. If you have an independent zookeeper installation, hbase still needs to know how to connect to it. Dave -Original Message- From: shanmuganathan.r [mailto:shanmuganatha...@zohocorp.com] Sent: We

Re: how to solve this problem?

2011-07-27 Thread Chris Tarnas
Did you restart Hbase and HDFS daemons after applying the limit.conf changes? That will be necessary for the changes to take effect. -chris On Jul 27, 2011, at 5:37 AM, shanmuganathan.r wrote: > Hi Gan,Thanks for tour reply > > >I set the nproc and ulimit in the /e

data loss due to regionserver going down

2011-07-27 Thread 吴限
Hi everyone. I'd like to run the following *data* *loss* scenario by you to see if we are doing something obviously wrong with our setup here. Setup: -cdh3u0 - Hadoop 0.20.2 - HBase 0.90.1 - 1 Master Node running as NameNode & JobTracker -zookeeper quorum - 2 child nodes running

can readBlock change to this for improving performance

2011-07-27 Thread BlueDavy Lin
hi! In hbase 0.90.2, HFile$Reader readBlock do this: synchronized (blockIndex.blockKeys[block]) { blockLoads++; // Check cache for block. If found return. if (cache != null) { ByteBuffer cachedBuf = cache.getBlock(name + block, cacheBlock); if

Filter Rows on last 4 bytes

2011-07-27 Thread Shuja Rehman
Hi, I have the key consist of two integers (4+4=8 Bytes). and i want to filter the rows on the basis of second integer which means need to compare last 4 bytes of key. If last 4 bytes matched with input integer then row should include in the return results. Can you ppl let me know how to do this w

Re: Design/Schema questions

2011-07-27 Thread Mark
Thanks for the info. We are wondering if using syslog to aggregate these type of "log" files would be an a safer alternative. On 7/26/11 8:01 PM, Michael Segel wrote: On Tue, Jul 26, 2011 at 7:39 AM, Mark wrote: So my first question is, would HBase fit our use case? If not can anyone offer

Re: Encoding when using Bytes.toBytes(String)?

2011-07-27 Thread Joey Echeverria
Correct. On Tue, Jul 26, 2011 at 11:07 PM, Steinmaurer Thomas wrote: > Hi! > > Thanks. So, it isn't a fixed width with 2 bytes in general, but rather > depends on the characters? If yes, I think this means I don't have to be > worried about at all? > > Thanks, > Thomas > > -Original Message--

Re: how to solve this problem?

2011-07-27 Thread shanmuganathan.r
Hi Gan,Thanks for tour reply I set the nproc and ulimit in the /etc/security/limits.conf and session required pam_limits.so in the /etc/pam.d/common-session in all machines in my cluster . My /etc/security/limits.conf hadoop(user) -

Re: Design/Schema questions

2011-07-27 Thread Doug Meil
Following up on what Stack said, make sure you read this.. http://hbase.apache.org/book.html#performance ... this chapter also refers to OpenTSDB for certain topics (especially key-design issues). On 7/26/11 4:27 PM, "Stack" wrote: >On Tue, Jul 26, 2011 at 1:08 PM, Mark wrote: >> Any reaso

RE: how to solve this problem?

2011-07-27 Thread Srikanth P. Shreenivas
Refer : http://wiki.apache.org/hadoop/Hbase/FAQ#A6 -Original Message- From: shanmuganathan.r [mailto:shanmuganatha...@zohocorp.com] Sent: Wednesday, July 27, 2011 4:43 PM To: user@hbase.apache.org Subject: how to solve this problem? Hi All, I am running the hbase in fully d

Re: how to solve this problem?

2011-07-27 Thread Gan, Xiyun
http://wiki.apache.org/hadoop/Hbase/FAQ#A6 On Wed, Jul 27, 2011 at 7:12 PM, shanmuganathan.r wrote: > Hi All, > > >            I am running the hbase in fully distributed mode. I want to create > a table with 500 rows . After creation of 122 rows the following exception is > thrown > > org.apac

how to solve this problem?

2011-07-27 Thread shanmuganathan.r
Hi All, I am running the hbase in fully distributed mode. I want to create a table with 500 rows . After creation of 122 rows the following exception is thrown org.apache.hadoop.hbase.ZooKeeperConnectionException: java.io.IOException: Too many open files at org.apache.hadoop.

The Isolation and allocation part of HBase is availabe.

2011-07-27 Thread Liujia
Now The Isolation and allocation part of HBase is available. The issue is discussed in https://issues.apache.org/jira/browse/HBASE-4120,the source code and patched tarball are located in https://github.com/ICT-Ope/HBase_allocation, Two versions Integrated with HBase 0.90.2 and 0.90.3 are availab

Doubt in hbase installation?

2011-07-27 Thread shanmuganathan.r
Hi All, I have some doubt in the installation of hbase in fully distributed mode. In my configuration the hbase does not manage the zookeeper. I added zoo.cfg in zookeeper configuration file . Also specified some configuration in the hbase-site.xml file. Are the two configurations