Re: Hbase table querying using 'like' concept of SQL

2010-01-07 Thread Jeff Zhang
I think you can use PrefixFilter to do the like operation. On Fri, Jan 8, 2010 at 2:27 PM, Sriram Muthuswamy Chittathoor < srir...@ivycomptech.com> wrote: > Hi: > > > > I am trying to evaluate this for a hbase table which has around 30 > million records. The table stores user information like

Hbase table querying using 'like' concept of SQL

2010-01-07 Thread Sriram Muthuswamy Chittathoor
Hi: I am trying to evaluate this for a hbase table which has around 30 million records. The table stores user information like {accountName , screenName, firstName , lastName, dataOfBirth} etc. The idea is to be able to query the rows using "like" clause. So something like select

RE: Hbase loading error -- Trailer 'header' is wrong; does the trailer size match content

2010-01-07 Thread Sriram Muthuswamy Chittathoor
Thanks. I disabled speculative execution and it seems to have worked. Will try it on bigger sizes to see if it is consistent. Currently I write some 20 milli rows using a MR job (very simple row though). From: saint@gmail.com [mailto:saint@gmail.com]

Re: MR in HBase

2010-01-07 Thread stack
This is a little tough. Do both tables have same number of regions? Are you walking through the two tables serially in your mapreduce or do you want to do random lookups into the second table dependent on the row you are currently processing in table one? St.Ack On Thu, Jan 7, 2010 at 7:51 PM,

Re: Hbase loading error -- Trailer 'header' is wrong; does the trailer size match content

2010-01-07 Thread stack
I think the failed tasks are leaving around incomplete hfiles. Search for empty hfiles, try and correlate them to the failed tasks. If they match, just remove them before running loadtable.rb. Otherwise, try and figure how the bad hfiles were written. Maybe the task log will give us a clue? Di

MR in HBase

2010-01-07 Thread john smith
Hi all, My requirement is that , I must read two tables (belonging to the same region server) in the same Map . Normally TableMap supports only 1 table at a time and right now I am reading the entire 2nd table in any one of the maps , This is a big overhead . So can any one suggest some modificat

Re: Insert streamed data into hbase

2010-01-07 Thread Andrew Purtell
No, you misunderstand. **Sequential** inserting into HBase is not very efficient for large data. If basically you are inserting a high volume of data with row keys that are all adjacent to each other, this will focus all of the load on one region server only. If the keys for the data being inserted

Insert streamed data into hbase

2010-01-07 Thread kishore g
Hi, I see that the inserting into hbase is not very efficient for large data. For event logging i see the solution explained in http://www.mail-archive.com/hbase-user@hadoop.apache.org/msg06010.html If my understanding is correct this is applicable if key is timestamp. Is there a solution to ac

Re: Seeing errors after loading a fair amount of data. KeeperException$NoNodeException, IOException

2010-01-07 Thread Andrew Purtell
Thanks. Not that you might be missing something, but note that we do set up a separate ZooKeeper quorum ensemble using c1.medium instances, so ZK can be resource independent. - Andy - Original Message > From: Marc Limotte > To: hbase-user@hadoop.apache.org > Sent: Thu, January 7,

Re: Seeing errors after loading a fair amount of data. KeeperException$NoNodeException, IOException

2010-01-07 Thread Marc Limotte
Thanks, Robert. I'm using Fedora, so it probably works the same way as you suggest. Setting the ulimit and xcievers as described in the troubleshooting didn't seem to help. But I'm going to try again with your suggestion. Marc On Thu, Jan 7, 2010 at 12:56 PM, Andrew Purtell wrote: > Robert,

Re: Seeing errors after loading a fair amount of data. KeeperException$NoNodeException, IOException

2010-01-07 Thread Marc Limotte
I'm using own scripts and methods to construct the EC2 cluster. I'll take a look through the src/contrib scripts, though, maybe there's a clue there about something I'm missing. Marc On Thu, Jan 7, 2010 at 12:49 PM, Andrew Purtell wrote: > Marc, > > Are you using the HBase EC2 scripts (in src/

Re: Seeing errors after loading a fair amount of data. KeeperException$NoNodeException, IOException

2010-01-07 Thread Andrew Purtell
Robert, Thanks for that. I updated the relevant section of the Troubleshooting page up on the HBase wiki with this advice. Best regards, - Andy - Original Message > From: "Gibbon, Robert, VF-Group" > To: hbase-user@hadoop.apache.org > Sent: Thu, January 7, 2010 5:04:58 AM > Subj

Re: Seeing errors after loading a fair amount of data. KeeperException$NoNodeException, IOException

2010-01-07 Thread Andrew Purtell
Marc, Are you using the HBase EC2 scripts (in src/contrib/ec2/), or is this a set of instances you are setting up using your own methods? Best regards, - Andy - Original Message > From: Marc Limotte > To: hbase-user@hadoop.apache.org > Sent: Thu, January 7, 2010 12:39:48 AM > Su

RE: [Announce] HBql

2010-01-07 Thread Wim Van Leuven
Ah ... yes. That's the tech also included in Google App Engine. I'm in the process of looking into this on GAE. -Original Message- From: Fernando Padilla [mailto:f...@alum.mit.edu] Sent: woensdag 6 januari 2010 1:40 To: hbase-user@hadoop.apache.org Subject: Re: [Announce] HBql I haven't

Re: problem of secondary index

2010-01-07 Thread Jean-Daniel Cryans
I'm not an expert on the indexed contrib but by looking a bit a the code, it just looks like that's how the secondary index table is built. Since the shell doesn't "talk" the same language as the contrib, it shows you the raw data. J-D 2010/1/6 : > > > I've created a table named table1 with a se

Re: HBase reading test

2010-01-07 Thread Jean-Daniel Cryans
Yeah instantiating a HTable is very expensive since it pings .META. once, glad you could resolve your issue! J-D 2010/1/6 : > Hi, > > I've found the root cause of why multiple reading users lower the hbase > performance. > That's because I always new a HTable in a share function, which will make

Re: Stargate API

2010-01-07 Thread Jean-Daniel Cryans
Unfortunately I have to punt this to 0.20.4 but, looking into the issue, is seems that the row key is parsed in a different way because it's expecting something like this: path := '/' '/' '/' ( ':' )? ( '/' )? I guess that a workaround would be to simply encode the row keys on the

Hbase loading error -- Trailer 'header' is wrong; does the trailer size match content

2010-01-07 Thread Sriram Muthuswamy Chittathoor
Hi: I am trying to run a MR job to output HFiles directly containing 10 million records (very simple 1 column family and very small). The job completes with some mention about killed jobs (reduce Failed/Killed Task Attempts > 0) . Then I use the script loadtable.rb to load my hfiles into hbase a

RE: Seeing errors after loading a fair amount of data. KeeperException$NoNodeException, IOException

2010-01-07 Thread Gibbon, Robert, VF-Group
Maybe you are running Red Hat? Just changing limits.conf I think won't work because RH has a maximum total open files across the whole system, which is 4096 by default, unless you do something like this too echo "32768" > /proc/sys/fs/file-max service network restart To make it permanent edit /et

Re: Seeing errors after loading a fair amount of data. KeeperException$NoNodeException, IOException

2010-01-07 Thread Marc Limotte
Thanks for the response, Ryan. I increased ulimt and Xceivers. My load job still dies, but the RegionServers stay up and running, and I can use the hbase shell to retrieve a row, so I guess HBase is still running. Doesn't seem to be swapping (still 2-3gig free) CPU usage is low top - 02:44:02 up