I think you can use PrefixFilter to do the like operation.
On Fri, Jan 8, 2010 at 2:27 PM, Sriram Muthuswamy Chittathoor <
srir...@ivycomptech.com> wrote:
> Hi:
>
>
>
> I am trying to evaluate this for a hbase table which has around 30
> million records. The table stores user information like
Hi:
I am trying to evaluate this for a hbase table which has around 30
million records. The table stores user information like
{accountName , screenName, firstName , lastName, dataOfBirth} etc. The
idea is to be able to query the rows using "like" clause.
So something like select
Thanks. I disabled speculative execution and it seems to have worked.
Will try it on bigger sizes to see if it is consistent. Currently I
write some 20 milli rows using a MR job (very simple row though).
From: saint@gmail.com [mailto:saint@gmail.com]
This is a little tough. Do both tables have same number of regions? Are
you walking through the two tables serially in your mapreduce or do you want
to do random lookups into the second table dependent on the row you are
currently processing in table one?
St.Ack
On Thu, Jan 7, 2010 at 7:51 PM,
I think the failed tasks are leaving around incomplete hfiles. Search for
empty hfiles, try and correlate them to the failed tasks. If they match,
just remove them before running loadtable.rb. Otherwise, try and figure how
the bad hfiles were written. Maybe the task log will give us a clue?
Di
Hi all,
My requirement is that , I must read two tables (belonging to the same
region server) in the same Map .
Normally TableMap supports only 1 table at a time and right now I am reading
the entire 2nd table in any one
of the maps , This is a big overhead . So can any one suggest some
modificat
No, you misunderstand. **Sequential** inserting into HBase is not very
efficient for large data. If basically you are inserting a high volume of
data with row keys that are all adjacent to each other, this will focus
all of the load on one region server only. If the keys for the data being
inserted
Hi,
I see that the inserting into hbase is not very efficient for large data.
For event logging i see the solution explained in
http://www.mail-archive.com/hbase-user@hadoop.apache.org/msg06010.html
If my understanding is correct this is applicable if key is timestamp. Is
there a solution to ac
Thanks.
Not that you might be missing something, but note that we do set up a separate
ZooKeeper quorum ensemble using c1.medium instances, so ZK can be resource
independent.
- Andy
- Original Message
> From: Marc Limotte
> To: hbase-user@hadoop.apache.org
> Sent: Thu, January 7,
Thanks, Robert.
I'm using Fedora, so it probably works the same way as you suggest. Setting
the ulimit and xcievers as described in the troubleshooting didn't seem to
help. But I'm going to try again with your suggestion.
Marc
On Thu, Jan 7, 2010 at 12:56 PM, Andrew Purtell wrote:
> Robert,
I'm using own scripts and methods to construct the EC2 cluster. I'll take a
look through the src/contrib scripts, though, maybe there's a clue there
about something I'm missing.
Marc
On Thu, Jan 7, 2010 at 12:49 PM, Andrew Purtell wrote:
> Marc,
>
> Are you using the HBase EC2 scripts (in src/
Robert,
Thanks for that. I updated the relevant section of the Troubleshooting page
up on the HBase wiki with this advice.
Best regards,
- Andy
- Original Message
> From: "Gibbon, Robert, VF-Group"
> To: hbase-user@hadoop.apache.org
> Sent: Thu, January 7, 2010 5:04:58 AM
> Subj
Marc,
Are you using the HBase EC2 scripts (in src/contrib/ec2/), or is this a set
of instances you are setting up using your own methods?
Best regards,
- Andy
- Original Message
> From: Marc Limotte
> To: hbase-user@hadoop.apache.org
> Sent: Thu, January 7, 2010 12:39:48 AM
> Su
Ah ... yes. That's the tech also included in Google App Engine. I'm in the
process of looking into this on GAE.
-Original Message-
From: Fernando Padilla [mailto:f...@alum.mit.edu]
Sent: woensdag 6 januari 2010 1:40
To: hbase-user@hadoop.apache.org
Subject: Re: [Announce] HBql
I haven't
I'm not an expert on the indexed contrib but by looking a bit a the
code, it just looks like that's how the secondary index table is
built. Since the shell doesn't "talk" the same language as the
contrib, it shows you the raw data.
J-D
2010/1/6 :
>
>
> I've created a table named table1 with a se
Yeah instantiating a HTable is very expensive since it pings .META.
once, glad you could resolve your issue!
J-D
2010/1/6 :
> Hi,
>
> I've found the root cause of why multiple reading users lower the hbase
> performance.
> That's because I always new a HTable in a share function, which will make
Unfortunately I have to punt this to 0.20.4 but, looking into the
issue, is seems that the row key is parsed in a different way because
it's expecting something like this:
path := '/' '/' '/' ( ':' )?
( '/' )?
I guess that a workaround would be to simply encode the row keys on
the
Hi:
I am trying to run a MR job to output HFiles directly containing 10
million records (very simple 1 column family and very small). The job
completes with some mention about killed jobs (reduce Failed/Killed
Task Attempts > 0) . Then I use the script loadtable.rb to load my
hfiles into hbase a
Maybe you are running Red Hat? Just changing limits.conf I think won't
work because RH has a maximum total open files across the whole system,
which is 4096 by default, unless you do something like this too
echo "32768" > /proc/sys/fs/file-max
service network restart
To make it permanent edit /et
Thanks for the response, Ryan.
I increased ulimt and Xceivers. My load job still dies, but the
RegionServers stay up and running, and I can use the hbase shell to retrieve
a row, so I guess HBase is still running.
Doesn't seem to be swapping (still 2-3gig free)
CPU usage is low
top - 02:44:02 up
20 matches
Mail list logo