Well, our example should have had that for an import. If you get the
example working, paste it to a JIRA and I'll update our doc.
St.Ack
On Wed, Jun 8, 2011 at 11:32 PM, praveenesh kumar wrote:
> Oh..!!!
> Sorry Sorry.. My mistake..
> I was searching org.apache.hadoop.conf.Configuration in the H
On Wed, Jun 8, 2011 at 11:33 PM, Ted Dunning wrote:
> Otis,
>
> We should talk some time about MapR. We did a test with Stack where we had
> an hbase instance with very active writes going on. We did successive
> snapshots with no interruption or pause in hbase operations and were able to
> demo
Otis,
We should talk some time about MapR. We did a test with Stack where we had
an hbase instance with very active writes going on. We did successive
snapshots with no interruption or pause in hbase operations and were able to
demonstrate the each snapshot was usable to restore hbase to the sta
Oh..!!!
Sorry Sorry.. My mistake..
I was searching org.apache.hadoop.conf.Configuration in the HBase API.. Its
in the Hadoop-core. jar file..
My mistake..
Extremely Sorry.. :-)
On Thu, Jun 9, 2011 at 12:00 PM, praveenesh kumar wrote:
> The link you send to me showing HBASE 0.91.0 - SNAPSHOT API
>
On Wed, Jun 8, 2011 at 4:39 PM, Abhijit Pol wrote:
> Recently we observed that our "get" latencies keep increasing over the
> period (and eventually flatten out at higher value) and if we restart hbase
> server, latencies go back to good state (low values) and start increasing
> again.
>
What hap
The link you send to me showing HBASE 0.91.0 - SNAPSHOT API
and in the link
http://hbase.apache.org/apidocs/overview-summary.html#overview_description,
I am not able to see org.apache.hadoop.conf.Configuration Class.
If I am following the given example --
http://hbase.apache.org/apidocs/org/apach
We leave it at DEBUG level. Good for figuring issues. Is there
client tracing in your datanode logs? You might want to disable this
if its on. Here is what I add to the hadoop conf/log4j.properties:
log4j.logger.org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace=WARN
St.Ack
2011/6/8
Where are you reading? I just checked the javadoc,
http://hbase.apache.org/apidocs/overview-summary.html#overview_description,
and it seems to be current.
St.Ack
On Wed, Jun 8, 2011 at 11:06 PM, praveenesh kumar wrote:
> Hello guys,
>
> I just started doing HBase programming. I am using HBase 0
Stack, Thanks for your reply.
In your production, set the log level information or warning ?
-邮件原件-
发件人: saint@gmail.com [mailto:saint@gmail.com] 代表 Stack
发送时间: 2011年6月9日 12:36
收件人: user@hbase.apache.org
主题: Re: a question about log level
In the conf/log4j.properties
St.Ack
On We
Hello guys,
I just started doing HBase programming. I am using HBase 0.90.3 API.
All tutorials I am getting are based on previous version.
I am not able to create conf object using HBase 0.90.3 API..
In the HBASE 0.90.3 API link , its saying HBaseConfiguration is using
org.apache.hadoop.conf.Conf
Can anyone comment on the performance of "Cluster Compute Instances" of EC2
which they have released lately and do provide 10 Gigabit Ethernet which was
the main issue with the previous instances. They have customized these
instances for low latency inter-node communication
We are plannin to s
On Wed, Jun 8, 2011 at 9:37 PM, praveenesh kumar wrote:
> But my problem is I want to keep the entry of localhost in my /etc/hosts
> file..
> Is there any parameter that we can put in hbase-site.xml so that RPC starts
> listening on regionserver's actual IP rather than default localhost. ??
>
No.
On Wed, Jun 8, 2011 at 9:45 PM, James Ram wrote:
> Is there anyway to add a new HQuorum to the cluster dynamically?
>
If HQuorum == HRegionServer, then yes. Just make sure it has same
config. as other members of the cluster and start it.
St.Ack
On Wed, Jun 8, 2011 at 10:10 PM, James Ram wrote:
> Hi,
> Thanks for your reply. So does HBase automatically reassign to another
> regionserver or do we have to do it manually.
>
It does it automatically.
St.Ack
Hi,
Thanks for your reply. So does HBase automatically reassign to another
regionserver or do we have to do it manually.
On Thu, Jun 9, 2011 at 10:18 AM, Chris Tarnas wrote:
> What is an HQuorum?
>
> If you mean a regionserver then possibly you application is attempting to
> get data that was on
What is an HQuorum?
If you mean a regionserver then possibly you application is attempting to get
data that was on a region hosted by the failed regionserver and in that case
you need to make sure you application can deal the connection failure and wait
for the the regions to be reassigned to
Hi,
Maybe this has been asked before. I couldn't find much information on this.
We have an application where multiple instances across different machines could
try to insert a new row with the same row key into a global HBase table at the
same time. If the row has been inserted by one instance
Is there anyway to add a new HQuorum to the cluster dynamically?
--
With Regards,
Jr.
Hi,
We are running a 5 machine Hbase cluster. We have noticed that whenever an
HQuorum fails in one machine, the entire application that is running on
HBase crashes. Is there anything to do about this?
--
With Regards,
Jr.
Hi..
I guess the problem is one of my regionserver is having entry of localhost
in /etc/hosts file.
My log is saying that
*2011-06-08 15:24:27,588 INFO org.apache.hadoop.hbase.*
*regionserver.HRegionServer: Serving as ub8,60020,1307526863668, RPC
listening on /127.0.0.1:60020, sessionid=0x306eacb5
In the conf/log4j.properties
St.Ack
On Wed, Jun 8, 2011 at 9:02 PM, Gaojinchao wrote:
> How should we set the log level for production ?
> Do anyone have some experience?
> I want to use information.
>
>
>
Thanks, I have seen it. Once I verify a viable solution, I will update this
thread.
On Jun 8, 2011 5:57 PM, "Otis Gospodnetic"
wrote:
> There is this post about HBase backup options
> http://blog.sematext.com/2011/03/11/hbase-backup-options/ . I hope it
helps.
>
> Otis
>
> Sematext :: htt
How should we set the log level for production ?
Do anyone have some experience?
I want to use information.
Hi there-
Summary comment:
1) Preference
Several people in this thread have suggested approaches (map-side memory join,
multi-get, temp files), all of which have merit and have advantages in certain
situations. Kudos to the dist-list for chiming in. The "right" approach
depends on the spec
The assumption was that regions were not evenly distributed prior to
restarting.
If they were, user wouldn't select this policy.
We can this policy effective only once - retain assignment is selected
following this new policy.
Of course the dynamic portion of load balancer needs to select the unde
Jim, I'd be interested in hearing your experience with Whirr when you try
it. I've been testing it the last couple of days and I haven't been able to
get the out-of-the box hadoop recipe to work when it cames up (the namenode
doesn't have any datanodes configured although they are all up and runnin
Sam, would HBaseWD help you here?
See
http://search-hadoop.com/m/AQ7CG2GkiO/hbasewd&subj=+ANN+HBaseWD+Distribute+Sequential+Writes+in+HBase
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Hadoop - HBase
Hadoop ecosystem search :: http://search-hadoop.com/
- Original Message
I wouldn't rely on any dates. :) I'd look at the number of remaining open JIRA
issues with that target version.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Hadoop - HBase
Hadoop ecosystem search :: http://search-hadoop.com/
- Original Message
> From: "Ma, Ming"
> T
There is this post about HBase backup options
http://blog.sematext.com/2011/03/11/hbase-backup-options/ . I hope it helps.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message
> From: Manoj Mur
We are trying to do this online as downtime is not an option. Good point,
nonetheless.
On Jun 8, 2011 3:48 PM, "Joey Echeverria" wrote:
> Can you afford some down time? If so, you could minor compact, disable
> the table, distcp, and then enable the table.
>
> -Joey
>
> On Wed, Jun 8, 2011 at 1:22
Hi,
Where can I find the targeted release date of 0.92.0?
Thanks.
Ming
We are on hbase 0.90 and using hbase for a while to perform high volume data
lookup using hbase client (no map-reduce involved).
Recently we observed that our "get" latencies keep increasing over the
period (and eventually flatten out at higher value) and if we restart hbase
server, latencies go b
Thank you for the explanation, I think I understand the suggestion now. I
completely agree with you that this would be effective for cases that you can
do the join of the sorted values in memory.
A small tweak would make this more generic and effective for any size. If you
had two separate Map
Can you afford some down time? If so, you could minor compact, disable
the table, distcp, and then enable the table.
-Joey
On Wed, Jun 8, 2011 at 1:22 PM, Manoj Murumkar wrote:
> Hi,
>
> We're trying to come up with right strategy for backing up HBase tables.
> Assumption is that sizes of tables
I believe this is what Eran is suggesting:
Table A
---
Row1 (has joinVal_1)
Row2 (has joinVal_2)
Row3 (has joinVal_1)
Table B
---
Row4 (has joinVal_1)
Row5 (has joinVal_3)
Row6 (has joinVal_2)
Mapper receives a list of input rows (union of both input tables in any
order), and produces (=
Unless I am mistaken... get() requires a row key, right?
And you can join tables on column data which isn't in the row key, right?
So how do you do a get()? :-)
Sure there is more than one way to skin a cat. But if you want to be
efficient... You will create a set of unique keys based on the col
If I understand the history correctly, round-robin was used in .89, but
"retains" is the policy for .90+.
My 2-cents is that if/when region-shuffling is required, I'd rather do that
with another utility and keep that out of cluster startup.
-Original Message-
From: saint@gmail.com [
On Wed, Jun 8, 2011 at 12:40 AM, tsuna wrote:
> On Tue, Jun 7, 2011 at 7:56 PM, Kjew Jned wrote:
> > I was studying the OpenTSDB example, where they also prefix the row keys
> with
> > event id.
> >
> > I further modified my row keys to have this ->
> >
> >
> >
> > The uuid is fairly unique
Let's make a toy example to see if we can capture all of the edge conditions:
Table A
---
Key1 joinVal_1
Key2 joinVal_2
Key3 joinVal_1
Table B
---
Key4 joinVal_1
Key5 joinVal_3
Key6 joinVal_2
Now, assume that we have a mapper that takes two values, one row from A, and
one row from B. Ar
On Wed, Jun 8, 2011 at 12:50 PM, Ted Yu wrote:
> I am thinking of creating a new policy for region assignment at cluster
> startup which assigns regions from each table in round-robin fashion.
>
Don't we want to retain assignments on startup since that will ensure
greatest locality of data? Roun
In trunk this behavior has been improved.
Load balancer would move the youngest region off heavily loaded region
server.
See http://zhihongyu.blogspot.com/2011/04/load-balancer-in-hbase-090.html
I am thinking of creating a new policy for region assignment at cluster
startup which assigns regions
Yes, thanks it worked! Have no idea how I didn't come across the method! Thank
you for the tip!
I'd like to clarify, again what I'm trying to do and why I still think it's
the best way to do it.
I want to join two large tables, I'm assuming, and this is the key to the
efficiency of this method, that: 1) I'm getting a lot of data from table A,
something which is close enough top a full table s
On Wed, Jun 8, 2011 at 1:44 AM, bijieshan wrote:
> Thanks Suraj.
> Yes, It's a better method. For I haven't test on that.
> So use HTablePool, it seems we haven't need to delete Zk connections
> manually? Is that correct?
>
Yes.
St.Ack
Grep the missing file in the namenode log and see if you can figure
from mentions therein what happend with this file. Had the master
taken it from you because it was processing server crash?
St.Ack
2011/6/8 Gaojinchao :
> Two regionservers(My cluster is 7 regionsever / datanode) crashed, saying
On Wed, Jun 8, 2011 at 9:00 AM, Hiller, Dean x66079
wrote:
> We have certain tables with under 10 rows, one under 200 rows and one with
> 1,000,000 rows. We have found out that having a copy/cache on each node is
> EXTREMELY fast for our batch processing since these copies of data are local
>
Looks like you need to copy to hbase a commons config jar; this
version of hadoop seems to depend on it:
java.lang.NoClassDefFoundError: org/apache/commons/configuration/Configuration
And you are clear that this version of hadoop does not have
sync/append so hbase will lose data on crash.
St.Ack
J-D,
Thanks for the info. I copied the appropriate hadoop jar file to the lib
directory (and renamed the original one). I wasn't able to figure out why
zookeeper wasn't running on my master server so I launched zookeeper directly
and set HBASE_MANAGES_ZK to false. (And since I am running
Hi,
We're trying to come up with right strategy for backing up HBase tables.
Assumption is that sizes of tables will not grow beyond few hundred GB.
Currently, we're employing exports (writing onto HDFS of another cluster
directly), but is taking too long (~5 hours to export ~5GB of data). Are
the
A problem that will be fixed in 0.90.4 is that once hbck finds one
issue, all checks that follow emit 'INCONSISTENCY'. A quick perusal
of the below has it that hbck is not able to reach a server. Can you
check into that? Its using an IP, rather than hostname. Why is that?
ips in the regionserv
On Wed, Jun 8, 2011 at 10:01 AM, James Hammerton
wrote:
> Thanks Stack,
>
> I take it you mean get hold of check_meta.rb from a recent version and alter
> it to find the HRIs?
>
Yes. Alter it to run in 0.20.6.
St.Ack
Scratch that. You mean alter the find_overlapping_regions script to use
.META. to find the overlapping regions don't you?
James
On Wed, Jun 8, 2011 at 6:01 PM, James Hammerton <
james.hammer...@mendeley.com> wrote:
> Thanks Stack,
>
> I take it you mean get hold of check_meta.rb from a recent ve
Thanks Stack,
I take it you mean get hold of check_meta.rb from a recent version and alter
it to find the HRIs?
Regards,
James
On Wed, Jun 8, 2011 at 5:55 PM, Stack wrote:
> Pull it in. You'll have to massage a little but rather than do the
> indirect HTable.getStartKeys (which turns around
Pull it in. You'll have to massage a little but rather than do the
indirect HTable.getStartKeys (which turns around and reads meta), read
.META. directly and get the HRIs yourself.
St.Ack
On Wed, Jun 8, 2011 at 9:51 AM, James Hammerton
wrote:
> Hi,
>
> I've checked /usr/lib/hbase/bin and it doe
On Wed, Jun 8, 2011 at 9:25 AM, Zhong, Sheng wrote:
> Could anyone give me suggestion for Hadoop/HBase upgrade? We're
> currently using apache hadoop 0.20.2 + hbase 0.20.3 + zookeeper-3.2.2.
> Has anyone done with latest stable version of hadoop-0.20.203.0rc1 +
> Hbase 0.90.2, and will Hbase 0.90
Hi,
I've checked /usr/lib/hbase/bin and it doesn't have check_meta.rb.
Also, HTable doesn't have getHRegionInfos in 0.20.6.
Regards,
James
On Wed, Jun 8, 2011 at 5:46 PM, Stack wrote:
> Do you have check_meta.rb in 0.20.6 (I don't remember? I think you
> do). Start with that?
>
> Otherwise
Do you have check_meta.rb in 0.20.6 (I don't remember? I think you
do). Start with that?
Otherwise, here:
keys = wanted_table.getStartEndKeys
In 0.20.6 can you get HRegionInfos instead of start keys? That'd be
more useful. They would have the encoded name.
> We'd ideally like to feed the r
Thanks, Stack.
The context is that we have a script, find_overlapping_regions.rb at:
https://github.com/Mendeley/hbase-scripts/blob/master/find_overlapping_regions.rb
We'd ideally like to feed the results into another script (to be written)
that will call org.apache.hbase.util.Merge. I've been lo
On Wed, Jun 8, 2011 at 9:22 AM, James Hammerton
wrote:
> Given the tableName, startKey and endKey for a region how do I get hold of
> the encodedName?
>
I suppose it depends on the context.
If reading .META., then if you deserialize the info:regioninfo into an
HRegionInfo instance, then you can
Hey,
Could anyone give me suggestion for Hadoop/HBase upgrade? We're
currently using apache hadoop 0.20.2 + hbase 0.20.3 + zookeeper-3.2.2.
Has anyone done with latest stable version of hadoop-0.20.203.0rc1 +
Hbase 0.90.2, and will Hbase 0.90.2 have compatible issue with
hadoop-0.20.203.0rc1? I
Hi,
Given the tableName, startKey and endKey for a region how do I get hold of
the encodedName?
We have code for identifying overlapping regions that outputs triples of the
form tableName, startKey and endKey for each region, but it looks like the
Merge command (we're using 0.20.6) requires the
t
We have certain tables with under 10 rows, one under 200 rows and one with
1,000,000 rows. We have found out that having a copy/cache on each node is
EXTREMELY fast for our batch processing since these copies of data are local
AND in-memory. The issue I am struggling with is the best way to ev
> From: Ted Dunning
> Lots of people are moving towards more spindles per box to
> increase IOP/s
>
> This is particular important for cases where the working
> set gets pushed out of memory.
Indeed.
Our spec is more like 12x 500 GB SATA disks, to push IOPS and more evenly
balance CPUs (fast du
+1 to Matt's opinion (if possible?).
I am interested in your use case, sounds very impressive by the stats you
gave. You said 1000 tables?
Looking forward to see what optimizations/config tweaks you had to do to
cope up with your read/write requirements.
Thanks,
Himanshu
On Wed, Jun 8, 2011 at 8
If it is possible I think any slides or even a video would be very
interesting to some of us that can't travel. I, for one, would love to hear
how you do it.
Thanks!
On Tue, Jun 7, 2011 at 6:07 PM, Jack Levin wrote:
> Hey Guys, I plan to do a tech talk here at ImageShack, on how we store
> and
Re: " With respect to Doug's posts, you can't do a multi-get off the bat"
That's an assumption, but you're entitled to your opinion.
-Original Message-
From: Michael Segel [mailto:michael_se...@hotmail.com]
Sent: Monday, June 06, 2011 10:08 PM
To: user@hbase.apache.org
Subject: RE: How t
Hello guys,
Well.. I am using 12 node hbase cluster.
I can see all the nodes running on Hbase Web-UI.
But when I am running hbase hbck , I am getting the following output :
hadoop@ub13:/usr/local/hadoop$ hbase hbck
11/06/08 15:30:52 INFO zookeeper.ZooKeeper: Client
environment:zookeeper.version
Hi geoff,
Since hbase balances not at table but at cluster basis it may happen that all
the regions for one table are located at the same region server. The reason for
this may be the way hbase does table splits. If a region exceeds the configured
maximum size the region is split into two, but
Two regionservers(My cluster is 7 regionsever / datanode) crashed, saying that
an file didn’t not exist,
and that a lease has expired (log detail below). Tried to find in this mailing
list. It seems different:
Hbase version: 0.90.3
HDFS version: cloudera 0.20.2+320
OS: swappiness :0 and ulimit
Thanks Suraj.
Yes, It's a better method. For I haven't test on that.
So use HTablePool, it seems we haven't need to delete Zk connections manually?
Is that correct?
Thanks!
Jieshan Bean
--
How about using HTablePool - doesn't that work for you?
http://hbase.apache.org/apidocs/o
On Tue, Jun 7, 2011 at 7:56 PM, Kjew Jned wrote:
> I was studying the OpenTSDB example, where they also prefix the row keys with
> event id.
>
> I further modified my row keys to have this ->
>
>
>
> The uuid is fairly unique and random.
> Is appending a uuid to the event id help the distribut
71 matches
Mail list logo