Instead of using a table, how about using the available ZooKeeper
service itself? They can hold small bits of information pretty well
themselves.
On Sat, Mar 26, 2011 at 12:29 AM, Vishal Kapoor
wrote:
> David,
> how about waking up my second map reduce job as soon as I see some
> rows updated in
I came upon this independently today, actually. Filed ZOOKEEPER-1030
On Fri, Mar 25, 2011 at 3:21 PM, Alex Baranau wrote:
> Right, from the same host (same ip). But in HBase I think the default max
> number of connections is set to 30. Please correct me if I'm wrong. If I'm
> right, then we shoul
Right, from the same host (same ip). But in HBase I think the default max
number of connections is set to 30. Please correct me if I'm wrong. If I'm
right, then we should probably change either of the defaults. No?
Alex Baranau
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hado
Look at http://yahoo.github.com/oozie/. May be it will helps you.
2011/3/25 Vishal Kapoor
> Can someone give me a direction on how to start a map reduce based on
> an outcome of another map reduce? ( nothing common between them apart
> from the first decides about the scope of the second.
>
> I
I see what you are asking. I'm using stand-alone Zookeeper, not "internal"
one of HBase. So it reads configuration only form zoo.cfg. And it seems that
by default (when maxClientCnxns is absent in it) it acts like
maxClientCnxns=10. I'd expect it to be unlimited when this property is
omitted. At le
The simplest way to do this is with a thread that executes the jobs you want to
run synchronously
Job job1 = ...
job1.waitForCompletion(true);
Job job2 = ...
job2.waitForCompletion(true);
-Original Message-
From: Vishal Kapoor [mailto:vishal.kapoor.
I actually created another Configuration object (cfg) within the map()
method itself, so it still worked. Now I have a much better idea of how the
Mapper is called.
Moving the HTable object configuration to the setup() method was the right
call. Thanks!
On Fri, Mar 25, 2011 at 12:01 PM, Buttler
Well there goes my weekend. :-P
> From: buttl...@llnl.gov
> To: user@hbase.apache.org
> Date: Fri, 25 Mar 2011 10:00:26 -0700
> Subject: RE: How could I re-calculate every entries in hbase efficiently
> through mapreduce?
>
> I would certainly find it use
On Fri, Mar 25, 2011 at 12:36 PM, Alex Baranau wrote:
> As far as I know HBase configured to initiate up to 30 connections by
> default, and maxClientCnxns for Zookeeper was meant to be 30 as well.
Yes
I'm not sure how it'd go from 30 to 10 (Is 10 the default connections
for zk?). Is it possibl
Hello,
I've set up a test HBase+Hadoop cluster yesterday and got the following
error in logs during running MR job (which internally creates HTable for
Reducer):
KeeperErrorCode = ConnectionLoss for /hbase
Then I went to Zookeeper logs and found this:
2011-03-24 22:41:49,884 - WARN [NIOServerC
Thanks, J-D, that managed to solve a part of the problem. The servers
have stopped crashing and the master now properly detects when a RS
goes down, by the way, since the RS does detect this it may be a good
idea to stop the server on this event which is a significant
configuration issue.
However n
What version of hbase?
How many regions?
Can you get a list? (Scan .META.)
You need to close the regions out on each regionserver, remove them
from .META. then remove the table from the filesystem. The first step
can be tricky. If only a few regions, you could try doing each one in
turn sendi
I would suggest that you have each mapper have its own HTable, rather than
having a static HTable in the outer class. Configure it from the setup method
of the mapper.
Hmm..., I am not exactly sure how the configuration from your HTable is passed
to the mapper in the first place. You are conf
David,
how about waking up my second map reduce job as soon as I see some
rows updated in that table.
any thoughts on observing a column update?
thanks,
Vishal
On Fri, Mar 25, 2011 at 2:56 PM, Buttler, David wrote:
> What about just storing some metadata in a special table?
> Then on you second
What about just storing some metadata in a special table?
Then on you second job startup you can read that meta data and set your scan
/input splits appropriately?
Dave
-Original Message-
From: Vishal Kapoor [mailto:vishal.kapoor...@gmail.com]
Sent: Friday, March 25, 2011 11:21 AM
To: us
On Thu, Mar 24, 2011 at 7:36 PM, Stanley Xu wrote:
> But I have two doubts here:
> 1. It looks the partitioner will do a lots of shuffling, I am wondering why
> it couldn't just do the put on the local region since the read and write on
> the same entry should be on the same region, isn't it?
>
T
Can someone give me a direction on how to start a map reduce based on
an outcome of another map reduce? ( nothing common between them apart
from the first decides about the scope of the second.
I might also want to set the scope of my second map reduce
(from/after) my first map reduce(scope as in
Hello all,
I wrote a routine that scans an HBase table, and writes to another table
from within the map function using HTable.put(). When I run the job, it
works fine for the first few rows but ZooKeeper starts having issues opening
up a connection after a while.
Am I just overloading the ZK ser
+1 Thank you David for the great explanation. It's complicated.
I am pretty new to this BigData space and found it really interesting and
always want to learn more about it. I will definitely look into OpenTSDB as
suggested. Thanks again :D
On Fri, Mar 25, 2011 at 12:18 PM, Buttler, David wrote:
Hmmm maybe my mental model is deficient. How do you propose building a
secondary index without a transaction?
The reason indexes work is that they store the data in a different way than the
primary table. That implies a second, independent data storage. Without a
transaction you can't be
Ugh. Redo. I added pointer to David Butler's response above as an
intro to secondary indexing issues in hbase.
St.Ack
On Fri, Mar 25, 2011 at 10:09 AM, Stack wrote:
> I added pointer to below into our book as 'intro to secondary indexing
> in hbase'.
> St.Ack
>
> On Fri, Mar 25, 2011 at 8:39 AM,
I added pointer to below into our book as 'intro to secondary indexing
in hbase'.
St.Ack
On Fri, Mar 25, 2011 at 8:39 AM, Buttler, David wrote:
> Do you know what it means to make secondary indexing a feature? There are
> two reasonable outcomes:
> 1) adding ACID semantics (and thus killing sca
I would certainly find it useful if you wrote such a blog post.
Dave
-Original Message-
From: Michael Segel [mailto:michael_se...@hotmail.com]
Sent: Friday, March 25, 2011 8:55 AM
To: user@hbase.apache.org
Subject: RE: How could I re-calculate every entries in hbase efficiently
through m
Thank you so much for the informative info. It really helps me out.
For secondary index, even without transaction, I would think one could still
build a secondary index on another key especially if we have row level
locking. Correct me if I am wrong.
Also, I have read about clustered B-Tree used
"During inserts into the table, there was one field that was populated
from hand-crafted HTML that should only have a small range of values
(e.g. a primary color). We wanted to keep a log of all of the unique
values that were found here, and so the values were the map job output
and then sorte
We ran across a use-case this week. During inserts into the table, there was
one field that was populated from hand-crafted HTML that should only have a
small range of values (e.g. a primary color). We wanted to keep a log of all
of the unique values that were found here, and so the values wer
Do you know what it means to make secondary indexing a feature? There are two
reasonable outcomes:
1) adding ACID semantics (and thus killing scalability)
2) allowing the secondary index to be out of date (leading to every naïve user
claiming that there is a serious bug that must be fixed).
Sec
On Fri, Mar 25, 2011 at 1:56 AM, Mohit wrote:
> Why not reconnect back to the zookeeper(at least try once and then abort, if
> unsuccessful) and resetting trackers/watchers instead of aborting/killing
> HMaster/HRegionServers just like it is done in one of the implementation of
> abort able named
Yeah...
Uhm I don't know of many use cases where you would want or need a reducer step
when dealing with HBase.
I'm sure one may exist, but from past practical experience... you shouldn't
need one.
> From: buttl...@llnl.gov
> To: user@hbase.apache.org
The table is in inconsistent state. Reason being it was not able to locate
a few regions.
When I disable this table using hbase shell, the master log says
RetriesException and is in the process of transition. This takes a lot of
time.
Is it possible to force drop this table? Or rather what are
There is no reason to use a reducer in this scenario. I frequently do map-only
update jobs. Skipping the reduce step saves a lot of unnecessary work.
Dave
-Original Message-
From: Stanley Xu [mailto:wenhao...@gmail.com]
Sent: Thursday, March 24, 2011 7:37 PM
To: user@hbase.apache.org
S
Hey folks,
just a short notice for those who haven't noticed we have only a
limited amount of Early-Bird tickets left and the Early-Bird period is
ends on April 7th. If you want to get one of the 30 remaining tickets
go and get one now here: http://berlinbuzzwords.de/content/tickets
While we are
Hello Users/Authors
Well we've observed in our cluster , that HMaster went down due to watched
event triggered from zookeeper, of type session expired.
Why not reconnect back to the zookeeper(at least try once and then abort, if
unsuccessful) and resetting trackers/watchers instead of abort
I need to use secondary indexing too, hopefully this important feature
will be made available soon :)
Sent from my iPhone
On Mar 25, 2011, at 12:48 AM, Stack wrote:
There is no native support for secondary indices in HBase (currently).
You will have to manage it yourself.
St.Ack
On Thu, Ma
34 matches
Mail list logo