Here's what I don't get -- how is this different than if I allocated a
different table for each separate value of the leading field? If I did
that and used the second field as the leading prefix instead, I know no one
would argue that it's a key that won't distribute well. I don't plan on
doing t
Hello all,
I have an endpoint coprocessor running in HBase that I would like to
modify. I previously loaded this coprocessor via the shell, without having
to restart HBase. However, after some experimentation I have not found any
way to reload a new version of the coprocessor without restarting
Uhm...
This isn't very good.
In terms of inserting, you will hit a single or small subset of regions.
This may not be that bad if you have enough data and the rows not all inserting
in to the same region.
since you're hitting an index to pull rows one at a time, you could do this...
if you
+HBase users.
-- Forwarded message --
From: Dmitriy Ryaboy
Date: 2012/9/4
Subject: Re: Extremely slow when loading small amount of data from HBase
To: "u...@pig.apache.org"
I think the hbase folks recommend something like 40 regions per node
per table, but I might be misrememb
On 9/4/12 3:07 PM, "Stack" wrote:
>On Tue, Sep 4, 2012 at 2:52 PM, Gen Liu wrote:
>> We are running into a case that if the region server that serves meta
>>table is down, all request will timeouts because region lookup is not
>>available.
>
>Only requests to .META. fail (and most of the time,
On Tue, Sep 4, 2012 at 2:52 PM, Gen Liu wrote:
> We are running into a case that if the region server that serves meta table
> is down, all request will timeouts because region lookup is not available.
Only requests to .META. fail (and most of the time, .META. info is
cached so should be relativ
Just today I saw this mentioned in the docs. They said they deliberately
don't replicate those, otherwise "it gets very messy".
Stas
On Tue, Sep 4, 2012 at 10:52 PM, Gen Liu wrote:
> Hi,
>
> We are running into a case that if the region server that serves meta
> table is down, all request will
Hi,
We are running into a case that if the region server that serves meta table is
down, all request will timeouts because region lookup is not available. At this
time, master is also not able to update meta table.
It seems that regions that serve root and meta are the single point of failure
i
Looks like you need the fix from HBASE-6018
On Mon, Sep 3, 2012 at 7:37 PM, abloz...@gmail.com wrote:
> [zhouhh@h185 ~]$ hbase hbck -fixMeta
> ...
> Number of Tables: 1731
> Number of live region servers: 4
> Number of dead region servers: 0
> Master: h185,61000,1346659732168
> Number of backup m
Is there anymore stack exception information? Also what version is this?
Jon.
On Mon, Sep 3, 2012 at 7:37 PM, abloz...@gmail.com wrote:
> [zhouhh@h185 ~]$ hbase hbck -fixMeta
> ...
> Number of Tables: 1731
> Number of live region servers: 4
> Number of dead region servers: 0
> Master: h185,6100
Hello,
Thank you for your replies. We are using CDH4 HBase 0.92. Good call on the
web interface. The port is blocked so I never really got a chance to test
it. As far as manual re-balancing is concerned I will check the book.
/David
On Tue, Sep 4, 2012 at 5:34 PM, Guillaume Gardey <
guillaume.g
Hi Lin,
checkout the slides about high update workloads and HBaseHUT at:
http://blog.sematext.com/?s=hbasehut
Maybe you could ask Alex Baranau about details here on the list to share it.
regards
Chris
Von: Lin Ma
An: user@hbase.apache.org; syrious3...@yah
On Sun, Sep 2, 2012 at 2:13 AM, Lin Ma wrote:
> Hello guys,
>
> I am reading the book "HBase, the definitive guide", at the beginning of
> chapter 3, it is mentioned in order to reduce performance impact for
> clients to update the same row (lock contention issues for automatic
> write), batch upd
On Tue, Sep 4, 2012 at 8:17 AM, Ioakim Perros wrote:
> Hello,
>
> I would be grateful if someone could shed a light to the following:
>
> Each M/R map task is reading data from a separate region of a table.
> From the jobtracker 's GUI, at the map completion graph, I notice that
> although data re
On Sun, Sep 2, 2012 at 6:38 AM, Richard Tang wrote:
> Hi, I have a connection problem on setting up hbase on remote node. The
> ``hbase`` instance is on a machine ``nodeA``. when I am trying to use hbase
> on ``nodeA`` from another machine (say ``nodeB``), it complains
>
>> Session 0x0 for server
*How does the data flow in to the system? One source at a time?*
Generally, it will be one source at a time where these rows are index
entries built from MapReduce jobs
*The second field. Is it sequential?*
No, the index writes from the MapReduce jobs should dump some relatively
small number of ro
There are serveral different ways.
Running jps as the user that hbase should start as will show you
what's running. You should be able to see HMaster or HRegionServer
running.
If things are running well the master should have a status http server
up. Going to that should tell you that things are
Here is mine.
But I can't garanteed that it's correct...
hbase.rootdir
hdfs://node3:9000/hbase
The directory shared by RegionServers.
hbase.cluster.distributed
true
The mode the cluster will be in. Possible values are
false: standalone and pseudo-distr
hello! would like an example of the file *hbase-site.xml* configured for a
fully distributed.
carefully.
--
[image: terraLab logo] *Igor Muzetti Pereira *
TerraLAB - Earth System Modelling and Simulation Laboratory
Computer Science Department, UFOP - Federal University of Ouro Preto
*Campus Univ
how do to know that the hbase is running correctly?
2012/9/4 Igor Muzetti
> hello! would like an example of the file *hbase-site.xml* configured for
> a fully distributed.
> carefully.
>
> --
> [image: terraLab logo] *Igor Muzetti Pereira *
> TerraLAB - Earth System Modelling and Simulation Labo
Eric,
So here's the larger question...
How does the data flow in to the system? One source at a time?
The second field. Is it sequential? If not sequential, is it going to be some
sort of incremental larger than a previous value? (Are you always inserting to
the left side of the queue?
How a
Longer term .. what's really going to happen is more like I'll have a first
field value of 1, 2, and maybe 3. I won't know 4 - 10 for a while and
the *second
*value after each initial value will be, although highly unique, relatively
exclusive for a given first value. This means that even if I di
I think you have to understand what happens as a table splits.
If you have a composite key where the first field has the value between 0-9 and
you pre-split your table, you will have all of your 1's going to the single
region until it splits. But both splits will start on the same node until th
You're the man Jean-Marc .. info is much appreciated.
On Tue, Sep 4, 2012 at 1:22 PM, Jean-Marc Spaggiari wrote:
> Hi Eric,
>
> Yes you can split and existing region. You can do that easily with the
> web interface. After the split, at some point, one of the 2 regions
> will be moved to another
Hi Eric,
Yes you can split and existing region. You can do that easily with the
web interface. After the split, at some point, one of the 2 regions
will be moved to another server to balanced the load. You can also
move it manually.
JM
2012/9/4, Eric Czech :
> Thanks again, both of you.
>
> I'll
Thanks again, both of you.
I'll look at pre splitting the regions so that there isn't so much initial
contention. The issue I'll have though is that I won't know all the prefix
values at first and will have to be able to add them later.
Is it possible to split regions on an existing table? Or i
Thanks, Harsh J, but I have checked /etc/ dir and hbase's root directory,
there is no zoo.cfg file present in both places...
I am aware that hbase client will first check zookeeper before contacting
hbase itself (for -ROOT- table and .META table ...). is there
anyway
- to test if zookeeper can be
Jerry thank you very much for the links.
Regards,
Ioakim
On 09/04/2012 08:05 PM, Jerry Lam wrote:
Hi Loakim:
Here a list of links I would suggest you to read (I know it is a lot to
read):
HBase Related:
-
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.html
Hi Loakim:
Here a list of links I would suggest you to read (I know it is a lot to
read):
HBase Related:
-
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.html
-
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-summary.html#package_desc
I understood that locking is at a row-level (and that my initial
hypothesis is hopefully false) , but I was trying to clarify if there is
some job configuration I am missing. Perhaps you 're right and I am
misinterpreting the jobtracker's map completion graph.
Thanks for answering.
On 09/04/2
I think the issue is that you are misinterpreting what you are seeing and what
Doug was trying to tell you...
The short simple answer is that you're getting one split per region. Each split
is assigned to a specific mapper task and that task will sequentially walk
through the table finding the
Thank you very much for your response and for the excellent reference.
The thing is that I am running jobs on a distributed environment and
beyond the TableMapReduceUtil settings,
I have just set the scan ' s caching to the number of rows I expect to
retrieve at each map task, and the scan's
Hi Loakim:
Sorry, your hypothesis doesn't make sense. I would suggest you to read the
"Learning HBase Internals" by Lars Hofhansl at
http://www.slideshare.net/cloudera/3-learning-h-base-internals-lars-hofhansl-salesforce-final
to
understand how HBase locking works.
Regarding to the issue you are
Thank you very much for responding, but this was not exactly what I was
looking for.
I have understood the splitting process when M/R jobs read from HBase
tables (that each M/R task reads from exactly one region).
What I would like to clarify if possible is, if there is indeed some
"locking"
Hello,
> a) What is the easiest way to get an overview of how a table is distributed
> across regions of a cluster? I guess I could search .META. but I haven't
> figured out how to use filters from shell.
> b) What constitutes a "badly distributed" table and how can I re-balance
> manually?
> c) I
Hi there-
Yes, there is an input split for each region of the source table of a MR
job.
There is a blurb on that in the RefGuide...
http://hbase.apache.org/book.html#splitter
On 9/4/12 11:17 AM, "Ioakim Perros" wrote:
>Hello,
>
>I would be grateful if someone could shed a light to the fo
Hi there,
I'm trying to set up replication in master-slave mode between two clusters,
and when this works set up master-master replication. Following the
replication FAQ step-by-step, but I can't make it work and have no idea how
to troubleshoot. There seem to be given only one way to find out whe
Hi Christian,
I read through the link you referred. It seems HBaseHUT is exactly the
solution I am looking for. Before making the technology choice decision, I
want to learn a bit more about its internal design and the general idea of
HBaseHUT of how throughput of write is improved. From the discu
> a) What is the easiest way to get an overview of how a table is distributed
> across regions of a cluster?
I usually see by the web interface (host:60010).
Click on a table and scroll down. There will be a region count of this table
across the cluster.
> b) What constitutes a "badly distribut
Hello,
I would be grateful if someone could shed a light to the following:
Each M/R map task is reading data from a separate region of a table.
From the jobtracker 's GUI, at the map completion graph, I notice that
although data read from mappers are different, they read data
sequentially - li
Can you tell us the version of HBase you're using.
The following feature (per table region balancing) isn't in 0.92.x:
https://issues.apache.org/jira/browse/HBASE-3373
On table.jsp page, you should see region count per region server.
Cheers
On Tue, Sep 4, 2012 at 7:56 AM, David Koch wrote:
>
Hello,
A couple of questions regarding balancing of a table's data in HBase.
a) What is the easiest way to get an overview of how a table is distributed
across regions of a cluster? I guess I could search .META. but I haven't
figured out how to use filters from shell.
b) What constitutes a "badly
42 matches
Mail list logo