Thanks for the clear answer Andy.
The comparison actually was conducted by hypertable dev team, so I guess it
wasn't all that fair to hbase.
I have regained the confidence in hbase once more :)
Ed
From mp2893's iPhone
On 2011. 5. 26., at 오전 12:03, Andrew Purtell wrote:
> I think I can spea
Anything listening at 160.110.79.33:3888? Is it reachable from the
server that is trying to connect?
St.Ack
On Wed, May 25, 2011 at 9:37 PM, James Ram wrote:
> Hi,
>
> We are using 1 Master and 4 Slave machines for Hadoop cluster. When I try to
> start the Hbase in Master, the slaves is showing
Hi,
We are using 1 Master and 4 Slave machines for Hadoop cluster. When I try to
start the Hbase in Master, the slaves is showing the below exception. Can
you help me.
2011-05-26 09:44:10,531 WARN
org.apache.zookeeper.server.quorum.QuorumCnxManager: Cannot open channel to
1 at election address /1
Hi Harsh / Christian,
Thankyou for your reply. The issue is solved. There was a difference between
master and 2 slaves(Regionservers).
On Wed, May 25, 2011 at 5:41 PM, Kleegrewe, Christian <
christian.kleegr...@siemens.com> wrote:
> Hi Jr.
>
> Have a look at the Exception message:
>
> org.apache
Python is great.
If you can hold your nose a little longer, you are either almost
there, or its a lost cause so bare with us a little longer.
Did the configs. above make a difference? (Initiating compaction at
65% is conservative -- you'll be burning lots of CPU -- but probably
good to start her
On Wed, May 25, 2011 at 2:44 PM, Wayne wrote:
> What are your write levels? We are pushing 30-40k writes/sec/node on 10
> nodes for 24-36-48-72 hours straight. We have only 4 writers per node so we
> are hardly overwhelming the nodes. Disk utilization runs at 10-20%, load is
> max 50% including so
That may be the best advice I ever got...although I would say 9 months we
didn't have 1 line of python and now we have a fantastic mpp framework built
with python with a team most of which never wrote a line of python before.
But...java is not python...
We have shredded our relational past and fr
This may be the most important detail of all.
It is important to go with your deep skills. I would be a round peg in your
square shop and you would be a square one in my round one.
On Wed, May 25, 2011 at 5:55 PM, Wayne wrote:
> We are not a Java shop, and do not want to become one. I think to
We are using std thrift from python. All writes are batched into usually 30k
writes per batch. The writes are small double/varchar(100) type values. Our
current write performance is fine for our needs...our concern is that they
are not sustainable over time given the GC timeouts.
Per the 4 items a
How large are these writes?
Are you using asynchbase or other alternative client implementation?
Are you batching updates?
On Wed, May 25, 2011 at 2:44 PM, Wayne wrote:
> What are your write levels? We are pushing 30-40k writes/sec/node on 10
> nodes for 24-36-48-72 hours straight. We have onl
On Wed, May 25, 2011 at 4:49 PM, Matt Corgan wrote:
> I was thinking it would be a nice feature if each time an hfile was written
> it kept a count of the raw bytes (before compression) to make it easy to
> compare to the file size on disk. It could report it in the web interface
> next to the di
I was thinking it would be a nice feature if each time an hfile was written
it kept a count of the raw bytes (before compression) to make it easy to
compare to the file size on disk. It could report it in the web interface
next to the disk size.
2011/5/25 Stack
> Good point Matt. I forgot abo
>> Also is modifying the log4j.properties in the conf directory a good
approach to ...
I assume you have distributed the modified log4j.properties onto the task
tracker machines.
On Wed, May 25, 2011 at 3:52 PM, Himanish Kushary wrote:
> The log4j logging statements work when I run the Map-Reduce
I'd recommend adding -Dlog4j.debug to the JVM args for any JVM that's not
giving you what you expect. In this case, if it's the map/reduce tasks, add
it to mapred.child.java.opts in mapred-site.xml. It should show you what
configuration log4j is actually picking up.
Dave
On Wed, May 25, 2011 at
We know several things that are in common with your hbase and your
cassandra.
a) the jvm
b) the machines
c) the os
d) the (necessary) prejudices of the implementors and op staff
On the other hand, we know of other hbase (and cassandra) installations
running similar
volumes on the same JVM.
I
The log4j logging statements work when I run the Map-Reduce job from eclipse
using the LocalTaskTracker. But the logging is not working when I ran the
Map-Reduce through hadoop jar command on the cluster. Strangely only the
logging statements in the main enclosing class(the job class with main
meth
I've figured it out. Turns out that the other configuration was loaded too
late. Regardless the following works properly:
org.apache.maven.plugins
maven-surefire-plugin
once
target
-Djava.library.path=/Users/tynt/apps/hadoop/lib/native/Mac
What are your write levels? We are pushing 30-40k writes/sec/node on 10
nodes for 24-36-48-72 hours straight. We have only 4 writers per node so we
are hardly overwhelming the nodes. Disk utilization runs at 10-20%, load is
max 50% including some app code, and memory is the 8g JVM out of 24G. We ru
Good point Matt. I forgot about compression. Let me add not to the
above referenced section in the book
St.Ack
On Wed, May 25, 2011 at 7:47 AM, Matt Corgan wrote:
> I have a table that compresses by 30x using gzip, so the default block size
> of 64 KB was writing 2 KB blocks to disk. To re
Hey all,
Running into a situation where I'm trying to use MiniHBaseCluster to test
operations against a table with LZO compression enabled. I have a set of
unit tests that exercise certain capabilities, but the moment I use LZO it
fails. I know exactly that it is that it cannot find the
nativegp
That region is not in the urlhashv2 directory.
I'll grep all the logs to see if it shows up.
-Original Message-
From: saint@gmail.com [mailto:saint@gmail.com] On Behalf Of Stack
Sent: Wednesday, May 25, 2011 3:30 PM
To: user@hbase.apache.org
Subject: Re: wrong region exception
Nope, nothing in the logs with that string.
-Original Message-
From: saint@gmail.com [mailto:saint@gmail.com] On Behalf Of Stack
Sent: Wednesday, May 25, 2011 3:30 PM
To: user@hbase.apache.org
Subject: Re: wrong region exception
Can you find this region in the filesystem? Look un
In case you are trying it on a virtual machine.
I changed my networking options from NAT to Bridged
and it started working.
I am guessing it has problems accessing the network interface otherwise maybe.
Major compaction will not merge old small regions. The change in
cable schema makes it so existing ones will grow larger before they in
turn split. To check a table is already compacted, do an lsr on the
table's directory in hdfs. See how many storefiles. If a major
compaction ran recently, th
Can you find this region in the filesystem? Look under the urlhashv2
table directory for a direction named
80116D7E506D87ED39EAFFE784B5B590. Grep your master log to see if you
can figure the history of this region.
St.Ack
On Wed, May 25, 2011 at 1:21 PM, Robert Gonzalez
wrote:
> The detailed er
The detailed error is :
Chain of regions in table urlhashv2 is broken; edges does not contain
80116D7E506D87ED39EAFFE784B5B590
Table urlhashv2 is inconsistent.
How does one fix this?
Thanks,
Robert
-Original Message-
From: saint@gmail.com [mailto:saint@gmail.com] On Behalf Of
On Wed, May 25, 2011 at 11:39 AM, Ted Dunning wrote:
> It should be recognized that your experiences are a bit out of the norm
> here. Many hbase installations use more recent JVM's without problems.
Indeed, we run u25 on CentOS 5.6 and over several days uptime it's
common to never see a full GC
Is the .META. allocated? Can you tell why its stuck? Is it an issue
with notifications not happening on .META. reassign? That timeout
below seems very big. Did we actually wait that long?
St.Ack
On Wed, May 25, 2011 at 3:26 AM, Gaojinchao wrote:
> Root and Meta had assigned finally from log
H
I created a table with a small maxfilesize (64MB)and now the table has got lots
of regions, more than 1000.
I changed the maxfilesize to a bigger number (512MB) but I still have lots of
regions even after I major_compact all tables. I t seems that major_compact
didn'tcompact any region.
2011/5/25 Gaojinchao :
> How many regions in the cluster? Do you say 1344 above? How do we get to
> 5041?
>
> In my test cluster: 1 hmasters , 2 regionservers , 3 zookeeper and 5041
> regions
>
> In this scenario:
> 1, Two Zookeeper crashed
What made them crash? Are you doing recovery testing
I'm not sure why you are asking this question on the hbase user
mailing list, it seems like you have a log4j issue.
J-D
On Wed, May 25, 2011 at 1:03 PM, Himanish Kushary wrote:
> Could anybody please help me with this.
>
> On Tue, May 24, 2011 at 10:17 AM, Himanish Kushary wrote:
>
>> Hi,
>>
>>
Could anybody please help me with this.
On Tue, May 24, 2011 at 10:17 AM, Himanish Kushary wrote:
> Hi,
>
> I have enabled debug for my Map-Reduce package inside the log4j.properties
> under the $HADOOP_HOME/conf directory (using CDH3).
>
> log4j.logger.com.himanish.analytics.mapreduce=DEBUG
>
>
I have restarted kicking in CMS earlier (65%) and turning off the
incremental. We have an 8g heap...should we go to 10g (24g in box)? More
memory for the JVM has never seemed to be better...though maybe with lots of
hot regions and our flush size we might be pushing it? Should we up the 50%
for mem
On Wed, May 25, 2011 at 10:49 AM, Jinsong Hu wrote
> if we add the root region back in, then essentially the hbck is complaining
> every region is bad,
> which is not true.
>
I did notice and recently fix an issue where HBCK will print an ERROR
for all regions that follow a bad one so rather tha
On Wed, May 25, 2011 at 11:08 AM, Wayne wrote:
> I tried to turn off all special JVM settings we have tried in the past.
> Below are link to the requested configs. I will try to find more logs for
> the full GC. We just made the switch and on this node it has
> only occurred once in the scope of t
Most hbase installations also seem to recommend bulk inserts for loading
data. We are pushing it more than most in terms of actually using the client
API to load large volumes of data. We keep delaying putting hbase into
production as nodes going awol for as much as 2+ minutes we can not accept
as
Wayne,
It should be recognized that your experiences are a bit out of the norm
here. Many hbase installations use more recent JVM's without problems.
As such, it may be premature to point the finger at the JVM as opposed to
the workload or environmental factors. Such a premature diagnosis can m
Thanks for the pointer. I read the doc, and somehow had missed that
argument.
-geoff
-Original Message-
From: jdcry...@gmail.com [mailto:jdcry...@gmail.com] On Behalf Of
Jean-Daniel Cryans
Sent: Wednesday, May 25, 2011 10:40 AM
To: user@hbase.apache.org
Subject: Re: bulkloader zookeeper c
We have the line commented out with the new ratio. I will turn off the
incremental mode. We do have cache turned off on the table level and have
set to 1% for .meta. only. We do not use the block cache.
I will keep testing. Frankly u25 scares as well as older JVMs seem much
better based on previou
For your GC settings:
- i wouldn't tune newratio or survivor ratio at all
- if you want to tame your young GC pauses, use -Xmn to pick a new
size - eg -Xmn256m
- turn off CMS Incremental Mode if you're running on real server hardware
HBase settings:
- 1% of heap to block cache seems strange. maybe
I tried to turn off all special JVM settings we have tried in the past.
Below are link to the requested configs. I will try to find more logs for
the full GC. We just made the switch and on this node it has
only occurred once in the scope of the current log (it may have rolled?).
Thanks.
http://p
Hi, Stack:
You have a point. I checked the non-hbase machine's hbck's result, and it
shows :
Summary:
2418 inconsistencies detected.
Status: INCONSISTENT
That number seems very familiar to me, so I went to the master admin
page, and found:
Total: servers: 6 requests=2783, reg
Hi Wayne,
Looks like your RAM might be oversubscribed. Could you paste your
hbase-site.xml and hbase-env.sh files? Also looks like you have some
strange GC settings on (eg perm gen collection which we don't really
need)
If you can paste a larger segment of GC logs (enough to include at
least two
>From the doc at http://hbase.apache.org/bulk-loads.html
The -c config-file option can be used to specify a file containing the
appropriate hbase parameters (e.g., hbase-site.xml) if not supplied
already on the CLASSPATH (In addition, the CLASSPATH must contain the
directory that has the zookeeper
HBASE-3721 was integrated to trunk, not 0.90.x
HBASE-3871 is under review.
So I would interpret my answer as tilting toward outputing Hfiles that fit
within a single Region.
If, after your effort, there're still some HFiles that don't fit. You can
try my patches.
Thanks
2011/5/25 Panayotis Anto
We switched to u25 and reverted the JVM settings to those recommended. Now
we have concurrent mode failures that occur lasting more than 60 seconds
while not under hardly any load
Below are the entries from the JVM log. Of course we can up the zookeeper
timeout to 2 min or 10 min for that matt
There's no "recommended size", I guess as long as it fits in memory
it's ok given that not all JVMs are given the same amount of heap.
J-D
On Wed, May 25, 2011 at 6:48 AM, Lucian Iordache
wrote:
> To be more specific, I was thinking of a recommended number of deletes per
> batch.
> For example I
As you saw it's family based, there's no "cross-family" schemas.
Can you tell us more about your use case?
J-D
On Wed, May 25, 2011 at 5:58 AM, Oleg Ruchovets wrote:
> Hi ,
> Is it possible to define TTL for hbase row (I found TTL only for column
> family) ?
> In case it is not possible wh
So your answer would be that it is better to have the best possible load
balancing during the reduce phase instead of taking care to output Hfiles that
fit within a single Region, because splitting done by Incremental Load is
rather fast?
> Date: Wed, 25 May 2011 09:20:10 -0700
> Subject: Re:
On Wed, May 25, 2011 at 9:18 AM, Jinsong Hu wrote:
> I tried several other non-hbase machines that has proper configuration, sure
> enough, all of them complain problems.
>
This is interesting Jinsong. For sure the configuration was pointed
at the right filesystem. Do you think there could have
LoadIncrementalHFiles would split HFile if it doesn't fit within a single
region.
Please refer to the following JIRAs which speedup LoadIncrementalHFiles:
https://issues.apache.org/jira/browse/HBASE-3871
https://issues.apache.org/jira/browse/HBASE-3721
Note: parallelizing splitting of HFile(s) by
This is a follow up of what I have found . I exported the several
complained tables to hdfs, truncate the original table, and import it again,
and run hbck, and found that the hbck still complain the problem saying the
hdfs directory is not there. I go to hdfs and take a look, and the region's
also - how long are your column family name and column qualifiers? they are
added to each row key in the index, so you want to make them as short as
possible
On Wed, May 25, 2011 at 10:47 AM, Matt Corgan wrote:
> I have a table that compresses by 30x using gzip, so the default block size
> of
I think I can speak for all of the HBase devs that in our opinion this vendor
"benchmark" was designed by hypertable to demonstrate a specific feature of
their system -- autotuning -- in such a way that HBase was, obviously, not
tuned. Nobody from the HBase project was consulted on the results o
Hi Dave,
Thanks for the reply. Actually we are not using TableInputFormat. I will have a
look at the class and if it makes logfiles more usable I will use it,
Best regards,
Christian
--8<--
Siemens AG
Corporate Technology
Corporate Research and Technol
Do you have any preference for how this might be accomplished?
Best regards,
- Andy
Problems worthy of attack prove their worth by hitting back. - Piet Hein (via
Tom White)
--- On Wed, 5/25/11, Mark Jarecki wrote:
> From: Mark Jarecki
> Subject: REST & Atomic increment
> To: user@hbase
I have a table that compresses by 30x using gzip, so the default block size
of 64 KB was writing 2 KB blocks to disk. To reduce storefileIndexSize, I
raised the block size to 256 KB, presumably writing ~8KB disk blocks which
is still pretty small. Maybe you could go even higher depending on your
To be more specific, I was thinking of a recommended number of deletes per
batch.
For example I need to delete 200.000 rows, should I delete them in several
batches or all at once?
(I've noticed that some problems appear for lists containing more than
100.000 deletes)
On Wed, May 25, 2011 at 4:27
Hello,
I have to make a lot of deletes from a hbase table, so I use the batch
method, providing a list of Delete objects.
Is there any limit for the number of Deletes to send in a batch?
--
Regards,
Lucian
Are you using TableInputFormat? If so, if you turn on DEBUG level logging
for hbase (or just org.apache.hadoop.hbase.mapreduce.TableInputFormatBase)
you should see lines like this, giving the map task number, region location,
start row, and end row:
getSplits: split -> 0 ->
hslave107:,@G\xA0\xFB\
Hi ,
Is it possible to define TTL for hbase row (I found TTL only for column
family) ?
In case it is not possible what is the best practice to implement TTL for
hbase rows?
Thanks in advance
Oleg.
Hi Jr.
Have a look at the Exception message:
org.apache.hadoop.hbase.ClockOutOfSyncException: Server
160.110.79.60,60020,1306320747968 has been rejected; Reported time is too
far out of sync with master. Time difference of 19832277ms > max allowed of
3ms
It seems that the system time of th
Here's an initial thought question: Is your cluster nodes' time in
sync with one another? i.e., is ntpd functional on each? HBase/ZK
would require clocks on each node to be synchronized with one another.
On Wed, May 25, 2011 at 5:37 PM, James Ram wrote:
> Hi,
>
> I am using 1 master and 4 slave m
Hi,
I am using 1 master and 4 slave machines cluster. I tried to add three
HRegionserver including the master. But its throwing the following
exception. The HRegionserver is running on the master but its not running on
the slaves. I have also given the hbase-site.xml and regionservers also.
Please
Hello,
I am currently working on a MR job that will output HFiles that will be bulk
loaded in an HBase Table.
According to the HBase site in order for the bulk loading to be efficient each
HFile of the MR job should fit within a single region.
In order to achieve that I use the TotalOrderPartiti
Root and Meta had assigned finally from logs.
I am not sure what 's up.
It seems that some exception for socket.
So ServerShutdownHandler couldn't finish and I try to reproduce it again(I
think it is easy )
2011-05-24 09:09:49,292 WARN
[RegionServer:3;C4C2.site,56262,1306199352333-EventThread]
Hi all,
I would like to figure out on which table region is used by a specific map
task. Is there a possibility to find such informaiton in the hbase logs?
Thanks in advance
Christian
--8<--
Siemens AG
Corporate Technology
Corporate Research and Technol
Hi,
Is there - or are there plans for implementing - an atomic increment method for
a column value in the REST API? I noticed such a method in Thrift.
Just thinking of a way to increment a column value integer in a single
operation - avoiding the need for a GET request followed by a PUT reques
Region size is 512M
hbase.regionserver.handler.count 50
hbase.regionserver.global.memstore.upperLimit 0.4
hbase.regionserver.global.memstore.lowerLimit 0.35
hbase.hregion.memstore.flush.size 128M
hbase.hregion.max.filesize 512M
hbase.client.scanner.caching 1 hfile.block.cache.size 0.2
hbase.h
Yes It is case balancer assigned a portion of the total.
How many regions in the cluster? Do you say 1344 above? How do we get to 5041?
In my test cluster: 1 hmasters , 2 regionservers , 3 zookeeper and 5041 regions
In this scenario:
1, Two Zookeeper crashed
2, One Hmaster and one regionserver
I'm planning to use a NoSQL distributed database.
I did some searching and came across a lot of database systems such as
MongoDB, CouchDB, Hbase, Cassandra, Hypertable, etc.
Since what I'll be doing is frequently reading a varying amount of data, and
less frequently writing a massive amount of dat
71 matches
Mail list logo