Re: hbase and hypertable comparison

2011-05-25 Thread Edward Choi
Thanks for the clear answer Andy. The comparison actually was conducted by hypertable dev team, so I guess it wasn't all that fair to hbase. I have regained the confidence in hbase once more :) Ed From mp2893's iPhone On 2011. 5. 26., at 오전 12:03, Andrew Purtell wrote: > I think I can spea

Re: Cannot open channel to 1 at election address

2011-05-25 Thread Stack
Anything listening at 160.110.79.33:3888? Is it reachable from the server that is trying to connect? St.Ack On Wed, May 25, 2011 at 9:37 PM, James Ram wrote: > Hi, > > We are using 1 Master and 4 Slave machines for Hadoop cluster. When I try to > start the Hbase in Master, the slaves is showing

Cannot open channel to 1 at election address

2011-05-25 Thread James Ram
Hi, We are using 1 Master and 4 Slave machines for Hadoop cluster. When I try to start the Hbase in Master, the slaves is showing the below exception. Can you help me. 2011-05-26 09:44:10,531 WARN org.apache.zookeeper.server.quorum.QuorumCnxManager: Cannot open channel to 1 at election address /1

Re: ClockOutOfSyncException with zookeeper

2011-05-25 Thread James Ram
Hi Harsh / Christian, Thankyou for your reply. The issue is solved. There was a difference between master and 2 slaves(Regionservers). On Wed, May 25, 2011 at 5:41 PM, Kleegrewe, Christian < christian.kleegr...@siemens.com> wrote: > Hi Jr. > > Have a look at the Exception message: > > org.apache

Re: mslab enabled jvm crash

2011-05-25 Thread Stack
Python is great. If you can hold your nose a little longer, you are either almost there, or its a lost cause so bare with us a little longer. Did the configs. above make a difference? (Initiating compaction at 65% is conservative -- you'll be burning lots of CPU -- but probably good to start her

Re: mslab enabled jvm crash

2011-05-25 Thread Erik Onnen
On Wed, May 25, 2011 at 2:44 PM, Wayne wrote: > What are your write levels? We are pushing 30-40k writes/sec/node on 10 > nodes for 24-36-48-72 hours straight. We have only 4 writers per node so we > are hardly overwhelming the nodes. Disk utilization runs at 10-20%, load is > max 50% including so

Re: mslab enabled jvm crash

2011-05-25 Thread Wayne
That may be the best advice I ever got...although I would say 9 months we didn't have 1 line of python and now we have a fantastic mpp framework built with python with a team most of which never wrote a line of python before. But...java is not python... We have shredded our relational past and fr

Re: mslab enabled jvm crash

2011-05-25 Thread Ted Dunning
This may be the most important detail of all. It is important to go with your deep skills. I would be a round peg in your square shop and you would be a square one in my round one. On Wed, May 25, 2011 at 5:55 PM, Wayne wrote: > We are not a Java shop, and do not want to become one. I think to

Re: mslab enabled jvm crash

2011-05-25 Thread Wayne
We are using std thrift from python. All writes are batched into usually 30k writes per batch. The writes are small double/varchar(100) type values. Our current write performance is fine for our needs...our concern is that they are not sustainable over time given the GC timeouts. Per the 4 items a

Re: mslab enabled jvm crash

2011-05-25 Thread Ted Dunning
How large are these writes? Are you using asynchbase or other alternative client implementation? Are you batching updates? On Wed, May 25, 2011 at 2:44 PM, Wayne wrote: > What are your write levels? We are pushing 30-40k writes/sec/node on 10 > nodes for 24-36-48-72 hours straight. We have onl

Re: a question storefileIndexSize

2011-05-25 Thread Stack
On Wed, May 25, 2011 at 4:49 PM, Matt Corgan wrote: > I was thinking it would be a nice feature if each time an hfile was written > it kept a count of the raw bytes (before compression) to make it easy to > compare to the file size on disk.  It could report it in the web interface > next to the di

Re: a question storefileIndexSize

2011-05-25 Thread Matt Corgan
I was thinking it would be a nice feature if each time an hfile was written it kept a count of the raw bytes (before compression) to make it easy to compare to the file size on disk. It could report it in the web interface next to the disk size. 2011/5/25 Stack > Good point Matt. I forgot abo

Re: Log4j changes not working inside static mapper and reducer classes

2011-05-25 Thread Ted Yu
>> Also is modifying the log4j.properties in the conf directory a good approach to ... I assume you have distributed the modified log4j.properties onto the task tracker machines. On Wed, May 25, 2011 at 3:52 PM, Himanish Kushary wrote: > The log4j logging statements work when I run the Map-Reduce

Re: Log4j changes not working inside static mapper and reducer classes

2011-05-25 Thread Dave Latham
I'd recommend adding -Dlog4j.debug to the JVM args for any JVM that's not giving you what you expect. In this case, if it's the map/reduce tasks, add it to mapred.child.java.opts in mapred-site.xml. It should show you what configuration log4j is actually picking up. Dave On Wed, May 25, 2011 at

Re: mslab enabled jvm crash

2011-05-25 Thread Ted Dunning
We know several things that are in common with your hbase and your cassandra. a) the jvm b) the machines c) the os d) the (necessary) prejudices of the implementors and op staff On the other hand, we know of other hbase (and cassandra) installations running similar volumes on the same JVM. I

Re: Log4j changes not working inside static mapper and reducer classes

2011-05-25 Thread Himanish Kushary
The log4j logging statements work when I run the Map-Reduce job from eclipse using the LocalTaskTracker. But the logging is not working when I ran the Map-Reduce through hadoop jar command on the cluster. Strangely only the logging statements in the main enclosing class(the job class with main meth

Re: Using MiniHBaseCluster in Maven Tests with LZO

2011-05-25 Thread Matt Davies
I've figured it out. Turns out that the other configuration was loaded too late. Regardless the following works properly: org.apache.maven.plugins maven-surefire-plugin once target -Djava.library.path=/Users/tynt/apps/hadoop/lib/native/Mac

Re: mslab enabled jvm crash

2011-05-25 Thread Wayne
What are your write levels? We are pushing 30-40k writes/sec/node on 10 nodes for 24-36-48-72 hours straight. We have only 4 writers per node so we are hardly overwhelming the nodes. Disk utilization runs at 10-20%, load is max 50% including some app code, and memory is the 8g JVM out of 24G. We ru

Re: a question storefileIndexSize

2011-05-25 Thread Stack
Good point Matt. I forgot about compression. Let me add not to the above referenced section in the book St.Ack On Wed, May 25, 2011 at 7:47 AM, Matt Corgan wrote: > I have a table that compresses by 30x using gzip, so the default block size > of 64 KB was writing 2 KB blocks to disk. To re

Using MiniHBaseCluster in Maven Tests with LZO

2011-05-25 Thread Matt Davies
Hey all, Running into a situation where I'm trying to use MiniHBaseCluster to test operations against a table with LZO compression enabled. I have a set of unit tests that exercise certain capabilities, but the moment I use LZO it fails. I know exactly that it is that it cannot find the nativegp

RE: wrong region exception

2011-05-25 Thread Robert Gonzalez
That region is not in the urlhashv2 directory. I'll grep all the logs to see if it shows up. -Original Message- From: saint@gmail.com [mailto:saint@gmail.com] On Behalf Of Stack Sent: Wednesday, May 25, 2011 3:30 PM To: user@hbase.apache.org Subject: Re: wrong region exception

RE: wrong region exception

2011-05-25 Thread Robert Gonzalez
Nope, nothing in the logs with that string. -Original Message- From: saint@gmail.com [mailto:saint@gmail.com] On Behalf Of Stack Sent: Wednesday, May 25, 2011 3:30 PM To: user@hbase.apache.org Subject: Re: wrong region exception Can you find this region in the filesystem? Look un

Re: HBase Error - assignment of -ROOT- failure

2011-05-25 Thread Pavan
In case you are trying it on a virtual machine. I changed my networking options from NAT to Bridged and it started working. I am guessing it has problems accessing the network interface otherwise maybe.

Re: HBase major_,compaction

2011-05-25 Thread Stack
Major compaction will not merge old small regions. The change in cable schema makes it so existing ones will grow larger before they in turn split. To check a table is already compacted, do an lsr on the table's directory in hdfs. See how many storefiles. If a major compaction ran recently, th

Re: wrong region exception

2011-05-25 Thread Stack
Can you find this region in the filesystem? Look under the urlhashv2 table directory for a direction named 80116D7E506D87ED39EAFFE784B5B590. Grep your master log to see if you can figure the history of this region. St.Ack On Wed, May 25, 2011 at 1:21 PM, Robert Gonzalez wrote: > The detailed er

RE: wrong region exception

2011-05-25 Thread Robert Gonzalez
The detailed error is : Chain of regions in table urlhashv2 is broken; edges does not contain 80116D7E506D87ED39EAFFE784B5B590 Table urlhashv2 is inconsistent. How does one fix this? Thanks, Robert -Original Message- From: saint@gmail.com [mailto:saint@gmail.com] On Behalf Of

Re: mslab enabled jvm crash

2011-05-25 Thread Erik Onnen
On Wed, May 25, 2011 at 11:39 AM, Ted Dunning wrote: > It should be recognized that your experiences are a bit out of the norm > here.  Many hbase installations use more recent JVM's without problems. Indeed, we run u25 on CentOS 5.6 and over several days uptime it's common to never see a full GC

Re: about TestRollingRestart

2011-05-25 Thread Stack
Is the .META. allocated? Can you tell why its stuck? Is it an issue with notifications not happening on .META. reassign? That timeout below seems very big. Did we actually wait that long? St.Ack On Wed, May 25, 2011 at 3:26 AM, Gaojinchao wrote: > Root and Meta had assigned finally from log

HBase major_,compaction

2011-05-25 Thread Miguel Costa
H I created a table with a small maxfilesize (64MB)and now the table has got lots of regions, more than 1000. I changed the maxfilesize to a bigger number (512MB) but I still have lots of regions even after I major_compact all tables. I t seems that major_compact didn'tcompact any region.

Re: About RegionServer checkin

2011-05-25 Thread Stack
2011/5/25 Gaojinchao : > How many regions in the cluster?  Do you say 1344 above?  How do we get to > 5041? > > In my test cluster: 1 hmasters , 2 regionservers , 3 zookeeper and 5041 > regions > > In this scenario: > 1, Two Zookeeper crashed What made them crash? Are you doing recovery testing

Re: Log4j changes not working inside static mapper and reducer classes

2011-05-25 Thread Jean-Daniel Cryans
I'm not sure why you are asking this question on the hbase user mailing list, it seems like you have a log4j issue. J-D On Wed, May 25, 2011 at 1:03 PM, Himanish Kushary wrote: > Could anybody please help me with this. > > On Tue, May 24, 2011 at 10:17 AM, Himanish Kushary wrote: > >> Hi, >> >>

Re: Log4j changes not working inside static mapper and reducer classes

2011-05-25 Thread Himanish Kushary
Could anybody please help me with this. On Tue, May 24, 2011 at 10:17 AM, Himanish Kushary wrote: > Hi, > > I have enabled debug for my Map-Reduce package inside the log4j.properties > under the $HADOOP_HOME/conf directory (using CDH3). > > log4j.logger.com.himanish.analytics.mapreduce=DEBUG > >

Re: mslab enabled jvm crash

2011-05-25 Thread Wayne
I have restarted kicking in CMS earlier (65%) and turning off the incremental. We have an 8g heap...should we go to 10g (24g in box)? More memory for the JVM has never seemed to be better...though maybe with lots of hot regions and our flush size we might be pushing it? Should we up the 50% for mem

Re: hbase hbck error

2011-05-25 Thread Stack
On Wed, May 25, 2011 at 10:49 AM, Jinsong Hu wrote > if we add the root region back in, then  essentially the hbck is complaining > every region is bad, > which is not true. > I did notice and recently fix an issue where HBCK will print an ERROR for all regions that follow a bad one so rather tha

Re: mslab enabled jvm crash

2011-05-25 Thread Stack
On Wed, May 25, 2011 at 11:08 AM, Wayne wrote: > I tried to turn off all special JVM settings we have tried in the past. > Below are link to the requested configs. I will try to find more logs for > the full GC. We just made the switch and on this node it has > only occurred once in the scope of t

Re: mslab enabled jvm crash

2011-05-25 Thread Wayne
Most hbase installations also seem to recommend bulk inserts for loading data. We are pushing it more than most in terms of actually using the client API to load large volumes of data. We keep delaying putting hbase into production as nodes going awol for as much as 2+ minutes we can not accept as

Re: mslab enabled jvm crash

2011-05-25 Thread Ted Dunning
Wayne, It should be recognized that your experiences are a bit out of the norm here. Many hbase installations use more recent JVM's without problems. As such, it may be premature to point the finger at the JVM as opposed to the workload or environmental factors. Such a premature diagnosis can m

RE: bulkloader zookeeper connectString

2011-05-25 Thread Geoff Hendrey
Thanks for the pointer. I read the doc, and somehow had missed that argument. -geoff -Original Message- From: jdcry...@gmail.com [mailto:jdcry...@gmail.com] On Behalf Of Jean-Daniel Cryans Sent: Wednesday, May 25, 2011 10:40 AM To: user@hbase.apache.org Subject: Re: bulkloader zookeeper c

Re: mslab enabled jvm crash

2011-05-25 Thread Wayne
We have the line commented out with the new ratio. I will turn off the incremental mode. We do have cache turned off on the table level and have set to 1% for .meta. only. We do not use the block cache. I will keep testing. Frankly u25 scares as well as older JVMs seem much better based on previou

Re: mslab enabled jvm crash

2011-05-25 Thread Todd Lipcon
For your GC settings: - i wouldn't tune newratio or survivor ratio at all - if you want to tame your young GC pauses, use -Xmn to pick a new size - eg -Xmn256m - turn off CMS Incremental Mode if you're running on real server hardware HBase settings: - 1% of heap to block cache seems strange. maybe

Re: mslab enabled jvm crash

2011-05-25 Thread Wayne
I tried to turn off all special JVM settings we have tried in the past. Below are link to the requested configs. I will try to find more logs for the full GC. We just made the switch and on this node it has only occurred once in the scope of the current log (it may have rolled?). Thanks. http://p

Re: hbase hbck error

2011-05-25 Thread Jinsong Hu
Hi, Stack: You have a point. I checked the non-hbase machine's hbck's result, and it shows : Summary: 2418 inconsistencies detected. Status: INCONSISTENT That number seems very familiar to me, so I went to the master admin page, and found: Total: servers: 6 requests=2783, reg

Re: mslab enabled jvm crash

2011-05-25 Thread Todd Lipcon
Hi Wayne, Looks like your RAM might be oversubscribed. Could you paste your hbase-site.xml and hbase-env.sh files? Also looks like you have some strange GC settings on (eg perm gen collection which we don't really need) If you can paste a larger segment of GC logs (enough to include at least two

Re: bulkloader zookeeper connectString

2011-05-25 Thread Jean-Daniel Cryans
>From the doc at http://hbase.apache.org/bulk-loads.html The -c config-file option can be used to specify a file containing the appropriate hbase parameters (e.g., hbase-site.xml) if not supplied already on the CLASSPATH (In addition, the CLASSPATH must contain the directory that has the zookeeper

Re: HFiles that fit within a single region VS better load balancing at reduce phase

2011-05-25 Thread Ted Yu
HBASE-3721 was integrated to trunk, not 0.90.x HBASE-3871 is under review. So I would interpret my answer as tilting toward outputing Hfiles that fit within a single Region. If, after your effort, there're still some HFiles that don't fit. You can try my patches. Thanks 2011/5/25 Panayotis Anto

Re: mslab enabled jvm crash

2011-05-25 Thread Wayne
We switched to u25 and reverted the JVM settings to those recommended. Now we have concurrent mode failures that occur lasting more than 60 seconds while not under hardly any load Below are the entries from the JVM log. Of course we can up the zookeeper timeout to 2 min or 10 min for that matt

Re: HTableInterface.batch actions number limit

2011-05-25 Thread Jean-Daniel Cryans
There's no "recommended size", I guess as long as it fits in memory it's ok given that not all JVMs are given the same amount of heap. J-D On Wed, May 25, 2011 at 6:48 AM, Lucian Iordache wrote: > To be more specific, I was thinking of a recommended number of deletes per > batch. > For example I

Re: hbase row TTL

2011-05-25 Thread Jean-Daniel Cryans
As you saw it's family based, there's no "cross-family" schemas. Can you tell us more about your use case? J-D On Wed, May 25, 2011 at 5:58 AM, Oleg Ruchovets wrote: > Hi , >    Is it possible to define TTL for hbase row  (I found TTL only for column > family) ? >  In case it is not possible wh

RE: HFiles that fit within a single region VS better load balancing at reduce phase

2011-05-25 Thread Panayotis Antonopoulos
So your answer would be that it is better to have the best possible load balancing during the reduce phase instead of taking care to output Hfiles that fit within a single Region, because splitting done by Incremental Load is rather fast? > Date: Wed, 25 May 2011 09:20:10 -0700 > Subject: Re:

Re: hbase hbck error

2011-05-25 Thread Stack
On Wed, May 25, 2011 at 9:18 AM, Jinsong Hu wrote: > I tried several other non-hbase machines that has proper configuration, sure > enough, all of them complain problems. > This is interesting Jinsong. For sure the configuration was pointed at the right filesystem. Do you think there could have

Re: HFiles that fit within a single region VS better load balancing at reduce phase

2011-05-25 Thread Ted Yu
LoadIncrementalHFiles would split HFile if it doesn't fit within a single region. Please refer to the following JIRAs which speedup LoadIncrementalHFiles: https://issues.apache.org/jira/browse/HBASE-3871 https://issues.apache.org/jira/browse/HBASE-3721 Note: parallelizing splitting of HFile(s) by

Re: hbase hbck error

2011-05-25 Thread Jinsong Hu
This is a follow up of what I have found . I exported the several complained tables to hdfs, truncate the original table, and import it again, and run hbck, and found that the hbck still complain the problem saying the hdfs directory is not there. I go to hdfs and take a look, and the region's

Re: a question storefileIndexSize

2011-05-25 Thread Matt Corgan
also - how long are your column family name and column qualifiers? they are added to each row key in the index, so you want to make them as short as possible On Wed, May 25, 2011 at 10:47 AM, Matt Corgan wrote: > I have a table that compresses by 30x using gzip, so the default block size > of

Re: hbase and hypertable comparison

2011-05-25 Thread Andrew Purtell
I think I can speak for all of the HBase devs that in our opinion this vendor "benchmark" was designed by hypertable to demonstrate a specific feature of their system -- autotuning -- in such a way that HBase was, obviously, not tuned. Nobody from the HBase project was consulted on the results o

RE: Mapreduce log question

2011-05-25 Thread Kleegrewe, Christian
Hi Dave, Thanks for the reply. Actually we are not using TableInputFormat. I will have a look at the class and if it makes logfiles more usable I will use it, Best regards, Christian --8<-- Siemens AG Corporate Technology Corporate Research and Technol

Re: REST & Atomic increment

2011-05-25 Thread Andrew Purtell
Do you have any preference for how this might be accomplished? Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) --- On Wed, 5/25/11, Mark Jarecki wrote: > From: Mark Jarecki > Subject: REST & Atomic increment > To: user@hbase

Re: a question storefileIndexSize

2011-05-25 Thread Matt Corgan
I have a table that compresses by 30x using gzip, so the default block size of 64 KB was writing 2 KB blocks to disk. To reduce storefileIndexSize, I raised the block size to 256 KB, presumably writing ~8KB disk blocks which is still pretty small. Maybe you could go even higher depending on your

Re: HTableInterface.batch actions number limit

2011-05-25 Thread Lucian Iordache
To be more specific, I was thinking of a recommended number of deletes per batch. For example I need to delete 200.000 rows, should I delete them in several batches or all at once? (I've noticed that some problems appear for lists containing more than 100.000 deletes) On Wed, May 25, 2011 at 4:27

HTableInterface.batch actions number limit

2011-05-25 Thread Lucian Iordache
Hello, I have to make a lot of deletes from a hbase table, so I use the batch method, providing a list of Delete objects. Is there any limit for the number of Deletes to send in a batch? -- Regards, Lucian

Re: Mapreduce log question

2011-05-25 Thread Dave Latham
Are you using TableInputFormat? If so, if you turn on DEBUG level logging for hbase (or just org.apache.hadoop.hbase.mapreduce.TableInputFormatBase) you should see lines like this, giving the map task number, region location, start row, and end row: getSplits: split -> 0 -> hslave107:,@G\xA0\xFB\

hbase row TTL

2011-05-25 Thread Oleg Ruchovets
Hi , Is it possible to define TTL for hbase row (I found TTL only for column family) ? In case it is not possible what is the best practice to implement TTL for hbase rows? Thanks in advance Oleg.

RE: ClockOutOfSyncException with zookeeper

2011-05-25 Thread Kleegrewe, Christian
Hi Jr. Have a look at the Exception message: org.apache.hadoop.hbase.ClockOutOfSyncException: Server 160.110.79.60,60020,1306320747968 has been rejected; Reported time is too far out of sync with master. Time difference of 19832277ms > max allowed of 3ms It seems that the system time of th

Re: ClockOutOfSyncException with zookeeper

2011-05-25 Thread Harsh J
Here's an initial thought question: Is your cluster nodes' time in sync with one another? i.e., is ntpd functional on each? HBase/ZK would require clocks on each node to be synchronized with one another. On Wed, May 25, 2011 at 5:37 PM, James Ram wrote: > Hi, > > I am using 1 master and 4 slave m

ClockOutOfSyncException with zookeeper

2011-05-25 Thread James Ram
Hi, I am using 1 master and 4 slave machines cluster. I tried to add three HRegionserver including the master. But its throwing the following exception. The HRegionserver is running on the master but its not running on the slaves. I have also given the hbase-site.xml and regionservers also. Please

HFiles that fit within a single region VS better load balancing at reduce phase

2011-05-25 Thread Panayotis Antonopoulos
Hello, I am currently working on a MR job that will output HFiles that will be bulk loaded in an HBase Table. According to the HBase site in order for the bulk loading to be efficient each HFile of the MR job should fit within a single region. In order to achieve that I use the TotalOrderPartiti

Re: about TestRollingRestart

2011-05-25 Thread Gaojinchao
Root and Meta had assigned finally from logs. I am not sure what 's up. It seems that some exception for socket. So ServerShutdownHandler couldn't finish and I try to reproduce it again(I think it is easy ) 2011-05-24 09:09:49,292 WARN [RegionServer:3;C4C2.site,56262,1306199352333-EventThread]

Mapreduce log question

2011-05-25 Thread Kleegrewe, Christian
Hi all, I would like to figure out on which table region is used by a specific map task. Is there a possibility to find such informaiton in the hbase logs? Thanks in advance Christian --8<-- Siemens AG Corporate Technology Corporate Research and Technol

REST & Atomic increment

2011-05-25 Thread Mark Jarecki
Hi, Is there - or are there plans for implementing - an atomic increment method for a column value in the REST API? I noticed such a method in Thrift. Just thinking of a way to increment a column value integer in a single operation - avoiding the need for a GET request followed by a PUT reques

Re: a question storefileIndexSize

2011-05-25 Thread Gaojinchao
Region size is 512M hbase.regionserver.handler.count 50 hbase.regionserver.global.memstore.upperLimit 0.4 hbase.regionserver.global.memstore.lowerLimit 0.35 hbase.hregion.memstore.flush.size 128M hbase.hregion.max.filesize 512M hbase.client.scanner.caching 1 hfile.block.cache.size 0.2 hbase.h

Re: About RegionServer checkin

2011-05-25 Thread Gaojinchao
Yes It is case balancer assigned a portion of the total. How many regions in the cluster? Do you say 1344 above? How do we get to 5041? In my test cluster: 1 hmasters , 2 regionservers , 3 zookeeper and 5041 regions In this scenario: 1, Two Zookeeper crashed 2, One Hmaster and one regionserver

hbase and hypertable comparison

2011-05-25 Thread edward choi
I'm planning to use a NoSQL distributed database. I did some searching and came across a lot of database systems such as MongoDB, CouchDB, Hbase, Cassandra, Hypertable, etc. Since what I'll be doing is frequently reading a varying amount of data, and less frequently writing a massive amount of dat