Who said anything about deb :). I do use tarballs.... Yes, so what did it is the copy of that jar to under hbase/lib, and then full restart. Now here is a funny thing, the master shuddered for about 10 minutes, spewing those messages:
2010-09-20 21:23:45,826 DEBUG org.apache.hadoop.hbase.master.HMaster: Event NodeCreated with state SyncConnected with path /hbase/UNASSIGNED/97999366 2010-09-20 21:23:45,827 DEBUG org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event NodeCreated with path /hbase/UNASSIGNED/97999366 2010-09-20 21:23:45,827 DEBUG org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got zkEvent NodeCreated state:SyncConnected path:/hbase/UNASSIGNED/97999366 2010-09-20 21:23:45,827 DEBUG org.apache.hadoop.hbase.master.RegionManager: Created/updated UNASSIGNED zNode img15,normal052q.jpg,1285001686282.97999366 in state M2ZK_REGION_OFFLINE 2010-09-20 21:23:45,828 INFO org.apache.hadoop.hbase.master.RegionServerOperation: img13,p1000319tq.jpg,1284952655960.812544765 open on 10.103.2.3,60020,1285042333293 2010-09-20 21:23:45,828 DEBUG org.apache.hadoop.hbase.master.ZKUnassignedWatcher: Got event type [ M2ZK_REGION_OFFLINE ] for region 97999366 2010-09-20 21:23:45,828 DEBUG org.apache.hadoop.hbase.master.HMaster: Event NodeChildrenChanged with state SyncConnected with path /hbase/UNASSIGNED 2010-09-20 21:23:45,828 DEBUG org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event NodeChildrenChanged with path /hbase/UNASSIGNED 2010-09-20 21:23:45,828 DEBUG org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got zkEvent NodeChildrenChanged state:SyncConnected path:/hbase/UNASSIGNED 2010-09-20 21:23:45,830 DEBUG org.apache.hadoop.hbase.master.BaseScanner: Current assignment of img150,,1284859678248.3116007 is not valid; serverAddress=10.103.2.1:60020, startCode=1285038205920 unknown. Does anyone know what they mean? At first it would kill one of my datanodes. But what helped is when I changed to heap size to 4GB for master and 2GB for datanode that was dying, and after 10 minutes I got into a clean state. -Jack On Mon, Sep 20, 2010 at 9:28 PM, Ryan Rawson <[email protected]> wrote: > yes, on every single machine as well, and restart. > > again, not sure how how you'd do this in a scalable manner with your > deb packages... on the source tarball you can just replace it, rsync > it out and done. > > :-) > > On Mon, Sep 20, 2010 at 8:56 PM, Jack Levin <[email protected]> wrote: >> ok, I found that file, do I replace hadoop-core.*.jar under >> /usr/lib/hbase/lib? >> Then restart, etc? All regionservers too? >> >> -Jack >> >> On Mon, Sep 20, 2010 at 8:40 PM, Ryan Rawson <[email protected]> wrote: >>> Well I don't really run CDH, I disagree with their rpm/deb packaging >>> policies and I have to highly recommend not using DEBs to install >>> software... >>> >>> So normally installing from tarball, the jar is in >>> <installpath>/hadoop-0.20.0-320/hadoop-core-0.20.2+320.jar >>> >>> On CDH/DEB edition, it's somewhere silly ... locate and find will be >>> your friend. It should be called hadoop-core-0.20.2+320.jar though! >>> >>> I'm working on a github publish of SU's production system, which uses >>> the cloudera maven repo to install the correct JAR in hbase so when >>> you type 'mvn assembly:assembly' to build your own hbase-*-bin.tar.gz >>> (the * being whatever version you specified in pom.xml) the cdh3b2 jar >>> comes pre-packaged. >>> >>> Stay tuned :-) >>> >>> -ryan >>> >>> On Mon, Sep 20, 2010 at 8:36 PM, Jack Levin <[email protected]> wrote: >>>> Ryan, hadoop jar, what is the usual path to the file? I just to to be >>>> sure, and where do I put it? >>>> >>>> -Jack >>>> >>>> On Mon, Sep 20, 2010 at 8:30 PM, Ryan Rawson <[email protected]> wrote: >>>>> you need 2 more things: >>>>> >>>>> - restart hdfs >>>>> - make sure the hadoop jar from your install replaces the one we ship with >>>>> >>>>> >>>>> On Mon, Sep 20, 2010 at 8:22 PM, Jack Levin <[email protected]> wrote: >>>>>> So, I switched to 0.89, and we already had CDH3 >>>>>> (hadoop-0.20-datanode-0.20.2+320-3.noarch), even though I added >>>>>> <name>dfs.support.append</name> as true to both hdfs-site.xml and >>>>>> hbase-site.xml, the master still reports this: >>>>>> >>>>>> You are currently running the HMaster without HDFS append support >>>>>> enabled. This may result in data loss. Please see the HBase wiki for >>>>>> details. >>>>>> Master Attributes >>>>>> Attribute Name Value Description >>>>>> HBase Version 0.89.20100726, r979826 HBase version and svn revision >>>>>> HBase Compiled Sat Jul 31 02:01:58 PDT 2010, stack When HBase >>>>>> version >>>>>> was compiled and by whom >>>>>> Hadoop Version 0.20.2, r911707 Hadoop version and svn revision >>>>>> Hadoop Compiled Fri Feb 19 08:07:34 UTC 2010, chrisdo When Hadoop >>>>>> version was compiled and by whom >>>>>> HBase Root Directory hdfs://namenode-rd.imageshack.us:9000/hbase >>>>>> Location >>>>>> of HBase home directory >>>>>> >>>>>> Any ideas whats wrong? >>>>>> >>>>>> -Jack >>>>>> >>>>>> >>>>>> On Mon, Sep 20, 2010 at 5:47 PM, Ryan Rawson <[email protected]> wrote: >>>>>>> Hey, >>>>>>> >>>>>>> There is actually only 1 active branch of hbase, that being the 0.89 >>>>>>> release, which is based on 'trunk'. We have snapshotted a series of >>>>>>> 0.89 "developer releases" in hopes that people would try them our and >>>>>>> start thinking about the next major version. One of these is what SU >>>>>>> is running prod on. >>>>>>> >>>>>>> At this point tracking 0.89 and which ones are the 'best' peach sets >>>>>>> to run is a bit of a contact sport, but if you are serious about not >>>>>>> losing data it is worthwhile. SU is based on the most recent DR with >>>>>>> a few minor patches of our own concoction brought in. If current >>>>>>> works, but some Master ops are slow, and there are a few patches on >>>>>>> top of that. I'll poke about and see if its possible to publish to a >>>>>>> github branch or something. >>>>>>> >>>>>>> -ryan >>>>>>> >>>>>>> On Mon, Sep 20, 2010 at 5:16 PM, Jack Levin <[email protected]> wrote: >>>>>>>> Sounds, good, only reason I ask is because of this: >>>>>>>> >>>>>>>> There are currently two active branches of HBase: >>>>>>>> >>>>>>>> * 0.20 - the current stable release series, being maintained with >>>>>>>> patches for bug fixes only. This release series does not support HDFS >>>>>>>> durability - edits may be lost in the case of node failure. >>>>>>>> * 0.89 - a development release series with active feature and >>>>>>>> stability development, not currently recommended for production use. >>>>>>>> This release does support HDFS durability - cases in which edits are >>>>>>>> lost are considered serious bugs. >>>>>>>>>>>>>> >>>>>>>> >>>>>>>> Are we talking about data loss in case of datanode going down while >>>>>>>> being written to, or RegionServer going down? >>>>>>>> >>>>>>>> -jack >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Sep 20, 2010 at 4:09 PM, Ryan Rawson <[email protected]> >>>>>>>> wrote: >>>>>>>>> We run 0.89 in production @ Stumbleupon. We also employ 3 >>>>>>>>> committers... >>>>>>>>> >>>>>>>>> As for safety, you have no choice but to run 0.89. If you run a 0.20 >>>>>>>>> release you will lose data. you must be on 0.89 and >>>>>>>>> CDH3/append-branch to achieve data durability, and there really is no >>>>>>>>> argument around it. If you are doing your tests with 0.20.6 now, I'd >>>>>>>>> stop and rebase those tests onto the latest DR announced on the list. >>>>>>>>> >>>>>>>>> -ryan >>>>>>>>> >>>>>>>>> On Mon, Sep 20, 2010 at 3:17 PM, Jack Levin <[email protected]> wrote: >>>>>>>>>> Hi Stack, see inline: >>>>>>>>>> >>>>>>>>>> On Mon, Sep 20, 2010 at 2:42 PM, Stack <[email protected]> wrote: >>>>>>>>>>> Hey Jack: >>>>>>>>>>> >>>>>>>>>>> Thanks for writing. >>>>>>>>>>> >>>>>>>>>>> See below for some comments. >>>>>>>>>>> >>>>>>>>>>> On Mon, Sep 20, 2010 at 11:00 AM, Jack Levin <[email protected]> >>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Image-Shack gets close to two million image uploads per day, which >>>>>>>>>>>> are >>>>>>>>>>>> usually stored on regular servers (we have about 700), as regular >>>>>>>>>>>> files, and each server has its own host name, such as (img55). >>>>>>>>>>>> I've >>>>>>>>>>>> been researching on how to improve our backend design in terms of >>>>>>>>>>>> data >>>>>>>>>>>> safety and stumped onto the Hbase project. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Any other requirements other than data safety? (latency, etc). >>>>>>>>>> >>>>>>>>>> Latency is the second requirement. We have some services that are >>>>>>>>>> very short tail, and can produce 95% cache hit rate, so I assume this >>>>>>>>>> would really put cache into good use. Some other services however, >>>>>>>>>> have about 25% cache hit ratio, in which case the latency should be >>>>>>>>>> 'adequate', e.g. if its slightly worse than getting data off raw >>>>>>>>>> disk, >>>>>>>>>> then its good enough. Safely is supremely important, then its >>>>>>>>>> availability, then speed. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> Now, I think hbase is he most beautiful thing that happen to >>>>>>>>>>>> distributed DB world :). The idea is to store image files (about >>>>>>>>>>>> 400Kb on average into HBASE). >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I'd guess some images are much bigger than this. Do you ever limit >>>>>>>>>>> the size of images folks can upload to your service? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> The setup will include the following >>>>>>>>>>>> configuration: >>>>>>>>>>>> >>>>>>>>>>>> 50 servers total (2 datacenters), with 8 GB RAM, dual core cpu, 6 x >>>>>>>>>>>> 2TB disks each. >>>>>>>>>>>> 3 to 5 Zookeepers >>>>>>>>>>>> 2 Masters (in a datacenter each) >>>>>>>>>>>> 10 to 20 Stargate REST instances (one per server, hash >>>>>>>>>>>> loadbalanced) >>>>>>>>>>> >>>>>>>>>>> Whats your frontend? Why REST? It might be more efficient if you >>>>>>>>>>> could run with thrift given REST base64s its payload IIRC (check the >>>>>>>>>>> src yourself). >>>>>>>>>> >>>>>>>>>> For insertion we use Haproxy, and balance curl PUTs across multiple >>>>>>>>>> REST APIs. >>>>>>>>>> For reading, its a nginx proxy that does Content-type modification >>>>>>>>>> from image/jpeg to octet-stream, and vice versa, >>>>>>>>>> it then hits Haproxy again, which hits balanced REST. >>>>>>>>>> Why REST, it was the simplest thing to run, given that its supports >>>>>>>>>> HTTP, potentially we could rewrite something for thrift, as long as >>>>>>>>>> we >>>>>>>>>> can use http still to send and receive data (anyone wrote anything >>>>>>>>>> like that say in python, C or java?) >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> 40 to 50 RegionServers (will probably keep masters separate on >>>>>>>>>>>> dedicated boxes). >>>>>>>>>>>> 2 Namenode servers (one backup, highly available, will do fsimage >>>>>>>>>>>> and >>>>>>>>>>>> edits snapshots also) >>>>>>>>>>>> >>>>>>>>>>>> So far I got about 13 servers running, and doing about 20 >>>>>>>>>>>> insertions / >>>>>>>>>>>> second (file size ranging from few KB to 2-3MB, ave. 400KB). via >>>>>>>>>>>> Stargate API. Our frontend servers receive files, and I just >>>>>>>>>>>> fork-insert them into stargate via http (curl). >>>>>>>>>>>> The inserts are humming along nicely, without any noticeable load >>>>>>>>>>>> on >>>>>>>>>>>> regionservers, so far inserted about 2 TB worth of images. >>>>>>>>>>>> I have adjusted the region file size to be 512MB, and table block >>>>>>>>>>>> size >>>>>>>>>>>> to about 400KB , trying to match average access block to limit HDFS >>>>>>>>>>>> trips. >>>>>>>>>>> >>>>>>>>>>> As Todd suggests, I'd go up from 512MB... 1G at least. You'll >>>>>>>>>>> probably want to up your flush size from 64MB to 128MB or maybe >>>>>>>>>>> 192MB. >>>>>>>>>> >>>>>>>>>> Yep, i will adjust to 1G. I thought flush was controlled by a >>>>>>>>>> function of memstore HEAP, something like 40%? Or are you talking >>>>>>>>>> about HDFS block size? >>>>>>>>>> >>>>>>>>>>> So far the read performance was more than adequate, and of >>>>>>>>>>>> course write performance is nowhere near capacity. >>>>>>>>>>>> So right now, all newly uploaded images go to HBASE. But we do >>>>>>>>>>>> plan >>>>>>>>>>>> to insert about 170 Million images (about 100 days worth), which is >>>>>>>>>>>> only about 64 TB, or 10% of planned cluster size of 600TB. >>>>>>>>>>>> The end goal is to have a storage system that creates data safety, >>>>>>>>>>>> e.g. system may go down but data can not be lost. Our Front-End >>>>>>>>>>>> servers will continue to serve images from their own file system >>>>>>>>>>>> (we >>>>>>>>>>>> are serving about 16 Gbits at peak), however should we need to >>>>>>>>>>>> bring >>>>>>>>>>>> any of those down for maintenance, we will redirect all traffic to >>>>>>>>>>>> Hbase (should be no more than few hundred Mbps), while the front >>>>>>>>>>>> end >>>>>>>>>>>> server is repaired (for example having its disk replaced), after >>>>>>>>>>>> the >>>>>>>>>>>> repairs, we quickly repopulate it with missing files, while serving >>>>>>>>>>>> the missing remaining off Hbase. >>>>>>>>>>>> All in all should be very interesting project, and I am hoping not >>>>>>>>>>>> to >>>>>>>>>>>> run into any snags, however, should that happens, I am pleased to >>>>>>>>>>>> know >>>>>>>>>>>> that such a great and vibrant tech group exists that supports and >>>>>>>>>>>> uses >>>>>>>>>>>> HBASE :). >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> We're definetly interested in how your project progresses. If you >>>>>>>>>>> are >>>>>>>>>>> ever up in the city, you should drop by for a chat. >>>>>>>>>> >>>>>>>>>> Cool. I'd like that. >>>>>>>>>> >>>>>>>>>>> St.Ack >>>>>>>>>>> >>>>>>>>>>> P.S. I'm also w/ Todd that you should move to 0.89 and blooms. >>>>>>>>>>> P.P.S I updated the wiki on stargate REST: >>>>>>>>>>> http://wiki.apache.org/hadoop/Hbase/Stargate >>>>>>>>>> >>>>>>>>>> Cool, I assume if we move to that it won't kill existing meta tables, >>>>>>>>>> and data? e.g. cross compatible? >>>>>>>>>> Is 0.89 ready for production environment? >>>>>>>>>> >>>>>>>>>> -Jack >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
