Re: Why did I only get 2 live datanodes?
Finally done! We need to remove the filesystem before we try to format it! It seems that format command will not totally re-format the filesystem if there is alredy one. Hopefully others will not meet the same problem from me. :=) David David Wei 写道: > I think my config is okay regarding this temp folder. I just used the > default setting of hadoop for temp folder. > Right now I found that the problem is following: > > 2008-10-17 01:52:57,167* FATAL *org.apache.hadoop.dfs.StateChange: > BLOCK* NameSystem.getDatanode: Data node 192.168.49.148:50010 is > attempting to report storage ID > DS-2140035130-127.0.0.1-50010-1223898963914. Node 192.168.55.104:50010 > is expected to serve this storage. > > trying following steps: > 1. Ensure every node's setting identical > 2. Format every node again > 3. Remove temp folders on each node > 4. Ensure the master and slaves can ssh to each others without password > 5. Restart the whole cluster > > But the same situation... ... :=( > > David > > [EMAIL PROTECTED] 写道: > >> have you config the property hadoop.tmp.dir in the configaration >> conf/hadoop-site.xml? >> the somefile will be stored in that directory。may be you could try to rm >> that directory and >> "bin/hadoop namenode -format" again. i meet the same problem.and i just do >> the same thing. >> it runs ok. >> >> 2008-10-16 >> >> >> >> [EMAIL PROTECTED] >> >> > > > >
Re: out of memory error
H, I should have sent to hbase mailing list. Please ignore this. Sorry for the spam. On Fri, Oct 17, 2008 at 12:30 AM, Jim Kellerman (POWERSET) < [EMAIL PROTECTED]> wrote: > In the future, you will get a more timely response for hbase > questions if you post them on the [EMAIL PROTECTED] > mailing list. > > In order to address your question, it would be helpful to > know your hardware configuration (memory, # of cores), > any changes you have made to hbase-site.xml, how many > file handles are allocated per process, what else is > running on the same machine as the region server and > what versions of hadoop and hbase you are running. > > --- > Jim Kellerman, Powerset (Live Search, Microsoft Corporation) > > > -Original Message- > > From: Rui Xing [mailto:[EMAIL PROTECTED] > > Sent: Thursday, October 16, 2008 4:52 AM > > To: core-user@hadoop.apache.org > > Subject: out of memory error > > > > Hello List, > > > > We encountered an out-of-memory error in data loading. We have 5 data > > nodes > > and 1 name node distributed on 6 machines. Block-level compression was > > used. > > Following is the log output. Seems the problem was caused in compression. > > Is > > there anybody who ever experienced such error? Any helps or clues are > > appreciated. > > > > 2008-10-15 21:44:33,069 FATAL > > [regionserver/0:0:0:0:0:0:0:0:60020.compactor] > > regionserver.HRegionServer$1(579): Set stop flag in > > regionserver/0:0:0:0:0:0:0:0:60020.compactor > > java.lang.OutOfMemoryError > > at sun.misc.Unsafe.allocateMemory(Native Method) > > at java.nio.DirectByteBuffer.(DirectByteBuffer.java:99) > > at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) > > at > > > org.apache.hadoop.io.compress.zlib.ZlibDecompressor.(ZlibDecompresso > > r.java:108) > > at > > > org.apache.hadoop.io.compress.zlib.ZlibDecompressor.(ZlibDecompresso > > r.java:115) > > at > > > org.apache.hadoop.io.compress.zlib.ZlibFactory.getZlibDecompressor(ZlibFac > > tory.java:104) > > at > > > org.apache.hadoop.io.compress.DefaultCodec.createDecompressor(DefaultCodec > > .java:80) > > at > > > org.apache.hadoop.io.SequenceFile$Reader.getPooledOrNewDecompressor(Sequen > > ceFile.java:1458) > > at > > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1543) > > at > > org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1442) > > at > > org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1431) > > at > > org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1426) > > at org.apache.hadoop.io.MapFile$Reader.open(MapFile.java:292) > > at > > > org.apache.hadoop.hbase.regionserver.HStoreFile$HbaseMapFile$HbaseReader.< > > init>(HStoreFile.java:635) > > at > > > org.apache.hadoop.hbase.regionserver.HStoreFile$BloomFilterMapFile$Reader. > > (HStoreFile.java:717) > > at > > > org.apache.hadoop.hbase.regionserver.HStoreFile$HalfMapFileReader.(H > > StoreFile.java:915) > > at > > > org.apache.hadoop.hbase.regionserver.HStoreFile.getReader(HStoreFile.java: > > 408) > > at > > org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:263) > > at > > > org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.jav > > a:1698) > > at > > org.apache.hadoop.hbase.regionserver.HRegion.(HRegion.java:481) > > at > > org.apache.hadoop.hbase.regionserver.HRegion.(HRegion.java:421) > > at > > > org.apache.hadoop.hbase.regionserver.HRegion.splitRegion(HRegion.java:815) > > at > > > org.apache.hadoop.hbase.regionserver.CompactSplitThread.split(CompactSplit > > Thread.java:133) > > at > > > org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitTh > > read.java:86) > > 2008-10-15 21:44:33,661 FATAL > > [regionserver/0:0:0:0:0:0:0:0:60020.cacheFlusher] > > regionserver.Flusher(183): > > Replay of hlog required. Forcing server restart > > org.apache.hadoop.hbase.DroppedSnapshotException: region: > > p4p_test,,1224072139042 > > at > > > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.ja > > va:1087) > > at > > org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:985) > > at > > > org.apache.hadoop.hbase.regionserver.Flusher.flushRegion(Flusher.java:174) > > at > > org.apache.hadoop.hbase.regionserver.Flusher.run(Flusher.java:91) > > Caused by: java.lang.OutOfMemoryError > > at sun.misc.Unsafe.allocateMemory(Native Method) > > at java.nio.DirectByteBuffer.(DirectByteBuffer.java:99) > > at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) > > at > > > org.apache.hadoop.io.compress.zlib.ZlibDecompressor.(ZlibDecompresso > > r.java:107) > > at > > > org.apache.hadoop.io.compress.zlib.ZlibDecompressor.(ZlibDecompresso > > r.java:115) > > at > > > org.apache.hadoop.io.compress.zlib.ZlibFactory.getZ
Re: Add new data directory during runtime
Since configuration file is loaded when a datanode starts up, it's not possible to have the change in "dfs.data"dir" applied in runtime. Please let me know if I'm wrong. On Fri, Oct 17, 2008 at 10:08 AM, Jinyeon Lee <[EMAIL PROTECTED]> wrote: > Is it possible to add more data directories by changing the > configuration `dfs.data.dir' during runtime? > > Regards, > Lee, Jin Yeon >
Re: Why did I only get 2 live datanodes?
I think my config is okay regarding this temp folder. I just used the default setting of hadoop for temp folder. Right now I found that the problem is following: 2008-10-17 01:52:57,167* FATAL *org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.getDatanode: Data node 192.168.49.148:50010 is attempting to report storage ID DS-2140035130-127.0.0.1-50010-1223898963914. Node 192.168.55.104:50010 is expected to serve this storage. trying following steps: 1. Ensure every node's setting identical 2. Format every node again 3. Remove temp folders on each node 4. Ensure the master and slaves can ssh to each others without password 5. Restart the whole cluster But the same situation... ... :=( David [EMAIL PROTECTED] 写道: > have you config the property hadoop.tmp.dir in the configaration > conf/hadoop-site.xml? > the somefile will be stored in that directory。may be you could try to rm > that directory and > "bin/hadoop namenode -format" again. i meet the same problem.and i just do > the same thing. > it runs ok. > > 2008-10-16 > > > > [EMAIL PROTECTED] >
Add new data directory during runtime
Is it possible to add more data directories by changing the configuration `dfs.data.dir' during runtime? Regards, Lee, Jin Yeon
Problem with MapFile.Reader.get()
Firstly, can someone please explain the purpose of the second parameter - Writable val? The Javadocs don't say what this is for and to my understanding this is what the function is supposed to retrieve, isn't it? In my code I'm passing null for the val parameter and seeing the following exception: 08/10/16 20:09:52 INFO mapred.JobClient: Task Id : attempt_200810141831_0036_m_00_0, Status : FAILED java.lang.NullPointerException at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1754) at org.apache.hadoop.io.MapFile$Reader.get(MapFile.java:523) Is this caused by the fact that I'm passing in null for val or is it something else? I'm using hadoop 0.18.1 if that makes any difference. -- View this message in context: http://www.nabble.com/Problem-with-MapFile.Reader.get%28%29-tp20022848p20022848.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: Distributed cache Design
Thanks Colin/ Owen I will try some of the ideas here and report back. Best Bhupesh On 10/16/08 4:05 PM, "Colin Evans" <[EMAIL PROTECTED]> wrote: > The trick is to amortize your computation over the whole set. So DFS > for a single node will always be faster on an in-memory graph, but > Hadoop is a good tool for computing all-pairs shortest paths in one shot > if you re-frame the algorithm as a belief propagation and message > passing algorithm. > > A lot of the time, the computation still explodes into n^2 or worse, so > you need to use a binning or blocking algorithm, like the one described > here: http://www.youtube.com/watch?v=1ZDybXl212Q > > In the case of graphs, a blocking function would be to find overlapping > strongly connected subgraphs where each subgraph fits in a reasonable > amount of memory. Then within each block, you do your computation and > you pass a summary of that computation to adjacent blocks,which gets > factored into the next computation. > > When we hooked up a Very Big Graph to our Hadoop cluster, we found that > there were a lot of scaling problems, which went away when we started > optimizing for streaming performance. > > -Colin > > > > Bhupesh Bansal wrote: >> Can you elaborate here , >> >> Lets say I want to implement a DFS in my graph. I am not able to picturise >> implementing it with doing graph in pieces without putting a depth bound to >> (3-4). Lets say we have 200M (4GB) edges to start with >> >> Best >> Bhupesh >> >> >> >> On 10/16/08 3:01 PM, "Owen O'Malley" <[EMAIL PROTECTED]> wrote: >> >> >>> On Oct 16, 2008, at 1:52 PM, Bhupesh Bansal wrote: >>> >>> We at Linkedin are trying to run some Large Graph Analysis problems on Hadoop. The fastest way to run would be to keep a copy of whole Graph in RAM at all mappers. (Graph size is about 8G in RAM) we have cluster of 8- cores machine with 8G on each. >>> The best way to deal with it is *not* to load the entire graph in one >>> process. In the WebMap at Yahoo, we have a graph of the web that has >>> roughly 1 trillion links and 100 billion nodes. See >>> http://tinyurl.com/4fgok6 >>> . To invert the links, you process the graph in pieces and resort >>> based on the target. You'll get much better performance and scale to >>> almost any size. >>> >>> Whats is the best way of doing that ?? Is there a way so that multiple mappers on same machine can access a RAM cache ?? I read about hadoop distributed cache looks like it's copies the file (hdfs / http) locally on the slaves but not necessrily in RAM ?? >>> You could mmap the file from distributed cache using MappedByteBuffer. >>> Then there will be one copy between jvms... >>> >>> -- Owen >>> >> >> >
Re: Distributed cache Design
The trick is to amortize your computation over the whole set. So DFS for a single node will always be faster on an in-memory graph, but Hadoop is a good tool for computing all-pairs shortest paths in one shot if you re-frame the algorithm as a belief propagation and message passing algorithm. A lot of the time, the computation still explodes into n^2 or worse, so you need to use a binning or blocking algorithm, like the one described here: http://www.youtube.com/watch?v=1ZDybXl212Q In the case of graphs, a blocking function would be to find overlapping strongly connected subgraphs where each subgraph fits in a reasonable amount of memory. Then within each block, you do your computation and you pass a summary of that computation to adjacent blocks,which gets factored into the next computation. When we hooked up a Very Big Graph to our Hadoop cluster, we found that there were a lot of scaling problems, which went away when we started optimizing for streaming performance. -Colin Bhupesh Bansal wrote: Can you elaborate here , Lets say I want to implement a DFS in my graph. I am not able to picturise implementing it with doing graph in pieces without putting a depth bound to (3-4). Lets say we have 200M (4GB) edges to start with Best Bhupesh On 10/16/08 3:01 PM, "Owen O'Malley" <[EMAIL PROTECTED]> wrote: On Oct 16, 2008, at 1:52 PM, Bhupesh Bansal wrote: We at Linkedin are trying to run some Large Graph Analysis problems on Hadoop. The fastest way to run would be to keep a copy of whole Graph in RAM at all mappers. (Graph size is about 8G in RAM) we have cluster of 8- cores machine with 8G on each. The best way to deal with it is *not* to load the entire graph in one process. In the WebMap at Yahoo, we have a graph of the web that has roughly 1 trillion links and 100 billion nodes. See http://tinyurl.com/4fgok6 . To invert the links, you process the graph in pieces and resort based on the target. You'll get much better performance and scale to almost any size. Whats is the best way of doing that ?? Is there a way so that multiple mappers on same machine can access a RAM cache ?? I read about hadoop distributed cache looks like it's copies the file (hdfs / http) locally on the slaves but not necessrily in RAM ?? You could mmap the file from distributed cache using MappedByteBuffer. Then there will be one copy between jvms... -- Owen
Re: Videos and slides of the HUG meetings?
Very cool. Thanks. Hien From: Ajay Anand <[EMAIL PROTECTED]> To: core-user@hadoop.apache.org Sent: Thursday, October 16, 2008 2:53:57 PM Subject: RE: Videos and slides of the HUG meetings? Actually this is the first time we were able to record the user group presentations. The recording is available now: http://developer.yahoo.net/blogs/hadoop/2008/10/hadoop_user_group_meetin g.html Ajay -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Thursday, October 16, 2008 11:10 AM To: core-user@hadoop.apache.org Subject: Videos and slides of the HUG meetings? Apparently Yahoo has been taking video/audio of all the presentations in the past HUG meetings. Are they available somewhere?
Re: Distributed cache Design
On Oct 16, 2008, at 3:09 PM, Bhupesh Bansal wrote: Lets say I want to implement a DFS in my graph. I am not able to picturise implementing it with doing graph in pieces without putting a depth bound to (3-4). Lets say we have 200M (4GB) edges to start with Start by watching the lecture on graph algorithms in map/reduce: http://www.youtube.com/watch?v=BT-piFBP4fE And see if that makes it clearer. If not, ask more questions. *smile* -- Owen
Re: Chukwa Support
Hi Alex, Chukwa has recently been deployed at Yahoo and now we are in the process of building a new series of patches to update the hadoop repository. Along with those patches, we're going to update the twiki and the deployment procedure. For the licensing issue (HICC), we're also working on it but it will take more time since we have to change the library that we are using. Could you give us more information on you're planning to use chukwa? Regards, Jerome B. On 10/16/08 12:16 PM, "Ariel Rabkin" <[EMAIL PROTECTED]> wrote: > Hey, glad to see that Chukwa is getting some attention and interest. > > An adaptor is a Java class that implements > org.apache.hadoop.chukwa.datacollection.adaptor.Adaptor. The Adaptor > javadoc should tell you what the methods need to do. > > You start an adaptor by sending a command of the form "add [classname] > [parameters] 0" to the Chukwa agent over TCP. By default, Chukwa > listens on port 9093. > > I don't believe HICC has been publicly released yet, due to annoying > GPL/Apache license incompatibilities. > > On Wed, Oct 15, 2008 at 3:27 PM, Alex Loddengaard > <[EMAIL PROTECTED]> wrote: >> I'm trying to play with Chukwa, but I'm struggling to get anything going. >> >> I've been operating off of the wiki entry (< >> http://wiki.apache.org/hadoop/Chukwa_Quick_Start>), making revisions as I go >> along. It's unclear to me how to 1) create an adapter and 2) start HICC >> (see the wiki for more information). >> >> I've gone through the wiki and created 'Document TODO:' items for each issue >> that I've run in to. Could someone familiar with Chukwa either comment on >> this issues on the mailing list or update the wiki? >> >> Chukwa seems like a great tool, but it's unclear exactly how to get it up >> and running. >>
Re: Distributed cache Design
Can you elaborate here , Lets say I want to implement a DFS in my graph. I am not able to picturise implementing it with doing graph in pieces without putting a depth bound to (3-4). Lets say we have 200M (4GB) edges to start with Best Bhupesh On 10/16/08 3:01 PM, "Owen O'Malley" <[EMAIL PROTECTED]> wrote: > > On Oct 16, 2008, at 1:52 PM, Bhupesh Bansal wrote: > >> We at Linkedin are trying to run some Large Graph Analysis problems on >> Hadoop. The fastest way to run would be to keep a copy of whole >> Graph in RAM >> at all mappers. (Graph size is about 8G in RAM) we have cluster of 8- >> cores >> machine with 8G on each. > > The best way to deal with it is *not* to load the entire graph in one > process. In the WebMap at Yahoo, we have a graph of the web that has > roughly 1 trillion links and 100 billion nodes. See http://tinyurl.com/4fgok6 > . To invert the links, you process the graph in pieces and resort > based on the target. You'll get much better performance and scale to > almost any size. > >> Whats is the best way of doing that ?? Is there a way so that multiple >> mappers on same machine can access a RAM cache ?? I read about hadoop >> distributed cache looks like it's copies the file (hdfs / http) >> locally on >> the slaves but not necessrily in RAM ?? > > You could mmap the file from distributed cache using MappedByteBuffer. > Then there will be one copy between jvms... > > -- Owen
Re: Distributed cache Design
On Oct 16, 2008, at 1:52 PM, Bhupesh Bansal wrote: We at Linkedin are trying to run some Large Graph Analysis problems on Hadoop. The fastest way to run would be to keep a copy of whole Graph in RAM at all mappers. (Graph size is about 8G in RAM) we have cluster of 8- cores machine with 8G on each. The best way to deal with it is *not* to load the entire graph in one process. In the WebMap at Yahoo, we have a graph of the web that has roughly 1 trillion links and 100 billion nodes. See http://tinyurl.com/4fgok6 . To invert the links, you process the graph in pieces and resort based on the target. You'll get much better performance and scale to almost any size. Whats is the best way of doing that ?? Is there a way so that multiple mappers on same machine can access a RAM cache ?? I read about hadoop distributed cache looks like it's copies the file (hdfs / http) locally on the slaves but not necessrily in RAM ?? You could mmap the file from distributed cache using MappedByteBuffer. Then there will be one copy between jvms... -- Owen
RE: Videos and slides of the HUG meetings?
Actually this is the first time we were able to record the user group presentations. The recording is available now: http://developer.yahoo.net/blogs/hadoop/2008/10/hadoop_user_group_meetin g.html Ajay -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Thursday, October 16, 2008 11:10 AM To: core-user@hadoop.apache.org Subject: Videos and slides of the HUG meetings? Apparently Yahoo has been taking video/audio of all the presentations in the past HUG meetings. Are they available somewhere?
Re: Distributed cache Design
At Freebase, we're mapping our large graphs into very large files of triples in HDFS and running large queries over them. Hadoop is optimized for processing streaming data off of disk, and we've found that trying to load a multi-GB graph and then access it in a Hadoop task has scaling problems. Mapping the graph to an on-disk representation as a bunch of interlocking or overlapping subgraphs works very well. -Colin Bhupesh Bansal wrote: Hey guys, We at Linkedin are trying to run some Large Graph Analysis problems on Hadoop. The fastest way to run would be to keep a copy of whole Graph in RAM at all mappers. (Graph size is about 8G in RAM) we have cluster of 8-cores machine with 8G on each. Whats is the best way of doing that ?? Is there a way so that multiple mappers on same machine can access a RAM cache ?? I read about hadoop distributed cache looks like it's copies the file (hdfs / http) locally on the slaves but not necessrily in RAM ?? Best Bhupesh
Re: Distributed cache Design
Bhupesh Bansal wrote: Minor correction the graph size is about 6G and not 8G. Ah, that's better. With the jvm reuse feature in 0.19 you should be able to load it once per job into a static, since all tasks of that job can share a JVM. Things will get tight if you try to run two such jobs at once, since JVMs are only shared by a single job. https://issues.apache.org/jira/browse/HADOOP-249 Doug
Re: Distributed cache Design
Minor correction the graph size is about 6G and not 8G. On 10/16/08 1:52 PM, "Bhupesh Bansal" <[EMAIL PROTECTED]> wrote: > Hey guys, > > > We at Linkedin are trying to run some Large Graph Analysis problems on > Hadoop. The fastest way to run would be to keep a copy of whole Graph in RAM > at all mappers. (Graph size is about 8G in RAM) we have cluster of 8-cores > machine with 8G on each. > > Whats is the best way of doing that ?? Is there a way so that multiple > mappers on same machine can access a RAM cache ?? I read about hadoop > distributed cache looks like it's copies the file (hdfs / http) locally on > the slaves but not necessrily in RAM ?? > > Best > Bhupesh >
Distributed cache Design
Hey guys, We at Linkedin are trying to run some Large Graph Analysis problems on Hadoop. The fastest way to run would be to keep a copy of whole Graph in RAM at all mappers. (Graph size is about 8G in RAM) we have cluster of 8-cores machine with 8G on each. Whats is the best way of doing that ?? Is there a way so that multiple mappers on same machine can access a RAM cache ?? I read about hadoop distributed cache looks like it's copies the file (hdfs / http) locally on the slaves but not necessrily in RAM ?? Best Bhupesh
Career Opportunity in Hadoop
I posted about a position with a client of mine a few days ago, and got some great responses, people who I think are qualified for the position. Of course, the process takes a little time. Is anyone else interested in a career opportunity as a "Hadoop Guru" for a firm in New York City? If you are, please respond to me by sending me a copy of your resume. Send it to [EMAIL PROTECTED] When I receive your resume, I will call or email you further details about the job. Thanks. -- View this message in context: http://www.nabble.com/Career-Opportunity-in-Hadoop-tp20016797p20016797.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Any one successfully ran the c++ pipes example?
Hi, I was trying to write an application using the pipes api. But it seemed the serialization part is not working correctly. More specifically, I can't deserialize a string from an StringInStream constructed from context.getInputSplit(). Even with the examples bundled in the distribution archive(wordcount-nopipe.cc), it threw exceptions. If anyone had experience on that, please kindly give some advise. P.S. The code that I am suspecting: HadoopUtils::StringInStream stream(context.getInputSplit()); HadoopUtils::deserializeString(fname, stream); fname should be deserialized from that stream, but it actually wasn't.
Help: How to change number of mappers in Hadoop streaming?
Would anybody help me? Can I use -jobconf mapred.map.task=50 in streaming command to change the job's number of mappers? I don't have a hadoop at hand and can not verify it. Thanks for your help. --- On Wed, 10/15/08, Steve Gao <[EMAIL PROTECTED]> wrote: From: Steve Gao <[EMAIL PROTECTED]> Subject: How to change number of mappers in Hadoop streaming? To: core-user@hadoop.apache.org Cc: [EMAIL PROTECTED] Date: Wednesday, October 15, 2008, 7:25 PM Is there a way to change number of mappers in Hadoop streaming command line? I know I can change hadoop-default.xml: mapred.map.tasks 10 The default number of map tasks per job. Typically set to a prime several times greater than number of available hosts. Ignored when mapred.job.tracker is "local". But that's for all jobs. What if I just want each job has different NUM_OF_Mappers themselves? Thanks
CloudBase: Data warehouse system build on top of Hadoop
Hi, CloudBase is a data warehouse system built on top of Hadoop. It is developed by Business.com (www.business.com) and is released to open source community under GNU General Public License 2.0 CloudBase provides a database abstraction layer on top of flat log files and allows one to query the log files using ANSI SQL. Some of the salient features of CloudBase are – 1) Supports ANSI SQL as its query language 2) Provides JDBC driver, so you can use any JDBC database manager application (e.g Squirrel) to connect to CloudBase. 3) Allows you to push results of queries into RDBMS using RDBMS JDBC driver 4) Supports String and Date time functions as mentioned in JDBC specifications. 5) Supports regular expressions in LIKE clause 6) Supports sub-queries, VIEWS 7) Supports Order by, Group By, Having clauses CloudBase site: http://sourceforge.net/projects/cloudbase/ CloudBase discussion group: http://groups.google.com/group/cloudbase-users Read more about CloudBase here- http://internap.dl.sourceforge.net/sourceforge/cloudbase/CloudBase_Tutorial.pdf -Taran
Re: Chukwa Support
Hey, glad to see that Chukwa is getting some attention and interest. An adaptor is a Java class that implements org.apache.hadoop.chukwa.datacollection.adaptor.Adaptor. The Adaptor javadoc should tell you what the methods need to do. You start an adaptor by sending a command of the form "add [classname] [parameters] 0" to the Chukwa agent over TCP. By default, Chukwa listens on port 9093. I don't believe HICC has been publicly released yet, due to annoying GPL/Apache license incompatibilities. On Wed, Oct 15, 2008 at 3:27 PM, Alex Loddengaard <[EMAIL PROTECTED]> wrote: > I'm trying to play with Chukwa, but I'm struggling to get anything going. > > I've been operating off of the wiki entry (< > http://wiki.apache.org/hadoop/Chukwa_Quick_Start>), making revisions as I go > along. It's unclear to me how to 1) create an adapter and 2) start HICC > (see the wiki for more information). > > I've gone through the wiki and created 'Document TODO:' items for each issue > that I've run in to. Could someone familiar with Chukwa either comment on > this issues on the mailing list or update the wiki? > > Chukwa seems like a great tool, but it's unclear exactly how to get it up > and running. > -- Ari Rabkin [EMAIL PROTECTED] UC Berkeley Computer Science Department
Videos and slides of the HUG meetings?
Apparently Yahoo has been taking video/audio of all the presentations in the past HUG meetings. Are they available somewhere?
Re: Processing large XML file
Hey Holger, Your project sounds interesting. I would place the untarred file in DFS, then write the mapreduce application to use StreamXmlRecordReader. This is a simple record reader which allows you to specify beginning and end text strings (in your case, and , respectively). The values between the strings would the value hadoop passes to your MapReduce application. As long as you don't have nested tags, this would work. Brian On Oct 16, 2008, at 10:28 AM, Holger Baumhaus wrote: Hello, I can't wrap my head around the best way to process a 20 GB Wikipedia XML dump file [1] with Hadoop. The content I'm interested in is enclosed in the -tags. Usually a SAX-parser would be the way to go, but since it is event based I don't think there would be a benefit of using a MR based approach. An article is only around a few KBs in size. My other thought was to preprocess the file and split it up into multiple text files with a size of 128 MBs. That step alone takes around 70 minutes on my machine. Would the preprocessing step also do the final processing (counting sentences and words in an article), than it wouldn't take much longer. So even if I would use the splitted files with Hadoop, I wouldn't really save some time, since I have to upload the 157 files to HDFS and then start the MR job. Are there other ways to handle XML files with Hadoop? Holger [1] http://download.wikimedia.org/enwiki/20081008/enwiki-20081008-pages-articles.xml.bz2 --
RE: out of memory error
In the future, you will get a more timely response for hbase questions if you post them on the [EMAIL PROTECTED] mailing list. In order to address your question, it would be helpful to know your hardware configuration (memory, # of cores), any changes you have made to hbase-site.xml, how many file handles are allocated per process, what else is running on the same machine as the region server and what versions of hadoop and hbase you are running. --- Jim Kellerman, Powerset (Live Search, Microsoft Corporation) > -Original Message- > From: Rui Xing [mailto:[EMAIL PROTECTED] > Sent: Thursday, October 16, 2008 4:52 AM > To: core-user@hadoop.apache.org > Subject: out of memory error > > Hello List, > > We encountered an out-of-memory error in data loading. We have 5 data > nodes > and 1 name node distributed on 6 machines. Block-level compression was > used. > Following is the log output. Seems the problem was caused in compression. > Is > there anybody who ever experienced such error? Any helps or clues are > appreciated. > > 2008-10-15 21:44:33,069 FATAL > [regionserver/0:0:0:0:0:0:0:0:60020.compactor] > regionserver.HRegionServer$1(579): Set stop flag in > regionserver/0:0:0:0:0:0:0:0:60020.compactor > java.lang.OutOfMemoryError > at sun.misc.Unsafe.allocateMemory(Native Method) > at java.nio.DirectByteBuffer.(DirectByteBuffer.java:99) > at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) > at > org.apache.hadoop.io.compress.zlib.ZlibDecompressor.(ZlibDecompresso > r.java:108) > at > org.apache.hadoop.io.compress.zlib.ZlibDecompressor.(ZlibDecompresso > r.java:115) > at > org.apache.hadoop.io.compress.zlib.ZlibFactory.getZlibDecompressor(ZlibFac > tory.java:104) > at > org.apache.hadoop.io.compress.DefaultCodec.createDecompressor(DefaultCodec > .java:80) > at > org.apache.hadoop.io.SequenceFile$Reader.getPooledOrNewDecompressor(Sequen > ceFile.java:1458) > at > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1543) > at > org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1442) > at > org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1431) > at > org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1426) > at org.apache.hadoop.io.MapFile$Reader.open(MapFile.java:292) > at > org.apache.hadoop.hbase.regionserver.HStoreFile$HbaseMapFile$HbaseReader.< > init>(HStoreFile.java:635) > at > org.apache.hadoop.hbase.regionserver.HStoreFile$BloomFilterMapFile$Reader. > (HStoreFile.java:717) > at > org.apache.hadoop.hbase.regionserver.HStoreFile$HalfMapFileReader.(H > StoreFile.java:915) > at > org.apache.hadoop.hbase.regionserver.HStoreFile.getReader(HStoreFile.java: > 408) > at > org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:263) > at > org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.jav > a:1698) > at > org.apache.hadoop.hbase.regionserver.HRegion.(HRegion.java:481) > at > org.apache.hadoop.hbase.regionserver.HRegion.(HRegion.java:421) > at > org.apache.hadoop.hbase.regionserver.HRegion.splitRegion(HRegion.java:815) > at > org.apache.hadoop.hbase.regionserver.CompactSplitThread.split(CompactSplit > Thread.java:133) > at > org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitTh > read.java:86) > 2008-10-15 21:44:33,661 FATAL > [regionserver/0:0:0:0:0:0:0:0:60020.cacheFlusher] > regionserver.Flusher(183): > Replay of hlog required. Forcing server restart > org.apache.hadoop.hbase.DroppedSnapshotException: region: > p4p_test,,1224072139042 > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.ja > va:1087) > at > org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:985) > at > org.apache.hadoop.hbase.regionserver.Flusher.flushRegion(Flusher.java:174) > at > org.apache.hadoop.hbase.regionserver.Flusher.run(Flusher.java:91) > Caused by: java.lang.OutOfMemoryError > at sun.misc.Unsafe.allocateMemory(Native Method) > at java.nio.DirectByteBuffer.(DirectByteBuffer.java:99) > at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) > at > org.apache.hadoop.io.compress.zlib.ZlibDecompressor.(ZlibDecompresso > r.java:107) > at > org.apache.hadoop.io.compress.zlib.ZlibDecompressor.(ZlibDecompresso > r.java:115) > at > org.apache.hadoop.io.compress.zlib.ZlibFactory.getZlibDecompressor(ZlibFac > tory.java:104) > at > org.apache.hadoop.io.compress.DefaultCodec.createDecompressor(DefaultCodec > .java:80) > at > org.apache.hadoop.io.SequenceFile$Reader.getPooledOrNewDecompressor(Sequen > ceFile.java:1458) > at > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1555) > at > org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1442) > at > org.apache.hado
out of memory error
Hello List, We encountered an out-of-memory error in data loading. We have 5 data nodes and 1 name node distributed on 6 machines. Block-level compression was used. Following is the log output. Seems the problem was caused in compression. Is there anybody who ever experienced such error? Any helps or clues are appreciated. 2008-10-15 21:44:33,069 FATAL [regionserver/0:0:0:0:0:0:0:0:60020.compactor] regionserver.HRegionServer$1(579): Set stop flag in regionserver/0:0:0:0:0:0:0:0:60020.compactor java.lang.OutOfMemoryError at sun.misc.Unsafe.allocateMemory(Native Method) at java.nio.DirectByteBuffer.(DirectByteBuffer.java:99) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.io.compress.zlib.ZlibDecompressor.(ZlibDecompressor.java:108) at org.apache.hadoop.io.compress.zlib.ZlibDecompressor.(ZlibDecompressor.java:115) at org.apache.hadoop.io.compress.zlib.ZlibFactory.getZlibDecompressor(ZlibFactory.java:104) at org.apache.hadoop.io.compress.DefaultCodec.createDecompressor(DefaultCodec.java:80) at org.apache.hadoop.io.SequenceFile$Reader.getPooledOrNewDecompressor(SequenceFile.java:1458) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1543) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1442) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1431) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1426) at org.apache.hadoop.io.MapFile$Reader.open(MapFile.java:292) at org.apache.hadoop.hbase.regionserver.HStoreFile$HbaseMapFile$HbaseReader.(HStoreFile.java:635) at org.apache.hadoop.hbase.regionserver.HStoreFile$BloomFilterMapFile$Reader.(HStoreFile.java:717) at org.apache.hadoop.hbase.regionserver.HStoreFile$HalfMapFileReader.(HStoreFile.java:915) at org.apache.hadoop.hbase.regionserver.HStoreFile.getReader(HStoreFile.java:408) at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:263) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1698) at org.apache.hadoop.hbase.regionserver.HRegion.(HRegion.java:481) at org.apache.hadoop.hbase.regionserver.HRegion.(HRegion.java:421) at org.apache.hadoop.hbase.regionserver.HRegion.splitRegion(HRegion.java:815) at org.apache.hadoop.hbase.regionserver.CompactSplitThread.split(CompactSplitThread.java:133) at org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:86) 2008-10-15 21:44:33,661 FATAL [regionserver/0:0:0:0:0:0:0:0:60020.cacheFlusher] regionserver.Flusher(183): Replay of hlog required. Forcing server restart org.apache.hadoop.hbase.DroppedSnapshotException: region: p4p_test,,1224072139042 at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1087) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:985) at org.apache.hadoop.hbase.regionserver.Flusher.flushRegion(Flusher.java:174) at org.apache.hadoop.hbase.regionserver.Flusher.run(Flusher.java:91) Caused by: java.lang.OutOfMemoryError at sun.misc.Unsafe.allocateMemory(Native Method) at java.nio.DirectByteBuffer.(DirectByteBuffer.java:99) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288) at org.apache.hadoop.io.compress.zlib.ZlibDecompressor.(ZlibDecompressor.java:107) at org.apache.hadoop.io.compress.zlib.ZlibDecompressor.(ZlibDecompressor.java:115) at org.apache.hadoop.io.compress.zlib.ZlibFactory.getZlibDecompressor(ZlibFactory.java:104) at org.apache.hadoop.io.compress.DefaultCodec.createDecompressor(DefaultCodec.java:80) at org.apache.hadoop.io.SequenceFile$Reader.getPooledOrNewDecompressor(SequenceFile.java:1458) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1555) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1442) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1431) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1426) at org.apache.hadoop.io.MapFile$Reader.open(MapFile.java:292) at org.apache.hadoop.hbase.regionserver.HStoreFile$HbaseMapFile$HbaseReader.(HStoreFile.java:635) at org.apache.hadoop.hbase.regionserver.HStoreFile$BloomFilterMapFile$Reader.(HStoreFile.java:717) at org.apache.hadoop.hbase.regionserver.HStoreFile.getReader(HStoreFile.java:413) at org.apache.hadoop.hbase.regionserver.HStore.updateReaders(HStore.java:665) at org.apache.hadoop.hbase.regionserver.HStore.internalFlushCache(HStore.java:640) at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:577) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1074) ... 3 more 2008-10-15 21:44:33,661 INFO [regionserver/0:0:0:0:0:0:0:0:60020.cacheFlu
Processing large XML file
Hello, I can't wrap my head around the best way to process a 20 GB Wikipedia XML dump file [1] with Hadoop. The content I'm interested in is enclosed in the -tags. Usually a SAX-parser would be the way to go, but since it is event based I don't think there would be a benefit of using a MR based approach. An article is only around a few KBs in size. My other thought was to preprocess the file and split it up into multiple text files with a size of 128 MBs. That step alone takes around 70 minutes on my machine. Would the preprocessing step also do the final processing (counting sentences and words in an article), than it wouldn't take much longer. So even if I would use the splitted files with Hadoop, I wouldn't really save some time, since I have to upload the 157 files to HDFS and then start the MR job. Are there other ways to handle XML files with Hadoop? Holger [1] http://download.wikimedia.org/enwiki/20081008/enwiki-20081008-pages-articles.xml.bz2 --
CanonicalInputFormat use case: Hadoop + MonetDB serving localhost
Hi Group, This is the second of two emails, each raising a related idea. The first, canvasses an InputFormat that allows a data-chunk to be targeted to a MR job/cluster node, and is here: http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200810.mbox/[EMAIL PROTECTED] As an aside, I noticed the following Jira item: "Add support for running MapReduce jobs over data residing in a MySQL table." https://issues.apache.org/jira/browse/HADOOP-2536 Some interested in that may like to investigate MonetDB (monetdb.cwi.nl). Primarily because its 'out of the box' performance is quite impressive. http://monetdb.cwi.nl/projects/monetdb/SQL/Benchmark/TPCH/index.html I'm not trying to suggest that PostreSQL or MySQL should be abandoned - and I certainly don't want to start some 'dispute' :) I'm just observing that MonetDB offers impressive performance which, for a use case I think is 'reasonable', suggests adopting a particular configuration - a DB server on each compute node. The use case set out below does not require any built in MonetDB-on-Hadoop support, or 'MR for MonetDB'. In fact I'm thinking in terms of a user's code being able to CRUD parsed and intermediate data, not just key-value pairs, before writing out final key-value pairs. Hopefully it introduces someone to an alternative DB they find useful, and adds some context/motivation to the additional InputFormat proposed earlier. Specifically, I'm assuming: 1) A DB installation that is not 'tuned/tweaked' (which all three could be), but is just installed. 2) A DB is installed on each cluster node to serve just that node (user scripts would connect to localhost:) 3) The cluster-nodes are sufficiently resourced/'spec'd for the datasets being queried. 4) The user has been able to load into the DB, the _complete_ data-chunk required by that node's MR job. See my previous email above. 5) User queries are similar to the queries listed in the benchmark above. Hence, the performance figures are a reasonable representation of the performance some users might experience. Some observations. - I definitely appreciate the sense in delgating queries to a DB - rather than re-implment this functionality in user code. - Further, one could have N-slave MySQL servers to handle queries rather than all queue on one server. This would remove the CPU and disk load from the cluster-node machines, loads that local MySQL and PostgreSQL installtions would impose for significant time periods. - There is additional network congestion with remote queries, especially if query results are large in size and/or number. - However, in the case of MySQL and PostgreSQL the network latency is likely to be dominated by the query delay, suggesting it would be reasonable to hand-off queries to dedicated servers, rather than load down cluster nodes for, say, 1-300+ seconds in the case of MySQL. It seems, to me, that apart from the first observation, MonetDB could turn this client-server approach on its head for some use-cases. With MonetDB the network latency would likely dominate the time taken for the queries assumed. Again for some cse cases, this suggets better performance could be achieved if each node has a MonetDB server serving that node. Network congestion is reduced, CPU and disk load is increased but for very small intervals of time. Most importantly results return much faster. However, memory load would increase (an issue on memory skinny nodes?). Given the benchmarks figures refer to out-the-box installations, this preformance indicated should be achievable in a DB-per-node configuration without too much admin effort. If this is the cluster set-up, then it becomes important to have a convenient method of ensuring each node receives the data-chunk it requires to load into the localhost DB. This is where I think my earlier proposal, CanonicalInputformat, adds considerable value - it offers a flexible way to ensure a complete data-chunk is delivered to a MapReduce task's node - correct? In this situation a user _could_ write a script to query a remote/localhost DB. However, it seems important to have the convenience of the CanonicalInputFormat, to be able to target data-chunks to nodes/MR jobs. I think the most important aspect from the above is that it demonstarates a compelling use case for the CanonicalInputFormat I raised earlier, whether or not there is built-in MR support for MonetDB. Thoughts? Am I imagining the benefits of using MonetDB as described? Is explicit MR support for MonetDB worth raising as a feature request on the jira? Cheers Mark
Re: Why did I only get 2 live datanodes?
have you config the property hadoop.tmp.dir in the configaration conf/hadoop-site.xml? the somefile will be stored in that directory。may be you could try to rm that directory and "bin/hadoop namenode -format" again. i meet the same problem.and i just do the same thing. it runs ok. 2008-10-16 [EMAIL PROTECTED]
Re: Why did I only get 2 live datanodes?
Hi, wei: Did you have all your nodes have the same directory structures, the same configuration files, the same hadoop data directory, and make sure you have access to the data structure. You may have connection problems, MAKE SURE your slave node can ssh to master without password. And TURN OFF FIREWALL in your master node. -- [EMAIL PROTECTED] Institute of Computing Technology, Chinese Academy of Sciences, Beijing.
Re: Why did I only get 2 live datanodes?
Almost desperate! After configuring all the machines connecting each other with hostname, only ONE node (master itself) still alive now! Conditions as follows: 1. each machine can access others with SSH without needing password input 2. every machine has the same /etc/hosts file listing the name of machines 3. all the machine has the same masters/slaves files listing the name of machines 4. inside hadoop-site.xml file, master had been configured to the right machine name 5. when starting cluster from master machine, found some logs like this: org.apache.hadoop.dfs.SafeModeException: Cannot delete /tmp/hadoop-root/mapred/system. Name node is in safe mode. The ratio of reported blocks 0.8750 has not reached the threshold 0.9990. Safe mode will be turned off automatically. at org.apache.hadoop.dfs.FSNamesystem.deleteInternal(FSNamesystem.java:1494) at org.apache.hadoop.dfs.FSNamesystem.delete(FSNamesystem.java:1466) at org.apache.hadoop.dfs.NameNode.delete(NameNode.java:425) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452) ... ... 6. inside slaves's log, it shows that slaves can not reach the master: STARTUP_MSG: build = http://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18 -r 694836; compiled by 'hadoopqa' on Fri Sep 12 23:29:35 UTC 2008 / 2008-10-16 15:35:44,142 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /192.168.52.129:9000. Already tried 0 time(s). ... after many times retry, it stoped. we had already try to delete the folder of hadoop-root inside /tmp, but it is the same for the problem! Anybody can help? Thx! David Wei 写道: > It seems that this is not the point. ;=( > And in this case, my cluster is really easy to crash > > > СP 写道: > >> According to the cofiguration settings manaul in this website >> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster) >> i think it sould be >> >> Master >> Slave1 >> Slave2 >> Slave3 >> >> in the slaves configuration >> maybe you should give each IP a hostname for identification >> >> - Original Message - >> From: "David Wei" <[EMAIL PROTECTED]> >> To: "?P" <[EMAIL PROTECTED]> >> Cc: >> Sent: Thursday, October 16, 2008 11:41 AM >> Subject: Re: Why did I only get 2 live datanodes? >> >> >> >> >>> only the IPs of each node. For example: >>> 192.168.52.129 >>> 192.168.49.40 >>> 192.168.55.104 >>> 192.168.49.148 >>> >>> They are totally different in hardware conf., and is that matter? >>> >>> Thx! >>> >>> David >>> >>> ?P 写道: >>> >>> What is your configuration in $HADOOP_HOME/conf/salves ? - Original Message - From: "David Wei" <[EMAIL PROTECTED]> To: Sent: Thursday, October 16, 2008 11:05 AM Subject: Why did I only get 2 live datanodes? > We had installed hadoop on 4 machines, and one of them has been chosen > to be master and also slaves. The rest of machines are configured as > slaves. But it is strange that we can only see 2 live nodes on web > UI:http://192.168.52.129:50070/dfshealth.jsp(master machine). When we > try to refresh the page, we found that only 2 nodes are listed and one > of them are continuing replaced by others. > > After using command:bin/hadoop dfsadmin -report, we got only 2 data > nodes listed. > > Another thing is we found this in the log > file(hadoop-root-namenode-datacenter5.log > ---datacenter5 is the name of master): > 2008-10-16 02:45:12,384 INFO org.apache.hadoop.dfs.StateChange: BLOCK* > NameSystem.registerDatanode: node 192.168.49.148:50010 is replaced by > 192.168.52.129:50010 with the same storageID > DS-2140035130-127.0.0.1-50010-1223898963914 > ... > 2008-10-16 02:45:12,432 INFO org.apache.hadoop.dfs.StateChange: BLOCK* > NameSystem.registerDatanode: node 192.168.52.129:50010 is replaced by > 192.168.55.104:50010 with the same storageID > DS-2140035130-127.0.0.1-50010-1223898963914 > > 2008-10-16 02:45:13,380 INFO org.apache.hadoop.dfs.StateChange: BLOCK* > NameSystem.registerDatanode: node 192.168.55.104:50010 is replaced by > 192.168.49.148:50010 with the same storageID > DS-2140035130-127.0.0.1-50010-1223898963914 > > Is there any problem with our configuration or maybe we just missed some > options? > > BTW, my machines are total different in hardware, such as memory, disk > and cpu. > > Thx! > > David > > > > > > > >> >> > > > >