Re: Distributed Agent
Take a look at this topic: http://dsonline.computer.org/portal/site/dsonline/menuitem.244c5fa74f801883f1a516106bbe36ec/index.jsp?&pName=dso_level1_about&path=dsonline/topics/agents&file=about.xml&xsl=generic.xsl&; 2009/4/14 Burak ISIKLI : > Hello everyone; > I want to write a distributed agent program. But i can't understand one thing > that what's difference between client-server program and agent program? Pls > help me... > > > > > > Burak ISIKLI > Dumlupinar University > Electric & Electronic - Computer Engineering > > http://burakisikli.wordpress.com > http://burakisikli.blogspot.com > > > > > -- M. Raşit ÖZDAŞ
Re: Modeling WordCount in a different way
On Wed, Apr 15, 2009 at 1:26 AM, Sharad Agarwal wrote: > > > > I am trying complex queries on hadoop and in which i require more than > one > > job to run to get final result..results of job one captures few joins of > the > > query and I want to pass those results as input to 2nd job and again do > > processing so that I can get final results.queries are such that I cant > do > > all types of joins and filterin in job1 and so I require two jobs. > > > > right now I write results of job 1 to hdfs and read dem for job2..but > thats > > take unecessary IO time.So was looking for something that I can store my > > results of job1 in memory and use them as input for job 2. > > > > do let me know if you need any more details. > How big is your input and output data ? And my total data is of 7.8 gb out of which for Job 1 i use around 3 gb.output of job1 is of about 1gb and I use this output as input to job 2. > How many nodes you are using? Well Right now due to lack of Resources I have only 4 nodes each dual core processors with 1GB og ram and about 80gb hard disk in each.. > > What is your job runtime? My first jobs takes long time after reaching 90% of reduce phase as it does in-memory merge sort and so that is also an big issue.I will have to arrange for more memory for my clusters I suppose. I will have look at jvm reuse feature. thanks > Pankil
Re: fyi: A Comparison of Approaches to Large-Scale Data Analysis: MapReduce vs. DBMS Benchmarks
Not sure if comparing Hadoop to databases is an apples to apples comparison. Hadoop is a complete job execution framework, which collocates the data with the computation. I suppose DBMS-X and Vertica do that to some certain extent, by way of SQL, but you're restricted to that. If you want to say, build a distributed web crawler, or a complex data processing pipeline, Hadoop will schedule those processes across a cluster for you, while Vertica and DBMS-X only deal with the storage of the data. The choice of experiments seemed skewed towards DBMS-X and Vertica. I think everybody is aware that Map-Reduce is inefficient for handling SQL-like queries and joins. It's also worth noting that I think 4 out of the 7 authors either currently or at one time work with Vertica (or c-store, the precursor to Vertica). Andy On Tue, Apr 14, 2009 at 10:16 AM, Guilherme Germoglio wrote: > (Hadoop is used in the benchmarks) > > http://database.cs.brown.edu/sigmod09/ > > There is currently considerable enthusiasm around the MapReduce > (MR) paradigm for large-scale data analysis [17]. Although the > basic control flow of this framework has existed in parallel SQL > database management systems (DBMS) for over 20 years, some > have called MR a dramatically new computing model [8, 17]. In > this paper, we describe and compare both paradigms. Furthermore, > we evaluate both kinds of systems in terms of performance and de- > velopment complexity. To this end, we define a benchmark con- > sisting of a collection of tasks that we have run on an open source > version of MR as well as on two parallel DBMSs. For each task, > we measure each system’s performance for various degrees of par- > allelism on a cluster of 100 nodes. Our results reveal some inter- > esting trade-offs. Although the process to load data into and tune > the execution of parallel DBMSs took much longer than the MR > system, the observed performance of these DBMSs was strikingly > better. We speculate about the causes of the dramatic performance > difference and consider implementation concepts that future sys- > tems should take from both kinds of architectures. > > > -- > Guilherme > > msn: guigermog...@hotmail.com > homepage: http://germoglio.googlepages.com >
Re: fyi: A Comparison of Approaches to Large-Scale Data Analysis: MapReduce vs. DBMS Benchmarks
I agree with you, Andy. This seems to be a great look into what Hadoop MapReduce is not good at. Over in the HBase world, we constantly deal with comparisons like this to RDBMSs, trying to determine if one is better than the other. It's a false choice and completely depends on the use case. Hadoop is not suited for random access, joins, dealing with subsets of your data; ie. it is not a relational database! It's designed to distribute a full scan of a large dataset, placing tasks on the same nodes as the data its processing. The emphasis is on task scheduling, fault tolerance, and very large datasets, low-latency has not been a priority. There are no "indexes" to speak of, it's completely orthogonal to what it does, so of course there is an enormous disparity in cases where that makes sense. Yes, B-Tree indexes are a wonderful breakthrough in data technology :) In short, I'm using Hadoop (HDFS and MapReduce) for a broad spectrum of applications including batch log processing, web crawling, and number of machine learning and natural language processing jobs... These may not be tasks that DBMS-X or Vertica would be good at, if even capable of them, but all things that I would include under "Large-Scale Data Analysis". Would have been really interesting to see how things like Pig, Hive, and Cascading would stack up against DBMS-X/Vertica for very complex, multi-join/sort/etc queries, across a broad spectrum of use cases and dataset/result sizes. There are a wide variety of solutions to the problems out there. It's important to know the strengths and weaknesses of each, so a bit unfortunate that this paper set the stage as it did. JG On Wed, April 15, 2009 6:44 am, Andy Liu wrote: > Not sure if comparing Hadoop to databases is an apples to apples > comparison. Hadoop is a complete job execution framework, which > collocates the data with the computation. I suppose DBMS-X and Vertica do > that to some certain extent, by way of SQL, but you're restricted to that. > If you want > to say, build a distributed web crawler, or a complex data processing > pipeline, Hadoop will schedule those processes across a cluster for you, > while Vertica and DBMS-X only deal with the storage of the data. > > The choice of experiments seemed skewed towards DBMS-X and Vertica. I > think everybody is aware that Map-Reduce is inefficient for handling > SQL-like > queries and joins. > > It's also worth noting that I think 4 out of the 7 authors either > currently or at one time work with Vertica (or c-store, the precursor to > Vertica). > > > Andy > > > On Tue, Apr 14, 2009 at 10:16 AM, Guilherme Germoglio > wrote: > > >> (Hadoop is used in the benchmarks) >> >> >> http://database.cs.brown.edu/sigmod09/ >> >> >> There is currently considerable enthusiasm around the MapReduce >> (MR) paradigm for large-scale data analysis [17]. Although the >> basic control ï¬ow of this framework has existed in parallel SQL >> database management systems (DBMS) for over 20 years, some have called >> MR a dramatically new computing model [8, 17]. In >> this paper, we describe and compare both paradigms. Furthermore, we >> evaluate both kinds of systems in terms of performance and de- velopment >> complexity. To this end, we deï¬ne a benchmark con- sisting of a >> collection of tasks that we have run on an open source version of MR as >> well as on two parallel DBMSs. For each task, we measure each systemâs >> performance for various degrees of par- allelism on a cluster of 100 >> nodes. Our results reveal some inter- esting trade-offs. Although the >> process to load data into and tune the execution of parallel DBMSs took >> much longer than the MR system, the observed performance of these DBMSs >> was strikingly better. We speculate about the causes of the dramatic >> performance difference and consider implementation concepts that future >> sys- tems should take from both kinds of architectures. >> >> >> -- >> Guilherme >> >> >> msn: guigermog...@hotmail.com >> homepage: http://germoglio.googlepages.com >> >> >
Re: hadoop-a small doubt
Hey , You can do that.That system should have same usrname like those of cluster and ofcourse it should be able to ssh name node.Also it should have hadoop and its hadoop-site.xml should be similar .Then u can access namenode,hdfs etc. if you are willing to see the web interface that can be done easily using any system. deepya wrote: > > Hi, >I am SreeDeepya doing MTech in IIIT.I am working on a project named > cost effective and scalable storage server.I configured a small hadoop > cluster with only two nodes one namenode and one datanode.I am new to > hadoop. > I have a small doubt. > > Can a system not in the hadoop cluster access the namenode or the > datanodeIf yes,then can you please tell me the necessary > configurations that has to be done. > > Thanks in advance. > > SreeDeepya > -- View this message in context: http://www.nabble.com/hadoop-a-small-doubt-tp22764615p23061794.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
How to submit a project to Hadoop/Apache
Hi, Can anyone point me to a documentation which explains how to submit a project to Hadoop as a subproject? Also, I will appreciate if someone points me to the documentation on how to submit a project as Apache project. We have a project that is built on Hadoop. It is released to the open source community under GPL license but we are thinking of submitting it as a Hadoop or Apache project. Any help on how to do this is appreciated. Thanks, Tarandeep
Re: Map-Reduce Slow Down
The log file : hadoop-mithila-datanode-node19.log.2009-04-14 has the following in it: 2009-04-14 10:08:11,499 INFO org.apache.hadoop.dfs.DataNode: STARTUP_MSG: / STARTUP_MSG: Starting DataNode STARTUP_MSG: host = node19/127.0.0.1 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.18.3 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18 -r 736250; compiled by 'ndaley' on Thu Jan 22 23:12:08 UTC 2009 / 2009-04-14 10:08:12,915 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 0 time(s). 2009-04-14 10:08:13,925 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 1 time(s). 2009-04-14 10:08:14,935 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 2 time(s). 2009-04-14 10:08:15,945 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 3 time(s). 2009-04-14 10:08:16,955 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 4 time(s). 2009-04-14 10:08:17,965 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 5 time(s). 2009-04-14 10:08:18,975 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 6 time(s). 2009-04-14 10:08:19,985 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 7 time(s). 2009-04-14 10:08:20,995 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 8 time(s). 2009-04-14 10:08:22,005 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 9 time(s). 2009-04-14 10:08:22,008 INFO org.apache.hadoop.ipc.RPC: Server at node18/ 192.168.0.18:54310 not available yet, Z... 2009-04-14 10:08:24,025 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 0 time(s). 2009-04-14 10:08:25,035 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 1 time(s). 2009-04-14 10:08:26,045 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 2 time(s). 2009-04-14 10:08:27,055 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 3 time(s). 2009-04-14 10:08:28,065 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 4 time(s). 2009-04-14 10:08:29,075 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 5 time(s). 2009-04-14 10:08:30,085 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 6 time(s). 2009-04-14 10:08:31,095 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 7 time(s). 2009-04-14 10:08:32,105 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 8 time(s). 2009-04-14 10:08:33,115 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 9 time(s). 2009-04-14 10:08:33,116 INFO org.apache.hadoop.ipc.RPC: Server at node18/ 192.168.0.18:54310 not available yet, Z... 2009-04-14 10:08:35,135 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 0 time(s). 2009-04-14 10:08:36,145 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 1 time(s). 2009-04-14 10:08:37,155 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 2 time(s). Hmmm I still cant figure it out.. Mithila On Tue, Apr 14, 2009 at 10:22 PM, Mithila Nagendra wrote: > Also, Would the way the port is accessed change if all these node are > connected through a gateway? I mean in the hadoop-site.xml file? The Ubuntu > systems we worked with earlier didnt have a gateway. > Mithila > > On Tue, Apr 14, 2009 at 9:48 PM, Mithila Nagendra wrote: > >> Aaron: Which log file do I look into - there are alot of them. Here s what >> the error looks like: >> [mith...@node19:~]$ cd hadoop >> [mith...@node19:~/hadoop]$ bin/hadoop dfs -ls >> 09/04/14 10:09:29 INFO ipc.Client: Retrying connect to server: node18/ >> 192.168.0.18:54310. Already tried 0 time(s). >> 09/04/14 10:09:30 INFO ipc.Client: Retrying connect to server: node18/ >> 192.168.0.18:54310. Already tried 1 time(s). >> 09/04/14 10:09:31 INFO ipc.Client: Retrying connect to server: node18/ >> 192.168.0.18:54310. Already tried 2 time(s). >> 09/04/14 10:09:32 INFO ipc.Client: Retrying connect to s
Re: Map-Reduce Slow Down
The log file runs into thousands of line with the same message being displayed every time. On Wed, Apr 15, 2009 at 8:10 PM, Mithila Nagendra wrote: > The log file : hadoop-mithila-datanode-node19.log.2009-04-14 has the > following in it: > > 2009-04-14 10:08:11,499 INFO org.apache.hadoop.dfs.DataNode: STARTUP_MSG: > / > STARTUP_MSG: Starting DataNode > STARTUP_MSG: host = node19/127.0.0.1 > STARTUP_MSG: args = [] > STARTUP_MSG: version = 0.18.3 > STARTUP_MSG: build = > https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18 -r > 736250; compiled by 'ndaley' on Thu Jan 22 23:12:08 UTC 2009 > / > 2009-04-14 10:08:12,915 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 0 time(s). > 2009-04-14 10:08:13,925 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 1 time(s). > 2009-04-14 10:08:14,935 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 2 time(s). > 2009-04-14 10:08:15,945 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 3 time(s). > 2009-04-14 10:08:16,955 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 4 time(s). > 2009-04-14 10:08:17,965 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 5 time(s). > 2009-04-14 10:08:18,975 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 6 time(s). > 2009-04-14 10:08:19,985 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 7 time(s). > 2009-04-14 10:08:20,995 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 8 time(s). > 2009-04-14 10:08:22,005 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 9 time(s). > 2009-04-14 10:08:22,008 INFO org.apache.hadoop.ipc.RPC: Server at node18/ > 192.168.0.18:54310 not available yet, Z... > 2009-04-14 10:08:24,025 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 0 time(s). > 2009-04-14 10:08:25,035 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 1 time(s). > 2009-04-14 10:08:26,045 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 2 time(s). > 2009-04-14 10:08:27,055 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 3 time(s). > 2009-04-14 10:08:28,065 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 4 time(s). > 2009-04-14 10:08:29,075 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 5 time(s). > 2009-04-14 10:08:30,085 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 6 time(s). > 2009-04-14 10:08:31,095 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 7 time(s). > 2009-04-14 10:08:32,105 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 8 time(s). > 2009-04-14 10:08:33,115 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 9 time(s). > 2009-04-14 10:08:33,116 INFO org.apache.hadoop.ipc.RPC: Server at node18/ > 192.168.0.18:54310 not available yet, Z... > 2009-04-14 10:08:35,135 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 0 time(s). > 2009-04-14 10:08:36,145 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 1 time(s). > 2009-04-14 10:08:37,155 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 2 time(s). > > > Hmmm I still cant figure it out.. > > Mithila > > > On Tue, Apr 14, 2009 at 10:22 PM, Mithila Nagendra wrote: > >> Also, Would the way the port is accessed change if all these node are >> connected through a gateway? I mean in the hadoop-site.xml file? The Ubuntu >> systems we worked with earlier didnt have a gateway. >> Mithila >> >> On Tue, Apr 14, 2009 at 9:48 PM, Mithila Nagendra wrote: >> >>> Aaron: Which log file do I look into - there are alot of them. Here s >>> what the error looks like: >>> [mith...@node19:~]$ cd hadoop >>> [mith...@node19:~/hadoop]$ bin/hadoop dfs -ls >>> 09/04/14 10:09:29 INFO ipc.Client: Retrying connect to server: node18/ >>> 192.168.0.18:54310. Already tried 0 time(s). >>>
Re: Map-Reduce Slow Down
Looks like your NameNode is down . Verify if hadoop process are running ( jps should show you all java running process). If your hadoop process are running try restarting your hadoop process . I guess this problem is due to your fsimage not being correct . You might have to format your namenode. Hope this helps. Thanks, -- Ravi On 4/15/09 10:15 AM, "Mithila Nagendra" wrote: The log file runs into thousands of line with the same message being displayed every time. On Wed, Apr 15, 2009 at 8:10 PM, Mithila Nagendra wrote: > The log file : hadoop-mithila-datanode-node19.log.2009-04-14 has the > following in it: > > 2009-04-14 10:08:11,499 INFO org.apache.hadoop.dfs.DataNode: STARTUP_MSG: > / > STARTUP_MSG: Starting DataNode > STARTUP_MSG: host = node19/127.0.0.1 > STARTUP_MSG: args = [] > STARTUP_MSG: version = 0.18.3 > STARTUP_MSG: build = > https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18 -r > 736250; compiled by 'ndaley' on Thu Jan 22 23:12:08 UTC 2009 > / > 2009-04-14 10:08:12,915 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 0 time(s). > 2009-04-14 10:08:13,925 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 1 time(s). > 2009-04-14 10:08:14,935 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 2 time(s). > 2009-04-14 10:08:15,945 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 3 time(s). > 2009-04-14 10:08:16,955 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 4 time(s). > 2009-04-14 10:08:17,965 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 5 time(s). > 2009-04-14 10:08:18,975 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 6 time(s). > 2009-04-14 10:08:19,985 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 7 time(s). > 2009-04-14 10:08:20,995 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 8 time(s). > 2009-04-14 10:08:22,005 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 9 time(s). > 2009-04-14 10:08:22,008 INFO org.apache.hadoop.ipc.RPC: Server at node18/ > 192.168.0.18:54310 not available yet, Z... > 2009-04-14 10:08:24,025 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 0 time(s). > 2009-04-14 10:08:25,035 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 1 time(s). > 2009-04-14 10:08:26,045 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 2 time(s). > 2009-04-14 10:08:27,055 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 3 time(s). > 2009-04-14 10:08:28,065 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 4 time(s). > 2009-04-14 10:08:29,075 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 5 time(s). > 2009-04-14 10:08:30,085 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 6 time(s). > 2009-04-14 10:08:31,095 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 7 time(s). > 2009-04-14 10:08:32,105 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 8 time(s). > 2009-04-14 10:08:33,115 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 9 time(s). > 2009-04-14 10:08:33,116 INFO org.apache.hadoop.ipc.RPC: Server at node18/ > 192.168.0.18:54310 not available yet, Z... > 2009-04-14 10:08:35,135 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 0 time(s). > 2009-04-14 10:08:36,145 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 1 time(s). > 2009-04-14 10:08:37,155 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: node18/192.168.0.18:54310. Already tried 2 time(s). > > > Hmmm I still cant figure it out.. > > Mithila > > > On Tue, Apr 14, 2009 at 10:22 PM, Mithila Nagendra wrote: > >> Also, Would the way the port is accessed change if all these node are >> connected through a gateway? I mean in the hadoop-site.xml file? The Ubuntu >> systems we worked with earlier didnt have a gateway. >> Mithi
Re: How to submit a project to Hadoop/Apache
This is how things get into Apache Incubator: http://incubator.apache.org/ But the rules are, I believe, that you can skip the incubator and go straight under a project's wing (e.g. Hadoop) if the project PMC approves. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Tarandeep Singh > To: core-user@hadoop.apache.org > Sent: Wednesday, April 15, 2009 1:08:38 PM > Subject: How to submit a project to Hadoop/Apache > > Hi, > > Can anyone point me to a documentation which explains how to submit a > project to Hadoop as a subproject? Also, I will appreciate if someone points > me to the documentation on how to submit a project as Apache project. > > We have a project that is built on Hadoop. It is released to the open source > community under GPL license but we are thinking of submitting it as a Hadoop > or Apache project. Any help on how to do this is appreciated. > > Thanks, > Tarandeep
Re: Using 3rd party Api in Map class
That certainly works, though if you plan to upgrade the underlying library, you'll find that copying files with the correct versions into $HADOOP_HOME/lib rapidly gets tedious, and subtle mistakes (e.g., forgetting one machine) can lead to frustration. When you consider the fact that you're using a Hadoop cluster to process and transfer around GBs of data on the low end, the difference between a 10 MB and a 20 MB job jar starts to look meaningless. Putting other jars in a lib/ directory inside your job jar keeps the version consistent and doesn't clutter up a shared directory on your cluster (assuming there are other users). - Aaron On Tue, Apr 14, 2009 at 11:15 AM, Farhan Husain wrote: > Hello, > > I got another solution for this. I just pasted all the required jar files > in > lib folder of each hadoop node. In this way the job jar is not too big and > will require less time to distribute in the cluster. > > Thanks, > Farhan > > On Mon, Apr 13, 2009 at 7:22 PM, Nick Cen wrote: > > > create a directroy call 'lib' in your project's root dir, then put all > the > > 3rd party jar in it. > > > > 2009/4/14 Farhan Husain > > > > > Hello, > > > > > > I am trying to use Pellet library for some OWL inferencing in my mapper > > > class. But I can't find a way to bundle the library jar files in my job > > jar > > > file. I am exporting my project as a jar file from Eclipse IDE. Will it > > > work > > > if I create the jar manually and include all the jar files Pellet > library > > > has? Is there any simpler way to include 3rd party library jar files in > a > > > hadoop job jar? Without being able to include the library jars I am > > getting > > > ClassNotFoundException. > > > > > > Thanks, > > > Farhan > > > > > > > > > > > -- > > http://daily.appspot.com/food/ > > >
Re: Map-Reduce Slow Down
Hi, I wrote a blog post a while back about connecting nodes via a gateway. See http://www.cloudera.com/blog/2008/12/03/securing-a-hadoop-cluster-through-a-gateway/ This assumes that the client is outside the gateway and all datanodes/namenode are inside, but the same principles apply. You'll just need to set up ssh tunnels from every datanode to the namenode. - Aaron On Wed, Apr 15, 2009 at 10:19 AM, Ravi Phulari wrote: > Looks like your NameNode is down . > Verify if hadoop process are running ( jps should show you all java > running process). > If your hadoop process are running try restarting your hadoop process . > I guess this problem is due to your fsimage not being correct . > You might have to format your namenode. > Hope this helps. > > Thanks, > -- > Ravi > > > On 4/15/09 10:15 AM, "Mithila Nagendra" wrote: > > The log file runs into thousands of line with the same message being > displayed every time. > > On Wed, Apr 15, 2009 at 8:10 PM, Mithila Nagendra > wrote: > > > The log file : hadoop-mithila-datanode-node19.log.2009-04-14 has the > > following in it: > > > > 2009-04-14 10:08:11,499 INFO org.apache.hadoop.dfs.DataNode: STARTUP_MSG: > > / > > STARTUP_MSG: Starting DataNode > > STARTUP_MSG: host = node19/127.0.0.1 > > STARTUP_MSG: args = [] > > STARTUP_MSG: version = 0.18.3 > > STARTUP_MSG: build = > > https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18 -r > > 736250; compiled by 'ndaley' on Thu Jan 22 23:12:08 UTC 2009 > > / > > 2009-04-14 10:08:12,915 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: node18/192.168.0.18:54310. Already tried 0 time(s). > > 2009-04-14 10:08:13,925 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: node18/192.168.0.18:54310. Already tried 1 time(s). > > 2009-04-14 10:08:14,935 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: node18/192.168.0.18:54310. Already tried 2 time(s). > > 2009-04-14 10:08:15,945 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: node18/192.168.0.18:54310. Already tried 3 time(s). > > 2009-04-14 10:08:16,955 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: node18/192.168.0.18:54310. Already tried 4 time(s). > > 2009-04-14 10:08:17,965 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: node18/192.168.0.18:54310. Already tried 5 time(s). > > 2009-04-14 10:08:18,975 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: node18/192.168.0.18:54310. Already tried 6 time(s). > > 2009-04-14 10:08:19,985 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: node18/192.168.0.18:54310. Already tried 7 time(s). > > 2009-04-14 10:08:20,995 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: node18/192.168.0.18:54310. Already tried 8 time(s). > > 2009-04-14 10:08:22,005 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: node18/192.168.0.18:54310. Already tried 9 time(s). > > 2009-04-14 10:08:22,008 INFO org.apache.hadoop.ipc.RPC: Server at node18/ > > 192.168.0.18:54310 not available yet, Z... > > 2009-04-14 10:08:24,025 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: node18/192.168.0.18:54310. Already tried 0 time(s). > > 2009-04-14 10:08:25,035 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: node18/192.168.0.18:54310. Already tried 1 time(s). > > 2009-04-14 10:08:26,045 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: node18/192.168.0.18:54310. Already tried 2 time(s). > > 2009-04-14 10:08:27,055 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: node18/192.168.0.18:54310. Already tried 3 time(s). > > 2009-04-14 10:08:28,065 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: node18/192.168.0.18:54310. Already tried 4 time(s). > > 2009-04-14 10:08:29,075 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: node18/192.168.0.18:54310. Already tried 5 time(s). > > 2009-04-14 10:08:30,085 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: node18/192.168.0.18:54310. Already tried 6 time(s). > > 2009-04-14 10:08:31,095 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: node18/192.168.0.18:54310. Already tried 7 time(s). > > 2009-04-14 10:08:32,105 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: node18/192.168.0.18:54310. Already tried 8 time(s). > > 2009-04-14 10:08:33,115 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: node18/192.168.0.18:54310. Already tried 9 time(s). > > 2009-04-14 10:08:33,116 INFO org.apache.hadoop.ipc.RPC: Server at node18/ > > 192.168.0.18:54310 not available yet, Z... > > 2009-04-14 10:08:35,135 INFO org.apache.hadoop.ipc.Client: Retrying > connect > > to server: node18/192.168.0.18:54310. Already tried 0 time(
Re: How to submit a project to Hadoop/Apache
Tarandeep, You might want to start by releasing your project as a "contrib" module for Hadoop. The overhead there is much easier -- just get it compiliing in the contrib/ directory, file a JIRA ticket on Hadoop Core, and attach your patch :) - Aaron On Wed, Apr 15, 2009 at 10:29 AM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote: > > This is how things get into Apache Incubator: http://incubator.apache.org/ > But the rules are, I believe, that you can skip the incubator and go > straight under a project's wing (e.g. Hadoop) if the project PMC approves. > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > - Original Message > > From: Tarandeep Singh > > To: core-user@hadoop.apache.org > > Sent: Wednesday, April 15, 2009 1:08:38 PM > > Subject: How to submit a project to Hadoop/Apache > > > > Hi, > > > > Can anyone point me to a documentation which explains how to submit a > > project to Hadoop as a subproject? Also, I will appreciate if someone > points > > me to the documentation on how to submit a project as Apache project. > > > > We have a project that is built on Hadoop. It is released to the open > source > > community under GPL license but we are thinking of submitting it as a > Hadoop > > or Apache project. Any help on how to do this is appreciated. > > > > Thanks, > > Tarandeep > >
Re: Map-Reduce Slow Down
Hi Aaron I will look into that thanks! I spoke to the admin who overlooks the cluster. He said that the gateway comes in to the picture only when one of the nodes communicates with a node outside of the cluster. But in my case the communication is carried out between the nodes which all belong to the same cluster. Mithila On Wed, Apr 15, 2009 at 8:59 PM, Aaron Kimball wrote: > Hi, > > I wrote a blog post a while back about connecting nodes via a gateway. See > http://www.cloudera.com/blog/2008/12/03/securing-a-hadoop-cluster-through-a-gateway/ > > This assumes that the client is outside the gateway and all > datanodes/namenode are inside, but the same principles apply. You'll just > need to set up ssh tunnels from every datanode to the namenode. > > - Aaron > > > On Wed, Apr 15, 2009 at 10:19 AM, Ravi Phulari wrote: > >> Looks like your NameNode is down . >> Verify if hadoop process are running ( jps should show you all java >> running process). >> If your hadoop process are running try restarting your hadoop process . >> I guess this problem is due to your fsimage not being correct . >> You might have to format your namenode. >> Hope this helps. >> >> Thanks, >> -- >> Ravi >> >> >> On 4/15/09 10:15 AM, "Mithila Nagendra" wrote: >> >> The log file runs into thousands of line with the same message being >> displayed every time. >> >> On Wed, Apr 15, 2009 at 8:10 PM, Mithila Nagendra >> wrote: >> >> > The log file : hadoop-mithila-datanode-node19.log.2009-04-14 has the >> > following in it: >> > >> > 2009-04-14 10:08:11,499 INFO org.apache.hadoop.dfs.DataNode: >> STARTUP_MSG: >> > / >> > STARTUP_MSG: Starting DataNode >> > STARTUP_MSG: host = node19/127.0.0.1 >> > STARTUP_MSG: args = [] >> > STARTUP_MSG: version = 0.18.3 >> > STARTUP_MSG: build = >> > https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18 -r >> > 736250; compiled by 'ndaley' on Thu Jan 22 23:12:08 UTC 2009 >> > / >> > 2009-04-14 10:08:12,915 INFO org.apache.hadoop.ipc.Client: Retrying >> connect >> > to server: node18/192.168.0.18:54310. Already tried 0 time(s). >> > 2009-04-14 10:08:13,925 INFO org.apache.hadoop.ipc.Client: Retrying >> connect >> > to server: node18/192.168.0.18:54310. Already tried 1 time(s). >> > 2009-04-14 10:08:14,935 INFO org.apache.hadoop.ipc.Client: Retrying >> connect >> > to server: node18/192.168.0.18:54310. Already tried 2 time(s). >> > 2009-04-14 10:08:15,945 INFO org.apache.hadoop.ipc.Client: Retrying >> connect >> > to server: node18/192.168.0.18:54310. Already tried 3 time(s). >> > 2009-04-14 10:08:16,955 INFO org.apache.hadoop.ipc.Client: Retrying >> connect >> > to server: node18/192.168.0.18:54310. Already tried 4 time(s). >> > 2009-04-14 10:08:17,965 INFO org.apache.hadoop.ipc.Client: Retrying >> connect >> > to server: node18/192.168.0.18:54310. Already tried 5 time(s). >> > 2009-04-14 10:08:18,975 INFO org.apache.hadoop.ipc.Client: Retrying >> connect >> > to server: node18/192.168.0.18:54310. Already tried 6 time(s). >> > 2009-04-14 10:08:19,985 INFO org.apache.hadoop.ipc.Client: Retrying >> connect >> > to server: node18/192.168.0.18:54310. Already tried 7 time(s). >> > 2009-04-14 10:08:20,995 INFO org.apache.hadoop.ipc.Client: Retrying >> connect >> > to server: node18/192.168.0.18:54310. Already tried 8 time(s). >> > 2009-04-14 10:08:22,005 INFO org.apache.hadoop.ipc.Client: Retrying >> connect >> > to server: node18/192.168.0.18:54310. Already tried 9 time(s). >> > 2009-04-14 10:08:22,008 INFO org.apache.hadoop.ipc.RPC: Server at >> node18/ >> > 192.168.0.18:54310 not available yet, Z... >> > 2009-04-14 10:08:24,025 INFO org.apache.hadoop.ipc.Client: Retrying >> connect >> > to server: node18/192.168.0.18:54310. Already tried 0 time(s). >> > 2009-04-14 10:08:25,035 INFO org.apache.hadoop.ipc.Client: Retrying >> connect >> > to server: node18/192.168.0.18:54310. Already tried 1 time(s). >> > 2009-04-14 10:08:26,045 INFO org.apache.hadoop.ipc.Client: Retrying >> connect >> > to server: node18/192.168.0.18:54310. Already tried 2 time(s). >> > 2009-04-14 10:08:27,055 INFO org.apache.hadoop.ipc.Client: Retrying >> connect >> > to server: node18/192.168.0.18:54310. Already tried 3 time(s). >> > 2009-04-14 10:08:28,065 INFO org.apache.hadoop.ipc.Client: Retrying >> connect >> > to server: node18/192.168.0.18:54310. Already tried 4 time(s). >> > 2009-04-14 10:08:29,075 INFO org.apache.hadoop.ipc.Client: Retrying >> connect >> > to server: node18/192.168.0.18:54310. Already tried 5 time(s). >> > 2009-04-14 10:08:30,085 INFO org.apache.hadoop.ipc.Client: Retrying >> connect >> > to server: node18/192.168.0.18:54310. Already tried 6 time(s). >> > 2009-04-14 10:08:31,095 INFO org.apache.hadoop.ipc.Client: Retrying >> connect >> > to server: node18/192.168.0.18:54310. Already tried 7 time(s). >> > 2009-04-14 10:08:32,105 INFO org.apache.hadoop.ipc.Client: Retryi
Re: How to submit a project to Hadoop/Apache
Thanks Aaron... yeah it sounds like a much easier approach :) On Wed, Apr 15, 2009 at 11:00 AM, Aaron Kimball wrote: > Tarandeep, > > You might want to start by releasing your project as a "contrib" module for > Hadoop. The overhead there is much easier -- just get it compiliing in the > contrib/ directory, file a JIRA ticket on Hadoop Core, and attach your > patch > :) > > - Aaron > > On Wed, Apr 15, 2009 at 10:29 AM, Otis Gospodnetic < > otis_gospodne...@yahoo.com> wrote: > > > > > This is how things get into Apache Incubator: > http://incubator.apache.org/ > > But the rules are, I believe, that you can skip the incubator and go > > straight under a project's wing (e.g. Hadoop) if the project PMC > approves. > > > > Otis > > -- > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > - Original Message > > > From: Tarandeep Singh > > > To: core-user@hadoop.apache.org > > > Sent: Wednesday, April 15, 2009 1:08:38 PM > > > Subject: How to submit a project to Hadoop/Apache > > > > > > Hi, > > > > > > Can anyone point me to a documentation which explains how to submit a > > > project to Hadoop as a subproject? Also, I will appreciate if someone > > points > > > me to the documentation on how to submit a project as Apache project. > > > > > > We have a project that is built on Hadoop. It is released to the open > > source > > > community under GPL license but we are thinking of submitting it as a > > Hadoop > > > or Apache project. Any help on how to do this is appreciated. > > > > > > Thanks, > > > Tarandeep > > > > >
Datanode Setup
I'm setting up a Hadoop cluster and I have the name node and job tracker up and running. However, I cannot get any of my datanodes or tasktrackers to start. Here is my hadoop-site.xml file... hadoop.tmp.dir /home/hadoop/h_temp A base for other temporary directories. dfs.data.dir /home/hadoop/data fs.default.name 192.168.1.10:54310 The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem. true mapred.job.tracker 192.168.1.10:54311 The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task. dfs.replication 0 Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time. and here is the error I'm getting... 2009-04-15 14:00:48,208 INFO org.apache.hadoop.dfs.DataNode: STARTUP_MSG: / STARTUP_MSG: Starting DataNode STARTUP_MSG: host = java.net.UnknownHostException: myhost: myhost STARTUP_MSG: args = [] STARTUP_MSG: version = 0.18.3 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18 -r 736250; compiled by 'ndaley' on Thu Jan 22 23:12:0$ / 2009-04-15 14:00:48,355 ERROR org.apache.hadoop.dfs.DataNode: java.net.UnknownHostException: myhost: myhost at java.net.InetAddress.getLocalHost(InetAddress.java:1353) at org.apache.hadoop.net.DNS.getDefaultHost(DNS.java:185) at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:249) at org.apache.hadoop.dfs.DataNode.(DataNode.java:223) at org.apache.hadoop.dfs.DataNode.makeInstance(DataNode.java:3071) at org.apache.hadoop.dfs.DataNode.instantiateDataNode(DataNode.java:3026) at org.apache.hadoop.dfs.DataNode.createDataNode(DataNode.java:3034) at org.apache.hadoop.dfs.DataNode.main(DataNode.java:3156) 2009-04-15 14:00:48,356 INFO org.apache.hadoop.dfs.DataNode: SHUTDOWN_MSG: / SHUTDOWN_MSG: Shutting down DataNode at java.net.UnknownHostException: myhost: myhost / -- View this message in context: http://www.nabble.com/Datanode-Setup-tp23064660p23064660.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: Datanode Setup
Hi, The replication factor has to be set to 1. Also for you dfs and job tracker configuration you should insert the name of the node rather than the i.p address. For instance: 192.168.1.10:54310 can be: master:54310 The nodes can be renamed by renaming them in the hosts files in /etc folder. It should look like the following: # Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 localhost.localdomain localhost node01 192.168.0.1 node01 192.168.0.2 node02 192.168.0.3 node03 Hope this helps Mithila On Wed, Apr 15, 2009 at 9:40 PM, jpe30 wrote: > > I'm setting up a Hadoop cluster and I have the name node and job tracker up > and running. However, I cannot get any of my datanodes or tasktrackers to > start. Here is my hadoop-site.xml file... > > > > > > > > > > > > hadoop.tmp.dir > /home/hadoop/h_temp > A base for other temporary directories. > > > > dfs.data.dir > /home/hadoop/data > > > > fs.default.name > 192.168.1.10:54310 > The name of the default file system. A URI whose > scheme and authority determine the FileSystem implementation. The > uri's scheme determines the config property (fs.SCHEME.impl) naming > the FileSystem implementation class. The uri's authority is used to > determine the host, port, etc. for a filesystem. > true > > > > mapred.job.tracker > 192.168.1.10:54311 > The host and port that the MapReduce job tracker runs > at. If "local", then jobs are run in-process as a single map > and reduce task. > > > > > dfs.replication > 0 > Default block replication. > The actual number of replications can be specified when the file is > created. > The default is used if replication is not specified in create time. > > > > > > > and here is the error I'm getting... > > > > > 2009-04-15 14:00:48,208 INFO org.apache.hadoop.dfs.DataNode: STARTUP_MSG: > / > STARTUP_MSG: Starting DataNode > STARTUP_MSG: host = java.net.UnknownHostException: myhost: myhost > STARTUP_MSG: args = [] > STARTUP_MSG: version = 0.18.3 > STARTUP_MSG: build = > https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18 -r > 736250; > compiled by 'ndaley' on Thu Jan 22 23:12:0$ > / > 2009-04-15 14:00:48,355 ERROR org.apache.hadoop.dfs.DataNode: > java.net.UnknownHostException: myhost: myhost >at java.net.InetAddress.getLocalHost(InetAddress.java:1353) >at org.apache.hadoop.net.DNS.getDefaultHost(DNS.java:185) >at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:249) > at org.apache.hadoop.dfs.DataNode.(DataNode.java:223) > at org.apache.hadoop.dfs.DataNode.makeInstance(DataNode.java:3071) >at > org.apache.hadoop.dfs.DataNode.instantiateDataNode(DataNode.java:3026) >at org.apache.hadoop.dfs.DataNode.createDataNode(DataNode.java:3034) >at org.apache.hadoop.dfs.DataNode.main(DataNode.java:3156) > > 2009-04-15 14:00:48,356 INFO org.apache.hadoop.dfs.DataNode: SHUTDOWN_MSG: > / > SHUTDOWN_MSG: Shutting down DataNode at java.net.UnknownHostException: > myhost: myhost > / > > -- > View this message in context: > http://www.nabble.com/Datanode-Setup-tp23064660p23064660.html > Sent from the Hadoop core-user mailing list archive at Nabble.com. > >
Re: Datanode Setup
That helps a lot actually. I will try setting up my hosts file tomorrow and make the other changes you suggested. Thanks! Mithila Nagendra wrote: > > Hi, > The replication factor has to be set to 1. Also for you dfs and job > tracker > configuration you should insert the name of the node rather than the i.p > address. > > For instance: > 192.168.1.10:54310 > > can be: > > master:54310 > > The nodes can be renamed by renaming them in the hosts files in /etc > folder. > It should look like the following: > > # Do not remove the following line, or various programs > # that require network functionality will fail. > 127.0.0.1 localhost.localdomain localhost node01 > 192.168.0.1 node01 > 192.168.0.2 node02 > 192.168.0.3 node03 > > Hope this helps > Mithila > > On Wed, Apr 15, 2009 at 9:40 PM, jpe30 wrote: > >> >> I'm setting up a Hadoop cluster and I have the name node and job tracker >> up >> and running. However, I cannot get any of my datanodes or tasktrackers >> to >> start. Here is my hadoop-site.xml file... >> >> >> >> >> >> >> >> >> >> >> >> hadoop.tmp.dir >> /home/hadoop/h_temp >> A base for other temporary directories. >> >> >> >> dfs.data.dir >> /home/hadoop/data >> >> >> >> fs.default.name >> 192.168.1.10:54310 >> The name of the default file system. A URI whose >> scheme and authority determine the FileSystem implementation. The >> uri's scheme determines the config property (fs.SCHEME.impl) naming >> the FileSystem implementation class. The uri's authority is used to >> determine the host, port, etc. for a filesystem. >> true >> >> >> >> mapred.job.tracker >> 192.168.1.10:54311 >> The host and port that the MapReduce job tracker runs >> at. If "local", then jobs are run in-process as a single map >> and reduce task. >> >> >> >> >> dfs.replication >> 0 >> Default block replication. >> The actual number of replications can be specified when the file is >> created. >> The default is used if replication is not specified in create time. >> >> >> >> >> >> >> and here is the error I'm getting... >> >> >> >> >> 2009-04-15 14:00:48,208 INFO org.apache.hadoop.dfs.DataNode: STARTUP_MSG: >> / >> STARTUP_MSG: Starting DataNode >> STARTUP_MSG: host = java.net.UnknownHostException: myhost: myhost >> STARTUP_MSG: args = [] >> STARTUP_MSG: version = 0.18.3 >> STARTUP_MSG: build = >> https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18 -r >> 736250; >> compiled by 'ndaley' on Thu Jan 22 23:12:0$ >> / >> 2009-04-15 14:00:48,355 ERROR org.apache.hadoop.dfs.DataNode: >> java.net.UnknownHostException: myhost: myhost >>at java.net.InetAddress.getLocalHost(InetAddress.java:1353) >>at org.apache.hadoop.net.DNS.getDefaultHost(DNS.java:185) >>at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:249) >> at org.apache.hadoop.dfs.DataNode.(DataNode.java:223) >> at >> org.apache.hadoop.dfs.DataNode.makeInstance(DataNode.java:3071) >>at >> org.apache.hadoop.dfs.DataNode.instantiateDataNode(DataNode.java:3026) >>at >> org.apache.hadoop.dfs.DataNode.createDataNode(DataNode.java:3034) >>at org.apache.hadoop.dfs.DataNode.main(DataNode.java:3156) >> >> 2009-04-15 14:00:48,356 INFO org.apache.hadoop.dfs.DataNode: >> SHUTDOWN_MSG: >> / >> SHUTDOWN_MSG: Shutting down DataNode at java.net.UnknownHostException: >> myhost: myhost >> / >> >> -- >> View this message in context: >> http://www.nabble.com/Datanode-Setup-tp23064660p23064660.html >> Sent from the Hadoop core-user mailing list archive at Nabble.com. >> >> > > -- View this message in context: http://www.nabble.com/Datanode-Setup-tp23064660p23065220.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: Extending ClusterMapReduceTestCase
I got it all up and working, thanks for your help - it was an issue with me not actually setting the log.dir system property before the cluster startup. Can't believe I missed that one :) As a side note (which you might already be aware of), the example class you're using in Chapter 7 (PiEstimator) has changed in Hadoop 0.19.1 such that the example code no longer works. The new one is a little trickier to test. I'm looking forward to seeing the rest of the book. And that delegate test harness when it's available :) jason hadoop wrote: > > btw that stack trace looks like the hadoop.log.dir issue > This is the code out of the init method, in JobHistory > > LOG_DIR = conf.get("hadoop.job.history.location" , > "file:///" + new File( > System.getProperty("hadoop.log.dir")).getAbsolutePath() > + File.separator + "history"); > > looks like the hadoop.log.dir system property is not set, note: not > environment variable, not configuration parameter, but system property. > > Try a *System.setProperty("hadoop.log.dir","/tmp");* in your code before > you > initialize the virtual cluster. > > > > On Tue, Apr 14, 2009 at 5:56 PM, jason hadoop > wrote: > >> >> I have actually built an add on class on top of ClusterMapReduceDelegate >> that just runs a virtual cluster that persists for running tests on, it >> is >> very nice, as you can interact via the web ui. >> Especially since the virtual cluster stuff is somewhat flaky under >> windows. >> >> I have a question in to the editor about the sample code. >> >> >> >> On Tue, Apr 14, 2009 at 8:16 AM, czero wrote: >> >>> >>> I actually picked up the alpha .PDF's of your book, great job. >>> >>> I'm following the example in chapter 7 to the letter now and am still >>> getting the same problem. 2 quick questions (and thanks for your time >>> in >>> advance)... >>> >>> Is the ClusterMapReduceDelegate class available anywhere yet? >>> >>> Adding ~/hadoop/libs/*.jar in it's entirety to my pom.xml is a lot of >>> bulk, >>> so I've avoided it until now. Are there any lib's in there that are >>> absolutely necessary for this test to work? >>> >>> Thanks again, >>> bc >>> >>> >>> >>> jason hadoop wrote: >>> > >>> > I have a nice variant of this in the ch7 examples section of my book, >>> > including a standalone wrapper around the virtual cluster for allowing >>> > multiple test instances to share the virtual cluster - and allow an >>> easier >>> > time to poke around with the input and output datasets. >>> > >>> > It even works decently under windows - my editor insisting on word to >>> > recent >>> > for crossover. >>> > >>> > On Mon, Apr 13, 2009 at 9:16 AM, czero wrote: >>> > >>> >> >>> >> Sry, I forgot to include the not-IntelliJ-console output :) >>> >> >>> >> 09/04/13 12:07:14 ERROR mapred.MiniMRCluster: Job tracker crashed >>> >> java.lang.NullPointerException >>> >>at java.io.File.(File.java:222) >>> >>at >>> org.apache.hadoop.mapred.JobHistory.init(JobHistory.java:143) >>> >>at >>> >> org.apache.hadoop.mapred.JobTracker.(JobTracker.java:1110) >>> >>at >>> >> org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:143) >>> >>at >>> >> >>> >> >>> org.apache.hadoop.mapred.MiniMRCluster$JobTrackerRunner.run(MiniMRCluster.java:96) >>> >>at java.lang.Thread.run(Thread.java:637) >>> >> >>> >> I managed to pick up the chapter in the Hadoop Book that Jason >>> mentions >>> >> that >>> >> deals with Unit testing (great chapter btw) and it looks like >>> everything >>> >> is >>> >> in order. He points out that this error is typically caused by a bad >>> >> hadoop.log.dir or missing log4j.properties, but I verified that my >>> dir >>> is >>> >> ok >>> >> and my hadoop-0.19.1-core.jar has the log4j.properties in it. >>> >> >>> >> I also tried running the same test with hadoop-core/test 0.19.0 - >>> same >>> >> thing. >>> >> >>> >> Thanks again, >>> >> >>> >> bc >>> >> >>> >> >>> >> czero wrote: >>> >> > >>> >> > Hey all, >>> >> > >>> >> > I'm also extending the ClusterMapReduceTestCase and having a bit of >>> >> > trouble as well. >>> >> > >>> >> > Currently I'm getting : >>> >> > >>> >> > Starting DataNode 0 with dfs.data.dir: >>> >> > build/test/data/dfs/data/data1,build/test/data/dfs/data/data2 >>> >> > Starting DataNode 1 with dfs.data.dir: >>> >> > build/test/data/dfs/data/data3,build/test/data/dfs/data/data4 >>> >> > Generating rack names for tasktrackers >>> >> > Generating host names for tasktrackers >>> >> > >>> >> > And then nothing... just spins on that forever. Any ideas? >>> >> > >>> >> > I have all the jetty and jetty-ext libs in the classpath and I set >>> the >>> >> > hadoop.log.dir and the SAX parser correctly. >>> >> > >>> >> > This is all I have for my test class so far, I'm not even doing >>> >> anything >>> >> > yet: >>> >> > >>> >> > public class TestDoop extends ClusterMapReduceTestCase { >>> >> > >>> >> > @Test >>> >> > public void testDoop() throws Exce
RE: reduce task specific jvm arg
This sounds like a reasonable request. Created https://issues.apache.org/jira/browse/HADOOP-5684 On our clusters, sometimes users want thin mappers and large reducers. Koji -Original Message- From: Jun Rao [mailto:jun...@almaden.ibm.com] Sent: Thursday, April 09, 2009 10:30 AM To: core-user@hadoop.apache.org Subject: reduce task specific jvm arg Hi, Is there a way to set jvm parameters only for reduce tasks in Hadoop? Thanks, Jun IBM Almaden Research Center K55/B1, 650 Harry Road, San Jose, CA 95120-6099 jun...@almaden.ibm.com
Re: Directory /tmp/hadoop-hadoop/dfs/name is in an inconsistent state: storage directory does not exist
Data stored to /tmp has no consistency / reliability guarantees. Your OS can delete that data at any time. Configure hadoop-site.xml to store data elsewhere. Grep for "/tmp" in hadoop-default.xml to see all the configuration options you'll have to change. Here's the list I came up with: hadoop.tmp.dir fs.checkpoint.dir dfs.name.dir dfs.data.dir mapred.local.dir mapred.system.dir mapred.temp.dir Again, you need to be storing your data somewhere other than /tmp. Alex On Tue, Apr 14, 2009 at 6:06 PM, Pankil Doshi wrote: > Hello Everyone, > > At time I get following error,when i restart my cluster desktops.(Before > that I shutdown mapred and dfs properly though). > Temp folder contains of the directory its looking for.Still I get this > error. > Only solution I found to get rid with this error is I have to format my dfs > entirely and then load the data again. and start whole process. > > But in that I loose my data on HDFS and I have to reload it. > > Does anyone has any clue abt it? > > Error from log fil e:- > > 2009-04-14 19:40:29,963 INFO org.apache.hadoop.dfs.NameNode: STARTUP_MSG: > / > STARTUP_MSG: Starting NameNode > STARTUP_MSG: host = Semantic002/192.168.1.133 > STARTUP_MSG: args = [] > STARTUP_MSG: version = 0.18.3 > STARTUP_MSG: build = > https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18 -r > 736250; > compiled by 'ndaley' on Thu Jan 22 23:12:08 UTC 2009 > / > 2009-04-14 19:40:30,958 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: > Initializing RPC Metrics with hostName=NameNode, port=9000 > 2009-04-14 19:40:30,996 INFO org.apache.hadoop.dfs.NameNode: Namenode up > at: > Semantic002/192.168.1.133:9000 > 2009-04-14 19:40:31,007 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: > Initializing JVM Metrics with processName=NameNode, sessionId=null > 2009-04-14 19:40:31,014 INFO org.apache.hadoop.dfs.NameNodeMetrics: > Initializing NameNodeMeterics using context > object:org.apache.hadoop.metrics.spi.NullCont > ext > 2009-04-14 19:40:31,160 INFO org.apache.hadoop.fs.FSNamesystem: > > fsOwner=hadoop,hadoop,adm,dialout,fax,cdrom,floppy,tape,audio,dip,plugdev,scanner,fuse,admin > 2009-04-14 19:40:31,161 INFO org.apache.hadoop.fs.FSNamesystem: > supergroup=supergroup > 2009-04-14 19:40:31,161 INFO org.apache.hadoop.fs.FSNamesystem: > isPermissionEnabled=true > 2009-04-14 19:40:31,183 INFO org.apache.hadoop.dfs.FSNamesystemMetrics: > Initializing FSNamesystemMeterics using context > object:org.apache.hadoop.metrics.spi. > NullContext > 2009-04-14 19:40:31,184 INFO org.apache.hadoop.fs.FSNamesystem: Registered > FSNamesystemStatusMBean > 2009-04-14 19:40:31,248 INFO org.apache.hadoop.dfs.Storage: Storage > directory /tmp/hadoop-hadoop/dfs/name does not exist. > 2009-04-14 19:40:31,251 ERROR org.apache.hadoop.fs.FSNamesystem: > FSNamesystem initialization failed. > org.apache.hadoop.dfs.InconsistentFSStateException: Directory > /tmp/hadoop-hadoop/dfs/name is in an inconsistent state: storage directory > does not exist or is > not accessible. >at > org.apache.hadoop.dfs.FSImage.recoverTransitionRead(FSImage.java:211) >at > org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:80) >at > org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:294) >at org.apache.hadoop.dfs.FSNamesystem.(FSNamesystem.java:273) >at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:148) >at org.apache.hadoop.dfs.NameNode.(NameNode.java:193) >at org.apache.hadoop.dfs.NameNode.(NameNode.java:179) >at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:830) >at org.apache.hadoop.dfs.NameNode.main(NameNode.java:839) > 2009-04-14 19:40:31,261 INFO org.apache.hadoop.ipc.Server: Stopping server > on 9000 > 2009-04-14 19:40:31,262 ERROR org.apache.hadoop.dfs.NameNode: > org.apache.hadoop.dfs.InconsistentFSStateException: Directory > /tmp/hadoop-hadoop/dfs/name is in > an inconsistent state: storage directory does not exist or is not > accessible. >at > org.apache.hadoop.dfs.FSImage.recoverTransitionRead(FSImage.java:211) >at > org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:80) >at > org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:294) >at org.apache.hadoop.dfs.FSNamesystem.(FSNamesystem.java:273) >at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:148) >at org.apache.hadoop.dfs.NameNode.(NameNode.java:193) >at org.apache.hadoop.dfs.NameNode.(NameNode.java:179) >at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:830) >at org.apache.hadoop.dfs.NameNode.main(NameNode.java:839) > > 2009-04-14 19:40:31,267 INFO org.apache.hadoop.dfs.NameNode: SHUTDOWN_MSG: > / > : > > Thanks > > Pankil >
Re: getting DiskErrorException during map
Alex, Yes, I bounced the Hadoop daemons after I changed the configuration files. I also tried setting $HADOOP_CONF_DIR to the directory where my hadop-site.xml file resides but it didn't work. However, I'm sure that HADOOP_CONF_DIR is not the issue because other properties that I changed in hadoop-site.xml seem to be properly set. Also, here is a section from my hadoop-site.xml file: hadoop.tmp.dir /scratch/local/jim/hadoop-${user.name} mapred.local.dir /scratch/local/jim/hadoop-${user.name}/mapred/local I also created /scratch/local/jim/hadoop-jim/mapred/local on each task tracker since I know directories that do not exist are ignored. When I manually ssh to the task trackers, I can see the directory /scratch/local/jim/hadoop-jim/dfs is automatically created so is it seems like hadoop.tmp.dir is set properly. However, hadoop still creates /tmp/hadoop-jim/mapred/local and uses that directory for the local storage. I'm starting to suspect that mapred.local.dir is overwritten to a default value of /tmp/hadoop-${user.name} somewhere inside the binaries. -jim On Tue, Apr 14, 2009 at 4:07 PM, Alex Loddengaard wrote: > First, did you bounce the Hadoop daemons after you changed the > configuration > files? I think you'll have to do this. > > Second, I believe 0.19.1 has hadoop-default.xml baked into the jar. Try > setting $HADOOP_CONF_DIR to the directory where hadoop-site.xml lives. For > whatever reason your hadoop-site.xml (and the hadoop-default.xml you tried > to change) are probably not being loaded. $HADOOP_CONF_DIR should fix > this. > > Good luck! > > Alex > > On Mon, Apr 13, 2009 at 11:25 AM, Jim Twensky > wrote: > > > Thank you Alex, you are right. There are quotas on the systems that I'm > > working. However, I tried to change mapred.local.dir as follows: > > > > --inside hadoop-site.xml: > > > > > >mapred.child.tmp > >/scratch/local/jim > > > > > >hadoop.tmp.dir > >/scratch/local/jim > > > > > >mapred.local.dir > >/scratch/local/jim > > > > > > and observed that the intermediate map outputs are still being written > > under /tmp/hadoop-jim/mapred/local > > > > I'm confused at this point since I also tried setting these values > directly > > inside the hadoop-default.xml and that didn't work either. Is there any > > other property that I'm supposed to change? I tried searching for "/tmp" > in > > the hadoop-default.xml file but couldn't find anything else. > > > > Thanks, > > Jim > > > > > > On Tue, Apr 7, 2009 at 9:35 PM, Alex Loddengaard > > wrote: > > > > > The getLocalPathForWrite function that throws this Exception assumes > that > > > you have space on the disks that mapred.local.dir is configured on. > Can > > > you > > > verify with `df` that those disks have space available? You might also > > try > > > moving mapred.local.dir off of /tmp if it's configured to use /tmp > right > > > now; I believe some systems have quotas on /tmp. > > > > > > Hope this helps. > > > > > > Alex > > > > > > On Tue, Apr 7, 2009 at 7:22 PM, Jim Twensky > > wrote: > > > > > > > Hi, > > > > > > > > I'm using Hadoop 0.19.1 and I have a very small test cluster with 9 > > > nodes, > > > > 8 > > > > of them being task trackers. I'm getting the following error and my > > jobs > > > > keep failing when map processes start hitting 30%: > > > > > > > > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find > > any > > > > valid local directory for > > > > > > > > > > > > > > taskTracker/jobcache/job_200904072051_0001/attempt_200904072051_0001_m_00_1/output/file.out > > > >at > > > > > > > > > > > > > > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:335) > > > >at > > > > > > > > > > > > > > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124) > > > >at > > > > > > > > > > > > > > org.apache.hadoop.mapred.MapOutputFile.getOutputFileForWrite(MapOutputFile.java:61) > > > >at > > > > > > > > > > > > > > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1209) > > > >at > > > > > > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:867) > > > >at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > > > >at org.apache.hadoop.mapred.Child.main(Child.java:158) > > > > > > > > > > > > I googled many blogs and web pages but I could neither understand why > > > this > > > > happens nor found a solution to this. What does that error message > mean > > > and > > > > how can avoid it, any suggestions? > > > > > > > > Thanks in advance, > > > > -jim > > > > > > > > > >
Re: Directory /tmp/hadoop-hadoop/dfs/name is in an inconsistent state: storage directory does not exist
Thanks Pankil On Wed, Apr 15, 2009 at 5:09 PM, Alex Loddengaard wrote: > Data stored to /tmp has no consistency / reliability guarantees. Your OS > can delete that data at any time. > > Configure hadoop-site.xml to store data elsewhere. Grep for "/tmp" in > hadoop-default.xml to see all the configuration options you'll have to > change. Here's the list I came up with: > > hadoop.tmp.dir > fs.checkpoint.dir > dfs.name.dir > dfs.data.dir > mapred.local.dir > mapred.system.dir > mapred.temp.dir > > Again, you need to be storing your data somewhere other than /tmp. > > Alex > > On Tue, Apr 14, 2009 at 6:06 PM, Pankil Doshi wrote: > > > Hello Everyone, > > > > At time I get following error,when i restart my cluster desktops.(Before > > that I shutdown mapred and dfs properly though). > > Temp folder contains of the directory its looking for.Still I get this > > error. > > Only solution I found to get rid with this error is I have to format my > dfs > > entirely and then load the data again. and start whole process. > > > > But in that I loose my data on HDFS and I have to reload it. > > > > Does anyone has any clue abt it? > > > > Error from log fil e:- > > > > 2009-04-14 19:40:29,963 INFO org.apache.hadoop.dfs.NameNode: STARTUP_MSG: > > / > > STARTUP_MSG: Starting NameNode > > STARTUP_MSG: host = Semantic002/192.168.1.133 > > STARTUP_MSG: args = [] > > STARTUP_MSG: version = 0.18.3 > > STARTUP_MSG: build = > > https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18 -r > > 736250; > > compiled by 'ndaley' on Thu Jan 22 23:12:08 UTC 2009 > > / > > 2009-04-14 19:40:30,958 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: > > Initializing RPC Metrics with hostName=NameNode, port=9000 > > 2009-04-14 19:40:30,996 INFO org.apache.hadoop.dfs.NameNode: Namenode up > > at: > > Semantic002/192.168.1.133:9000 > > 2009-04-14 19:40:31,007 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: > > Initializing JVM Metrics with processName=NameNode, sessionId=null > > 2009-04-14 19:40:31,014 INFO org.apache.hadoop.dfs.NameNodeMetrics: > > Initializing NameNodeMeterics using context > > object:org.apache.hadoop.metrics.spi.NullCont > > ext > > 2009-04-14 19:40:31,160 INFO org.apache.hadoop.fs.FSNamesystem: > > > > > fsOwner=hadoop,hadoop,adm,dialout,fax,cdrom,floppy,tape,audio,dip,plugdev,scanner,fuse,admin > > 2009-04-14 19:40:31,161 INFO org.apache.hadoop.fs.FSNamesystem: > > supergroup=supergroup > > 2009-04-14 19:40:31,161 INFO org.apache.hadoop.fs.FSNamesystem: > > isPermissionEnabled=true > > 2009-04-14 19:40:31,183 INFO org.apache.hadoop.dfs.FSNamesystemMetrics: > > Initializing FSNamesystemMeterics using context > > object:org.apache.hadoop.metrics.spi. > > NullContext > > 2009-04-14 19:40:31,184 INFO org.apache.hadoop.fs.FSNamesystem: > Registered > > FSNamesystemStatusMBean > > 2009-04-14 19:40:31,248 INFO org.apache.hadoop.dfs.Storage: Storage > > directory /tmp/hadoop-hadoop/dfs/name does not exist. > > 2009-04-14 19:40:31,251 ERROR org.apache.hadoop.fs.FSNamesystem: > > FSNamesystem initialization failed. > > org.apache.hadoop.dfs.InconsistentFSStateException: Directory > > /tmp/hadoop-hadoop/dfs/name is in an inconsistent state: storage > directory > > does not exist or is > > not accessible. > >at > > org.apache.hadoop.dfs.FSImage.recoverTransitionRead(FSImage.java:211) > >at > > org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:80) > >at > > org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:294) > >at > org.apache.hadoop.dfs.FSNamesystem.(FSNamesystem.java:273) > >at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:148) > >at org.apache.hadoop.dfs.NameNode.(NameNode.java:193) > >at org.apache.hadoop.dfs.NameNode.(NameNode.java:179) > >at > org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:830) > >at org.apache.hadoop.dfs.NameNode.main(NameNode.java:839) > > 2009-04-14 19:40:31,261 INFO org.apache.hadoop.ipc.Server: Stopping > server > > on 9000 > > 2009-04-14 19:40:31,262 ERROR org.apache.hadoop.dfs.NameNode: > > org.apache.hadoop.dfs.InconsistentFSStateException: Directory > > /tmp/hadoop-hadoop/dfs/name is in > > an inconsistent state: storage directory does not exist or is not > > accessible. > >at > > org.apache.hadoop.dfs.FSImage.recoverTransitionRead(FSImage.java:211) > >at > > org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:80) > >at > > org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:294) > >at > org.apache.hadoop.dfs.FSNamesystem.(FSNamesystem.java:273) > >at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:148) > >at org.apache.hadoop.dfs.NameNode.(NameNode.java:193) > >at org.apache.hadoop.dfs.NameNode.(NameNode.java:179) > >at > org.apache.hadoop.dfs.
Error reading task output
Hi, I'm getting the following warning when running the simple wordcount and grep examples. 09/04/15 16:54:16 INFO mapred.JobClient: Task Id : attempt_200904151649_0001_m_19_0, Status : FAILED Too many fetch-failures 09/04/15 16:54:16 WARN mapred.JobClient: Error reading task outputhttp://localhost.localdomain:50060/tasklog?plaintext=true&taskid=attempt_200904151649_0001_m_19_0&filter=stdout 09/04/15 16:54:16 WARN mapred.JobClient: Error reading task outputhttp://localhost.localdomain:50060/tasklog?plaintext=true&taskid=attempt_200904151649_0001_m_19_0&filter=stderr The only advice I could find from other posts with similar errors is to setup /etc/hosts with all slaves and the host IPs. I did this, but I still get the warning above. The output seems to come out alright however (I guess that's why it is a warning). I tried running a wget on the http:// address in the warning message and I get the following back 2009-04-15 16:53:46 ERROR 400: Argument taskid is required. So perhaps the wrong task ID is being passed to the http request. Any ideas on what can get rid of these warnings? Thanks, Cam
Re: Generating many small PNGs to Amazon S3 with MapReduce
On Tue, Apr 14, 2009 at 2:35 AM, tim robertson wrote: > > I am considering (for better throughput as maps generate huge request > volumes) pregenerating all my tiles (PNG) and storing them in S3 with > cloudfront. There will be billions of PNGs produced each at 1-3KB > each. > Storing billions of PNGs each at 1-3kb each into S3 will be perfectly fine, there is no need to generate them and then push them at once, if you are storing them each in their own S3 object (which they must be, if you intend to fetch them using cloudfront). Each S3 object is unique, and can be written fully in parallel. If you are writing to the same S3 object twice, ... well, you're doing it wrong. However, do the math on the costs for S3. We were doing something similar, and found that we were spending a fortune on our put requests at $0.01 per 1000, and next to nothing on storage. I've since moved to a more complicated model where I pack many small items in each object and store an index in simpledb. You'll need to partition your SimpleDBs if you do this.
Re: Map-Reduce Slow Down
Double check that there is no firewall in place. At one point a bunch of new machines were kickstarted and placed in a cluster and they all failed with something similar. It turned out the kickstart script turned enabled the firewall with a rule that blocked ports in the 50k range. It took us a while to even think to check that was not a part of our normal machine configuration On Wed, Apr 15, 2009 at 11:04 AM, Mithila Nagendra wrote: > Hi Aaron > I will look into that thanks! > > I spoke to the admin who overlooks the cluster. He said that the gateway > comes in to the picture only when one of the nodes communicates with a node > outside of the cluster. But in my case the communication is carried out > between the nodes which all belong to the same cluster. > > Mithila > > On Wed, Apr 15, 2009 at 8:59 PM, Aaron Kimball wrote: > > > Hi, > > > > I wrote a blog post a while back about connecting nodes via a gateway. > See > > > http://www.cloudera.com/blog/2008/12/03/securing-a-hadoop-cluster-through-a-gateway/ > > > > This assumes that the client is outside the gateway and all > > datanodes/namenode are inside, but the same principles apply. You'll just > > need to set up ssh tunnels from every datanode to the namenode. > > > > - Aaron > > > > > > On Wed, Apr 15, 2009 at 10:19 AM, Ravi Phulari >wrote: > > > >> Looks like your NameNode is down . > >> Verify if hadoop process are running ( jps should show you all java > >> running process). > >> If your hadoop process are running try restarting your hadoop process . > >> I guess this problem is due to your fsimage not being correct . > >> You might have to format your namenode. > >> Hope this helps. > >> > >> Thanks, > >> -- > >> Ravi > >> > >> > >> On 4/15/09 10:15 AM, "Mithila Nagendra" wrote: > >> > >> The log file runs into thousands of line with the same message being > >> displayed every time. > >> > >> On Wed, Apr 15, 2009 at 8:10 PM, Mithila Nagendra > >> wrote: > >> > >> > The log file : hadoop-mithila-datanode-node19.log.2009-04-14 has the > >> > following in it: > >> > > >> > 2009-04-14 10:08:11,499 INFO org.apache.hadoop.dfs.DataNode: > >> STARTUP_MSG: > >> > / > >> > STARTUP_MSG: Starting DataNode > >> > STARTUP_MSG: host = node19/127.0.0.1 > >> > STARTUP_MSG: args = [] > >> > STARTUP_MSG: version = 0.18.3 > >> > STARTUP_MSG: build = > >> > https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18 -r > >> > 736250; compiled by 'ndaley' on Thu Jan 22 23:12:08 UTC 2009 > >> > / > >> > 2009-04-14 10:08:12,915 INFO org.apache.hadoop.ipc.Client: Retrying > >> connect > >> > to server: node18/192.168.0.18:54310. Already tried 0 time(s). > >> > 2009-04-14 10:08:13,925 INFO org.apache.hadoop.ipc.Client: Retrying > >> connect > >> > to server: node18/192.168.0.18:54310. Already tried 1 time(s). > >> > 2009-04-14 10:08:14,935 INFO org.apache.hadoop.ipc.Client: Retrying > >> connect > >> > to server: node18/192.168.0.18:54310. Already tried 2 time(s). > >> > 2009-04-14 10:08:15,945 INFO org.apache.hadoop.ipc.Client: Retrying > >> connect > >> > to server: node18/192.168.0.18:54310. Already tried 3 time(s). > >> > 2009-04-14 10:08:16,955 INFO org.apache.hadoop.ipc.Client: Retrying > >> connect > >> > to server: node18/192.168.0.18:54310. Already tried 4 time(s). > >> > 2009-04-14 10:08:17,965 INFO org.apache.hadoop.ipc.Client: Retrying > >> connect > >> > to server: node18/192.168.0.18:54310. Already tried 5 time(s). > >> > 2009-04-14 10:08:18,975 INFO org.apache.hadoop.ipc.Client: Retrying > >> connect > >> > to server: node18/192.168.0.18:54310. Already tried 6 time(s). > >> > 2009-04-14 10:08:19,985 INFO org.apache.hadoop.ipc.Client: Retrying > >> connect > >> > to server: node18/192.168.0.18:54310. Already tried 7 time(s). > >> > 2009-04-14 10:08:20,995 INFO org.apache.hadoop.ipc.Client: Retrying > >> connect > >> > to server: node18/192.168.0.18:54310. Already tried 8 time(s). > >> > 2009-04-14 10:08:22,005 INFO org.apache.hadoop.ipc.Client: Retrying > >> connect > >> > to server: node18/192.168.0.18:54310. Already tried 9 time(s). > >> > 2009-04-14 10:08:22,008 INFO org.apache.hadoop.ipc.RPC: Server at > >> node18/ > >> > 192.168.0.18:54310 not available yet, Z... > >> > 2009-04-14 10:08:24,025 INFO org.apache.hadoop.ipc.Client: Retrying > >> connect > >> > to server: node18/192.168.0.18:54310. Already tried 0 time(s). > >> > 2009-04-14 10:08:25,035 INFO org.apache.hadoop.ipc.Client: Retrying > >> connect > >> > to server: node18/192.168.0.18:54310. Already tried 1 time(s). > >> > 2009-04-14 10:08:26,045 INFO org.apache.hadoop.ipc.Client: Retrying > >> connect > >> > to server: node18/192.168.0.18:54310. Already tried 2 time(s). > >> > 2009-04-14 10:08:27,055 INFO org.apache.hadoop.ipc.Client: Retrying > >> connect > >> > to server: node18/192.168.0.18:54310. Already tried 3 time(s). > >> > 2009-04-14 10:0
RE: More Replication on dfs
Hi My problem is not that my data is under replicated. I have 3 data nodes. In my hadoop-site.xml I also set the configuration as: dfs.replication 2 But after this also data is replicated on 3 nodes instead of two nodes. Now, please tell what can be the problem? Thanks & Regards Aseem Puri -Original Message- From: Raghu Angadi [mailto:rang...@yahoo-inc.com] Sent: Wednesday, April 15, 2009 2:58 AM To: core-user@hadoop.apache.org Subject: Re: More Replication on dfs Aseem, Regd over-replication, it is mostly app related issue as Alex mentioned. But if you are concerned about under-replicated blocks in fsck output : These blocks should not stay under-replicated if you have enough nodes and enough space on them (check NameNode webui). Try grep-ing for one of the blocks in NameNode log (and datnode logs as well, since you have just 3 nodes). Raghu. Puri, Aseem wrote: > Alex, > > Ouput of $ bin/hadoop fsck / command after running HBase data insert > command in a table is: > > . > . > . > . > . > /hbase/test/903188508/tags/info/4897652949308499876: Under replicated > blk_-5193 > 695109439554521_3133. Target Replicas is 3 but found 1 replica(s). > . > /hbase/test/903188508/tags/mapfiles/4897652949308499876/data: Under > replicated > blk_-1213602857020415242_3132. Target Replicas is 3 but found 1 > replica(s). > . > /hbase/test/903188508/tags/mapfiles/4897652949308499876/index: Under > replicated > blk_3934493034551838567_3132. Target Replicas is 3 but found 1 > replica(s). > . > /user/HadoopAdmin/hbase table.doc: Under replicated > blk_4339521803948458144_103 > 1. Target Replicas is 3 but found 2 replica(s). > . > /user/HadoopAdmin/input/bin.doc: Under replicated > blk_-3661765932004150973_1030 > . Target Replicas is 3 but found 2 replica(s). > . > /user/HadoopAdmin/input/file01.txt: Under replicated > blk_2744169131466786624_10 > 01. Target Replicas is 3 but found 2 replica(s). > . > /user/HadoopAdmin/input/file02.txt: Under replicated > blk_2021956984317789924_10 > 02. Target Replicas is 3 but found 2 replica(s). > . > /user/HadoopAdmin/input/test.txt: Under replicated > blk_-3062256167060082648_100 > 4. Target Replicas is 3 but found 2 replica(s). > ... > /user/HadoopAdmin/output/part-0: Under replicated > blk_8908973033976428484_1 > 010. Target Replicas is 3 but found 2 replica(s). > Status: HEALTHY > Total size:48510226 B > Total dirs:492 > Total files: 439 (Files currently being written: 2) > Total blocks (validated): 401 (avg. block size 120973 B) (Total > open file > blocks (not validated): 2) > Minimally replicated blocks: 401 (100.0 %) > Over-replicated blocks:0 (0.0 %) > Under-replicated blocks: 399 (99.50124 %) > Mis-replicated blocks: 0 (0.0 %) > Default replication factor:2 > Average block replication: 1.3117207 > Corrupt blocks:0 > Missing replicas: 675 (128.327 %) > Number of data-nodes: 2 > Number of racks: 1 > > > The filesystem under path '/' is HEALTHY > Please tell what is wrong. > > Aseem > > -Original Message- > From: Alex Loddengaard [mailto:a...@cloudera.com] > Sent: Friday, April 10, 2009 11:04 PM > To: core-user@hadoop.apache.org > Subject: Re: More Replication on dfs > > Aseem, > > How are you verifying that blocks are not being replicated? Have you > ran > fsck? *bin/hadoop fsck /* > > I'd be surprised if replication really wasn't happening. Can you run > fsck > and pay attention to "Under-replicated blocks" and "Mis-replicated > blocks?" > In fact, can you just copy-paste the output of fsck? > > Alex > > On Thu, Apr 9, 2009 at 11:23 PM, Puri, Aseem > wrote: > >> Hi >>I also tried the command $ bin/hadoop balancer. But still the >> same problem. >> >> Aseem >> >> -Original Message- >> From: Puri, Aseem [mailto:aseem.p...@honeywell.com] >> Sent: Friday, April 10, 2009 11:18 AM >> To: core-user@hadoop.apache.org >> Subject: RE: More Replication on dfs >> >> Hi Alex, >> >>Thanks for sharing your knowledge. Till now I have three >> machines and I have to check the behavior of Hadoop so I want >> replication factor should be 2. I started my Hadoop server with >> replication factor 3. After that I upload 3 files to implement word >> count program. But as my all files are stored on one machine and >> replicated to other datanodes also, so my map reduce program takes > input >> from one Datanode only. I want my files to be on different data node > so >> to check functionality of map reduce properly. >> >>Also before starting my Hadoop server again with replication >> factor 2 I formatted all Datanodes and deleted all old data manually. >> >> Please suggest what I should do now. >> >> Regards, >> Aseem Puri >> >> >> -Original Message- >> From: Mithila Nagendra [mailto:mnage...@asu.edu] >> Sent: Friday, April 10, 2009 10:56 AM >> To: core-user@ha
RE: More Replication on dfs
Hi My problem is not that my data is under replicated. I have 3 data nodes. In my hadoop-site.xml I also set the configuration as: dfs.replication 2 But after this also data is replicated on 3 nodes instead of two nodes. Now, please tell what can be the problem? Thanks & Regards Aseem Puri -Original Message- From: Raghu Angadi [mailto:rang...@yahoo-inc.com] Sent: Wednesday, April 15, 2009 2:58 AM To: core-user@hadoop.apache.org Subject: Re: More Replication on dfs Aseem, Regd over-replication, it is mostly app related issue as Alex mentioned. But if you are concerned about under-replicated blocks in fsck output : These blocks should not stay under-replicated if you have enough nodes and enough space on them (check NameNode webui). Try grep-ing for one of the blocks in NameNode log (and datnode logs as well, since you have just 3 nodes). Raghu. Puri, Aseem wrote: > Alex, > > Ouput of $ bin/hadoop fsck / command after running HBase data insert > command in a table is: > > . > . > . > . > . > /hbase/test/903188508/tags/info/4897652949308499876: Under replicated > blk_-5193 > 695109439554521_3133. Target Replicas is 3 but found 1 replica(s). > . > /hbase/test/903188508/tags/mapfiles/4897652949308499876/data: Under > replicated > blk_-1213602857020415242_3132. Target Replicas is 3 but found 1 > replica(s). > . > /hbase/test/903188508/tags/mapfiles/4897652949308499876/index: Under > replicated > blk_3934493034551838567_3132. Target Replicas is 3 but found 1 > replica(s). > . > /user/HadoopAdmin/hbase table.doc: Under replicated > blk_4339521803948458144_103 > 1. Target Replicas is 3 but found 2 replica(s). > . > /user/HadoopAdmin/input/bin.doc: Under replicated > blk_-3661765932004150973_1030 > . Target Replicas is 3 but found 2 replica(s). > . > /user/HadoopAdmin/input/file01.txt: Under replicated > blk_2744169131466786624_10 > 01. Target Replicas is 3 but found 2 replica(s). > . > /user/HadoopAdmin/input/file02.txt: Under replicated > blk_2021956984317789924_10 > 02. Target Replicas is 3 but found 2 replica(s). > . > /user/HadoopAdmin/input/test.txt: Under replicated > blk_-3062256167060082648_100 > 4. Target Replicas is 3 but found 2 replica(s). > ... > /user/HadoopAdmin/output/part-0: Under replicated > blk_8908973033976428484_1 > 010. Target Replicas is 3 but found 2 replica(s). > Status: HEALTHY > Total size:48510226 B > Total dirs:492 > Total files: 439 (Files currently being written: 2) > Total blocks (validated): 401 (avg. block size 120973 B) (Total > open file > blocks (not validated): 2) > Minimally replicated blocks: 401 (100.0 %) > Over-replicated blocks:0 (0.0 %) > Under-replicated blocks: 399 (99.50124 %) > Mis-replicated blocks: 0 (0.0 %) > Default replication factor:2 > Average block replication: 1.3117207 > Corrupt blocks:0 > Missing replicas: 675 (128.327 %) > Number of data-nodes: 2 > Number of racks: 1 > > > The filesystem under path '/' is HEALTHY > Please tell what is wrong. > > Aseem > > -Original Message- > From: Alex Loddengaard [mailto:a...@cloudera.com] > Sent: Friday, April 10, 2009 11:04 PM > To: core-user@hadoop.apache.org > Subject: Re: More Replication on dfs > > Aseem, > > How are you verifying that blocks are not being replicated? Have you > ran > fsck? *bin/hadoop fsck /* > > I'd be surprised if replication really wasn't happening. Can you run > fsck > and pay attention to "Under-replicated blocks" and "Mis-replicated > blocks?" > In fact, can you just copy-paste the output of fsck? > > Alex > > On Thu, Apr 9, 2009 at 11:23 PM, Puri, Aseem > wrote: > >> Hi >>I also tried the command $ bin/hadoop balancer. But still the >> same problem. >> >> Aseem >> >> -Original Message- >> From: Puri, Aseem [mailto:aseem.p...@honeywell.com] >> Sent: Friday, April 10, 2009 11:18 AM >> To: core-user@hadoop.apache.org >> Subject: RE: More Replication on dfs >> >> Hi Alex, >> >>Thanks for sharing your knowledge. Till now I have three >> machines and I have to check the behavior of Hadoop so I want >> replication factor should be 2. I started my Hadoop server with >> replication factor 3. After that I upload 3 files to implement word >> count program. But as my all files are stored on one machine and >> replicated to other datanodes also, so my map reduce program takes > input >> from one Datanode only. I want my files to be on different data node > so >> to check functionality of map reduce properly. >> >>Also before starting my Hadoop server again with replication >> factor 2 I formatted all Datanodes and deleted all old data manually. >> >> Please suggest what I should do now. >> >> Regards, >> Aseem Puri >> >> >> -Original Message- >> From: Mithila Nagendra [mailto:mnage...@asu.edu] >> Sent: Friday, April 10, 2009 10:56 AM >> To: core-user@hado