Re: Problems with HOD and HDFS
On Tuesday 15 June 2010 04:19 AM, David Milne wrote: [2010-06-15 10:07:52,470] DEBUG/10 torque:147 - pbsdsh command: /opt/torque-2.4.5/bin/pbsdsh /home/dmilne/hadoop/hadoop-0.20.1/contrib/hod/bin/hodring --hodring.tarball-retry-initial-time 1.0 --hodring.cmd-retry-initial-time 2.0 --hodring.cmd-retry-interval 2.0 --hodring.service-id 34350.symphony.cs.waikato.ac.nz --hodring.temp-dir /scratch/local/dmilne/hod --hodring.http-port-range 8000-9000 --hodring.userid dmilne --hodring.java-home /opt/jdk1.6.0_20 --hodring.svcrgy-addr symphony.cs.waikato.ac.nz:36372 --hodring.download-addr h:t --hodring.tarball-retry-interval 3.0 --hodring.log-dir /scratch/local/dmilne/hod/log --hodring.mapred-system-dir-root /mapredsystem --hodring.xrs-port-range 32768-65536 --hodring.debug 4 --hodring.ringmaster-xrs-addr cn71:33771 --hodring.register [2010-06-15 10:07:52,475] DEBUG/10 ringMaster:929 - Returned from runWorkers. //chorus (many times) Did you mean pbsdsh command itseld was printed many times above? That should not happen. I previously thought hodrings could not start namenode but looks like hodrings themselves failed to startup. You can do two things: - See qstat output, log into the slave nodes where your job was supposed to start and see hodring logs there. - run the above hodring command yourselves directly on on these slave nodes for your job and see if it fails with some error. +Vinod
Re: Problems with HOD and HDFS
David Milne wrote: Is there something else I could read about setting up short-lived Hadoop clusters on virtual machines? I have no experience with VMs at all. I see there is quite a bit of material about using them to get Hadoop up and running with a psuedo-cluster on a single machine, but I don't follow how this stretches out to using multiple machines allocated by Torque. My slides are up here http://www.slideshare.net/steve_l/farming-hadoop-inthecloud We've been bringing up hadoop in a virtual infrastructure, first you ask for the master node containing a NN, a JT and a DN with almost no storage (just enough for the filesystem to go live, so stop the JT blocking). If it comes up you then have a stable hostname for the filesystem which you can use for all the real worker nodes (DN + TT) you want. Some nearby physicists are trying to get Hadoop to co-exist with the grid schedulers, I've added a feature request to make the reporting of task tracker slots something plugins can handle, so that you'd have a set of hadoop workers which could be used by the grid apps or by hadoop -with physical hadoop storage. When they were doing work scheduled out of hadoop, they'd report less availability to the Job Tracker, so not overload the machines. Dan Templeton of Sun/Oracle has been working with getting Hadoop to coexist with his resource manager -he's worth contacting. Maybe we could persuade him to give public online talk on the topic. -steve
Using wget to download file from HDFS
Hello, HDFS supports http read-only access to filesystem. Is it possible to use wget to download file using some url like http://namenode:webhttp://%3cnamenode%3e:%3cweb gui port/.. Thanks Jaydeep DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.
Re: Hadoop and IP on InfiniBand (IPoIB)
Thanks, Allen, for responding. So, if I understand you correctly, the dfs.datanode.dns.interface and mapred.tasktracker.dns.interface options may be used to define inbound connections only? Concerning the OS configuration, my /etc/hosts files assign unique host names to the ethernet and IB interfaces. However, even if I specify the IB host names in the masters and slaves files, communication still occurs via ethernet, not via IB. Your recommendation would therefore be to define IB instead of ethernet as the default network interface connection, right? Thanks, Russ On 06/14/10 12:32 PM, Allen Wittenauer wrote: On Jun 14, 2010, at 10:57 AM, Russell Brown wrote: I'm a new user of Hadoop. I have a Linux cluster with both gigabit ethernet and InfiniBand communications interfaces. Could someone please tell me how to switch IP communication from ethernet (the default) to InfiniBand? Thanks. Hadoop will bind inbound connections via the interface settings in the various hadoop configuration files. Outbound connections are unbound and based solely on OS configuration. I filed a jira to fix this, but it is obviously low priority since few people run multi-nic boxes. Best bet is to down the ethernet and up the IB, changing routing, etc, as necessary. -- Russell A. Brown| Oracle russ.br...@oracle.com | UMPK14-260 (650) 786-3011 (office) | 14 Network Circle (650) 786-3453 (fax)| Menlo Park, CA 94025
Re: Using wget to download file from HDFS
Sure you can. A http download option is also provided in the DataNode web-interface (def-port:50075). Use the streamFile feature of the same. An example follows. If I have a file called 'results' lying as /user/hadoop/results, I'll do: wget http://hostname.top.dom:50075/streamFile?filename=/user/hadoop/results -O results This will get me the file data in the wget-local file 'results' On Tue, Jun 15, 2010 at 7:12 PM, Jaydeep Ayachit jaydeep_ayac...@persistent.co.in wrote: Hello, HDFS supports http read-only access to filesystem. Is it possible to use wget to download file using some url like http://namenode:webhttp://%3cnamenode%3e:%3cweb gui port/.. Thanks Jaydeep DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. -- Harsh J www.harshj.com
Jotracker java.lang.NumberFormatException
Hi All , I have multinode cluster with 1 master (namenode,+ jobtracker) and 2 slavers (datanode + tasktracker). I can start namenode and datanodes,but CANT start jobtracker.The log shows java.lang.NumberFormatException. I will be greatfull if anybody can tell me what is the problem and why is this java execption being thrown? here is the complete log , all the files value are attached.(master,slaves,core-site.xml...etc) / 2010-06-15 17:05:12,679 INFO org.apache.hadoop.mapred.JobTracker: STARTUP_MSG: / STARTUP_MSG: Starting JobTracker STARTUP_MSG: host = centosxcat1/192.168.15.140 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.2 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 / 2010-06-15 17:05:12,756 INFO org.apache.hadoop.mapred.JobTracker: Scheduler configured with (memSizeForMapSlotOnJT, memSizeForReduceSlotOnJT, limitMaxMemForMapTasks, limitMaxMemForReduceTasks) (-1, -1, -1, -1) 2010-06-15 17:05:12,768 FATAL org.apache.hadoop.mapred.JobTracker: java.lang.NumberFormatException: For input string: 54311 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:481) at java.lang.Integer.parseInt(Integer.java:514) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:146) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:123) at org.apache.hadoop.mapred.JobTracker.getAddress(JobTracker.java:1807) at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1579) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:183) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:175) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:3702) 2010-06-15 17:05:12,769 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG: / SHUTDOWN_MSG: Shutting down JobTracker at centosxcat1/192.168.15.140 / cat conf/master centosxcat1 cat conf/salves aadityaxcat3 linux-466z cat conf/core-site.xml ?xml version=1.0? ?xml-stylesheet type=text/xsl href=configuration.xsl? !-- Put site-specific property overrides in this file. --?xml version=1.0? ?xml-stylesheet type=text/xsl href=configuration.xsl? !-- Put site-specific property overrides in this file. -- configuration property name dfs.name.dir /name value /fsname /value /property property name dfs.data.dir /name value /fsdata /value /property property name dfs.replication /name value 2 /value /property /configuration cat conf/mapred-site.xml ?xml version=1.0? ?xml-stylesheet type=text/xsl href=configuration.xsl? !-- Put site-specific property overrides in this file. -- configuration property name mapred.job.tracker /name value centosxcat1:54311 /value /property /configuration configuration property namefs.default.name/name valuehdfs://centosxcat1/value /property /configuration cat conf/hdfs-site.xml
RE: Using wget to download file from HDFS
Thanks, data node may not be known. Is it possible to direct url to namenode and namenode handling streaming by fetching data from various data nodes? Regards Jaydeep -Original Message- From: Harsh J [mailto:qwertyman...@gmail.com] Sent: Tuesday, June 15, 2010 9:38 PM To: common-user@hadoop.apache.org Subject: Re: Using wget to download file from HDFS Sure you can. A http download option is also provided in the DataNode web-interface (def-port:50075). Use the streamFile feature of the same. An example follows. If I have a file called 'results' lying as /user/hadoop/results, I'll do: wget http://hostname.top.dom:50075/streamFile?filename=/user/hadoop/results -O results This will get me the file data in the wget-local file 'results' On Tue, Jun 15, 2010 at 7:12 PM, Jaydeep Ayachit jaydeep_ayac...@persistent.co.in wrote: Hello, HDFS supports http read-only access to filesystem. Is it possible to use wget to download file using some url like http://namenode:webhttp://%3cnamenode%3e:%3cweb gui port/.. Thanks Jaydeep DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. -- Harsh J www.harshj.com DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.
Re: Jotracker java.lang.NumberFormatException
Hi Ankit, You need to trim your configuration variables so there is no extra whitespace. eg valuefoo/value, not: value foo /value There's a patch up for this in many of the configs, but not sure if we got mapred.job.tracker. -Todd On Tue, Jun 15, 2010 at 5:55 AM, ankit sharma ankit1984.c...@gmail.comwrote: Hi All , I have multinode cluster with 1 master (namenode,+ jobtracker) and 2 slavers (datanode + tasktracker). I can start namenode and datanodes,but CANT start jobtracker.The log shows java.lang.NumberFormatException. I will be greatfull if anybody can tell me what is the problem and why is this java execption being thrown? here is the complete log , all the files value are attached.(master,slaves,core-site.xml...etc) / 2010-06-15 17:05:12,679 INFO org.apache.hadoop.mapred.JobTracker: STARTUP_MSG: / STARTUP_MSG: Starting JobTracker STARTUP_MSG: host = centosxcat1/192.168.15.140 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.2 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 / 2010-06-15 17:05:12,756 INFO org.apache.hadoop.mapred.JobTracker: Scheduler configured with (memSizeForMapSlotOnJT, memSizeForReduceSlotOnJT, limitMaxMemForMapTasks, limitMaxMemForReduceTasks) (-1, -1, -1, -1) 2010-06-15 17:05:12,768 FATAL org.apache.hadoop.mapred.JobTracker: java.lang.NumberFormatException: For input string: 54311 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:481) at java.lang.Integer.parseInt(Integer.java:514) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:146) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:123) at org.apache.hadoop.mapred.JobTracker.getAddress(JobTracker.java:1807) at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1579) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:183) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:175) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:3702) 2010-06-15 17:05:12,769 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG: / SHUTDOWN_MSG: Shutting down JobTracker at centosxcat1/192.168.15.140 / cat conf/master centosxcat1 cat conf/salves aadityaxcat3 linux-466z cat conf/core-site.xml ?xml version=1.0? ?xml-stylesheet type=text/xsl href=configuration.xsl? !-- Put site-specific property overrides in this file. --?xml version=1.0? ?xml-stylesheet type=text/xsl href=configuration.xsl? !-- Put site-specific property overrides in this file. -- configuration property name dfs.name.dir /name value /fsname /value /property property name dfs.data.dir /name value /fsdata /value /property property name dfs.replication /name value 2 /value /property /configuration cat conf/mapred-site.xml ?xml version=1.0? ?xml-stylesheet type=text/xsl href=configuration.xsl? !-- Put site-specific property overrides in this file. -- configuration property name mapred.job.tracker /name value centosxcat1:54311 /value /property /configuration configuration property namefs.default.name/name valuehdfs://centosxcat1/value /property /configuration cat conf/hdfs-site.xml -- Todd Lipcon Software Engineer, Cloudera
Re: Using wget to download file from HDFS
On Tue, Jun 15, 2010 at 12:30 PM, Jaydeep Ayachit jaydeep_ayac...@persistent.co.in wrote: Thanks, data node may not be known. Is it possible to direct url to namenode and namenode handling streaming by fetching data from various data nodes? Regards Jaydeep -Original Message- From: Harsh J [mailto:qwertyman...@gmail.com] Sent: Tuesday, June 15, 2010 9:38 PM To: common-user@hadoop.apache.org Subject: Re: Using wget to download file from HDFS Sure you can. A http download option is also provided in the DataNode web-interface (def-port:50075). Use the streamFile feature of the same. An example follows. If I have a file called 'results' lying as /user/hadoop/results, I'll do: wget http://hostname.top.dom:50075/streamFile?filename=/user/hadoop/results -Ohttp://hostname.top.dom:50075/streamFile?filename=/user/hadoop/results%0A-Oresults This will get me the file data in the wget-local file 'results' On Tue, Jun 15, 2010 at 7:12 PM, Jaydeep Ayachit jaydeep_ayac...@persistent.co.in wrote: Hello, HDFS supports http read-only access to filesystem. Is it possible to use wget to download file using some url like http://namenode:webhttp://%3cnamenode%3e:%3cweb gui port/.. Thanks Jaydeep DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. -- Harsh J www.harshj.com DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. To accomplish something like this: You have to use the name node web interface and extract the names of datanodes from the html, then follow the above process. :::Edward reaches in his bag of tricks::: Or you can kick up a webserver with tomcat to serve HDFS. http://www.edwardcapriolo.com/wiki/en/Tomcat_Hadoop
Re: How to use MapFile in mapreduce
Yes, your thought was right! Using SequenceFileInputFormat should work fine (MapFile is just a specialization of it, a sorted one), so just pass the input paths to it. On Tue, Jun 15, 2010 at 10:43 PM, Asif Jan asif@unige.ch wrote: Hi any pointers on how to use the MapFile with new mapreduce API. I did find the correspondinf output format e.g. org.apache.hadoop.mapreduce.lib.output.MapFileOutputFormat, but was not able to see how I can specify MapFileInputFormat ? (naively I thought that org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat; should work for MapFile as well) will I have to implement RecordReader in order to read from a MapFile ? Thanks -- Harsh J www.harshj.com
Re: Hadoop and IP on InfiniBand (IPoIB)
On Jun 15, 2010, at 7:40 AM, Russell Brown wrote: Thanks, Allen, for responding. So, if I understand you correctly, the dfs.datanode.dns.interface and mapred.tasktracker.dns.interface options may be used to define inbound connections only? Correct. The daemons will bind to those interfaces and use those names as their 'official' connection in. Concerning the OS configuration, my /etc/hosts files assign unique host names to the ethernet and IB interfaces. However, even if I specify the IB host names in the masters and slaves files, communication still occurs via ethernet, not via IB. BTW, are you doing this on Solaris or Linux? Solaris is notorious for not honoring inbound and outbound interfaces. [In other words, just because the packet came in on bge0, that is no guarantee that the reply will go out on bge0 if another route is available. Particularly frustrating with NFS and SunCluster.] Your recommendation would therefore be to define IB instead of ethernet as the default network interface connection, right? Yup. Or at least give it a lower cost in the routing table.
Re: Using wget to download file from HDFS
On Jun 15, 2010, at 9:30 AM, Jaydeep Ayachit wrote: Thanks, data node may not be known. Is it possible to direct url to namenode and namenode handling streaming by fetching data from various data nodes? If you access the servlet on the NameNode, it will automatically redirect you to a data node that has some of the data on it. You certainly should not pick a random data node yourself. Also note that in yahoo 0.20.104 or 0.22, you'll need a Kerberos ticket or delegation token to use the servlet. -- Owen
Re: Hadoop and IP on InfiniBand (IPoIB)
FYI, Allen Wittnauer, I'm using Linux not Solaris, but I'll pay attention to your comment about Solaris if I install Solaris on the cluster. Thanks again for your helpful comments. Russ On 06/15/10 11:10 AM, Allen Wittenauer wrote: On Jun 15, 2010, at 7:40 AM, Russell Brown wrote: Thanks, Allen, for responding. So, if I understand you correctly, the dfs.datanode.dns.interface and mapred.tasktracker.dns.interface options may be used to define inbound connections only? Correct. The daemons will bind to those interfaces and use those names as their 'official' connection in. Concerning the OS configuration, my /etc/hosts files assign unique host names to the ethernet and IB interfaces. However, even if I specify the IB host names in the masters and slaves files, communication still occurs via ethernet, not via IB. BTW, are you doing this on Solaris or Linux? Solaris is notorious for not honoring inbound and outbound interfaces. [In other words, just because the packet came in on bge0, that is no guarantee that the reply will go out on bge0 if another route is available. Particularly frustrating with NFS and SunCluster.] Your recommendation would therefore be to define IB instead of ethernet as the default network interface connection, right? Yup. Or at least give it a lower cost in the routing table. -- Russell A. Brown| Oracle russ.br...@oracle.com | UMPK14-260 (650) 786-3011 (office) | 14 Network Circle (650) 786-3453 (fax)| Menlo Park, CA 94025
Re: Problems with HOD and HDFS
Hi David, The original HOD project was integrated with Condor ( http://bit.ly/CondorProject), which Yahoo! was using to schedule clusters. A year or two ago, the Condor project in addition to being open-source w/o costs for licensing, created close integration with Hadoop (as does SGE), as presented by me at a prior Hadoop World, and the Condor team at Condor Week 2010: http://bit.ly/Condor_Hadoop_CondorWeek2010 My company has solutions for deploying Hadoop Clusters on shared infrastructure using CycleServer and schedulers like Condor/SGE/etc. The general deployment strategy is to deploy head nodes (Name/Job Tracker), then execute nodes, and to be careful about how you deal with data/sizing/replication counts. If you're interested in this, please feel free to drop us a line at my e-mail or http://cyclecomputing.com/about/contact Thanks, Jason On Mon, Jun 14, 2010 at 7:45 PM, David Milne d.n.mi...@gmail.com wrote: Unless I am missing something, the Fair Share and Capacity schedulers sound like a solution to a different problem: aren't they for a dedicated Hadoop cluster that needs to be shared by lots of people? I have a general purpose cluster that needs to be shared by lots of people. Only one of them (me) wants to run hadoop, and only wants to run it intermittently. I'm not concerned with data locality, as my workflow is: 1) upload data I need to process to cluster 2) run a chain of map-reduce tasks 3) grab processed data from cluster 4) clean up cluster Mesos sounds good, but I am definitely NOT brave about this. As I said, I am just one user of the cluster among many. I would want to stick with Torque and Maui for resource management. - Dave On Tue, Jun 15, 2010 at 12:37 AM, Amr Awadallah a...@cloudera.com wrote: Dave, Yes, many others have the same situation, the recommended solution is either to use the Fair Share Scheduler or the Capacity Scheduler. These schedulers are much better than HOD since they take data locality into consideration (they don't just spin up 20 TT nodes on machines that have nothing to do with your data). They also don't lock down the nodes just for you, so as TT are freed other jobs can use them immediately (as opposed to no body can use them till your entire job is done). Also, if you are brave and want to try something spanking new, then I recommend you reach out to the Mesos guys, they have a scheduler layer under Hadoop that is data locality aware: http://mesos.berkeley.edu/ -- amr On Sun, Jun 13, 2010 at 9:21 PM, David Milne d.n.mi...@gmail.com wrote: Ok, thanks Jeff. This is pretty surprising though. I would have thought many people would be in my position, where they have to use Hadoop on a general purpose cluster, and need it to play nice with a resource manager? What do other people do in this position, if they don't use HOD? Deprecated normally means there is a better alternative. - Dave On Mon, Jun 14, 2010 at 2:39 PM, Jeff Hammerbacher ham...@cloudera.com wrote: Hey Dave, I can't speak for the folks at Yahoo!, but from watching the JIRA, I don't think HOD is actively used or developed anywhere these days. You're attempting to use a mostly deprecated project, and hence not receiving any support on the mailing list. Thanks, Jeff On Sun, Jun 13, 2010 at 7:33 PM, David Milne d.n.mi...@gmail.com wrote: Anybody? I am completely stuck here. I have no idea who else I can ask or where I can go for more information. Is there somewhere specific where I should be asking about HOD? Thank you, Dave On Thu, Jun 10, 2010 at 2:56 PM, David Milne d.n.mi...@gmail.com wrote: Hi there, I am trying to get Hadoop on Demand up and running, but am having problems with the ringmaster not being able to communicate with HDFS. The output from the hod allocate command ends with this, with full verbosity: [2010-06-10 14:40:22,650] CRITICAL/50 hadoop:298 - Failed to retrieve 'hdfs' service address. [2010-06-10 14:40:22,654] DEBUG/10 hadoop:631 - Cleaning up cluster id 34029.symphony.cs.waikato.ac.nz, as cluster could not be allocated. [2010-06-10 14:40:22,655] DEBUG/10 hadoop:635 - Calling rm.stop() [2010-06-10 14:40:22,665] DEBUG/10 hadoop:637 - Returning from rm.stop() [2010-06-10 14:40:22,666] CRITICAL/50 hod:401 - Cannot allocate cluster /home/dmilne/hadoop/cluster [2010-06-10 14:40:23,090] DEBUG/10 hod:597 - return code: 7 I've attached the hodrc file below, but briefly HOD is supposed to provision an HDFS cluster as well as a Map/Reduce cluster, and seems to be failing to do so. The ringmaster log looks like this: [2010-06-10 14:36:05,144] DEBUG/10 ringMaster:479 - getServiceAddr name: hdfs [2010-06-10 14:36:05,145] DEBUG/10 ringMaster:487 - getServiceAddr service: hodlib.GridServices.hdfs.Hdfs instance at
exception related to logging (using latest sources)
Hi I am getting following exception when running map-reduce jobs. java.lang.NullPointerException at org.apache.hadoop.mapred.TaskLogAppender.flush(TaskLogAppender.java:69) at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:222) at org.apache.hadoop.mapred.Child$4.run(Child.java:219) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org .apache .hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java: 813) at org.apache.hadoop.mapred.Child.main(Child.java:211) I am using latest sources (0.22.0-snapshot) that I have built myself. any ideas? thanks
Re: Jotracker java.lang.NumberFormatException
Hi , I got the mistake, It was unwanted newline in mapred-site.xml in mapred.job.tracker. On Tue, Jun 15, 2010 at 6:25 PM, ankit sharma ankit1984.c...@gmail.comwrote: Hi All , I have multinode cluster with 1 master (namenode,+ jobtracker) and 2 slavers (datanode + tasktracker). I can start namenode and datanodes,but CANT start jobtracker.The log shows java.lang.NumberFormatException. I will be greatfull if anybody can tell me what is the problem and why is this java execption being thrown? here is the complete log , all the files value are attached.(master,slaves,core-site.xml...etc) / 2010-06-15 17:05:12,679 INFO org.apache.hadoop.mapred.JobTracker: STARTUP_MSG: / STARTUP_MSG: Starting JobTracker STARTUP_MSG: host = centosxcat1/192.168.15.140 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.2 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 / 2010-06-15 17:05:12,756 INFO org.apache.hadoop.mapred.JobTracker: Scheduler configured with (memSizeForMapSlotOnJT, memSizeForReduceSlotOnJT, limitMaxMemForMapTasks, limitMaxMemForReduceTasks) (-1, -1, -1, -1) 2010-06-15 17:05:12,768 FATAL org.apache.hadoop.mapred.JobTracker: java.lang.NumberFormatException: For input string: 54311 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:481) at java.lang.Integer.parseInt(Integer.java:514) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:146) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:123) at org.apache.hadoop.mapred.JobTracker.getAddress(JobTracker.java:1807) at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1579) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:183) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:175) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:3702) 2010-06-15 17:05:12,769 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG: / SHUTDOWN_MSG: Shutting down JobTracker at centosxcat1/192.168.15.140 / cat conf/master centosxcat1 cat conf/salves aadityaxcat3 linux-466z cat conf/core-site.xml ?xml version=1.0? ?xml-stylesheet type=text/xsl href=configuration.xsl? !-- Put site-specific property overrides in this file. --?xml version=1.0? ?xml-stylesheet type=text/xsl href=configuration.xsl? !-- Put site-specific property overrides in this file. -- configuration property name dfs.name.dir /name value /fsname /value /property property name dfs.data.dir /name value /fsdata /value /property property name dfs.replication /name value 2 /value /property /configuration cat conf/mapred-site.xml ?xml version=1.0? ?xml-stylesheet type=text/xsl href=configuration.xsl? !-- Put site-specific property overrides in this file. -- configuration property name mapred.job.tracker /name value centosxcat1:54311 /value /property /configuration configuration property namefs.default.name/name valuehdfs://centosxcat1/value /property /configuration cat conf/hdfs-site.xml
Re: Hbase tutorial?
just write in thyis way CREATE TABLE 'webtable ','mycolumn' -- View this message in context: http://lucene.472066.n3.nabble.com/Hbase-tutorial-tp650605p896764.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: Problems with HOD and HDFS
On Tue, Jun 15, 2010 at 3:10 PM, Jason Stowe jst...@cyclecomputing.comwrote: Hi David, The original HOD project was integrated with Condor ( http://bit.ly/CondorProject), which Yahoo! was using to schedule clusters. A year or two ago, the Condor project in addition to being open-source w/o costs for licensing, created close integration with Hadoop (as does SGE), as presented by me at a prior Hadoop World, and the Condor team at Condor Week 2010: http://bit.ly/Condor_Hadoop_CondorWeek2010 My company has solutions for deploying Hadoop Clusters on shared infrastructure using CycleServer and schedulers like Condor/SGE/etc. The general deployment strategy is to deploy head nodes (Name/Job Tracker), then execute nodes, and to be careful about how you deal with data/sizing/replication counts. If you're interested in this, please feel free to drop us a line at my e-mail or http://cyclecomputing.com/about/contact Thanks, Jason On Mon, Jun 14, 2010 at 7:45 PM, David Milne d.n.mi...@gmail.com wrote: Unless I am missing something, the Fair Share and Capacity schedulers sound like a solution to a different problem: aren't they for a dedicated Hadoop cluster that needs to be shared by lots of people? I have a general purpose cluster that needs to be shared by lots of people. Only one of them (me) wants to run hadoop, and only wants to run it intermittently. I'm not concerned with data locality, as my workflow is: 1) upload data I need to process to cluster 2) run a chain of map-reduce tasks 3) grab processed data from cluster 4) clean up cluster Mesos sounds good, but I am definitely NOT brave about this. As I said, I am just one user of the cluster among many. I would want to stick with Torque and Maui for resource management. - Dave On Tue, Jun 15, 2010 at 12:37 AM, Amr Awadallah a...@cloudera.com wrote: Dave, Yes, many others have the same situation, the recommended solution is either to use the Fair Share Scheduler or the Capacity Scheduler. These schedulers are much better than HOD since they take data locality into consideration (they don't just spin up 20 TT nodes on machines that have nothing to do with your data). They also don't lock down the nodes just for you, so as TT are freed other jobs can use them immediately (as opposed to no body can use them till your entire job is done). Also, if you are brave and want to try something spanking new, then I recommend you reach out to the Mesos guys, they have a scheduler layer under Hadoop that is data locality aware: http://mesos.berkeley.edu/ -- amr On Sun, Jun 13, 2010 at 9:21 PM, David Milne d.n.mi...@gmail.com wrote: Ok, thanks Jeff. This is pretty surprising though. I would have thought many people would be in my position, where they have to use Hadoop on a general purpose cluster, and need it to play nice with a resource manager? What do other people do in this position, if they don't use HOD? Deprecated normally means there is a better alternative. - Dave On Mon, Jun 14, 2010 at 2:39 PM, Jeff Hammerbacher ham...@cloudera.com wrote: Hey Dave, I can't speak for the folks at Yahoo!, but from watching the JIRA, I don't think HOD is actively used or developed anywhere these days. You're attempting to use a mostly deprecated project, and hence not receiving any support on the mailing list. Thanks, Jeff On Sun, Jun 13, 2010 at 7:33 PM, David Milne d.n.mi...@gmail.com wrote: Anybody? I am completely stuck here. I have no idea who else I can ask or where I can go for more information. Is there somewhere specific where I should be asking about HOD? Thank you, Dave On Thu, Jun 10, 2010 at 2:56 PM, David Milne d.n.mi...@gmail.com wrote: Hi there, I am trying to get Hadoop on Demand up and running, but am having problems with the ringmaster not being able to communicate with HDFS. The output from the hod allocate command ends with this, with full verbosity: [2010-06-10 14:40:22,650] CRITICAL/50 hadoop:298 - Failed to retrieve 'hdfs' service address. [2010-06-10 14:40:22,654] DEBUG/10 hadoop:631 - Cleaning up cluster id 34029.symphony.cs.waikato.ac.nz, as cluster could not be allocated. [2010-06-10 14:40:22,655] DEBUG/10 hadoop:635 - Calling rm.stop() [2010-06-10 14:40:22,665] DEBUG/10 hadoop:637 - Returning from rm.stop() [2010-06-10 14:40:22,666] CRITICAL/50 hod:401 - Cannot allocate cluster /home/dmilne/hadoop/cluster [2010-06-10 14:40:23,090] DEBUG/10 hod:597 - return code: 7 I've attached the hodrc file below, but briefly HOD is supposed to provision an HDFS cluster as well as a Map/Reduce cluster, and seems to be failing to do so. The ringmaster log looks like this:
mapred.jobtracker.retirejob.interval
Hi, When I ran a job (containing some hundreds of thousands tasks) over our hadoop-0.19.2 cluster, I got OutOfMemoryError at JobTracker. Monitoring memory usage at the JobTracker with Ganglia, it looks like that memory space of the JobTracker is released every 24 hours, which is the default value of mapred.jobtracker.retirejob.interval. What happens if I set mapred.jobtracker.retirejob.interval to any less values like 12 hours or even 0? Does it only control job retirement interval from JobTrackers memory? Any ohter side effects can you suppose? Thanks. Manhee
Size of intermediate data in the gridmix benchmark
Hello all, If I scale down the size of input data for the gridmix2 benchmark to 100Mb, what would be the maximum amount of intermediate data that would be generated ? Please let me know how could I figure it out before running the benchmark. Thanks, Vikas A Patil