transferring between HDFS which reside in different subnet
Hi, I have a question to the hadoop experts: I have two HDFS, in different subnet. HDFS1 : 192.168.*.* HDFS2: 10.10.*.* the namenode of HDFS2 has two NIC. One connected to 192.168.*.* and another to 10.10.*.*. So, is it possible to transfer data from HDFS1 to HDFS2 and vice versa. Regards, Arindam
Re: transferring between HDFS which reside in different subnet
I can not cross access HDFS. Though HDFS2 has two NIC the HDFS is running on the other subnet. On Fri, May 11, 2012 at 3:57 PM, Shi Yu sh...@uchicago.edu wrote: If you could cross-access HDFS from both name nodes, then it should be transferable using /distcp /command. Shi * * On 5/11/2012 8:45 AM, Arindam Choudhury wrote: Hi, I have a question to the hadoop experts: I have two HDFS, in different subnet. HDFS1 : 192.168.*.* HDFS2: 10.10.*.* the namenode of HDFS2 has two NIC. One connected to 192.168.*.* and another to 10.10.*.*. So, is it possible to transfer data from HDFS1 to HDFS2 and vice versa. Regards, Arindam
Re: transferring between HDFS which reside in different subnet
So, hadoop dfs -cp hdfs:// hdfs://... this will work. On Fri, May 11, 2012 at 4:14 PM, Rajesh Sai T tsairaj...@gmail.com wrote: Looks like both are private subnets, so you got to route via a public default gateway. Try adding route using route command if your in linux(windows i have no idea). Just a thought i havent tried it though. Thanks, Rajesh Typed from mobile, please bear with typos. On May 11, 2012 10:03 AM, Arindam Choudhury arindamchoudhu...@gmail.com wrote: I can not cross access HDFS. Though HDFS2 has two NIC the HDFS is running on the other subnet. On Fri, May 11, 2012 at 3:57 PM, Shi Yu sh...@uchicago.edu wrote: If you could cross-access HDFS from both name nodes, then it should be transferable using /distcp /command. Shi * * On 5/11/2012 8:45 AM, Arindam Choudhury wrote: Hi, I have a question to the hadoop experts: I have two HDFS, in different subnet. HDFS1 : 192.168.*.* HDFS2: 10.10.*.* the namenode of HDFS2 has two NIC. One connected to 192.168.*.* and another to 10.10.*.*. So, is it possible to transfer data from HDFS1 to HDFS2 and vice versa. Regards, Arindam
understanding hadoop job submission
Hi, I am new to hadoop and I am trying to understand hadoop job submission. We submit the job using: hadoop jar some.jar name input output this in turn invoke the RunJar . But in RunJar I can not find any JobSubmit() or any call to JobClient. Then, how the job gets submitted to the JobTracker? -Arindam
Re: understanding hadoop job submission
Hi, The code is: public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs(); if (otherArgs.length != 2) { System.err.println(Usage: wordcount in out); System.exit(2); } Job job = new Job(conf, word count); job.setJarByClass(WordCount.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(otherArgs[0])); FileOutputFormat.setOutputPath(job, new Path(otherArgs[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); } I understand it now. But, is it possible to write a program using the JobClient to submit the hadoop job? To do that I have to create a JobConf manually. Am I thinking right? Arindam On Wed, Apr 25, 2012 at 10:56 AM, Devaraj k devara...@huawei.com wrote: Hi Arindam, hadoop jar jarFileName MainClassName The above command will not submit the job. This command only executes the jar file using the Main Class(Main-class present in manifest info if available otherwise class name(i.e MainClassName in the above command) passed as an argument. If we give any additional arguments in the command, those will be passed to the Main class args. We can have a job submission code in the Main Class or any of the classes in the jar file. You can take a look into WordCount example for job submission info. Thanks Devaraj From: Arindam Choudhury [arindamchoudhu...@gmail.com] Sent: Wednesday, April 25, 2012 2:14 PM To: common-user Subject: understanding hadoop job submission Hi, I am new to hadoop and I am trying to understand hadoop job submission. We submit the job using: hadoop jar some.jar name input output this in turn invoke the RunJar . But in RunJar I can not find any JobSubmit() or any call to JobClient. Then, how the job gets submitted to the JobTracker? -Arindam
Re: remote job submission
If you are allowed a remote connection to the cluster's service ports, then you can directly submit your jobs from your local CLI. Just make sure your local configuration points to the right locations. Can you elaborate in details please? On Fri, Apr 20, 2012 at 2:20 PM, Harsh J ha...@cloudera.com wrote: If you are allowed a remote connection to the cluster's service ports, then you can directly submit your jobs from your local CLI. Just make sure your local configuration points to the right locations. Otherwise, perhaps you can choose to use Apache Oozie (Incubating) (http://incubator.apache.org/oozie/) It does provide a REST interface that launches jobs up for you over the supplied clusters, but its more oriented towards workflow management or perhaps HUE: https://github.com/cloudera/hue On Fri, Apr 20, 2012 at 5:37 PM, Arindam Choudhury arindamchoudhu...@gmail.com wrote: Hi, Do hadoop have any web service or other interface so I can submit jobs from remote machine? Thanks, Arindam -- Harsh J
Re: remote job submission
Sorry. But I can you give me a example. On Fri, Apr 20, 2012 at 3:08 PM, Harsh J ha...@cloudera.com wrote: Arindam, If your machine can access the clusters' NN/JT/DN ports, then you can simply run your job from the machine itself. On Fri, Apr 20, 2012 at 6:31 PM, Arindam Choudhury arindamchoudhu...@gmail.com wrote: If you are allowed a remote connection to the cluster's service ports, then you can directly submit your jobs from your local CLI. Just make sure your local configuration points to the right locations. Can you elaborate in details please? On Fri, Apr 20, 2012 at 2:20 PM, Harsh J ha...@cloudera.com wrote: If you are allowed a remote connection to the cluster's service ports, then you can directly submit your jobs from your local CLI. Just make sure your local configuration points to the right locations. Otherwise, perhaps you can choose to use Apache Oozie (Incubating) (http://incubator.apache.org/oozie/) It does provide a REST interface that launches jobs up for you over the supplied clusters, but its more oriented towards workflow management or perhaps HUE: https://github.com/cloudera/hue On Fri, Apr 20, 2012 at 5:37 PM, Arindam Choudhury arindamchoudhu...@gmail.com wrote: Hi, Do hadoop have any web service or other interface so I can submit jobs from remote machine? Thanks, Arindam -- Harsh J -- Harsh J