transferring between HDFS which reside in different subnet

2012-05-11 Thread Arindam Choudhury
Hi,

I have a question to the hadoop experts:

I have two HDFS, in different subnet.

HDFS1 : 192.168.*.*
HDFS2:  10.10.*.*

the namenode of HDFS2 has two NIC. One connected to 192.168.*.* and another
to 10.10.*.*.

So, is it possible to transfer data from HDFS1 to HDFS2 and vice versa.

Regards,
Arindam


Re: transferring between HDFS which reside in different subnet

2012-05-11 Thread Arindam Choudhury
I can not cross access HDFS. Though HDFS2 has two NIC the HDFS is running
on the other subnet.

On Fri, May 11, 2012 at 3:57 PM, Shi Yu sh...@uchicago.edu wrote:

 If you could cross-access HDFS from both name nodes, then it should be
 transferable using /distcp /command.

 Shi *
 *

 On 5/11/2012 8:45 AM, Arindam Choudhury wrote:

 Hi,

 I have a question to the hadoop experts:

 I have two HDFS, in different subnet.

 HDFS1 : 192.168.*.*
 HDFS2:  10.10.*.*

 the namenode of HDFS2 has two NIC. One connected to 192.168.*.* and
 another
 to 10.10.*.*.

 So, is it possible to transfer data from HDFS1 to HDFS2 and vice versa.

 Regards,
 Arindam





Re: transferring between HDFS which reside in different subnet

2012-05-11 Thread Arindam Choudhury
So,

hadoop dfs -cp hdfs:// hdfs://...

this will work.

On Fri, May 11, 2012 at 4:14 PM, Rajesh Sai T tsairaj...@gmail.com wrote:

 Looks like both are private subnets, so you got to route via a public
 default gateway. Try adding route using route command if your in
 linux(windows i have no idea). Just a thought i havent tried it though.

 Thanks,
 Rajesh

 Typed from mobile, please bear with typos.
 On May 11, 2012 10:03 AM, Arindam Choudhury arindamchoudhu...@gmail.com
 
 wrote:

  I can not cross access HDFS. Though HDFS2 has two NIC the HDFS is running
  on the other subnet.
 
  On Fri, May 11, 2012 at 3:57 PM, Shi Yu sh...@uchicago.edu wrote:
 
   If you could cross-access HDFS from both name nodes, then it should be
   transferable using /distcp /command.
  
   Shi *
   *
  
   On 5/11/2012 8:45 AM, Arindam Choudhury wrote:
  
   Hi,
  
   I have a question to the hadoop experts:
  
   I have two HDFS, in different subnet.
  
   HDFS1 : 192.168.*.*
   HDFS2:  10.10.*.*
  
   the namenode of HDFS2 has two NIC. One connected to 192.168.*.* and
   another
   to 10.10.*.*.
  
   So, is it possible to transfer data from HDFS1 to HDFS2 and vice
 versa.
  
   Regards,
   Arindam
  
  
  
 



understanding hadoop job submission

2012-04-25 Thread Arindam Choudhury
Hi,

I am new to hadoop and I am trying to understand hadoop job submission.

We submit the job using:

hadoop jar some.jar name input output

this in turn invoke the RunJar . But in RunJar I can not find any
JobSubmit() or any call to JobClient.

Then, how the job gets submitted to the JobTracker?

-Arindam


Re: understanding hadoop job submission

2012-04-25 Thread Arindam Choudhury
Hi,

The code is:

public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
String[] otherArgs = new GenericOptionsParser(conf,
args).getRemainingArgs();
if (otherArgs.length != 2) {
  System.err.println(Usage: wordcount in out);
  System.exit(2);
}
Job job = new Job(conf, word count);
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
  }

I understand it now. But, is it possible to write a program using the
JobClient to submit the hadoop job?

To do that I have to create a JobConf manually. Am I thinking right?

Arindam

On Wed, Apr 25, 2012 at 10:56 AM, Devaraj k devara...@huawei.com wrote:

 Hi Arindam,

hadoop jar jarFileName MainClassName

 The above command will not submit the job. This command only executes the
 jar file using the Main Class(Main-class present in manifest info if
 available otherwise class name(i.e MainClassName in the above command)
 passed as an argument. If we give any additional arguments in the command,
 those will be passed to the Main class args.

   We can have a job submission code in the Main Class or any of the
 classes in the jar file. You can take a look into WordCount example for job
 submission info.


 Thanks
 Devaraj

 
 From: Arindam Choudhury [arindamchoudhu...@gmail.com]
 Sent: Wednesday, April 25, 2012 2:14 PM
 To: common-user
 Subject: understanding hadoop job submission

 Hi,

 I am new to hadoop and I am trying to understand hadoop job submission.

 We submit the job using:

 hadoop jar some.jar name input output

 this in turn invoke the RunJar . But in RunJar I can not find any
 JobSubmit() or any call to JobClient.

 Then, how the job gets submitted to the JobTracker?

 -Arindam



Re: remote job submission

2012-04-20 Thread Arindam Choudhury
If you are allowed a remote connection to the cluster's service ports,
then you can directly submit your jobs from your local CLI. Just make
sure your local configuration points to the right locations.

Can you elaborate in details please?

On Fri, Apr 20, 2012 at 2:20 PM, Harsh J ha...@cloudera.com wrote:

 If you are allowed a remote connection to the cluster's service ports,
 then you can directly submit your jobs from your local CLI. Just make
 sure your local configuration points to the right locations.

 Otherwise, perhaps you can choose to use Apache Oozie (Incubating)
 (http://incubator.apache.org/oozie/) It does provide a REST interface
 that launches jobs up for you over the supplied clusters, but its more
 oriented towards workflow management or perhaps HUE:
 https://github.com/cloudera/hue

 On Fri, Apr 20, 2012 at 5:37 PM, Arindam Choudhury
 arindamchoudhu...@gmail.com wrote:
  Hi,
 
  Do hadoop have any web service or other interface so I can submit jobs
 from
  remote machine?
 
  Thanks,
  Arindam



 --
 Harsh J



Re: remote job submission

2012-04-20 Thread Arindam Choudhury
Sorry. But I can you give me a example.

On Fri, Apr 20, 2012 at 3:08 PM, Harsh J ha...@cloudera.com wrote:

 Arindam,

 If your machine can access the clusters' NN/JT/DN ports, then you can
 simply run your job from the machine itself.

 On Fri, Apr 20, 2012 at 6:31 PM, Arindam Choudhury
 arindamchoudhu...@gmail.com wrote:
  If you are allowed a remote connection to the cluster's service ports,
  then you can directly submit your jobs from your local CLI. Just make
  sure your local configuration points to the right locations.
 
  Can you elaborate in details please?
 
  On Fri, Apr 20, 2012 at 2:20 PM, Harsh J ha...@cloudera.com wrote:
 
  If you are allowed a remote connection to the cluster's service ports,
  then you can directly submit your jobs from your local CLI. Just make
  sure your local configuration points to the right locations.
 
  Otherwise, perhaps you can choose to use Apache Oozie (Incubating)
  (http://incubator.apache.org/oozie/) It does provide a REST interface
  that launches jobs up for you over the supplied clusters, but its more
  oriented towards workflow management or perhaps HUE:
  https://github.com/cloudera/hue
 
  On Fri, Apr 20, 2012 at 5:37 PM, Arindam Choudhury
  arindamchoudhu...@gmail.com wrote:
   Hi,
  
   Do hadoop have any web service or other interface so I can submit jobs
  from
   remote machine?
  
   Thanks,
   Arindam
 
 
 
  --
  Harsh J
 



 --
 Harsh J