Re: Remote Job Submission

2008-05-23 Thread Michael Bieniosek
You could set up an rpc server on a machine that does have hadoop installed.
Then, your clients could submit rpc requests to this machine, and your rpc
server would resubmit the job to hadoop.

-Michael

On 5/23/08 2:10 PM, "Natarajan, Senthil" <[EMAIL PROTECTED]> wrote:

> The client machine doesn't have Hadoop installed and it is not a slave
> machine.
> From the client machine data and task nodes are not seen.
>
> In this scenario how to load data to HDFS and submit the MapReduce job from
> client.
> Is it possible?
>
> If not what minimal things need to be setup so that the data and jobs can be
> submitted remotely from the client machine.
>
> Thanks,
> Senthil
>
> -Original Message-
> From: Ted Dunning [mailto:[EMAIL PROTECTED]
> Sent: Friday, May 23, 2008 4:52 PM
> To: core-user@hadoop.apache.org; '[EMAIL PROTECTED]'
> Subject: Re: Remote Job Submission
>
>
> Both are possible.  You may have to have access to the data and task nodes
> for some operations.  If you can see all of the nodes in your cluster, you
> should be able to do everything.
>
>
> On 5/23/08 1:46 PM, "Natarajan, Senthil" <[EMAIL PROTECTED]> wrote:
>
>> Hi,
>> I was wondering is it possible to submit MapReduce job on remote Hadoop
>> cluster.
>>
>> (i.e) Submitting the job from the machine which doesn't have Hadoop installed
>> and submitting to different machine where Hadoop installed.
>> Is it possible to do this?
>>
>> I guess at least data can be uploaded to HDFS through java program remotely
>> right?
>>
>> Thanks,
>> Senthil
>



Re: Remote Job Submission

2008-05-23 Thread Doug Cutting

Ted Dunning wrote:

- in order to submit the job, I think you only need to see the job-tracker.
Somebody should correct me if I am wrong.


No, you also need to be able to write the job.xml, job.jar, and 
job.split into HDFS.  Someday perhaps we'll pass these via RPC to the 
jobtracker and have it store them in HDFS, but currently JobClient 
assumes that the submitter has access to HDFS.


Doug


Re: Remote Job Submission

2008-05-23 Thread Ted Dunning

To write the data, you have a few choices:

- put some kind of proxy in that can see the cluster and write to it using
DAV or HTTP post.  It would then do a normal HDFS write.  There was a DAV
client for HDFS at one time.

- make the cluster visible and install the hadoop.jar and configuration file
in order to be able to do the write.

- in order to submit the job, I think you only need to see the job-tracker.
Somebody should correct me if I am wrong.  If I am right, then you just need
the hadoop jar and configuration in order to submit.  Proxying the
submission is probably more complex than proxying the file writing.

On 5/23/08 2:10 PM, "Natarajan, Senthil" <[EMAIL PROTECTED]> wrote:

> The client machine doesn't have Hadoop installed and it is not a slave
> machine.
> From the client machine data and task nodes are not seen.
> 
> In this scenario how to load data to HDFS and submit the MapReduce job from
> client.
> Is it possible?
> 
> If not what minimal things need to be setup so that the data and jobs can be
> submitted remotely from the client machine.
> 
> Thanks,
> Senthil
> 
> -Original Message-
> From: Ted Dunning [mailto:[EMAIL PROTECTED]
> Sent: Friday, May 23, 2008 4:52 PM
> To: core-user@hadoop.apache.org; '[EMAIL PROTECTED]'
> Subject: Re: Remote Job Submission
> 
> 
> Both are possible.  You may have to have access to the data and task nodes
> for some operations.  If you can see all of the nodes in your cluster, you
> should be able to do everything.
> 
> 
> On 5/23/08 1:46 PM, "Natarajan, Senthil" <[EMAIL PROTECTED]> wrote:
> 
>> Hi,
>> I was wondering is it possible to submit MapReduce job on remote Hadoop
>> cluster.
>> 
>> (i.e) Submitting the job from the machine which doesn't have Hadoop installed
>> and submitting to different machine where Hadoop installed.
>> Is it possible to do this?
>> 
>> I guess at least data can be uploaded to HDFS through java program remotely
>> right?
>> 
>> Thanks,
>> Senthil
> 



RE: Remote Job Submission

2008-05-23 Thread Natarajan, Senthil
The client machine doesn't have Hadoop installed and it is not a slave machine.
>From the client machine data and task nodes are not seen.

In this scenario how to load data to HDFS and submit the MapReduce job from 
client.
Is it possible?

If not what minimal things need to be setup so that the data and jobs can be 
submitted remotely from the client machine.

Thanks,
Senthil

-Original Message-
From: Ted Dunning [mailto:[EMAIL PROTECTED]
Sent: Friday, May 23, 2008 4:52 PM
To: core-user@hadoop.apache.org; '[EMAIL PROTECTED]'
Subject: Re: Remote Job Submission


Both are possible.  You may have to have access to the data and task nodes
for some operations.  If you can see all of the nodes in your cluster, you
should be able to do everything.


On 5/23/08 1:46 PM, "Natarajan, Senthil" <[EMAIL PROTECTED]> wrote:

> Hi,
> I was wondering is it possible to submit MapReduce job on remote Hadoop
> cluster.
>
> (i.e) Submitting the job from the machine which doesn't have Hadoop installed
> and submitting to different machine where Hadoop installed.
> Is it possible to do this?
>
> I guess at least data can be uploaded to HDFS through java program remotely
> right?
>
> Thanks,
> Senthil



Re: Remote Job Submission

2008-05-23 Thread Ted Dunning

Both are possible.  You may have to have access to the data and task nodes
for some operations.  If you can see all of the nodes in your cluster, you
should be able to do everything.


On 5/23/08 1:46 PM, "Natarajan, Senthil" <[EMAIL PROTECTED]> wrote:

> Hi,
> I was wondering is it possible to submit MapReduce job on remote Hadoop
> cluster.
> 
> (i.e) Submitting the job from the machine which doesn't have Hadoop installed
> and submitting to different machine where Hadoop installed.
> Is it possible to do this?
> 
> I guess at least data can be uploaded to HDFS through java program remotely
> right?
> 
> Thanks,
> Senthil



Remote Job Submission

2008-05-23 Thread Natarajan, Senthil
Hi,
I was wondering is it possible to submit MapReduce job on remote Hadoop cluster.

(i.e) Submitting the job from the machine which doesn't have Hadoop installed 
and submitting to different machine where Hadoop installed.
Is it possible to do this?

I guess at least data can be uploaded to HDFS through java program remotely 
right?

Thanks,
Senthil