Re: Documentation Link is broken on

2013-02-13 Thread Mayank Bansal
Its still not working for some docs like .20.X

Thanks,
Mayank

On Wed, Feb 13, 2013 at 2:01 PM, Harsh J  wrote:
> We're back to serving docs now I think. Thanks Doug!
>
> On Thu, Feb 14, 2013 at 1:07 AM, Harsh J  wrote:
>> I see some change work being done, and docs should appear again soon
>> enough: http://svn.apache.org/viewvc/hadoop/common/site/common/publish/docs/.
>> I think we're moving the docs to use a different publishing system - I
>> notice Doug is on this, but I haven't more info. I'll check back in an
>> hour if it is still not published live until then.
>>
>> On Thu, Feb 14, 2013 at 1:01 AM, Harsh J  wrote:
>>> Looking at the host publish page, I notice docs/ has gone empty, is
>>> that normal? I don't see a relevant recent commit to have made such a
>>> change though.
>>>
>>> :/www/hadoop.apache.org/docs$ ls -l
>>> total 0
>>> :/www/hadoop.apache.org/docs$ svn log | head
>>> 
>>> r1382982 | cutting | 2012-09-10 16:58:57 + (Mon, 10 Sep 2012) | 1 line
>>>
>>> HADOOP-8662. Consolidate former subproject websites into a single site.
>>> 
>>> r1134994 | todd | 2011-06-12 22:00:51 + (Sun, 12 Jun 2011) | 2 lines
>>>
>>> HADOOP-7106. Reorganize SVN layout to combine HDFS, Common, and MR in
>>> a single tree (project unsplit)
>>>
>>> ==
>>>
>>> The most recent change I notice made was on 7th, the release push of
>>> 0.23.6 but this problem happened today and that release's diff has no
>>> indications of any deletes either, so it is not that. An INFRA issue?
>>>
>>> On Thu, Feb 14, 2013 at 12:43 AM, Mayank Bansal  wrote:
 HI  Guys,

 All the documentation links are broken on apache.
  http://hadoop.apache.org/docs/r0.20.2/

 Does anybody know how to fix this?

 Thanks,
 Mayank
>>>
>>>
>>>
>>> --
>>> Harsh J
>>
>>
>>
>> --
>> Harsh J
>
>
>
> --
> Harsh J



-- 
Thanks and Regards,
Mayank
Cell: 408-718-9370


Re: Documentation Link is broken on

2013-02-13 Thread Harsh J
We're back to serving docs now I think. Thanks Doug!

On Thu, Feb 14, 2013 at 1:07 AM, Harsh J  wrote:
> I see some change work being done, and docs should appear again soon
> enough: http://svn.apache.org/viewvc/hadoop/common/site/common/publish/docs/.
> I think we're moving the docs to use a different publishing system - I
> notice Doug is on this, but I haven't more info. I'll check back in an
> hour if it is still not published live until then.
>
> On Thu, Feb 14, 2013 at 1:01 AM, Harsh J  wrote:
>> Looking at the host publish page, I notice docs/ has gone empty, is
>> that normal? I don't see a relevant recent commit to have made such a
>> change though.
>>
>> :/www/hadoop.apache.org/docs$ ls -l
>> total 0
>> :/www/hadoop.apache.org/docs$ svn log | head
>> 
>> r1382982 | cutting | 2012-09-10 16:58:57 + (Mon, 10 Sep 2012) | 1 line
>>
>> HADOOP-8662. Consolidate former subproject websites into a single site.
>> 
>> r1134994 | todd | 2011-06-12 22:00:51 + (Sun, 12 Jun 2011) | 2 lines
>>
>> HADOOP-7106. Reorganize SVN layout to combine HDFS, Common, and MR in
>> a single tree (project unsplit)
>>
>> ==
>>
>> The most recent change I notice made was on 7th, the release push of
>> 0.23.6 but this problem happened today and that release's diff has no
>> indications of any deletes either, so it is not that. An INFRA issue?
>>
>> On Thu, Feb 14, 2013 at 12:43 AM, Mayank Bansal  wrote:
>>> HI  Guys,
>>>
>>> All the documentation links are broken on apache.
>>>  http://hadoop.apache.org/docs/r0.20.2/
>>>
>>> Does anybody know how to fix this?
>>>
>>> Thanks,
>>> Mayank
>>
>>
>>
>> --
>> Harsh J
>
>
>
> --
> Harsh J



--
Harsh J


Re: Managing space in Master Node

2013-02-13 Thread Mohammad Tariq
You can specify the logging level as specified by Charles. But turning logs
off is never a good idea. Logs are really helpful in problem diagnosis,
which are eventual.

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Thu, Feb 14, 2013 at 1:22 AM, Arko Provo Mukherjee <
arkoprovomukher...@gmail.com> wrote:

> Hi,
>
> Yeah, my NameNode is also seconding as a DataNode.
>
> I would like to "turn off" this feature.
>
> Request help regarding the same.
>
> Thanks & regards
> Arko
>
> On Wed, Feb 13, 2013 at 1:38 PM, Charles Baker  wrote:
> > Hi Arko. Sounds like you may be running a DataNode on the NameNode which
> is
> > not recommended practice. Normally, the only files the NN stores are the
> > image and edits files. It does not store any actual HDFS data. If you
> must
> > run a DN on the NN, try turning down the logging in
> /conf/log4j.properties:
> >
> > #hadoop.root.logger=INFO,console
> > #hadoop.root.logger=WARN,console
> > hadoop.root.logger=ERROR,console
> >
> > Depending on the logging information you require, of course.
> >
> > -Chuck
> >
> >
> > -Original Message-
> > From: Arko Provo Mukherjee [mailto:arkoprovomukher...@gmail.com]
> > Sent: Wednesday, February 13, 2013 11:32 AM
> > To: hdfs-u...@hadoop.apache.org
> > Subject: Managing space in Master Node
> >
> > Hello Gurus,
> >
> > I am managing a Hadoop Cluster to run some experiments.
> >
> > The issue I am continuously facing is that the Master Node runs out of
> disk
> > space due to logs and data files.
> >
> > I can monitor and delete log files. However, I cannot delete the HDFS
> data.
> >
> > Thus, is there a way to force Hadoop not to save any HDFS data in the
> Master
> > Node?
> >
> > Then I can use my master to handle the metadata only and store the logs.
> >
> > Thanks & regards
> > Arko
> > SDL Enterprise Technologies, Inc. - all rights reserved.  The
> information contained in this email may be confidential and/or legally
> privileged. It has been sent for the sole use of the intended recipient(s).
> If you are not the intended recipient of this mail, you are hereby notified
> that any unauthorized review, use, disclosure, dissemination, distribution,
> or copying of this communication, or any of its contents, is strictly
> prohibited. If you have received this communication in error, please reply
> to the sender and destroy all copies of the message.
> > Registered address: 201 Edgewater Drive, Suite 225, Wakefield, MA 01880,
> USA
> >
>


Re: Need Information on Hadoop Cluster Set up

2013-02-13 Thread Yusaku Sako
Hello Seema,

Yes, you can use Apache Ambari to set up and manage a single node cluster.

Yusaku

On Wed, Feb 13, 2013 at 11:48 AM, Hadoop  wrote:

> Hi All, Good to see this thread.
>
> Is this for Single node cluster setup ?
>
> I need help with singlenode cluster setup. Would really appreciate the
> assistance on this.
>
> Thanks,
>
> Seema
>
> On Feb 13, 2013, at 3:19 AM, Yusaku Sako  wrote:
>
> Hi Sandeep,
>
> You may also want to take a look at Apache Ambari for installing,
> managing, and monitoring a Hadoop cluster.
>
> http://incubator.apache.org/ambari/index.html
>
> Yusaku
>
> On Wed, Feb 13, 2013 at 12:20 AM, Alexander Alten-Lorenz <
> wget.n...@gmail.com> wrote:
>
>> Hi,
>>
>>
>> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
>> http://mapredit.blogspot.de/p/get-hadoop-cluster-running-in-20.html
>>
>> Just two of more blogs who describes a setup.
>>
>> Hardware:
>> http://www.youtube.com/watch?v=UQJnJvwcsA8
>>
>> http://books.google.de/books?id=Nff49D7vnJcC&pg=PA259&lpg=PA259&dq=hadoop+commodity+hardware+specs&source=bl&ots=IihqWp8zRq&sig=Dse6D7KO8XS5EcXQnCnShAl5Q70&hl=en&sa=X&ei=0UwbUajSFozJswaXxoH4Aw&ved=0CD4Q6AEwAg#v=onepage&q=hadoop%20commodity%20hardware%20specs&f=false
>>
>> - Alex
>>
>> On Feb 13, 2013, at 8:00 AM, Sandeep Jain 
>> wrote:
>>
>> > Dear Team,
>> >
>> > We are in the initial phase for Hadoop learning and wanted to set up
>> the cluster for Hadoop Administration perspective.
>> > Kindly help us on all the possible options for a makinga a hadoop
>> cluster up and running.
>> > Do let us know the mimimum configurations of machines too.
>> >
>> > Your help is much appreciated.
>> >
>> > Regards,
>> > Sandeep
>> >
>> >  CAUTION - Disclaimer *
>> > This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended
>> solely
>> > for the use of the addressee(s). If you are not the intended recipient,
>> please
>> > notify the sender by e-mail and delete the original message. Further,
>> you are not
>> > to copy, disclose, or distribute this e-mail or its contents to any
>> other person and
>> > any such actions are unlawful. This e-mail may contain viruses. Infosys
>> has taken
>> > every reasonable precaution to minimize this risk, but is not liable
>> for any damage
>> > you may sustain as a result of any virus in this e-mail. You should
>> carry out your
>> > own virus checks before opening the e-mail or attachment. Infosys
>> reserves the
>> > right to monitor and review the content of all messages sent to or from
>> this e-mail
>> > address. Messages sent to or from this e-mail address may be stored on
>> the
>> > Infosys e-mail system.
>> > ***INFOSYS End of Disclaimer INFOSYS***
>>
>> --
>> Alexander Alten-Lorenz
>> http://mapredit.blogspot.com
>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>
>>
>


Re: Need Information on Hadoop Cluster Set up

2013-02-13 Thread Mohammad Tariq
Hello Seema,

These links cover all kinds of setup. If you still face any
problem, you can go
herefor
a single node setup.

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Thu, Feb 14, 2013 at 1:18 AM, Hadoop  wrote:

> Hi All, Good to see this thread.
>
> Is this for Single node cluster setup ?
>
> I need help with singlenode cluster setup. Would really appreciate the
> assistance on this.
>
> Thanks,
>
> Seema
>
> On Feb 13, 2013, at 3:19 AM, Yusaku Sako  wrote:
>
> Hi Sandeep,
>
> You may also want to take a look at Apache Ambari for installing,
> managing, and monitoring a Hadoop cluster.
>
> http://incubator.apache.org/ambari/index.html
>
> Yusaku
>
> On Wed, Feb 13, 2013 at 12:20 AM, Alexander Alten-Lorenz <
> wget.n...@gmail.com> wrote:
>
>> Hi,
>>
>>
>> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
>> http://mapredit.blogspot.de/p/get-hadoop-cluster-running-in-20.html
>>
>> Just two of more blogs who describes a setup.
>>
>> Hardware:
>> http://www.youtube.com/watch?v=UQJnJvwcsA8
>>
>> http://books.google.de/books?id=Nff49D7vnJcC&pg=PA259&lpg=PA259&dq=hadoop+commodity+hardware+specs&source=bl&ots=IihqWp8zRq&sig=Dse6D7KO8XS5EcXQnCnShAl5Q70&hl=en&sa=X&ei=0UwbUajSFozJswaXxoH4Aw&ved=0CD4Q6AEwAg#v=onepage&q=hadoop%20commodity%20hardware%20specs&f=false
>>
>> - Alex
>>
>> On Feb 13, 2013, at 8:00 AM, Sandeep Jain 
>> wrote:
>>
>> > Dear Team,
>> >
>> > We are in the initial phase for Hadoop learning and wanted to set up
>> the cluster for Hadoop Administration perspective.
>> > Kindly help us on all the possible options for a makinga a hadoop
>> cluster up and running.
>> > Do let us know the mimimum configurations of machines too.
>> >
>> > Your help is much appreciated.
>> >
>> > Regards,
>> > Sandeep
>> >
>> >  CAUTION - Disclaimer *
>> > This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended
>> solely
>> > for the use of the addressee(s). If you are not the intended recipient,
>> please
>> > notify the sender by e-mail and delete the original message. Further,
>> you are not
>> > to copy, disclose, or distribute this e-mail or its contents to any
>> other person and
>> > any such actions are unlawful. This e-mail may contain viruses. Infosys
>> has taken
>> > every reasonable precaution to minimize this risk, but is not liable
>> for any damage
>> > you may sustain as a result of any virus in this e-mail. You should
>> carry out your
>> > own virus checks before opening the e-mail or attachment. Infosys
>> reserves the
>> > right to monitor and review the content of all messages sent to or from
>> this e-mail
>> > address. Messages sent to or from this e-mail address may be stored on
>> the
>> > Infosys e-mail system.
>> > ***INFOSYS End of Disclaimer INFOSYS***
>>
>> --
>> Alexander Alten-Lorenz
>> http://mapredit.blogspot.com
>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>
>>
>


Re: Managing space in Master Node

2013-02-13 Thread Arko Provo Mukherjee
Hi,

Yeah, my NameNode is also seconding as a DataNode.

I would like to "turn off" this feature.

Request help regarding the same.

Thanks & regards
Arko

On Wed, Feb 13, 2013 at 1:38 PM, Charles Baker  wrote:
> Hi Arko. Sounds like you may be running a DataNode on the NameNode which is
> not recommended practice. Normally, the only files the NN stores are the
> image and edits files. It does not store any actual HDFS data. If you must
> run a DN on the NN, try turning down the logging in /conf/log4j.properties:
>
> #hadoop.root.logger=INFO,console
> #hadoop.root.logger=WARN,console
> hadoop.root.logger=ERROR,console
>
> Depending on the logging information you require, of course.
>
> -Chuck
>
>
> -Original Message-
> From: Arko Provo Mukherjee [mailto:arkoprovomukher...@gmail.com]
> Sent: Wednesday, February 13, 2013 11:32 AM
> To: hdfs-u...@hadoop.apache.org
> Subject: Managing space in Master Node
>
> Hello Gurus,
>
> I am managing a Hadoop Cluster to run some experiments.
>
> The issue I am continuously facing is that the Master Node runs out of disk
> space due to logs and data files.
>
> I can monitor and delete log files. However, I cannot delete the HDFS data.
>
> Thus, is there a way to force Hadoop not to save any HDFS data in the Master
> Node?
>
> Then I can use my master to handle the metadata only and store the logs.
>
> Thanks & regards
> Arko
> SDL Enterprise Technologies, Inc. - all rights reserved.  The information 
> contained in this email may be confidential and/or legally privileged. It has 
> been sent for the sole use of the intended recipient(s). If you are not the 
> intended recipient of this mail, you are hereby notified that any 
> unauthorized review, use, disclosure, dissemination, distribution, or copying 
> of this communication, or any of its contents, is strictly prohibited. If you 
> have received this communication in error, please reply to the sender and 
> destroy all copies of the message.
> Registered address: 201 Edgewater Drive, Suite 225, Wakefield, MA 01880, USA
>


Re: Need Information on Hadoop Cluster Set up

2013-02-13 Thread Hadoop
Hi All, Good to see this thread.

Is this for Single node cluster setup ?

I need help with singlenode cluster setup. Would really appreciate the 
assistance on this.

Thanks,

Seema

On Feb 13, 2013, at 3:19 AM, Yusaku Sako  wrote:

> Hi Sandeep, 
> 
> You may also want to take a look at Apache Ambari for installing, managing, 
> and monitoring a Hadoop cluster.
> 
> http://incubator.apache.org/ambari/index.html
> 
> Yusaku
> 
> On Wed, Feb 13, 2013 at 12:20 AM, Alexander Alten-Lorenz 
>  wrote:
>> Hi,
>> 
>> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
>> http://mapredit.blogspot.de/p/get-hadoop-cluster-running-in-20.html
>> 
>> Just two of more blogs who describes a setup.
>> 
>> Hardware:
>> http://www.youtube.com/watch?v=UQJnJvwcsA8
>> http://books.google.de/books?id=Nff49D7vnJcC&pg=PA259&lpg=PA259&dq=hadoop+commodity+hardware+specs&source=bl&ots=IihqWp8zRq&sig=Dse6D7KO8XS5EcXQnCnShAl5Q70&hl=en&sa=X&ei=0UwbUajSFozJswaXxoH4Aw&ved=0CD4Q6AEwAg#v=onepage&q=hadoop%20commodity%20hardware%20specs&f=false
>> 
>> - Alex
>> 
>> On Feb 13, 2013, at 8:00 AM, Sandeep Jain  wrote:
>> 
>> > Dear Team,
>> >
>> > We are in the initial phase for Hadoop learning and wanted to set up the 
>> > cluster for Hadoop Administration perspective.
>> > Kindly help us on all the possible options for a makinga a hadoop cluster 
>> > up and running.
>> > Do let us know the mimimum configurations of machines too.
>> >
>> > Your help is much appreciated.
>> >
>> > Regards,
>> > Sandeep
>> >
>> >  CAUTION - Disclaimer *
>> > This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended 
>> > solely
>> > for the use of the addressee(s). If you are not the intended recipient, 
>> > please
>> > notify the sender by e-mail and delete the original message. Further, you 
>> > are not
>> > to copy, disclose, or distribute this e-mail or its contents to any other 
>> > person and
>> > any such actions are unlawful. This e-mail may contain viruses. Infosys 
>> > has taken
>> > every reasonable precaution to minimize this risk, but is not liable for 
>> > any damage
>> > you may sustain as a result of any virus in this e-mail. You should carry 
>> > out your
>> > own virus checks before opening the e-mail or attachment. Infosys reserves 
>> > the
>> > right to monitor and review the content of all messages sent to or from 
>> > this e-mail
>> > address. Messages sent to or from this e-mail address may be stored on the
>> > Infosys e-mail system.
>> > ***INFOSYS End of Disclaimer INFOSYS***
>> 
>> --
>> Alexander Alten-Lorenz
>> http://mapredit.blogspot.com
>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
> 


Re: Installing Hadoop on RHEL 6.2

2013-02-13 Thread Giridharan Kesavan
I think you should give this link a try "let me google that for
you"
:)

-Giri


On Wed, Feb 13, 2013 at 11:00 AM, Shah, Rahul1 wrote:

>  Hi,
>
> ** **
>
> Can someone help me with installation of Hadoop on cluster with RHEL 6.2
> OS. Any link or steps will be appreciated to get me going. Thanks
>
> ** **
>
> -Rahul
>
> ** **
>


RE: Managing space in Master Node

2013-02-13 Thread Charles Baker
Hi Arko. Sounds like you may be running a DataNode on the NameNode which is
not recommended practice. Normally, the only files the NN stores are the
image and edits files. It does not store any actual HDFS data. If you must
run a DN on the NN, try turning down the logging in /conf/log4j.properties:

#hadoop.root.logger=INFO,console
#hadoop.root.logger=WARN,console
hadoop.root.logger=ERROR,console

Depending on the logging information you require, of course.

-Chuck


-Original Message-
From: Arko Provo Mukherjee [mailto:arkoprovomukher...@gmail.com] 
Sent: Wednesday, February 13, 2013 11:32 AM
To: hdfs-u...@hadoop.apache.org
Subject: Managing space in Master Node

Hello Gurus,

I am managing a Hadoop Cluster to run some experiments.

The issue I am continuously facing is that the Master Node runs out of disk
space due to logs and data files.

I can monitor and delete log files. However, I cannot delete the HDFS data.

Thus, is there a way to force Hadoop not to save any HDFS data in the Master
Node?

Then I can use my master to handle the metadata only and store the logs.

Thanks & regards
Arko
SDL Enterprise Technologies, Inc. - all rights reserved.  The information 
contained in this email may be confidential and/or legally privileged. It has 
been sent for the sole use of the intended recipient(s). If you are not the 
intended recipient of this mail, you are hereby notified that any unauthorized 
review, use, disclosure, dissemination, distribution, or copying of this 
communication, or any of its contents, is strictly prohibited. If you have 
received this communication in error, please reply to the sender and destroy 
all copies of the message.
Registered address: 201 Edgewater Drive, Suite 225, Wakefield, MA 01880, USA



Re: Managing space in Master Node

2013-02-13 Thread Mohammad Tariq
Hello Arko,

  Add the dfs.data.dir property in your hdfs-site.xml file and
point it to
some other location.

For logs, do the same thing by modifying the following line in
hadoop-env.sh file :

# Where log files are stored.  $HADOOP_HOME/logs by default.
export HADOOP_LOG_DIR=/hadoop/hdfs/logs

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Thu, Feb 14, 2013 at 1:02 AM, Arko Provo Mukherjee <
arkoprovomukher...@gmail.com> wrote:

> Hello Gurus,
>
> I am managing a Hadoop Cluster to run some experiments.
>
> The issue I am continuously facing is that the Master Node runs out of
> disk space due to logs and data files.
>
> I can monitor and delete log files. However, I cannot delete the HDFS data.
>
> Thus, is there a way to force Hadoop not to save any HDFS data in the
> Master Node?
>
> Then I can use my master to handle the metadata only and store the logs.
>
> Thanks & regards
> Arko
>


Re: Documentation Link is broken on

2013-02-13 Thread Harsh J
I see some change work being done, and docs should appear again soon
enough: http://svn.apache.org/viewvc/hadoop/common/site/common/publish/docs/.
I think we're moving the docs to use a different publishing system - I
notice Doug is on this, but I haven't more info. I'll check back in an
hour if it is still not published live until then.

On Thu, Feb 14, 2013 at 1:01 AM, Harsh J  wrote:
> Looking at the host publish page, I notice docs/ has gone empty, is
> that normal? I don't see a relevant recent commit to have made such a
> change though.
>
> :/www/hadoop.apache.org/docs$ ls -l
> total 0
> :/www/hadoop.apache.org/docs$ svn log | head
> 
> r1382982 | cutting | 2012-09-10 16:58:57 + (Mon, 10 Sep 2012) | 1 line
>
> HADOOP-8662. Consolidate former subproject websites into a single site.
> 
> r1134994 | todd | 2011-06-12 22:00:51 + (Sun, 12 Jun 2011) | 2 lines
>
> HADOOP-7106. Reorganize SVN layout to combine HDFS, Common, and MR in
> a single tree (project unsplit)
>
> ==
>
> The most recent change I notice made was on 7th, the release push of
> 0.23.6 but this problem happened today and that release's diff has no
> indications of any deletes either, so it is not that. An INFRA issue?
>
> On Thu, Feb 14, 2013 at 12:43 AM, Mayank Bansal  wrote:
>> HI  Guys,
>>
>> All the documentation links are broken on apache.
>>  http://hadoop.apache.org/docs/r0.20.2/
>>
>> Does anybody know how to fix this?
>>
>> Thanks,
>> Mayank
>
>
>
> --
> Harsh J



--
Harsh J


Re: Documentation Link is broken on

2013-02-13 Thread Harsh J
Looking at the host publish page, I notice docs/ has gone empty, is
that normal? I don't see a relevant recent commit to have made such a
change though.

:/www/hadoop.apache.org/docs$ ls -l
total 0
:/www/hadoop.apache.org/docs$ svn log | head

r1382982 | cutting | 2012-09-10 16:58:57 + (Mon, 10 Sep 2012) | 1 line

HADOOP-8662. Consolidate former subproject websites into a single site.

r1134994 | todd | 2011-06-12 22:00:51 + (Sun, 12 Jun 2011) | 2 lines

HADOOP-7106. Reorganize SVN layout to combine HDFS, Common, and MR in
a single tree (project unsplit)

==

The most recent change I notice made was on 7th, the release push of
0.23.6 but this problem happened today and that release's diff has no
indications of any deletes either, so it is not that. An INFRA issue?

On Thu, Feb 14, 2013 at 12:43 AM, Mayank Bansal  wrote:
> HI  Guys,
>
> All the documentation links are broken on apache.
>  http://hadoop.apache.org/docs/r0.20.2/
>
> Does anybody know how to fix this?
>
> Thanks,
> Mayank



--
Harsh J


Re: Installing Hadoop on RHEL 6.2

2013-02-13 Thread Mohammad Tariq
You can find all the steps
here
.

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Thu, Feb 14, 2013 at 12:47 AM, Shah, Rahul1 wrote:

>  I have the distribution with me so I guess I can bypass the downloading
> step. I have 10 systems on which I have to install Hadoop and get it
> running.
>
> ** **
>
> *From:* Mohammad Tariq [mailto:donta...@gmail.com]
> *Sent:* Wednesday, February 13, 2013 12:11 PM
>
> *To:* user@hadoop.apache.org
> *Subject:* Re: Installing Hadoop on RHEL 6.2
>
> ** **
>
> Hello Rahul,
>
> ** **
>
>   Process to configure Apache's distribution is not much different
> from configuration on Ubuntu for which you can go 
> here.
> Alternatively you can RPMs provided by Cloudera.
>
>
> 
>
> Warm Regards,
>
> Tariq
>
> https://mtariq.jux.com/
>
> cloudfront.blogspot.com
>
> ** **
>
> On Thu, Feb 14, 2013 at 12:30 AM, Shah, Rahul1 
> wrote:
>
> Hi,
>
>  
>
> Can someone help me with installation of Hadoop on cluster with RHEL 6.2
> OS. Any link or steps will be appreciated to get me going. Thanks
>
>  
>
> -Rahul
>
>  
>
> ** **
>


Re: mapreduce.map.env

2013-02-13 Thread Harsh J
I was incorrect here: MR2 does support this; I failed to look for the
right constant reference and there were two.

On Wed, Feb 13, 2013 at 11:32 PM, Harsh J  wrote:
> What version are you specifically asking about?
>
> The MR2 (2.x) does not have this anymore in use (regression? not sure
> what the replacement is, will have to figure out), but 1.x has
> mapred.map/reduce.child.env which you can use to the same effect.
>
> On Wed, Feb 13, 2013 at 11:05 PM, Saptarshi Guha
>  wrote:
>> Hello,
>>
>> Is this still valid? Can i set
>>
>> mapreduce.{map|reduce}.env
>>
>> When i google i get links
>>
>> http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CDIQFjAA&url=http%3A%2F%2Fhadoop.apache.org%2Fdocs%2Fcurrent%2Fhadoop-mapreduce-client%2Fhadoop-mapreduce-client-core%2Fmapred-default.xml&ei=oc4bUa-GDom4igK974DwDw&usg=AFQjCNEnb_-tEiCRsOndsyuGLqtWOJ8OPw&sig2=JVUiA78CsXC7aIdo1X09Lw&bvm=bv.42261806,d.cGE&cad=rja
>>
>> and
>>
>> http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0CDkQFjAB&url=http%3A%2F%2Fhadoop.apache.org%2Fdocs%2Fmapreduce%2Fr0.21.0%2Fapi%2Fconstant-values.html&ei=oc4bUa-GDom4igK974DwDw&usg=AFQjCNFmbVNRmo70tLRRH-m9iwZUhnWDJQ&sig2=7r4odDgIBIpN92JYDfmjew&bvm=bv.42261806,d.cGE&cad=rja
>>
>>
>> none of which work.
>
>
>
> --
> Harsh J



--
Harsh J


RE: Installing Hadoop on RHEL 6.2

2013-02-13 Thread Shah, Rahul1
I have the distribution with me so I guess I can bypass the downloading step. I 
have 10 systems on which I have to install Hadoop and get it running.

From: Mohammad Tariq [mailto:donta...@gmail.com]
Sent: Wednesday, February 13, 2013 12:11 PM
To: user@hadoop.apache.org
Subject: Re: Installing Hadoop on RHEL 6.2

Hello Rahul,

  Process to configure Apache's distribution is not much different from 
configuration on Ubuntu for which you can go 
here. 
Alternatively you can RPMs provided by Cloudera.

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com

On Thu, Feb 14, 2013 at 12:30 AM, Shah, Rahul1 
mailto:rahul1.s...@intel.com>> wrote:
Hi,

Can someone help me with installation of Hadoop on cluster with RHEL 6.2 OS. 
Any link or steps will be appreciated to get me going. Thanks

-Rahul




Re: Installing Hadoop on RHEL 6.2

2013-02-13 Thread Mohammad Tariq
What clusters??

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Thu, Feb 14, 2013 at 12:41 AM, Mohammad Tariq  wrote:

> Hello Rahul,
>
>   Process to configure Apache's distribution is not much different
> from configuration on Ubuntu for which you can go 
> here.
> Alternatively you can RPMs provided by Cloudera.
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Thu, Feb 14, 2013 at 12:30 AM, Shah, Rahul1 wrote:
>
>>  Hi,
>>
>> ** **
>>
>> Can someone help me with installation of Hadoop on cluster with RHEL 6.2
>> OS. Any link or steps will be appreciated to get me going. Thanks
>>
>> ** **
>>
>> -Rahul
>>
>> ** **
>>
>
>


Documentation Link is broken on

2013-02-13 Thread Mayank Bansal
HI  Guys,

All the documentation links are broken on apache.
 http://hadoop.apache.org/docs/r0.20.2/

Does anybody know how to fix this?

Thanks,
Mayank


Re: Installing Hadoop on RHEL 6.2

2013-02-13 Thread Hitesh Shah
You could try using Ambari. 

http://incubator.apache.org/ambari/
http://incubator.apache.org/ambari/1.2.0/installing-hadoop-using-ambari/content/index.html

-- Hitesh

On Feb 13, 2013, at 11:00 AM, Shah, Rahul1 wrote:

> Hi,
>  
> Can someone help me with installation of Hadoop on cluster with RHEL 6.2 OS. 
> Any link or steps will be appreciated to get me going. Thanks
>  
> -Rahul
>  



Re: Installing Hadoop on RHEL 6.2

2013-02-13 Thread Mohammad Tariq
Hello Rahul,

  Process to configure Apache's distribution is not much different from
configuration on Ubuntu for which you can go
here.
Alternatively you can RPMs provided by Cloudera.

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Thu, Feb 14, 2013 at 12:30 AM, Shah, Rahul1 wrote:

>  Hi,
>
> ** **
>
> Can someone help me with installation of Hadoop on cluster with RHEL 6.2
> OS. Any link or steps will be appreciated to get me going. Thanks
>
> ** **
>
> -Rahul
>
> ** **
>


RE: Installing Hadoop on RHEL 6.2

2013-02-13 Thread Shah, Rahul1
I have these clusters setup with RHEL 6.2. So I guess it's called Fully 
distributed mode.

From: Nitin Pawar [mailto:nitinpawar...@gmail.com]
Sent: Wednesday, February 13, 2013 12:07 PM
To: user@hadoop.apache.org
Subject: Re: Installing Hadoop on RHEL 6.2

are you installing fully distributed mode or pseudo distributed mode?

On Thu, Feb 14, 2013 at 12:30 AM, Shah, Rahul1 
mailto:rahul1.s...@intel.com>> wrote:
Hi,

Can someone help me with installation of Hadoop on cluster with RHEL 6.2 OS. 
Any link or steps will be appreciated to get me going. Thanks

-Rahul




--
Nitin Pawar


Re: Installing Hadoop on RHEL 6.2

2013-02-13 Thread Nitin Pawar
are you installing fully distributed mode or pseudo distributed mode?


On Thu, Feb 14, 2013 at 12:30 AM, Shah, Rahul1 wrote:

>  Hi,
>
> ** **
>
> Can someone help me with installation of Hadoop on cluster with RHEL 6.2
> OS. Any link or steps will be appreciated to get me going. Thanks
>
> ** **
>
> -Rahul
>
> ** **
>



-- 
Nitin Pawar


Installing Hadoop on RHEL 6.2

2013-02-13 Thread Shah, Rahul1
Hi,

Can someone help me with installation of Hadoop on cluster with RHEL 6.2 OS. 
Any link or steps will be appreciated to get me going. Thanks

-Rahul



r0.20.2 documentation gone?

2013-02-13 Thread Jean-Marc Spaggiari
Hi,

http://hadoop.apache.org/docs/r0.20.2/hdfs-default.html is not working
anymore. Only 0.19 pages are still there. Is there any reason? Where
can we found the documentatino for 0.20.2?

Thanks,

JM


Re: Java submit job to remote server

2013-02-13 Thread Alex Thieme
It appears this is the full extent of the stack trace. Anything prior to the 
org.apache.hadoop calls are from my container where hadoop is called from.

Caused by: java.io.IOException: Call to /127.0.0.1:9001 failed on local 
exception: java.io.EOFException
at org.apache.hadoop.ipc.Client.wrapException(Client.java:775)
at org.apache.hadoop.ipc.Client.call(Client.java:743)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
at org.apache.hadoop.mapred.$Proxy55.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
at org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:429)
at org.apache.hadoop.mapred.JobClient.init(JobClient.java:423)
at org.apache.hadoop.mapred.JobClient.(JobClient.java:410)
at org.apache.hadoop.mapreduce.Job.(Job.java:50)
at 
com.allenabi.sherlock.graph.OfflineDataTool.run(OfflineDataTool.java:25)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at 
com.allenabi.sherlock.graph.OfflineDataComponent.submitJob(OfflineDataComponent.java:67)
... 64 more
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446)

Alex Thieme
athi...@athieme.com
508-361-2788
In

On Feb 12, 2013, at 8:16 PM, Hemanth Yamijala  wrote:

> Can you please include the complete stack trace and not just the root. Also, 
> have you set fs.default.name to a hdfs location like hdfs://localhost:9000 ?
> 
> Thanks
> Hemanth
> 
> On Wednesday, February 13, 2013, Alex Thieme wrote:
> Thanks for the prompt reply and I'm sorry I forgot to include the exception. 
> My bad. I've included it below. There certainly appears to be a server 
> running on localhost:9001. At least, I was able to telnet to that address. 
> While in development, I'm treating the server on localhost as the remote 
> server. Moving to production, there'd obviously be a different remote server 
> address configured.
> 
> Root Exception stack trace:
> java.io.EOFException
>   at java.io.DataInputStream.readInt(DataInputStream.java:375)
>   at 
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501)
>   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446)
> + 3 more (set debug level logging or '-Dmule.verbose.exceptions=true' for 
> everything)
> 
> 
> On Feb 12, 2013, at 4:22 PM, Nitin Pawar  wrote:
> 
>> conf.set("mapred.job.tracker", "localhost:9001");
>> 
>> this means that your jobtracker is on port 9001 on localhost 
>> 
>> if you change it to the remote host and thats the port its running on then 
>> it should work as expected 
>> 
>> whats the exception you are getting? 
>> 
>> 
>> On Wed, Feb 13, 2013 at 2:41 AM, Alex Thieme  wrote:
>> I apologize for asking what seems to be such a basic question, but I would 
>> use some help with submitting a job to a remote server.
>> 
>> I have downloaded and installed hadoop locally in pseudo-distributed mode. I 
>> have written some Java code to submit a job. 
>> 
>> Here's the org.apache.hadoop.util.Tool and 
>> org.apache.hadoop.mapreduce.Mapper I've written.
>> 
>> If I enable the conf.set("mapred.job.tracker", "localhost:9001") line, then 
>> I get the exception included below.
>> 
>> If that line is disabled, then the job is completed. However, in reviewing 
>> the hadoop server administration page 
>> (http://localhost:50030/jobtracker.jsp) I don't see the job as processed by 
>> the server. Instead, I wonder if my Java code is simply running the 
>> necessary mapper Java code, bypassing the locally installed server.
>> 
>> Thanks in advance.
>> 
>> Alex
>> 
>> public class OfflineDataTool extends Configured implements Tool {
>> 
>> public int run(final String[] args) throws Exception {
>> final Configuration conf = getConf();
>> //conf.set("mapred.job.tracker", "localhost:9001");
>> 
>> final Job job = new Job(conf);
>> job.setJarByClass(getClass());
>> job.setJobName(getClass().getName());
>> 
>> job.setMapperClass(OfflineDataMapper.class);
>> 
>> job.setInputFormatClass(TextInputFormat.class);
>> 
>> job.setMapOutputKeyClass(Text.class);
>> job.setMapOutputValueClass(Text.class);
>> 
>> job.setOutputKeyClass(Text.class);
>> job.setOutputValueClass(Text.class);
>> 
>> FileInputFormat.addInputPath(job, new 
>> org.apache.hadoop.fs.Path(args[0]));
>> 
>> final org.apache.hadoop.fs.Path output = new org.a



Re: mapreduce.map.env

2013-02-13 Thread Harsh J
What version are you specifically asking about?

The MR2 (2.x) does not have this anymore in use (regression? not sure
what the replacement is, will have to figure out), but 1.x has
mapred.map/reduce.child.env which you can use to the same effect.

On Wed, Feb 13, 2013 at 11:05 PM, Saptarshi Guha
 wrote:
> Hello,
>
> Is this still valid? Can i set
>
> mapreduce.{map|reduce}.env
>
> When i google i get links
>
> http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CDIQFjAA&url=http%3A%2F%2Fhadoop.apache.org%2Fdocs%2Fcurrent%2Fhadoop-mapreduce-client%2Fhadoop-mapreduce-client-core%2Fmapred-default.xml&ei=oc4bUa-GDom4igK974DwDw&usg=AFQjCNEnb_-tEiCRsOndsyuGLqtWOJ8OPw&sig2=JVUiA78CsXC7aIdo1X09Lw&bvm=bv.42261806,d.cGE&cad=rja
>
> and
>
> http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0CDkQFjAB&url=http%3A%2F%2Fhadoop.apache.org%2Fdocs%2Fmapreduce%2Fr0.21.0%2Fapi%2Fconstant-values.html&ei=oc4bUa-GDom4igK974DwDw&usg=AFQjCNFmbVNRmo70tLRRH-m9iwZUhnWDJQ&sig2=7r4odDgIBIpN92JYDfmjew&bvm=bv.42261806,d.cGE&cad=rja
>
>
> none of which work.



--
Harsh J


Re: What resources are used by idle NameNode and JobTracker tasks

2013-02-13 Thread Harsh J
The NN has a constant memory use and the CPU usage should also be
patterned in idle mode. The JT usually retires jobs over the days, and
the memory and CPU usage should both go down over time in idle mode,
especially if you indicate a week of idleness. There is however the DN
block scanner that may trigger itself and begin consuming a bit of an
IO in verifying block data integrity (to re-populate rot/etc.
corrupted blocks on disks), which would probably be the only usage
registered on an idle cluster with no users talking to it.

What exact 'resources' was the complaint about? Having more idea on
that would help us trace or answer better.

On Mon, Feb 11, 2013 at 11:37 PM, Steve Lewis  wrote:
> I have Hadoop running on 8 nodes in a 50 node cluster. The NameNode and
> Jobtrackers have been running for months while jobs were run occasionally -
> usually no more than a dozen a week and more typically none. Recently the
> cluster manager accused Hadoop of consuming a large number of resources in
> the cluster - this occurred during a week an which I was running no jobs.
> While I stopped Hadoop  the trackers have been running for many months
> stably.
>   Is there any reason why either of these jobs should suddenly and without
> jobs being run increase their consumption of resources in a serious way??
>
>
> --
> Steven M. Lewis PhD
> 4221 105th Ave NE
> Kirkland, WA 98033
> 206-384-1340 (cell)
> Skype lordjoe_com
>



--
Harsh J


mapreduce.map.env

2013-02-13 Thread Saptarshi Guha
Hello,

Is this still valid? Can i set

mapreduce.{map|reduce}.env

When i google i get links

http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CDIQFjAA&url=http%3A%2F%2Fhadoop.apache.org%2Fdocs%2Fcurrent%2Fhadoop-mapreduce-client%2Fhadoop-mapreduce-client-core%2Fmapred-default.xml&ei=oc4bUa-GDom4igK974DwDw&usg=AFQjCNEnb_-tEiCRsOndsyuGLqtWOJ8OPw&sig2=JVUiA78CsXC7aIdo1X09Lw&bvm=bv.42261806,d.cGE&cad=rja

and

http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0CDkQFjAB&url=http%3A%2F%2Fhadoop.apache.org%2Fdocs%2Fmapreduce%2Fr0.21.0%2Fapi%2Fconstant-values.html&ei=oc4bUa-GDom4igK974DwDw&usg=AFQjCNFmbVNRmo70tLRRH-m9iwZUhnWDJQ&sig2=7r4odDgIBIpN92JYDfmjew&bvm=bv.42261806,d.cGE&cad=rja


none of which work.


Re: configure mapreduce to work with pem files.

2013-02-13 Thread Nitin Pawar
Pedro,

Hadoop does not need ssh in MR.

hadoop start up script needs ssh setup so that you can start all your
services from a single node. And that is not mandatory. You can login to
all nodes and start individual services on the nodes as needed by the node
with the correct configuration for hadoop


On Wed, Feb 13, 2013 at 7:15 PM, Pedro Sá da Costa wrote:

> So, why it is necessary to configure ssh in Hadoop MR?
>
>
> On 13 February 2013 12:58, Harsh J  wrote:
>
>> Hi,
>>
>> Nodes in Hadoop do not communicate using SSH. See
>> http://wiki.apache.org/hadoop/FAQ#Does_Hadoop_require_SSH.3F
>>
>> On Wed, Feb 13, 2013 at 5:16 PM, Pedro Sá da Costa 
>> wrote:
>> > I'm trying to configure ssh for the Hadoop mapreduce, but my nodes only
>> > communicate with each others using RSA keys in pem format.
>> >
>> > (It doesn't work)
>> > ssh user@host
>> > Permission denied (publickey).
>> >
>> > (It works)
>> > ssh -i ~/key.pem user@host
>> >
>> > The nodes in mapreduce communicate using ssh. How I configure the ssh,
>> or
>> > the mapreduce to work with the pem file.
>> >
>> >
>> > --
>> > Best regards,
>> > P
>>
>>
>>
>> --
>> Harsh J
>>
>
>
>
> --
> Best regards,
>



-- 
Nitin Pawar


Re: configure mapreduce to work with pem files.

2013-02-13 Thread Pedro Sá da Costa
So, why it is necessary to configure ssh in Hadoop MR?

On 13 February 2013 12:58, Harsh J  wrote:

> Hi,
>
> Nodes in Hadoop do not communicate using SSH. See
> http://wiki.apache.org/hadoop/FAQ#Does_Hadoop_require_SSH.3F
>
> On Wed, Feb 13, 2013 at 5:16 PM, Pedro Sá da Costa 
> wrote:
> > I'm trying to configure ssh for the Hadoop mapreduce, but my nodes only
> > communicate with each others using RSA keys in pem format.
> >
> > (It doesn't work)
> > ssh user@host
> > Permission denied (publickey).
> >
> > (It works)
> > ssh -i ~/key.pem user@host
> >
> > The nodes in mapreduce communicate using ssh. How I configure the ssh, or
> > the mapreduce to work with the pem file.
> >
> >
> > --
> > Best regards,
> > P
>
>
>
> --
> Harsh J
>



-- 
Best regards,


Re: Child JVM, Distributed Cache and Language Embedding

2013-02-13 Thread David Boyd

Actually HADOOP places the symlink into the local
working directory of of the JVM process.  I use this
method to push shared objects (CUDA and OpenCL) to
nodes for tasks.There is a section in the
HADOOP docs on shared object loading that should
help (although I found I did not need to and could
not do the System.loadLibrary call recommended).

You can also bundle the stuff into the JAR file in
a subdir and that will be unpacked to the local
working dir.   The nice thing about using the
distributed cache is the files only need to be pushed
to the cluster once with a copyFromLocal and then
just symlinked at runtime so it is much faster.


On 2/13/2013 1:55 AM, Saptarshi Guha wrote:

Hmm,
distributedcache.getLocalCacheArchives


On Tue, Feb 12, 2013 at 9:28 PM, Saptarshi Guha
mailto:saptarshi.g...@gmail.com>> wrote:

Hello,

I'm bit fuzzy on the details here so appreciate your help.

I am embedding a language into the JVM. My hadoop job will
instantiate the child JVM once for all tasks assigned
(mapred.job.reuse.jvm.num.tasks = -1)

So if a node can run 6 parallel JVMs, it will and these 6 will churn
through all the tasks assigned to them.

Now, per JVM, the language engine will be instantiated. For this to
work, I will ship the language distribution to the nodes (the nodes
are really bare and installing the language on the node is not an
option) using the distributed cache (as a tar.gz. file).

My understanding is that HadoopMapreduce will unarchive this tgz
file and then for every task attempt symlink it into the task
attempt's working folder.

However, for the language engine  to be successfully initialized i
need to know the location of the unarchived file, a location that
will stay constant across all task attempts for that child JVM,

Q: How can i infer this location?

Cheers
Saptarshi




--
= mailto:db...@lorenzresearch.com 
David W. Boyd
Vice President, Operations
Lorenz Research, a Data Tactics corporation
7901 Jones Branch, Suite 610
Mclean, VA 22102
office:   +1-703-506-3735, ext 308
fax: +1-703-506-6703
cell: +1-703-402-7908
== http://www.lorenzresearch.com/ 


The information contained in this message may be privileged
and/or confidential and protected from disclosure.
If the reader of this message is not the intended recipient
or an employee or agent responsible for delivering this message
to the intended recipient, you are hereby notified that any
dissemination, distribution or copying of this communication
is strictly prohibited.  If you have received this communication
in error, please notify the sender immediately by replying to
this message and deleting the material from any computer.




Re: configure mapreduce to work with pem files.

2013-02-13 Thread Harsh J
Hi,

Nodes in Hadoop do not communicate using SSH. See
http://wiki.apache.org/hadoop/FAQ#Does_Hadoop_require_SSH.3F

On Wed, Feb 13, 2013 at 5:16 PM, Pedro Sá da Costa  wrote:
> I'm trying to configure ssh for the Hadoop mapreduce, but my nodes only
> communicate with each others using RSA keys in pem format.
>
> (It doesn't work)
> ssh user@host
> Permission denied (publickey).
>
> (It works)
> ssh -i ~/key.pem user@host
>
> The nodes in mapreduce communicate using ssh. How I configure the ssh, or
> the mapreduce to work with the pem file.
>
>
> --
> Best regards,
> P



--
Harsh J


Re: configure mapreduce to work with pem files.

2013-02-13 Thread Chris Embree
You need to configure ssh to use your pem files, by default it uses dsa or
rsa files.  Look at man ssh_config.



On Wed, Feb 13, 2013 at 6:46 AM, Pedro Sá da Costa wrote:

> I'm trying to configure ssh for the Hadoop mapreduce, but my nodes only
> communicate with each others using RSA keys in pem format.
>
> (It doesn't work)
> ssh user@host
> Permission denied (publickey).
>
> (It works)
> ssh -i ~/key.pem user@host
>
> The nodes in mapreduce communicate using ssh. How I configure the ssh, or
> the mapreduce to work with the pem file.
>
>
> --
> Best regards,
> P
>


configure mapreduce to work with pem files.

2013-02-13 Thread Pedro Sá da Costa
I'm trying to configure ssh for the Hadoop mapreduce, but my nodes only
communicate with each others using RSA keys in pem format.

(It doesn't work)
ssh user@host
Permission denied (publickey).

(It works)
ssh -i ~/key.pem user@host

The nodes in mapreduce communicate using ssh. How I configure the ssh, or
the mapreduce to work with the pem file.


-- 
Best regards,
P


Re: Need Information on Hadoop Cluster Set up

2013-02-13 Thread Yusaku Sako
Hi Sandeep,

You may also want to take a look at Apache Ambari for installing, managing,
and monitoring a Hadoop cluster.

http://incubator.apache.org/ambari/index.html

Yusaku

On Wed, Feb 13, 2013 at 12:20 AM, Alexander Alten-Lorenz <
wget.n...@gmail.com> wrote:

> Hi,
>
>
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
> http://mapredit.blogspot.de/p/get-hadoop-cluster-running-in-20.html
>
> Just two of more blogs who describes a setup.
>
> Hardware:
> http://www.youtube.com/watch?v=UQJnJvwcsA8
>
> http://books.google.de/books?id=Nff49D7vnJcC&pg=PA259&lpg=PA259&dq=hadoop+commodity+hardware+specs&source=bl&ots=IihqWp8zRq&sig=Dse6D7KO8XS5EcXQnCnShAl5Q70&hl=en&sa=X&ei=0UwbUajSFozJswaXxoH4Aw&ved=0CD4Q6AEwAg#v=onepage&q=hadoop%20commodity%20hardware%20specs&f=false
>
> - Alex
>
> On Feb 13, 2013, at 8:00 AM, Sandeep Jain 
> wrote:
>
> > Dear Team,
> >
> > We are in the initial phase for Hadoop learning and wanted to set up the
> cluster for Hadoop Administration perspective.
> > Kindly help us on all the possible options for a makinga a hadoop
> cluster up and running.
> > Do let us know the mimimum configurations of machines too.
> >
> > Your help is much appreciated.
> >
> > Regards,
> > Sandeep
> >
> >  CAUTION - Disclaimer *
> > This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended
> solely
> > for the use of the addressee(s). If you are not the intended recipient,
> please
> > notify the sender by e-mail and delete the original message. Further,
> you are not
> > to copy, disclose, or distribute this e-mail or its contents to any
> other person and
> > any such actions are unlawful. This e-mail may contain viruses. Infosys
> has taken
> > every reasonable precaution to minimize this risk, but is not liable for
> any damage
> > you may sustain as a result of any virus in this e-mail. You should
> carry out your
> > own virus checks before opening the e-mail or attachment. Infosys
> reserves the
> > right to monitor and review the content of all messages sent to or from
> this e-mail
> > address. Messages sent to or from this e-mail address may be stored on
> the
> > Infosys e-mail system.
> > ***INFOSYS End of Disclaimer INFOSYS***
>
> --
> Alexander Alten-Lorenz
> http://mapredit.blogspot.com
> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>
>


Re: Need Information on Hadoop Cluster Set up

2013-02-13 Thread Alexander Alten-Lorenz
Hi,

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
http://mapredit.blogspot.de/p/get-hadoop-cluster-running-in-20.html

Just two of more blogs who describes a setup.

Hardware:
http://www.youtube.com/watch?v=UQJnJvwcsA8
http://books.google.de/books?id=Nff49D7vnJcC&pg=PA259&lpg=PA259&dq=hadoop+commodity+hardware+specs&source=bl&ots=IihqWp8zRq&sig=Dse6D7KO8XS5EcXQnCnShAl5Q70&hl=en&sa=X&ei=0UwbUajSFozJswaXxoH4Aw&ved=0CD4Q6AEwAg#v=onepage&q=hadoop%20commodity%20hardware%20specs&f=false

- Alex

On Feb 13, 2013, at 8:00 AM, Sandeep Jain  wrote:

> Dear Team,
> 
> We are in the initial phase for Hadoop learning and wanted to set up the 
> cluster for Hadoop Administration perspective.
> Kindly help us on all the possible options for a makinga a hadoop cluster up 
> and running.
> Do let us know the mimimum configurations of machines too.
> 
> Your help is much appreciated.
> 
> Regards,
> Sandeep
> 
>  CAUTION - Disclaimer *
> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
> for the use of the addressee(s). If you are not the intended recipient, please
> notify the sender by e-mail and delete the original message. Further, you are 
> not
> to copy, disclose, or distribute this e-mail or its contents to any other 
> person and
> any such actions are unlawful. This e-mail may contain viruses. Infosys has 
> taken
> every reasonable precaution to minimize this risk, but is not liable for any 
> damage
> you may sustain as a result of any virus in this e-mail. You should carry out 
> your
> own virus checks before opening the e-mail or attachment. Infosys reserves the
> right to monitor and review the content of all messages sent to or from this 
> e-mail
> address. Messages sent to or from this e-mail address may be stored on the
> Infosys e-mail system.
> ***INFOSYS End of Disclaimer INFOSYS***

--
Alexander Alten-Lorenz
http://mapredit.blogspot.com
German Hadoop LinkedIn Group: http://goo.gl/N8pCF



Re: Child JVM, Distributed Cache and Language Embedding

2013-02-13 Thread Saptarshi Guha
Hmm,
distributedcache.getLocalCacheArchives


On Tue, Feb 12, 2013 at 9:28 PM, Saptarshi Guha wrote:

> Hello,
>
> I'm bit fuzzy on the details here so appreciate your help.
>
> I am embedding a language into the JVM. My hadoop job will instantiate the
> child JVM once for all tasks assigned (mapred.job.reuse.jvm.num.tasks =
> -1)
>
> So if a node can run 6 parallel JVMs, it will and these 6 will churn
> through all the tasks assigned to them.
>
> Now, per JVM, the language engine will be instantiated. For this to work,
> I will ship the language distribution to the nodes (the nodes are really
> bare and installing the language on the node is not an option) using the
> distributed cache (as a tar.gz. file).
>
> My understanding is that HadoopMapreduce will unarchive this tgz file and
> then for every task attempt symlink it into the task attempt's working
> folder.
>
> However, for the language engine  to be successfully initialized i need to
> know the location of the unarchived file, a location that will stay
> constant across all task attempts for that child JVM,
>
> Q: How can i infer this location?
>
> Cheers
> Saptarshi
>
>


Need Information on Hadoop Cluster Set up

2013-02-13 Thread Sandeep Jain
Dear Team,

We are in the initial phase for Hadoop learning and wanted to set up the 
cluster for Hadoop Administration perspective.
Kindly help us on all the possible options for a makinga a hadoop cluster up 
and running.
Do let us know the mimimum configurations of machines too.

Your help is much appreciated.

Regards,
Sandeep

 CAUTION - Disclaimer *
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
for the use of the addressee(s). If you are not the intended recipient, please
notify the sender by e-mail and delete the original message. Further, you are 
not
to copy, disclose, or distribute this e-mail or its contents to any other 
person and
any such actions are unlawful. This e-mail may contain viruses. Infosys has 
taken
every reasonable precaution to minimize this risk, but is not liable for any 
damage
you may sustain as a result of any virus in this e-mail. You should carry out 
your
own virus checks before opening the e-mail or attachment. Infosys reserves the
right to monitor and review the content of all messages sent to or from this 
e-mail
address. Messages sent to or from this e-mail address may be stored on the
Infosys e-mail system.
***INFOSYS End of Disclaimer INFOSYS***