Re: Copy data from Mainframe to HDFS

2013-07-23 Thread Raj K Singh
in mainframe you can have 3 type of datasources
--flat files
--VSAM files
--DB2/IMS

DB2 or IMS supprt the export utilities to copy the data into flat file
which you can get through ftp/sftp
VSAM file can be exported using IDCAMS utility
flat files can be get using the ftp/sftp


Raj K Singh
http://www.rajkrrsingh.blogspot.com
Mobile  Tel: +91 (0)9899821370


On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq  wrote:

> Hello Sandeep,
>
> You don't have to convert the data in order to copy it into the HDFS. But
> you might have to think about the MR processing of these files because of
> the format of these files.
>
> You could probably make use of Sqoop .
>
> I also came across DMX-H a few days ago while browsing. I don't know
> anything about the licensing and how good it is. Just thought of sharing it
> with you. You can visit their 
> pageto see more. They also 
> provide a VM(includes CDH) to get started quickly.
>
> Warm Regards,
> Tariq
> cloudfront.blogspot.com
>
>
> On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri wrote:
>
>> Hi ,
>>
>> "How to copy datasets from Mainframe to HDFS directly?  I know that we
>> can NDM files to Linux box and then we can use simple put command to copy
>> data to HDFS.  But, how to copy data directly from mainframe to HDFS?  I
>> have PS, PDS and VSAM datasets to copy to HDFS for analysis using
>> MapReduce.
>>
>> Also, Do we need to convert data from EBCDIC to ASCII before copy? "
>>
>> --
>> --Regards
>>   Sandeep Nemuri
>>
>
>


Re: Copy data from Mainframe to HDFS

2013-07-23 Thread Balamurali
Hi,

I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 .
Created table in HBase.Inserted records.Processing the data using Hive.
I have to show a graph with some points ( 7 - 7 days or 12 for one year).In
one day records may include 1000 - lacks.I need to show average of these
1000 - lacks records.is there any built in haddop mechanism to process
these records fast.

Also I need to run a hive query  or job (when we run a hive query actually
a job is submitting) in every 1 hour.Is there a scheduling mechanism in
hadoop to handle thsese


Please reply.
Balamurali


On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq  wrote:

> Hello Sandeep,
>
> You don't have to convert the data in order to copy it into the HDFS. But
> you might have to think about the MR processing of these files because of
> the format of these files.
>
> You could probably make use of Sqoop .
>
> I also came across DMX-H a few days ago while browsing. I don't know
> anything about the licensing and how good it is. Just thought of sharing it
> with you. You can visit their 
> pageto see more. They also 
> provide a VM(includes CDH) to get started quickly.
>
> Warm Regards,
> Tariq
> cloudfront.blogspot.com
>
>
> On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri wrote:
>
>> Hi ,
>>
>> "How to copy datasets from Mainframe to HDFS directly?  I know that we
>> can NDM files to Linux box and then we can use simple put command to copy
>> data to HDFS.  But, how to copy data directly from mainframe to HDFS?  I
>> have PS, PDS and VSAM datasets to copy to HDFS for analysis using
>> MapReduce.
>>
>> Also, Do we need to convert data from EBCDIC to ASCII before copy? "
>>
>> --
>> --Regards
>>   Sandeep Nemuri
>>
>
>


RE: Copy data from Mainframe to HDFS

2013-07-23 Thread Devaraj k
Hi Balamurali,

As per my knowledge, there is nothing in the hadoop which does exactly as per 
your requirement.

You can write mapreduce jobs according to your functionality and submit 
hourly/daily/weekly or monthly . And then you can aggregate the results.

If you want some help regarding Hive, you can ask the same in Hive mailing list.



Thanks
Devaraj k

From: Balamurali [mailto:balamurali...@gmail.com]
Sent: 23 July 2013 12:42
To: user
Subject: Re: Copy data from Mainframe to HDFS

Hi,

I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 .
Created table in HBase.Inserted records.Processing the data using Hive.
I have to show a graph with some points ( 7 - 7 days or 12 for one year).In one 
day records may include 1000 - lacks.I need to show average of these 1000 - 
lacks records.is there any built in haddop mechanism to 
process these records fast.
Also I need to run a hive query  or job (when we run a hive query actually a 
job is submitting) in every 1 hour.Is there a scheduling mechanism in hadoop to 
handle thsese

Please reply.
Balamurali

On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq 
mailto:donta...@gmail.com>> wrote:
Hello Sandeep,

You don't have to convert the data in order to copy it into the HDFS. But you 
might have to think about the MR processing of these files because of the 
format of these files.

You could probably make use of Sqoop.

I also came across DMX-H a few days ago while browsing. I don't know anything 
about the licensing and how good it is. Just thought of sharing it with you. 
You can visit their page to 
see more. They also provide a VM(includes CDH) to get started quickly.

Warm Regards,
Tariq
cloudfront.blogspot.com

On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri 
mailto:nhsande...@gmail.com>> wrote:
Hi ,

"How to copy datasets from Mainframe to HDFS directly?  I know that we can NDM 
files to Linux box and then we can use simple put command to copy data to HDFS. 
 But, how to copy data directly from mainframe to HDFS?  I have PS, PDS and 
VSAM datasets to copy to HDFS for analysis using MapReduce.

Also, Do we need to convert data from EBCDIC to ASCII before copy? "

--
--Regards
  Sandeep Nemuri




Re: ERROR orm.ClassWriter: Cannot resolve SQL type 1111

2013-07-23 Thread Fatih Haltas
At those columns, I am using uint type. I tried to cast them via sqoop
option still it gave the same error.

 For other columns having type int, text etc, I am able to import them but
I have hundreds of data in uint type that I need.

While looking at some solutions, I saw that sqoop does not support uint
type, is it correct or is there any update related uint type?

Thanks you all, especially to Jarcec, you helped me a lot ;)



On Mon, Jul 22, 2013 at 7:04 PM, Jarek Jarcec Cecho wrote:

> Hi Fatih,
> per JDBC documentation [1] the code  stands for type OTHER which
> basically means "unknown". As Sqoop do not know the type, it do not know
> how to transfer it to Hadoop. Would you mind sharing your table definition?
>
> The possible workaround is to use query based import and cast the
> problematic columns to known and supported data types.
>
> Jarcec
>
> Links:
> 1:
> http://docs.oracle.com/javase/6/docs/api/constant-values.html#java.sql.Types.OTHER
>
> On Mon, Jul 22, 2013 at 04:03:42PM +0400, Fatih Haltas wrote:
> > Hi everyone,
> >
> > I am trying to import data from postgre to hdfs but unfortunately, I am
> > taking this error. What should I do?
> > I would be really obliged if you can help. I am struggling more than 3
> days.
> >
> > ---
> > Command that I used
> > ---
> > [hadoop@ADUAE042-LAP-V ~]$ sqoop import-all-tables --direct --connect
> > jdbc:postgresql://192.168.194.158:5432/IMS --username pgsql -P  --
> --schema
> > LiveIPs
> > 
> > Result
> > ---
> > Warning: /usr/lib/hbase does not exist! HBase imports will fail.
> > Please set $HBASE_HOME to the root of your HBase installation.
> > Warning: $HADOOP_HOME is deprecated.
> >
> > 13/07/22 15:01:05 WARN tool.BaseSqoopTool: Setting your password on the
> > command-line is insecure. Consider using -P instead.
> > 13/07/22 15:01:06 INFO manager.SqlManager: Using default fetchSize of
> 1000
> > 13/07/22 15:01:06 INFO manager.PostgresqlManager: We will use schema
> LiveIPs
> > 13/07/22 15:01:06 INFO tool.CodeGenTool: Beginning code generation
> > 13/07/22 15:01:06 INFO manager.SqlManager: Executing SQL statement:
> SELECT
> > t.* FROM "LiveIPs"."2013-04-01" AS t LIMIT 1
> > 13/07/22 15:01:06 ERROR orm.ClassWriter: Cannot resolve SQL type 
> > 13/07/22 15:01:06 ERROR orm.ClassWriter: Cannot resolve SQL type 
> > 13/07/22 15:01:06 ERROR orm.ClassWriter: No Java type for SQL type 
> for
> > column ip
> > 13/07/22 15:01:06 ERROR orm.ClassWriter: No Java type for SQL type 
> for
> > column ip
> > 13/07/22 15:01:06 ERROR orm.ClassWriter: No Java type for SQL type 
> for
> > column ip
> > 13/07/22 15:01:06 ERROR orm.ClassWriter: No Java type for SQL type 
> for
> > column ip
> > 13/07/22 15:01:06 ERROR orm.ClassWriter: No Java type for SQL type 
> for
> > column ip
> > 13/07/22 15:01:06 ERROR orm.ClassWriter: No Java type for SQL type 
> for
> > column ip
> > 13/07/22 15:01:06 ERROR sqoop.Sqoop: Got exception running Sqoop:
> > java.lang.NullPointerException
> > java.lang.NullPointerException
> > at org.apache.sqoop.orm.ClassWriter.parseNullVal(ClassWriter.java:912)
> > at org.apache.sqoop.orm.ClassWriter.parseColumn(ClassWriter.java:937)
> > at org.apache.sqoop.orm.ClassWriter.generateParser(ClassWriter.java:1011)
> > at
> >
> org.apache.sqoop.orm.ClassWriter.generateClassForColumns(ClassWriter.java:1342)
> > at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1153)
> > at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:82)
> > at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:390)
> > at
> >
> org.apache.sqoop.tool.ImportAllTablesTool.run(ImportAllTablesTool.java:64)
> > at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
> > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
> > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
> > at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
>


Join Operation with Regular Expression

2013-07-23 Thread enes yücer
Hi,

I have 2 data set one of them contain string text, and other table contain
string patern (which is searching in text), id.

I have create volatile solution, to create two external hive table and full
join of them
and after full join,I use regex function in where case. but, it takes too
long. because hive does not support regex based join conditions.


how do I do this operation in hadoop or have you implement MR job( in
hive,pig, java) like this? or any advice?


thanks.


RE: Join Operation with Regular Expression

2013-07-23 Thread Devaraj k
You can try writing the mapreduce job for this. In the Job, you can filter the 
records in Mapper based on the where condition regex and then perform the join 
in the Reducer.

Please refer the classes present in hadoop-datajoin module to get an idea how 
to implement the join job.

Thanks
Devaraj k

From: enes yücer [mailto:enes...@gmail.com]
Sent: 23 July 2013 13:49
To: user@hadoop.apache.org
Subject: Join Operation with Regular Expression

Hi,
I have 2 data set one of them contain string text, and other table contain 
string patern (which is searching in text), id.
I have create volatile solution, to create two external hive table and full 
join of them
and after full join,I use regex function in where case. but, it takes too long. 
because hive does not support regex based join conditions.

how do I do this operation in hadoop or have you implement MR job( in hive,pig, 
java) like this? or any advice?

thanks.


Re: setting mapred.task.timeout programmatically from client

2013-07-23 Thread Ted Yu
For scheduling mechanism please take a look at oozie. 

Cheers

On Jul 22, 2013, at 10:37 PM, Balamurali  wrote:

> Hi, 
> 
> I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 .
> Created table in HBase.Inserted records.Processing the data using Hive.
> I have to show a graph with some points ( 7 - 7 days or 12 for one year).In 
> one day records may include 1000 - lacks.I need to show average of these 1000 
> - lacks records.is there any built in haddop mechanism to process these 
> records fast.
> 
> Also I need to run a hive query  or job (when we run a hive query actually a 
> job is submitting) in every 1 hour.Is there a scheduling mechanism in hadoop 
> to handle thsese
> 
> 
> Please reply.
> Balamurali
> 
> 
> On Tue, Jul 23, 2013 at 10:02 AM, Eugene Koifman  
> wrote:
>> Than you both
>> 
>> On Mon, Jul 22, 2013 at 8:16 PM, Devaraj k  wrote:
>> > 'mapred.task.timeout' is deprecated configuration. You can use 
>> > 'mapreduce.task.timeout' property to do the same.
>> >
>> > You could set this configuration while submitting the Job using 
>> > org.apache.hadoop.conf.Configuration.setLong(String name, long value) API 
>> > from conf or JobConf.
>> >
>> > Thanks
>> > Devaraj k
>> >
>> > -Original Message-
>> > From: Eugene Koifman [mailto:ekoif...@hortonworks.com]
>> > Sent: 23 July 2013 04:24
>> > To: user@hadoop.apache.org
>> > Subject: setting mapred.task.timeout programmatically from client
>> >
>> > Hello,
>> > is there a way to set mapred.task.timeout programmatically from client?
>> >
>> > Thank you
> 


Re: setting mapred.task.timeout programmatically from client

2013-07-23 Thread Balamurali
Ok thanks


On Tue, Jul 23, 2013 at 3:02 PM, Ted Yu  wrote:

> For scheduling mechanism please take a look at oozie.
>
> Cheers
>
> On Jul 22, 2013, at 10:37 PM, Balamurali  wrote:
>
> Hi,
>
> I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 .
> Created table in HBase.Inserted records.Processing the data using Hive.
> I have to show a graph with some points ( 7 - 7 days or 12 for one
> year).In one day records may include 1000 - lacks.I need to show average of
> these 1000 - lacks records.is there any built in haddop mechanism to
> process these records fast.
>
> Also I need to run a hive query  or job (when we run a hive query actually
> a job is submitting) in every 1 hour.Is there a scheduling mechanism in
> hadoop to handle thsese
>
>
> Please reply.
> Balamurali
>
>
> On Tue, Jul 23, 2013 at 10:02 AM, Eugene Koifman  > wrote:
>
>> Than you both
>>
>> On Mon, Jul 22, 2013 at 8:16 PM, Devaraj k  wrote:
>> > 'mapred.task.timeout' is deprecated configuration. You can use
>> 'mapreduce.task.timeout' property to do the same.
>> >
>> > You could set this configuration while submitting the Job using
>> org.apache.hadoop.conf.Configuration.setLong(String name, long value) API
>> from conf or JobConf.
>> >
>> > Thanks
>> > Devaraj k
>> >
>> > -Original Message-
>> > From: Eugene Koifman [mailto:ekoif...@hortonworks.com]
>> > Sent: 23 July 2013 04:24
>> > To: user@hadoop.apache.org
>> > Subject: setting mapred.task.timeout programmatically from client
>> >
>> > Hello,
>> > is there a way to set mapred.task.timeout programmatically from client?
>> >
>> > Thank you
>>
>
>


Re: Copy data from Mainframe to HDFS

2013-07-23 Thread Sandeep Nemuri
Thanks for your reply guys ,
i am looking for open source do we have any ??


On Tue, Jul 23, 2013 at 12:53 PM, Devaraj k  wrote:

>  Hi Balamurali,
>
> ** **
>
> As per my knowledge, there is nothing in the hadoop which does exactly as
> per your requirement.
>
> ** **
>
> You can write mapreduce jobs according to your functionality and submit
> hourly/daily/weekly or monthly . And then you can aggregate the results.**
> **
>
> ** **
>
> If you want some help regarding Hive, you can ask the same in Hive mailing
> list.
>
> ** **
>
> ** **
>
> ** **
>
> Thanks
>
> Devaraj k
>
> ** **
>
> *From:* Balamurali [mailto:balamurali...@gmail.com]
> *Sent:* 23 July 2013 12:42
> *To:* user
> *Subject:* Re: Copy data from Mainframe to HDFS
>
> ** **
>
> Hi,
>
>
> I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 .
> Created table in HBase.Inserted records.Processing the data using Hive.***
> *
>
> I have to show a graph with some points ( 7 - 7 days or 12 for one
> year).In one day records may include 1000 - lacks.I need to show average of
> these 1000 - lacks records.is there any built in haddop mechanism to
> process these records fast.
>
> Also I need to run a hive query  or job (when we run a hive query actually
> a job is submitting) in every 1 hour.Is there a scheduling mechanism in
> hadoop to handle thsese
>
> 
>
> Please reply.
>
> Balamurali
>
> ** **
>
> On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq 
> wrote:
>
> Hello Sandeep,
>
> ** **
>
> You don't have to convert the data in order to copy it into the HDFS. But
> you might have to think about the MR processing of these files because of
> the format of these files.
>
> ** **
>
> You could probably make use of Sqoop .
>
> ** **
>
> I also came across DMX-H a few days ago while browsing. I don't know
> anything about the licensing and how good it is. Just thought of sharing it
> with you. You can visit their 
> pageto see more. They also 
> provide a VM(includes CDH) to get started quickly.
> 
>
>
> 
>
> Warm Regards,
>
> Tariq
>
> cloudfront.blogspot.com
>
> ** **
>
> On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri 
> wrote:
>
> Hi , 
>
> ** **
>
> "How to copy datasets from Mainframe to HDFS directly?  I know that we can
> NDM files to Linux box and then we can use simple put command to copy data
> to HDFS.  But, how to copy data directly from mainframe to HDFS?  I have
> PS, PDS and VSAM datasets to copy to HDFS for analysis using MapReduce. **
> **
>
> ** **
>
> Also, Do we need to convert data from EBCDIC to ASCII before copy? "
>
> ** **
>
> -- 
>
> --Regards
>
>   Sandeep Nemuri
>
> ** **
>
> ** **
>



-- 
--Regards
  Sandeep Nemuri


Re: ERROR orm.ClassWriter: Cannot resolve SQL type 1111

2013-07-23 Thread Shahab Yunus
I think you will have to write a custom code to handle this.

Regards,
Shahab


On Tue, Jul 23, 2013 at 3:50 AM, Fatih Haltas  wrote:

> At those columns, I am using uint type. I tried to cast them via sqoop
> option still it gave the same error.
>
>  For other columns having type int, text etc, I am able to import them but
> I have hundreds of data in uint type that I need.
>
> While looking at some solutions, I saw that sqoop does not support uint
> type, is it correct or is there any update related uint type?
>
> Thanks you all, especially to Jarcec, you helped me a lot ;)
>
>
>
> On Mon, Jul 22, 2013 at 7:04 PM, Jarek Jarcec Cecho wrote:
>
>> Hi Fatih,
>> per JDBC documentation [1] the code  stands for type OTHER which
>> basically means "unknown". As Sqoop do not know the type, it do not know
>> how to transfer it to Hadoop. Would you mind sharing your table definition?
>>
>> The possible workaround is to use query based import and cast the
>> problematic columns to known and supported data types.
>>
>> Jarcec
>>
>> Links:
>> 1:
>> http://docs.oracle.com/javase/6/docs/api/constant-values.html#java.sql.Types.OTHER
>>
>> On Mon, Jul 22, 2013 at 04:03:42PM +0400, Fatih Haltas wrote:
>> > Hi everyone,
>> >
>> > I am trying to import data from postgre to hdfs but unfortunately, I am
>> > taking this error. What should I do?
>> > I would be really obliged if you can help. I am struggling more than 3
>> days.
>> >
>> > ---
>> > Command that I used
>> > ---
>> > [hadoop@ADUAE042-LAP-V ~]$ sqoop import-all-tables --direct --connect
>> > jdbc:postgresql://192.168.194.158:5432/IMS --username pgsql -P  --
>> --schema
>> > LiveIPs
>> > 
>> > Result
>> > ---
>> > Warning: /usr/lib/hbase does not exist! HBase imports will fail.
>> > Please set $HBASE_HOME to the root of your HBase installation.
>> > Warning: $HADOOP_HOME is deprecated.
>> >
>> > 13/07/22 15:01:05 WARN tool.BaseSqoopTool: Setting your password on the
>> > command-line is insecure. Consider using -P instead.
>> > 13/07/22 15:01:06 INFO manager.SqlManager: Using default fetchSize of
>> 1000
>> > 13/07/22 15:01:06 INFO manager.PostgresqlManager: We will use schema
>> LiveIPs
>> > 13/07/22 15:01:06 INFO tool.CodeGenTool: Beginning code generation
>> > 13/07/22 15:01:06 INFO manager.SqlManager: Executing SQL statement:
>> SELECT
>> > t.* FROM "LiveIPs"."2013-04-01" AS t LIMIT 1
>> > 13/07/22 15:01:06 ERROR orm.ClassWriter: Cannot resolve SQL type 
>> > 13/07/22 15:01:06 ERROR orm.ClassWriter: Cannot resolve SQL type 
>> > 13/07/22 15:01:06 ERROR orm.ClassWriter: No Java type for SQL type 
>> for
>> > column ip
>> > 13/07/22 15:01:06 ERROR orm.ClassWriter: No Java type for SQL type 
>> for
>> > column ip
>> > 13/07/22 15:01:06 ERROR orm.ClassWriter: No Java type for SQL type 
>> for
>> > column ip
>> > 13/07/22 15:01:06 ERROR orm.ClassWriter: No Java type for SQL type 
>> for
>> > column ip
>> > 13/07/22 15:01:06 ERROR orm.ClassWriter: No Java type for SQL type 
>> for
>> > column ip
>> > 13/07/22 15:01:06 ERROR orm.ClassWriter: No Java type for SQL type 
>> for
>> > column ip
>> > 13/07/22 15:01:06 ERROR sqoop.Sqoop: Got exception running Sqoop:
>> > java.lang.NullPointerException
>> > java.lang.NullPointerException
>> > at org.apache.sqoop.orm.ClassWriter.parseNullVal(ClassWriter.java:912)
>> > at org.apache.sqoop.orm.ClassWriter.parseColumn(ClassWriter.java:937)
>> > at
>> org.apache.sqoop.orm.ClassWriter.generateParser(ClassWriter.java:1011)
>> > at
>> >
>> org.apache.sqoop.orm.ClassWriter.generateClassForColumns(ClassWriter.java:1342)
>> > at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1153)
>> > at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:82)
>> > at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:390)
>> > at
>> >
>> org.apache.sqoop.tool.ImportAllTablesTool.run(ImportAllTablesTool.java:64)
>> > at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
>> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> > at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
>> > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
>> > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
>> > at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
>>
>
>


New hadoop 1.2 single node installation giving problems

2013-07-23 Thread Ashish Umrani
Hi There,

First of all, sorry if I am asking some stupid question.  Myself being new
to the Hadoop environment , am finding it a bit difficult to figure out why
its failing

I have installed hadoop 1.2, based on instructions given in the folllowing
link
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

All went well and I could do the start-all.sh and the jps command does show
all 5 process to be present.

However when I try to do

hadoop fs -ls

I get the following error

hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$ hadoop fs
-ls
Warning: $HADOOP_HOME is deprecated.

13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not

13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not

13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not

13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not

13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not

13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not

ls: Cannot access .: No such file or directory.
hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$



Can someone help me figure out whats the issue in my installation


Regards
ashish


Re: copy files from ftp to hdfs in parallel, distcp failed

2013-07-23 Thread Hao Ren

Hi,

I am just wondering whether I can move data from Ftp to Hdfs via Hadoop 
distcp.


Can someone give me an example ?

In my case, I always encounter the "can not access ftp" error.

I am quite sure that the link, login et passwd are correct, actually, I 
have just copy and paste the ftp address to Firefox. It does work. 
However,//it doesn't work with:

bin/hadoop -ls ftp://

Any workaround here ?

Thank you.

Hao

Le 16/07/2013 17:47, Hao Ren a écrit :

Hi,

Actually, I test with my own ftp host at first, however it doesn't work.

Then I changed it into 0.0.0.0.

But I always get the "can not access ftp" msg.

Thank you .

Hao.

Le 16/07/2013 17:03, Ram a écrit :

Hi,
Please replace 0.0.0.0.with your ftp host ip address and try it.

Hi,



From,
Ramesh.




On Mon, Jul 15, 2013 at 3:22 PM, Hao Ren > wrote:


Thank you, Ram

I have configured core-site.xml as following:









hadoop.tmp.dir
/vol/persistent-hdfs



fs.default.name 
   
hdfs://ec2-23-23-33-234.compute-1.amazonaws.com:9010





io.file.buffer.size
65536



fs.ftp.host
0.0.0.0



fs.ftp.host.port
21




Then I tried  hadoop fs -ls file:/// , it works.
But hadoop fs -ls ftp://:@// doesn't work as usual:
ls: Cannot access ftp://:@//: No such file or directory.

When ignoring  as :

hadoop fs -ls ftp://:@/

There are no error msgs, but it lists nothing.


I have also check the rights for my /home/ directroy:

drwxr-xr-x 114096 jui 11 16:30 

and all the files under /home/ have rights 755.

I can easily copy the link ftp://:@// to firefox, it lists all the files as expected.

Any workaround here ?

Thank you.

Le 12/07/2013 14:01, Ram a écrit :

Please configure the following in core-ste.xml and try.
   Use hadoop fs -ls file:///  -- to display local file system files
   Use hadoop fs -ls ftp://   -- to display
ftp files if it is listing files go for distcp.

reference from

http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml

fs.ftp.host 0.0.0.0 FTP filesystem connects to this server
fs.ftp.host.port21  FTP filesystem connects to fs.ftp.host on
this port




-- 
Hao Ren

ClaraVista
www.claravista.fr  





--
Hao Ren
ClaraVista
www.claravista.fr



--
Hao Ren
ClaraVista
www.claravista.fr



Re: New hadoop 1.2 single node installation giving problems

2013-07-23 Thread Yexi Jiang
Maybe the conf file is missing or no privilege to access or there is
something wrong about the format of your conf files (hdfs-site, core-site,
mapred-site). You can double check them. Also probably the typo of the
 tag or something like that.


2013/7/23 Ashish Umrani 

> Hi There,
>
> First of all, sorry if I am asking some stupid question.  Myself being new
> to the Hadoop environment , am finding it a bit difficult to figure out why
> its failing
>
> I have installed hadoop 1.2, based on instructions given in the folllowing
> link
>
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
>
> All went well and I could do the start-all.sh and the jps command does
> show all 5 process to be present.
>
> However when I try to do
>
> hadoop fs -ls
>
> I get the following error
>
> hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$ hadoop
> fs -ls
> Warning: $HADOOP_HOME is deprecated.
>
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> ls: Cannot access .: No such file or directory.
> hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
>
>
>
> Can someone help me figure out whats the issue in my installation
>
>
> Regards
> ashish
>



-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


Re: New hadoop 1.2 single node installation giving problems

2013-07-23 Thread Jitendra Yadav
Hi,

You might have missed some configuration (XML tags ), Please check all the
Conf files.

Thanks
On Tue, Jul 23, 2013 at 6:25 PM, Ashish Umrani wrote:

> Hi There,
>
> First of all, sorry if I am asking some stupid question.  Myself being new
> to the Hadoop environment , am finding it a bit difficult to figure out why
> its failing
>
> I have installed hadoop 1.2, based on instructions given in the folllowing
> link
>
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
>
> All went well and I could do the start-all.sh and the jps command does
> show all 5 process to be present.
>
> However when I try to do
>
> hadoop fs -ls
>
> I get the following error
>
>  hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$ hadoop
> fs -ls
> Warning: $HADOOP_HOME is deprecated.
>
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> ls: Cannot access .: No such file or directory.
> hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
>
>
>
> Can someone help me figure out whats the issue in my installation
>
>
> Regards
> ashish
>


Re: Copy data from Mainframe to HDFS

2013-07-23 Thread Jun Ping Du
Hi Sandeep, 
I think Apache Oozie is something you are looking for. It provide workflow 
management on Hadoop (and Pig, Hive, etc.) jobs and support continuously run 
jobs in specific time period. Please refer: http://oozie.apache.org/docs/3.3.2/ 
for details. 

Thanks, 

Junping 

- Original Message -

From: "Sandeep Nemuri"  
To: user@hadoop.apache.org 
Sent: Tuesday, July 23, 2013 7:04:56 PM 
Subject: Re: Copy data from Mainframe to HDFS 

Thanks for your reply guys , 
i am looking for open source do we have any ?? 


On Tue, Jul 23, 2013 at 12:53 PM, Devaraj k < devara...@huawei.com > wrote: 





Hi Balamurali, 



As per my knowledge, there is nothing in the hadoop which does exactly as per 
your requirement. 



You can write mapreduce jobs according to your functionality and submit 
hourly/daily/weekly or monthly . And then you can aggregate the results. 



If you want some help regarding Hive, you can ask the same in Hive mailing 
list. 







Thanks 

Devaraj k 




From: Balamurali [mailto: balamurali...@gmail.com ] 
Sent: 23 July 2013 12:42 
To: user 
Subject: Re: Copy data from Mainframe to HDFS 





Hi, 



I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 . 
Created table in HBase.Inserted records.Processing the data using Hive. 


I have to show a graph with some points ( 7 - 7 days or 12 for one year).In one 
day records may include 1000 - lacks.I need to show average of these 1000 - 
lacks records.is there any built in haddop mechanism to process these records 
fast. 


Also I need to run a hive query or job (when we run a hive query actually a job 
is submitting) in every 1 hour.Is there a scheduling mechanism in hadoop to 
handle thsese 



Please reply. 


Balamurali 





On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq < donta...@gmail.com > wrote: 


Hello Sandeep, 





You don't have to convert the data in order to copy it into the HDFS. But you 
might have to think about the MR processing of these files because of the 
format of these files. 





You could probably make use of Sqoop . 





I also came across DMX-H a few days ago while browsing. I don't know anything 
about the licensing and how good it is. Just thought of sharing it with you. 
You can visit their page to see more. They also provide a VM(includes CDH) to 
get started quickly. 





Warm Regards, 


Tariq 


cloudfront.blogspot.com 





On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri < nhsande...@gmail.com > 
wrote: 


Hi , 





"How to copy datasets from Mainframe to HDFS directly? I know that we can NDM 
files to Linux box and then we can use simple put command to copy data to HDFS. 
But, how to copy data directly from mainframe to HDFS? I have PS, PDS and VSAM 
datasets to copy to HDFS for analysis using MapReduce. 





Also, Do we need to convert data from EBCDIC to ASCII before copy? " 





-- 


--Regards 


Sandeep Nemuri 












-- 
--Regards 
Sandeep Nemuri 



Re: New hadoop 1.2 single node installation giving problems

2013-07-23 Thread Ashish Umrani
Hey thanks for response.  I have changed 4 files during installation

core-site.xml
mapred-site.xml
hdfs-site.xml   and
hadoop-env.sh


I could not find any issues except that all params in the hadoop-env.sh are
commented out.  Only java_home is un commented.

If you have a quick minute can you please browse through these files in
email and let me know where could be the issue.

Regards
ashish



I am listing those files below.
*core-site.xml *






  
hadoop.tmp.dir
/app/hadoop/tmp
A base for other temporary directories.
  

  
fs.default.name
hdfs://localhost:54310
The name of the default file system.  A URI whose
scheme and authority determine the FileSystem implementation.  The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class.  The uri's authority is used to
determine the host, port, etc. for a filesystem.
  




*mapred-site.xml*






  
mapred.job.tracker
localhost:54311
The host and port that the MapReduce job tracker runs
at.  If "local", then jobs are run in-process as a single map
and reduce task.

  




*hdfs-site.xml   and*






  dfs.replication
  1
  Default block replication.
The actual number of replications can be specified when the file is
created.
The default is used if replication is not specified in create time.
  




*hadoop-env.sh*
# Set Hadoop-specific environment variables here.

# The only required environment variable is JAVA_HOME.  All others are
# optional.  When running a distributed configuration it is best to
# set JAVA_HOME in this file, so that it is correctly defined on
# remote nodes.

# The java implementation to use.  Required.
export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_25

# Extra Java CLASSPATH elements.  Optional.
# export HADOOP_CLASSPATH=


All pther params in hadoop-env.sh are commented








On Tue, Jul 23, 2013 at 8:38 AM, Jitendra Yadav
wrote:

> Hi,
>
> You might have missed some configuration (XML tags ), Please check all the
> Conf files.
>
> Thanks
> On Tue, Jul 23, 2013 at 6:25 PM, Ashish Umrani wrote:
>
>> Hi There,
>>
>> First of all, sorry if I am asking some stupid question.  Myself being
>> new to the Hadoop environment , am finding it a bit difficult to figure out
>> why its failing
>>
>> I have installed hadoop 1.2, based on instructions given in the
>> folllowing link
>>
>> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
>>
>> All went well and I could do the start-all.sh and the jps command does
>> show all 5 process to be present.
>>
>> However when I try to do
>>
>> hadoop fs -ls
>>
>> I get the following error
>>
>>  hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
>> hadoop fs -ls
>> Warning: $HADOOP_HOME is deprecated.
>>
>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>> 
>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>> 
>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>> 
>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>> 
>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>> 
>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>> 
>> ls: Cannot access .: No such file or directory.
>> hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
>>
>>
>>
>> Can someone help me figure out whats the issue in my installation
>>
>>
>> Regards
>> ashish
>>
>
>


Re: Incrementally adding to existing output directory

2013-07-23 Thread Max Lebedev
Hi Devaraj,

Thanks for the advice. That did the trick.

Thanks,
Max Lebedev


On Wed, Jul 17, 2013 at 10:51 PM, Devaraj k  wrote:

>  It seems, It is not taking the CutomOutputFormat for the Job. You need
> to set the custom output format class using the 
> org.apache.hadoop.mapred.JobConf.setOutputFormat(Class extends OutputFormat> theClass) API for your Job.
>
> ** **
>
> If we don’t set OutputFormat for Job, it takes the default as
> TextOutputFormat which internally extends FileOutputFormat, that’s why you
> see in the below exception still it is using the FileOutputFormat.
>
> ** **
>
> ** **
>
> Thanks
>
> Devaraj k
>
> ** **
>
> *From:* Max Lebedev [mailto:ma...@actionx.com]
> *Sent:* 18 July 2013 01:03
> *To:* user@hadoop.apache.org
> *Subject:* Re: Incrementally adding to existing output directory
>
> ** **
>
> Hi Devaraj,
>
> Thank you very much for your help. I've created a CustomOutputFormat which
> is almost identical to FileOutputFormat as seen 
> here
> except I've removed line 125 which throws the FileAlreadyExistsException.
> However, when I try to run my code, I get this error:
>
> Exception in thread "main"
> org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory
> outDir already exists
>
>at
> org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:137)
> 
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:887)
> 
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
> 
>
> at java.security.AccessController.doPrivileged(Native Method)*
> ***
>
> at javax.security.auth.Subject.doAs(Subject.java:396)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> 
>
> at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)**
> **
>
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:500)
>
> at
> org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)
>
> ...
>
> at java.lang.reflect.Method.invoke(Method.java:597)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
> In my source code, I've changed "FileOutputFormat.setOutputPath" to
> "CustomOutputFormat.setOutputPath"
>
> Is it the case that FileOutputFormat.checkOutputSpecs is happening
> somewhere else, or have I done something wrong?
> I also don't quite understand your suggestion about MultipleOutputs. Would
> you mind elaborating?
>
> Thanks,
> Max Lebedev
>
> ** **
>
> On Tue, Jul 16, 2013 at 9:42 PM, Devaraj k  wrote:**
> **
>
> Hi Max,
>
>  
>
>   It can be done by customizing the output format class for your Job
> according to your expectations. You could you refer
> OutputFormat.checkOutputSpecs(JobContext context) method which checks the
> ouput specification. We can override this in your custom OutputFormat. You
> can also see MultipleOutputs class for implementation details how it could
> be done.
>
>  
>
> Thanks
>
> Devaraj k
>
>  
>
> *From:* Max Lebedev [mailto:ma...@actionx.com]
> *Sent:* 16 July 2013 23:33
> *To:* user@hadoop.apache.org
> *Subject:* Incrementally adding to existing output directory
>
>  
>
> Hi
>
> I'm trying to figure out how to incrementally add to an existing output
> directory using MapReduce.
>
> I cannot specify the exact output path, as data in the input is sorted
> into categories and then written to different directories based in the
> contents. (in the examples below, token= or token=)
>
> As an example:
>
> When using MultipleOutput and provided that outDir does not exist yet, the
> following will work:
>
> hadoop jar myMR.jar
> --input-path=inputDir/dt=2013-05-03/* --output-path=outDir
>
> The result will be: 
>
> outDir/token=/dt=2013-05-03/
>
> outDir/token=/dt=2013-05-03/
>
> However, the following will fail because outDir already exists. Even
> though I am copying new inputs.
>
> hadoop jar myMR.jar  --input-path=inputDir/dt=2013-05-04/*
> --output-path=outDir
>
> will throw FileAlreadyExistsException
>
> What I would expect is that it adds
>
> outDir/token=/dt=2013-05-04/
>
> outDir/token=/dt=2013-05-04/
>
> Another possibility would be the following hack but it does not seem to be
> very elegant:
>
> hadoop jar myMR.jar --input-path=inputDir/2013-05-04/*
> --output-path=tempOutDir
>
> then copy from tempOutDir to outDir
>
> Is there a better way to address incrementally adding to an existing
> hadoop output directory?
>
> ** **
>


Only log.index

2013-07-23 Thread Ajay Srivastava
Hi,

I see that most of the tasks have only log.index created in 
/opt/hadoop/logs/userlogs/jobId/task_attempt directory.
When does this happen ?
Is there a config setting for this OR this is a bug ?


Regards,
Ajay Srivastava

Re: New hadoop 1.2 single node installation giving problems

2013-07-23 Thread bejoy . hadoop
Hi Ashish

In your hdfs-site.xml within   tag you need to have the 
 tag and inside a  tag you can have , and 
 tags.


Regards 
Bejoy KS

Sent from remote device, Please excuse typos

-Original Message-
From: Ashish Umrani 
Date: Tue, 23 Jul 2013 09:28:00 
To: 
Reply-To: user@hadoop.apache.org
Subject: Re: New hadoop 1.2 single node installation giving problems

Hey thanks for response.  I have changed 4 files during installation

core-site.xml
mapred-site.xml
hdfs-site.xml   and
hadoop-env.sh


I could not find any issues except that all params in the hadoop-env.sh are
commented out.  Only java_home is un commented.

If you have a quick minute can you please browse through these files in
email and let me know where could be the issue.

Regards
ashish



I am listing those files below.
*core-site.xml *






  
hadoop.tmp.dir
/app/hadoop/tmp
A base for other temporary directories.
  

  
fs.default.name
hdfs://localhost:54310
The name of the default file system.  A URI whose
scheme and authority determine the FileSystem implementation.  The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class.  The uri's authority is used to
determine the host, port, etc. for a filesystem.
  




*mapred-site.xml*






  
mapred.job.tracker
localhost:54311
The host and port that the MapReduce job tracker runs
at.  If "local", then jobs are run in-process as a single map
and reduce task.

  




*hdfs-site.xml   and*






  dfs.replication
  1
  Default block replication.
The actual number of replications can be specified when the file is
created.
The default is used if replication is not specified in create time.
  




*hadoop-env.sh*
# Set Hadoop-specific environment variables here.

# The only required environment variable is JAVA_HOME.  All others are
# optional.  When running a distributed configuration it is best to
# set JAVA_HOME in this file, so that it is correctly defined on
# remote nodes.

# The java implementation to use.  Required.
export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_25

# Extra Java CLASSPATH elements.  Optional.
# export HADOOP_CLASSPATH=


All pther params in hadoop-env.sh are commented








On Tue, Jul 23, 2013 at 8:38 AM, Jitendra Yadav
wrote:

> Hi,
>
> You might have missed some configuration (XML tags ), Please check all the
> Conf files.
>
> Thanks
> On Tue, Jul 23, 2013 at 6:25 PM, Ashish Umrani wrote:
>
>> Hi There,
>>
>> First of all, sorry if I am asking some stupid question.  Myself being
>> new to the Hadoop environment , am finding it a bit difficult to figure out
>> why its failing
>>
>> I have installed hadoop 1.2, based on instructions given in the
>> folllowing link
>>
>> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
>>
>> All went well and I could do the start-all.sh and the jps command does
>> show all 5 process to be present.
>>
>> However when I try to do
>>
>> hadoop fs -ls
>>
>> I get the following error
>>
>>  hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
>> hadoop fs -ls
>> Warning: $HADOOP_HOME is deprecated.
>>
>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>> 
>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>> 
>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>> 
>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>> 
>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>> 
>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>> 
>> ls: Cannot access .: No such file or directory.
>> hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
>>
>>
>>
>> Can someone help me figure out whats the issue in my installation
>>
>>
>> Regards
>> ashish
>>
>
>



Re: New hadoop 1.2 single node installation giving problems

2013-07-23 Thread Yexi Jiang
Seems *hdfs-site.xml has no property tag.*


2013/7/23 Ashish Umrani 

> Hey thanks for response.  I have changed 4 files during installation
>
> core-site.xml
> mapred-site.xml
> hdfs-site.xml   and
> hadoop-env.sh
>
>
> I could not find any issues except that all params in the hadoop-env.sh
> are commented out.  Only java_home is un commented.
>
> If you have a quick minute can you please browse through these files in
> email and let me know where could be the issue.
>
> Regards
> ashish
>
>
>
> I am listing those files below.
> *core-site.xml *
> 
> 
>
> 
>
> 
>   
> hadoop.tmp.dir
> /app/hadoop/tmp
> A base for other temporary directories.
>   
>
>   
> fs.default.name
> hdfs://localhost:54310
> The name of the default file system.  A URI whose
> scheme and authority determine the FileSystem implementation.  The
> uri's scheme determines the config property (fs.SCHEME.impl) naming
> the FileSystem implementation class.  The uri's authority is used to
> determine the host, port, etc. for a filesystem.
>   
> 
>
>
>
> *mapred-site.xml*
> 
> 
>
> 
>
> 
>   
> mapred.job.tracker
> localhost:54311
> The host and port that the MapReduce job tracker runs
> at.  If "local", then jobs are run in-process as a single map
> and reduce task.
> 
>   
> 
>
>
>
> *hdfs-site.xml   and*
> 
> 
>
> 
>
> 
>   dfs.replication
>   1
>   Default block replication.
> The actual number of replications can be specified when the file is
> created.
> The default is used if replication is not specified in create time.
>   
> 
>
>
>
> *hadoop-env.sh*
> # Set Hadoop-specific environment variables here.
>
> # The only required environment variable is JAVA_HOME.  All others are
> # optional.  When running a distributed configuration it is best to
> # set JAVA_HOME in this file, so that it is correctly defined on
> # remote nodes.
>
> # The java implementation to use.  Required.
> export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_25
>
> # Extra Java CLASSPATH elements.  Optional.
> # export HADOOP_CLASSPATH=
>
>
> All pther params in hadoop-env.sh are commented
>
>
>
>
>
>
>
>
> On Tue, Jul 23, 2013 at 8:38 AM, Jitendra Yadav <
> jeetuyadav200...@gmail.com> wrote:
>
>> Hi,
>>
>> You might have missed some configuration (XML tags ), Please check all
>> the Conf files.
>>
>> Thanks
>> On Tue, Jul 23, 2013 at 6:25 PM, Ashish Umrani 
>> wrote:
>>
>>> Hi There,
>>>
>>> First of all, sorry if I am asking some stupid question.  Myself being
>>> new to the Hadoop environment , am finding it a bit difficult to figure out
>>> why its failing
>>>
>>> I have installed hadoop 1.2, based on instructions given in the
>>> folllowing link
>>>
>>> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
>>>
>>> All went well and I could do the start-all.sh and the jps command does
>>> show all 5 process to be present.
>>>
>>> However when I try to do
>>>
>>> hadoop fs -ls
>>>
>>> I get the following error
>>>
>>>  hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
>>> hadoop fs -ls
>>> Warning: $HADOOP_HOME is deprecated.
>>>
>>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>>> 
>>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>>> 
>>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>>> 
>>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>>> 
>>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>>> 
>>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>>> 
>>> ls: Cannot access .: No such file or directory.
>>> hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
>>>
>>>
>>>
>>> Can someone help me figure out whats the issue in my installation
>>>
>>>
>>> Regards
>>> ashish
>>>
>>
>>
>


-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


Re: New hadoop 1.2 single node installation giving problems

2013-07-23 Thread Jitendra Yadav
Hi Ashish,

Please check   in hdfs-site.xml.

It is missing.

Thanks.
On Tue, Jul 23, 2013 at 9:58 PM, Ashish Umrani wrote:

> Hey thanks for response.  I have changed 4 files during installation
>
> core-site.xml
> mapred-site.xml
> hdfs-site.xml   and
> hadoop-env.sh
>
>
> I could not find any issues except that all params in the hadoop-env.sh
> are commented out.  Only java_home is un commented.
>
> If you have a quick minute can you please browse through these files in
> email and let me know where could be the issue.
>
> Regards
> ashish
>
>
>
> I am listing those files below.
>  *core-site.xml *
>  
> 
>
> 
>
> 
>   
> hadoop.tmp.dir
> /app/hadoop/tmp
> A base for other temporary directories.
>   
>
>   
> fs.default.name
> hdfs://localhost:54310
> The name of the default file system.  A URI whose
> scheme and authority determine the FileSystem implementation.  The
> uri's scheme determines the config property (fs.SCHEME.impl) naming
> the FileSystem implementation class.  The uri's authority is used to
> determine the host, port, etc. for a filesystem.
>   
> 
>
>
>
> *mapred-site.xml*
>  
> 
>
> 
>
> 
>   
> mapred.job.tracker
> localhost:54311
> The host and port that the MapReduce job tracker runs
> at.  If "local", then jobs are run in-process as a single map
> and reduce task.
> 
>   
> 
>
>
>
> *hdfs-site.xml   and*
>  
> 
>
> 
>
> 
>   dfs.replication
>   1
>   Default block replication.
> The actual number of replications can be specified when the file is
> created.
> The default is used if replication is not specified in create time.
>   
> 
>
>
>
> *hadoop-env.sh*
>  # Set Hadoop-specific environment variables here.
>
> # The only required environment variable is JAVA_HOME.  All others are
> # optional.  When running a distributed configuration it is best to
> # set JAVA_HOME in this file, so that it is correctly defined on
> # remote nodes.
>
> # The java implementation to use.  Required.
> export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_25
>
> # Extra Java CLASSPATH elements.  Optional.
> # export HADOOP_CLASSPATH=
>
>
> All pther params in hadoop-env.sh are commented
>
>
>
>
>
>
>
>
> On Tue, Jul 23, 2013 at 8:38 AM, Jitendra Yadav <
> jeetuyadav200...@gmail.com> wrote:
>
>> Hi,
>>
>> You might have missed some configuration (XML tags ), Please check all
>> the Conf files.
>>
>> Thanks
>> On Tue, Jul 23, 2013 at 6:25 PM, Ashish Umrani 
>> wrote:
>>
>>> Hi There,
>>>
>>> First of all, sorry if I am asking some stupid question.  Myself being
>>> new to the Hadoop environment , am finding it a bit difficult to figure out
>>> why its failing
>>>
>>> I have installed hadoop 1.2, based on instructions given in the
>>> folllowing link
>>>
>>> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
>>>
>>> All went well and I could do the start-all.sh and the jps command does
>>> show all 5 process to be present.
>>>
>>> However when I try to do
>>>
>>> hadoop fs -ls
>>>
>>> I get the following error
>>>
>>>  hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
>>> hadoop fs -ls
>>> Warning: $HADOOP_HOME is deprecated.
>>>
>>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>>> 
>>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>>> 
>>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>>> 
>>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>>> 
>>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>>> 
>>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>>> 
>>> ls: Cannot access .: No such file or directory.
>>> hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
>>>
>>>
>>>
>>> Can someone help me figure out whats the issue in my installation
>>>
>>>
>>> Regards
>>> ashish
>>>
>>
>>
>


Re: New hadoop 1.2 single node installation giving problems

2013-07-23 Thread Ashish Umrani
Thanks Jitendra, Bejoy and Yexi,

I got past that.  And now the ls command says it can not access the
directory.  I am sure this is a permissions issue.  I am just wondering
which directory and I missing permissions on.

Any pointers?

And once again, thanks a lot

Regards
ashish

*hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$ hadoop
fs -ls*
*Warning: $HADOOP_HOME is deprecated.*
*
*
*ls: Cannot access .: No such file or directory.*



On Tue, Jul 23, 2013 at 9:42 AM, Jitendra Yadav
wrote:

> Hi Ashish,
>
> Please check   in hdfs-site.xml.
>
> It is missing.
>
> Thanks.
> On Tue, Jul 23, 2013 at 9:58 PM, Ashish Umrani wrote:
>
>> Hey thanks for response.  I have changed 4 files during installation
>>
>> core-site.xml
>> mapred-site.xml
>> hdfs-site.xml   and
>> hadoop-env.sh
>>
>>
>> I could not find any issues except that all params in the hadoop-env.sh
>> are commented out.  Only java_home is un commented.
>>
>> If you have a quick minute can you please browse through these files in
>> email and let me know where could be the issue.
>>
>> Regards
>> ashish
>>
>>
>>
>> I am listing those files below.
>>  *core-site.xml *
>>  
>> 
>>
>> 
>>
>> 
>>   
>> hadoop.tmp.dir
>> /app/hadoop/tmp
>> A base for other temporary directories.
>>   
>>
>>   
>> fs.default.name
>> hdfs://localhost:54310
>> The name of the default file system.  A URI whose
>> scheme and authority determine the FileSystem implementation.  The
>> uri's scheme determines the config property (fs.SCHEME.impl) naming
>> the FileSystem implementation class.  The uri's authority is used to
>> determine the host, port, etc. for a filesystem.
>>   
>> 
>>
>>
>>
>> *mapred-site.xml*
>>  
>> 
>>
>> 
>>
>> 
>>   
>> mapred.job.tracker
>> localhost:54311
>> The host and port that the MapReduce job tracker runs
>> at.  If "local", then jobs are run in-process as a single map
>> and reduce task.
>> 
>>   
>> 
>>
>>
>>
>> *hdfs-site.xml   and*
>>  
>> 
>>
>> 
>>
>> 
>>   dfs.replication
>>   1
>>   Default block replication.
>> The actual number of replications can be specified when the file is
>> created.
>> The default is used if replication is not specified in create time.
>>   
>> 
>>
>>
>>
>> *hadoop-env.sh*
>>  # Set Hadoop-specific environment variables here.
>>
>> # The only required environment variable is JAVA_HOME.  All others are
>> # optional.  When running a distributed configuration it is best to
>> # set JAVA_HOME in this file, so that it is correctly defined on
>> # remote nodes.
>>
>> # The java implementation to use.  Required.
>> export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_25
>>
>> # Extra Java CLASSPATH elements.  Optional.
>> # export HADOOP_CLASSPATH=
>>
>>
>> All pther params in hadoop-env.sh are commented
>>
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Jul 23, 2013 at 8:38 AM, Jitendra Yadav <
>> jeetuyadav200...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> You might have missed some configuration (XML tags ), Please check all
>>> the Conf files.
>>>
>>> Thanks
>>> On Tue, Jul 23, 2013 at 6:25 PM, Ashish Umrani 
>>> wrote:
>>>
 Hi There,

 First of all, sorry if I am asking some stupid question.  Myself being
 new to the Hadoop environment , am finding it a bit difficult to figure out
 why its failing

 I have installed hadoop 1.2, based on instructions given in the
 folllowing link

 http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

 All went well and I could do the start-all.sh and the jps command does
 show all 5 process to be present.

 However when I try to do

 hadoop fs -ls

 I get the following error

  hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
 hadoop fs -ls
 Warning: $HADOOP_HOME is deprecated.

 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
 
 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
 
 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
 
 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
 
 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
 
 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
 
 ls: Cannot access .: No such file or directory.
 hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$



 Can someone help me figure out whats the issue in my installation


 Regards
 ashish

>>>
>>>
>>
>


Re: New hadoop 1.2 single node installation giving problems

2013-07-23 Thread Shekhar Sharma
Its warning not error...

Create a directory and then do ls ( In your case /user/hduser is not
created untill and unless for the first time you create a directory or put
some file)

hadoop fs  -mkdir sample

hadoop fs  -ls

I would suggest if you are getting pemission problem,
please check the following:

(1) Have you run the command "hadoop namenode -format" with different user
and you are accessing the hdfs with different user?

On Tue, Jul 23, 2013 at 10:10 PM,  wrote:

> **
> Hi Ashish
>
> In your hdfs-site.xml within  tag you need to have the
>  tag and inside a  tag you can have , and
>  tags.
>
> Regards
> Bejoy KS
>
> Sent from remote device, Please excuse typos
> --
> *From: * Ashish Umrani 
> *Date: *Tue, 23 Jul 2013 09:28:00 -0700
> *To: *
> *ReplyTo: * user@hadoop.apache.org
> *Subject: *Re: New hadoop 1.2 single node installation giving problems
>
> Hey thanks for response.  I have changed 4 files during installation
>
> core-site.xml
> mapred-site.xml
> hdfs-site.xml   and
> hadoop-env.sh
>
>
> I could not find any issues except that all params in the hadoop-env.sh
> are commented out.  Only java_home is un commented.
>
> If you have a quick minute can you please browse through these files in
> email and let me know where could be the issue.
>
> Regards
> ashish
>
>
>
> I am listing those files below.
> *core-site.xml *
> 
> 
>
> 
>
> 
>   
> hadoop.tmp.dir
> /app/hadoop/tmp
> A base for other temporary directories.
>   
>
>   
> fs.default.name
> hdfs://localhost:54310
> The name of the default file system.  A URI whose
> scheme and authority determine the FileSystem implementation.  The
> uri's scheme determines the config property (fs.SCHEME.impl) naming
> the FileSystem implementation class.  The uri's authority is used to
> determine the host, port, etc. for a filesystem.
>   
> 
>
>
>
> *mapred-site.xml*
> 
> 
>
> 
>
> 
>   
> mapred.job.tracker
> localhost:54311
> The host and port that the MapReduce job tracker runs
> at.  If "local", then jobs are run in-process as a single map
> and reduce task.
> 
>   
> 
>
>
>
> *hdfs-site.xml   and*
> 
> 
>
> 
>
> 
>   dfs.replication
>   1
>   Default block replication.
> The actual number of replications can be specified when the file is
> created.
> The default is used if replication is not specified in create time.
>   
> 
>
>
>
> *hadoop-env.sh*
> # Set Hadoop-specific environment variables here.
>
> # The only required environment variable is JAVA_HOME.  All others are
> # optional.  When running a distributed configuration it is best to
> # set JAVA_HOME in this file, so that it is correctly defined on
> # remote nodes.
>
> # The java implementation to use.  Required.
> export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_25
>
> # Extra Java CLASSPATH elements.  Optional.
> # export HADOOP_CLASSPATH=
>
>
> All pther params in hadoop-env.sh are commented
>
>
>
>
>
>
>
>
> On Tue, Jul 23, 2013 at 8:38 AM, Jitendra Yadav <
> jeetuyadav200...@gmail.com> wrote:
>
>> Hi,
>>
>> You might have missed some configuration (XML tags ), Please check all
>> the Conf files.
>>
>> Thanks
>> On Tue, Jul 23, 2013 at 6:25 PM, Ashish Umrani 
>> wrote:
>>
>>> Hi There,
>>>
>>> First of all, sorry if I am asking some stupid question.  Myself being
>>> new to the Hadoop environment , am finding it a bit difficult to figure out
>>> why its failing
>>>
>>> I have installed hadoop 1.2, based on instructions given in the
>>> folllowing link
>>>
>>> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
>>>
>>> All went well and I could do the start-all.sh and the jps command does
>>> show all 5 process to be present.
>>>
>>> However when I try to do
>>>
>>> hadoop fs -ls
>>>
>>> I get the following error
>>>
>>>  hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
>>> hadoop fs -ls
>>> Warning: $HADOOP_HOME is deprecated.
>>>
>>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>>> 
>>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>>> 
>>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>>> 
>>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>>> 
>>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>>> 
>>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>>> 
>>> ls: Cannot access .: No such file or directory.
>>> hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
>>>
>>>
>>>
>>> Can someone help me figure out whats the issue in my installation
>>>
>>>
>>> Regards
>>> ashish
>>>
>>
>>
>


Re: New hadoop 1.2 single node installation giving problems

2013-07-23 Thread Mohammad Tariq
Hello Ashish,

Change the permissions of /app/hadoop/tmp to 755 and see if it helps.

Warm Regards,
Tariq
cloudfront.blogspot.com


On Tue, Jul 23, 2013 at 10:27 PM, Ashish Umrani wrote:

> Thanks Jitendra, Bejoy and Yexi,
>
> I got past that.  And now the ls command says it can not access the
> directory.  I am sure this is a permissions issue.  I am just wondering
> which directory and I missing permissions on.
>
> Any pointers?
>
> And once again, thanks a lot
>
> Regards
> ashish
>
> *hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$ hadoop
> fs -ls*
> *Warning: $HADOOP_HOME is deprecated.*
> *
> *
> *ls: Cannot access .: No such file or directory.*
>
>
>
> On Tue, Jul 23, 2013 at 9:42 AM, Jitendra Yadav <
> jeetuyadav200...@gmail.com> wrote:
>
>> Hi Ashish,
>>
>> Please check   in hdfs-site.xml.
>>
>> It is missing.
>>
>> Thanks.
>> On Tue, Jul 23, 2013 at 9:58 PM, Ashish Umrani 
>> wrote:
>>
>>> Hey thanks for response.  I have changed 4 files during installation
>>>
>>> core-site.xml
>>> mapred-site.xml
>>> hdfs-site.xml   and
>>> hadoop-env.sh
>>>
>>>
>>> I could not find any issues except that all params in the hadoop-env.sh
>>> are commented out.  Only java_home is un commented.
>>>
>>> If you have a quick minute can you please browse through these files in
>>> email and let me know where could be the issue.
>>>
>>> Regards
>>> ashish
>>>
>>>
>>>
>>> I am listing those files below.
>>>  *core-site.xml *
>>>  
>>> 
>>>
>>> 
>>>
>>> 
>>>   
>>> hadoop.tmp.dir
>>> /app/hadoop/tmp
>>> A base for other temporary directories.
>>>   
>>>
>>>   
>>> fs.default.name
>>> hdfs://localhost:54310
>>> The name of the default file system.  A URI whose
>>> scheme and authority determine the FileSystem implementation.  The
>>> uri's scheme determines the config property (fs.SCHEME.impl) naming
>>> the FileSystem implementation class.  The uri's authority is used to
>>> determine the host, port, etc. for a filesystem.
>>>   
>>> 
>>>
>>>
>>>
>>> *mapred-site.xml*
>>>  
>>> 
>>>
>>> 
>>>
>>> 
>>>   
>>> mapred.job.tracker
>>> localhost:54311
>>> The host and port that the MapReduce job tracker runs
>>> at.  If "local", then jobs are run in-process as a single map
>>> and reduce task.
>>> 
>>>   
>>> 
>>>
>>>
>>>
>>> *hdfs-site.xml   and*
>>>  
>>> 
>>>
>>> 
>>>
>>> 
>>>   dfs.replication
>>>   1
>>>   Default block replication.
>>> The actual number of replications can be specified when the file is
>>> created.
>>> The default is used if replication is not specified in create time.
>>>   
>>> 
>>>
>>>
>>>
>>> *hadoop-env.sh*
>>>  # Set Hadoop-specific environment variables here.
>>>
>>> # The only required environment variable is JAVA_HOME.  All others are
>>> # optional.  When running a distributed configuration it is best to
>>> # set JAVA_HOME in this file, so that it is correctly defined on
>>> # remote nodes.
>>>
>>> # The java implementation to use.  Required.
>>> export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_25
>>>
>>> # Extra Java CLASSPATH elements.  Optional.
>>> # export HADOOP_CLASSPATH=
>>>
>>>
>>> All pther params in hadoop-env.sh are commented
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Jul 23, 2013 at 8:38 AM, Jitendra Yadav <
>>> jeetuyadav200...@gmail.com> wrote:
>>>
 Hi,

 You might have missed some configuration (XML tags ), Please check all
 the Conf files.

 Thanks
 On Tue, Jul 23, 2013 at 6:25 PM, Ashish Umrani >>> > wrote:

> Hi There,
>
> First of all, sorry if I am asking some stupid question.  Myself being
> new to the Hadoop environment , am finding it a bit difficult to figure 
> out
> why its failing
>
> I have installed hadoop 1.2, based on instructions given in the
> folllowing link
>
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
>
> All went well and I could do the start-all.sh and the jps command does
> show all 5 process to be present.
>
> However when I try to do
>
> hadoop fs -ls
>
> I get the following error
>
>  hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
> hadoop fs -ls
> Warning: $HADOOP_HOME is deprecated.
>
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> ls: Cannot access .: No such file or directory.
> hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
>
>
>
> Can someone help me figure out whats the issue in my in

Re: New hadoop 1.2 single node installation giving problems

2013-07-23 Thread Ashish Umrani
Hey,

Thanks Shekhar.  That worked like a chimp.  Appreciate help from you all.
 Now I will try to put files and try the word count or similar program.

Regards
ashish


On Tue, Jul 23, 2013 at 10:07 AM, Shekhar Sharma wrote:

> Its warning not error...
>
> Create a directory and then do ls ( In your case /user/hduser is not
> created untill and unless for the first time you create a directory or put
> some file)
>
> hadoop fs  -mkdir sample
>
> hadoop fs  -ls
>
> I would suggest if you are getting pemission problem,
> please check the following:
>
> (1) Have you run the command "hadoop namenode -format" with different user
> and you are accessing the hdfs with different user?
>
> On Tue, Jul 23, 2013 at 10:10 PM,  wrote:
>
>> **
>> Hi Ashish
>>
>> In your hdfs-site.xml within  tag you need to have the
>>  tag and inside a  tag you can have , and
>>  tags.
>>
>> Regards
>> Bejoy KS
>>
>> Sent from remote device, Please excuse typos
>> --
>> *From: * Ashish Umrani 
>> *Date: *Tue, 23 Jul 2013 09:28:00 -0700
>> *To: *
>> *ReplyTo: * user@hadoop.apache.org
>> *Subject: *Re: New hadoop 1.2 single node installation giving problems
>>
>> Hey thanks for response.  I have changed 4 files during installation
>>
>> core-site.xml
>> mapred-site.xml
>> hdfs-site.xml   and
>> hadoop-env.sh
>>
>>
>> I could not find any issues except that all params in the hadoop-env.sh
>> are commented out.  Only java_home is un commented.
>>
>> If you have a quick minute can you please browse through these files in
>> email and let me know where could be the issue.
>>
>> Regards
>> ashish
>>
>>
>>
>> I am listing those files below.
>> *core-site.xml *
>> 
>> 
>>
>> 
>>
>> 
>>   
>> hadoop.tmp.dir
>> /app/hadoop/tmp
>> A base for other temporary directories.
>>   
>>
>>   
>> fs.default.name
>> hdfs://localhost:54310
>> The name of the default file system.  A URI whose
>> scheme and authority determine the FileSystem implementation.  The
>> uri's scheme determines the config property (fs.SCHEME.impl) naming
>> the FileSystem implementation class.  The uri's authority is used to
>> determine the host, port, etc. for a filesystem.
>>   
>> 
>>
>>
>>
>> *mapred-site.xml*
>> 
>> 
>>
>> 
>>
>> 
>>   
>> mapred.job.tracker
>> localhost:54311
>> The host and port that the MapReduce job tracker runs
>> at.  If "local", then jobs are run in-process as a single map
>> and reduce task.
>> 
>>   
>> 
>>
>>
>>
>> *hdfs-site.xml   and*
>> 
>> 
>>
>> 
>>
>> 
>>   dfs.replication
>>   1
>>   Default block replication.
>> The actual number of replications can be specified when the file is
>> created.
>> The default is used if replication is not specified in create time.
>>   
>> 
>>
>>
>>
>> *hadoop-env.sh*
>> # Set Hadoop-specific environment variables here.
>>
>> # The only required environment variable is JAVA_HOME.  All others are
>> # optional.  When running a distributed configuration it is best to
>> # set JAVA_HOME in this file, so that it is correctly defined on
>> # remote nodes.
>>
>> # The java implementation to use.  Required.
>> export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_25
>>
>> # Extra Java CLASSPATH elements.  Optional.
>> # export HADOOP_CLASSPATH=
>>
>>
>> All pther params in hadoop-env.sh are commented
>>
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Jul 23, 2013 at 8:38 AM, Jitendra Yadav <
>> jeetuyadav200...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> You might have missed some configuration (XML tags ), Please check all
>>> the Conf files.
>>>
>>> Thanks
>>> On Tue, Jul 23, 2013 at 6:25 PM, Ashish Umrani 
>>> wrote:
>>>
 Hi There,

 First of all, sorry if I am asking some stupid question.  Myself being
 new to the Hadoop environment , am finding it a bit difficult to figure out
 why its failing

 I have installed hadoop 1.2, based on instructions given in the
 folllowing link

 http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

 All went well and I could do the start-all.sh and the jps command does
 show all 5 process to be present.

 However when I try to do

 hadoop fs -ls

 I get the following error

  hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
 hadoop fs -ls
 Warning: $HADOOP_HOME is deprecated.

 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
 
 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
 
 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
 
 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
 
 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
 
 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
 
 ls: Cannot access .: No such file or directory.
 hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$

Re: New hadoop 1.2 single node installation giving problems

2013-07-23 Thread Ashish Umrani
Thanks,

But the issue was that there was no directory and hence it was not showing
anything.  Adding a directory cleared the warning.

I appreciate your help.

Regards
ashish


On Tue, Jul 23, 2013 at 10:08 AM, Mohammad Tariq  wrote:

> Hello Ashish,
>
> Change the permissions of /app/hadoop/tmp to 755 and see if it helps.
>
> Warm Regards,
> Tariq
> cloudfront.blogspot.com
>
>
> On Tue, Jul 23, 2013 at 10:27 PM, Ashish Umrani 
> wrote:
>
>> Thanks Jitendra, Bejoy and Yexi,
>>
>> I got past that.  And now the ls command says it can not access the
>> directory.  I am sure this is a permissions issue.  I am just wondering
>> which directory and I missing permissions on.
>>
>> Any pointers?
>>
>> And once again, thanks a lot
>>
>> Regards
>> ashish
>>
>> *hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
>> hadoop fs -ls*
>> *Warning: $HADOOP_HOME is deprecated.*
>> *
>> *
>> *ls: Cannot access .: No such file or directory.*
>>
>>
>>
>> On Tue, Jul 23, 2013 at 9:42 AM, Jitendra Yadav <
>> jeetuyadav200...@gmail.com> wrote:
>>
>>> Hi Ashish,
>>>
>>> Please check   in hdfs-site.xml.
>>>
>>> It is missing.
>>>
>>> Thanks.
>>> On Tue, Jul 23, 2013 at 9:58 PM, Ashish Umrani 
>>> wrote:
>>>
 Hey thanks for response.  I have changed 4 files during installation

 core-site.xml
 mapred-site.xml
 hdfs-site.xml   and
 hadoop-env.sh


 I could not find any issues except that all params in the hadoop-env.sh
 are commented out.  Only java_home is un commented.

 If you have a quick minute can you please browse through these files in
 email and let me know where could be the issue.

 Regards
 ashish



 I am listing those files below.
  *core-site.xml *
  
 

 

 
   
 hadoop.tmp.dir
 /app/hadoop/tmp
 A base for other temporary directories.
   

   
 fs.default.name
 hdfs://localhost:54310
 The name of the default file system.  A URI whose
 scheme and authority determine the FileSystem implementation.  The
 uri's scheme determines the config property (fs.SCHEME.impl) naming
 the FileSystem implementation class.  The uri's authority is used to
 determine the host, port, etc. for a filesystem.
   
 



 *mapred-site.xml*
  
 

 

 
   
 mapred.job.tracker
 localhost:54311
 The host and port that the MapReduce job tracker runs
 at.  If "local", then jobs are run in-process as a single map
 and reduce task.
 
   
 



 *hdfs-site.xml   and*
  
 

 

 
   dfs.replication
   1
   Default block replication.
 The actual number of replications can be specified when the file is
 created.
 The default is used if replication is not specified in create time.
   
 



 *hadoop-env.sh*
  # Set Hadoop-specific environment variables here.

 # The only required environment variable is JAVA_HOME.  All others are
 # optional.  When running a distributed configuration it is best to
 # set JAVA_HOME in this file, so that it is correctly defined on
 # remote nodes.

 # The java implementation to use.  Required.
 export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_25

 # Extra Java CLASSPATH elements.  Optional.
 # export HADOOP_CLASSPATH=


 All pther params in hadoop-env.sh are commented








 On Tue, Jul 23, 2013 at 8:38 AM, Jitendra Yadav <
 jeetuyadav200...@gmail.com> wrote:

> Hi,
>
> You might have missed some configuration (XML tags ), Please check all
> the Conf files.
>
> Thanks
> On Tue, Jul 23, 2013 at 6:25 PM, Ashish Umrani <
> ashish.umr...@gmail.com> wrote:
>
>> Hi There,
>>
>> First of all, sorry if I am asking some stupid question.  Myself
>> being new to the Hadoop environment , am finding it a bit difficult to
>> figure out why its failing
>>
>> I have installed hadoop 1.2, based on instructions given in the
>> folllowing link
>>
>> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
>>
>> All went well and I could do the start-all.sh and the jps command
>> does show all 5 process to be present.
>>
>> However when I try to do
>>
>> hadoop fs -ls
>>
>> I get the following error
>>
>>  hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
>> hadoop fs -ls
>> Warning: $HADOOP_HOME is deprecated.
>>
>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>> 
>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>> 
>> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
>>>

Re: New hadoop 1.2 single node installation giving problems

2013-07-23 Thread Shekhar Sharma
After starting  i would suggest always check whether your NameNode and job
tracker UI are working or not and check the number of live nodes in both of
the UI..
Regards,
Som Shekhar Sharma
+91-8197243810


On Tue, Jul 23, 2013 at 10:41 PM, Ashish Umrani wrote:

> Thanks,
>
> But the issue was that there was no directory and hence it was not showing
> anything.  Adding a directory cleared the warning.
>
> I appreciate your help.
>
> Regards
> ashish
>
>
> On Tue, Jul 23, 2013 at 10:08 AM, Mohammad Tariq wrote:
>
>> Hello Ashish,
>>
>> Change the permissions of /app/hadoop/tmp to 755 and see if it helps.
>>
>> Warm Regards,
>> Tariq
>> cloudfront.blogspot.com
>>
>>
>> On Tue, Jul 23, 2013 at 10:27 PM, Ashish Umrani 
>> wrote:
>>
>>> Thanks Jitendra, Bejoy and Yexi,
>>>
>>> I got past that.  And now the ls command says it can not access the
>>> directory.  I am sure this is a permissions issue.  I am just wondering
>>> which directory and I missing permissions on.
>>>
>>> Any pointers?
>>>
>>> And once again, thanks a lot
>>>
>>> Regards
>>> ashish
>>>
>>> *hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
>>> hadoop fs -ls*
>>> *Warning: $HADOOP_HOME is deprecated.*
>>> *
>>> *
>>> *ls: Cannot access .: No such file or directory.*
>>>
>>>
>>>
>>> On Tue, Jul 23, 2013 at 9:42 AM, Jitendra Yadav <
>>> jeetuyadav200...@gmail.com> wrote:
>>>
 Hi Ashish,

 Please check   in hdfs-site.xml.

 It is missing.

 Thanks.
 On Tue, Jul 23, 2013 at 9:58 PM, Ashish Umrani >>> > wrote:

> Hey thanks for response.  I have changed 4 files during installation
>
> core-site.xml
> mapred-site.xml
> hdfs-site.xml   and
> hadoop-env.sh
>
>
> I could not find any issues except that all params in the
> hadoop-env.sh are commented out.  Only java_home is un commented.
>
> If you have a quick minute can you please browse through these files
> in email and let me know where could be the issue.
>
> Regards
> ashish
>
>
>
> I am listing those files below.
>  *core-site.xml *
>  
> 
>
> 
>
> 
>   
> hadoop.tmp.dir
> /app/hadoop/tmp
> A base for other temporary directories.
>   
>
>   
> fs.default.name
> hdfs://localhost:54310
> The name of the default file system.  A URI whose
> scheme and authority determine the FileSystem implementation.  The
> uri's scheme determines the config property (fs.SCHEME.impl) naming
> the FileSystem implementation class.  The uri's authority is used
> to
> determine the host, port, etc. for a filesystem.
>   
> 
>
>
>
> *mapred-site.xml*
>  
> 
>
> 
>
> 
>   
> mapred.job.tracker
> localhost:54311
> The host and port that the MapReduce job tracker runs
> at.  If "local", then jobs are run in-process as a single map
> and reduce task.
> 
>   
> 
>
>
>
> *hdfs-site.xml   and*
>  
> 
>
> 
>
> 
>   dfs.replication
>   1
>   Default block replication.
> The actual number of replications can be specified when the file
> is created.
> The default is used if replication is not specified in create time.
>   
> 
>
>
>
> *hadoop-env.sh*
>  # Set Hadoop-specific environment variables here.
>
> # The only required environment variable is JAVA_HOME.  All others are
> # optional.  When running a distributed configuration it is best to
> # set JAVA_HOME in this file, so that it is correctly defined on
> # remote nodes.
>
> # The java implementation to use.  Required.
> export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_25
>
> # Extra Java CLASSPATH elements.  Optional.
> # export HADOOP_CLASSPATH=
>
>
> All pther params in hadoop-env.sh are commented
>
>
>
>
>
>
>
>
> On Tue, Jul 23, 2013 at 8:38 AM, Jitendra Yadav <
> jeetuyadav200...@gmail.com> wrote:
>
>> Hi,
>>
>> You might have missed some configuration (XML tags ), Please check
>> all the Conf files.
>>
>> Thanks
>> On Tue, Jul 23, 2013 at 6:25 PM, Ashish Umrani <
>> ashish.umr...@gmail.com> wrote:
>>
>>> Hi There,
>>>
>>> First of all, sorry if I am asking some stupid question.  Myself
>>> being new to the Hadoop environment , am finding it a bit difficult to
>>> figure out why its failing
>>>
>>> I have installed hadoop 1.2, based on instructions given in the
>>> folllowing link
>>>
>>> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
>>>
>>> All went well and I could do the start-all.sh and the jps command
>>> does show all 5 process to be present.
>>>
>>> However when I try to do

Re: New hadoop 1.2 single node installation giving problems

2013-07-23 Thread Jitendra Yadav
Try..

*hadoop fs -ls /*

**
Thanks


On Tue, Jul 23, 2013 at 10:27 PM, Ashish Umrani wrote:

> Thanks Jitendra, Bejoy and Yexi,
>
> I got past that.  And now the ls command says it can not access the
> directory.  I am sure this is a permissions issue.  I am just wondering
> which directory and I missing permissions on.
>
> Any pointers?
>
> And once again, thanks a lot
>
> Regards
> ashish
>
>  *hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
> hadoop fs -ls*
> *Warning: $HADOOP_HOME is deprecated.*
> *
> *
> *ls: Cannot access .: No such file or directory.*
>
>
>
> On Tue, Jul 23, 2013 at 9:42 AM, Jitendra Yadav <
> jeetuyadav200...@gmail.com> wrote:
>
>> Hi Ashish,
>>
>> Please check   in hdfs-site.xml.
>>
>> It is missing.
>>
>> Thanks.
>> On Tue, Jul 23, 2013 at 9:58 PM, Ashish Umrani 
>> wrote:
>>
>>> Hey thanks for response.  I have changed 4 files during installation
>>>
>>> core-site.xml
>>> mapred-site.xml
>>> hdfs-site.xml   and
>>> hadoop-env.sh
>>>
>>>
>>> I could not find any issues except that all params in the hadoop-env.sh
>>> are commented out.  Only java_home is un commented.
>>>
>>> If you have a quick minute can you please browse through these files in
>>> email and let me know where could be the issue.
>>>
>>> Regards
>>> ashish
>>>
>>>
>>>
>>> I am listing those files below.
>>>  *core-site.xml *
>>>  
>>> 
>>>
>>> 
>>>
>>> 
>>>   
>>> hadoop.tmp.dir
>>> /app/hadoop/tmp
>>> A base for other temporary directories.
>>>   
>>>
>>>   
>>> fs.default.name
>>> hdfs://localhost:54310
>>> The name of the default file system.  A URI whose
>>> scheme and authority determine the FileSystem implementation.  The
>>> uri's scheme determines the config property (fs.SCHEME.impl) naming
>>> the FileSystem implementation class.  The uri's authority is used to
>>> determine the host, port, etc. for a filesystem.
>>>   
>>> 
>>>
>>>
>>>
>>> *mapred-site.xml*
>>>  
>>> 
>>>
>>> 
>>>
>>> 
>>>   
>>> mapred.job.tracker
>>> localhost:54311
>>> The host and port that the MapReduce job tracker runs
>>> at.  If "local", then jobs are run in-process as a single map
>>> and reduce task.
>>> 
>>>   
>>> 
>>>
>>>
>>>
>>> *hdfs-site.xml   and*
>>>  
>>> 
>>>
>>> 
>>>
>>> 
>>>   dfs.replication
>>>   1
>>>   Default block replication.
>>> The actual number of replications can be specified when the file is
>>> created.
>>> The default is used if replication is not specified in create time.
>>>   
>>> 
>>>
>>>
>>>
>>> *hadoop-env.sh*
>>>  # Set Hadoop-specific environment variables here.
>>>
>>> # The only required environment variable is JAVA_HOME.  All others are
>>> # optional.  When running a distributed configuration it is best to
>>> # set JAVA_HOME in this file, so that it is correctly defined on
>>> # remote nodes.
>>>
>>> # The java implementation to use.  Required.
>>> export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_25
>>>
>>> # Extra Java CLASSPATH elements.  Optional.
>>> # export HADOOP_CLASSPATH=
>>>
>>>
>>> All pther params in hadoop-env.sh are commented
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Jul 23, 2013 at 8:38 AM, Jitendra Yadav <
>>> jeetuyadav200...@gmail.com> wrote:
>>>
 Hi,

 You might have missed some configuration (XML tags ), Please check all
 the Conf files.

 Thanks
 On Tue, Jul 23, 2013 at 6:25 PM, Ashish Umrani >>> > wrote:

> Hi There,
>
> First of all, sorry if I am asking some stupid question.  Myself being
> new to the Hadoop environment , am finding it a bit difficult to figure 
> out
> why its failing
>
> I have installed hadoop 1.2, based on instructions given in the
> folllowing link
>
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
>
> All went well and I could do the start-all.sh and the jps command does
> show all 5 process to be present.
>
> However when I try to do
>
> hadoop fs -ls
>
> I get the following error
>
>  hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
> hadoop fs -ls
> Warning: $HADOOP_HOME is deprecated.
>
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> 13/07/23 05:55:06 WARN conf.Configuration: bad conf file: element not
> 
> ls: Cannot access .: No such file or directory.
> hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
>
>
>
> Can someone help me figure out whats the issue in my installation
>
>
> Regards
> ashish
>


>>>
>>
>


Re: New hadoop 1.2 single node installation giving problems

2013-07-23 Thread Ashish Umrani
Jitendra, Som,

Thanks.  Issue was in not having any file there.  Its working fine now.

I am able to do -ls and could also do -mkdir and -put.

Now is time to run the jar and apparently I am getting

no main manifest attribute, in wc.jar


But I believe its because of maven pom file does not have the main class
entry.

Which I go ahead and change the pom file and build it again, please let me
know if you guys think of some other reason.

Once again this user group rocks.  I have never seen this quick a response.

Regards
ashish


On Tue, Jul 23, 2013 at 10:21 AM, Jitendra Yadav  wrote:

> Try..
>
> *hadoop fs -ls /*
>
> **
> Thanks
>
>
> On Tue, Jul 23, 2013 at 10:27 PM, Ashish Umrani 
> wrote:
>
>> Thanks Jitendra, Bejoy and Yexi,
>>
>> I got past that.  And now the ls command says it can not access the
>> directory.  I am sure this is a permissions issue.  I am just wondering
>> which directory and I missing permissions on.
>>
>> Any pointers?
>>
>> And once again, thanks a lot
>>
>> Regards
>> ashish
>>
>>  *hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
>> hadoop fs -ls*
>> *Warning: $HADOOP_HOME is deprecated.*
>> *
>> *
>> *ls: Cannot access .: No such file or directory.*
>>
>>
>>
>> On Tue, Jul 23, 2013 at 9:42 AM, Jitendra Yadav <
>> jeetuyadav200...@gmail.com> wrote:
>>
>>> Hi Ashish,
>>>
>>> Please check   in hdfs-site.xml.
>>>
>>> It is missing.
>>>
>>> Thanks.
>>> On Tue, Jul 23, 2013 at 9:58 PM, Ashish Umrani 
>>> wrote:
>>>
 Hey thanks for response.  I have changed 4 files during installation

 core-site.xml
 mapred-site.xml
 hdfs-site.xml   and
 hadoop-env.sh


 I could not find any issues except that all params in the hadoop-env.sh
 are commented out.  Only java_home is un commented.

 If you have a quick minute can you please browse through these files in
 email and let me know where could be the issue.

 Regards
 ashish



 I am listing those files below.
  *core-site.xml *
  
 

 

 
   
 hadoop.tmp.dir
 /app/hadoop/tmp
 A base for other temporary directories.
   

   
 fs.default.name
 hdfs://localhost:54310
 The name of the default file system.  A URI whose
 scheme and authority determine the FileSystem implementation.  The
 uri's scheme determines the config property (fs.SCHEME.impl) naming
 the FileSystem implementation class.  The uri's authority is used to
 determine the host, port, etc. for a filesystem.
   
 



 *mapred-site.xml*
  
 

 

 
   
 mapred.job.tracker
 localhost:54311
 The host and port that the MapReduce job tracker runs
 at.  If "local", then jobs are run in-process as a single map
 and reduce task.
 
   
 



 *hdfs-site.xml   and*
  
 

 

 
   dfs.replication
   1
   Default block replication.
 The actual number of replications can be specified when the file is
 created.
 The default is used if replication is not specified in create time.
   
 



 *hadoop-env.sh*
  # Set Hadoop-specific environment variables here.

 # The only required environment variable is JAVA_HOME.  All others are
 # optional.  When running a distributed configuration it is best to
 # set JAVA_HOME in this file, so that it is correctly defined on
 # remote nodes.

 # The java implementation to use.  Required.
 export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_25

 # Extra Java CLASSPATH elements.  Optional.
 # export HADOOP_CLASSPATH=


 All pther params in hadoop-env.sh are commented








 On Tue, Jul 23, 2013 at 8:38 AM, Jitendra Yadav <
 jeetuyadav200...@gmail.com> wrote:

> Hi,
>
> You might have missed some configuration (XML tags ), Please check all
> the Conf files.
>
> Thanks
> On Tue, Jul 23, 2013 at 6:25 PM, Ashish Umrani <
> ashish.umr...@gmail.com> wrote:
>
>> Hi There,
>>
>> First of all, sorry if I am asking some stupid question.  Myself
>> being new to the Hadoop environment , am finding it a bit difficult to
>> figure out why its failing
>>
>> I have installed hadoop 1.2, based on instructions given in the
>> folllowing link
>>
>> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
>>
>> All went well and I could do the start-all.sh and the jps command
>> does show all 5 process to be present.
>>
>> However when I try to do
>>
>> hadoop fs -ls
>>
>> I get the following error
>>
>>  hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
>> hadoop fs -ls
>> Warning: $HADOOP_HOME is depre

Re: New hadoop 1.2 single node installation giving problems

2013-07-23 Thread Shekhar Sharma
hadoop jar wc.jar  inputdata outputdestination


Regards,
Som Shekhar Sharma
+91-8197243810


On Tue, Jul 23, 2013 at 10:58 PM, Ashish Umrani wrote:

> Jitendra, Som,
>
> Thanks.  Issue was in not having any file there.  Its working fine now.
>
> I am able to do -ls and could also do -mkdir and -put.
>
> Now is time to run the jar and apparently I am getting
>
> no main manifest attribute, in wc.jar
>
>
> But I believe its because of maven pom file does not have the main class
> entry.
>
> Which I go ahead and change the pom file and build it again, please let me
> know if you guys think of some other reason.
>
> Once again this user group rocks.  I have never seen this quick a response.
>
> Regards
> ashish
>
>
> On Tue, Jul 23, 2013 at 10:21 AM, Jitendra Yadav <
> jeetuyadav200...@gmail.com> wrote:
>
>> Try..
>>
>> *hadoop fs -ls /*
>>
>> **
>> Thanks
>>
>>
>> On Tue, Jul 23, 2013 at 10:27 PM, Ashish Umrani 
>> wrote:
>>
>>> Thanks Jitendra, Bejoy and Yexi,
>>>
>>> I got past that.  And now the ls command says it can not access the
>>> directory.  I am sure this is a permissions issue.  I am just wondering
>>> which directory and I missing permissions on.
>>>
>>> Any pointers?
>>>
>>> And once again, thanks a lot
>>>
>>> Regards
>>> ashish
>>>
>>>  *hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
>>> hadoop fs -ls*
>>> *Warning: $HADOOP_HOME is deprecated.*
>>> *
>>> *
>>> *ls: Cannot access .: No such file or directory.*
>>>
>>>
>>>
>>> On Tue, Jul 23, 2013 at 9:42 AM, Jitendra Yadav <
>>> jeetuyadav200...@gmail.com> wrote:
>>>
 Hi Ashish,

 Please check   in hdfs-site.xml.

 It is missing.

 Thanks.
 On Tue, Jul 23, 2013 at 9:58 PM, Ashish Umrani >>> > wrote:

> Hey thanks for response.  I have changed 4 files during installation
>
> core-site.xml
> mapred-site.xml
> hdfs-site.xml   and
> hadoop-env.sh
>
>
> I could not find any issues except that all params in the
> hadoop-env.sh are commented out.  Only java_home is un commented.
>
> If you have a quick minute can you please browse through these files
> in email and let me know where could be the issue.
>
> Regards
> ashish
>
>
>
> I am listing those files below.
>  *core-site.xml *
>  
> 
>
> 
>
> 
>   
> hadoop.tmp.dir
> /app/hadoop/tmp
> A base for other temporary directories.
>   
>
>   
> fs.default.name
> hdfs://localhost:54310
> The name of the default file system.  A URI whose
> scheme and authority determine the FileSystem implementation.  The
> uri's scheme determines the config property (fs.SCHEME.impl) naming
> the FileSystem implementation class.  The uri's authority is used
> to
> determine the host, port, etc. for a filesystem.
>   
> 
>
>
>
> *mapred-site.xml*
>  
> 
>
> 
>
> 
>   
> mapred.job.tracker
> localhost:54311
> The host and port that the MapReduce job tracker runs
> at.  If "local", then jobs are run in-process as a single map
> and reduce task.
> 
>   
> 
>
>
>
> *hdfs-site.xml   and*
>  
> 
>
> 
>
> 
>   dfs.replication
>   1
>   Default block replication.
> The actual number of replications can be specified when the file
> is created.
> The default is used if replication is not specified in create time.
>   
> 
>
>
>
> *hadoop-env.sh*
>  # Set Hadoop-specific environment variables here.
>
> # The only required environment variable is JAVA_HOME.  All others are
> # optional.  When running a distributed configuration it is best to
> # set JAVA_HOME in this file, so that it is correctly defined on
> # remote nodes.
>
> # The java implementation to use.  Required.
> export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_25
>
> # Extra Java CLASSPATH elements.  Optional.
> # export HADOOP_CLASSPATH=
>
>
> All pther params in hadoop-env.sh are commented
>
>
>
>
>
>
>
>
> On Tue, Jul 23, 2013 at 8:38 AM, Jitendra Yadav <
> jeetuyadav200...@gmail.com> wrote:
>
>> Hi,
>>
>> You might have missed some configuration (XML tags ), Please check
>> all the Conf files.
>>
>> Thanks
>> On Tue, Jul 23, 2013 at 6:25 PM, Ashish Umrani <
>> ashish.umr...@gmail.com> wrote:
>>
>>> Hi There,
>>>
>>> First of all, sorry if I am asking some stupid question.  Myself
>>> being new to the Hadoop environment , am finding it a bit difficult to
>>> figure out why its failing
>>>
>>> I have installed hadoop 1.2, based on instructions given in the
>>> folllowing link
>>>
>>> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu

Get the tree structure of a HDFS dir, similar to dir/files

2013-07-23 Thread Huy Pham
Hi All,
   Do any of you have or can refer me to some sample Java code that get the 
tree structure of a HDFS directory, similar to the file system?
   For example: I have a HDFS dir, called /data, inside data, there is 
/data/valid and /data/invalid, and so on, so I would need to be able to get the 
whole tree structure of that and know which is is a dir, which one is a file. 
Both program and HDFS are LOCAL.
   In other words, what I look for is something similar to File class in Java, 
which has isDirectory() and list() to list all the children (files and dirs) of 
a dir. Found something in stackoverflow but it does not work.
Thanks
Huy




Re: New hadoop 1.2 single node installation giving problems

2013-07-23 Thread Ashish Umrani
Thanks Shekhar,

The problem was not in my building of the jar.  It was in fact in execution

I was running command

*hadoop -jar*   input output

The problem was with -jar.  It should be

*hadoop jar*   input output


Thanks for help once again

regards
ashish


On Tue, Jul 23, 2013 at 10:31 AM, Shekhar Sharma wrote:

> hadoop jar wc.jar  inputdata outputdestination
>
>
> Regards,
> Som Shekhar Sharma
> +91-8197243810
>
>
> On Tue, Jul 23, 2013 at 10:58 PM, Ashish Umrani 
> wrote:
>
>> Jitendra, Som,
>>
>> Thanks.  Issue was in not having any file there.  Its working fine now.
>>
>> I am able to do -ls and could also do -mkdir and -put.
>>
>> Now is time to run the jar and apparently I am getting
>>
>> no main manifest attribute, in wc.jar
>>
>>
>> But I believe its because of maven pom file does not have the main class
>> entry.
>>
>> Which I go ahead and change the pom file and build it again, please let
>> me know if you guys think of some other reason.
>>
>> Once again this user group rocks.  I have never seen this quick a
>> response.
>>
>> Regards
>> ashish
>>
>>
>> On Tue, Jul 23, 2013 at 10:21 AM, Jitendra Yadav <
>> jeetuyadav200...@gmail.com> wrote:
>>
>>> Try..
>>>
>>> *hadoop fs -ls /*
>>>
>>> **
>>> Thanks
>>>
>>>
>>> On Tue, Jul 23, 2013 at 10:27 PM, Ashish Umrani >> > wrote:
>>>
 Thanks Jitendra, Bejoy and Yexi,

 I got past that.  And now the ls command says it can not access the
 directory.  I am sure this is a permissions issue.  I am just wondering
 which directory and I missing permissions on.

 Any pointers?

 And once again, thanks a lot

 Regards
 ashish

  *hduser@ashish-HP-Pavilion-dv6-Notebook-PC:/usr/local/hadoop/conf$
 hadoop fs -ls*
 *Warning: $HADOOP_HOME is deprecated.*
 *
 *
 *ls: Cannot access .: No such file or directory.*



 On Tue, Jul 23, 2013 at 9:42 AM, Jitendra Yadav <
 jeetuyadav200...@gmail.com> wrote:

> Hi Ashish,
>
> Please check   in hdfs-site.xml.
>
> It is missing.
>
> Thanks.
> On Tue, Jul 23, 2013 at 9:58 PM, Ashish Umrani <
> ashish.umr...@gmail.com> wrote:
>
>> Hey thanks for response.  I have changed 4 files during installation
>>
>> core-site.xml
>> mapred-site.xml
>> hdfs-site.xml   and
>> hadoop-env.sh
>>
>>
>> I could not find any issues except that all params in the
>> hadoop-env.sh are commented out.  Only java_home is un commented.
>>
>> If you have a quick minute can you please browse through these files
>> in email and let me know where could be the issue.
>>
>> Regards
>> ashish
>>
>>
>>
>> I am listing those files below.
>>  *core-site.xml *
>>  
>> 
>>
>> 
>>
>> 
>>   
>> hadoop.tmp.dir
>> /app/hadoop/tmp
>> A base for other temporary directories.
>>   
>>
>>   
>> fs.default.name
>> hdfs://localhost:54310
>> The name of the default file system.  A URI whose
>> scheme and authority determine the FileSystem implementation.  The
>> uri's scheme determines the config property (fs.SCHEME.impl)
>> naming
>> the FileSystem implementation class.  The uri's authority is used
>> to
>> determine the host, port, etc. for a filesystem.
>>   
>> 
>>
>>
>>
>> *mapred-site.xml*
>>  
>> 
>>
>> 
>>
>> 
>>   
>> mapred.job.tracker
>> localhost:54311
>> The host and port that the MapReduce job tracker runs
>> at.  If "local", then jobs are run in-process as a single map
>> and reduce task.
>> 
>>   
>> 
>>
>>
>>
>> *hdfs-site.xml   and*
>>  
>> 
>>
>> 
>>
>> 
>>   dfs.replication
>>   1
>>   Default block replication.
>> The actual number of replications can be specified when the file
>> is created.
>> The default is used if replication is not specified in create
>> time.
>>   
>> 
>>
>>
>>
>> *hadoop-env.sh*
>>  # Set Hadoop-specific environment variables here.
>>
>> # The only required environment variable is JAVA_HOME.  All others are
>> # optional.  When running a distributed configuration it is best to
>> # set JAVA_HOME in this file, so that it is correctly defined on
>> # remote nodes.
>>
>> # The java implementation to use.  Required.
>> export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_25
>>
>> # Extra Java CLASSPATH elements.  Optional.
>> # export HADOOP_CLASSPATH=
>>
>>
>> All pther params in hadoop-env.sh are commented
>>
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Jul 23, 2013 at 8:38 AM, Jitendra Yadav <
>> jeetuyadav200...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> You might have missed some configuration (XM

Re: Get the tree structure of a HDFS dir, similar to dir/files

2013-07-23 Thread Shahab Yunus
See this
https://sites.google.com/site/hadoopandhive/home/how-to-read-all-files-in-a-directory-in-hdfs-using-hadoop-filesystem-api

and

http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/FileSystem.html#isDirectory(org.apache.hadoop.fs.Path)

Basically you can write your own function, possible with recursion, to
iterate over a directory using the idea and combinations from above two
links.

Regards,
Shahab


On Tue, Jul 23, 2013 at 2:05 PM, Huy Pham  wrote:

>  Hi All,
>Do any of you have or can refer me to some sample Java code that get
> the tree structure of a HDFS directory, similar to the file system?
>For example: I have a HDFS dir, called /data, inside data, there is
> /data/valid and /data/invalid, and so on, so I would need to be able to get
> the whole tree structure of that and know which is is a dir, which one is a
> file. Both program and HDFS are LOCAL.
>In other words, what I look for is something similar to File class in
> Java, which has isDirectory() and list() to list all the children (files
> and dirs) of a dir. Found something in stackoverflow but it does not work.
> Thanks
> Huy
>
>
>


Saving counters in Mapfile

2013-07-23 Thread Elazar Leibovich
Hi,

A common use case one want an ordered structure for, is for saving counters.

Naturally, I wanted to save the counters in a Mapfile:

for (long ix = 0; ix < MAXVALUE; ix++) {
mapfile.append(new Text("counter key of val " + ix), new
LongWritable(ix));
}

This however looks a bit inefficient. We'll store two files, and an index
file. The index file will contain an offset (long) to the sequence file,
which would contain a single long.

I'd rather have only the index file, that would store the counter value
instead of offsets.

Is there a way to do that with Mapfile? Perhaps there's a better way to
save searchable counters in HDFS file?

Thanks,


Re: Saving counters in Mapfile

2013-07-23 Thread manishbh...@rocketmail.com
Hi,
If you intend to use those counters in further functions then I think Hadoop 
will take care this by itself, you can explore combiner for the same. In 
Sequence file mapfile and index files has its own functionality and as per my 
understanding index file expect offset to move over map file and logically you 
can't use it for counter.


Sent via Rocket from my HTC 

- Reply message -
From: "Elazar Leibovich" 
To: 
Subject: Saving counters in Mapfile
Date: Wed, Jul 24, 2013 3:27 AM


Hi,

A common use case one want an ordered structure for, is for saving counters.

Naturally, I wanted to save the counters in a Mapfile:

for (long ix = 0; ix < MAXVALUE; ix++) {
mapfile.append(new Text("counter key of val " + ix), new
LongWritable(ix));
}

This however looks a bit inefficient. We'll store two files, and an index
file. The index file will contain an offset (long) to the sequence file,
which would contain a single long.

I'd rather have only the index file, that would store the counter value
instead of offsets.

Is there a way to do that with Mapfile? Perhaps there's a better way to
save searchable counters in HDFS file?

Thanks,


Re: Saving counters in Mapfile

2013-07-23 Thread Michael Segel
Uhm...

You want to save the counters as in counts per job run or something? (Remember 
HDFS == WORM) 

Then you could do a sequence file and then use something like HBase to manage 
the index. 
(Every time you add a set of counters, you have a new file and a new index.) 
Heck you could use HBase for the whole thing but it would be overkill if this 
was all that you were doing. 


On Jul 23, 2013, at 4:57 PM, Elazar Leibovich  wrote:

> Hi,
> 
> A common use case one want an ordered structure for, is for saving counters.
> 
> Naturally, I wanted to save the counters in a Mapfile:
> 
> for (long ix = 0; ix < MAXVALUE; ix++) {
> mapfile.append(new Text("counter key of val " + ix), new 
> LongWritable(ix));
> }
> 
> This however looks a bit inefficient. We'll store two files, and an index 
> file. The index file will contain an offset (long) to the sequence file, 
> which would contain a single long.
> 
> I'd rather have only the index file, that would store the counter value 
> instead of offsets.
> 
> Is there a way to do that with Mapfile? Perhaps there's a better way to save 
> searchable counters in HDFS file?
> 
> Thanks,



Re: Get the tree structure of a HDFS dir, similar to dir/files

2013-07-23 Thread Harsh J
The FileSystem interface provides a recursive option for this. See
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FileSystem.html#listFiles(org.apache.hadoop.fs.Path,%20boolean)

On Tue, Jul 23, 2013 at 11:35 PM, Huy Pham  wrote:
> Hi All,
>Do any of you have or can refer me to some sample Java code that get the
> tree structure of a HDFS directory, similar to the file system?
>For example: I have a HDFS dir, called /data, inside data, there is
> /data/valid and /data/invalid, and so on, so I would need to be able to get
> the whole tree structure of that and know which is is a dir, which one is a
> file. Both program and HDFS are LOCAL.
>In other words, what I look for is something similar to File class in
> Java, which has isDirectory() and list() to list all the children (files and
> dirs) of a dir. Found something in stackoverflow but it does not work.
> Thanks
> Huy
>
>



-- 
Harsh J


Re: Only log.index

2013-07-23 Thread Vinod Kumar Vavilapalli

It could either mean that all those task-attempts are crashing before the 
process itself is getting spawned (check TT logs) or those logs are getting 
deleted after the fact. Suspect the earlier.

Thanks,
+Vinod

On Jul 23, 2013, at 9:33 AM, Ajay Srivastava wrote:

> Hi,
> 
> I see that most of the tasks have only log.index created in 
> /opt/hadoop/logs/userlogs/jobId/task_attempt directory.
> When does this happen ?
> Is there a config setting for this OR this is a bug ?
> 
> 
> Regards,
> Ajay Srivastava



Re: Only log.index

2013-07-23 Thread Ajay Srivastava
Hi Vinod,

Thanks. It seems that something else is going on -

Here is the content of log.index -

ajay-srivastava:userlogs ajay.srivastava$ cat 
job_201307222115_0188/attempt_201307222115_0188_r_00_0/log.index
LOG_DIR:/opt/hadoop/bin/../logs/userlogs/job_201307222115_0188/attempt_201307222115_0188_r_08_0
stdout:0 0
stderr:156 0
syslog:995 166247

Looks like that the log.index is pointing to another attempt directory.
Is it doing some kind of optimization ? What is purpose of log.index ?


Regards,
Ajay Srivastava


On 24-Jul-2013, at 11:09 AM, Vinod Kumar Vavilapalli wrote:

> 
> It could either mean that all those task-attempts are crashing before the 
> process itself is getting spawned (check TT logs) or those logs are getting 
> deleted after the fact. Suspect the earlier.
> 
> Thanks,
> +Vinod
> 
> On Jul 23, 2013, at 9:33 AM, Ajay Srivastava wrote:
> 
>> Hi,
>> 
>> I see that most of the tasks have only log.index created in 
>> /opt/hadoop/logs/userlogs/jobId/task_attempt directory.
>> When does this happen ?
>> Is there a config setting for this OR this is a bug ?
>> 
>> 
>> Regards,
>> Ajay Srivastava
> 



Re: Only log.index

2013-07-23 Thread Vinod Kumar Vavilapalli

Ah, I should've guessed that. You seem to have JVM reuse enabled. Even if JVMs 
are reused, all the tasks write to the same files as they share the JVM. They 
only have different index files. The same thing happens for what we call the 
TaskCleanup tasks which are launched for failing/killed tasks.

Thanks,
+Vinod

On Jul 23, 2013, at 10:55 PM, Ajay Srivastava wrote:

> Hi Vinod,
> 
> Thanks. It seems that something else is going on -
> 
> Here is the content of log.index -
> 
> ajay-srivastava:userlogs ajay.srivastava$ cat 
> job_201307222115_0188/attempt_201307222115_0188_r_00_0/log.index
> LOG_DIR:/opt/hadoop/bin/../logs/userlogs/job_201307222115_0188/attempt_201307222115_0188_r_08_0
> stdout:0 0
> stderr:156 0
> syslog:995 166247
> 
> Looks like that the log.index is pointing to another attempt directory.
> Is it doing some kind of optimization ? What is purpose of log.index ?
> 
> 
> Regards,
> Ajay Srivastava
> 
> 
> On 24-Jul-2013, at 11:09 AM, Vinod Kumar Vavilapalli wrote:
> 
>> 
>> It could either mean that all those task-attempts are crashing before the 
>> process itself is getting spawned (check TT logs) or those logs are getting 
>> deleted after the fact. Suspect the earlier.
>> 
>> Thanks,
>> +Vinod
>> 
>> On Jul 23, 2013, at 9:33 AM, Ajay Srivastava wrote:
>> 
>>> Hi,
>>> 
>>> I see that most of the tasks have only log.index created in 
>>> /opt/hadoop/logs/userlogs/jobId/task_attempt directory.
>>> When does this happen ?
>>> Is there a config setting for this OR this is a bug ?
>>> 
>>> 
>>> Regards,
>>> Ajay Srivastava
>> 
> 



Re: Only log.index

2013-07-23 Thread Ajay Srivastava
Yes. That explains it and confirms my guess too :-)

stderr:156 0
syslog:995 166247

What are these numbers ? Byte offset in corresponding files from where logs of 
this task starts.



Regards,
Ajay Srivastava


On 24-Jul-2013, at 12:10 PM, Vinod Kumar Vavilapalli wrote:


Ah, I should've guessed that. You seem to have JVM reuse enabled. Even if JVMs 
are reused, all the tasks write to the same files as they share the JVM. They 
only have different index files. The same thing happens for what we call the 
TaskCleanup tasks which are launched for failing/killed tasks.

Thanks,
+Vinod

On Jul 23, 2013, at 10:55 PM, Ajay Srivastava wrote:

Hi Vinod,

Thanks. It seems that something else is going on -

Here is the content of log.index -

ajay-srivastava:userlogs ajay.srivastava$ cat 
job_201307222115_0188/attempt_201307222115_0188_r_00_0/log.index
LOG_DIR:/opt/hadoop/bin/../logs/userlogs/job_201307222115_0188/attempt_201307222115_0188_r_08_0
stdout:0 0
stderr:156 0
syslog:995 166247

Looks like that the log.index is pointing to another attempt directory.
Is it doing some kind of optimization ? What is purpose of log.index ?


Regards,
Ajay Srivastava


On 24-Jul-2013, at 11:09 AM, Vinod Kumar Vavilapalli wrote:


It could either mean that all those task-attempts are crashing before the 
process itself is getting spawned (check TT logs) or those logs are getting 
deleted after the fact. Suspect the earlier.

Thanks,
+Vinod

On Jul 23, 2013, at 9:33 AM, Ajay Srivastava wrote:

Hi,

I see that most of the tasks have only log.index created in 
/opt/hadoop/logs/userlogs/jobId/task_attempt directory.
When does this happen ?
Is there a config setting for this OR this is a bug ?


Regards,
Ajay Srivastava