About Hadoop

2013-11-25 Thread RajBasha S
can Map Reduce will run on HDFS or any other file system ? HDFS is Mandatory


Re: About Hadoop

2013-11-25 Thread Nitin Pawar
you don't necessarily have to have to hdfs to run mapreduce.

But its recommended :)




On Mon, Nov 25, 2013 at 3:25 PM, RajBasha S  wrote:

> can Map Reduce will run on HDFS or any other file system ? HDFS is
> Mandatory
>



-- 
Nitin Pawar


Re: About Hadoop

2013-11-28 Thread Adam Kawa
MapReduce is best optimized for HDFS, but you can run MapReduce jobs over
data stored in other file-systems e.g. local file system (e.g. ext3, ext4,
xfs), S3, CloudStore (formerly Kosmos) to name a few.


2013/11/25 Nitin Pawar 

> you don't necessarily have to have to hdfs to run mapreduce.
>
> But its recommended :)
>
>
>
>
> On Mon, Nov 25, 2013 at 3:25 PM, RajBasha S wrote:
>
>> can Map Reduce will run on HDFS or any other file system ? HDFS is
>> Mandatory
>>
>
>
>
> --
> Nitin Pawar
>


about hadoop upgrade

2014-05-19 Thread ch huang
hi,maillist:
i want to upgrade my cluster ,in doc,one of step is backup
namenode dfs.namenode.name.dir
directory,i have 2 directories defined in hdfs-site.xml,should i backup
them all ,or just one of them?


dfs.namenode.name.dir
file:///data/namespace/1,file:///data/namespace/2



Question about Hadoop

2013-08-06 Thread 間々田 剛史
Dear Sir

we are students at Hosei University.
we study hadoop now for reserch.

we use Hadoop2.0.0-CDH4.2.1 MRv2 and its environment is centOS 6.2.
we can access HDFS from master and slaves.
We have some questions.
Master:Hadoop04
Slaves:Hadoop01
   Hadoop02
   Hadoop03


we run the "wordcount" program on "complete distributed mode" ,but it stops
running.
The following is the response for running the program.


13/07/24 16:52:15 INFO input.FileInputFormat: Total input paths to process
: 1
13/07/24 16:52:16 WARN conf.Configuration: mapred.job.name is deprecated.
Instea
d, use mapreduce.job.name
13/07/24 16:52:16 WARN conf.Configuration: mapreduce.reduce.class is
deprecated.
 Instead, use mapreduce.job.reduce.class
13/07/24 16:52:16 WARN conf.Configuration: mapred.input.dir is deprecated.
Inste
ad, use mapreduce.input.fileinputformat.inputdir
13/07/24 16:52:16 WARN conf.Configuration: mapred.output.dir is deprecated.
Instead, use mapreduce.output.fileoutputformat.outputdir
13/07/24 16:52:16 WARN conf.Configuration: mapred.map.tasks is deprecated.
Instead, use mapreduce.job.maps
13/07/24 16:52:16 WARN conf.Configuration: mapred.output.key.class is
deprecated. Instead, use mapreduce.job.output.key.class
13/07/24 16:52:16 WARN conf.Configuration: mapred.working.dir is
deprecated. Instead, use mapreduce.job.working.dir
13/07/24 16:52:16 INFO mapreduce.JobSubmitter: Submitting tokens for job:
job_1374652061495_0002
13/07/24 16:52:16 INFO client.YarnClientImpl: Submitted application
application_1374652061495_0002 to ResourceManager at Hadoop04/
10.31.185.24:8040
13/07/24 16:52:16 INFO mapreduce.Job: The url to track the job:
http://Hadoop04:
8088/proxy/application_1374652061495_0002/
13/07/24 16:52:16 INFO mapreduce.Job: Running job: job_1374652061495_0002

it stops running at the point.
then,
when we access http://Hadoop04:8088/proxy/application_1374652061495_0002/,
it shows "The requested application does not appear to be running yet, and
has not set a tracking URL"

I am looking forward to your reply.

Best wishes,
tadayuki,tsuyoshi,and hideki


about hadoop-2.2.0 "mapred.child.java.opts"

2013-12-03 Thread Henry Hung
Hello,

I have a question.
Is it correct to say that in hadoop-2.2.0, the mapred-site.xml node 
"mapred.child.java.opts" is replaced by two new node "mapreduce.map.java.opts" 
and "mapreduce.reduce.java.opts"?

Best regards,
Henry


The privileged confidential information contained in this email is intended for 
use only by the addressees as indicated by the original sender of this email. 
If you are not the addressee indicated in this email or are not responsible for 
delivery of the email to such a person, please kindly reply to the sender 
indicating this fact and delete all copies of it from your computer and network 
server immediately. Your cooperation is highly appreciated. It is advised that 
any unauthorized use of confidential information of Winbond is strictly 
prohibited; and any information in this email irrelevant to the official 
business of Winbond shall be deemed as neither given nor endorsed by Winbond.


issue about hadoop streaming

2013-12-25 Thread ch huang
hi,maillist:

   i read the doc about hadoop streaming,is it possible to construct job
chain through pipeline and hadoop streaming ?
if the first job like this
first job : hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar
-input /alex/messages -output /alex/stout4 -mapper /bin/cat -reducer /tmp/
mycount.pl -file /tmp/mycount.pl

so i want to let the first job output become the second job input ,if can
,how to do it? thanks!


question about hadoop dfs

2014-01-25 Thread EdwardKing
I use Hadoop2.2.0 to create a master node and a sub node,like follows:

Live Datanodes : 2
Node  Transferring Address  Last Contact  Admin State  Configured Capacity (GB) 
 Used(GB)  Non DFS Used (GB)  Remaining(GB)  Used(%)  
master 172.11.12.6:50010 1In Service   16.15
  0.00  2.7613.39   
  0.00
node1 172.11.12.7:50010 0 In Service   16.15
 0.00   2.7513.40   
  0.00

Then I create a abc.txt file on master 172.11.12.6
[hadoop@master ~]$ pwd
/home/hadoop
[hadoop@master ~]$ echo "This is a test." >> abc.txt
[hadoop@master ~]$ hadoop dfs -copyFromLocal test.txt
[hadoop@master ~]$ hadoop dfs -ls
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

14/01/25 22:07:00 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
Found 1 items
-rw-r--r--   2 hadoop supergroup 16 2014-01-25 21:36 abc.txt

[hadoop@master ~]$ rm abc.txt
[hadoop@master ~]$ hadoop dfs -cat abc.txt
This is a test.

My question is:
1. Is supergroup a directory?  Where does it locate?
2. I search abc.txt on master 172.11.12.6 and node1 172.11.12.7 by following 
command:
[hadoop@master ~]$ find / -name abc.txt 
But I don't find abc.txt file. Where is the file abc.txt? After I erase it by 
rm command, I still cat this file? Where is it? My OS is CentOS-5.8.

Thanks.
---
Confidentiality Notice: The information contained in this e-mail and any 
accompanying attachment(s) 
is intended only for the use of the intended recipient and may be confidential 
and/or privileged of 
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of 
this communication is 
not the intended recipient, unauthorized use, forwarding, printing,  storing, 
disclosure or copying 
is strictly prohibited, and may be unlawful.If you have received this 
communication in error,please 
immediately notify the sender by return e-mail, and delete the original message 
and all copies from 
your system. Thank you. 
---


Re: about hadoop upgrade

2014-05-19 Thread Harsh J
Backing up one of them is sufficient, but do check if both their
contents are same and backup the more recent one (a mount may lag
behind if it had a failure earlier, for ex.).

On Mon, May 19, 2014 at 3:39 PM, ch huang  wrote:
> hi,maillist:
> i want to upgrade my cluster ,in doc,one of step is backup namenode
> dfs.namenode.name.dir directory,i have 2 directories defined in
> hdfs-site.xml,should i backup them all ,or just one of them?
>
> 
> dfs.namenode.name.dir
> file:///data/namespace/1,file:///data/namespace/2
> 



-- 
Harsh J


About hadoop-2.0.5 release

2013-06-11 Thread Ramya S
Hi,
 
When will be the release of  stable version of hadoop-2.0.5-alpha?
 
 
Thanks & Regards,
Ramya.S


Re: Question about Hadoop

2013-08-06 Thread manish dunani
After checking ur error code.
I think u entered wrong map and reduce class.

can u pls show me code??
Then i will tell u correctly where u did the mistake..



On Tue, Aug 6, 2013 at 12:25 PM, 間々田 剛史
wrote:

> Dear Sir
>
> we are students at Hosei University.
> we study hadoop now for reserch.
>
> we use Hadoop2.0.0-CDH4.2.1 MRv2 and its environment is centOS 6.2.
> we can access HDFS from master and slaves.
> We have some questions.
> Master:Hadoop04
> Slaves:Hadoop01
>Hadoop02
>Hadoop03
>
>
> we run the "wordcount" program on "complete distributed mode" ,but it
> stops running.
> The following is the response for running the program.
>
>
> 13/07/24 16:52:15 INFO input.FileInputFormat: Total input paths to process
> : 1
> 13/07/24 16:52:16 WARN conf.Configuration: mapred.job.name is deprecated.
> Instea
> d, use mapreduce.job.name
> 13/07/24 16:52:16 WARN conf.Configuration: mapreduce.reduce.class is
> deprecated.
>  Instead, use mapreduce.job.reduce.class
> 13/07/24 16:52:16 WARN conf.Configuration: mapred.input.dir is deprecated.
> Inste
> ad, use mapreduce.input.fileinputformat.inputdir
> 13/07/24 16:52:16 WARN conf.Configuration: mapred.output.dir is
> deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
> 13/07/24 16:52:16 WARN conf.Configuration: mapred.map.tasks is deprecated.
> Instead, use mapreduce.job.maps
> 13/07/24 16:52:16 WARN conf.Configuration: mapred.output.key.class is
> deprecated. Instead, use mapreduce.job.output.key.class
> 13/07/24 16:52:16 WARN conf.Configuration: mapred.working.dir is
> deprecated. Instead, use mapreduce.job.working.dir
> 13/07/24 16:52:16 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> job_1374652061495_0002
> 13/07/24 16:52:16 INFO client.YarnClientImpl: Submitted application
> application_1374652061495_0002 to ResourceManager at Hadoop04/
> 10.31.185.24:8040
> 13/07/24 16:52:16 INFO mapreduce.Job: The url to track the job:
> http://Hadoop04:
> 8088/proxy/application_1374652061495_0002/
> 13/07/24 16:52:16 INFO mapreduce.Job: Running job: job_1374652061495_0002
>
> it stops running at the point.
> then,
> when we access http://Hadoop04:8088/proxy/application_1374652061495_0002/,
> it shows "The requested application does not appear to be running yet, and
> has not set a tracking URL"
>
> I am looking forward to your reply.
>
> Best wishes,
> tadayuki,tsuyoshi,and hideki
>



-- 
MANISH DUNANI
-THANX
+91 9426881954,+91 8460656443
manishd...@gmail.com


Re: Question about Hadoop

2013-08-06 Thread Tatsuo Kawasaki
Hi Tsuyoshi,

Did you run "wordcount" sample in hadoop-examples.jar?
Can you share the command that you run?

Thanks,
--
Tatsuo


On Tue, Aug 6, 2013 at 3:55 PM, 間々田 剛史
wrote:

> Dear Sir
>
> we are students at Hosei University.
> we study hadoop now for reserch.
>
> we use Hadoop2.0.0-CDH4.2.1 MRv2 and its environment is centOS 6.2.
> we can access HDFS from master and slaves.
> We have some questions.
> Master:Hadoop04
> Slaves:Hadoop01
>Hadoop02
>Hadoop03
>
>
> we run the "wordcount" program on "complete distributed mode" ,but it
> stops running.
> The following is the response for running the program.
>
>
> 13/07/24 16:52:15 INFO input.FileInputFormat: Total input paths to process
> : 1
> 13/07/24 16:52:16 WARN conf.Configuration: mapred.job.name is deprecated.
> Instea
> d, use mapreduce.job.name
> 13/07/24 16:52:16 WARN conf.Configuration: mapreduce.reduce.class is
> deprecated.
>  Instead, use mapreduce.job.reduce.class
> 13/07/24 16:52:16 WARN conf.Configuration: mapred.input.dir is deprecated.
> Inste
> ad, use mapreduce.input.fileinputformat.inputdir
> 13/07/24 16:52:16 WARN conf.Configuration: mapred.output.dir is
> deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
> 13/07/24 16:52:16 WARN conf.Configuration: mapred.map.tasks is deprecated.
> Instead, use mapreduce.job.maps
> 13/07/24 16:52:16 WARN conf.Configuration: mapred.output.key.class is
> deprecated. Instead, use mapreduce.job.output.key.class
> 13/07/24 16:52:16 WARN conf.Configuration: mapred.working.dir is
> deprecated. Instead, use mapreduce.job.working.dir
> 13/07/24 16:52:16 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> job_1374652061495_0002
> 13/07/24 16:52:16 INFO client.YarnClientImpl: Submitted application
> application_1374652061495_0002 to ResourceManager at Hadoop04/
> 10.31.185.24:8040
> 13/07/24 16:52:16 INFO mapreduce.Job: The url to track the job:
> http://Hadoop04:
> 8088/proxy/application_1374652061495_0002/
> 13/07/24 16:52:16 INFO mapreduce.Job: Running job: job_1374652061495_0002
>
> it stops running at the point.
> then,
> when we access http://Hadoop04:8088/proxy/application_1374652061495_0002/,
> it shows "The requested application does not appear to be running yet, and
> has not set a tracking URL"
>
> I am looking forward to your reply.
>
> Best wishes,
> tadayuki,tsuyoshi,and hideki
>



-- 
--
Tatsuo Kawasaki
tat...@cloudera.com


Re: Question about Hadoop

2013-08-06 Thread yypvsxf19870706
Hi
   You need to check your resourcemanager log and the container log which 
container allocate by your RM.  
  

发自我的 iPhone

在 2013-8-6,15:30,manish dunani  写道:

> After checking ur error code.
> I think u entered wrong map and reduce class. 
> 
> can u pls show me code??
> Then i will tell u correctly where u did the mistake..
> 
> 
> 
> On Tue, Aug 6, 2013 at 12:25 PM, �g々田 ��史 
>  wrote:
>> Dear Sir
>> 
>> we are students at Hosei University.
>> we study hadoop now for reserch.
>> 
>> we use Hadoop2.0.0-CDH4.2.1 MRv2 and its environment is centOS 6.2.
>> we can access HDFS from master and slaves.
>> We have some questions.
>> Master:Hadoop04
>> Slaves:Hadoop01
>>Hadoop02
>>Hadoop03
>> 
>> 
>> we run the "wordcount" program on "complete distributed mode" ,but it stops 
>> running.
>> The following is the response for running the program.
>> 
>> 
>> 13/07/24 16:52:15 INFO input.FileInputFormat: Total input paths to process : 
>> 1
>> 13/07/24 16:52:16 WARN conf.Configuration: mapred.job.name is deprecated. 
>> Instea
>> d, use mapreduce.job.name
>> 13/07/24 16:52:16 WARN conf.Configuration: mapreduce.reduce.class is 
>> deprecated.
>>  Instead, use mapreduce.job.reduce.class
>> 13/07/24 16:52:16 WARN conf.Configuration: mapred.input.dir is deprecated. 
>> Inste
>> ad, use mapreduce.input.fileinputformat.inputdir
>> 13/07/24 16:52:16 WARN conf.Configuration: mapred.output.dir is deprecated. 
>> Instead, use mapreduce.output.fileoutputformat.outputdir
>> 13/07/24 16:52:16 WARN conf.Configuration: mapred.map.tasks is deprecated. 
>> Instead, use mapreduce.job.maps
>> 13/07/24 16:52:16 WARN conf.Configuration: mapred.output.key.class is 
>> deprecated. Instead, use mapreduce.job.output.key.class
>> 13/07/24 16:52:16 WARN conf.Configuration: mapred.working.dir is deprecated. 
>> Instead, use mapreduce.job.working.dir
>> 13/07/24 16:52:16 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
>> job_1374652061495_0002
>> 13/07/24 16:52:16 INFO client.YarnClientImpl: Submitted application 
>> application_1374652061495_0002 to ResourceManager at 
>> Hadoop04/10.31.185.24:8040
>> 13/07/24 16:52:16 INFO mapreduce.Job: The url to track the job: 
>> http://Hadoop04:
>> 8088/proxy/application_1374652061495_0002/
>> 13/07/24 16:52:16 INFO mapreduce.Job: Running job: job_1374652061495_0002
>> 
>> it stops running at the point.
>> then, 
>> when we access http://Hadoop04:8088/proxy/application_1374652061495_0002/,
>> it shows "The requested application does not appear to be running yet, and 
>> has not set a tracking URL"
>> 
>> I am looking forward to your reply. 
>> 
>> Best wishes,
>> tadayuki,tsuyoshi,and hideki
> 
> 
> 
> -- 
> MANISH DUNANI
> -THANX
> +91 9426881954,+91 8460656443
> manishd...@gmail.com


question about hadoop HA

2013-08-15 Thread ch huang
hi,all
i have a question that i can not answer by myself,hope any one can help.
if i do not set up HA,client can query DNS get the hdfs entrance,but if i
set up namenode HA,how client know which host it should talk?


issues about hadoop-0.20.0

2015-07-18 Thread longfei li
Hello!
I built a hadoop cluster including 12 nodes which is based on arm(cubietruck), 
I run simple program wordcount to find how many words of h in hello, it runs 
perfectly. But I run a mutiple program like pi,i run like this:
./hadoop jar hadoop-example-0.21.0.jar pi 100 10
infomation
15/07/18 11:38:54 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found in 
the classpath. Usage of hadoop-site.xml is deprecated. Instead use 
core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of 
core-default.xml, mapred-default.xml and hdfs-default.xml respectively
15/07/18 11:38:54 INFO security.Groups: Group mapping 
impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=30
15/07/18 11:38:55 WARN conf.Configuration: mapred.task.id is deprecated. 
Instead, use mapreduce.task.attempt.id
15/07/18 11:38:55 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for 
parsing the arguments. Applications should implement Tool for the same.
15/07/18 11:38:55 INFO input.FileInputFormat: Total input paths to process : 1
15/07/18 11:38:58 WARN conf.Configuration: mapred.map.tasks is deprecated. 
Instead, use mapreduce.job.maps
15/07/18 11:38:58 INFO mapreduce.JobSubmitter: number of splits:1
15/07/18 11:38:58 INFO mapreduce.JobSubmitter: adding the following namenodes' 
delegation tokens:null
15/07/18 11:38:59 INFO mapreduce.Job: Running job: job_201507181137_0001
15/07/18 11:39:00 INFO mapreduce.Job:  map 0% reduce 0%
15/07/18 11:39:20 INFO mapreduce.Job:  map 100% reduce 0%
15/07/18 11:39:35 INFO mapreduce.Job:  map 100% reduce 10%
15/07/18 11:39:36 INFO mapreduce.Job:  map 100% reduce 20%
15/07/18 11:39:38 INFO mapreduce.Job:  map 100% reduce 90%
15/07/18 11:39:58 INFO mapreduce.Job:  map 100% reduce 100%
15/07/18 11:49:47 INFO mapreduce.Job:  map 100% reduce 89%
15/07/18 11:49:49 INFO mapreduce.Job:  map 100% reduce 19%
15/07/18 11:49:54 INFO mapreduce.Job: Task Id : 
attempt_201507181137_0001_r_00_0, Status : FAILED
Task attempt_201507181137_0001_r_00_0 failed to report status for 602 
seconds. Killing!
15/07/18 11:49:57 WARN mapreduce.Job: Error reading task outputhadoop-slave7
15/07/18 11:49:57 WARN mapreduce.Job: Error reading task outputhadoop-slave7
15/07/18 11:49:58 INFO mapreduce.Job: Task Id : 
attempt_201507181137_0001_r_02_0, Status : FAILED
Task attempt_201507181137_0001_r_02_0 failed to report status for 601 
seconds. Killing!
15/07/18 11:50:00 WARN mapreduce.Job: Error reading task outputhadoop-slave5
15/07/18 11:50:00 WARN mapreduce.Job: Error reading task outputhadoop-slave5
15/07/18 11:50:00 INFO mapreduce.Job: Task Id : 
attempt_201507181137_0001_r_03_0, Status : FAILED
Task attempt_201507181137_0001_r_03_0 failed to report status for 601 
seconds. Killing!
15/07/18 11:50:03 WARN mapreduce.Job: Error reading task outputhadoop-slave12
15/07/18 11:50:03 WARN mapreduce.Job: Error reading task outputhadoop-slave12
15/07/18 11:50:03 INFO mapreduce.Job: Task Id : 
attempt_201507181137_0001_r_04_0, Status : FAILED
Task attempt_201507181137_0001_r_04_0 failed to report status for 601 
seconds. Killing!
15/07/18 11:50:06 WARN mapreduce.Job: Error reading task outputhadoop-slave8
15/07/18 11:50:06 WARN mapreduce.Job: Error reading task outputhadoop-slave8
15/07/18 11:50:06 INFO mapreduce.Job: Task Id : 
attempt_201507181137_0001_r_07_0, Status : FAILED
Task attempt_201507181137_0001_r_07_0 failed to report status for 601 
seconds. Killing!
15/07/18 11:50:08 WARN mapreduce.Job: Error reading task outputhadoop-slave11
15/07/18 11:50:08 WARN mapreduce.Job: Error reading task outputhadoop-slave11
15/07/18 11:50:08 INFO mapreduce.Job: Task Id : 
attempt_201507181137_0001_r_08_0, Status : FAILED
Task attempt_201507181137_0001_r_08_0 failed to report status for 601 
seconds. Killing!
15/07/18 11:50:11 WARN mapreduce.Job: Error reading task outputhadoop-slave9
15/07/18 11:50:11 WARN mapreduce.Job: Error reading task outputhadoop-slave9
15/07/18 11:50:11 INFO mapreduce.Job: Task Id : 
attempt_201507181137_0001_r_09_0, Status : FAILED
Task attempt_201507181137_0001_r_09_0 failed to report status for 601 
seconds. Killing!
15/07/18 11:50:13 WARN mapreduce.Job: Error reading task outputhadoop-slave4
15/07/18 11:50:13 WARN mapreduce.Job: Error reading task outputhadoop-slave4
15/07/18 11:50:13 INFO mapreduce.Job: Task Id : 
attempt_201507181137_0001_r_06_0, Status : FAILED
Task attempt_201507181137_0001_r_06_0 failed to report status for 601 
seconds. Killing!
15/07/18 11:50:16 WARN mapreduce.Job: Error reading task outputhadoop-slave6
15/07/18 11:50:16 WARN mapreduce.Job: Error reading task outputhadoop-slave6
15/07/18 11:50:16 INFO mapreduce.Job: Task Id : 
attempt_201507181137_0001_r_05_0, Status : FAILED
Task attempt_201507181137_0001_r_05_0 failed to report status for 601 
seconds. Killing!
15/07/18 11:50:18 WARN mapreduce.Job: Error reading task outputhadoop-slave1
15/07/18 11:50:18 WARN mapreduce.

Questions about hadoop-metrics2.properties

2013-10-23 Thread Benyi Wang
1. Does hadoop metrics2 only support File and Ganglia sink?
2. Can I expose metrics as JMX, especially for customized metrics? I
created some  metrics in my mapreduce job and could successfully output
them using a FileSink. But if I use jconsole to access YARN nodemanager, I
can only see hadoop metrics e.g Hadoop/NodeManager/NodeManagerMetrices
etc.,  not mine with prefix maptask. How to setup to see maptask/reducetask
prefix metrics?
3. Is there an example using jmx? I could not find

The configuration syntax is:

  [prefix].[source|sink|jmx|].[instance].[option]

http://hadoop.apache.org/docs/current/api/org/apache/hadoop/metrics2/package-summary.html


Re: about hadoop-2.2.0 "mapred.child.java.opts"

2013-12-03 Thread Harsh J
Yes but the old property is yet to be entirely removed (removal of
configs is graceful).

These properties were introduced to provide more fine-tuned way to
configure each type of task separately, but the older value continues
to be accepted if present; the current behaviour is that if the MR
runtime finds mapred.child.java.opts configured, it will override
values of mapreduce.map|reduce.java.opts configs. To configure
mapreduce.map|reduce.java.opts therefore, you should make sure you
aren't passing mapred.child.java.opts (which is also no longer in the
mapred-default.xml intentionally).

On Wed, Dec 4, 2013 at 12:56 PM, Henry Hung  wrote:
> Hello,
>
>
>
> I have a question.
>
> Is it correct to say that in hadoop-2.2.0, the mapred-site.xml node
> "mapred.child.java.opts” is replaced by two new node
> “mapreduce.map.java.opts” and “mapreduce.reduce.java.opts”?
>
>
>
> Best regards,
>
> Henry
>
>
> 
> The privileged confidential information contained in this email is intended
> for use only by the addressees as indicated by the original sender of this
> email. If you are not the addressee indicated in this email or are not
> responsible for delivery of the email to such a person, please kindly reply
> to the sender indicating this fact and delete all copies of it from your
> computer and network server immediately. Your cooperation is highly
> appreciated. It is advised that any unauthorized use of confidential
> information of Winbond is strictly prohibited; and any information in this
> email irrelevant to the official business of Winbond shall be deemed as
> neither given nor endorsed by Winbond.



-- 
Harsh J


Re: about hadoop-2.2.0 "mapred.child.java.opts"

2013-12-04 Thread Harsh J
Actually, its the other way around (thanks Sandy for catching this
error in my post). The presence of mapreduce.map|reduce.java.opts
overrides mapred.child.java.opts, not the other way round as I had
stated earlier (below).

On Wed, Dec 4, 2013 at 1:28 PM, Harsh J  wrote:
> Yes but the old property is yet to be entirely removed (removal of
> configs is graceful).
>
> These properties were introduced to provide more fine-tuned way to
> configure each type of task separately, but the older value continues
> to be accepted if present; the current behaviour is that if the MR
> runtime finds mapred.child.java.opts configured, it will override
> values of mapreduce.map|reduce.java.opts configs. To configure
> mapreduce.map|reduce.java.opts therefore, you should make sure you
> aren't passing mapred.child.java.opts (which is also no longer in the
> mapred-default.xml intentionally).
>
> On Wed, Dec 4, 2013 at 12:56 PM, Henry Hung  wrote:
>> Hello,
>>
>>
>>
>> I have a question.
>>
>> Is it correct to say that in hadoop-2.2.0, the mapred-site.xml node
>> "mapred.child.java.opts” is replaced by two new node
>> “mapreduce.map.java.opts” and “mapreduce.reduce.java.opts”?
>>
>>
>>
>> Best regards,
>>
>> Henry
>>
>>
>> 
>> The privileged confidential information contained in this email is intended
>> for use only by the addressees as indicated by the original sender of this
>> email. If you are not the addressee indicated in this email or are not
>> responsible for delivery of the email to such a person, please kindly reply
>> to the sender indicating this fact and delete all copies of it from your
>> computer and network server immediately. Your cooperation is highly
>> appreciated. It is advised that any unauthorized use of confidential
>> information of Winbond is strictly prohibited; and any information in this
>> email irrelevant to the official business of Winbond shall be deemed as
>> neither given nor endorsed by Winbond.
>
>
>
> --
> Harsh J



-- 
Harsh J


RE: about hadoop-2.2.0 "mapred.child.java.opts"

2013-12-04 Thread Henry Hung
@Harsh J

Thank you, I intend to upgrade from Hadoop 1.0.4 and this kind of information 
is very helpful.

Best regards,
Henry

-Original Message-
From: Harsh J [mailto:ha...@cloudera.com]
Sent: Wednesday, December 04, 2013 4:20 PM
To: 
Subject: Re: about hadoop-2.2.0 "mapred.child.java.opts"

Actually, its the other way around (thanks Sandy for catching this error in my 
post). The presence of mapreduce.map|reduce.java.opts overrides 
mapred.child.java.opts, not the other way round as I had stated earlier (below).

On Wed, Dec 4, 2013 at 1:28 PM, Harsh J  wrote:
> Yes but the old property is yet to be entirely removed (removal of
> configs is graceful).
>
> These properties were introduced to provide more fine-tuned way to
> configure each type of task separately, but the older value continues
> to be accepted if present; the current behaviour is that if the MR
> runtime finds mapred.child.java.opts configured, it will override
> values of mapreduce.map|reduce.java.opts configs. To configure
> mapreduce.map|reduce.java.opts therefore, you should make sure you
> aren't passing mapred.child.java.opts (which is also no longer in the
> mapred-default.xml intentionally).
>
> On Wed, Dec 4, 2013 at 12:56 PM, Henry Hung  wrote:
>> Hello,
>>
>>
>>
>> I have a question.
>>
>> Is it correct to say that in hadoop-2.2.0, the mapred-site.xml node
>> "mapred.child.java.opts" is replaced by two new node
>> "mapreduce.map.java.opts" and "mapreduce.reduce.java.opts"?
>>
>>
>>
>> Best regards,
>>
>> Henry
>>
>>
>> 
>> The privileged confidential information contained in this email is
>> intended for use only by the addressees as indicated by the original
>> sender of this email. If you are not the addressee indicated in this
>> email or are not responsible for delivery of the email to such a
>> person, please kindly reply to the sender indicating this fact and
>> delete all copies of it from your computer and network server
>> immediately. Your cooperation is highly appreciated. It is advised
>> that any unauthorized use of confidential information of Winbond is
>> strictly prohibited; and any information in this email irrelevant to
>> the official business of Winbond shall be deemed as neither given nor 
>> endorsed by Winbond.
>
>
>
> --
> Harsh J



--
Harsh J

The privileged confidential information contained in this email is intended for 
use only by the addressees as indicated by the original sender of this email. 
If you are not the addressee indicated in this email or are not responsible for 
delivery of the email to such a person, please kindly reply to the sender 
indicating this fact and delete all copies of it from your computer and network 
server immediately. Your cooperation is highly appreciated. It is advised that 
any unauthorized use of confidential information of Winbond is strictly 
prohibited; and any information in this email irrelevant to the official 
business of Winbond shall be deemed as neither given nor endorsed by Winbond.


Re: question about hadoop dfs

2014-01-25 Thread Jeff Zhang
1. Is supergroup a directory?  Where does it locate?
supergroup is user group rather than directory just like the user group
of linux

2. I search abc.txt on master 172.11.12.6 and node1 172.11.12.7 by
following command:
the meadata(file name, file path and block location) is in master, the
file data itself is in datanode.



On Sun, Jan 26, 2014 at 2:22 PM, EdwardKing  wrote:

> I use Hadoop2.2.0 to create a master node and a sub node,like follows:
>
> Live Datanodes : 2
> Node  Transferring Address  Last Contact  Admin State  Configured Capacity
> (GB)  Used(GB)  Non DFS Used (GB)  Remaining(GB)  Used(%)
> master 172.11.12.6:50010 1In Service
> 16.15  0.00  2.76
>  13.39 0.00
> node1 172.11.12.7:50010 0 In Service
> 16.15 0.00   2.75
>  13.40 0.00
>
> Then I create a abc.txt file on master 172.11.12.6
> [hadoop@master ~]$ pwd
> /home/hadoop
> [hadoop@master ~]$ echo "This is a test." >> abc.txt
> [hadoop@master ~]$ hadoop dfs -copyFromLocal test.txt
> [hadoop@master ~]$ hadoop dfs -ls
> DEPRECATED: Use of this script to execute hdfs command is deprecated.
> Instead use the hdfs command for it.
>
> 14/01/25 22:07:00 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> Found 1 items
> -rw-r--r--   2 hadoop supergroup 16 2014-01-25 21:36 abc.txt
>
> [hadoop@master ~]$ rm abc.txt
> [hadoop@master ~]$ hadoop dfs -cat abc.txt
> This is a test.
>
> My question is:
> 1. Is supergroup a directory?  Where does it locate?
> 2. I search abc.txt on master 172.11.12.6 and node1 172.11.12.7 by
> following command:
> [hadoop@master ~]$ find / -name abc.txt
> But I don't find abc.txt file. Where is the file abc.txt? After I erase it
> by rm command, I still cat this file? Where is it? My OS is CentOS-5.8.
>
> Thanks.
>
> ---
> Confidentiality Notice: The information contained in this e-mail and any
> accompanying attachment(s)
> is intended only for the use of the intended recipient and may be
> confidential and/or privileged of
> Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader
> of this communication is
> not the intended recipient, unauthorized use, forwarding, printing,
>  storing, disclosure or copying
> is strictly prohibited, and may be unlawful.If you have received this
> communication in error,please
> immediately notify the sender by return e-mail, and delete the original
> message and all copies from
> your system. Thank you.
>
> ---
>


Re: question about hadoop dfs

2014-01-25 Thread EdwardKing
hdfs-site.xm is follows:


dfs.name.dir
file:/home/software/name
 


dfs.namenode.secondary.http-address
master:9001


dfs.data.dir
file:/home/software/data


dfs.http.address
master:9002


dfs.replication
2


dfs.datanode.du.reserved
1073741824


dfs.block.size
134217728


dfs.permissions
false



[root@master ~]# cd /home
[root@master home]# cd software/
[root@master software]# ls
data   hadoop-2.2.0 jdk1.7.0_02name  test.txt
file:  hadoop-2.2.0.tar.gz  jdk-7u2-linux-i586.tar.gz  temp  tmp

[root@master name]# pwd
/home/software/name
[root@master name]# ls
current  in_use.lock
[root@master name]# 

[root@master software]# pwd
/home/software
[root@master software]# cd data
[root@master data]# ls
current  in_use.lock

>> the meadata(file name, file path and block location) is in master, the file 
>> data itself is in datanode.

Where I can find abc.txt meadata,such as file name, file path and block 
location?  The abc.txt file data itself is in  master 172.11.12.6 or node1 
172.11.12.7,which directory it locate? 

Thanks.



- Original Message - 
From: Jeff Zhang 
To: user@hadoop.apache.org 
Sent: Sunday, January 26, 2014 2:30 PM
Subject: Re: question about hadoop dfs


1. Is supergroup a directory?  Where does it locate?
supergroup is user group rather than directory just like the user group of 
linux


2. I search abc.txt on master 172.11.12.6 and node1 172.11.12.7 by following 
command:
the meadata(file name, file path and block location) is in master, the file 
data itself is in datanode.





On Sun, Jan 26, 2014 at 2:22 PM, EdwardKing  wrote:

I use Hadoop2.2.0 to create a master node and a sub node,like follows:

Live Datanodes : 2
Node  Transferring Address  Last Contact  Admin State  Configured Capacity (GB) 
 Used(GB)  Non DFS Used (GB)  Remaining(GB)  Used(%)
master 172.11.12.6:50010 1In Service   16.15
  0.00  2.7613.39   
  0.00
node1 172.11.12.7:50010 0 In Service   16.15
 0.00   2.7513.40   
  0.00

Then I create a abc.txt file on master 172.11.12.6
[hadoop@master ~]$ pwd
/home/hadoop
[hadoop@master ~]$ echo "This is a test." >> abc.txt
[hadoop@master ~]$ hadoop dfs -copyFromLocal test.txt
[hadoop@master ~]$ hadoop dfs -ls
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

14/01/25 22:07:00 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
Found 1 items
-rw-r--r--   2 hadoop supergroup 16 2014-01-25 21:36 abc.txt

[hadoop@master ~]$ rm abc.txt
[hadoop@master ~]$ hadoop dfs -cat abc.txt
This is a test.

My question is:
1. Is supergroup a directory?  Where does it locate?
2. I search abc.txt on master 172.11.12.6 and node1 172.11.12.7 by following 
command:
[hadoop@master ~]$ find / -name abc.txt
But I don't find abc.txt file. Where is the file abc.txt? After I erase it by 
rm command, I still cat this file? Where is it? My OS is CentOS-5.8.

Thanks.
---
Confidentiality Notice: The information contained in this e-mail and any 
accompanying attachment(s)
is intended only for the use of the intended recipient and may be confidential 
and/or privileged of
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of 
this communication is
not the intended recipient, unauthorized use, forwarding, printing,  storing, 
disclosure or copying
is strictly prohibited, and may be unlawful.If you have received this 
communication in error,please
immediately notify the sender by return e-mail, and delete the original message 
and all copies from
your system. Thank you.
---
---
Confidentiality Notice: The information contained in this e-mail and any 
accompanying attachment(s) 
is intended only for the use of the intended recipient and may be confidential 
and/or privileged of 
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of 
this communication is 
not the intended recipient, unauthorized use, forwarding, printing,  storing, 
disclosure or copying 
is strictly prohibited, and may be unlawful.If you have received this 
communication in error,please 
immediately notify the sender by return e-mail, and delete the original message 
and all copies from 
your system. Thank you. 
---


Re: question about hadoop dfs

2014-01-27 Thread Jeff Zhang
you can use the fsck command to find the block locations, here's one example

hadoop fsck /user/hadoop/graph_data.txt -blocks -locations -files



On Sun, Jan 26, 2014 at 2:48 PM, EdwardKing  wrote:

> hdfs-site.xm is follows:
> 
> 
> dfs.name.dir
> file:/home/software/name
>  
> 
> 
> dfs.namenode.secondary.http-address
> master:9001
> 
> 
> dfs.data.dir
> file:/home/software/data
> 
> 
> dfs.http.address
> master:9002
> 
> 
> dfs.replication
> 2
> 
> 
> dfs.datanode.du.reserved
> 1073741824
> 
> 
> dfs.block.size
> 134217728
> 
> 
> dfs.permissions
> false
> 
> 
>
> [root@master ~]# cd /home
> [root@master home]# cd software/
> [root@master software]# ls
> data   hadoop-2.2.0 jdk1.7.0_02name  test.txt
> file:  hadoop-2.2.0.tar.gz  jdk-7u2-linux-i586.tar.gz  temp  tmp
>
> [root@master name]# pwd
> /home/software/name
> [root@master name]# ls
> current  in_use.lock
> [root@master name]#
>
> [root@master software]# pwd
> /home/software
> [root@master software]# cd data
> [root@master data]# ls
> current  in_use.lock
>
> >> the meadata(file name, file path and block location) is in master, the
> file data itself is in datanode.
>
> Where I can find abc.txt meadata,such as file name, file path and block
> location?  The abc.txt file data itself is in  master 172.11.12.6 or node1
> 172.11.12.7,which directory it locate?
>
> Thanks.
>
>
>
> - Original Message -
> From: Jeff Zhang
> To: user@hadoop.apache.org
> Sent: Sunday, January 26, 2014 2:30 PM
> Subject: Re: question about hadoop dfs
>
>
> 1. Is supergroup a directory?  Where does it locate?
> supergroup is user group rather than directory just like the user
> group of linux
>
>
> 2. I search abc.txt on master 172.11.12.6 and node1 172.11.12.7 by
> following command:
> the meadata(file name, file path and block location) is in master, the
> file data itself is in datanode.
>
>
>
>
>
> On Sun, Jan 26, 2014 at 2:22 PM, EdwardKing  wrote:
>
> I use Hadoop2.2.0 to create a master node and a sub node,like follows:
>
> Live Datanodes : 2
> Node  Transferring Address  Last Contact  Admin State  Configured Capacity
> (GB)  Used(GB)  Non DFS Used (GB)  Remaining(GB)  Used(%)
> master 172.11.12.6:50010 1In Service
> 16.15  0.00  2.76
>  13.39 0.00
> node1 172.11.12.7:50010 0 In Service
> 16.15 0.00   2.75
>  13.40 0.00
>
> Then I create a abc.txt file on master 172.11.12.6
> [hadoop@master ~]$ pwd
> /home/hadoop
> [hadoop@master ~]$ echo "This is a test." >> abc.txt
> [hadoop@master ~]$ hadoop dfs -copyFromLocal test.txt
> [hadoop@master ~]$ hadoop dfs -ls
> DEPRECATED: Use of this script to execute hdfs command is deprecated.
> Instead use the hdfs command for it.
>
> 14/01/25 22:07:00 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> Found 1 items
> -rw-r--r--   2 hadoop supergroup 16 2014-01-25 21:36 abc.txt
>
> [hadoop@master ~]$ rm abc.txt
> [hadoop@master ~]$ hadoop dfs -cat abc.txt
> This is a test.
>
> My question is:
> 1. Is supergroup a directory?  Where does it locate?
> 2. I search abc.txt on master 172.11.12.6 and node1 172.11.12.7 by
> following command:
> [hadoop@master ~]$ find / -name abc.txt
> But I don't find abc.txt file. Where is the file abc.txt? After I erase it
> by rm command, I still cat this file? Where is it? My OS is CentOS-5.8.
>
> Thanks.
>
> ---
> Confidentiality Notice: The information contained in this e-mail and any
> accompanying attachment(s)
> is intended only for the use of the intended recipient and may be
> confidential and/or privileged of
> Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader
> of this communication is
> not the intended recipient, unauthorized use, forwarding, printing,
>  storing, disclosure or copying
> is strictly prohibited, and may be unlawful.If you have received this
> communication in error,please
> immediately notify the sender by return e-mail, and delete the original
> message and all copies from
> your system. Thank you.
>
> ---
>
> ---

Re: About Hadoop Deb file

2013-02-20 Thread Chris Embree
Jokingly I want to say the problem is that you selected Ubuntu (or any
other Debian based Linux) as your platform.

On a more serious note, if you are new to both Linux and Hadoop, you might
be much better off to select CentOS for your Linux as that is the base
development platform for most contributors.

Yes, I am very biased toward RPM based Linux Distributions.  YMMV. :)

On Wed, Feb 20, 2013 at 11:56 PM, Mayur Patil wrote:

> Hello,
>
>I am using Ubuntu 12.04 Desktop.
>
>I had downloaded hadoop-1.1.1-1.deb file and check with md5 check-sum
> it says verified and OK.
>
>But when I try to install on Ubuntu it gives warning
>
>
> *Package is of bad quality.*
>>
>> *This could cause serious problems on your computer*
>>
>> *Lintian check results for /media/abc/hadoop_1.1.1-1_i386.deb:*
>> *Use of uninitialized value $ENV{"HOME"} in concatenation (.) or string
>> at /usr/bin/lintian line 108.*
>> *E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/bin/task-controller*
>> *E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/lib/libhadoop.a*
>> *E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/lib/libhadoop.so*
>> *E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/lib/libhadoop.so.1*
>> *E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/lib/libhadoop.so.1.0.0*
>> *E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/lib/libhadooppipes.a*
>> *E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/lib/libhadooputils.a*
>> *E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/lib/libhdfs.a*
>> *E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/lib/libhdfs.so*
>> *E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/lib/libhdfs.so.0*
>> *E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/lib/libhdfs.so.0.0.0*
>> *E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/libexec/jsvc.i386 *
>>
>
>   What's gone wrong ??
>
>   Thanks !!
> --
> *Cheers,
> Mayur*.


Re: About Hadoop Deb file

2013-02-20 Thread Harsh J
Try the debs from the Apache Bigtop project 0.3 release, its a bit of
an older 1.x release but the debs would work well:
http://archive.apache.org/dist/incubator/bigtop/bigtop-0.3.0-incubating/repos/

On Thu, Feb 21, 2013 at 10:26 AM, Mayur Patil  wrote:
> Hello,
>
>I am using Ubuntu 12.04 Desktop.
>
>I had downloaded hadoop-1.1.1-1.deb file and check with md5 check-sum it
> says verified and OK.
>
>But when I try to install on Ubuntu it gives warning
>
>
>> Package is of bad quality.
>>
>> This could cause serious problems on your computer
>>
>> Lintian check results for /media/abc/hadoop_1.1.1-1_i386.deb:
>> Use of uninitialized value $ENV{"HOME"} in concatenation (.) or string at
>> /usr/bin/lintian line 108.
>> E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/bin/task-controller
>> E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/lib/libhadoop.a
>> E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/lib/libhadoop.so
>> E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/lib/libhadoop.so.1
>> E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/lib/libhadoop.so.1.0.0
>> E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/lib/libhadooppipes.a
>> E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/lib/libhadooputils.a
>> E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/lib/libhdfs.a
>> E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/lib/libhdfs.so
>> E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/lib/libhdfs.so.0
>> E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/lib/libhdfs.so.0.0.0
>> E: hadoop: arch-independent-package-contains-binary-or-object
>> usr/libexec/jsvc.i386
>
>
>   What's gone wrong ??
>
>   Thanks !!
> --
> Cheers,
> Mayur.



--
Harsh J


Re: About Hadoop Deb file

2013-02-21 Thread Jean-Marc Spaggiari
Hi Mayur,

Where have you downloaded the DEB files? Are they Debian related? Or
Unbuntu related? Unbuntu is not worst than CentOS. They are just different
choices. Both should work.

JM

Hi Ma
2013/2/21 Harsh J 

> Try the debs from the Apache Bigtop project 0.3 release, its a bit of
> an older 1.x release but the debs would work well:
>
> http://archive.apache.org/dist/incubator/bigtop/bigtop-0.3.0-incubating/repos/
>
> On Thu, Feb 21, 2013 at 10:26 AM, Mayur Patil 
> wrote:
> > Hello,
> >
> >I am using Ubuntu 12.04 Desktop.
> >
> >I had downloaded hadoop-1.1.1-1.deb file and check with md5 check-sum
> it
> > says verified and OK.
> >
> >But when I try to install on Ubuntu it gives warning
> >
> >
> >> Package is of bad quality.
> >>
> >> This could cause serious problems on your computer
> >>
> >> Lintian check results for /media/abc/hadoop_1.1.1-1_i386.deb:
> >> Use of uninitialized value $ENV{"HOME"} in concatenation (.) or string
> at
> >> /usr/bin/lintian line 108.
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/bin/task-controller
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/lib/libhadoop.a
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/lib/libhadoop.so
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/lib/libhadoop.so.1
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/lib/libhadoop.so.1.0.0
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/lib/libhadooppipes.a
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/lib/libhadooputils.a
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/lib/libhdfs.a
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/lib/libhdfs.so
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/lib/libhdfs.so.0
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/lib/libhdfs.so.0.0.0
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/libexec/jsvc.i386
> >
> >
> >   What's gone wrong ??
> >
> >   Thanks !!
> > --
> > Cheers,
> > Mayur.
>
>
>
> --
> Harsh J
>


Re: About Hadoop Deb file

2013-02-21 Thread Azuryy Yu
Hi JM,
I am just curious how did you test Unbuntu is not worst than CentOS?
Thanks.


On Fri, Feb 22, 2013 at 5:31 AM, Jean-Marc Spaggiari <
jean-m...@spaggiari.org> wrote:

> Hi Mayur,
>
> Where have you downloaded the DEB files? Are they Debian related? Or
> Unbuntu related? Unbuntu is not worst than CentOS. They are just different
> choices. Both should work.
>
> JM
>
> Hi Ma
> 2013/2/21 Harsh J 
>
>> Try the debs from the Apache Bigtop project 0.3 release, its a bit of
>> an older 1.x release but the debs would work well:
>>
>> http://archive.apache.org/dist/incubator/bigtop/bigtop-0.3.0-incubating/repos/
>>
>> On Thu, Feb 21, 2013 at 10:26 AM, Mayur Patil 
>> wrote:
>> > Hello,
>> >
>> >I am using Ubuntu 12.04 Desktop.
>> >
>> >I had downloaded hadoop-1.1.1-1.deb file and check with md5
>> check-sum it
>> > says verified and OK.
>> >
>> >But when I try to install on Ubuntu it gives warning
>> >
>> >
>> >> Package is of bad quality.
>> >>
>> >> This could cause serious problems on your computer
>> >>
>> >> Lintian check results for /media/abc/hadoop_1.1.1-1_i386.deb:
>> >> Use of uninitialized value $ENV{"HOME"} in concatenation (.) or string
>> at
>> >> /usr/bin/lintian line 108.
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/bin/task-controller
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhadoop.a
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhadoop.so
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhadoop.so.1
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhadoop.so.1.0.0
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhadooppipes.a
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhadooputils.a
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhdfs.a
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhdfs.so
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhdfs.so.0
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhdfs.so.0.0.0
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/libexec/jsvc.i386
>> >
>> >
>> >   What's gone wrong ??
>> >
>> >   Thanks !!
>> > --
>> > Cheers,
>> > Mayur.
>>
>>
>>
>> --
>> Harsh J
>>
>
>


Re: About Hadoop Deb file

2013-02-21 Thread Mayur Patil
>
>   Hi Mayur,


   Hi! !

Where have you downloaded the DEB files?
>

 I have downloaded them to USB diskflash drive.


> Are they Debian related? Or Unbuntu related?
>

   I have downloaded from site of hadoop; not from UBuntu repository

   and check with MD5 checksum and it is verified.


> Unbuntu is not worst than CentOS. They are just different choices. Both
> should work.
>
> JM
>

-- 
*Cheers,
Mayur*.


> Hi Mayur
> 2013/2/21 Harsh J 
>
>> Try the debs from the Apache Bigtop project 0.3 release, its a bit of
>> an older 1.x release but the debs would work well:
>>
>> http://archive.apache.org/dist/incubator/bigtop/bigtop-0.3.0-incubating/repos/
>>
>> On Thu, Feb 21, 2013 at 10:26 AM, Mayur Patil 
>> wrote:
>> > Hello,
>> >
>> >I am using Ubuntu 12.04 Desktop.
>> >
>> >I had downloaded hadoop-1.1.1-1.deb file and check with md5
>> check-sum it
>> > says verified and OK.
>> >
>> >But when I try to install on Ubuntu it gives warning
>> >
>> >
>> >> Package is of bad quality.
>> >>
>> >> This could cause serious problems on your computer
>> >>
>> >> Lintian check results for /media/abc/hadoop_1.1.1-1_i386.deb:
>> >> Use of uninitialized value $ENV{"HOME"} in concatenation (.) or string
>> at
>> >> /usr/bin/lintian line 108.
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/bin/task-controller
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhadoop.a
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhadoop.so
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhadoop.so.1
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhadoop.so.1.0.0
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhadooppipes.a
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhadooputils.a
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhdfs.a
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhdfs.so
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhdfs.so.0
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhdfs.so.0.0.0
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/libexec/jsvc.i386
>> >
>> >
>> >   What's gone wrong ??
>> >
>> >   Thanks !!
>> > --
>> > Cheers,
>> > Mayur.
>>
>>
>>
>> --
>> Harsh J
>>
>


Re: About Hadoop Deb file

2013-02-21 Thread Jean-Marc Spaggiari
Mayur,

Have you looked at that?

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

I just created a VM, installed Debian 64bits, downloaded the .deb file and
installed it without any issue. Are you using Unbuntu 64bits? Or 32bits?

JM


2013/2/21 Mayur Patil 

>   Hi Mayur,
>
>
>Hi! !
>
> Where have you downloaded the DEB files?
>>
>
>  I have downloaded them to USB diskflash drive.
>
>
>> Are they Debian related? Or Unbuntu related?
>>
>
>I have downloaded from site of hadoop; not from UBuntu repository
>
>and check with MD5 checksum and it is verified.
>
>
>> Unbuntu is not worst than CentOS. They are just different choices. Both
>> should work.
>>
>> JM
>>
>
> --
> *Cheers,
> Mayur*.
>
>
>>  Hi Mayur
>> 2013/2/21 Harsh J 
>>
>>> Try the debs from the Apache Bigtop project 0.3 release, its a bit of
>>> an older 1.x release but the debs would work well:
>>>
>>> http://archive.apache.org/dist/incubator/bigtop/bigtop-0.3.0-incubating/repos/
>>>
>>> On Thu, Feb 21, 2013 at 10:26 AM, Mayur Patil 
>>> wrote:
>>> > Hello,
>>> >
>>> >I am using Ubuntu 12.04 Desktop.
>>> >
>>> >I had downloaded hadoop-1.1.1-1.deb file and check with md5
>>> check-sum it
>>> > says verified and OK.
>>> >
>>> >But when I try to install on Ubuntu it gives warning
>>> >
>>> >
>>> >> Package is of bad quality.
>>> >>
>>> >> This could cause serious problems on your computer
>>> >>
>>> >> Lintian check results for /media/abc/hadoop_1.1.1-1_i386.deb:
>>> >> Use of uninitialized value $ENV{"HOME"} in concatenation (.) or
>>> string at
>>> >> /usr/bin/lintian line 108.
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/bin/task-controller
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/lib/libhadoop.a
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/lib/libhadoop.so
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/lib/libhadoop.so.1
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/lib/libhadoop.so.1.0.0
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/lib/libhadooppipes.a
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/lib/libhadooputils.a
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/lib/libhdfs.a
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/lib/libhdfs.so
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/lib/libhdfs.so.0
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/lib/libhdfs.so.0.0.0
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/libexec/jsvc.i386
>>> >
>>> >
>>> >   What's gone wrong ??
>>> >
>>> >   Thanks !!
>>> > --
>>> > Cheers,
>>> > Mayur.
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>


Re: About Hadoop Deb file

2013-02-21 Thread Mayur Patil
I am using 32 bits. I will look out for your link JM sir.

On Fri, Feb 22, 2013 at 8:17 AM, Jean-Marc Spaggiari <
jean-m...@spaggiari.org> wrote:

> Mayur,
>
> Have you looked at that?
>
>
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
>
> I just created a VM, installed Debian 64bits, downloaded the .deb file and
> installed it without any issue. Are you using Unbuntu 64bits? Or 32bits?
>
> JM
>
>
>
> 2013/2/21 Mayur Patil 
>
>>   Hi Mayur,
>>
>>
>>Hi! !
>>
>> Where have you downloaded the DEB files?
>>>
>>
>>  I have downloaded them to USB diskflash drive.
>>
>>
>>> Are they Debian related? Or Unbuntu related?
>>>
>>
>>I have downloaded from site of hadoop; not from UBuntu repository
>>
>>and check with MD5 checksum and it is verified.
>>
>>
>>> Unbuntu is not worst than CentOS. They are just different choices. Both
>>> should work.
>>>
>>> JM
>>>
>>
>> --
>> *Cheers,
>> Mayur*.
>>
>>
>>>  Hi Mayur
>>> 2013/2/21 Harsh J 
>>>
 Try the debs from the Apache Bigtop project 0.3 release, its a bit of
 an older 1.x release but the debs would work well:

 http://archive.apache.org/dist/incubator/bigtop/bigtop-0.3.0-incubating/repos/

 On Thu, Feb 21, 2013 at 10:26 AM, Mayur Patil 
 wrote:
 > Hello,
 >
 >I am using Ubuntu 12.04 Desktop.
 >
 >I had downloaded hadoop-1.1.1-1.deb file and check with md5
 check-sum it
 > says verified and OK.
 >
 >But when I try to install on Ubuntu it gives warning
 >
 >
 >> Package is of bad quality.
 >>
 >> This could cause serious problems on your computer
 >>
 >> Lintian check results for /media/abc/hadoop_1.1.1-1_i386.deb:
 >> Use of uninitialized value $ENV{"HOME"} in concatenation (.) or
 string at
 >> /usr/bin/lintian line 108.
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/bin/task-controller
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/lib/libhadoop.a
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/lib/libhadoop.so
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/lib/libhadoop.so.1
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/lib/libhadoop.so.1.0.0
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/lib/libhadooppipes.a
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/lib/libhadooputils.a
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/lib/libhdfs.a
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/lib/libhdfs.so
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/lib/libhdfs.so.0
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/lib/libhdfs.so.0.0.0
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/libexec/jsvc.i386
 >
 >
 >   What's gone wrong ??
 >
 >   Thanks !!
 > --
 > Cheers,
 > Mayur.



 --
 Harsh J

>>>
>


-- 
*Cheers,
Mayur*.


Re: About Hadoop Deb file

2013-02-22 Thread Jean-Marc Spaggiari
Hi Mayur,

How are you installing the package? Can you install it with dpkg --install?
Are you trying with another command?

I googled the error (*se of uninitialized value $ENV{"HOME"} in
concatenation (.) or string at /usr/bin/lintian*) and found many reference
to it. You might want to take a look too.

JM


2013/2/21 Mayur Patil 

> I am using 32 bits. I will look out for your link JM sir.
>
>
> On Fri, Feb 22, 2013 at 8:17 AM, Jean-Marc Spaggiari <
> jean-m...@spaggiari.org> wrote:
>
>> Mayur,
>>
>> Have you looked at that?
>>
>>
>> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
>>
>> I just created a VM, installed Debian 64bits, downloaded the .deb file
>> and installed it without any issue. Are you using Unbuntu 64bits? Or 32bits?
>>
>> JM
>>
>>
>>
>> 2013/2/21 Mayur Patil 
>>
>>>   Hi Mayur,
>>>
>>>
>>>Hi! !
>>>
>>> Where have you downloaded the DEB files?

>>>
>>>  I have downloaded them to USB diskflash drive.
>>>
>>>
 Are they Debian related? Or Unbuntu related?

>>>
>>>I have downloaded from site of hadoop; not from UBuntu repository
>>>
>>>and check with MD5 checksum and it is verified.
>>>
>>>
 Unbuntu is not worst than CentOS. They are just different choices. Both
 should work.

 JM

>>>
>>> --
>>> *Cheers,
>>> Mayur*.
>>>
>>>
  Hi Mayur
 2013/2/21 Harsh J 

> Try the debs from the Apache Bigtop project 0.3 release, its a bit of
> an older 1.x release but the debs would work well:
>
> http://archive.apache.org/dist/incubator/bigtop/bigtop-0.3.0-incubating/repos/
>
> On Thu, Feb 21, 2013 at 10:26 AM, Mayur Patil <
> ram.nath241...@gmail.com> wrote:
> > Hello,
> >
> >I am using Ubuntu 12.04 Desktop.
> >
> >I had downloaded hadoop-1.1.1-1.deb file and check with md5
> check-sum it
> > says verified and OK.
> >
> >But when I try to install on Ubuntu it gives warning
> >
> >
> >> Package is of bad quality.
> >>
> >> This could cause serious problems on your computer
> >>
> >> Lintian check results for /media/abc/hadoop_1.1.1-1_i386.deb:
> >> Use of uninitialized value $ENV{"HOME"} in concatenation (.) or
> string at
> >> /usr/bin/lintian line 108.
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/bin/task-controller
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/lib/libhadoop.a
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/lib/libhadoop.so
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/lib/libhadoop.so.1
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/lib/libhadoop.so.1.0.0
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/lib/libhadooppipes.a
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/lib/libhadooputils.a
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/lib/libhdfs.a
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/lib/libhdfs.so
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/lib/libhdfs.so.0
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/lib/libhdfs.so.0.0.0
> >> E: hadoop: arch-independent-package-contains-binary-or-object
> >> usr/libexec/jsvc.i386
> >
> >
> >   What's gone wrong ??
> >
> >   Thanks !!
> > --
> > Cheers,
> > Mayur.
>
>
>
> --
> Harsh J
>

>>
>
>
> --
> *Cheers,
> Mayur*.


Re: About Hadoop Deb file

2013-02-24 Thread Mayur Patil
Hello there,

   Success !! I have installed Hadoop-1.0.4.deb file on Ubuntu 12.04 LTS !!

   In /usr/bin/hadoop file where should I set JAVA_HOME and

   HADOOP_INSTALL??

   I have openjdk 7 and 6 version installed.

   Thanks !!

-- 
*Cheers,
Mayur*.




Hi Mayur,
>
> How are you installing the package? Can you install it with dpkg
> --install? Are you trying with another command?
>

   On ubuntu deb packages are directly due to Gdebi installer.


>
> I googled the error (*se of uninitialized value $ENV{"HOME"} in
> concatenation (.) or string at /usr/bin/lintian*) and found many
> reference to it. You might want to take a look too.


  I will try to search for Google query.

>
>
> JM
>
>
> 2013/2/21 Mayur Patil 
>
>> I am using 32 bits. I will look out for your link JM sir.
>>
>>
>> On Fri, Feb 22, 2013 at 8:17 AM, Jean-Marc Spaggiari <
>> jean-m...@spaggiari.org> wrote:
>>
>>> Mayur,
>>>
>>> Have you looked at that?
>>>
>>>
>>> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
>>>
>>> I just created a VM, installed Debian 64bits, downloaded the .deb file
>>> and installed it without any issue. Are you using Unbuntu 64bits? Or 32bits?
>>>
>>> JM
>>>
>>>
>>>
>>> 2013/2/21 Mayur Patil 
>>>
   Hi Mayur,


Hi! !

 Where have you downloaded the DEB files?
>

  I have downloaded them to USB diskflash drive.


> Are they Debian related? Or Unbuntu related?
>

I have downloaded from site of hadoop; not from UBuntu repository

and check with MD5 checksum and it is verified.


> Unbuntu is not worst than CentOS. They are just different choices.
> Both should work.
>
> JM
>

 --
 *Cheers,
 Mayur*.


>  Hi Mayur
> 2013/2/21 Harsh J 
>
>> Try the debs from the Apache Bigtop project 0.3 release, its a bit of
>> an older 1.x release but the debs would work well:
>>
>> http://archive.apache.org/dist/incubator/bigtop/bigtop-0.3.0-incubating/repos/
>>
>> On Thu, Feb 21, 2013 at 10:26 AM, Mayur Patil <
>> ram.nath241...@gmail.com> wrote:
>> > Hello,
>> >
>> >I am using Ubuntu 12.04 Desktop.
>> >
>> >I had downloaded hadoop-1.1.1-1.deb file and check with md5
>> check-sum it
>> > says verified and OK.
>> >
>> >But when I try to install on Ubuntu it gives warning
>> >
>> >
>> >> Package is of bad quality.
>> >>
>> >> This could cause serious problems on your computer
>> >>
>> >> Lintian check results for /media/abc/hadoop_1.1.1-1_i386.deb:
>> >> Use of uninitialized value $ENV{"HOME"} in concatenation (.) or
>> string at
>> >> /usr/bin/lintian line 108.
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/bin/task-controller
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhadoop.a
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhadoop.so
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhadoop.so.1
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhadoop.so.1.0.0
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhadooppipes.a
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhadooputils.a
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhdfs.a
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhdfs.so
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhdfs.so.0
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/lib/libhdfs.so.0.0.0
>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>> >> usr/libexec/jsvc.i386
>> >
>> >
>> >   What's gone wrong ??
>> >
>> >   Thanks !!
>> > --
>> > Cheers,
>> > Mayur.
>>
>>
>>
>> --
>> Harsh J
>>
>
>>>
>>
>>
>> --
>> *Cheers,
>> Mayur*.
>
>
>


-- 
*Cheers,
Mayur*.


Re: About Hadoop Deb file

2013-02-24 Thread Jean-Marc Spaggiari
Hi Mayur,

Have you looked at the link I sent you?

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

It will show you where to set JAVA_HOME and others.

JM

2013/2/24 Mayur Patil 

> Hello there,
>
>Success !! I have installed Hadoop-1.0.4.deb file on Ubuntu 12.04 LTS !!
>
>In /usr/bin/hadoop file where should I set JAVA_HOME and
>
>HADOOP_INSTALL??
>
>I have openjdk 7 and 6 version installed.
>
>
>Thanks !!
>
> --
> *Cheers,
> Mayur*.
>
>
>
>
> Hi Mayur,
>>
>> How are you installing the package? Can you install it with dpkg
>> --install? Are you trying with another command?
>>
>
>On ubuntu deb packages are directly due to Gdebi installer.
>
>
>>
>> I googled the error (*se of uninitialized value $ENV{"HOME"} in
>> concatenation (.) or string at /usr/bin/lintian*) and found many
>> reference to it. You might want to take a look too.
>
>
>   I will try to search for Google query.
>
>>
>>
>> JM
>>
>>
>> 2013/2/21 Mayur Patil 
>>
>>> I am using 32 bits. I will look out for your link JM sir.
>>>
>>>
>>> On Fri, Feb 22, 2013 at 8:17 AM, Jean-Marc Spaggiari <
>>> jean-m...@spaggiari.org> wrote:
>>>
 Mayur,

 Have you looked at that?


 http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

 I just created a VM, installed Debian 64bits, downloaded the .deb file
 and installed it without any issue. Are you using Unbuntu 64bits? Or 
 32bits?

 JM



 2013/2/21 Mayur Patil 

>   Hi Mayur,
>
>
>Hi! !
>
> Where have you downloaded the DEB files?
>>
>
>  I have downloaded them to USB diskflash drive.
>
>
>> Are they Debian related? Or Unbuntu related?
>>
>
>I have downloaded from site of hadoop; not from UBuntu repository
>
>and check with MD5 checksum and it is verified.
>
>
>> Unbuntu is not worst than CentOS. They are just different choices.
>> Both should work.
>>
>> JM
>>
>
> --
> *Cheers,
> Mayur*.
>
>
>>  Hi Mayur
>> 2013/2/21 Harsh J 
>>
>>> Try the debs from the Apache Bigtop project 0.3 release, its a bit of
>>> an older 1.x release but the debs would work well:
>>>
>>> http://archive.apache.org/dist/incubator/bigtop/bigtop-0.3.0-incubating/repos/
>>>
>>> On Thu, Feb 21, 2013 at 10:26 AM, Mayur Patil <
>>> ram.nath241...@gmail.com> wrote:
>>> > Hello,
>>> >
>>> >I am using Ubuntu 12.04 Desktop.
>>> >
>>> >I had downloaded hadoop-1.1.1-1.deb file and check with md5
>>> check-sum it
>>> > says verified and OK.
>>> >
>>> >But when I try to install on Ubuntu it gives warning
>>> >
>>> >
>>> >> Package is of bad quality.
>>> >>
>>> >> This could cause serious problems on your computer
>>> >>
>>> >> Lintian check results for /media/abc/hadoop_1.1.1-1_i386.deb:
>>> >> Use of uninitialized value $ENV{"HOME"} in concatenation (.) or
>>> string at
>>> >> /usr/bin/lintian line 108.
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/bin/task-controller
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/lib/libhadoop.a
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/lib/libhadoop.so
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/lib/libhadoop.so.1
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/lib/libhadoop.so.1.0.0
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/lib/libhadooppipes.a
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/lib/libhadooputils.a
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/lib/libhdfs.a
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/lib/libhdfs.so
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/lib/libhdfs.so.0
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/lib/libhdfs.so.0.0.0
>>> >> E: hadoop: arch-independent-package-contains-binary-or-object
>>> >> usr/libexec/jsvc.i386
>>> >
>>> >
>>> >   What's gone wrong ??
>>> >
>>> >   Thanks !!
>>> > --
>>> > Cheers,
>>> > Mayur.
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>

>>>
>>>
>>> --
>>> *Cheers,
>>> Mayur*.
>>
>>
>>
>
>
> --
> *Cheers,
> Mayur*.


Re: About Hadoop Deb file

2013-02-24 Thread Mayur Patil
Ok sir. I have seen it superficially but now I will look thorough at it.
*
--
Cheers,
Mayur*

On Sun, Feb 24, 2013 at 8:21 PM, Jean-Marc Spaggiari <
jean-m...@spaggiari.org> wrote:

> Hi Mayur,
>
> Have you looked at the link I sent you?
>
>
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
>
> It will show you where to set JAVA_HOME and others.
>
> JM
>
>
> 2013/2/24 Mayur Patil 
>
>> Hello there,
>>
>>Success !! I have installed Hadoop-1.0.4.deb file on Ubuntu 12.04 LTS
>> !!
>>
>>In /usr/bin/hadoop file where should I set JAVA_HOME and
>>
>>HADOOP_INSTALL??
>>
>>I have openjdk 7 and 6 version installed.
>>
>>
>>Thanks !!
>>
>> --
>> *Cheers,
>> Mayur*.
>>
>>
>>
>>
>> Hi Mayur,
>>>
>>> How are you installing the package? Can you install it with dpkg
>>> --install? Are you trying with another command?
>>>
>>
>>On ubuntu deb packages are directly due to Gdebi installer.
>>
>>
>>>
>>> I googled the error (*se of uninitialized value $ENV{"HOME"} in
>>> concatenation (.) or string at /usr/bin/lintian*) and found many
>>> reference to it. You might want to take a look too.
>>
>>
>>   I will try to search for Google query.
>>
>>>
>>>
>>> JM
>>>
>>>
>>> 2013/2/21 Mayur Patil 
>>>
 I am using 32 bits. I will look out for your link JM sir.


 On Fri, Feb 22, 2013 at 8:17 AM, Jean-Marc Spaggiari <
 jean-m...@spaggiari.org> wrote:

> Mayur,
>
> Have you looked at that?
>
>
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
>
> I just created a VM, installed Debian 64bits, downloaded the .deb file
> and installed it without any issue. Are you using Unbuntu 64bits? Or 
> 32bits?
>
> JM
>
>
>
> 2013/2/21 Mayur Patil 
>
>>   Hi Mayur,
>>
>>
>>Hi! !
>>
>> Where have you downloaded the DEB files?
>>>
>>
>>  I have downloaded them to USB diskflash drive.
>>
>>
>>> Are they Debian related? Or Unbuntu related?
>>>
>>
>>I have downloaded from site of hadoop; not from UBuntu repository
>>
>>and check with MD5 checksum and it is verified.
>>
>>
>>> Unbuntu is not worst than CentOS. They are just different choices.
>>> Both should work.
>>>
>>> JM
>>>
>>
>> --
>> *Cheers,
>> Mayur*.
>>
>>
>>>  Hi Mayur
>>> 2013/2/21 Harsh J 
>>>
 Try the debs from the Apache Bigtop project 0.3 release, its a bit
 of
 an older 1.x release but the debs would work well:

 http://archive.apache.org/dist/incubator/bigtop/bigtop-0.3.0-incubating/repos/

 On Thu, Feb 21, 2013 at 10:26 AM, Mayur Patil <
 ram.nath241...@gmail.com> wrote:
 > Hello,
 >
 >I am using Ubuntu 12.04 Desktop.
 >
 >I had downloaded hadoop-1.1.1-1.deb file and check with md5
 check-sum it
 > says verified and OK.
 >
 >But when I try to install on Ubuntu it gives warning
 >
 >
 >> Package is of bad quality.
 >>
 >> This could cause serious problems on your computer
 >>
 >> Lintian check results for /media/abc/hadoop_1.1.1-1_i386.deb:
 >> Use of uninitialized value $ENV{"HOME"} in concatenation (.) or
 string at
 >> /usr/bin/lintian line 108.
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/bin/task-controller
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/lib/libhadoop.a
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/lib/libhadoop.so
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/lib/libhadoop.so.1
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/lib/libhadoop.so.1.0.0
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/lib/libhadooppipes.a
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/lib/libhadooputils.a
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/lib/libhdfs.a
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/lib/libhdfs.so
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/lib/libhdfs.so.0
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/lib/libhdfs.so.0.0.0
 >> E: hadoop: arch-independent-package-contains-binary-or-object
 >> usr/libexec/jsvc.i386
 >
 >
 >   What's gone wrong ??
 >
 >   Thanks !!
 > --
 > Cheers,
 > Mayur.



 --
 Harsh J
>>

Re: About hadoop-2.0.5 release

2013-06-16 Thread Roman Shaposhnik
On Tue, Jun 11, 2013 at 11:22 PM, Ramya S  wrote:
> Hi,
>
> When will be the release of  stable version of hadoop-2.0.5-alpha?

hadoop-2.0.5-alpha has been released last week and can be obtained
either in its source form:
http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.0.5-alpha/
or packaged form:
http://bigtop01.cloudera.org:8080/view/Releases/job/Bigtop-0.6.0/

Thanks,
Roman.


issue about hadoop hardware choose

2013-08-08 Thread ch huang
hi,all:
My company need build a 10 node hadoop cluster (2 namenode and
8 datanode & node manager ,for both data storage and data analysis ) ,we
have hbase ,hive on the hadoop cluster, 10G data increment per day.
we use CDH4.3 ( for dual - namenode HA),my plan is

   name node  & resource manager
   dual Quad Core
 24G RAM
 2 * 500GB SATA DISK (JBOD)

 datanode & node manager
 dual Quad Core
 24G RAM
 2 * 1TGB SATA DISK (JBOD)


my question is
1, if resource manager need a dedicated server? ( i plan to put RM with one
of NN)
2, if the RAM is enough for RM + NN machine?
3,RAID is need for NN machine?
4,is it ok if i place JN on other node(DN or NN)
5, how much zookeeper server node i need?
6,i want to place yarn proxy server and mapreduce history server with
another NN,is it ok?


Re: question about hadoop HA

2013-08-15 Thread bharath vissapragada
Client uses the class dfs.client.failover.proxy.provider.[nameservice ID]* *to
find the active namenode. Default is ConfiguredFailOverProxyProvider. You
can plug in your own implementation and specify it in the config file.

Regards,
Bharath .V
w:http://researchweb.iiit.ac.in/~bharath.v


On Fri, Aug 16, 2013 at 8:16 AM, ch huang  wrote:

> hi,all
> i have a question that i can not answer by myself,hope any one can help.
> if i do not set up HA,client can query DNS get the hdfs entrance,but if i
> set up namenode HA,how client know which host it should talk?
>
>
>


Re: issues about hadoop-0.20.0

2015-07-18 Thread Harsh J
Apache Hadoop 0.20 and 0.21 are both very old and unmaintained releases at
this point, and may carry some issues unfixed via further releases. Please
consider using a newer release.

Is there a specific reason you intend to use 0.21.0, which came out of a
branch long since abandoned?

On Sat, Jul 18, 2015 at 1:27 PM longfei li  wrote:

> Hello!
> I built a hadoop cluster including 12 nodes which is based on
> arm(cubietruck), I run simple program wordcount to find how many words of h
> in hello, it runs perfectly. But I run a mutiple program like pi,i run like
> this:
> ./hadoop jar hadoop-example-0.21.0.jar pi 100 10
> infomation
> 15/07/18 11:38:54 WARN conf.Configuration: DEPRECATED: hadoop-site.xml
> found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use
> core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of
> core-default.xml, mapred-default.xml and hdfs-default.xml respectively
> 15/07/18 11:38:54 INFO security.Groups: Group mapping
> impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
> cacheTimeout=30
> 15/07/18 11:38:55 WARN conf.Configuration: mapred.task.id is deprecated.
> Instead, use mapreduce.task.attempt.id
> 15/07/18 11:38:55 WARN mapreduce.JobSubmitter: Use GenericOptionsParser
> for parsing the arguments. Applications should implement Tool for the same.
> 15/07/18 11:38:55 INFO input.FileInputFormat: Total input paths to process
> : 1
> 15/07/18 11:38:58 WARN conf.Configuration: mapred.map.tasks is deprecated.
> Instead, use mapreduce.job.maps
> 15/07/18 11:38:58 INFO mapreduce.JobSubmitter: number of splits:1
> 15/07/18 11:38:58 INFO mapreduce.JobSubmitter: adding the following
> namenodes' delegation tokens:null
> 15/07/18 11:38:59 INFO mapreduce.Job: Running job: job_201507181137_0001
> 15/07/18 11:39:00 INFO mapreduce.Job:  map 0% reduce 0%
> 15/07/18 11:39:20 INFO mapreduce.Job:  map 100% reduce 0%
> 15/07/18 11:39:35 INFO mapreduce.Job:  map 100% reduce 10%
> 15/07/18 11:39:36 INFO mapreduce.Job:  map 100% reduce 20%
> 15/07/18 11:39:38 INFO mapreduce.Job:  map 100% reduce 90%
> 15/07/18 11:39:58 INFO mapreduce.Job:  map 100% reduce 100%
> 15/07/18 11:49:47 INFO mapreduce.Job:  map 100% reduce 89%
> 15/07/18 11:49:49 INFO mapreduce.Job:  map 100% reduce 19%
> 15/07/18 11:49:54 INFO mapreduce.Job: Task Id :
> attempt_201507181137_0001_r_00_0, Status : FAILED
> Task attempt_201507181137_0001_r_00_0 failed to report status for 602
> seconds. Killing!
> 15/07/18 11:49:57 WARN mapreduce.Job: Error reading task
> outputhadoop-slave7
> 15/07/18 11:49:57 WARN mapreduce.Job: Error reading task
> outputhadoop-slave7
> 15/07/18 11:49:58 INFO mapreduce.Job: Task Id :
> attempt_201507181137_0001_r_02_0, Status : FAILED
> Task attempt_201507181137_0001_r_02_0 failed to report status for 601
> seconds. Killing!
> 15/07/18 11:50:00 WARN mapreduce.Job: Error reading task
> outputhadoop-slave5
> 15/07/18 11:50:00 WARN mapreduce.Job: Error reading task
> outputhadoop-slave5
> 15/07/18 11:50:00 INFO mapreduce.Job: Task Id :
> attempt_201507181137_0001_r_03_0, Status : FAILED
> Task attempt_201507181137_0001_r_03_0 failed to report status for 601
> seconds. Killing!
> 15/07/18 11:50:03 WARN mapreduce.Job: Error reading task
> outputhadoop-slave12
> 15/07/18 11:50:03 WARN mapreduce.Job: Error reading task
> outputhadoop-slave12
> 15/07/18 11:50:03 INFO mapreduce.Job: Task Id :
> attempt_201507181137_0001_r_04_0, Status : FAILED
> Task attempt_201507181137_0001_r_04_0 failed to report status for 601
> seconds. Killing!
> 15/07/18 11:50:06 WARN mapreduce.Job: Error reading task
> outputhadoop-slave8
> 15/07/18 11:50:06 WARN mapreduce.Job: Error reading task
> outputhadoop-slave8
> 15/07/18 11:50:06 INFO mapreduce.Job: Task Id :
> attempt_201507181137_0001_r_07_0, Status : FAILED
> Task attempt_201507181137_0001_r_07_0 failed to report status for 601
> seconds. Killing!
> 15/07/18 11:50:08 WARN mapreduce.Job: Error reading task
> outputhadoop-slave11
> 15/07/18 11:50:08 WARN mapreduce.Job: Error reading task
> outputhadoop-slave11
> 15/07/18 11:50:08 INFO mapreduce.Job: Task Id :
> attempt_201507181137_0001_r_08_0, Status : FAILED
> Task attempt_201507181137_0001_r_08_0 failed to report status for 601
> seconds. Killing!
> 15/07/18 11:50:11 WARN mapreduce.Job: Error reading task
> outputhadoop-slave9
> 15/07/18 11:50:11 WARN mapreduce.Job: Error reading task
> outputhadoop-slave9
> 15/07/18 11:50:11 INFO mapreduce.Job: Task Id :
> attempt_201507181137_0001_r_09_0, Status : FAILED
> Task attempt_201507181137_0001_r_09_0 failed to report status for 601
> seconds. Killing!
> 15/07/18 11:50:13 WARN mapreduce.Job: Error reading task
> outputhadoop-slave4
> 15/07/18 11:50:13 WARN mapreduce.Job: Error reading task
> outputhadoop-slave4
> 15/07/18 11:50:13 INFO mapreduce.Job: Task Id :
> attempt_201507181137_0001_r_06_0, Status : FAILED
> Task attempt_201507181137_0001_r_06_0 failed to re

Re: issues about hadoop-0.20.0

2015-07-18 Thread Ulul

Hi

I'd say than no matter what version is running, parameters seem not fit 
the cluster that doesn't manage to handle 100 maps that each process a 
billion samples : it's hitting the mapreduce timeout of 600 seconds


I'd try with something like 20 10

Ulul

Le 18/07/2015 12:17, Harsh J a écrit :
Apache Hadoop 0.20 and 0.21 are both very old and unmaintained 
releases at this point, and may carry some issues unfixed via further 
releases. Please consider using a newer release.


Is there a specific reason you intend to use 0.21.0, which came out of 
a branch long since abandoned?


On Sat, Jul 18, 2015 at 1:27 PM longfei li > wrote:


Hello!
I built a hadoop cluster including 12 nodes which is based on
arm(cubietruck), I run simple program wordcount to find how many
words of h in hello, it runs perfectly. But I run a mutiple
program like pi,i run like this:
./hadoop jar hadoop-example-0.21.0.jar pi 100 10
infomation
15/07/18 11:38:54 WARN conf.Configuration: DEPRECATED:
hadoop-site.xml found in the classpath. Usage of hadoop-site.xml
is deprecated. Instead use core-site.xml, mapred-site.xml and
hdfs-site.xml to override properties of core-default.xml,
mapred-default.xml and hdfs-default.xml respectively
15/07/18 11:38:54 INFO security.Groups: Group mapping
impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
cacheTimeout=30
15/07/18 11:38:55 WARN conf.Configuration: mapred.task.id
 is deprecated. Instead, use
mapreduce.task.attempt.id 
15/07/18 11:38:55 WARN mapreduce.JobSubmitter: Use
GenericOptionsParser for parsing the arguments. Applications
should implement Tool for the same.
15/07/18 11:38:55 INFO input.FileInputFormat: Total input paths to
process : 1
15/07/18 11:38:58 WARN conf.Configuration: mapred.map.tasks is
deprecated. Instead, use mapreduce.job.maps
15/07/18 11:38:58 INFO mapreduce.JobSubmitter: number of splits:1
15/07/18 11:38:58 INFO mapreduce.JobSubmitter: adding the
following namenodes' delegation tokens:null
15/07/18 11:38:59 INFO mapreduce.Job: Running job:
job_201507181137_0001
15/07/18 11:39:00 INFO mapreduce.Job:  map 0% reduce 0%
15/07/18 11:39:20 INFO mapreduce.Job:  map 100% reduce 0%
15/07/18 11:39:35 INFO mapreduce.Job:  map 100% reduce 10%
15/07/18 11:39:36 INFO mapreduce.Job:  map 100% reduce 20%
15/07/18 11:39:38 INFO mapreduce.Job:  map 100% reduce 90%
15/07/18 11:39:58 INFO mapreduce.Job:  map 100% reduce 100%
15/07/18 11:49:47 INFO mapreduce.Job:  map 100% reduce 89%
15/07/18 11:49:49 INFO mapreduce.Job:  map 100% reduce 19%
15/07/18 11:49:54 INFO mapreduce.Job: Task Id :
attempt_201507181137_0001_r_00_0, Status : FAILED
Task attempt_201507181137_0001_r_00_0 failed to report status
for 602 seconds. Killing!
15/07/18 11:49:57 WARN mapreduce.Job: Error reading task
outputhadoop-slave7
15/07/18 11:49:57 WARN mapreduce.Job: Error reading task
outputhadoop-slave7
15/07/18 11:49:58 INFO mapreduce.Job: Task Id :
attempt_201507181137_0001_r_02_0, Status : FAILED
Task attempt_201507181137_0001_r_02_0 failed to report status
for 601 seconds. Killing!
15/07/18 11:50:00 WARN mapreduce.Job: Error reading task
outputhadoop-slave5
15/07/18 11:50:00 WARN mapreduce.Job: Error reading task
outputhadoop-slave5
15/07/18 11:50:00 INFO mapreduce.Job: Task Id :
attempt_201507181137_0001_r_03_0, Status : FAILED
Task attempt_201507181137_0001_r_03_0 failed to report status
for 601 seconds. Killing!
15/07/18 11:50:03 WARN mapreduce.Job: Error reading task
outputhadoop-slave12
15/07/18 11:50:03 WARN mapreduce.Job: Error reading task
outputhadoop-slave12
15/07/18 11:50:03 INFO mapreduce.Job: Task Id :
attempt_201507181137_0001_r_04_0, Status : FAILED
Task attempt_201507181137_0001_r_04_0 failed to report status
for 601 seconds. Killing!
15/07/18 11:50:06 WARN mapreduce.Job: Error reading task
outputhadoop-slave8
15/07/18 11:50:06 WARN mapreduce.Job: Error reading task
outputhadoop-slave8
15/07/18 11:50:06 INFO mapreduce.Job: Task Id :
attempt_201507181137_0001_r_07_0, Status : FAILED
Task attempt_201507181137_0001_r_07_0 failed to report status
for 601 seconds. Killing!
15/07/18 11:50:08 WARN mapreduce.Job: Error reading task
outputhadoop-slave11
15/07/18 11:50:08 WARN mapreduce.Job: Error reading task
outputhadoop-slave11
15/07/18 11:50:08 INFO mapreduce.Job: Task Id :
attempt_201507181137_0001_r_08_0, Status : FAILED
Task attempt_201507181137_0001_r_08_0 failed to report status
for 601 seconds. Killing!
15/07/18 11:50:11 WARN mapreduce.Job: Error reading task
outputhadoop-slave9
15/07/18 11:50:11 WARN mapreduc

Re: Questions about hadoop-metrics2.properties

2013-10-23 Thread Luke Lu
1. File and Ganglia are the only bundled sinks, though there are
socket/json (for chukwa) and graphite sinks patches in the works.
2. Hadoop metrics (and metrics2) is mostly designed for system/process
metrics, which means you'll need to attach jconsole to your map/reduce task
processes to see your task metrics instrumented via metrics. What you
actually want is probably custom job counters.
3. You don't need any configuration to use JMX to access metrics2, as JMX
is currently on by default. The configuration in hadoop-metrics2.properties
is mostly for optional sink configuration and metrics filtering.

__Luke



On Wed, Oct 23, 2013 at 4:21 PM, Benyi Wang  wrote:

> 1. Does hadoop metrics2 only support File and Ganglia sink?
> 2. Can I expose metrics as JMX, especially for customized metrics? I
> created some  metrics in my mapreduce job and could successfully output
> them using a FileSink. But if I use jconsole to access YARN nodemanager, I
> can only see hadoop metrics e.g Hadoop/NodeManager/NodeManagerMetrices
> etc.,  not mine with prefix maptask. How to setup to see maptask/reducetask
> prefix metrics?
> 3. Is there an example using jmx? I could not find
>
> The configuration syntax is:
>
>   [prefix].[source|sink|jmx|].[instance].[option]
>
>
> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/metrics2/package-summary.html
>


Re: Questions about hadoop-metrics2.properties

2013-10-25 Thread Benyi Wang
Thanks, Luke. Your answer is really helpful.


On Wed, Oct 23, 2013 at 11:51 PM, Luke Lu  wrote:

> 1. File and Ganglia are the only bundled sinks, though there are
> socket/json (for chukwa) and graphite sinks patches in the works.
> 2. Hadoop metrics (and metrics2) is mostly designed for system/process
> metrics, which means you'll need to attach jconsole to your map/reduce task
> processes to see your task metrics instrumented via metrics. What you
> actually want is probably custom job counters.
> 3. You don't need any configuration to use JMX to access metrics2, as JMX
> is currently on by default. The configuration in hadoop-metrics2.properties
> is mostly for optional sink configuration and metrics filtering.
>
> __Luke
>
>
>
> On Wed, Oct 23, 2013 at 4:21 PM, Benyi Wang  wrote:
>
>> 1. Does hadoop metrics2 only support File and Ganglia sink?
>> 2. Can I expose metrics as JMX, especially for customized metrics? I
>> created some  metrics in my mapreduce job and could successfully output
>> them using a FileSink. But if I use jconsole to access YARN nodemanager, I
>> can only see hadoop metrics e.g Hadoop/NodeManager/NodeManagerMetrices
>> etc.,  not mine with prefix maptask. How to setup to see maptask/reducetask
>> prefix metrics?
>> 3. Is there an example using jmx? I could not find
>>
>> The configuration syntax is:
>>
>>   [prefix].[source|sink|jmx|].[instance].[option]
>>
>>
>> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/metrics2/package-summary.html
>>
>
>


Questions about Hadoop logs and mapred.local.dir

2014-05-13 Thread sam liu
Hi Experts,

1. The size of mapred.local.dir is big(30 GB), how many methods could clean
it correctly?
2. For logs of NameNode/DataNode/JobTracker/TaskTracker, are they all
rolling type log? What's their max size? I can not find the specific
settings for them in log4j.properties.
3. I find the size of dfs.name.dir and dfs.data.dir is very big now, are
there any files under them could be removed actually? Or all files under
the two folders could not be removed at all?

Thanks!


Two simple question about hadoop 2

2013-07-21 Thread Yexi Jiang
Hi, all,

I have two simple questions about hadoop 2 (or YARN).
1. When will the stable version of hadoop 2.x come out?
2. We currently have a cluster deployed with hadoop 1.x, is there any way
to upgrade it to hadoop 2.x without damaging the existing data in current
HDFS?

Thank you very much!

Regards,
Yexi


Re: issue about hadoop hardware choose

2013-08-08 Thread Azuryy Yu
if you want HA, then do you want to deploy journal node on the DN?
On Aug 8, 2013 5:09 PM, "ch huang"  wrote:

> hi,all:
> My company need build a 10 node hadoop cluster (2 namenode and
> 8 datanode & node manager ,for both data storage and data analysis ) ,we
> have hbase ,hive on the hadoop cluster, 10G data increment per day.
> we use CDH4.3 ( for dual - namenode HA),my plan is
>
>name node  & resource manager
>dual Quad Core
>  24G RAM
>  2 * 500GB SATA DISK (JBOD)
>
>  datanode & node manager
>  dual Quad Core
>  24G RAM
>  2 * 1TGB SATA DISK (JBOD)
>
>
> my question is
> 1, if resource manager need a dedicated server? ( i plan to put RM with
> one of NN)
> 2, if the RAM is enough for RM + NN machine?
> 3,RAID is need for NN machine?
> 4,is it ok if i place JN on other node(DN or NN)
> 5, how much zookeeper server node i need?
> 6,i want to place yarn proxy server and mapreduce history server with
> another NN,is it ok?
>
>
>
>
>


Re: issue about hadoop hardware choose

2013-08-08 Thread Mirko Kämpf
Hello Ch Huang,


Do you know this book?
"Hadoop Operations" http://shop.oreilly.com/product/0636920025085.do

I think, it answers most of the questions in detail.

For a production cluster you should consider MRv1.
And I suggest you, to go with more hard drives per slave node to have a
higher
IO bandwith for map reduce, give it 4 x 2 TB at least or even 6.
At least three zookeeper servers are used.

Best wishes
Mirko



2013/8/8 ch huang 

> hi,all:
> My company need build a 10 node hadoop cluster (2 namenode and
> 8 datanode & node manager ,for both data storage and data analysis ) ,we
> have hbase ,hive on the hadoop cluster, 10G data increment per day.
> we use CDH4.3 ( for dual - namenode HA),my plan is
>
>name node  & resource manager
>dual Quad Core
>  24G RAM
>  2 * 500GB SATA DISK (JBOD)
>
>  datanode & node manager
>  dual Quad Core
>  24G RAM
>  2 * 1TGB SATA DISK (JBOD)
>
>
> my question is
> 1, if resource manager need a dedicated server? ( i plan to put RM with
> one of NN)
> 2, if the RAM is enough for RM + NN machine?
> 3,RAID is need for NN machine?
> 4,is it ok if i place JN on other node(DN or NN)
> 5, how much zookeeper server node i need?
> 6,i want to place yarn proxy server and mapreduce history server with
> another NN,is it ok?
>
>
>
>
>


Follow Tutorials about Hadoop on Guchex

2012-09-21 Thread Vinicius Melo
Hello everyone,

I would like to invite everybody from Hadoop community to visit our
knowledge sharing platform. You can visit our hadoop page where we are
going to update with Hadoop Tutorials, also we would ask for those
that can share any tutorial to post it.

It is very simple to use, if you want just to have a feed with
tutorials about Hadoop just follow the tag Hadoop on:
http://guchex.com/tag/59/hadoop

If you would like to share something with all followers of Hadoop tag
, join us and create a post using a tag Hadoop.
http://guchex.com/Post/CreatePost


Thanks,
Vinicius Melo


question about hadoop in maven repository

2012-12-16 Thread Yunming Zhang
Hi, 

I am modifying the dependencies for Mahout package, (the open source machine 
learning package built on top of Hadoop), 

I am a bit confused over why there are so many hadoop dependencies in the maven 
project, there are four artifactIds 
1) hadoop-core, 2) hadoop-common, 3)hadoop-mapreduce-client-core, 
4)hadoop-mapreduce-client-common

I am trying to replace the hadoop jar file used to compile with my customized 
version, 

Thanks

Yunming

Question about Hadoop YARN in 2.7.3

2021-05-13 Thread 慧波彭
Hello, we use capacity scheduler to allocate resources in our production
environment, and use node label to isolate resources.
There is a demand that we want to dynamically create node labels and
associate node labels to existing queue without
changing capacity-scheduler.xml.
Does anyone know how to implement it?
I am looking forward to hearing from you.


a question about hadoop source code repo

2014-05-12 Thread Libo Yu
Hi,

Under hadoop-mapreduce-project directory, I notice the following two
directories:
 1hadoop-mapreduce-client/
 2src/

2 src/ can be expanded to
  java/org/apache/hadoop/mapreduce

My question is what the directory is for. And I wonder why mapreduce code
is in hadoop-mapreduce-client.  Thanks.

Libo


Re: Questions about Hadoop logs and mapred.local.dir

2014-05-16 Thread Mohammad Tariq
Hi Sam,

1. I am sorry I didn't quite get "how many methods could clean it correctly?
".

Since this directory contains only the temporary files it should get
cleaned up after your jobs are over. If you still have unnecessary data
present there you can delete it. Make sure no jobs are running while you
clean this directory.

2. All the daemons use log4j and DailyRollingFileAppender, which does not
have retention settings. You can change the behavior by changing the
Appender of your choice in *log4j.properties* files under
*HADOOP_HOME/conf*directory. The associated property is
*log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender*.

3. You must never touch the content of these 2 directories. This the actual
HDFS *data+metadata*, which you don't want to loose.

You can't find more on log files
here
.

HTH

*Warm regards,*
*Mohammad Tariq*
*cloudfront.blogspot.com *


On Wed, May 7, 2014 at 9:10 AM, sam liu  wrote:

> Hi Experts,
>
> 1. The size of mapred.local.dir is big(30 GB), how many methods could
> clean it correctly?
> 2. For logs of NameNode/DataNode/JobTracker/TaskTracker, are they all
> rolling type log? What's their max size? I can not find the specific
> settings for them in log4j.properties.
> 3. I find the size of dfs.name.dir and dfs.data.dir is very big now, are
> there any files under them could be removed actually? Or all files under
> the two folders could not be removed at all?
>
> Thanks!
>


Re: Two simple question about hadoop 2

2013-07-21 Thread Harsh J
Hi,

On Mon, Jul 22, 2013 at 7:26 AM, Yexi Jiang  wrote:
> Hi, all,
>
> I have two simple questions about hadoop 2 (or YARN).
> 1. When will the stable version of hadoop 2.x come out?

No fixed date yet, but probably later this year. There is a formal
beta that should release soon (2.1.0-beta), which should evolve into a
stable version afterwards.

> 2. We currently have a cluster deployed with hadoop 1.x, is there any way to
> upgrade it to hadoop 2.x without damaging the existing data in current HDFS?

Upgrading to 2.x from 1.x is a supported scenario and does not harm
your HDFS data.

> Thank you very much!
>
> Regards,
> Yexi
>
>
>



--
Harsh J


Re: question about hadoop in maven repository

2012-12-17 Thread Jeff Zhang
It looks like that you are using hadoop 1.x which include several
sub-projects. I'm not sure Mahout support 1.x

Suggest you use hadoop 0.20.x


On Mon, Dec 17, 2012 at 1:31 PM, Yunming Zhang
wrote:

> Hi,
>
> I am modifying the dependencies for Mahout package, (the open source
> machine learning package built on top of Hadoop),
>
> I am a bit confused over why there are so many hadoop dependencies in the
> maven project, there are four artifactIds
> 1) hadoop-core, 2) hadoop-common, 3)hadoop-mapreduce-client-**core,
> 4)hadoop-mapreduce-client-**common
>
> I am trying to replace the hadoop jar file used to compile with my
> customized version,
>
> Thanks
>
> Yunming
>



-- 
Best Regards

Jeff Zhang


issue about hadoop data migrate between IDC

2014-08-06 Thread ch huang
hi,maillist:
   my company signed a new IDC ,i must move the hadoop
data(about 30T data) to new IDC,any good suggestion?


Re: Question about Hadoop YARN in 2.7.3

2021-05-13 Thread HK
You can automate to create a capacity-scheduler.xml based on the
requirement, after that you can deploy it on RM, and refresh the queue.
Is it your requirement to not to restart RM or not to change capacity
scheduler?

On Thu, May 13, 2021 at 2:45 PM 慧波彭  wrote:

> Hello, we use capacity scheduler to allocate resources in our production
> environment, and use node label to isolate resources.
> There is a demand that we want to dynamically create node labels and
> associate node labels to existing queue without
> changing capacity-scheduler.xml.
> Does anyone know how to implement it?
> I am looking forward to hearing from you.
>


Re: Question about Hadoop YARN in 2.7.3

2021-05-13 Thread Malcolm McFarland
Heya,
I implemented something similar last year to guarantee resource
provisioning when we deployed to YARN. We stuck to one-label-per-node
to keep things relatively simple. Iirc, these are the basic steps:

- add `yarn.node-labels.configuration-type=centralized` to your yarn-site.xml
- set up your queues with
`yarn.scheduler.capacity.root.queues=,,`,
`yarn.scheduler.capacity.root..queues=,`,
etc (we only used one layer)
- add provisioning requirements as needed (ie
`yarn.scheduler.capacity.root.capacity=100`,
`yarn.scheduler.capacity.root..capacity=X`, etc)
- add accessible labels for the queues, ie
`yarn.scheduler.capacity.root..accessible-node-labels=[,,..]`

With all of this organization in place, you can then allocate nodes
dynamically using the main `yarn rmadmin` command, ie

$HADOOP_YARN_HOME/bin/yarn rmadmin -addToClusterNodeLabels LABEL1,LABEL2,..
$HADOOP_YARN_HOME/bin/yarn rmadmin -replaceLabelsOnNode
"node.dns=LABEL node.dns=LABEL ..."
Malcolm McFarland
Cavulus

Malcolm McFarland
Cavulus


This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
unauthorized or improper disclosure, copying, distribution, or use of
the contents of this message is prohibited. The information contained
in this message is intended only for the personal and confidential use
of the recipient(s) named above. If you have received this message in
error, please notify the sender immediately and delete the original
message.


On Thu, May 13, 2021 at 3:05 AM HK  wrote:
>
> You can automate to create a capacity-scheduler.xml based on the requirement, 
> after that you can deploy it on RM, and refresh the queue.
> Is it your requirement to not to restart RM or not to change capacity 
> scheduler?
>
> On Thu, May 13, 2021 at 2:45 PM 慧波彭  wrote:
>>
>> Hello, we use capacity scheduler to allocate resources in our production 
>> environment, and use node label to isolate resources.
>> There is a demand that we want to dynamically create node labels and 
>> associate node labels to existing queue without changing 
>> capacity-scheduler.xml.
>> Does anyone know how to implement it?
>> I am looking forward to hearing from you.

-
To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
For additional commands, e-mail: user-h...@hadoop.apache.org



Re: Question about Hadoop YARN in 2.7.3

2021-05-13 Thread Malcolm McFarland
Sorry about the formatting on that, I hit send before I'd checked it. Here
it is again, hopefully a bit more legibly (and with a fix):

> I implemented something similar last year to guarantee resource
provisioning when we deployed to YARN. We stuck to one-label-per-node to
keep things relatively simple. Iirc, these are the basic steps:

- add `yarn.node-labels.configuration-type=centralized` to your
yarn-site.xml
- set up your queues with
`yarn.scheduler.capacity.root.queues=,,`,
`yarn.scheduler.capacity.root..queues=,`, etc
(we only used one layer)
- add provisioning requirements as needed (ie
`yarn.scheduler.capacity.root.capacity=100`,
`yarn.scheduler.capacity.root..capacity=X`, etc)
- add accessible labels for the queues, ie
`yarn.scheduler.capacity.root..accessible-node-labels=[,,..]`

> With all of this organization in place, you can then allocate nodes
dynamically using the main `yarn rmadmin` command, ie

$HADOOP_YARN_HOME/bin/yarn rmadmin -addToClusterNodeLabels LABEL1,LABEL2,..
$HADOOP_YARN_HOME/bin/yarn rmadmin -replaceLabelsOnNode "node.dns=LABEL1
node.dns=LABEL1 node.dns=LABEL2 ..." -failOnUnknownNodes

Hth
Malcolm McFarland
Cavulus


On Thu, May 13, 2021 at 10:38 AM Malcolm McFarland 
wrote:

> Heya,
> I implemented something similar last year to guarantee resource
> provisioning when we deployed to YARN. We stuck to one-label-per-node
> to keep things relatively simple. Iirc, these are the basic steps:
>
> - add `yarn.node-labels.configuration-type=centralized` to your
> yarn-site.xml
> - set up your queues with
> `yarn.scheduler.capacity.root.queues=,,`,
> `yarn.scheduler.capacity.root..queues=,`,
> etc (we only used one layer)
> - add provisioning requirements as needed (ie
> `yarn.scheduler.capacity.root.capacity=100`,
> `yarn.scheduler.capacity.root..capacity=X`, etc)
> - add accessible labels for the queues, ie
>
> `yarn.scheduler.capacity.root..accessible-node-labels=[,,..]`
>
> With all of this organization in place, you can then allocate nodes
> dynamically using the main `yarn rmadmin` command, ie
>
> $HADOOP_YARN_HOME/bin/yarn rmadmin -addToClusterNodeLabels LABEL1,LABEL2,..
> $HADOOP_YARN_HOME/bin/yarn rmadmin -replaceLabelsOnNode
> "node.dns=LABEL node.dns=LABEL ..."
> Malcolm McFarland
> Cavulus
>
> Malcolm McFarland
> Cavulus
>
>
> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> unauthorized or improper disclosure, copying, distribution, or use of
> the contents of this message is prohibited. The information contained
> in this message is intended only for the personal and confidential use
> of the recipient(s) named above. If you have received this message in
> error, please notify the sender immediately and delete the original
> message.
>
>
> On Thu, May 13, 2021 at 3:05 AM HK  wrote:
> >
> > You can automate to create a capacity-scheduler.xml based on the
> requirement, after that you can deploy it on RM, and refresh the queue.
> > Is it your requirement to not to restart RM or not to change capacity
> scheduler?
> >
> > On Thu, May 13, 2021 at 2:45 PM 慧波彭  wrote:
> >>
> >> Hello, we use capacity scheduler to allocate resources in our
> production environment, and use node label to isolate resources.
> >> There is a demand that we want to dynamically create node labels and
> associate node labels to existing queue without changing
> capacity-scheduler.xml.
> >> Does anyone know how to implement it?
> >> I am looking forward to hearing from you.
>


Re: a question about hadoop source code repo

2014-05-16 Thread Ted Yu
Which branch are you looking at ?
In trunk, I see:

$ ls hadoop-mapreduce-project
CHANGES.txt NOTICE.txt conf hadoop-mapreduce-examples target
INSTALL assembly dev-support lib
LICENSE.txt bin hadoop-mapreduce-client pom.xml

$ find hadoop-mapreduce-project -name src
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs-plugins/src
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src
hadoop-mapreduce-project/hadoop-mapreduce-examples/src



On Mon, May 12, 2014 at 7:18 AM, Libo Yu  wrote:

> Hi,
>
> Under hadoop-mapreduce-project directory, I notice the following two
> directories:
>  1hadoop-mapreduce-client/
>  2src/
>
> 2 src/ can be expanded to
>   java/org/apache/hadoop/mapreduce
>
> My question is what the directory is for. And I wonder why mapreduce code
> is in hadoop-mapreduce-client.  Thanks.
>
> Libo
>
>


One simple question about hadoop exit-code 65

2016-04-02 Thread 169517388
to hadoop.org:

Hello hadoop.org. I'm the new guy who is learning the hadoop right now. I'v 
bulit a 5 nodes hadoop experimental environment. When I ran the MR program, the 
error came out.
I searched a lot. Maybe the java runtime environment or anything else. I 
didn't get it.

16/04/02 16:03:37 INFO mapreduce.Job: Task Id : 
attempt_1459528700872_0003_m_00_1, Status : FAILED 
Exception from container-launch: ExitCodeException exitCode=65: 
ExitCodeException exitCode=65: 
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) 
at org.apache.hadoop.util.Shell.run(Shell.java:455) 
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) 
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
 
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
 
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
at java.lang.Thread.run(Thread.java:745)

I just don't know how to resolve it.

By the way, is their any forum about hadoop. I want to talk more about the 
hadoop with some friends. 
So please, I'v already crazy, beause I'v been disturb by this problem for 
one week.
Please help me, thank you very much.






HY Frank
From China Shanghai


Re: One simple question about hadoop exit-code 65

2016-04-02 Thread Gagan Brahmi
Look like problems with the java executor or Countainer executor file
on the nodes.

I would recommend you to verify the java executor on all nodes
(/usr/bin/java). It is possible the links are missing.


Regards,
Gagan Brahmi

On Sat, Apr 2, 2016 at 9:40 AM, 169517388 <169517...@qq.com> wrote:
> to hadoop.org:
>
> Hello hadoop.org. I'm the new guy who is learning the hadoop right now.
> I'v bulit a 5 nodes hadoop experimental environment. When I ran the MR
> program, the error came out.
> I searched a lot. Maybe the java runtime environment or anything else. I
> didn't get it.
>
> 16/04/02 16:03:37 INFO mapreduce.Job: Task Id :
> attempt_1459528700872_0003_m_00_1, Status : FAILED
> Exception from container-launch: ExitCodeException exitCode=65:
> ExitCodeException exitCode=65:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
> at org.apache.hadoop.util.Shell.run(Shell.java:455)
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
> I just don't know how to resolve it.
>
> By the way, is their any forum about hadoop. I want to talk more about
> the hadoop with some friends.
> So please, I'v already crazy, beause I'v been disturb by this problem
> for one week.
> Please help me, thank you very much.
>
>
>
>
> 
> HY Frank
> From China Shanghai

-
To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
For additional commands, e-mail: user-h...@hadoop.apache.org



Re: One simple question about hadoop exit-code 65

2016-04-02 Thread Musty Rehmani
Can you share the following info from your environment so that it can better 
help us in helping you with this issue. Simply looking at Java exceptions may 
not be enough 
Hadoop versionJava version Memory allocations and heap size for node manager 
and containers How the job was run,  Hive,  Spark or Java MR code Have you 
restarted node manager 
Thanks Musty 

Sent from Yahoo Mail on Android 
 
  On Sat, Apr 2, 2016 at 1:15 PM, 169517388<169517...@qq.com> wrote:   to 
hadoop.org:
    Hello hadoop.org. I'm the new guy who is learning the hadoop right now. I'v 
bulit a 5 nodes hadoop experimental environment. When I ran the MR program, the 
error came out.    I searched a lot. Maybe the java runtime environment or 
anything else. I didn't get it.
    16/04/02 16:03:37 INFO mapreduce.Job: Task Id : 
attempt_1459528700872_0003_m_00_1, Status : FAILED Exception from 
container-launch: ExitCodeException exitCode=65: 
ExitCodeException exitCode=65: 
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
 at org.apache.hadoop.util.Shell.run(Shell.java:455)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
 at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
 at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
    I just don't know how to resolve it.
    By the way, is their any forum about hadoop. I want to talk more about the 
hadoop with some friends.     So please, I'v already crazy, beause I'v been 
disturb by this problem for one week.    Please help me, thank you very much.
    
    
HY FrankFrom China Shanghai  


Noob question about Hadoop job that writes output to HBase

2017-04-20 Thread evelina dumitrescu
Hi,

I am new to Hadoop and Hbase.
I was trying to make a small proof-of-concept Hadoop map reduce job that
reads the data from HDFS and stores the output in Hbase.
I did the setup as presented in this tutorial [1].
Here is the pseudocode from the map reduce code [2].
The problem is that I am unable to contact Hbase from the Hadoop job and
the job gets stuck.
Here are the logs from syslog [3], stderr [4] and console [5].
How should I correctly setup HbaseConfiguration ?
I couldn't find any example online that worked and it's hard for a beginner
to debug the issue.
Any help would be appreciated.

Thank you,
Evelina

[1]
http://www.bogotobogo.com/Hadoop/BigData_hadoop_HBase_Pseudo_Distributed.php
[2]
https://pastebin.com/hUDAMMes
[3]
https://pastebin.com/XxmWAUTf
[4]
https://pastebin.com/fYUYw4Cv
[5]
https://pastebin.com/YJ1hERDe


Question about Hadoop/HDFS files written and maximun file sizes

2020-05-14 Thread J M
Hi,

I don't have much knowledge about Hadoop/HDFS, my question can be simple,
or not...

Then, I have a Hadoop/HDFS environment, but my disks are not very big.

One applicacion is writing in files. But, sometimes the disk is filled with
large file sizes.

Then, my question is:

Exist any form to limitating the maximum file sizes written in HDFS?

I was thinking of something like:
When a file have a size of >= 1Gb, then new data written to this file,
cause that the first data written to this file deleted. In this way the
file size would always be limited, as a rolled file.

Howto do this task?

Regards,
Cesar Jorge


Re: Noob question about Hadoop job that writes output to HBase

2017-04-21 Thread evelina dumitrescu
The Hadoop version that I use is 2.7.1 and the Hbase version is 1.2.5.
I can do any operation from the HBase shell.

On Fri, Apr 21, 2017 at 8:01 AM, evelina dumitrescu <
evelina.a.dumitre...@gmail.com> wrote:

> Hi,
>
> I am new to Hadoop and Hbase.
> I was trying to make a small proof-of-concept Hadoop map reduce job that
> reads the data from HDFS and stores the output in Hbase.
> I did the setup as presented in this tutorial [1].
> Here is the pseudocode from the map reduce code [2].
> The problem is that I am unable to contact Hbase from the Hadoop job and
> the job gets stuck.
> Here are the logs from syslog [3], stderr [4] and console [5].
> How should I correctly setup HbaseConfiguration ?
> I couldn't find any example online that worked and it's hard for a
> beginner to debug the issue.
> Any help would be appreciated.
>
> Thank you,
> Evelina
>
> [1]
> http://www.bogotobogo.com/Hadoop/BigData_hadoop_HBase_Pseudo
> _Distributed.php
> [2]
> https://pastebin.com/hUDAMMes
> [3]
> https://pastebin.com/XxmWAUTf
> [4]
> https://pastebin.com/fYUYw4Cv
> [5]
> https://pastebin.com/YJ1hERDe
>


Re: Noob question about Hadoop job that writes output to HBase

2017-04-22 Thread Ravi Prakash
Hi Evelina!

You've posted the logs for the MapReduce ApplicationMaster . From this I
can see the reducer timed out after 600 secs :
2017-04-21 00:24:07,747 INFO [AsyncDispatcher event handler]
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics
report from attempt_1492722585320_0001_r_00_0:
AttemptID:attempt_1492722585320_0001_r_00_0 Timed out after 600 secs

To find out why the reducer timed out, you'd have to go look at the logs of
the reducer. These too are available from the RM page (where you got these
logs). just click a few more links deeper.

HTH
Ravi

On Fri, Apr 21, 2017 at 1:58 AM, evelina dumitrescu <
evelina.a.dumitre...@gmail.com> wrote:

> The Hadoop version that I use is 2.7.1 and the Hbase version is 1.2.5.
> I can do any operation from the HBase shell.
>
>
> On Fri, Apr 21, 2017 at 8:01 AM, evelina dumitrescu <
> evelina.a.dumitre...@gmail.com> wrote:
>
>> Hi,
>>
>> I am new to Hadoop and Hbase.
>> I was trying to make a small proof-of-concept Hadoop map reduce job that
>> reads the data from HDFS and stores the output in Hbase.
>> I did the setup as presented in this tutorial [1].
>> Here is the pseudocode from the map reduce code [2].
>> The problem is that I am unable to contact Hbase from the Hadoop job and
>> the job gets stuck.
>> Here are the logs from syslog [3], stderr [4] and console [5].
>> How should I correctly setup HbaseConfiguration ?
>> I couldn't find any example online that worked and it's hard for a
>> beginner to debug the issue.
>> Any help would be appreciated.
>>
>> Thank you,
>> Evelina
>>
>> [1]
>> http://www.bogotobogo.com/Hadoop/BigData_hadoop_HBase_Pseudo
>> _Distributed.php
>> [2]
>> https://pastebin.com/hUDAMMes
>> [3]
>> https://pastebin.com/XxmWAUTf
>> [4]
>> https://pastebin.com/fYUYw4Cv
>> [5]
>> https://pastebin.com/YJ1hERDe
>>
>
>


Re: Noob question about Hadoop job that writes output to HBase

2017-04-22 Thread Sidharth Kumar
I guess even aggregated log will have error which can be collected by using

yarn logs -applicationId  > .log



Sidharth
Mob: +91 819799
LinkedIn: www.linkedin.com/in/sidharthkumar2792

On 22-Apr-2017 5:40 PM, "Ravi Prakash"  wrote:

> Hi Evelina!
>
> You've posted the logs for the MapReduce ApplicationMaster . From this I
> can see the reducer timed out after 600 secs :
> 2017-04-21 00:24:07,747 INFO [AsyncDispatcher event handler]
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics
> report from attempt_1492722585320_0001_r_00_0: 
> AttemptID:attempt_1492722585320_0001_r_00_0
> Timed out after 600 secs
>
> To find out why the reducer timed out, you'd have to go look at the logs
> of the reducer. These too are available from the RM page (where you got
> these logs). just click a few more links deeper.
>
> HTH
> Ravi
>
> On Fri, Apr 21, 2017 at 1:58 AM, evelina dumitrescu <
> evelina.a.dumitre...@gmail.com> wrote:
>
>> The Hadoop version that I use is 2.7.1 and the Hbase version is 1.2.5.
>> I can do any operation from the HBase shell.
>>
>>
>> On Fri, Apr 21, 2017 at 8:01 AM, evelina dumitrescu <
>> evelina.a.dumitre...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I am new to Hadoop and Hbase.
>>> I was trying to make a small proof-of-concept Hadoop map reduce job that
>>> reads the data from HDFS and stores the output in Hbase.
>>> I did the setup as presented in this tutorial [1].
>>> Here is the pseudocode from the map reduce code [2].
>>> The problem is that I am unable to contact Hbase from the Hadoop job and
>>> the job gets stuck.
>>> Here are the logs from syslog [3], stderr [4] and console [5].
>>> How should I correctly setup HbaseConfiguration ?
>>> I couldn't find any example online that worked and it's hard for a
>>> beginner to debug the issue.
>>> Any help would be appreciated.
>>>
>>> Thank you,
>>> Evelina
>>>
>>> [1]
>>> http://www.bogotobogo.com/Hadoop/BigData_hadoop_HBase_Pseudo
>>> _Distributed.php
>>> [2]
>>> https://pastebin.com/hUDAMMes
>>> [3]
>>> https://pastebin.com/XxmWAUTf
>>> [4]
>>> https://pastebin.com/fYUYw4Cv
>>> [5]
>>> https://pastebin.com/YJ1hERDe
>>>
>>
>>
>


Agenda & More Information about Hadoop Community Meetup @ Palo Alto, June 26

2019-06-19 Thread Wangda Tan
Hi All,

I want to let you know that we have confirmed most of the agenda for Hadoop
Community Meetup. It will be a whole day event.

Agenda & Dial-In info because see below, *please RSVP
at https://www.meetup.com/Hadoop-Contributors/events/262055924/
*

Huge thanks to Daniel Templeton, Wei-Chiu Chuang, Christina Vu for helping
with organizing and logistics.

*Please help to promote meetup information on Twitter, LinkedIn, etc.
Appreciated! *

Best,
Wangda

























































*AM:9:00: Arrival and check-in--9:30 -
10:15:-Talk: Hadoop storage in cloud-native
environmentsAbstract: Hadoop is a mature storage system but designed years
before the cloud-native movement. Kubernetes and other cloud-native tools
are emerging solutions for containerized environments but sometimes they
require different approaches.In this presentation we would like to share
our experiences to run Apache Hadoop Ozone in Kubernetes and the connection
point to other cloud-native ecosystem elements. We will compare the
benefits and drawbacks to use Kubernetes and Hadoop storage together and
show our current achievements and future plans.Speaker: Marton Elek
(Cloudera)10:20 - 11:00:--Talk: Selective Wire Encryption In
HDFSAbstract: Wire data encryption is a key component of the Hadoop
Distributed File System (HDFS). However, such encryption enforcement comes
in as an all-or-nothing feature. In our use case at LinkedIn, we would like
to selectively expose fast unencrypted access to fully managed internal
clients, which can be trusted, while only expose encrypted access to
clients outside of the trusted circle with higher security risks. That way
we minimize performance overhead for trusted internal clients while still
securing data from potential outside threats. Our design extends HDFS
NameNode to run on multiple ports, connecting to different NameNode ports
would end up with different levels of encryption protection. This
protection then gets enforced for both NameNode RPC and the subsequent data
transfers to/from DataNode. This approach comes with minimum operational
and performance overhead.Speaker: Konstantin Shvachko (LinkedIn), Chen
Liang (LinkedIn)11:10 - 11:55:-Talk: YuniKorn: Next Generation
Scheduling for YARN and K8sAbstract: We will talk about our open source
work - YuniKorn scheduler project (Y for YARN, K for K8s, uni- for Unified)
brings long-wanted features such as hierarchical queues, fairness between
users/jobs/queues, preemption to Kubernetes; and it brings service
scheduling enhancements to YARN. Any improvements to this scheduler can
benefit both Kubernetes and YARN community.Speaker: Wangda Tan
(Cloudera)PM:12:00 - 12:55 Lunch Break (Provided by
Cloudera)1:00 -
1:25---Talk: Yarn Efficiency at UberAbstract: We will present the
work done at Uber to improve YARN cluster utilization and job SOA with
elastic resource management, low compute workload on passive datacenter,
preemption, larger container, etc. We will also go through YARN upgrade in
order to adopt new features and talk about the challenges.Speaker: Aihua Xu
(Uber), Prashant Golash (Uber)1:30 - 2:10 One more
talk-2:20 - 4:00---BoF sessions &
Breakout Sessions & Group discussions: Talk about items like JDK 11
support, next releases (2.10.0, 3.3.0, etc.), Hadoop on Cloud, etc.4:00:
Reception provided by
Cloudera.==Join Zoom
Meetinghttps://cloudera.zoom.us/j/116816195
*


Re: Question about Hadoop/HDFS files written and maximun file sizes

2020-05-14 Thread Deepak Vohra
  
Max file size is not configurable directly but other settings could affect max 
file size, such as maximum number of blocks per file setting 
dfs.namenode.fs-limits.max-blocks-per-file. This prevents the creation of 
extremely large files which can degrade performance.
    dfs.namenode.fs-limits.max-blocks-per-file    
1048576    Maximum number of blocks per file, 
enforced by the Namenode on        write. This prevents the creation of 
extremely large files which can        degrade 
performance.
  Space Quotas, Storage Type Quotas may also be set.
https://hadoop.apache.org/docs/r3.0.3/hadoop-project-dist/hadoop-hdfs/HdfsQuotaAdminGuide.html
https://www.informit.com/articles/article.aspx?p=2755708&seqNum=4

On Thursday, May 14, 2020, 09:19:44 a.m. UTC, J M 
 wrote:  
 
 Hi,
I don't have much knowledge about Hadoop/HDFS, my question can be simple, or 
not...
Then, I have a Hadoop/HDFS environment, but my disks are not very big.
One applicacion is writing in files. But, sometimes the disk is filled with 
large file sizes.
Then, my question is:
Exist any form to limitating the maximum file sizes written in HDFS?
I was thinking of something like:
When a file have a size of >= 1Gb, then new data written to this file, cause 
that the first data written to this file deleted. In this way the file size 
would always be limited, as a rolled file.
Howto do this task?
Regards,Cesar Jorge
  

Re: Agenda & More Information about Hadoop Community Meetup @ Palo Alto, June 26

2019-06-20 Thread runlin zhang
It’s great,I'm really looking forward to this Meetup

> 在 2019年6月20日,上午7:49,Wangda Tan  写道:
> 
> Hi All,
> 
> I want to let you know that we have confirmed most of the agenda for Hadoop
> Community Meetup. It will be a whole day event.
> 
> Agenda & Dial-In info because see below, *please RSVP
> at https://www.meetup.com/Hadoop-Contributors/events/262055924/
> *
> 
> Huge thanks to Daniel Templeton, Wei-Chiu Chuang, Christina Vu for helping
> with organizing and logistics.
> 
> *Please help to promote meetup information on Twitter, LinkedIn, etc.
> Appreciated! *
> 
> Best,
> Wangda
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> *AM:9:00: Arrival and check-in--9:30 -
> 10:15:-Talk: Hadoop storage in cloud-native
> environmentsAbstract: Hadoop is a mature storage system but designed years
> before the cloud-native movement. Kubernetes and other cloud-native tools
> are emerging solutions for containerized environments but sometimes they
> require different approaches.In this presentation we would like to share
> our experiences to run Apache Hadoop Ozone in Kubernetes and the connection
> point to other cloud-native ecosystem elements. We will compare the
> benefits and drawbacks to use Kubernetes and Hadoop storage together and
> show our current achievements and future plans.Speaker: Marton Elek
> (Cloudera)10:20 - 11:00:--Talk: Selective Wire Encryption In
> HDFSAbstract: Wire data encryption is a key component of the Hadoop
> Distributed File System (HDFS). However, such encryption enforcement comes
> in as an all-or-nothing feature. In our use case at LinkedIn, we would like
> to selectively expose fast unencrypted access to fully managed internal
> clients, which can be trusted, while only expose encrypted access to
> clients outside of the trusted circle with higher security risks. That way
> we minimize performance overhead for trusted internal clients while still
> securing data from potential outside threats. Our design extends HDFS
> NameNode to run on multiple ports, connecting to different NameNode ports
> would end up with different levels of encryption protection. This
> protection then gets enforced for both NameNode RPC and the subsequent data
> transfers to/from DataNode. This approach comes with minimum operational
> and performance overhead.Speaker: Konstantin Shvachko (LinkedIn), Chen
> Liang (LinkedIn)11:10 - 11:55:-Talk: YuniKorn: Next Generation
> Scheduling for YARN and K8sAbstract: We will talk about our open source
> work - YuniKorn scheduler project (Y for YARN, K for K8s, uni- for Unified)
> brings long-wanted features such as hierarchical queues, fairness between
> users/jobs/queues, preemption to Kubernetes; and it brings service
> scheduling enhancements to YARN. Any improvements to this scheduler can
> benefit both Kubernetes and YARN community.Speaker: Wangda Tan
> (Cloudera)PM:12:00 - 12:55 Lunch Break (Provided by
> Cloudera)1:00 -
> 1:25---Talk: Yarn Efficiency at UberAbstract: We will present the
> work done at Uber to improve YARN cluster utilization and job SOA with
> elastic resource management, low compute workload on passive datacenter,
> preemption, larger container, etc. We will also go through YARN upgrade in
> order to adopt new features and talk about the challenges.Speaker: Aihua Xu
> (Uber), Prashant Golash (Uber)1:30 - 2:10 One more
> talk-2:20 - 4:00---BoF sessions &
> Breakout Sessions & Group discussions: Talk about items like JDK 11
> support, next releases (2.10.0, 3.3.0, etc.), Hadoop on Cloud, etc.4:00:
> Reception provided by
> Cloudera.==Join Zoom
> Meetinghttps://cloudera.zoom.us/j/116816195
> *


-
To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
For additional commands, e-mail: user-h...@hadoop.apache.org



Re: Agenda & More Information about Hadoop Community Meetup @ Palo Alto, June 26

2019-06-25 Thread Wangda Tan
A friendly reminder,

The meetup will take place tomorrow at 9:00 AM PDT to 4:00 PM PDT.

The address is: 395 Page Mill Rd, Palo Alto, CA 94306
We’ll be in the Bigtop conference room on the 1st floor. Go left after
coming through the main entrance, and it will be on the right.

Zoom: https://cloudera.zoom.us/j/606607666

Please let me know if you have any questions. If you haven't RSVP yet,
please go ahead and RSVP so we can better prepare food, seat, etc.

Thanks,
Wangda

On Wed, Jun 19, 2019 at 4:49 PM Wangda Tan  wrote:

> Hi All,
>
> I want to let you know that we have confirmed most of the agenda for
> Hadoop Community Meetup. It will be a whole day event.
>
> Agenda & Dial-In info because see below, *please RSVP
> at https://www.meetup.com/Hadoop-Contributors/events/262055924/
> *
>
> Huge thanks to Daniel Templeton, Wei-Chiu Chuang, Christina Vu for helping
> with organizing and logistics.
>
> *Please help to promote meetup information on Twitter, LinkedIn, etc.
> Appreciated! *
>
> Best,
> Wangda
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *AM:9:00: Arrival and check-in--9:30 -
> 10:15:-Talk: Hadoop storage in cloud-native
> environmentsAbstract: Hadoop is a mature storage system but designed years
> before the cloud-native movement. Kubernetes and other cloud-native tools
> are emerging solutions for containerized environments but sometimes they
> require different approaches.In this presentation we would like to share
> our experiences to run Apache Hadoop Ozone in Kubernetes and the connection
> point to other cloud-native ecosystem elements. We will compare the
> benefits and drawbacks to use Kubernetes and Hadoop storage together and
> show our current achievements and future plans.Speaker: Marton Elek
> (Cloudera)10:20 - 11:00:--Talk: Selective Wire Encryption In
> HDFSAbstract: Wire data encryption is a key component of the Hadoop
> Distributed File System (HDFS). However, such encryption enforcement comes
> in as an all-or-nothing feature. In our use case at LinkedIn, we would like
> to selectively expose fast unencrypted access to fully managed internal
> clients, which can be trusted, while only expose encrypted access to
> clients outside of the trusted circle with higher security risks. That way
> we minimize performance overhead for trusted internal clients while still
> securing data from potential outside threats. Our design extends HDFS
> NameNode to run on multiple ports, connecting to different NameNode ports
> would end up with different levels of encryption protection. This
> protection then gets enforced for both NameNode RPC and the subsequent data
> transfers to/from DataNode. This approach comes with minimum operational
> and performance overhead.Speaker: Konstantin Shvachko (LinkedIn), Chen
> Liang (LinkedIn)11:10 - 11:55:-Talk: YuniKorn: Next Generation
> Scheduling for YARN and K8sAbstract: We will talk about our open source
> work - YuniKorn scheduler project (Y for YARN, K for K8s, uni- for Unified)
> brings long-wanted features such as hierarchical queues, fairness between
> users/jobs/queues, preemption to Kubernetes; and it brings service
> scheduling enhancements to YARN. Any improvements to this scheduler can
> benefit both Kubernetes and YARN community.Speaker: Wangda Tan
> (Cloudera)PM:12:00 - 12:55 Lunch Break (Provided by
> Cloudera)1:00 -
> 1:25---Talk: Yarn Efficiency at UberAbstract: We will present the
> work done at Uber to improve YARN cluster utilization and job SOA with
> elastic resource management, low compute workload on passive datacenter,
> preemption, larger container, etc. We will also go through YARN upgrade in
> order to adopt new features and talk about the challenges.Speaker: Aihua Xu
> (Uber), Prashant Golash (Uber)1:30 - 2:10 One more
> talk-2:20 - 4:00---BoF sessions &
> Breakout Sessions & Group discussions: Talk about items like JDK 11
> support, next releases (2.10.0, 3.3.0, etc.), Hadoop on Cloud, etc.4:00:
> Reception provided by
> Cloudera.==Join Zoom
> Meetinghttps://cloudera.zoom.us/j/116816195
> *
>


Re: Agenda & More Information about Hadoop Community Meetup @ Palo Alto, June 26

2019-06-25 Thread Weiwei Yang
Thanks Wangda.
Will this event be recorded? It will be extremely helpful for people who are 
unable to join to catch up.

Thanks
Weiwei
On Jun 26, 2019, 4:12 AM +0800, Wangda Tan , wrote:
A friendly reminder,

The meetup will take place tomorrow at 9:00 AM PDT to 4:00 PM PDT.

The address is: 395 Page Mill Rd, Palo Alto, CA 94306
We’ll be in the Bigtop conference room on the 1st floor. Go left after
coming through the main entrance, and it will be on the right.

Zoom: https://cloudera.zoom.us/j/606607666

Please let me know if you have any questions. If you haven't RSVP yet,
please go ahead and RSVP so we can better prepare food, seat, etc.

Thanks,
Wangda

On Wed, Jun 19, 2019 at 4:49 PM Wangda Tan  wrote:

Hi All,

I want to let you know that we have confirmed most of the agenda for
Hadoop Community Meetup. It will be a whole day event.

Agenda & Dial-In info because see below, *please RSVP
at https://www.meetup.com/Hadoop-Contributors/events/262055924/
*

Huge thanks to Daniel Templeton, Wei-Chiu Chuang, Christina Vu for helping
with organizing and logistics.

*Please help to promote meetup information on Twitter, LinkedIn, etc.
Appreciated! *

Best,
Wangda

























































*AM:9:00: Arrival and check-in--9:30 -
10:15:-Talk: Hadoop storage in cloud-native
environmentsAbstract: Hadoop is a mature storage system but designed years
before the cloud-native movement. Kubernetes and other cloud-native tools
are emerging solutions for containerized environments but sometimes they
require different approaches.In this presentation we would like to share
our experiences to run Apache Hadoop Ozone in Kubernetes and the connection
point to other cloud-native ecosystem elements. We will compare the
benefits and drawbacks to use Kubernetes and Hadoop storage together and
show our current achievements and future plans.Speaker: Marton Elek
(Cloudera)10:20 - 11:00:--Talk: Selective Wire Encryption In
HDFSAbstract: Wire data encryption is a key component of the Hadoop
Distributed File System (HDFS). However, such encryption enforcement comes
in as an all-or-nothing feature. In our use case at LinkedIn, we would like
to selectively expose fast unencrypted access to fully managed internal
clients, which can be trusted, while only expose encrypted access to
clients outside of the trusted circle with higher security risks. That way
we minimize performance overhead for trusted internal clients while still
securing data from potential outside threats. Our design extends HDFS
NameNode to run on multiple ports, connecting to different NameNode ports
would end up with different levels of encryption protection. This
protection then gets enforced for both NameNode RPC and the subsequent data
transfers to/from DataNode. This approach comes with minimum operational
and performance overhead.Speaker: Konstantin Shvachko (LinkedIn), Chen
Liang (LinkedIn)11:10 - 11:55:-Talk: YuniKorn: Next Generation
Scheduling for YARN and K8sAbstract: We will talk about our open source
work - YuniKorn scheduler project (Y for YARN, K for K8s, uni- for Unified)
brings long-wanted features such as hierarchical queues, fairness between
users/jobs/queues, preemption to Kubernetes; and it brings service
scheduling enhancements to YARN. Any improvements to this scheduler can
benefit both Kubernetes and YARN community.Speaker: Wangda Tan
(Cloudera)PM:12:00 - 12:55 Lunch Break (Provided by
Cloudera)1:00 -
1:25---Talk: Yarn Efficiency at UberAbstract: We will present the
work done at Uber to improve YARN cluster utilization and job SOA with
elastic resource management, low compute workload on passive datacenter,
preemption, larger container, etc. We will also go through YARN upgrade in
order to adopt new features and talk about the challenges.Speaker: Aihua Xu
(Uber), Prashant Golash (Uber)1:30 - 2:10 One more
talk-2:20 - 4:00---BoF sessions &
Breakout Sessions & Group discussions: Talk about items like JDK 11
support, next releases (2.10.0, 3.3.0, etc.), Hadoop on Cloud, etc.4:00:
Reception provided by
Cloudera.==Join Zoom
Meetinghttps://cloudera.zoom.us/j/116816195
*



A question about Hadoop 1 job user id used for group mapping, which could lead to performance degradatioin

2014-01-07 Thread Jian Fang
Hi,

I looked at Hadoop 1.X source code and found some logic that I could not
understand.

In the org.apache.hadoop.mapred.Child class, there were two UGIs defined as
follows.

UserGroupInformation current = UserGroupInformation.getCurrentUser();
current.addToken(jt);

UserGroupInformation taskOwner
 =
UserGroupInformation.createRemoteUser(firstTaskid.getJobID().toString());
taskOwner.addToken(jt);

But it is the taskOwner that is actually passed as a UGI to task tracker
and then to HDFS. The first one was not referenced any where.

final TaskUmbilicalProtocol umbilical =
  taskOwner.doAs(new PrivilegedExceptionAction()
{
@Override
public TaskUmbilicalProtocol run() throws Exception {
  return
(TaskUmbilicalProtocol)RPC.getProxy(TaskUmbilicalProtocol.class,
  TaskUmbilicalProtocol.versionID,
  address,
  defaultConf);
}
});

What puzzled me is that the job id is actually passed in as the user name
to task tracker. On the Name node side, when it tries to map the
non-existing user name, i.e., task id, to a group, it always returns empty
array. As a result, we always see annoying warning messages such as

 WARN org.apache.hadoop.security.UserGroupInformation (IPC Server handler
63 on 9000): No groups available for user job_201401071758_0002

Sometimes, the warning messages were thrown so fast, hundreds or even
thousands per second for a big cluster, the system performance was degraded
dramatically.

Could someone please explain why this logic was designed in this way? Any
benefit to use non-existing user for the group mapping? Or is this a bug?

Thanks in advance,

John


Re: A question about Hadoop 1 job user id used for group mapping, which could lead to performance degradatioin

2014-01-08 Thread Vinod Kumar Vavilapalli
It just seems like lazy code. You can see that, later, there is this:

{code}

for(Token token : UserGroupInformation.getCurrentUser().getTokens()) 
{
  childUGI.addToken(token);
}

{code}

So eventually the JobToken is getting added to the UGI which runs task-code.

>  WARN org.apache.hadoop.security.UserGroupInformation (IPC Server handler 63 
> on 9000): No groups available for user job_201401071758_0002

This seems to be a problem. When the task tries to reach the NameNode, it 
should do so as the user, not the job-id. It is not just logging, I'd be 
surprised if jobs pass. Do you have permissions enabled on HDFS?

Oh, or is this in non-secure mode (i.e. without kerberos)?

+Vinod


On Jan 7, 2014, at 5:14 PM, Jian Fang  wrote:

> Hi,
> 
> I looked at Hadoop 1.X source code and found some logic that I could not 
> understand. 
> 
> In the org.apache.hadoop.mapred.Child class, there were two UGIs defined as 
> follows.
> 
> UserGroupInformation current = UserGroupInformation.getCurrentUser();
> current.addToken(jt);
> 
> UserGroupInformation taskOwner 
>  = 
> UserGroupInformation.createRemoteUser(firstTaskid.getJobID().toString());
> taskOwner.addToken(jt);
> 
> But it is the taskOwner that is actually passed as a UGI to task tracker and 
> then to HDFS. The first one was not referenced any where.
> 
> final TaskUmbilicalProtocol umbilical = 
>   taskOwner.doAs(new PrivilegedExceptionAction() {
> @Override
> public TaskUmbilicalProtocol run() throws Exception {
>   return 
> (TaskUmbilicalProtocol)RPC.getProxy(TaskUmbilicalProtocol.class,
>   TaskUmbilicalProtocol.versionID,
>   address,
>   defaultConf);
> }
> });
> 
> What puzzled me is that the job id is actually passed in as the user name to 
> task tracker. On the Name node side, when it tries to map the non-existing 
> user name, i.e., task id, to a group, it always returns empty array. As a 
> result, we always see annoying warning messages such as
> 
>  WARN org.apache.hadoop.security.UserGroupInformation (IPC Server handler 63 
> on 9000): No groups available for user job_201401071758_0002
> 
> Sometimes, the warning messages were thrown so fast, hundreds or even 
> thousands per second for a big cluster, the system performance was degraded 
> dramatically. 
> 
> Could someone please explain why this logic was designed in this way? Any 
> benefit to use non-existing user for the group mapping? Or is this a bug?
> 
> Thanks in advance,
> 
> John


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: A question about Hadoop 1 job user id used for group mapping, which could lead to performance degradatioin

2014-01-08 Thread Jian Fang
Thanks Vinod for your quick response. It is running in non-secure mode.

I still don't get what is the purpose to use job id in UGI. Could you
please explain a bit more?

Thanks,

John


On Wed, Jan 8, 2014 at 10:11 AM, Vinod Kumar Vavilapalli <
vino...@hortonworks.com> wrote:

> It just seems like lazy code. You can see that, later, there is this:
>
> {code}
>
> for(Token token :
> UserGroupInformation.getCurrentUser().getTokens()) {
>   childUGI.addToken(token);
> }
>
> {code}
>
> So eventually the JobToken is getting added to the UGI which runs
> task-code.
>
> >  WARN org.apache.hadoop.security.UserGroupInformation (IPC Server
> handler 63 on 9000): No groups available for user job_201401071758_0002
>
> This seems to be a problem. When the task tries to reach the NameNode, it
> should do so as the user, not the job-id. It is not just logging, I'd be
> surprised if jobs pass. Do you have permissions enabled on HDFS?
>
> Oh, or is this in non-secure mode (i.e. without kerberos)?
>
> +Vinod
>
>
> On Jan 7, 2014, at 5:14 PM, Jian Fang 
> wrote:
>
> > Hi,
> >
> > I looked at Hadoop 1.X source code and found some logic that I could not
> understand.
> >
> > In the org.apache.hadoop.mapred.Child class, there were two UGIs defined
> as follows.
> >
> > UserGroupInformation current = UserGroupInformation.getCurrentUser();
> > current.addToken(jt);
> >
> > UserGroupInformation taskOwner
> >  =
> UserGroupInformation.createRemoteUser(firstTaskid.getJobID().toString());
> > taskOwner.addToken(jt);
> >
> > But it is the taskOwner that is actually passed as a UGI to task tracker
> and then to HDFS. The first one was not referenced any where.
> >
> > final TaskUmbilicalProtocol umbilical =
> >   taskOwner.doAs(new
> PrivilegedExceptionAction() {
> > @Override
> > public TaskUmbilicalProtocol run() throws Exception {
> >   return
> (TaskUmbilicalProtocol)RPC.getProxy(TaskUmbilicalProtocol.class,
> >   TaskUmbilicalProtocol.versionID,
> >   address,
> >   defaultConf);
> > }
> > });
> >
> > What puzzled me is that the job id is actually passed in as the user
> name to task tracker. On the Name node side, when it tries to map the
> non-existing user name, i.e., task id, to a group, it always returns empty
> array. As a result, we always see annoying warning messages such as
> >
> >  WARN org.apache.hadoop.security.UserGroupInformation (IPC Server
> handler 63 on 9000): No groups available for user job_201401071758_0002
> >
> > Sometimes, the warning messages were thrown so fast, hundreds or even
> thousands per second for a big cluster, the system performance was degraded
> dramatically.
> >
> > Could someone please explain why this logic was designed in this way?
> Any benefit to use non-existing user for the group mapping? Or is this a
> bug?
> >
> > Thanks in advance,
> >
> > John
>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>


Re: A question about Hadoop 1 job user id used for group mapping, which could lead to performance degradatioin

2014-01-08 Thread Jian Fang
Looked a bit deeper and seems this code was introduced by the following
JIRA.

https://issues.apache.org/jira/browse/MAPREDUCE-1457

There is another related JIRA, i.e.,
https://issues.apache.org/jira/browse/MAPREDUCE-4329.

Perhaps, the warning message is a side effect of JIRA MAPREDUCE-1457 when
the cluster is running in non-secured mode. There should be some code path
that caused the job id was treated as user name in task tracker or job
tracker. Then the job id was passed to HDFS name node. This is definitely a
big problem since the heavy warning logs alone degraded the system
performance in a relatively big cluster.

This behavior is very easy to reproduce by simply running terasort on a
cluster.

Any suggestion to fix this problem?




On Wed, Jan 8, 2014 at 11:18 AM, Jian Fang wrote:

> Thanks Vinod for your quick response. It is running in non-secure mode.
>
> I still don't get what is the purpose to use job id in UGI. Could you
> please explain a bit more?
>
> Thanks,
>
> John
>
>
> On Wed, Jan 8, 2014 at 10:11 AM, Vinod Kumar Vavilapalli <
> vino...@hortonworks.com> wrote:
>
>> It just seems like lazy code. You can see that, later, there is this:
>>
>> {code}
>>
>> for(Token token :
>> UserGroupInformation.getCurrentUser().getTokens()) {
>>   childUGI.addToken(token);
>> }
>>
>> {code}
>>
>> So eventually the JobToken is getting added to the UGI which runs
>> task-code.
>>
>> >  WARN org.apache.hadoop.security.UserGroupInformation (IPC Server
>> handler 63 on 9000): No groups available for user job_201401071758_0002
>>
>> This seems to be a problem. When the task tries to reach the NameNode, it
>> should do so as the user, not the job-id. It is not just logging, I'd be
>> surprised if jobs pass. Do you have permissions enabled on HDFS?
>>
>> Oh, or is this in non-secure mode (i.e. without kerberos)?
>>
>> +Vinod
>>
>>
>> On Jan 7, 2014, at 5:14 PM, Jian Fang 
>> wrote:
>>
>> > Hi,
>> >
>> > I looked at Hadoop 1.X source code and found some logic that I could
>> not understand.
>> >
>> > In the org.apache.hadoop.mapred.Child class, there were two UGIs
>> defined as follows.
>> >
>> > UserGroupInformation current =
>> UserGroupInformation.getCurrentUser();
>> > current.addToken(jt);
>> >
>> > UserGroupInformation taskOwner
>> >  =
>> UserGroupInformation.createRemoteUser(firstTaskid.getJobID().toString());
>> > taskOwner.addToken(jt);
>> >
>> > But it is the taskOwner that is actually passed as a UGI to task
>> tracker and then to HDFS. The first one was not referenced any where.
>> >
>> > final TaskUmbilicalProtocol umbilical =
>> >   taskOwner.doAs(new
>> PrivilegedExceptionAction() {
>> > @Override
>> > public TaskUmbilicalProtocol run() throws Exception {
>> >   return
>> (TaskUmbilicalProtocol)RPC.getProxy(TaskUmbilicalProtocol.class,
>> >   TaskUmbilicalProtocol.versionID,
>> >   address,
>> >   defaultConf);
>> > }
>> > });
>> >
>> > What puzzled me is that the job id is actually passed in as the user
>> name to task tracker. On the Name node side, when it tries to map the
>> non-existing user name, i.e., task id, to a group, it always returns empty
>> array. As a result, we always see annoying warning messages such as
>> >
>> >  WARN org.apache.hadoop.security.UserGroupInformation (IPC Server
>> handler 63 on 9000): No groups available for user job_201401071758_0002
>> >
>> > Sometimes, the warning messages were thrown so fast, hundreds or even
>> thousands per second for a big cluster, the system performance was degraded
>> dramatically.
>> >
>> > Could someone please explain why this logic was designed in this way?
>> Any benefit to use non-existing user for the group mapping? Or is this a
>> bug?
>> >
>> > Thanks in advance,
>> >
>> > John
>>
>>
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified
>> that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender
>> immediately
>> and delete it from your system. Thank You.
>>
>
>


Re: A question about Hadoop 1 job user id used for group mapping, which could lead to performance degradatioin

2014-02-18 Thread Chris Schneider
Hi John,

My AWS Elastic MapReduce NameNode is also filling its log file with messages 
like the following:

2014-02-18 23:56:52,344 WARN org.apache.hadoop.security.UserGroupInformation 
(IPC Server handler 78 on 9000): No groups available for user 
job_201402182309_0073
2014-02-18 23:56:52,351 WARN org.apache.hadoop.security.UserGroupInformation 
(IPC Server handler 48 on 9000): No groups available for user 
job_201402182309_0073
2014-02-18 23:56:52,356 WARN org.apache.hadoop.security.UserGroupInformation 
(IPC Server handler 38 on 9000): No groups available for user 
job_201402182309_0073

I ran into this same issue in March 2013 and got past it by using an m1.xlarge 
master node (instead of my usual m1.large) when (like right now) I double my 
slave count (to 32 cc2.8xlarge instances) to re-import a lot of my input data. 
Using that m1.xlarge didn't prevent the NameNode from logging messages like 
this, but the beefier instance seemed to weather the load these messages 
represented better.

Unfortunately, even my m1.xlarge master node now seems overwhelmed. The cluster 
starts off fine, efficiently mowing through the jobs in my job flow step for a 
few hours, but it eventually gets into a mode where the copy phase of the 
reduce jobs appear to make no progress at all. At that point, the NameNode 
seems to be spending all of its time writing messages like the ones above.

The issue doesn't seem to be related to the NameNode JVM size (I tried 
increasing it to 4GB before I realized it never used more than ~400MB), nor 
dfs.namenode.handler.count (which I increased from 64 to 96).

We're currently trying to work around the problem by hacking log4j.properties 
to set the logging level for org.apache.hadoop.security.UserGroupInformation to 
ERROR. We might have to do so for the entire package, as I've also seen the 
following in the NameNode logs:

2014-02-19 01:01:24,184 WARN 
org.apache.hadoop.security.ShellBasedUnixGroupsMapping (IPC Server handler 84 
on 9000): got exception trying to get groups for user job_201402182309_0226
org.apache.hadoop.util.Shell$ExitCodeException: id: job_201402182309_0226: No 
such user

at org.apache.hadoop.util.Shell.runCommand(Shell.java:255)
at org.apache.hadoop.util.Shell.run(Shell.java:182)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:461)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:444)
at 
org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:78)
at 
org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBasedUnixGroupsMapping.java:53)
at org.apache.hadoop.security.Groups.getGroups(Groups.java:79)
at 
org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1037)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.(FSPermissionChecker.java:50)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5218)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkTraverse(FSNamesystem.java:5201)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:2030)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.getFileInfo(NameNode.java:850)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:573)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

I would also be very interested in hearing Jakob Homan and Deveraj Das respond 
to your analysis of the changes made for MAPREDUCE-1457.

Please post again with any further information you're able to glean about this 
problem.

Thanks,

- Chris

On Jan 8, 2014, at 1:26 PM, Jian Fang wrote:

> Looked a bit deeper and seems this code was introduced by the following JIRA.
> 
> https://issues.apache.org/jira/browse/MAPREDUCE-1457
> 
> There is another related JIRA, i.e., 
> https://issues.apache.org/jira/browse/MAPREDUCE-4329.
> 
> Perhaps, the warning message is a side effect of JIRA MAPREDUCE-1457 when the 
> cluster is running in non-secured mode. There should be some code path that 
> caused the job id was treated as user name in task tracker or job tracker. 
> Then the job id was passed to HDFS name node. This is definitely a big 
> problem

Re: A question about Hadoop 1 job user id used for group mapping, which could lead to performance degradatioin

2014-02-21 Thread Chris Schneider
Hi John,

FWIW, setting the log level of org.apache.hadoop.security.UserGroupInformation 
to ERROR seemed to prevent the fatal NameNode slowdown we ran into. Although I 
still saw "no such user" Shell$ExitCodeException messages in the logs, these 
only occurred every few minutes or so. Thus, it seems like this is a reasonable 
work around until the underlying problem is fixed. I suggest that you file a 
JIRA ticket, though, as nobody seems to be rushing in here to tell us what 
we're doing wrong.

Thanks,

- Chris

On Feb 18, 2014, at 5:54 PM, Chris Schneider wrote:

> Hi John,
> 
> My AWS Elastic MapReduce NameNode is also filling its log file with messages 
> like the following:
> 
> 2014-02-18 23:56:52,344 WARN org.apache.hadoop.security.UserGroupInformation 
> (IPC Server handler 78 on 9000): No groups available for user 
> job_201402182309_0073
> 2014-02-18 23:56:52,351 WARN org.apache.hadoop.security.UserGroupInformation 
> (IPC Server handler 48 on 9000): No groups available for user 
> job_201402182309_0073
> 2014-02-18 23:56:52,356 WARN org.apache.hadoop.security.UserGroupInformation 
> (IPC Server handler 38 on 9000): No groups available for user 
> job_201402182309_0073
> 
> I ran into this same issue in March 2013 and got past it by using an 
> m1.xlarge master node (instead of my usual m1.large) when (like right now) I 
> double my slave count (to 32 cc2.8xlarge instances) to re-import a lot of my 
> input data. Using that m1.xlarge didn't prevent the NameNode from logging 
> messages like this, but the beefier instance seemed to weather the load these 
> messages represented better.
> 
> Unfortunately, even my m1.xlarge master node now seems overwhelmed. The 
> cluster starts off fine, efficiently mowing through the jobs in my job flow 
> step for a few hours, but it eventually gets into a mode where the copy phase 
> of the reduce jobs appear to make no progress at all. At that point, the 
> NameNode seems to be spending all of its time writing messages like the ones 
> above.
> 
> The issue doesn't seem to be related to the NameNode JVM size (I tried 
> increasing it to 4GB before I realized it never used more than ~400MB), nor 
> dfs.namenode.handler.count (which I increased from 64 to 96).
> 
> We're currently trying to work around the problem by hacking log4j.properties 
> to set the logging level for org.apache.hadoop.security.UserGroupInformation 
> to ERROR. We might have to do so for the entire package, as I've also seen 
> the following in the NameNode logs:
> 
> 2014-02-19 01:01:24,184 WARN 
> org.apache.hadoop.security.ShellBasedUnixGroupsMapping (IPC Server handler 84 
> on 9000): got exception trying to get groups for user job_201402182309_0226
> org.apache.hadoop.util.Shell$ExitCodeException: id: job_201402182309_0226: No 
> such user
> 
>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:255)
>   at org.apache.hadoop.util.Shell.run(Shell.java:182)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
>   at org.apache.hadoop.util.Shell.execCommand(Shell.java:461)
>   at org.apache.hadoop.util.Shell.execCommand(Shell.java:444)
>   at 
> org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:78)
>   at 
> org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBasedUnixGroupsMapping.java:53)
>   at org.apache.hadoop.security.Groups.getGroups(Groups.java:79)
>   at 
> org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1037)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.(FSPermissionChecker.java:50)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5218)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkTraverse(FSNamesystem.java:5201)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:2030)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.getFileInfo(NameNode.java:850)
>   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:573)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)
> 
> I would also be very interested in hearing Jakob Homan and Deveraj Das 
> respond to your analysis of the changes made for MAPREDUCE-1457.
> 
> Please post again wi