Re: Resource limits with Hadoop and JVM

2013-09-27 Thread Forrest Aldrich

I wanted to elaborate on what happened.

A hadoop slave was added to a live cluster.   Turns out, I think the 
mapred-site.xml was not configured with the correct master host.  But 
alas, in any case when the commands were run:



 * |$ hadoop mradmin -refreshNodes|
 * |$ hadoop dfsadmin -refreshNodes|

||

The master went completely berserk, up to a system load of 60 where it 
froze.


This should never, ever happen -- no matter what the issue.   So what 
I'm trying to understand is how to prevent this while allowing 
hadoop/java to run about its business.


We are using an older version of Hadoop (1.0.1) so maybe we hit a bug, I 
can't really tell.


I read an article about Spotify experiencing issues like this and some 
of their approaches, but it's not clear which is which here (I'm a newbie).



Thanks.



On 9/16/13 5:04 PM, Vinod Kumar Vavilapalli wrote:
I assume you are on Linux. Also assuming that your tasks are so 
resource intensive that they are taking down nodes. You should enable 
limits per task, see 
http://hadoop.apache.org/docs/stable/cluster_setup.html#Memory+monitoring


What it does is that jobs are now forced to up front provide their 
resource requirements, and TTs enforce those limits.


HTH
+Vinod Kumar Vavilapalli
Hortonworks Inc.
http://hortonworks.com/

On Sep 16, 2013, at 1:35 PM, Forrest Aldrich wrote:

We recently experienced a couple of situations that brought one or 
more Hadoop nodes down (unresponsive).   One was related to a bug in 
a utility we use (ffmpeg) that was resolved by compiling a new 
version. The next, today, occurred after attempting to join a new 
node to the cluster.


A basic start of the (local) tasktracker and datanode did not work -- 
so based on reference, I issued: hadoop mradmin -refreshNodes, which 
was to be followed by hadoop dfsadmin -refreshNodes.The load 
average literally jumped to 60 and the master (which also runs a 
slave) became unresponsive.


Seems to me that this should never happen.   But, looking around, I 
saw an article from Spotify which mentioned the need to set certain 
resource limits on the JVM as well as in the system itself 
(limits.conf, we run RHEL).I (and we) are fairly new to Hadoop, 
so some of these issues are very new.


I wonder if some of the experts here might be able to comment on this 
issue - perhaps point out settings and other measures we can take to 
prevent this sort of incident in the future.


Our setup is not complicated.   Have 3 hadoop nodes, the first is 
also a master and a slave (has more resources, too).   The underlying 
system we do is split up tasks to ffmpeg  (which is another issue as 
it tends to eat resources, but so far with a recompile, we are 
good).   We have two more hardware nodes to add shortly.



Thanks!



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or 
entity to which it is addressed and may contain information that is 
confidential, privileged and exempt from disclosure under applicable 
law. If the reader of this message is not the intended recipient, you 
are hereby notified that any printing, copying, dissemination, 
distribution, disclosure or forwarding of this communication is 
strictly prohibited. If you have received this communication in error, 
please contact the sender immediately and delete it from your system. 
Thank You. 




Re: Retrieve and compute input splits

2013-09-27 Thread Jay Vyas
Technically, the block locations are provided by the InputSplit which in
the FileInputFormat case, is provided by the FileSystem Interface.

http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/InputSplit.html

The thing to realize here is that the FileSystem implementation is provided
at runtime - so the InputSplit class is responsible to create a FileSystem
implementation using reflection, and then call the getBlockLocations of on
a given file or set of files which the input split is corresponding to.

I think your confusion here lies in the fact that the input splits create a
filesystem, however, they dont know what the filesystem implementation
actually is - they only rely on the abstract contract, which provides a set
of block locations.

See the FileSystem abstract class for details on that.


On Fri, Sep 27, 2013 at 7:02 PM, Peyman Mohajerian wrote:

> For the JobClient to compute the input splits doesn't it need to contact
> Name Node. Only Name Node knows where the splits are, how can it compute it
> without that additional call?
>
>
> On Fri, Sep 27, 2013 at 1:41 AM, Sonal Goyal wrote:
>
>> The input splits are not copied, only the information on the location of
>> the splits is copied to the jobtracker so that it can assign tasktrackers
>> which are local to the split.
>>
>> Check the Job Initialization section at
>>
>> http://answers.oreilly.com/topic/459-anatomy-of-a-mapreduce-job-run-with-hadoop/
>>
>> To create the list of tasks to run, the job scheduler first retrieves
>> the input splits computed by the JobClient from the shared filesystem
>> (step 6). It then creates one map task for each split. The number of reduce
>> tasks to create is determined by the mapred.reduce.tasks property in the
>> JobConf, which is set by the setNumReduceTasks() method, and the
>> scheduler simply creates this number of reduce tasks to be run. Tasks are
>> given IDs at this point.
>>
>> Best Regards,
>> Sonal
>> Nube Technologies 
>>
>>  
>>
>>
>>
>>
>> On Fri, Sep 27, 2013 at 10:55 AM, Sai Sai  wrote:
>>
>>> Hi
>>> I have attached the anatomy of MR from definitive guide.
>>>
>>> In step 6 it says JT/Scheduler  retrieve  input splits computed by the
>>> client from hdfs.
>>>
>>> In the above line it refers to as the client computes input splits.
>>>
>>> 1. Why does the JT/Scheduler retrieve the input splits and what does it
>>> do.
>>> If it is retrieving the input split does this mean it goes to the block
>>> and reads each record
>>> and gets the record back to JT. If so this is a lot of data movement for
>>> large files.
>>> which is not data locality. so i m getting confused.
>>>
>>> 2. How does the client know how to calculate the input splits.
>>>
>>> Any help please.
>>> Thanks
>>> Sai
>>>
>>
>>
>


-- 
Jay Vyas
http://jayunit100.blogspot.com


Re: Retrieve and compute input splits

2013-09-27 Thread Peyman Mohajerian
For the JobClient to compute the input splits doesn't it need to contact
Name Node. Only Name Node knows where the splits are, how can it compute it
without that additional call?


On Fri, Sep 27, 2013 at 1:41 AM, Sonal Goyal  wrote:

> The input splits are not copied, only the information on the location of
> the splits is copied to the jobtracker so that it can assign tasktrackers
> which are local to the split.
>
> Check the Job Initialization section at
>
> http://answers.oreilly.com/topic/459-anatomy-of-a-mapreduce-job-run-with-hadoop/
>
> To create the list of tasks to run, the job scheduler first retrieves the
> input splits computed by the JobClient from the shared filesystem (step
> 6). It then creates one map task for each split. The number of reduce tasks
> to create is determined by the mapred.reduce.tasks property in the JobConf,
> which is set by the setNumReduceTasks() method, and the scheduler simply
> creates this number of reduce tasks to be run. Tasks are given IDs at this
> point.
>
> Best Regards,
> Sonal
> Nube Technologies 
>
>  
>
>
>
>
> On Fri, Sep 27, 2013 at 10:55 AM, Sai Sai  wrote:
>
>> Hi
>> I have attached the anatomy of MR from definitive guide.
>>
>> In step 6 it says JT/Scheduler  retrieve  input splits computed by the
>> client from hdfs.
>>
>> In the above line it refers to as the client computes input splits.
>>
>> 1. Why does the JT/Scheduler retrieve the input splits and what does it
>> do.
>> If it is retrieving the input split does this mean it goes to the block
>> and reads each record
>> and gets the record back to JT. If so this is a lot of data movement for
>> large files.
>> which is not data locality. so i m getting confused.
>>
>> 2. How does the client know how to calculate the input splits.
>>
>> Any help please.
>> Thanks
>> Sai
>>
>
>


Calling the JobTracker from Reducer throws InvalidCredentials GSSException

2013-09-27 Thread Manish Verma
I am trying to get the job tracker counters in my reducer. It works on
single node demo hadoop but fails on a real cluster where kerberos is used
for authentication.


RunningJob parentJob =
client.getJob(JobID.forName(
context.getConfiguration().get("mapred.job.id") ));

Counters counters = parentJob.getCounters();


The call to getCounter() API throws GSSException (No valid credentials
provided - Failed to find any kerberos tgt).

I launched this job using hadoop jar command.

Any help would be much appreciated.

Thanks
Manish


Re: Can container requests be made paralelly from multiple threads

2013-09-27 Thread Omkar Joshi
My point is why you want multiple threads as a part of single AM talking to
RM simultaneously? I think AMRMProtocol only AM is suppose to use and if
the requirement is to have multiple requestor requesting resources then it
should be clubbed into one single request and sent to RM. One more thing
may be related to this. When AM makes a request to RM; today it gets
resources only if earlier NM heartbeat resulted into RM scheduling one for
it. So multiplexing AMRM requests wont help anyway but it will only
complicate things on RM side. Scheduler is not kicked in (synchronously)
when AM makes a request.

Thanks,
Omkar Joshi
*Hortonworks Inc.* 


On Fri, Sep 27, 2013 at 11:14 AM, Krishna Kishore Bonagiri <
write2kish...@gmail.com> wrote:

> Hi Omkar,
>
>   Thanks for the quick reply. I have a requirement for sets of containers
> depending on some of my business logic. I found that each of the request
> allocations is taking around 2 seconds, so I am thinking of doing the
> requests at the same from multiple threads.
>
> Kishore
>
>
> On Fri, Sep 27, 2013 at 11:27 PM, Omkar Joshi wrote:
>
>> Hi,
>>
>> I suggest you should not do that. After YARN-744 goes in this will be
>> prevented on RM side. May I know why you want to do this? any advantage/
>> use case?
>>
>> Thanks,
>> Omkar Joshi
>> *Hortonworks Inc.* 
>>
>>
>> On Fri, Sep 27, 2013 at 8:31 AM, Krishna Kishore Bonagiri <
>> write2kish...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>>   Can we submit container requests from multiple threads in parallel to
>>> the Resource Manager?
>>>
>>> Thanks,
>>> Kishore
>>>
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Can container requests be made paralelly from multiple threads

2013-09-27 Thread Krishna Kishore Bonagiri
Hi Omkar,

  Thanks for the quick reply. I have a requirement for sets of containers
depending on some of my business logic. I found that each of the request
allocations is taking around 2 seconds, so I am thinking of doing the
requests at the same from multiple threads.

Kishore


On Fri, Sep 27, 2013 at 11:27 PM, Omkar Joshi wrote:

> Hi,
>
> I suggest you should not do that. After YARN-744 goes in this will be
> prevented on RM side. May I know why you want to do this? any advantage/
> use case?
>
> Thanks,
> Omkar Joshi
> *Hortonworks Inc.* 
>
>
> On Fri, Sep 27, 2013 at 8:31 AM, Krishna Kishore Bonagiri <
> write2kish...@gmail.com> wrote:
>
>> Hi,
>>
>>   Can we submit container requests from multiple threads in parallel to
>> the Resource Manager?
>>
>> Thanks,
>> Kishore
>>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.


Re: Can container requests be made paralelly from multiple threads

2013-09-27 Thread Omkar Joshi
Hi,

I suggest you should not do that. After YARN-744 goes in this will be
prevented on RM side. May I know why you want to do this? any advantage/
use case?

Thanks,
Omkar Joshi
*Hortonworks Inc.* 


On Fri, Sep 27, 2013 at 8:31 AM, Krishna Kishore Bonagiri <
write2kish...@gmail.com> wrote:

> Hi,
>
>   Can we submit container requests from multiple threads in parallel to
> the Resource Manager?
>
> Thanks,
> Kishore
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Can container requests be made paralelly from multiple threads

2013-09-27 Thread Krishna Kishore Bonagiri
Hi,

  Can we submit container requests from multiple threads in parallel to the
Resource Manager?

Thanks,
Kishore


distcp / mv is not working on ftp

2013-09-27 Thread Fabian Zimmermann
Hi,

i'm just trying to backup some files to our ftp-server.

hadoop distcp hdfs:///data/ ftp://user:pass@server/data/

returns after some minutes with:

Task TASKID="task_201308231529_97700_m_02" TASK_TYPE="MAP" 
TASK_STATUS="FAILED" FINISH_TIME="1380217916479" ERROR="java\.io\.IOException: 
Cannot rename parent(source): ftp://x:x@backup2/data/, parent(destination):  
ftp://x:x@backup2/data/
at 
org\.apache\.hadoop\.fs\.ftp\.FTPFileSystem\.rename(FTPFileSystem\.java:557)
at 
org\.apache\.hadoop\.fs\.ftp\.FTPFileSystem\.rename(FTPFileSystem\.java:522)
at 
org\.apache\.hadoop\.mapred\.FileOutputCommitter\.moveTaskOutputs(FileOutputCommitter\.java:154)
at 
org\.apache\.hadoop\.mapred\.FileOutputCommitter\.moveTaskOutputs(FileOutputCommitter\.java:172)
at 
org\.apache\.hadoop\.mapred\.FileOutputCommitter\.commitTask(FileOutputCommitter\.java:132)
at 
org\.apache\.hadoop\.mapred\.OutputCommitter\.commitTask(OutputCommitter\.java:221)
at org\.apache\.hadoop\.mapred\.Task\.commit(Task\.java:1000)
at org\.apache\.hadoop\.mapred\.Task\.done(Task\.java:870)
at org\.apache\.hadoop\.mapred\.MapTask\.run(MapTask\.java:329)
at org\.apache\.hadoop\.mapred\.Child$4\.run" TASK_ATTEMPT_ID="" .

I googled a bit and added

fs.ftp.host = backup2
fs.ftp.user.backup2 = user
fs.ftp.password.backup2 = password

to core-site.xml, then I was able to execute:

hadoop fs -ls ftp:///data/
hadoop fs -rm ftp:///data/test.file

but as soon as I try

hadoop fs -mv file:///data/test.file ftp:///data/test2.file
mv: `ftp:///data/test.file': Input/output error

I enabled debug-logging in our ftp-server and got:

Sep 27 15:24:33 backup2 ftpd[38241]: command: LIST /data
Sep 27 15:24:33 backup2 ftpd[38241]: <--- 150
Sep 27 15:24:33 backup2 ftpd[38241]: Opening BINARY mode data connection for 
'/bin/ls'.
Sep 27 15:24:33 backup2 ftpd[38241]: <--- 226
Sep 27 15:24:33 backup2 ftpd[38241]: Transfer complete.
Sep 27 15:24:33 backup2 ftpd[38241]: command: CWD ftp:/data
Sep 27 15:24:33 backup2 ftpd[38241]: <--- 550
Sep 27 15:24:33 backup2 ftpd[38241]: ftp:/data: No such file or directory.
Sep 27 15:24:33 backup2 ftpd[38241]: command: RNFR test.file
Sep 27 15:24:33 backup2 ftpd[38241]: <--- 550

looks like the generation of "CWD" is buggy, hadoop tries to cd into 
"ftp:/data", but should use "/data"

Any ideas how to fix?

Thanks a lot,

Fabian Zimmermann
-- 
IT Engineer, Systemadministrator

xplosion interactive GmbH

Steindamm 71 | Besucher: Steindamm 80
20099 Hamburg

t. + 49 (0) 40.2850 7045
m. + 49 (0) 160 5898835
f. + 49 (0) 40.2850 1922
f.zimmerm...@xplosion.de
http://www.xplosion.de

FOLLOW US ON TWITTER - www.twitter.com/xplosion_de

Sitz der Gesellschaft: Hamburg
Handelsregister: AG Hamburg, HRB 109808
Geschäftsführer: Daniel Neuhaus, Thorsten Lottici 

Wir sind Mitglied im BVDW (Bundesverband Digitale Wirtschaft) 

This e-mail is confidential and is intended for the addressee(s) only.  
If you are not the named addressee you may not use it, copy it or  
disclose it to any other person. If you received this message in error  
please notify the sender immediately.



回复: Re: FATAL org.apache.hadoop.mapred.JettyBugMonitor question

2013-09-27 Thread 麦树荣
hi, thank you for your reply.

Hadoop version is hadoop-0.20.2-cdh3u4 ,
I guess the jetty version is jetty-6.1.26 ( because  I see the files 
"jetty-6.1.26.cloudera.1.jar", 
"jetty-servlet-tester-6.1.26.cloudera.1.jar","jetty-util-6.1.26.cloudera.1.jar 
" in $HADOOP_HOME/lib/ )

how to ship a  patched Jetty ? can you give me website ?


麦树荣

发件人: Harsh J
发送时间: 2013-09-14 18:44
收件人:  ; 
user6d6b4dda
主题: Re: FATAL org.apache.hadoop.mapred.JettyBugMonitor question

What version of jetty are you using? We've not seen this lately (after
we began shipping a patched Jetty), but the check is valid and
protects your MR jobs from getting into a hung or slow state.

On Fri, Sep 13, 2013 at 1:26 PM, 麦树荣  wrote:
> No one gives me help ?
>
> 
> 麦树荣
>
> From: 麦树荣
> Date: 2013-09-09 10:39
> To: user@hadoop.apache.org
> Subject: FATAL org.apache.hadoop.mapred.JettyBugMonitor question
> hi,
>
> Recently,I  encounter “FATAL org.apache.hadoop.mapred.JettyBugMonitor
> ,tasktracker shut down”problem in some hadoop computers. The log information
> is as follows:
> 2013-09-02 19:33:53,015 FATAL org.apache.hadoop.mapred.JettyBugMonitor:
> 
> Jetty CPU usage: 120.6%. This is greater than the fatal threshold
> mapred.tasktracker.jetty.cpu.threshold.fatal. Aborting JVM.
>
> After google, I find the following relative information:
> The TaskTracker now has a thread which monitors for a known Jetty bug in
> which the selector thread starts spinning and map output can no longer be
> served. If the bug is detected, the TaskTracker will shut itself down. This
> feature can be disabled by setting
> mapred.tasktracker.jetty.cpu.check.enabled to false.
>
> How do you solve the problem usually ? Is there a simple method to deal with
> the problem ?
> Thanks.



--
Harsh J


Hadoop Solaris OS compatibility

2013-09-27 Thread Jitendra Yadav
Hi All,

Since few years, I'm working as hadoop admin on Linux platform,Though we
have majority of servers on Solaris (Sun Sparc hardware). Many times I have
seen that hadoop is compatible with Linux. Is that right?. If yes then what
all things I need to have so that I can run hadoop on Solaris in
production? Do I need to build hadoop source on Solaris?

Thanks
Jitendra


Re: Retrieve and compute input splits

2013-09-27 Thread Sonal Goyal
The input splits are not copied, only the information on the location of
the splits is copied to the jobtracker so that it can assign tasktrackers
which are local to the split.

Check the Job Initialization section at
http://answers.oreilly.com/topic/459-anatomy-of-a-mapreduce-job-run-with-hadoop/

To create the list of tasks to run, the job scheduler first retrieves the
input splits computed by the JobClient from the shared filesystem (step 6).
It then creates one map task for each split. The number of reduce tasks to
create is determined by the mapred.reduce.tasks property in the JobConf,
which is set by the setNumReduceTasks() method, and the scheduler simply
creates this number of reduce tasks to be run. Tasks are given IDs at this
point.

Best Regards,
Sonal
Nube Technologies 






On Fri, Sep 27, 2013 at 10:55 AM, Sai Sai  wrote:

> Hi
> I have attached the anatomy of MR from definitive guide.
>
> In step 6 it says JT/Scheduler  retrieve  input splits computed by the
> client from hdfs.
>
> In the above line it refers to as the client computes input splits.
>
> 1. Why does the JT/Scheduler retrieve the input splits and what does it do.
> If it is retrieving the input split does this mean it goes to the block
> and reads each record
> and gets the record back to JT. If so this is a lot of data movement for
> large files.
> which is not data locality. so i m getting confused.
>
> 2. How does the client know how to calculate the input splits.
>
> Any help please.
> Thanks
> Sai
>


Re: Input Split vs Task vs attempt vs computation

2013-09-27 Thread Sonal Goyal
Inline

Best Regards,
Sonal
Nube Technologies 






On Fri, Sep 27, 2013 at 10:42 AM, Sai Sai  wrote:

> Hi
> I have a few questions i am trying to understand:
>
> 1. Is each input split same as a record, (a rec can be a single line or
> multiple lines).
>

An InputSplit is a chunk of input that is handled by a map task. It will
generally contain multiple records. The RecordReader provides the key
values to the map task. Check
http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapreduce/InputSplit.html

>
> 2. Is each Task a collection of few computations or attempts.
>
> For ex: if i have a small file with 5 lines.
> By default there will be 1 line on which each map computation is performed.
> So totally 5 computations r done on 1 node.
>
> This means JT will spawn 1 JVM for 1 Tasktracker on a node
> and another JVM for map task which will instantiate 5 map objects 1 for
> each line.
>
> i am not sure what you mean by 5 map objects. But yes, the mapper will be
invoked 5 times, once for each line.


> The MT JVM is called the task which will have 5 attempts for  each line.
> This means attempt is same as computation.
>
> Please let me know if anything is incorrect.
> Thanks
> Sai
>
>