Re: Question about log files

2015-04-06 Thread 杨浩
I think the log information has lost.

 the hadoop is not designed for that you deleted these files incorrectly

2015-04-02 11:45 GMT+08:00 煜 韦 :

> Hi there,
> If log files are deleted without restarting service, it seems that the
> logs is to be lost for later operation. For example, on namenode, datanode.
> Why not log files could be re-created when deleted by mistake or on
> purpose during cluster is running?
>
> Thanks,
> Jared
>


Re: File permission of Hadoop staging directory getting changed time to time

2015-04-06 Thread Inosh Goonewardena
Hi All,

Really appreciate any input on this issue.

Thanks and Regards,
Inosh

On Tue, Jan 20, 2015 at 10:29 PM, Inosh Goonewardena 
wrote:

> Hi All,
>
> We have setup a Hadoop cluster using hadoop 1.0.4 and we use Hive to
> submit map/reduce jobs. But we have noticed that these map/reduce job
> submission failing time to time due to a permission issue occurring in
> Hadoop. When this happens, we have to use the hadoop hdfs command and
> change the file permission of mapreduce staging directory. Following is the
> error that is getting logged.
>
> [2015-01-16 05:30:13,314] ERROR {org.apache.hadoop.hive.ql.exec.Task} -
> Job Submission failed with exception 'java.io.IOException(The
> ownership/permissions on the staging directory hdfs://
> hadoop0.test.com:9000/mnt/hadoop_tmp/mapred/staging/inosh/.staging is not
> as expected. It is owned by inosh and permissions are rwxr-xr-x. The
> directory must be owned by the submitter inosh or by inosh and permissions
> must be rwx--)'
> java.io.IOException: The ownership/permissions on the staging directory
> hdfs://hadoop0.test.com:9000/mnt/hadoop_tmp/mapred/staging/inosh/.staging
> is not as expected. It is owned by inosh and permissions are rwxr-xr-x. The
> directory must be owned by the submitter inosh or by inosh and permissions
> must be rwx--
> at
> org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:108)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:798)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:792)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1123)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:792)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:766)
> at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460)
> at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:129)
> at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:62)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1351)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1126)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:934)
> at
> org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:201)
> at
> org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:187)
> ...
>
>
> When we get this error we go to hadoop name node and execute following
> command to set proper permission to the folder mentioned in the exception.
> ./hadoop dfs -chmod 700 /mnt/hadoop_tmp/mapred/staging/inosh/.staging
>
>
> We would like know why this happens and how to fix this permanently.
>
>
> Thanks and Regards,
> Inosh
>


Tracking job failure using APIs

2015-04-06 Thread Sanjeev Tripurari
Hi All,

How can I track a job failure on node or list of nodes, using YARN apis.
I could get the list of long running jobs, using yarn client API,
but need to go further to AM, NM, task attempts for map or reduce.

Say, I have a job running for long,(about 4hours), might be caused of some
task failures.

Please provide the sequence of APIs, or any reference.

Thanks and Regards
-Sanjeev

-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.


Re: File permission of Hadoop staging directory getting changed time to time

2015-04-06 Thread Harshit Mathur
try running with sudo hive.

On Mon, Apr 6, 2015 at 1:40 PM, Inosh Goonewardena 
wrote:

> Hi All,
>
> Really appreciate any input on this issue.
>
> Thanks and Regards,
> Inosh
>
> On Tue, Jan 20, 2015 at 10:29 PM, Inosh Goonewardena 
> wrote:
>
>> Hi All,
>>
>> We have setup a Hadoop cluster using hadoop 1.0.4 and we use Hive to
>> submit map/reduce jobs. But we have noticed that these map/reduce job
>> submission failing time to time due to a permission issue occurring in
>> Hadoop. When this happens, we have to use the hadoop hdfs command and
>> change the file permission of mapreduce staging directory. Following is the
>> error that is getting logged.
>>
>> [2015-01-16 05:30:13,314] ERROR {org.apache.hadoop.hive.ql.exec.Task} -
>> Job Submission failed with exception 'java.io.IOException(The
>> ownership/permissions on the staging directory hdfs://
>> hadoop0.test.com:9000/mnt/hadoop_tmp/mapred/staging/inosh/.staging is
>> not as expected. It is owned by inosh and permissions are rwxr-xr-x. The
>> directory must be owned by the submitter inosh or by inosh and permissions
>> must be rwx--)'
>> java.io.IOException: The ownership/permissions on the staging directory
>> hdfs://hadoop0.test.com:9000/mnt/hadoop_tmp/mapred/staging/inosh/.staging
>> is not as expected. It is owned by inosh and permissions are rwxr-xr-x. The
>> directory must be owned by the submitter inosh or by inosh and permissions
>> must be rwx--
>> at
>> org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:108)
>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:798)
>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:792)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:415)
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1123)
>> at
>> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:792)
>> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:766)
>> at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460)
>> at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136)
>> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:129)
>> at
>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:62)
>> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1351)
>> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1126)
>> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:934)
>> at
>> org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:201)
>> at
>> org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:187)
>> ...
>>
>>
>> When we get this error we go to hadoop name node and execute following
>> command to set proper permission to the folder mentioned in the exception.
>> ./hadoop dfs -chmod 700 /mnt/hadoop_tmp/mapred/staging/inosh/.staging
>>
>>
>> We would like know why this happens and how to fix this permanently.
>>
>>
>> Thanks and Regards,
>> Inosh
>>
>
>


-- 
Harshit Mathur


Re: Question about log files

2015-04-06 Thread Fabio C.
I noticed that too, I think Hadoop keeps the file open all the time and
when you delete it it is just no more able to write on it and doesn't try
to recreate it. Not sure if it's a Log4j problem or an Hadoop one...
yanghaogn, which is the *correct* way to delete the Hadoop logs? I didn't
find anything better than deleting the file and restarting the service...

On Mon, Apr 6, 2015 at 9:27 AM, 杨浩  wrote:

> I think the log information has lost.
>
>  the hadoop is not designed for that you deleted these files incorrectly
>
> 2015-04-02 11:45 GMT+08:00 煜 韦 :
>
>> Hi there,
>> If log files are deleted without restarting service, it seems that the
>> logs is to be lost for later operation. For example, on namenode, datanode.
>> Why not log files could be re-created when deleted by mistake or on
>> purpose during cluster is running?
>>
>> Thanks,
>> Jared
>>
>
>


unsubscribe

2015-04-06 Thread Brahma Reddy Battula
Kindly Send email to user-unsubscr...@hadoop.apache.org

Subject: Re: Hadoop 2.6 issue
To: user@hadoop.apache.org
From: rapen...@in.ibm.com
Date: Thu, 2 Apr 2015 09:02:09 +0530


Please un subscribe me from this list.



Regards,

Ravi Prasad Pentakota

India Software Lab, IBM Software Group

Phone: +9180-43328520  Mobile: 919620959477 

e-mail:rapen...@in.ibm.com






Kumar Jayapal ---04/02/2015 07:50:05 AM---$which java make sure the paths are 
valid for your installation (change if using 32bit



From:   Kumar Jayapal 

To: user@hadoop.apache.org

Cc: Anand Murali 

Date:   04/02/2015 07:50 AM

Subject:Re: Hadoop 2.6 issue







$which java



make sure the paths are valid for your installation (change if using 32bit 
version): 

/usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java



/usr/lib/jvm/java-6-openjdk-amd64/bin/javac
Setup update-alternatives:

sudo update-alternatives --install "/usr/bin/java" "java" 
"/usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java" 1

sudo update-alternatives --install "/usr/bin/javac" "javac" 
"/usr/lib/jvm/java-6-openjdk-amd64/bin/javac" 1



sudo update-alternatives --set java 
/usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java

sudo update-alternatives --set javac /usr/lib/jvm/java-6-openjdk-amd64/bin/javac

Alternatively, make sure the correct version is checked for both Java and 
compiler:

sudo update-alternatives --config java

sudo update-alternatives --config javac

List the installed Java alternatives with:

sudo update-alternatives --list java

sudo update-alternatives --list javac



On Wed, Apr 1, 2015 at 10:35 AM, Ravindra Kumar Naik  
wrote:

Hi,



Creating batch program will not have the same effect. If you put the variables 
in /etc/environment then it will be available to all users on the operating 
system. HDFS doesn't run with root privileges.

You need to open the application with sudo or with root privileges to modify it.

e.g. If you are using vi editor then its just sudo vim /etc/environment 
(similar, if you are using other editors) and add environment variables there.





On Wed, Apr 1, 2015 at 7:38 PM, Anand Murali  wrote:
Mr. Ravindra:



This is visible, however I am unable to modify it, eventhough I have admin 
priveleges. I am new to the Linux environment. Shall be glad if you did advise. 
However, as I told you earlier, I have created a batch program which contains, 
JAVA_HOME setting, HADOOP_INSTALL setting and PATH setting. I have rfun this 
file but I am still unable to start the daemons. I am following Tom Whyte's 
-Hadoop definitive Guide book instructions on how to install Hadoop.



at $hadoop version works. I am able to format namenode, but fail to start 
daemons.



Reply most welcome.



Thanks

 

Anand Murali  

11/7, 'Anand Vihar', Kandasamy St, Mylapore

Chennai - 600 004, India

Ph: (044)- 28474593/ 43526162 (voicemail)







On Wednesday, April 1, 2015 7:04 PM, Ravindra Kumar Naik  
wrote:





Are you sure that its not there, could you please check the output of this 
command



ls /etc/env*







On Wed, Apr 1, 2015 at 6:55 PM, Anand Murali  wrote:
Mr. Ravindra:



I am using Ubuntu 14. Can you please provide the full path. I am logged in as 
root and it is not found in /etc. In any case what you have suggested I have 
tried creating a batch file and it does not work in my installation.



Thanks





 

Anand Murali  

11/7, 'Anand Vihar', Kandasamy St, Mylapore

Chennai - 600 004, India

Ph: (044)- 28474593/ 43526162 (voicemail)







On Wednesday, April 1, 2015 6:50 PM, Ravindra Kumar Naik  
wrote:





I meant /etc/environment. It should be present if you are using Ubuntu.



Regards,

Ravindra



On Wed, Apr 1, 2015 at 6:39 PM, Anand Murali  wrote:
Mr. Ravindra 



I dont find any etc/environment. Can you be more specific please. I have done 
whatever you are saying in a user created batch program and run it, followed by 
running hadoop-env.sh and it still does not work.



Thanks

 

Anand Murali  

11/7, 'Anand Vihar', Kandasamy St, Mylapore

Chennai - 600 004, India

Ph: (044)- 28474593/ 43526162 (voicemail)







On Wednesday, April 1, 2015 6:10 PM, Ravindra Kumar Naik  
wrote:





Hi,



If you are using Ubuntu then add these lines to /etc/environment 

JAVA_HOME=

PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:$JAVA_HOME/bin"



Please put the actual path to JDK in the first line.



Regards,

Ravindra





On Wed, Apr 1, 2015 at 5:50 PM, roland.depratti  wrote:
Anand,



Sorry about that, I was assuming Redhat/Centos.



For Ubuntu, try sudo update-alternatives --config java.







Sent from my Verizon Wireless 4G LTE smartphone





 Original message 

From: Anand Murali  

Date: 04/01/2015 7:22 AM (GMT-05:00) 

To: user@hadoop.apache.org 

Subject: Re: Hadoop 2.6 issue 



Dear Mr.Roland:



The alternatives command errors out. I have the extracted version of the Oracle 
JDK7. However I am ignorant regarding its installation on Ubuntu. Can you poin

Re: Question about log files

2015-04-06 Thread Joep Rottinghuis
This depends on your OS.
When you "delete" a file on Linux, you merely unlink the entry from the 
directory.
The file does not actually get deleted until until the last reference (open 
handle) goes away. Note that this could lead to an interesting way to fill up a 
disk.
You should be able to see the open files by a process using the lsof command.
The process itself does not know that a dentry has been removed, so there is 
nothing that log4j or the Hadoop code can do about it.
Assuming you have some rolling file appender configured, log4j should start 
logging to a new file at some point, or you have to bounce you daemon process.

Cheers,

Joep

Sent from my iPhone

> On Apr 6, 2015, at 6:19 AM, Fabio C.  wrote:
> 
> I noticed that too, I think Hadoop keeps the file open all the time and when 
> you delete it it is just no more able to write on it and doesn't try to 
> recreate it. Not sure if it's a Log4j problem or an Hadoop one...
> yanghaogn, which is the *correct* way to delete the Hadoop logs? I didn't 
> find anything better than deleting the file and restarting the service...
> 
>> On Mon, Apr 6, 2015 at 9:27 AM, 杨浩  wrote:
>> I think the log information has lost.
>> 
>>  the hadoop is not designed for that you deleted these files incorrectly
>> 
>> 2015-04-02 11:45 GMT+08:00 煜 韦 :
>>> Hi there,
>>> If log files are deleted without restarting service, it seems that the logs 
>>> is to be lost for later operation. For example, on namenode, datanode.
>>> Why not log files could be re-created when deleted by mistake or on purpose 
>>> during cluster is running?
>>> 
>>> Thanks,
>>> Jared
> 


YARN resource manager in ubuntu 14.04: Problem in starting http server. Server handlers failed

2015-04-06 Thread Daniel Rodriguez
Hi all,

I am trying to get YARN running in ubuntu 14.04 with not a lot of luck,
specially the Resource Manager. See error at the end of the email.

I am using CDH 5.3 that supports ubuntu 14.04

The same settings (using the same salt formulas as configuration
management) on ubuntu 12.04 are working with no issues.

I check some usuals like ports (9026) but everything seams to be ok.

This is the error line:
http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/hadoop-common/2.5.0/org/apache/hadoop/http/HttpServer2.java#840

There is not a lot of help on the error message and the code is not very
specific about what handler is failing.

Any ideas on the reason or how to debug this are welcome.

Error message:

2015-04-06 16:43:06,307 INFO  [main] resourcemanager.ResourceManager
(SignalLogger.java:register(91)) - registered UNIX signal handlers for
[TERM, HUP, INT]
2015-04-06 16:43:06,546 INFO  [main] conf.Configuration
(Configuration.java:getConfResourceAsInputStream(2207)) - found resource
core-site.xml at file:/etc/hadoop/conf/core-site.xml
2015-04-06 16:43:06,799 INFO  [main] security.Groups
(Groups.java:refresh(202)) - clearing userToGroupsMap cache
2015-04-06 16:43:06,992 INFO  [main] conf.Configuration
(Configuration.java:getConfResourceAsInputStream(2207)) - found resource
yarn-site.xml at file:/etc/hadoop/conf/yarn-site.xml
2015-04-06 16:43:07,075 INFO  [main] event.AsyncDispatcher
(AsyncDispatcher.java:register(197)) - Registering class
org.apache.hadoop.yarn.server.resourcemanager.RMFatalEventType for class
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMFatalEventDispatcher
2015-04-06 16:43:07,148 INFO  [main] security.NMTokenSecretManagerInRM
(NMTokenSecretManagerInRM.java:(75)) - NMTokenKeyRollingInterval:
8640ms and NMTokenKeyActivationDelay: 90ms
2015-04-06 16:43:07,158 INFO  [main] security.RMContainerTokenSecretManager
(RMContainerTokenSecretManager.java:(76)) -
ContainerTokenKeyRollingInterval: 8640ms and
ContainerTokenKeyActivationDelay: 90ms
2015-04-06 16:43:07,174 INFO  [main] security.AMRMTokenSecretManager
(AMRMTokenSecretManager.java:(94)) - AMRMTokenKeyRollingInterval:
8640ms and AMRMTokenKeyActivationDelay: 90 ms
2015-04-06 16:43:07,197 INFO  [main] event.AsyncDispatcher
(AsyncDispatcher.java:register(197)) - Registering class
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStoreEventType
for class
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
details.
2015-04-06 16:43:07,537 INFO  [main] event.AsyncDispatcher
(AsyncDispatcher.java:register(197)) - Registering class
org.apache.hadoop.yarn.server.resourcemanager.NodesListManagerEventType for
class org.apache.hadoop.yarn.server.resourcemanager.NodesListManager
2015-04-06 16:43:07,537 INFO  [main] resourcemanager.ResourceManager
(ResourceManager.java:createScheduler(265)) - Using Scheduler:
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
2015-04-06 16:43:07,561 INFO  [main] event.AsyncDispatcher
(AsyncDispatcher.java:register(197)) - Registering class
org.apache.hadoop.yarn.server.resourcemanager.scheduler.event.SchedulerEventType
for class
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher
2015-04-06 16:43:07,562 INFO  [main] event.AsyncDispatcher
(AsyncDispatcher.java:register(197)) - Registering class
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppEventType for
class
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher
2015-04-06 16:43:07,563 INFO  [main] event.AsyncDispatcher
(AsyncDispatcher.java:register(197)) - Registering class
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptEventType
for class
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher
2015-04-06 16:43:07,563 INFO  [main] event.AsyncDispatcher
(AsyncDispatcher.java:register(197)) - Registering class
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeEventType for
class
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$NodeEventDispatcher
2015-04-06 16:43:07,697 WARN  [main] impl.MetricsConfig
(MetricsConfig.java:loadFirst(124)) - Cannot locate configuration: tried
hadoop-metrics2-resourcemanager.properties,hadoop-metrics2.properties
2015-04-06 16:43:07,857 INFO  [main] impl.MetricsSystemImpl
(MetricsSystemImpl.java:startTimer(356)) - Scheduled snapshot period at 10
second(s).
2015-04-06 16:43:07,857 INFO  [main] impl.MetricsSystemImpl
(MetricsSystemImpl.java:start(184)) - ResourceManager metrics system started
2015-04-06 16:43:07,882 INFO  [main] event.AsyncDispatcher
(AsyncDispatcher.java:register(197)) - Registering class
org.apache.hadoop.yarn.server.resou

Pin Map/Reduce tasks to specific cores

2015-04-06 Thread George Ioannidis
Hello. My question, which can be found on *Stack Overflow
*
as well, regards pinning map/reduce tasks to specific cores, either on
hadoop v.1.2.1 or hadoop v.2.
In specific, I would like to know if the end-user can have any control on
which core executes a specific map/reduce task.

To pin an application on linux, there's the "taskset" command, but is
anything similar provided by hadoop? If not, is the Linux Scheduler in
charge of allocating tasks to specific cores?

--

Below I am providing two cases to better illustrate my question:

*Case #1:* 2 GiB input size, HDFS block size of 64 MiB and 2 compute nodes
available, with 32 cores each.
As follows, 32 map tasks will be called; let's suppose that
mapred.tasktracker.map.tasks.maximum
= 16, so 16 map tasks will be allocated to each node.

Can I guarantee that each Map Task will run on a specific core, or is it up
to the Linux Scheduler?

--

*Case #2:* The same as case #1, but now the input size is 8 GiB, so there
are not enough slots for all map tasks (128), so multiple tasks will share
the same cores.
Can I control how much "time" each task will spend on a specific core and
if it will be reassigned to the same core in the future?

Any information on the above would be highly appreciated.

Kind Regards,
George


Why replication of Under-Replicated blocks in decommissioned datanodes is so slow

2015-04-06 Thread 麦树荣
version: hadoop-2.2.0

There were 13 nodes in our hdfs cluster. We wanted to decommission 7 nodes. We 
used two methods as follow:

Method 1:
At the beginning, we set the dfs.hosts.exclude parameter and successfully 
decommissioned 7 nodes, so there were many Under-Replicated blocks need to 
replicate. However, it spent about 20 hours and the replication didn’t finish 
yet. We observed the speed of replication is very slow.

Method 2:
Later, we gave up the method, and used another method of stopping datanode node 
by node. We stopped one datanode. When replication of Under-Replicated blocks 
of the node finished, we continued to stop another datanode till 7 nodes were 
stopped. It spent about 12 hours and the speed of replication is obviously much 
faster the method 1.

We thought method 1 should be faster method 2. But factually, method 2 is much 
faster than method 1. Why ?


答复: Yarn container out of memory when using large memory mapped file

2015-04-06 Thread 麦树荣
mapreduce.reduce.memory.mb  means physical memory, not JVM heap.
The large mapped files (about 8G total) is more than 
4G(mapreduce.reduce.memory.mb=4096), so you got the error.

发件人: Yao, York [mailto:york@here.com]
发送时间: 2015年4月5日 6:36
收件人: user@hadoop.apache.org
主题: Yarn container out of memory when using large memory mapped file


Hello,

I am using hadoop 2.4. The reducer use several large memory mapped files (about 
8G total). The reducer itself use very little memory. To my knowledge, the 
memeory mapped file (FileChannel.map(readonly)) also use little memory (managed 
by OS instead of JVM).

I got error similar to this: Container 
[pid=26783,containerID=container_1389136889967_0009_01_02] is running 
beyond physical memory limits. Current usage: 4.2 GB of 4 GB physical memory 
used; 5.2 GB of 8.4 GB virtual memory used. Killing container

Here was my settings:

mapreduce.reduce.java.opts=-Xmx2048m

mapreduce.reduce.memory.mb=4096

So I adjust the parameter to this and works:

mapreduce.reduce.java.opts=-Xmx10240m

mapreduce.reduce.memory.mb=12288

I further adjust the parameters and get it work like this:

mapreduce.reduce.java.opts=-Xmx2048m

mapreduce.reduce.memory.mb=10240

My question is: why I need the yarn container to have about 8G more memory than 
the JVM size? The culprit seems to be the large java memory mapped files I used 
(each about 1.5G, sum up to about 8G). Isn't the memory mapped files managed by 
the OS and they supposed to be sharable by multiple processes (e.g. reducers)?

Thanks!

York



安全提示:本邮件非公司内部邮件,请注意保护个人及公司信息安全,如有索取帐号密码等可疑情况请向 
sect...@qunar.com❮sect...@qunar.com❯ 举报。


RE: Pin Map/Reduce tasks to specific cores

2015-04-06 Thread Rohith Sharma K S
Hi George

In MRV2, YARN supports CGroups implementation.  Using CGroup it is possible to 
run containers in specific cores.

For your detailed reference, some of the useful links
http://dev.hortonworks.com.s3.amazonaws.com/HDPDocuments/HDP2/HDP-2-trunk/bk_system-admin-guide/content/ch_cgroups.html
http://blog.cloudera.com/blog/2013/12/managing-multiple-resources-in-hadoop-2-with-yarn/
http://riccomini.name/posts/hadoop/2013-06-14-yarn-with-cgroups/

P.S : I could not find any related document in Hadoop Yarn docs. I will raise 
ticket for the same  in community.

Hope the above information will help your use case!!!

Thanks & Regards
Rohith Sharma K S

From: George Ioannidis [mailto:giorgio...@gmail.com]
Sent: 07 April 2015 01:55
To: user@hadoop.apache.org
Subject: Pin Map/Reduce tasks to specific cores

Hello. My question, which can be found on Stack 
Overflow
 as well, regards pinning map/reduce tasks to specific cores, either on hadoop 
v.1.2.1 or hadoop v.2.
In specific, I would like to know if the end-user can have any control on which 
core executes a specific map/reduce task.

To pin an application on linux, there's the "taskset" command, but is anything 
similar provided by hadoop? If not, is the Linux Scheduler in charge of 
allocating tasks to specific cores?

--
Below I am providing two cases to better illustrate my question:
Case #1: 2 GiB input size, HDFS block size of 64 MiB and 2 compute nodes 
available, with 32 cores each.
As follows, 32 map tasks will be called; let's suppose that 
mapred.tasktracker.map.tasks.maximum = 16, so 16 map tasks will be allocated to 
each node.
Can I guarantee that each Map Task will run on a specific core, or is it up to 
the Linux Scheduler?

--

Case #2: The same as case #1, but now the input size is 8 GiB, so there are not 
enough slots for all map tasks (128), so multiple tasks will share the same 
cores.
Can I control how much "time" each task will spend on a specific core and if it 
will be reassigned to the same core in the future?
Any information on the above would be highly appreciated.
Kind Regards,
George


RE: Pin Map/Reduce tasks to specific cores

2015-04-06 Thread Naganarasimha G R (Naga)
Hi George,

The current implementation present in YARN using Cgroups supports CPU isolation 
but not by pinning to specific cores (Cgroup CPUsets) but based on cpu cycles 
(quota & Period).
Admin is provided with an option of specifying how much percentage of CPU can 
be used by YARN containers. And Yarn will take care of configuring Cgroup Quota 
and Period files and
ensures only configured CPU percentage is only used by YARN containers

Is there any particular need to pin the MR tasks to the specific cores ? or you 
just want to ensure YARN is not using more than the specified percentage of CPU 
in a give node ?

Regards,
Naga


From: Rohith Sharma K S [rohithsharm...@huawei.com]
Sent: Tuesday, April 07, 2015 09:23
To: user@hadoop.apache.org
Subject: RE: Pin Map/Reduce tasks to specific cores

Hi George

In MRV2, YARN supports CGroups implementation.  Using CGroup it is possible to 
run containers in specific cores.

For your detailed reference, some of the useful links
http://dev.hortonworks.com.s3.amazonaws.com/HDPDocuments/HDP2/HDP-2-trunk/bk_system-admin-guide/content/ch_cgroups.html
http://blog.cloudera.com/blog/2013/12/managing-multiple-resources-in-hadoop-2-with-yarn/
http://riccomini.name/posts/hadoop/2013-06-14-yarn-with-cgroups/

P.S : I could not find any related document in Hadoop Yarn docs. I will raise 
ticket for the same  in community.

Hope the above information will help your use case!!!

Thanks & Regards
Rohith Sharma K S

From: George Ioannidis [mailto:giorgio...@gmail.com]
Sent: 07 April 2015 01:55
To: user@hadoop.apache.org
Subject: Pin Map/Reduce tasks to specific cores

Hello. My question, which can be found on Stack 
Overflow
 as well, regards pinning map/reduce tasks to specific cores, either on hadoop 
v.1.2.1 or hadoop v.2.
In specific, I would like to know if the end-user can have any control on which 
core executes a specific map/reduce task.

To pin an application on linux, there's the "taskset" command, but is anything 
similar provided by hadoop? If not, is the Linux Scheduler in charge of 
allocating tasks to specific cores?

--
Below I am providing two cases to better illustrate my question:
Case #1: 2 GiB input size, HDFS block size of 64 MiB and 2 compute nodes 
available, with 32 cores each.
As follows, 32 map tasks will be called; let's suppose that 
mapred.tasktracker.map.tasks.maximum = 16, so 16 map tasks will be allocated to 
each node.
Can I guarantee that each Map Task will run on a specific core, or is it up to 
the Linux Scheduler?

--

Case #2: The same as case #1, but now the input size is 8 GiB, so there are not 
enough slots for all map tasks (128), so multiple tasks will share the same 
cores.
Can I control how much "time" each task will spend on a specific core and if it 
will be reassigned to the same core in the future?
Any information on the above would be highly appreciated.
Kind Regards,
George