Re: Cleanup activity on YARN containers
Hi Rohith, Thanks for the reply. Mine is a YARN application. I have some files that are local to where the containers run on, and I want to clean them up at the end of the container execution. So, I want to do this cleanup on the same node my container ran on. With what you are suggesting, I can't delete the files local to the container. Is there any other way? Thanks, Kishore On Tue, Apr 8, 2014 at 8:55 AM, Rohith Sharma K S rohithsharm...@huawei.com wrote: Hi Kishore, Is jobs are submitted through MapReduce or Is it Yarn Application? 1. For MapReduce Framwork, framework itself provides facility to clean up per task level. Is there any callback kind of facility, in which I can write some code to be executed on my container at the end of my application or *at the end of that particular container execution?* You can override setup() and cleanup() for doing initialization and cleanup of your task. This facility is provided by MapReduce framework. The call flow of task execution is The framework first calls setup(org.apache.hadoop.mapreduce.Mapper.Context), followed by map(Object, Object, Context) / reduce(Object, Iterable, Context) for each key/value pair. Finally cleanup(Context) is called. Note : In clean up, do not hold container for more than mapreduce.task.timeout. Because, once map/reduce is completed, progress will not be sent to applicationmaster(ping is not considered as status update). If your application is taking more than value configured for mapreduce.task.timeout, then application master consider this task as timedout. In such case, you need to increase value for mapreduce.task.timeout based on your cleanup time. 2. For Yarn Application, completed container's list are sent to ApplicationMaster in heartbeat. Here you can do clean up activities for containers. Hope this will help for you. J!! Thanks Regards Rohith Sharma K S *From:* Krishna Kishore Bonagiri [mailto:write2kish...@gmail.com] *Sent:* 07 April 2014 16:41 *To:* user@hadoop.apache.org *Subject:* Cleanup activity on YARN containers Hi, Is there any callback kind of facility, in which I can write some code to be executed on my container at the end of my application or at the end of that particular container execution? I want to do some cleanup activities at the end of my application, and the clean up is not related to the localized resources that are downloaded from HDFS. Thanks, Kishore
RE: Cleanup activity on YARN containers
For local container clean up, can be cleaned at ShutDownHook. !!?? Thanks Regards Rohith Sharma K S From: Krishna Kishore Bonagiri [mailto:write2kish...@gmail.com] Sent: 08 April 2014 20:01 To: user@hadoop.apache.org Subject: Re: Cleanup activity on YARN containers Hi Rohith, Thanks for the reply. Mine is a YARN application. I have some files that are local to where the containers run on, and I want to clean them up at the end of the container execution. So, I want to do this cleanup on the same node my container ran on. With what you are suggesting, I can't delete the files local to the container. Is there any other way? Thanks, Kishore On Tue, Apr 8, 2014 at 8:55 AM, Rohith Sharma K S rohithsharm...@huawei.commailto:rohithsharm...@huawei.com wrote: Hi Kishore, Is jobs are submitted through MapReduce or Is it Yarn Application? 1. For MapReduce Framwork, framework itself provides facility to clean up per task level. Is there any callback kind of facility, in which I can write some code to be executed on my container at the end of my application or at the end of that particular container execution? You can override setup() and cleanup() for doing initialization and cleanup of your task. This facility is provided by MapReduce framework. The call flow of task execution is The framework first calls setup(org.apache.hadoop.mapreduce.Mapper.Context), followed by map(Object, Object, Context) / reduce(Object, Iterable, Context) for each key/value pair. Finally cleanup(Context) is called. Note : In clean up, do not hold container for more than mapreduce.task.timeout. Because, once map/reduce is completed, progress will not be sent to applicationmaster(ping is not considered as status update). If your application is taking more than value configured for mapreduce.task.timeout, then application master consider this task as timedout. In such case, you need to increase value for mapreduce.task.timeout based on your cleanup time. 2. For Yarn Application, completed container's list are sent to ApplicationMaster in heartbeat. Here you can do clean up activities for containers. Hope this will help for you. :)!! Thanks Regards Rohith Sharma K S From: Krishna Kishore Bonagiri [mailto:write2kish...@gmail.commailto:write2kish...@gmail.com] Sent: 07 April 2014 16:41 To: user@hadoop.apache.orgmailto:user@hadoop.apache.org Subject: Cleanup activity on YARN containers Hi, Is there any callback kind of facility, in which I can write some code to be executed on my container at the end of my application or at the end of that particular container execution? I want to do some cleanup activities at the end of my application, and the clean up is not related to the localized resources that are downloaded from HDFS. Thanks, Kishore
Re: Cleanup activity on YARN containers
Hi Rohith, Is there something like shutdown hook for containers? Can you please also tell me how to use that? Thanks, Kishore On Wed, Apr 9, 2014 at 8:34 AM, Rohith Sharma K S rohithsharm...@huawei.com wrote: For local container clean up, can be cleaned at ShutDownHook. !!?? Thanks Regards Rohith Sharma K S *From:* Krishna Kishore Bonagiri [mailto:write2kish...@gmail.com] *Sent:* 08 April 2014 20:01 *To:* user@hadoop.apache.org *Subject:* Re: Cleanup activity on YARN containers Hi Rohith, Thanks for the reply. Mine is a YARN application. I have some files that are local to where the containers run on, and I want to clean them up at the end of the container execution. So, I want to do this cleanup on the same node my container ran on. With what you are suggesting, I can't delete the files local to the container. Is there any other way? Thanks, Kishore On Tue, Apr 8, 2014 at 8:55 AM, Rohith Sharma K S rohithsharm...@huawei.com wrote: Hi Kishore, Is jobs are submitted through MapReduce or Is it Yarn Application? 1. For MapReduce Framwork, framework itself provides facility to clean up per task level. Is there any callback kind of facility, in which I can write some code to be executed on my container at the end of my application or *at the end of that particular container execution?* You can override setup() and cleanup() for doing initialization and cleanup of your task. This facility is provided by MapReduce framework. The call flow of task execution is The framework first calls setup(org.apache.hadoop.mapreduce.Mapper.Context), followed by map(Object, Object, Context) / reduce(Object, Iterable, Context) for each key/value pair. Finally cleanup(Context) is called. Note : In clean up, do not hold container for more than mapreduce.task.timeout. Because, once map/reduce is completed, progress will not be sent to applicationmaster(ping is not considered as status update). If your application is taking more than value configured for mapreduce.task.timeout, then application master consider this task as timedout. In such case, you need to increase value for mapreduce.task.timeout based on your cleanup time. 2. For Yarn Application, completed container's list are sent to ApplicationMaster in heartbeat. Here you can do clean up activities for containers. Hope this will help for you. J!! Thanks Regards Rohith Sharma K S *From:* Krishna Kishore Bonagiri [mailto:write2kish...@gmail.com] *Sent:* 07 April 2014 16:41 *To:* user@hadoop.apache.org *Subject:* Cleanup activity on YARN containers Hi, Is there any callback kind of facility, in which I can write some code to be executed on my container at the end of my application or at the end of that particular container execution? I want to do some cleanup activities at the end of my application, and the clean up is not related to the localized resources that are downloaded from HDFS. Thanks, Kishore
RE: Cleanup activity on YARN containers
Is there something like shutdown hook for containers? There is no containers specific shutdown hook. I was telling about Java shutdown hook i.e 'Runtime.getRuntime().addShutdownHook(Threadhttp://docs.oracle.com/javase/7/docs/api/java/lang/Thread.html hook)' during start of container JVM. In hook, clean up can be done. Thanks Regards Rohith Sharma K S From: Krishna Kishore Bonagiri [mailto:write2kish...@gmail.com] Sent: 09 April 2014 10:49 To: user@hadoop.apache.org Subject: Re: Cleanup activity on YARN containers Hi Rohith, Is there something like shutdown hook for containers? Can you please also tell me how to use that? Thanks, Kishore On Wed, Apr 9, 2014 at 8:34 AM, Rohith Sharma K S rohithsharm...@huawei.commailto:rohithsharm...@huawei.com wrote: For local container clean up, can be cleaned at ShutDownHook. !!?? Thanks Regards Rohith Sharma K S From: Krishna Kishore Bonagiri [mailto:write2kish...@gmail.commailto:write2kish...@gmail.com] Sent: 08 April 2014 20:01 To: user@hadoop.apache.orgmailto:user@hadoop.apache.org Subject: Re: Cleanup activity on YARN containers Hi Rohith, Thanks for the reply. Mine is a YARN application. I have some files that are local to where the containers run on, and I want to clean them up at the end of the container execution. So, I want to do this cleanup on the same node my container ran on. With what you are suggesting, I can't delete the files local to the container. Is there any other way? Thanks, Kishore On Tue, Apr 8, 2014 at 8:55 AM, Rohith Sharma K S rohithsharm...@huawei.commailto:rohithsharm...@huawei.com wrote: Hi Kishore, Is jobs are submitted through MapReduce or Is it Yarn Application? 1. For MapReduce Framwork, framework itself provides facility to clean up per task level. Is there any callback kind of facility, in which I can write some code to be executed on my container at the end of my application or at the end of that particular container execution? You can override setup() and cleanup() for doing initialization and cleanup of your task. This facility is provided by MapReduce framework. The call flow of task execution is The framework first calls setup(org.apache.hadoop.mapreduce.Mapper.Context), followed by map(Object, Object, Context) / reduce(Object, Iterable, Context) for each key/value pair. Finally cleanup(Context) is called. Note : In clean up, do not hold container for more than mapreduce.task.timeout. Because, once map/reduce is completed, progress will not be sent to applicationmaster(ping is not considered as status update). If your application is taking more than value configured for mapreduce.task.timeout, then application master consider this task as timedout. In such case, you need to increase value for mapreduce.task.timeout based on your cleanup time. 2. For Yarn Application, completed container's list are sent to ApplicationMaster in heartbeat. Here you can do clean up activities for containers. Hope this will help for you. :)!! Thanks Regards Rohith Sharma K S From: Krishna Kishore Bonagiri [mailto:write2kish...@gmail.commailto:write2kish...@gmail.com] Sent: 07 April 2014 16:41 To: user@hadoop.apache.orgmailto:user@hadoop.apache.org Subject: Cleanup activity on YARN containers Hi, Is there any callback kind of facility, in which I can write some code to be executed on my container at the end of my application or at the end of that particular container execution? I want to do some cleanup activities at the end of my application, and the clean up is not related to the localized resources that are downloaded from HDFS. Thanks, Kishore
RE: Cleanup activity on YARN containers
Hi Kishore, Is jobs are submitted through MapReduce or Is it Yarn Application? 1. For MapReduce Framwork, framework itself provides facility to clean up per task level. Is there any callback kind of facility, in which I can write some code to be executed on my container at the end of my application or at the end of that particular container execution? You can override setup() and cleanup() for doing initialization and cleanup of your task. This facility is provided by MapReduce framework. The call flow of task execution is The framework first calls setup(org.apache.hadoop.mapreduce.Mapper.Context)eclipse-javadoc:%E2%98%82=hadoop-mapreduce-client-core/src%5C/main%5C/java%3Corg.apache.hadoop.mapreduce%7BMapper.java%E2%98%83Mapper%E2%98%82%E2%98%82setup%E2%98%82org.apache.hadoop.mapreduce.Mapper.Context, followed by map(Object, Object, Context)eclipse-javadoc:%E2%98%82=hadoop-mapreduce-client-core/src%5C/main%5C/java%3Corg.apache.hadoop.mapreduce%7BMapper.java%E2%98%83Mapper%E2%98%82%E2%98%82map%E2%98%82Object%E2%98%82Object%E2%98%82Context / reduce(Object, Iterable, Context)eclipse-javadoc:%E2%98%82=hadoop-mapreduce-client-core/src%5C/main%5C/java%3Corg.apache.hadoop.mapreduce%7BReducer.java%E2%98%83Reducer%E2%98%82%E2%98%82reduce%E2%98%82Object%E2%98%82Iterable%E2%98%82Context for each key/value pair. Finally cleanup(Context)eclipse-javadoc:%E2%98%82=hadoop-mapreduce-client-core/src%5C/main%5C/java%3Corg.apache.hadoop.mapreduce%7BMapper.java%E2%98%83Mapper%E2%98%82%E2%98%82cleanup%E2%98%82Context is called. Note : In clean up, do not hold container for more than mapreduce.task.timeout. Because, once map/reduce is completed, progress will not be sent to applicationmaster(ping is not considered as status update). If your application is taking more than value configured for mapreduce.task.timeout, then application master consider this task as timedout. In such case, you need to increase value for mapreduce.task.timeout based on your cleanup time. 2. For Yarn Application, completed container's list are sent to ApplicationMaster in heartbeat. Here you can do clean up activities for containers. Hope this will help for you. :)!! Thanks Regards Rohith Sharma K S From: Krishna Kishore Bonagiri [mailto:write2kish...@gmail.com] Sent: 07 April 2014 16:41 To: user@hadoop.apache.org Subject: Cleanup activity on YARN containers Hi, Is there any callback kind of facility, in which I can write some code to be executed on my container at the end of my application or at the end of that particular container execution? I want to do some cleanup activities at the end of my application, and the clean up is not related to the localized resources that are downloaded from HDFS. Thanks, Kishore