Re: Security on YARN
Thanks Yan. I will take a look shortly. On Sat, Jul 25, 2015 at 1:20 AM, Yan Fang yanfang...@gmail.com wrote: Hi Chen Song, If you can work on this issue, it will be great. 1. the related ticket is https://issues.apache.org/jira/browse/SAMZA-727 2. most of the change will happen in Yarn AM and Yarn client parts. The code sits in the samza-yarn package https://github.com/apache/samza/tree/master/samza-yarn/src/main/scala/org/apache/samza/job/yarn . 3. when you implement this, make sure it does not affect the non-secure Yarn implementation. Because non-secure cluster implementation has been proved working, while the secure cluster may have the issue as Yi Pan mentioned, For a long-running Samza job, it does not work. We will need a way to refresh the Kerberos ticket periodically, which is not supported yet. But I am happy to see at least we have some support for secure cluster. We can figure the issue out later. If you want to have some help in understanding the existing code, let me know. Thanks, Fang, Yan yanfang...@gmail.com On Fri, Jul 24, 2015 at 7:00 PM, Chen Song chen.song...@gmail.com wrote: Can someone give some context on this? I can volunteer myself and try working on this. Chen On Thu, Jul 2, 2015 at 4:29 AM, Qi Fu q...@talend.com wrote: Hi Yi Yan, Many thanks for your information. I have created a jira for this: https://issues.apache.org/jira/browse/SAMZA-727 I'm willing to test it if someone can work on this. -Qi From: Yi Pan nickpa...@gmail.com Sent: Thursday, July 2, 2015 1:38 AM To: dev@samza.apache.org Subject: Re: Security on YARN Hi, Yan, Your memory serves as well as mine. :) I remember that Chris and I discussed this Kerberos ticket expiration issue when we were brain storming on how to access HDFS data in Samza. At high-level, what happens is that the Kerberos ticket to access a secured Hadoop cluster is issued to Samza containers at the job start time, and will expire later. For a long-running Samza job, it does not work. We will need a way to refresh the Kerberos ticket periodically, which is not supported yet. Chris probably can chime in with more details. -Yi On Wed, Jul 1, 2015 at 4:08 PM, Yan Fang yanfang...@gmail.com wrote: Hi Qi, I think this is caused by the fact that Samza currently does not support Yarn with Kerberos. Feel free to open a ticket for this feature. But if my memory serves, there was an issue mentioned about the Kerberos. Seems when the Kerberos ticket expires, Samza will have some issues? Can not find the resource. Anyone remember this? Cheers, Fang, Yan yanfang...@gmail.com On Wed, Jul 1, 2015 at 3:41 AM, Qi Fu q...@talend.com wrote: Hi all, I'm testing Samza on YARN and I have encountered a problem on the security setting of YARN (Kerberos). Here is the detail: 1. My cluster is secured by Kerberos, and I deploy my samza job from one of the cluster. 2. My config file is in ~/.samza/conf/(yarn-site.xml, core-site.xml, hdfs-site.xml) 3. The job is deployed successfully, and I can get the info such as: ClientHelper [INFO] set package url to scheme: hdfs port: -1 file: /user/test/samzatest.tar.gz for application_1435680272316_0003 ClientHelper [INFO] set package size to 212924524 for application_1435680272316_0003 I think the security setting is correct as it can get the file size from HDFS. 4. But I get the error from YARN job manager as following: Application application_1435680272316_0003 failed 2 times due to AM Container for appattempt_1435680272316_0003_02 exited with exitCode: -1000 For more detailed output, check application tracking page: http://cdh-namenode:8088/proxy/application_1435680272316_0003/Then , click on links to logs of each attempt. Diagnostics: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: talend-cdh-datanode8/62.210.141.237; destination host is: talend-cdh-namenode:8020; java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: cdh-datanode8/62.210.141.237; destination host is: cdh-namenode:8020; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) .. Anyone knows how to solve this? Qi FU -- Chen Song -- Chen Song
Re: Security on YARN
Can someone give some context on this? I can volunteer myself and try working on this. Chen On Thu, Jul 2, 2015 at 4:29 AM, Qi Fu q...@talend.com wrote: Hi Yi Yan, Many thanks for your information. I have created a jira for this: https://issues.apache.org/jira/browse/SAMZA-727 I'm willing to test it if someone can work on this. -Qi From: Yi Pan nickpa...@gmail.com Sent: Thursday, July 2, 2015 1:38 AM To: dev@samza.apache.org Subject: Re: Security on YARN Hi, Yan, Your memory serves as well as mine. :) I remember that Chris and I discussed this Kerberos ticket expiration issue when we were brain storming on how to access HDFS data in Samza. At high-level, what happens is that the Kerberos ticket to access a secured Hadoop cluster is issued to Samza containers at the job start time, and will expire later. For a long-running Samza job, it does not work. We will need a way to refresh the Kerberos ticket periodically, which is not supported yet. Chris probably can chime in with more details. -Yi On Wed, Jul 1, 2015 at 4:08 PM, Yan Fang yanfang...@gmail.com wrote: Hi Qi, I think this is caused by the fact that Samza currently does not support Yarn with Kerberos. Feel free to open a ticket for this feature. But if my memory serves, there was an issue mentioned about the Kerberos. Seems when the Kerberos ticket expires, Samza will have some issues? Can not find the resource. Anyone remember this? Cheers, Fang, Yan yanfang...@gmail.com On Wed, Jul 1, 2015 at 3:41 AM, Qi Fu q...@talend.com wrote: Hi all, I'm testing Samza on YARN and I have encountered a problem on the security setting of YARN (Kerberos). Here is the detail: 1. My cluster is secured by Kerberos, and I deploy my samza job from one of the cluster. 2. My config file is in ~/.samza/conf/(yarn-site.xml, core-site.xml, hdfs-site.xml) 3. The job is deployed successfully, and I can get the info such as: ClientHelper [INFO] set package url to scheme: hdfs port: -1 file: /user/test/samzatest.tar.gz for application_1435680272316_0003 ClientHelper [INFO] set package size to 212924524 for application_1435680272316_0003 I think the security setting is correct as it can get the file size from HDFS. 4. But I get the error from YARN job manager as following: Application application_1435680272316_0003 failed 2 times due to AM Container for appattempt_1435680272316_0003_02 exited with exitCode: -1000 For more detailed output, check application tracking page: http://cdh-namenode:8088/proxy/application_1435680272316_0003/Then, click on links to logs of each attempt. Diagnostics: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: talend-cdh-datanode8/62.210.141.237; destination host is: talend-cdh-namenode:8020; java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: cdh-datanode8/62.210.141.237; destination host is: cdh-namenode:8020; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) .. Anyone knows how to solve this? Qi FU -- Chen Song
Re: Security on YARN
Hi Chen Song, If you can work on this issue, it will be great. 1. the related ticket is https://issues.apache.org/jira/browse/SAMZA-727 2. most of the change will happen in Yarn AM and Yarn client parts. The code sits in the samza-yarn package https://github.com/apache/samza/tree/master/samza-yarn/src/main/scala/org/apache/samza/job/yarn . 3. when you implement this, make sure it does not affect the non-secure Yarn implementation. Because non-secure cluster implementation has been proved working, while the secure cluster may have the issue as Yi Pan mentioned, For a long-running Samza job, it does not work. We will need a way to refresh the Kerberos ticket periodically, which is not supported yet. But I am happy to see at least we have some support for secure cluster. We can figure the issue out later. If you want to have some help in understanding the existing code, let me know. Thanks, Fang, Yan yanfang...@gmail.com On Fri, Jul 24, 2015 at 7:00 PM, Chen Song chen.song...@gmail.com wrote: Can someone give some context on this? I can volunteer myself and try working on this. Chen On Thu, Jul 2, 2015 at 4:29 AM, Qi Fu q...@talend.com wrote: Hi Yi Yan, Many thanks for your information. I have created a jira for this: https://issues.apache.org/jira/browse/SAMZA-727 I'm willing to test it if someone can work on this. -Qi From: Yi Pan nickpa...@gmail.com Sent: Thursday, July 2, 2015 1:38 AM To: dev@samza.apache.org Subject: Re: Security on YARN Hi, Yan, Your memory serves as well as mine. :) I remember that Chris and I discussed this Kerberos ticket expiration issue when we were brain storming on how to access HDFS data in Samza. At high-level, what happens is that the Kerberos ticket to access a secured Hadoop cluster is issued to Samza containers at the job start time, and will expire later. For a long-running Samza job, it does not work. We will need a way to refresh the Kerberos ticket periodically, which is not supported yet. Chris probably can chime in with more details. -Yi On Wed, Jul 1, 2015 at 4:08 PM, Yan Fang yanfang...@gmail.com wrote: Hi Qi, I think this is caused by the fact that Samza currently does not support Yarn with Kerberos. Feel free to open a ticket for this feature. But if my memory serves, there was an issue mentioned about the Kerberos. Seems when the Kerberos ticket expires, Samza will have some issues? Can not find the resource. Anyone remember this? Cheers, Fang, Yan yanfang...@gmail.com On Wed, Jul 1, 2015 at 3:41 AM, Qi Fu q...@talend.com wrote: Hi all, I'm testing Samza on YARN and I have encountered a problem on the security setting of YARN (Kerberos). Here is the detail: 1. My cluster is secured by Kerberos, and I deploy my samza job from one of the cluster. 2. My config file is in ~/.samza/conf/(yarn-site.xml, core-site.xml, hdfs-site.xml) 3. The job is deployed successfully, and I can get the info such as: ClientHelper [INFO] set package url to scheme: hdfs port: -1 file: /user/test/samzatest.tar.gz for application_1435680272316_0003 ClientHelper [INFO] set package size to 212924524 for application_1435680272316_0003 I think the security setting is correct as it can get the file size from HDFS. 4. But I get the error from YARN job manager as following: Application application_1435680272316_0003 failed 2 times due to AM Container for appattempt_1435680272316_0003_02 exited with exitCode: -1000 For more detailed output, check application tracking page: http://cdh-namenode:8088/proxy/application_1435680272316_0003/Then, click on links to logs of each attempt. Diagnostics: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: talend-cdh-datanode8/62.210.141.237; destination host is: talend-cdh-namenode:8020; java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: cdh-datanode8/62.210.141.237; destination host is: cdh-namenode:8020; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) .. Anyone knows how to solve this? Qi FU -- Chen Song