[jira] [Created] (TEZ-1640) Unable to achieve Secured Impersonation
Subroto Sanyal created TEZ-1640: --- Summary: Unable to achieve Secured Impersonation Key: TEZ-1640 URL: https://issues.apache.org/jira/browse/TEZ-1640 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.0 Reporter: Subroto Sanyal My client is running with user "subroto" and following are the entries in the xmls: {code:xml|title=core-site.xml|borderStyle=solid} hadoop.proxyuser.subroto.groups impersonatedgroup hadoop.proxyuser.subroto.hosts * {code} I have a user _qa_ which belongs to the the group _impersonatedgroup_ . Following is the code to launch the DAGAppMaster {code:java|title=TezClientWrapper.java|borderStyle=solid} TezClient tezClient = SecureGridMode.executePossiblyImpersonated(conf, new PrivilegedExceptionAction() { @Override public TezClient run() throws Exception { final TezConfiguration tezConf = createTezConf(conf, datameerJobContext); if (amSpecificProperties != null) { applyAmSpecificProperties(tezConf, amSpecificProperties); } UserGroupInformation currentUser = UserGroupInformation.getCurrentUser(); LOG.info("Current User:" + currentUser); File tokenFile = new File(System.getProperty("java.io.tmpdir"), tezSessionName.replaceAll("[^a-zA-Z0-9]", "")); LOG.info("Token File:" + tokenFile.getAbsolutePath()); currentUser.getCredentials().writeTokenStorageFile(UriUtil.toPath(tokenFile.getAbsoluteFile()), conf); tezConf.set(TezConfiguration.TEZ_CREDENTIALS_PATH, tokenFile.getAbsolutePath()); TezClient tezClient = TezClient.create(tezSessionName, tezConf, createSession, localResourceMap, currentUser.getCredentials()); tezClient.setAppMasterCredentials(currentUser.getCredentials()); tezClient.start(); tezClient.waitTillReady(); return tezClient; } });{code} The logs so obtained from this piece of code execution is: {noformat}Current User:qa (auth:PROXY) via subroto@EC2.INTERNAL (auth:KERBEROS){noformat} The code piece fails in: _tezClient.waitTillReady();_ >From the Resource Manager UI I can see that a application is launched with >user _qa_. Failure stack-trace: {noformat} (UserGroupInformation.java:1551) - PriviledgedActionException as:qa (auth:SIMPLE) cause:java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] Failed to retrieve AM Status via proxy com.google.protobuf.ServiceException: java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: "ip-10-178-144-254/10.178.144.254"; destination host is: "ip-10-187-33-206":56660; at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216) at com.sun.proxy.$Proxy111.getAMStatus(Unknown Source) at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:522) at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:597) at test.app.TezClientWrapper$1.run(TezClientFacade.java:146) at test.app.TezClientWrapper$1.run(TezClientFacade.java:130) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at test.app.Security.doAs(Security.java:65) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-1640) Unable to achieve Secured Impersonation
[ https://issues.apache.org/jira/browse/TEZ-1640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subroto Sanyal updated TEZ-1640: Description: My client is running with user "subroto" and following are the entries in the xmls: {code:xml|title=core-site.xml|borderStyle=solid} hadoop.proxyuser.subroto.groups impersonatedgroup hadoop.proxyuser.subroto.hosts * {code} I have a user _qa_ which belongs to the the group _impersonatedgroup_ . Following is the code to launch the DAGAppMaster {code:java|title=TezClientWrapper.java|borderStyle=solid} TezClient tezClient = SecureGridMode.executePossiblyImpersonated(conf, new PrivilegedExceptionAction() { @Override public TezClient run() throws Exception { final TezConfiguration tezConf = createTezConf(conf, jobContext); if (amSpecificProperties != null) { applyAmSpecificProperties(tezConf, amSpecificProperties); } UserGroupInformation currentUser = UserGroupInformation.getCurrentUser(); LOG.info("Current User:" + currentUser); File tokenFile = new File(System.getProperty("java.io.tmpdir"), tezSessionName.replaceAll("[^a-zA-Z0-9]", "")); LOG.info("Token File:" + tokenFile.getAbsolutePath()); currentUser.getCredentials().writeTokenStorageFile(UriUtil.toPath(tokenFile.getAbsoluteFile()), conf); tezConf.set(TezConfiguration.TEZ_CREDENTIALS_PATH, tokenFile.getAbsolutePath()); TezClient tezClient = TezClient.create(tezSessionName, tezConf, createSession, localResourceMap, currentUser.getCredentials()); tezClient.setAppMasterCredentials(currentUser.getCredentials()); tezClient.start(); tezClient.waitTillReady(); return tezClient; } });{code} The logs so obtained from this piece of code execution is: {noformat}Current User:qa (auth:PROXY) via subroto@EC2.INTERNAL (auth:KERBEROS){noformat} The code piece fails in: _tezClient.waitTillReady();_ >From the Resource Manager UI I can see that a application is launched with >user _qa_. Failure stack-trace: {noformat} (UserGroupInformation.java:1551) - PriviledgedActionException as:qa (auth:SIMPLE) cause:java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] Failed to retrieve AM Status via proxy com.google.protobuf.ServiceException: java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: "ip-10-178-144-254/10.178.144.254"; destination host is: "ip-10-187-33-206":56660; at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216) at com.sun.proxy.$Proxy111.getAMStatus(Unknown Source) at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:522) at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:597) at test.app.TezClientWrapper$1.run(TezClientFacade.java:146) at test.app.TezClientWrapper$1.run(TezClientFacade.java:130) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at test.app.Security.doAs(Security.java:65) {noformat} was: My client is running with user "subroto" and following are the entries in the xmls: {code:xml|title=core-site.xml|borderStyle=solid} hadoop.proxyuser.subroto.groups impersonatedgroup hadoop.proxyuser.subroto.hosts * {code} I have a user _qa_ which belongs to the the group _impersonatedgroup_ . Following is the code to launch the DAGAppMaster {code:java|title=TezClientWrapper.java|borderStyle=solid} TezClient tezClient = SecureGridMode.executePossiblyImpersonated(conf, new PrivilegedExceptionAction() { @Override public TezClient run() throws Exception { final TezConfiguration tezConf = createTezConf(conf, datameerJobContext); if (amSpecificProperties != null) { applyAmSpecificProperties(tezConf, amSpecificProperties); } UserGroupInformation currentUser = UserGroupInformation.getCurrentUser(); LOG.info("Current User:" + currentUser); File t
[jira] [Commented] (TEZ-1635) Dag gets stuck intermittently
[ https://issues.apache.org/jira/browse/TEZ-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160664#comment-14160664 ] Gunther Hagleitner commented on TEZ-1635: - [~vikram.dixit] can you answer [~rajesh.balamohan]'s question? > Dag gets stuck intermittently > - > > Key: TEZ-1635 > URL: https://issues.apache.org/jira/browse/TEZ-1635 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Vikram Dixit K >Priority: Blocker > Attachments: Screen Shot 2014-10-05 at 9.46.31 AM.png, > syslog_dag_1412109415326_0002_10.gz, tez_smb_1_hung_job.log, > tez_smb_1_successful_job.log > > > Attaching logs for the dag. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-1277) Tez Spill handler should truncate files to reserve space on disk
[ https://issues.apache.org/jira/browse/TEZ-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated TEZ-1277: - Description: Occasionally tasks fail due to full disks because the disks had space when the task was allocating via LocalDirAllocator, but the disk space was actually promised to many tasks instead of just one. This race condition shows up when a 1Gb spill can be done in ~10s or so. There is no way to do this via the hadoop-fs abstraction - but an SSD based spill wastes most of the IOPS on journal updates about the file length changing. was: Occasionally tasks fail due to full disks because the disks had space when the task was allocating via LocalDirAllocator, but the disk space was actually promised to many tasks instead of just one. This race condition shows up when a 1Gb spill can be done in ~10s or so. > Tez Spill handler should truncate files to reserve space on disk > > > Key: TEZ-1277 > URL: https://issues.apache.org/jira/browse/TEZ-1277 > Project: Apache Tez > Issue Type: Improvement >Affects Versions: 0.5.0 >Reporter: Gopal V >Assignee: Gopal V > > Occasionally tasks fail due to full disks because the disks had space when > the task was allocating via LocalDirAllocator, but the disk space was > actually promised to many tasks instead of just one. > This race condition shows up when a 1Gb spill can be done in ~10s or so. > There is no way to do this via the hadoop-fs abstraction - but an SSD based > spill wastes most of the IOPS on journal updates about the file length > changing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-1635) Dag gets stuck intermittently
[ https://issues.apache.org/jira/browse/TEZ-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161358#comment-14161358 ] Bikas Saha commented on TEZ-1635: - Rajesh. That was my observation too (see the first comment). The mapper was missing at least 1 InputDataInformationEvent. Your analysis of Custom vertex manager seems to root cause it to the vertex manager having a race in it. [~vikram.dixit] made some fixes in it related to the setVertexParallelism() that creates an ordering in the code. Will that help here? > Dag gets stuck intermittently > - > > Key: TEZ-1635 > URL: https://issues.apache.org/jira/browse/TEZ-1635 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Vikram Dixit K >Priority: Blocker > Attachments: Screen Shot 2014-10-05 at 9.46.31 AM.png, > syslog_dag_1412109415326_0002_10.gz, tez_smb_1_hung_job.log, > tez_smb_1_successful_job.log > > > Attaching logs for the dag. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-1600) Swimlanes View for Tez UI
[ https://issues.apache.org/jira/browse/TEZ-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-1600: - Attachment: TEZ-1600.patch [~pramachandran], This is a version that has the rough version of swimlanes working. There is a minimal task_attempt model that is setup included. > Swimlanes View for Tez UI > - > > Key: TEZ-1600 > URL: https://issues.apache.org/jira/browse/TEZ-1600 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-1600.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-1600) Swimlanes View for Tez UI
[ https://issues.apache.org/jira/browse/TEZ-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161562#comment-14161562 ] Jonathan Eagles commented on TEZ-1600: -- I plan on spending some more time on the swimlane view, cleaning up the code and cleaning up the view presentation. > Swimlanes View for Tez UI > - > > Key: TEZ-1600 > URL: https://issues.apache.org/jira/browse/TEZ-1600 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: TEZ-1600.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)