[jira] [Updated] (YARN-8041) [Router] Federation: routing some missing REST invocations transparently to multiple RMs
[ https://issues.apache.org/jira/browse/YARN-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiran Wu updated YARN-8041: --- Attachment: YARN-8041.005.patch > [Router] Federation: routing some missing REST invocations transparently to > multiple RMs > > > Key: YARN-8041 > URL: https://issues.apache.org/jira/browse/YARN-8041 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation, router >Reporter: Yiran Wu >Assignee: Yiran Wu >Priority: Minor > Attachments: YARN-8041.001.patch, YARN-8041.002.patch, > YARN-8041.003.patch, YARN-8041.004.patch, YARN-8041.005.patch > > > This Jira tracks the implementation of some missing REST invocation in > FederationInterceptorREST: > * getAppStatistics > * getNodeToLabels > * getLabelsOnNode > * updateApplicationPriority > * getAppQueue > * updateAppQueue > * getAppTimeout > * getAppTimeouts > * updateApplicationTimeout > * getAppAttempts > * getAppAttempt > * getContainers > * getContainer -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8337) Deadlock Federation Router
[ https://issues.apache.org/jira/browse/YARN-8337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483763#comment-16483763 ] Yiran Wu commented on YARN-8337: Thanks [~jianchao jia] , I have a question. The "INSERT IGNORE " will be ignore any error. Do we need to ensure the data inserted successfully or capture errors and retry insert it? > Deadlock Federation Router > -- > > Key: YARN-8337 > URL: https://issues.apache.org/jira/browse/YARN-8337 > Project: Hadoop YARN > Issue Type: Bug > Components: federation, router >Reporter: Jianchao Jia >Priority: Major > Attachments: YARN-8337.001.patch > > > We use mysql innodb as the state store engine,in router log we found dead > lock error like below: > {code:java} > [2018-05-21T15:41:40.383+08:00] [ERROR] [IPC Server handler 25 on 8050] : > Unable to insert the newly generated application > application_1526295230627_127402 > com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock > found when trying to get lock; try restarting transaction > at sun.reflect.GeneratedConstructorAccessor107.newInstance(Unknown > Source) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at com.mysql.jdbc.Util.handleNewInstance(Util.java:425) > at com.mysql.jdbc.Util.getInstance(Util.java:408) > at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:952) > at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3973) > at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3909) > at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2527) > at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2680) > at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2484) > at > com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1858) > at > com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2079) > at > com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2013) > at > com.mysql.jdbc.PreparedStatement.executeLargeUpdate(PreparedStatement.java:5104) > at > com.mysql.jdbc.CallableStatement.executeLargeUpdate(CallableStatement.java:2418) > at > com.mysql.jdbc.CallableStatement.executeUpdate(CallableStatement.java:887) > at > com.zaxxer.hikari.pool.ProxyPreparedStatement.executeUpdate(ProxyPreparedStatement.java:61) > at > com.zaxxer.hikari.pool.HikariProxyCallableStatement.executeUpdate(HikariProxyCallableStatement.java) > at > org.apache.hadoop.yarn.server.federation.store.impl.SQLFederationStateStore.addApplicationHomeSubCluster(SQLFederationStateStore.java:547) > {code} > Use "show engine innodb status;" command to find what happens > {code:java} > 2018-05-21 15:41:40 7f4685870700 > *** (1) TRANSACTION: > TRANSACTION 241131538, ACTIVE 0 sec inserting, thread declared inside InnoDB > 4999 > mysql tables in use 2, locked 2 > LOCK WAIT 4 lock struct(s), heap size 1184, 2 row lock(s) > MySQL thread id 7602335, OS thread handle 0x7f46858f2700, query id 2919792534 > 192.168.1.138 federation executing > INSERT INTO applicationsHomeSubCluster > (applicationId,homeSubCluster) > (SELECT applicationId_IN, homeSubCluster_IN > FROM applicationsHomeSubCluster > WHERE applicationId = applicationId_IN > HAVING COUNT(*) = 0 ) > *** (1) WAITING FOR THIS LOCK TO BE GRANTED: > RECORD LOCKS space id 113 page no 21208 n bits 296 index `PRIMARY` of table > `guldan_federationstatestore`.`applicationshomesubcluster` trx id 241131538 > lock_mode X locks gap before rec insert intention waiting > Record lock, heap no 23 PHYSICAL RECORD: n_fields 4; compact format; info > bits 0 > 0: len 30; hex 6170706c69636174696f6e5f313532363239353233303632375f31323734; > asc application_1526295230627_1274; (total 31 bytes); > 1: len 6; hex 0ba5f32d; asc -;; > 2: len 7; hex dd00280110; asc ( ;; > 3: len 13; hex 686f70655f636c757374657231; asc hope_cluster1;; > *** (2) TRANSACTION: > TRANSACTION 241131539, ACTIVE 0 sec inserting, thread declared inside InnoDB > 4999 > mysql tables in use 2, locked 2 > 4 lock struct(s), heap size 1184, 2 row lock(s) > MySQL thread id 7600638, OS thread handle 0x7f4685870700, query id 2919792535 > 192.168.1.138 federation executing > INSERT INTO applicationsHomeSubCluster > (applicationId,homeSubCluster) > (SELECT applicationId_IN, homeSubCluster_IN > FROM applicationsHomeSubCluster > WHERE applicationId = applicationId_IN > HAVING COUNT(*) = 0 ) > *** (2) HOLDS THE LOCK(S): > RECORD LOCKS space id 113 page no 21208 n bits 296 index `PRIMARY` of table >
[jira] [Commented] (YARN-8297) Incorrect ATS Url used for Wire encrypted cluster
[ https://issues.apache.org/jira/browse/YARN-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483819#comment-16483819 ] Sunil Govindan commented on YARN-8297: -- [~rohithsharma], could u pls help to check the patch. > Incorrect ATS Url used for Wire encrypted cluster > - > > Key: YARN-8297 > URL: https://issues.apache.org/jira/browse/YARN-8297 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Affects Versions: 3.1.0 >Reporter: Yesha Vora >Assignee: Sunil Govindan >Priority: Blocker > Fix For: 3.2.0, 3.1.1 > > Attachments: YARN-8297-addendum.patch, YARN-8297.001.patch > > > "Service" page uses incorrect web url for ATS in wire encrypted env. For ATS > urls, it uses https protocol with http port. > This issue causes all ATS call to fail and UI does not display component > details. > url used: > https://xxx:8198/ws/v2/timeline/apps/application_1526357251888_0022/entities/SERVICE_ATTEMPT?fields=ALL&_=1526415938320 > expected url : > https://xxx:8199/ws/v2/timeline/apps/application_1526357251888_0022/entities/SERVICE_ATTEMPT?fields=ALL&_=1526415938320 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8041) [Router] Federation: routing some missing REST invocations transparently to multiple RMs
[ https://issues.apache.org/jira/browse/YARN-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483838#comment-16483838 ] genericqa commented on YARN-8041: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 35s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 7s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 22s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 53s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 10 new + 18 unchanged - 0 fixed = 28 total (was 18) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 34s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 23s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 5s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 67m 8s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 23s{color} | {color:red} hadoop-yarn-server-router in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}141m 19s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.router.webapp.TestFederationInterceptorREST | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8041 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924505/YARN-8041.005.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux ee43e60accb6
[jira] [Commented] (YARN-6919) Add default volume mount list
[ https://issues.apache.org/jira/browse/YARN-6919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483859#comment-16483859 ] Shane Kumpf commented on YARN-6919: --- Thanks for the patch [~ebadger]! I tested this out and it is working as intended. +1 lgtm, I'll commit this later today if there are no objections. > Add default volume mount list > - > > Key: YARN-6919 > URL: https://issues.apache.org/jira/browse/YARN-6919 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Labels: Docker > Attachments: YARN-6919.001.patch, YARN-6919.002.patch > > > Piggybacking on YARN-5534, we should create a default list that bind mounts > selected volumes into all docker containers. This list will be empty by > default -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8273) Log aggregation does not warn if HDFS quota in target directory is exceeded
[ https://issues.apache.org/jira/browse/YARN-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483871#comment-16483871 ] Gergo Repas commented on YARN-8273: --- [~rkanter] Thanks for the review! Yes, indeed LogAggregationDFSException can be a checked exception (and a subclass of YarnException), I've updated the patch. > Log aggregation does not warn if HDFS quota in target directory is exceeded > --- > > Key: YARN-8273 > URL: https://issues.apache.org/jira/browse/YARN-8273 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation >Affects Versions: 3.1.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Attachments: YARN-8273.000.patch, YARN-8273.001.patch, > YARN-8273.002.patch, YARN-8273.003.patch, YARN-8273.004.patch, > YARN-8273.005.patch, YARN-8273.006.patch > > > It appears that if an HDFS space quota is set on a target directory for log > aggregation and the quota is already exceeded when log aggregation is > attempted, zero-byte log files will be written to the HDFS directory, however > NodeManager logs do not reflect a failure to write the files successfully > (i.e. there are no ERROR or WARN messages to this effect). > An improvement may be worth investigating to alert users to this scenario, as > otherwise logs for a YARN application may be missing both on HDFS and locally > (after local log cleanup is done) and the user may not otherwise be informed. > Steps to reproduce: > * Set a small HDFS space quota on /tmp/logs/username/logs (e.g. 2MB) > * Write files to HDFS such that /tmp/logs/username/logs is almost 2MB full > * Run a Spark or MR job in the cluster > * Observe that zero byte files are written to HDFS after job completion > * Observe that YARN container logs are also not present on the NM hosts (or > are deleted after yarn.nodemanager.delete.debug-delay-sec) > * Observe that no ERROR or WARN messages appear to be logged in the NM role > log -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8173) [Router] Implement missing FederationClientInterceptor#getApplications()
[ https://issues.apache.org/jira/browse/YARN-8173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiran Wu updated YARN-8173: --- Attachment: YARN-8041.004.patch > [Router] Implement missing FederationClientInterceptor#getApplications() > > > Key: YARN-8173 > URL: https://issues.apache.org/jira/browse/YARN-8173 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Yiran Wu >Assignee: Yiran Wu >Priority: Major > Attachments: YARN-8041.004.patch, YARN-8173.001.patch, > YARN-8173.002.patch, YARN-8173.003.patch > > > oozie dependent method Implement > {code:java} > getApplications() > getDeglationToken() > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8173) [Router] Implement missing FederationClientInterceptor#getApplications()
[ https://issues.apache.org/jira/browse/YARN-8173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiran Wu updated YARN-8173: --- Attachment: (was: YARN-8041.004.patch) > [Router] Implement missing FederationClientInterceptor#getApplications() > > > Key: YARN-8173 > URL: https://issues.apache.org/jira/browse/YARN-8173 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Yiran Wu >Assignee: Yiran Wu >Priority: Major > Attachments: YARN-8173.001.patch, YARN-8173.002.patch, > YARN-8173.003.patch > > > oozie dependent method Implement > {code:java} > getApplications() > getDeglationToken() > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8297) Incorrect ATS Url used for Wire encrypted cluster
[ https://issues.apache.org/jira/browse/YARN-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil Govindan updated YARN-8297: - Attachment: YARN-8297-addendum.patch > Incorrect ATS Url used for Wire encrypted cluster > - > > Key: YARN-8297 > URL: https://issues.apache.org/jira/browse/YARN-8297 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Affects Versions: 3.1.0 >Reporter: Yesha Vora >Assignee: Sunil Govindan >Priority: Blocker > Fix For: 3.2.0, 3.1.1 > > Attachments: YARN-8297-addendum.patch, YARN-8297.001.patch > > > "Service" page uses incorrect web url for ATS in wire encrypted env. For ATS > urls, it uses https protocol with http port. > This issue causes all ATS call to fail and UI does not display component > details. > url used: > https://xxx:8198/ws/v2/timeline/apps/application_1526357251888_0022/entities/SERVICE_ATTEMPT?fields=ALL&_=1526415938320 > expected url : > https://xxx:8199/ws/v2/timeline/apps/application_1526357251888_0022/entities/SERVICE_ATTEMPT?fields=ALL&_=1526415938320 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Reopened] (YARN-8297) Incorrect ATS Url used for Wire encrypted cluster
[ https://issues.apache.org/jira/browse/YARN-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil Govindan reopened YARN-8297: -- This issue is not cleanly handled. Needed an addendum patch > Incorrect ATS Url used for Wire encrypted cluster > - > > Key: YARN-8297 > URL: https://issues.apache.org/jira/browse/YARN-8297 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Affects Versions: 3.1.0 >Reporter: Yesha Vora >Assignee: Sunil Govindan >Priority: Blocker > Fix For: 3.2.0, 3.1.1 > > Attachments: YARN-8297-addendum.patch, YARN-8297.001.patch > > > "Service" page uses incorrect web url for ATS in wire encrypted env. For ATS > urls, it uses https protocol with http port. > This issue causes all ATS call to fail and UI does not display component > details. > url used: > https://xxx:8198/ws/v2/timeline/apps/application_1526357251888_0022/entities/SERVICE_ATTEMPT?fields=ALL&_=1526415938320 > expected url : > https://xxx:8199/ws/v2/timeline/apps/application_1526357251888_0022/entities/SERVICE_ATTEMPT?fields=ALL&_=1526415938320 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8297) Incorrect ATS Url used for Wire encrypted cluster
[ https://issues.apache.org/jira/browse/YARN-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483886#comment-16483886 ] genericqa commented on YARN-8297: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 34s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 36m 16s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 48s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 49m 30s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8297 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924526/YARN-8297-addendum.patch | | Optional Tests | asflicense shadedclient | | uname | Linux 6f1bf834c6e8 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 57c2feb | | maven | version: Apache Maven 3.3.9 | | Max. process+thread count | 312 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/20820/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Incorrect ATS Url used for Wire encrypted cluster > - > > Key: YARN-8297 > URL: https://issues.apache.org/jira/browse/YARN-8297 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Affects Versions: 3.1.0 >Reporter: Yesha Vora >Assignee: Sunil Govindan >Priority: Blocker > Fix For: 3.2.0, 3.1.1 > > Attachments: YARN-8297-addendum.patch, YARN-8297.001.patch > > > "Service" page uses incorrect web url for ATS in wire encrypted env. For ATS > urls, it uses https protocol with http port. > This issue causes all ATS call to fail and UI does not display component > details. > url used: > https://xxx:8198/ws/v2/timeline/apps/application_1526357251888_0022/entities/SERVICE_ATTEMPT?fields=ALL&_=1526415938320 > expected url : > https://xxx:8199/ws/v2/timeline/apps/application_1526357251888_0022/entities/SERVICE_ATTEMPT?fields=ALL&_=1526415938320 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8173) [Router] Implement missing FederationClientInterceptor#getApplications()
[ https://issues.apache.org/jira/browse/YARN-8173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiran Wu updated YARN-8173: --- Attachment: YARN-8041.004.patch > [Router] Implement missing FederationClientInterceptor#getApplications() > > > Key: YARN-8173 > URL: https://issues.apache.org/jira/browse/YARN-8173 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Yiran Wu >Assignee: Yiran Wu >Priority: Major > Attachments: YARN-8173.001.patch, YARN-8173.002.patch, > YARN-8173.003.patch > > > oozie dependent method Implement > {code:java} > getApplications() > getDeglationToken() > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8337) Deadlock Federation Router
[ https://issues.apache.org/jira/browse/YARN-8337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483773#comment-16483773 ] Jianchao Jia edited comment on YARN-8337 at 5/22/18 10:41 AM: -- [~yiran] Thanks for your comment. If the record exists in the table, "row_count()" will return zero,otherwise will return one。 In SQLFederationStateStore.java,have different treatment according to different return values。 [~giovanni.fumarola] can you review this ,or gave other advice? was (Author: jianchao jia): [~yiran] Thanks for you comment. If the record exists in the table, "row_count()" will return zero,otherwise will return one。 In SQLFederationStateStore.java,have different treatment according to different return values。 [~giovanni.fumarola] can you review this ,or gave other advice? > Deadlock Federation Router > -- > > Key: YARN-8337 > URL: https://issues.apache.org/jira/browse/YARN-8337 > Project: Hadoop YARN > Issue Type: Bug > Components: federation, router >Reporter: Jianchao Jia >Priority: Major > Attachments: YARN-8337.001.patch > > > We use mysql innodb as the state store engine,in router log we found dead > lock error like below: > {code:java} > [2018-05-21T15:41:40.383+08:00] [ERROR] [IPC Server handler 25 on 8050] : > Unable to insert the newly generated application > application_1526295230627_127402 > com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock > found when trying to get lock; try restarting transaction > at sun.reflect.GeneratedConstructorAccessor107.newInstance(Unknown > Source) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at com.mysql.jdbc.Util.handleNewInstance(Util.java:425) > at com.mysql.jdbc.Util.getInstance(Util.java:408) > at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:952) > at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3973) > at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3909) > at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2527) > at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2680) > at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2484) > at > com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1858) > at > com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2079) > at > com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2013) > at > com.mysql.jdbc.PreparedStatement.executeLargeUpdate(PreparedStatement.java:5104) > at > com.mysql.jdbc.CallableStatement.executeLargeUpdate(CallableStatement.java:2418) > at > com.mysql.jdbc.CallableStatement.executeUpdate(CallableStatement.java:887) > at > com.zaxxer.hikari.pool.ProxyPreparedStatement.executeUpdate(ProxyPreparedStatement.java:61) > at > com.zaxxer.hikari.pool.HikariProxyCallableStatement.executeUpdate(HikariProxyCallableStatement.java) > at > org.apache.hadoop.yarn.server.federation.store.impl.SQLFederationStateStore.addApplicationHomeSubCluster(SQLFederationStateStore.java:547) > {code} > Use "show engine innodb status;" command to find what happens > {code:java} > 2018-05-21 15:41:40 7f4685870700 > *** (1) TRANSACTION: > TRANSACTION 241131538, ACTIVE 0 sec inserting, thread declared inside InnoDB > 4999 > mysql tables in use 2, locked 2 > LOCK WAIT 4 lock struct(s), heap size 1184, 2 row lock(s) > MySQL thread id 7602335, OS thread handle 0x7f46858f2700, query id 2919792534 > 192.168.1.138 federation executing > INSERT INTO applicationsHomeSubCluster > (applicationId,homeSubCluster) > (SELECT applicationId_IN, homeSubCluster_IN > FROM applicationsHomeSubCluster > WHERE applicationId = applicationId_IN > HAVING COUNT(*) = 0 ) > *** (1) WAITING FOR THIS LOCK TO BE GRANTED: > RECORD LOCKS space id 113 page no 21208 n bits 296 index `PRIMARY` of table > `guldan_federationstatestore`.`applicationshomesubcluster` trx id 241131538 > lock_mode X locks gap before rec insert intention waiting > Record lock, heap no 23 PHYSICAL RECORD: n_fields 4; compact format; info > bits 0 > 0: len 30; hex 6170706c69636174696f6e5f313532363239353233303632375f31323734; > asc application_1526295230627_1274; (total 31 bytes); > 1: len 6; hex 0ba5f32d; asc -;; > 2: len 7; hex dd00280110; asc ( ;; > 3: len 13; hex 686f70655f636c757374657231; asc hope_cluster1;; > *** (2) TRANSACTION: > TRANSACTION 241131539, ACTIVE 0 sec inserting, thread declared inside InnoDB > 4999 > mysql tables in use 2, locked 2 > 4 lock struct(s), heap size 1184, 2 row lock(s) > MySQL
[jira] [Commented] (YARN-8337) Deadlock Federation Router
[ https://issues.apache.org/jira/browse/YARN-8337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483773#comment-16483773 ] Jianchao Jia commented on YARN-8337: [~yiran] Thanks for you comment. If the record exists in the table, "row_count()" will return zero,otherwise will return one。 In SQLFederationStateStore.java,have different treatment according to different return values。 [~giovanni.fumarola] can you review this ,or gave other advice? > Deadlock Federation Router > -- > > Key: YARN-8337 > URL: https://issues.apache.org/jira/browse/YARN-8337 > Project: Hadoop YARN > Issue Type: Bug > Components: federation, router >Reporter: Jianchao Jia >Priority: Major > Attachments: YARN-8337.001.patch > > > We use mysql innodb as the state store engine,in router log we found dead > lock error like below: > {code:java} > [2018-05-21T15:41:40.383+08:00] [ERROR] [IPC Server handler 25 on 8050] : > Unable to insert the newly generated application > application_1526295230627_127402 > com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock > found when trying to get lock; try restarting transaction > at sun.reflect.GeneratedConstructorAccessor107.newInstance(Unknown > Source) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at com.mysql.jdbc.Util.handleNewInstance(Util.java:425) > at com.mysql.jdbc.Util.getInstance(Util.java:408) > at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:952) > at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3973) > at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3909) > at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2527) > at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2680) > at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2484) > at > com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1858) > at > com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2079) > at > com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2013) > at > com.mysql.jdbc.PreparedStatement.executeLargeUpdate(PreparedStatement.java:5104) > at > com.mysql.jdbc.CallableStatement.executeLargeUpdate(CallableStatement.java:2418) > at > com.mysql.jdbc.CallableStatement.executeUpdate(CallableStatement.java:887) > at > com.zaxxer.hikari.pool.ProxyPreparedStatement.executeUpdate(ProxyPreparedStatement.java:61) > at > com.zaxxer.hikari.pool.HikariProxyCallableStatement.executeUpdate(HikariProxyCallableStatement.java) > at > org.apache.hadoop.yarn.server.federation.store.impl.SQLFederationStateStore.addApplicationHomeSubCluster(SQLFederationStateStore.java:547) > {code} > Use "show engine innodb status;" command to find what happens > {code:java} > 2018-05-21 15:41:40 7f4685870700 > *** (1) TRANSACTION: > TRANSACTION 241131538, ACTIVE 0 sec inserting, thread declared inside InnoDB > 4999 > mysql tables in use 2, locked 2 > LOCK WAIT 4 lock struct(s), heap size 1184, 2 row lock(s) > MySQL thread id 7602335, OS thread handle 0x7f46858f2700, query id 2919792534 > 192.168.1.138 federation executing > INSERT INTO applicationsHomeSubCluster > (applicationId,homeSubCluster) > (SELECT applicationId_IN, homeSubCluster_IN > FROM applicationsHomeSubCluster > WHERE applicationId = applicationId_IN > HAVING COUNT(*) = 0 ) > *** (1) WAITING FOR THIS LOCK TO BE GRANTED: > RECORD LOCKS space id 113 page no 21208 n bits 296 index `PRIMARY` of table > `guldan_federationstatestore`.`applicationshomesubcluster` trx id 241131538 > lock_mode X locks gap before rec insert intention waiting > Record lock, heap no 23 PHYSICAL RECORD: n_fields 4; compact format; info > bits 0 > 0: len 30; hex 6170706c69636174696f6e5f313532363239353233303632375f31323734; > asc application_1526295230627_1274; (total 31 bytes); > 1: len 6; hex 0ba5f32d; asc -;; > 2: len 7; hex dd00280110; asc ( ;; > 3: len 13; hex 686f70655f636c757374657231; asc hope_cluster1;; > *** (2) TRANSACTION: > TRANSACTION 241131539, ACTIVE 0 sec inserting, thread declared inside InnoDB > 4999 > mysql tables in use 2, locked 2 > 4 lock struct(s), heap size 1184, 2 row lock(s) > MySQL thread id 7600638, OS thread handle 0x7f4685870700, query id 2919792535 > 192.168.1.138 federation executing > INSERT INTO applicationsHomeSubCluster > (applicationId,homeSubCluster) > (SELECT applicationId_IN, homeSubCluster_IN > FROM applicationsHomeSubCluster > WHERE applicationId = applicationId_IN > HAVING COUNT(*) = 0 ) > *** (2) HOLDS THE LOCK(S): > RECORD LOCKS
[jira] [Commented] (YARN-8337) Deadlock Federation Router
[ https://issues.apache.org/jira/browse/YARN-8337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483772#comment-16483772 ] genericqa commented on YARN-8337: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 27s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 48s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 48s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 20m 43s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8337 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924514/YARN-8337.001.patch | | Optional Tests | asflicense | | uname | Linux 27b9a4520364 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 57c2feb | | maven | version: Apache Maven 3.3.9 | | Max. process+thread count | 475 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn U: hadoop-yarn-project/hadoop-yarn | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/20819/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Deadlock Federation Router > -- > > Key: YARN-8337 > URL: https://issues.apache.org/jira/browse/YARN-8337 > Project: Hadoop YARN > Issue Type: Bug > Components: federation, router >Reporter: Jianchao Jia >Priority: Major > Attachments: YARN-8337.001.patch > > > We use mysql innodb as the state store engine,in router log we found dead > lock error like below: > {code:java} > [2018-05-21T15:41:40.383+08:00] [ERROR] [IPC Server handler 25 on 8050] : > Unable to insert the newly generated application > application_1526295230627_127402 > com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock > found when trying to get lock; try restarting transaction > at sun.reflect.GeneratedConstructorAccessor107.newInstance(Unknown > Source) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at com.mysql.jdbc.Util.handleNewInstance(Util.java:425) > at com.mysql.jdbc.Util.getInstance(Util.java:408) > at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:952) > at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3973) > at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3909) > at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2527) > at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2680) > at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2484) > at > com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1858) > at > com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2079) > at > com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2013) > at > com.mysql.jdbc.PreparedStatement.executeLargeUpdate(PreparedStatement.java:5104) > at > com.mysql.jdbc.CallableStatement.executeLargeUpdate(CallableStatement.java:2418) > at > com.mysql.jdbc.CallableStatement.executeUpdate(CallableStatement.java:887) > at > com.zaxxer.hikari.pool.ProxyPreparedStatement.executeUpdate(ProxyPreparedStatement.java:61) > at > com.zaxxer.hikari.pool.HikariProxyCallableStatement.executeUpdate(HikariProxyCallableStatement.java) >
[jira] [Commented] (YARN-8320) Add support CPU isolation for latency-sensitive (LS) service
[ https://issues.apache.org/jira/browse/YARN-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483813#comment-16483813 ] Weiwei Yang commented on YARN-8320: --- Some updates, I am working with [~yangjiandan] on polishing the design doc, will add more details and explanations this week. Please feel free to comment and share your thoughts. > Add support CPU isolation for latency-sensitive (LS) service > - > > Key: YARN-8320 > URL: https://issues.apache.org/jira/browse/YARN-8320 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Jiandan Yang >Priority: Major > Attachments: CPU-isolation-for-latency-sensitive-services-v1.pdf, > YARN-8320.001.patch > > > Currently NodeManager uses “cpu.cfs_period_us”, “cpu.cfs_quota_us” and > “cpu.shares” to isolate cpu resource. However, > * Linux Completely Fair Scheduling (CFS) is a throughput-oriented scheduler; > no support for differentiated latency > * Request latency of services running on container may be frequent shake > when all containers share cpus, and latency-sensitive services can not afford > in our production environment. > So we need more finer cpu isolation. > My co-workers and I propose a solution using cgroup cpuset to binds > containers to different processors, this is inspired by the isolation > technique in [Borg > system|http://schd.ws/hosted_files/lcccna2016/a7/CAT%20@%20Scale.pdf]. > Later I will upload a detailed design doc. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8335) Privileged docker containers' jobSubmitDir does not get successfully cleaned up
[ https://issues.apache.org/jira/browse/YARN-8335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483863#comment-16483863 ] Shane Kumpf commented on YARN-8335: --- Is this a dupe of YARN-7904? > Privileged docker containers' jobSubmitDir does not get successfully cleaned > up > --- > > Key: YARN-8335 > URL: https://issues.apache.org/jira/browse/YARN-8335 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Priority: Major > Labels: Docker > > The jobSubmitDir directory is owned by root and is being cleaned up as the > submitting user, which appears to be why it is failing to clean up. > {noformat} > 2018-05-21 19:46:15,124 WARN [DeletionService #0] > privileged.PrivilegedOperationExecutor > (PrivilegedOperationExecutor.java:executePrivilegedOperation(174)) - Shell > execution returned exit code: 255. Privileged Execution Operation Stderr: > Stdout: main : command provided 3 > main : run as user is ebadger > main : requested yarn user is ebadger > failed to unlink > /tmp/hadoop-local3/usercache/ebadger/appcache/application_1526931492976_0007/container_1526931492976_0007_01_01/jobSubmitDir/job.split: > Permission denied > failed to unlink > /tmp/hadoop-local3/usercache/ebadger/appcache/application_1526931492976_0007/container_1526931492976_0007_01_01/jobSubmitDir/job.splitmetainfo: > Permission denied > failed to rmdir jobSubmitDir: Directory not empty > Error while deleting > /tmp/hadoop-local3/usercache/ebadger/appcache/application_1526931492976_0007/container_1526931492976_0007_01_01: > 39 (Directory not empty) > Full command array for failed execution: > [/hadoop-3.2.0-SNAPSHOT/bin/container-executor, ebadger, ebadger, 3, > /tmp/hadoop-local3/usercache/ebadger/appcache/application_1526931492976_0007/container_1526931492976_0007_01_01] > 2018-05-21 19:46:15,124 ERROR [DeletionService #0] > nodemanager.LinuxContainerExecutor > (LinuxContainerExecutor.java:deleteAsUser(848)) - DeleteAsUser for > /tmp/hadoop-local3/usercache/ebadger/appcache/application_1526931492976_0007/container_1526931492976_0007_01_01 > returned with exit code: 255 > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: > ExitCodeException exitCode=255: > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:180) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:206) > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.deleteAsUser(LinuxContainerExecutor.java:844) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.deletion.task.FileDeletionTask.run(FileDeletionTask.java:135) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: ExitCodeException exitCode=255: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:1009) > at org.apache.hadoop.util.Shell.run(Shell.java:902) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1227) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:152) > ... 10 more > {noformat} > {noformat} > [foo@bar hadoop]$ ls -l > /tmp/hadoop-local3/usercache/ebadger/appcache/application_1526931492976_0007/container_1526931492976_0007_01_01/ > total 4 > drwxr-sr-x 2 root users 4096 May 21 19:45 jobSubmitDir > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8191) Fair scheduler: queue deletion without RM restart
[ https://issues.apache.org/jira/browse/YARN-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483864#comment-16483864 ] Gergo Repas commented on YARN-8191: --- [~haibochen] Thanks for the review! 1) - Good point, I fixed it. 2) - This logic's origin is a suggestion from [~wilfreds] (Wilfred - please correct me if I'm wrong about the intentions behind {{getRemovedStaticQueues(), setQueuesToDynamic()}}). The point here is that the set of removed queues can be gathered in {{AllocationReloadListener.onReload()}} outside of the writeLock. It's safe to do so because onReload() is only called from the synchronized {{AllocationFileLoaderService.reloadAllocations()}} method. This way the {{AllocationReloadListener.getRemovedStaticQueues()}} logic is subject to the least amount of locking. The thread safety was indeed missing for {{QueueManager.setQueuesToDynamic()}}, I've added the missing synchronized block. 3) Sorry, what do you mean by "What about the other case where some dynamic queues are not added as static in the new allocation file?". If you mean dynamic queue creation via application submission, the test case for this (+the removal) is {{TestQueueManager.testRemovalOfDynamicLeafQueue()}}. 4-5) I have refactored this part of the code, removed getIncompatibleQueueName() and changed only the return type of removeEmptyIncompatibleQueues() to indicate if there was no queue that's been tried to be removed. 6) {{updateAllocationConfiguration()}} is only called when the configuration file has been modified, so if for example there's only one configuration modification during the lifetime of the RM, incompatible queues would not be cleaned up until a restart. > Fair scheduler: queue deletion without RM restart > - > > Key: YARN-8191 > URL: https://issues.apache.org/jira/browse/YARN-8191 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.0.1 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Attachments: Queue Deletion in Fair Scheduler.pdf, > YARN-8191.000.patch, YARN-8191.001.patch, YARN-8191.002.patch, > YARN-8191.003.patch, YARN-8191.004.patch, YARN-8191.005.patch, > YARN-8191.006.patch, YARN-8191.007.patch, YARN-8191.008.patch, > YARN-8191.009.patch, YARN-8191.010.patch, YARN-8191.011.patch, > YARN-8191.012.patch, YARN-8191.013.patch > > > The Fair Scheduler never cleans up queues even if they are deleted in the > allocation file, or were dynamically created and are never going to be used > again. Queues always remain in memory which leads to two following issues. > # Steady fairshares aren’t calculated correctly due to remaining queues > # WebUI shows deleted queues, which is confusing for users (YARN-4022). > We want to support proper queue deletion without restarting the Resource > Manager: > # Static queues without any entries that are removed from fair-scheduler.xml > should be deleted from memory. > # Dynamic queues without any entries should be deleted. > # RM Web UI should only show the queues defined in the scheduler at that > point in time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8337) Deadlock Federation Router
Jianchao Jia created YARN-8337: -- Summary: Deadlock Federation Router Key: YARN-8337 URL: https://issues.apache.org/jira/browse/YARN-8337 Project: Hadoop YARN Issue Type: Bug Components: federation, router Reporter: Jianchao Jia We use mysql innodb as the state store engine,in router log we found dead lock error like below: {code:java} [2018-05-21T15:41:40.383+08:00] [ERROR] [IPC Server handler 25 on 8050] : Unable to insert the newly generated application application_1526295230627_127402 com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction at sun.reflect.GeneratedConstructorAccessor107.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at com.mysql.jdbc.Util.handleNewInstance(Util.java:425) at com.mysql.jdbc.Util.getInstance(Util.java:408) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:952) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3973) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3909) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2527) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2680) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2484) at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1858) at com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2079) at com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2013) at com.mysql.jdbc.PreparedStatement.executeLargeUpdate(PreparedStatement.java:5104) at com.mysql.jdbc.CallableStatement.executeLargeUpdate(CallableStatement.java:2418) at com.mysql.jdbc.CallableStatement.executeUpdate(CallableStatement.java:887) at com.zaxxer.hikari.pool.ProxyPreparedStatement.executeUpdate(ProxyPreparedStatement.java:61) at com.zaxxer.hikari.pool.HikariProxyCallableStatement.executeUpdate(HikariProxyCallableStatement.java) at org.apache.hadoop.yarn.server.federation.store.impl.SQLFederationStateStore.addApplicationHomeSubCluster(SQLFederationStateStore.java:547) {code} Use "show engine innodb status;" command to find what happens {code:java} 2018-05-21 15:41:40 7f4685870700 *** (1) TRANSACTION: TRANSACTION 241131538, ACTIVE 0 sec inserting, thread declared inside InnoDB 4999 mysql tables in use 2, locked 2 LOCK WAIT 4 lock struct(s), heap size 1184, 2 row lock(s) MySQL thread id 7602335, OS thread handle 0x7f46858f2700, query id 2919792534 192.168.1.138 federation executing INSERT INTO applicationsHomeSubCluster (applicationId,homeSubCluster) (SELECT applicationId_IN, homeSubCluster_IN FROM applicationsHomeSubCluster WHERE applicationId = applicationId_IN HAVING COUNT(*) = 0 ) *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 113 page no 21208 n bits 296 index `PRIMARY` of table `guldan_federationstatestore`.`applicationshomesubcluster` trx id 241131538 lock_mode X locks gap before rec insert intention waiting Record lock, heap no 23 PHYSICAL RECORD: n_fields 4; compact format; info bits 0 0: len 30; hex 6170706c69636174696f6e5f313532363239353233303632375f31323734; asc application_1526295230627_1274; (total 31 bytes); 1: len 6; hex 0ba5f32d; asc -;; 2: len 7; hex dd00280110; asc ( ;; 3: len 13; hex 686f70655f636c757374657231; asc hope_cluster1;; *** (2) TRANSACTION: TRANSACTION 241131539, ACTIVE 0 sec inserting, thread declared inside InnoDB 4999 mysql tables in use 2, locked 2 4 lock struct(s), heap size 1184, 2 row lock(s) MySQL thread id 7600638, OS thread handle 0x7f4685870700, query id 2919792535 192.168.1.138 federation executing INSERT INTO applicationsHomeSubCluster (applicationId,homeSubCluster) (SELECT applicationId_IN, homeSubCluster_IN FROM applicationsHomeSubCluster WHERE applicationId = applicationId_IN HAVING COUNT(*) = 0 ) *** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 113 page no 21208 n bits 296 index `PRIMARY` of table `guldan_federationstatestore`.`applicationshomesubcluster` trx id 241131539 lock mode S locks gap before rec Record lock, heap no 23 PHYSICAL RECORD: n_fields 4; compact format; info bits 0 0: len 30; hex 6170706c69636174696f6e5f313532363239353233303632375f31323734; asc application_1526295230627_1274; (total 31 bytes); 1: len 6; hex 0ba5f32d; asc -;; 2: len 7; hex dd00280110; asc ( ;; 3: len 13; hex 686f70655f636c757374657231; asc hope_cluster1;; *** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 113 page no 21208 n bits 296 index `PRIMARY` of table `guldan_federationstatestore`.`applicationshomesubcluster` trx id
[jira] [Updated] (YARN-8320) Support CPU isolation for latency-sensitive (LS) service
[ https://issues.apache.org/jira/browse/YARN-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-8320: -- Description: Currently NodeManager uses “cpu.cfs_period_us”, “cpu.cfs_quota_us” and “cpu.shares” to isolate cpu resource. However, * Linux Completely Fair Scheduling (CFS) is a throughput-oriented scheduler; no support for differentiated latency * Request latency of services running on container may be frequent shake when all containers share cpus, and latency-sensitive services can not afford in our production environment. So we need more fine-grained cpu isolation. Here we propose a solution using cgroup cpuset to binds containers to different processors, this is inspired by the isolation technique in [Borg system|http://schd.ws/hosted_files/lcccna2016/a7/CAT%20@%20Scale.pdf]. was: Currently NodeManager uses “cpu.cfs_period_us”, “cpu.cfs_quota_us” and “cpu.shares” to isolate cpu resource. However, * Linux Completely Fair Scheduling (CFS) is a throughput-oriented scheduler; no support for differentiated latency * Request latency of services running on container may be frequent shake when all containers share cpus, and latency-sensitive services can not afford in our production environment. So we need more finer cpu isolation. My co-workers and I propose a solution using cgroup cpuset to binds containers to different processors, this is inspired by the isolation technique in [Borg system|http://schd.ws/hosted_files/lcccna2016/a7/CAT%20@%20Scale.pdf]. Later I will upload a detailed design doc. > Support CPU isolation for latency-sensitive (LS) service > > > Key: YARN-8320 > URL: https://issues.apache.org/jira/browse/YARN-8320 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Jiandan Yang >Priority: Major > Attachments: CPU-isolation-for-latency-sensitive-services-v1.pdf, > YARN-8320.001.patch > > > Currently NodeManager uses “cpu.cfs_period_us”, “cpu.cfs_quota_us” and > “cpu.shares” to isolate cpu resource. However, > * Linux Completely Fair Scheduling (CFS) is a throughput-oriented scheduler; > no support for differentiated latency > * Request latency of services running on container may be frequent shake > when all containers share cpus, and latency-sensitive services can not afford > in our production environment. > So we need more fine-grained cpu isolation. > Here we propose a solution using cgroup cpuset to binds containers to > different processors, this is inspired by the isolation technique in [Borg > system|http://schd.ws/hosted_files/lcccna2016/a7/CAT%20@%20Scale.pdf]. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8320) Support CPU isolation for latency-sensitive (LS) service
[ https://issues.apache.org/jira/browse/YARN-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483813#comment-16483813 ] Weiwei Yang edited comment on YARN-8320 at 5/22/18 11:39 AM: - Some updates, I am working with [~yangjiandan] on polishing the design doc, will add more details and explanations this week. We've done some prototype already as the WIP patch [~yangjiandan] has shared. Please feel free to comment and share your thoughts. was (Author: cheersyang): Some updates, I am working with [~yangjiandan] on polishing the design doc, will add more details and explanations this week. Please feel free to comment and share your thoughts. > Support CPU isolation for latency-sensitive (LS) service > > > Key: YARN-8320 > URL: https://issues.apache.org/jira/browse/YARN-8320 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Jiandan Yang >Priority: Major > Attachments: CPU-isolation-for-latency-sensitive-services-v1.pdf, > YARN-8320.001.patch > > > Currently NodeManager uses “cpu.cfs_period_us”, “cpu.cfs_quota_us” and > “cpu.shares” to isolate cpu resource. However, > * Linux Completely Fair Scheduling (CFS) is a throughput-oriented scheduler; > no support for differentiated latency > * Request latency of services running on container may be frequent shake > when all containers share cpus, and latency-sensitive services can not afford > in our production environment. > So we need more fine-grained cpu isolation. > Here we propose a solution using cgroup cpuset to binds containers to > different processors, this is inspired by the isolation technique in [Borg > system|http://schd.ws/hosted_files/lcccna2016/a7/CAT%20@%20Scale.pdf]. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8329) Docker client configuration can still be set incorrectly
[ https://issues.apache.org/jira/browse/YARN-8329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483852#comment-16483852 ] Shane Kumpf commented on YARN-8329: --- Thanks for the review [~jlowe]! {quote}I'm not seeing why the copy is necessary. Eliminating the copy would also eliminate the need to do a token identifier decode to construct an alias.{quote} Good point. I think the original method to extract all the credentials from the token ByteBuffer has value, but not in its current form or location. I'll put up a patch to clean this up. > Docker client configuration can still be set incorrectly > > > Key: YARN-8329 > URL: https://issues.apache.org/jira/browse/YARN-8329 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.2.0, 3.1.1 >Reporter: Shane Kumpf >Assignee: Shane Kumpf >Priority: Major > Labels: Docker > Attachments: YARN-8329.001.patch > > > YARN-7996 implemented a fix to avoid writing an empty Docker client > configuration file, but there are still cases where the {{docker --config}} > argument is set in error. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8191) Fair scheduler: queue deletion without RM restart
[ https://issues.apache.org/jira/browse/YARN-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergo Repas updated YARN-8191: -- Attachment: YARN-8191.014.patch > Fair scheduler: queue deletion without RM restart > - > > Key: YARN-8191 > URL: https://issues.apache.org/jira/browse/YARN-8191 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.0.1 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Attachments: Queue Deletion in Fair Scheduler.pdf, > YARN-8191.000.patch, YARN-8191.001.patch, YARN-8191.002.patch, > YARN-8191.003.patch, YARN-8191.004.patch, YARN-8191.005.patch, > YARN-8191.006.patch, YARN-8191.007.patch, YARN-8191.008.patch, > YARN-8191.009.patch, YARN-8191.010.patch, YARN-8191.011.patch, > YARN-8191.012.patch, YARN-8191.013.patch, YARN-8191.014.patch > > > The Fair Scheduler never cleans up queues even if they are deleted in the > allocation file, or were dynamically created and are never going to be used > again. Queues always remain in memory which leads to two following issues. > # Steady fairshares aren’t calculated correctly due to remaining queues > # WebUI shows deleted queues, which is confusing for users (YARN-4022). > We want to support proper queue deletion without restarting the Resource > Manager: > # Static queues without any entries that are removed from fair-scheduler.xml > should be deleted from memory. > # Dynamic queues without any entries should be deleted. > # RM Web UI should only show the queues defined in the scheduler at that > point in time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8273) Log aggregation does not warn if HDFS quota in target directory is exceeded
[ https://issues.apache.org/jira/browse/YARN-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergo Repas updated YARN-8273: -- Attachment: YARN-8273.006.patch > Log aggregation does not warn if HDFS quota in target directory is exceeded > --- > > Key: YARN-8273 > URL: https://issues.apache.org/jira/browse/YARN-8273 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation >Affects Versions: 3.1.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Attachments: YARN-8273.000.patch, YARN-8273.001.patch, > YARN-8273.002.patch, YARN-8273.003.patch, YARN-8273.004.patch, > YARN-8273.005.patch, YARN-8273.006.patch > > > It appears that if an HDFS space quota is set on a target directory for log > aggregation and the quota is already exceeded when log aggregation is > attempted, zero-byte log files will be written to the HDFS directory, however > NodeManager logs do not reflect a failure to write the files successfully > (i.e. there are no ERROR or WARN messages to this effect). > An improvement may be worth investigating to alert users to this scenario, as > otherwise logs for a YARN application may be missing both on HDFS and locally > (after local log cleanup is done) and the user may not otherwise be informed. > Steps to reproduce: > * Set a small HDFS space quota on /tmp/logs/username/logs (e.g. 2MB) > * Write files to HDFS such that /tmp/logs/username/logs is almost 2MB full > * Run a Spark or MR job in the cluster > * Observe that zero byte files are written to HDFS after job completion > * Observe that YARN container logs are also not present on the NM hosts (or > are deleted after yarn.nodemanager.delete.debug-delay-sec) > * Observe that no ERROR or WARN messages appear to be logged in the NM role > log -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8173) [Router] Implement missing FederationClientInterceptor#getApplications()
[ https://issues.apache.org/jira/browse/YARN-8173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiran Wu updated YARN-8173: --- Attachment: (was: YARN-8041.004.patch) > [Router] Implement missing FederationClientInterceptor#getApplications() > > > Key: YARN-8173 > URL: https://issues.apache.org/jira/browse/YARN-8173 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Yiran Wu >Assignee: Yiran Wu >Priority: Major > Attachments: YARN-8173.001.patch, YARN-8173.002.patch, > YARN-8173.003.patch > > > oozie dependent method Implement > {code:java} > getApplications() > getDeglationToken() > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8259) Revisit liveliness checks for Docker containers
[ https://issues.apache.org/jira/browse/YARN-8259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shane Kumpf updated YARN-8259: -- Priority: Blocker (was: Major) > Revisit liveliness checks for Docker containers > --- > > Key: YARN-8259 > URL: https://issues.apache.org/jira/browse/YARN-8259 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.0.2, 3.2.0, 3.1.1 >Reporter: Shane Kumpf >Assignee: Shane Kumpf >Priority: Blocker > Labels: Docker > Attachments: YARN-8259.001.patch > > > As privileged containers may execute as a user that does not match the YARN > run as user, sending the null signal for liveliness checks could fail. We > need to reconsider how liveliness checks are handled in the Docker case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8259) Revisit liveliness checks for Docker containers
[ https://issues.apache.org/jira/browse/YARN-8259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483845#comment-16483845 ] Shane Kumpf commented on YARN-8259: --- {quote}System administrator can reserve one cpu core for node manager and all the docker inspect call are counted toward saturating one cpu core{quote} I'm less concerned about the cpu usage and more about docker's client/server model and the potential for hangs (that I've seen many of in the past under load). Personally, I want the /proc route for my systems and am not using hidepid. Losing a container due to an intermittent docker issue isn't really acceptable to me when an alternative exists that avoids the issue. What I could do is implement both the /proc and {{docker inspect}} approaches, and a configuration switch to choose the implementation for that that use hidepid (or a system without /proc). Would that be acceptable? I'm also going to make this a blocker, as all privileged containers are leaked on NM restart today. > Revisit liveliness checks for Docker containers > --- > > Key: YARN-8259 > URL: https://issues.apache.org/jira/browse/YARN-8259 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.0.2, 3.2.0, 3.1.1 >Reporter: Shane Kumpf >Assignee: Shane Kumpf >Priority: Major > Labels: Docker > Attachments: YARN-8259.001.patch > > > As privileged containers may execute as a user that does not match the YARN > run as user, sending the null signal for liveliness checks could fail. We > need to reconsider how liveliness checks are handled in the Docker case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8273) Log aggregation does not warn if HDFS quota in target directory is exceeded
[ https://issues.apache.org/jira/browse/YARN-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483761#comment-16483761 ] genericqa commented on YARN-8273: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 35s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 3s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 24s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 24s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 20s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 1 new + 48 unchanged - 0 fixed = 49 total (was 48) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 15s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 19s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 38s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}105m 26s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8273 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924489/YARN-8273.005.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle | | uname | Linux dc5dec84fe3e 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 57c2feb | | maven |
[jira] [Updated] (YARN-8320) upport CPU isolation for latency-sensitive (LS) service
[ https://issues.apache.org/jira/browse/YARN-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-8320: -- Summary: upport CPU isolation for latency-sensitive (LS) service (was: Add support CPU isolation for latency-sensitive (LS) service) > upport CPU isolation for latency-sensitive (LS) service > > > Key: YARN-8320 > URL: https://issues.apache.org/jira/browse/YARN-8320 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Jiandan Yang >Priority: Major > Attachments: CPU-isolation-for-latency-sensitive-services-v1.pdf, > YARN-8320.001.patch > > > Currently NodeManager uses “cpu.cfs_period_us”, “cpu.cfs_quota_us” and > “cpu.shares” to isolate cpu resource. However, > * Linux Completely Fair Scheduling (CFS) is a throughput-oriented scheduler; > no support for differentiated latency > * Request latency of services running on container may be frequent shake > when all containers share cpus, and latency-sensitive services can not afford > in our production environment. > So we need more finer cpu isolation. > My co-workers and I propose a solution using cgroup cpuset to binds > containers to different processors, this is inspired by the isolation > technique in [Borg > system|http://schd.ws/hosted_files/lcccna2016/a7/CAT%20@%20Scale.pdf]. > Later I will upload a detailed design doc. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8285) Remove unused environment variables from the Docker runtime
[ https://issues.apache.org/jira/browse/YARN-8285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483893#comment-16483893 ] Shane Kumpf commented on YARN-8285: --- [~ebadger], thanks for the patch! +1 lgtm, I'll commit this later today if there are no objections. Note that the patch doesn't apply cleanly, but the conflict is straightforward enough that I will address it. > Remove unused environment variables from the Docker runtime > --- > > Key: YARN-8285 > URL: https://issues.apache.org/jira/browse/YARN-8285 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Shane Kumpf >Assignee: Eric Badger >Priority: Trivial > Labels: Docker > Attachments: YARN-8285.001.patch > > > YARN-7430 enabled user remapping for Docker containers by default. As a > result, YARN_CONTAINER_RUNTIME_DOCKER_RUN_ENABLE_USER_REMAPPING is no longer > used and can be removed. > YARN_CONTAINER_RUNTIME_DOCKER_IMAGE_FILE was added in the original > implementation, but was never used and can be removed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8273) Log aggregation does not warn if HDFS quota in target directory is exceeded
[ https://issues.apache.org/jira/browse/YARN-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484005#comment-16484005 ] genericqa commented on YARN-8273: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 31s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 1s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 10s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 9s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 22s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 49s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 96m 31s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8273 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924535/YARN-8273.006.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle | | uname | Linux 6b3464e78886 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 57c2feb | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC1 | | Test
[jira] [Updated] (YARN-8337) Deadlock In Federation Router
[ https://issues.apache.org/jira/browse/YARN-8337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianchao Jia updated YARN-8337: --- Summary: Deadlock In Federation Router (was: Deadlock Federation Router) > Deadlock In Federation Router > - > > Key: YARN-8337 > URL: https://issues.apache.org/jira/browse/YARN-8337 > Project: Hadoop YARN > Issue Type: Bug > Components: federation, router >Reporter: Jianchao Jia >Priority: Major > Attachments: YARN-8337.001.patch > > > We use mysql innodb as the state store engine,in router log we found dead > lock error like below: > {code:java} > [2018-05-21T15:41:40.383+08:00] [ERROR] [IPC Server handler 25 on 8050] : > Unable to insert the newly generated application > application_1526295230627_127402 > com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock > found when trying to get lock; try restarting transaction > at sun.reflect.GeneratedConstructorAccessor107.newInstance(Unknown > Source) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at com.mysql.jdbc.Util.handleNewInstance(Util.java:425) > at com.mysql.jdbc.Util.getInstance(Util.java:408) > at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:952) > at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3973) > at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3909) > at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2527) > at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2680) > at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2484) > at > com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1858) > at > com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2079) > at > com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2013) > at > com.mysql.jdbc.PreparedStatement.executeLargeUpdate(PreparedStatement.java:5104) > at > com.mysql.jdbc.CallableStatement.executeLargeUpdate(CallableStatement.java:2418) > at > com.mysql.jdbc.CallableStatement.executeUpdate(CallableStatement.java:887) > at > com.zaxxer.hikari.pool.ProxyPreparedStatement.executeUpdate(ProxyPreparedStatement.java:61) > at > com.zaxxer.hikari.pool.HikariProxyCallableStatement.executeUpdate(HikariProxyCallableStatement.java) > at > org.apache.hadoop.yarn.server.federation.store.impl.SQLFederationStateStore.addApplicationHomeSubCluster(SQLFederationStateStore.java:547) > {code} > Use "show engine innodb status;" command to find what happens > {code:java} > 2018-05-21 15:41:40 7f4685870700 > *** (1) TRANSACTION: > TRANSACTION 241131538, ACTIVE 0 sec inserting, thread declared inside InnoDB > 4999 > mysql tables in use 2, locked 2 > LOCK WAIT 4 lock struct(s), heap size 1184, 2 row lock(s) > MySQL thread id 7602335, OS thread handle 0x7f46858f2700, query id 2919792534 > 192.168.1.138 federation executing > INSERT INTO applicationsHomeSubCluster > (applicationId,homeSubCluster) > (SELECT applicationId_IN, homeSubCluster_IN > FROM applicationsHomeSubCluster > WHERE applicationId = applicationId_IN > HAVING COUNT(*) = 0 ) > *** (1) WAITING FOR THIS LOCK TO BE GRANTED: > RECORD LOCKS space id 113 page no 21208 n bits 296 index `PRIMARY` of table > `guldan_federationstatestore`.`applicationshomesubcluster` trx id 241131538 > lock_mode X locks gap before rec insert intention waiting > Record lock, heap no 23 PHYSICAL RECORD: n_fields 4; compact format; info > bits 0 > 0: len 30; hex 6170706c69636174696f6e5f313532363239353233303632375f31323734; > asc application_1526295230627_1274; (total 31 bytes); > 1: len 6; hex 0ba5f32d; asc -;; > 2: len 7; hex dd00280110; asc ( ;; > 3: len 13; hex 686f70655f636c757374657231; asc hope_cluster1;; > *** (2) TRANSACTION: > TRANSACTION 241131539, ACTIVE 0 sec inserting, thread declared inside InnoDB > 4999 > mysql tables in use 2, locked 2 > 4 lock struct(s), heap size 1184, 2 row lock(s) > MySQL thread id 7600638, OS thread handle 0x7f4685870700, query id 2919792535 > 192.168.1.138 federation executing > INSERT INTO applicationsHomeSubCluster > (applicationId,homeSubCluster) > (SELECT applicationId_IN, homeSubCluster_IN > FROM applicationsHomeSubCluster > WHERE applicationId = applicationId_IN > HAVING COUNT(*) = 0 ) > *** (2) HOLDS THE LOCK(S): > RECORD LOCKS space id 113 page no 21208 n bits 296 index `PRIMARY` of table > `guldan_federationstatestore`.`applicationshomesubcluster` trx id 241131539 > lock mode S locks gap before rec > Record lock, heap no 23 PHYSICAL RECORD: n_fields 4; compact format; info
[jira] [Updated] (YARN-8320) Support CPU isolation for latency-sensitive (LS) service
[ https://issues.apache.org/jira/browse/YARN-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-8320: -- Summary: Support CPU isolation for latency-sensitive (LS) service (was: upport CPU isolation for latency-sensitive (LS) service) > Support CPU isolation for latency-sensitive (LS) service > > > Key: YARN-8320 > URL: https://issues.apache.org/jira/browse/YARN-8320 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Jiandan Yang >Priority: Major > Attachments: CPU-isolation-for-latency-sensitive-services-v1.pdf, > YARN-8320.001.patch > > > Currently NodeManager uses “cpu.cfs_period_us”, “cpu.cfs_quota_us” and > “cpu.shares” to isolate cpu resource. However, > * Linux Completely Fair Scheduling (CFS) is a throughput-oriented scheduler; > no support for differentiated latency > * Request latency of services running on container may be frequent shake > when all containers share cpus, and latency-sensitive services can not afford > in our production environment. > So we need more finer cpu isolation. > My co-workers and I propose a solution using cgroup cpuset to binds > containers to different processors, this is inspired by the isolation > technique in [Borg > system|http://schd.ws/hosted_files/lcccna2016/a7/CAT%20@%20Scale.pdf]. > Later I will upload a detailed design doc. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8206) Sending a kill does not immediately kill docker containers
[ https://issues.apache.org/jira/browse/YARN-8206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484093#comment-16484093 ] Hudson commented on YARN-8206: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14250 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14250/]) YARN-8206. Sending a kill does not immediately kill docker containers. (jlowe: rev 5f11288e41fca2e414dcbea130c7702e29d4d610) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/TestDockerContainerRuntime.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DockerLinuxContainerRuntime.java > Sending a kill does not immediately kill docker containers > -- > > Key: YARN-8206 > URL: https://issues.apache.org/jira/browse/YARN-8206 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Labels: Docker > Fix For: 3.2.0, 3.1.1 > > Attachments: YARN-8206.001.patch, YARN-8206.002.patch, > YARN-8206.003.patch, YARN-8206.004.patch, YARN-8206.005.patch, > YARN-8206.006.patch, YARN-8206.007.patch, YARN-8206.008.patch, > YARN-8206.009.patch, YARN-8206.010.patch, YARN-8206.011.patch > > > {noformat} > if (ContainerExecutor.Signal.KILL.equals(signal) > || ContainerExecutor.Signal.TERM.equals(signal)) { > handleContainerStop(containerId, env); > {noformat} > Currently in the code, we are handling both SIGKILL and SIGTERM as equivalent > for docker containers. However, they should actually be separate. When YARN > sends a SIGKILL to a process, it means for it to die immediately and not sit > around waiting for anything. This ensures an immediate reclamation of > resources. Additionally, if a SIGTERM is sent before the SIGKILL, the task > might not handle the signal correctly, and will then end up as a failed task > instead of a killed task. This is especially bad for preemption. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8259) Revisit liveliness checks for Docker containers
[ https://issues.apache.org/jira/browse/YARN-8259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shane Kumpf updated YARN-8259: -- Target Version/s: 3.0.2, 3.2.0, 3.1.1 > Revisit liveliness checks for Docker containers > --- > > Key: YARN-8259 > URL: https://issues.apache.org/jira/browse/YARN-8259 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.0.2, 3.2.0, 3.1.1 >Reporter: Shane Kumpf >Assignee: Shane Kumpf >Priority: Blocker > Labels: Docker > Attachments: YARN-8259.001.patch > > > As privileged containers may execute as a user that does not match the YARN > run as user, sending the null signal for liveliness checks could fail. We > need to reconsider how liveliness checks are handled in the Docker case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8041) [Router] Federation: routing some missing REST invocations transparently to multiple RMs
[ https://issues.apache.org/jira/browse/YARN-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiran Wu updated YARN-8041: --- Attachment: YARN-8041.005.patch > [Router] Federation: routing some missing REST invocations transparently to > multiple RMs > > > Key: YARN-8041 > URL: https://issues.apache.org/jira/browse/YARN-8041 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation, router >Reporter: Yiran Wu >Assignee: Yiran Wu >Priority: Minor > Attachments: YARN-8041.001.patch, YARN-8041.002.patch, > YARN-8041.003.patch, YARN-8041.004.patch, YARN-8041.005.patch, > YARN-8041.006.patch > > > This Jira tracks the implementation of some missing REST invocation in > FederationInterceptorREST: > * getAppStatistics > * getNodeToLabels > * getLabelsOnNode > * updateApplicationPriority > * getAppQueue > * updateAppQueue > * getAppTimeout > * getAppTimeouts > * updateApplicationTimeout > * getAppAttempts > * getAppAttempt > * getContainers > * getContainer -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4781) Support intra-queue preemption for fairness ordering policy.
[ https://issues.apache.org/jira/browse/YARN-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484049#comment-16484049 ] Sunil Govindan commented on YARN-4781: -- Hi [~eepayne] Latest patch looks good to me. I tried to test this in a local cluster and looks fine. However i have not verified case where FairOrdering policy could be used with weights. Did you get chance to cross check the same as well? Thanks. Other than this, i am good with this patch to commit. > Support intra-queue preemption for fairness ordering policy. > > > Key: YARN-4781 > URL: https://issues.apache.org/jira/browse/YARN-4781 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Wangda Tan >Assignee: Eric Payne >Priority: Major > Attachments: YARN-4781.001.patch, YARN-4781.002.patch, > YARN-4781.003.patch, YARN-4781.004.patch, YARN-4781.005.patch > > > We introduced fairness queue policy since YARN-3319, which will let large > applications make progresses and not starve small applications. However, if a > large application takes the queue’s resources, and containers of the large > app has long lifespan, small applications could still wait for resources for > long time and SLAs cannot be guaranteed. > Instead of wait for application release resources on their own, we need to > preempt resources of queue with fairness policy enabled. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8041) [Router] Federation: routing some missing REST invocations transparently to multiple RMs
[ https://issues.apache.org/jira/browse/YARN-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiran Wu updated YARN-8041: --- Attachment: (was: YARN-8041.005.patch) > [Router] Federation: routing some missing REST invocations transparently to > multiple RMs > > > Key: YARN-8041 > URL: https://issues.apache.org/jira/browse/YARN-8041 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation, router >Reporter: Yiran Wu >Assignee: Yiran Wu >Priority: Minor > Attachments: YARN-8041.001.patch, YARN-8041.002.patch, > YARN-8041.003.patch, YARN-8041.004.patch, YARN-8041.005.patch, > YARN-8041.006.patch > > > This Jira tracks the implementation of some missing REST invocation in > FederationInterceptorREST: > * getAppStatistics > * getNodeToLabels > * getLabelsOnNode > * updateApplicationPriority > * getAppQueue > * updateAppQueue > * getAppTimeout > * getAppTimeouts > * updateApplicationTimeout > * getAppAttempts > * getAppAttempt > * getContainers > * getContainer -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8041) [Router] Federation: routing some missing REST invocations transparently to multiple RMs
[ https://issues.apache.org/jira/browse/YARN-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiran Wu updated YARN-8041: --- Attachment: YARN-8041.006.patch > [Router] Federation: routing some missing REST invocations transparently to > multiple RMs > > > Key: YARN-8041 > URL: https://issues.apache.org/jira/browse/YARN-8041 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation, router >Reporter: Yiran Wu >Assignee: Yiran Wu >Priority: Minor > Attachments: YARN-8041.001.patch, YARN-8041.002.patch, > YARN-8041.003.patch, YARN-8041.004.patch, YARN-8041.005.patch, > YARN-8041.006.patch > > > This Jira tracks the implementation of some missing REST invocation in > FederationInterceptorREST: > * getAppStatistics > * getNodeToLabels > * getLabelsOnNode > * updateApplicationPriority > * getAppQueue > * updateAppQueue > * getAppTimeout > * getAppTimeouts > * updateApplicationTimeout > * getAppAttempts > * getAppAttempt > * getContainers > * getContainer -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8191) Fair scheduler: queue deletion without RM restart
[ https://issues.apache.org/jira/browse/YARN-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484050#comment-16484050 ] genericqa commented on YARN-8191: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 32s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 9s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 25s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 5 new + 88 unchanged - 0 fixed = 93 total (was 88) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 36s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 15s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 30s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}119m 6s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | | Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadListener; locked 75% of time Unsynchronized access at AllocationFileLoaderService.java:75% of time Unsynchronized access at AllocationFileLoaderService.java:[line 117] | | | org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.removeEmptyIncompatibleQueues(String, FSQueueType) has Boolean return type and returns explicit null At QueueManager.java:type and returns explicit null At QueueManager.java:[line 399] | | Failed junit tests | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | | | hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8191 | | JIRA Patch URL |
[jira] [Commented] (YARN-8108) RM metrics rest API throws GSSException in kerberized environment
[ https://issues.apache.org/jira/browse/YARN-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484199#comment-16484199 ] Eric Yang commented on YARN-8108: - [~yzhangal] This is a regression in Hadoop 3.x, hence it is marked as a blocker. Friendly reminder to PMCs to review this patch to bring this to closure. > RM metrics rest API throws GSSException in kerberized environment > - > > Key: YARN-8108 > URL: https://issues.apache.org/jira/browse/YARN-8108 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Kshitij Badani >Assignee: Eric Yang >Priority: Blocker > Attachments: YARN-8108.001.patch > > > Test is trying to pull up metrics data from SHS after kiniting as 'test_user' > It is throwing GSSException as follows > {code:java} > b2b460b80713|RUNNING: curl --silent -k -X GET -D > /hwqe/hadoopqe/artifacts/tmp-94845 --negotiate -u : > http://rm_host:8088/proxy/application_1518674952153_0070/metrics/json2018-02-15 > 07:15:48,757|INFO|MainThread|machine.py:194 - > run()||GUID=fc5a3266-28f8-4eed-bae2-b2b460b80713|Exit Code: 0 > 2018-02-15 07:15:48,758|INFO|MainThread|spark.py:1757 - > getMetricsJsonData()|metrics: > > > > Error 403 GSSException: Failure unspecified at GSS-API level > (Mechanism level: Request is a replay (34)) > > HTTP ERROR 403 > Problem accessing /proxy/application_1518674952153_0070/metrics/json. > Reason: > GSSException: Failure unspecified at GSS-API level (Mechanism level: > Request is a replay (34)) > > > {code} > Rootcausing : proxyserver on RM can't be supported for Kerberos enabled > cluster because AuthenticationFilter is applied twice in Hadoop code (once in > httpServer2 for RM, and another instance from AmFilterInitializer for proxy > server). This will require code changes to hadoop-yarn-server-web-proxy > project -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8232) RMContainer lost queue name when RM HA happens
[ https://issues.apache.org/jira/browse/YARN-8232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-8232: - Fix Version/s: 2.8.5 2.9.2 2.10.0 Thanks, [~ziqian hu]! We recently ran into the same issue on 2.8 as well, so I committed this to branch-3.0, branch-2, branch-2.9, and branch-2.8. > RMContainer lost queue name when RM HA happens > -- > > Key: YARN-8232 > URL: https://issues.apache.org/jira/browse/YARN-8232 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.8.3 >Reporter: Hu Ziqian >Assignee: Hu Ziqian >Priority: Major > Fix For: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 2.8.5 > > Attachments: YARN-8232-branch-2.8.3.001.patch, YARN-8232.001.patch, > YARN-8232.002.patch, YARN-8232.003.patch > > > RMContainer has a member variable queuename to store which queue the > container belongs to. When RM HA happens and RMContainers are recovered by > scheduler based on NM reports, the queue name isn't recovered and always be > null. > This situation causes some problems. Here is a case in preemption. Preemption > uses container's queue name to deduct preemptable resources when we use more > than one preempt selector, (for example, enable intra-queue preemption,) . > The detail is in > {code:java} > CapacitySchedulerPreemptionUtils.deductPreemptableResourcesBasedSelectedCandidates(){code} > If the contain's queue name is null, this function will throw a > YarnRuntimeException because it tries to get the container's > TempQueuePerPartition and the preemption fails. > Our patch solved this problem by setting container queue name when recover > containers. The patch is based on branch-2.8.3. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8338) TimelineService V1.5 doesn't come up after HADOOP-15406
[ https://issues.apache.org/jira/browse/YARN-8338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484249#comment-16484249 ] Vinod Kumar Vavilapalli commented on YARN-8338: --- Full exception trace {code:java} java.lang.NoClassDefFoundError: org/objenesis/Objenesis at org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.(RollingLevelDBTimelineStore.java:174) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2532) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2497) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2593) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2619) at org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.createSummaryStore(EntityGroupFSTimelineStore.java:266) at org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.serviceInit(EntityGroupFSTimelineStore.java:152) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:115) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:177) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:187) Caused by: java.lang.ClassNotFoundException: org.objenesis.Objenesis at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 15 more{code} > TimelineService V1.5 doesn't come up after HADOOP-15406 > --- > > Key: YARN-8338 > URL: https://issues.apache.org/jira/browse/YARN-8338 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli >Priority: Critical > > TimelineService V1.5 fails with the following: > {code} > java.lang.NoClassDefFoundError: org/objenesis/Objenesis > at > org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.(RollingLevelDBTimelineStore.java:174) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8041) [Router] Federation: routing some missing REST invocations transparently to multiple RMs
[ https://issues.apache.org/jira/browse/YARN-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiran Wu updated YARN-8041: --- Attachment: YARN-8041.007.patch > [Router] Federation: routing some missing REST invocations transparently to > multiple RMs > > > Key: YARN-8041 > URL: https://issues.apache.org/jira/browse/YARN-8041 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation, router >Reporter: Yiran Wu >Assignee: Yiran Wu >Priority: Minor > Attachments: YARN-8041.001.patch, YARN-8041.002.patch, > YARN-8041.003.patch, YARN-8041.004.patch, YARN-8041.005.patch, > YARN-8041.006.patch, YARN-8041.007.patch > > > This Jira tracks the implementation of some missing REST invocation in > FederationInterceptorREST: > * getAppStatistics > * getNodeToLabels > * getLabelsOnNode > * updateApplicationPriority > * getAppQueue > * updateAppQueue > * getAppTimeout > * getAppTimeouts > * updateApplicationTimeout > * getAppAttempts > * getAppAttempt > * getContainers > * getContainer -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8337) Deadlock In Federation Router
[ https://issues.apache.org/jira/browse/YARN-8337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484286#comment-16484286 ] Giovanni Matteo Fumarola commented on YARN-8337: Thanks [~jianchao jia] and [~yiran] for finding the bug and work on it. The same logic works in *HSQLDBFederationStateStore*. Please update this test as well and test if the fix works over there. > Deadlock In Federation Router > - > > Key: YARN-8337 > URL: https://issues.apache.org/jira/browse/YARN-8337 > Project: Hadoop YARN > Issue Type: Bug > Components: federation, router >Reporter: Jianchao Jia >Priority: Major > Attachments: YARN-8337.001.patch > > > We use mysql innodb as the state store engine,in router log we found dead > lock error like below: > {code:java} > [2018-05-21T15:41:40.383+08:00] [ERROR] [IPC Server handler 25 on 8050] : > Unable to insert the newly generated application > application_1526295230627_127402 > com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock > found when trying to get lock; try restarting transaction > at sun.reflect.GeneratedConstructorAccessor107.newInstance(Unknown > Source) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at com.mysql.jdbc.Util.handleNewInstance(Util.java:425) > at com.mysql.jdbc.Util.getInstance(Util.java:408) > at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:952) > at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3973) > at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3909) > at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2527) > at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2680) > at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2484) > at > com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1858) > at > com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2079) > at > com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2013) > at > com.mysql.jdbc.PreparedStatement.executeLargeUpdate(PreparedStatement.java:5104) > at > com.mysql.jdbc.CallableStatement.executeLargeUpdate(CallableStatement.java:2418) > at > com.mysql.jdbc.CallableStatement.executeUpdate(CallableStatement.java:887) > at > com.zaxxer.hikari.pool.ProxyPreparedStatement.executeUpdate(ProxyPreparedStatement.java:61) > at > com.zaxxer.hikari.pool.HikariProxyCallableStatement.executeUpdate(HikariProxyCallableStatement.java) > at > org.apache.hadoop.yarn.server.federation.store.impl.SQLFederationStateStore.addApplicationHomeSubCluster(SQLFederationStateStore.java:547) > {code} > Use "show engine innodb status;" command to find what happens > {code:java} > 2018-05-21 15:41:40 7f4685870700 > *** (1) TRANSACTION: > TRANSACTION 241131538, ACTIVE 0 sec inserting, thread declared inside InnoDB > 4999 > mysql tables in use 2, locked 2 > LOCK WAIT 4 lock struct(s), heap size 1184, 2 row lock(s) > MySQL thread id 7602335, OS thread handle 0x7f46858f2700, query id 2919792534 > 192.168.1.138 federation executing > INSERT INTO applicationsHomeSubCluster > (applicationId,homeSubCluster) > (SELECT applicationId_IN, homeSubCluster_IN > FROM applicationsHomeSubCluster > WHERE applicationId = applicationId_IN > HAVING COUNT(*) = 0 ) > *** (1) WAITING FOR THIS LOCK TO BE GRANTED: > RECORD LOCKS space id 113 page no 21208 n bits 296 index `PRIMARY` of table > `guldan_federationstatestore`.`applicationshomesubcluster` trx id 241131538 > lock_mode X locks gap before rec insert intention waiting > Record lock, heap no 23 PHYSICAL RECORD: n_fields 4; compact format; info > bits 0 > 0: len 30; hex 6170706c69636174696f6e5f313532363239353233303632375f31323734; > asc application_1526295230627_1274; (total 31 bytes); > 1: len 6; hex 0ba5f32d; asc -;; > 2: len 7; hex dd00280110; asc ( ;; > 3: len 13; hex 686f70655f636c757374657231; asc hope_cluster1;; > *** (2) TRANSACTION: > TRANSACTION 241131539, ACTIVE 0 sec inserting, thread declared inside InnoDB > 4999 > mysql tables in use 2, locked 2 > 4 lock struct(s), heap size 1184, 2 row lock(s) > MySQL thread id 7600638, OS thread handle 0x7f4685870700, query id 2919792535 > 192.168.1.138 federation executing > INSERT INTO applicationsHomeSubCluster > (applicationId,homeSubCluster) > (SELECT applicationId_IN, homeSubCluster_IN > FROM applicationsHomeSubCluster > WHERE applicationId = applicationId_IN > HAVING COUNT(*) = 0 ) > *** (2) HOLDS THE LOCK(S): > RECORD LOCKS space id 113 page no 21208 n bits 296 index `PRIMARY` of table >
[jira] [Commented] (YARN-8041) [Router] Federation: routing some missing REST invocations transparently to multiple RMs
[ https://issues.apache.org/jira/browse/YARN-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484295#comment-16484295 ] Giovanni Matteo Fumarola commented on YARN-8041: [~yiran] thanks for the patch. The latest patch has some problems: checkstyle, whitespace and most important a failed unit test in Router. Please fix those. > [Router] Federation: routing some missing REST invocations transparently to > multiple RMs > > > Key: YARN-8041 > URL: https://issues.apache.org/jira/browse/YARN-8041 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation, router >Reporter: Yiran Wu >Assignee: Yiran Wu >Priority: Minor > Attachments: YARN-8041.001.patch, YARN-8041.002.patch, > YARN-8041.003.patch, YARN-8041.004.patch, YARN-8041.005.patch, > YARN-8041.006.patch, YARN-8041.007.patch > > > This Jira tracks the implementation of some missing REST invocation in > FederationInterceptorREST: > * getAppStatistics > * getNodeToLabels > * getLabelsOnNode > * updateApplicationPriority > * getAppQueue > * updateAppQueue > * getAppTimeout > * getAppTimeouts > * updateApplicationTimeout > * getAppAttempts > * getAppAttempt > * getContainers > * getContainer -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8041) [Router] Federation: routing some missing REST invocations transparently to multiple RMs
[ https://issues.apache.org/jira/browse/YARN-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484238#comment-16484238 ] genericqa commented on YARN-8041: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 28s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 58s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 13s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 46s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 10 new + 18 unchanged - 0 fixed = 28 total (was 18) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 30s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 18s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 13s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 68m 38s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 19s{color} | {color:red} hadoop-yarn-server-router in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}135m 51s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.router.webapp.TestFederationInterceptorREST | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8041 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924556/YARN-8041.006.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 094db537de3d
[jira] [Created] (YARN-8338) TimelineService V1.5 doesn't come up after HADOOP-15406
Vinod Kumar Vavilapalli created YARN-8338: - Summary: TimelineService V1.5 doesn't come up after HADOOP-15406 Key: YARN-8338 URL: https://issues.apache.org/jira/browse/YARN-8338 Project: Hadoop YARN Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli TimelineService V1.5 fails with the following: {code} java.lang.NoClassDefFoundError: org/objenesis/Objenesis at org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.(RollingLevelDBTimelineStore.java:174) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8292) Fix the dominant resource preemption cannot happen when some of the resource vector becomes negative
[ https://issues.apache.org/jira/browse/YARN-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484257#comment-16484257 ] Eric Payne commented on YARN-8292: -- {quote}Actually this is required after the change. {quote} Yes, I see now. {quote}TestPreemptionForQueueWithPriorities {quote} {{TestPreemptionForQueueWithPriorities}} passes for me in my local environment. {quote}doPreempt = Resources.lessThan(rc, clusterResource, Resources .componentwiseMin(toObtainAfterPreemption, Resources.none()), Resources.componentwiseMin(toObtainByPartition, Resources.none())); {quote} I don't think we want the above code to {{componentwiseMin}} the {{toObtain}} values with 0, since that will set _all_ positive resource entities to 0. {quote}Can we address this in a separate JIRA if we cannot come with some simple solution? {quote} In my tests, the current implementation of preemption does not seem to work anyway when extensible resources are enabled, so this seems to be a larger problem. You are right that it should be its own JIRA. I give my +1 here. [~jlowe] / [~sunilg], do you have additional comments? > Fix the dominant resource preemption cannot happen when some of the resource > vector becomes negative > > > Key: YARN-8292 > URL: https://issues.apache.org/jira/browse/YARN-8292 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8292.001.patch, YARN-8292.002.patch, > YARN-8292.003.patch, YARN-8292.004.patch, YARN-8292.005.patch, > YARN-8292.006.patch > > > This is an example of the problem: > > {code} > // guaranteed, max,used, pending > "root(=[30:18:6 30:18:6 12:12:6 1:1:1]);" + //root > "-a(=[10:6:2 10:6:2 6:6:3 0:0:0]);" + // a > "-b(=[10:6:2 10:6:2 6:6:3 0:0:0]);" + // b > "-c(=[10:6:2 10:6:2 0:0:0 1:1:1])"; // c > {code} > There're 3 resource types. Total resource of the cluster is 30:18:6 > For both of a/b, there're 3 containers running, each of container is 2:2:1. > Queue c uses 0 resource, and have 1:1:1 pending resource. > Under existing logic, preemption cannot happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8259) Revisit liveliness checks for Docker containers
[ https://issues.apache.org/jira/browse/YARN-8259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484271#comment-16484271 ] Eric Yang commented on YARN-8259: - [~shaneku...@gmail.com] The proposal for implementing both is okay, but we can make better software with sensible optimization and pick a solution that can work for all scenarios without adding extra administration tasks. There is no objection with current approach. We are aware that hidepid corner case can generate additional system administration tasks to white list node manager to access all pid. We also know it cost more resource to fork exec with docker inspect approach. Human labor to configure OS with knowledge of Hadoop details is usually more expensive than adding processor or ram. It would be great if the solution can work without additional configuration flag, nor adding extra hardware resource. This means doing pid check as privileged user via container-executor may be preferred solution by system administrators without adding overhead to system administration chores. Can proc pid check work in docker in docker environment? > Revisit liveliness checks for Docker containers > --- > > Key: YARN-8259 > URL: https://issues.apache.org/jira/browse/YARN-8259 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.0.2, 3.2.0, 3.1.1 >Reporter: Shane Kumpf >Assignee: Shane Kumpf >Priority: Blocker > Labels: Docker > Attachments: YARN-8259.001.patch > > > As privileged containers may execute as a user that does not match the YARN > run as user, sending the null signal for liveliness checks could fail. We > need to reconsider how liveliness checks are handled in the Docker case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8310) Handle old NMTokenIdentifier, AMRMTokenIdentifier, and ContainerTokenIdentifier formats
[ https://issues.apache.org/jira/browse/YARN-8310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486511#comment-16486511 ] genericqa commented on YARN-8310: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 30s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 30m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 4s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 29m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 38s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 49s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 47s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 27s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 56s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}152m 3s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8310 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924026/YARN-8310.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux c552e7cde900 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 43be9ab | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/20828/testReport/ | | Max. process+thread count | 1348 (vs. ulimit of 1) | | modules | C:
[jira] [Commented] (YARN-8310) Handle old NMTokenIdentifier, AMRMTokenIdentifier, and ContainerTokenIdentifier formats
[ https://issues.apache.org/jira/browse/YARN-8310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486637#comment-16486637 ] Miklos Szegedi commented on YARN-8310: -- I will backport this to branches branch-2, branch-3.0 and branch 3.1 > Handle old NMTokenIdentifier, AMRMTokenIdentifier, and > ContainerTokenIdentifier formats > --- > > Key: YARN-8310 > URL: https://issues.apache.org/jira/browse/YARN-8310 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Robert Kanter >Assignee: Robert Kanter >Priority: Major > Attachments: YARN-8310.001.patch, YARN-8310.002.patch, > YARN-8310.003.patch, YARN-8310.branch-2.001.patch, > YARN-8310.branch-2.002.patch, YARN-8310.branch-2.003.patch > > > In some recent upgrade testing, we saw this error causing the NodeManager to > fail to startup afterwards: > {noformat} > org.apache.hadoop.service.ServiceStateException: > com.google.protobuf.InvalidProtocolBufferException: Protocol message > contained an invalid tag (zero). > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:173) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:441) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:834) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:895) > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol > message contained an invalid tag (zero). > at > com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.java:89) > at > com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:108) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto.(YarnSecurityTokenProtos.java:1860) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto.(YarnSecurityTokenProtos.java:1824) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto$1.parsePartialFrom(YarnSecurityTokenProtos.java:2016) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto$1.parsePartialFrom(YarnSecurityTokenProtos.java:2011) > at > com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto.parseFrom(YarnSecurityTokenProtos.java:2686) > at > org.apache.hadoop.yarn.security.ContainerTokenIdentifier.readFields(ContainerTokenIdentifier.java:254) > at > org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:177) > at > org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerTokenIdentifier(BuilderUtils.java:322) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverContainer(ContainerManagerImpl.java:455) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:373) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:316) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > ... 5 more > {noformat} > The NodeManager fails because it's trying to read a > {{ContainerTokenIdentifier}} in the "old" format before we changed them to > protobufs (YARN-668). This is very similar to YARN-5594 where we ran into a > similar problem with the ResourceManager and RM Delegation Tokens. > To provide a better experience, we should make the code able to read the old > format if it's unable to read it using the new format. We didn't run into > any errors with the other two types of tokens that YARN-668 incompatibly > changed (NMTokenIdentifier and AMRMTokenIdentifier), but we may as well fix > those while we're at it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands,
[jira] [Commented] (YARN-7998) RM crashes with NPE during recovering if ACL configuration was changed
[ https://issues.apache.org/jira/browse/YARN-7998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486638#comment-16486638 ] Wilfred Spiegelenburg commented on YARN-7998: - I don't think we should fail restore of a running application at all when the ACL was changed. Logging the failure is good but just killing the application is not the right thing to do. We should either not start up at all and tell the end user to fix the configuration or allow the application to be restored and finish. The ACL change when made on a running RM is also not triggering a running application review. You do not kill any running application that is not allowed by the ACL when it gets changed. Restore should not behave any different. Based on the details in YARN-7913 I think we need to close this as a duplicate and come up with a general fix that handles all these cases and not do one of changes to fix a specific corner case. > RM crashes with NPE during recovering if ACL configuration was changed > -- > > Key: YARN-7998 > URL: https://issues.apache.org/jira/browse/YARN-7998 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 3.0.0 >Reporter: Oleksandr Shevchenko >Assignee: Oleksandr Shevchenko >Priority: Major > Attachments: YARN-7998.000.patch, YARN-7998.001.patch, > YARN-7998.002.patch, YARN-7998.003.patch > > > RM crashes with NPE during failover because ACL configurations were changed > as a result we no longer have a rights to submit an application to a queue. > Scenario: > # Submit an application > # Change ACL configuration for a queue that accepted the application so that > an owner of the application will no longer have a rights to submit this > application. > # Restart RM. > As a result, we get NPE: > 2018-02-27 18:14:00,968 INFO org.apache.hadoop.service.AbstractService: > Service ResourceManager failed in state STARTED; cause: > java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.addApplicationAttempt(FairScheduler.java:738) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1286) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:116) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AttemptRecoveredTransition.transition(RMAppAttemptImpl.java:1098) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AttemptRecoveredTransition.transition(RMAppAttemptImpl.java:1044) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8338) TimelineService V1.5 doesn't come up after HADOOP-15406
[ https://issues.apache.org/jira/browse/YARN-8338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486678#comment-16486678 ] Vinod Kumar Vavilapalli commented on YARN-8338: --- [~jlowe], never mind, removing the exclusion failed in compilation itself. We will have to declare a version. {code} [INFO] --- maven-enforcer-plugin:3.0.0-M1:enforce (depcheck) @ hadoop-aws --- [WARNING] Dependency convergence error for org.objenesis:objenesis:2.1 paths to dependency are: +-org.apache.hadoop:hadoop-aws:3.1.1-SNAPSHOT +-com.amazonaws:DynamoDBLocal:1.11.86 +-org.mockito:mockito-core:1.10.19 +-org.objenesis:objenesis:2.1 and +-org.apache.hadoop:hadoop-aws:3.1.1-SNAPSHOT +-org.apache.hadoop:hadoop-yarn-server-tests:3.1.1-SNAPSHOT +-org.apache.hadoop:hadoop-yarn-server-resourcemanager:3.1.1-SNAPSHOT +-org.apache.hadoop:hadoop-yarn-server-applicationhistoryservice:3.1.1-SNAPSHOT +-de.ruedigermoeller:fst:2.50 +-org.objenesis:objenesis:2.5.1 [WARNING] Rule 0: org.apache.maven.plugins.enforcer.DependencyConvergence failed with message: Failed while enforcing releasability. See above detailed error message. {code} > TimelineService V1.5 doesn't come up after HADOOP-15406 > --- > > Key: YARN-8338 > URL: https://issues.apache.org/jira/browse/YARN-8338 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli >Priority: Critical > Attachments: YARN-8338.txt > > > TimelineService V1.5 fails with the following: > {code} > java.lang.NoClassDefFoundError: org/objenesis/Objenesis > at > org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.(RollingLevelDBTimelineStore.java:174) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7899) [AMRMProxy] Stateful FederationInterceptor for pending requests
[ https://issues.apache.org/jira/browse/YARN-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486508#comment-16486508 ] genericqa commented on YARN-7899: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 44s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 13s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 26s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 20s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 13 new + 16 unchanged - 0 fixed = 29 total (was 16) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 14s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 19s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 21s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 37s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}114m 50s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-7899 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924637/YARN-7899.v1.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux d8722a246fd9 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 43be9ab | |
[jira] [Updated] (YARN-8336) Fix potential connection leak in SchedConfCLI and YarnWebServiceUtils
[ https://issues.apache.org/jira/browse/YARN-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-8336: --- Attachment: YARN-8336.v2.patch > Fix potential connection leak in SchedConfCLI and YarnWebServiceUtils > - > > Key: YARN-8336 > URL: https://issues.apache.org/jira/browse/YARN-8336 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8336.v1.patch, YARN-8336.v2.patch > > > Missing ClientResponse.close and Client.destroy can lead to a connection leak. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored
[ https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486530#comment-16486530 ] Eric Yang edited comment on YARN-8342 at 5/23/18 12:58 AM: --- The current behavior is documented in [YARN-7516|https://issues.apache.org/jira/browse/YARN-7516?focusedCommentId=16353125=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16353125]. Non-trusted image is not allowed to supply launch command into container due to [reason|https://issues.apache.org/jira/browse/YARN-7516?focusedCommentId=16347441=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16347441] stated by Shane. We don't allow mounting of host disks to untrusted image to prevent the image from putting unauthorized files that can not be erased in the localizer directory. When using untrusted image with yarn mode, this will generate a launch_container.sh that runs a empty bash command and exit immediately according to Shane. The end result is some what unexpected even though it minimized the security risks. The solution is to set YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE=true in yarn-env.sh, and this will turn the cluster into docker mode as default. There is no launch_container.sh required in docker mode, and we might be able to lift drop launch command restriction. was (Author: eyang): The current behavior is documented in [YARN-7516|https://issues.apache.org/jira/browse/YARN-7516?focusedCommentId=16353125=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16353125]. Non-trusted image is not allowed to supply launch command into container due to [reason|https://issues.apache.org/jira/browse/YARN-7516?focusedCommentId=16353125=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16353125] stated by Shane. We don't allow mounting of host disks to untrusted image to prevent the image from putting unauthorized files that can not be erased in the localizer directory. When using untrusted image with yarn mode, this will generate a launch_container.sh that runs a empty bash command and exit immediately according to Shane. The end result is some what unexpected even though it minimized the security risks. The solution is to set YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE=true in yarn-env.sh, and this will turn the cluster into docker mode as default. There is no launch_container.sh required in docker mode, and we might be able to lift drop launch command restriction. > Using docker image from a non-privileged registry, the launch_command is not > honored > > > Key: YARN-8342 > URL: https://issues.apache.org/jira/browse/YARN-8342 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Priority: Critical > Labels: Docker > > During test of the Docker feature, I found that if a container comes from > non-privileged docker registry, the specified launch command will be ignored. > Container will success without any log, which is very confusing to end users. > And this behavior is inconsistent to containers from privileged docker > registries. > cc: [~eyang], [~shaneku...@gmail.com], [~ebadger], [~jlowe] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8344) Missing nm.close() in TestNodeManagerResync to fix unit tests on Windows
[ https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-8344: --- Summary: Missing nm.close() in TestNodeManagerResync to fix unit tests on Windows (was: Missing nm.close() in TestNodeManagerResync) > Missing nm.close() in TestNodeManagerResync to fix unit tests on Windows > > > Key: YARN-8344 > URL: https://issues.apache.org/jira/browse/YARN-8344 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8344.v1.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8344) Missing nm.close() in TestNodeManagerResync to fix unit tests on Windows
[ https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486577#comment-16486577 ] Giovanni Matteo Fumarola commented on YARN-8344: Missing nm.close let other unit tests to fail. @SuppressWarnings("unchecked") are not anymore valid. > Missing nm.close() in TestNodeManagerResync to fix unit tests on Windows > > > Key: YARN-8344 > URL: https://issues.apache.org/jira/browse/YARN-8344 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8344.v1.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8336) Fix potential connection leak in SchedConfCLI and YarnWebServiceUtils
[ https://issues.apache.org/jira/browse/YARN-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486607#comment-16486607 ] genericqa commented on YARN-8336: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 15s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 31m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 28s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 20s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 19s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 28m 21s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 42s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}120m 5s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8336 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924656/YARN-8336.v2.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 5d892f15fcfe 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 68c7fd8 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC1 | | Test Results |
[jira] [Commented] (YARN-8292) Fix the dominant resource preemption cannot happen when some of the resource vector becomes negative
[ https://issues.apache.org/jira/browse/YARN-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486618#comment-16486618 ] genericqa commented on YARN-8292: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 37s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 25s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 9s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 11s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 7 new + 98 unchanged - 0 fixed = 105 total (was 98) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 28s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 9s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 27s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 57s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}159m 52s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy | | | hadoop.yarn.server.resourcemanager.monitor.capacity.TestPreemptionForQueueWithPriorities | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8292 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924650/YARN-8292.007.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 86a2cc16e4d0 4.4.0-116-generic #140-Ubuntu SMP Mon Feb 12 21:23:04 UTC 2018 x86_64 x86_64 x86_64
[jira] [Commented] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored
[ https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486530#comment-16486530 ] Eric Yang commented on YARN-8342: - The current behavior is documented in [YARN-7516|https://issues.apache.org/jira/browse/YARN-7516?focusedCommentId=16353125=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16353125]. Non-trusted image is not allowed to supply launch command into container due to [reason|https://issues.apache.org/jira/browse/YARN-7516?focusedCommentId=16353125=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16353125] stated by Shane. We don't allow mounting of host disks to untrusted image to prevent the image from putting unauthorized files that can not be erased in the localizer directory. When using untrusted image with yarn mode, this will generate a launch_container.sh that runs a empty bash command and exit immediately according to Shane. The end result is some what unexpected even though it minimized the security risks. The solution is to set YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE=true in yarn-env.sh, and this will turn the cluster into docker mode as default. There is no launch_container.sh required in docker mode, and we might be able to lift drop launch command restriction. > Using docker image from a non-privileged registry, the launch_command is not > honored > > > Key: YARN-8342 > URL: https://issues.apache.org/jira/browse/YARN-8342 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Priority: Critical > Labels: Docker > > During test of the Docker feature, I found that if a container comes from > non-privileged docker registry, the specified launch command will be ignored. > Container will success without any log, which is very confusing to end users. > And this behavior is inconsistent to containers from privileged docker > registries. > cc: [~eyang], [~shaneku...@gmail.com], [~ebadger], [~jlowe] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8344) Missing nm.close() in TestNodeManagerResync
[ https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-8344: --- Attachment: YARN-8344.v1.patch > Missing nm.close() in TestNodeManagerResync > --- > > Key: YARN-8344 > URL: https://issues.apache.org/jira/browse/YARN-8344 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8344.v1.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8344) Missing nm.close() in TestNodeManagerResync
Giovanni Matteo Fumarola created YARN-8344: -- Summary: Missing nm.close() in TestNodeManagerResync Key: YARN-8344 URL: https://issues.apache.org/jira/browse/YARN-8344 Project: Hadoop YARN Issue Type: Bug Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8310) Handle old NMTokenIdentifier, AMRMTokenIdentifier, and ContainerTokenIdentifier formats
[ https://issues.apache.org/jira/browse/YARN-8310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486644#comment-16486644 ] Hudson commented on YARN-8310: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14258 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14258/]) YARN-8310. Handle old NMTokenIdentifier, AMRMTokenIdentifier, and (miklos.szegedi: rev 3e5f7ea986600e084fcac723b0423e7de1b3bb8a) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/ContainerTokenIdentifier.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/security/TestYARNTokenIdentifier.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/AMRMTokenIdentifier.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/IOUtils.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/NMTokenIdentifier.java > Handle old NMTokenIdentifier, AMRMTokenIdentifier, and > ContainerTokenIdentifier formats > --- > > Key: YARN-8310 > URL: https://issues.apache.org/jira/browse/YARN-8310 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Robert Kanter >Assignee: Robert Kanter >Priority: Major > Attachments: YARN-8310.001.patch, YARN-8310.002.patch, > YARN-8310.003.patch, YARN-8310.branch-2.001.patch, > YARN-8310.branch-2.002.patch, YARN-8310.branch-2.003.patch > > > In some recent upgrade testing, we saw this error causing the NodeManager to > fail to startup afterwards: > {noformat} > org.apache.hadoop.service.ServiceStateException: > com.google.protobuf.InvalidProtocolBufferException: Protocol message > contained an invalid tag (zero). > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:173) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:441) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:834) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:895) > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol > message contained an invalid tag (zero). > at > com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.java:89) > at > com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:108) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto.(YarnSecurityTokenProtos.java:1860) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto.(YarnSecurityTokenProtos.java:1824) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto$1.parsePartialFrom(YarnSecurityTokenProtos.java:2016) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto$1.parsePartialFrom(YarnSecurityTokenProtos.java:2011) > at > com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto.parseFrom(YarnSecurityTokenProtos.java:2686) > at > org.apache.hadoop.yarn.security.ContainerTokenIdentifier.readFields(ContainerTokenIdentifier.java:254) > at > org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:177) > at > org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerTokenIdentifier(BuilderUtils.java:322) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverContainer(ContainerManagerImpl.java:455) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:373) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:316) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > ... 5 more > {noformat} > The NodeManager fails because it's trying
[jira] [Commented] (YARN-7998) RM crashes with NPE during recovering if ACL configuration was changed
[ https://issues.apache.org/jira/browse/YARN-7998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486643#comment-16486643 ] genericqa commented on YARN-7998: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s{color} | {color:red} YARN-7998 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-7998 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12913213/YARN-7998.003.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/20835/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > RM crashes with NPE during recovering if ACL configuration was changed > -- > > Key: YARN-7998 > URL: https://issues.apache.org/jira/browse/YARN-7998 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 3.0.0 >Reporter: Oleksandr Shevchenko >Assignee: Oleksandr Shevchenko >Priority: Major > Attachments: YARN-7998.000.patch, YARN-7998.001.patch, > YARN-7998.002.patch, YARN-7998.003.patch > > > RM crashes with NPE during failover because ACL configurations were changed > as a result we no longer have a rights to submit an application to a queue. > Scenario: > # Submit an application > # Change ACL configuration for a queue that accepted the application so that > an owner of the application will no longer have a rights to submit this > application. > # Restart RM. > As a result, we get NPE: > 2018-02-27 18:14:00,968 INFO org.apache.hadoop.service.AbstractService: > Service ResourceManager failed in state STARTED; cause: > java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.addApplicationAttempt(FairScheduler.java:738) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1286) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:116) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AttemptRecoveredTransition.transition(RMAppAttemptImpl.java:1098) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AttemptRecoveredTransition.transition(RMAppAttemptImpl.java:1044) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7340) Missing the time stamp in exception message in Class NoOverCommitPolicy
[ https://issues.apache.org/jira/browse/YARN-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486713#comment-16486713 ] Dinesh Chitlangia commented on YARN-7340: - [~yufeigu] - I have updated the log message to reflect the startTime. Kindly review the patch. I have not updated any unit tests for this. Thank you. > Missing the time stamp in exception message in Class NoOverCommitPolicy > --- > > Key: YARN-7340 > URL: https://issues.apache.org/jira/browse/YARN-7340 > Project: Hadoop YARN > Issue Type: Bug > Components: reservation system >Affects Versions: 3.1.0 >Reporter: Yufei Gu >Assignee: Dinesh Chitlangia >Priority: Minor > Labels: newbie++ > Attachments: YARN-7340.001.patch > > > It could be easily figured out by reading code. > {code} > throw new ResourceOverCommitException( > "Resources at time " + " would be overcommitted by " > + "accepting reservation: " + reservation.getReservationId()); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6
[ https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484721#comment-16484721 ] Hsin-Liang Huang edited comment on YARN-8326 at 5/23/18 12:18 AM: -- [~eyang] this afternoon, I tried the command and the performance was dramatically improved. It used to run 8 seconds, now it ran 3 seconds consistently, then I compared with the other 3.0 cluster which I didn't make the properties changes that you suggested, and it still ran 8 seconds consistently. I am going to run our testcases to see if the performance is also improved there. was (Author: hlhu...@us.ibm.com): [~eyang] this afternoon, I tried the command and the performance was dramatically improved. It used to run 8 seconds, now it ran 3 seconds consistently, then I compared with the other HDP 3.0 cluster which I didn't make the properties changes that you suggested, and it still ran 8 seconds consistently. I am going to run our testcases to see if the performance is also improved there. > Yarn 3.0 seems runs slower than Yarn 2.6 > > > Key: YARN-8326 > URL: https://issues.apache.org/jira/browse/YARN-8326 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0 > Environment: This is the yarn-site.xml for 3.0. > > > > hadoop.registry.dns.bind-port > 5353 > > > hadoop.registry.dns.domain-name > hwx.site > > > hadoop.registry.dns.enabled > true > > > hadoop.registry.dns.zone-mask > 255.255.255.0 > > > hadoop.registry.dns.zone-subnet > 172.17.0.0 > > > manage.include.files > false > > > yarn.acl.enable > false > > > yarn.admin.acl > yarn > > > yarn.client.nodemanager-connect.max-wait-ms > 6 > > > yarn.client.nodemanager-connect.retry-interval-ms > 1 > > > yarn.http.policy > HTTP_ONLY > > > yarn.log-aggregation-enable > false > > > yarn.log-aggregation.retain-seconds > 2592000 > > > yarn.log.server.url > > [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs] > > > yarn.log.server.web-service.url > > [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory] > > > yarn.node-labels.enabled > false > > > yarn.node-labels.fs-store.retry-policy-spec > 2000, 500 > > > yarn.node-labels.fs-store.root-dir > /system/yarn/node-labels > > > yarn.nodemanager.address > 0.0.0.0:45454 > > > yarn.nodemanager.admin-env > MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX > > > yarn.nodemanager.aux-services > mapreduce_shuffle,spark2_shuffle,timeline_collector > > > yarn.nodemanager.aux-services.mapreduce_shuffle.class > org.apache.hadoop.mapred.ShuffleHandler > > > yarn.nodemanager.aux-services.spark2_shuffle.class > org.apache.spark.network.yarn.YarnShuffleService > > > yarn.nodemanager.aux-services.spark2_shuffle.classpath > /usr/spark2/aux/* > > > yarn.nodemanager.aux-services.spark_shuffle.class > org.apache.spark.network.yarn.YarnShuffleService > > > yarn.nodemanager.aux-services.timeline_collector.class > > org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService > > > yarn.nodemanager.bind-host > 0.0.0.0 > > > yarn.nodemanager.container-executor.class > > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor > > > yarn.nodemanager.container-metrics.unregister-delay-ms > 6 > > > yarn.nodemanager.container-monitor.interval-ms > 3000 > > > yarn.nodemanager.delete.debug-delay-sec > 0 > > > > yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage > 90 > > > yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb > 1000 > > > yarn.nodemanager.disk-health-checker.min-healthy-disks > 0.25 > > > yarn.nodemanager.health-checker.interval-ms > 135000 > > > yarn.nodemanager.health-checker.script.timeout-ms > 6 > > > > yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage > false > > > yarn.nodemanager.linux-container-executor.group > hadoop > > > > yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users > false > > > yarn.nodemanager.local-dirs > /hadoop/yarn/local > > > yarn.nodemanager.log-aggregation.compression-type > gz > > > yarn.nodemanager.log-aggregation.debug-enabled > false > > > yarn.nodemanager.log-aggregation.num-log-files-per-app > 30 > > > > yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds > 3600 > > > yarn.nodemanager.log-dirs > /hadoop/yarn/log > > > yarn.nodemanager.log.retain-seconds > 604800 > > > yarn.nodemanager.pmem-check-enabled > false > > > yarn.nodemanager.recovery.dir >
[jira] [Commented] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored
[ https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486545#comment-16486545 ] Vinod Kumar Vavilapalli commented on YARN-8342: --- Looks like the name {{docker.privileged-containers.registries}} is very misleading. It doesn't apply only for Docker Privileged Containers, right? If so, we should fix this name. bq. When using untrusted image with yarn mode, this will generate a launch_container.sh that runs a empty bash command and exit immediately according to Shane Why not simply take the launch command given by the user and let it fail instead of silently replacing it with empty bash? > Using docker image from a non-privileged registry, the launch_command is not > honored > > > Key: YARN-8342 > URL: https://issues.apache.org/jira/browse/YARN-8342 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Priority: Critical > Labels: Docker > > During test of the Docker feature, I found that if a container comes from > non-privileged docker registry, the specified launch command will be ignored. > Container will success without any log, which is very confusing to end users. > And this behavior is inconsistent to containers from privileged docker > registries. > cc: [~eyang], [~shaneku...@gmail.com], [~ebadger], [~jlowe] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8334) Fix potential connection leak in GPGUtils
[ https://issues.apache.org/jira/browse/YARN-8334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486590#comment-16486590 ] genericqa commented on YARN-8334: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 25m 56s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} YARN-7402 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 10s{color} | {color:green} YARN-7402 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s{color} | {color:green} YARN-7402 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s{color} | {color:green} YARN-7402 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s{color} | {color:green} YARN-7402 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 34s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 37s{color} | {color:green} YARN-7402 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} YARN-7402 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 58s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 33s{color} | {color:green} hadoop-yarn-server-globalpolicygenerator in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 83m 29s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-8334 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924657/YARN-8334-YARN-7402.v2.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 16db2c1ede7d 3.13.0-137-generic #186-Ubuntu SMP Mon Dec 4 19:09:19 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | YARN-7402 / f9c69ca | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/20833/testReport/ | | Max. process+thread count | 304 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-globalpolicygenerator U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-globalpolicygenerator | | Console output |
[jira] [Updated] (YARN-4599) Set OOM control for memory cgroups
[ https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Szegedi updated YARN-4599: - Attachment: YARN-4599.016.patch > Set OOM control for memory cgroups > -- > > Key: YARN-4599 > URL: https://issues.apache.org/jira/browse/YARN-4599 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.9.0 >Reporter: Karthik Kambatla >Assignee: Miklos Szegedi >Priority: Major > Labels: oct16-medium > Attachments: Elastic Memory Control in YARN.pdf, YARN-4599.000.patch, > YARN-4599.001.patch, YARN-4599.002.patch, YARN-4599.003.patch, > YARN-4599.004.patch, YARN-4599.005.patch, YARN-4599.006.patch, > YARN-4599.007.patch, YARN-4599.008.patch, YARN-4599.009.patch, > YARN-4599.010.patch, YARN-4599.011.patch, YARN-4599.012.patch, > YARN-4599.013.patch, YARN-4599.014.patch, YARN-4599.015.patch, > YARN-4599.016.patch, YARN-4599.sandflee.patch, yarn-4599-not-so-useful.patch > > > YARN-1856 adds memory cgroups enforcing support. We should also explicitly > set OOM control so that containers are not killed as soon as they go over > their usage. Today, one could set the swappiness to control this, but > clusters with swap turned off exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8334) Fix potential connection leak in GPGUtils
[ https://issues.apache.org/jira/browse/YARN-8334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-8334: --- Attachment: YARN-8334-YARN-7402.v2.patch > Fix potential connection leak in GPGUtils > - > > Key: YARN-8334 > URL: https://issues.apache.org/jira/browse/YARN-8334 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Minor > Attachments: YARN-8334-YARN-7402.v1.patch, > YARN-8334-YARN-7402.v2.patch > > > Missing ClientResponse.close and Client.destroy can lead to a connection leak. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8310) Handle old NMTokenIdentifier, AMRMTokenIdentifier, and ContainerTokenIdentifier formats
[ https://issues.apache.org/jira/browse/YARN-8310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486628#comment-16486628 ] Miklos Szegedi commented on YARN-8310: -- Committed to trunk. Thank you for the patch [~rkanter] and for the review [~grepas]. > Handle old NMTokenIdentifier, AMRMTokenIdentifier, and > ContainerTokenIdentifier formats > --- > > Key: YARN-8310 > URL: https://issues.apache.org/jira/browse/YARN-8310 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Robert Kanter >Assignee: Robert Kanter >Priority: Major > Attachments: YARN-8310.001.patch, YARN-8310.002.patch, > YARN-8310.003.patch, YARN-8310.branch-2.001.patch, > YARN-8310.branch-2.002.patch, YARN-8310.branch-2.003.patch > > > In some recent upgrade testing, we saw this error causing the NodeManager to > fail to startup afterwards: > {noformat} > org.apache.hadoop.service.ServiceStateException: > com.google.protobuf.InvalidProtocolBufferException: Protocol message > contained an invalid tag (zero). > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:173) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:441) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:834) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:895) > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol > message contained an invalid tag (zero). > at > com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.java:89) > at > com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:108) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto.(YarnSecurityTokenProtos.java:1860) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto.(YarnSecurityTokenProtos.java:1824) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto$1.parsePartialFrom(YarnSecurityTokenProtos.java:2016) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto$1.parsePartialFrom(YarnSecurityTokenProtos.java:2011) > at > com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto.parseFrom(YarnSecurityTokenProtos.java:2686) > at > org.apache.hadoop.yarn.security.ContainerTokenIdentifier.readFields(ContainerTokenIdentifier.java:254) > at > org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:177) > at > org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerTokenIdentifier(BuilderUtils.java:322) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverContainer(ContainerManagerImpl.java:455) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:373) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:316) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > ... 5 more > {noformat} > The NodeManager fails because it's trying to read a > {{ContainerTokenIdentifier}} in the "old" format before we changed them to > protobufs (YARN-668). This is very similar to YARN-5594 where we ran into a > similar problem with the ResourceManager and RM Delegation Tokens. > To provide a better experience, we should make the code able to read the old > format if it's unable to read it using the new format. We didn't run into > any errors with the other two types of tokens that YARN-668 incompatibly > changed (NMTokenIdentifier and AMRMTokenIdentifier), but we may as well fix > those while we're at it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For
[jira] [Commented] (YARN-8310) Handle old NMTokenIdentifier, AMRMTokenIdentifier, and ContainerTokenIdentifier formats
[ https://issues.apache.org/jira/browse/YARN-8310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486626#comment-16486626 ] Miklos Szegedi commented on YARN-8310: -- +1 LGTM. > Handle old NMTokenIdentifier, AMRMTokenIdentifier, and > ContainerTokenIdentifier formats > --- > > Key: YARN-8310 > URL: https://issues.apache.org/jira/browse/YARN-8310 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Robert Kanter >Assignee: Robert Kanter >Priority: Major > Attachments: YARN-8310.001.patch, YARN-8310.002.patch, > YARN-8310.003.patch, YARN-8310.branch-2.001.patch, > YARN-8310.branch-2.002.patch, YARN-8310.branch-2.003.patch > > > In some recent upgrade testing, we saw this error causing the NodeManager to > fail to startup afterwards: > {noformat} > org.apache.hadoop.service.ServiceStateException: > com.google.protobuf.InvalidProtocolBufferException: Protocol message > contained an invalid tag (zero). > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:173) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:441) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:834) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:895) > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol > message contained an invalid tag (zero). > at > com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.java:89) > at > com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:108) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto.(YarnSecurityTokenProtos.java:1860) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto.(YarnSecurityTokenProtos.java:1824) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto$1.parsePartialFrom(YarnSecurityTokenProtos.java:2016) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto$1.parsePartialFrom(YarnSecurityTokenProtos.java:2011) > at > com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223) > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) > at > org.apache.hadoop.yarn.proto.YarnSecurityTokenProtos$ContainerTokenIdentifierProto.parseFrom(YarnSecurityTokenProtos.java:2686) > at > org.apache.hadoop.yarn.security.ContainerTokenIdentifier.readFields(ContainerTokenIdentifier.java:254) > at > org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:177) > at > org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerTokenIdentifier(BuilderUtils.java:322) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverContainer(ContainerManagerImpl.java:455) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:373) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:316) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > ... 5 more > {noformat} > The NodeManager fails because it's trying to read a > {{ContainerTokenIdentifier}} in the "old" format before we changed them to > protobufs (YARN-668). This is very similar to YARN-5594 where we ran into a > similar problem with the ResourceManager and RM Delegation Tokens. > To provide a better experience, we should make the code able to read the old > format if it's unable to read it using the new format. We didn't run into > any errors with the other two types of tokens that YARN-668 incompatibly > changed (NMTokenIdentifier and AMRMTokenIdentifier), but we may as well fix > those while we're at it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8344) Missing nm.close() in TestNodeManagerResync to fix unit tests on Windows
[ https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486641#comment-16486641 ] genericqa commented on YARN-8344: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 37s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 39s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 24s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 1 new + 29 unchanged - 2 fixed = 30 total (was 31) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 36s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 4s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 77m 14s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8344 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924672/YARN-8344.v1.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 1785a9e0357b 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 68c7fd8 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/20834/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/20834/testReport/ | | Max. process+thread count | 341 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U:
[jira] [Created] (YARN-8341) Yarn Service: Integration tests
Chandni Singh created YARN-8341: --- Summary: Yarn Service: Integration tests Key: YARN-8341 URL: https://issues.apache.org/jira/browse/YARN-8341 Project: Hadoop YARN Issue Type: Improvement Reporter: Chandni Singh Assignee: Chandni Singh In order to test the rest api end-to-end, we can add Integration tests for Yarn service api. The integration tests * belong to junit category {{IntegrationTest}}. * will be only run when triggered by executing {{mvn failsafe:integration-test}} * the surefire plugin for regular tests excludes {{IntegrationTest}} * RM host, user name, and any additional properties which are needed to execute the tests against a cluster can be passed as System properties. For eg. {{mvn failsafe:integration-test -Drm.host=localhost -Duser.name=root}} We can add more integration tests which can check scalability and performance. Have these tests here benefits everyone in the community because anyone can run these tests against there cluster. Attaching a work in progress patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8341) Yarn Service: Integration tests
[ https://issues.apache.org/jira/browse/YARN-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandni Singh updated YARN-8341: Attachment: (was: YARN-8341.wip.patch) > Yarn Service: Integration tests > > > Key: YARN-8341 > URL: https://issues.apache.org/jira/browse/YARN-8341 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8341.wip.patch > > > In order to test the rest api end-to-end, we can add Integration tests for > Yarn service api. > The integration tests > * belong to junit category {{IntegrationTest}}. > * will be only run when triggered by executing {{mvn > failsafe:integration-test}} > * the surefire plugin for regular tests excludes {{IntegrationTest}} > * RM host, user name, and any additional properties which are needed to > execute the tests against a cluster can be passed as System properties. > For eg. {{mvn failsafe:integration-test -Drm.host=localhost -Duser.name=root}} > We can add more integration tests which can check scalability and performance. > Have these tests here benefits everyone in the community because anyone can > run these tests against there cluster. > Attaching a work in progress patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8290) SystemMetricsPublisher.appACLsUpdated should be invoked after application information is published to ATS to avoid "User is not set in the application report" Exception
[ https://issues.apache.org/jira/browse/YARN-8290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8290: - Summary: SystemMetricsPublisher.appACLsUpdated should be invoked after application information is published to ATS to avoid "User is not set in the application report" Exception (was: SystemMetricsPublisher.appACLsUpdated should be invoked after application information is published to ATS to avoid ) > SystemMetricsPublisher.appACLsUpdated should be invoked after application > information is published to ATS to avoid "User is not set in the application > report" Exception > > > Key: YARN-8290 > URL: https://issues.apache.org/jira/browse/YARN-8290 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Yesha Vora >Assignee: Eric Yang >Priority: Major > Attachments: YARN-8290.001.patch, YARN-8290.002.patch, > YARN-8290.003.patch, YARN-8290.004.patch > > > Scenario: > 1) Start 5 streaming application in background > 2) Kill Active RM and cause RM failover > After RM failover, The application failed with below error. > {code}18/02/01 21:24:29 WARN client.RequestHedgingRMFailoverProxyProvider: > Invocation returned exception on [rm2] : > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1517520038847_0003' doesn't exist in RM. Please check > that the job submission was successful. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:338) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347) > , so propagating back to caller. > 18/02/01 21:24:29 INFO impl.YarnClientImpl: Submitted application > application_1517520038847_0003 > 18/02/01 21:24:30 INFO mapreduce.JobSubmitter: Cleaning up the staging area > /user/hrt_qa/.staging/job_1517520038847_0003 > 18/02/01 21:24:30 ERROR streaming.StreamJob: Error Launching job : User is > not set in the application report > Streaming Command Failed!{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8290) SystemMetricsPublisher.appACLsUpdated should be invoked after application information is published to ATS to avoid "User is not set in the application report" Exception
[ https://issues.apache.org/jira/browse/YARN-8290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8290: - Priority: Critical (was: Major) > SystemMetricsPublisher.appACLsUpdated should be invoked after application > information is published to ATS to avoid "User is not set in the application > report" Exception > > > Key: YARN-8290 > URL: https://issues.apache.org/jira/browse/YARN-8290 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Yesha Vora >Assignee: Eric Yang >Priority: Critical > Attachments: YARN-8290.001.patch, YARN-8290.002.patch, > YARN-8290.003.patch, YARN-8290.004.patch > > > Scenario: > 1) Start 5 streaming application in background > 2) Kill Active RM and cause RM failover > After RM failover, The application failed with below error. > {code}18/02/01 21:24:29 WARN client.RequestHedgingRMFailoverProxyProvider: > Invocation returned exception on [rm2] : > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1517520038847_0003' doesn't exist in RM. Please check > that the job submission was successful. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:338) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347) > , so propagating back to caller. > 18/02/01 21:24:29 INFO impl.YarnClientImpl: Submitted application > application_1517520038847_0003 > 18/02/01 21:24:30 INFO mapreduce.JobSubmitter: Cleaning up the staging area > /user/hrt_qa/.staging/job_1517520038847_0003 > 18/02/01 21:24:30 ERROR streaming.StreamJob: Error Launching job : User is > not set in the application report > Streaming Command Failed!{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8341) Yarn Service: Integration tests
[ https://issues.apache.org/jira/browse/YARN-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandni Singh updated YARN-8341: Attachment: YARN-8341.wip.patch > Yarn Service: Integration tests > > > Key: YARN-8341 > URL: https://issues.apache.org/jira/browse/YARN-8341 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8341.wip.patch, YARN-8341.wip.patch > > > In order to test the rest api end-to-end, we can add Integration tests for > Yarn service api. > The integration tests > * belong to junit category {{IntegrationTest}}. > * will be only run when triggered by executing {{mvn > failsafe:integration-test}} > * the surefire plugin for regular tests excludes {{IntegrationTest}} > * RM host, user name, and any additional properties which are needed to > execute the tests against a cluster can be passed as System properties. > For eg. {{mvn failsafe:integration-test -Drm.host=localhost -Duser.name=root}} > We can add more integration tests which can check scalability and performance. > Have these tests here benefits everyone in the community because anyone can > run these tests against there cluster. > Attaching a work in progress patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8273) Log aggregation does not warn if HDFS quota in target directory is exceeded
[ https://issues.apache.org/jira/browse/YARN-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484638#comment-16484638 ] Hudson commented on YARN-8273: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14255 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14255/]) YARN-8273. Log aggregation does not warn if HDFS quota in target (rkanter: rev b22f56c4719e63bd4f6edc2a075e0bcdb9442255) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/pom.xml * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/filecontroller/LogAggregationDFSException.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestAppLogAggregatorImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/filecontroller/tfile/LogAggregationTFileController.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogFormat.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/filecontroller/LogAggregationFileController.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/logaggregation/TestContainerLogsUtils.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestNMWebServices.java > Log aggregation does not warn if HDFS quota in target directory is exceeded > --- > > Key: YARN-8273 > URL: https://issues.apache.org/jira/browse/YARN-8273 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation >Affects Versions: 3.1.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-8273.000.patch, YARN-8273.001.patch, > YARN-8273.002.patch, YARN-8273.003.patch, YARN-8273.004.patch, > YARN-8273.005.patch, YARN-8273.006.patch > > > It appears that if an HDFS space quota is set on a target directory for log > aggregation and the quota is already exceeded when log aggregation is > attempted, zero-byte log files will be written to the HDFS directory, however > NodeManager logs do not reflect a failure to write the files successfully > (i.e. there are no ERROR or WARN messages to this effect). > An improvement may be worth investigating to alert users to this scenario, as > otherwise logs for a YARN application may be missing both on HDFS and locally > (after local log cleanup is done) and the user may not otherwise be informed. > Steps to reproduce: > * Set a small HDFS space quota on /tmp/logs/username/logs (e.g. 2MB) > * Write files to HDFS such that /tmp/logs/username/logs is almost 2MB full > * Run a Spark or MR job in the cluster > * Observe that zero byte files are written to HDFS after job completion > * Observe that YARN container logs are also not present on the NM hosts (or > are deleted after yarn.nodemanager.delete.debug-delay-sec) > * Observe that no ERROR or WARN messages appear to be logged in the NM role > log -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8334) Fix potential connection leak in GPGUtils
[ https://issues.apache.org/jira/browse/YARN-8334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484651#comment-16484651 ] Botong Huang edited comment on YARN-8334 at 5/22/18 10:03 PM: -- I realized that I confused destroy() with finalize() earlier. +1 on the patch pending on the findbug warning. You can basically remove the if (client != null) check. was (Author: botong): I realized that I confused destroy() with finalize() earlier. +1 on the patch pending on the findbug warning. > Fix potential connection leak in GPGUtils > - > > Key: YARN-8334 > URL: https://issues.apache.org/jira/browse/YARN-8334 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Minor > Attachments: YARN-8334-YARN-7402.v1.patch > > > Missing ClientResponse.close and Client.destroy can lead to a connection leak. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8292) Fix the dominant resource preemption cannot happen when some of the resource vector becomes negative
[ https://issues.apache.org/jira/browse/YARN-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484674#comment-16484674 ] Jason Lowe commented on YARN-8292: -- Thanks for updating the patch! Why does isAnyMajorResourceZeroOrNegative explicitly use a floating point zero constant and force the implicit conversion of the getMemorySize() result from a long to a float? This is done in a few other places in DefaultResourceCalculator and they all seem like wasteful conversions to me. The logger that was added to AbstractPreemptableResourceCalculator is not used. Also I'm curious why commons logging was used here instead of SLF4J. stepFactor is a constant that should be precomputed in the AbstractPreemptableResourceCalculator constructor rather than computing it from scratch each time. Do we really want to use Resources.lessThanOrEqual(rc, totGuarant, unassigned, none) here? For DRF that requires computing shares in each resource dimension for both resources which is relatively expensive. I think Resources.fitsIn(unassigned, none) is more along what what is called for here (although fitsIn does some unit checking and conversions we don't want either). Really what we want is something like a isAnyMajorResourceRequested() which returns true if any resource dimension is > 0. Not a fan of the proposed method name, but hopefully it gets across what I'm talking about here. Of course if we're going to always componentwiseMax unassigned with Resources.none() to make sure no resource dimension in unassigned can ever go negative then the check can be simplified to if (Resources.none().equals(unassigned)). Similar "do we really want a full DRF comparison here" comment for the Resources.greaterThan(rc, clusterResource, toObtainByPartition, Resources.none()) check and the Resources.lessThan check that occurs a bit later. The comment says: {code} * When true: *stop preempt container when any resource type < 0 for to- *preempt. {code} but the code will stop preempting if any resource dimension <= 0 since it does: {code} if (conservativeDRF) { doPreempt = !Resources.isAnyMajorResourceZeroOrNegative(rc, toObtainByPartition); {code} I agree with Eric that this essentially means conservativeDRF is badly broken if there is a resource dimension that is not requested by every container, and that raises the question of whether it makes sense to make conservativeDRF the default. It would be good to cleanup the unused imports as flagged by checkstyle. > Fix the dominant resource preemption cannot happen when some of the resource > vector becomes negative > > > Key: YARN-8292 > URL: https://issues.apache.org/jira/browse/YARN-8292 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8292.001.patch, YARN-8292.002.patch, > YARN-8292.003.patch, YARN-8292.004.patch, YARN-8292.005.patch, > YARN-8292.006.patch > > > This is an example of the problem: > > {code} > // guaranteed, max,used, pending > "root(=[30:18:6 30:18:6 12:12:6 1:1:1]);" + //root > "-a(=[10:6:2 10:6:2 6:6:3 0:0:0]);" + // a > "-b(=[10:6:2 10:6:2 6:6:3 0:0:0]);" + // b > "-c(=[10:6:2 10:6:2 0:0:0 1:1:1])"; // c > {code} > There're 3 resource types. Total resource of the cluster is 30:18:6 > For both of a/b, there're 3 containers running, each of container is 2:2:1. > Queue c uses 0 resource, and have 1:1:1 pending resource. > Under existing logic, preemption cannot happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8292) Fix the dominant resource preemption cannot happen when some of the resource vector becomes negative
[ https://issues.apache.org/jira/browse/YARN-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484735#comment-16484735 ] Wangda Tan commented on YARN-8292: -- Thanks [~jlowe], all great comments. Addressed all of them, please let me know if you have any other comments. (007) > Fix the dominant resource preemption cannot happen when some of the resource > vector becomes negative > > > Key: YARN-8292 > URL: https://issues.apache.org/jira/browse/YARN-8292 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8292.001.patch, YARN-8292.002.patch, > YARN-8292.003.patch, YARN-8292.004.patch, YARN-8292.005.patch, > YARN-8292.006.patch, YARN-8292.007.patch > > > This is an example of the problem: > > {code} > // guaranteed, max,used, pending > "root(=[30:18:6 30:18:6 12:12:6 1:1:1]);" + //root > "-a(=[10:6:2 10:6:2 6:6:3 0:0:0]);" + // a > "-b(=[10:6:2 10:6:2 6:6:3 0:0:0]);" + // b > "-c(=[10:6:2 10:6:2 0:0:0 1:1:1])"; // c > {code} > There're 3 resource types. Total resource of the cluster is 30:18:6 > For both of a/b, there're 3 containers running, each of container is 2:2:1. > Queue c uses 0 resource, and have 1:1:1 pending resource. > Under existing logic, preemption cannot happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8292) Fix the dominant resource preemption cannot happen when some of the resource vector becomes negative
[ https://issues.apache.org/jira/browse/YARN-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8292: - Attachment: YARN-8292.007.patch > Fix the dominant resource preemption cannot happen when some of the resource > vector becomes negative > > > Key: YARN-8292 > URL: https://issues.apache.org/jira/browse/YARN-8292 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8292.001.patch, YARN-8292.002.patch, > YARN-8292.003.patch, YARN-8292.004.patch, YARN-8292.005.patch, > YARN-8292.006.patch, YARN-8292.007.patch > > > This is an example of the problem: > > {code} > // guaranteed, max,used, pending > "root(=[30:18:6 30:18:6 12:12:6 1:1:1]);" + //root > "-a(=[10:6:2 10:6:2 6:6:3 0:0:0]);" + // a > "-b(=[10:6:2 10:6:2 6:6:3 0:0:0]);" + // b > "-c(=[10:6:2 10:6:2 0:0:0 1:1:1])"; // c > {code} > There're 3 resource types. Total resource of the cluster is 30:18:6 > For both of a/b, there're 3 containers running, each of container is 2:2:1. > Queue c uses 0 resource, and have 1:1:1 pending resource. > Under existing logic, preemption cannot happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8290) SystemMetricsPublisher.appACLsUpdated should be invoked after application information is published to ATS to avoid "User is not set in the application report" Exception
[ https://issues.apache.org/jira/browse/YARN-8290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484586#comment-16484586 ] Hudson commented on YARN-8290: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14254 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14254/]) YARN-8290. SystemMetricsPublisher.appACLsUpdated should be invoked after (wangda: rev bd15d2396ef0c24fb6b60c6393d16b37651b828e) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java > SystemMetricsPublisher.appACLsUpdated should be invoked after application > information is published to ATS to avoid "User is not set in the application > report" Exception > > > Key: YARN-8290 > URL: https://issues.apache.org/jira/browse/YARN-8290 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Yesha Vora >Assignee: Eric Yang >Priority: Critical > Fix For: 3.2.0, 3.1.1 > > Attachments: YARN-8290.001.patch, YARN-8290.002.patch, > YARN-8290.003.patch, YARN-8290.004.patch > > > Scenario: > 1) Start 5 streaming application in background > 2) Kill Active RM and cause RM failover > After RM failover, The application failed with below error. > {code}18/02/01 21:24:29 WARN client.RequestHedgingRMFailoverProxyProvider: > Invocation returned exception on [rm2] : > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1517520038847_0003' doesn't exist in RM. Please check > that the job submission was successful. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:338) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347) > , so propagating back to caller. > 18/02/01 21:24:29 INFO impl.YarnClientImpl: Submitted application > application_1517520038847_0003 > 18/02/01 21:24:30 INFO mapreduce.JobSubmitter: Cleaning up the staging area > /user/hrt_qa/.staging/job_1517520038847_0003 > 18/02/01 21:24:30 ERROR streaming.StreamJob: Error Launching job : User is > not set in the application report > Streaming Command Failed!{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8273) Log aggregation does not warn if HDFS quota in target directory is exceeded
[ https://issues.apache.org/jira/browse/YARN-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484588#comment-16484588 ] Robert Kanter commented on YARN-8273: - +1 LGTM > Log aggregation does not warn if HDFS quota in target directory is exceeded > --- > > Key: YARN-8273 > URL: https://issues.apache.org/jira/browse/YARN-8273 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation >Affects Versions: 3.1.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Attachments: YARN-8273.000.patch, YARN-8273.001.patch, > YARN-8273.002.patch, YARN-8273.003.patch, YARN-8273.004.patch, > YARN-8273.005.patch, YARN-8273.006.patch > > > It appears that if an HDFS space quota is set on a target directory for log > aggregation and the quota is already exceeded when log aggregation is > attempted, zero-byte log files will be written to the HDFS directory, however > NodeManager logs do not reflect a failure to write the files successfully > (i.e. there are no ERROR or WARN messages to this effect). > An improvement may be worth investigating to alert users to this scenario, as > otherwise logs for a YARN application may be missing both on HDFS and locally > (after local log cleanup is done) and the user may not otherwise be informed. > Steps to reproduce: > * Set a small HDFS space quota on /tmp/logs/username/logs (e.g. 2MB) > * Write files to HDFS such that /tmp/logs/username/logs is almost 2MB full > * Run a Spark or MR job in the cluster > * Observe that zero byte files are written to HDFS after job completion > * Observe that YARN container logs are also not present on the NM hosts (or > are deleted after yarn.nodemanager.delete.debug-delay-sec) > * Observe that no ERROR or WARN messages appear to be logged in the NM role > log -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8332) Incorrect min/max allocation property name in resource types doc
[ https://issues.apache.org/jira/browse/YARN-8332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484587#comment-16484587 ] Hudson commented on YARN-8332: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14254 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14254/]) YARN-8332. Incorrect min/max allocation property name in resource types (wangda: rev 83f53e5c6236de30c213dc41878cebfb02597e26) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceModel.md > Incorrect min/max allocation property name in resource types doc > > > Key: YARN-8332 > URL: https://issues.apache.org/jira/browse/YARN-8332 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Critical > Fix For: 3.2.0, 3.1.1 > > Attachments: YARN-8332.001.patch > > > It should be > {noformat} > yarn.resource-types..minimum-allocation > yarn.resource-types..maximum-allocation > {noformat} > instead of > {noformat} > yarn.resource-types..minimum > yarn.resource-types..maximum > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8343) YARN should have ability to run images only from a whitelist docker registries
Wangda Tan created YARN-8343: Summary: YARN should have ability to run images only from a whitelist docker registries Key: YARN-8343 URL: https://issues.apache.org/jira/browse/YARN-8343 Project: Hadoop YARN Issue Type: Bug Reporter: Wangda Tan This is a superset of docker.privileged-containers.registries, admin can specify a whitelist and all images from non-privileged-container.registries will be rejected. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6
[ https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484721#comment-16484721 ] Hsin-Liang Huang commented on YARN-8326: [~eyang] this afternoon, I tried the command and the performance was dramatically improved. It used to run 8 seconds, now it ran 3 seconds consistently, then I compared with the other HDP 3.0 cluster which I didn't make the properties changes that you suggested, and it still ran 8 seconds consistently. I am going to run our testcases to see if the performance is also improved there. > Yarn 3.0 seems runs slower than Yarn 2.6 > > > Key: YARN-8326 > URL: https://issues.apache.org/jira/browse/YARN-8326 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0 > Environment: This is the yarn-site.xml for 3.0. > > > > hadoop.registry.dns.bind-port > 5353 > > > hadoop.registry.dns.domain-name > hwx.site > > > hadoop.registry.dns.enabled > true > > > hadoop.registry.dns.zone-mask > 255.255.255.0 > > > hadoop.registry.dns.zone-subnet > 172.17.0.0 > > > manage.include.files > false > > > yarn.acl.enable > false > > > yarn.admin.acl > yarn > > > yarn.client.nodemanager-connect.max-wait-ms > 6 > > > yarn.client.nodemanager-connect.retry-interval-ms > 1 > > > yarn.http.policy > HTTP_ONLY > > > yarn.log-aggregation-enable > false > > > yarn.log-aggregation.retain-seconds > 2592000 > > > yarn.log.server.url > > [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs] > > > yarn.log.server.web-service.url > > [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory] > > > yarn.node-labels.enabled > false > > > yarn.node-labels.fs-store.retry-policy-spec > 2000, 500 > > > yarn.node-labels.fs-store.root-dir > /system/yarn/node-labels > > > yarn.nodemanager.address > 0.0.0.0:45454 > > > yarn.nodemanager.admin-env > MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX > > > yarn.nodemanager.aux-services > mapreduce_shuffle,spark2_shuffle,timeline_collector > > > yarn.nodemanager.aux-services.mapreduce_shuffle.class > org.apache.hadoop.mapred.ShuffleHandler > > > yarn.nodemanager.aux-services.spark2_shuffle.class > org.apache.spark.network.yarn.YarnShuffleService > > > yarn.nodemanager.aux-services.spark2_shuffle.classpath > /usr/spark2/aux/* > > > yarn.nodemanager.aux-services.spark_shuffle.class > org.apache.spark.network.yarn.YarnShuffleService > > > yarn.nodemanager.aux-services.timeline_collector.class > > org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService > > > yarn.nodemanager.bind-host > 0.0.0.0 > > > yarn.nodemanager.container-executor.class > > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor > > > yarn.nodemanager.container-metrics.unregister-delay-ms > 6 > > > yarn.nodemanager.container-monitor.interval-ms > 3000 > > > yarn.nodemanager.delete.debug-delay-sec > 0 > > > > yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage > 90 > > > yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb > 1000 > > > yarn.nodemanager.disk-health-checker.min-healthy-disks > 0.25 > > > yarn.nodemanager.health-checker.interval-ms > 135000 > > > yarn.nodemanager.health-checker.script.timeout-ms > 6 > > > > yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage > false > > > yarn.nodemanager.linux-container-executor.group > hadoop > > > > yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users > false > > > yarn.nodemanager.local-dirs > /hadoop/yarn/local > > > yarn.nodemanager.log-aggregation.compression-type > gz > > > yarn.nodemanager.log-aggregation.debug-enabled > false > > > yarn.nodemanager.log-aggregation.num-log-files-per-app > 30 > > > > yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds > 3600 > > > yarn.nodemanager.log-dirs > /hadoop/yarn/log > > > yarn.nodemanager.log.retain-seconds > 604800 > > > yarn.nodemanager.pmem-check-enabled > false > > > yarn.nodemanager.recovery.dir > /var/log/hadoop-yarn/nodemanager/recovery-state > > > yarn.nodemanager.recovery.enabled > true > > > yarn.nodemanager.recovery.supervised > true > > > yarn.nodemanager.remote-app-log-dir > /app-logs > > > yarn.nodemanager.remote-app-log-dir-suffix > logs > > > yarn.nodemanager.resource-plugins > > > > yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices > auto > > > yarn.nodemanager.resource-plugins.gpu.docker-plugin > nvidia-docker-v1 > > >
[jira] [Commented] (YARN-8316) Diagnostic message should improve when yarn service fails to launch due to ATS unavailability
[ https://issues.apache.org/jira/browse/YARN-8316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16485766#comment-16485766 ] genericqa commented on YARN-8316: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 43s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 34s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 39s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 28m 37s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 85m 48s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8316 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924638/YARN-8316.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 786f4b1ac210 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 43be9ab | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/20829/testReport/ | | Max. process+thread count | 703 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/20829/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Diagnostic message should improve when yarn service fails to launch due to > ATS unavailability >
[jira] [Updated] (YARN-8341) Yarn Service: Integration tests
[ https://issues.apache.org/jira/browse/YARN-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandni Singh updated YARN-8341: Attachment: YARN-8341.wip.patch > Yarn Service: Integration tests > > > Key: YARN-8341 > URL: https://issues.apache.org/jira/browse/YARN-8341 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8341.wip.patch > > > In order to test the rest api end-to-end, we can add Integration tests for > Yarn service api. > The integration tests > * belong to junit category {{IntegrationTest}}. > * will be only run when triggered by executing {{mvn > failsafe:integration-test}} > * the surefire plugin for regular tests excludes {{IntegrationTest}} > * RM host, user name, and any additional properties which are needed to > execute the tests against a cluster can be passed as System properties. > For eg. {{mvn failsafe:integration-test -Drm.host=localhost -Duser.name=root}} > We can add more integration tests which can check scalability and performance. > Have these tests here benefits everyone in the community because anyone can > run these tests against there cluster. > Attaching a work in progress patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8316) Diagnostic message should improve when yarn service fails to launch due to ATS unavailability
[ https://issues.apache.org/jira/browse/YARN-8316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Billie Rinaldi updated YARN-8316: - Attachment: YARN-8316.001.patch > Diagnostic message should improve when yarn service fails to launch due to > ATS unavailability > - > > Key: YARN-8316 > URL: https://issues.apache.org/jira/browse/YARN-8316 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Affects Versions: 3.1.0 >Reporter: Yesha Vora >Assignee: Billie Rinaldi >Priority: Major > Attachments: YARN-8316.001.patch > > > Scenario: > 1) shutdown ATS > 2) launch yarn service. > yarn service launch cmd fails with below stack trace. There is no diagnostic > message available in response. > {code:java} > bash-4.2$ yarn app -launch hbase-sec /tmp/hbase-secure.yar > WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of > YARN_LOGFILE. > WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of > YARN_PID_DIR. > 18/05/17 13:24:43 INFO client.RMProxy: Connecting to ResourceManager at > xxx/xxx:8050 > 18/05/17 13:24:44 INFO client.AHSProxy: Connecting to Application History > server at localhost/xxx:10200 > 18/05/17 13:24:44 INFO client.RMProxy: Connecting to ResourceManager at > xxx/xxx:8050 > 18/05/17 13:24:44 INFO client.AHSProxy: Connecting to Application History > server at localhost/127.0.0.1:10200 > 18/05/17 13:24:44 INFO client.ApiServiceClient: Loading service definition > from local FS: /tmp/hbase-secure.yar > 18/05/17 13:26:06 ERROR client.ApiServiceClient: > bash-4.2$ echo $? > 56{code} > The Error message should provide ConnectionRefused exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8334) Fix potential connection leak in GPGUtils
[ https://issues.apache.org/jira/browse/YARN-8334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484651#comment-16484651 ] Botong Huang edited comment on YARN-8334 at 5/22/18 10:01 PM: -- I realized that I confused destroy() with finalize() earlier. +1 on the patch pending on the findbug warning. was (Author: botong): I realized that I confused destroy() with finalize() earlier. +1 on the patch. > Fix potential connection leak in GPGUtils > - > > Key: YARN-8334 > URL: https://issues.apache.org/jira/browse/YARN-8334 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Minor > Attachments: YARN-8334-YARN-7402.v1.patch > > > Missing ClientResponse.close and Client.destroy can lead to a connection leak. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8339) Service AM should localize static/archive resource types to container working directory instead of 'resources'
[ https://issues.apache.org/jira/browse/YARN-8339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8339: - Target Version/s: 3.1.1 Priority: Critical (was: Major) > Service AM should localize static/archive resource types to container working > directory instead of 'resources' > --- > > Key: YARN-8339 > URL: https://issues.apache.org/jira/browse/YARN-8339 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Critical > Attachments: YARN-8339.1.patch > > > This is to address one of the review comments posted by [~wangda] in > YARN-8079 at > https://issues.apache.org/jira/browse/YARN-8079?focusedCommentId=16482065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16482065 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8339) Service AM should localize static/archive resource types to container working directory instead of 'resources'
[ https://issues.apache.org/jira/browse/YARN-8339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484538#comment-16484538 ] Wangda Tan commented on YARN-8339: -- Thanks [~suma.shivaprasad] for the quick fix, patch LGTM, just triggered another Jenkins run, will commit it after that. > Service AM should localize static/archive resource types to container working > directory instead of 'resources' > --- > > Key: YARN-8339 > URL: https://issues.apache.org/jira/browse/YARN-8339 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Major > Attachments: YARN-8339.1.patch > > > This is to address one of the review comments posted by [~wangda] in > YARN-8079 at > https://issues.apache.org/jira/browse/YARN-8079?focusedCommentId=16482065=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16482065 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8338) TimelineService V1.5 doesn't come up after HADOOP-15406
[ https://issues.apache.org/jira/browse/YARN-8338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16485752#comment-16485752 ] Vinod Kumar Vavilapalli commented on YARN-8338: --- [~jlowe], yeah, it wasn't easy to figure out who / what depends on that JAR. No easy way to unit test your approach, let me try that in a cluster. > TimelineService V1.5 doesn't come up after HADOOP-15406 > --- > > Key: YARN-8338 > URL: https://issues.apache.org/jira/browse/YARN-8338 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli >Priority: Critical > Attachments: YARN-8338.txt > > > TimelineService V1.5 fails with the following: > {code} > java.lang.NoClassDefFoundError: org/objenesis/Objenesis > at > org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.(RollingLevelDBTimelineStore.java:174) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8290) SystemMetricsPublisher.appACLsUpdated should be invoked after application information is published to ATS to avoid
[ https://issues.apache.org/jira/browse/YARN-8290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8290: - Summary: SystemMetricsPublisher.appACLsUpdated should be invoked after application information is published to ATS to avoid (was: Yarn application failed to recover with "Error Launching job : User is not set in the application report" error after RM restart) > SystemMetricsPublisher.appACLsUpdated should be invoked after application > information is published to ATS to avoid > --- > > Key: YARN-8290 > URL: https://issues.apache.org/jira/browse/YARN-8290 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Yesha Vora >Assignee: Eric Yang >Priority: Major > Attachments: YARN-8290.001.patch, YARN-8290.002.patch, > YARN-8290.003.patch, YARN-8290.004.patch > > > Scenario: > 1) Start 5 streaming application in background > 2) Kill Active RM and cause RM failover > After RM failover, The application failed with below error. > {code}18/02/01 21:24:29 WARN client.RequestHedgingRMFailoverProxyProvider: > Invocation returned exception on [rm2] : > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1517520038847_0003' doesn't exist in RM. Please check > that the job submission was successful. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:338) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347) > , so propagating back to caller. > 18/02/01 21:24:29 INFO impl.YarnClientImpl: Submitted application > application_1517520038847_0003 > 18/02/01 21:24:30 INFO mapreduce.JobSubmitter: Cleaning up the staging area > /user/hrt_qa/.staging/job_1517520038847_0003 > 18/02/01 21:24:30 ERROR streaming.StreamJob: Error Launching job : User is > not set in the application report > Streaming Command Failed!{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8332) Incorrect min/max allocation property name in resource types doc
[ https://issues.apache.org/jira/browse/YARN-8332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484513#comment-16484513 ] Wangda Tan commented on YARN-8332: -- +1, committing .. > Incorrect min/max allocation property name in resource types doc > > > Key: YARN-8332 > URL: https://issues.apache.org/jira/browse/YARN-8332 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Critical > Attachments: YARN-8332.001.patch > > > It should be > {noformat} > yarn.resource-types..minimum-allocation > yarn.resource-types..maximum-allocation > {noformat} > instead of > {noformat} > yarn.resource-types..minimum > yarn.resource-types..maximum > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8343) YARN should have ability to run images only from a whitelist docker registries
[ https://issues.apache.org/jira/browse/YARN-8343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484695#comment-16484695 ] Wangda Tan commented on YARN-8343: -- cc: [~shaneku...@gmail.com], [~eyang], [~ebadger], [~jlowe] > YARN should have ability to run images only from a whitelist docker registries > -- > > Key: YARN-8343 > URL: https://issues.apache.org/jira/browse/YARN-8343 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Priority: Critical > > This is a superset of docker.privileged-containers.registries, admin can > specify a whitelist and all images from non-privileged-container.registries > will be rejected. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org