[GitHub] spark pull request: [SPARK-12807] [YARN] Spark External Shuffle no...

2016-02-08 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/10780#discussion_r52205244 --- Diff: network/yarn/pom.xml --- @@ -86,6 +88,15

[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

2016-02-08 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/8#discussion_r52220649 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/ApplicationCache.scala --- @@ -0,0 +1,669 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-12807] [YARN] Spark External Shuffle no...

2016-02-05 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/10780#issuecomment-180284513 Can do --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-12807] [YARN] Spark External Shuffle no...

2016-02-05 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/10780#issuecomment-180338682 I've just pushed out a new version which leaves only jackson and guava as the relocations: the ones that are most trouble. I could turn off the guava relocate

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2016-02-05 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/6935#issuecomment-180374115 rebased against master; switch from scan of HTML view to REST API to enumerate listings of complete/incomplete apps, add @squito's ? arg redirection and test

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2016-02-04 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/6935#issuecomment-179849703 ...thx for the feedback. I'm fixing the merge which is now triggering a regression —maybe a race condition in test startup— apps should go from

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2016-02-04 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/6935#issuecomment-179850363 ..I should add that it depends on the head attempt on the list being complete; the filter in HistoryServer is very sensitive to ordering. If there's an incomplete

[GitHub] spark pull request: [SPARK-13148] [YARN] [WIP] zero-keytab Oozie a...

2016-02-03 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/11033#issuecomment-179232563 This now works: I've tested it by creating tokens, saving them to a file. pointing to in the env var then using spark-submit to bring up a cluster

[GitHub] spark pull request: [SPARK-13148] [YARN] [WIP] zero-keytab Oozie a...

2016-02-02 Thread steveloughran
GitHub user steveloughran opened a pull request: https://github.com/apache/spark/pull/11033 [SPARK-13148] [YARN] [WIP] zero-keytab Oozie application launch This patch looks for the env var `HADOOP_TOKEN_FILE_LOCATION`, and if set skips trying to collect new delegation tokens

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2016-01-28 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/6935#issuecomment-176209826 @squito —have you had a chance to look at the latest version? I think I've addressed all your issues, and it is building and testing against the current master

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2016-01-26 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/6935#issuecomment-175048241 I'm going to add a warning note here pointing to [HDFS-5478](https://issues.apache.org/jira/browse/HDFS-5478), some versions of HDFS aren't picking up changes

[GitHub] spark pull request: [SPARK-12807] [YARN] Spark External Shuffle no...

2016-01-19 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/10780#issuecomment-172985031 The latest patch 1. relocates everything except leveldb. Leveldb uses JNI stuff, which doesn't relocate, and on JVMs with multiple classloaders, can

[GitHub] spark pull request: [SPARK-12893][YARN] Fix history URL redirect e...

2016-01-19 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/10821#issuecomment-172941838 ..but if you can't get at it, copy & paste is probably the best way to make do. Make the new method private[spark] and we could add a test in the history se

[GitHub] spark pull request: [SPARK-12807] [YARN] Spark External Shuffle no...

2016-01-18 Thread steveloughran
Github user steveloughran closed the pull request at: https://github.com/apache/spark/pull/10782 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request: [SPARK-12807] [YARN] Spark External Shuffle no...

2016-01-18 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/10782#issuecomment-172698197 (closing) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-12893][YARN] Fix history URL redirect e...

2016-01-18 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/10821#issuecomment-172702504 I've been using `HistoryServer.getAttemptURI()` to do this mapping in the timeline integration -but that method is `private[history]`. The code here appears

[GitHub] spark pull request: SPARK-12807] [YARN] Spark External Shuffle not...

2016-01-16 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/10782#issuecomment-172280115 Sean: the PR #10780 is the version against /master; its the one with the full diff. This is just here to show it works for 1.6 too. How about I close this one

[GitHub] spark pull request: SPARK-12807] [YARN] Spark External Shuffle not...

2016-01-16 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/10782#discussion_r49939450 --- Diff: network/yarn/pom.xml --- @@ -96,6 +107,54

[GitHub] spark pull request: SPARK-12807] [YARN] Spark External Shuffle not...

2016-01-16 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/10782#discussion_r49939469 --- Diff: network/yarn/pom.xml --- @@ -86,6 +88,15 @@ + --- End diff

[GitHub] spark pull request: SPARK-12807] [YARN] Spark External Shuffle not...

2016-01-15 Thread steveloughran
GitHub user steveloughran opened a pull request: https://github.com/apache/spark/pull/10782 SPARK-12807] [YARN] Spark External Shuffle not working in Hadoop clusters with Jackson 2.2.3: branch-1.6 patch This is the patch of PR #10780 applied to branch-1.6, to verify it works

[GitHub] spark pull request: [SPARK-12807] [YARN] Spark External Shuffle no...

2016-01-15 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/10780#issuecomment-172157447 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-12807] [YARN] [WIP] Spark External Shuf...

2016-01-15 Thread steveloughran
GitHub user steveloughran opened a pull request: https://github.com/apache/spark/pull/10780 [SPARK-12807] [YARN] [WIP] Spark External Shuffle not working in Hadoop clusters with Jackson 2.2.3 Patch to 1. Shade jackson 2.x in spark-yarn-shuffle JAR 2. Use maven failsafe

[GitHub] spark pull request: [SPARK-12807] [YARN] [WIP] Spark External Shuf...

2016-01-15 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/10780#issuecomment-172143421 Updated patch which has shading of `com.fasterxml.jackson` -> `org.spark-project.com.fasterxml.jackson` There's a test for this working; it uses the

[GitHub] spark pull request: [SPARK-10873] Support column sort and search f...

2016-01-13 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/10648#issuecomment-171525569 I have some opinions too, as with YARN timeline service integration, it's essentially hooked up to a database, both for publishing and retrieval. It might

[GitHub] spark pull request: [SPARK-10149] [CORE] [WIP] Locality Level is a...

2016-01-12 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/8533#issuecomment-170940934 Can I also suggest that the code is "IPv6 ready". Facebook have switched to IPv6 in their DCs, and [HADOOP-11890](https://issues.apache.org/jira/bro

[GitHub] spark pull request: [SPARK-12420][SQL] Have a built-in CSV data so...

2016-01-06 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/10615#issuecomment-169314710 Is this going to require the new parser JAR on the classpath everywhere, or will everything excluding CSV parsing still work without it? --- If your project

[GitHub] spark pull request: [SPARK-10149] [CORE] [WIP] Locality Level is a...

2016-01-06 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/8533#discussion_r48953313 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskLocation.scala --- @@ -35,16 +37,28 @@ case class ExecutorCacheTaskLocation(override val

[GitHub] spark pull request: [SPARK-9104][SPARK-9105][SPARK-9106][SPARK-910...

2016-01-03 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/7753#issuecomment-168497378 One thought here: it'd probably be nice to have a json history with the new events as part of the history server suite regression tests —that'll catch any

[GitHub] spark pull request: [SPARK-1537] [YARN] Add history provider for Y...

2016-01-03 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/10545#issuecomment-168496890 I should add that i'm thinking of moving the core `rest` package from `yarn/src/history` to `yarn/src/main`. Why? It adds Hadoop authentication to Jersey Client

[GitHub] spark pull request: [SPARK-1537] [YARN] Add history provider for Y...

2016-01-01 Thread steveloughran
GitHub user steveloughran opened a pull request: https://github.com/apache/spark/pull/10545 [SPARK-1537] [YARN] Add history provider for YARN Application Timeline Server This is the successor to PR #5413; it incorporates SPARK-11315 (PR #8744), which was split out for easier

[GitHub] spark pull request: [SPARK-1537] [YARN] [WiP] Add history provider...

2016-01-01 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/5423#issuecomment-168318509 now succeeded by #10545 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-1537] [YARN] [WiP] Add history provider...

2016-01-01 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/5423#issuecomment-168315481 yes, it is still relevant, yes it was awaiting review, no I wasn't expecting it to be closed --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-1537] [YARN] [WiP] Add history provider...

2016-01-01 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/5423#issuecomment-168317338 I'm about to resubmit it. The way the code is structured, the 2.6 specific stuff lives under yarn/src/history, as discussed in earlier points in this PR

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-23 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r48363721 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/ApplicationCache.scala --- @@ -0,0 +1,658 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-23 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r48364543 --- Diff: core/src/test/scala/org/apache/spark/deploy/history/ApplicationCacheSuite.scala --- @@ -0,0 +1,476 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-23 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r48367142 --- Diff: docs/monitoring.md --- @@ -69,36 +83,53 @@ follows: +### Spark configuration options + Property

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-23 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r48368082 --- Diff: docs/monitoring.md --- @@ -69,36 +83,53 @@ follows: +### Spark configuration options + Property

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-23 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r48364727 --- Diff: core/src/test/scala/org/apache/spark/deploy/history/ApplicationCacheSuite.scala --- @@ -0,0 +1,476 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-23 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r48364720 --- Diff: core/src/test/scala/org/apache/spark/deploy/history/ApplicationCacheSuite.scala --- @@ -0,0 +1,476 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-23 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r48364439 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -430,8 +517,55 @@ private[history] class FsHistoryProvider

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-23 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r48364502 --- Diff: core/src/test/scala/org/apache/spark/deploy/history/ApplicationCacheSuite.scala --- @@ -0,0 +1,476 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-23 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r48366527 --- Diff: core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala --- @@ -21,15 +21,28 @@ import java.net.{HttpURLConnection, URL

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-23 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r48366891 --- Diff: core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala --- @@ -281,6 +296,191 @@ class HistoryServerSuite extends

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-23 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r48363399 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/ApplicationCache.scala --- @@ -0,0 +1,658 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-23 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r48364066 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -430,8 +517,55 @@ private[history] class FsHistoryProvider

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-23 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r48366085 --- Diff: core/src/test/scala/org/apache/spark/deploy/history/ApplicationCacheSuite.scala --- @@ -0,0 +1,476 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-23 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r48366097 --- Diff: core/src/test/scala/org/apache/spark/deploy/history/ApplicationCacheSuite.scala --- @@ -0,0 +1,476 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-23 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r48364810 --- Diff: core/src/test/scala/org/apache/spark/deploy/history/ApplicationCacheSuite.scala --- @@ -0,0 +1,476 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-23 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/6935#issuecomment-166966798 I've updated the patch with the comments, and reworked how the updated probe works, removing the need to have provider-specific state cached, returned

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-15 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/6935#issuecomment-164740423 yeah, I found it surprisingly complex too ... if I known at the start I'd have steered clear. And don't worry about any delay; there are some other spark PRs

[GitHub] spark pull request: [SPARK-12108] Make event logs smaller

2015-12-11 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/10115#issuecomment-164026158 Has this gone into 1.6.0 or 1.6.1? Because the JIRA is still open and doesn't say --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-11574][Core] Add metrics StatsD sink

2015-12-10 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9518#discussion_r47209179 --- Diff: core/src/main/scala/org/apache/spark/metrics/sink/StatsdReporter.scala --- @@ -0,0 +1,143 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-12221] add cpu time to metrics

2015-12-09 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/10212#issuecomment-163210092 (I may be trusted enough to start a run..let's see) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-9926] [SPARK-10340] [SQL] Use S3 bulk l...

2015-12-09 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/8512#discussion_r47081031 --- Diff: core/pom.xml --- @@ -40,6 +40,16 @@ ${avro.mapred.classifier} + com.amazonaws + aws-java-sdk

[GitHub] spark pull request: [SPARK-9926] [SPARK-10340] [SQL] Use S3 bulk l...

2015-12-09 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/8512#discussion_r47081146 --- Diff: core/src/test/scala/org/apache/spark/deploy/SparkS3UtilSuite.scala --- @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-9926] [SPARK-10340] [SQL] Use S3 bulk l...

2015-12-09 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/8512#discussion_r47081737 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkS3Util.scala --- @@ -0,0 +1,342 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

2015-12-09 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/9306#issuecomment-163222684 Moving to Hadoop 0.90 [HADOOP-9623](https://issues.apache.org/jira/browse/HADOOP-9623) was what could be described as "an accidental disaster"';

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-09 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/6935#issuecomment-163203560 Failing test is pyspark —pretty unlikely to be related ``` == FAIL

[GitHub] spark pull request: [SPARK-12221] add cpu time to metrics

2015-12-09 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/10212#discussion_r47083416 --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala --- @@ -718,6 +719,7 @@ private[spark] object JsonProtocol

[GitHub] spark pull request: [SPARK-12221] add cpu time to metrics

2015-12-09 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/10212#issuecomment-163209990 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-11353][IO] Update jets3t version to 0.9...

2015-12-09 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/9306#issuecomment-163253041 > However s3a is not a breeze either (even in newer Hadoop 2.7+ versions), especially with Frankfurt buckets, which support only AWS Signature V4. rea

[GitHub] spark pull request: [SPARK-12241] [YARN] Improve failure reporting...

2015-12-09 Thread steveloughran
GitHub user steveloughran opened a pull request: https://github.com/apache/spark/pull/10227 [SPARK-12241] [YARN] Improve failure reporting in Yarn client obtainTokenForHBase() This lines up the HBase token logic with that done for Hive in SPARK-11265: reflection with only CFNE

[GitHub] spark pull request: [SPARK-11373] [CORE] WiP Add metrics to the Hi...

2015-12-09 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/9571#issuecomment-163358148 latest branch cut out health checks. They'd be nice, but as they are useful across all of spark, it's probably best to wait for something central to go

[GitHub] spark pull request: [SPARK-11574][Core] Add metrics StatsD sink

2015-12-09 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9518#discussion_r47142484 --- Diff: core/src/main/scala/org/apache/spark/metrics/sink/StatsdReporter.scala --- @@ -0,0 +1,143 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-11574][Core] Add metrics StatsD sink

2015-12-09 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/9518#issuecomment-163373154 (I'm not an admin, so can't verify the patch) I've just compared it to the Hadoop StatsDSink; it looks pretty similar although there's some encouragement

[GitHub] spark pull request: [SPARK-11315] [YARN] Add YARN extension servic...

2015-12-08 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/8744#issuecomment-162963641 Now that it's dependency PR is in, I welcome comments and reviews on this. The latest change just adds a version counter to every entity publishing, so that when

[GitHub] spark pull request: [SPARK-9926] [SPARK-10340] [SQL] Use S3 bulk l...

2015-12-08 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/8512#discussion_r46991325 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkS3Util.scala --- @@ -0,0 +1,336 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-7889] [CORE] WiP HistoryServer to refre...

2015-12-08 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/6935#issuecomment-162962369 FWIW, I've got the SPARK-1537 Yarn history provider hooked up to this in [a branch](https://github.com/steveloughran/spark/tree/history/SPARK-7889%2BSPARK-1537

[GitHub] spark pull request: [SPARK-9926] [SPARK-10340] [SQL] Use S3 bulk l...

2015-12-08 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/8512#discussion_r46992376 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkS3Util.scala --- @@ -0,0 +1,336 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-9926] [SPARK-10340] [SQL] Use S3 bulk l...

2015-12-08 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/8512#issuecomment-162974033 Has anyone looked at the performance of this versus S3a in Hadoop 2.7+? Because while I do agree this will dramatically improve s3n: and s3: perf, all ongoing

[GitHub] spark pull request: [SPARK-9468][Yarn][Core] Avoid scheduling task...

2015-12-08 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/7786#issuecomment-162965678 Jerry: do you know when YARN visits this decisions about having things pre-emptible? That is: if you are given a warning dos that mean the container will go any

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-08 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r46993661 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/ApplicationCache.scala --- @@ -0,0 +1,648 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-08 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r46995031 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/ApplicationCache.scala --- @@ -0,0 +1,648 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-08 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r46995495 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/ApplicationCache.scala --- @@ -0,0 +1,579 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-08 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r47010882 --- Diff: core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala --- @@ -281,6 +296,202 @@ class HistoryServerSuite extends

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-08 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r47012771 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala --- @@ -73,4 +101,17 @@ private[history] abstract class

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-08 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r47012758 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala --- @@ -33,7 +33,35 @@ private[spark] case class

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-08 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r47012646 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/ApplicationCache.scala --- @@ -0,0 +1,648 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-08 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r47023063 --- Diff: core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala --- @@ -281,6 +296,202 @@ class HistoryServerSuite extends

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-08 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r47017364 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -678,6 +827,54 @@ private[history] class FsHistoryProvider

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-08 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r47017107 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -430,8 +519,54 @@ private[history] class FsHistoryProvider

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-08 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r47018390 --- Diff: core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala --- @@ -230,6 +230,13 @@ private[spark] class EventLoggingListener

[GitHub] spark pull request: [SPARK-7889] [CORE] HistoryServer to refresh c...

2015-12-08 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r47018518 --- Diff: core/src/test/scala/org/apache/spark/deploy/history/ApplicationCacheSuite.scala --- @@ -0,0 +1,460 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-7889] [CORE] WiP HistoryServer to refre...

2015-12-04 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r46673276 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -699,12 +853,24 @@ private class FsApplicationAttemptInfo

[GitHub] spark pull request: [SPARK-7889] [CORE] WiP HistoryServer to refre...

2015-12-04 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r46672615 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -430,8 +476,50 @@ private[history] class FsHistoryProvider

[GitHub] spark pull request: [SPARK-7889] [CORE] WiP HistoryServer to refre...

2015-12-04 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r46672772 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala --- @@ -73,4 +78,34 @@ private[history] abstract class

[GitHub] spark pull request: [SPARK-7889] [CORE] WiP HistoryServer to refre...

2015-12-04 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r46672762 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/ApplicationCache.scala --- @@ -0,0 +1,579 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-7889] [CORE] WiP HistoryServer to refre...

2015-12-04 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r46673138 --- Diff: core/src/test/scala/org/apache/spark/deploy/history/ApplicationCacheSuite.scala --- @@ -0,0 +1,460 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-7889] [CORE] WiP HistoryServer to refre...

2015-12-04 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r46673155 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -175,18 +180,34 @@ private[history] class

[GitHub] spark pull request: [SPARK-7889] [CORE] WiP HistoryServer to refre...

2015-12-04 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r46672796 --- Diff: core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala --- @@ -281,6 +296,199 @@ class HistoryServerSuite extends

[GitHub] spark pull request: [SPARK-7889] [CORE] WiP HistoryServer to refre...

2015-12-04 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r46672830 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/ApplicationCache.scala --- @@ -0,0 +1,579 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-7889] [CORE] WiP HistoryServer to refre...

2015-12-04 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/6935#discussion_r46700913 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -610,11 +701,38 @@ private[history] class

[GitHub] spark pull request: [SPARK-7889] [CORE] WiP HistoryServer to refre...

2015-12-03 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/6935#issuecomment-161779559 This is the next iteration; if you look at the intermittent patches I was trying to track the time the filesize changed, but it (a) made the code complex (b

[GitHub] spark pull request: [SPARK-11314] [YARN] add service API and test ...

2015-12-02 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9182#discussion_r46434094 --- Diff: yarn/src/test/scala/org/apache/spark/scheduler/cluster/ExtensionServiceIntegrationSuite.scala --- @@ -0,0 +1,87 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-11314] [YARN] add service API and test ...

2015-12-02 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9182#discussion_r46432466 --- Diff: yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnSchedulerBackend.scala --- @@ -51,6 +51,64 @@ private[spark] abstract class

[GitHub] spark pull request: [SPARK-11314] [YARN] add service API and test ...

2015-12-02 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9182#discussion_r46432648 --- Diff: yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnSchedulerBackend.scala --- @@ -51,6 +51,64 @@ private[spark] abstract class

[GitHub] spark pull request: [SPARK-11314] [YARN] add service API and test ...

2015-12-02 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9182#discussion_r46433390 --- Diff: yarn/src/main/scala/org/apache/spark/scheduler/cluster/SchedulerExtensionService.scala --- @@ -0,0 +1,158 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-7889] [CORE] WiP HistoryServer to refre...

2015-12-02 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/6935#issuecomment-161240157 the reason the modtime doesn't change is the log file is kept open; the mod time is set when the output stream is created and not updated as new data is added

[GitHub] spark pull request: [SPARK-7889] [CORE] WiP HistoryServer to refre...

2015-12-01 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/6935#issuecomment-161107339 ... now it's looking like somethings up with the modtime info in the local FS; I'm thinking of tracking the file length as well: a bigger file -> more reco

[GitHub] spark pull request: [SPARK-11314] [YARN] add service API and test ...

2015-12-01 Thread steveloughran
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/9182#issuecomment-161107529 thanks -will deal with these on wednesday. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

<    3   4   5   6   7   8   9   10   11   12   >