spark git commit: [SPARK-24372][BUILD] Add scripts to help with preparing releases.

2018-06-22 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 33e77fa89 -> 4e7d8678a [SPARK-24372][BUILD] Add scripts to help with preparing releases. The "do-release.sh" script asks questions about the RC being prepared, trying to find out as much as possible automatically, and then executes the

spark git commit: [SPARK-24519] Make the threshold for highly compressed map status configurable

2018-06-22 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 92c2f00bd -> 39dfaf2fd [SPARK-24519] Make the threshold for highly compressed map status configurable **Problem** MapStatus uses hardcoded value of 2000 partitions to determine if it should use highly compressed map status. We should make

spark git commit: [SPARK-22897][CORE] Expose stageAttemptId in TaskContext

2018-06-22 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.2 751b00820 -> a6459 [SPARK-22897][CORE] Expose stageAttemptId in TaskContext stageAttemptId added in TaskContext and corresponding construction modification Added a new test in TaskContextSuite, two cases are tested: 1. Normal case

spark git commit: [SPARK-24589][CORE] Correctly identify tasks in output commit coordinator.

2018-06-21 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.2 7bfefc928 -> 751b00820 [SPARK-24589][CORE] Correctly identify tasks in output commit coordinator. When an output stage is retried, it's possible that tasks from the previous attempt are still running. In that case, there would be a new

spark git commit: [SPARK-24589][CORE] Correctly identify tasks in output commit coordinator.

2018-06-21 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.3 8928de3cd -> 3a4b6f3be [SPARK-24589][CORE] Correctly identify tasks in output commit coordinator. When an output stage is retried, it's possible that tasks from the previous attempt are still running. In that case, there would be a new

spark git commit: [SPARK-24589][CORE] Correctly identify tasks in output commit coordinator.

2018-06-21 Thread tgraves
Repository: spark Updated Branches: refs/heads/master b56e9c613 -> c8e909cd4 [SPARK-24589][CORE] Correctly identify tasks in output commit coordinator. When an output stage is retried, it's possible that tasks from the previous attempt are still running. In that case, there would be a new

spark git commit: Preparing development version 2.2.3-SNAPSHOT

2018-06-18 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.2 e2e4d5849 -> 7bfefc928 Preparing development version 2.2.3-SNAPSHOT Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7bfefc92 Tree:

[1/2] spark git commit: Preparing Spark release v2.2.2-rc1

2018-06-18 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.2 090b883fa -> e2e4d5849 Preparing Spark release v2.2.2-rc1 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8ce9e2a4 Tree:

[2/2] spark git commit: Preparing development version 2.2-3-SNAPSHOT

2018-06-18 Thread tgraves
Preparing development version 2.2-3-SNAPSHOT Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e2e4d584 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e2e4d584 Diff:

[spark] Git Push Summary

2018-06-18 Thread tgraves
Repository: spark Updated Tags: refs/tags/v2.2.2-rc1 [created] 8ce9e2a4a - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-22683][CORE] Add a executorAllocationRatio parameter to throttle the parallelism of the dynamic allocation

2018-04-24 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 4926a7c2f -> 55c4ca88a [SPARK-22683][CORE] Add a executorAllocationRatio parameter to throttle the parallelism of the dynamic allocation ## What changes were proposed in this pull request? By default, the dynamic allocation will request

spark git commit: [SPARK-21890] Credentials not being passed to add the tokens

2017-09-07 Thread tgraves
Repository: spark Updated Branches: refs/heads/master eea2b877c -> b9ab791a9 [SPARK-21890] Credentials not being passed to add the tokens I observed this while running a oozie job trying to connect to hbase via spark. It look like the creds are not being passed in

spark git commit: [SPARK-21798] No config to replace deprecated SPARK_CLASSPATH config for launching daemons like History Server

2017-08-28 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.2 0d4ef2f69 -> 59bb7ebfb [SPARK-21798] No config to replace deprecated SPARK_CLASSPATH config for launching daemons like History Server History Server Launch uses SparkClassCommandBuilder for launching the server. It is observed that

spark git commit: [SPARK-21798] No config to replace deprecated SPARK_CLASSPATH config for launching daemons like History Server

2017-08-28 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 0456b4050 -> 24e6c187f [SPARK-21798] No config to replace deprecated SPARK_CLASSPATH config for launching daemons like History Server History Server Launch uses SparkClassCommandBuilder for launching the server. It is observed that

spark git commit: [SPARK-21501] Change CacheLoader to limit entries based on memory footprint

2017-08-23 Thread tgraves
Repository: spark Updated Branches: refs/heads/master d6b30edd4 -> 1662e9311 [SPARK-21501] Change CacheLoader to limit entries based on memory footprint Right now the spark shuffle service has a cache for index files. It is based on a # of files cached

spark git commit: [SPARK-21656][CORE] spark dynamic allocation should not idle timeout executors when tasks still to run

2017-08-16 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.2 f1accc851 -> f5ede0d55 [SPARK-21656][CORE] spark dynamic allocation should not idle timeout executors when tasks still to run ## What changes were proposed in this pull request? Right now spark lets go of executors when they are idle

spark git commit: [SPARK-21656][CORE] spark dynamic allocation should not idle timeout executors when tasks still to run

2017-08-16 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 0bb8d1f30 -> adf005dab [SPARK-21656][CORE] spark dynamic allocation should not idle timeout executors when tasks still to run ## What changes were proposed in this pull request? Right now spark lets go of executors when they are idle for

spark git commit: [SPARK-20713][SPARK CORE] Convert CommitDenied to TaskKilled.

2017-08-03 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 13785daa8 -> bb7afb4e1 [SPARK-20713][SPARK CORE] Convert CommitDenied to TaskKilled. ## What changes were proposed in this pull request? In executor, toTaskFailedReason is converted to toTaskCommitDeniedReason to avoid the inconsistency

spark git commit: [SPARK-21585] Application Master marking application status as Failed for Client Mode

2017-08-01 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 253a07e43 -> 97ccc63f7 [SPARK-21585] Application Master marking application status as Failed for Client Mode The fix deployed for SPARK-21541 resulted in the Application Master to set the final status of a spark application as Failed for

spark git commit: [SPARK-21541][YARN] Spark Logs show incorrect job status for a job that does not create SparkContext

2017-07-28 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 784680903 -> 69ab0e4bd [SPARK-21541][YARN] Spark Logs show incorrect job status for a job that does not create SparkContext If you run a spark job without creating the SparkSession or SparkContext, the spark job logs says it succeeded

spark git commit: [SPARK-21243][CORE] Limit no. of map outputs in a shuffle fetch

2017-07-21 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.2 9949fed1c -> 88dccda39 [SPARK-21243][CORE] Limit no. of map outputs in a shuffle fetch For configurations with external shuffle enabled, we have observed that if a very large no. of blocks are being fetched from a remote host, it puts

spark git commit: [SPARK-21243][Core] Limit no. of map outputs in a shuffle fetch

2017-07-19 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 70fe99dc6 -> ef6177558 [SPARK-21243][Core] Limit no. of map outputs in a shuffle fetch ## What changes were proposed in this pull request? For configurations with external shuffle enabled, we have observed that if a very large no. of

spark git commit: [SPARK-21321][SPARK CORE] Spark very verbose on shutdown

2017-07-17 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.2 8e85ce625 -> 0ef98fd43 [SPARK-21321][SPARK CORE] Spark very verbose on shutdown ## What changes were proposed in this pull request? The current code is very verbose on shutdown. The changes I propose is to change the log level when

spark git commit: [SPARK-21321][SPARK CORE] Spark very verbose on shutdown

2017-07-17 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 7047f49f4 -> 0e07a29cf [SPARK-21321][SPARK CORE] Spark very verbose on shutdown ## What changes were proposed in this pull request? The current code is very verbose on shutdown. The changes I propose is to change the log level when the

spark git commit: [SPARK-13669][SPARK-20898][CORE] Improve the blacklist mechanism to handle external shuffle service unavailable situation

2017-06-26 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 5282bae04 -> 9e50a1d37 [SPARK-13669][SPARK-20898][CORE] Improve the blacklist mechanism to handle external shuffle service unavailable situation ## What changes were proposed in this pull request? Currently we are running into an issue

spark git commit: [SPARK-20355] Add per application spark version on the history server headerpage

2017-05-09 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 714811d0b -> 181261a81 [SPARK-20355] Add per application spark version on the history server headerpage ## What changes were proposed in this pull request? Spark Version for a specific application is not displayed on the history page

spark git commit: [SPARK-20426] Lazy initialization of FileSegmentManagedBuffer for shuffle service.

2017-04-27 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.2 92b61f02d -> c69d862b2 [SPARK-20426] Lazy initialization of FileSegmentManagedBuffer for shuffle service. ## What changes were proposed in this pull request? When application contains large amount of shuffle blocks. NodeManager

spark git commit: [SPARK-20426] Lazy initialization of FileSegmentManagedBuffer for shuffle service.

2017-04-27 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 561e9cc39 -> 85c6ce619 [SPARK-20426] Lazy initialization of FileSegmentManagedBuffer for shuffle service. ## What changes were proposed in this pull request? When application contains large amount of shuffle blocks. NodeManager requires

spark git commit: [SPARK-19812] YARN shuffle service fails to relocate recovery DB acro…

2017-04-26 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 7a365257e -> 7fecf5130 [SPARK-19812] YARN shuffle service fails to relocate recovery DB acro… …ss NFS directories ## What changes were proposed in this pull request? Change from using java Files.move to use Hadoop filesystem

spark git commit: [SPARK-19812] YARN shuffle service fails to relocate recovery DB acro…

2017-04-26 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.2 a2f5ced32 -> 612952251 [SPARK-19812] YARN shuffle service fails to relocate recovery DB acro… …ss NFS directories ## What changes were proposed in this pull request? Change from using java Files.move to use Hadoop filesystem

spark git commit: [SPARK-18750][YARN] Avoid using "mapValues" when allocating containers.

2017-01-25 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 3fdce8143 -> 76db394f2 [SPARK-18750][YARN] Avoid using "mapValues" when allocating containers. That method is prone to stack overflows when the input map is really large; instead, use plain "map". Also includes a unit test that was tested

spark git commit: [SPARK-19179][YARN] Change spark.yarn.access.namenodes config and update docs

2017-01-17 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 6c00c069e -> b79cc7ceb [SPARK-19179][YARN] Change spark.yarn.access.namenodes config and update docs ## What changes were proposed in this pull request? `spark.yarn.access.namenodes` configuration cannot actually reflects the usage of

spark git commit: [SPARK-19021][YARN] Generailize HDFSCredentialProvider to support non HDFS security filesystems

2017-01-11 Thread tgraves
Repository: spark Updated Branches: refs/heads/master a61551356 -> 4239a1081 [SPARK-19021][YARN] Generailize HDFSCredentialProvider to support non HDFS security filesystems Currently Spark can only get token renewal interval from security HDFS (hdfs://), if Spark runs with other security

spark git commit: [SPARK-19033][CORE] Add admin acls for history server

2017-01-06 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 903bb8e8a -> 4a4c3dc9c [SPARK-19033][CORE] Add admin acls for history server ## What changes were proposed in this pull request? Current HistoryServer's ACLs is derived from application event-log, which means the newly changed ACLs

spark git commit: [SPARK-19033][CORE] Add admin acls for history server

2017-01-06 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.1 1ecf1a953 -> 4ca178880 [SPARK-19033][CORE] Add admin acls for history server ## What changes were proposed in this pull request? Current HistoryServer's ACLs is derived from application event-log, which means the newly changed ACLs

spark git commit: [SPARK-17843][WEB UI] Indicate event logs pending for processing on history server UI

2016-11-11 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.1 51dca6143 -> 00c9c7d96 [SPARK-17843][WEB UI] Indicate event logs pending for processing on history server UI ## What changes were proposed in this pull request? History Server UI's application listing to display information on

spark git commit: [SPARK-18357] Fix yarn files/archive broken issue andd unit tests

2016-11-08 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.1 9595a7106 -> 876eee2b1 [SPARK-18357] Fix yarn files/archive broken issue andd unit tests ## What changes were proposed in this pull request? The #15627 broke functionality with yarn --files --archives does not accept any files. This

spark git commit: [SPARK-18357] Fix yarn files/archive broken issue andd unit tests

2016-11-08 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 9c419698f -> 245e5a2f8 [SPARK-18357] Fix yarn files/archive broken issue andd unit tests ## What changes were proposed in this pull request? The #15627 broke functionality with yarn --files --archives does not accept any files. This

spark git commit: [SPARK-18099][YARN] Fail if same files added to distributed cache for --files and --archives

2016-11-03 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.1 3e139e239 -> 569f77a11 [SPARK-18099][YARN] Fail if same files added to distributed cache for --files and --archives ## What changes were proposed in this pull request? During spark-submit, if yarn dist cache is instructed to add same

spark git commit: [SPARK-18099][YARN] Fail if same files added to distributed cache for --files and --archives

2016-11-03 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 16293311c -> 098e4ca9c [SPARK-18099][YARN] Fail if same files added to distributed cache for --files and --archives ## What changes were proposed in this pull request? During spark-submit, if yarn dist cache is instructed to add same

spark git commit: [SPARK-17417][CORE] Fix # of partitions for Reliable RDD checkpointing

2016-10-10 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 7e16c94f1 -> 4bafacaa5 [SPARK-17417][CORE] Fix # of partitions for Reliable RDD checkpointing ## What changes were proposed in this pull request? Currently the no. of partition files are limited to 1 files (%05d format). If there are

spark git commit: [SPARK-17417][CORE] Fix # of partitions for Reliable RDD checkpointing

2016-10-10 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.0 d27df3579 -> d719e9a08 [SPARK-17417][CORE] Fix # of partitions for Reliable RDD checkpointing ## What changes were proposed in this pull request? Currently the no. of partition files are limited to 1 files (%05d format). If there

spark git commit: [SPARK-17710][HOTFIX] Fix ClassCircularityError in ReplSuite tests in Maven build: use 'Class.forName' instead of 'Utils.classForName'

2016-09-28 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 7d0923202 -> 7dfad4b13 [SPARK-17710][HOTFIX] Fix ClassCircularityError in ReplSuite tests in Maven build: use 'Class.forName' instead of 'Utils.classForName' ## What changes were proposed in this pull request? Fix ClassCircularityError in

spark git commit: [SPARK-16757] Set up Spark caller context to HDFS and YARN

2016-09-27 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 7f16affa2 -> 6a68c5d7b [SPARK-16757] Set up Spark caller context to HDFS and YARN ## What changes were proposed in this pull request? 1. Pass `jobId` to Task. 2. Invoke Hadoop APIs. * A new function `setCallerContext` is added in

spark git commit: [SPARK-17511] Yarn Dynamic Allocation: Avoid marking released container as Failed

2016-09-14 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.0 6fe5972e6 -> fab77dadf [SPARK-17511] Yarn Dynamic Allocation: Avoid marking released container as Failed Due to race conditions, the ` assert(numExecutorsRunning <= targetNumExecutors)` can fail causing `AssertionError`. So removed

spark git commit: [SPARK-17511] Yarn Dynamic Allocation: Avoid marking released container as Failed

2016-09-14 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 040e46979 -> ff6e4cbdc [SPARK-17511] Yarn Dynamic Allocation: Avoid marking released container as Failed ## What changes were proposed in this pull request? Due to race conditions, the ` assert(numExecutorsRunning <=

spark git commit: [SPARK-17433] YarnShuffleService doesn't handle moving credentials levelDb

2016-09-09 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 7098a1294 -> a3981c28c [SPARK-17433] YarnShuffleService doesn't handle moving credentials levelDb The secrets leveldb isn't being moved if you run spark shuffle services without yarn nm recovery on and then turn it on. This fixes that.

spark git commit: [SPARK-16711] YarnShuffleService doesn't re-init properly on YARN rolling upgrade

2016-09-08 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.0 28377da38 -> e169085cd [SPARK-16711] YarnShuffleService doesn't re-init properly on YARN rolling upgrade branch-2.0 version of this patch. The differences are in the YarnShuffleService for finding the location to put the DB.

spark git commit: [SPARK-17243][WEB UI] Spark 2.0 History Server won't load with very large application history

2016-08-31 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.0 bc6c0d9f9 -> 021aa28f4 [SPARK-17243][WEB UI] Spark 2.0 History Server won't load with very large application history ## What changes were proposed in this pull request? back port of #14835 addressing merge conflicts With the new

spark git commit: [SPARK-17243][WEB UI] Spark 2.0 History Server won't load with very large application history

2016-08-30 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 02ac379e8 -> f7beae6da [SPARK-17243][WEB UI] Spark 2.0 History Server won't load with very large application history ## What changes were proposed in this pull request? With the new History Server the summary page loads the application

spark git commit: [SPARK-15083][WEB UI] History Server can OOM due to unlimited TaskUIData

2016-08-24 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 40b30fcf4 -> 891ac2b91 [SPARK-15083][WEB UI] History Server can OOM due to unlimited TaskUIData ## What changes were proposed in this pull request? Based on #12990 by tankkyo Since the History Server currently loads all application's

spark git commit: [SPARK-11227][CORE] UnknownHostException can be thrown when NameNode HA is enabled.

2016-08-19 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.0 e0c60f185 -> d0707c6ba [SPARK-11227][CORE] UnknownHostException can be thrown when NameNode HA is enabled. ## What changes were proposed in this pull request? If the following conditions are satisfied, executors don't load properties

spark git commit: [SPARK-11227][CORE] UnknownHostException can be thrown when NameNode HA is enabled.

2016-08-19 Thread tgraves
Repository: spark Updated Branches: refs/heads/master e98eb2146 -> 071eaaf9d [SPARK-11227][CORE] UnknownHostException can be thrown when NameNode HA is enabled. ## What changes were proposed in this pull request? If the following conditions are satisfied, executors don't load properties in

spark git commit: [SPARK-16673][WEB UI] New Executor Page removed conditional for Logs and Thread Dump columns

2016-08-19 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 67e59d464 -> e98eb2146 [SPARK-16673][WEB UI] New Executor Page removed conditional for Logs and Thread Dump columns ## What changes were proposed in this pull request? When #13670 switched `ExecutorsPage` to use JQuery DataTables it

spark git commit: [SPARK-15703][SCHEDULER][CORE][WEBUI] Make ListenerBus event queue size configurable

2016-07-26 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 0869b3a5f -> 0b71d9ae0 [SPARK-15703][SCHEDULER][CORE][WEBUI] Make ListenerBus event queue size configurable ## What changes were proposed in this pull request? This change adds a new configuration entry to specify the size of the spark

spark git commit: [SPARK-15951] Change Executors Page to use datatables to support sorting columns and searching

2016-07-20 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 4b079dc39 -> b9bab4dcf [SPARK-15951] Change Executors Page to use datatables to support sorting columns and searching 1. Create the executorspage-template.html for displaying application information in datables. 2. Added REST API

spark git commit: [SPARK-16505][YARN] Optionally propagate error during shuffle service startup.

2016-07-14 Thread tgraves
Repository: spark Updated Branches: refs/heads/master c4bc2ed84 -> b7b5e1787 [SPARK-16505][YARN] Optionally propagate error during shuffle service startup. This prevents the NM from starting when something is wrong, which would lead to later errors which are confusing and harder to debug.

spark git commit: [SPARK-14963][MINOR][YARN] Fix typo in YarnShuffleService recovery file name

2016-07-14 Thread tgraves
Repository: spark Updated Branches: refs/heads/master e3f8a0336 -> c4bc2ed84 [SPARK-14963][MINOR][YARN] Fix typo in YarnShuffleService recovery file name ## What changes were proposed in this pull request? Due to the changes of

spark git commit: [SPARK-16435][YARN][MINOR] Add warning log if initialExecutors is less than minExecutors

2016-07-13 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.0 7d9bd951b -> 90f0e8132 [SPARK-16435][YARN][MINOR] Add warning log if initialExecutors is less than minExecutors ## What changes were proposed in this pull request? Currently if `spark.dynamicAllocation.initialExecutors` is less than

spark git commit: [SPARK-16435][YARN][MINOR] Add warning log if initialExecutors is less than minExecutors

2016-07-13 Thread tgraves
Repository: spark Updated Branches: refs/heads/master f376c3726 -> d8220c1e5 [SPARK-16435][YARN][MINOR] Add warning log if initialExecutors is less than minExecutors ## What changes were proposed in this pull request? Currently if `spark.dynamicAllocation.initialExecutors` is less than

spark git commit: [SPARK-15990][YARN] Add rolling log aggregation support for Spark on yarn

2016-06-29 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 393db655c -> 272a2f78f [SPARK-15990][YARN] Add rolling log aggregation support for Spark on yarn ## What changes were proposed in this pull request? Yarn supports rolling log aggregation since 2.6, previously log will only be aggregated

spark git commit: [SPARK-13723][YARN] Change behavior of --num-executors with dynamic allocation.

2016-06-23 Thread tgraves
Repository: spark Updated Branches: refs/heads/master a410814c8 -> 738f134bf [SPARK-13723][YARN] Change behavior of --num-executors with dynamic allocation. ## What changes were proposed in this pull request? This changes the behavior of --num-executors and spark.executor.instances when

spark git commit: [SPARK-15725][YARN] Ensure ApplicationMaster sleeps for the min interval.

2016-06-23 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.0 214676d29 -> b8818d892 [SPARK-15725][YARN] Ensure ApplicationMaster sleeps for the min interval. ## What changes were proposed in this pull request? Update `ApplicationMaster` to sleep for at least the minimum allocation interval

spark git commit: [SPARK-16138] Try to cancel executor requests only if we have at least 1

2016-06-23 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 5eef1e6c6 -> 5bf2889bf [SPARK-16138] Try to cancel executor requests only if we have at least 1 ## What changes were proposed in this pull request? Adding additional check to if statement ## How was this patch tested? I built and deployed

spark git commit: [SPARK-16080][YARN] Set correct link name for conf archive in executors.

2016-06-21 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 93338807a -> bcb0258ae [SPARK-16080][YARN] Set correct link name for conf archive in executors. This makes sure the files are in the executor's classpath as they're expected to be. Also update the unit test to make sure the files are there

spark git commit: [SPARK-16080][YARN] Set correct link name for conf archive in executors.

2016-06-21 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.0 943239bf4 -> 052779a0c [SPARK-16080][YARN] Set correct link name for conf archive in executors. This makes sure the files are in the executor's classpath as they're expected to be. Also update the unit test to make sure the files are

spark git commit: [SPARK-16018][SHUFFLE] Shade netty to load shuffle jar in Nodemanger

2016-06-17 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.0 269b715e4 -> 3457497e0 [SPARK-16018][SHUFFLE] Shade netty to load shuffle jar in Nodemanger ## What changes were proposed in this pull request? Shade the netty.io namespace so that we can use it in shuffle independent of the

spark git commit: [SPARK-16018][SHUFFLE] Shade netty to load shuffle jar in Nodemanger

2016-06-17 Thread tgraves
Repository: spark Updated Branches: refs/heads/master c8809db5a -> 298c4ae81 [SPARK-16018][SHUFFLE] Shade netty to load shuffle jar in Nodemanger ## What changes were proposed in this pull request? Shade the netty.io namespace so that we can use it in shuffle independent of the dependencies

spark git commit: [SPARK-15046][YARN] Parse value of token renewal interval correctly.

2016-06-15 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.0 df9a19fe8 -> 7a0ed75ea [SPARK-15046][YARN] Parse value of token renewal interval correctly. Use the config variable definition both to set and parse the value, avoiding issues with code expecting the value in a different format.

spark git commit: [SPARK-15046][YARN] Parse value of token renewal interval correctly.

2016-06-15 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 0ee9fd9e5 -> 40eeef952 [SPARK-15046][YARN] Parse value of token renewal interval correctly. Use the config variable definition both to set and parse the value, avoiding issues with code expecting the value in a different format. Tested by

spark git commit: [SPARK-13148][YARN] document zero-keytab Oozie application launch; add diagnostics

2016-05-26 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-2.0 9cf34727c -> 0cb69a918 [SPARK-13148][YARN] document zero-keytab Oozie application launch; add diagnostics This patch provides detail on what to do for keytabless Oozie launches of spark apps, and adds some debug-level diagnostics of

spark git commit: [SPARK-13148][YARN] document zero-keytab Oozie application launch; add diagnostics

2016-05-26 Thread tgraves
Repository: spark Updated Branches: refs/heads/master c76457c8e -> 01b350a4f [SPARK-13148][YARN] document zero-keytab Oozie application launch; add diagnostics This patch provides detail on what to do for keytabless Oozie launches of spark apps, and adds some debug-level diagnostics of what

spark git commit: [SPARK-14963][YARN] Using recoveryPath if NM recovery is enabled

2016-05-10 Thread tgraves
Repository: spark Updated Branches: refs/heads/master a019e6efb -> aab99d31a [SPARK-14963][YARN] Using recoveryPath if NM recovery is enabled ## What changes were proposed in this pull request? >From Hadoop 2.5+, Yarn NM supports NM recovery which using recovery path for >auxiliary services

spark git commit: [SPARK-4224][CORE][YARN] Support group acls

2016-05-04 Thread tgraves
Repository: spark Updated Branches: refs/heads/master abecbcd5e -> a45647746 [SPARK-4224][CORE][YARN] Support group acls ## What changes were proposed in this pull request? Currently only a list of users can be specified for view and modify acls. This change enables a group of

spark git commit: [SPARK-6735][YARN] Add window based executor failure tracking mechanism for long running service

2016-04-28 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 9e785079b -> 8b44bd52f [SPARK-6735][YARN] Add window based executor failure tracking mechanism for long running service This work is based on twinkle-sachdeva 's proposal. In parallel to such mechanism for AM failures, here add similar

spark git commit: [SPARK-13988][CORE] Make replaying event logs multi threaded in Histo…ry server to ensure a single large log does not block other logs from being rendered.

2016-04-21 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 4ac6e75cd -> 6fdd0e32a [SPARK-13988][CORE] Make replaying event logs multi threaded in Histo…ry server to ensure a single large log does not block other logs from being rendered. ## What changes were proposed in this pull request? The

spark git commit: [SPARK-14572][DOC] Update config docs to allow -Xms in extraJavaOptions

2016-04-14 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 3cf3db17b -> f83ba454a [SPARK-14572][DOC] Update config docs to allow -Xms in extraJavaOptions ## What changes were proposed in this pull request? The configuration docs are updated to reflect the changes introduced with

spark git commit: [SPARK-12384] Enables spark-clients to set the min(-Xms) and max(*.memory config) j…

2016-04-07 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 35e0db2d4 -> 033d80815 [SPARK-12384] Enables spark-clients to set the min(-Xms) and max(*.memory config) j… ## What changes were proposed in this pull request? Currently Spark clients are started with the same memory setting for Xms

spark git commit: [SPARK-14245][WEB UI] Display the user in the application view

2016-04-07 Thread tgraves
Repository: spark Updated Branches: refs/heads/master db75ccb55 -> 35e0db2d4 [SPARK-14245][WEB UI] Display the user in the application view ## What changes were proposed in this pull request? The Spark UI (both active and history) should show the user who ran the application somewhere when

spark git commit: [SPARK-13063][YARN] Make the SPARK YARN STAGING DIR as configurable

2016-04-05 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 463bac001 -> bc36df127 [SPARK-13063][YARN] Make the SPARK YARN STAGING DIR as configurable ## What changes were proposed in this pull request? Made the SPARK YARN STAGING DIR as configurable with the configuration as

spark git commit: [SPARK-12864][YARN] initialize executorIdCounter after ApplicationMaster killed for max n…

2016-04-01 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 3e991dbc3 -> bd7b91cef [SPARK-12864][YARN] initialize executorIdCounter after ApplicationMaster killed for max n… Currently, when max number of executor failures reached the `maxNumExecutorFailures`, `ApplicationMaster` will be killed

spark git commit: [SPARK-13642][YARN][1.6-BACKPORT] Properly handle signal kill in ApplicationMaster

2016-03-23 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-1.6 179f6e323 -> 5e9cefc8c [SPARK-13642][YARN][1.6-BACKPORT] Properly handle signal kill in ApplicationMaster ## What changes were proposed in this pull request? This patch is fixing the race condition in ApplicationMaster when receiving

spark git commit: [SPARK-13577][YARN] Allow Spark jar to be multiple jars, archive.

2016-03-11 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 8fff0f92a -> 07f1c5447 [SPARK-13577][YARN] Allow Spark jar to be multiple jars, archive. In preparation for the demise of assemblies, this change allows the YARN backend to use multiple jars and globs as the "Spark jar". The config option

spark git commit: [SPARK-13675][UI] Fix wrong historyserver url link for application running in yarn cluster mode

2016-03-08 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 9bf76ddde -> 9e86e6efd [SPARK-13675][UI] Fix wrong historyserver url link for application running in yarn cluster mode ## What changes were proposed in this pull request? Current URL for each application to access history UI is like:

spark git commit: [SPARK-13459][WEB UI] Separate Alive and Dead Executors in Executor Totals Table

2016-03-04 Thread tgraves
Repository: spark Updated Branches: refs/heads/master b7d414742 -> 5f42c28b1 [SPARK-13459][WEB UI] Separate Alive and Dead Executors in Executor Totals Table ## What changes were proposed in this pull request? Now that dead executors are shown in the executors table (#10058) the totals

spark git commit: [SPARK-13481] Desc order of appID by default for history server page.

2016-02-29 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 236e3c8fb -> 2f91f5ac0 [SPARK-13481] Desc order of appID by default for history server page. ## What changes were proposed in this pull request? Now by default, it shows as ascending order of appId. We might prefer to display as

spark git commit: [SPARK-12523][YARN] Support long-running of the Spark On HBase and hive meta store.

2016-02-26 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 318bf4115 -> 5c3912e5c [SPARK-12523][YARN] Support long-running of the Spark On HBase and hive meta store. Obtain the hive metastore and hbase token as well as hdfs token in DelegationToeknRenewer to supoort long-running application of

spark git commit: [SPARK-12316] Wait a minutes to avoid cycle calling.

2016-02-25 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-1.6 e3802a752 -> 5f7440b25 [SPARK-12316] Wait a minutes to avoid cycle calling. When application end, AM will clean the staging dir. But if the driver trigger to update the delegation token, it will can't find the right token file and

spark git commit: [SPARK-12316] Wait a minutes to avoid cycle calling.

2016-02-25 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 157fe64f3 -> 5fcf4c2bf [SPARK-12316] Wait a minutes to avoid cycle calling. When application end, AM will clean the staging dir. But if the driver trigger to update the delegation token, it will can't find the right token file and then it

spark git commit: [SPARK-13364] Sort appId as num rather than str in history page.

2016-02-23 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 87d7f8904 -> 4d1e5f92e [SPARK-13364] Sort appId as num rather than str in history page. ## What changes were proposed in this pull request? History page now sorts the appID as a string, which can lead to unexpected order for the case

spark git commit: [SPARK-13163][WEB UI] Column width on new History Server DataTables not getting set correctly

2016-02-10 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 5cf20598c -> 39cc620e9 [SPARK-13163][WEB UI] Column width on new History Server DataTables not getting set correctly The column width for the new DataTables now adjusts for the current page rather than being hard-coded for the entire

spark git commit: [SPARK-13126] fix the right margin of history page.

2016-02-10 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 39cc620e9 -> 4b80026f0 [SPARK-13126] fix the right margin of history page. The right margin of the history page is little bit off. A simple fix for that issue. Author: zhuol Closes #11029 from zhuoliu/13126.

[2/3] spark git commit: [SPARK-10873] Support column sort and search for History Server.

2016-01-29 Thread tgraves
http://git-wip-us.apache.org/repos/asf/spark/blob/e4c1162b/core/src/main/resources/org/apache/spark/ui/static/jquery.dataTables.1.10.4.min.js -- diff --git

[3/3] spark git commit: [SPARK-10873] Support column sort and search for History Server.

2016-01-29 Thread tgraves
[SPARK-10873] Support column sort and search for History Server. [SPARK-10873] Support column sort and search for History Server using jQuery DataTable and REST API. Before this commit, the history server was generated hard-coded html and can not support search, also, the sorting was disabled

[1/3] spark git commit: [SPARK-10873] Support column sort and search for History Server.

2016-01-29 Thread tgraves
Repository: spark Updated Branches: refs/heads/master e51b6eaa9 -> e4c1162b6 http://git-wip-us.apache.org/repos/asf/spark/blob/e4c1162b/core/src/main/resources/org/apache/spark/ui/static/jquery.mustache.js -- diff --git

spark git commit: [SPARK-10911] Executors should System.exit on clean shutdown.

2016-01-26 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 649e9d0f5 -> ae0309a88 [SPARK-10911] Executors should System.exit on clean shutdown. Call system.exit explicitly to make sure non-daemon user threads terminate. Without this, user applications might live forever if the cluster manager

spark git commit: [SPARK-12149][WEB UI] Executor UI improvement suggestions - Color UI

2016-01-25 Thread tgraves
Repository: spark Updated Branches: refs/heads/master ef8fb3612 -> c037d2548 [SPARK-12149][WEB UI] Executor UI improvement suggestions - Color UI Added color coding to the Executors page for Active Tasks, Failed Tasks, Completed Tasks and Task Time. Active Tasks is shaded blue with it's

spark git commit: [SPARK-12716][WEB UI] Add a TOTALS row to the Executors Web UI

2016-01-15 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 0bb73554a -> 61c45876f [SPARK-12716][WEB UI] Add a TOTALS row to the Executors Web UI Added a Totals table to the top of the page to display the totals of each applicable column in the executors table. Old Description: ~~Created a TOTALS

spark git commit: [SPARK-12654] sc.wholeTextFiles with spark.hadoop.cloneConf=true fail…

2016-01-08 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 8c70cb4c6 -> 553fd7b91 [SPARK-12654] sc.wholeTextFiles with spark.hadoop.cloneConf=true fail… …s on secure Hadoop https://issues.apache.org/jira/browse/SPARK-12654 So the bug here is that WholeTextFileRDD.getPartitions has: val conf

spark git commit: [SPARK-12654] sc.wholeTextFiles with spark.hadoop.cloneConf=true fail…

2016-01-08 Thread tgraves
Repository: spark Updated Branches: refs/heads/branch-1.6 e4227cb3e -> faf094c7c [SPARK-12654] sc.wholeTextFiles with spark.hadoop.cloneConf=true fail… …s on secure Hadoop https://issues.apache.org/jira/browse/SPARK-12654 So the bug here is that WholeTextFileRDD.getPartitions has: val

<    1   2   3   4   5   6   7   8   9   10   >