[GitHub] spark pull request: [SPARK-4962] [CORE] Put TaskScheduler.start ba...

2015-09-01 Thread YanTangZhai
Github user YanTangZhai closed the pull request at: https://github.com/apache/spark/pull/3810 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...

2015-01-25 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/3794#issuecomment-71405599 @JoshRosen I don't think just calling rdd.partitions on the final RDD could achieve our goal. Furthermore, rdd.partitions has been called before: 470 // Che

[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...

2015-01-24 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/3794#issuecomment-71308409 @JoshRosen I've brought this up to date with master. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on G

[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...

2015-01-20 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/3794#issuecomment-70628086 @JoshRosen Thanks. I've updated it as your comments. Please review again. However, these's merge conflicts. I will resolve this conflict if this approach

[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...

2015-01-19 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/3794#issuecomment-70481411 @JoshRosen Thanks for your comments. I've updates it. I directly use getParentStages which will call RDD's getPartitions before sending JobSubmitted event

[GitHub] spark pull request: [SPARK-5316] [CORE] DAGScheduler may make shuf...

2015-01-19 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/4105 [SPARK-5316] [CORE] DAGScheduler may make shuffleToMapStage leak if getParentStages failes DAGScheduler may make shuffleToMapStage leak if getParentStages failes. If getParentStages has

[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...

2015-01-14 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/3794#issuecomment-69916653 @JoshRosen I've updated it. Please review again. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitH

[GitHub] spark pull request: [SPARK-4962] [CORE] Put TaskScheduler.start ba...

2015-01-13 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/3810#issuecomment-69716974 @srowen I've updated this PR and resolved conflict. Please review again. Thanks. I explain three points: 1. I am not sure the description makes a case

[GitHub] spark pull request: [SPARK-4962] [CORE] Put TaskScheduler.start ba...

2015-01-11 Thread YanTangZhai
Github user YanTangZhai commented on a diff in the pull request: https://github.com/apache/spark/pull/3810#discussion_r22776416 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -333,9 +333,15 @@ class SparkContext(config: SparkConf) extends Logging with

[GitHub] spark pull request: [SPARK-4962] [CORE] Put TaskScheduler.start ba...

2015-01-11 Thread YanTangZhai
Github user YanTangZhai commented on a diff in the pull request: https://github.com/apache/spark/pull/3810#discussion_r22776371 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -333,9 +333,15 @@ class SparkContext(config: SparkConf) extends Logging with

[GitHub] spark pull request: [SPARK-4962] [CORE] Put TaskScheduler.start ba...

2015-01-11 Thread YanTangZhai
Github user YanTangZhai commented on a diff in the pull request: https://github.com/apache/spark/pull/3810#discussion_r22776305 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -55,13 +57,9 @@ private[spark] class Client

[GitHub] spark pull request: [SPARK-5163] [CORE] Load properties from confi...

2015-01-11 Thread YanTangZhai
Github user YanTangZhai closed the pull request at: https://github.com/apache/spark/pull/3963 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-5163] [CORE] Load properties from confi...

2015-01-11 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/3963#issuecomment-69523350 @pwendell Ok. Thank you very much. I close this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-5163] [CORE] Load properties from confi...

2015-01-08 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/3963 [SPARK-5163] [CORE] Load properties from configuration file for example spark-defaults.conf when creating SparkConf object I create and run a Spark program which does not use SparkSubmit

[GitHub] spark pull request: [SPARK-5007] [CORE] Try random port when start...

2015-01-08 Thread YanTangZhai
Github user YanTangZhai closed the pull request at: https://github.com/apache/spark/pull/3845 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-5007] [CORE] Try random port when start...

2015-01-08 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/3845#issuecomment-69282504 @andrewor14 @rxin Oh, I see. Thank you very much. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...

2014-12-31 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/3794#issuecomment-68438167 @JoshRosen Thanks for your comments. I've updated it according to your comments and contrived a simple example as follows: ```javascript val input

[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...

2014-12-31 Thread YanTangZhai
Github user YanTangZhai commented on a diff in the pull request: https://github.com/apache/spark/pull/3794#discussion_r22376680 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -178,7 +178,7 @@ abstract class RDD[T: ClassTag]( // Our dependencies and

[GitHub] spark pull request: [SPARK-4692] [SQL] Support ! boolean logic ope...

2014-12-30 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/3555#issuecomment-68425639 @marmbrus I've updated it. Please review again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as wel

[GitHub] spark pull request: [SPARK-5007] [CORE] Try random port when start...

2014-12-30 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/3845 [SPARK-5007] [CORE] Try random port when startServiceOnPort to reduce the chance of port collision When multiple Spark programs are submitted at the same node (called springboard machine). The

[GitHub] spark pull request: [SPARK-4962] [CORE] Put TaskScheduler.start ba...

2014-12-26 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/3810 [SPARK-4962] [CORE] Put TaskScheduler.start back in SparkContext to shorten cluster resources occupation period When SparkContext object is instantiated, TaskScheduler is started and some

[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...

2014-12-24 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/3794 [SPARK-4961] [CORE] Put HadoopRDD.getPartitions forward to reduce DAGScheduler.JobSubmitted processing time HadoopRDD.getPartitions is lazyied to process in DAGScheduler.JobSubmitted. If

[GitHub] spark pull request: [SPARK-4723] [CORE] To abort the stages which ...

2014-12-24 Thread YanTangZhai
Github user YanTangZhai closed the pull request at: https://github.com/apache/spark/pull/3786 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-4723] [CORE] To abort the stages which ...

2014-12-24 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/3786#issuecomment-68082514 @markhamstra Thanks for your comment. I will analyse deeply why stage attempts so many times. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SPARK-4723] [CORE] To abort the stages which ...

2014-12-23 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/3786 [SPARK-4723] [CORE] To abort the stages which have attempted some times For some reason, some stages may attempt many times. A threshold could be added and the stages which have attempted more

[GitHub] spark pull request: [SPARK-4946] [CORE] Using AkkaUtils.askWithRep...

2014-12-23 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/3785 [SPARK-4946] [CORE] Using AkkaUtils.askWithReply in MapOutputTracker.askTracker to reduce the chance of the communicating problem Using AkkaUtils.askWithReply in MapOutputTracker.askTracker to

[GitHub] spark pull request: [SPARK-3545] Put HadoopRDD.getPartitions forwa...

2014-12-23 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/2409#issuecomment-68021964 @JoshRosen Thanks. I will divide this JIRA/PR into two JIRAs/PRs. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-4692] [SQL] Support ! boolean logic ope...

2014-12-22 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/3555#issuecomment-67816709 @liancheng I will revert the last space change. Thanks for your comment. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-4692] [SQL] Support ! boolean logic ope...

2014-12-18 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/3555#issuecomment-67473028 @marmbrus Please review again. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-4693] [SQL] PruningPredicates may be wr...

2014-12-18 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/3556#issuecomment-67472596 @marmbrus Please review again. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [WIP] [SPARK-4273] [SQL] Providing ExternalSet...

2014-12-17 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/3137#issuecomment-67452153 @marmbrus Thanks. I'm also trying another approach to optimize this operation. I want to discuss it with you later. --- If your project is set up for it, yo

[GitHub] spark pull request: [SPARK-4693] [SQL] PruningPredicates may be wr...

2014-12-17 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/3556#issuecomment-67437985 @marmbrus Thank you for your comments. I will do it right away. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-4693] [SQL] PruningPredicates may be wr...

2014-12-02 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/3556 [SPARK-4693] [SQL] PruningPredicates may be wrong if predicates contains an empty AttributeSet() references The sql "select * from spark_test::for_test where abs(20141202) is not null

[GitHub] spark pull request: [SPARK-4692] [SQL] Support ! boolean logic ope...

2014-12-02 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/3555 [SPARK-4692] [SQL] Support ! boolean logic operator like NOT Support ! boolean logic operator like NOT in sql as follows select * from for_test where !(col1 > col2) You can merge this p

[GitHub] spark pull request: [SPARK-4677] [WEB] Add hadoop input time in ta...

2014-12-01 Thread YanTangZhai
Github user YanTangZhai closed the pull request at: https://github.com/apache/spark/pull/3539 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-4677] [WEB] Add hadoop input time in ta...

2014-12-01 Thread YanTangZhai
Github user YanTangZhai commented on a diff in the pull request: https://github.com/apache/spark/pull/3539#discussion_r21140476 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -238,10 +238,13 @@ class HadoopRDD[K, V]( val value: V

[GitHub] spark pull request: [SPARK-4677] [WEB] Add hadoop input time in ta...

2014-12-01 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/3539 [SPARK-4677] [WEB] Add hadoop input time in task webui Add hadoop input time in task webui like GC Time to explicitly show the time used by task to read input data. You can merge this pull

[GitHub] spark pull request: [SPARK-4676] [SQL] JavaSchemaRDD.schema may th...

2014-12-01 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/3538 [SPARK-4676] [SQL] JavaSchemaRDD.schema may throw NullType MatchError if sql has null val jsc = new org.apache.spark.api.java.JavaSparkContext(sc) val jhc = new

[GitHub] spark pull request: [SPARK-4401] [SQL] RuleExecutor should log tra...

2014-11-14 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/3265#issuecomment-63058643 @srowen Thanks. I close this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-4401] [SQL] RuleExecutor should log tra...

2014-11-14 Thread YanTangZhai
Github user YanTangZhai closed the pull request at: https://github.com/apache/spark/pull/3265 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-4401] [SQL] RuleExecutor should log tra...

2014-11-14 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/3265 [SPARK-4401] [SQL] RuleExecutor should log trace correct iteration num RuleExecutor should log trace correct iteration num You can merge this pull request into a Git repository by running

[GitHub] spark pull request: [WIP] [SPARK-4273] [SQL] Providing ExternalSet...

2014-11-06 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/3137 [WIP] [SPARK-4273] [SQL] Providing ExternalSet to avoid OOM when count(distinct) Some task may OOM when count(distinct) if it needs to process many records. CombineSetsAndCountFunction puts

[GitHub] spark pull request: [SPARK-4009][SQL]HiveTableScan should use make...

2014-10-21 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/2857#issuecomment-59915528 @marmbrus Thanks. Please disregard it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-4009][SQL]HiveTableScan should use make...

2014-10-21 Thread YanTangZhai
Github user YanTangZhai closed the pull request at: https://github.com/apache/spark/pull/2857 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-4009][SQL]HiveTableScan should use make...

2014-10-20 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/2857 [SPARK-4009][SQL]HiveTableScan should use makeRDDForTable instead of makeRDDForPartitionedTable for partitioned table when partitionPruningPred is None HiveTableScan should use makeRDDForTable

[GitHub] spark pull request: [SPARK-3545] Put HadoopRDD.getPartitions forwa...

2014-09-16 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/2409 [SPARK-3545] Put HadoopRDD.getPartitions forward and put TaskScheduler.start back to reduce DAGScheduler.JobSubmitted processing time and shorten cluster resources occupation period We have

[GitHub] spark pull request: [SPARK-2714] DAGScheduler logs jobid when runJ...

2014-09-12 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1617#issuecomment-55376375 @andrewor14 Thanks. Please review again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-3003] FailedStage could not be cancelle...

2014-09-12 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1921#issuecomment-55371512 @andrewor14 If a running stage is fetch failed, it will be moved to failedStages from runningStages. But it is still kept alive in web ui. Then I try to kill this

[GitHub] spark pull request: [SPARK-3003] FailedStage could not be cancelle...

2014-09-12 Thread YanTangZhai
Github user YanTangZhai closed the pull request at: https://github.com/apache/spark/pull/1921 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-2715] ExternalAppendOnlyMap adds max li...

2014-09-11 Thread YanTangZhai
Github user YanTangZhai closed the pull request at: https://github.com/apache/spark/pull/1618 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-2715] ExternalAppendOnlyMap adds max li...

2014-09-11 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1618#issuecomment-55254537 @andrewor14 Yeah, I see. I will close the PR. If needed, it could be reopened. Thank you very much. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: [SPARK-2643] Stages web ui has ERROR when pool...

2014-09-11 Thread YanTangZhai
Github user YanTangZhai closed the pull request at: https://github.com/apache/spark/pull/1854 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-2643] Stages web ui has ERROR when pool...

2014-09-11 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1854#issuecomment-55254070 @jkbradley I will close this PR. Thank you very much. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-3148] Update global variables of HttpBr...

2014-08-20 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/2059#issuecomment-52884506 Hi @JoshRosen SparkContext1 creates broadcastManager and initializes HttpBroadcast object. HttpBroadcast creates httpserver and broadcastDir and so on. However

[GitHub] spark pull request: Update global variables of HttpBroadcast so th...

2014-08-20 Thread YanTangZhai
Github user YanTangZhai closed the pull request at: https://github.com/apache/spark/pull/2058 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: Update global variables of HttpBroadcast so th...

2014-08-20 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/2058#issuecomment-52783462 #2059 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-3148] Update global variables of HttpBr...

2014-08-20 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/2059 [SPARK-3148] Update global variables of HttpBroadcast so that multiple SparkContexts can coexist Update global variables of HttpBroadcast so that multiple SparkContexts can coexist You can

[GitHub] spark pull request: Update global variables of HttpBroadcast so th...

2014-08-20 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/2058 Update global variables of HttpBroadcast so that multiple SparkContexts can coexist Update global variables of HttpBroadcast so that multiple SparkContexts can coexist You can merge this pull

[GitHub] spark pull request: [SPARK-3067] JobProgressPage could not show Fa...

2014-08-15 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/1966 [SPARK-3067] JobProgressPage could not show Fair Scheduler Pools section sometimes JobProgressPage could not show Fair Scheduler Pools section sometimes. SparkContext starts webui and then

[GitHub] spark pull request: [SPARK-3003] FailedStage could not be cancelle...

2014-08-13 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/1921 [SPARK-3003] FailedStage could not be cancelled by DAGScheduler when cancelJob or cancelStage Some stage is changed from running to failed, then DAGSCheduler could not cancel it when cancelJob

[GitHub] spark pull request: [SPARK-2643] Stages web ui has ERROR when pool...

2014-08-08 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1854#issuecomment-51604647 Please review again, thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2643] Stages web ui has ERROR when pool...

2014-08-08 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1854#issuecomment-51601604 @srowen The stage 10 will be removed from stageIdToData later. Since it will be added into completedStages or failedStages again and will be removed from

[GitHub] spark pull request: [SPARK-2643] Stages web ui has ERROR when pool...

2014-08-08 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1854#issuecomment-51597142 @srowen I see, thanks. I will modify. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2643] Stages web ui has ERROR when pool...

2014-08-08 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1854#issuecomment-51595260 @srowen The completedStages may contains stages as follows ...10, 10, 10, 10, 10, 11, 18... and the activeStages may contains 1, 10, 5 with unique 10 and the

[GitHub] spark pull request: [Spark 2643] Stages web ui has ERROR when pool...

2014-08-08 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/1854 [Spark 2643] Stages web ui has ERROR when pool name is None 14/07/23 16:01:44 WARN servlet.ServletHandler: /stages/ java.util.NoSuchElementException: None.get at scala.None$.get

[GitHub] spark pull request: [SPARK-2290] Worker should directly use its ow...

2014-08-05 Thread YanTangZhai
Github user YanTangZhai closed the pull request at: https://github.com/apache/spark/pull/1244 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-2290] Worker should directly use its ow...

2014-08-05 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1392#issuecomment-51190110 @pwendell Sorry, I'm late. Please disregard this PR since #1734 has been closed. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-2290] Worker should directly use its ow...

2014-08-05 Thread YanTangZhai
Github user YanTangZhai closed the pull request at: https://github.com/apache/spark/pull/1392 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-2714] DAGScheduler logs jobid when runJ...

2014-07-29 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1617#issuecomment-50564465 Hi @markhamstra When DAGScheduler concurrently runs multiple jobs, SparkContext only logs "Job finished" and logs in the same file which doesn't

[GitHub] spark pull request: [SPARK-2715] ExternalAppendOnlyMap adds max li...

2014-07-28 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1618#issuecomment-50425621 Hi @andrewor14 The default values of the two max limits are zero, which does not change the original operating mode and does not fail an application that is running

[GitHub] spark pull request: [SPARK-2715] ExternalAppendOnlyMap adds max li...

2014-07-28 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/1618 [SPARK-2715] ExternalAppendOnlyMap adds max limit of times and max limit of disk bytes written for spilling ExternalAppendOnlyMap adds max limit of times and max limit of disk bytes written

[GitHub] spark pull request: [SPARK-2714] DAGScheduler logs jobid when runJ...

2014-07-28 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/1617 [SPARK-2714] DAGScheduler logs jobid when runJob finishes DAGScheduler logs jobid when runJob finishes You can merge this pull request into a Git repository by running: $ git pull https

[GitHub] spark pull request: [SPARK-2647] DAGScheduler plugs other JobSubmi...

2014-07-27 Thread YanTangZhai
Github user YanTangZhai closed the pull request at: https://github.com/apache/spark/pull/1548 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-2647] DAGScheduler plugs other JobSubmi...

2014-07-27 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1548#issuecomment-50292640 @markhamstra Ok. Thank you very much. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-2647] DAGScheduler plugs other JobSubmi...

2014-07-24 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1548#issuecomment-5727 Hi @markhamstra , you are right. I will think of other ways to solve this problem. Thanks. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SPARK-2647] DAGScheduler plugs other JobSubmi...

2014-07-23 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/1548 [SPARK-2647] DAGScheduler plugs other JobSubmitted events when processing one JobSubmitted event If a few of jobs are submitted, DAGScheduler plugs other JobSubmitted events when processing

[GitHub] spark pull request: [SPARK-2290] Worker should directly use its ow...

2014-07-21 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1392#issuecomment-49584362 Hi @andrewor14 , that's ok. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pr

[GitHub] spark pull request: [SPARK-2325] Utils.getLocalDir had better chec...

2014-07-13 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1281#issuecomment-48840401 Hi @ash211, I think this change is needed. Since the method Utils.getLocalDir is used by some function such as HttpBroadcast, which is different from

[GitHub] spark pull request: [SPARK-2325] Utils.getLocalDir had better chec...

2014-07-13 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1281#issuecomment-48840378 Hi @ash211, I think this change is needed. Since the method Utils.getLocalDir is used by some function such as HttpBroadcast, which is different from

[GitHub] spark pull request: [SPARK-2325] Utils.getLocalDir had better chec...

2014-07-13 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1281#issuecomment-48840373 Hi @ash211, I think this change is needed. Since the method Utils.getLocalDir is used by some function such as HttpBroadcast, which is different from

[GitHub] spark pull request: [SPARK-2290] Worker should directly use its ow...

2014-07-13 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1244#issuecomment-48839912 I've fixed the compile problem. Please review and test again. Thanks very much. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-2290] Worker should directly use its ow...

2014-07-13 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1392#issuecomment-48839668 fix #1244 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-2290] Worker should directly use its ow...

2014-07-13 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1392#issuecomment-48839557 #1244 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-2290] Worker should directly use its ow...

2014-07-13 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/1392 [SPARK-2290] Worker should directly use its own sparkHome instead of appDesc.sparkHome when LaunchExecutor Worker should directly use its own sparkHome instead of appDesc.sparkHome when

[GitHub] spark pull request: [SPARK-2325] Utils.getLocalDir had better chec...

2014-07-01 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/1281 [SPARK-2325] Utils.getLocalDir had better check the directory and choose a good one instead of choosing the first one directly If the first directory of spark.local.dir is bad, application will

[GitHub] spark pull request: [SPARK-2324] SparkContext should not exit dire...

2014-07-01 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1274#issuecomment-47737851 Thank aarondav. I've modified some codes. Please help to review again. --- If your project is set up for it, you can reply to this email and have your reply a

[GitHub] spark pull request: [SPARK-2324] SparkContext should not exit dire...

2014-07-01 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/1274 [SPARK-2324] SparkContext should not exit directly when spark.local.dir is a list of multiple paths and one of them has error The spark.local.dir is configured as a list of multiple paths as

[GitHub] spark pull request: [SPARK-2290] Worker should directly use its ow...

2014-06-30 Thread YanTangZhai
Github user YanTangZhai commented on the pull request: https://github.com/apache/spark/pull/1244#issuecomment-47612188 The sarkHome field is taken out of ApplicationDescription entirely. Please review again. Thanks. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-2290] Worker should directly use its ow...

2014-06-27 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/1244 [SPARK-2290] Worker should directly use its own sparkHome instead of appDesc.sparkHome when LaunchExecutor Worker should directly use its own sparkHome instead of appDesc.sparkHome when