from:"XuTingjun"

[GitHub] spark pull request: [SPARK-13112]CoarsedExecutorBackend register t...

2016-04-04 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/12078#issuecomment-205684324
  
Can I ask a question, why can't initialize executor before register to 
driver as this pr? Is there any hidden danger?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5210] Support group event log when app ...

2015-12-15 Thread XuTingjun

Github user XuTingjun closed the pull request at:

https://github.com/apache/spark/pull/9246


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11652] [CORE] Remote code execution wit...

2015-12-08 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/10198#issuecomment-162851296
  
LGTM, thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11652] [CORE] Remote code execution wit...

2015-12-08 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/9731#issuecomment-162828252
  
ok, please fix it as soon as possible, thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11652] [CORE] Remote code execution wit...

2015-12-08 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/9731#issuecomment-162824623
  
I think the groupId should be "commons-collections", not 
"org.apache.commons", right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11652] [CORE] Remote code execution wit...

2015-12-08 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/9731#issuecomment-162819273
  
@srowen I only find below commons-collection file:
```
commons-collections
  commons-collections
  3.2.2
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11652] [CORE] Remote code execution wit...

2015-12-08 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/9731#issuecomment-162814698
  
@srowen I can't find this jar file, can you give me a download url?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12143]When cloumn type is binary, chang...

2015-12-03 Thread XuTingjun

GitHub user XuTingjun opened a pull request:

https://github.com/apache/spark/pull/10139

[SPARK-12143]When cloumn type is binary, change to Array[Byte] instead of 
string

In Beeline, execute below sql:
1. create table bb(bi binary);
2. load data inpath 'tmp/data' into table bb;
3.select * from bb;
Error: java.lang.ClassCastException: java.lang.String cannot be cast to [B 
(state=, code=0)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuTingjun/spark patch-3

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/10139.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #10139


commit 54c17e7d7a6d00f4eb1406544d0b29709c777880
Author: meiyoula <1039320...@qq.com>
Date:   2015-12-04T02:08:46Z

When cloumn type is binary, change binary to Array[Byte] instead of string




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [ SPARK-12142]Reply false when container alloc...

2015-12-03 Thread XuTingjun

GitHub user XuTingjun opened a pull request:

https://github.com/apache/spark/pull/10138

[ SPARK-12142]Reply false when container allocator is not ready and reset 
target

Using Dynamic Allocation function, when a new AM is starting, and 
ExecutorAllocationManager send RequestExecutor message to AM. If the container 
allocator is not ready, the whole app will hang on

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuTingjun/spark patch-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/10138.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #10138


commit f0fc7947ab45debda7d5c4c4dc1aac6547f491e9
Author: meiyoula <1039320...@qq.com>
Date:   2015-12-04T01:53:32Z

when request executor fails, reset the numExecutorsTarget

commit 2b02b01ffc43bf989f2ea0bc266894c28c14acd8
Author: meiyoula <1039320...@qq.com>
Date:   2015-12-04T01:59:55Z

reply false




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11334] numRunningTasks can't be less th...

2015-11-12 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/9288#issuecomment-156078546
  
@andrewor14, yeah the ```DAGScheduler``` post events from a single thread, 
but the root cause is that ```DAGScheduler``` receive the 
```SparkListenerTaskEnd``` behind ```SparkListenerStageCompleted```.
I try to resolve the root cause, as you said scheduler is quite 
complicated, my realization caused many unit tests failed. So if you OK with 
doing a fix on the ```ExecutorAllocationManager``` side, I will fix update the 
```ExecutorAllocationManager``` code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11334] numRunningTasks can't be less th...

2015-11-12 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/9288#issuecomment-156078133
  
@jerryshao, I try to realize the code of your suggestion, but many unit 
tests are failed. I think it's difficult for me. If you can help me, I would be 
very grateful.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11334] numRunningTasks can't be less th...

2015-10-27 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/9288#issuecomment-151400153
  
Yeah, I know the root cause is the wrong ordering of events. 
The code of these event order are: [kill 
Task](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1443),
 
[StageCompleted](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1444),
 
[JobEnd](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1458).
 
Because the TaskEnd is not serial with StageComplete and JobEnd, so I think 
maybe we can't control the ordering.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11334] numRunningTasks can't be less th...

2015-10-26 Thread XuTingjun

GitHub user XuTingjun opened a pull request:

https://github.com/apache/spark/pull/9288

[SPARK-11334] numRunningTasks can't be less than 0, or it will affect 
executor allocation

With Dynamic Allocation function, a task failed over ```maxFailure``` time, 
all the dependent jobs, stages, tasks will be killed or aborted. In this 
process, ```SparkListenerTaskEnd``` event will be behind in 
```SparkListenerStageCompleted``` and ```SparkListenerJobEnd```. Like the Event 
Log below:
```
{"Event":"SparkListenerStageCompleted","Stage Info":
Unknown macro: {"Stage ID"}
Unknown macro: {"Event"}
{"Event":"SparkListenerTaskEnd","Stage ID":20,"Stage Attempt ID":0,"Task 
Type":"ResultTask","Task End Reason":
Unknown macro: {"Reason"}
,"Task Info":{"Task ID":1955,"Index":88,"Attempt":2,"Launch 
Time":1444914699763,"Executor 
ID":"5","Host":"linux-223","Locality":"PROCESS_LOCAL","Speculative":false,"Getting
 Result Time":0,"Finish Time":1444914699864,"Failed":true,"Accumulables":[]}}
```
Because that, the ```numRunningTasks``` in ```ExecutorAllocationManager``` 
class will be less than 0, and it will affect executor allocation.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuTingjun/spark SPARK-11334

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9288.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9288


commit e32e68485eaf0ed9eed7d88478154aff8650da62
Author: xutingjun 
Date:   2015-10-27T03:14:56Z

numRunningTasks can't be less than 0, or it will affect executor allocation




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5210] Support group event log when app ...

2015-10-23 Thread XuTingjun

GitHub user XuTingjun opened a pull request:

https://github.com/apache/spark/pull/9246

[SPARK-5210] Support group event log when app is long-running

For long-running Spark applications (e.g. running for days / weeks), the 
Spark event log may grow to be very large.

I think group event log by job is an acceptable resolution. 

1. To group eventLog, one application has two kinds file: one meta file and 
many part files. We put ```StageSubmitted/ StageCompleted/ TaskResubmit/ 
TaskStart/TaskEnd/TaskGettingResult/ JobStart/JobEnd``` events into meta file, 
and put other events into part file. The event log shows like below:
```
application_1439246697595_0001-meta
application_1439246697595_0001-part1
application_1439246697595_0001-part2
```
2.To HistoryServer, every part file will be treated as an application, and 
it will replay meta file after replay part file. Below is the display of group 
app on HistoryServer web:

![default](https://cloud.githubusercontent.com/assets/7609069/10688491/2d9f9c02-79a7-11e5-9879-c14899a56fb9.png)


  



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuTingjun/spark SPARK-5210

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9246.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9246


commit 62c982b0048252d88de27e0791cbafbbc69c6c57
Author: xutingjun 
Date:   2015-10-23T07:52:10Z

add big event log




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9585] Delete the input format caching b...

2015-09-22 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/7918#issuecomment-142487300
  
@rxin, My jira id is **meiyoula**


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9585] add config to enable inputFormat ...

2015-09-22 Thread XuTingjun

Github user XuTingjun commented on a diff in the pull request:

https://github.com/apache/spark/pull/7918#discussion_r40058843
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
@@ -182,17 +182,13 @@ class HadoopRDD[K, V](
   }
 
   protected def getInputFormat(conf: JobConf): InputFormat[K, V] = {
-if (HadoopRDD.containsCachedMetadata(inputFormatCacheKey)) {
-  return 
HadoopRDD.getCachedMetadata(inputFormatCacheKey).asInstanceOf[InputFormat[K, V]]
-}
-// Once an InputFormat for this RDD is created, cache it so that only 
one reflection call is
-// done in each local process.
+// Once a constructor of InputFormat for this RDD is created, cache it 
so that only one
+// reflection call is done in each local process.
--- End diff --

@rxin, Do you mean I should delete the comment here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9585] add config to enable inputFormat ...

2015-09-15 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/7918#issuecomment-140619178
  
@JoshRosen, Can you have a look on this? Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9585] add config to enable inputFormat ...

2015-09-14 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/7918#issuecomment-140005869
  
@sryza I don't really understand **caching the constructor**.
I find the method```ReflectionUtils.newInstance``` will cache the 
constructor.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10586] fix bug: BlockManager ca't be re...

2015-09-14 Thread XuTingjun

GitHub user XuTingjun opened a pull request:

https://github.com/apache/spark/pull/8741

[SPARK-10586] fix bug: BlockManager ca't be removed when it is 
re-registered, then disassociats

Scene: When the executor has been removed, but it still exists on the 
SparkUI web.

Process: 
1. Driver Lost executor because heartbeat timed out;
2. Executor received SIGNAL 15: SIGTERM and **re-registered** the 
blockmanager;
3. Driver Lost executor because remote Rpc client disassociated and attempt 
to remove blockmanager but not success;

Reason:
The first time lost executor, the [SchedulerBackend 
reviveOffers](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala#L481),
 when the host also has other executors, it won't [add this 
executor](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala#L288).
 So the second time lost executor, DAGScheduler won't remove BlockManager 
because the executorId has been in ```failedEpoch```.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuTingjun/spark SPARK-10586

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8741.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8741


commit bb562a269b954639a8f254d79782a2f7d0e209c9
Author: xutingjun 
Date:   2015-09-14T07:43:29Z

when executor adds, delete it from failedEpoch




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10311][Streaming]Reload appId and attem...

2015-08-27 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/8477#issuecomment-135602834
  
Sorry that I don't declare the problem clearly.

When an app starts with CheckPoint file using [getOrCreate 
method](https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala#L829),
 the new AM process will new a SparkContext object, but just using the [old 
SparkConf](https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala#L140),
 So the new attemptId set by new AM process doesn't do anything.

Also the appId is the same.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10311]Reload appId and attemptId when a...

2015-08-26 Thread XuTingjun

GitHub user XuTingjun opened a pull request:

https://github.com/apache/spark/pull/8477

[SPARK-10311]Reload appId and attemptId when a new ApplicationMaster 
registes



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuTingjun/spark streaming-attempt

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8477.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8477


commit 3211a68039b9886e31e6aabf00d6de335f81b4f6
Author: xutingjun 
Date:   2015-08-27T03:31:03Z

reload appId and attemptId when AM is new




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10147]App shouldn't show in HistoryServ...

2015-08-20 Thread XuTingjun

Github user XuTingjun closed the pull request at:

https://github.com/apache/spark/pull/8348


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10147]App shouldn't show in HistoryServ...

2015-08-20 Thread XuTingjun

GitHub user XuTingjun opened a pull request:

https://github.com/apache/spark/pull/8348

[SPARK-10147]App shouldn't show in HistoryServer web when the event file 
has been deleted on hdfs

Phenomenonï¼App still shows in HistoryServer web when the event file has 
been deleted on hdfs.
Cause: It is because ```log-replay-executor``` thread and ```clean log``` 
thread both will write value to object ```applications```, so it has 
synchronization problem.

So I think we should delete the ```log-replay-executor``` thread to avoid 
this synchronization problem.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuTingjun/spark history_app_clean

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8348.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8348


commit eefe568fda2debd25f331284716db1b9f4c96862
Author: xutingjun 
Date:   2015-06-02T12:27:05Z

Lazy start the scheduler for dynamic allocation #6430

commit f999c7629ee88f313286763540db38d3206595fa
Author: xutingjun 
Date:   2015-08-21T04:03:28Z

delete executor which fetch and parse log files




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] maxNumExecutorsNeeded should prop...

2015-08-11 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6817#issuecomment-129747858
  
@andrewor14, I have tested it in the real cluster, it's ok.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] maxNumExecutorsNeeded should prop...

2015-08-10 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6817#issuecomment-129336529
  
Maybe we can change below code, right?
```
val numTasksScheduled = stageIdToTaskIndices(stageId).size
val numTasksTotal = stageIdToNumTasks.getOrElse(stageId, -1)
if (numTasksScheduled == numTasksTotal) {
  // No more pending tasks for this stage
  stageIdToNumTasks -= stageId
  if (stageIdToNumTasks.isEmpty) {
allocationManager.onSchedulerQueueEmpty()
  }
}
```
to 
```
if (totalPendingTasks() == 0) {
  allocationManager.onSchedulerQueueEmpty()
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] maxNumExecutorsNeeded should prop...

2015-08-09 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6817#issuecomment-129287244
  
@andrewor14, I understand what you mean.
 what I consider is that, if many stages run in parallel, just delete L606 
may be not  correct.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9585] add config to enable inputFormat ...

2015-08-07 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/7918#issuecomment-128620998
  
Hi all, Can you have a look on this? I think it's meaningful.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] When tasks failed and append new ...

2015-08-05 Thread XuTingjun

Github user XuTingjun commented on a diff in the pull request:

https://github.com/apache/spark/pull/6817#discussion_r36377614
  
--- Diff: 
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
@@ -628,6 +621,13 @@ private[spark] class ExecutorAllocationManager(
 allocationManager.onExecutorIdle(executorId)
   }
 }
+
+// If the task failed, we expect it to be resubmitted later.
+if (taskEnd.reason != Success) {
+  stageIdToTaskIndices.get(stageId).foreach { taskIndices =>
+taskIndices.remove(taskIndex)
--- End diff --

Andrew noted before:
> If anything, I would think that we should remove [this 
line](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L606).
 If this task fails, then the next attempt would go to the else case of 
stageIdNumTasks.getOrElse(stageId, -1), which not technically correct. It's 
safe to remove it because we remove it in stageCompleted anyway.

I agree with this option, so I delete the Line 602-610. After delete these 
lines, we needn't call ``` allocationManager.onSchedulerBacklogged()``` any 
more, because ```allocationManager.onSchedulerQueueEmpty()``` only called when 
all stages finish.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2017] [UI] Stage page hangs with many t...

2015-08-05 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/7296#issuecomment-128218149
  
@andrewor14, Now the [task 
pagination](https://github.com/apache/spark/pull/7399) has been realized. So 
this patch has no meaning now?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9585] add config to enable inputFormat ...

2015-08-04 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/7918#issuecomment-127892711
  
Thanks all, I have added the document .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] When tasks failed and append new ...

2015-08-04 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6817#issuecomment-127886475
  
@squito, I have updated the test, thank you very much.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9585] add config to enable inputFormat ...

2015-08-03 Thread XuTingjun

GitHub user XuTingjun opened a pull request:

https://github.com/apache/spark/pull/7918

[SPARK-9585] add config to enable inputFormat cache or not



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuTingjun/spark cached_inputFormat

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/7918.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #7918


commit 7fc31d83848c13c80e75e1c71248a96d1a917687
Author: xutingjun 
Date:   2015-08-04T01:37:11Z

add config to enable inputFormat cache or not




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] When tasks failed and append new ...

2015-08-03 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6817#issuecomment-127448769
  
retest please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] When tasks failed and append new ...

2015-08-03 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6817#issuecomment-127223219
  
@andrewor14, I have changed the code to what you suggested, please have a 
look.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] When tasks failed and append new ...

2015-07-22 Thread XuTingjun

Github user XuTingjun commented on a diff in the pull request:

https://github.com/apache/spark/pull/6817#discussion_r35285511
  
--- Diff: 
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
@@ -553,12 +562,14 @@ private[spark] class ExecutorAllocationManager(
 }
 
 // If this is the last pending task, mark the scheduler queue as 
empty
-stageIdToTaskIndices.getOrElseUpdate(stageId, new 
mutable.HashSet[Int]) += taskIndex
+stageIdToTaskIndices
+  .getOrElseUpdate(stageId, new mutable.HashSet[String]) += 
(taskIndex + "." + attemptId)
 val numTasksScheduled = stageIdToTaskIndices(stageId).size
 val numTasksTotal = stageIdToNumTasks.getOrElse(stageId, -1)
 if (numTasksScheduled == numTasksTotal) {
   // No more pending tasks for this stage
   stageIdToNumTasks -= stageId
--- End diff --

sorry, I am wrong the line number, actually they are [these 
code](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L567-L574)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] When tasks failed and append new ...

2015-07-22 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6817#issuecomment-123588243
  
Yeah, I got it. I think we can add below code into ```onTaskEnd``` method, 
right?
```
stageIdToTaskIndices.get(taskEnd.stageId).get.remove(taskIndex)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] When tasks failed and append new ...

2015-07-22 Thread XuTingjun

Github user XuTingjun commented on a diff in the pull request:

https://github.com/apache/spark/pull/6817#discussion_r35186974
  
--- Diff: 
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
@@ -553,12 +562,14 @@ private[spark] class ExecutorAllocationManager(
 }
 
 // If this is the last pending task, mark the scheduler queue as 
empty
-stageIdToTaskIndices.getOrElseUpdate(stageId, new 
mutable.HashSet[Int]) += taskIndex
+stageIdToTaskIndices
+  .getOrElseUpdate(stageId, new mutable.HashSet[String]) += 
(taskIndex + "." + attemptId)
 val numTasksScheduled = stageIdToTaskIndices(stageId).size
 val numTasksTotal = stageIdToNumTasks.getOrElse(stageId, -1)
 if (numTasksScheduled == numTasksTotal) {
   // No more pending tasks for this stage
   stageIdToNumTasks -= stageId
--- End diff --

Can we delete line 557-585 ?  I think ```stageCompleted``` also have done 
this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] When tasks failed and append new ...

2015-07-15 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6817#issuecomment-121798632
  
@andrewor14 , Sorry to bother you again. I think it's really a bug, wish 
you have a look again, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4598][WebUI][WIP]Task table pagination ...

2015-07-14 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/7399#issuecomment-121448667
  
This table can sort globally by any field?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8953] SPARK_EXECUTOR_CORES has no effec...

2015-07-09 Thread XuTingjun

GitHub user XuTingjun opened a pull request:

https://github.com/apache/spark/pull/7322

[SPARK-8953] SPARK_EXECUTOR_CORES has no effect to dynamic executor 
allocation function

The configuration ```SPARK_EXECUTOR_CORES``` won't put into 
```SparkConf```, so it has no effect to the dynamic executor allocation.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuTingjun/spark SPARK_EXECUTOR_CORES

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/7322.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #7322


commit 2cafa893c4ce1b8d87ad1cb125286aefbf91fa54
Author: xutingjun 
Date:   2015-07-09T12:03:31Z

make SPARK_EXECUTOR_CORES has effect to dynamicAllocation




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] When tasks failed and append new ...

2015-07-02 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6817#issuecomment-118210073
  
@andrewor14, Have you understood the problem?

```  def totalPendingTasks(): Int = {
  stageIdToNumTasks.map { case (stageId, numTasks) =>
numTasks - 
stageIdToTaskIndices.get(stageId).map(_.size).getOrElse(0)
  }.sum
}``` 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8560][UI] The Executors page will have ...

2015-06-29 Thread XuTingjun

Github user XuTingjun commented on a diff in the pull request:

https://github.com/apache/spark/pull/6950#discussion_r33543095
  
--- Diff: core/src/main/scala/org/apache/spark/ui/exec/ExecutorsTab.scala 
---
@@ -92,15 +92,18 @@ class ExecutorsListener(storageStatusListener: 
StorageStatusListener) extends Sp
 val info = taskEnd.taskInfo
 if (info != null) {
   val eid = info.executorId
-  executorToTasksActive(eid) = executorToTasksActive.getOrElse(eid, 1) 
- 1
-  executorToDuration(eid) = executorToDuration.getOrElse(eid, 0L) + 
info.duration
   taskEnd.reason match {
+case Resubmitted =>
+  return
--- End diff --

I think so, because these ```Resubmitted SparkListenerTaskEnd``` tasks have 
post a ```Success SparkListenerTaskEnd``` before, all the metrics have been 
updated. If update here, the metrics will calculate twice.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8618] Obtain hbase token retries many t...

2015-06-29 Thread XuTingjun

Github user XuTingjun closed the pull request at:

https://github.com/apache/spark/pull/7007


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8618] Obtain hbase token retries many t...

2015-06-29 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/7007#issuecomment-116998896
  
sorry, I think this patch is not good, so i will close it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] When tasks failed and append new ...

2015-06-29 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6817#issuecomment-116995151
  
@andrewor14, sorry for my pool English. The problem is :
when a executor losts, the running tasks on it will be failed, and post a  
```SparkListenerTaskEnd```. Until reach maxTaskFailures, the failed tasks will 
[re-run with a new task 
id](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala#L720).
 Yeah, the new task will post a ```SparkListenerTaskStart```, and the 
[stageIdToTaskIndices](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L556)
 will add. But the [total task 
num](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L517)
 only set when ```StageSubmitted```. so the  [numTasksScheduled == 
numTasksTotal](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L559)
 won't be accessed, and [pending tasks 
calculation](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/sp
 ark/ExecutorAllocationManager.scala#L611) will be wrong.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-7889 Jobs progress of apps on complete p...

2015-06-28 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6935#issuecomment-116404740
  
yeah, I think this patch also can't refresh. The place where detache the 
handlers is not right.
I think we need to thinking when and where to detache the handlers.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-7889 Jobs progress of apps on complete p...

2015-06-28 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6935#issuecomment-116374449
  
@squito, I am agree with what you said. First about my 
patch[#6545](https://github.com/apache/spark/pull/6545/files), it will refresh 
only on request ```http://localhost:18080/history/```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8560][UI] The Executors page will have ...

2015-06-25 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6950#issuecomment-115178185
  
@tdas @pwendell Can you have a look on this patch, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-7889 Jobs progress of apps on complete p...

2015-06-24 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6935#issuecomment-115131256
  
@steveloughran  Have you tested it? I don't think it's ok.
yeah, you realized the refreshing of incompleted apps. But I think second 
time sending "/history/appid" request , it will not goes into [/history 
handler](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala#L77),
 but goes to [history/appid 
handler](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala#L116),
 so it will not call the [refresh 
code](https://github.com/steveloughran/spark/blob/stevel/patches/SPARK-7889-history-cache/core/src/main/scala/org/apache/spark/deploy/history/ApplicationCache.scala#L75)
 and won't refresh.
/cc @squito 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8618] add judgment of hbase configurati...

2015-06-24 Thread XuTingjun

GitHub user XuTingjun opened a pull request:

https://github.com/apache/spark/pull/7007

[SPARK-8618] add judgment of hbase configuration



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuTingjun/spark hbaseToken

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/7007.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #7007


commit 541ffedc0d1565bfa4495925fd502f6fa77e4da7
Author: xutingjun 
Date:   2015-06-25T02:26:20Z

add judgment of hbase configuration




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] When tasks failed and append new ...

2015-06-23 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6817#issuecomment-114735104
  
@andrewor14 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8560][UI] The Executors page will have ...

2015-06-23 Thread XuTingjun

GitHub user XuTingjun opened a pull request:

https://github.com/apache/spark/pull/6950

[SPARK-8560][UI] The Executors page will have negative if having 
resubmitted tasks

 when the ```taskEnd.reason``` is ```Resubmitted```, it shouldn't  do 
statistics. Because this tasks has a ```SUCCESS``` taskEnd before.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuTingjun/spark pageError

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/6950.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6950


commit af35dc339cddd856d3bff2fab63ec4b3f544d02c
Author: xutingjun 
Date:   2015-06-23T08:51:52Z

When taskEnd is Resubmitted, don't do statistics




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8391] catch the Throwable and report er...

2015-06-23 Thread XuTingjun

Github user XuTingjun closed the pull request at:

https://github.com/apache/spark/pull/6893


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8391] catch the Throwable and report er...

2015-06-22 Thread XuTingjun

Github user XuTingjun commented on a diff in the pull request:

https://github.com/apache/spark/pull/6893#discussion_r33005372
  
--- Diff: 
core/src/main/scala/org/apache/spark/ui/scope/RDDOperationGraph.scala ---
@@ -164,12 +164,20 @@ private[ui] object RDDOperationGraph extends Logging {
*
* For the complete DOT specification, see 
http://www.graphviz.org/Documentation/dotguide.pdf.
*/
-  def makeDotFile(graph: RDDOperationGraph): String = {
+  def makeDotFile(graph: RDDOperationGraph, stageId: String): String = {
 val dotFile = new StringBuilder
 dotFile.append("digraph G {\n")
-dotFile.append(makeDotSubgraph(graph.rootCluster, indent = "  "))
-graph.edges.foreach { edge => dotFile.append(s"""  
${edge.fromId}->${edge.toId};\n""") }
-dotFile.append("}")
+try {
+  dotFile.append(makeDotSubgraph(graph.rootCluster, indent = "  "))
+  graph.edges.foreach { edge => dotFile.append(s"""  
${edge.fromId}->${edge.toId};\n""") }
+  dotFile.append("}")
+} catch {
+  case t: Throwable =>
--- End diff --

@andrewor14 , ```NonFatal(e)``` will not match ```OutOfMemoryError```, but 
this ERROR will happen, what should I do? Should I have an other case of  
```case e: OutOfMemoryError```?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-7889 Jobs progress of apps on complete p...

2015-06-22 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6935#issuecomment-114329092
  
@steveloughran, I think you always put too many space in some place.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] When tasks failed and append new ...

2015-06-22 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6817#issuecomment-114326614
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8391] catch the Throwable and report er...

2015-06-22 Thread XuTingjun

Github user XuTingjun commented on a diff in the pull request:

https://github.com/apache/spark/pull/6893#discussion_r33002628
  
--- Diff: 
core/src/main/scala/org/apache/spark/ui/scope/RDDOperationGraph.scala ---
@@ -164,12 +164,20 @@ private[ui] object RDDOperationGraph extends Logging {
*
* For the complete DOT specification, see 
http://www.graphviz.org/Documentation/dotguide.pdf.
*/
-  def makeDotFile(graph: RDDOperationGraph): String = {
+  def makeDotFile(graph: RDDOperationGraph, stageId: String): String = {
 val dotFile = new StringBuilder
 dotFile.append("digraph G {\n")
-dotFile.append(makeDotSubgraph(graph.rootCluster, indent = "  "))
-graph.edges.foreach { edge => dotFile.append(s"""  
${edge.fromId}->${edge.toId};\n""") }
-dotFile.append("}")
+try {
+  dotFile.append(makeDotSubgraph(graph.rootCluster, indent = "  "))
+  graph.edges.foreach { edge => dotFile.append(s"""  
${edge.fromId}->${edge.toId};\n""") }
+  dotFile.append("}")
+} catch {
+  case t: Throwable =>
--- End diff --

@srowen @andrewor14 , Do you have any good idea on this ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8391] catch the Throwable and report er...

2015-06-19 Thread XuTingjun

Github user XuTingjun commented on a diff in the pull request:

https://github.com/apache/spark/pull/6893#discussion_r32809194
  
--- Diff: 
core/src/main/scala/org/apache/spark/ui/scope/RDDOperationGraph.scala ---
@@ -164,12 +164,20 @@ private[ui] object RDDOperationGraph extends Logging {
*
* For the complete DOT specification, see 
http://www.graphviz.org/Documentation/dotguide.pdf.
*/
-  def makeDotFile(graph: RDDOperationGraph): String = {
+  def makeDotFile(graph: RDDOperationGraph, stageId: String): String = {
 val dotFile = new StringBuilder
 dotFile.append("digraph G {\n")
-dotFile.append(makeDotSubgraph(graph.rootCluster, indent = "  "))
-graph.edges.foreach { edge => dotFile.append(s"""  
${edge.fromId}->${edge.toId};\n""") }
-dotFile.append("}")
+try {
+  dotFile.append(makeDotSubgraph(graph.rootCluster, indent = "  "))
+  graph.edges.foreach { edge => dotFile.append(s"""  
${edge.fromId}->${edge.toId};\n""") }
+  dotFile.append("}")
+} catch {
+  case t: Throwable =>
--- End diff --

Here I am not attempt to cleanup ```dotFile```, I just want to catch the 
Throwable to  prevent the whole page dies when the dot-file node dies.
I think one node dies can't cause the whole page dies, it will make user 
can't see other informations.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] When tasks failed and append new ...

2015-06-19 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6817#issuecomment-113399629
  
I think this patch has no association with the failed unit tests, please 
retest.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7889] [UI] make sure click the "App ID"...

2015-06-18 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6545#issuecomment-113358887
  
@squito,  Can you pay attention on this? Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8391] catch the Throwable and report er...

2015-06-18 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6893#issuecomment-113350684
  
retest please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8391] catch the Throwable and report er...

2015-06-18 Thread XuTingjun

GitHub user XuTingjun opened a pull request:

https://github.com/apache/spark/pull/6893

[SPARK-8391] catch the Throwable and report error to DAG graph

1. Maybe there has different Throwables while making dot file, in order to 
prevent the whole page dies, I think using try-catch is a reasonable resolution.
2. When making dot file occurs Throwable, report error to DAG graph, to 
tell user what happens.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuTingjun/spark SPARK-8391

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/6893.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6893


commit b2afade781a296d15b53848a98b158da8874a46e
Author: xutingjun 
Date:   2015-06-19T02:11:58Z

catch the Throwable and report error to DAG graph




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] When tasks failed and append new ...

2015-06-17 Thread XuTingjun

Github user XuTingjun commented on a diff in the pull request:

https://github.com/apache/spark/pull/6817#discussion_r32694699
  
--- Diff: 
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
@@ -553,12 +562,13 @@ private[spark] class ExecutorAllocationManager(
 }
 
 // If this is the last pending task, mark the scheduler queue as 
empty
-stageIdToTaskIndices.getOrElseUpdate(stageId, new 
mutable.HashSet[Int]) += taskIndex
+stageIdToTaskIndices.getOrElseUpdate(stageId, new 
mutable.HashSet[String]) += (taskIndex + "." + attemptId)
--- End diff --

yeah, I understand  what do you mean. But why I change the code is because, 
if a task is failed, a new one will append, this two tasks'  ```taskIndex``` 
are the same


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8392] RDDOperationGraph: getting cached...

2015-06-17 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6839#issuecomment-113015490
  
@andrewor14, I have updated the title and code, please have a look again, 
thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8392] RDDOperationGraph: getting cached...

2015-06-17 Thread XuTingjun

Github user XuTingjun commented on a diff in the pull request:

https://github.com/apache/spark/pull/6839#discussion_r32693244
  
--- Diff: 
core/src/main/scala/org/apache/spark/ui/scope/RDDOperationGraph.scala ---
@@ -70,6 +70,13 @@ private[ui] class RDDOperationCluster(val id: String, 
private var _name: String)
   def getAllNodes: Seq[RDDOperationNode] = {
 _childNodes ++ _childClusters.flatMap(_.childNodes)
   }
+
+  /** Return all the nodes which are cached. */
+  def getCachedNodes: Seq[RDDOperationNode] = {
+val cachedNodes = _childNodes.filter(_.cached)
+_childClusters.foreach(cluster => cachedNodes ++= 
cluster._childNodes.filter(_.cached))
--- End diff --

yeah, I think so.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8392] Improve the efficiency

2015-06-16 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6839#issuecomment-112407985
  
Yeah, I think  expand all nodes then filter every node, is slow and cost 
memory.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8392] Improve the efficiency

2015-06-16 Thread XuTingjun

GitHub user XuTingjun opened a pull request:

https://github.com/apache/spark/pull/6839

[SPARK-8392] Improve the efficiency

def getAllNodes: Seq[RDDOperationNode] =
{ _childNodes ++ _childClusters.flatMap(_.childNodes) }

when the _childClusters has so many nodes, the process will hang on. I 
think we can improve the efficiency here.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuTingjun/spark DAGImprove

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/6839.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6839


commit 81f9fd247a4ce8c69636495ade2f130f6fa6aa6f
Author: xutingjun 
Date:   2015-06-16T09:09:14Z

put the filter inside




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] When tasks failed and append new ...

2015-06-15 Thread XuTingjun

Github user XuTingjun commented on a diff in the pull request:

https://github.com/apache/spark/pull/6817#discussion_r32484601
  
--- Diff: 
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
@@ -537,10 +537,19 @@ private[spark] class ExecutorAllocationManager(
   }
 }
 
+override def onTaskResubmit(taskResubmit: SparkListenerTaskResubmit): 
Unit = {
+  val stageId = taskResubmit.stageId
+  allocationManager.synchronized {
+val num = stageIdToNumTasks.getOrElse(stageId, 0)
+stageIdToNumTasks.update(stageId, num + 1)
+  }
+}
+
--- End diff --

@squito, I think when an executor goes down, the stages won't be 
resubmitted. Here I means when a task fails, it will retry, and so a new task 
will append. And in order to let the **ExecutorAllocationManager** knows there 
are new tasks submit, I add **SparkListenerTaskResubmit** event.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] When tasks failed and append new ...

2015-06-15 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6817#issuecomment-111985278
  
@sryza  @andrewor14 Can you have a look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] When tasks failed and append new ...

2015-06-15 Thread XuTingjun

Github user XuTingjun commented on a diff in the pull request:

https://github.com/apache/spark/pull/6817#discussion_r32399852
  
--- Diff: 
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
@@ -537,10 +537,19 @@ private[spark] class ExecutorAllocationManager(
   }
 }
 
+override def onTaskResubmit(taskResubmit: SparkListenerTaskResubmit): 
Unit = {
+  val stageId = taskResubmit.stageId
+  allocationManager.synchronized {
+val num = stageIdToNumTasks.getOrElse(stageId, 0)
+stageIdToNumTasks.update(stageId, num + 1)
+  }
+}
+
--- End diff --

when a new attempt task is resubmitted, it should add to the 
stageIdToNumTasks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] When tasks failed and append new ...

2015-06-15 Thread XuTingjun

Github user XuTingjun commented on a diff in the pull request:

https://github.com/apache/spark/pull/6817#discussion_r32399782
  
--- Diff: 
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
@@ -553,12 +562,13 @@ private[spark] class ExecutorAllocationManager(
 }
 
 // If this is the last pending task, mark the scheduler queue as 
empty
-stageIdToTaskIndices.getOrElseUpdate(stageId, new 
mutable.HashSet[Int]) += taskIndex
+stageIdToTaskIndices.getOrElseUpdate(stageId, new 
mutable.HashSet[String]) += (taskIndex + "." + attemptId)
--- End diff --

if use taskIndex , when the new task (attempt id is not 0) is starting, the 
 stageIdToTaskIndices will not update, because the taskIndex is the same with 
the before failed task.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8366] When tasks failed and append new ...

2015-06-15 Thread XuTingjun

Github user XuTingjun commented on a diff in the pull request:

https://github.com/apache/spark/pull/6817#discussion_r32399543
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -732,6 +731,8 @@ private[spark] class TaskSetManager(
 return
   }
 }
+sched.dagScheduler.taskResubmit(tasks(index))
--- End diff --

when the numFailures is bigger than maxTaskFailures, not need to append new 
a task and submit it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: When tasks failed and append new ones, post Sp...

2015-06-14 Thread XuTingjun

GitHub user XuTingjun opened a pull request:

https://github.com/apache/spark/pull/6817

When tasks failed and append new ones, post SparkListenerTaskResubmit event



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuTingjun/spark SPARK-8366

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/6817.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6817


commit ee673b52ae8f89011ed9c45e4eb050f7d8656c0b
Author: xutingjun 
Date:   2015-06-15T02:15:57Z

fix dynamic bug when task fails




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7889] [UI] make sure click the "App ID"...

2015-06-08 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6545#issuecomment-110192207
  
Hi @squito, I think I need your help, I am not clearly know how to write 
this test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7889] [UI] make sure click the "App ID"...

2015-06-08 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6545#issuecomment-110191346
  
@squito, I think you can help me, I am not  clearly know how to write this 
test, thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6973]remove skipped stage ID from compl...

2015-06-08 Thread XuTingjun

Github user XuTingjun closed the pull request at:

https://github.com/apache/spark/pull/5550


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6973]remove skipped stage ID from compl...

2015-06-07 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/5550#issuecomment-109820313
  
@srowen, The jira has been updated to resolved, I think this patch can 
merged, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8099] set executor cores into system in...

2015-06-05 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6643#issuecomment-109247456
  
please retest


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8099] set executor cores into system in...

2015-06-04 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6643#issuecomment-109142206
  
Thanks all, I have updated the code, please have a look again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: set executor cores into system

2015-06-04 Thread XuTingjun

GitHub user XuTingjun opened a pull request:

https://github.com/apache/spark/pull/6643

set executor cores into system



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuTingjun/spark SPARK-8099

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/6643.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6643


commit 0600861f7e77062be3caab22899547d916bb3e06
Author: xutingjun 
Date:   2015-06-04T12:38:26Z

set executor cores into system




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7889] [UI] make sure click the "App ID"...

2015-06-03 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6545#issuecomment-108708777
  
I think the reason is not just because of the appCache. After my debug the 
code, I found there are two mainly reasons:
1. First time send *"/history/appid"* request, the *handler "/history/*"* 
will deal with it. During this, the handler adds the handlers of sparkui into 
historyserver, contains the *handler "/history/appid"*. So when the second time 
send *"/history/appid"* request, the *handler "/history/appid"* will deal with 
it instead of *handler "/history/*"*. so the second time ui is the same with 
the first one;

2. In the *handler "/history/*"*, the code use `appCache.get()` to cache 
app ui. I think we should use `appCache.refresh()` instead to make it can 
refresh.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7889] [UI] make sure click the "App ID"...

2015-06-02 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6545#issuecomment-107912763
  
@tsudukim Sorry, I don't agree with you, 
"spark.history.retainedApplications" can't to be 0 I think


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7889] [UI] make sure click the "App ID"...

2015-06-01 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/6545#issuecomment-107774238
  
@tsudukim hey, I think this bug is introduced by #3467, can you have a look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7889] make sure click the "App ID" on H...

2015-05-31 Thread XuTingjun

GitHub user XuTingjun opened a pull request:

https://github.com/apache/spark/pull/6545

[SPARK-7889] make sure  click the "App ID" on HistoryPage, the SparkUI will 
be refreshed.

The bug is that: When clicking the app in incomplete page, the tasks 
progress is 100/2000. After the app finished, click into the app again, the 
tasks progress still shows 100/2000.

This patch I override the handle function in UIHandlerCollection. I just 
add the line 137-151, to make sure every time click the "App ID" , the 
HistoryPage will read the event file on hdfs, and refresh the SparkUI.

May have two problems:
1. Though it makes the SparkUI can refresh, but  the refresh is effective 
just after clicking the "App ID". If have entered the SparkUI, refresh makes no 
change.
2. every time click the "App ID", it will go to read the hdfs file. So it 
will make response more slower and have the risk of being attacked.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/XuTingjun/spark SPARK-7889

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/6545.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6545


commit 9d5c1baa4acfe66e04dc58a4ece5a97792db8749
Author: Xutingjun 
Date:   2015-06-01T02:01:29Z

To make sure every time click the "App ID" on HistoryPage, the SparkUI will 
be refreshed.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

2015-05-19 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/5586#issuecomment-103777236
  
@deanchen Can you list the needed configs of hbase in client.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

2015-05-18 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/5586#issuecomment-103314158
  
@deanchen ,I use this patch, hbase throw the exception below. Can you help 
me?
>java.io.IOException: No secret manager configured for token authentication
at 
org.apache.hadoop.hbase.security.token.TokenProvider.getAuthenticationToken(TokenProvider.java:110)
at 
org.apache.hadoop.hbase.protobuf.generated.AuthenticationProtos$AuthenticationService$1.getAuthenticationToken(AuthenticationProtos.java:4267)
at 
org.apache.hadoop.hbase.protobuf.generated.AuthenticationProtos$AuthenticationService.callMethod(AuthenticationProtos.java:4387)
at 
org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7696)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1877)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1859)
at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32209)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2131)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:102)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
at java.lang.Thread.run(Thread.java:745)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6973]remove skipped stage ID from compl...

2015-05-11 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/5550#issuecomment-100898085
  
@JoshRosen 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6973]remove skipped stage ID from compl...

2015-05-07 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/5550#issuecomment-100074535
  
I agree the opinion said by @srowen before. In my case, job details page 
has one completed and one skipped stage, so I think decrease the numerator is 
better.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6973]remove skipped stage ID from compl...

2015-05-03 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/5550#issuecomment-98594154
  
@srowen Can you deal with this patch? Thanks !


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

2015-04-29 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/5586#issuecomment-97653791
  
These days I run the select command to read data in hbase with beeline 
shell, it always throw the exception:
>java.lang.IllegalStateException: unread block data
at 
java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2424)
at 
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1383)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at 
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at 
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at 
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69)
at 
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95)
at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:193)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
My cluster information is: /opt/jdk1.8.0_40, hadoop26.0, hbase1.0.0, 
zookeeper 3.5.0



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6973]remove skipped stage ID from compl...

2015-04-23 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/5550#issuecomment-95767885
  
@JoshRosen Can you have a look on this? We need your opinion.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6973]remove skipped stage ID from compl...

2015-04-23 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/5550#issuecomment-95559213
  
I have tested it, it has fixed my problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6973]remove skipped stage ID from compl...

2015-04-23 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/5550#issuecomment-9147
  
@srowen I have updated the code, please have a look.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6973]modify total stages/tasks on the a...

2015-04-23 Thread XuTingjun

Github user XuTingjun commented on a diff in the pull request:

https://github.com/apache/spark/pull/5550#discussion_r28953413
  
--- Diff: core/src/main/scala/org/apache/spark/ui/jobs/UIData.scala ---
@@ -63,7 +64,7 @@ private[jobs] object UIData {
 /* Stages */
 var numActiveStages: Int = 0,
 // This needs to be a set instead of a simple count to prevent 
double-counting of rerun stages:
-var completedStageIndices: OpenHashSet[Int] = new OpenHashSet[Int](),
+var completedStageIndices: mutable.HashSet[Int] = new 
mutable.HashSet[Int](),
--- End diff --

Did you disagree this? But the `OpenHashSet` doesn't support remove 
operation. Do you have any suggestion on this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6973]modify total stages/tasks on the a...

2015-04-23 Thread XuTingjun

Github user XuTingjun commented on a diff in the pull request:

https://github.com/apache/spark/pull/5550#discussion_r28953027
  
--- Diff: 
core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala ---
@@ -271,7 +271,9 @@ class JobProgressListener(conf: SparkConf) extends 
SparkListener with Logging {
 ) {
   jobData.numActiveStages -= 1
   if (stage.failureReason.isEmpty) {
-jobData.completedStageIndices.add(stage.stageId)
+if (!stage.submissionTime.isEmpty) {
--- End diff --

Yes, like the case described in the jira.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6973]modify total stages/tasks on the a...

2015-04-23 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/5550#issuecomment-95526618
  
But I think I have done that in `onStageSubmitted`, removing a stage ID 
from the completed set the moment it is retried. Have you seen that?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6973]modify total stages/tasks on the a...

2015-04-22 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/5550#issuecomment-95386082
  
@srowen I may miss something. Actually your idea is that, we should count  
the last retry status of stage into completed/total, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6973]modify total stages/tasks on the a...

2015-04-21 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/5550#issuecomment-95008837
  
@srowen I have update the patch code, delete the skipped stage from 
completed set.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6918][YARN] Secure HBase support.

2015-04-19 Thread XuTingjun

Github user XuTingjun commented on the pull request:

https://github.com/apache/spark/pull/5586#issuecomment-94367970
  
Yeah, LGTM, I need this function. can we put hbase's config into 
hbase-site.xml, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6973]modify total stages/tasks on the a...

2015-04-19 Thread XuTingjun

GitHub user XuTingjun reopened a pull request:

https://github.com/apache/spark/pull/5550

[SPARK-6973]modify total stages/tasks on the allJobsPage

Though totalStages = allStages - skippedStages is understandable. But 
consider the problem [SPARK-6973], I think totalStages = allStages is more 
reasonable. Like "2/1 (2 failed) (1 skipped)", this item also shows the skipped 
num, it also will be understandable.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuTingjun/spark allJobsPage

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5550.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5550


commit 47525c6138597a01a6cd2408b95b0fdd4387e0c5
Author: Xu Tingjun 
Date:   2015-04-17T06:29:41Z

modify total stages/tasks on the allJobsPage




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 >

1 - 100 of 173 matches

Mail list logo