[GitHub] spark issue #16324: [SPARK-18910][SQL]Resolve faile to use UDF that jar file...

2016-12-18 Thread shenh062326
Github user shenh062326 commented on the issue:

https://github.com/apache/spark/pull/16324
  
I‘m sorry, @rxin, I don't understand what you mean.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16324: [SPARK-18910][SQL]Resolve faile to use UDF that jar file...

2016-12-17 Thread shenh062326
Github user shenh062326 commented on the issue:

https://github.com/apache/spark/pull/16324
  
Currently,we can create a UDF with jar in HDFS,  but failed to use it. 
Spark driver won't download the jar from HDFS, it only add the path to the 
classLoader. 
If we don't support reading UDF jar from HDFS, we should download the UDF 
jar. 
I think support reading UDF jar from HDFS is better.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16324: [SPARK-18910][SQL]Resolve faile to use UDF that jar file...

2016-12-17 Thread shenh062326
Github user shenh062326 commented on the issue:

https://github.com/apache/spark/pull/16324
  
Should we download the UDF jar from hdfs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16324: Resolve faile to use UDF that jar file in hdfs.

2016-12-16 Thread shenh062326
GitHub user shenh062326 opened a pull request:

https://github.com/apache/spark/pull/16324

Resolve faile to use UDF that jar file in hdfs.

## What changes were proposed in this pull request?

In SparkContext, setURLStreamHandlerFactory method on URL with an instance 
of FsUrlStreamHandlerFactory, to prevent failed to use UDF with jar file in 
HDFS.


## How was this patch tested?

I have test it in my cluster.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shenh062326/spark SPARK-18910

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16324.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16324


commit a91b08fcbdd76e02c2a244ea4ffca726339fdba8
Author: yuling 
Date:   2016-12-17T05:56:53Z

Resolve faile to use UDF that jar file in hdfs.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14557: [SPARK-16709][CORE] Kill the running task if stag...

2016-08-14 Thread shenh062326
Github user shenh062326 commented on a diff in the pull request:

https://github.com/apache/spark/pull/14557#discussion_r74714601
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -798,6 +798,19 @@ private[spark] class TaskSetManager(
   }
 }
 maybeFinishTaskSet()
+
+// kill running task if stage failed
+if(reason.isInstanceOf[FetchFailed]) {
+  killTasks(runningTasksSet, taskInfos)
+}
+  }
+
+  def killTasks(tasks: HashSet[Long], taskInfo: HashMap[Long, TaskInfo]): 
Boolean = {
+tasks.foreach { task =>
+  val executorId = taskInfo(task).executorId
+  sched.sc.schedulerBackend.killTask(task, executorId, true)
--- End diff --

Do you mean to add a parameter to the function?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14574: [SPARK-16985] Change dataFormat from yyyyMMddHHmm...

2016-08-09 Thread shenh062326
GitHub user shenh062326 opened a pull request:

https://github.com/apache/spark/pull/14574

[SPARK-16985] Change dataFormat from MMddHHmm to MMddHHmmss

## What changes were proposed in this pull request?

In our cluster, sometimes the sql output maybe overrided. When I submit 
some sql, all insert into the same table, and the sql will cost less one 
minute, here is the detail,
1 sql1, 11:03 insert into table.
2 sql2, 11:04:11 insert into table.
3 sql3, 11:04:48 insert into table.
4 sql4, 11:05 insert into table.
5 sql5, 11:06 insert into table.
The sql3's output file will override the sql2's output file. here is the 
log:
```
16/05/04 11:04:11 INFO hive.SparkHiveHadoopWriter: 
XXfinalPath=hdfs://tl-sng-gdt-nn-tdw.tencent-distribute.com:54310/tmp/assorz/tdw-tdwadmin/20160504/04559505496526517_-1_1204544348/1/_tmp.p_20160428/attempt_201605041104_0001_m_00_1

16/05/04 11:04:48 INFO hive.SparkHiveHadoopWriter: 
XXfinalPath=hdfs://tl-sng-gdt-nn-tdw.tencent-distribute.com:54310/tmp/assorz/tdw-tdwadmin/20160504/04559505496526517_-1_212180468/1/_tmp.p_20160428/attempt_201605041104_0001_m_00_1

```

The reason is the output file use SimpleDateFormat("MMddHHmm"), if two 
sql insert into the same table in the same minute, the output will be overrite. 
I think we should change dateFormat to "MMddHHmmss", in our cluster, we 
can't finished a sql in one second.


## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)


(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shenh062326/spark SPARK-16985

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14574.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14574


commit a385a2ac2a659d153532bef0748a0b1134687c8b
Author: hongshen 
Date:   2016-08-10T03:57:56Z

Change dataFormat from MMddHHmm to MMddHHmmss




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14557: [SPARK-16709][CORE] Kill the running task if stag...

2016-08-09 Thread shenh062326
Github user shenh062326 commented on a diff in the pull request:

https://github.com/apache/spark/pull/14557#discussion_r74021599
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1564,6 +1564,14 @@ class SparkContext(config: SparkConf) extends 
Logging with ExecutorAllocationCli
 }
   }
 
+  def killTasks(tasks: HashSet[Long], taskInfo: HashMap[Long, TaskInfo]): 
Boolean = {
--- End diff --

Jerryshao, Thanks for your prompt. I will move the method to TaskSetManager.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14557: [SPARK-16709][CORE] Kill the running task if stag...

2016-08-08 Thread shenh062326
GitHub user shenh062326 opened a pull request:

https://github.com/apache/spark/pull/14557

[SPARK-16709][CORE] Kill the running task if stage failed

## What changes were proposed in this pull request?

At SPARK-16709,  when a stage failed, but the running task is still 
running, the retry stage will rerun the running task, it could cause 
TaskCommitDeniedException and task retry forever.

Here is the log:
`16/07/28 05:22:15 INFO scheduler.TaskSetManager: Starting task 1.0 in 
stage 1.0 (TID 175, 10.215.146.81, partition 1,PROCESS_LOCAL, 1930 bytes)

16/07/28 05:28:35 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 
1.1 (TID 207, 10.196.147.232, partition 1,PROCESS_LOCAL, 1930 bytes)

16/07/28 05:28:48 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 
1.0 (TID 175) in 393261 ms on 10.215.146.81 (3/50)

16/07/28 05:34:11 WARN scheduler.TaskSetManager: Lost task 1.0 in stage 1.1 
(TID 207, 10.196.147.232): TaskCommitDenied (Driver denied task commit) for 
job: 1, partition: 1, attemptNumber: 207`

1 task 1.0 in stage1.0 start
2 stage1.0 failed, start stage1.1.
3 task 1.0 in stage1.1 start
4 task 1.0 in stage1.0 finished.
5 task 1.0 in stage1.1 failed with TaskCommitDenied Exception, then retry 
forever.



## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)


(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shenh062326/spark SPARK-16709

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14557.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14557


commit 1a1ea2f598e7db4ab1b856b420dca36b796c2a1c
Author: hongshen 
Date:   2016-08-09T06:44:14Z

Kill the running task if stage failed.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13450][SQL] External spilling when join...

2016-02-25 Thread shenh062326
GitHub user shenh062326 opened a pull request:

https://github.com/apache/spark/pull/11386

[SPARK-13450][SQL] External spilling when join a lot of rows with the same 
key

  SortMergeJoin use a ArrayBuffer[InternalRow] to store bufferedMatches, if 
the join have a lot of rows with the same key, it will throw OutOfMemoryError.
  Add a ExternalAppendOnlyArrayBuffer to store bufferedMatches instand of 
ArrayBuffer.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shenh062326/spark my_change6

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/11386.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #11386


commit b0c00f42c4a889aabb6a0edd25522c53df9f18ad
Author: hongshen 
Date:   2016-02-26T15:29:45Z

External spilling when join a lot of rows with the same key




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10918] [CORE] Prevent task failed for e...

2015-10-04 Thread shenh062326
GitHub user shenh062326 opened a pull request:

https://github.com/apache/spark/pull/8975

[SPARK-10918] [CORE] Prevent task failed for executor kill by driver

  When dynamicAllocation is enabled, when a executor was idle timeout, it 
will be kill by driver, if a task offer to the executor at the same time, the 
task will failed due to executor lost.
  

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shenh062326/spark my_change20151005

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8975.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8975


commit 88c9c3ef407cecbe46ced9411d1d14ff70752d65
Author: hongshen 
Date:   2015-10-05T02:12:50Z

Prevent task failed for executor kill by driver




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6738] [CORE] Improve estimate the size ...

2015-04-27 Thread shenh062326
Github user shenh062326 commented on the pull request:

https://github.com/apache/spark/pull/5608#issuecomment-96870004
  
@srowen  @mateiz  
Thanks for you review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6738] [CORE] Improve estimate the size ...

2015-04-26 Thread shenh062326
Github user shenh062326 commented on the pull request:

https://github.com/apache/spark/pull/5608#issuecomment-96454990
  
I don't know why it has not start build automaticly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6738] [CORE] Improve estimate the size ...

2015-04-26 Thread shenh062326
Github user shenh062326 commented on the pull request:

https://github.com/apache/spark/pull/5608#issuecomment-96369283
  
Thanks, I will fix it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6738] [CORE] Improve estimate the size ...

2015-04-25 Thread shenh062326
Github user shenh062326 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5608#discussion_r29107048
  
--- Diff: core/src/main/scala/org/apache/spark/util/SizeEstimator.scala ---
@@ -204,25 +204,36 @@ private[spark] object SizeEstimator extends Logging {
 }
   } else {
 // Estimate the size of a large array by sampling elements without 
replacement.
-var size = 0.0
+// To exclude the shared objects that the array elements may link, 
sample twice
+// and use the min one to caculate array size.
 val rand = new Random(42)
-val drawn = new OpenHashSet[Int](ARRAY_SAMPLE_SIZE)
-var numElementsDrawn = 0
-while (numElementsDrawn < ARRAY_SAMPLE_SIZE) {
-  var index = 0
-  do {
-index = rand.nextInt(length)
-  } while (drawn.contains(index))
-  drawn.add(index)
-  val elem = ScalaRunTime.array_apply(array, 
index).asInstanceOf[AnyRef]
-  size += SizeEstimator.estimate(elem, state.visited)
-  numElementsDrawn += 1
-}
-state.size += ((length / (ARRAY_SAMPLE_SIZE * 1.0)) * size).toLong
+val drawn = new OpenHashSet[Int](2 * ARRAY_SAMPLE_SIZE)
+val s1 = sampleArray(array, state, rand, drawn, length)
+val s2 = sampleArray(array, state, rand, drawn, length)
+val size = math.min(s1, s2)
+state.size += math.max(s1, s2) + 
+  (size * ((length - ARRAY_SAMPLE_SIZE) / 
(ARRAY_SAMPLE_SIZE))).toLong
   }
 }
   }
 
+  private def sampleArray(array: AnyRef, state: SearchState, 
--- End diff --

OK, thanks for review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6738] [CORE] Improve estimate the size ...

2015-04-24 Thread shenh062326
Github user shenh062326 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5608#discussion_r29097662
  
--- Diff: core/src/main/scala/org/apache/spark/util/SizeEstimator.scala ---
@@ -204,25 +204,36 @@ private[spark] object SizeEstimator extends Logging {
 }
   } else {
 // Estimate the size of a large array by sampling elements without 
replacement.
-var size = 0.0
+// To exclude the shared objects that the array elements may link, 
sample twice
+// and use the min one to caculate array size.
 val rand = new Random(42)
-val drawn = new OpenHashSet[Int](ARRAY_SAMPLE_SIZE)
-var numElementsDrawn = 0
-while (numElementsDrawn < ARRAY_SAMPLE_SIZE) {
-  var index = 0
-  do {
-index = rand.nextInt(length)
-  } while (drawn.contains(index))
-  drawn.add(index)
-  val elem = ScalaRunTime.array_apply(array, 
index).asInstanceOf[AnyRef]
-  size += SizeEstimator.estimate(elem, state.visited)
-  numElementsDrawn += 1
-}
-state.size += ((length / (ARRAY_SAMPLE_SIZE * 1.0)) * size).toLong
+val drawn = new OpenHashSet[Int](2 * ARRAY_SAMPLE_SIZE)
--- End diff --

If the array size >= 400, we only have to sample 100 distinct elements from 
the array, twice.
+for (i <- 0 until ARRAY_SAMPLE_SIZE) {




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6738] [CORE] Improve estimate the size ...

2015-04-24 Thread shenh062326
Github user shenh062326 commented on the pull request:

https://github.com/apache/spark/pull/5608#issuecomment-95908120
  
Sampling strategy not always works, but sampling twice are more effective 
then only discarding the first non-null sample. And sampling 200 times  will 
not cause performance issues. 
If you think the code shouldn't written like that, I aggree, I will change 
it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6738] [CORE] Improve estimate the size ...

2015-04-24 Thread shenh062326
Github user shenh062326 commented on the pull request:

https://github.com/apache/spark/pull/5608#issuecomment-95904531
  
@srowen 
The last assertResult I have add in the testcase is the case that can't 
only discarding the first non-null sample, because half of the array elems are 
not link to the shared object, if the first non-null sample (which generate by 
random) is not link to the shared object, we can't exclude the shared object. 
But if we sampling twice, even if the twice has not exclude the shared object, 
it can also work.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6738] [CORE] Improve estimate the size ...

2015-04-23 Thread shenh062326
Github user shenh062326 commented on the pull request:

https://github.com/apache/spark/pull/5608#issuecomment-95767900
  
It seems always work in my cluster, at least I have not find a case not 
work. But if I change to the simpler one, sometimes it doesn't work. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6738] [CORE] Improve estimate the size ...

2015-04-22 Thread shenh062326
Github user shenh062326 commented on the pull request:

https://github.com/apache/spark/pull/5608#issuecomment-95388189
  
@srowen  
At first, I also want to exclude shared objects by discarding the first 
non-null sample, but not always work, since not all the objects links to the 
shared objects.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6738] [CORE] Improve estimate the size ...

2015-04-21 Thread shenh062326
Github user shenh062326 commented on the pull request:

https://github.com/apache/spark/pull/5608#issuecomment-94995027
  
@mateiz 
In most case,  the first sampling size is contain the shared objects, the 
second will not. But if the arrray is large, and is only has a few not null 
objects, it can be the second sampling contain shared objects, but  the first 
will not.  

In order to exclude the shared objects, I use the min size to caculate the 
result. It will not going wrong if there are no shared objects, I have test it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6738] [CORE] Improve estimate the size ...

2015-04-21 Thread shenh062326
Github user shenh062326 commented on the pull request:

https://github.com/apache/spark/pull/5608#issuecomment-94754171
  
No, the change has no matter with the check for null.
 If the arraySize > 200, and elem has the share object, the 
SizeEstimator.visitArray is not correct.
for example,  arraySize=2,  all the array elem has a share object with 
100B and not share object 50B,  the truely size is 
2*50B+100B=2*10^B  , but currently, SizeEstimator.visitArray will 
return (100* 50B + 100B ) * 2/100=201*10^6B, It's more than 100 times 
than the truely size. Furthermore is the array is greater, the greater the 
multiplier.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6738] [CORE] Improve estimate the size ...

2015-04-21 Thread shenh062326
GitHub user shenh062326 opened a pull request:

https://github.com/apache/spark/pull/5608

[SPARK-6738] [CORE] Improve estimate the size of a large array

Currently, SizeEstimator.visitArray is not correct in the follow case, 
array size > 200, 
elem has the share object

when I add a debug log in SizeTracker.scala:
 logInfo(s"numUpdates:$numUpdates, size:$ts, 
bytesPerUpdate:$bytesPerUpdate, cost time:$b")

I got the following log:
 numUpdates:1, size:262448, bytesPerUpdate:0.0, cost time:35
 numUpdates:2, size:420698, bytesPerUpdate:158250.0, cost time:35
 numUpdates:4, size:420754, bytesPerUpdate:28.0, cost time:32
 numUpdates:7, size:420754, bytesPerUpdate:0.0, cost time:27
 numUpdates:12, size:420754, bytesPerUpdate:0.0, cost time:28
 numUpdates:20, size:420754, bytesPerUpdate:0.0, cost time:25
 numUpdates:32, size:420754, bytesPerUpdate:0.0, cost time:21
 numUpdates:52, size:420754, bytesPerUpdate:0.0, cost time:20
 numUpdates:84, size:420754, bytesPerUpdate:0.0, cost time:20
 numUpdates:135, size:420754, bytesPerUpdate:0.0, cost time:20
 numUpdates:216, size:420754, bytesPerUpdate:0.0, cost time:11
 numUpdates:346, size:420754, bytesPerUpdate:0.0, cost time:6
 numUpdates:554, size:488911, bytesPerUpdate:327.67788461538464, cost time:8
 numUpdates:887, size:2312259426, bytesPerUpdate:6942253.798798799, cost 
time:198
15/04/21 14:27:26 INFO collection.ExternalAppendOnlyMap: Thread 51 spilling 
in-memory map of 3.0 GB to disk (1 time so far)
15/04/21 14:27:26 INFO collection.ExternalAppendOnlyMap: 
/data11/yarnenv/local/usercache/spark/appcache/application_1426746631567_11745/spark-local-20150421142719-c001/30/temp_local_066af981-c2fc-4b70-a00e-110e23006fbc

But in fact the file size is only 162K:
$ ll -h 
/data11/yarnenv/local/usercache/spark/appcache/application_1426746631567_11745/spark-local-20150421142719-c001/30/temp_local_066af981-c2fc-4b70-a00e-110e23006fbc
-rw-r- 1 spark users 162K Apr 21 14:27 
/data11/yarnenv/local/usercache/spark/appcache/application_1426746631567_11745/spark-local-20150421142719-c001/30/temp_local_066af981-c2fc-4b70-a00e-110e23006fbc


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shenh062326/spark my_change5

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5608.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5608


commit 4c28e36f94cef254655d76ee9e290483943e4ba8
Author: Hong Shen 
Date:   2015-04-21T08:40:29Z

Improve estimate the size of a large array




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5529][CORE]Add expireDeadHosts in Heart...

2015-02-26 Thread shenh062326
Github user shenh062326 commented on the pull request:

https://github.com/apache/spark/pull/4363#issuecomment-76145260
  
Sorry for late, I will change it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5529][CORE]Add expireDeadHosts in Heart...

2015-02-26 Thread shenh062326
Github user shenh062326 commented on a diff in the pull request:

https://github.com/apache/spark/pull/4363#discussion_r25413198
  
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -17,33 +17,84 @@
 
 package org.apache.spark
 
-import akka.actor.Actor
+import scala.concurrent.duration._
+import scala.collection.mutable
+
+import akka.actor.{Actor, Cancellable}
+
 import org.apache.spark.executor.TaskMetrics
 import org.apache.spark.storage.BlockManagerId
-import org.apache.spark.scheduler.TaskScheduler
+import org.apache.spark.scheduler.{SlaveLost, TaskScheduler}
 import org.apache.spark.util.ActorLogReceive
 
 /**
  * A heartbeat from executors to the driver. This is a shared message used 
by several internal
- * components to convey liveness or execution information for in-progress 
tasks.
+ * components to convey liveness or execution information for in-progress 
tasks. It will also 
+ * expire the hosts that have not heartbeated for more than 
spark.driver.executorTimeoutMs.
  */
 private[spark] case class Heartbeat(
 executorId: String,
 taskMetrics: Array[(Long, TaskMetrics)], // taskId -> TaskMetrics
 blockManagerId: BlockManagerId)
 
+private[spark] case object ExpireDeadHosts 
+
 private[spark] case class HeartbeatResponse(reregisterBlockManager: 
Boolean)
 
 /**
  * Lives in the driver to receive heartbeats from executors..
  */
-private[spark] class HeartbeatReceiver(scheduler: TaskScheduler)
+private[spark] class HeartbeatReceiver(sc: SparkContext, scheduler: 
TaskScheduler)
--- End diff --

It will not easy to understand, on the other hand the SparkContext is use 
in a lot of place.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5529][CORE]Add expireDeadHosts in Heart...

2015-02-12 Thread shenh062326
Github user shenh062326 commented on a diff in the pull request:

https://github.com/apache/spark/pull/4363#discussion_r24574765
  
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -17,33 +17,84 @@
 
 package org.apache.spark
 
-import akka.actor.Actor
+import scala.concurrent.duration._
+import scala.collection.mutable
+
+import akka.actor.{Actor, Cancellable}
+
 import org.apache.spark.executor.TaskMetrics
 import org.apache.spark.storage.BlockManagerId
-import org.apache.spark.scheduler.TaskScheduler
+import org.apache.spark.scheduler.{SlaveLost, TaskScheduler}
 import org.apache.spark.util.ActorLogReceive
 
 /**
  * A heartbeat from executors to the driver. This is a shared message used 
by several internal
- * components to convey liveness or execution information for in-progress 
tasks.
+ * components to convey liveness or execution information for in-progress 
tasks. It will also 
+ * expire the hosts that have not heartbeated for more than 
spark.driver.executorTimeoutMs.
  */
 private[spark] case class Heartbeat(
 executorId: String,
 taskMetrics: Array[(Long, TaskMetrics)], // taskId -> TaskMetrics
 blockManagerId: BlockManagerId)
 
+private[spark] case object ExpireDeadHosts 
+
 private[spark] case class HeartbeatResponse(reregisterBlockManager: 
Boolean)
 
 /**
  * Lives in the driver to receive heartbeats from executors..
  */
-private[spark] class HeartbeatReceiver(scheduler: TaskScheduler)
+private[spark] class HeartbeatReceiver(sc: SparkContext, scheduler: 
TaskScheduler)
   extends Actor with ActorLogReceive with Logging {
 
+  val executorLastSeen = new mutable.HashMap[String, Long]
+  
+  val executorTimeout = sc.conf.getLong("spark.driver.executorTimeoutMs", 
+sc.conf.getLong("spark.storage.blockManagerSlaveTimeoutMs", 120 * 
1000))
+  
+  val checkTimeoutInterval = 
sc.conf.getLong("spark.driver.executorTimeoutIntervalMs",
+sc.conf.getLong("spark.storage.blockManagerTimeoutIntervalMs", 6))
+  
+  var timeoutCheckingTask: Cancellable = null
+  
+  override def preStart(): Unit = {
+import context.dispatcher
+timeoutCheckingTask = context.system.scheduler.schedule(0.seconds,
+  checkTimeoutInterval.milliseconds, self, ExpireDeadHosts)
+super.preStart()
+  }
+  
   override def receiveWithLogging = {
 case Heartbeat(executorId, taskMetrics, blockManagerId) =>
   val response = HeartbeatResponse(
 !scheduler.executorHeartbeatReceived(executorId, taskMetrics, 
blockManagerId))
+  executorLastSeen(executorId) = System.currentTimeMillis()
   sender ! response
+case ExpireDeadHosts =>
+  expireDeadHosts()
+  }
+
+  private def expireDeadHosts(): Unit = {
+logTrace("Checking for hosts with no recent heartbeats in 
HeartbeatReceiver.")
+val now = System.currentTimeMillis()
+val minSeenTime = now - executorTimeout
+for ((executorId, lastSeenMs) <- executorLastSeen) {
+  if (lastSeenMs < minSeenTime) {
+logWarning(s"Removing executor $executorId with no recent 
heartbeats: " +
+  s"${now - lastSeenMs} ms exceeds timeout $executorTimeout ms")
+scheduler.executorLost(executorId, SlaveLost())
+if(sc.supportKillExecutor()) {
+  sc.killExecutor(executorId)
+}
+executorLastSeen.remove(executorId)
--- End diff --

Because the akka connection is still alive, we can kill executor by send 
kill message to applicationMaster.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5529][CORE]Add expireDeadHosts in Heart...

2015-02-11 Thread shenh062326
Github user shenh062326 commented on the pull request:

https://github.com/apache/spark/pull/4363#issuecomment-74023636
  
Hi @sryza, I think this pull request is OK now, can you merge it into 
master?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5736][Web UI]Add executor log url to Ex...

2015-02-11 Thread shenh062326
Github user shenh062326 commented on the pull request:

https://github.com/apache/spark/pull/4529#issuecomment-73857999
  
Hi @srowen.
We just want to read executor log from UI. is there any easy way to add 
executor log url to UI?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5736][Web UI]Add executor log url to Ex...

2015-02-11 Thread shenh062326
GitHub user shenh062326 opened a pull request:

https://github.com/apache/spark/pull/4529

[SPARK-5736][Web UI]Add executor log url to Executors page on Yarn

Currently, there is not executor log url in spark ui (on Yarn), we have to 
read executor log by login the machine that executor in. I think we should add 
executor log url to executors pages.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shenh062326/spark my_change4

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/4529.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4529


commit 078ca723802c9a824befff023f4e4b9ba9986253
Author: Hong Shen 
Date:   2015-02-11T09:02:13Z

Add executor log url to Executors page on Yarn




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5529][CORE]Add expireDeadHosts in Heart...

2015-02-09 Thread shenh062326
Github user shenh062326 commented on the pull request:

https://github.com/apache/spark/pull/4363#issuecomment-73638561
  
The failed tests have no relationship with this patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5529][CORE]Add expireDeadHosts in Heart...

2015-02-09 Thread shenh062326
Github user shenh062326 commented on a diff in the pull request:

https://github.com/apache/spark/pull/4363#discussion_r24381524
  
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -17,33 +17,82 @@
 
 package org.apache.spark
 
-import akka.actor.Actor
+import scala.concurrent.duration._
+import scala.collection.mutable
+
+import akka.actor.{Actor, Cancellable}
+
 import org.apache.spark.executor.TaskMetrics
 import org.apache.spark.storage.BlockManagerId
-import org.apache.spark.scheduler.TaskScheduler
+import org.apache.spark.scheduler.{SlaveLost, TaskScheduler}
 import org.apache.spark.util.ActorLogReceive
 
 /**
  * A heartbeat from executors to the driver. This is a shared message used 
by several internal
- * components to convey liveness or execution information for in-progress 
tasks.
+ * components to convey liveness or execution information for in-progress 
tasks. It will also 
+ * expire the hosts that have not heartbeated for more than 
spark.driver.executorTimeoutMs.
  */
 private[spark] case class Heartbeat(
 executorId: String,
 taskMetrics: Array[(Long, TaskMetrics)], // taskId -> TaskMetrics
 blockManagerId: BlockManagerId)
 
+private[spark] case object ExpireDeadHosts 
+
 private[spark] case class HeartbeatResponse(reregisterBlockManager: 
Boolean)
 
 /**
  * Lives in the driver to receive heartbeats from executors..
  */
-private[spark] class HeartbeatReceiver(scheduler: TaskScheduler)
+private[spark] class HeartbeatReceiver(sc: SparkContext, scheduler: 
TaskScheduler)
   extends Actor with ActorLogReceive with Logging {
 
+  val executorLastSeen = new mutable.HashMap[String, Long]
+  
+  val executorTimeout = sc.conf.getLong("spark.driver.executorTimeoutMs", 
+sc.conf.getLong("spark.storage.blockManagerSlaveTimeoutMs", 120 * 
1000))
+  
+  val checkTimeoutInterval = 
sc.conf.getLong("spark.driver.executorTimeoutIntervalMs",
+sc.conf.getLong("spark.storage.blockManagerTimeoutIntervalMs", 6))
+  
+  var timeoutCheckingTask: Cancellable = null
+  
+  override def preStart(): Unit = {
+import context.dispatcher
+timeoutCheckingTask = context.system.scheduler.schedule(0.seconds,
+  checkTimeoutInterval.milliseconds, self, ExpireDeadHosts)
+super.preStart
--- End diff --

Ok, I will change it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5529][CORE]Add expireDeadHosts in Heart...

2015-02-09 Thread shenh062326
Github user shenh062326 commented on the pull request:

https://github.com/apache/spark/pull/4363#issuecomment-73625458
  
Hi @andrewor14 , @sryza and @rxin. Thanks. I agree with your views. I will 
change sc.killExecutor to not throw an assertion error.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5529][CORE]Add expireDeadHosts in Heart...

2015-02-06 Thread shenh062326
Github user shenh062326 commented on the pull request:

https://github.com/apache/spark/pull/4363#issuecomment-73220183
  
scheduler.executorLost(executorId, SlaveLost()) will call 
BlockManagerMasterActor.removeBlockManager, the stack is:
HeartbeatReceiver.expireDeadHosts
TaskSchedulerImpl.executorLost
DAGScheduler.executorLost
DAGScheduler.handleExecutorLost
blockManagerMaster.removeExecutor
BlockManagerMasterActor.removeExecutor
BlockManagerMasterActor.removeBlockManager

"The documentation needs to be updated. I'm happy to provide wording if it 
would be helpful" which documentation you mean?




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5529][CORE]Add expireDeadHosts in Heart...

2015-02-05 Thread shenh062326
Github user shenh062326 commented on a diff in the pull request:

https://github.com/apache/spark/pull/4363#discussion_r24216671
  
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -17,33 +17,85 @@
 
 package org.apache.spark
 
-import akka.actor.Actor
+import scala.concurrent.duration._
+import scala.collection.mutable
+
+import akka.actor.{Actor, Cancellable}
+
 import org.apache.spark.executor.TaskMetrics
 import org.apache.spark.storage.BlockManagerId
-import org.apache.spark.scheduler.TaskScheduler
+import org.apache.spark.scheduler.{SlaveLost, TaskScheduler}
 import org.apache.spark.util.ActorLogReceive
 
 /**
  * A heartbeat from executors to the driver. This is a shared message used 
by several internal
- * components to convey liveness or execution information for in-progress 
tasks.
+ * components to convey liveness or execution information for in-progress 
tasks. It will also 
+ * expiry the hosts that have no heartbeat for more than 
spark.executor.heartbeat.timeoutMs.
  */
 private[spark] case class Heartbeat(
 executorId: String,
 taskMetrics: Array[(Long, TaskMetrics)], // taskId -> TaskMetrics
 blockManagerId: BlockManagerId)
 
+private[spark] case object ExpireDeadHosts 
+
 private[spark] case class HeartbeatResponse(reregisterBlockManager: 
Boolean)
 
 /**
  * Lives in the driver to receive heartbeats from executors..
  */
-private[spark] class HeartbeatReceiver(scheduler: TaskScheduler)
+private[spark] class HeartbeatReceiver(sc: SparkContext, scheduler: 
TaskScheduler)
   extends Actor with ActorLogReceive with Logging {
 
+  val executorLastSeen = new mutable.HashMap[String, Long]
+  
+  val slaveTimeout = sc.conf.getLong("spark.executor.heartbeat.timeoutMs", 
120 * 1000)
--- End diff --

OK, Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5529][CORE]Add expireDeadHosts in Heart...

2015-02-05 Thread shenh062326
Github user shenh062326 commented on a diff in the pull request:

https://github.com/apache/spark/pull/4363#discussion_r24215867
  
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -17,33 +17,85 @@
 
 package org.apache.spark
 
-import akka.actor.Actor
+import scala.concurrent.duration._
+import scala.collection.mutable
+
+import akka.actor.{Actor, Cancellable}
+
 import org.apache.spark.executor.TaskMetrics
 import org.apache.spark.storage.BlockManagerId
-import org.apache.spark.scheduler.TaskScheduler
+import org.apache.spark.scheduler.{SlaveLost, TaskScheduler}
 import org.apache.spark.util.ActorLogReceive
 
 /**
  * A heartbeat from executors to the driver. This is a shared message used 
by several internal
- * components to convey liveness or execution information for in-progress 
tasks.
+ * components to convey liveness or execution information for in-progress 
tasks. It will also 
+ * expiry the hosts that have no heartbeat for more than 
spark.executor.heartbeat.timeoutMs.
  */
 private[spark] case class Heartbeat(
 executorId: String,
 taskMetrics: Array[(Long, TaskMetrics)], // taskId -> TaskMetrics
 blockManagerId: BlockManagerId)
 
+private[spark] case object ExpireDeadHosts 
+
 private[spark] case class HeartbeatResponse(reregisterBlockManager: 
Boolean)
 
 /**
  * Lives in the driver to receive heartbeats from executors..
  */
-private[spark] class HeartbeatReceiver(scheduler: TaskScheduler)
+private[spark] class HeartbeatReceiver(sc: SparkContext, scheduler: 
TaskScheduler)
   extends Actor with ActorLogReceive with Logging {
 
+  val executorLastSeen = new mutable.HashMap[String, Long]
+  
+  val slaveTimeout = sc.conf.getLong("spark.executor.heartbeat.timeoutMs", 
120 * 1000)
--- End diff --

Is that you mean spark.storage.blockManagerSlaveTimeoutMs and  
spark.storage.blockManagerTimeoutIntervalMs ?
Should we duplicate this two configs ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5529][CORE]Add expireDeadHosts in Heart...

2015-02-05 Thread shenh062326
Github user shenh062326 commented on a diff in the pull request:

https://github.com/apache/spark/pull/4363#discussion_r24215268
  
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -17,33 +17,85 @@
 
 package org.apache.spark
 
-import akka.actor.Actor
+import scala.concurrent.duration._
+import scala.collection.mutable
+
+import akka.actor.{Actor, Cancellable}
+
 import org.apache.spark.executor.TaskMetrics
 import org.apache.spark.storage.BlockManagerId
-import org.apache.spark.scheduler.TaskScheduler
+import org.apache.spark.scheduler.{SlaveLost, TaskScheduler}
 import org.apache.spark.util.ActorLogReceive
 
 /**
  * A heartbeat from executors to the driver. This is a shared message used 
by several internal
- * components to convey liveness or execution information for in-progress 
tasks.
+ * components to convey liveness or execution information for in-progress 
tasks. It will also 
+ * expiry the hosts that have no heartbeat for more than 
spark.executor.heartbeat.timeoutMs.
--- End diff --

OK, I will change it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5529][CORE]Add expireDeadHosts in Heart...

2015-02-04 Thread shenh062326
Github user shenh062326 commented on the pull request:

https://github.com/apache/spark/pull/4363#issuecomment-73006178
  
The failed testcase has no relationship with this patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5529][CORE]Add expireDeadHosts in Heart...

2015-02-04 Thread shenh062326
Github user shenh062326 commented on a diff in the pull request:

https://github.com/apache/spark/pull/4363#discussion_r24138722
  
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -32,18 +33,56 @@ private[spark] case class Heartbeat(
 taskMetrics: Array[(Long, TaskMetrics)], // taskId -> TaskMetrics
 blockManagerId: BlockManagerId)
 
+private[spark] case object ExpireDeadHosts 
+
 private[spark] case class HeartbeatResponse(reregisterBlockManager: 
Boolean)
 
 /**
  * Lives in the driver to receive heartbeats from executors..
--- End diff --

Hi Sryza, thanks for your review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Add expireDeadHosts in HeartbeatReceiver

2015-02-04 Thread shenh062326
Github user shenh062326 commented on the pull request:

https://github.com/apache/spark/pull/4363#issuecomment-72826509
  
add [SPARK-5529]


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Add expireDeadHosts in HeartbeatReceiver

2015-02-04 Thread shenh062326
GitHub user shenh062326 opened a pull request:

https://github.com/apache/spark/pull/4363

Add expireDeadHosts in HeartbeatReceiver

If a blockManager has not send heartBeat more than 120s, 
BlockManagerMasterActor will remove it. But coarseGrainedSchedulerBackend can 
only remove executor after an DisassociatedEvent.  We should expireDeadHosts at 
HeartbeatReceiver.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shenh062326/spark my_change3

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/4363.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4363


commit c922cb067606a0d99070e15a68943d22accb6c3d
Author: Hong Shen 
Date:   2015-02-04T09:41:35Z

Add expireDeadHosts in HeartbeatReceiver




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4934][CORE] Print remote address in Con...

2015-01-24 Thread shenh062326
Github user shenh062326 commented on the pull request:

https://github.com/apache/spark/pull/4157#issuecomment-71348155
  
I think you are right, it's no need to change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-5199. Input metrics should show up for I...

2015-01-24 Thread shenh062326
Github user shenh062326 commented on the pull request:

https://github.com/apache/spark/pull/4050#issuecomment-71347965
  
If we use a inputFormat that don‘t instanc of 
org.apache.hadoop.mapreduce.lib.input.{CombineFileSplit, FileSplit},  then we 
can't get information of  input metrics.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5347][CORE] Change FileSplit to InputSp...

2015-01-24 Thread shenh062326
Github user shenh062326 commented on the pull request:

https://github.com/apache/spark/pull/4150#issuecomment-71347933
  
If we use a inputFormat that don‘t instanc of 
org.apache.hadoop.mapreduce.lib.input.{CombineFileSplit, FileSplit},  then we 
can't get information of  input metrics.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4934][CORE] Print remote address in Con...

2015-01-22 Thread shenh062326
Github user shenh062326 commented on a diff in the pull request:

https://github.com/apache/spark/pull/4157#discussion_r23370062
  
--- Diff: 
core/src/main/scala/org/apache/spark/network/nio/ConnectionManager.scala ---
@@ -375,16 +375,22 @@ private[nio] class ConnectionManager(
 }
   }
 } else {
-  logInfo("Key not valid ? " + key)
+  logInfo("Key not valid ? " + key + " remote address: " + 
+  key.channel().asInstanceOf[SocketChannel].socket
--- End diff --

Thanks, I will change it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4934][CORE] Print remote address in Con...

2015-01-22 Thread shenh062326
GitHub user shenh062326 opened a pull request:

https://github.com/apache/spark/pull/4157

[SPARK-4934][CORE] Print remote address in ConnectionManager

Connection key is hard to read : key already cancelled ? 
sun.nio.ch.SelectionKeyImpl@52b0e278. 
It’s hard to solve problem by this log. It's better to add remote address.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shenh062326/spark my_change2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/4157.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4157


commit e5ac73e3fd18c3bf5c1a32ce531b15be9feac385
Author: Hong Shen 
Date:   2015-01-22T07:53:47Z

Print remote address in ConnectionManager




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5347][CORE] Change FileSplit to InputSp...

2015-01-21 Thread shenh062326
GitHub user shenh062326 opened a pull request:

https://github.com/apache/spark/pull/4150

[SPARK-5347][CORE] Change FileSplit to InputSplit in update inputMetrics

When inputFormatClass is set to CombineFileInputFormat, input metrics show 
that input is empty. It don't appear is spark-1.1.0. It's because in HadoopRDD, 
inputMetrics only been set when split is instanceOf FileSplit, but 
CombineFileInputFormat use InputSplit. It's not nessesary to instanceOf 
FileSplit, only have to instanceOf InputSplit.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shenh062326/spark my_change1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/4150.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4150


commit 9e04a547115bbcb4c19b55b451ca3afe09955e9f
Author: Hong Shen 
Date:   2015-01-22T00:51:15Z

change FileSplit to InputSplit in update inputMetrics




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark Core] SPARK-4380 Edit spilling log from...

2014-11-13 Thread shenh062326
Github user shenh062326 commented on a diff in the pull request:

https://github.com/apache/spark/pull/3243#discussion_r20337096
  
--- Diff: 
core/src/main/scala/org/apache/spark/util/collection/Spillable.scala ---
@@ -105,7 +105,7 @@ private[spark] trait Spillable[C] {
*/
   @inline private def logSpillage(size: Long) {
 val threadId = Thread.currentThread().getId
-logInfo("Thread %d spilling in-memory map of %d MB to disk (%d time%s 
so far)"
-.format(threadId, size / (1024 * 1024), _spillCount, if 
(_spillCount > 1) "s" else ""))
+logInfo("Thread %d spilling in-memory map of %d B to disk (%d time%s 
so far)"
--- End diff --

Thanks Srowen,  change to Utils.bytesToString.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark Core] SPARK-4380 Edit spilling log from...

2014-11-13 Thread shenh062326
GitHub user shenh062326 opened a pull request:

https://github.com/apache/spark/pull/3243

[Spark Core] SPARK-4380 Edit spilling log from MB to B

https://issues.apache.org/jira/browse/SPARK-4380

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shenh062326/spark spark_change

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3243.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3243


commit 946351ce11f9178e952c72f23dd4f93891a15c15
Author: Hong Shen 
Date:   2014-11-13T12:38:09Z

Edit spilling log from MB to B




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org