Github user shenh062326 commented on the issue:
https://github.com/apache/spark/pull/16324
Iâm sorry, @rxin, I don't understand what you mean.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project doe
Github user shenh062326 commented on the issue:
https://github.com/apache/spark/pull/16324
Currentlyï¼we can create a UDF with jar in HDFS, but failed to use it.
Spark driver won't download the jar from HDFS, it only add the path to the
classLoader.
If we don'
Github user shenh062326 commented on the issue:
https://github.com/apache/spark/pull/16324
Should we download the UDF jar from hdfs.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
GitHub user shenh062326 opened a pull request:
https://github.com/apache/spark/pull/16324
Resolve faile to use UDF that jar file in hdfs.
## What changes were proposed in this pull request?
In SparkContext, setURLStreamHandlerFactory method on URL with an instance
of
Github user shenh062326 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14557#discussion_r74714601
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -798,6 +798,19 @@ private[spark] class TaskSetManager
GitHub user shenh062326 opened a pull request:
https://github.com/apache/spark/pull/14574
[SPARK-16985] Change dataFormat from MMddHHmm to MMddHHmmss
## What changes were proposed in this pull request?
In our cluster, sometimes the sql output maybe overrided. When I
Github user shenh062326 commented on a diff in the pull request:
https://github.com/apache/spark/pull/14557#discussion_r74021599
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1564,6 +1564,14 @@ class SparkContext(config: SparkConf) extends
Logging with
GitHub user shenh062326 opened a pull request:
https://github.com/apache/spark/pull/14557
[SPARK-16709][CORE] Kill the running task if stage failed
## What changes were proposed in this pull request?
At SPARK-16709, when a stage failed, but the running task is still
GitHub user shenh062326 opened a pull request:
https://github.com/apache/spark/pull/11386
[SPARK-13450][SQL] External spilling when join a lot of rows with the same
key
SortMergeJoin use a ArrayBuffer[InternalRow] to store bufferedMatches, if
the join have a lot of rows with the
GitHub user shenh062326 opened a pull request:
https://github.com/apache/spark/pull/8975
[SPARK-10918] [CORE] Prevent task failed for executor kill by driver
When dynamicAllocation is enabled, when a executor was idle timeout, it
will be kill by driver, if a task offer to the
Github user shenh062326 commented on the pull request:
https://github.com/apache/spark/pull/5608#issuecomment-96870004
@srowen @mateiz
Thanks for you review.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user shenh062326 commented on the pull request:
https://github.com/apache/spark/pull/5608#issuecomment-96454990
I don't know why it has not start build automaticly.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user shenh062326 commented on the pull request:
https://github.com/apache/spark/pull/5608#issuecomment-96369283
Thanks, I will fix it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user shenh062326 commented on a diff in the pull request:
https://github.com/apache/spark/pull/5608#discussion_r29107048
--- Diff: core/src/main/scala/org/apache/spark/util/SizeEstimator.scala ---
@@ -204,25 +204,36 @@ private[spark] object SizeEstimator extends Logging
Github user shenh062326 commented on a diff in the pull request:
https://github.com/apache/spark/pull/5608#discussion_r29097662
--- Diff: core/src/main/scala/org/apache/spark/util/SizeEstimator.scala ---
@@ -204,25 +204,36 @@ private[spark] object SizeEstimator extends Logging
Github user shenh062326 commented on the pull request:
https://github.com/apache/spark/pull/5608#issuecomment-95908120
Sampling strategy not always works, but sampling twice are more effective
then only discarding the first non-null sample. And sampling 200 times will
not cause
Github user shenh062326 commented on the pull request:
https://github.com/apache/spark/pull/5608#issuecomment-95904531
@srowen
The last assertResult I have add in the testcase is the case that can't
only discarding the first non-null sample, because half of the array elem
Github user shenh062326 commented on the pull request:
https://github.com/apache/spark/pull/5608#issuecomment-95767900
It seems always work in my cluster, at least I have not find a case not
work. But if I change to the simpler one, sometimes it doesn't work.
---
If your proje
Github user shenh062326 commented on the pull request:
https://github.com/apache/spark/pull/5608#issuecomment-95388189
@srowen
At first, I also want to exclude shared objects by discarding the first
non-null sample, but not always work, since not all the objects links to the
Github user shenh062326 commented on the pull request:
https://github.com/apache/spark/pull/5608#issuecomment-94995027
@mateiz
In most case, the first sampling size is contain the shared objects, the
second will not. But if the arrray is large, and is only has a few not null
Github user shenh062326 commented on the pull request:
https://github.com/apache/spark/pull/5608#issuecomment-94754171
No, the change has no matter with the check for null.
If the arraySize > 200, and elem has the share object, the
SizeEstimator.visitArray is not correct.
GitHub user shenh062326 opened a pull request:
https://github.com/apache/spark/pull/5608
[SPARK-6738] [CORE] Improve estimate the size of a large array
Currently, SizeEstimator.visitArray is not correct in the follow case,
array size > 200,
elem has the share obj
Github user shenh062326 commented on the pull request:
https://github.com/apache/spark/pull/4363#issuecomment-76145260
Sorry for late, I will change it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user shenh062326 commented on a diff in the pull request:
https://github.com/apache/spark/pull/4363#discussion_r25413198
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -17,33 +17,84 @@
package org.apache.spark
-import
Github user shenh062326 commented on a diff in the pull request:
https://github.com/apache/spark/pull/4363#discussion_r24574765
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -17,33 +17,84 @@
package org.apache.spark
-import
Github user shenh062326 commented on the pull request:
https://github.com/apache/spark/pull/4363#issuecomment-74023636
Hi @sryza, I think this pull request is OK now, can you merge it into
master?
---
If your project is set up for it, you can reply to this email and have your
reply
Github user shenh062326 commented on the pull request:
https://github.com/apache/spark/pull/4529#issuecomment-73857999
Hi @srowen.
We just want to read executor log from UI. is there any easy way to add
executor log url to UI?
---
If your project is set up for it, you can reply
GitHub user shenh062326 opened a pull request:
https://github.com/apache/spark/pull/4529
[SPARK-5736][Web UI]Add executor log url to Executors page on Yarn
Currently, there is not executor log url in spark ui (on Yarn), we have to
read executor log by login the machine that
Github user shenh062326 commented on the pull request:
https://github.com/apache/spark/pull/4363#issuecomment-73638561
The failed tests have no relationship with this patch.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user shenh062326 commented on a diff in the pull request:
https://github.com/apache/spark/pull/4363#discussion_r24381524
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -17,33 +17,82 @@
package org.apache.spark
-import
Github user shenh062326 commented on the pull request:
https://github.com/apache/spark/pull/4363#issuecomment-73625458
Hi @andrewor14 , @sryza and @rxin. Thanks. I agree with your views. I will
change sc.killExecutor to not throw an assertion error.
---
If your project is set up
Github user shenh062326 commented on the pull request:
https://github.com/apache/spark/pull/4363#issuecomment-73220183
scheduler.executorLost(executorId, SlaveLost()) will call
BlockManagerMasterActor.removeBlockManager, the stack is:
HeartbeatReceiver.expireDeadHosts
Github user shenh062326 commented on a diff in the pull request:
https://github.com/apache/spark/pull/4363#discussion_r24216671
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -17,33 +17,85 @@
package org.apache.spark
-import
Github user shenh062326 commented on a diff in the pull request:
https://github.com/apache/spark/pull/4363#discussion_r24215867
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -17,33 +17,85 @@
package org.apache.spark
-import
Github user shenh062326 commented on a diff in the pull request:
https://github.com/apache/spark/pull/4363#discussion_r24215268
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -17,33 +17,85 @@
package org.apache.spark
-import
Github user shenh062326 commented on the pull request:
https://github.com/apache/spark/pull/4363#issuecomment-73006178
The failed testcase has no relationship with this patch.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user shenh062326 commented on a diff in the pull request:
https://github.com/apache/spark/pull/4363#discussion_r24138722
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -32,18 +33,56 @@ private[spark] case class Heartbeat(
taskMetrics
Github user shenh062326 commented on the pull request:
https://github.com/apache/spark/pull/4363#issuecomment-72826509
add [SPARK-5529]
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
GitHub user shenh062326 opened a pull request:
https://github.com/apache/spark/pull/4363
Add expireDeadHosts in HeartbeatReceiver
If a blockManager has not send heartBeat more than 120s,
BlockManagerMasterActor will remove it. But coarseGrainedSchedulerBackend can
only remove
Github user shenh062326 commented on the pull request:
https://github.com/apache/spark/pull/4157#issuecomment-71348155
I think you are right, it's no need to change.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as wel
Github user shenh062326 commented on the pull request:
https://github.com/apache/spark/pull/4050#issuecomment-71347965
If we use a inputFormat that donât instanc of
org.apache.hadoop.mapreduce.lib.input.{CombineFileSplit, FileSplit}, then we
can't get information of input me
Github user shenh062326 commented on the pull request:
https://github.com/apache/spark/pull/4150#issuecomment-71347933
If we use a inputFormat that donât instanc of
org.apache.hadoop.mapreduce.lib.input.{CombineFileSplit, FileSplit}, then we
can't get information of input me
Github user shenh062326 commented on a diff in the pull request:
https://github.com/apache/spark/pull/4157#discussion_r23370062
--- Diff:
core/src/main/scala/org/apache/spark/network/nio/ConnectionManager.scala ---
@@ -375,16 +375,22 @@ private[nio] class ConnectionManager
GitHub user shenh062326 opened a pull request:
https://github.com/apache/spark/pull/4157
[SPARK-4934][CORE] Print remote address in ConnectionManager
Connection key is hard to read : key already cancelled ?
sun.nio.ch.SelectionKeyImpl@52b0e278.
Itâs hard to solve problem by
GitHub user shenh062326 opened a pull request:
https://github.com/apache/spark/pull/4150
[SPARK-5347][CORE] Change FileSplit to InputSplit in update inputMetrics
When inputFormatClass is set to CombineFileInputFormat, input metrics show
that input is empty. It don't appear is
Github user shenh062326 commented on a diff in the pull request:
https://github.com/apache/spark/pull/3243#discussion_r20337096
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/Spillable.scala ---
@@ -105,7 +105,7 @@ private[spark] trait Spillable[C
GitHub user shenh062326 opened a pull request:
https://github.com/apache/spark/pull/3243
[Spark Core] SPARK-4380 Edit spilling log from MB to B
https://issues.apache.org/jira/browse/SPARK-4380
You can merge this pull request into a Git repository by running:
$ git pull https
47 matches
Mail list logo