Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4055#issuecomment-113166545
@andrewor14 @srowen Already refine with the comments.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user suyanNone commented on a diff in the pull request:
https://github.com/apache/spark/pull/4055#discussion_r32600352
--- Diff: core/src/main/scala/org/apache/spark/scheduler/ResultTask.scala
---
@@ -65,4 +65,6 @@ private[spark] class ResultTask[T, U](
override def
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4887#issuecomment-112642422
@srowen eh...give me some time to review this patch...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
GitHub user suyanNone opened a pull request:
https://github.com/apache/spark/pull/6834
Only show 4096 bytes content for executor log instead all
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/suyanNone/spark small-display
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/6586#issuecomment-112055404
and I also think it may be more better to de-allocate direct Buffer while
we send chunkFetchSuccess to requester.
I not sure I right or not,
NioManagedBuffer
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/6586#issuecomment-112051808
@JoshRosen
The netty direct buffer leak problem fix on Netty-4.0.29-Final, but it is
not release official yet.
Although I also not sure that patch is
Github user suyanNone commented on a diff in the pull request:
https://github.com/apache/spark/pull/6644#discussion_r31885371
--- Diff: core/src/main/scala/org/apache/spark/ui/exec/ExecutorsPage.scala
---
@@ -85,7 +88,7 @@ private[ui] class ExecutorsPage
GitHub user suyanNone opened a pull request:
https://github.com/apache/spark/pull/6644
[SPARK-8100][UI]Make able
While spark application is still running, and the lost executor info also
disappear in Spark UI, it is not
easy to see why that executor lost or refer other sth
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/6586#issuecomment-108707568
@srowen
Today I spent some time to have a performance test.
If I just test 1 cycle, TestOutPutStream have a minor strength, may due to
directbuffer
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/6586#issuecomment-108327805
@srowen if it just to adjust memoryoverhead arg, it will customized for
every different application or even if the data increases day by day, may it
will change again
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/6586#issuecomment-108298877
@srowen may can split the buffer into slices to read or write through
channel.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/6586#issuecomment-108266326
@srowen
if want to make physical memory used more reasonable, I think there have 2
ways:
1. use fileOutputStream or fileinputStream to write byte[] directly
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/6586#issuecomment-108174658
@srowen
I think in all system, we want make physical memory to be more
controllable.
We use spark on yarn, we always encounter direct buffer is out of
GitHub user suyanNone opened a pull request:
https://github.com/apache/spark/pull/6586
[SPARK-8044][CORE] Avoid to use directbuffer while reading or writing disk
level block
1. I found if we use getChannel to put or get data, it will create
DirectBuffer anyway, which is not
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4886#issuecomment-97048856
@srowen may i got some idea about why we need to try to cache twice:
if a shuffleRDD need to be cached, and level is memory&disk&serial, so
when first
Github user suyanNone closed the pull request at:
https://github.com/apache/spark/pull/3582
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user suyanNone closed the pull request at:
https://github.com/apache/spark/pull/4886
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4886#issuecomment-94342126
@srowen I fine to close this patch.
eh...I not sure I fully got your point. I agree this means "a level which
does not use memory" and ` This branch is fo
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4886#issuecomment-93885511
@srowen
I forgot to update desc, already refine
if program go here `if (!putLevel.useMemory) {`, means put a disk level
block, or memory_and_disk level
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4055#issuecomment-93883077
@andrewor14 that problem occurs while have stage-retry.
Our user had been meet that problem under having a Executor Lost because of
killed by yarn or sth, while we
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4886#issuecomment-92709484
Jenkins, test this again.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4145#issuecomment-88343483
@JoshRosen Can someone verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/5259#issuecomment-87907228
Duplicate with @kayousterhout SPARK-5360, and 5360 is better solution.
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user suyanNone closed the pull request at:
https://github.com/apache/spark/pull/5259
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/5259#issuecomment-87905693
@kayousterhout I had just search "accumulators" in global issues before I
pull request. I has read you patch, I also prefer SPARK-5360.
---
If your proj
GitHub user suyanNone opened a pull request:
https://github.com/apache/spark/pull/5259
Accumulator deserialized twice because the NarrowCoGroupSplitDep contains
rdd object.
1. Use code like belows, will found accumulator deserialized twice.
first:
```
task
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4055#issuecomment-83422662
This patch is forgotten by us...
@srowen @markhamstra @kayousterhout
this patch can prevent from endless retry which may occurs after a
executor is killed or
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4887#issuecomment-77791515
@srowen also be my fault, for lazy to make description more clear... I will
update the description to make more sense
---
If your project is set up for it, you can
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4887#issuecomment-77295859
@andrewor14 , i thinks is not duplicate.
Put a memory_and_disk level block.
1) Try to put in memory store, unroll fails. 2) Put into disk success. 3)
return
GitHub user suyanNone opened a pull request:
https://github.com/apache/spark/pull/4887
Unroll unsuccessful memory_and_disk level block should release reserved ...
Current code:
Now we want to cache a Memory_and_disk level block
1. Try to put in memory and unroll unsuccessful
GitHub user suyanNone opened a pull request:
https://github.com/apache/spark/pull/4886
Not cache in memory again if put memory_and_disk level block after put i...
Current code:
Now we want to cache a Memory_and_disk level block
1. Try to put in memory and unroll unsuccessful
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/3629#issuecomment-76649122
ping @andrewor14
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/3629#issuecomment-76369781
@andrewor14
If put it at the end of `accountingLock.synchronized`, as I described
above,
```
accountingLock.synchronized {
1. ensureFreeSpace
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/3629#issuecomment-76324070
@andrewor14 , Already refine according comments, please review~
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user suyanNone commented on a diff in the pull request:
https://github.com/apache/spark/pull/3629#discussion_r25421661
--- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala ---
@@ -328,6 +330,7 @@ private[spark] class MemoryStore(blockManager
Github user suyanNone commented on a diff in the pull request:
https://github.com/apache/spark/pull/4055#discussion_r25414968
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -483,8 +483,9 @@ private[spark] class TaskSetManager
Github user suyanNone commented on a diff in the pull request:
https://github.com/apache/spark/pull/4055#discussion_r25333545
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -483,8 +483,9 @@ private[spark] class TaskSetManager
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4055#issuecomment-74646530
@srowen @JoshRosen can some one verify this patch.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4055#issuecomment-72163353
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4055#issuecomment-72148352
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4055#issuecomment-72148232
@cloud-fan--!
[error]
/home/jenkins/workspace/SparkPullRequestBuilder/core/src/main/scala/org/apache/spark/scheduler/ResultTask.scala:69:
class ResultTask
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4055#issuecomment-72147882
jenkins retests
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user suyanNone commented on a diff in the pull request:
https://github.com/apache/spark/pull/4055#discussion_r23592779
--- Diff: core/src/main/scala/org/apache/spark/scheduler/ResultTask.scala
---
@@ -65,4 +65,6 @@ private[spark] class ResultTask[T, U](
override def
Github user suyanNone commented on a diff in the pull request:
https://github.com/apache/spark/pull/4055#discussion_r23592724
--- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala ---
@@ -106,7 +106,21 @@ private[spark] abstract class Task[T](val stageId:
Int, var
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4055#issuecomment-71579227
@cloud-fan ZhangLei, SunHongLiang, HanLi, ChenXingYu, blabla...I am
ZhangLei's classmate in ZJU.
---
If your project is set up for it, you can reply to this
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4055#issuecomment-71458822
Last week and current week, my work is not in Spark, so reply was slower
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4055#issuecomment-71458406
@cloud-fan btw, Do you know HarryZhang?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4055#issuecomment-71457733
@cloud-fan
According current code, it may not easy to change to not re-submit task in
pendingTasks. To be honest, current DAGScheduler is complicated but at some
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4055#issuecomment-71447811
@srowen, the original `hashCode()` generated by idea auto-generated
hashcode feature. Now I already refined by your comments.
About canEqual(), I just according
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/2471#issuecomment-71309691
@vanzin = =! I got it, sigh~
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/2471#issuecomment-70964880
Is thi patch ok to merge?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/3629#issuecomment-70203787
@andrewor14
I will still releasePendingUnrollMemory() before [this
line](https://github.com/apache/spark/blob/4e1f12d997426560226648d62ee17c90352613e7/core
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/3629#issuecomment-70077674
@andrewor14 @liyezhang556520
yes, because in current memorystore, it keep a `previousMemoryReserved` for
current Thread, that var is design to reserved the unroll
GitHub user suyanNone opened a pull request:
https://github.com/apache/spark/pull/4055
[SPARK-5259][CORE]Make sure mapStage.pendingtasks is set() while
MapStage.isAvailable is t...
Add task equal() and hashcode() to avoid stage.pendingTasks not accurate
while stage was retry
You
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/3629#issuecomment-69527640
@andrewor14
`shouldn't we release the pending memory after we actually put the block
(i.e. after this line), not before?`
Agree. I think [this
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/3629#issuecomment-69301441
@andrewor14 sorry for my poor english, and @liyezhang556520 has explained
well.
now, I just talk the situation if just remove the memory release in
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/3629#issuecomment-69127678
@andrewor14
Hi, yes, but the valueA (unrollmemoryForThisThread) can't used by others
even after this value already added to memoryStore part. it will be d
GitHub user suyanNone opened a pull request:
https://github.com/apache/spark/pull/3932
[SPARK-5132][Core]Correct stage Attempt Id key in stageInfofromJson
SPARK-5132:
stageInfoToJson: Stage Attempt Id
stageInfoFromJson: Attempt Id
You can merge this pull request into a Git
GitHub user suyanNone reopened a pull request:
https://github.com/apache/spark/pull/3629
[SPARK-4777][CORE] Some block memory after unrollSafely not count into used
memory(memoryStore.entrys or unrollMemory)
Some memory not count into memory used by memoryStore or unrollMemory
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/3629#issuecomment-67127268
@liyezhang556520 I not familiar with the process about pull request = =,
ok, I will reopen it...
---
If your project is set up for it, you can reply to this email
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/3629#issuecomment-67127309
Reopen
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/2134#issuecomment-67126094
yes, its duplicate with your patch
I just see you patch title "parallel drop to disk"... so I don't see the
code in detail. I already close my patch.
Github user suyanNone closed the pull request at:
https://github.com/apache/spark/pull/3629
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/3629#issuecomment-67126072
It is already resolved in [SPARK-3000][CORE] drop old blocks to disk in
parallel when free memory is not enough for caching new blocks #2134
---
If your project is
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/3541#issuecomment-66863621
To be frankly, I am fresh in spark.
If have local fileSystem accessed errors, should we identify and mark as
host-failed? may wrapAs ReadFileException or
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/3541#issuecomment-66396760
@davies IIUC, current executor-backlist or host-backlist is all for one
TaskSetManager. There is a chance that tow or more taskSetManager run on the
same host or
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/3574#issuecomment-66224947
@JoshRosen I refine the code according your comments.
If still have problem, it OK for you to fix up including the title and
comments, and thanks for you to check
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/3582#issuecomment-66031430
@JoshRosen
I guessï¼
1. Tow Thread in Same Executor
1.1 Two Thread in same Executor,
Executor have 4 core, and cpu per task is 1
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/3574#issuecomment-65976983
@JoshRosen
yes, it will not cause any error if there not have a check in removeBlock
and dropOldBlocks
I pull this request, because I think it will be more
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/3582#issuecomment-65970969
@JoshRosen
Okay, I will check if there will be multiple thread to be the writer. I
write this code because while I read the current code, it think there will be
GitHub user suyanNone opened a pull request:
https://github.com/apache/spark/pull/3629
[SPARK-4777][CORE] Some block memory after unrollSafely not count into used
memory(memoryStore.entrys or unrollMemory)
Some memory not count into memory used by memoryStore or unrollMemory
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/3582#issuecomment-65891778
Sorry for my poor comments and English.
In all,
1. we do put one thread by one thread until there have 1 thread succeed.
2. multiple doGetLocal
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/3574#issuecomment-65779235
@JoshRosen
ThreadA into RemoveBlock(), and got info for blockId1
ThreadB into DropFromMemory(), and got info for blockId1
now Thread A, B all want got
Github user suyanNone commented on a diff in the pull request:
https://github.com/apache/spark/pull/3574#discussion_r21359660
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -1089,15 +1089,17 @@ private[spark] class BlockManager(
val info
GitHub user suyanNone opened a pull request:
https://github.com/apache/spark/pull/3582
[SPARK-4721][CORE] Improve logic while first thread put block failed
1. make thread which wait old block info try one by one while the first
thread which created that block failed.
2. use
GitHub user suyanNone opened a pull request:
https://github.com/apache/spark/pull/3574
[SPARK-4714][CORE]: Add checking info is null or not while having gotten
info.syn
[SPARK-4714][CORE]: Add checking info is null or not while having gotten
info.syn
in removeBlock
Github user suyanNone commented on a diff in the pull request:
https://github.com/apache/spark/pull/791#discussion_r21156020
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -837,11 +837,11 @@ private[spark] class BlockManager(
* Drop a
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/3340#issuecomment-64158918
@andrewor14 , now OK?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/3340#issuecomment-63760334
= =! Sorry, made some mistakes, so recommit the changes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
GitHub user suyanNone opened a pull request:
https://github.com/apache/spark/pull/3340
Fix SPARK-4471: blockManagerIdFromJson function throws exception while B...
Fix SPARK-4471: blockManagerIdFromJson function throws exception while
BlockManagerId be null in
101 - 180 of 180 matches
Mail list logo