[jira] [Updated] (SPARK-21849) Make the serializer function more robust

2017-08-28 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-21849: Issue Type: Improvement (was: Bug) > Make the serializer function more robust >

[jira] [Created] (SPARK-21849) Make the serializer function more robust

2017-08-28 Thread DjvuLee (JIRA)
DjvuLee created SPARK-21849: --- Summary: Make the serializer function more robust Key: SPARK-21849 URL: https://issues.apache.org/jira/browse/SPARK-21849 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-21682) Caching 100k-task RDD GC-kills driver (due to updatedBlockStatuses?)

2017-08-09 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120959#comment-16120959 ] DjvuLee commented on SPARK-21682: - Yes, our company also faced with this scalability problem, the driver

[jira] [Commented] (SPARK-21547) Spark cleaner cost too many time

2017-07-29 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16106292#comment-16106292 ] DjvuLee commented on SPARK-21547: - Ok, I will try and posted the result later. > Spark cleaner cost

[jira] [Commented] (SPARK-21547) Spark cleaner cost too many time

2017-07-27 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16103459#comment-16103459 ] DjvuLee commented on SPARK-21547: - Yes, I agree that this has a relationship with the work, but doing

[jira] [Updated] (SPARK-21547) Spark cleaner cost too many time

2017-07-27 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-21547: Description: Spark Streaming sometime cost so many time deal with cleaning, and this can become worse

[jira] [Updated] (SPARK-21547) Spark cleaner cost too many time

2017-07-27 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-21547: Description: Spark Streaming sometime cost so many time deal with cleaning, and this can become worse

[jira] [Commented] (SPARK-21547) Spark cleaner cost too many time

2017-07-27 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16103005#comment-16103005 ] DjvuLee commented on SPARK-21547: - 17/07/27 11:29:51 INFO TaskSetManager: Finished task 169.0 in stage

[jira] [Updated] (SPARK-21547) Spark cleaner cost too many time

2017-07-27 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-21547: Description: Spark Streaming sometime cost so many time deal with cleaning, and this can become worse when

[jira] [Created] (SPARK-21547) Spark cleaner cost too many time

2017-07-27 Thread DjvuLee (JIRA)
DjvuLee created SPARK-21547: --- Summary: Spark cleaner cost too many time Key: SPARK-21547 URL: https://issues.apache.org/jira/browse/SPARK-21547 Project: Spark Issue Type: Bug Components:

[jira] [Updated] (SPARK-21383) YARN can allocate too many executors

2017-07-17 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-21383: Summary: YARN can allocate too many executors (was: YARN can allocate to many executors) > YARN can

[jira] [Updated] (SPARK-21082) Consider Executor's memory usage when scheduling task

2017-06-14 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-21082: Affects Version/s: (was: 2.2.1) 2.3.0 > Consider Executor's memory usage when

[jira] [Commented] (SPARK-21082) Consider Executor's memory usage when scheduling task

2017-06-14 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049965#comment-16049965 ] DjvuLee commented on SPARK-21082: - Data locality, input size for task, scheduling order affect a lot,

[jira] [Comment Edited] (SPARK-21082) Consider Executor's memory usage when scheduling task

2017-06-14 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049933#comment-16049933 ] DjvuLee edited comment on SPARK-21082 at 6/15/17 2:47 AM: -- Not a really fast

[jira] [Commented] (SPARK-21082) Consider Executor's memory usage when scheduling task

2017-06-14 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049933#comment-16049933 ] DjvuLee commented on SPARK-21082: - Not a really fast node and slow node problem. Even all the nodes have

[jira] [Updated] (SPARK-21082) Consider Executor's memory usage when scheduling task

2017-06-14 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-21082: Description: Spark Scheduler do not consider the memory usage during dispatch tasks, this can lead to

[jira] [Commented] (SPARK-21082) Consider Executor's memory usage when scheduling task

2017-06-14 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048835#comment-16048835 ] DjvuLee commented on SPARK-21082: - Yes, one of the reason why Spark do not balance tasks well enough is

[jira] [Comment Edited] (SPARK-21082) Consider Executor's memory usage when scheduling task

2017-06-13 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048653#comment-16048653 ] DjvuLee edited comment on SPARK-21082 at 6/14/17 3:15 AM: -- [~srowen] This

[jira] [Commented] (SPARK-21082) Consider Executor's memory usage when scheduling task

2017-06-13 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048654#comment-16048654 ] DjvuLee commented on SPARK-21082: - My idea is try to consider the BlockManger information when scheduling

[jira] [Commented] (SPARK-21082) Consider Executor's memory usage when scheduling task

2017-06-13 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048653#comment-16048653 ] DjvuLee commented on SPARK-21082: - [~srowen] This situation occurred when the partition number is larger

[jira] [Updated] (SPARK-21082) Consider Executor's memory usage when scheduling task

2017-06-13 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-21082: Description: Spark Scheduler do not consider the memory usage during dispatch tasks, this can lead to

[jira] [Commented] (SPARK-21082) Consider Executor's memory usage when scheduling task

2017-06-13 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048146#comment-16048146 ] DjvuLee commented on SPARK-21082: - If this feature is a good suggestion(we encounter this problem in

[jira] [Updated] (SPARK-21082) Consider Executor's memory usage when scheduling task

2017-06-13 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-21082: Description: Spark Scheduler do not consider the memory usage during dispatch tasks, this can lead to

[jira] [Updated] (SPARK-21082) Consider Executor's memory usage when scheduling task

2017-06-13 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-21082: Description: Spark Scheduler do not consider the memory usage during dispatch tasks, this can lead to

[jira] [Updated] (SPARK-21082) Consider Executor's memory usage when scheduling task

2017-06-13 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-21082: Description: When we cache the > Consider Executor's memory usage when scheduling task >

[jira] [Updated] (SPARK-21082) Consider Executor's memory usage when scheduling task

2017-06-13 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-21082: Component/s: Scheduler > Consider Executor's memory usage when scheduling task >

[jira] [Created] (SPARK-21082) Consider the Executor's Memory usage when scheduling task

2017-06-13 Thread DjvuLee (JIRA)
DjvuLee created SPARK-21082: --- Summary: Consider the Executor's Memory usage when scheduling task Key: SPARK-21082 URL: https://issues.apache.org/jira/browse/SPARK-21082 Project: Spark Issue Type:

[jira] [Updated] (SPARK-21082) Consider Executor's memory usage when scheduling task

2017-06-13 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-21082: Summary: Consider Executor's memory usage when scheduling task (was: Consider the Executor's Memory

[jira] [Commented] (SPARK-21064) Fix the default value bug in NettyBlockTransferServiceSuite

2017-06-12 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16046475#comment-16046475 ] DjvuLee commented on SPARK-21064: - The defalut value for `spark.port.maxRetries` is 100, but we use the

[jira] [Created] (SPARK-21064) Fix the default value bug in NettyBlockTransferServiceSuite

2017-06-12 Thread DjvuLee (JIRA)
DjvuLee created SPARK-21064: --- Summary: Fix the default value bug in NettyBlockTransferServiceSuite Key: SPARK-21064 URL: https://issues.apache.org/jira/browse/SPARK-21064 Project: Spark Issue

[jira] [Commented] (SPARK-18085) Better History Server scalability for many / large applications

2017-06-07 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16040633#comment-16040633 ] DjvuLee commented on SPARK-18085: - [~vanzin] the procedure of loading the history summary page is still a

[jira] [Commented] (SPARK-18085) Better History Server scalability for many / large applications

2017-06-05 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16038210#comment-16038210 ] DjvuLee commented on SPARK-18085: - [~vanzin] I want to try your branch. Does all the information is the

[jira] [Updated] (SPARK-18085) Better History Server scalability for many / large applications

2017-03-06 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-18085: Yes, you're right. I just want to impact as few as possible. > Better History Server scalability for many

[jira] [Commented] (SPARK-18085) Better History Server scalability for many / large applications

2017-03-05 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15896652#comment-15896652 ] DjvuLee commented on SPARK-18085: - "A separate jar file" means we generate a new jar file for the history

[jira] [Commented] (SPARK-19823) Support Gang Distribution of Task

2017-03-04 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15896107#comment-15896107 ] DjvuLee commented on SPARK-19823: - [~zsxwing] Can you have a look at? > Support Gang Distribution of

[jira] [Comment Edited] (SPARK-19823) Support Gang Distribution of Task

2017-03-04 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15896096#comment-15896096 ] DjvuLee edited comment on SPARK-19823 at 3/5/17 7:19 AM: - When Spark distributes

[jira] [Comment Edited] (SPARK-19823) Support Gang Distribution of Task

2017-03-04 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15896096#comment-15896096 ] DjvuLee edited comment on SPARK-19823 at 3/5/17 7:10 AM: - When Spark distributes

[jira] [Comment Edited] (SPARK-19823) Support Gang Distribution of Task

2017-03-04 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15896096#comment-15896096 ] DjvuLee edited comment on SPARK-19823 at 3/5/17 7:10 AM: - When Spark distributes

[jira] [Comment Edited] (SPARK-19823) Support Gang Distribution of Task

2017-03-04 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15896096#comment-15896096 ] DjvuLee edited comment on SPARK-19823 at 3/5/17 7:10 AM: - When Spark distributes

[jira] [Commented] (SPARK-19823) Support Gang Distribution of Task

2017-03-04 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15896096#comment-15896096 ] DjvuLee commented on SPARK-19823: - When Spark distributes tasks to Executors, it uses a Round-Robin way,

[jira] [Commented] (SPARK-19823) Support Gang Distribution of Task

2017-03-04 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15896097#comment-15896097 ] DjvuLee commented on SPARK-19823: - If this is a good advice, I will give a Pull Request. > Support Gang

[jira] [Created] (SPARK-19823) Support Gang Distribution of Task

2017-03-04 Thread DjvuLee (JIRA)
DjvuLee created SPARK-19823: --- Summary: Support Gang Distribution of Task Key: SPARK-19823 URL: https://issues.apache.org/jira/browse/SPARK-19823 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-18085) Better History Server scalability for many / large applications

2017-03-04 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15896070#comment-15896070 ] DjvuLee commented on SPARK-18085: - [~vanzin] Thanks for your reply! Does your new solution will generate

[jira] [Commented] (SPARK-19821) Throw out the Read-only disk information when create file for Shuffle

2017-03-04 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15896067#comment-15896067 ] DjvuLee commented on SPARK-19821: - Currently, when the disk is just read-only, we will just throw out the

[jira] [Updated] (SPARK-19821) Throw out the Read-only disk information when create file for Shuffle

2017-03-04 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-19821: Description: java.io.FileNotFoundException:

[jira] [Created] (SPARK-19821) Throw out the Read-only disk information when create file for Shuffle

2017-03-04 Thread DjvuLee (JIRA)
DjvuLee created SPARK-19821: --- Summary: Throw out the Read-only disk information when create file for Shuffle Key: SPARK-19821 URL: https://issues.apache.org/jira/browse/SPARK-19821 Project: Spark

[jira] [Commented] (SPARK-18085) Better History Server scalability for many / large applications

2017-03-03 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15894074#comment-15894074 ] DjvuLee commented on SPARK-18085: - [~vanzin] This is a nice design. There is not much information about

[jira] [Commented] (SPARK-17300) ClosedChannelException caused by missing block manager when speculative tasks are killed

2017-02-05 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853630#comment-15853630 ] DjvuLee commented on SPARK-17300: - [~rdblue] Is there any fix for this issue later? >

[jira] [Created] (SPARK-19327) Improve the partition method when read data from jdbc

2017-01-21 Thread DjvuLee (JIRA)
DjvuLee created SPARK-19327: --- Summary: Improve the partition method when read data from jdbc Key: SPARK-19327 URL: https://issues.apache.org/jira/browse/SPARK-19327 Project: Spark Issue Type:

[jira] [Updated] (SPARK-19239) Check the lowerBound and upperBound whether equal None in jdbc API

2017-01-16 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-19239: Summary: Check the lowerBound and upperBound whether equal None in jdbc API (was: Check the lowerBound

[jira] [Updated] (SPARK-19239) Check the lowerBound and upperBound equal None in jdbc API

2017-01-16 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-19239: Description: When we use the ``jdbc`` in pyspark, if we check the lowerBound and upperBound, we can give a

[jira] [Created] (SPARK-19239) Check the lowerBound and upperBound equal None in jdbc API

2017-01-16 Thread DjvuLee (JIRA)
DjvuLee created SPARK-19239: --- Summary: Check the lowerBound and upperBound equal None in jdbc API Key: SPARK-19239 URL: https://issues.apache.org/jira/browse/SPARK-19239 Project: Spark Issue Type:

[jira] [Comment Edited] (SPARK-18778) Fix the Scala classpath in the spark-shell

2016-12-07 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15731403#comment-15731403 ] DjvuLee edited comment on SPARK-18778 at 12/8/16 7:46 AM: -- When I just run the

[jira] [Commented] (SPARK-18778) Fix the Scala classpath in the spark-shell

2016-12-07 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15731455#comment-15731455 ] DjvuLee commented on SPARK-18778: - [~srowen] [~andrewor14] can you have a look at? > Fix the Scala

[jira] [Comment Edited] (SPARK-18778) Fix the Scala classpath in the spark-shell

2016-12-07 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15731431#comment-15731431 ] DjvuLee edited comment on SPARK-18778 at 12/8/16 7:38 AM: -- I give a fix in the

[jira] [Comment Edited] (SPARK-18778) Fix the Scala classpath in the spark-shell

2016-12-07 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15731431#comment-15731431 ] DjvuLee edited comment on SPARK-18778 at 12/8/16 7:38 AM: -- I give a fix in the

[jira] [Comment Edited] (SPARK-18778) Fix the Scala classpath in the spark-shell

2016-12-07 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15731431#comment-15731431 ] DjvuLee edited comment on SPARK-18778 at 12/8/16 7:35 AM: -- I give a fix in the

[jira] [Commented] (SPARK-18778) Fix the Scala classpath in the spark-shell

2016-12-07 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15731431#comment-15731431 ] DjvuLee commented on SPARK-18778: - I give a fix in the https://github.com/apache/spark/pull/16210 > Fix

[jira] [Commented] (SPARK-18778) Fix the Scala classpath in the spark-shell

2016-12-07 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15731403#comment-15731403 ] DjvuLee commented on SPARK-18778: - When I just run the ./bin/spark-shell under our environment, the

[jira] [Comment Edited] (SPARK-18778) Fix the Scala classpath in the spark-shell

2016-12-07 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15731403#comment-15731403 ] DjvuLee edited comment on SPARK-18778 at 12/8/16 7:25 AM: -- When I just run the

[jira] [Updated] (SPARK-18778) Fix the Scala classpath in the spark-shell

2016-12-07 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-18778: Affects Version/s: 1.6.1 2.0.2 > Fix the Scala classpath in the spark-shell >

[jira] [Updated] (SPARK-18778) Fix the Scala classpath in the spark-shell

2016-12-07 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-18778: Description: Failed to initialize compiler: object scala.runtime in compiler mirror not found. ** Note

[jira] [Created] (SPARK-18778) Fix the Scala classpath in the spark-shell

2016-12-07 Thread DjvuLee (JIRA)
DjvuLee created SPARK-18778: --- Summary: Fix the Scala classpath in the spark-shell Key: SPARK-18778 URL: https://issues.apache.org/jira/browse/SPARK-18778 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-18181) Huge managed memory leak (2.7G) when running reduceByKey

2016-11-21 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15685568#comment-15685568 ] DjvuLee commented on SPARK-18181: - [~barrybecker4] can you reproduce this on the spark2.x version? >

[jira] [Commented] (SPARK-18528) limit + groupBy leads to java.lang.NullPointerException

2016-11-21 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15685451#comment-15685451 ] DjvuLee commented on SPARK-18528: - I just test your example, but it works. >>>

[jira] [Closed] (SPARK-17500) The DiskBytesSpilled metric in ExternalMerger && ExternalGroupBy is not right

2016-09-12 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee closed SPARK-17500. --- Resolution: Not A Bug > The DiskBytesSpilled metric in ExternalMerger && ExternalGroupBy is not right >

[jira] [Updated] (SPARK-17500) The DiskBytesSpilled metric in ExternalMerger && ExternalGroupBy is not right

2016-09-11 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-17500: Description: The DiskBytesSpilled metric in ExternalMerger && ExternalGroupBy is increased by file size in

[jira] [Updated] (SPARK-17500) The DiskBytesSpilled metric in ExternalMerger && ExternalGroupBy is not right

2016-09-11 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-17500: Summary: The DiskBytesSpilled metric in ExternalMerger && ExternalGroupBy is not right (was: The

[jira] [Created] (SPARK-17500) The DiskBytesSpilled metric in ExternalMerger && ExternalGroupBy is wrong

2016-09-11 Thread DjvuLee (JIRA)
DjvuLee created SPARK-17500: --- Summary: The DiskBytesSpilled metric in ExternalMerger && ExternalGroupBy is wrong Key: SPARK-17500 URL: https://issues.apache.org/jira/browse/SPARK-17500 Project: Spark

[jira] [Issue Comment Deleted] (SPARK-3630) Identify cause of Kryo+Snappy PARSING_ERROR

2016-08-22 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-3630: --- Comment: was deleted (was: How much data do you test? we encounter this error in our production. Our data

[jira] [Comment Edited] (SPARK-3630) Identify cause of Kryo+Snappy PARSING_ERROR

2016-08-22 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430215#comment-15430215 ] DjvuLee edited comment on SPARK-3630 at 8/22/16 7:10 AM: - Can I know how much data

[jira] [Commented] (SPARK-3630) Identify cause of Kryo+Snappy PARSING_ERROR

2016-08-22 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430215#comment-15430215 ] DjvuLee commented on SPARK-3630: How much data do you test? we encounter this error in our production.

[jira] [Commented] (SPARK-3630) Identify cause of Kryo+Snappy PARSING_ERROR

2016-08-22 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430217#comment-15430217 ] DjvuLee commented on SPARK-3630: How much data do you test? we encounter this error in our production.

[jira] [Commented] (SPARK-11054) WARN ReliableDeliverySupervisor: Association with remote system

2015-10-11 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14952225#comment-14952225 ] DjvuLee commented on SPARK-11054: - Since the default value for spark.akka.heartbeat.interval or

[jira] [Created] (SPARK-11054) WARN ReliableDeliverySupervisor: Association with remote system

2015-10-11 Thread DjvuLee (JIRA)
DjvuLee created SPARK-11054: --- Summary: WARN ReliableDeliverySupervisor: Association with remote system Key: SPARK-11054 URL: https://issues.apache.org/jira/browse/SPARK-11054 Project: Spark Issue

[jira] [Commented] (SPARK-11054) WARN ReliableDeliverySupervisor: Association with remote system

2015-10-11 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14952653#comment-14952653 ] DjvuLee commented on SPARK-11054: - I am very sorry for set the wrong Priority! I knew this problem

[jira] [Commented] (SPARK-10717) remove the with Loging in the NioBlockTransferService

2015-09-19 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877397#comment-14877397 ] DjvuLee commented on SPARK-10717: - change to final class NioBlockTransferService(conf: SparkConf,

[jira] [Created] (SPARK-10717) remove the with Loging in the NioBlockTransferService

2015-09-19 Thread DjvuLee (JIRA)
DjvuLee created SPARK-10717: --- Summary: remove the with Loging in the NioBlockTransferService Key: SPARK-10717 URL: https://issues.apache.org/jira/browse/SPARK-10717 Project: Spark Issue Type:

[jira] [Commented] (SPARK-4105) FAILED_TO_UNCOMPRESS(5) errors when fetching shuffle data with sort-based shuffle

2015-08-06 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661207#comment-14661207 ] DjvuLee commented on SPARK-4105: Is there any progress on this bug? I have the same

[jira] [Commented] (SPARK-5739) Size exceeds Integer.MAX_VALUE in File Map

2015-02-11 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316393#comment-14316393 ] DjvuLee commented on SPARK-5739: Yes, 1M maybe enough for the Kmeans algorithm. But if we

[jira] [Commented] (SPARK-5739) Size exceeds Integer.MAX_VALUE in File Map

2015-02-11 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316430#comment-14316430 ] DjvuLee commented on SPARK-5739: Ok, Got it, I will look the code for more detail. I

[jira] [Commented] (SPARK-5739) Size exceeds Integer.MAX_VALUE in File Map

2015-02-11 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317384#comment-14317384 ] DjvuLee commented on SPARK-5739: Yes, I do not explain cleanly. What I mean is that we can

[jira] [Updated] (SPARK-5739) Size exceeds Integer.MAX_VALUE in File Map

2015-02-11 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-5739: --- Summary: Size exceeds Integer.MAX_VALUE in File Map (was: Size exceeds Integer.MAX_VALUE in FileMap) Size

[jira] [Commented] (SPARK-5739) Size exceeds Integer.MAX_VALUE in File Map

2015-02-11 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315845#comment-14315845 ] DjvuLee commented on SPARK-5739: the data is generated by the example KMeansDataGenerator

[jira] [Created] (SPARK-5739) Size exceeds Integer.MAX_VALUE in FileMap

2015-02-11 Thread DjvuLee (JIRA)
DjvuLee created SPARK-5739: -- Summary: Size exceeds Integer.MAX_VALUE in FileMap Key: SPARK-5739 URL: https://issues.apache.org/jira/browse/SPARK-5739 Project: Spark Issue Type: Bug Affects

[jira] [Updated] (SPARK-5739) Size exceeds Integer.MAX_VALUE in File Map

2015-02-11 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-5739: --- Description: I just run the kmeans algorithm using a random generate data,but occurred this problem after

[jira] [Updated] (SPARK-5375) Specify more clearly about the max thread meaning in the ConnectionManager

2015-01-22 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-5375: --- Description: In the ConnectionManager.scala file, there is three thread pool: handleMessageExecutor,

[jira] [Created] (SPARK-5375) Specify more clearly about the max thread meaning in the ConnectionManager

2015-01-22 Thread DjvuLee (JIRA)
DjvuLee created SPARK-5375: -- Summary: Specify more clearly about the max thread meaning in the ConnectionManager Key: SPARK-5375 URL: https://issues.apache.org/jira/browse/SPARK-5375 Project: Spark

[jira] [Updated] (SPARK-5375) Specify more clearly about the max thread meaning in the ConnectionManager

2015-01-22 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-5375: --- Description: In the ConnectionManager.scala file, there is three thread pool: handleMessageExecutor,

[jira] [Commented] (SPARK-1112) When spark.akka.frameSize 10, task results bigger than 10MiB block execution

2014-07-24 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14073244#comment-14073244 ] DjvuLee commented on SPARK-1112: Does anyone test in version0.9.2,I found it also failed ,

[jira] [Commented] (SPARK-2156) When the size of serialized results for one partition is slightly smaller than 10MB (the default akka.frameSize), the execution blocks

2014-07-17 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064965#comment-14064965 ] DjvuLee commented on SPARK-2156: I see this fixed in the spark branch-0.9 in the github,

[jira] [Commented] (SPARK-2138) The KMeans algorithm in the MLlib can lead to the Serialized Task size become bigger and bigger

2014-07-15 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061804#comment-14061804 ] DjvuLee commented on SPARK-2138: [~piotrszul] In my opinion, if your task size if bigger

[jira] [Commented] (SPARK-2138) The KMeans algorithm in the MLlib can lead to the Serialized Task size become bigger and bigger

2014-07-15 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061843#comment-14061843 ] DjvuLee commented on SPARK-2138: Oh, So can we improve this better? The KMeans

[jira] [Commented] (SPARK-2138) The KMeans algorithm in the MLlib can lead to the Serialized Task size become bigger and bigger

2014-07-15 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062026#comment-14062026 ] DjvuLee commented on SPARK-2138: In my experiment, I set the akka.frameSize=200, my data

[jira] [Commented] (SPARK-2138) The KMeans algorithm in the MLlib can lead to the Serialized Task size become bigger and bigger

2014-07-15 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062163#comment-14062163 ] DjvuLee commented on SPARK-2138: oh, I am a little sorry that I write some mistaken in my

[jira] [Commented] (SPARK-2138) The KMeans algorithm in the MLlib can lead to the Serialized Task size become bigger and bigger

2014-06-30 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048488#comment-14048488 ] DjvuLee commented on SPARK-2138: @[~piotrszul] Thanks for your test, I will also test this

[jira] [Commented] (SPARK-2138) The KMeans algorithm in the MLlib can lead to the Serialized Task size become bigger and bigger

2014-06-27 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046597#comment-14046597 ] DjvuLee commented on SPARK-2138: If this bug fixed, shall we closed this issue? [~mengxr]

[jira] [Commented] (SPARK-2138) The KMeans algorithm in the MLlib can lead to the Serialized Task size become bigger and bigger

2014-06-18 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14036892#comment-14036892 ] DjvuLee commented on SPARK-2138: Thanks very much! I am very glad to see that I reported

[jira] [Created] (SPARK-2138) The KMeans algorithm in the MLlib can lead to the Serialized Task size become bigger and bigger

2014-06-13 Thread DjvuLee (JIRA)
DjvuLee created SPARK-2138: -- Summary: The KMeans algorithm in the MLlib can lead to the Serialized Task size become bigger and bigger Key: SPARK-2138 URL: https://issues.apache.org/jira/browse/SPARK-2138

[jira] [Updated] (SPARK-2138) The KMeans algorithm in the MLlib can lead to the Serialized Task size become bigger and bigger

2014-06-13 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-2138: --- Description: When the algorithm running at certain stage, when running the reduceBykey() algorithm, It can

  1   2   >