[jira] [Commented] (SPARK-14804) Graph vertexRDD/EdgeRDD checkpoint results ClassCastException:

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567714#comment-15567714 ] Apache Spark commented on SPARK-14804: -- User 'apivovarov' has created a pull request for this issue:

[jira] [Resolved] (SPARK-17880) The url linking to `AccumulatorV2` in the document is incorrect.

2016-10-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-17880. - Resolution: Fixed Assignee: Kousuke Saruta Fix Version/s: 2.1.0

[jira] [Commented] (SPARK-17846) A bad state of Running Applications with spark standalone HA

2016-10-11 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567662#comment-15567662 ] Saisai Shao commented on SPARK-17846: - I think this issue should be the same as SPARK-14262. > A bad

[jira] [Commented] (SPARK-17344) Kafka 0.8 support for Structured Streaming

2016-10-11 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567457#comment-15567457 ] Cody Koeninger commented on SPARK-17344: Given the choice between rewriting underlying kafka

[jira] [Closed] (SPARK-17837) Disaster recovery of offsets from WAL

2016-10-11 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger closed SPARK-17837. -- Resolution: Duplicate Duplicate of SPARK-17829 > Disaster recovery of offsets from WAL >

[jira] [Resolved] (SPARK-17720) Static configurations in SQL

2016-10-11 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-17720. - Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15295

[jira] [Assigned] (SPARK-17882) RBackendHandler swallowing errors

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17882: Assignee: Apache Spark > RBackendHandler swallowing errors >

[jira] [Assigned] (SPARK-17882) RBackendHandler swallowing errors

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17882: Assignee: (was: Apache Spark) > RBackendHandler swallowing errors >

[jira] [Commented] (SPARK-17882) RBackendHandler swallowing errors

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567423#comment-15567423 ] Apache Spark commented on SPARK-17882: -- User 'jrshust' has created a pull request for this issue:

[jira] [Commented] (SPARK-17817) PySpark RDD Repartitioning Results in Highly Skewed Partition Sizes

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567418#comment-15567418 ] Apache Spark commented on SPARK-17817: -- User 'viirya' has created a pull request for this issue:

[jira] [Created] (SPARK-17882) RBackendHandler swallowing errors

2016-10-11 Thread James Shuster (JIRA)
James Shuster created SPARK-17882: - Summary: RBackendHandler swallowing errors Key: SPARK-17882 URL: https://issues.apache.org/jira/browse/SPARK-17882 Project: Spark Issue Type: Bug

[jira] [Comment Edited] (SPARK-17344) Kafka 0.8 support for Structured Streaming

2016-10-11 Thread Jeremy Smith (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567367#comment-15567367 ] Jeremy Smith edited comment on SPARK-17344 at 10/12/16 2:56 AM: {quote}

[jira] [Commented] (SPARK-17344) Kafka 0.8 support for Structured Streaming

2016-10-11 Thread Jeremy Smith (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567367#comment-15567367 ] Jeremy Smith commented on SPARK-17344: -- > By contrast, writing a streaming source shim around the

[jira] [Commented] (SPARK-16845) org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

2016-10-11 Thread Liwei Lin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567362#comment-15567362 ] Liwei Lin commented on SPARK-16845: --- Thanks for the pointer, let me look into this. :-) >

[jira] [Comment Edited] (SPARK-16845) org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

2016-10-11 Thread Liwei Lin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567353#comment-15567353 ] Liwei Lin edited comment on SPARK-16845 at 10/12/16 2:51 AM: - [~dondrake]

[jira] [Commented] (SPARK-16845) org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

2016-10-11 Thread Don Drake (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567358#comment-15567358 ] Don Drake commented on SPARK-16845: --- I can't at the moment, mine is not simple. But this JIRA has

[jira] [Comment Edited] (SPARK-16845) org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

2016-10-11 Thread Liwei Lin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567353#comment-15567353 ] Liwei Lin edited comment on SPARK-16845 at 10/12/16 2:50 AM: - -[~dondrake]

[jira] [Commented] (SPARK-16845) org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

2016-10-11 Thread Liwei Lin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567353#comment-15567353 ] Liwei Lin commented on SPARK-16845: --- [~dondrake] [~Utsumi] Could you provide a simple reproducer? I may

[jira] [Comment Edited] (SPARK-11758) Missing Index column while creating a DataFrame from Pandas

2016-10-11 Thread Leandro Ferrado (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567341#comment-15567341 ] Leandro Ferrado edited comment on SPARK-11758 at 10/12/16 2:43 AM: --- Hi

[jira] [Commented] (SPARK-11758) Missing Index column while creating a DataFrame from Pandas

2016-10-11 Thread Leandro Ferrado (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567341#comment-15567341 ] Leandro Ferrado commented on SPARK-11758: - Hi Holden. First, I would add just a single line in

[jira] [Comment Edited] (SPARK-11758) Missing Index column while creating a DataFrame from Pandas

2016-10-11 Thread Leandro Ferrado (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567329#comment-15567329 ] Leandro Ferrado edited comment on SPARK-11758 at 10/12/16 2:39 AM: --- Hi

[jira] [Issue Comment Deleted] (SPARK-11758) Missing Index column while creating a DataFrame from Pandas

2016-10-11 Thread Leandro Ferrado (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leandro Ferrado updated SPARK-11758: Comment: was deleted (was: Hi Holden. First, I would add just a single line in order to

[jira] [Commented] (SPARK-11758) Missing Index column while creating a DataFrame from Pandas

2016-10-11 Thread Leandro Ferrado (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567329#comment-15567329 ] Leandro Ferrado commented on SPARK-11758: - Hi Holden. First, I would add just a single line in

[jira] [Commented] (SPARK-17344) Kafka 0.8 support for Structured Streaming

2016-10-11 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567321#comment-15567321 ] Michael Armbrust commented on SPARK-17344: -- These are good questions. A few thoughts: bq. How

[jira] [Comment Edited] (SPARK-12484) DataFrame withColumn() does not work in Java

2016-10-11 Thread Ryan Brant (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567271#comment-15567271 ] Ryan Brant edited comment on SPARK-12484 at 10/12/16 2:04 AM: -- Was there a

[jira] [Commented] (SPARK-12484) DataFrame withColumn() does not work in Java

2016-10-11 Thread Ryan Brant (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567271#comment-15567271 ] Ryan Brant commented on SPARK-12484: Was there a resolution to this? I am also getting this issue in

[jira] [Assigned] (SPARK-17870) ML/MLLIB: ChiSquareSelector based on Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17870: Assignee: Apache Spark > ML/MLLIB: ChiSquareSelector based on Statistics.chiSqTest(RDD)

[jira] [Commented] (SPARK-17870) ML/MLLIB: ChiSquareSelector based on Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567244#comment-15567244 ] Apache Spark commented on SPARK-17870: -- User 'mpjlu' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17870) ML/MLLIB: ChiSquareSelector based on Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17870: Assignee: (was: Apache Spark) > ML/MLLIB: ChiSquareSelector based on

[jira] [Commented] (SPARK-17881) Aggregation function for generating string histograms

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567226#comment-15567226 ] Apache Spark commented on SPARK-17881: -- User 'wzhfy' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17881) Aggregation function for generating string histograms

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17881: Assignee: Apache Spark > Aggregation function for generating string histograms >

[jira] [Assigned] (SPARK-17881) Aggregation function for generating string histograms

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17881: Assignee: (was: Apache Spark) > Aggregation function for generating string histograms

[jira] [Created] (SPARK-17881) Aggregation function for generating string histograms

2016-10-11 Thread Zhenhua Wang (JIRA)
Zhenhua Wang created SPARK-17881: Summary: Aggregation function for generating string histograms Key: SPARK-17881 URL: https://issues.apache.org/jira/browse/SPARK-17881 Project: Spark Issue

[jira] [Commented] (SPARK-17853) Kafka OffsetOutOfRangeException on DStreams union from separate Kafka clusters with identical topic names.

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567199#comment-15567199 ] Apache Spark commented on SPARK-17853: -- User 'koeninger' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17853) Kafka OffsetOutOfRangeException on DStreams union from separate Kafka clusters with identical topic names.

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17853: Assignee: (was: Apache Spark) > Kafka OffsetOutOfRangeException on DStreams union

[jira] [Assigned] (SPARK-17853) Kafka OffsetOutOfRangeException on DStreams union from separate Kafka clusters with identical topic names.

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17853: Assignee: Apache Spark > Kafka OffsetOutOfRangeException on DStreams union from separate

[jira] [Commented] (SPARK-17878) Support for multiple null values when reading CSV data

2016-10-11 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567180#comment-15567180 ] Hossein Falaki commented on SPARK-17878: Sure. If passing a list is possible it is the better

[jira] [Commented] (SPARK-17870) ML/MLLIB: ChiSquareSelector based on Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567170#comment-15567170 ] Peng Meng commented on SPARK-17870: --- hi [~avulanov], the question here is not use raw chi2 scores or

[jira] [Commented] (SPARK-17878) Support for multiple null values when reading CSV data

2016-10-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567149#comment-15567149 ] Hyukjin Kwon commented on SPARK-17878: -- BTW, maybe, I will try to investigate further if it is

[jira] [Comment Edited] (SPARK-17878) Support for multiple null values when reading CSV data

2016-10-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567124#comment-15567124 ] Hyukjin Kwon edited comment on SPARK-17878 at 10/12/16 12:50 AM: - Oh, I

[jira] [Commented] (SPARK-17878) Support for multiple null values when reading CSV data

2016-10-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567124#comment-15567124 ] Hyukjin Kwon commented on SPARK-17878: -- Oh, I didn't mean I am against this. I am just wondering if

[jira] [Commented] (SPARK-17878) Support for multiple null values when reading CSV data

2016-10-11 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567104#comment-15567104 ] Hossein Falaki commented on SPARK-17878: That would require API change in SparkSQL. Otherwise, we

[jira] [Commented] (SPARK-17878) Support for multiple null values when reading CSV data

2016-10-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567046#comment-15567046 ] Hyukjin Kwon commented on SPARK-17878: -- Maybe it'd be nicer if options allow list or nested map (if

[jira] [Commented] (SPARK-4411) Add "kill" link for jobs in the UI

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567028#comment-15567028 ] Apache Spark commented on SPARK-4411: - User 'ajbozarth' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17880) The url linking to `AccumulatorV2` in the document is incorrect.

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17880: Assignee: (was: Apache Spark) > The url linking to `AccumulatorV2` in the document is

[jira] [Commented] (SPARK-17880) The url linking to `AccumulatorV2` in the document is incorrect.

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566936#comment-15566936 ] Apache Spark commented on SPARK-17880: -- User 'sarutak' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17880) The url linking to `AccumulatorV2` in the document is incorrect.

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17880: Assignee: Apache Spark > The url linking to `AccumulatorV2` in the document is incorrect.

[jira] [Created] (SPARK-17880) The url linking to `AccumulatorV2` in the document is incorrect.

2016-10-11 Thread Kousuke Saruta (JIRA)
Kousuke Saruta created SPARK-17880: -- Summary: The url linking to `AccumulatorV2` in the document is incorrect. Key: SPARK-17880 URL: https://issues.apache.org/jira/browse/SPARK-17880 Project: Spark

[jira] [Commented] (SPARK-15621) BatchEvalPythonExec fails with OOM

2016-10-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566858#comment-15566858 ] Davies Liu commented on SPARK-15621: [~rezasafi] We usually do not backport this kind of

[jira] [Commented] (SPARK-15621) BatchEvalPythonExec fails with OOM

2016-10-11 Thread Reza Safi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566752#comment-15566752 ] Reza Safi commented on SPARK-15621: --- Hi [~davies], can the fix be backported to branch-2.0 as well,

[jira] [Commented] (SPARK-16845) org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

2016-10-11 Thread K (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566741#comment-15566741 ] K commented on SPARK-16845: --- We manually wrote parts that were throwing errors (StringIndexer and

[jira] [Resolved] (SPARK-17387) Creating SparkContext() from python without spark-submit ignores user conf

2016-10-11 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-17387. Resolution: Fixed Assignee: Jeff Zhang Fix Version/s: 2.1.0 > Creating

[jira] [Updated] (SPARK-17877) Can not checkpoint connectedComponents resulting graph

2016-10-11 Thread Alexander Pivovarov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated SPARK-17877: Description: The following code demonstrates the issue {code} import

[jira] [Commented] (SPARK-9879) OOM in LIMIT clause with large number

2016-10-11 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566639#comment-15566639 ] Dongjoon Hyun commented on SPARK-9879: -- Hi, All. The PR seems to be closed last December. Can we

[jira] [Created] (SPARK-17879) Don't compact metadata logs constantly into a single compacted file

2016-10-11 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-17879: --- Summary: Don't compact metadata logs constantly into a single compacted file Key: SPARK-17879 URL: https://issues.apache.org/jira/browse/SPARK-17879 Project: Spark

[jira] [Commented] (SPARK-17870) ML/MLLIB: ChiSquareSelector based on Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566586#comment-15566586 ] Sean Owen commented on SPARK-17870: --- If the degrees of freedom are the same across the tests, then

[jira] [Created] (SPARK-17878) Support for multiple null values when reading CSV data

2016-10-11 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-17878: -- Summary: Support for multiple null values when reading CSV data Key: SPARK-17878 URL: https://issues.apache.org/jira/browse/SPARK-17878 Project: Spark

[jira] [Updated] (SPARK-17455) IsotonicRegression takes non-polynomial time for some inputs

2016-10-11 Thread Nic Eggert (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nic Eggert updated SPARK-17455: --- Priority: Major (was: Minor) > IsotonicRegression takes non-polynomial time for some inputs >

[jira] [Commented] (SPARK-17781) datetime is serialized as double inside dapply()

2016-10-11 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566541#comment-15566541 ] Hossein Falaki commented on SPARK-17781: [~shivaram] Thanks for looking into it. I think the

[jira] [Updated] (SPARK-17863) SELECT distinct does not work if there is a order by clause

2016-10-11 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-17863: - Description: {code} select distinct struct.a, struct.b from ( select named_struct('a', 1, 'b', 2, 'c',

[jira] [Comment Edited] (SPARK-17463) Serialization of accumulators in heartbeats is not thread-safe

2016-10-11 Thread Harish (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566515#comment-15566515 ] Harish edited comment on SPARK-17463 at 10/11/16 8:34 PM: -- My second approach

[jira] [Commented] (SPARK-17463) Serialization of accumulators in heartbeats is not thread-safe

2016-10-11 Thread Harish (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566515#comment-15566515 ] Harish commented on SPARK-17463: My second approach was: def testfunc(keys, vals, columnsToStandardize):

[jira] [Assigned] (SPARK-17845) Improve window function frame boundary API in DataFrame

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17845: Assignee: Apache Spark (was: Reynold Xin) > Improve window function frame boundary API

[jira] [Assigned] (SPARK-17845) Improve window function frame boundary API in DataFrame

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17845: Assignee: Reynold Xin (was: Apache Spark) > Improve window function frame boundary API

[jira] [Commented] (SPARK-17845) Improve window function frame boundary API in DataFrame

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566509#comment-15566509 ] Apache Spark commented on SPARK-17845: -- User 'rxin' has created a pull request for this issue:

[jira] [Closed] (SPARK-17857) SHOW TABLES IN schema throws exception if schema doesn't exist

2016-10-11 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun closed SPARK-17857. - Resolution: Not A Problem Although the behavior is changed from 1.x, we had better close this

[jira] [Commented] (SPARK-17870) ML/MLLIB: ChiSquareSelector based on Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Alexander Ulanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566467#comment-15566467 ] Alexander Ulanov commented on SPARK-17870: --

[jira] [Commented] (SPARK-17781) datetime is serialized as double inside dapply()

2016-10-11 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566457#comment-15566457 ] Shivaram Venkataraman commented on SPARK-17781: --- [~falaki] I looked at this a bit more

[jira] [Commented] (SPARK-11784) enable Timestamp filter pushdown

2016-10-11 Thread Ian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566439#comment-15566439 ] Ian commented on SPARK-11784: - Yes, I meant TimestampType filter pushdown > enable Timestamp filter pushdown

[jira] [Commented] (SPARK-4411) Add "kill" link for jobs in the UI

2016-10-11 Thread Alex Bozarth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566430#comment-15566430 ] Alex Bozarth commented on SPARK-4411: - I'm currently working on this. I'm updating the original pr to

[jira] [Comment Edited] (SPARK-17463) Serialization of accumulators in heartbeats is not thread-safe

2016-10-11 Thread Harish (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566427#comment-15566427 ] Harish edited comment on SPARK-17463 at 10/11/16 8:06 PM: -- No i dont have any

[jira] [Commented] (SPARK-17463) Serialization of accumulators in heartbeats is not thread-safe

2016-10-11 Thread Harish (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566427#comment-15566427 ] Harish commented on SPARK-17463: No i dont have any code like that. I use pyspark .. Please find my code

[jira] [Comment Edited] (SPARK-12216) Spark failed to delete temp directory

2016-10-11 Thread Jerome Scheuring (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566349#comment-15566349 ] Jerome Scheuring edited comment on SPARK-12216 at 10/11/16 7:59 PM:

[jira] [Updated] (SPARK-17877) Can not checkpoint connectedComponents resulting graph

2016-10-11 Thread Alexander Pivovarov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated SPARK-17877: Description: The following code demonstrates the issue {code} import

[jira] [Commented] (SPARK-17463) Serialization of accumulators in heartbeats is not thread-safe

2016-10-11 Thread Harish (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566413#comment-15566413 ] Harish commented on SPARK-17463: Ok. thanks for the update. Do we have any work around for the second

[jira] [Updated] (SPARK-17816) Json serialzation of accumulators are failing with ConcurrentModificationException

2016-10-11 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-17816: - Fix Version/s: 2.0.2 > Json serialzation of accumulators are failing with >

[jira] [Updated] (SPARK-17816) Json serialzation of accumulators are failing with ConcurrentModificationException

2016-10-11 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-17816: - Affects Version/s: 2.0.1 > Json serialzation of accumulators are failing with >

[jira] [Comment Edited] (SPARK-17877) Can not checkpoint connectedComponents resulting graph

2016-10-11 Thread Alexander Pivovarov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566385#comment-15566385 ] Alexander Pivovarov edited comment on SPARK-17877 at 10/11/16 7:50 PM:

[jira] [Updated] (SPARK-17812) More granular control of starting offsets (assign)

2016-10-11 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-17812: - Summary: More granular control of starting offsets (assign) (was: More granular control

[jira] [Commented] (SPARK-17463) Serialization of accumulators in heartbeats is not thread-safe

2016-10-11 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566391#comment-15566391 ] Shixiong Zhu commented on SPARK-17463: -- Do you have a reproducer? I saw `at

[jira] [Commented] (SPARK-17812) More granular control of starting offsets

2016-10-11 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566392#comment-15566392 ] Michael Armbrust commented on SPARK-17812: -- For the seeking back {{X}} offsets use case, I was

[jira] [Updated] (SPARK-17812) More granular control of starting offsets

2016-10-11 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-17812: - Description: Right now you can only run a Streaming Query starting from either the

[jira] [Commented] (SPARK-17877) Can not checkpoint connectedComponents resulting graph

2016-10-11 Thread Alexander Pivovarov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566385#comment-15566385 ] Alexander Pivovarov commented on SPARK-17877: - Another open issue with checkpointing is

[jira] [Commented] (SPARK-17845) Improve window function frame boundary API in DataFrame

2016-10-11 Thread Timothy Hunter (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566380#comment-15566380 ] Timothy Hunter commented on SPARK-17845: I like the {{Window.rowsBetween(Long.MinValue, -3)}}

[jira] [Resolved] (SPARK-15153) SparkR spark.naiveBayes throws error when label is numeric type

2016-10-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-15153. --- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15431

[jira] [Updated] (SPARK-17877) Can not checkpoint connectedComponents resulting graph

2016-10-11 Thread Alexander Pivovarov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated SPARK-17877: Description: The following code demonstrates the issue {code} import

[jira] [Updated] (SPARK-17877) Can not checkpoint connectedComponents resulting graph

2016-10-11 Thread Alexander Pivovarov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated SPARK-17877: Description: The following code demonstrates the issue {code} import

[jira] [Created] (SPARK-17877) Can not checkpoint connectedComponents resulting graph

2016-10-11 Thread Alexander Pivovarov (JIRA)
Alexander Pivovarov created SPARK-17877: --- Summary: Can not checkpoint connectedComponents resulting graph Key: SPARK-17877 URL: https://issues.apache.org/jira/browse/SPARK-17877 Project: Spark

[jira] [Updated] (SPARK-17877) Can not checkpoint connectedComponents resulting graph

2016-10-11 Thread Alexander Pivovarov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated SPARK-17877: Description: The following code demonstrates the issue {code} import

[jira] [Updated] (SPARK-17877) Can not checkpoint connectedComponents resulting graph

2016-10-11 Thread Alexander Pivovarov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated SPARK-17877: Description: The following code demonstrates an issue {code} import

[jira] [Comment Edited] (SPARK-12216) Spark failed to delete temp directory

2016-10-11 Thread Jerome Scheuring (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566349#comment-15566349 ] Jerome Scheuring edited comment on SPARK-12216 at 10/11/16 7:34 PM:

[jira] [Commented] (SPARK-12216) Spark failed to delete temp directory

2016-10-11 Thread Jerome Scheuring (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566349#comment-15566349 ] Jerome Scheuring commented on SPARK-12216: -- _Note that I am entirely new to the process of

[jira] [Commented] (SPARK-17463) Serialization of accumulators in heartbeats is not thread-safe

2016-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566299#comment-15566299 ] Sean Owen commented on SPARK-17463: --- No, that change came after, and is part of a different JIRA that

[jira] [Commented] (SPARK-17463) Serialization of accumulators in heartbeats is not thread-safe

2016-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566300#comment-15566300 ] Sean Owen commented on SPARK-17463: --- No, that change came after, and is part of a different JIRA that

[jira] [Commented] (SPARK-17870) ML/MLLIB: ChiSquareSelector based on Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566284#comment-15566284 ] Sean Owen commented on SPARK-17870: --- OK I get it, they're doing different things really. The scikit

[jira] [Commented] (SPARK-15153) SparkR spark.naiveBayes throws error when label is numeric type

2016-10-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566251#comment-15566251 ] Joseph K. Bradley commented on SPARK-15153: --- Note I'm setting the target version for 2.1, not

[jira] [Updated] (SPARK-15153) SparkR spark.naiveBayes throws error when label is numeric type

2016-10-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-15153: -- Target Version/s: 2.1.0 > SparkR spark.naiveBayes throws error when label is numeric

[jira] [Commented] (SPARK-17344) Kafka 0.8 support for Structured Streaming

2016-10-11 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566244#comment-15566244 ] Cody Koeninger commented on SPARK-17344: How long would it take CDH to distribute 0.10 if there

[jira] [Resolved] (SPARK-17817) PySpark RDD Repartitioning Results in Highly Skewed Partition Sizes

2016-10-11 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-17817. -- Resolution: Fixed Assignee: Liang-Chi Hsieh Fix Version/s: 2.1.0 > PySpark RDD

[jira] [Commented] (SPARK-17463) Serialization of accumulators in heartbeats is not thread-safe

2016-10-11 Thread Harish (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566206#comment-15566206 ] Harish commented on SPARK-17463: Is this fix is part of the https://github.com/apache/spark/pull/15371

  1   2   >