[jira] [Commented] (SPARK-17344) Kafka 0.8 support for Structured Streaming

2016-10-11 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566191#comment-15566191 ] Michael Armbrust commented on SPARK-17344: -- I think the fact that CDH is still distributing 0.9

[jira] [Assigned] (SPARK-17876) Write StructuredStreaming WAL to a stream instead of materializing all at once

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17876: Assignee: (was: Apache Spark) > Write StructuredStreaming WAL to a stream instead of

[jira] [Commented] (SPARK-17876) Write StructuredStreaming WAL to a stream instead of materializing all at once

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566181#comment-15566181 ] Apache Spark commented on SPARK-17876: -- User 'brkyvz' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17876) Write StructuredStreaming WAL to a stream instead of materializing all at once

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17876: Assignee: Apache Spark > Write StructuredStreaming WAL to a stream instead of

[jira] [Created] (SPARK-17876) Write StructuredStreaming WAL to a stream instead of materializing all at once

2016-10-11 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-17876: --- Summary: Write StructuredStreaming WAL to a stream instead of materializing all at once Key: SPARK-17876 URL: https://issues.apache.org/jira/browse/SPARK-17876

[jira] [Commented] (SPARK-17709) spark 2.0 join - column resolution error

2016-10-11 Thread Ashish Shrowty (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566160#comment-15566160 ] Ashish Shrowty commented on SPARK-17709: Cool.. thanks. Will do this in next day or two. > spark

[jira] [Commented] (SPARK-17811) SparkR cannot parallelize data.frame with NA or NULL in Date columns

2016-10-11 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566163#comment-15566163 ] Miao Wang commented on SPARK-17811: --- :) Just want to submit a PR and found that you have a fix. Good to

[jira] [Assigned] (SPARK-17875) Remove unneeded direct dependence on Netty 3.x

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17875: Assignee: Apache Spark (was: Sean Owen) > Remove unneeded direct dependence on Netty 3.x

[jira] [Commented] (SPARK-17875) Remove unneeded direct dependence on Netty 3.x

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566136#comment-15566136 ] Apache Spark commented on SPARK-17875: -- User 'srowen' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17875) Remove unneeded direct dependence on Netty 3.x

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17875: Assignee: Sean Owen (was: Apache Spark) > Remove unneeded direct dependence on Netty 3.x

[jira] [Created] (SPARK-17875) Remove unneeded direct dependence on Netty 3.x

2016-10-11 Thread Sean Owen (JIRA)
Sean Owen created SPARK-17875: - Summary: Remove unneeded direct dependence on Netty 3.x Key: SPARK-17875 URL: https://issues.apache.org/jira/browse/SPARK-17875 Project: Spark Issue Type:

[jira] [Commented] (SPARK-17709) spark 2.0 join - column resolution error

2016-10-11 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566126#comment-15566126 ] Xiao Li commented on SPARK-17709: - Below is the link:

[jira] [Commented] (SPARK-17709) spark 2.0 join - column resolution error

2016-10-11 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566124#comment-15566124 ] Xiao Li commented on SPARK-17709: - Below is the link:

[jira] [Commented] (SPARK-17709) spark 2.0 join - column resolution error

2016-10-11 Thread Ashish Shrowty (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566114#comment-15566114 ] Ashish Shrowty commented on SPARK-17709: I assume I would need to modify the Spark code and build

[jira] [Commented] (SPARK-17808) BinaryType fails in Python 3 due to outdated Pyrolite

2016-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566107#comment-15566107 ] Sean Owen commented on SPARK-17808: --- I think it could be OK. It's a bug fix, and while it is a minor

[jira] [Commented] (SPARK-17463) Serialization of accumulators in heartbeats is not thread-safe

2016-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566104#comment-15566104 ] Sean Owen commented on SPARK-17463: --- What do you mean? this has been released already in 2.0.1. >

[jira] [Created] (SPARK-17874) Enabling SSL on HistoryServer should only open one port not two

2016-10-11 Thread Andrew Ash (JIRA)
Andrew Ash created SPARK-17874: -- Summary: Enabling SSL on HistoryServer should only open one port not two Key: SPARK-17874 URL: https://issues.apache.org/jira/browse/SPARK-17874 Project: Spark

[jira] [Updated] (SPARK-17874) Additional SSL port on HistoryServer should be configurable

2016-10-11 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash updated SPARK-17874: --- Summary: Additional SSL port on HistoryServer should be configurable (was: Enabling SSL on

[jira] [Updated] (SPARK-17858) Provide option for Spark SQL to skip corrupt files

2016-10-11 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-17858: - Description: In Spark 2.0, corrupt files will fail a SQL query. However, the user may just want

[jira] [Commented] (SPARK-15343) NoClassDefFoundError when initializing Spark with YARN

2016-10-11 Thread Jo Desmet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566019#comment-15566019 ] Jo Desmet commented on SPARK-15343: --- By design we apparently have a very tight coupling of scheduling

[jira] [Commented] (SPARK-17139) Add model summary for MultinomialLogisticRegression

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565973#comment-15565973 ] Apache Spark commented on SPARK-17139: -- User 'WeichenXu123' has created a pull request for this

[jira] [Assigned] (SPARK-17139) Add model summary for MultinomialLogisticRegression

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17139: Assignee: Apache Spark > Add model summary for MultinomialLogisticRegression >

[jira] [Assigned] (SPARK-17139) Add model summary for MultinomialLogisticRegression

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17139: Assignee: (was: Apache Spark) > Add model summary for MultinomialLogisticRegression >

[jira] [Updated] (SPARK-4105) FAILED_TO_UNCOMPRESS(5) errors when fetching shuffle data with sort-based shuffle

2016-10-11 Thread Artur Sukhenko (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artur Sukhenko updated SPARK-4105: -- Affects Version/s: 2.0.0 > FAILED_TO_UNCOMPRESS(5) errors when fetching shuffle data with

[jira] [Updated] (SPARK-8425) Add blacklist mechanism for task scheduling

2016-10-11 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated SPARK-8425: Attachment: DesignDocforBlacklistMechanism.pdf Seems like there is agreement on the design, so I'm

[jira] [Updated] (SPARK-17870) ML/MLLIB: ChiSquareSelector based on Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peng Meng updated SPARK-17870: -- Summary: ML/MLLIB: ChiSquareSelector based on Statistics.chiSqTest(RDD) is wrong (was: ML/MLLIB:

[jira] [Assigned] (SPARK-17873) ALTER TABLE ... RENAME TO ... should allow users to specify database in destination table name

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17873: Assignee: Apache Spark (was: Wenchen Fan) > ALTER TABLE ... RENAME TO ... should allow

[jira] [Commented] (SPARK-17873) ALTER TABLE ... RENAME TO ... should allow users to specify database in destination table name

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565623#comment-15565623 ] Apache Spark commented on SPARK-17873: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Commented] (SPARK-16845) org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

2016-10-11 Thread Don Drake (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565592#comment-15565592 ] Don Drake commented on SPARK-16845: --- Unfortunately, it does not work around it. 16/10/10 18:19:47

[jira] [Created] (SPARK-17873) ALTER TABLE ... RENAME TO ... should allow users to specify database in destination table name

2016-10-11 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-17873: --- Summary: ALTER TABLE ... RENAME TO ... should allow users to specify database in destination table name Key: SPARK-17873 URL: https://issues.apache.org/jira/browse/SPARK-17873

[jira] [Commented] (SPARK-17853) Kafka OffsetOutOfRangeException on DStreams union from separate Kafka clusters with identical topic names.

2016-10-11 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565559#comment-15565559 ] Cody Koeninger commented on SPARK-17853: Good, will keep this ticket open at least until

[jira] [Created] (SPARK-17872) aggregate function on dataset with tuples grouped by non sequential fields

2016-10-11 Thread Niek Bartholomeus (JIRA)
Niek Bartholomeus created SPARK-17872: - Summary: aggregate function on dataset with tuples grouped by non sequential fields Key: SPARK-17872 URL: https://issues.apache.org/jira/browse/SPARK-17872

[jira] [Updated] (SPARK-17872) aggregate function on dataset with tuples grouped by non sequential fields

2016-10-11 Thread Niek Bartholomeus (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niek Bartholomeus updated SPARK-17872: -- Description: The following lines where the field index in the tuple used in an

[jira] [Commented] (SPARK-17463) Serialization of accumulators in heartbeats is not thread-safe

2016-10-11 Thread Harish (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565493#comment-15565493 ] Harish commented on SPARK-17463: It looks like a show stopper for my current project. Can you please let

[jira] [Comment Edited] (SPARK-17463) Serialization of accumulators in heartbeats is not thread-safe

2016-10-11 Thread Harish (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565493#comment-15565493 ] Harish edited comment on SPARK-17463 at 10/11/16 2:03 PM: -- It looks like a show

[jira] [Commented] (SPARK-17808) BinaryType fails in Python 3 due to outdated Pyrolite

2016-10-11 Thread Pete Fein (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565470#comment-15565470 ] Pete Fein commented on SPARK-17808: --- Any reason this can't be included in the next 2.0.x bug fix

[jira] [Commented] (SPARK-17853) Kafka OffsetOutOfRangeException on DStreams union from separate Kafka clusters with identical topic names.

2016-10-11 Thread Aleksander Ihnatowicz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565448#comment-15565448 ] Aleksander Ihnatowicz commented on SPARK-17853: --- Setting different group ids solved the

[jira] [Commented] (SPARK-17853) Kafka OffsetOutOfRangeException on DStreams union from separate Kafka clusters with identical topic names.

2016-10-11 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565400#comment-15565400 ] Cody Koeninger commented on SPARK-17853: Use a different group id. Let me know if that addresses

[jira] [Assigned] (SPARK-17822) JVMObjectTracker.objMap may leak JVM objects

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17822: Assignee: Apache Spark > JVMObjectTracker.objMap may leak JVM objects >

[jira] [Assigned] (SPARK-17822) JVMObjectTracker.objMap may leak JVM objects

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17822: Assignee: (was: Apache Spark) > JVMObjectTracker.objMap may leak JVM objects >

[jira] [Commented] (SPARK-17822) JVMObjectTracker.objMap may leak JVM objects

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565374#comment-15565374 ] Apache Spark commented on SPARK-17822: -- User 'techaddict' has created a pull request for this issue:

[jira] [Created] (SPARK-17871) Dataset joinwith syntax should support specifying the condition in a compile-time safe way

2016-10-11 Thread Jamie Hutton (JIRA)
Jamie Hutton created SPARK-17871: Summary: Dataset joinwith syntax should support specifying the condition in a compile-time safe way Key: SPARK-17871 URL: https://issues.apache.org/jira/browse/SPARK-17871

[jira] [Commented] (SPARK-17870) ML/MLLIB: Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565315#comment-15565315 ] Peng Meng commented on SPARK-17870: --- https://github.com/apache/spark/pull/1484#issuecomment-51024568 Hi

[jira] [Commented] (SPARK-17853) Kafka OffsetOutOfRangeException on DStreams union from separate Kafka clusters with identical topic names.

2016-10-11 Thread Piotr Guzik (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565307#comment-15565307 ] Piotr Guzik commented on SPARK-17853: - Hi. We are using version 0-10. We are also using the same

[jira] [Commented] (SPARK-17853) Kafka OffsetOutOfRangeException on DStreams union from separate Kafka clusters with identical topic names.

2016-10-11 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565278#comment-15565278 ] Cody Koeninger commented on SPARK-17853: Which version of DStream are you using, 0-10 or 0-8? Are

[jira] [Commented] (SPARK-17870) ML/MLLIB: Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565251#comment-15565251 ] Peng Meng commented on SPARK-17870: --- The scikit learn code is here:

[jira] [Commented] (SPARK-17870) ML/MLLIB: Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565238#comment-15565238 ] Sean Owen commented on SPARK-17870: --- I don't quite understand this example, can you point me to the

[jira] [Commented] (SPARK-17870) ML/MLLIB: Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565225#comment-15565225 ] Peng Meng commented on SPARK-17870: --- yes, the selectKBest and selectPercentile in scikit learn only use

[jira] [Resolved] (SPARK-17656) Decide on the variant of @scala.annotation.varargs and use consistently

2016-10-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-17656. -- Resolution: Fixed This was fixed in the PR together. > Decide on the variant of

[jira] [Commented] (SPARK-17870) ML/MLLIB: Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565180#comment-15565180 ] Sean Owen commented on SPARK-17870: --- I don't think the raw statistic can be directly compared here

[jira] [Updated] (SPARK-14272) Evaluate GaussianMixtureModel with LogLikelihood

2016-10-11 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-14272: - Component/s: (was: MLlib) ML > Evaluate GaussianMixtureModel with

[jira] [Commented] (SPARK-14272) Evaluate GaussianMixtureModel with LogLikelihood

2016-10-11 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565069#comment-15565069 ] zhengruifeng commented on SPARK-14272: -- Yes, I will a update after SPARK-17847 get merged >

[jira] [Comment Edited] (SPARK-14272) Evaluate GaussianMixtureModel with LogLikelihood

2016-10-11 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565069#comment-15565069 ] zhengruifeng edited comment on SPARK-14272 at 10/11/16 10:07 AM: - Yes, I

[jira] [Commented] (SPARK-17870) ML/MLLIB: Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565041#comment-15565041 ] Peng Meng commented on SPARK-17870: --- hi [~srowen], thanks very much for you quickly reply. yes,the

[jira] [Commented] (SPARK-17854) it will failed when do select rand(null)

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565015#comment-15565015 ] Apache Spark commented on SPARK-17854: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Assigned] (SPARK-17854) it will failed when do select rand(null)

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17854: Assignee: (was: Apache Spark) > it will failed when do select rand(null) >

[jira] [Assigned] (SPARK-17854) it will failed when do select rand(null)

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17854: Assignee: Apache Spark > it will failed when do select rand(null) >

[jira] [Commented] (SPARK-15153) SparkR spark.naiveBayes throws error when label is numeric type

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565004#comment-15565004 ] Apache Spark commented on SPARK-15153: -- User 'yanboliang' has created a pull request for this issue:

[jira] [Commented] (SPARK-14272) Evaluate GaussianMixtureModel with LogLikelihood

2016-10-11 Thread Lei Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564986#comment-15564986 ] Lei Wang commented on SPARK-14272: -- Is this still in progress? > Evaluate GaussianMixtureModel with

[jira] [Commented] (SPARK-17870) ML/MLLIB: Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564959#comment-15564959 ] Sean Owen commented on SPARK-17870: --- Oof, I'm pretty certain you're correct. You can rank on the

[jira] [Comment Edited] (SPARK-17784) Add fromCenters method for KMeans

2016-10-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564946#comment-15564946 ] Nick Pentreath edited comment on SPARK-17784 at 10/11/16 8:59 AM: -- It's

[jira] [Commented] (SPARK-17784) Add fromCenters method for KMeans

2016-10-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564946#comment-15564946 ] Nick Pentreath commented on SPARK-17784: It's actually to create a new `KMeans` estimator I

[jira] [Created] (SPARK-17870) ML/MLLIB: Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Peng Meng (JIRA)
Peng Meng created SPARK-17870: - Summary: ML/MLLIB: Statistics.chiSqTest(RDD) is wrong Key: SPARK-17870 URL: https://issues.apache.org/jira/browse/SPARK-17870 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-17854) it will failed when do select rand(null)

2016-10-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564884#comment-15564884 ] Hyukjin Kwon commented on SPARK-17854: -- It seems this can be quickly fixed. Please let me submit a

[jira] [Commented] (SPARK-15957) RFormula supports forcing to index label

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564876#comment-15564876 ] Apache Spark commented on SPARK-15957: -- User 'yanboliang' has created a pull request for this issue:

[jira] [Updated] (SPARK-17821) Expression Canonicalization should support Add and Or

2016-10-11 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-17821: Assignee: Liang-Chi Hsieh > Expression Canonicalization should support Add and Or >

[jira] [Resolved] (SPARK-17821) Expression Canonicalization should support Add and Or

2016-10-11 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17821. - Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15388

[jira] [Commented] (SPARK-17219) QuantileDiscretizer does strange things with NaN values

2016-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564828#comment-15564828 ] Sean Owen commented on SPARK-17219: --- Yeah, unless you return some complex object with normal buckets

[jira] [Closed] (SPARK-17869) Connect to Amazon S3 using signature version 4 (only choice in Frankfurt)

2016-10-11 Thread Robin B (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robin B closed SPARK-17869. --- Resolution: Won't Fix You are right [~srowen] > Connect to Amazon S3 using signature version 4 (only choice

[jira] [Assigned] (SPARK-17840) Add some pointers for wiki/CONTRIBUTING.md in README.md and some warnings in PULL_REQUEST_TEMPLATE

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17840: Assignee: Apache Spark > Add some pointers for wiki/CONTRIBUTING.md in README.md and some

[jira] [Assigned] (SPARK-17840) Add some pointers for wiki/CONTRIBUTING.md in README.md and some warnings in PULL_REQUEST_TEMPLATE

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17840: Assignee: (was: Apache Spark) > Add some pointers for wiki/CONTRIBUTING.md in

[jira] [Commented] (SPARK-17840) Add some pointers for wiki/CONTRIBUTING.md in README.md and some warnings in PULL_REQUEST_TEMPLATE

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564814#comment-15564814 ] Apache Spark commented on SPARK-17840: -- User 'srowen' has created a pull request for this issue:

[jira] [Commented] (SPARK-17869) Connect to Amazon S3 using signature version 4 (only choice in Frankfurt)

2016-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564801#comment-15564801 ] Sean Owen commented on SPARK-17869: --- This isn't a Spark issue, right? it's an issue with S3 config in

[jira] [Updated] (SPARK-17869) Connect to Amazon S3 using signature version 4 (only choice in Frankfurt)

2016-10-11 Thread Robin B (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robin B updated SPARK-17869: Description: Connection fails with **400 Bad request** for S3 in Frankfurt region where version 4

[jira] [Created] (SPARK-17869) Connect to Amazon S3 using signature version 4 (only choice in Frankfurt)

2016-10-11 Thread Robin B (JIRA)
Robin B created SPARK-17869: --- Summary: Connect to Amazon S3 using signature version 4 (only choice in Frankfurt) Key: SPARK-17869 URL: https://issues.apache.org/jira/browse/SPARK-17869 Project: Spark

[jira] [Commented] (SPARK-17219) QuantileDiscretizer does strange things with NaN values

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564786#comment-15564786 ] Apache Spark commented on SPARK-17219: -- User 'VinceShieh' has created a pull request for this issue:

[jira] [Resolved] (SPARK-17864) Mark data type APIs as stable, rather than DeveloperApi

2016-10-11 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17864. - Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15426

[jira] [Resolved] (SPARK-17825) Expose log likelihood of EM algorithm in mllib

2016-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17825. --- Resolution: Duplicate > Expose log likelihood of EM algorithm in mllib >

[jira] [Commented] (SPARK-17825) Expose log likelihood of EM algorithm in mllib

2016-10-11 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564750#comment-15564750 ] zhengruifeng commented on SPARK-17825: -- This jira seems a duplicate of [Spark-14272] > Expose log

[jira] [Commented] (SPARK-17868) Do not use bitmasks during parsing and analysis of CUBE/ROLLUP/GROUPING SETS

2016-10-11 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564751#comment-15564751 ] Jiang Xingbo commented on SPARK-17868: -- Yes, I'll be working on this. Thank you! > Do not use

[jira] [Commented] (SPARK-17487) Configurable bucketing info extraction

2016-10-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564701#comment-15564701 ] Reynold Xin commented on SPARK-17487: - Thanks - that makes sense! > Configurable bucketing info

[jira] [Resolved] (SPARK-17808) BinaryType fails in Python 3 due to outdated Pyrolite

2016-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17808. --- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15386

[jira] [Updated] (SPARK-17808) BinaryType fails in Python 3 due to outdated Pyrolite

2016-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-17808: -- Assignee: Bryan Cutler > BinaryType fails in Python 3 due to outdated Pyrolite >

[jira] [Assigned] (SPARK-17866) Dataset.dropDuplicates (i.e., distinct) should not change the output of child plan

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17866: Assignee: (was: Apache Spark) > Dataset.dropDuplicates (i.e., distinct) should not

[jira] [Commented] (SPARK-17866) Dataset.dropDuplicates (i.e., distinct) should not change the output of child plan

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564616#comment-15564616 ] Apache Spark commented on SPARK-17866: -- User 'viirya' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17867) Dataset.dropDuplicates (i.e. distinct) should consider the columns with same column name

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17867: Assignee: Apache Spark > Dataset.dropDuplicates (i.e. distinct) should consider the

[jira] [Commented] (SPARK-17867) Dataset.dropDuplicates (i.e. distinct) should consider the columns with same column name

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564617#comment-15564617 ] Apache Spark commented on SPARK-17867: -- User 'viirya' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17866) Dataset.dropDuplicates (i.e., distinct) should not change the output of child plan

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17866: Assignee: Apache Spark > Dataset.dropDuplicates (i.e., distinct) should not change the

[jira] [Assigned] (SPARK-17867) Dataset.dropDuplicates (i.e. distinct) should consider the columns with same column name

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17867: Assignee: (was: Apache Spark) > Dataset.dropDuplicates (i.e. distinct) should

<    1   2