[jira] [Commented] (SPARK-12741) DataFrame count method return wrong size.

2016-04-12 Thread Stephane Maarek (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238723#comment-15238723 ] Stephane Maarek commented on SPARK-12741: - can we please re-open the issue? > Da

[jira] [Commented] (SPARK-14587) abstract class Receiver should be explicit about the return type of its methods

2016-04-12 Thread Liwei Lin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238714#comment-15238714 ] Liwei Lin commented on SPARK-14587: --- hi [~jlaskowski] would you like to fix or would yo

[jira] [Commented] (SPARK-14591) DDLParser should accept decimal(precision)

2016-04-12 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238711#comment-15238711 ] Yin Huai commented on SPARK-14591: -- btw, is there any way to allow using keywords as col

[jira] [Commented] (SPARK-14591) DDLParser should accept decimal(precision)

2016-04-12 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238708#comment-15238708 ] Yin Huai commented on SPARK-14591: -- OK. Thanks. I found it while looking at the test of

[jira] [Commented] (SPARK-14409) Investigate adding a RankingEvaluator to ML

2016-04-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238707#comment-15238707 ] Nick Pentreath commented on SPARK-14409: Given the amount of existing code in mll

[jira] [Commented] (SPARK-14593) Make currentVars work with splitExpressions to enable whole stage codegen for large input columns

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238703#comment-15238703 ] Apache Spark commented on SPARK-14593: -- User 'viirya' has created a pull request for

[jira] [Assigned] (SPARK-14593) Make currentVars work with splitExpressions to enable whole stage codegen for large input columns

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14593: Assignee: (was: Apache Spark) > Make currentVars work with splitExpressions to enable

[jira] [Assigned] (SPARK-14593) Make currentVars work with splitExpressions to enable whole stage codegen for large input columns

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14593: Assignee: Apache Spark > Make currentVars work with splitExpressions to enable whole stage

[jira] [Created] (SPARK-14593) Make currentVars work with splitExpressions to enable whole stage codegen for large input columns

2016-04-12 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-14593: --- Summary: Make currentVars work with splitExpressions to enable whole stage codegen for large input columns Key: SPARK-14593 URL: https://issues.apache.org/jira/browse/SPARK-

[jira] [Commented] (SPARK-14591) DDLParser should accept decimal(precision)

2016-04-12 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238701#comment-15238701 ] Herman van Hovell commented on SPARK-14591: --- I think it is still used in a coup

[jira] [Commented] (SPARK-14525) DataFrameWriter's save method should delegate to jdbc for jdbc datasource

2016-04-12 Thread Justin Pihony (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238693#comment-15238693 ] Justin Pihony commented on SPARK-14525: --- I don't see why not since they're just key

[jira] [Issue Comment Deleted] (SPARK-14525) DataFrameWriter's save method should delegate to jdbc for jdbc datasource

2016-04-12 Thread Justin Pihony (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Justin Pihony updated SPARK-14525: -- Comment: was deleted (was: I don't see why not since they're just key/values anyway. Here's the

[jira] [Comment Edited] (SPARK-14591) DDLParser should accept decimal(precision)

2016-04-12 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238686#comment-15238686 ] Yin Huai edited comment on SPARK-14591 at 4/13/16 6:21 AM: --- Ah,

[jira] [Commented] (SPARK-14525) DataFrameWriter's save method should delegate to jdbc for jdbc datasource

2016-04-12 Thread Justin Pihony (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238690#comment-15238690 ] Justin Pihony commented on SPARK-14525: --- I don't see why not since they're just key

[jira] [Commented] (SPARK-11374) skip.header.line.count is ignored in HiveContext

2016-04-12 Thread Stephane Maarek (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238689#comment-15238689 ] Stephane Maarek commented on SPARK-11374: - any updates on this? Just some log: {

[jira] [Issue Comment Deleted] (SPARK-11374) skip.header.line.count is ignored in HiveContext

2016-04-12 Thread Stephane Maarek (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephane Maarek updated SPARK-11374: Comment: was deleted (was: I may add that more metadata isn't processed, namely TBLPROPERTI

[jira] [Commented] (SPARK-14591) DDLParser should accept decimal(precision)

2016-04-12 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238686#comment-15238686 ] Yin Huai commented on SPARK-14591: -- Ah, we have this. I was using {{org.apache.spark.sq

[jira] [Commented] (SPARK-14154) Simplify the implementation for Kolmogorov–Smirnov test

2016-04-12 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238673#comment-15238673 ] yuhao yang commented on SPARK-14154: result on 4 cores: ||scale ||old

[jira] [Commented] (SPARK-14433) PySpark ml GaussianMixture

2016-04-12 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238671#comment-15238671 ] Miao Wang commented on SPARK-14433: --- Coding started 30% complete. > PySpark ml Gaussia

[jira] [Commented] (SPARK-14591) DDLParser should accept decimal(precision)

2016-04-12 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238667#comment-15238667 ] Herman van Hovell commented on SPARK-14591: --- [~yhuai] I just tried the followin

[jira] [Commented] (SPARK-14525) DataFrameWriter's save method should delegate to jdbc for jdbc datasource

2016-04-12 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238666#comment-15238666 ] Reynold Xin commented on SPARK-14525: - is it possible to specify all the options for

[jira] [Commented] (SPARK-11157) Allow Spark to be built without assemblies

2016-04-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238665#comment-15238665 ] Marcelo Vanzin commented on SPARK-11157: Please file a separate bug for that (it

[jira] [Commented] (SPARK-14525) DataFrameWriter's save method should delegate to jdbc for jdbc datasource

2016-04-12 Thread Justin Pihony (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238621#comment-15238621 ] Justin Pihony commented on SPARK-14525: --- I don't mind putting together a PR for thi

[jira] [Commented] (SPARK-14540) Support Scala 2.12 closures and Java 8 lambdas in ClosureCleaner

2016-04-12 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238602#comment-15238602 ] Josh Rosen commented on SPARK-14540: I found a problem which seems to prevent the cle

[jira] [Created] (SPARK-14592) Create table like

2016-04-12 Thread Yin Huai (JIRA)
Yin Huai created SPARK-14592: Summary: Create table like Key: SPARK-14592 URL: https://issues.apache.org/jira/browse/SPARK-14592 Project: Spark Issue Type: Sub-task Components: SQL

[jira] [Created] (SPARK-14591) DDLParser should accept decimal(precision)

2016-04-12 Thread Yin Huai (JIRA)
Yin Huai created SPARK-14591: Summary: DDLParser should accept decimal(precision) Key: SPARK-14591 URL: https://issues.apache.org/jira/browse/SPARK-14591 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-14127) [Table related commands] Describe table

2016-04-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238567#comment-15238567 ] Xiao Li commented on SPARK-14127: - Most of work are duplicate with `show table extended`.

[jira] [Updated] (SPARK-14586) SparkSQL doesn't parse decimal like Hive

2016-04-12 Thread Stephane Maarek (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephane Maarek updated SPARK-14586: Description: create a test_data.csv with the following {code:none} a, 2.0 ,3.0 {code} (the

[jira] [Commented] (SPARK-14499) Add tests to make sure drop partitions of an external table will not delete data

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238552#comment-15238552 ] Apache Spark commented on SPARK-14499: -- User 'gatorsmile' has created a pull request

[jira] [Assigned] (SPARK-14499) Add tests to make sure drop partitions of an external table will not delete data

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14499: Assignee: Apache Spark > Add tests to make sure drop partitions of an external table will

[jira] [Assigned] (SPARK-14499) Add tests to make sure drop partitions of an external table will not delete data

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14499: Assignee: (was: Apache Spark) > Add tests to make sure drop partitions of an external

[jira] [Assigned] (SPARK-14590) Update pull request template with link to jira

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14590: Assignee: (was: Apache Spark) > Update pull request template with link to jira > -

[jira] [Commented] (SPARK-14590) Update pull request template with link to jira

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238532#comment-15238532 ] Apache Spark commented on SPARK-14590: -- User 'lresende' has created a pull request f

[jira] [Assigned] (SPARK-14590) Update pull request template with link to jira

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14590: Assignee: Apache Spark > Update pull request template with link to jira >

[jira] [Created] (SPARK-14590) Update pull request template with link to jira

2016-04-12 Thread Luciano Resende (JIRA)
Luciano Resende created SPARK-14590: --- Summary: Update pull request template with link to jira Key: SPARK-14590 URL: https://issues.apache.org/jira/browse/SPARK-14590 Project: Spark Issue Ty

[jira] [Assigned] (SPARK-14589) Enhance DB2 JDBC Dialect docker tests

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14589: Assignee: (was: Apache Spark) > Enhance DB2 JDBC Dialect docker tests > --

[jira] [Commented] (SPARK-14589) Enhance DB2 JDBC Dialect docker tests

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238521#comment-15238521 ] Apache Spark commented on SPARK-14589: -- User 'lresende' has created a pull request f

[jira] [Assigned] (SPARK-14589) Enhance DB2 JDBC Dialect docker tests

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14589: Assignee: Apache Spark > Enhance DB2 JDBC Dialect docker tests > -

[jira] [Created] (SPARK-14589) Enhance DB2 JDBC Dialect docker tests

2016-04-12 Thread Luciano Resende (JIRA)
Luciano Resende created SPARK-14589: --- Summary: Enhance DB2 JDBC Dialect docker tests Key: SPARK-14589 URL: https://issues.apache.org/jira/browse/SPARK-14589 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-14311) Model persistence in SparkR

2016-04-12 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238500#comment-15238500 ] Yanbo Liang commented on SPARK-14311: - Sure, I can have a try. Another issue is R `Ob

[jira] [Created] (SPARK-14588) Consider getting column stats from files (wherever feasible) to get better stats for joins

2016-04-12 Thread Rajesh Balamohan (JIRA)
Rajesh Balamohan created SPARK-14588: Summary: Consider getting column stats from files (wherever feasible) to get better stats for joins Key: SPARK-14588 URL: https://issues.apache.org/jira/browse/SPARK-14588

[jira] [Created] (SPARK-14587) abstract class Receiver should be explicit about the return type of its methods

2016-04-12 Thread Jacek Laskowski (JIRA)
Jacek Laskowski created SPARK-14587: --- Summary: abstract class Receiver should be explicit about the return type of its methods Key: SPARK-14587 URL: https://issues.apache.org/jira/browse/SPARK-14587

[jira] [Assigned] (SPARK-14441) Consolidate DDL tests

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14441: Assignee: (was: Apache Spark) > Consolidate DDL tests > - > >

[jira] [Commented] (SPARK-14441) Consolidate DDL tests

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238480#comment-15238480 ] Apache Spark commented on SPARK-14441: -- User 'bomeng' has created a pull request for

[jira] [Assigned] (SPARK-14441) Consolidate DDL tests

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14441: Assignee: Apache Spark > Consolidate DDL tests > - > >

[jira] [Commented] (SPARK-14554) disable whole stage codegen if there are too many input columns

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238469#comment-15238469 ] Apache Spark commented on SPARK-14554: -- User 'cloud-fan' has created a pull request

[jira] [Commented] (SPARK-14409) Investigate adding a RankingEvaluator to ML

2016-04-12 Thread Yong Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238462#comment-15238462 ] Yong Tang commented on SPARK-14409: --- Thanks [~mlnick] for the review. I was planning to

[jira] [Updated] (SPARK-14586) SparkSQL doesn't parse decimal like Hive

2016-04-12 Thread Stephane Maarek (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephane Maarek updated SPARK-14586: Description: create a test_data.csv with the following {code:none} a, 2.0 ,3.0 {code} (the

[jira] [Updated] (SPARK-14586) SparkSQL doesn't parse decimal like Hive

2016-04-12 Thread Stephane Maarek (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephane Maarek updated SPARK-14586: Description: create a test_data.csv with the following {code:none} a, 2.0 ,3.0 {code} (the

[jira] [Updated] (SPARK-14586) SparkSQL doesn't parse decimal like Hive

2016-04-12 Thread Stephane Maarek (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephane Maarek updated SPARK-14586: Description: create a test_data.csv with the following {code:none} a, 2.0 ,3.0 {code} (the

[jira] [Updated] (SPARK-14447) Speed up TungstenAggregate w/ keys using AggregateHashMap

2016-04-12 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sameer Agarwal updated SPARK-14447: --- Summary: Speed up TungstenAggregate w/ keys using AggregateHashMap (was: Integrate Aggregate

[jira] [Commented] (SPARK-14447) Integrate AggregateHashMap in Aggregates with Keys

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238401#comment-15238401 ] Apache Spark commented on SPARK-14447: -- User 'sameeragarwal' has created a pull requ

[jira] [Updated] (SPARK-14583) SparkSQL doesn't read hive table properly after MSCK REPAIR

2016-04-12 Thread Stephane Maarek (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephane Maarek updated SPARK-14583: Summary: SparkSQL doesn't read hive table properly after MSCK REPAIR (was: Spark doesn't r

[jira] [Updated] (SPARK-14586) SparkSQL doesn't parse decimal like Hive

2016-04-12 Thread Stephane Maarek (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephane Maarek updated SPARK-14586: Description: create a test_data.csv with the following {code:none} a, 2.0 ,3.0 {code} (the

[jira] [Created] (SPARK-14586) SparkSQL doesn't parse decimal like Hive

2016-04-12 Thread Stephane Maarek (JIRA)
Stephane Maarek created SPARK-14586: --- Summary: SparkSQL doesn't parse decimal like Hive Key: SPARK-14586 URL: https://issues.apache.org/jira/browse/SPARK-14586 Project: Spark Issue Type: Bu

[jira] [Updated] (SPARK-14375) Unit test for spark.ml KMeansSummary

2016-04-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14375: -- Shepherd: Joseph K. Bradley Assignee: Yanbo Liang Target Version

[jira] [Created] (SPARK-14585) Provide accessor methods for Pipeline stages

2016-04-12 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-14585: - Summary: Provide accessor methods for Pipeline stages Key: SPARK-14585 URL: https://issues.apache.org/jira/browse/SPARK-14585 Project: Spark Issue

[jira] [Updated] (SPARK-14583) Spark doesn't read hive table properly after MSCK REPAIR

2016-04-12 Thread Stephane Maarek (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephane Maarek updated SPARK-14583: Description: it seems that Spark forgets or fails to read the metadata tblproperties after

[jira] [Updated] (SPARK-14583) Spark doesn't read hive table properly after MSCK REPAIR

2016-04-12 Thread Stephane Maarek (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephane Maarek updated SPARK-14583: Description: it seems that Spark forgets or fails to read the metadata tblproperties after

[jira] [Updated] (SPARK-14084) Parallel training jobs in model selection

2016-04-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14084: -- Target Version/s: 2.1.0 (was: 2.0.0) > Parallel training jobs in model selection > ---

[jira] [Created] (SPARK-14584) Improve recognition of non-nullability in Dataset transformations

2016-04-12 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-14584: -- Summary: Improve recognition of non-nullability in Dataset transformations Key: SPARK-14584 URL: https://issues.apache.org/jira/browse/SPARK-14584 Project: Spark

[jira] [Resolved] (SPARK-13982) SparkR - KMeans predict: Output column name of features is an unclear, automatic genetared text

2016-04-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-13982. --- Resolution: Fixed Assignee: Yanbo Liang Fix Version/s: 2.0.0 Resolvin

[jira] [Updated] (SPARK-13982) SparkR - KMeans predict: Output column name of features is an unclear, automatic generated text

2016-04-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-13982: -- Summary: SparkR - KMeans predict: Output column name of features is an unclear, automat

[jira] [Commented] (SPARK-14059) Define R wrappers under org.apache.spark.ml.r

2016-04-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238358#comment-15238358 ] Joseph K. Bradley commented on SPARK-14059: --- This task looks complete. Can I r

[jira] [Commented] (SPARK-14583) Spark doesn't read hive table properly after MSCK REPAIR

2016-04-12 Thread Stephane Maarek (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238352#comment-15238352 ] Stephane Maarek commented on SPARK-14583: - pretty much the same behavior if inste

[jira] [Updated] (SPARK-13982) SparkR - KMeans predict: Output column name of features is an unclear, automatic genetared text

2016-04-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-13982: -- Target Version/s: 2.0.0 > SparkR - KMeans predict: Output column name of features is an

[jira] [Commented] (SPARK-14154) Simplify the implementation for Kolmogorov–Smirnov test

2016-04-12 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238346#comment-15238346 ] yuhao yang commented on SPARK-14154: Got your concern. I'll run some benchmark. > Si

[jira] [Updated] (SPARK-14509) Add python CountVectorizerExample

2016-04-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14509: -- Shepherd: Joseph K. Bradley Assignee: zhengruifeng Target Versio

[jira] [Commented] (SPARK-14577) spark.sql.codegen.maxCaseBranches config option

2016-04-12 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238344#comment-15238344 ] Reynold Xin commented on SPARK-14577: - Yea we shouldn't change the architecture. >

[jira] [Created] (SPARK-14583) Spark doesn't read hive table properly after MSCK REPAIR

2016-04-12 Thread Stephane Maarek (JIRA)
Stephane Maarek created SPARK-14583: --- Summary: Spark doesn't read hive table properly after MSCK REPAIR Key: SPARK-14583 URL: https://issues.apache.org/jira/browse/SPARK-14583 Project: Spark

[jira] [Commented] (SPARK-14577) spark.sql.codegen.maxCaseBranches config option

2016-04-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238323#comment-15238323 ] Dongjoon Hyun commented on SPARK-14577: --- In the current Spark architecture, `sql/co

[jira] [Commented] (SPARK-14582) Increase the parallelism for small tables

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238321#comment-15238321 ] Apache Spark commented on SPARK-14582: -- User 'davies' has created a pull request for

[jira] [Assigned] (SPARK-14582) Increase the parallelism for small tables

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14582: Assignee: Apache Spark (was: Davies Liu) > Increase the parallelism for small tables > --

[jira] [Assigned] (SPARK-14582) Increase the parallelism for small tables

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14582: Assignee: Davies Liu (was: Apache Spark) > Increase the parallelism for small tables > --

[jira] [Resolved] (SPARK-14579) Fix a race condition in StreamExecution.processAllAvailable

2016-04-12 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-14579. -- Resolution: Fixed Fix Version/s: 2.0.0 > Fix a race condition in StreamExecution.process

[jira] [Created] (SPARK-14582) Increase the parallelism for small tables

2016-04-12 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14582: -- Summary: Increase the parallelism for small tables Key: SPARK-14582 URL: https://issues.apache.org/jira/browse/SPARK-14582 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-10386) Model import/export for PrefixSpan

2016-04-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-10386: -- Shepherd: Joseph K. Bradley (was: Xiangrui Meng) > Model import/export for PrefixSpan

[jira] [Resolved] (SPARK-14578) Can't load a json dataset with nested wide schema

2016-04-12 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14578. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12338 [https://github.

[jira] [Commented] (SPARK-8514) LU factorization on BlockMatrix

2016-04-12 Thread Jerome (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238300#comment-15238300 ] Jerome commented on SPARK-8514: --- Hello Joseph: Is this JIRA still under consideration? Bes

[jira] [Commented] (SPARK-14529) Consolidate mllib and mllib-local into one mllib folder

2016-04-12 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238291#comment-15238291 ] DB Tsai commented on SPARK-14529: - We still can make graphx depend on mllib-local, and I

[jira] [Updated] (SPARK-5992) Locality Sensitive Hashing (LSH) for MLlib

2016-04-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5992: - Target Version/s: 2.1.0 (was: 2.0.0) > Locality Sensitive Hashing (LSH) for MLlib > -

[jira] [Updated] (SPARK-12942) Provide option to allow control the precision of numerical type for DataFrameWriter

2016-04-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-12942: -- Target Version/s: 2.0.0 Component/s: (was: ML) > Provide option to allow c

[jira] [Updated] (SPARK-12942) Provide option to allow control the precision of numerical type for DataFrameWriter

2016-04-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-12942: -- Target Version/s: (was: 2.0.0) > Provide option to allow control the precision of num

[jira] [Updated] (SPARK-12942) Provide option to allow control the precision of numerical type for DataFrameWriter

2016-04-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-12942: -- Target Version/s: (was: 2.0.0) > Provide option to allow control the precision of num

[jira] [Updated] (SPARK-9478) Add class weights to Random Forest

2016-04-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-9478: - Target Version/s: 2.1.0 (was: 2.0.0) > Add class weights to Random Forest > -

[jira] [Updated] (SPARK-8514) LU factorization on BlockMatrix

2016-04-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-8514: - Target Version/s: (was: 2.0.0) > LU factorization on BlockMatrix > -

[jira] [Updated] (SPARK-10078) Vector-free L-BFGS

2016-04-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-10078: -- Target Version/s: (was: 2.0.0) > Vector-free L-BFGS > -- > >

[jira] [Comment Edited] (SPARK-13116) TungstenAggregate though it is supposedly capable of all processing unsafe & safe rows, fails if the input is safe rows

2016-04-12 Thread Martin Brandt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238264#comment-15238264 ] Martin Brandt edited comment on SPARK-13116 at 4/12/16 11:46 PM: --

[jira] [Commented] (SPARK-13116) TungstenAggregate though it is supposedly capable of all processing unsafe & safe rows, fails if the input is safe rows

2016-04-12 Thread Martin Brandt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238264#comment-15238264 ] Martin Brandt commented on SPARK-13116: --- I am seeing what looks like the issue desc

[jira] [Commented] (SPARK-14581) Improve filter push down

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238220#comment-15238220 ] Apache Spark commented on SPARK-14581: -- User 'davies' has created a pull request for

[jira] [Assigned] (SPARK-14581) Improve filter push down

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14581: Assignee: Davies Liu (was: Apache Spark) > Improve filter push down > ---

[jira] [Assigned] (SPARK-14581) Improve filter push down

2016-04-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14581: Assignee: Apache Spark (was: Davies Liu) > Improve filter push down > ---

[jira] [Created] (SPARK-14581) Improve filter push down

2016-04-12 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14581: -- Summary: Improve filter push down Key: SPARK-14581 URL: https://issues.apache.org/jira/browse/SPARK-14581 Project: Spark Issue Type: Improvement Compon

[jira] [Resolved] (SPARK-14363) Executor OOM due to a memory leak in Sorter

2016-04-12 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14363. Resolution: Fixed Fix Version/s: 1.6.2 2.0.0 Issue resolved by pull reque

[jira] [Updated] (SPARK-14497) Use top instead of sortBy() to get top N frequent words as dict in CountVectorizer

2016-04-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14497: -- Summary: Use top instead of sortBy() to get top N frequent words as dict in CountVector

[jira] [Commented] (SPARK-12414) Remove closure serializer

2016-04-12 Thread Dubkov Mikhail (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238171#comment-15238171 ] Dubkov Mikhail commented on SPARK-12414: [~srowen], [~andrewor14], As I see, you

[jira] [Commented] (SPARK-14154) Simplify the implementation for Kolmogorov–Smirnov test

2016-04-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238172#comment-15238172 ] Xiangrui Meng commented on SPARK-14154: --- Changed the priority to critical since we

[jira] [Updated] (SPARK-14568) Log instrumentation in logistic regression as a first task

2016-04-12 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14568: -- Component/s: ML > Log instrumentation in logistic regression as a first task >

[jira] [Updated] (SPARK-14154) Simplify the implementation for Kolmogorov–Smirnov test

2016-04-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-14154: -- Priority: Critical (was: Minor) > Simplify the implementation for Kolmogorov–Smirnov test > --

[jira] [Commented] (SPARK-11157) Allow Spark to be built without assemblies

2016-04-12 Thread Sebastian Kochman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238151#comment-15238151 ] Sebastian Kochman commented on SPARK-11157: --- After this change, when I try to s

  1   2   3   >