[jira] [Assigned] (SPARK-20192) SparkR 2.2.0 migration guide, release note

2017-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20192: Assignee: Apache Spark (was: Felix Cheung) > SparkR 2.2.0 migration guide, release note

[jira] [Assigned] (SPARK-20192) SparkR 2.2.0 migration guide, release note

2017-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20192: Assignee: Felix Cheung (was: Apache Spark) > SparkR 2.2.0 migration guide, release note

[jira] [Commented] (SPARK-20192) SparkR 2.2.0 migration guide, release note

2017-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990594#comment-15990594 ] Apache Spark commented on SPARK-20192: -- User 'felixcheung' has created a pull request for this

[jira] [Resolved] (SPARK-20490) Add eqNullSafe, not and ! to SparkR

2017-04-30 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-20490. -- Resolution: Fixed Assignee: Maciej Szymkiewicz Fix Version/s: 2.3.0

[jira] [Commented] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-30 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990548#comment-15990548 ] Liang-Chi Hsieh commented on SPARK-20392: - [~barrybecker4] I created SPARK-20542 to track the

[jira] [Created] (SPARK-20542) Add an API into Bucketizer that can bin a lot of columns all at once

2017-04-30 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-20542: --- Summary: Add an API into Bucketizer that can bin a lot of columns all at once Key: SPARK-20542 URL: https://issues.apache.org/jira/browse/SPARK-20542 Project:

[jira] [Resolved] (SPARK-20442) Fill up documentations for functions in Column API in PySpark

2017-04-30 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-20442. - Resolution: Fixed Fix Version/s: 2.3.0 > Fill up documentations for functions in Column API in

[jira] [Assigned] (SPARK-20442) Fill up documentations for functions in Column API in PySpark

2017-04-30 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-20442: --- Assignee: Hyukjin Kwon > Fill up documentations for functions in Column API in PySpark >

[jira] [Assigned] (SPARK-20541) SparkR SS should support awaitTermination without timeout

2017-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20541: Assignee: Apache Spark > SparkR SS should support awaitTermination without timeout >

[jira] [Assigned] (SPARK-20541) SparkR SS should support awaitTermination without timeout

2017-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20541: Assignee: (was: Apache Spark) > SparkR SS should support awaitTermination without

[jira] [Commented] (SPARK-20541) SparkR SS should support awaitTermination without timeout

2017-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990524#comment-15990524 ] Apache Spark commented on SPARK-20541: -- User 'felixcheung' has created a pull request for this

[jira] [Created] (SPARK-20541) SparkR SS should support awaitTermination without timeout

2017-04-30 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-20541: Summary: SparkR SS should support awaitTermination without timeout Key: SPARK-20541 URL: https://issues.apache.org/jira/browse/SPARK-20541 Project: Spark

[jira] [Assigned] (SPARK-20015) Document R Structured Streaming (experimental) in R vignettes and R & SS programming guide, R example

2017-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20015: Assignee: Apache Spark (was: Felix Cheung) > Document R Structured Streaming

[jira] [Assigned] (SPARK-20015) Document R Structured Streaming (experimental) in R vignettes and R & SS programming guide, R example

2017-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20015: Assignee: Felix Cheung (was: Apache Spark) > Document R Structured Streaming

[jira] [Commented] (SPARK-20015) Document R Structured Streaming (experimental) in R vignettes and R & SS programming guide, R example

2017-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990506#comment-15990506 ] Apache Spark commented on SPARK-20015: -- User 'felixcheung' has created a pull request for this

[jira] [Commented] (SPARK-20525) ClassCast exception when interpreting UDFs from a String in spark-shell

2017-04-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990505#comment-15990505 ] Sean Owen commented on SPARK-20525: --- This is probably a classloader issue in the end too, which is I

[jira] [Commented] (SPARK-20525) ClassCast exception when interpreting UDFs from a String in spark-shell

2017-04-30 Thread Dave Knoester (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990489#comment-15990489 ] Dave Knoester commented on SPARK-20525: --- This is the binary distribution (official version), I

[jira] [Assigned] (SPARK-20540) Dynamic allocation constantly requests and kills executors

2017-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20540: Assignee: (was: Apache Spark) > Dynamic allocation constantly requests and kills

[jira] [Assigned] (SPARK-20540) Dynamic allocation constantly requests and kills executors

2017-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20540: Assignee: Apache Spark > Dynamic allocation constantly requests and kills executors >

[jira] [Assigned] (SPARK-20540) Dynamic allocation constantly requests and kills executors

2017-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20540: Assignee: (was: Apache Spark) > Dynamic allocation constantly requests and kills

[jira] [Commented] (SPARK-20540) Dynamic allocation constantly requests and kills executors

2017-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990487#comment-15990487 ] Apache Spark commented on SPARK-20540: -- User 'rdblue' has created a pull request for this issue:

[jira] [Assigned] (SPARK-20540) Dynamic allocation constantly requests and kills executors

2017-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20540: Assignee: Apache Spark > Dynamic allocation constantly requests and kills executors >

[jira] [Created] (SPARK-20540) Dynamic allocation constantly requests and kills executors

2017-04-30 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-20540: - Summary: Dynamic allocation constantly requests and kills executors Key: SPARK-20540 URL: https://issues.apache.org/jira/browse/SPARK-20540 Project: Spark Issue

[jira] [Commented] (SPARK-20525) ClassCast exception when interpreting UDFs from a String in spark-shell

2017-04-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990380#comment-15990380 ] Sean Owen commented on SPARK-20525: --- How did you build, how did you run? this isn't a standard

[jira] [Commented] (SPARK-20525) ClassCast exception when interpreting UDFs from a String in spark-shell

2017-04-30 Thread Dave Knoester (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990378#comment-15990378 ] Dave Knoester commented on SPARK-20525: --- I reopened it because the error is reproducible in

[jira] [Updated] (SPARK-20463) Add support for IS [NOT] DISTINCT FROM to SPARK SQL

2017-04-30 Thread Michael Styles (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Styles updated SPARK-20463: --- Description: Add support for the SQL standard distinct predicate to SPARK SQL. {noformat}

[jira] [Updated] (SPARK-20463) Add support for IS [NOT] DISTINCT FROM to SPARK SQL

2017-04-30 Thread Michael Styles (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Styles updated SPARK-20463: --- Component/s: (was: PySpark) SQL Summary: Add support for IS

[jira] [Updated] (SPARK-20525) ClassCast exception when interpreting UDFs from a String in spark-shell

2017-04-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-20525: -- Priority: Major (was: Critical) [~dknoester]: let committers decide priority. I see no evidence that

[jira] [Updated] (SPARK-20525) ClassCast exception when interpreting UDFs from a String in spark-shell

2017-04-30 Thread Dave Knoester (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Knoester updated SPARK-20525: -- Priority: Critical (was: Major) Thank you for pointing out the priority guidelines. I'm

[jira] [Reopened] (SPARK-20525) ClassCast exception when interpreting UDFs from a String in spark-shell

2017-04-30 Thread Dave Knoester (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Knoester reopened SPARK-20525: --- This issue is different: It is reproducible in spark-shell, which should not be susceptible to

[jira] [Resolved] (SPARK-20535) R wrappers for explode_outer and posexplode_outer

2017-04-30 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-20535. -- Resolution: Fixed Assignee: Maciej Szymkiewicz Fix Version/s: 2.3.0

[jira] [Resolved] (SPARK-20539) support optional dataframe name

2017-04-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-20539. --- Resolution: Duplicate > support optional dataframe name > --- > >

[jira] [Created] (SPARK-20539) support optional dataframe name

2017-04-30 Thread PJ Fanning (JIRA)
PJ Fanning created SPARK-20539: -- Summary: support optional dataframe name Key: SPARK-20539 URL: https://issues.apache.org/jira/browse/SPARK-20539 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-20538) Dataset.reduce operator should use withNewExecutionId (as foreach or foreachPartition)

2017-04-30 Thread Jacek Laskowski (JIRA)
Jacek Laskowski created SPARK-20538: --- Summary: Dataset.reduce operator should use withNewExecutionId (as foreach or foreachPartition) Key: SPARK-20538 URL: https://issues.apache.org/jira/browse/SPARK-20538

[jira] [Resolved] (SPARK-20492) Do not print empty parentheses for invalid primitive types in parser

2017-04-30 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-20492. --- Resolution: Fixed Assignee: Hyukjin Kwon Fix Version/s: 2.2.0 > Do

[jira] [Comment Edited] (SPARK-16957) Use weighted midpoints for split values.

2017-04-30 Thread 颜发才
[ https://issues.apache.org/jira/browse/SPARK-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990195#comment-15990195 ] Yan Facai (颜发才) edited comment on SPARK-16957 at 4/30/17 11:28 AM: --- To

[jira] [Commented] (SPARK-16957) Use weighted midpoints for split values.

2017-04-30 Thread 颜发才
[ https://issues.apache.org/jira/browse/SPARK-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990195#comment-15990195 ] Yan Facai (颜发才) commented on SPARK-16957: - To match the other libraries, we use mean value for

[jira] [Assigned] (SPARK-20537) OffHeapColumnVector reallocation may not copy existing data

2017-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20537: Assignee: (was: Apache Spark) > OffHeapColumnVector reallocation may not copy

[jira] [Assigned] (SPARK-20537) OffHeapColumnVector reallocation may not copy existing data

2017-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20537: Assignee: Apache Spark > OffHeapColumnVector reallocation may not copy existing data >

[jira] [Commented] (SPARK-20537) OffHeapColumnVector reallocation may not copy existing data

2017-04-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990191#comment-15990191 ] Apache Spark commented on SPARK-20537: -- User 'kiszk' has created a pull request for this issue:

[jira] [Created] (SPARK-20537) OffHeapColumnVector reallocation may not copy existing data

2017-04-30 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20537: Summary: OffHeapColumnVector reallocation may not copy existing data Key: SPARK-20537 URL: https://issues.apache.org/jira/browse/SPARK-20537 Project: Spark

[jira] [Created] (SPARK-20536) Extend ColumnName to create StructFields with explicit nullable

2017-04-30 Thread Jacek Laskowski (JIRA)
Jacek Laskowski created SPARK-20536: --- Summary: Extend ColumnName to create StructFields with explicit nullable Key: SPARK-20536 URL: https://issues.apache.org/jira/browse/SPARK-20536 Project: Spark

[jira] [Resolved] (SPARK-20525) ClassCast exception when interpreting UDFs from a String in spark-shell

2017-04-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-20525. --- Resolution: Duplicate Yes it is usually a library mismatch version. It's a duplicate of many JIRAs

[jira] [Updated] (SPARK-20525) ClassCast exception when interpreting UDFs from a String in spark-shell

2017-04-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-20525: -- Priority: Major (was: Blocker) Please read http://spark.apache.org/contributing.html -- don't set

[jira] [Assigned] (SPARK-20521) The default of 'spark.worker.cleanup.appDataTtl' should be 604800 in spark-standalone.md.

2017-04-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-20521: - Assignee: guoxiaolongzte > The default of 'spark.worker.cleanup.appDataTtl' should be 604800

[jira] [Assigned] (SPARK-20300) Python API for ALSModel.recommendForAllUsers,Items

2017-04-30 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-20300: -- Assignee: Nick Pentreath > Python API for ALSModel.recommendForAllUsers,Items >

[jira] [Resolved] (SPARK-20521) The default of 'spark.worker.cleanup.appDataTtl' should be 604800 in spark-standalone.md.

2017-04-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-20521. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17798