[jira] [Comment Edited] (SPARK-21476) RandomForest classification model not using broadcast in transform

2017-07-19 Thread Saurabh Agrawal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094182#comment-16094182 ] Saurabh Agrawal edited comment on SPARK-21476 at 7/20/17 5:21 AM: -- I'm

[jira] [Comment Edited] (SPARK-21476) RandomForest classification model not using broadcast in transform

2017-07-19 Thread Saurabh Agrawal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094182#comment-16094182 ] Saurabh Agrawal edited comment on SPARK-21476 at 7/20/17 5:16 AM: -- I'm

[jira] [Comment Edited] (SPARK-21476) RandomForest classification model not using broadcast in transform

2017-07-19 Thread Saurabh Agrawal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094182#comment-16094182 ] Saurabh Agrawal edited comment on SPARK-21476 at 7/20/17 5:15 AM: -- I'm

[jira] [Commented] (SPARK-21476) RandomForest classification model not using broadcast in transform

2017-07-19 Thread Saurabh Agrawal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094182#comment-16094182 ] Saurabh Agrawal commented on SPARK-21476: - I'm saying that the trees in the model get serialized

[jira] [Resolved] (SPARK-16542) bugs about types that result an array of null when creating dataframe using python

2017-07-19 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin resolved SPARK-16542. --- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18444

[jira] [Assigned] (SPARK-16542) bugs about types that result an array of null when creating dataframe using python

2017-07-19 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin reassigned SPARK-16542: - Assignee: Xiang Gao > bugs about types that result an array of null when creating

[jira] [Updated] (SPARK-21034) Filter not getting pushed down the groupBy clause when first() or last() aggregate function is used

2017-07-19 Thread Abhijit Bhole (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhijit Bhole updated SPARK-21034: -- Description: Here is a sample code - {code:java} from pyspark.sql import functions as F df

[jira] [Updated] (SPARK-21479) Outer join filter pushdown in null supplying table when condition is on one of the joined columns

2017-07-19 Thread Abhijit Bhole (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhijit Bhole updated SPARK-21479: -- Description: Here are two different query plans - {code:java} df1 =

[jira] [Created] (SPARK-21479) Outer join filter pushdown in null supplying table when condition is on one of the joined columns

2017-07-19 Thread Abhijit Bhole (JIRA)
Abhijit Bhole created SPARK-21479: - Summary: Outer join filter pushdown in null supplying table when condition is on one of the joined columns Key: SPARK-21479 URL:

[jira] [Commented] (SPARK-2465) Use long as user / item ID for ALS

2017-07-19 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094106#comment-16094106 ] Peng Meng commented on SPARK-2465: -- I think it is time to revisit this now. Some of our customers, such

[jira] [Commented] (SPARK-9860) Join: Determine the join strategy (broadcast join or shuffle join) at runtime

2017-07-19 Thread Carson Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094076#comment-16094076 ] Carson Wang commented on SPARK-9860: We are working on this and also improving the current

[jira] [Resolved] (SPARK-21437) Java Keyword cannot be used in table schema

2017-07-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-21437. -- Resolution: Not A Bug > Java Keyword cannot be used in table schema >

[jira] [Created] (SPARK-21478) Unpersist a DF also unpersist related DFs

2017-07-19 Thread Roberto Mirizzi (JIRA)
Roberto Mirizzi created SPARK-21478: --- Summary: Unpersist a DF also unpersist related DFs Key: SPARK-21478 URL: https://issues.apache.org/jira/browse/SPARK-21478 Project: Spark Issue Type:

[jira] [Updated] (SPARK-21478) Unpersist a DF also unpersists related DFs

2017-07-19 Thread Roberto Mirizzi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roberto Mirizzi updated SPARK-21478: Summary: Unpersist a DF also unpersists related DFs (was: Unpersist a DF also unpersist

[jira] [Resolved] (SPARK-21333) joinWith documents and analysis allow invalid join types

2017-07-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-21333. - Resolution: Fixed Fix Version/s: 2.3.0 2.2.1 > joinWith documents and analysis

[jira] [Assigned] (SPARK-21333) joinWith documents and analysis allow invalid join types

2017-07-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-21333: --- Assignee: Corey Woodfield > joinWith documents and analysis allow invalid join types >

[jira] [Commented] (SPARK-21476) RandomForest classification model not using broadcast in transform

2017-07-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093892#comment-16093892 ] Sean Owen commented on SPARK-21476: --- I'm not sure what you're suggesting, that something should or

[jira] [Commented] (SPARK-19842) Informational Referential Integrity Constraints Support in Spark

2017-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093887#comment-16093887 ] Reynold Xin commented on SPARK-19842: - Are you guys doing any work here? > Informational Referential

[jira] [Resolved] (SPARK-21456) Make the driver failover_timeout configurable (Mesos cluster mode)

2017-07-19 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-21456. Resolution: Fixed Assignee: Susan X. Huynh Fix Version/s: 2.3.0 > Make the

[jira] [Updated] (SPARK-21089) Table properties are not shown in DESC EXTENDED/FORMATTED

2017-07-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-21089: Priority: Blocker (was: Critical) > Table properties are not shown in DESC EXTENDED/FORMATTED >

[jira] [Updated] (SPARK-21203) Wrong results of insertion of Array of Struct

2017-07-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-21203: Priority: Blocker (was: Critical) > Wrong results of insertion of Array of Struct >

[jira] [Updated] (SPARK-21258) Window result incorrect using complex object with spilling

2017-07-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-21258: Priority: Blocker (was: Major) > Window result incorrect using complex object with spilling >

[jira] [Resolved] (SPARK-21446) [SQL] JDBC Postgres fetchsize parameter ignored again

2017-07-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-21446. - Resolution: Fixed Fix Version/s: 2.3.0 2.2.1 2.1.2 > [SQL]

[jira] [Assigned] (SPARK-21446) [SQL] JDBC Postgres fetchsize parameter ignored again

2017-07-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-21446: --- Assignee: Albert Jang > [SQL] JDBC Postgres fetchsize parameter ignored again >

[jira] [Assigned] (SPARK-21477) Mark LocalTableScanExec's input data transient

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21477: Assignee: Apache Spark > Mark LocalTableScanExec's input data transient >

[jira] [Commented] (SPARK-21477) Mark LocalTableScanExec's input data transient

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093835#comment-16093835 ] Apache Spark commented on SPARK-21477: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21477) Mark LocalTableScanExec's input data transient

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21477: Assignee: (was: Apache Spark) > Mark LocalTableScanExec's input data transient >

[jira] [Created] (SPARK-21477) Mark LocalTableScanExec's input data transient

2017-07-19 Thread Xiao Li (JIRA)
Xiao Li created SPARK-21477: --- Summary: Mark LocalTableScanExec's input data transient Key: SPARK-21477 URL: https://issues.apache.org/jira/browse/SPARK-21477 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-19743) Exception when creating more than one implicit Encoder in REPL

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-19743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Bryński updated SPARK-19743: --- Affects Version/s: 2.1.1 2.2.0 > Exception when creating more than

[jira] [Comment Edited] (SPARK-21470) [SPARK 2.2 Regression] Spark History server doesn't support HDFS HA

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093750#comment-16093750 ] Maciej Bryński edited comment on SPARK-21470 at 7/19/17 9:06 PM: - OK. I

[jira] [Resolved] (SPARK-21243) Limit the number of maps in a single shuffle fetch

2017-07-19 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-21243. --- Resolution: Fixed Fix Version/s: 2.3.0 > Limit the number of maps in a single shuffle

[jira] [Assigned] (SPARK-21243) Limit the number of maps in a single shuffle fetch

2017-07-19 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves reassigned SPARK-21243: - Assignee: Dhruve Ashar > Limit the number of maps in a single shuffle fetch >

[jira] [Assigned] (SPARK-21439) Cannot use Spark with Python ABCmeta (exception from cloudpickle)

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21439: Assignee: (was: Apache Spark) > Cannot use Spark with Python ABCmeta (exception from

[jira] [Commented] (SPARK-21470) [SPARK 2.2 Regression] Spark History server doesn't support HDFS HA

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093750#comment-16093750 ] Maciej Bryński commented on SPARK-21470: OK. I think I found the reason. There were no

[jira] [Closed] (SPARK-21470) [SPARK 2.2 Regression] Spark History server doesn't support HDFS HA

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Bryński closed SPARK-21470. -- Resolution: Invalid > [SPARK 2.2 Regression] Spark History server doesn't support HDFS HA >

[jira] [Assigned] (SPARK-21439) Cannot use Spark with Python ABCmeta (exception from cloudpickle)

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21439: Assignee: Apache Spark > Cannot use Spark with Python ABCmeta (exception from

[jira] [Commented] (SPARK-21439) Cannot use Spark with Python ABCmeta (exception from cloudpickle)

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093746#comment-16093746 ] Apache Spark commented on SPARK-21439: -- User 'maver1ck' has created a pull request for this issue:

[jira] [Commented] (SPARK-21470) [SPARK 2.2 Regression] Spark History server doesn't support HDFS HA

2017-07-19 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093726#comment-16093726 ] Marcelo Vanzin commented on SPARK-21470: That still looks like some issue in your configuration.

[jira] [Comment Edited] (SPARK-21470) [SPARK 2.2 Regression] Spark History server doesn't support HDFS HA

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093716#comment-16093716 ] Maciej Bryński edited comment on SPARK-21470 at 7/19/17 8:05 PM: -

[jira] [Commented] (SPARK-21470) [SPARK 2.2 Regression] Spark History server doesn't support HDFS HA

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093716#comment-16093716 ] Maciej Bryński commented on SPARK-21470: [~vanzin] I tried. {code} /etc/hadoop/conf$ grep -A1

[jira] [Created] (SPARK-21476) RandomForest classification model not using broadcast in transform

2017-07-19 Thread Saurabh Agrawal (JIRA)
Saurabh Agrawal created SPARK-21476: --- Summary: RandomForest classification model not using broadcast in transform Key: SPARK-21476 URL: https://issues.apache.org/jira/browse/SPARK-21476 Project:

[jira] [Comment Edited] (SPARK-17333) Make pyspark interface friendly with static analysis

2017-07-19 Thread Assaf Mendelson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091210#comment-16091210 ] Assaf Mendelson edited comment on SPARK-17333 at 7/19/17 7:18 PM: --

[jira] [Commented] (SPARK-21475) Change the usage of FileInputStream/OutputStream to Files.newInput/OutputStream in the critical path

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093589#comment-16093589 ] Apache Spark commented on SPARK-21475: -- User 'jerryshao' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21475) Change the usage of FileInputStream/OutputStream to Files.newInput/OutputStream in the critical path

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21475: Assignee: Apache Spark > Change the usage of FileInputStream/OutputStream to >

[jira] [Assigned] (SPARK-21475) Change the usage of FileInputStream/OutputStream to Files.newInput/OutputStream in the critical path

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21475: Assignee: (was: Apache Spark) > Change the usage of FileInputStream/OutputStream to

[jira] [Commented] (SPARK-21474) Make number of parallel fetches from a reducer configurable

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093564#comment-16093564 ] Apache Spark commented on SPARK-21474: -- User 'raajay' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21474) Make number of parallel fetches from a reducer configurable

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21474: Assignee: Apache Spark > Make number of parallel fetches from a reducer configurable >

[jira] [Assigned] (SPARK-21474) Make number of parallel fetches from a reducer configurable

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21474: Assignee: (was: Apache Spark) > Make number of parallel fetches from a reducer

[jira] [Created] (SPARK-21475) Change the usage of FileInputStream/OutputStream to Files.newInput/OutputStream in the critical path

2017-07-19 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-21475: --- Summary: Change the usage of FileInputStream/OutputStream to Files.newInput/OutputStream in the critical path Key: SPARK-21475 URL:

[jira] [Resolved] (SPARK-21455) RpcFailure should be call on RpcResponseCallback.onFailure

2017-07-19 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-21455. -- Resolution: Won't Fix > RpcFailure should be call on RpcResponseCallback.onFailure >

[jira] [Resolved] (SPARK-21464) Minimize deprecation warnings caused by ProcessingTime class

2017-07-19 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-21464. --- Resolution: Fixed Fix Version/s: 3.0.0 2.2.1 Issue resolved by

[jira] [Created] (SPARK-21474) Make number of parallel fetches from a reducer configurable

2017-07-19 Thread Raajay Viswanathan (JIRA)
Raajay Viswanathan created SPARK-21474: -- Summary: Make number of parallel fetches from a reducer configurable Key: SPARK-21474 URL: https://issues.apache.org/jira/browse/SPARK-21474 Project:

[jira] [Commented] (SPARK-18226) SparkR displaying vector columns in incorrect way

2017-07-19 Thread Kirti (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093483#comment-16093483 ] Kirti commented on SPARK-18226: --- Hi Felix, After collecting output of predict in R, probability column

[jira] [Commented] (SPARK-18226) SparkR displaying vector columns in incorrect way

2017-07-19 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093466#comment-16093466 ] Felix Cheung commented on SPARK-18226: -- if you collect on what's returned by predict(), you should

[jira] [Comment Edited] (SPARK-21374) Reading globbed paths from S3 into DF doesn't work if filesystem caching is disabled

2017-07-19 Thread Andrey Taptunov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093417#comment-16093417 ] Andrey Taptunov edited comment on SPARK-21374 at 7/19/17 5:19 PM: --

[jira] [Commented] (SPARK-21374) Reading globbed paths from S3 into DF doesn't work if filesystem caching is disabled

2017-07-19 Thread Andrey Taptunov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093417#comment-16093417 ] Andrey Taptunov commented on SPARK-21374: - [~ste...@apache.org] Indeed, while working on PR and

[jira] [Commented] (SPARK-21470) [SPARK 2.2 Regression] Spark History server doesn't support HDFS HA

2017-07-19 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093408#comment-16093408 ] Marcelo Vanzin commented on SPARK-21470: The HDFS library generally throws that error if your

[jira] [Updated] (SPARK-20725) partial aggregate should behave correctly for sameResult

2017-07-19 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-20725: Target Version/s: (was: 2.1.2) > partial aggregate should behave correctly for sameResult >

[jira] [Resolved] (SPARK-21473) Running Transform on a bean which has only setters gives NullPointerExcpetion

2017-07-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21473. --- Resolution: Duplicate Given the error, looks like the change for SPARK-19666 already addressed why

[jira] [Updated] (SPARK-21473) Running Transform on a bean which has only setters gives NullPointerExcpetion

2017-07-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-21473: -- Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) > Running Transform on a bean

[jira] [Updated] (SPARK-21473) Running Transform on a bean which has only setters gives NullPointerExcpetion

2017-07-19 Thread Aseem Bansal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aseem Bansal updated SPARK-21473: - Description: If I run the following using the Java API {code:java}

[jira] [Assigned] (SPARK-21441) Incorrect Codegen in SortMergeJoinExec results failures in some cases

2017-07-19 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-21441: --- Assignee: Feng Zhu > Incorrect Codegen in SortMergeJoinExec results failures in some cases

[jira] [Created] (SPARK-21473) Running Transform on a bean which has only setters gives NullPointerExcpetion

2017-07-19 Thread Aseem Bansal (JIRA)
Aseem Bansal created SPARK-21473: Summary: Running Transform on a bean which has only setters gives NullPointerExcpetion Key: SPARK-21473 URL: https://issues.apache.org/jira/browse/SPARK-21473

[jira] [Resolved] (SPARK-21441) Incorrect Codegen in SortMergeJoinExec results failures in some cases

2017-07-19 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-21441. - Resolution: Fixed Fix Version/s: 2.1.2 2.3.0 2.2.1

[jira] [Commented] (SPARK-5159) Thrift server does not respect hive.server2.enable.doAs=true

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093105#comment-16093105 ] Maciej Bryński commented on SPARK-5159: --- Still existed in Spark 2.2.0. Probably duplicate of

[jira] [Commented] (SPARK-21177) df.saveAsTable slows down linearly, with number of appends

2017-07-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093099#comment-16093099 ] Hyukjin Kwon commented on SPARK-21177: -- Could you provide some steps to reproduce? I want to follow

[jira] [Resolved] (SPARK-10216) Avoid creating empty files during overwrite into Hive table with group by query

2017-07-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-10216. -- Resolution: Fixed This is fixed in https://github.com/apache/spark/pull/18654 > Avoid

[jira] [Resolved] (SPARK-21105) Useless empty files in hive table

2017-07-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-21105. -- Resolution: Duplicate This is a duplicate of SPARK-10216. > Useless empty files in hive table

[jira] [Assigned] (SPARK-21414) Buffer in SlidingWindowFunctionFrame could be big though window is small

2017-07-19 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-21414: --- Assignee: jin xing > Buffer in SlidingWindowFunctionFrame could be big though window is

[jira] [Resolved] (SPARK-21414) Buffer in SlidingWindowFunctionFrame could be big though window is small

2017-07-19 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-21414. - Resolution: Fixed Fix Version/s: 2.3.0 2.2.1 Issue resolved by pull

[jira] [Commented] (SPARK-21177) df.saveAsTable slows down linearly, with number of appends

2017-07-19 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093080#comment-16093080 ] Prashant Sharma commented on SPARK-21177: - I can reproduce it on another system with latest

[jira] [Commented] (SPARK-11248) Spark hivethriftserver is using the wrong user to while getting HDFS permissions

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-11248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093055#comment-16093055 ] Maciej Bryński commented on SPARK-11248: I have similar issue in Spark 2.2.0 > Spark

[jira] [Updated] (SPARK-11248) Spark hivethriftserver is using the wrong user to while getting HDFS permissions

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-11248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Bryński updated SPARK-11248: --- Affects Version/s: 2.2.0 > Spark hivethriftserver is using the wrong user to while getting

[jira] [Updated] (SPARK-11248) Spark hivethriftserver is using the wrong user to while getting HDFS permissions

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-11248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Bryński updated SPARK-11248: --- Affects Version/s: 2.1.1 > Spark hivethriftserver is using the wrong user to while getting

[jira] [Updated] (SPARK-21448) Hi dear guys, I have a question about aggregateByKey of pairrrd.

2017-07-19 Thread qihuagao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] qihuagao updated SPARK-21448: - Description: java pair rdd has aggregateByKey, which can avoid full shuffle, so have impressive

[jira] [Updated] (SPARK-21459) Some aggregation functions change the case of nested field names

2017-07-19 Thread David Allsopp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Allsopp updated SPARK-21459: -- Description: When working with DataFrames with nested schemas, the behavior of the

[jira] [Updated] (SPARK-21459) Some aggregation functions change the case of nested field names

2017-07-19 Thread David Allsopp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Allsopp updated SPARK-21459: -- Description: When working with DataFrames with nested schemas, the behavior of the

[jira] [Assigned] (SPARK-21472) Introduce ArrowColumnVector as a reader for Arrow vectors.

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21472: Assignee: Apache Spark > Introduce ArrowColumnVector as a reader for Arrow vectors. >

[jira] [Assigned] (SPARK-21472) Introduce ArrowColumnVector as a reader for Arrow vectors.

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21472: Assignee: (was: Apache Spark) > Introduce ArrowColumnVector as a reader for Arrow

[jira] [Commented] (SPARK-21472) Introduce ArrowColumnVector as a reader for Arrow vectors.

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092921#comment-16092921 ] Apache Spark commented on SPARK-21472: -- User 'ueshin' has created a pull request for this issue:

[jira] [Updated] (SPARK-21472) Introduce ArrowColumnVector as a reader for Arrow vectors.

2017-07-19 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin updated SPARK-21472: -- Description: Introducing {{ArrowColumnVector}} as a reader for Arrow vectors. It extends

[jira] [Commented] (SPARK-21439) Cannot use Spark with Python ABCmeta (exception from cloudpickle)

2017-07-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092919#comment-16092919 ] Hyukjin Kwon commented on SPARK-21439: -- We have a cloudpickle copy in Spark. Would you test that and

[jira] [Updated] (SPARK-21472) Introduce ArrowColumnVector as a reader for Arrow vectors.

2017-07-19 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin updated SPARK-21472: -- Description: Introducing {{ArrowColumnVector}} as a reader for Arrow vectors. It extends

[jira] [Updated] (SPARK-21472) Introduce ArrowColumnVector as a reader for Arrow vectors.

2017-07-19 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin updated SPARK-21472: -- Description: Introducing {{ArrowColumnVector}} as a reader for Arrow vectors. This extends

[jira] [Created] (SPARK-21472) Introduce ArrowColumnVector as a reader for Arrow vectors.

2017-07-19 Thread Takuya Ueshin (JIRA)
Takuya Ueshin created SPARK-21472: - Summary: Introduce ArrowColumnVector as a reader for Arrow vectors. Key: SPARK-21472 URL: https://issues.apache.org/jira/browse/SPARK-21472 Project: Spark

[jira] [Commented] (SPARK-20783) Enhance ColumnVector to support compressed representation

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092892#comment-16092892 ] Apache Spark commented on SPARK-20783: -- User 'kiszk' has created a pull request for this issue:

[jira] [Resolved] (SPARK-21471) Read binary file error in Spark Streaming

2017-07-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21471. --- Resolution: Invalid This is better on StackOverflow. JIRA is for reporting researched bugs in Spark

[jira] [Comment Edited] (SPARK-21439) Cannot use Spark with Python ABCmeta (exception from cloudpickle)

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092888#comment-16092888 ] Maciej Bryński edited comment on SPARK-21439 at 7/19/17 10:15 AM: -- I

[jira] [Commented] (SPARK-21439) Cannot use Spark with Python ABCmeta (exception from cloudpickle)

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092888#comment-16092888 ] Maciej Bryński commented on SPARK-21439: https://github.com/cloudpipe/cloudpickle/pull/104 >

[jira] [Updated] (SPARK-21470) [SPARK 2.2 Regression] Spark History server doesn't support HDFS HA

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Bryński updated SPARK-21470: --- Summary: [SPARK 2.2 Regression] Spark History server doesn't support HDFS HA (was: Spark

[jira] [Updated] (SPARK-21471) Read binary file error in Spark Streaming

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lê Văn Thanh updated SPARK-21471: - Description: My client using GZIPOutputStream to compressed the data and push to my server.

[jira] [Created] (SPARK-21471) Read binary file error in Spark Streaming

2017-07-19 Thread JIRA
Lê Văn Thanh created SPARK-21471: Summary: Read binary file error in Spark Streaming Key: SPARK-21471 URL: https://issues.apache.org/jira/browse/SPARK-21471 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-21470) Spark History server doesn't support HDFS HA

2017-07-19 Thread JIRA
Maciej Bryński created SPARK-21470: -- Summary: Spark History server doesn't support HDFS HA Key: SPARK-21470 URL: https://issues.apache.org/jira/browse/SPARK-21470 Project: Spark Issue Type:

[jira] [Created] (SPARK-21469) Add doc and example for FeatureHasher

2017-07-19 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-21469: -- Summary: Add doc and example for FeatureHasher Key: SPARK-21469 URL: https://issues.apache.org/jira/browse/SPARK-21469 Project: Spark Issue Type:

[jira] [Created] (SPARK-21468) FeatureHasher Python API

2017-07-19 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-21468: -- Summary: FeatureHasher Python API Key: SPARK-21468 URL: https://issues.apache.org/jira/browse/SPARK-21468 Project: Spark Issue Type: New Feature

[jira] [Resolved] (SPARK-21467) Spark SQL does not support the Hive property of "hive.vectorized.execution.enabled"

2017-07-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21467. --- Resolution: Invalid You're referring to a Hive feature -- this can't possibly be considered a Spark

[jira] [Created] (SPARK-21467) Spark SQL does not support the Hive property of "hive.vectorized.execution.enabled"

2017-07-19 Thread Gu Chao (JIRA)
Gu Chao created SPARK-21467: --- Summary: Spark SQL does not support the Hive property of "hive.vectorized.execution.enabled" Key: SPARK-21467 URL: https://issues.apache.org/jira/browse/SPARK-21467 Project:

[jira] [Updated] (SPARK-21440) Refactor ArrowConverters and add DecimalType, ArrayType and StructType support.

2017-07-19 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin updated SPARK-21440: -- Description: This is a refactoring of {{ArrowConverters}} and related classes. # Refactor

[jira] [Resolved] (SPARK-21316) Dataset Union output is not consistent with the column sequence

2017-07-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-21316. -- Resolution: Not A Problem > Dataset Union output is not consistent with the column sequence >

[jira] [Resolved] (SPARK-21433) Spark SQL should support higher version of Hive metastore

2017-07-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21433. --- Resolution: Invalid > Spark SQL should support higher version of Hive metastore >

  1   2   >