[jira] [Created] (SPARK-21483) Make org.apache.spark.ml.linalg.Vector bean-compliant so it can be used in Encoders.bean(Vector.class)

2017-07-19 Thread Aseem Bansal (JIRA)
Aseem Bansal created SPARK-21483: Summary: Make org.apache.spark.ml.linalg.Vector bean-compliant so it can be used in Encoders.bean(Vector.class) Key: SPARK-21483 URL: https://issues.apache.org/jira/browse/SPARK-2

[jira] [Created] (SPARK-21482) Make LabeledPoint bean-compliant so it can be used in Encoders.bean(LabeledPoint.class)

2017-07-19 Thread Aseem Bansal (JIRA)
Aseem Bansal created SPARK-21482: Summary: Make LabeledPoint bean-compliant so it can be used in Encoders.bean(LabeledPoint.class) Key: SPARK-21482 URL: https://issues.apache.org/jira/browse/SPARK-21482

[jira] [Commented] (SPARK-21476) RandomForest classification model not using broadcast in transform

2017-07-19 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16094266#comment-16094266 ] Peng Meng commented on SPARK-21476: --- Seems transform should use transformImpl but not u

[jira] [Created] (SPARK-21481) Add indexOf method in ml.feature.HashingTF similar to mllib.feature.HashingTF

2017-07-19 Thread Aseem Bansal (JIRA)
Aseem Bansal created SPARK-21481: Summary: Add indexOf method in ml.feature.HashingTF similar to mllib.feature.HashingTF Key: SPARK-21481 URL: https://issues.apache.org/jira/browse/SPARK-21481 Project

[jira] [Created] (SPARK-21480) Memory leak in org.apache.hadoop.hive.metastore.MetaStoreDirectSql.executeNoResult

2017-07-19 Thread Jack Hu (JIRA)
Jack Hu created SPARK-21480: --- Summary: Memory leak in org.apache.hadoop.hive.metastore.MetaStoreDirectSql.executeNoResult Key: SPARK-21480 URL: https://issues.apache.org/jira/browse/SPARK-21480 Project: Spa

[jira] [Comment Edited] (SPARK-21476) RandomForest classification model not using broadcast in transform

2017-07-19 Thread Saurabh Agrawal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16094182#comment-16094182 ] Saurabh Agrawal edited comment on SPARK-21476 at 7/20/17 5:21 AM: -

[jira] [Comment Edited] (SPARK-21476) RandomForest classification model not using broadcast in transform

2017-07-19 Thread Saurabh Agrawal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16094182#comment-16094182 ] Saurabh Agrawal edited comment on SPARK-21476 at 7/20/17 5:16 AM: -

[jira] [Comment Edited] (SPARK-21476) RandomForest classification model not using broadcast in transform

2017-07-19 Thread Saurabh Agrawal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16094182#comment-16094182 ] Saurabh Agrawal edited comment on SPARK-21476 at 7/20/17 5:15 AM: -

[jira] [Commented] (SPARK-21476) RandomForest classification model not using broadcast in transform

2017-07-19 Thread Saurabh Agrawal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16094182#comment-16094182 ] Saurabh Agrawal commented on SPARK-21476: - I'm saying that the trees in the model

[jira] [Resolved] (SPARK-16542) bugs about types that result an array of null when creating dataframe using python

2017-07-19 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin resolved SPARK-16542. --- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18444 [https://g

[jira] [Assigned] (SPARK-16542) bugs about types that result an array of null when creating dataframe using python

2017-07-19 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin reassigned SPARK-16542: - Assignee: Xiang Gao > bugs about types that result an array of null when creating datafr

[jira] [Updated] (SPARK-21034) Filter not getting pushed down the groupBy clause when first() or last() aggregate function is used

2017-07-19 Thread Abhijit Bhole (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhijit Bhole updated SPARK-21034: -- Description: Here is a sample code - {code:java} from pyspark.sql import functions as F df

[jira] [Updated] (SPARK-21479) Outer join filter pushdown in null supplying table when condition is on one of the joined columns

2017-07-19 Thread Abhijit Bhole (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhijit Bhole updated SPARK-21479: -- Description: Here are two different query plans - {code:java} df1 = spark.createDataFrame([

[jira] [Created] (SPARK-21479) Outer join filter pushdown in null supplying table when condition is on one of the joined columns

2017-07-19 Thread Abhijit Bhole (JIRA)
Abhijit Bhole created SPARK-21479: - Summary: Outer join filter pushdown in null supplying table when condition is on one of the joined columns Key: SPARK-21479 URL: https://issues.apache.org/jira/browse/SPARK-2147

[jira] [Commented] (SPARK-2465) Use long as user / item ID for ALS

2017-07-19 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16094106#comment-16094106 ] Peng Meng commented on SPARK-2465: -- I think it is time to revisit this now. Some of our

[jira] [Commented] (SPARK-9860) Join: Determine the join strategy (broadcast join or shuffle join) at runtime

2017-07-19 Thread Carson Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16094076#comment-16094076 ] Carson Wang commented on SPARK-9860: We are working on this and also improving the cur

[jira] [Resolved] (SPARK-21437) Java Keyword cannot be used in table schema

2017-07-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-21437. -- Resolution: Not A Bug > Java Keyword cannot be used in table schema > -

[jira] [Created] (SPARK-21478) Unpersist a DF also unpersist related DFs

2017-07-19 Thread Roberto Mirizzi (JIRA)
Roberto Mirizzi created SPARK-21478: --- Summary: Unpersist a DF also unpersist related DFs Key: SPARK-21478 URL: https://issues.apache.org/jira/browse/SPARK-21478 Project: Spark Issue Type: B

[jira] [Updated] (SPARK-21478) Unpersist a DF also unpersists related DFs

2017-07-19 Thread Roberto Mirizzi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roberto Mirizzi updated SPARK-21478: Summary: Unpersist a DF also unpersists related DFs (was: Unpersist a DF also unpersist re

[jira] [Resolved] (SPARK-21333) joinWith documents and analysis allow invalid join types

2017-07-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-21333. - Resolution: Fixed Fix Version/s: 2.3.0 2.2.1 > joinWith documents and analysis

[jira] [Assigned] (SPARK-21333) joinWith documents and analysis allow invalid join types

2017-07-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-21333: --- Assignee: Corey Woodfield > joinWith documents and analysis allow invalid join types > -

[jira] [Commented] (SPARK-21476) RandomForest classification model not using broadcast in transform

2017-07-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093892#comment-16093892 ] Sean Owen commented on SPARK-21476: --- I'm not sure what you're suggesting, that somethin

[jira] [Commented] (SPARK-19842) Informational Referential Integrity Constraints Support in Spark

2017-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093887#comment-16093887 ] Reynold Xin commented on SPARK-19842: - Are you guys doing any work here? > Informati

[jira] [Resolved] (SPARK-21456) Make the driver failover_timeout configurable (Mesos cluster mode)

2017-07-19 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-21456. Resolution: Fixed Assignee: Susan X. Huynh Fix Version/s: 2.3.0 > Make the

[jira] [Updated] (SPARK-21089) Table properties are not shown in DESC EXTENDED/FORMATTED

2017-07-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-21089: Priority: Blocker (was: Critical) > Table properties are not shown in DESC EXTENDED/FORMATTED > --

[jira] [Updated] (SPARK-21203) Wrong results of insertion of Array of Struct

2017-07-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-21203: Priority: Blocker (was: Critical) > Wrong results of insertion of Array of Struct > --

[jira] [Updated] (SPARK-21258) Window result incorrect using complex object with spilling

2017-07-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-21258: Priority: Blocker (was: Major) > Window result incorrect using complex object with spilling >

[jira] [Resolved] (SPARK-21446) [SQL] JDBC Postgres fetchsize parameter ignored again

2017-07-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-21446. - Resolution: Fixed Fix Version/s: 2.3.0 2.2.1 2.1.2 > [SQL] J

[jira] [Assigned] (SPARK-21446) [SQL] JDBC Postgres fetchsize parameter ignored again

2017-07-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-21446: --- Assignee: Albert Jang > [SQL] JDBC Postgres fetchsize parameter ignored again >

[jira] [Assigned] (SPARK-21477) Mark LocalTableScanExec's input data transient

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21477: Assignee: Apache Spark > Mark LocalTableScanExec's input data transient >

[jira] [Commented] (SPARK-21477) Mark LocalTableScanExec's input data transient

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093835#comment-16093835 ] Apache Spark commented on SPARK-21477: -- User 'gatorsmile' has created a pull request

[jira] [Assigned] (SPARK-21477) Mark LocalTableScanExec's input data transient

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21477: Assignee: (was: Apache Spark) > Mark LocalTableScanExec's input data transient > -

[jira] [Created] (SPARK-21477) Mark LocalTableScanExec's input data transient

2017-07-19 Thread Xiao Li (JIRA)
Xiao Li created SPARK-21477: --- Summary: Mark LocalTableScanExec's input data transient Key: SPARK-21477 URL: https://issues.apache.org/jira/browse/SPARK-21477 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-19743) Exception when creating more than one implicit Encoder in REPL

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-19743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Bryński updated SPARK-19743: --- Affects Version/s: 2.1.1 2.2.0 > Exception when creating more than one

[jira] [Comment Edited] (SPARK-21470) [SPARK 2.2 Regression] Spark History server doesn't support HDFS HA

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093750#comment-16093750 ] Maciej Bryński edited comment on SPARK-21470 at 7/19/17 9:06 PM: --

[jira] [Resolved] (SPARK-21243) Limit the number of maps in a single shuffle fetch

2017-07-19 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-21243. --- Resolution: Fixed Fix Version/s: 2.3.0 > Limit the number of maps in a single shuffle

[jira] [Assigned] (SPARK-21243) Limit the number of maps in a single shuffle fetch

2017-07-19 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves reassigned SPARK-21243: - Assignee: Dhruve Ashar > Limit the number of maps in a single shuffle fetch > --

[jira] [Assigned] (SPARK-21439) Cannot use Spark with Python ABCmeta (exception from cloudpickle)

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21439: Assignee: (was: Apache Spark) > Cannot use Spark with Python ABCmeta (exception from c

[jira] [Commented] (SPARK-21470) [SPARK 2.2 Regression] Spark History server doesn't support HDFS HA

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093750#comment-16093750 ] Maciej Bryński commented on SPARK-21470: OK. I think I found the reason. There

[jira] [Closed] (SPARK-21470) [SPARK 2.2 Regression] Spark History server doesn't support HDFS HA

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Bryński closed SPARK-21470. -- Resolution: Invalid > [SPARK 2.2 Regression] Spark History server doesn't support HDFS HA > ---

[jira] [Assigned] (SPARK-21439) Cannot use Spark with Python ABCmeta (exception from cloudpickle)

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21439: Assignee: Apache Spark > Cannot use Spark with Python ABCmeta (exception from cloudpickle)

[jira] [Commented] (SPARK-21439) Cannot use Spark with Python ABCmeta (exception from cloudpickle)

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093746#comment-16093746 ] Apache Spark commented on SPARK-21439: -- User 'maver1ck' has created a pull request f

[jira] [Commented] (SPARK-21470) [SPARK 2.2 Regression] Spark History server doesn't support HDFS HA

2017-07-19 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093726#comment-16093726 ] Marcelo Vanzin commented on SPARK-21470: That still looks like some issue in your

[jira] [Comment Edited] (SPARK-21470) [SPARK 2.2 Regression] Spark History server doesn't support HDFS HA

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093716#comment-16093716 ] Maciej Bryński edited comment on SPARK-21470 at 7/19/17 8:05 PM: --

[jira] [Commented] (SPARK-21470) [SPARK 2.2 Regression] Spark History server doesn't support HDFS HA

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093716#comment-16093716 ] Maciej Bryński commented on SPARK-21470: [~vanzin] I tried. {code} /etc/hadoop/co

[jira] [Created] (SPARK-21476) RandomForest classification model not using broadcast in transform

2017-07-19 Thread Saurabh Agrawal (JIRA)
Saurabh Agrawal created SPARK-21476: --- Summary: RandomForest classification model not using broadcast in transform Key: SPARK-21476 URL: https://issues.apache.org/jira/browse/SPARK-21476 Project: Spa

[jira] [Comment Edited] (SPARK-17333) Make pyspark interface friendly with static analysis

2017-07-19 Thread Assaf Mendelson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16091210#comment-16091210 ] Assaf Mendelson edited comment on SPARK-17333 at 7/19/17 7:18 PM: -

[jira] [Commented] (SPARK-21475) Change the usage of FileInputStream/OutputStream to Files.newInput/OutputStream in the critical path

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093589#comment-16093589 ] Apache Spark commented on SPARK-21475: -- User 'jerryshao' has created a pull request

[jira] [Assigned] (SPARK-21475) Change the usage of FileInputStream/OutputStream to Files.newInput/OutputStream in the critical path

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21475: Assignee: Apache Spark > Change the usage of FileInputStream/OutputStream to > Files.newI

[jira] [Assigned] (SPARK-21475) Change the usage of FileInputStream/OutputStream to Files.newInput/OutputStream in the critical path

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21475: Assignee: (was: Apache Spark) > Change the usage of FileInputStream/OutputStream to >

[jira] [Commented] (SPARK-21474) Make number of parallel fetches from a reducer configurable

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093564#comment-16093564 ] Apache Spark commented on SPARK-21474: -- User 'raajay' has created a pull request for

[jira] [Assigned] (SPARK-21474) Make number of parallel fetches from a reducer configurable

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21474: Assignee: Apache Spark > Make number of parallel fetches from a reducer configurable > ---

[jira] [Assigned] (SPARK-21474) Make number of parallel fetches from a reducer configurable

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21474: Assignee: (was: Apache Spark) > Make number of parallel fetches from a reducer configu

[jira] [Created] (SPARK-21475) Change the usage of FileInputStream/OutputStream to Files.newInput/OutputStream in the critical path

2017-07-19 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-21475: --- Summary: Change the usage of FileInputStream/OutputStream to Files.newInput/OutputStream in the critical path Key: SPARK-21475 URL: https://issues.apache.org/jira/browse/SPARK-21475

[jira] [Resolved] (SPARK-21455) RpcFailure should be call on RpcResponseCallback.onFailure

2017-07-19 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-21455. -- Resolution: Won't Fix > RpcFailure should be call on RpcResponseCallback.onFailure > --

[jira] [Resolved] (SPARK-21464) Minimize deprecation warnings caused by ProcessingTime class

2017-07-19 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-21464. --- Resolution: Fixed Fix Version/s: 3.0.0 2.2.1 Issue resolved by pull

[jira] [Created] (SPARK-21474) Make number of parallel fetches from a reducer configurable

2017-07-19 Thread Raajay Viswanathan (JIRA)
Raajay Viswanathan created SPARK-21474: -- Summary: Make number of parallel fetches from a reducer configurable Key: SPARK-21474 URL: https://issues.apache.org/jira/browse/SPARK-21474 Project: Spar

[jira] [Commented] (SPARK-18226) SparkR displaying vector columns in incorrect way

2017-07-19 Thread Kirti (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093483#comment-16093483 ] Kirti commented on SPARK-18226: --- Hi Felix, After collecting output of predict in R, probab

[jira] [Commented] (SPARK-18226) SparkR displaying vector columns in incorrect way

2017-07-19 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093466#comment-16093466 ] Felix Cheung commented on SPARK-18226: -- if you collect on what's returned by predict

[jira] [Comment Edited] (SPARK-21374) Reading globbed paths from S3 into DF doesn't work if filesystem caching is disabled

2017-07-19 Thread Andrey Taptunov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093417#comment-16093417 ] Andrey Taptunov edited comment on SPARK-21374 at 7/19/17 5:19 PM: -

[jira] [Commented] (SPARK-21374) Reading globbed paths from S3 into DF doesn't work if filesystem caching is disabled

2017-07-19 Thread Andrey Taptunov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093417#comment-16093417 ] Andrey Taptunov commented on SPARK-21374: - [~ste...@apache.org] Indeed, while w

[jira] [Commented] (SPARK-21470) [SPARK 2.2 Regression] Spark History server doesn't support HDFS HA

2017-07-19 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093408#comment-16093408 ] Marcelo Vanzin commented on SPARK-21470: The HDFS library generally throws that e

[jira] [Updated] (SPARK-20725) partial aggregate should behave correctly for sameResult

2017-07-19 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-20725: Target Version/s: (was: 2.1.2) > partial aggregate should behave correctly for sameResult > -

[jira] [Resolved] (SPARK-21473) Running Transform on a bean which has only setters gives NullPointerExcpetion

2017-07-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21473. --- Resolution: Duplicate Given the error, looks like the change for SPARK-19666 already addressed why it

[jira] [Updated] (SPARK-21473) Running Transform on a bean which has only setters gives NullPointerExcpetion

2017-07-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-21473: -- Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) > Running Transform on a bean w

[jira] [Updated] (SPARK-21473) Running Transform on a bean which has only setters gives NullPointerExcpetion

2017-07-19 Thread Aseem Bansal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aseem Bansal updated SPARK-21473: - Description: If I run the following using the Java API {code:java} dataset.map(Transformer::tran

[jira] [Assigned] (SPARK-21441) Incorrect Codegen in SortMergeJoinExec results failures in some cases

2017-07-19 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-21441: --- Assignee: Feng Zhu > Incorrect Codegen in SortMergeJoinExec results failures in some cases >

[jira] [Created] (SPARK-21473) Running Transform on a bean which has only setters gives NullPointerExcpetion

2017-07-19 Thread Aseem Bansal (JIRA)
Aseem Bansal created SPARK-21473: Summary: Running Transform on a bean which has only setters gives NullPointerExcpetion Key: SPARK-21473 URL: https://issues.apache.org/jira/browse/SPARK-21473 Project

[jira] [Resolved] (SPARK-21441) Incorrect Codegen in SortMergeJoinExec results failures in some cases

2017-07-19 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-21441. - Resolution: Fixed Fix Version/s: 2.1.2 2.3.0 2.2.1 I

[jira] [Commented] (SPARK-5159) Thrift server does not respect hive.server2.enable.doAs=true

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093105#comment-16093105 ] Maciej Bryński commented on SPARK-5159: --- Still existed in Spark 2.2.0. Probably dupl

[jira] [Commented] (SPARK-21177) df.saveAsTable slows down linearly, with number of appends

2017-07-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093099#comment-16093099 ] Hyukjin Kwon commented on SPARK-21177: -- Could you provide some steps to reproduce? I

[jira] [Resolved] (SPARK-10216) Avoid creating empty files during overwrite into Hive table with group by query

2017-07-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-10216. -- Resolution: Fixed This is fixed in https://github.com/apache/spark/pull/18654 > Avoid creating

[jira] [Resolved] (SPARK-21105) Useless empty files in hive table

2017-07-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-21105. -- Resolution: Duplicate This is a duplicate of SPARK-10216. > Useless empty files in hive table

[jira] [Assigned] (SPARK-21414) Buffer in SlidingWindowFunctionFrame could be big though window is small

2017-07-19 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-21414: --- Assignee: jin xing > Buffer in SlidingWindowFunctionFrame could be big though window is smal

[jira] [Resolved] (SPARK-21414) Buffer in SlidingWindowFunctionFrame could be big though window is small

2017-07-19 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-21414. - Resolution: Fixed Fix Version/s: 2.3.0 2.2.1 Issue resolved by pull req

[jira] [Commented] (SPARK-21177) df.saveAsTable slows down linearly, with number of appends

2017-07-19 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093080#comment-16093080 ] Prashant Sharma commented on SPARK-21177: - I can reproduce it on another system w

[jira] [Commented] (SPARK-11248) Spark hivethriftserver is using the wrong user to while getting HDFS permissions

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-11248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093055#comment-16093055 ] Maciej Bryński commented on SPARK-11248: I have similar issue in Spark 2.2.0 > S

[jira] [Updated] (SPARK-11248) Spark hivethriftserver is using the wrong user to while getting HDFS permissions

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-11248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Bryński updated SPARK-11248: --- Affects Version/s: 2.2.0 > Spark hivethriftserver is using the wrong user to while getting HD

[jira] [Updated] (SPARK-11248) Spark hivethriftserver is using the wrong user to while getting HDFS permissions

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-11248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Bryński updated SPARK-11248: --- Affects Version/s: 2.1.1 > Spark hivethriftserver is using the wrong user to while getting HD

[jira] [Updated] (SPARK-21448) Hi dear guys, I have a question about aggregateByKey of pairrrd.

2017-07-19 Thread qihuagao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] qihuagao updated SPARK-21448: - Description: java pair rdd has aggregateByKey, which can avoid full shuffle, so have impressive performa

[jira] [Updated] (SPARK-21459) Some aggregation functions change the case of nested field names

2017-07-19 Thread David Allsopp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Allsopp updated SPARK-21459: -- Description: When working with DataFrames with nested schemas, the behavior of the aggregation

[jira] [Updated] (SPARK-21459) Some aggregation functions change the case of nested field names

2017-07-19 Thread David Allsopp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Allsopp updated SPARK-21459: -- Description: When working with DataFrames with nested schemas, the behavior of the aggregation

[jira] [Assigned] (SPARK-21472) Introduce ArrowColumnVector as a reader for Arrow vectors.

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21472: Assignee: Apache Spark > Introduce ArrowColumnVector as a reader for Arrow vectors. >

[jira] [Assigned] (SPARK-21472) Introduce ArrowColumnVector as a reader for Arrow vectors.

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21472: Assignee: (was: Apache Spark) > Introduce ArrowColumnVector as a reader for Arrow vect

[jira] [Commented] (SPARK-21472) Introduce ArrowColumnVector as a reader for Arrow vectors.

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16092921#comment-16092921 ] Apache Spark commented on SPARK-21472: -- User 'ueshin' has created a pull request for

[jira] [Updated] (SPARK-21472) Introduce ArrowColumnVector as a reader for Arrow vectors.

2017-07-19 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin updated SPARK-21472: -- Description: Introducing {{ArrowColumnVector}} as a reader for Arrow vectors. It extends {{Colu

[jira] [Commented] (SPARK-21439) Cannot use Spark with Python ABCmeta (exception from cloudpickle)

2017-07-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16092919#comment-16092919 ] Hyukjin Kwon commented on SPARK-21439: -- We have a cloudpickle copy in Spark. Would y

[jira] [Updated] (SPARK-21472) Introduce ArrowColumnVector as a reader for Arrow vectors.

2017-07-19 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin updated SPARK-21472: -- Description: Introducing {{ArrowColumnVector}} as a reader for Arrow vectors. It extends {{Colu

[jira] [Updated] (SPARK-21472) Introduce ArrowColumnVector as a reader for Arrow vectors.

2017-07-19 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin updated SPARK-21472: -- Description: Introducing {{ArrowColumnVector}} as a reader for Arrow vectors. This extends {{Co

[jira] [Created] (SPARK-21472) Introduce ArrowColumnVector as a reader for Arrow vectors.

2017-07-19 Thread Takuya Ueshin (JIRA)
Takuya Ueshin created SPARK-21472: - Summary: Introduce ArrowColumnVector as a reader for Arrow vectors. Key: SPARK-21472 URL: https://issues.apache.org/jira/browse/SPARK-21472 Project: Spark

[jira] [Commented] (SPARK-20783) Enhance ColumnVector to support compressed representation

2017-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16092892#comment-16092892 ] Apache Spark commented on SPARK-20783: -- User 'kiszk' has created a pull request for

[jira] [Resolved] (SPARK-21471) Read binary file error in Spark Streaming

2017-07-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21471. --- Resolution: Invalid This is better on StackOverflow. JIRA is for reporting researched bugs in Spark

[jira] [Comment Edited] (SPARK-21439) Cannot use Spark with Python ABCmeta (exception from cloudpickle)

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16092888#comment-16092888 ] Maciej Bryński edited comment on SPARK-21439 at 7/19/17 10:15 AM: -

[jira] [Commented] (SPARK-21439) Cannot use Spark with Python ABCmeta (exception from cloudpickle)

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16092888#comment-16092888 ] Maciej Bryński commented on SPARK-21439: https://github.com/cloudpipe/cloudpickle

[jira] [Updated] (SPARK-21470) [SPARK 2.2 Regression] Spark History server doesn't support HDFS HA

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Bryński updated SPARK-21470: --- Summary: [SPARK 2.2 Regression] Spark History server doesn't support HDFS HA (was: Spark His

[jira] [Updated] (SPARK-21471) Read binary file error in Spark Streaming

2017-07-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-21471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lê Văn Thanh updated SPARK-21471: - Description: My client using GZIPOutputStream to compressed the data and push to my server. Whe

[jira] [Created] (SPARK-21471) Read binary file error in Spark Streaming

2017-07-19 Thread JIRA
Lê Văn Thanh created SPARK-21471: Summary: Read binary file error in Spark Streaming Key: SPARK-21471 URL: https://issues.apache.org/jira/browse/SPARK-21471 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-21470) Spark History server doesn't support HDFS HA

2017-07-19 Thread JIRA
Maciej Bryński created SPARK-21470: -- Summary: Spark History server doesn't support HDFS HA Key: SPARK-21470 URL: https://issues.apache.org/jira/browse/SPARK-21470 Project: Spark Issue Type:

[jira] [Created] (SPARK-21469) Add doc and example for FeatureHasher

2017-07-19 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-21469: -- Summary: Add doc and example for FeatureHasher Key: SPARK-21469 URL: https://issues.apache.org/jira/browse/SPARK-21469 Project: Spark Issue Type: Documen

[jira] [Created] (SPARK-21468) FeatureHasher Python API

2017-07-19 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-21468: -- Summary: FeatureHasher Python API Key: SPARK-21468 URL: https://issues.apache.org/jira/browse/SPARK-21468 Project: Spark Issue Type: New Feature

  1   2   >