[jira] [Updated] (SPARK-22739) Additional Expression Support for Objects

2018-01-15 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sameer Agarwal updated SPARK-22739: --- Priority: Major (was: Critical) > Additional Expression Support for Objects >

[jira] [Resolved] (SPARK-23020) Flaky Test: org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher

2018-01-15 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sameer Agarwal resolved SPARK-23020. Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 20223

[jira] [Assigned] (SPARK-23020) Flaky Test: org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher

2018-01-15 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sameer Agarwal reassigned SPARK-23020: -- Assignee: Marcelo Vanzin > Flaky Test:

[jira] [Commented] (SPARK-23085) API parity for mllib.linalg.Vectors.sparse

2018-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326795#comment-16326795 ] Apache Spark commented on SPARK-23085: -- User 'zhengruifeng' has created a pull request for this

[jira] [Assigned] (SPARK-23085) API parity for mllib.linalg.Vectors.sparse

2018-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23085: Assignee: (was: Apache Spark) > API parity for mllib.linalg.Vectors.sparse >

[jira] [Assigned] (SPARK-23085) API parity for mllib.linalg.Vectors.sparse

2018-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23085: Assignee: Apache Spark > API parity for mllib.linalg.Vectors.sparse >

[jira] [Updated] (SPARK-23085) API parity for mllib.linalg.Vectors.sparse

2018-01-15 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-23085: - Description: Both {{ML.Vectors#sparse(size: {color:#cc7832}Int, {color}indices:

[jira] [Updated] (SPARK-23085) API parity for mllib.linalg.Vectors.sparse

2018-01-15 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-23085: - Description: Both {ML.Vectors#sparse size: {color:#cc7832}Int, {color}indices:

[jira] [Created] (SPARK-23085) API parity for mllib.linalg.Vectors.sparse

2018-01-15 Thread zhengruifeng (JIRA)
zhengruifeng created SPARK-23085: Summary: API parity for mllib.linalg.Vectors.sparse Key: SPARK-23085 URL: https://issues.apache.org/jira/browse/SPARK-23085 Project: Spark Issue Type:

[jira] [Updated] (SPARK-22956) Union Stream Failover Cause `IllegalStateException`

2018-01-15 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-22956: - Fix Version/s: (was: 2.3.0) 2.4.0 2.3.1 > Union Stream

[jira] [Resolved] (SPARK-22956) Union Stream Failover Cause `IllegalStateException`

2018-01-15 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-22956. -- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 20150

[jira] [Assigned] (SPARK-22956) Union Stream Failover Cause `IllegalStateException`

2018-01-15 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-22956: Assignee: Li Yuanjian > Union Stream Failover Cause `IllegalStateException` >

[jira] [Updated] (SPARK-19700) Design an API for pluggable scheduler implementations

2018-01-15 Thread Anirudh Ramanathan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anirudh Ramanathan updated SPARK-19700: --- Issue Type: Improvement (was: Bug) > Design an API for pluggable scheduler

[jira] [Commented] (SPARK-23083) Adding Kubernetes as an option to https://spark.apache.org/

2018-01-15 Thread Anirudh Ramanathan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326711#comment-16326711 ] Anirudh Ramanathan commented on SPARK-23083: opened

[jira] [Resolved] (SPARK-23080) Improve error message for built-in functions

2018-01-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-23080. -- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 20271

[jira] [Assigned] (SPARK-23080) Improve error message for built-in functions

2018-01-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-23080: Assignee: Marco Gaido > Improve error message for built-in functions >

[jira] [Commented] (SPARK-22624) Expose range partitioning shuffle introduced by SPARK-22614

2018-01-15 Thread xubo245 (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326707#comment-16326707 ] xubo245 commented on SPARK-22624: - [~smilegator] ok, I will finished it. > Expose range partitioning

[jira] [Commented] (SPARK-23076) When we call cache() on RDD which depends on ShuffleRowRDD, we will get an error result

2018-01-15 Thread zhoukang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326685#comment-16326685 ] zhoukang commented on SPARK-23076: -- Yes,may it should be an improvement?Since i suppose some other may

[jira] [Updated] (SPARK-23076) When we call cache() on RDD which depends on ShuffleRowRDD, we will get an error result

2018-01-15 Thread zhoukang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated SPARK-23076: - Issue Type: Improvement (was: Bug) > When we call cache() on RDD which depends on ShuffleRowRDD, we

[jira] [Commented] (SPARK-20120) spark-sql CLI support silent mode

2018-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326673#comment-16326673 ] Apache Spark commented on SPARK-20120: -- User 'wangyum' has created a pull request for this issue:

[jira] [Updated] (SPARK-23035) Fix improper information of TempTableAlreadyExistsException

2018-01-15 Thread xubo245 (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-23035: Description:   Problem: it throw TempTableAlreadyExistsException and output "Temporary table '$table'

[jira] [Updated] (SPARK-23035) Fix improper information of TempTableAlreadyExistsException

2018-01-15 Thread xubo245 (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-23035: Description:   Problem: it throw TempTableAlreadyExistsException and output "Temporary table '$table'

[jira] [Updated] (SPARK-23035) Fix improper information of TempTableAlreadyExistsException

2018-01-15 Thread xubo245 (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-23035: Description:   Problem: it throw TempTableAlreadyExistsException and output "Temporary table '$table'

[jira] [Updated] (SPARK-23035) Fix improper information of TempTableAlreadyExistsException

2018-01-15 Thread xubo245 (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-23035: Description:   Problem: it throw TempTableAlreadyExistsException and output "Temporary table '$table'

[jira] [Updated] (SPARK-23035) Fix improper information of TempTableAlreadyExistsException

2018-01-15 Thread xubo245 (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-23035: Summary: Fix improper information of TempTableAlreadyExistsException (was: Fix warning: TEMPORARY TABLE

[jira] [Commented] (SPARK-23000) Flaky test suite DataSourceWithHiveMetastoreCatalogSuite in Spark 2.3

2018-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326663#comment-16326663 ] Apache Spark commented on SPARK-23000: -- User 'sameeragarwal' has created a pull request for this

[jira] [Updated] (SPARK-23084) Add unboundedPreceding(), unboundedFollowing() and currentRow() to PySpark

2018-01-15 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23084: Description: Add the new APIs (introduced by https://github.com/apache/spark/pull/18814) to PySpark. Also

[jira] [Updated] (SPARK-23084) Add unboundedPreceding(), unboundedFollowing() and currentRow() to PySpark

2018-01-15 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23084: Description: Add the new APIs (introduced by https://github.com/apache/spark/pull/18814) to PySpark

[jira] [Updated] (SPARK-23084) Add unboundedPreceding(), unboundedFollowing() and currentRow() to PySpark

2018-01-15 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23084: Environment: (was: Add the new APIs (introduced by  https://github.com/apache/spark/pull/18814) to

[jira] [Created] (SPARK-23084) Add unboundedPreceding(), unboundedFollowing() and currentRow() to PySpark

2018-01-15 Thread Xiao Li (JIRA)
Xiao Li created SPARK-23084: --- Summary: Add unboundedPreceding(), unboundedFollowing() and currentRow() to PySpark Key: SPARK-23084 URL: https://issues.apache.org/jira/browse/SPARK-23084 Project: Spark

[jira] [Commented] (SPARK-23083) Adding Kubernetes as an option to https://spark.apache.org/

2018-01-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326635#comment-16326635 ] Sean Owen commented on SPARK-23083: --- Yes that's fine. If it only makes sense after the 2.3 release then

[jira] [Commented] (SPARK-23083) Adding Kubernetes as an option to https://spark.apache.org/

2018-01-15 Thread Anirudh Ramanathan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326631#comment-16326631 ] Anirudh Ramanathan commented on SPARK-23083: Thanks. I'll create a PR against that repo.

[jira] [Commented] (SPARK-23083) Adding Kubernetes as an option to https://spark.apache.org/

2018-01-15 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326619#comment-16326619 ] Reynold Xin commented on SPARK-23083: - Here's the website repo:

[jira] [Created] (SPARK-23083) Adding Kubernetes as an option to https://spark.apache.org/

2018-01-15 Thread Anirudh Ramanathan (JIRA)
Anirudh Ramanathan created SPARK-23083: -- Summary: Adding Kubernetes as an option to https://spark.apache.org/ Key: SPARK-23083 URL: https://issues.apache.org/jira/browse/SPARK-23083 Project:

[jira] [Commented] (SPARK-23078) Allow Submitting Spark Thrift Server in Cluster Mode

2018-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326521#comment-16326521 ] Apache Spark commented on SPARK-23078: -- User 'ozzieba' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23078) Allow Submitting Spark Thrift Server in Cluster Mode

2018-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23078: Assignee: Apache Spark > Allow Submitting Spark Thrift Server in Cluster Mode >

[jira] [Assigned] (SPARK-23078) Allow Submitting Spark Thrift Server in Cluster Mode

2018-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23078: Assignee: (was: Apache Spark) > Allow Submitting Spark Thrift Server in Cluster Mode

[jira] [Commented] (SPARK-23074) Dataframe-ified zipwithindex

2018-01-15 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326489#comment-16326489 ] Ruslan Dautkhanov commented on SPARK-23074: --- Yep, there are use cases where ordering is

[jira] [Commented] (SPARK-23074) Dataframe-ified zipwithindex

2018-01-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326479#comment-16326479 ] Sean Owen commented on SPARK-23074: --- Hm, rowNumber requires you to sort the input? I didn't think it

[jira] [Updated] (SPARK-23082) Allow separate node selectors for driver and executors in Kubernetes

2018-01-15 Thread Oz Ben-Ami (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oz Ben-Ami updated SPARK-23082: --- Description: In YARN, we can use {{spark.yarn.am.nodeLabelExpression}} to submit the Spark driver

[jira] [Updated] (SPARK-23082) Allow separate node selectors for driver and executors in Kubernetes

2018-01-15 Thread Oz Ben-Ami (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oz Ben-Ami updated SPARK-23082: --- Description: In YARN, we can use spark.yarn.am.nodeLabelExpression to submit the Spark driver to a

[jira] [Created] (SPARK-23082) Allow separate node selectors for driver and executors in Kubernetes

2018-01-15 Thread Oz Ben-Ami (JIRA)
Oz Ben-Ami created SPARK-23082: -- Summary: Allow separate node selectors for driver and executors in Kubernetes Key: SPARK-23082 URL: https://issues.apache.org/jira/browse/SPARK-23082 Project: Spark

[jira] [Resolved] (SPARK-21108) convert LinearSVC to aggregator framework

2018-01-15 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-21108. Resolution: Fixed > convert LinearSVC to aggregator framework >

[jira] [Commented] (SPARK-12717) pyspark broadcast fails when using multiple threads

2018-01-15 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326459#comment-16326459 ] Bryan Cutler commented on SPARK-12717: -- Hi [~codlife], you can use Spark 2.2.1 which was released in

[jira] [Commented] (SPARK-23074) Dataframe-ified zipwithindex

2018-01-15 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326455#comment-16326455 ] Ruslan Dautkhanov commented on SPARK-23074: --- {quote}You can create a DataFrame from the result

[jira] [Comment Edited] (SPARK-23074) Dataframe-ified zipwithindex

2018-01-15 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326455#comment-16326455 ] Ruslan Dautkhanov edited comment on SPARK-23074 at 1/15/18 5:40 PM:

[jira] [Commented] (SPARK-6305) Add support for log4j 2.x to Spark

2018-01-15 Thread Roque Vassal'lo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326422#comment-16326422 ] Roque Vassal'lo commented on SPARK-6305: Hi, I would like to know current status of this ticket.

[jira] [Updated] (SPARK-23081) Add colRegex API to PySpark

2018-01-15 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23081: Summary: Add colRegex API to PySpark (was: Add colRegex to PySpark) > Add colRegex API to PySpark >

[jira] [Created] (SPARK-23081) Add colRegex to PySpark

2018-01-15 Thread Xiao Li (JIRA)
Xiao Li created SPARK-23081: --- Summary: Add colRegex to PySpark Key: SPARK-23081 URL: https://issues.apache.org/jira/browse/SPARK-23081 Project: Spark Issue Type: Improvement Components:

[jira] [Updated] (SPARK-23081) Add colRegex to PySpark

2018-01-15 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23081: Fix Version/s: (was: 2.3.0) > Add colRegex to PySpark > --- > >

[jira] [Assigned] (SPARK-12139) REGEX Column Specification for Hive Queries

2018-01-15 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-12139: --- Assignee: jane > REGEX Column Specification for Hive Queries >

[jira] [Commented] (SPARK-23078) Allow Submitting Spark Thrift Server in Cluster Mode

2018-01-15 Thread Oz Ben-Ami (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326390#comment-16326390 ] Oz Ben-Ami commented on SPARK-23078: [~mgaido] no objections, Kubernetes is the one that most needs

[jira] [Resolved] (SPARK-12139) REGEX Column Specification for Hive Queries

2018-01-15 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-12139. - Resolution: Fixed Fix Version/s: 2.3.0 > REGEX Column Specification for Hive Queries >

[jira] [Assigned] (SPARK-23080) Improve error message for built-in functions

2018-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23080: Assignee: (was: Apache Spark) > Improve error message for built-in functions >

[jira] [Assigned] (SPARK-23080) Improve error message for built-in functions

2018-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23080: Assignee: Apache Spark > Improve error message for built-in functions >

[jira] [Commented] (SPARK-23080) Improve error message for built-in functions

2018-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326383#comment-16326383 ] Apache Spark commented on SPARK-23080: -- User 'mgaido91' has created a pull request for this issue:

[jira] [Commented] (SPARK-22624) Expose range partitioning shuffle introduced by SPARK-22614

2018-01-15 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326382#comment-16326382 ] Xiao Li commented on SPARK-22624: - cc [~xubo245] Do you want to take this? > Expose range partitioning

[jira] [Commented] (SPARK-23078) Allow Submitting Spark Thrift Server in Cluster Mode

2018-01-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326378#comment-16326378 ] Marco Gaido commented on SPARK-23078: - [~ozzieba] I see that in Kubernetes it might work, but I think

[jira] [Comment Edited] (SPARK-23078) Allow Submitting Spark Thrift Server in Cluster Mode

2018-01-15 Thread Oz Ben-Ami (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326366#comment-16326366 ] Oz Ben-Ami edited comment on SPARK-23078 at 1/15/18 4:02 PM: - [~mgaido] In

[jira] [Commented] (SPARK-23078) Allow Submitting Spark Thrift Server in Cluster Mode

2018-01-15 Thread Oz Ben-Ami (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326366#comment-16326366 ] Oz Ben-Ami commented on SPARK-23078: [~mgaido] In Kubernetes you can just create a Service which

[jira] [Created] (SPARK-23080) Improve error message for built-in functions

2018-01-15 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23080: --- Summary: Improve error message for built-in functions Key: SPARK-23080 URL: https://issues.apache.org/jira/browse/SPARK-23080 Project: Spark Issue Type: Bug

[jira] [Comment Edited] (SPARK-23050) Structured Streaming with S3 file source duplicates data because of eventual consistency.

2018-01-15 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326330#comment-16326330 ] Steve Loughran edited comment on SPARK-23050 at 1/15/18 3:24 PM: - Quick

[jira] [Commented] (SPARK-23078) Allow Submitting Spark Thrift Server in Cluster Mode

2018-01-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326337#comment-16326337 ] Marco Gaido commented on SPARK-23078: - The problem is: in cluster mode you don't control where the

[jira] [Assigned] (SPARK-23079) Fix query constraints propagation with aliases

2018-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23079: Assignee: (was: Apache Spark) > Fix query constraints propagation with aliases >

[jira] [Commented] (SPARK-23079) Fix query constraints propagation with aliases

2018-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326334#comment-16326334 ] Apache Spark commented on SPARK-23079: -- User 'gengliangwang' has created a pull request for this

[jira] [Assigned] (SPARK-23079) Fix query constraints propagation with aliases

2018-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23079: Assignee: Apache Spark > Fix query constraints propagation with aliases >

[jira] [Resolved] (SPARK-23035) Fix warning: TEMPORARY TABLE ... USING ... is deprecated and use TempViewAlreadyExistsException when create temp view

2018-01-15 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23035. - Resolution: Fixed Assignee: xubo245 Fix Version/s: 2.3.0 > Fix warning: TEMPORARY TABLE

[jira] [Commented] (SPARK-23050) Structured Streaming with S3 file source duplicates data because of eventual consistency.

2018-01-15 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326330#comment-16326330 ] Steve Loughran commented on SPARK-23050: Quick review of the code Yes, there's potentially a

[jira] [Created] (SPARK-23079) Fix query constraints propagation with aliases

2018-01-15 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-23079: -- Summary: Fix query constraints propagation with aliases Key: SPARK-23079 URL: https://issues.apache.org/jira/browse/SPARK-23079 Project: Spark Issue

[jira] [Commented] (SPARK-23076) When we call cache() on RDD which depends on ShuffleRowRDD, we will get an error result

2018-01-15 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326304#comment-16326304 ] Wenchen Fan commented on SPARK-23076: - To be clear, you maintain an internal Spark version and cache

[jira] [Created] (SPARK-23078) Allow Submitting Spark Thrift Server in Cluster Mode

2018-01-15 Thread Oz Ben-Ami (JIRA)
Oz Ben-Ami created SPARK-23078: -- Summary: Allow Submitting Spark Thrift Server in Cluster Mode Key: SPARK-23078 URL: https://issues.apache.org/jira/browse/SPARK-23078 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-23070) Bump previousSparkVersion in MimaBuild.scala to be 2.2.0

2018-01-15 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23070. - Resolution: Fixed > Bump previousSparkVersion in MimaBuild.scala to be 2.2.0 >

[jira] [Commented] (SPARK-18112) Spark2.x does not support read data from Hive 2.x metastore

2018-01-15 Thread JP Moresmau (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326303#comment-16326303 ] JP Moresmau commented on SPARK-18112: - I'm using Hive 2.3.2 and Spark 2.2.1, but I still run into

[jira] [Resolved] (SPARK-22995) Spark UI stdout/stderr links point to executors internal address

2018-01-15 Thread Jhon Cardenas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jhon Cardenas resolved SPARK-22995. --- Resolution: Unresolved > Spark UI stdout/stderr links point to executors internal address >

[jira] [Assigned] (SPARK-23029) Doc spark.shuffle.file.buffer units are kb when no units specified

2018-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23029: Assignee: (was: Apache Spark) > Doc spark.shuffle.file.buffer units are kb when no

[jira] [Commented] (SPARK-23029) Doc spark.shuffle.file.buffer units are kb when no units specified

2018-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326260#comment-16326260 ] Apache Spark commented on SPARK-23029: -- User 'ferdonline' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23029) Doc spark.shuffle.file.buffer units are kb when no units specified

2018-01-15 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23029: Assignee: Apache Spark > Doc spark.shuffle.file.buffer units are kb when no units

[jira] [Assigned] (SPARK-21856) Update Python API for MultilayerPerceptronClassifierModel

2018-01-15 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-21856: -- Assignee: Chunsheng Ji > Update Python API for MultilayerPerceptronClassifierModel >

[jira] [Assigned] (SPARK-21856) Update Python API for MultilayerPerceptronClassifierModel

2018-01-15 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-21856: -- Assignee: (was: Weichen Xu) > Update Python API for

[jira] [Assigned] (SPARK-21856) Update Python API for MultilayerPerceptronClassifierModel

2018-01-15 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-21856: -- Assignee: Weichen Xu > Update Python API for MultilayerPerceptronClassifierModel >

[jira] [Resolved] (SPARK-21856) Update Python API for MultilayerPerceptronClassifierModel

2018-01-15 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-21856. Resolution: Fixed > Update Python API for MultilayerPerceptronClassifierModel >

[jira] [Updated] (SPARK-23076) When we call cache() on RDD which depends on ShuffleRowRDD, we will get an error result

2018-01-15 Thread zhoukang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated SPARK-23076: - Summary: When we call cache() on RDD which depends on ShuffleRowRDD, we will get an error result (was:

[jira] [Comment Edited] (SPARK-23076) When we call cache() on ShuffleRowRDD, we will get an error result

2018-01-15 Thread zhoukang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326242#comment-16326242 ] zhoukang edited comment on SPARK-23076 at 1/15/18 1:34 PM: --- As the picture i

[jira] [Commented] (SPARK-23076) When we call cache() on ShuffleRowRDD, we will get an error result

2018-01-15 Thread zhoukang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326242#comment-16326242 ] zhoukang commented on SPARK-23076: -- As the picture i posted, i cached MapPartitionsRDD which depends on

[jira] [Updated] (SPARK-23076) When we call cache() on ShuffleRowRDD, we will get an error result

2018-01-15 Thread zhoukang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated SPARK-23076: - Description: For query below: {code:java} select * from csv_demo limit 3; {code} The correct result

[jira] [Comment Edited] (SPARK-17998) Reading Parquet files coalesces parts into too few in-memory partitions

2018-01-15 Thread Fernando Pereira (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326195#comment-16326195 ] Fernando Pereira edited comment on SPARK-17998 at 1/15/18 12:35 PM:

[jira] [Commented] (SPARK-17998) Reading Parquet files coalesces parts into too few in-memory partitions

2018-01-15 Thread Fernando Pereira (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326195#comment-16326195 ] Fernando Pereira commented on SPARK-17998: -- [~sams] Did you have the change to check

[jira] [Commented] (SPARK-23076) When we call cache() on ShuffleRowRDD, we will get an error result

2018-01-15 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326163#comment-16326163 ] Wenchen Fan commented on SPARK-23076: - How did you cache ShuffleRowRDD? It's kind of an intermedia

[jira] [Resolved] (SPARK-23077) Apache Structured Streaming: Bulk/Batch write support for Hive using streaming dataset

2018-01-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-23077. --- Resolution: Invalid I'm not sure what you're reporting here, but this sounds like a question about

[jira] [Updated] (SPARK-23077) Apache Structured Streaming: Bulk/Batch write support for Hive using streaming dataset

2018-01-15 Thread Pravin Agrawal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Agrawal updated SPARK-23077: --- Issue Type: Improvement (was: Bug) > Apache Structured Streaming: Bulk/Batch write support

[jira] [Commented] (SPARK-23076) When we call cache() on ShuffleRowRDD, we will get an error result

2018-01-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326153#comment-16326153 ] Sean Owen commented on SPARK-23076: --- That sounds serious, although if true, I would expect a lot of

[jira] [Commented] (SPARK-22943) OneHotEncoder supports manual specification of categorySizes

2018-01-15 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326151#comment-16326151 ] Nick Pentreath commented on SPARK-22943: Does the new estimator & model version of OHE solve this

[jira] [Updated] (SPARK-23077) Apache Structured Streaming: Bulk/Batch write support for Hive using streaming dataset

2018-01-15 Thread Pravin Agrawal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Agrawal updated SPARK-23077: --- Description: Using Apache Spark 2.2: Structured Streaming, Create a program which reads data

[jira] [Updated] (SPARK-23077) Apache Structured Streaming: Bulk/Batch write support for Hive using streaming dataset

2018-01-15 Thread Pravin Agrawal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Agrawal updated SPARK-23077: --- Summary: Apache Structured Streaming: Bulk/Batch write support for Hive using streaming

[jira] [Resolved] (SPARK-22993) checkpointInterval param doc should be clearer

2018-01-15 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-22993. Resolution: Fixed > checkpointInterval param doc should be clearer >

[jira] [Updated] (SPARK-23077) Apache Structured Streaming: Unable to write streaming dataset into Hive

2018-01-15 Thread Pravin Agrawal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Agrawal updated SPARK-23077: --- Description: Using Apache Spark 2.2: Structured Streaming, I am creating a program which

[jira] [Assigned] (SPARK-22993) checkpointInterval param doc should be clearer

2018-01-15 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-22993: -- Assignee: Seth Hendrickson > checkpointInterval param doc should be clearer >

[jira] [Updated] (SPARK-23077) Apache Structured Streaming: Unable to write streaming dataset into Hive?

2018-01-15 Thread Pravin Agrawal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Agrawal updated SPARK-23077: --- Issue Type: Bug (was: Question) > Apache Structured Streaming: Unable to write streaming

[jira] [Updated] (SPARK-23077) Apache Structured Streaming: Unable to write streaming dataset into Hive

2018-01-15 Thread Pravin Agrawal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Agrawal updated SPARK-23077: --- Summary: Apache Structured Streaming: Unable to write streaming dataset into Hive (was:

[jira] [Updated] (SPARK-23077) Apache Structured Streaming: Unable to write streaming dataset into Hive?

2018-01-15 Thread Pravin Agrawal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Agrawal updated SPARK-23077: --- Summary: Apache Structured Streaming: Unable to write streaming dataset into Hive? (was:

  1   2   >