[jira] [Commented] (SPARK-26823) SBT Build Warnings

2019-02-06 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16762438#comment-16762438 ] Hyukjin Kwon commented on SPARK-26823: -- Please make the JIRA title more descriptive

[jira] [Resolved] (SPARK-26823) SBT Build Warnings

2019-02-06 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-26823. -- Resolution: Incomplete > SBT Build Warnings > --- > > Key: SP

[jira] [Commented] (SPARK-26828) Coalesce to reduce partitions before writing to hive is not working

2019-02-06 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16762436#comment-16762436 ] Hyukjin Kwon commented on SPARK-26828: -- Can you make a self-reproducer so that peop

[jira] [Resolved] (SPARK-26834) ERROR org.apache.spark.deploy.history.HistoryServer: RECEIVED SIGNAL TERM

2019-02-06 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-26834. -- Resolution: Invalid > ERROR org.apache.spark.deploy.history.HistoryServer: RECEIVED SIGNAL TER

[jira] [Commented] (SPARK-26834) ERROR org.apache.spark.deploy.history.HistoryServer: RECEIVED SIGNAL TERM

2019-02-06 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16762433#comment-16762433 ] Hyukjin Kwon commented on SPARK-26834: -- If you're using CDH, please use a channel f

[jira] [Commented] (SPARK-26835) Document configuration properties of Spark SQL Generic Load/Save Functions

2019-02-06 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16762431#comment-16762431 ] Hyukjin Kwon commented on SPARK-26835: -- They are explained in API documentation. "G

[jira] [Updated] (SPARK-26836) Columns get switched in Spark SQL using Avro backed Hive table if schema evolves

2019-02-06 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-26836: - Priority: Major (was: Critical) > Columns get switched in Spark SQL using Avro backed Hive tabl

[jira] [Resolved] (SPARK-26838) Dataframe write Parquet java.lang.NoClassDefFoundError: Could not initialize class ParquetOptions

2019-02-06 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-26838. -- Resolution: Invalid Sounds more like a question since all the current Jenkins jobs are running

[jira] [Resolved] (SPARK-21758) `SHOW TBLPROPERTIES` can not get properties start with spark.sql.*

2019-02-06 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-21758. -- Resolution: Not A Problem > `SHOW TBLPROPERTIES` can not get properties start with spark.sql.*

[jira] [Commented] (SPARK-21758) `SHOW TBLPROPERTIES` can not get properties start with spark.sql.*

2019-02-06 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16762409#comment-16762409 ] Hyukjin Kwon commented on SPARK-21758: -- Look intentionally being filtered out: htt

[jira] [Created] (SPARK-26840) Avoid cost-based join reorder in presence of join hints

2019-02-06 Thread Maryann Xue (JIRA)
Maryann Xue created SPARK-26840: --- Summary: Avoid cost-based join reorder in presence of join hints Key: SPARK-26840 URL: https://issues.apache.org/jira/browse/SPARK-26840 Project: Spark Issue T

[jira] [Resolved] (SPARK-26832) Avoid project creation per record at Python's grouped vectorized UDF

2019-02-06 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-26832. -- Resolution: Invalid I misread the codes .. > Avoid project creation per record at Python's g

[jira] [Comment Edited] (SPARK-24657) SortMergeJoin may cause SparkOutOfMemory in execution memory because of not cleanup resource when finished the merge join

2019-02-06 Thread Tao Luo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16762280#comment-16762280 ] Tao Luo edited comment on SPARK-24657 at 2/7/19 1:53 AM: - Just r

[jira] [Comment Edited] (SPARK-24657) SortMergeJoin may cause SparkOutOfMemory in execution memory because of not cleanup resource when finished the merge join

2019-02-06 Thread Tao Luo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761607#comment-16761607 ] Tao Luo edited comment on SPARK-24657 at 2/7/19 2:02 AM: - Sure:

[jira] [Comment Edited] (SPARK-24657) SortMergeJoin may cause SparkOutOfMemory in execution memory because of not cleanup resource when finished the merge join

2019-02-06 Thread Tao Luo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761607#comment-16761607 ] Tao Luo edited comment on SPARK-24657 at 2/7/19 2:03 AM: - Sure:

[jira] [Commented] (SPARK-24657) SortMergeJoin may cause SparkOutOfMemory in execution memory because of not cleanup resource when finished the merge join

2019-02-06 Thread Tao Luo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16762280#comment-16762280 ] Tao Luo commented on SPARK-24657: - Just reproduced it in standalone mode using  [https:/

[jira] [Commented] (SPARK-26839) on JDK11, IsolatedClientLoader must be able to load java.sql classes

2019-02-06 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16762219#comment-16762219 ] Imran Rashid commented on SPARK-26839: -- I found that just on a regular scala 2.12 s

[jira] [Created] (SPARK-26839) on J11, IsolatedClientLoader must be able to load java.sql classes

2019-02-06 Thread Imran Rashid (JIRA)
Imran Rashid created SPARK-26839: Summary: on J11, IsolatedClientLoader must be able to load java.sql classes Key: SPARK-26839 URL: https://issues.apache.org/jira/browse/SPARK-26839 Project: Spark

[jira] [Updated] (SPARK-26839) on JDK11, IsolatedClientLoader must be able to load java.sql classes

2019-02-06 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated SPARK-26839: - Summary: on JDK11, IsolatedClientLoader must be able to load java.sql classes (was: on J11, Iso

[jira] [Assigned] (SPARK-22798) Add multiple column support to PySpark StringIndexer

2019-02-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22798: Assignee: (was: Apache Spark) > Add multiple column support to PySpark StringIndexer

[jira] [Assigned] (SPARK-22798) Add multiple column support to PySpark StringIndexer

2019-02-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22798: Assignee: Apache Spark > Add multiple column support to PySpark StringIndexer > -

[jira] [Comment Edited] (SPARK-19468) Dataset slow because of unnecessary shuffles

2019-02-06 Thread Mitesh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761207#comment-16761207 ] Mitesh edited comment on SPARK-19468 at 2/6/19 7:13 PM: I'm seei

[jira] [Comment Edited] (SPARK-19468) Dataset slow because of unnecessary shuffles

2019-02-06 Thread Mitesh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761207#comment-16761207 ] Mitesh edited comment on SPARK-19468 at 2/6/19 7:13 PM: +1 I'm s

[jira] [Comment Edited] (SPARK-19981) Sort-Merge join inserts shuffles when joining dataframes with aliased columns

2019-02-06 Thread Mitesh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761535#comment-16761535 ] Mitesh edited comment on SPARK-19981 at 2/6/19 7:12 PM: Ping [~m

[jira] [Created] (SPARK-26838) Dataframe write Parquet java.lang.NoClassDefFoundError: Could not initialize class ParquetOptions

2019-02-06 Thread Sanket (JIRA)
Sanket created SPARK-26838: -- Summary: Dataframe write Parquet java.lang.NoClassDefFoundError: Could not initialize class ParquetOptions Key: SPARK-26838 URL: https://issues.apache.org/jira/browse/SPARK-26838

[jira] [Comment Edited] (SPARK-17086) QuantileDiscretizer throws InvalidArgumentException (parameter splits given invalid value) on valid data

2019-02-06 Thread Matias Rotenberg (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761953#comment-16761953 ] Matias Rotenberg edited comment on SPARK-17086 at 2/6/19 5:41 PM:

[jira] [Commented] (SPARK-17086) QuantileDiscretizer throws InvalidArgumentException (parameter splits given invalid value) on valid data

2019-02-06 Thread Matias Rotenberg (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761953#comment-16761953 ] Matias Rotenberg commented on SPARK-17086: -- I seem to be running into this exac

[jira] [Commented] (SPARK-26509) Parquet DELTA_BYTE_ARRAY is not supported in Spark 2.x's Vectorized Reader

2019-02-06 Thread Nandor Kollar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761927#comment-16761927 ] Nandor Kollar commented on SPARK-26509: --- [~yumwang] here's an example to reproduce

[jira] [Updated] (SPARK-26665) BlockTransferService.fetchBlockSync may hang forever

2019-02-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26665: -- Fix Version/s: (was: 2.3.4) 2.3.3 > BlockTransferService.fetchBlockSync may han

[jira] [Updated] (SPARK-26751) HiveSessionImpl might have memory leak since Operation do not close properly

2019-02-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26751: -- Fix Version/s: (was: 2.3.4) 2.3.3 > HiveSessionImpl might have memory leak sinc

[jira] [Updated] (SPARK-26228) OOM issue encountered when computing Gramian matrix

2019-02-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26228: -- Fix Version/s: (was: 2.3.4) 2.3.3 > OOM issue encountered when computing Gramia

[jira] [Updated] (SPARK-26638) Pyspark vector classes always return error for unary negation

2019-02-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26638: -- Fix Version/s: (was: 2.3.4) 2.3.3 > Pyspark vector classes always return error

[jira] [Updated] (SPARK-26351) Documented formula of precision at k does not match the actual code

2019-02-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26351: -- Fix Version/s: (was: 2.3.4) 2.3.3 > Documented formula of precision at k does n

[jira] [Updated] (SPARK-24740) PySpark tests do not pass with NumPy 0.14.x+

2019-02-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-24740: -- Fix Version/s: (was: 2.3.4) > PySpark tests do not pass with NumPy 0.14.x+ > -

[jira] [Updated] (SPARK-26806) EventTimeStats.merge doesn't handle "zero.merge(zero)" correctly

2019-02-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26806: -- Fix Version/s: (was: 2.3.4) 2.3.3 > EventTimeStats.merge doesn't handle "zero.m

[jira] [Updated] (SPARK-26757) GraphX EdgeRDDImpl and VertexRDDImpl `count` method cannot handle empty RDDs

2019-02-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26757: -- Fix Version/s: (was: 2.3.4) 2.3.3 > GraphX EdgeRDDImpl and VertexRDDImpl `count

[jira] [Resolved] (SPARK-26734) StackOverflowError on WAL serialization caused by large receivedBlockQueue

2019-02-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26734. --- Resolution: Fixed Fix Version/s: 2.3.4 2.4.1 3.0.0 Issu

[jira] [Assigned] (SPARK-26734) StackOverflowError on WAL serialization caused by large receivedBlockQueue

2019-02-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-26734: - Assignee: Ross M. Lodge > StackOverflowError on WAL serialization caused by large receivedBlock

[jira] [Assigned] (SPARK-26837) Pruning nested fields from object serializers

2019-02-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26837: Assignee: (was: Apache Spark) > Pruning nested fields from object serializers > -

[jira] [Updated] (SPARK-26836) Columns get switched in Spark SQL using Avro backed Hive table if schema evolves

2019-02-06 Thread Tamas Nemeth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tamas Nemeth updated SPARK-26836: - Priority: Critical (was: Minor) > Columns get switched in Spark SQL using Avro backed Hive tabl

[jira] [Assigned] (SPARK-26837) Pruning nested fields from object serializers

2019-02-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26837: Assignee: Apache Spark > Pruning nested fields from object serializers >

[jira] [Updated] (SPARK-26836) Columns get switched in Spark SQL using Avro backed Hive table if schema evolves

2019-02-06 Thread Tamas Nemeth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tamas Nemeth updated SPARK-26836: - Attachment: doctors.avro original.avsc doctors_evolved.json

[jira] [Created] (SPARK-26837) Pruning nested fields from object serializers

2019-02-06 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-26837: --- Summary: Pruning nested fields from object serializers Key: SPARK-26837 URL: https://issues.apache.org/jira/browse/SPARK-26837 Project: Spark Issue Typ

[jira] [Created] (SPARK-26836) Columns get switched in Spark SQL using Avro backed Hive table if schema evolves

2019-02-06 Thread Tamas Nemeth (JIRA)
Tamas Nemeth created SPARK-26836: Summary: Columns get switched in Spark SQL using Avro backed Hive table if schema evolves Key: SPARK-26836 URL: https://issues.apache.org/jira/browse/SPARK-26836 Proj

[jira] [Updated] (SPARK-26834) ERROR org.apache.spark.deploy.history.HistoryServer: RECEIVED SIGNAL TERM

2019-02-06 Thread Amit Samanta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amit Samanta updated SPARK-26834: - Description: Hi, We are using CDH 5.14.0 phoenix -4.14. Hbase - 1.2 spark2-2.1.0cloudera3

[jira] [Updated] (SPARK-26835) Document configuration properties of Spark SQL Generic Load/Save Functions

2019-02-06 Thread Peter Horvath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Horvath updated SPARK-26835: -- Description: Currently the [Generic Load/Save Functions section of Spark SQL documentation|ht

[jira] [Created] (SPARK-26835) Document configuration properties of Spark SQL Generic Load/Save Functions

2019-02-06 Thread Peter Horvath (JIRA)
Peter Horvath created SPARK-26835: - Summary: Document configuration properties of Spark SQL Generic Load/Save Functions Key: SPARK-26835 URL: https://issues.apache.org/jira/browse/SPARK-26835 Project:

[jira] [Commented] (SPARK-26804) Spark sql carries newline char from last csv column when imported

2019-02-06 Thread Raj (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761750#comment-16761750 ] Raj commented on SPARK-26804: - Hi [~hyukjin.kwon]    Let me know if you need any more detai

[jira] [Commented] (SPARK-25588) SchemaParseException: Can't redefine: list when reading from Parquet

2019-02-06 Thread Nandor Kollar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761734#comment-16761734 ] Nandor Kollar commented on SPARK-25588: --- [~rdakshin] the stacktrace you get is unr

[jira] [Updated] (SPARK-26834) ERROR org.apache.spark.deploy.history.HistoryServer: RECEIVED SIGNAL TERM

2019-02-06 Thread Amit Samanta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amit Samanta updated SPARK-26834: - Issue Type: Bug (was: Question) > ERROR org.apache.spark.deploy.history.HistoryServer: RECEIVED

[jira] [Created] (SPARK-26834) ERROR org.apache.spark.deploy.history.HistoryServer: RECEIVED SIGNAL TERM

2019-02-06 Thread Amit Samanta (JIRA)
Amit Samanta created SPARK-26834: Summary: ERROR org.apache.spark.deploy.history.HistoryServer: RECEIVED SIGNAL TERM Key: SPARK-26834 URL: https://issues.apache.org/jira/browse/SPARK-26834 Project: Sp

[jira] [Commented] (SPARK-26833) Kubernetes RBAC documentation is unclear on exact RBAC requirements

2019-02-06 Thread Rob Vesse (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761667#comment-16761667 ] Rob Vesse commented on SPARK-26833: --- Although not sure the latter is doable. With {{k

[jira] [Updated] (SPARK-26833) Kubernetes RBAC documentation is unclear on exact RBAC requirements

2019-02-06 Thread Rob Vesse (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rob Vesse updated SPARK-26833: -- Description: I've seen a couple of users get bitten by this in informal discussions on GitHub and Sla

[jira] [Updated] (SPARK-26833) Kubernetes RBAC documentation is unclear on exact RBAC requirements

2019-02-06 Thread Rob Vesse (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rob Vesse updated SPARK-26833: -- Description: I've seen a couple of users get bitten by this in informal discussions on GitHub and Sla

[jira] [Updated] (SPARK-26833) Kubernetes RBAC documentation is unclear on exact RBAC requirements

2019-02-06 Thread Rob Vesse (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rob Vesse updated SPARK-26833: -- Description: I've seen a couple of users get bitten by this in informal discussions on GitHub and Sla

[jira] [Created] (SPARK-26833) Kubernetes RBAC documentation is unclear on exact RBAC requirements

2019-02-06 Thread Rob Vesse (JIRA)
Rob Vesse created SPARK-26833: - Summary: Kubernetes RBAC documentation is unclear on exact RBAC requirements Key: SPARK-26833 URL: https://issues.apache.org/jira/browse/SPARK-26833 Project: Spark

[jira] [Assigned] (SPARK-26832) Avoid project creation per record at Python's grouped vectorized UDF

2019-02-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26832: Assignee: (was: Apache Spark) > Avoid project creation per record at Python's grouped

[jira] [Assigned] (SPARK-26832) Avoid project creation per record at Python's grouped vectorized UDF

2019-02-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26832: Assignee: Apache Spark > Avoid project creation per record at Python's grouped vectorized

[jira] [Created] (SPARK-26832) Avoid project creation per record at Python's grouped vectorized UDF

2019-02-06 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-26832: Summary: Avoid project creation per record at Python's grouped vectorized UDF Key: SPARK-26832 URL: https://issues.apache.org/jira/browse/SPARK-26832 Project: Spark

[jira] [Assigned] (SPARK-26831) bin/pyspark: avoid hardcoded `python` command and improve version checks

2019-02-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26831: Assignee: Apache Spark > bin/pyspark: avoid hardcoded `python` command and improve versio

[jira] [Assigned] (SPARK-26831) bin/pyspark: avoid hardcoded `python` command and improve version checks

2019-02-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26831: Assignee: (was: Apache Spark) > bin/pyspark: avoid hardcoded `python` command and imp

[jira] [Commented] (SPARK-26831) bin/pyspark: avoid hardcoded `python` command and improve version checks

2019-02-06 Thread Stefaan Lippens (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761617#comment-16761617 ] Stefaan Lippens commented on SPARK-26831: - However, as [~hyukjin.kwon] noted in

[jira] [Commented] (SPARK-26831) bin/pyspark: avoid hardcoded `python` command and improve version checks

2019-02-06 Thread Stefaan Lippens (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761615#comment-16761615 ] Stefaan Lippens commented on SPARK-26831: - As an initial improvement I propose t

[jira] [Comment Edited] (SPARK-24657) SortMergeJoin may cause SparkOutOfMemory in execution memory because of not cleanup resource when finished the merge join

2019-02-06 Thread Tao Luo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761607#comment-16761607 ] Tao Luo edited comment on SPARK-24657 at 2/6/19 9:31 AM: - Sure:

[jira] [Created] (SPARK-26831) bin/pyspark: avoid hardcoded `python` command and improve version checks

2019-02-06 Thread Stefaan Lippens (JIRA)
Stefaan Lippens created SPARK-26831: --- Summary: bin/pyspark: avoid hardcoded `python` command and improve version checks Key: SPARK-26831 URL: https://issues.apache.org/jira/browse/SPARK-26831 Projec

[jira] [Commented] (SPARK-24657) SortMergeJoin may cause SparkOutOfMemory in execution memory because of not cleanup resource when finished the merge join

2019-02-06 Thread Tao Luo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761607#comment-16761607 ] Tao Luo commented on SPARK-24657: - Sure: {code:java} from pyspark.sql.functions import r

[jira] [Comment Edited] (SPARK-24657) SortMergeJoin may cause SparkOutOfMemory in execution memory because of not cleanup resource when finished the merge join

2019-02-06 Thread Tao Luo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761607#comment-16761607 ] Tao Luo edited comment on SPARK-24657 at 2/6/19 9:22 AM: - Sure:

[jira] [Comment Edited] (SPARK-24657) SortMergeJoin may cause SparkOutOfMemory in execution memory because of not cleanup resource when finished the merge join

2019-02-06 Thread Tao Luo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761607#comment-16761607 ] Tao Luo edited comment on SPARK-24657 at 2/6/19 9:21 AM: - Sure:

[jira] [Created] (SPARK-26830) Arrow optimization in native R function execution at dapply

2019-02-06 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-26830: Summary: Arrow optimization in native R function execution at dapply Key: SPARK-26830 URL: https://issues.apache.org/jira/browse/SPARK-26830 Project: Spark