[jira] [Updated] (SPARK-25276) OutOfMemoryError: GC overhead limit exceeded when using alias

2018-09-10 Thread Ajith S (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated SPARK-25276: Summary: OutOfMemoryError: GC overhead limit exceeded when using alias (was: Redundant constrains when

[jira] [Created] (SPARK-25393) Parsing CSV strings in a column

2018-09-10 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-25393: -- Summary: Parsing CSV strings in a column Key: SPARK-25393 URL: https://issues.apache.org/jira/browse/SPARK-25393 Project: Spark Issue Type: Improvement

[jira] [Assigned] (SPARK-25393) Parsing CSV strings in a column

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25393: Assignee: Apache Spark > Parsing CSV strings in a column >

[jira] [Commented] (SPARK-25378) ArrayData.toArray assume UTF8String

2018-09-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608865#comment-16608865 ] Hyukjin Kwon commented on SPARK-25378: -- I also agree with leaving this resolved. >

[jira] [Updated] (SPARK-25367) The column attributes obtained by Spark sql are inconsistent with hive

2018-09-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-25367: - Target Version/s: (was: 2.3.1) > The column attributes obtained by Spark sql are inconsistent

[jira] [Updated] (SPARK-25367) The column attributes obtained by Spark sql are inconsistent with hive

2018-09-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-25367: - Priority: Major (was: Critical) > The column attributes obtained by Spark sql are inconsistent

[jira] [Commented] (SPARK-25367) The column attributes obtained by Spark sql are inconsistent with hive

2018-09-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608899#comment-16608899 ] Hyukjin Kwon commented on SPARK-25367: -- (please avoid to set target version and Critical+ which are

[jira] [Updated] (SPARK-25276) OutOfMemoryError: GC overhead limit exceeded when using alias

2018-09-10 Thread Ajith S (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated SPARK-25276: Attachment: test.txt > OutOfMemoryError: GC overhead limit exceeded when using alias >

[jira] [Resolved] (SPARK-24999) Reduce unnecessary 'new' memory operations

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-24999. - Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 21968

[jira] [Assigned] (SPARK-24999) Reduce unnecessary 'new' memory operations

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-24999: --- Assignee: caoxuewen > Reduce unnecessary 'new' memory operations >

[jira] [Updated] (SPARK-25276) OutOfMemoryError: GC overhead limit exceeded when using alias

2018-09-10 Thread Ajith S (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated SPARK-25276: Description:     Attaching a test to reproduce the issue. The test fails with following message

[jira] [Commented] (SPARK-25377) spark sql dataframe cache is invalid

2018-09-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608896#comment-16608896 ] Hyukjin Kwon commented on SPARK-25377: -- Can you post a self-contained reproducer? > spark sql

[jira] [Updated] (SPARK-25276) OutOfMemoryError: GC overhead limit exceeded when using alias

2018-09-10 Thread Ajith S (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated SPARK-25276: Description: OutOfMemoryError: GC overhead limit exceeded when using alias  When run the sql. attached

[jira] [Resolved] (SPARK-25365) a better way to handle vector index and sparsity in FeatureHasher implementation ?

2018-09-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-25365. -- Resolution: Invalid > a better way to handle vector index and sparsity in FeatureHasher >

[jira] [Commented] (SPARK-25365) a better way to handle vector index and sparsity in FeatureHasher implementation ?

2018-09-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608903#comment-16608903 ] Hyukjin Kwon commented on SPARK-25365: -- Questions should go to mailing list. Please see

[jira] [Commented] (SPARK-25393) Parsing CSV strings in a column

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608921#comment-16608921 ] Apache Spark commented on SPARK-25393: -- User 'MaxGekk' has created a pull request for this issue:

[jira] [Assigned] (SPARK-25393) Parsing CSV strings in a column

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25393: Assignee: (was: Apache Spark) > Parsing CSV strings in a column >

[jira] [Commented] (SPARK-25393) Parsing CSV strings in a column

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608922#comment-16608922 ] Apache Spark commented on SPARK-25393: -- User 'MaxGekk' has created a pull request for this issue:

[jira] [Created] (SPARK-25392) [Spark Job History]Inconsistent behaviour for pool details in spark web UI and history server page

2018-09-10 Thread ABHISHEK KUMAR GUPTA (JIRA)
ABHISHEK KUMAR GUPTA created SPARK-25392: Summary: [Spark Job History]Inconsistent behaviour for pool details in spark web UI and history server page Key: SPARK-25392 URL:

[jira] [Commented] (SPARK-25331) Structured Streaming File Sink duplicates records in case of driver failure

2018-09-10 Thread Mihaly Toth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608933#comment-16608933 ] Mihaly Toth commented on SPARK-25331: - I was thinking about how to make FileStreamSink idempotent

[jira] [Commented] (SPARK-25278) Number of output rows metric of union of views is multiplied by their occurrences

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609106#comment-16609106 ] Apache Spark commented on SPARK-25278: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Created] (SPARK-25394) Expose App status metrics as Source

2018-09-10 Thread Stavros Kontopoulos (JIRA)
Stavros Kontopoulos created SPARK-25394: --- Summary: Expose App status metrics as Source Key: SPARK-25394 URL: https://issues.apache.org/jira/browse/SPARK-25394 Project: Spark Issue

[jira] [Commented] (SPARK-25102) Write Spark version information to Parquet file footers

2018-09-10 Thread Zoltan Ivanfi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609062#comment-16609062 ] Zoltan Ivanfi commented on SPARK-25102: --- Hi [~npoberezkin], Sorry for answering so late, I was on

[jira] [Updated] (SPARK-24958) Report executors' process tree total memory information to heartbeat signals

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-24958: Target Version/s: (was: 2.4.0) > Report executors' process tree total memory information to

[jira] [Resolved] (SPARK-25036) Scala 2.12 issues: Compilation error with sbt

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-25036. - Resolution: Fixed Fix Version/s: 2.4.0 > Scala 2.12 issues: Compilation error with sbt >

[jira] [Commented] (SPARK-24958) Report executors' process tree total memory information to heartbeat signals

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609211#comment-16609211 ] Wenchen Fan commented on SPARK-24958: - I'm removing the target version since we are not going to

[jira] [Resolved] (SPARK-25278) Number of output rows metric of union of views is multiplied by their occurrences

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-25278. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 22284

[jira] [Assigned] (SPARK-25278) Number of output rows metric of union of views is multiplied by their occurrences

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-25278: --- Assignee: Marco Gaido > Number of output rows metric of union of views is multiplied by

[jira] [Commented] (SPARK-25153) Improve error messages for columns with dots/periods

2018-09-10 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-25153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609061#comment-16609061 ] Fernando Díaz commented on SPARK-25153: --- I will take a look at it. Quick question: Given a 

[jira] [Updated] (SPARK-25220) [K8S] Split out node selector config between driver and executors.

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-25220: Target Version/s: (was: 2.4.0) > [K8S] Split out node selector config between driver and

[jira] [Commented] (SPARK-25220) [K8S] Split out node selector config between driver and executors.

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609208#comment-16609208 ] Wenchen Fan commented on SPARK-25220: - I'm removing the target version, since it's trivial and we

[jira] [Updated] (SPARK-25357) Add metadata to SparkPlanInfo to dump more information like file path to event log

2018-09-10 Thread Lantao Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lantao Jin updated SPARK-25357: --- Summary: Add metadata to SparkPlanInfo to dump more information like file path to event log (was:

[jira] [Updated] (SPARK-25357) Add metadata to SparkPlanInfo to dump more information like file path to event log

2018-09-10 Thread Lantao Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lantao Jin updated SPARK-25357: --- Description: Field {{metadata}} removed from {{SparkPlanInfo}} in SPARK-17701. Corresponding, this

[jira] [Commented] (SPARK-24647) Sink Should Return Writen Offsets For ProgressReporting

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609218#comment-16609218 ] Wenchen Fan commented on SPARK-24647: - I'm removing the target version, since we are not going to

[jira] [Commented] (SPARK-24353) Add support for pod affinity/anti-affinity

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609226#comment-16609226 ] Wenchen Fan commented on SPARK-24353: - I'm removing the target version, since we are not going to

[jira] [Updated] (SPARK-24091) Internally used ConfigMap prevents use of user-specified ConfigMaps carrying Spark configs files

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-24091: Target Version/s: (was: 2.4.0) > Internally used ConfigMap prevents use of user-specified

[jira] [Commented] (SPARK-23647) extend hint syntax to support any expression for Python

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609267#comment-16609267 ] Wenchen Fan commented on SPARK-23647: - I'm removing the target version, since we are not going to

[jira] [Updated] (SPARK-23647) extend hint syntax to support any expression for Python

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-23647: Target Version/s: (was: 2.4.0) > extend hint syntax to support any expression for Python >

[jira] [Commented] (SPARK-20715) MapStatuses shouldn't be redundantly stored in both ShuffleMapStage and MapOutputTracker

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609290#comment-16609290 ] Apache Spark commented on SPARK-20715: -- User 'bersprockets' has created a pull request for this

[jira] [Commented] (SPARK-20715) MapStatuses shouldn't be redundantly stored in both ShuffleMapStage and MapOutputTracker

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609291#comment-16609291 ] Apache Spark commented on SPARK-20715: -- User 'bersprockets' has created a pull request for this

[jira] [Updated] (SPARK-21395) Spark SQL hive-thriftserver doesn't register operation log before execute sql statement

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-21395: Target Version/s: (was: 2.4.0) > Spark SQL hive-thriftserver doesn't register operation log

[jira] [Commented] (SPARK-21395) Spark SQL hive-thriftserver doesn't register operation log before execute sql statement

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609298#comment-16609298 ] Wenchen Fan commented on SPARK-21395: - I'm removing the target version, since we are not going to

[jira] [Commented] (SPARK-23597) Audit Spark SQL code base for non-interpreted expressions

2018-09-10 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609411#comment-16609411 ] Marco Gaido commented on SPARK-23597: - I haven't, [~hvanhovell]? > Audit Spark SQL code base for

[jira] [Updated] (SPARK-25396) Read array of JSON objects via an Iterator

2018-09-10 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-25396: --- Description: If a JSON file has a structure like below: {code} [ {

[jira] [Commented] (SPARK-25394) Expose App status metrics as Source

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609214#comment-16609214 ] Apache Spark commented on SPARK-25394: -- User 'skonto' has created a pull request for this issue:

[jira] [Commented] (SPARK-24656) SparkML Transformers and Estimators with multiple columns

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609213#comment-16609213 ] Wenchen Fan commented on SPARK-24656: - I'm removing the target version, since no one is working on

[jira] [Assigned] (SPARK-25394) Expose App status metrics as Source

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25394: Assignee: Apache Spark > Expose App status metrics as Source >

[jira] [Updated] (SPARK-24550) Add support for Kubernetes specific metrics

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-24550: Target Version/s: (was: 2.4.0) > Add support for Kubernetes specific metrics >

[jira] [Updated] (SPARK-24647) Sink Should Return Writen Offsets For ProgressReporting

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-24647: Target Version/s: (was: 2.4.0) > Sink Should Return Writen Offsets For ProgressReporting >

[jira] [Commented] (SPARK-24225) Support closing AutoClosable objects in MemoryStore so Broadcast Variables can be released properly

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609238#comment-16609238 ] Wenchen Fan commented on SPARK-24225: - I'm removing the target version, since we are not going to

[jira] [Commented] (SPARK-24156) Enable no-data micro batches for more eager streaming state clean up

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609241#comment-16609241 ] Wenchen Fan commented on SPARK-24156: - I'm resolving it, since all subtasks are resolved. > Enable

[jira] [Resolved] (SPARK-23899) Built-in SQL Function Improvement

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-23899. - Resolution: Fixed Fix Version/s: 2.4.0 > Built-in SQL Function Improvement >

[jira] [Commented] (SPARK-23899) Built-in SQL Function Improvement

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609258#comment-16609258 ] Wenchen Fan commented on SPARK-23899: - I'm resolving it, since there is only one subtask unfinished,

[jira] [Commented] (SPARK-23171) Reduce the time costs of the rule runs that do not change the plans

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609279#comment-16609279 ] Wenchen Fan commented on SPARK-23171: - I'm removing the target version, since no progress so far. >

[jira] [Commented] (SPARK-21972) Allow users to control input data persistence in ML Estimators via a handlePersistence ml.Param

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609296#comment-16609296 ] Wenchen Fan commented on SPARK-21972: - I'm removing the target version, since we are not going to

[jira] [Updated] (SPARK-21972) Allow users to control input data persistence in ML Estimators via a handlePersistence ml.Param

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-21972: Target Version/s: (was: 2.4.0) > Allow users to control input data persistence in ML Estimators

[jira] [Commented] (SPARK-21291) R bucketBy partitionBy API

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609325#comment-16609325 ] Wenchen Fan commented on SPARK-21291: - I'm removing the target version, since no one is working on

[jira] [Updated] (SPARK-21291) R bucketBy partitionBy API

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-21291: Target Version/s: (was: 2.4.0) > R bucketBy partitionBy API > -- > >

[jira] [Assigned] (SPARK-25395) Remove Spark Optional Java API

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25395: Assignee: (was: Apache Spark) > Remove Spark Optional Java API >

[jira] [Commented] (SPARK-25395) Remove Spark Optional Java API

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609356#comment-16609356 ] Apache Spark commented on SPARK-25395: -- User 'mmolimar' has created a pull request for this issue:

[jira] [Commented] (SPARK-25376) Scenarios we should handle but missed in 2.4 for barrier execution mode

2018-09-10 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609391#comment-16609391 ] Imran Rashid commented on SPARK-25376: -- I raised some of my concerns on one of the earlier PRs,

[jira] [Updated] (SPARK-25378) ArrayData.toArray(StringType) assume UTF8String in 2.4

2018-09-10 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-25378: -- Summary: ArrayData.toArray(StringType) assume UTF8String in 2.4 (was: ArrayData.toArray

[jira] [Created] (SPARK-25396) Read array of JSON objects via an Iterator

2018-09-10 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-25396: -- Summary: Read array of JSON objects via an Iterator Key: SPARK-25396 URL: https://issues.apache.org/jira/browse/SPARK-25396 Project: Spark Issue Type:

[jira] [Commented] (SPARK-25396) Read array of JSON objects via an Iterator

2018-09-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609479#comment-16609479 ] Hyukjin Kwon commented on SPARK-25396: -- At that time, there's no multiple mode or json functions.

[jira] [Updated] (SPARK-24656) SparkML Transformers and Estimators with multiple columns

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-24656: Target Version/s: (was: 2.4.0) > SparkML Transformers and Estimators with multiple columns >

[jira] [Assigned] (SPARK-25394) Expose App status metrics as Source

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25394: Assignee: (was: Apache Spark) > Expose App status metrics as Source >

[jira] [Updated] (SPARK-24498) Add JDK compiler for runtime codegen

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-24498: Target Version/s: (was: 2.4.0) > Add JDK compiler for runtime codegen >

[jira] [Updated] (SPARK-23712) Investigate replacing Code Generated UnsafeRowJoiner with an Interpreted version

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-23712: Target Version/s: (was: 2.4.0) > Investigate replacing Code Generated UnsafeRowJoiner with an

[jira] [Commented] (SPARK-23712) Investigate replacing Code Generated UnsafeRowJoiner with an Interpreted version

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609264#comment-16609264 ] Wenchen Fan commented on SPARK-23712: - I'm removing the target version, since we are not going to

[jira] [Commented] (SPARK-23580) Interpreted mode fallback should be implemented for all expressions & projections

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609271#comment-16609271 ] Wenchen Fan commented on SPARK-23580: - Shall we resolve the ticket? It seems 90% done. >

[jira] [Created] (SPARK-25395) Remove Spark Optional Java API

2018-09-10 Thread Mario Molina (JIRA)
Mario Molina created SPARK-25395: Summary: Remove Spark Optional Java API Key: SPARK-25395 URL: https://issues.apache.org/jira/browse/SPARK-25395 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-25396) Read array of JSON objects via an Iterator

2018-09-10 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609469#comment-16609469 ] Maxim Gekk commented on SPARK-25396: [~hyukjin.kwon] WDYT > Read array of JSON objects via an

[jira] [Commented] (SPARK-24550) Add support for Kubernetes specific metrics

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609220#comment-16609220 ] Wenchen Fan commented on SPARK-24550: - I'm removing the target version, since no one is working on

[jira] [Updated] (SPARK-25394) Expose App status metrics as Source

2018-09-10 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-25394: Description: ApplicationListener in Spark core captures useful metrics like job

[jira] [Commented] (SPARK-24498) Add JDK compiler for runtime codegen

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609223#comment-16609223 ] Wenchen Fan commented on SPARK-24498: - I'm removing the target version, since we are not going to

[jira] [Resolved] (SPARK-24156) Enable no-data micro batches for more eager streaming state clean up

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-24156. - Resolution: Fixed Fix Version/s: 2.4.0 > Enable no-data micro batches for more eager

[jira] [Updated] (SPARK-24225) Support closing AutoClosable objects in MemoryStore so Broadcast Variables can be released properly

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-24225: Target Version/s: (was: 2.4.0) > Support closing AutoClosable objects in MemoryStore so

[jira] [Commented] (SPARK-24091) Internally used ConfigMap prevents use of user-specified ConfigMaps carrying Spark configs files

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609242#comment-16609242 ] Wenchen Fan commented on SPARK-24091: - I'm removing the target version, since we are not going to

[jira] [Commented] (SPARK-23483) Feature parity for Python vs Scala APIs

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609273#comment-16609273 ] Wenchen Fan commented on SPARK-23483: - Can we resolve this ticket? seems 90% done. > Feature parity

[jira] [Commented] (SPARK-23153) Support application dependencies in submission client's local file system

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609283#comment-16609283 ] Wenchen Fan commented on SPARK-23153: - I'm removing the target version, since no one is working on

[jira] [Updated] (SPARK-23153) Support application dependencies in submission client's local file system

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-23153: Target Version/s: (was: 2.4.0) > Support application dependencies in submission client's local

[jira] [Updated] (SPARK-22798) Add multiple column support to PySpark StringIndexer

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-22798: Target Version/s: (was: 2.4.0) > Add multiple column support to PySpark StringIndexer >

[jira] [Updated] (SPARK-22796) Add multiple column support to PySpark QuantileDiscretizer

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-22796: Target Version/s: (was: 2.4.0) > Add multiple column support to PySpark QuantileDiscretizer >

[jira] [Commented] (SPARK-22796) Add multiple column support to PySpark QuantileDiscretizer

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609286#comment-16609286 ] Wenchen Fan commented on SPARK-22796: - I'm removing the target version, since no progress yet. >

[jira] [Commented] (SPARK-22055) Port release scripts

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609294#comment-16609294 ] Wenchen Fan commented on SPARK-22055: - I'm removing the target version, since we can't make it

[jira] [Updated] (SPARK-22055) Port release scripts

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-22055: Target Version/s: (was: 2.4.0) > Port release scripts > > >

[jira] [Updated] (SPARK-24353) Add support for pod affinity/anti-affinity

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-24353: Target Version/s: (was: 2.4.0) > Add support for pod affinity/anti-affinity >

[jira] [Commented] (SPARK-24253) DataSourceV2: Add DeleteSupport for delete and overwrite operations

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609231#comment-16609231 ] Wenchen Fan commented on SPARK-24253: - I'm removing the target version, since we are not going to

[jira] [Updated] (SPARK-24253) DataSourceV2: Add DeleteSupport for delete and overwrite operations

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-24253: Target Version/s: (was: 2.4.0) > DataSourceV2: Add DeleteSupport for delete and overwrite

[jira] [Updated] (SPARK-23906) Add UDF trunc(numeric)

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-23906: Target Version/s: (was: 2.4.0) > Add UDF trunc(numeric) > -- > >

[jira] [Commented] (SPARK-23906) Add UDF trunc(numeric)

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609251#comment-16609251 ] Wenchen Fan commented on SPARK-23906: - I'm removing the target version, since we are not going to

[jira] [Commented] (SPARK-23597) Audit Spark SQL code base for non-interpreted expressions

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609269#comment-16609269 ] Wenchen Fan commented on SPARK-23597: - Is it done? cc [~hvanhovell] [~viirya] [~mgaido] > Audit

[jira] [Commented] (SPARK-23243) Shuffle+Repartition on an RDD could lead to incorrect answers

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609289#comment-16609289 ] Apache Spark commented on SPARK-23243: -- User 'bersprockets' has created a pull request for this

[jira] [Commented] (SPARK-21940) Support timezone for timestamps in SparkR

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609297#comment-16609297 ] Wenchen Fan commented on SPARK-21940: - I'm removing the target version, since no one is working on

[jira] [Updated] (SPARK-21940) Support timezone for timestamps in SparkR

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-21940: Target Version/s: (was: 2.4.0) > Support timezone for timestamps in SparkR >

[jira] [Assigned] (SPARK-25395) Remove Spark Optional Java API

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25395: Assignee: Apache Spark > Remove Spark Optional Java API > --

[jira] [Commented] (SPARK-24429) Add support for spark.driver.extraJavaOptions in cluster mode for Spark on K8s

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609225#comment-16609225 ] Wenchen Fan commented on SPARK-24429: - I'm removing the target version, since no one is working on

[jira] [Updated] (SPARK-24429) Add support for spark.driver.extraJavaOptions in cluster mode for Spark on K8s

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-24429: Target Version/s: (was: 2.4.0) > Add support for spark.driver.extraJavaOptions in cluster mode

[jira] [Commented] (SPARK-22798) Add multiple column support to PySpark StringIndexer

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609285#comment-16609285 ] Wenchen Fan commented on SPARK-22798: - I'm removing the target version, since no progress yet. >

[jira] [Commented] (SPARK-22632) Fix the behavior of timestamp values for R's DataFrame to respect session timezone

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609292#comment-16609292 ] Wenchen Fan commented on SPARK-22632: - Is this still a problem now? > Fix the behavior of timestamp

  1   2   >