[jira] [Commented] (SPARK-22393) spark-shell can't find imported types in class constructors, extends clause

2018-04-02 Thread Joe Pallas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423495#comment-16423495 ] Joe Pallas commented on SPARK-22393: Following up to my previous comment: [~mpetruska

[jira] [Commented] (SPARK-23852) Parquet MR bug can lead to incorrect SQL results

2018-04-02 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423491#comment-16423491 ] Henry Robinson commented on SPARK-23852: Partly, but not completely. If the colum

[jira] [Commented] (SPARK-21870) Split codegen'd aggregation code into small functions for the HotSpot

2018-04-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423489#comment-16423489 ] Apache Spark commented on SPARK-21870: -- User 'maropu' has created a pull request for

[jira] [Assigned] (SPARK-21870) Split codegen'd aggregation code into small functions for the HotSpot

2018-04-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21870: Assignee: Apache Spark > Split codegen'd aggregation code into small functions for the Hot

[jira] [Assigned] (SPARK-21870) Split codegen'd aggregation code into small functions for the HotSpot

2018-04-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21870: Assignee: (was: Apache Spark) > Split codegen'd aggregation code into small functions

[jira] [Updated] (SPARK-23852) Parquet MR bug can lead to incorrect SQL results

2018-04-02 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23852: Labels: correctness (was: ) > Parquet MR bug can lead to incorrect SQL results > -

[jira] [Updated] (SPARK-23852) Parquet MR bug can lead to incorrect SQL results

2018-04-02 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23852: Priority: Blocker (was: Critical) > Parquet MR bug can lead to incorrect SQL results > ---

[jira] [Commented] (SPARK-23852) Parquet MR bug can lead to incorrect SQL results

2018-04-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423479#comment-16423479 ] Reynold Xin commented on SPARK-23852: - Does turning the flag parquet.filter.stats.ena

[jira] [Comment Edited] (SPARK-21453) Cached Kafka consumer may be closed too early

2018-04-02 Thread Daniel Nitzan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423445#comment-16423445 ] Daniel Nitzan edited comment on SPARK-21453 at 4/3/18 4:05 AM:

[jira] [Comment Edited] (SPARK-21453) Cached Kafka consumer may be closed too early

2018-04-02 Thread Daniel Nitzan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423445#comment-16423445 ] Daniel Nitzan edited comment on SPARK-21453 at 4/3/18 4:04 AM:

[jira] [Commented] (SPARK-21453) Cached Kafka consumer may be closed too early

2018-04-02 Thread Daniel Nitzan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423445#comment-16423445 ] Daniel Nitzan commented on SPARK-21453: --- This seems to be an issue with Continuous

[jira] [Commented] (SPARK-23842) accessing java from PySpark lambda functions

2018-04-02 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423439#comment-16423439 ] Ruslan Dautkhanov commented on SPARK-23842: --- [~hyukjin.kwon] Thanks for the rep

[jira] [Commented] (SPARK-19320) Allow guaranteed amount of GPU to be used when launching jobs

2018-04-02 Thread Susan X. Huynh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423382#comment-16423382 ] Susan X. Huynh commented on SPARK-19320: Oh, looks like it does retain the old be

[jira] [Commented] (SPARK-23842) accessing java from PySpark lambda functions

2018-04-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423378#comment-16423378 ] Hyukjin Kwon commented on SPARK-23842: -- How come it's more a Spark issue. spark sess

[jira] [Commented] (SPARK-23839) consider bucket join in cost-based JoinReorder rule

2018-04-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423372#comment-16423372 ] Hyukjin Kwon commented on SPARK-23839: -- cc [~ZenWzh] too. > consider bucket join in

[jira] [Resolved] (SPARK-19964) Flaky test: SparkSubmitSuite "includes jars passed in through --packages"

2018-04-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-19964. -- Resolution: Fixed Assignee: Marcelo Vanzin Target Version/s: 2.3.1, 2.4.0

[jira] [Commented] (SPARK-23810) Matrix Multiplication is so bad, file I/O to local python is better

2018-04-02 Thread dciborow (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423320#comment-16423320 ] dciborow commented on SPARK-23810: -- GitHub code for repo. [https://github.com/dciborow/

[jira] [Updated] (SPARK-23810) Matrix Multiplication is so bad, file I/O to local python is better

2018-04-02 Thread dciborow (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dciborow updated SPARK-23810: - Attachment: image-2018-04-02-20-34-44-980.png > Matrix Multiplication is so bad, file I/O to local python

[jira] [Resolved] (SPARK-23791) Sub-optimal generated code for sum aggregating

2018-04-02 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro resolved SPARK-23791. -- Resolution: Duplicate > Sub-optimal generated code for sum aggregating > --

[jira] [Commented] (SPARK-23791) Sub-optimal generated code for sum aggregating

2018-04-02 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423312#comment-16423312 ] Takeshi Yamamuro commented on SPARK-23791: -- I checked the root cause of this tic

[jira] [Commented] (SPARK-23828) PySpark StringIndexerModel should have constructor from labels

2018-04-02 Thread Huaxin Gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423305#comment-16423305 ] Huaxin Gao commented on SPARK-23828: I am pretty much done with the code. Will submit

[jira] [Resolved] (SPARK-23690) VectorAssembler should have handleInvalid to handle columns with null values

2018-04-02 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-23690. --- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 20829 [h

[jira] [Assigned] (SPARK-23690) VectorAssembler should have handleInvalid to handle columns with null values

2018-04-02 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-23690: - Assignee: yogesh garg > VectorAssembler should have handleInvalid to handle colu

[jira] [Commented] (SPARK-22883) ML test for StructuredStreaming: spark.ml.feature, A-M

2018-04-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423291#comment-16423291 ] Apache Spark commented on SPARK-22883: -- User 'jkbradley' has created a pull request

[jira] [Commented] (SPARK-21363) Prevent column name duplication in temporary view

2018-04-02 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423264#comment-16423264 ] Takeshi Yamamuro commented on SPARK-21363: -- What's the concrete case? I think is

[jira] [Resolved] (SPARK-19128) Refresh Metadata Cache After ALTER TABLE SET LOCATION

2018-04-02 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-19128. - Resolution: Not A Problem > Refresh Metadata Cache After ALTER TABLE SET LOCATION > -

[jira] [Updated] (SPARK-23207) Shuffle+Repartition on an DataFrame could lead to incorrect answers

2018-04-02 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23207: Labels: correctness (was: ) > Shuffle+Repartition on an DataFrame could lead to incorrect answers > --

[jira] [Resolved] (SPARK-23834) Flaky test: LauncherServerSuite.testAppHandleDisconnect

2018-04-02 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-23834. Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 20950 [https:/

[jira] [Assigned] (SPARK-23834) Flaky test: LauncherServerSuite.testAppHandleDisconnect

2018-04-02 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-23834: -- Assignee: Marcelo Vanzin > Flaky test: LauncherServerSuite.testAppHandleDisconnect > -

[jira] [Commented] (SPARK-23504) Flaky test: RateSourceV2Suite.basic microbatch execution

2018-04-02 Thread Jose Torres (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423137#comment-16423137 ] Jose Torres commented on SPARK-23504: - We've replaced the hacky RateSourceV2 implemen

[jira] [Created] (SPARK-23853) Skip doctests which require hive support built in PySpark

2018-04-02 Thread holdenk (JIRA)
holdenk created SPARK-23853: --- Summary: Skip doctests which require hive support built in PySpark Key: SPARK-23853 URL: https://issues.apache.org/jira/browse/SPARK-23853 Project: Spark Issue Type: B

[jira] [Updated] (SPARK-23852) Parquet MR bug can lead to incorrect SQL results

2018-04-02 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson updated SPARK-23852: --- Description: Parquet MR 1.9.0 and 1.8.2 both have a bug, PARQUET-1217, that means that pushi

[jira] [Created] (SPARK-23852) Parquet MR bug can lead to incorrect SQL results

2018-04-02 Thread Henry Robinson (JIRA)
Henry Robinson created SPARK-23852: -- Summary: Parquet MR bug can lead to incorrect SQL results Key: SPARK-23852 URL: https://issues.apache.org/jira/browse/SPARK-23852 Project: Spark Issue Ty

[jira] [Commented] (SPARK-23848) Structured Streaming fails with nested UDTs

2018-04-02 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423084#comment-16423084 ] Joseph K. Bradley commented on SPARK-23848: --- Whoops! Sorry, I should have caug

[jira] [Resolved] (SPARK-23848) Structured Streaming fails with nested UDTs

2018-04-02 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-23848. --- Resolution: Not A Problem > Structured Streaming fails with nested UDTs > ---

[jira] [Created] (SPARK-23851) Investigate pip install edit mode unicode errors

2018-04-02 Thread holdenk (JIRA)
holdenk created SPARK-23851: --- Summary: Investigate pip install edit mode unicode errors Key: SPARK-23851 URL: https://issues.apache.org/jira/browse/SPARK-23851 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-23851) Investigate pip install edit mode unicode errors

2018-04-02 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423081#comment-16423081 ] holdenk commented on SPARK-23851: - Happening with pip 9.0.3 > Investigate pip install ed

[jira] [Commented] (SPARK-23848) Structured Streaming fails with nested UDTs

2018-04-02 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423071#comment-16423071 ] Shixiong Zhu commented on SPARK-23848: -- To fix your codes, you can just change "case

[jira] [Commented] (SPARK-23848) Structured Streaming fails with nested UDTs

2018-04-02 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423070#comment-16423070 ] Shixiong Zhu commented on SPARK-23848: -- [~josephkb] this is unfortunately because Da

[jira] [Commented] (SPARK-21363) Prevent column name duplication in temporary view

2018-04-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423048#comment-16423048 ] Reynold Xin commented on SPARK-21363: - How can user drop the fields or rename them af

[jira] [Commented] (SPARK-23835) When Dataset.as converts column from nullable to non-nullable type, null Doubles are converted silently to -1

2018-04-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423046#comment-16423046 ] Michael Armbrust commented on SPARK-23835: -- I believe the correct semantics are

[jira] [Resolved] (SPARK-23713) Clean-up UnsafeWriter classes

2018-04-02 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-23713. --- Resolution: Fixed Assignee: Kazuaki Ishizaki Fix Version/s: 2.4.0 > C

[jira] [Resolved] (SPARK-23285) Allow spark.executor.cores to be fractional

2018-04-02 Thread Anirudh Ramanathan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anirudh Ramanathan resolved SPARK-23285. Resolution: Fixed > Allow spark.executor.cores to be fractional > -

[jira] [Assigned] (SPARK-23285) Allow spark.executor.cores to be fractional

2018-04-02 Thread Anirudh Ramanathan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anirudh Ramanathan reassigned SPARK-23285: -- Assignee: Yinan Li > Allow spark.executor.cores to be fractional > ---

[jira] [Commented] (SPARK-19320) Allow guaranteed amount of GPU to be used when launching jobs

2018-04-02 Thread Susan X. Huynh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423002#comment-16423002 ] Susan X. Huynh commented on SPARK-19320: [~yanji84] What happens if spark.mesos.e

[jira] [Updated] (SPARK-22865) Publish Official Apache Spark Docker images

2018-04-02 Thread Anirudh Ramanathan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anirudh Ramanathan updated SPARK-22865: --- Priority: Major (was: Minor) > Publish Official Apache Spark Docker images > ---

[jira] [Commented] (SPARK-22865) Publish Official Apache Spark Docker images

2018-04-02 Thread Anirudh Ramanathan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422992#comment-16422992 ] Anirudh Ramanathan commented on SPARK-22865: [~eje] was working on getting th

[jira] [Commented] (SPARK-23680) entrypoint.sh does not accept arbitrary UIDs, returning as an error

2018-04-02 Thread Anirudh Ramanathan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422987#comment-16422987 ] Anirudh Ramanathan commented on SPARK-23680: [~felixcheung] helped me set up

[jira] [Comment Edited] (SPARK-23680) entrypoint.sh does not accept arbitrary UIDs, returning as an error

2018-04-02 Thread Anirudh Ramanathan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422987#comment-16422987 ] Anirudh Ramanathan edited comment on SPARK-23680 at 4/2/18 7:04 PM: ---

[jira] [Resolved] (SPARK-23825) [K8s] Spark pods should request memory + memoryOverhead as resources

2018-04-02 Thread Matt Cheah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-23825. Resolution: Fixed Fix Version/s: 2.4.0 > [K8s] Spark pods should request memory + memoryOver

[jira] [Assigned] (SPARK-23849) Tests for the samplingRatio option of json schema inferring

2018-04-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23849: Assignee: (was: Apache Spark) > Tests for the samplingRatio option of json schema infe

[jira] [Commented] (SPARK-23849) Tests for the samplingRatio option of json schema inferring

2018-04-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422967#comment-16422967 ] Apache Spark commented on SPARK-23849: -- User 'MaxGekk' has created a pull request fo

[jira] [Assigned] (SPARK-23849) Tests for the samplingRatio option of json schema inferring

2018-04-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23849: Assignee: Apache Spark > Tests for the samplingRatio option of json schema inferring > ---

[jira] [Commented] (SPARK-23850) We should not redact username|user|url from UI by default

2018-04-02 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422965#comment-16422965 ] Thomas Graves commented on SPARK-23850: --- ping [~ash...@gmail.com] [~onursatici] [~L

[jira] [Created] (SPARK-23850) We should not redact username|user|url from UI by default

2018-04-02 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-23850: - Summary: We should not redact username|user|url from UI by default Key: SPARK-23850 URL: https://issues.apache.org/jira/browse/SPARK-23850 Project: Spark I

[jira] [Created] (SPARK-23849) Tests for the samplingRatio option of json schema inferring

2018-04-02 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-23849: -- Summary: Tests for the samplingRatio option of json schema inferring Key: SPARK-23849 URL: https://issues.apache.org/jira/browse/SPARK-23849 Project: Spark Issu

[jira] [Commented] (SPARK-23801) Consistent SIGSEGV after upgrading to Spark v2.3.0

2018-04-02 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422955#comment-16422955 ] Kazuaki Ishizaki commented on SPARK-23801: -- If you use WebUI, you can easily see

[jira] [Updated] (SPARK-23848) Structured Streaming fails with nested UDTs

2018-04-02 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-23848: -- Description: While trying to write a test for org.apache.spark.ml.feature.MinHashLSHMod

[jira] [Updated] (SPARK-23836) Support returning StructType to the level support in GroupedMap Arrow's "scalar" UDFS (or similar)

2018-04-02 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-23836: Summary: Support returning StructType to the level support in GroupedMap Arrow's "scalar" UDFS (or similar)

[jira] [Commented] (SPARK-23836) Support returning StructType & MapType in Arrow's "scalar" UDFS (or similar)

2018-04-02 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422942#comment-16422942 ] holdenk commented on SPARK-23836: - Oh wait, I missunderstood our support of structype - I

[jira] [Commented] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2018-04-02 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422938#comment-16422938 ] holdenk commented on SPARK-21187: - So Arrays are listed as crossed off but it seems like

[jira] [Commented] (SPARK-23836) Support returning StructType & MapType in Arrow's "scalar" UDFS (or similar)

2018-04-02 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422926#comment-16422926 ] holdenk commented on SPARK-23836: - I'm going to take a quick crack at this this week. >

[jira] [Commented] (SPARK-23836) Support returning StructType & MapType in Arrow's "scalar" UDFS (or similar)

2018-04-02 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422922#comment-16422922 ] holdenk commented on SPARK-23836: - [~hyukjin.kwon] its a good question, that one seems to

[jira] [Assigned] (SPARK-23847) Add asc_nulls_first, asc_nulls_last to PySpark

2018-04-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23847: Assignee: (was: Apache Spark) > Add asc_nulls_first, asc_nulls_last to PySpark > -

[jira] [Assigned] (SPARK-23847) Add asc_nulls_first, asc_nulls_last to PySpark

2018-04-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23847: Assignee: Apache Spark > Add asc_nulls_first, asc_nulls_last to PySpark >

[jira] [Commented] (SPARK-23847) Add asc_nulls_first, asc_nulls_last to PySpark

2018-04-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422901#comment-16422901 ] Apache Spark commented on SPARK-23847: -- User 'huaxingao' has created a pull request

[jira] [Created] (SPARK-23848) Structured Streaming fails with nested UDTs

2018-04-02 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-23848: - Summary: Structured Streaming fails with nested UDTs Key: SPARK-23848 URL: https://issues.apache.org/jira/browse/SPARK-23848 Project: Spark Issue T

[jira] [Created] (SPARK-23847) Add asc_nulls_first, asc_nulls_last to PySpark

2018-04-02 Thread Huaxin Gao (JIRA)
Huaxin Gao created SPARK-23847: -- Summary: Add asc_nulls_first, asc_nulls_last to PySpark Key: SPARK-23847 URL: https://issues.apache.org/jira/browse/SPARK-23847 Project: Spark Issue Type: Improv

[jira] [Commented] (SPARK-23847) Add asc_nulls_first, asc_nulls_last to PySpark

2018-04-02 Thread Huaxin Gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422887#comment-16422887 ] Huaxin Gao commented on SPARK-23847: I will submit a PR soon. > Add asc_nulls_first,

[jira] [Commented] (SPARK-23747) Add EpochCoordinator unit tests

2018-04-02 Thread Efim Poberezkin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422814#comment-16422814 ] Efim Poberezkin commented on SPARK-23747: - Okay, will do, thank you for clarifica

[jira] [Commented] (SPARK-23823) ResolveReferences loses correct origin

2018-04-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422781#comment-16422781 ] Apache Spark commented on SPARK-23823: -- User 'JiahuiJiang' has created a pull reques

[jira] [Commented] (SPARK-23828) PySpark StringIndexerModel should have constructor from labels

2018-04-02 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422729#comment-16422729 ] Bryan Cutler commented on SPARK-23828: -- No I'm not working on it, please go ahead [~

[jira] [Comment Edited] (SPARK-23747) Add EpochCoordinator unit tests

2018-04-02 Thread Jose Torres (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422613#comment-16422613 ] Jose Torres edited comment on SPARK-23747 at 4/2/18 3:26 PM: -

[jira] [Commented] (SPARK-23747) Add EpochCoordinator unit tests

2018-04-02 Thread Jose Torres (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422613#comment-16422613 ] Jose Torres commented on SPARK-23747: - I mean testing the internal logic. We'd want t

[jira] [Comment Edited] (SPARK-23839) consider bucket join in cost-based JoinReorder rule

2018-04-02 Thread Xiaoju Wu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1648#comment-1648 ] Xiaoju Wu edited comment on SPARK-23839 at 4/2/18 3:05 PM: --- Yes

[jira] [Commented] (SPARK-19276) FetchFailures can be hidden by user (or sql) exception handling

2018-04-02 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422588#comment-16422588 ] Imran Rashid commented on SPARK-19276: -- Oh thanks for pointing that out [~xchen12138

[jira] [Assigned] (SPARK-23846) samplingRatio for schema inferring of CSV datasource

2018-04-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23846: Assignee: Apache Spark > samplingRatio for schema inferring of CSV datasource > --

[jira] [Assigned] (SPARK-23846) samplingRatio for schema inferring of CSV datasource

2018-04-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23846: Assignee: (was: Apache Spark) > samplingRatio for schema inferring of CSV datasource >

[jira] [Commented] (SPARK-23846) samplingRatio for schema inferring of CSV datasource

2018-04-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422579#comment-16422579 ] Apache Spark commented on SPARK-23846: -- User 'MaxGekk' has created a pull request fo

[jira] [Created] (SPARK-23846) samplingRatio for schema inferring of CSV datasource

2018-04-02 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-23846: -- Summary: samplingRatio for schema inferring of CSV datasource Key: SPARK-23846 URL: https://issues.apache.org/jira/browse/SPARK-23846 Project: Spark Issue Type:

[jira] [Commented] (SPARK-23840) PySpark error when converting a DataFrame to rdd

2018-04-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422509#comment-16422509 ] Hyukjin Kwon commented on SPARK-23840: -- It would be nicer if we can have error messa

[jira] [Commented] (SPARK-23840) PySpark error when converting a DataFrame to rdd

2018-04-02 Thread Uri Goren (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422242#comment-16422242 ] Uri Goren commented on SPARK-23840: --- This error does not occur locally, I have checked

[jira] [Commented] (SPARK-23839) consider bucket join in cost-based JoinReorder rule

2018-04-02 Thread Xiaoju Wu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1648#comment-1648 ] Xiaoju Wu commented on SPARK-23839: --- Yes, bucketing is one of the cases to say that the

[jira] [Commented] (SPARK-12823) Cannot create UDF with StructType input

2018-04-02 Thread Simeon H.K. Fitch (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422189#comment-16422189 ] Simeon H.K. Fitch commented on SPARK-12823: --- [~gbarna]Nice! Thanks for sharing!

[jira] [Commented] (SPARK-23839) consider bucket join in cost-based JoinReorder rule

2018-04-02 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422177#comment-16422177 ] Takeshi Yamamuro commented on SPARK-23839: -- Yea, I know the case. I just suggest

[jira] [Commented] (SPARK-23839) consider bucket join in cost-based JoinReorder rule

2018-04-02 Thread Xiaoju Wu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422116#comment-16422116 ] Xiaoju Wu commented on SPARK-23839: --- [~maropu] My concern is, "bucket join always first

[jira] [Updated] (SPARK-23839) consider bucket join in cost-based JoinReorder rule

2018-04-02 Thread Xiaoju Wu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoju Wu updated SPARK-23839: -- Description: Since spark 2.2, the cost-based JoinReorder rule is implemented and in Spark 2.3 released

[jira] [Commented] (SPARK-23791) Sub-optimal generated code for sum aggregating

2018-04-02 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422095#comment-16422095 ] Takeshi Yamamuro commented on SPARK-23791: -- Reopend (https://issues.apache.org/j

[jira] [Commented] (SPARK-21870) Split codegen'd aggregation code into small functions for the HotSpot

2018-04-02 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422093#comment-16422093 ] Takeshi Yamamuro commented on SPARK-21870: -- Reopened by the suggestion of [~mgai

[jira] [Reopened] (SPARK-21870) Split codegen'd aggregation code into small functions for the HotSpot

2018-04-02 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro reopened SPARK-21870: -- > Split codegen'd aggregation code into small functions for the HotSpot > -

[jira] [Commented] (SPARK-23835) When Dataset.as converts column from nullable to non-nullable type, null Doubles are converted silently to -1

2018-04-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422082#comment-16422082 ] Marco Gaido commented on SPARK-23835: - Actually this is not the first time we see thi

[jira] [Commented] (SPARK-23797) SparkSQL performance on small TPCDS tables is very low when compared to Drill or Presto

2018-04-02 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422078#comment-16422078 ] Takeshi Yamamuro commented on SPARK-23797: -- I think this ticket seems to be inva

[jira] [Commented] (SPARK-23791) Sub-optimal generated code for sum aggregating

2018-04-02 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422074#comment-16422074 ] Takeshi Yamamuro commented on SPARK-23791: -- sure, I will. > Sub-optimal generat

[jira] [Commented] (SPARK-23791) Sub-optimal generated code for sum aggregating

2018-04-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422068#comment-16422068 ] Marco Gaido commented on SPARK-23791: - Yes, I think you're right [~maropu]. Do you wa

[jira] [Commented] (SPARK-23791) Sub-optimal generated code for sum aggregating

2018-04-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422064#comment-16422064 ] Marco Gaido commented on SPARK-23791: - Thanks, [~rednikotin]. The error you noticed i

[jira] [Commented] (SPARK-23791) Sub-optimal generated code for sum aggregating

2018-04-02 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422062#comment-16422062 ] Takeshi Yamamuro commented on SPARK-23791: -- Probably, this issue might to be rel

[jira] [Created] (SPARK-23845) Continuous rate source uses different offset format

2018-04-02 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-23845: --- Summary: Continuous rate source uses different offset format Key: SPARK-23845 URL: https://issues.apache.org/jira/browse/SPARK-23845 Project: Spark Issue Type:

[jira] [Commented] (SPARK-19842) Informational Referential Integrity Constraints Support in Spark

2018-04-02 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422041#comment-16422041 ] Takeshi Yamamuro commented on SPARK-19842: -- Probably, the proposed might comply

[jira] [Commented] (SPARK-23844) Socket Stream recovering from checkpoint will throw exception

2018-04-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422037#comment-16422037 ] Apache Spark commented on SPARK-23844: -- User 'jerryshao' has created a pull request

[jira] [Assigned] (SPARK-23844) Socket Stream recovering from checkpoint will throw exception

2018-04-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23844: Assignee: Apache Spark > Socket Stream recovering from checkpoint will throw exception > -

  1   2   >