[jira] [Created] (SPARK-48711) OOM killer may leave SparkContext in broken state causing ConnectionRefusedError

2024-06-25 Thread Rafal Wojdyla (Jira)
Rafal Wojdyla created SPARK-48711: - Summary: OOM killer may leave SparkContext in broken state causing ConnectionRefusedError Key: SPARK-48711 URL: https://issues.apache.org/jira/browse/SPARK-48711

[jira] [Commented] (SPARK-44003) DynamicPartitionDataSingleWriter is being starved by Parquet MemoryManager

2023-12-11 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795227#comment-17795227 ] Rafal Wojdyla commented on SPARK-44003: --- Reported this issue in Parquet:

[jira] [Comment Edited] (SPARK-44003) DynamicPartitionDataSingleWriter is being starved by Parquet MemoryManager

2023-12-10 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795101#comment-17795101 ] Rafal Wojdyla edited comment on SPARK-44003 at 12/10/23 7:09 PM: - Coming

[jira] [Commented] (SPARK-44003) DynamicPartitionDataSingleWriter is being starved by Parquet MemoryManager

2023-12-10 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795102#comment-17795102 ] Rafal Wojdyla commented on SPARK-44003: --- In context of {{DynamicPartitionDataConcurrentWriter}}

[jira] [Comment Edited] (SPARK-44003) DynamicPartitionDataSingleWriter is being starved by Parquet MemoryManager

2023-12-10 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795101#comment-17795101 ] Rafal Wojdyla edited comment on SPARK-44003 at 12/10/23 7:02 PM: - Coming

[jira] [Commented] (SPARK-44003) DynamicPartitionDataSingleWriter is being starved by Parquet MemoryManager

2023-12-10 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795101#comment-17795101 ] Rafal Wojdyla commented on SPARK-44003: --- Coming back briefly to this issue, and to elaborate on

[jira] [Updated] (SPARK-44003) DynamicPartitionDataSingleWriter is being starved by Parquet MemoryManager

2023-06-10 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-44003: -- Description: We have a pyspark job that writes to a partitioned parquet dataset via:

[jira] [Commented] (SPARK-44003) DynamicPartitionDataSingleWriter is being starved by Parquet MemoryManager

2023-06-08 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730541#comment-17730541 ] Rafal Wojdyla commented on SPARK-44003: --- Another question: is there a strict requirement to have a

[jira] [Commented] (SPARK-44003) DynamicPartitionDataSingleWriter is being starved by Parquet MemoryManager

2023-06-07 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730358#comment-17730358 ] Rafal Wojdyla commented on SPARK-44003: --- I believe {{MemoryManager}} has too strict

[jira] [Updated] (SPARK-44003) DynamicPartitionDataSingleWriter is being starved by Parquet MemoryManager

2023-06-07 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-44003: -- Description: We have a pyspark job that writes to a partitioned parquet dataset via:

[jira] [Updated] (SPARK-44003) DynamicPartitionDataSingleWriter is being starved by Parquet MemoryManager

2023-06-07 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-44003: -- Description: We have a pyspark job that writes to a partitioned parquet dataset via:

[jira] [Commented] (SPARK-44003) DynamicPartitionDataSingleWriter is being starved by Parquet MemoryManager

2023-06-07 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730353#comment-17730353 ] Rafal Wojdyla commented on SPARK-44003: --- And here's the [lock

[jira] [Created] (SPARK-44003) DynamicPartitionDataSingleWriter is being starved by Parquet MemoryManager

2023-06-07 Thread Rafal Wojdyla (Jira)
Rafal Wojdyla created SPARK-44003: - Summary: DynamicPartitionDataSingleWriter is being starved by Parquet MemoryManager Key: SPARK-44003 URL: https://issues.apache.org/jira/browse/SPARK-44003

[jira] [Commented] (SPARK-40363) Add SQL misc function to assert/check column value

2022-09-06 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17601090#comment-17601090 ] Rafal Wojdyla commented on SPARK-40363: --- Hey [~hyukjin.kwon] yea, the implementation is simple,

[jira] [Created] (SPARK-40363) Add SQL misc function to assert/check column value

2022-09-06 Thread Rafal Wojdyla (Jira)
Rafal Wojdyla created SPARK-40363: - Summary: Add SQL misc function to assert/check column value Key: SPARK-40363 URL: https://issues.apache.org/jira/browse/SPARK-40363 Project: Spark Issue

[jira] [Commented] (SPARK-37609) Transient StackOverflowError on DataFrame from Catalyst QueryPlan

2022-08-29 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17597026#comment-17597026 ] Rafal Wojdyla commented on SPARK-37609: --- Experienced another issue like this, this time the query

[jira] [Commented] (SPARK-38904) Low cost DataFrame schema swap util

2022-04-18 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17523636#comment-17523636 ] Rafal Wojdyla commented on SPARK-38904: --- [~hyukjin.kwon] ok, will give it a shot, and ping you if

[jira] [Commented] (SPARK-38904) Low cost DataFrame schema swap util

2022-04-14 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17522610#comment-17522610 ] Rafal Wojdyla commented on SPARK-38904: --- [~hyukjin.kwon] thanks for the comment, sounds good to

[jira] [Updated] (SPARK-38904) Low cost DataFrame schema swap util

2022-04-14 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-38904: -- Description: This question is related to [https://stackoverflow.com/a/37090151/1661491].

[jira] [Updated] (SPARK-38904) Low cost DataFrame schema swap util

2022-04-14 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-38904: -- Description: This question is related to [https://stackoverflow.com/a/37090151/1661491].

[jira] [Updated] (SPARK-38904) Low cost DataFrame schema swap util

2022-04-14 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-38904: -- Description: This question is related to [https://stackoverflow.com/a/37090151/1661491].

[jira] [Updated] (SPARK-38904) Low cost DataFrame schema swap util

2022-04-14 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-38904: -- Description: This question is related to [https://stackoverflow.com/a/37090151/1661491].

[jira] [Updated] (SPARK-38904) Low cost DataFrame schema swap util

2022-04-14 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-38904: -- Description: This question is related to [https://stackoverflow.com/a/37090151/1661491].

[jira] [Created] (SPARK-38904) Low cost DataFrame schema swap util

2022-04-14 Thread Rafal Wojdyla (Jira)
Rafal Wojdyla created SPARK-38904: - Summary: Low cost DataFrame schema swap util Key: SPARK-38904 URL: https://issues.apache.org/jira/browse/SPARK-38904 Project: Spark Issue Type: New

[jira] [Commented] (SPARK-38438) Can't update spark.jars.packages on existing global/default context

2022-04-14 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17522176#comment-17522176 ] Rafal Wojdyla commented on SPARK-38438: --- For posterity see the context in

[jira] [Commented] (SPARK-2868) Support named accumulators in Python

2022-03-25 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17512540#comment-17512540 ] Rafal Wojdyla commented on SPARK-2868: -- Is there a better issue to track the work on named

[jira] [Comment Edited] (SPARK-38438) Can't update spark.jars.packages on existing global/default context

2022-03-09 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17502633#comment-17502633 ] Rafal Wojdyla edited comment on SPARK-38438 at 3/9/22, 12:35 PM: - The

[jira] [Updated] (SPARK-38438) Can't update spark.jars.packages on existing global/default context

2022-03-09 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-38438: -- Description: Reproduction: {code:python} from pyspark.sql import SparkSession # default

[jira] [Updated] (SPARK-38438) Can't update spark.jars.packages on existing global/default context

2022-03-08 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-38438: -- Description: Reproduction: {code:python} from pyspark.sql import SparkSession # default

[jira] [Comment Edited] (SPARK-38438) Can't update spark.jars.packages on existing global/default context

2022-03-07 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17502633#comment-17502633 ] Rafal Wojdyla edited comment on SPARK-38438 at 3/8/22, 4:33 AM: The

[jira] [Updated] (SPARK-38438) Can't update spark.jars.packages on existing global/default context

2022-03-07 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-38438: -- Issue Type: Bug (was: New Feature) > Can't update spark.jars.packages on existing

[jira] [Commented] (SPARK-38438) Can't update spark.jars.packages on existing global/default context

2022-03-07 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17502633#comment-17502633 ] Rafal Wojdyla commented on SPARK-38438: --- The workaround actually doesn't stop the existing JVM, it

[jira] [Updated] (SPARK-38438) Can't update spark.jars.packages on existing global/default context

2022-03-07 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-38438: -- Description: Reproduction: {code:python} from pyspark.sql import SparkSession # default

[jira] [Created] (SPARK-38438) Can't update spark.jars.packages on existing global/default context

2022-03-07 Thread Rafal Wojdyla (Jira)
Rafal Wojdyla created SPARK-38438: - Summary: Can't update spark.jars.packages on existing global/default context Key: SPARK-38438 URL: https://issues.apache.org/jira/browse/SPARK-38438 Project: Spark

[jira] [Comment Edited] (SPARK-37782) Make DataFrame.transform take the parameters for the function.

2021-12-30 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17466982#comment-17466982 ] Rafal Wojdyla edited comment on SPARK-37782 at 12/30/21, 8:05 PM: --

[jira] [Commented] (SPARK-37782) Make DataFrame.transform take the parameters for the function.

2021-12-30 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17466982#comment-17466982 ] Rafal Wojdyla commented on SPARK-37782: --- Duplicate of

[jira] [Commented] (SPARK-37609) Transient StackOverflowError on DataFrame from Catalyst QueryPlan

2021-12-13 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17458691#comment-17458691 ] Rafal Wojdyla commented on SPARK-37609: --- [~hyukjin.kwon] yep, understand that, if I have some time

[jira] [Updated] (SPARK-37610) Anonymized/obfuscated query plan?

2021-12-10 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-37610: -- Description: I would like to share a query plan for a specific

[jira] [Commented] (SPARK-37609) Transient StackOverflowError on DataFrame from Catalyst QueryPlan

2021-12-10 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17457483#comment-17457483 ] Rafal Wojdyla commented on SPARK-37609: --- [~yumwang] I don't have a public code to share. Also

[jira] [Created] (SPARK-37610) Anonymized/obfuscated query plan?

2021-12-10 Thread Rafal Wojdyla (Jira)
Rafal Wojdyla created SPARK-37610: - Summary: Anonymized/obfuscated query plan? Key: SPARK-37610 URL: https://issues.apache.org/jira/browse/SPARK-37610 Project: Spark Issue Type: New Feature

[jira] [Updated] (SPARK-37609) Transient StackOverflowError on DataFrame from Catalyst QueryPlan

2021-12-10 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-37609: -- Environment: py:3.9 > Transient StackOverflowError on DataFrame from Catalyst QueryPlan >

[jira] [Created] (SPARK-37609) Transient StackOverflowError on DataFrame from Catalyst QueryPlan

2021-12-10 Thread Rafal Wojdyla (Jira)
Rafal Wojdyla created SPARK-37609: - Summary: Transient StackOverflowError on DataFrame from Catalyst QueryPlan Key: SPARK-37609 URL: https://issues.apache.org/jira/browse/SPARK-37609 Project: Spark

[jira] [Updated] (SPARK-37601) Could/should sql.DataFrame.transform accept function parameters?

2021-12-09 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-37601: -- Description: {code:python} def foo(df: DataFrame, p: int) -> DataFrame ... # current:

[jira] [Updated] (SPARK-37601) Could/should sql.DataFrame.transform accept function parameters?

2021-12-09 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-37601: -- Description: {code:python} def foo(df: DataFrame, p: int) -> DataFrame ... # current:

[jira] [Updated] (SPARK-37601) Could/should sql.DataFrame.transform accept function parameters?

2021-12-09 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-37601: -- Description: {code:python} def foo(df: DataFrame, p: int) -> DataFrame ... # current:

[jira] [Updated] (SPARK-37601) Could/should sql.DataFrame.transform accept function parameters?

2021-12-09 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-37601: -- Description: {code:python} def foo(df: DataFrame, p: int) -> DataFrame ... # current:

[jira] [Updated] (SPARK-37601) Could/should sql.DataFrame.transform accept function parameters?

2021-12-09 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-37601: -- Description: {noformat} def foo(df: DataFrame, p: int) -> DataFrame ... # current: from

[jira] [Created] (SPARK-37601) Could/should sql.DataFrame.transform accept function parameters?

2021-12-09 Thread Rafal Wojdyla (Jira)
Rafal Wojdyla created SPARK-37601: - Summary: Could/should sql.DataFrame.transform accept function parameters? Key: SPARK-37601 URL: https://issues.apache.org/jira/browse/SPARK-37601 Project: Spark

[jira] [Commented] (SPARK-37570) mypy breaks on pyspark.pandas.plot.core.Bucketizer

2021-12-08 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17455897#comment-17455897 ] Rafal Wojdyla commented on SPARK-37570: --- A workaround in {{setup.cfg}}: {noformat}

[jira] [Updated] (SPARK-37570) mypy breaks on pyspark.pandas.plot.core.Bucketizer

2021-12-08 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-37570: -- Environment: Mypy version: v0.910-1 Py: 3.9 > mypy breaks on

[jira] [Updated] (SPARK-37570) mypy breaks on pyspark.pandas.plot.core.Bucketizer

2021-12-08 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-37570: -- Environment: Mypy: v0.910-1 Py: 3.9 was: Mypy version: v0.910-1 Py: 3.9 > mypy breaks on

[jira] [Updated] (SPARK-37570) mypy breaks on pyspark.pandas.plot.core.Bucketizer

2021-12-08 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-37570: -- Description: Mypy breaks on a project with pyspark 3.2.0 dependency (worked fine for 3.1.2),

[jira] [Created] (SPARK-37577) ClassCastException: ArrayType cannot be cast to StructType

2021-12-08 Thread Rafal Wojdyla (Jira)
Rafal Wojdyla created SPARK-37577: - Summary: ClassCastException: ArrayType cannot be cast to StructType Key: SPARK-37577 URL: https://issues.apache.org/jira/browse/SPARK-37577 Project: Spark

[jira] [Updated] (SPARK-37570) mypy breaks on pyspark.pandas.plot.core.Bucketizer

2021-12-07 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-37570: -- Description: Mypy breaks on a project with pyspark 3.2.0 dependency (worked fine for 3.1.2),

[jira] [Created] (SPARK-37570) mypy breaks on pyspark.pandas.plot.core.Bucketizer

2021-12-07 Thread Rafal Wojdyla (Jira)
Rafal Wojdyla created SPARK-37570: - Summary: mypy breaks on pyspark.pandas.plot.core.Bucketizer Key: SPARK-37570 URL: https://issues.apache.org/jira/browse/SPARK-37570 Project: Spark Issue

[jira] [Comment Edited] (SPARK-35386) parquet read with schema should fail on non-existing columns

2021-05-16 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17345821#comment-17345821 ] Rafal Wojdyla edited comment on SPARK-35386 at 5/17/21, 5:53 AM: -

[jira] [Comment Edited] (SPARK-35386) parquet read with schema should fail on non-existing columns

2021-05-16 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17345821#comment-17345821 ] Rafal Wojdyla edited comment on SPARK-35386 at 5/17/21, 5:52 AM: -

[jira] [Comment Edited] (SPARK-35386) parquet read with schema should fail on non-existing columns

2021-05-16 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17345821#comment-17345821 ] Rafal Wojdyla edited comment on SPARK-35386 at 5/17/21, 5:51 AM: -

[jira] [Comment Edited] (SPARK-35386) parquet read with schema should fail on non-existing columns

2021-05-16 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17345821#comment-17345821 ] Rafal Wojdyla edited comment on SPARK-35386 at 5/17/21, 2:00 AM: -

[jira] [Commented] (SPARK-35386) parquet read with schema should fail on non-existing columns

2021-05-16 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17345821#comment-17345821 ] Rafal Wojdyla commented on SPARK-35386: --- {quote} IIRC, it's not documented. We might think about

[jira] [Commented] (SPARK-35386) parquet read with schema should fail on non-existing columns

2021-05-16 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17345811#comment-17345811 ] Rafal Wojdyla commented on SPARK-35386: --- {quote} I think the logic is that, when users specify a

[jira] [Commented] (SPARK-35386) parquet read with schema should fail on non-existing columns

2021-05-16 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17345645#comment-17345645 ] Rafal Wojdyla commented on SPARK-35386: --- [~hyukjin.kwon] Thanks for the prompt reply. But in the

[jira] [Updated] (SPARK-35386) parquet read with schema should fail on non-existing columns

2021-05-12 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-35386: -- Description: When read schema is specified as I user I would prefer/like if spark failed on

[jira] [Updated] (SPARK-35386) parquet read with schema should fail on non-existing columns

2021-05-12 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-35386: -- Description: When read schema is specified as I user I would prefer/like if spark failed on

[jira] [Created] (SPARK-35386) parquet read with schema should fail on non-existing columns

2021-05-12 Thread Rafal Wojdyla (Jira)
Rafal Wojdyla created SPARK-35386: - Summary: parquet read with schema should fail on non-existing columns Key: SPARK-35386 URL: https://issues.apache.org/jira/browse/SPARK-35386 Project: Spark

[jira] [Commented] (SPARK-34544) pyspark toPandas() should return pd.DataFrame

2021-03-09 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17298016#comment-17298016 ] Rafal Wojdyla commented on SPARK-34544: --- [~zero323] yea, I see, in nature that suggestion is

[jira] [Commented] (SPARK-34629) Python type hints improvement

2021-03-04 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295753#comment-17295753 ] Rafal Wojdyla commented on SPARK-34629: --- Thanks [~hyukjin.kwon]! > Python type hints improvement

[jira] [Comment Edited] (SPARK-34544) pyspark toPandas() should return pd.DataFrame

2021-03-01 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17292987#comment-17292987 ] Rafal Wojdyla edited comment on SPARK-34544 at 3/1/21, 4:40 PM: 

[jira] [Commented] (SPARK-34544) pyspark toPandas() should return pd.DataFrame

2021-03-01 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17293013#comment-17293013 ] Rafal Wojdyla commented on SPARK-34544: --- [~zero323] I appreciate your prompt answers. Re:

[jira] [Commented] (SPARK-34544) pyspark toPandas() should return pd.DataFrame

2021-03-01 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17292987#comment-17292987 ] Rafal Wojdyla commented on SPARK-34544: ---  [~zero323] > it is more a dev utility than user a

[jira] [Created] (SPARK-34544) pyspark toPandas() should return pd.DataFrame

2021-02-25 Thread Rafal Wojdyla (Jira)
Rafal Wojdyla created SPARK-34544: - Summary: pyspark toPandas() should return pd.DataFrame Key: SPARK-34544 URL: https://issues.apache.org/jira/browse/SPARK-34544 Project: Spark Issue Type:

[jira] [Commented] (SPARK-34540) Add convert_dtypes to the DataFrameLike protocol

2021-02-25 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290893#comment-17290893 ] Rafal Wojdyla commented on SPARK-34540: --- This is related to

[jira] [Created] (SPARK-34540) Add convert_dtypes to the DataFrameLike protocol

2021-02-25 Thread Rafal Wojdyla (Jira)
Rafal Wojdyla created SPARK-34540: - Summary: Add convert_dtypes to the DataFrameLike protocol Key: SPARK-34540 URL: https://issues.apache.org/jira/browse/SPARK-34540 Project: Spark Issue

[jira] [Updated] (SPARK-33605) Add GCS FS/connector config (dependencies?) akin to S3

2020-12-09 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-33605: -- Summary: Add GCS FS/connector config (dependencies?) akin to S3 (was: Add GCS FS/connector

[jira] [Comment Edited] (SPARK-33605) Add GCS FS/connector to the dependencies akin to S3

2020-11-30 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240863#comment-17240863 ] Rafal Wojdyla edited comment on SPARK-33605 at 11/30/20, 5:33 PM: --

[jira] [Updated] (SPARK-33605) Add GCS FS/connector to the dependencies akin to S3

2020-11-30 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-33605: -- Description: Spark comes with some S3 batteries included, which makes it easier to use with

[jira] [Comment Edited] (SPARK-33605) Add GCS FS/connector to the dependencies akin to S3

2020-11-30 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240863#comment-17240863 ] Rafal Wojdyla edited comment on SPARK-33605 at 11/30/20, 4:45 PM: --

[jira] [Comment Edited] (SPARK-33605) Add GCS FS/connector to the dependencies akin to S3

2020-11-30 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240863#comment-17240863 ] Rafal Wojdyla edited comment on SPARK-33605 at 11/30/20, 4:45 PM: --

[jira] [Updated] (SPARK-33605) Add GCS FS/connector to the dependencies akin to S3

2020-11-30 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-33605: -- Description: Spark comes with some S3 batteries included, which makes it easier to use with

[jira] [Commented] (SPARK-33605) Add GCS FS/connector to the dependencies akin to S3

2020-11-30 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240863#comment-17240863 ] Rafal Wojdyla commented on SPARK-33605: --- Actually, the pyspark package includes the config for S3

[jira] [Updated] (SPARK-33605) Add GCS FS/connector to the dependencies akin to S3

2020-11-30 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-33605: -- Description: Spark comes with

[jira] [Updated] (SPARK-33605) Add GCS FS/connector to the dependencies akin to S3

2020-11-30 Thread Rafal Wojdyla (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafal Wojdyla updated SPARK-33605: -- Description: Spark comes with

[jira] [Created] (SPARK-33605) Add GCS FS/connector to the dependencies akin to S3

2020-11-30 Thread Rafal Wojdyla (Jira)
Rafal Wojdyla created SPARK-33605: - Summary: Add GCS FS/connector to the dependencies akin to S3 Key: SPARK-33605 URL: https://issues.apache.org/jira/browse/SPARK-33605 Project: Spark Issue