Rafal Wojdyla created SPARK-48711:
-
Summary: OOM killer may leave SparkContext in broken state causing
ConnectionRefusedError
Key: SPARK-48711
URL: https://issues.apache.org/jira/browse/SPARK-48711
[
https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795227#comment-17795227
]
Rafal Wojdyla commented on SPARK-44003:
---
Reported this issue in Parquet:
[
https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795101#comment-17795101
]
Rafal Wojdyla edited comment on SPARK-44003 at 12/10/23 7:09 PM:
-
Coming
[
https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795102#comment-17795102
]
Rafal Wojdyla commented on SPARK-44003:
---
In context of {{DynamicPartitionDataConcurrentWriter}}
[
https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795101#comment-17795101
]
Rafal Wojdyla edited comment on SPARK-44003 at 12/10/23 7:02 PM:
-
Coming
[
https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795101#comment-17795101
]
Rafal Wojdyla commented on SPARK-44003:
---
Coming back briefly to this issue, and to elaborate on
[
https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-44003:
--
Description:
We have a pyspark job that writes to a partitioned parquet dataset via:
[
https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730541#comment-17730541
]
Rafal Wojdyla commented on SPARK-44003:
---
Another question: is there a strict requirement to have a
[
https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730358#comment-17730358
]
Rafal Wojdyla commented on SPARK-44003:
---
I believe {{MemoryManager}} has too strict
[
https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-44003:
--
Description:
We have a pyspark job that writes to a partitioned parquet dataset via:
[
https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-44003:
--
Description:
We have a pyspark job that writes to a partitioned parquet dataset via:
[
https://issues.apache.org/jira/browse/SPARK-44003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730353#comment-17730353
]
Rafal Wojdyla commented on SPARK-44003:
---
And here's the [lock
Rafal Wojdyla created SPARK-44003:
-
Summary: DynamicPartitionDataSingleWriter is being starved by
Parquet MemoryManager
Key: SPARK-44003
URL: https://issues.apache.org/jira/browse/SPARK-44003
[
https://issues.apache.org/jira/browse/SPARK-40363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17601090#comment-17601090
]
Rafal Wojdyla commented on SPARK-40363:
---
Hey [~hyukjin.kwon] yea, the implementation is simple,
Rafal Wojdyla created SPARK-40363:
-
Summary: Add SQL misc function to assert/check column value
Key: SPARK-40363
URL: https://issues.apache.org/jira/browse/SPARK-40363
Project: Spark
Issue
[
https://issues.apache.org/jira/browse/SPARK-37609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17597026#comment-17597026
]
Rafal Wojdyla commented on SPARK-37609:
---
Experienced another issue like this, this time the query
[
https://issues.apache.org/jira/browse/SPARK-38904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17523636#comment-17523636
]
Rafal Wojdyla commented on SPARK-38904:
---
[~hyukjin.kwon] ok, will give it a shot, and ping you if
[
https://issues.apache.org/jira/browse/SPARK-38904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17522610#comment-17522610
]
Rafal Wojdyla commented on SPARK-38904:
---
[~hyukjin.kwon] thanks for the comment, sounds good to
[
https://issues.apache.org/jira/browse/SPARK-38904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-38904:
--
Description:
This question is related to [https://stackoverflow.com/a/37090151/1661491].
[
https://issues.apache.org/jira/browse/SPARK-38904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-38904:
--
Description:
This question is related to [https://stackoverflow.com/a/37090151/1661491].
[
https://issues.apache.org/jira/browse/SPARK-38904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-38904:
--
Description:
This question is related to [https://stackoverflow.com/a/37090151/1661491].
[
https://issues.apache.org/jira/browse/SPARK-38904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-38904:
--
Description:
This question is related to [https://stackoverflow.com/a/37090151/1661491].
[
https://issues.apache.org/jira/browse/SPARK-38904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-38904:
--
Description:
This question is related to [https://stackoverflow.com/a/37090151/1661491].
Rafal Wojdyla created SPARK-38904:
-
Summary: Low cost DataFrame schema swap util
Key: SPARK-38904
URL: https://issues.apache.org/jira/browse/SPARK-38904
Project: Spark
Issue Type: New
[
https://issues.apache.org/jira/browse/SPARK-38438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17522176#comment-17522176
]
Rafal Wojdyla commented on SPARK-38438:
---
For posterity see the context in
[
https://issues.apache.org/jira/browse/SPARK-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17512540#comment-17512540
]
Rafal Wojdyla commented on SPARK-2868:
--
Is there a better issue to track the work on named
[
https://issues.apache.org/jira/browse/SPARK-38438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17502633#comment-17502633
]
Rafal Wojdyla edited comment on SPARK-38438 at 3/9/22, 12:35 PM:
-
The
[
https://issues.apache.org/jira/browse/SPARK-38438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-38438:
--
Description:
Reproduction:
{code:python}
from pyspark.sql import SparkSession
# default
[
https://issues.apache.org/jira/browse/SPARK-38438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-38438:
--
Description:
Reproduction:
{code:python}
from pyspark.sql import SparkSession
# default
[
https://issues.apache.org/jira/browse/SPARK-38438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17502633#comment-17502633
]
Rafal Wojdyla edited comment on SPARK-38438 at 3/8/22, 4:33 AM:
The
[
https://issues.apache.org/jira/browse/SPARK-38438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-38438:
--
Issue Type: Bug (was: New Feature)
> Can't update spark.jars.packages on existing
[
https://issues.apache.org/jira/browse/SPARK-38438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17502633#comment-17502633
]
Rafal Wojdyla commented on SPARK-38438:
---
The workaround actually doesn't stop the existing JVM, it
[
https://issues.apache.org/jira/browse/SPARK-38438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-38438:
--
Description:
Reproduction:
{code:python}
from pyspark.sql import SparkSession
# default
Rafal Wojdyla created SPARK-38438:
-
Summary: Can't update spark.jars.packages on existing
global/default context
Key: SPARK-38438
URL: https://issues.apache.org/jira/browse/SPARK-38438
Project: Spark
[
https://issues.apache.org/jira/browse/SPARK-37782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17466982#comment-17466982
]
Rafal Wojdyla edited comment on SPARK-37782 at 12/30/21, 8:05 PM:
--
[
https://issues.apache.org/jira/browse/SPARK-37782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17466982#comment-17466982
]
Rafal Wojdyla commented on SPARK-37782:
---
Duplicate of
[
https://issues.apache.org/jira/browse/SPARK-37609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17458691#comment-17458691
]
Rafal Wojdyla commented on SPARK-37609:
---
[~hyukjin.kwon] yep, understand that, if I have some time
[
https://issues.apache.org/jira/browse/SPARK-37610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-37610:
--
Description: I would like to share a query plan for a specific
[
https://issues.apache.org/jira/browse/SPARK-37609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17457483#comment-17457483
]
Rafal Wojdyla commented on SPARK-37609:
---
[~yumwang] I don't have a public code to share. Also
Rafal Wojdyla created SPARK-37610:
-
Summary: Anonymized/obfuscated query plan?
Key: SPARK-37610
URL: https://issues.apache.org/jira/browse/SPARK-37610
Project: Spark
Issue Type: New Feature
[
https://issues.apache.org/jira/browse/SPARK-37609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-37609:
--
Environment: py:3.9
> Transient StackOverflowError on DataFrame from Catalyst QueryPlan
>
Rafal Wojdyla created SPARK-37609:
-
Summary: Transient StackOverflowError on DataFrame from Catalyst
QueryPlan
Key: SPARK-37609
URL: https://issues.apache.org/jira/browse/SPARK-37609
Project: Spark
[
https://issues.apache.org/jira/browse/SPARK-37601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-37601:
--
Description:
{code:python}
def foo(df: DataFrame, p: int) -> DataFrame
...
# current:
[
https://issues.apache.org/jira/browse/SPARK-37601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-37601:
--
Description:
{code:python}
def foo(df: DataFrame, p: int) -> DataFrame
...
# current:
[
https://issues.apache.org/jira/browse/SPARK-37601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-37601:
--
Description:
{code:python}
def foo(df: DataFrame, p: int) -> DataFrame
...
# current:
[
https://issues.apache.org/jira/browse/SPARK-37601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-37601:
--
Description:
{code:python}
def foo(df: DataFrame, p: int) -> DataFrame
...
# current:
[
https://issues.apache.org/jira/browse/SPARK-37601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-37601:
--
Description:
{noformat}
def foo(df: DataFrame, p: int) -> DataFrame
...
# current:
from
Rafal Wojdyla created SPARK-37601:
-
Summary: Could/should sql.DataFrame.transform accept function
parameters?
Key: SPARK-37601
URL: https://issues.apache.org/jira/browse/SPARK-37601
Project: Spark
[
https://issues.apache.org/jira/browse/SPARK-37570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17455897#comment-17455897
]
Rafal Wojdyla commented on SPARK-37570:
---
A workaround in {{setup.cfg}}:
{noformat}
[
https://issues.apache.org/jira/browse/SPARK-37570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-37570:
--
Environment:
Mypy version: v0.910-1
Py: 3.9
> mypy breaks on
[
https://issues.apache.org/jira/browse/SPARK-37570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-37570:
--
Environment:
Mypy: v0.910-1
Py: 3.9
was:
Mypy version: v0.910-1
Py: 3.9
> mypy breaks on
[
https://issues.apache.org/jira/browse/SPARK-37570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-37570:
--
Description:
Mypy breaks on a project with pyspark 3.2.0 dependency (worked fine for 3.1.2),
Rafal Wojdyla created SPARK-37577:
-
Summary: ClassCastException: ArrayType cannot be cast to StructType
Key: SPARK-37577
URL: https://issues.apache.org/jira/browse/SPARK-37577
Project: Spark
[
https://issues.apache.org/jira/browse/SPARK-37570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-37570:
--
Description:
Mypy breaks on a project with pyspark 3.2.0 dependency (worked fine for 3.1.2),
Rafal Wojdyla created SPARK-37570:
-
Summary: mypy breaks on pyspark.pandas.plot.core.Bucketizer
Key: SPARK-37570
URL: https://issues.apache.org/jira/browse/SPARK-37570
Project: Spark
Issue
[
https://issues.apache.org/jira/browse/SPARK-35386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17345821#comment-17345821
]
Rafal Wojdyla edited comment on SPARK-35386 at 5/17/21, 5:53 AM:
-
[
https://issues.apache.org/jira/browse/SPARK-35386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17345821#comment-17345821
]
Rafal Wojdyla edited comment on SPARK-35386 at 5/17/21, 5:52 AM:
-
[
https://issues.apache.org/jira/browse/SPARK-35386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17345821#comment-17345821
]
Rafal Wojdyla edited comment on SPARK-35386 at 5/17/21, 5:51 AM:
-
[
https://issues.apache.org/jira/browse/SPARK-35386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17345821#comment-17345821
]
Rafal Wojdyla edited comment on SPARK-35386 at 5/17/21, 2:00 AM:
-
[
https://issues.apache.org/jira/browse/SPARK-35386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17345821#comment-17345821
]
Rafal Wojdyla commented on SPARK-35386:
---
{quote}
IIRC, it's not documented. We might think about
[
https://issues.apache.org/jira/browse/SPARK-35386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17345811#comment-17345811
]
Rafal Wojdyla commented on SPARK-35386:
---
{quote}
I think the logic is that, when users specify a
[
https://issues.apache.org/jira/browse/SPARK-35386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17345645#comment-17345645
]
Rafal Wojdyla commented on SPARK-35386:
---
[~hyukjin.kwon]
Thanks for the prompt reply. But in the
[
https://issues.apache.org/jira/browse/SPARK-35386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-35386:
--
Description:
When read schema is specified as I user I would prefer/like if spark failed on
[
https://issues.apache.org/jira/browse/SPARK-35386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-35386:
--
Description:
When read schema is specified as I user I would prefer/like if spark failed on
Rafal Wojdyla created SPARK-35386:
-
Summary: parquet read with schema should fail on non-existing
columns
Key: SPARK-35386
URL: https://issues.apache.org/jira/browse/SPARK-35386
Project: Spark
[
https://issues.apache.org/jira/browse/SPARK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17298016#comment-17298016
]
Rafal Wojdyla commented on SPARK-34544:
---
[~zero323] yea, I see, in nature that suggestion is
[
https://issues.apache.org/jira/browse/SPARK-34629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295753#comment-17295753
]
Rafal Wojdyla commented on SPARK-34629:
---
Thanks [~hyukjin.kwon]!
> Python type hints improvement
[
https://issues.apache.org/jira/browse/SPARK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17292987#comment-17292987
]
Rafal Wojdyla edited comment on SPARK-34544 at 3/1/21, 4:40 PM:
[
https://issues.apache.org/jira/browse/SPARK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17293013#comment-17293013
]
Rafal Wojdyla commented on SPARK-34544:
---
[~zero323] I appreciate your prompt answers.
Re:
[
https://issues.apache.org/jira/browse/SPARK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17292987#comment-17292987
]
Rafal Wojdyla commented on SPARK-34544:
---
[~zero323]
> it is more a dev utility than user a
Rafal Wojdyla created SPARK-34544:
-
Summary: pyspark toPandas() should return pd.DataFrame
Key: SPARK-34544
URL: https://issues.apache.org/jira/browse/SPARK-34544
Project: Spark
Issue Type:
[
https://issues.apache.org/jira/browse/SPARK-34540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290893#comment-17290893
]
Rafal Wojdyla commented on SPARK-34540:
---
This is related to
Rafal Wojdyla created SPARK-34540:
-
Summary: Add convert_dtypes to the DataFrameLike protocol
Key: SPARK-34540
URL: https://issues.apache.org/jira/browse/SPARK-34540
Project: Spark
Issue
[
https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-33605:
--
Summary: Add GCS FS/connector config (dependencies?) akin to S3 (was: Add
GCS FS/connector
[
https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240863#comment-17240863
]
Rafal Wojdyla edited comment on SPARK-33605 at 11/30/20, 5:33 PM:
--
[
https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-33605:
--
Description:
Spark comes with some S3 batteries included, which makes it easier to use with
[
https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240863#comment-17240863
]
Rafal Wojdyla edited comment on SPARK-33605 at 11/30/20, 4:45 PM:
--
[
https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240863#comment-17240863
]
Rafal Wojdyla edited comment on SPARK-33605 at 11/30/20, 4:45 PM:
--
[
https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-33605:
--
Description:
Spark comes with some S3 batteries included, which makes it easier to use with
[
https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240863#comment-17240863
]
Rafal Wojdyla commented on SPARK-33605:
---
Actually, the pyspark package includes the config for S3
[
https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-33605:
--
Description:
Spark comes with
[
https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rafal Wojdyla updated SPARK-33605:
--
Description:
Spark comes with
Rafal Wojdyla created SPARK-33605:
-
Summary: Add GCS FS/connector to the dependencies akin to S3
Key: SPARK-33605
URL: https://issues.apache.org/jira/browse/SPARK-33605
Project: Spark
Issue
83 matches
Mail list logo