[jira] [Commented] (SPARK-37425) Inline type hints for python/pyspark/mllib/recommendation.py
[ https://issues.apache.org/jira/browse/SPARK-37425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502847#comment-17502847 ] dch nguyen commented on SPARK-37425: raise a pr soon! > Inline type hints for python/pyspark/mllib/recommendation.py > > > Key: SPARK-37425 > URL: https://issues.apache.org/jira/browse/SPARK-37425 > Project: Spark > Issue Type: Sub-task > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Priority: Major > > Inline type hints from python/pyspark/mlib/recommendation.pyi to > python/pyspark/mllib/recommendation.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36969) Inline type hints for SparkContext
[ https://issues.apache.org/jira/browse/SPARK-36969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17488577#comment-17488577 ] dch nguyen commented on SPARK-36969: [~itholic] this was resolved by https://issues.apache.org/jira/browse/SPARK-37152. Thanks! > Inline type hints for SparkContext > -- > > Key: SPARK-36969 > URL: https://issues.apache.org/jira/browse/SPARK-36969 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > Fix For: 3.3.0 > > > Many files can remove > {code:java} > # type: ignore[attr-defined] > {code} > if this file is inlined type -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37154) Inline type hints for python/pyspark/rdd.py
[ https://issues.apache.org/jira/browse/SPARK-37154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477611#comment-17477611 ] dch nguyen commented on SPARK-37154: Can I go for this? [~byronhsu] > Inline type hints for python/pyspark/rdd.py > --- > > Key: SPARK-37154 > URL: https://issues.apache.org/jira/browse/SPARK-37154 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Byron Hsu >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37930) Fix DataFrame select subset with duplicated columns
[ https://issues.apache.org/jira/browse/SPARK-37930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477529#comment-17477529 ] dch nguyen commented on SPARK-37930: I'm working on this. Thanks > Fix DataFrame select subset with duplicated columns > --- > > Key: SPARK-37930 > URL: https://issues.apache.org/jira/browse/SPARK-37930 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > > pandas > {code:java} > >>> pdf > a > 0 1 > 1 2 > 2 3 > 3 4 > >>> pdf[['a', 'a']] > a a > 0 1 1 > 1 2 2 > 2 3 3 > 3 4 4 {code} > pandas on spark > {code:java} > >>> psdf > a > 0 1 > 1 2 > 2 3 > 3 4 > >>> psdf[['a', 'a']] > Traceback (most recent call last): > File "", line 1, in > File "/u02/spark/python/pyspark/pandas/frame.py", line 12077, in __repr__ > pdf = self._get_or_create_repr_pandas_cache(max_display_count) > File "/u02/spark/python/pyspark/pandas/frame.py", line 12068, in > _get_or_create_repr_pandas_cache > self, "_repr_pandas_cache", {n: self.head(n + 1)._to_internal_pandas()} > File "/u02/spark/python/pyspark/pandas/frame.py", line 12063, in > _to_internal_pandas > return self._internal.to_pandas_frame > File "/u02/spark/python/pyspark/pandas/utils.py", line 576, in > wrapped_lazy_property > setattr(self, attr_name, fn(self)) > File "/u02/spark/python/pyspark/pandas/internal.py", line 1055, in > to_pandas_frame > return InternalFrame.restore_index(pdf, > **self.arguments_for_restore_index) > File "/u02/spark/python/pyspark/pandas/internal.py", line 1156, in > restore_index > pdf.columns = pd.Index( > File "/u02/venv3.9-2/lib/python3.9/site-packages/pandas/core/generic.py", > line 5500, in __setattr__ > return object.__setattr__(self, name, value) > File "pandas/_libs/properties.pyx", line 70, in > pandas._libs.properties.AxisProperty.__set__ > File "/u02/venv3.9-2/lib/python3.9/site-packages/pandas/core/generic.py", > line 766, in _set_axis > self._mgr.set_axis(axis, labels) > File > "/u02/venv3.9-2/lib/python3.9/site-packages/pandas/core/internals/managers.py", > line 216, in set_axis > self._validate_set_axis(axis, new_labels) > File > "/u02/venv3.9-2/lib/python3.9/site-packages/pandas/core/internals/base.py", > line 57, in _validate_set_axis > raise ValueError( > ValueError: Length mismatch: Expected axis has 4 elements, new values have 2 > elements {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37930) Fix DataFrame select subset with duplicated columns
[ https://issues.apache.org/jira/browse/SPARK-37930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477528#comment-17477528 ] dch nguyen commented on SPARK-37930: {code:java} >>> import pandas as pd >>> pdf = pd.DataFrame([1,2,3,4], columns=['a']) >>> pdf a 0 1 1 2 2 3 3 4 >>> pdf = pdf[['a', 'a']] >>> pdf a a 0 1 1 1 2 2 2 3 3 3 4 4 >>> pdf[['a', 'a']] a a a a 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 {code} Seem it come from pandas. [https://github.com/apache/spark/blob/df7447bc62052e3d7391ba23d7220fb8c9b923fd/python/pyspark/pandas/internal.py#L1146] > Fix DataFrame select subset with duplicated columns > --- > > Key: SPARK-37930 > URL: https://issues.apache.org/jira/browse/SPARK-37930 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > > pandas > {code:java} > >>> pdf > a > 0 1 > 1 2 > 2 3 > 3 4 > >>> pdf[['a', 'a']] > a a > 0 1 1 > 1 2 2 > 2 3 3 > 3 4 4 {code} > pandas on spark > {code:java} > >>> psdf > a > 0 1 > 1 2 > 2 3 > 3 4 > >>> psdf[['a', 'a']] > Traceback (most recent call last): > File "", line 1, in > File "/u02/spark/python/pyspark/pandas/frame.py", line 12077, in __repr__ > pdf = self._get_or_create_repr_pandas_cache(max_display_count) > File "/u02/spark/python/pyspark/pandas/frame.py", line 12068, in > _get_or_create_repr_pandas_cache > self, "_repr_pandas_cache", {n: self.head(n + 1)._to_internal_pandas()} > File "/u02/spark/python/pyspark/pandas/frame.py", line 12063, in > _to_internal_pandas > return self._internal.to_pandas_frame > File "/u02/spark/python/pyspark/pandas/utils.py", line 576, in > wrapped_lazy_property > setattr(self, attr_name, fn(self)) > File "/u02/spark/python/pyspark/pandas/internal.py", line 1055, in > to_pandas_frame > return InternalFrame.restore_index(pdf, > **self.arguments_for_restore_index) > File "/u02/spark/python/pyspark/pandas/internal.py", line 1156, in > restore_index > pdf.columns = pd.Index( > File "/u02/venv3.9-2/lib/python3.9/site-packages/pandas/core/generic.py", > line 5500, in __setattr__ > return object.__setattr__(self, name, value) > File "pandas/_libs/properties.pyx", line 70, in > pandas._libs.properties.AxisProperty.__set__ > File "/u02/venv3.9-2/lib/python3.9/site-packages/pandas/core/generic.py", > line 766, in _set_axis > self._mgr.set_axis(axis, labels) > File > "/u02/venv3.9-2/lib/python3.9/site-packages/pandas/core/internals/managers.py", > line 216, in set_axis > self._validate_set_axis(axis, new_labels) > File > "/u02/venv3.9-2/lib/python3.9/site-packages/pandas/core/internals/base.py", > line 57, in _validate_set_axis > raise ValueError( > ValueError: Length mismatch: Expected axis has 4 elements, new values have 2 > elements {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37930) Fix DataFrame select subset with duplicated columns
dch nguyen created SPARK-37930: -- Summary: Fix DataFrame select subset with duplicated columns Key: SPARK-37930 URL: https://issues.apache.org/jira/browse/SPARK-37930 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen pandas {code:java} >>> pdf a 0 1 1 2 2 3 3 4 >>> pdf[['a', 'a']] a a 0 1 1 1 2 2 2 3 3 3 4 4 {code} pandas on spark {code:java} >>> psdf a 0 1 1 2 2 3 3 4 >>> psdf[['a', 'a']] Traceback (most recent call last): File "", line 1, in File "/u02/spark/python/pyspark/pandas/frame.py", line 12077, in __repr__ pdf = self._get_or_create_repr_pandas_cache(max_display_count) File "/u02/spark/python/pyspark/pandas/frame.py", line 12068, in _get_or_create_repr_pandas_cache self, "_repr_pandas_cache", {n: self.head(n + 1)._to_internal_pandas()} File "/u02/spark/python/pyspark/pandas/frame.py", line 12063, in _to_internal_pandas return self._internal.to_pandas_frame File "/u02/spark/python/pyspark/pandas/utils.py", line 576, in wrapped_lazy_property setattr(self, attr_name, fn(self)) File "/u02/spark/python/pyspark/pandas/internal.py", line 1055, in to_pandas_frame return InternalFrame.restore_index(pdf, **self.arguments_for_restore_index) File "/u02/spark/python/pyspark/pandas/internal.py", line 1156, in restore_index pdf.columns = pd.Index( File "/u02/venv3.9-2/lib/python3.9/site-packages/pandas/core/generic.py", line 5500, in __setattr__ return object.__setattr__(self, name, value) File "pandas/_libs/properties.pyx", line 70, in pandas._libs.properties.AxisProperty.__set__ File "/u02/venv3.9-2/lib/python3.9/site-packages/pandas/core/generic.py", line 766, in _set_axis self._mgr.set_axis(axis, labels) File "/u02/venv3.9-2/lib/python3.9/site-packages/pandas/core/internals/managers.py", line 216, in set_axis self._validate_set_axis(axis, new_labels) File "/u02/venv3.9-2/lib/python3.9/site-packages/pandas/core/internals/base.py", line 57, in _validate_set_axis raise ValueError( ValueError: Length mismatch: Expected axis has 4 elements, new values have 2 elements {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37929) Support cascade mode for `dropNamespace` API
[ https://issues.apache.org/jira/browse/SPARK-37929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477052#comment-17477052 ] dch nguyen commented on SPARK-37929: I'm working on this. > Support cascade mode for `dropNamespace` API > - > > Key: SPARK-37929 > URL: https://issues.apache.org/jira/browse/SPARK-37929 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37929) Support cascade mode for `dropNamespace` API
[ https://issues.apache.org/jira/browse/SPARK-37929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-37929: --- Description: According to [#cmt,|https://github.com/apache/spark/pull/35202#discussion_r784463563] > Support cascade mode for `dropNamespace` API > - > > Key: SPARK-37929 > URL: https://issues.apache.org/jira/browse/SPARK-37929 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > > According to > [#cmt,|https://github.com/apache/spark/pull/35202#discussion_r784463563] -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37929) Support cascade mode for `dropNamespace` API
[ https://issues.apache.org/jira/browse/SPARK-37929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-37929: --- Description: (was: According to [#cmt,|https://github.com/apache/spark/pull/35202#discussion_r784463563] ) > Support cascade mode for `dropNamespace` API > - > > Key: SPARK-37929 > URL: https://issues.apache.org/jira/browse/SPARK-37929 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37929) Support cascade mode for `dropNamespace` API
dch nguyen created SPARK-37929: -- Summary: Support cascade mode for `dropNamespace` API Key: SPARK-37929 URL: https://issues.apache.org/jira/browse/SPARK-37929 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37479) Migrate DROP NAMESPACE to use V2 command by default
[ https://issues.apache.org/jira/browse/SPARK-37479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475156#comment-17475156 ] dch nguyen commented on SPARK-37479: [~imback82] I'm working on this now. I'll create PR soon. Thank you. > Migrate DROP NAMESPACE to use V2 command by default > --- > > Key: SPARK-37479 > URL: https://issues.apache.org/jira/browse/SPARK-37479 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-37381) Unify v1 and v2 SHOW CREATE TABLE tests
[ https://issues.apache.org/jira/browse/SPARK-37381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450947#comment-17450947 ] dch nguyen edited comment on SPARK-37381 at 11/30/21, 8:33 AM: --- [~xiaopenglei] it's ok. I'll try the other one. Thank you. was (Author: dchvn): [~xiaopenglei] it's ok. Thank you. > Unify v1 and v2 SHOW CREATE TABLE tests > > > Key: SPARK-37381 > URL: https://issues.apache.org/jira/browse/SPARK-37381 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: PengLei >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37381) Unify v1 and v2 SHOW CREATE TABLE tests
[ https://issues.apache.org/jira/browse/SPARK-37381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450947#comment-17450947 ] dch nguyen commented on SPARK-37381: [~xiaopenglei] it's ok. Thank you. > Unify v1 and v2 SHOW CREATE TABLE tests > > > Key: SPARK-37381 > URL: https://issues.apache.org/jira/browse/SPARK-37381 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: PengLei >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37495) Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled
dch nguyen created SPARK-37495: -- Summary: Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled Key: SPARK-37495 URL: https://issues.apache.org/jira/browse/SPARK-37495 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37491) Fix Series.asof when values of the series is not sorted
dch nguyen created SPARK-37491: -- Summary: Fix Series.asof when values of the series is not sorted Key: SPARK-37491 URL: https://issues.apache.org/jira/browse/SPARK-37491 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen https://github.com/apache/spark/pull/34737#discussion_r758223279 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37482) Skip check monotonic increasing for Series.asof with 'compute.eager_check'
dch nguyen created SPARK-37482: -- Summary: Skip check monotonic increasing for Series.asof with 'compute.eager_check' Key: SPARK-37482 URL: https://issues.apache.org/jira/browse/SPARK-37482 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37055) Apply 'compute.eager_check' across all the codebase
[ https://issues.apache.org/jira/browse/SPARK-37055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450210#comment-17450210 ] dch nguyen commented on SPARK-37055: thanks! I will try to address them > Apply 'compute.eager_check' across all the codebase > --- > > Key: SPARK-37055 > URL: https://issues.apache.org/jira/browse/SPARK-37055 > Project: Spark > Issue Type: Umbrella > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > > As [~hyukjin.kwon] guide > 1 Make every input validation like this covered by the new configuration. > For example: > {code:python} > - a == b > + def eager_check(f): # Utility function > + return not config.compute.eager_check and f() > + > + eager_check(lambda: a == b) > {code} > 2 We should check if the output makes sense although the behaviour is not > matched with pandas'. If the output does not make sense, we shouldn't cover > it with this configuration. > 3 Make this configuration enabled by default so we match the behaviour to > pandas' by default. > > We have to make sure listing which API is affected in the description of > 'compute.eager_check' -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37055) Apply 'compute.eager_check' across all the codebase
[ https://issues.apache.org/jira/browse/SPARK-37055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450191#comment-17450191 ] dch nguyen commented on SPARK-37055: [~hyukjin.kwon] , no, I am not now. I did not find anywhere to apply this conf more :( > Apply 'compute.eager_check' across all the codebase > --- > > Key: SPARK-37055 > URL: https://issues.apache.org/jira/browse/SPARK-37055 > Project: Spark > Issue Type: Umbrella > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > > As [~hyukjin.kwon] guide > 1 Make every input validation like this covered by the new configuration. > For example: > {code:python} > - a == b > + def eager_check(f): # Utility function > + return not config.compute.eager_check and f() > + > + eager_check(lambda: a == b) > {code} > 2 We should check if the output makes sense although the behaviour is not > matched with pandas'. If the output does not make sense, we shouldn't cover > it with this configuration. > 3 Make this configuration enabled by default so we match the behaviour to > pandas' by default. > > We have to make sure listing which API is affected in the description of > 'compute.eager_check' -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37479) Migrate DROP NAMESPACE to use V2 command by default
[ https://issues.apache.org/jira/browse/SPARK-37479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450189#comment-17450189 ] dch nguyen commented on SPARK-37479: working on this > Migrate DROP NAMESPACE to use V2 command by default > --- > > Key: SPARK-37479 > URL: https://issues.apache.org/jira/browse/SPARK-37479 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37479) Migrate DROP NAMESPACE to use V2 command by default
dch nguyen created SPARK-37479: -- Summary: Migrate DROP NAMESPACE to use V2 command by default Key: SPARK-37479 URL: https://issues.apache.org/jira/browse/SPARK-37479 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37478) Unify v1 and v2 DROP NAMESPACE tests
[ https://issues.apache.org/jira/browse/SPARK-37478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450188#comment-17450188 ] dch nguyen commented on SPARK-37478: working on this > Unify v1 and v2 DROP NAMESPACE tests > > > Key: SPARK-37478 > URL: https://issues.apache.org/jira/browse/SPARK-37478 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37478) Unify v1 and v2 DROP NAMESPACE tests
dch nguyen created SPARK-37478: -- Summary: Unify v1 and v2 DROP NAMESPACE tests Key: SPARK-37478 URL: https://issues.apache.org/jira/browse/SPARK-37478 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37381) Unify v1 and v2 SHOW CREATE TABLE tests
[ https://issues.apache.org/jira/browse/SPARK-37381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17449352#comment-17449352 ] dch nguyen commented on SPARK-37381: Can I go for it? I want to try to fix it [~xiaopenglei] [~maxgekk] > Unify v1 and v2 SHOW CREATE TABLE tests > > > Key: SPARK-37381 > URL: https://issues.apache.org/jira/browse/SPARK-37381 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: PengLei >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37343) Implement createIndex and IndexExists in JDBC (Postgres dialect)
[ https://issues.apache.org/jira/browse/SPARK-37343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17444210#comment-17444210 ] dch nguyen commented on SPARK-37343: I'm working on this. > Implement createIndex and IndexExists in JDBC (Postgres dialect) > > > Key: SPARK-37343 > URL: https://issues.apache.org/jira/browse/SPARK-37343 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37343) Implement createIndex and IndexExists in JDBC (Postgres dialect)
dch nguyen created SPARK-37343: -- Summary: Implement createIndex and IndexExists in JDBC (Postgres dialect) Key: SPARK-37343 URL: https://issues.apache.org/jira/browse/SPARK-37343 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37330) Migrate ReplaceTableStatement to v2 command
[ https://issues.apache.org/jira/browse/SPARK-37330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17443609#comment-17443609 ] dch nguyen commented on SPARK-37330: I am working on this > Migrate ReplaceTableStatement to v2 command > --- > > Key: SPARK-37330 > URL: https://issues.apache.org/jira/browse/SPARK-37330 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37330) Migrate ReplaceTableStatement to v2 command
dch nguyen created SPARK-37330: -- Summary: Migrate ReplaceTableStatement to v2 command Key: SPARK-37330 URL: https://issues.apache.org/jira/browse/SPARK-37330 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-37260) PYSPARK Arrow 3.2.0 docs link invalid
[ https://issues.apache.org/jira/browse/SPARK-37260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17441494#comment-17441494 ] dch nguyen edited comment on SPARK-37260 at 11/10/21, 4:03 AM: --- ping [~hyukjin.kwon] , is this issue resolved by [#34475|https://github.com/apache/spark/pull/34475]? was (Author: dchvn): [~hyukjin.kwon] , is this issue resolved by [#34475|https://github.com/apache/spark/pull/34475]? > PYSPARK Arrow 3.2.0 docs link invalid > - > > Key: SPARK-37260 > URL: https://issues.apache.org/jira/browse/SPARK-37260 > Project: Spark > Issue Type: Bug > Components: Documentation >Affects Versions: 3.2.0 >Reporter: Thomas Graves >Priority: Major > > [http://spark.apache.org/docs/latest/sql-pyspark-pandas-with-arrow.html] > links to: > [https://spark.apache.org/docs/latest/api/python/user_guide/arrow_pandas.html] > which links to: > [https://spark.apache.org/docs/latest/api/python/sql/arrow_pandas.rst] > But that is an invalid link. > I assume its supposed to point to: > https://spark.apache.org/docs/latest/api/python/user_guide/sql/arrow_pandas.html -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37260) PYSPARK Arrow 3.2.0 docs link invalid
[ https://issues.apache.org/jira/browse/SPARK-37260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17441494#comment-17441494 ] dch nguyen commented on SPARK-37260: [~hyukjin.kwon] , is this issue resolved by [#34475|https://github.com/apache/spark/pull/34475]? > PYSPARK Arrow 3.2.0 docs link invalid > - > > Key: SPARK-37260 > URL: https://issues.apache.org/jira/browse/SPARK-37260 > Project: Spark > Issue Type: Bug > Components: Documentation >Affects Versions: 3.2.0 >Reporter: Thomas Graves >Priority: Major > > [http://spark.apache.org/docs/latest/sql-pyspark-pandas-with-arrow.html] > links to: > [https://spark.apache.org/docs/latest/api/python/user_guide/arrow_pandas.html] > which links to: > [https://spark.apache.org/docs/latest/api/python/sql/arrow_pandas.rst] > But that is an invalid link. > I assume its supposed to point to: > https://spark.apache.org/docs/latest/api/python/user_guide/sql/arrow_pandas.html -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37236) Inline type hints for KernelDensity.pyi, test.py in python/pyspark/mllib/stat/
[ https://issues.apache.org/jira/browse/SPARK-37236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440143#comment-17440143 ] dch nguyen commented on SPARK-37236: working on this > Inline type hints for KernelDensity.pyi, test.py in python/pyspark/mllib/stat/ > -- > > Key: SPARK-37236 > URL: https://issues.apache.org/jira/browse/SPARK-37236 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37235) Inline type hints for distribution.py and __init__.py in python/pyspark/mllib/stat
[ https://issues.apache.org/jira/browse/SPARK-37235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440142#comment-17440142 ] dch nguyen commented on SPARK-37235: working on this > Inline type hints for distribution.py and __init__.py in > python/pyspark/mllib/stat > -- > > Key: SPARK-37235 > URL: https://issues.apache.org/jira/browse/SPARK-37235 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37235) Inline type hints for distribution.py and __init__.py in python/pyspark/mllib/
[ https://issues.apache.org/jira/browse/SPARK-37235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-37235: --- Summary: Inline type hints for distribution.py and __init__.py in python/pyspark/mllib/ (was: Inline type hints for python/pyspark/mllib/distribution.py and __init__.py) > Inline type hints for distribution.py and __init__.py in python/pyspark/mllib/ > -- > > Key: SPARK-37235 > URL: https://issues.apache.org/jira/browse/SPARK-37235 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37235) Inline type hints for distribution.py and __init__.py in python/pyspark/mllib/stat
[ https://issues.apache.org/jira/browse/SPARK-37235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-37235: --- Summary: Inline type hints for distribution.py and __init__.py in python/pyspark/mllib/stat (was: Inline type hints for distribution.py and __init__.py in python/pyspark/mllib/) > Inline type hints for distribution.py and __init__.py in > python/pyspark/mllib/stat > -- > > Key: SPARK-37235 > URL: https://issues.apache.org/jira/browse/SPARK-37235 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37236) Inline type hints for KernelDensity.pyi, test.py in python/pyspark/mllib/stat/
[ https://issues.apache.org/jira/browse/SPARK-37236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-37236: --- Summary: Inline type hints for KernelDensity.pyi, test.py in python/pyspark/mllib/stat/ (was: Inline type hints for distribution.py, __init__.py in python/pyspark/mllib/stat/) > Inline type hints for KernelDensity.pyi, test.py in python/pyspark/mllib/stat/ > -- > > Key: SPARK-37236 > URL: https://issues.apache.org/jira/browse/SPARK-37236 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37236) Inline type hints for distribution.py, __init__.py in python/pyspark/mllib/stat/
dch nguyen created SPARK-37236: -- Summary: Inline type hints for distribution.py, __init__.py in python/pyspark/mllib/stat/ Key: SPARK-37236 URL: https://issues.apache.org/jira/browse/SPARK-37236 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37235) Inline type hints for python/pyspark/mllib/distribution.py and __init__.py
dch nguyen created SPARK-37235: -- Summary: Inline type hints for python/pyspark/mllib/distribution.py and __init__.py Key: SPARK-37235 URL: https://issues.apache.org/jira/browse/SPARK-37235 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37234) Inline type hints for python/pyspark/mllib/stat/_statistics.py
[ https://issues.apache.org/jira/browse/SPARK-37234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-37234: --- Summary: Inline type hints for python/pyspark/mllib/stat/_statistics.py (was: Inline type hints for python/pyspark/mllib/_statistics.py) > Inline type hints for python/pyspark/mllib/stat/_statistics.py > -- > > Key: SPARK-37234 > URL: https://issues.apache.org/jira/browse/SPARK-37234 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37234) Inline type hints for python/pyspark/mllib/_statistics.py
[ https://issues.apache.org/jira/browse/SPARK-37234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440141#comment-17440141 ] dch nguyen commented on SPARK-37234: working on this. > Inline type hints for python/pyspark/mllib/_statistics.py > - > > Key: SPARK-37234 > URL: https://issues.apache.org/jira/browse/SPARK-37234 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37234) Inline type hints for python/pyspark/mllib/_statistics.py
dch nguyen created SPARK-37234: -- Summary: Inline type hints for python/pyspark/mllib/_statistics.py Key: SPARK-37234 URL: https://issues.apache.org/jira/browse/SPARK-37234 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37233) Inline type hints for files in python/pyspark/mllib
dch nguyen created SPARK-37233: -- Summary: Inline type hints for files in python/pyspark/mllib Key: SPARK-37233 URL: https://issues.apache.org/jira/browse/SPARK-37233 Project: Spark Issue Type: Umbrella Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37180) PySpark.pandas should support __version__
[ https://issues.apache.org/jira/browse/SPARK-37180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17438522#comment-17438522 ] dch nguyen commented on SPARK-37180: As Koalas was merged into Pyspark, so Should pyspark.pandas.__version__ be aliased spark.version ? [~hyukjin.kwon] > PySpark.pandas should support __version__ > - > > Key: SPARK-37180 > URL: https://issues.apache.org/jira/browse/SPARK-37180 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Chuck Connell >Priority: Major > > In regular pandas you can say > {quote}pd.___version___ > {quote} > to get the pandas version number. PySpark pandas should support the same. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-37180) PySpark.pandas should support __version__
[ https://issues.apache.org/jira/browse/SPARK-37180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17438522#comment-17438522 ] dch nguyen edited comment on SPARK-37180 at 11/4/21, 7:31 AM: -- As Koalas was merged into Pyspark, so Should pyspark.pandas.__version__ be aliased of spark.version ? [~hyukjin.kwon] was (Author: dchvn): As Koalas was merged into Pyspark, so Should pyspark.pandas.__version__ be aliased spark.version ? [~hyukjin.kwon] > PySpark.pandas should support __version__ > - > > Key: SPARK-37180 > URL: https://issues.apache.org/jira/browse/SPARK-37180 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Chuck Connell >Priority: Major > > In regular pandas you can say > {quote}pd.___version___ > {quote} > to get the pandas version number. PySpark pandas should support the same. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37094) Inline type hints for files in python/pyspark
[ https://issues.apache.org/jira/browse/SPARK-37094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17435810#comment-17435810 ] dch nguyen commented on SPARK-37094: [~ByronHsu] I have worked on some issues (statcounter, storagelevel and util) last week but haven't created PRs yet, so i create them soon! . Sorry for this! > Inline type hints for files in python/pyspark > - > > Key: SPARK-37094 > URL: https://issues.apache.org/jira/browse/SPARK-37094 > Project: Spark > Issue Type: Umbrella > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37146) Inline type hints for python/pyspark/__init__.py
[ https://issues.apache.org/jira/browse/SPARK-37146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17435341#comment-17435341 ] dch nguyen commented on SPARK-37146: I am working on this > Inline type hints for python/pyspark/__init__.py > > > Key: SPARK-37146 > URL: https://issues.apache.org/jira/browse/SPARK-37146 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37146) Inline type hints for python/pyspark/__init__.py
dch nguyen created SPARK-37146: -- Summary: Inline type hints for python/pyspark/__init__.py Key: SPARK-37146 URL: https://issues.apache.org/jira/browse/SPARK-37146 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37144) Inline type hints for python/pyspark/file.py
dch nguyen created SPARK-37144: -- Summary: Inline type hints for python/pyspark/file.py Key: SPARK-37144 URL: https://issues.apache.org/jira/browse/SPARK-37144 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37142) Add __all__ to pyspark/pandas/*/__init__.py
[ https://issues.apache.org/jira/browse/SPARK-37142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-37142: --- Issue Type: Improvement (was: Bug) > Add __all__ to pyspark/pandas/*/__init__.py > --- > > Key: SPARK-37142 > URL: https://issues.apache.org/jira/browse/SPARK-37142 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37142) Add __all__ to pyspark/pandas/*/__init__.py
dch nguyen created SPARK-37142: -- Summary: Add __all__ to pyspark/pandas/*/__init__.py Key: SPARK-37142 URL: https://issues.apache.org/jira/browse/SPARK-37142 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37140) Inline type hints for python/pyspark/resultiterable.py
dch nguyen created SPARK-37140: -- Summary: Inline type hints for python/pyspark/resultiterable.py Key: SPARK-37140 URL: https://issues.apache.org/jira/browse/SPARK-37140 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37139) Inline type hints for python/pyspark/taskcontext.py and python/pyspark/version.py
dch nguyen created SPARK-37139: -- Summary: Inline type hints for python/pyspark/taskcontext.py and python/pyspark/version.py Key: SPARK-37139 URL: https://issues.apache.org/jira/browse/SPARK-37139 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37116) Allow sequences (tuples and lists) as pivot values argument in PySpark
[ https://issues.apache.org/jira/browse/SPARK-37116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17434107#comment-17434107 ] dch nguyen commented on SPARK-37116: work on this :v > Allow sequences (tuples and lists) as pivot values argument in PySpark > -- > > Key: SPARK-37116 > URL: https://issues.apache.org/jira/browse/SPARK-37116 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Minor > > Both tuples and lists are accepted by PySpark on runtime. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37116) Allow sequences (tuples and lists) as pivot values argument in PySpark
dch nguyen created SPARK-37116: -- Summary: Allow sequences (tuples and lists) as pivot values argument in PySpark Key: SPARK-37116 URL: https://issues.apache.org/jira/browse/SPARK-37116 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen Both tuples and lists are accepted by PySpark on runtime. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37107) Inline type hints for files in python/pyspark/status.py
dch nguyen created SPARK-37107: -- Summary: Inline type hints for files in python/pyspark/status.py Key: SPARK-37107 URL: https://issues.apache.org/jira/browse/SPARK-37107 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37095) Inline type hints for files in python/pyspark/broadcast.py
[ https://issues.apache.org/jira/browse/SPARK-37095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17432804#comment-17432804 ] dch nguyen commented on SPARK-37095: working on this > Inline type hints for files in python/pyspark/broadcast.py > -- > > Key: SPARK-37095 > URL: https://issues.apache.org/jira/browse/SPARK-37095 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37095) Inline type hints for files in python/pyspark/broadcast.py
dch nguyen created SPARK-37095: -- Summary: Inline type hints for files in python/pyspark/broadcast.py Key: SPARK-37095 URL: https://issues.apache.org/jira/browse/SPARK-37095 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37083) Inline type hints for python/pyspark/accumulators.py
[ https://issues.apache.org/jira/browse/SPARK-37083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-37083: --- Parent: SPARK-37094 Issue Type: Sub-task (was: Bug) > Inline type hints for python/pyspark/accumulators.py > > > Key: SPARK-37083 > URL: https://issues.apache.org/jira/browse/SPARK-37083 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36969) Inline type hints for SparkContext
[ https://issues.apache.org/jira/browse/SPARK-36969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-36969: --- Parent: SPARK-37094 Issue Type: Sub-task (was: Bug) > Inline type hints for SparkContext > -- > > Key: SPARK-36969 > URL: https://issues.apache.org/jira/browse/SPARK-36969 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > > Many files can remove > {code:java} > # type: ignore[attr-defined] > {code} > if this file is inlined type -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36969) Inline type hints for SparkContext
[ https://issues.apache.org/jira/browse/SPARK-36969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-36969: --- Parent: (was: SPARK-36845) Issue Type: Bug (was: Sub-task) > Inline type hints for SparkContext > -- > > Key: SPARK-36969 > URL: https://issues.apache.org/jira/browse/SPARK-36969 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > > Many files can remove > {code:java} > # type: ignore[attr-defined] > {code} > if this file is inlined type -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37094) Inline type hints for files in python/pyspark
dch nguyen created SPARK-37094: -- Summary: Inline type hints for files in python/pyspark Key: SPARK-37094 URL: https://issues.apache.org/jira/browse/SPARK-37094 Project: Spark Issue Type: Umbrella Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37083) Inline type hints for python/pyspark/accumulators.py
[ https://issues.apache.org/jira/browse/SPARK-37083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-37083: --- Parent: (was: SPARK-36845) Issue Type: Bug (was: Sub-task) > Inline type hints for python/pyspark/accumulators.py > > > Key: SPARK-37083 > URL: https://issues.apache.org/jira/browse/SPARK-37083 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37093) Inline type hints python/pyspark/streaming
[ https://issues.apache.org/jira/browse/SPARK-37093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-37093: --- Issue Type: Umbrella (was: Bug) > Inline type hints python/pyspark/streaming > -- > > Key: SPARK-37093 > URL: https://issues.apache.org/jira/browse/SPARK-37093 > Project: Spark > Issue Type: Umbrella > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37042) Inline type hints for kinesis.py and listener.py in python/pyspark/streaming
[ https://issues.apache.org/jira/browse/SPARK-37042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-37042: --- Parent: SPARK-37093 Issue Type: Sub-task (was: Bug) > Inline type hints for kinesis.py and listener.py in python/pyspark/streaming > > > Key: SPARK-37042 > URL: https://issues.apache.org/jira/browse/SPARK-37042 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37042) Inline type hints for kinesis.py and listener.py in python/pyspark/streaming
[ https://issues.apache.org/jira/browse/SPARK-37042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-37042: --- Parent: (was: SPARK-36845) Issue Type: Bug (was: Sub-task) > Inline type hints for kinesis.py and listener.py in python/pyspark/streaming > > > Key: SPARK-37042 > URL: https://issues.apache.org/jira/browse/SPARK-37042 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37015) Inline type hints for python/pyspark/streaming/dstream.py
[ https://issues.apache.org/jira/browse/SPARK-37015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-37015: --- Parent: SPARK-37093 Issue Type: Sub-task (was: Bug) > Inline type hints for python/pyspark/streaming/dstream.py > - > > Key: SPARK-37015 > URL: https://issues.apache.org/jira/browse/SPARK-37015 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37015) Inline type hints for python/pyspark/streaming/dstream.py
[jira] [Updated] (SPARK-37014) Inline type hints for python/pyspark/streaming/context.py
[ https://issues.apache.org/jira/browse/SPARK-37014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-37014: --- Parent: SPARK-37093 Issue Type: Sub-task (was: Bug) > Inline type hints for python/pyspark/streaming/context.py > - > > Key: SPARK-37014 > URL: https://issues.apache.org/jira/browse/SPARK-37014 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37014) Inline type hints for python/pyspark/streaming/context.py
[ https://issues.apache.org/jira/browse/SPARK-37014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-37014: --- Parent: (was: SPARK-36845) Issue Type: Bug (was: Sub-task) > Inline type hints for python/pyspark/streaming/context.py > - > > Key: SPARK-37014 > URL: https://issues.apache.org/jira/browse/SPARK-37014 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37093) Inline type hints python/pyspark/streaming
dch nguyen created SPARK-37093: -- Summary: Inline type hints python/pyspark/streaming Key: SPARK-37093 URL: https://issues.apache.org/jira/browse/SPARK-37093 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36845) Inline type hint files
[ https://issues.apache.org/jira/browse/SPARK-36845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17432755#comment-17432755 ] dch nguyen commented on SPARK-36845: [~ueshin], yes, i will > Inline type hint files > -- > > Key: SPARK-36845 > URL: https://issues.apache.org/jira/browse/SPARK-36845 > Project: Spark > Issue Type: Umbrella > Components: PySpark, SQL >Affects Versions: 3.3.0 >Reporter: Takuya Ueshin >Assignee: Xinrong Meng >Priority: Major > > Currently there are type hint stub files ({{*.pyi}}) to show the expected > types for functions, but we can also take advantage of static type checking > within the functions by inlining the type hints. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37083) Inline type hints for python/pyspark/accumulators.py
dch nguyen created SPARK-37083: -- Summary: Inline type hints for python/pyspark/accumulators.py Key: SPARK-37083 URL: https://issues.apache.org/jira/browse/SPARK-37083 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37055) Apply 'compute.eager_check' across all the codebase
[ https://issues.apache.org/jira/browse/SPARK-37055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-37055: --- Description: As [~hyukjin.kwon] guide 1 Make every input validation like this covered by the new configuration. For example: {code:python} - a == b + def eager_check(f): # Utility function + return not config.compute.eager_check and f() + + eager_check(lambda: a == b) {code} 2 We should check if the output makes sense although the behaviour is not matched with pandas'. If the output does not make sense, we shouldn't cover it with this configuration. 3 Make this configuration enabled by default so we match the behaviour to pandas' by default. We have to make sure listing which API is affected in the description of 'compute.eager_check' > Apply 'compute.eager_check' across all the codebase > --- > > Key: SPARK-37055 > URL: https://issues.apache.org/jira/browse/SPARK-37055 > Project: Spark > Issue Type: Umbrella > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > > As [~hyukjin.kwon] guide > 1 Make every input validation like this covered by the new configuration. > For example: > {code:python} > - a == b > + def eager_check(f): # Utility function > + return not config.compute.eager_check and f() > + > + eager_check(lambda: a == b) > {code} > 2 We should check if the output makes sense although the behaviour is not > matched with pandas'. If the output does not make sense, we shouldn't cover > it with this configuration. > 3 Make this configuration enabled by default so we match the behaviour to > pandas' by default. > > We have to make sure listing which API is affected in the description of > 'compute.eager_check' -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37055) Apply 'compute.eager_check' across all the codebase
dch nguyen created SPARK-37055: -- Summary: Apply 'compute.eager_check' across all the codebase Key: SPARK-37055 URL: https://issues.apache.org/jira/browse/SPARK-37055 Project: Spark Issue Type: Umbrella Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37042) Inline type hints for kinesis.py and listener.py in python/pyspark/streaming
[ https://issues.apache.org/jira/browse/SPARK-37042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-37042: --- Summary: Inline type hints for kinesis.py and listener.py in python/pyspark/streaming (was: Inline type hints for python/pyspark/streaming/kinesis.py) > Inline type hints for kinesis.py and listener.py in python/pyspark/streaming > > > Key: SPARK-37042 > URL: https://issues.apache.org/jira/browse/SPARK-37042 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37042) Inline type hints for python/pyspark/streaming/kinesis.py
dch nguyen created SPARK-37042: -- Summary: Inline type hints for python/pyspark/streaming/kinesis.py Key: SPARK-37042 URL: https://issues.apache.org/jira/browse/SPARK-37042 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37042) Inline type hints for python/pyspark/streaming/kinesis.py
[ https://issues.apache.org/jira/browse/SPARK-37042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429918#comment-17429918 ] dch nguyen commented on SPARK-37042: i am working on this > Inline type hints for python/pyspark/streaming/kinesis.py > - > > Key: SPARK-37042 > URL: https://issues.apache.org/jira/browse/SPARK-37042 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37002) Introduce the 'compute.eager_check' option
[ https://issues.apache.org/jira/browse/SPARK-37002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-37002: --- Summary: Introduce the 'compute.eager_check' option (was: Introduce the 'compute.check_identical_indices' option) > Introduce the 'compute.eager_check' option > -- > > Key: SPARK-37002 > URL: https://issues.apache.org/jira/browse/SPARK-37002 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > > https://issues.apache.org/jira/browse/SPARK-36968 > [https://github.com/apache/spark/pull/34235] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37033) Inline type hints for python/pyspark/resource/requests.py
dch nguyen created SPARK-37033: -- Summary: Inline type hints for python/pyspark/resource/requests.py Key: SPARK-37033 URL: https://issues.apache.org/jira/browse/SPARK-37033 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37033) Inline type hints for python/pyspark/resource/requests.py
[ https://issues.apache.org/jira/browse/SPARK-37033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429802#comment-17429802 ] dch nguyen commented on SPARK-37033: working on this! > Inline type hints for python/pyspark/resource/requests.py > - > > Key: SPARK-37033 > URL: https://issues.apache.org/jira/browse/SPARK-37033 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37014) Inline type hints for python/pyspark/streaming/context.py
[ https://issues.apache.org/jira/browse/SPARK-37014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429112#comment-17429112 ] dch nguyen commented on SPARK-37014: working on this > Inline type hints for python/pyspark/streaming/context.py > - > > Key: SPARK-37014 > URL: https://issues.apache.org/jira/browse/SPARK-37014 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37015) Inline type hints for python/pyspark/streaming/dstream.py
[ https://issues.apache.org/jira/browse/SPARK-37015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429111#comment-17429111 ] dch nguyen commented on SPARK-37015: working on this > Inline type hints for python/pyspark/streaming/dstream.py > - > > Key: SPARK-37015 > URL: https://issues.apache.org/jira/browse/SPARK-37015 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37015) Inline type hints for python/pyspark/streaming/dstream.py
dch nguyen created SPARK-37015: -- Summary: Inline type hints for python/pyspark/streaming/dstream.py Key: SPARK-37015 URL: https://issues.apache.org/jira/browse/SPARK-37015 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37014) Inline type hints for python/pyspark/streaming/context.py
dch nguyen created SPARK-37014: -- Summary: Inline type hints for python/pyspark/streaming/context.py Key: SPARK-37014 URL: https://issues.apache.org/jira/browse/SPARK-37014 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36525) DS V2 Index Support
[ https://issues.apache.org/jira/browse/SPARK-36525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429077#comment-17429077 ] dch nguyen commented on SPARK-36525: [~huaxingao] yes, i'd like to > DS V2 Index Support > --- > > Key: SPARK-36525 > URL: https://issues.apache.org/jira/browse/SPARK-36525 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Huaxin Gao >Priority: Major > > Many data sources support index to improvement query performance. In order to > take advantage of the index support in data source, the following APIs will > be added for working with indexes: > {code:java} > /** >* Creates an index. >* >* @param indexName the name of the index to be created >* @param indexType the IndexType of the index to be created >* @param table the table on which index to be created >* @param columns the columns on which index to be created >* @param properties the properties of the index to be created >* @throws IndexAlreadyExistsException If the index already exists > (optional) >* @throws UnsupportedOperationException If create index is not a supported > operation >*/ > void createIndex(String indexName, > String indexType, > Identifier table, > FieldReference[] columns, > Map properties) > throws IndexAlreadyExistsException, UnsupportedOperationException; > /** >* Soft deletes the index with the given name. >* Deleted index can be restored by calling restoreIndex. >* >* @param indexName the name of the index to be deleted >* @return true if the index is deleted >* @throws NoSuchIndexException If the index does not exist (optional) >* @throws UnsupportedOperationException If delete index is not a supported > operation >*/ > default boolean deleteIndex(String indexName) > throws NoSuchIndexException, UnsupportedOperationException > /** >* Checks whether an index exists. >* >* @param indexName the name of the index >* @return true if the index exists, false otherwise >*/ > boolean indexExists(String indexName); > /** >* Lists all the indexes in a table. >* >* @param table the table to be checked on for indexes >* @throws NoSuchTableException >*/ > Index[] listIndexes(Identifier table) throws NoSuchTableException; > /** >* Hard deletes the index with the given name. >* The Index can't be restored once dropped. >* >* @param indexName the name of the index to be dropped. >* @return true if the index is dropped >* @throws NoSuchIndexException If the index does not exist (optional) >* @throws UnsupportedOperationException If drop index is not a supported > operation >*/ > boolean dropIndex(String indexName) throws NoSuchIndexException, > UnsupportedOperationException; > /** >* Restores the index with the given name. >* Deleted index can be restored by calling restoreIndex, but dropped index > can't be restored. >* >* @param indexName the name of the index to be restored >* @return true if the index is restored >* @throws NoSuchIndexException If the index does not exist (optional) >* @throws UnsupportedOperationException >*/ > default boolean restoreIndex(String indexName) > throws NoSuchIndexException, UnsupportedOperationException > /** >* Refreshes index using the latest data. This causes the index to be > rebuilt. >* >* @param indexName the name of the index to be rebuilt >* @return true if the index is rebuilt >* @throws NoSuchIndexException If the index does not exist (optional) >* @throws UnsupportedOperationException >*/ > default boolean refreshIndex(String indexName) > throws NoSuchIndexException, UnsupportedOperationException > /** >* Alter Index using the new property. This causes the index to be rebuilt. >* >* @param indexName the name of the index to be altered >* @return true if the index is altered >* @throws NoSuchIndexException If the index does not exist (optional) >* @throws UnsupportedOperationException >*/ > default boolean alterIndex(String indexName, Properties properties) > throws NoSuchIndexException, UnsupportedOperationException > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36525) DS V2 Index Support
[ https://issues.apache.org/jira/browse/SPARK-36525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17428676#comment-17428676 ] dch nguyen commented on SPARK-36525: [~huaxingao], Should we do these functions for supportsIndex in JDBC for the other dialects like Oracle, Postgres, etc.? > DS V2 Index Support > --- > > Key: SPARK-36525 > URL: https://issues.apache.org/jira/browse/SPARK-36525 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Huaxin Gao >Priority: Major > > Many data sources support index to improvement query performance. In order to > take advantage of the index support in data source, the following APIs will > be added for working with indexes: > {code:java} > /** >* Creates an index. >* >* @param indexName the name of the index to be created >* @param indexType the IndexType of the index to be created >* @param table the table on which index to be created >* @param columns the columns on which index to be created >* @param properties the properties of the index to be created >* @throws IndexAlreadyExistsException If the index already exists > (optional) >* @throws UnsupportedOperationException If create index is not a supported > operation >*/ > void createIndex(String indexName, > String indexType, > Identifier table, > FieldReference[] columns, > Map properties) > throws IndexAlreadyExistsException, UnsupportedOperationException; > /** >* Soft deletes the index with the given name. >* Deleted index can be restored by calling restoreIndex. >* >* @param indexName the name of the index to be deleted >* @return true if the index is deleted >* @throws NoSuchIndexException If the index does not exist (optional) >* @throws UnsupportedOperationException If delete index is not a supported > operation >*/ > default boolean deleteIndex(String indexName) > throws NoSuchIndexException, UnsupportedOperationException > /** >* Checks whether an index exists. >* >* @param indexName the name of the index >* @return true if the index exists, false otherwise >*/ > boolean indexExists(String indexName); > /** >* Lists all the indexes in a table. >* >* @param table the table to be checked on for indexes >* @throws NoSuchTableException >*/ > Index[] listIndexes(Identifier table) throws NoSuchTableException; > /** >* Hard deletes the index with the given name. >* The Index can't be restored once dropped. >* >* @param indexName the name of the index to be dropped. >* @return true if the index is dropped >* @throws NoSuchIndexException If the index does not exist (optional) >* @throws UnsupportedOperationException If drop index is not a supported > operation >*/ > boolean dropIndex(String indexName) throws NoSuchIndexException, > UnsupportedOperationException; > /** >* Restores the index with the given name. >* Deleted index can be restored by calling restoreIndex, but dropped index > can't be restored. >* >* @param indexName the name of the index to be restored >* @return true if the index is restored >* @throws NoSuchIndexException If the index does not exist (optional) >* @throws UnsupportedOperationException >*/ > default boolean restoreIndex(String indexName) > throws NoSuchIndexException, UnsupportedOperationException > /** >* Refreshes index using the latest data. This causes the index to be > rebuilt. >* >* @param indexName the name of the index to be rebuilt >* @return true if the index is rebuilt >* @throws NoSuchIndexException If the index does not exist (optional) >* @throws UnsupportedOperationException >*/ > default boolean refreshIndex(String indexName) > throws NoSuchIndexException, UnsupportedOperationException > /** >* Alter Index using the new property. This causes the index to be rebuilt. >* >* @param indexName the name of the index to be altered >* @return true if the index is altered >* @throws NoSuchIndexException If the index does not exist (optional) >* @throws UnsupportedOperationException >*/ > default boolean alterIndex(String indexName, Properties properties) > throws NoSuchIndexException, UnsupportedOperationException > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37002) Introduce the 'compute.check_identical_indices' option
[ https://issues.apache.org/jira/browse/SPARK-37002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-37002: --- Summary: Introduce the 'compute.check_identical_indices' option (was: Introduce the 'compute.check_identical_index' option) > Introduce the 'compute.check_identical_indices' option > -- > > Key: SPARK-37002 > URL: https://issues.apache.org/jira/browse/SPARK-37002 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > > https://issues.apache.org/jira/browse/SPARK-36968 > [https://github.com/apache/spark/pull/34235] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37002) Introduce the 'compute.check_identical_index' option
dch nguyen created SPARK-37002: -- Summary: Introduce the 'compute.check_identical_index' option Key: SPARK-37002 URL: https://issues.apache.org/jira/browse/SPARK-37002 Project: Spark Issue Type: Improvement Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen https://issues.apache.org/jira/browse/SPARK-36968 [https://github.com/apache/spark/pull/34235] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36973) Deduplicate prepare data method for HistogramPlotBase and KdePlotBase
dch nguyen created SPARK-36973: -- Summary: Deduplicate prepare data method for HistogramPlotBase and KdePlotBase Key: SPARK-36973 URL: https://issues.apache.org/jira/browse/SPARK-36973 Project: Spark Issue Type: Improvement Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36969) Inline type hints for SparkContext
[ https://issues.apache.org/jira/browse/SPARK-36969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-36969: --- Summary: Inline type hints for SparkContext (was: Inline type hints for python/pyspark/context.py) > Inline type hints for SparkContext > -- > > Key: SPARK-36969 > URL: https://issues.apache.org/jira/browse/SPARK-36969 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > > Many files can remove > {code:java} > # type: ignore[attr-defined] > {code} > if this file is inlined type -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36969) Inline type hints for python/pyspark/context.py
[ https://issues.apache.org/jira/browse/SPARK-36969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17426924#comment-17426924 ] dch nguyen commented on SPARK-36969: working on this > Inline type hints for python/pyspark/context.py > --- > > Key: SPARK-36969 > URL: https://issues.apache.org/jira/browse/SPARK-36969 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > > Many files can remove > {code:java} > # type: ignore[attr-defined] > {code} > if this file is inlined type -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36969) Inline type hints for python/pyspark/context.py
dch nguyen created SPARK-36969: -- Summary: Inline type hints for python/pyspark/context.py Key: SPARK-36969 URL: https://issues.apache.org/jira/browse/SPARK-36969 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen Many files can remove {code:java} # type: ignore[attr-defined] {code} if this file is inlined type -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36968) ps.Series.dot raise "matrices are not aligned" if index is not same
[ https://issues.apache.org/jira/browse/SPARK-36968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-36968: --- Summary: ps.Series.dot raise "matrices are not aligned" if index is not same (was: ps.Series.dot raise ValueError "matrices are not aligned" if index is not same) > ps.Series.dot raise "matrices are not aligned" if index is not same > --- > > Key: SPARK-36968 > URL: https://issues.apache.org/jira/browse/SPARK-36968 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36968) ps.Series.dot raise ValueError "matrices are not aligned" if index is not same
[ https://issues.apache.org/jira/browse/SPARK-36968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-36968: --- Summary: ps.Series.dot raise ValueError "matrices are not aligned" if index is not same (was: ps.Series.dot raise ValueError matrices are not aligned if index is not same) > ps.Series.dot raise ValueError "matrices are not aligned" if index is not same > -- > > Key: SPARK-36968 > URL: https://issues.apache.org/jira/browse/SPARK-36968 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36968) ps.Series.dot raise ValueError matrices are not aligned if index is not same
dch nguyen created SPARK-36968: -- Summary: ps.Series.dot raise ValueError matrices are not aligned if index is not same Key: SPARK-36968 URL: https://issues.apache.org/jira/browse/SPARK-36968 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36952) Inline type hints for python/pyspark/resource/information.py and python/pyspark/resource/profile.py
[ https://issues.apache.org/jira/browse/SPARK-36952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17425949#comment-17425949 ] dch nguyen commented on SPARK-36952: working on this > Inline type hints for python/pyspark/resource/information.py and > python/pyspark/resource/profile.py > --- > > Key: SPARK-36952 > URL: https://issues.apache.org/jira/browse/SPARK-36952 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dgd_contributor >Priority: Major > > Inline type hints for python/pyspark/resource/information.py and > python/pyspark/resource/profile.py -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36945) Inline type hints for python/pyspark/sql/udf.py
dch nguyen created SPARK-36945: -- Summary: Inline type hints for python/pyspark/sql/udf.py Key: SPARK-36945 URL: https://issues.apache.org/jira/browse/SPARK-36945 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36945) Inline type hints for python/pyspark/sql/udf.py
[ https://issues.apache.org/jira/browse/SPARK-36945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17425301#comment-17425301 ] dch nguyen commented on SPARK-36945: working on this > Inline type hints for python/pyspark/sql/udf.py > --- > > Key: SPARK-36945 > URL: https://issues.apache.org/jira/browse/SPARK-36945 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36944) Remove unused python/pyspark/sql/__init__.pyi
dch nguyen created SPARK-36944: -- Summary: Remove unused python/pyspark/sql/__init__.pyi Key: SPARK-36944 URL: https://issues.apache.org/jira/browse/SPARK-36944 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36938) Inline type hints for group.py in python/pyspark/sql
[ https://issues.apache.org/jira/browse/SPARK-36938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dch nguyen updated SPARK-36938: --- Summary: Inline type hints for group.py in python/pyspark/sql (was: nline type hints for group.py in python/pyspark/sql ) > Inline type hints for group.py in python/pyspark/sql > - > > Key: SPARK-36938 > URL: https://issues.apache.org/jira/browse/SPARK-36938 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36938) nline type hints for group.py in python/pyspark/sql
dch nguyen created SPARK-36938: -- Summary: nline type hints for group.py in python/pyspark/sql Key: SPARK-36938 URL: https://issues.apache.org/jira/browse/SPARK-36938 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: dch nguyen -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36938) Inline type hints for group.py in python/pyspark/sql
[ https://issues.apache.org/jira/browse/SPARK-36938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17424929#comment-17424929 ] dch nguyen commented on SPARK-36938: i am working on this > Inline type hints for group.py in python/pyspark/sql > - > > Key: SPARK-36938 > URL: https://issues.apache.org/jira/browse/SPARK-36938 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org