[jira] [Updated] (SPARK-39574) Better error message when `ps.Index` is used for DataFrame/Series creation

2022-06-23 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39574: - Summary: Better error message when `ps.Index` is used for DataFrame/Series creation (was:

[jira] [Created] (SPARK-39574) Better error message when ps.Index is used for DataFrame/Series creation

2022-06-23 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39574: Summary: Better error message when ps.Index is used for DataFrame/Series creation Key: SPARK-39574 URL: https://issues.apache.org/jira/browse/SPARK-39574 Project:

[jira] [Updated] (SPARK-39494) Support `createDataFrame` from a list of scalars

2022-06-21 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39494: - Description: Currently, DataFrame creation from a list of scalars is unsupported as below: |>>>

[jira] [Updated] (SPARK-39494) Support `createDataFrame` from a list of scalars

2022-06-21 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39494: - Description: Currently, DataFrame creation from a list of scalars is unsupported as below: |>>>

[jira] [Updated] (SPARK-39550) Fix `MultiIndex.value_counts()` when Arrow Execution is enabled

2022-06-21 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39550: - Description:   When Arrow Execution is enabled, {code:java} >>>

[jira] [Commented] (SPARK-39550) Fix `MultiIndex.value_counts()` when Arrow Execution is enabled

2022-06-21 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17557145#comment-17557145 ] Xinrong Meng commented on SPARK-39550: -- I am working on that. > Fix `MultiIndex.value_counts()`

[jira] [Created] (SPARK-39550) Fix `MultiIndex.value_counts()` when Arrow Execution is enabled

2022-06-21 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39550: Summary: Fix `MultiIndex.value_counts()` when Arrow Execution is enabled Key: SPARK-39550 URL: https://issues.apache.org/jira/browse/SPARK-39550 Project: Spark

[jira] [Created] (SPARK-39494) Support `createDataFrame` from a list of scalars

2022-06-16 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39494: Summary: Support `createDataFrame` from a list of scalars Key: SPARK-39494 URL: https://issues.apache.org/jira/browse/SPARK-39494 Project: Spark Issue Type:

[jira] [Created] (SPARK-39483) Construct the schema from `np.dtype` when `createDataFrame` from a NumPy array

2022-06-15 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39483: Summary: Construct the schema from `np.dtype` when `createDataFrame` from a NumPy array Key: SPARK-39483 URL: https://issues.apache.org/jira/browse/SPARK-39483

[jira] [Created] (SPARK-39443) Improve docstring of pyspark.sql.functions.col/first

2022-06-10 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39443: Summary: Improve docstring of pyspark.sql.functions.col/first Key: SPARK-39443 URL: https://issues.apache.org/jira/browse/SPARK-39443 Project: Spark Issue

[jira] [Updated] (SPARK-39405) NumPy support in SQL

2022-06-07 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39405: - Description: NumPy is the fundamental package for scientific computing with Python. It is very

[jira] [Updated] (SPARK-39405) NumPy support in SQL

2022-06-07 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39405: - Description: NumPy is the fundamental package for scientific computing with Python. It is very

[jira] [Updated] (SPARK-39406) Accept NumPy array in createDataFrame

2022-06-07 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39406: - Summary: Accept NumPy array in createDataFrame (was: Accept numpy array in createDataFrame) >

[jira] [Created] (SPARK-39406) Accept numpy array in createDataFrame

2022-06-07 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39406: Summary: Accept numpy array in createDataFrame Key: SPARK-39406 URL: https://issues.apache.org/jira/browse/SPARK-39406 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-39405) NumPy support in SQL

2022-06-07 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39405: Summary: NumPy support in SQL Key: SPARK-39405 URL: https://issues.apache.org/jira/browse/SPARK-39405 Project: Spark Issue Type: Umbrella

[jira] [Updated] (SPARK-39262) Correct the behavior of creating DataFrame from an RDD

2022-05-26 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39262: - Description: Correct the behavior of creating DataFrame from an RDD **with `0` or an empty

[jira] [Updated] (SPARK-39262) Correct the behavior of creating DataFrame from an RDD

2022-05-26 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39262: - Summary: Correct the behavior of creating DataFrame from an RDD (was: Correct error messages

[jira] [Updated] (SPARK-39048) Refactor `GroupBy._reduce_for_stat_function` on accepted data types

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39048: - Parent: SPARK-39076 Issue Type: Sub-task (was: Improvement) > Refactor

[jira] [Updated] (SPARK-38880) Implement `numeric_only` parameter of `GroupBy.max/min`

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38880: - Parent: SPARK-39076 Issue Type: Sub-task (was: Improvement) > Implement `numeric_only`

[jira] [Updated] (SPARK-39000) Convert bools to ints in basic statistical functions of GroupBy objects

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39000: - Parent: SPARK-39076 Issue Type: Sub-task (was: Improvement) > Convert bools to ints in

[jira] [Updated] (SPARK-39227) Reach parity with pandas boolean cast

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39227: - Parent: SPARK-39076 Issue Type: Sub-task (was: Improvement) > Reach parity with pandas

[jira] [Updated] (SPARK-38952) Implement `numeric_only` of `GroupBy.first` and `GroupBy.last`

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38952: - Parent: SPARK-39076 Issue Type: Sub-task (was: Improvement) > Implement `numeric_only`

[jira] [Updated] (SPARK-38763) Pandas API on spark Can`t apply lamda to columns.

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38763: - Parent: SPARK-39199 Issue Type: Sub-task (was: Bug) > Pandas API on spark Can`t apply

[jira] [Updated] (SPARK-38766) Support lambda `column` parameter of `DataFrame.rename`

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38766: - Parent: (was: SPARK-39199) Issue Type: Improvement (was: Sub-task) > Support

[jira] [Updated] (SPARK-38387) Support `na_action` and Series input correspondence in `Series.map`

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38387: - Parent: SPARK-39199 Issue Type: Sub-task (was: New Feature) > Support `na_action` and

[jira] [Updated] (SPARK-38766) Support lambda `column` parameter of `DataFrame.rename`

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38766: - Parent: SPARK-39199 Issue Type: Sub-task (was: Bug) > Support lambda `column`

[jira] [Updated] (SPARK-38400) Enable Series.rename to change index labels

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38400: - Parent: SPARK-39199 Issue Type: Sub-task (was: Improvement) > Enable Series.rename to

[jira] [Updated] (SPARK-38491) Support `ignore_index` of `Series.sort_values`

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38491: - Parent: SPARK-39199 Issue Type: Sub-task (was: Improvement) > Support `ignore_index`

[jira] [Updated] (SPARK-38518) Implement `skipna` of `Series.all/Index.all` to exclude NA/null values

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38518: - Parent: SPARK-39199 Issue Type: Sub-task (was: Improvement) > Implement `skipna` of

[jira] [Updated] (SPARK-38441) Support string and bool `regex` in `Series.replace`

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38441: - Parent: SPARK-39199 Issue Type: Sub-task (was: Improvement) > Support string and bool

[jira] [Updated] (SPARK-38479) Add `Series.duplicated` to indicate duplicate Series values.

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38479: - Parent: SPARK-39199 Issue Type: Sub-task (was: New Feature) > Add `Series.duplicated`

[jira] [Updated] (SPARK-38576) Implement `numeric_only` parameter for `DataFrame/Series.rank` to rank numeric columns only

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38576: - Parent: SPARK-39199 Issue Type: Sub-task (was: Improvement) > Implement `numeric_only`

[jira] [Updated] (SPARK-38608) Implement `bool_only` parameter of `DataFrame.all` and`DataFrame.any`

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38608: - Parent: SPARK-39199 Issue Type: Sub-task (was: Improvement) > Implement `bool_only`

[jira] [Updated] (SPARK-38552) Implement `keep` parameter of `frame.nlargest/nsmallest` to decide how to resolve ties

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38552: - Parent: SPARK-39199 Issue Type: Sub-task (was: Improvement) > Implement `keep`

[jira] [Updated] (SPARK-38686) Implement `keep` parameter of `(Index/MultiIndex).drop_duplicates`

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38686: - Parent: SPARK-39199 Issue Type: Sub-task (was: Improvement) > Implement `keep`

[jira] [Updated] (SPARK-38704) Support string `inclusive` parameter of `Series.between`

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38704: - Parent: SPARK-39199 Issue Type: Sub-task (was: Improvement) > Support string

[jira] [Updated] (SPARK-38726) Support `how` parameter of `MultiIndex.dropna`

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38726: - Parent: SPARK-39199 Issue Type: Sub-task (was: Improvement) > Support `how` parameter

[jira] [Updated] (SPARK-38765) Implement `inplace` parameter of `Series.clip`

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38765: - Parent: SPARK-39199 Issue Type: Sub-task (was: Improvement) > Implement `inplace`

[jira] [Updated] (SPARK-38837) Implement `dropna` parameter of `SeriesGroupBy.value_counts`

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38837: - Parent: SPARK-39199 Issue Type: Sub-task (was: Improvement) > Implement `dropna`

[jira] [Updated] (SPARK-38863) Implement `skipna` parameter of `DataFrame.all`

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38863: - Parent: SPARK-39199 Issue Type: Sub-task (was: Improvement) > Implement `skipna`

[jira] [Updated] (SPARK-38793) Support `return_indexer` parameter of `Index/MultiIndex.sort_values`

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38793: - Parent: SPARK-39199 Issue Type: Sub-task (was: Improvement) > Support `return_indexer`

[jira] [Updated] (SPARK-38903) Implement `ignore_index` of `Series.sort_values` and `Series.sort_index`

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38903: - Parent: SPARK-39199 Issue Type: Sub-task (was: Improvement) > Implement `ignore_index`

[jira] [Updated] (SPARK-38890) Implement `ignore_index` of `DataFrame.sort_index`.

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38890: - Parent: SPARK-39199 Issue Type: Sub-task (was: Improvement) > Implement `ignore_index`

[jira] [Updated] (SPARK-38938) Implement `inplace` and `columns` parameters of `Series.drop`

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38938: - Parent: SPARK-39199 Issue Type: Sub-task (was: Improvement) > Implement `inplace` and

[jira] [Updated] (SPARK-38989) Implement `ignore_index` of `DataFrame/Series.sample`

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38989: - Parent: SPARK-39199 Issue Type: Sub-task (was: Improvement) > Implement `ignore_index`

[jira] [Updated] (SPARK-39201) Implement `ignore_index` of `DataFrame.explode` and `DataFrame.drop_duplicates`

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39201: - Parent: SPARK-39199 Issue Type: Sub-task (was: Improvement) > Implement `ignore_index`

[jira] [Updated] (SPARK-39201) Implement `ignore_index` of `DataFrame.explode` and `DataFrame.drop_duplicates`

2022-05-24 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39201: - Issue Type: Improvement (was: Umbrella) > Implement `ignore_index` of `DataFrame.explode` and

[jira] [Updated] (SPARK-39262) Correct error messages when creating DataFrame from an RDD with the first element `0`

2022-05-23 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39262: - Description: Correct error messages when creating DataFrame from an RDD with the first element

[jira] [Updated] (SPARK-39262) Correct error messages when creating DataFrame from an RDD with the first element `0`

2022-05-23 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39262: - Summary: Correct error messages when creating DataFrame from an RDD with the first element `0`

[jira] [Created] (SPARK-39262) Correct error messages when creating DataFrame from an RDD with the first row is `0`

2022-05-23 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39262: Summary: Correct error messages when creating DataFrame from an RDD with the first row is `0` Key: SPARK-39262 URL: https://issues.apache.org/jira/browse/SPARK-39262

[jira] [Updated] (SPARK-39199) Implement pandas API missing parameters

2022-05-20 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39199: - Description: pandas API on Spark aims to make pandas code work on Spark clusters without any

[jira] [Updated] (SPARK-37525) Timedelta support in pandas API on Spark

2022-05-18 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-37525: - Summary: Timedelta support in pandas API on Spark (was: Support TimedeltaIndex in pandas API

[jira] [Resolved] (SPARK-37525) Support TimedeltaIndex in pandas API on Spark

2022-05-18 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng resolved SPARK-37525. -- Resolution: Resolved > Support TimedeltaIndex in pandas API on Spark >

[jira] [Created] (SPARK-39228) Implement `skipna` of `Series.argmax`

2022-05-18 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39228: Summary: Implement `skipna` of `Series.argmax` Key: SPARK-39228 URL: https://issues.apache.org/jira/browse/SPARK-39228 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-39227) Reach parity with pandas boolean cast

2022-05-18 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39227: - Description: There are pandas APIs that need boolean casts: all, any. Currently,

[jira] [Created] (SPARK-39227) Reach parity with pandas boolean cast

2022-05-18 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39227: Summary: Reach parity with pandas boolean cast Key: SPARK-39227 URL: https://issues.apache.org/jira/browse/SPARK-39227 Project: Spark Issue Type:

[jira] [Created] (SPARK-39201) Implement `ignore_index` of `DataFrame.explode` and `DataFrame.drop_duplicates`

2022-05-16 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39201: Summary: Implement `ignore_index` of `DataFrame.explode` and `DataFrame.drop_duplicates` Key: SPARK-39201 URL: https://issues.apache.org/jira/browse/SPARK-39201

[jira] [Updated] (SPARK-39199) Implement pandas API missing parameters

2022-05-16 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39199: - Description: pandas API on Spark aims to achieve full pandas API coverage. Currently, most

[jira] [Updated] (SPARK-39199) Implement pandas API missing parameters

2022-05-16 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39199: - Description: pandas API on Spark aims to achieve full pandas API coverage. Currently, most

[jira] [Updated] (SPARK-38608) Implement `bool_only` parameter of `DataFrame.all` and`DataFrame.any`

2022-05-16 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38608: - Summary: Implement `bool_only` parameter of `DataFrame.all` and`DataFrame.any` (was:

[jira] [Created] (SPARK-39199) Implement pandas API missing parameters

2022-05-16 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39199: Summary: Implement pandas API missing parameters Key: SPARK-39199 URL: https://issues.apache.org/jira/browse/SPARK-39199 Project: Spark Issue Type: Umbrella

[jira] [Updated] (SPARK-37525) Support TimedeltaIndex in pandas API on Spark

2022-05-16 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-37525: - Description: Since DayTimeIntervalType is supported in PySpark, we may add TimedeltaIndex

[jira] [Created] (SPARK-39197) Implement `skipna` parameter of `GroupBy.all`

2022-05-16 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39197: Summary: Implement `skipna` parameter of `GroupBy.all` Key: SPARK-39197 URL: https://issues.apache.org/jira/browse/SPARK-39197 Project: Spark Issue Type:

[jira] [Updated] (SPARK-39155) Access to JVM through passed-in GatewayClient during type conversion

2022-05-11 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39155: - Description: Access to JVM through passed-in GatewayClient during type conversion. In

[jira] [Created] (SPARK-39155) Access to JVM through passed-in GatewayClient during type conversion

2022-05-11 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39155: Summary: Access to JVM through passed-in GatewayClient during type conversion Key: SPARK-39155 URL: https://issues.apache.org/jira/browse/SPARK-39155 Project: Spark

[jira] [Created] (SPARK-39154) Remove outdated statements on distributed-sequence default index

2022-05-11 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39154: Summary: Remove outdated statements on distributed-sequence default index Key: SPARK-39154 URL: https://issues.apache.org/jira/browse/SPARK-39154 Project: Spark

[jira] [Commented] (SPARK-38819) Run Pandas on Spark with Pandas 1.4.x

2022-05-10 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534428#comment-17534428 ] Xinrong Meng commented on SPARK-38819: -- Thanks Yikun! > Run Pandas on Spark with Pandas 1.4.x >

[jira] [Created] (SPARK-39133) Mention log level setting in PYSPARK_JVM_STACKTRACE_ENABLED

2022-05-09 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39133: Summary: Mention log level setting in PYSPARK_JVM_STACKTRACE_ENABLED Key: SPARK-39133 URL: https://issues.apache.org/jira/browse/SPARK-39133 Project: Spark

[jira] [Created] (SPARK-39109) Adjust `GroupBy.mean/median` to match pandas 1.4

2022-05-05 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39109: Summary: Adjust `GroupBy.mean/median` to match pandas 1.4 Key: SPARK-39109 URL: https://issues.apache.org/jira/browse/SPARK-39109 Project: Spark Issue Type:

[jira] [Created] (SPARK-39095) Adjust `GroupBy.std` to match pandas 1.4

2022-05-03 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39095: Summary: Adjust `GroupBy.std` to match pandas 1.4 Key: SPARK-39095 URL: https://issues.apache.org/jira/browse/SPARK-39095 Project: Spark Issue Type:

[jira] [Updated] (SPARK-39076) Standardize Statistical Functions of pandas API on Spark

2022-05-02 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39076: - Description: Statistical functions are the most commonly-used functions in Data Engineering

[jira] [Created] (SPARK-39077) Implement `skipna` of basic statistical functions of DataFrame and Series

2022-04-29 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39077: Summary: Implement `skipna` of basic statistical functions of DataFrame and Series Key: SPARK-39077 URL: https://issues.apache.org/jira/browse/SPARK-39077 Project:

[jira] [Created] (SPARK-39076) Standardize Statistical Functions of pandas API on Spark

2022-04-29 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39076: Summary: Standardize Statistical Functions of pandas API on Spark Key: SPARK-39076 URL: https://issues.apache.org/jira/browse/SPARK-39076 Project: Spark

[jira] [Created] (SPARK-39051) Minor refactoring of `python/pyspark/sql/pandas/conversion.py`

2022-04-27 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39051: Summary: Minor refactoring of `python/pyspark/sql/pandas/conversion.py` Key: SPARK-39051 URL: https://issues.apache.org/jira/browse/SPARK-39051 Project: Spark

[jira] [Updated] (SPARK-39048) Refactor `GroupBy._reduce_for_stat_function` on accepted data types

2022-04-27 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-39048: - Summary: Refactor `GroupBy._reduce_for_stat_function` on accepted data types (was: Refactor

[jira] [Created] (SPARK-39048) Refactor GroupBy._reduce_for_stat_function on accepted data types

2022-04-27 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39048: Summary: Refactor GroupBy._reduce_for_stat_function on accepted data types Key: SPARK-39048 URL: https://issues.apache.org/jira/browse/SPARK-39048 Project: Spark

[jira] [Commented] (SPARK-38988) Pandas API - "PerformanceWarning: DataFrame is highly fragmented." get printed many times.

2022-04-26 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528459#comment-17528459 ] Xinrong Meng commented on SPARK-38988: -- Thank you for raising that! I will try muting the warnings

[jira] [Created] (SPARK-39000) Convert bools to ints in basic statistical functions of GroupBy objects

2022-04-22 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39000: Summary: Convert bools to ints in basic statistical functions of GroupBy objects Key: SPARK-39000 URL: https://issues.apache.org/jira/browse/SPARK-39000 Project:

[jira] [Created] (SPARK-38991) Implement `numeric_only` of `GroupBy.mean` and `GroupBy.sum`

2022-04-21 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-38991: Summary: Implement `numeric_only` of `GroupBy.mean` and `GroupBy.sum` Key: SPARK-38991 URL: https://issues.apache.org/jira/browse/SPARK-38991 Project: Spark

[jira] [Commented] (SPARK-38991) Implement `numeric_only` of `GroupBy.mean` and `GroupBy.sum`

2022-04-21 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526121#comment-17526121 ] Xinrong Meng commented on SPARK-38991: -- I am working on that. > Implement `numeric_only` of

[jira] [Created] (SPARK-38989) Implement `ignore_index` of `DataFrame/Series.sample`

2022-04-21 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-38989: Summary: Implement `ignore_index` of `DataFrame/Series.sample` Key: SPARK-38989 URL: https://issues.apache.org/jira/browse/SPARK-38989 Project: Spark Issue

[jira] [Created] (SPARK-38971) Test anchor frame for in-place `Series.rename_axis`

2022-04-20 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-38971: Summary: Test anchor frame for in-place `Series.rename_axis` Key: SPARK-38971 URL: https://issues.apache.org/jira/browse/SPARK-38971 Project: Spark Issue

[jira] [Updated] (SPARK-38953) Document PySpark common exceptions / errors

2022-04-19 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-38953: - Component/s: Documentation > Document PySpark common exceptions / errors >

[jira] [Created] (SPARK-38953) Document PySpark common exceptions / errors

2022-04-19 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-38953: Summary: Document PySpark common exceptions / errors Key: SPARK-38953 URL: https://issues.apache.org/jira/browse/SPARK-38953 Project: Spark Issue Type:

[jira] [Created] (SPARK-38952) Implement `numeric_only` of `GroupBy.first` and `GroupBy.last`

2022-04-19 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-38952: Summary: Implement `numeric_only` of `GroupBy.first` and `GroupBy.last` Key: SPARK-38952 URL: https://issues.apache.org/jira/browse/SPARK-38952 Project: Spark

[jira] [Created] (SPARK-38940) Test Series' anchor frame for in-place updates on Series

2022-04-18 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-38940: Summary: Test Series' anchor frame for in-place updates on Series Key: SPARK-38940 URL: https://issues.apache.org/jira/browse/SPARK-38940 Project: Spark

[jira] [Created] (SPARK-38938) Implement `inplace` and `columns` parameters of `Series.drop`

2022-04-18 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-38938: Summary: Implement `inplace` and `columns` parameters of `Series.drop` Key: SPARK-38938 URL: https://issues.apache.org/jira/browse/SPARK-38938 Project: Spark

[jira] [Created] (SPARK-38903) Implement `ignore_index` of `Series.sort_values` and `Series.sort_index`

2022-04-14 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-38903: Summary: Implement `ignore_index` of `Series.sort_values` and `Series.sort_index` Key: SPARK-38903 URL: https://issues.apache.org/jira/browse/SPARK-38903 Project:

[jira] [Created] (SPARK-38890) Implement `ignore_index` of `DataFrame.sort_index`.

2022-04-13 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-38890: Summary: Implement `ignore_index` of `DataFrame.sort_index`. Key: SPARK-38890 URL: https://issues.apache.org/jira/browse/SPARK-38890 Project: Spark Issue

[jira] [Created] (SPARK-38880) Implement `numeric_only` parameter of `GroupBy.max/min`

2022-04-12 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-38880: Summary: Implement `numeric_only` parameter of `GroupBy.max/min` Key: SPARK-38880 URL: https://issues.apache.org/jira/browse/SPARK-38880 Project: Spark

[jira] [Created] (SPARK-38863) Implement `skipna` parameter of `DataFrame.all`

2022-04-11 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-38863: Summary: Implement `skipna` parameter of `DataFrame.all` Key: SPARK-38863 URL: https://issues.apache.org/jira/browse/SPARK-38863 Project: Spark Issue Type:

[jira] [Created] (SPARK-38837) Implement `dropna` parameter of `SeriesGroupBy.value_counts`

2022-04-08 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-38837: Summary: Implement `dropna` parameter of `SeriesGroupBy.value_counts` Key: SPARK-38837 URL: https://issues.apache.org/jira/browse/SPARK-38837 Project: Spark

[jira] [Created] (SPARK-38793) Support `return_indexer` parameter of `Index/MultiIndex.sort_values`

2022-04-05 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-38793: Summary: Support `return_indexer` parameter of `Index/MultiIndex.sort_values` Key: SPARK-38793 URL: https://issues.apache.org/jira/browse/SPARK-38793 Project: Spark

[jira] [Commented] (SPARK-38763) Pandas API on spark Can`t apply lamda to columns.

2022-04-05 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517719#comment-17517719 ] Xinrong Meng commented on SPARK-38763: -- [~bjornjorgensen] For sure :) The fix is in Spark 3.3

[jira] [Commented] (SPARK-38763) Pandas API on spark Can`t apply lamda to columns.

2022-04-01 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17516177#comment-17516177 ] Xinrong Meng commented on SPARK-38763: -- I will backport the fix after approved and merged. >

[jira] [Comment Edited] (SPARK-38763) Pandas API on spark Can`t apply lamda to columns.

2022-04-01 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17516172#comment-17516172 ] Xinrong Meng edited comment on SPARK-38763 at 4/1/22 11:56 PM: --- Hi

[jira] [Resolved] (SPARK-38766) Support lambda `column` parameter of `DataFrame.rename`

2022-04-01 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng resolved SPARK-38766. -- Resolution: Duplicate > Support lambda `column` parameter of `DataFrame.rename` >

[jira] [Commented] (SPARK-38763) Pandas API on spark Can`t apply lamda to columns.

2022-04-01 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17516172#comment-17516172 ] Xinrong Meng commented on SPARK-38763: -- Hi [~bjornjorgensen], thanks for raising that! The

[jira] [Commented] (SPARK-38766) Support lambda `column` parameter of `DataFrame.rename`

2022-04-01 Thread Xinrong Meng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17516173#comment-17516173 ] Xinrong Meng commented on SPARK-38766: -- I am working on that. > Support lambda `column` parameter

[jira] [Created] (SPARK-38766) Support lambda `column` parameter of `DataFrame.rename`

2022-04-01 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-38766: Summary: Support lambda `column` parameter of `DataFrame.rename` Key: SPARK-38766 URL: https://issues.apache.org/jira/browse/SPARK-38766 Project: Spark

<    1   2   3   4   5   6   7   8   9   10   >