[jira] [Commented] (SPARK-38961) Enhance to automatically generate the pandas API support list
[ https://issues.apache.org/jira/browse/SPARK-38961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17601358#comment-17601358 ] Apache Spark commented on SPARK-38961: -- User 'Yikun' has created a pull request for this issue: https://github.com/apache/spark/pull/37820 > Enhance to automatically generate the pandas API support list > - > > Key: SPARK-38961 > URL: https://issues.apache.org/jira/browse/SPARK-38961 > Project: Spark > Issue Type: Documentation > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Haejoon Lee >Assignee: Hyunwoo Park >Priority: Major > Fix For: 3.4.0 > > > Currently, the supported pandas API list is manually maintained, so it would > be better to make the list automatically generated to reduce the maintenance > cost. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38961) Enhance to automatically generate the pandas API support list
[ https://issues.apache.org/jira/browse/SPARK-38961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17601359#comment-17601359 ] Apache Spark commented on SPARK-38961: -- User 'Yikun' has created a pull request for this issue: https://github.com/apache/spark/pull/37820 > Enhance to automatically generate the pandas API support list > - > > Key: SPARK-38961 > URL: https://issues.apache.org/jira/browse/SPARK-38961 > Project: Spark > Issue Type: Documentation > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Haejoon Lee >Assignee: Hyunwoo Park >Priority: Major > Fix For: 3.4.0 > > > Currently, the supported pandas API list is manually maintained, so it would > be better to make the list automatically generated to reduce the maintenance > cost. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38961) Enhance to automatically generate the pandas API support list
[ https://issues.apache.org/jira/browse/SPARK-38961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581871#comment-17581871 ] Apache Spark commented on SPARK-38961: -- User 'Yikun' has created a pull request for this issue: https://github.com/apache/spark/pull/37583 > Enhance to automatically generate the pandas API support list > - > > Key: SPARK-38961 > URL: https://issues.apache.org/jira/browse/SPARK-38961 > Project: Spark > Issue Type: Documentation > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Haejoon Lee >Assignee: Hyunwoo Park >Priority: Major > Fix For: 3.4.0 > > > Currently, the supported pandas API list is manually maintained, so it would > be better to make the list automatically generated to reduce the maintenance > cost. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38961) Enhance to automatically generate the pandas API support list
[ https://issues.apache.org/jira/browse/SPARK-38961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581869#comment-17581869 ] Apache Spark commented on SPARK-38961: -- User 'Yikun' has created a pull request for this issue: https://github.com/apache/spark/pull/37583 > Enhance to automatically generate the pandas API support list > - > > Key: SPARK-38961 > URL: https://issues.apache.org/jira/browse/SPARK-38961 > Project: Spark > Issue Type: Documentation > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Haejoon Lee >Assignee: Hyunwoo Park >Priority: Major > Fix For: 3.4.0 > > > Currently, the supported pandas API list is manually maintained, so it would > be better to make the list automatically generated to reduce the maintenance > cost. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38961) Enhance to automatically generate the pandas API support list
[ https://issues.apache.org/jira/browse/SPARK-38961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17545296#comment-17545296 ] Apache Spark commented on SPARK-38961: -- User 'beobest2' has created a pull request for this issue: https://github.com/apache/spark/pull/36748 > Enhance to automatically generate the pandas API support list > - > > Key: SPARK-38961 > URL: https://issues.apache.org/jira/browse/SPARK-38961 > Project: Spark > Issue Type: Documentation > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Haejoon Lee >Assignee: Hyunwoo Park >Priority: Major > Fix For: 3.4.0 > > > Currently, the supported pandas API list is manually maintained, so it would > be better to make the list automatically generated to reduce the maintenance > cost. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38961) Enhance to automatically generate the pandas API support list
[ https://issues.apache.org/jira/browse/SPARK-38961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17545295#comment-17545295 ] Apache Spark commented on SPARK-38961: -- User 'beobest2' has created a pull request for this issue: https://github.com/apache/spark/pull/36748 > Enhance to automatically generate the pandas API support list > - > > Key: SPARK-38961 > URL: https://issues.apache.org/jira/browse/SPARK-38961 > Project: Spark > Issue Type: Documentation > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Haejoon Lee >Assignee: Hyunwoo Park >Priority: Major > Fix For: 3.4.0 > > > Currently, the supported pandas API list is manually maintained, so it would > be better to make the list automatically generated to reduce the maintenance > cost. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38961) Enhance to automatically generate the pandas API support list
[ https://issues.apache.org/jira/browse/SPARK-38961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534795#comment-17534795 ] Apache Spark commented on SPARK-38961: -- User 'beobest2' has created a pull request for this issue: https://github.com/apache/spark/pull/36509 > Enhance to automatically generate the pandas API support list > - > > Key: SPARK-38961 > URL: https://issues.apache.org/jira/browse/SPARK-38961 > Project: Spark > Issue Type: Test > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Haejoon Lee >Priority: Major > > Currently, the supported pandas API list is manually maintained, so it would > be better to make the list automatically generated to reduce the maintenance > cost. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38961) Enhance to automatically generate the pandas API support list
[ https://issues.apache.org/jira/browse/SPARK-38961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534012#comment-17534012 ] Hyunwoo Park commented on SPARK-38961: -- [~hyukjin.kwon] Sure! I'm working on this :) > Enhance to automatically generate the pandas API support list > - > > Key: SPARK-38961 > URL: https://issues.apache.org/jira/browse/SPARK-38961 > Project: Spark > Issue Type: Test > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Haejoon Lee >Priority: Major > > Currently, the supported pandas API list is manually maintained, so it would > be better to make the list automatically generated to reduce the maintenance > cost. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38961) Enhance to automatically generate the pandas API support list
[ https://issues.apache.org/jira/browse/SPARK-38961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17533733#comment-17533733 ] Hyukjin Kwon commented on SPARK-38961: -- [~beobest2] please go ahead with a PR! Maybe we can just put the codes in a comment at the .rst file. > Enhance to automatically generate the pandas API support list > - > > Key: SPARK-38961 > URL: https://issues.apache.org/jira/browse/SPARK-38961 > Project: Spark > Issue Type: Test > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Haejoon Lee >Priority: Major > > Currently, the supported pandas API list is manually maintained, so it would > be better to make the list automatically generated to reduce the maintenance > cost. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38961) Enhance to automatically generate the pandas API support list
[ https://issues.apache.org/jira/browse/SPARK-38961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17533705#comment-17533705 ] Yikun Jiang commented on SPARK-38961: - See also here: https://github.com/apache/spark/pull/36083#issuecomment-1101966359 > Enhance to automatically generate the pandas API support list > - > > Key: SPARK-38961 > URL: https://issues.apache.org/jira/browse/SPARK-38961 > Project: Spark > Issue Type: Test > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Haejoon Lee >Priority: Major > > Currently, the supported pandas API list is manually maintained, so it would > be better to make the list automatically generated to reduce the maintenance > cost. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38961) Enhance to automatically generate the pandas API support list
[ https://issues.apache.org/jira/browse/SPARK-38961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17533699#comment-17533699 ] Hyunwoo Park commented on SPARK-38961: -- How about this way? {code:python} from inspect import getmembers, isclass, isfunction import pandas as pd from pyspark import pandas as ps # automatically generated pyspark.pandas APIs ps_classes = tuple(map(lambda x: x[0], getmembers(ps, isclass))) for ps_class in ps_classes: for method, _ in getmembers(getattr(ps, ps_class), isfunction): print(f"{ps_class}.{method}") # also it is possible to automatically create a missing list common_classes = set(map(lambda x: x[0], getmembers(pd, isclass))) & \ set(map(lambda x: x[0], getmembers(ps, isclass))) print(common_classes) # {'Series', 'DataFrame', 'MultiIndex', 'DatetimeIndex', 'NamedAgg', 'Index', 'Int64Index', 'TimedeltaIndex', 'CategoricalIndex', 'Float64Index'} for _class in common_classes: not_implemented = set( map(lambda x: x[0], getmembers(getattr(pd, _class), isfunction)) ) - set( map(lambda x: x[0], getmembers(getattr(ps, _class), isfunction)) ) print(f"class: {_class}") print(f"not_implemented: {not_implemented}") {code} > Enhance to automatically generate the pandas API support list > - > > Key: SPARK-38961 > URL: https://issues.apache.org/jira/browse/SPARK-38961 > Project: Spark > Issue Type: Test > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Haejoon Lee >Priority: Major > > Currently, the supported pandas API list is manually maintained, so it would > be better to make the list automatically generated to reduce the maintenance > cost. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org