itholic opened a new pull request, #43996:
URL: https://github.com/apache/spark/pull/43996

   ### What changes were proposed in this pull request?
   
   This PR proposes to fix a critical issue in the [Supported pandas API 
documentation](https://spark.apache.org/docs/latest/api/python/user_guide/pandas_on_spark/supported_pandas_api.html)
 where many essential APIs such as `DataFrame.max`, `DataFrame.min`, 
`DataFrame.mean`, `and DataFrame.median`, etc. were incorrectly marked as not 
implemented - marked as "N" - as below:
   
   <img width="291" alt="Screenshot 2023-11-24 at 12 37 49 PM" 
src="https://github.com/apache/spark/assets/44108233/95c5785c-711c-400c-b2ec-0db034e90fd8";>
   
   
   
    The root cause of this issue was that the script used to generate the 
support list excluded functions inherited from parent classes. For instance, 
`CategoricalIndex.max` is actually supported by inheriting the `Index` class 
but was not directly implemented in `CategoricalIndex`, leading to it being 
marked as unsupported:
   
   <img width="397" alt="Screenshot 2023-11-24 at 12 30 08 PM" 
src="https://github.com/apache/spark/assets/44108233/90e92996-a88a-4a20-bb0c-4909097e2688";>
   
   
   
   
   ### Why are the changes needed?
   
   The current documentation inaccurately represents the state of supported 
pandas API, which could significantly hinder user experience and adoption. By 
correcting these inaccuracies, we ensure that the documentation reflects the 
true capabilities of Pandas API on Spark, providing users with reliable and 
accurate information.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No. This PR only updates the documentation to accurately reflect the current 
state of supported pandas API.
   
   
   ### How was this patch tested?
   
   Manually build documentation, and check if the supported pandas API list is 
correctly generated as below:
   
   <img width="299" alt="Screenshot 2023-11-24 at 12 36 31 PM" 
src="https://github.com/apache/spark/assets/44108233/a2da0f0b-0973-45cb-b22d-9582bbeb51b5";>
   
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to