Manoj Chandrashekar created SPARK-39605:
-------------------------------------------

             Summary: PySpark df.count() operation works fine on DBR 7.3 LTS 
but fails in DBR 10.4 LTS
                 Key: SPARK-39605
                 URL: https://issues.apache.org/jira/browse/SPARK-39605
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 3.2.1
            Reporter: Manoj Chandrashekar
             Fix For: 3.0.1


I have a job that infers schema from mongodb and does operations such as 
flattening and unwinding because there are nested fields. After performing 
various transformations, finally when the count() is performed, it works 
perfectly fine in databricks runtime version 7.3 LTS but fails to perform the 
same in 10.4 LTS.

Below is the image that shows successful run in 7.3 LTS:

!https://docs.microsoft.com/answers/storage/attachments/215035-image.png|width=672,height=80!

Below is the image that shows failure in 10.4 LTS:

!https://docs.microsoft.com/answers/storage/attachments/215026-image.png|width=668,height=69!

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to