[jira] [Updated] (SPARK-39605) PySpark df.count() operation works fine on DBR 7.3 LTS but fails in DBR 10.4 LTS

Manoj Chandrashekar (Jira) Sun, 26 Jun 2022 01:51:04 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-39605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Manoj Chandrashekar updated SPARK-39605:
----------------------------------------
    Description: 
I have a job that infers schema from mongodb and does operations such as 
flattening and unwinding because there are nested fields. After performing 
various transformations, finally when the count() is performed, it works 
perfectly fine in databricks runtime version 7.3 LTS but fails to perform the 
same in 10.4 LTS.

*Below is the image that shows successful run in 7.3 LTS:*

!https://docs.microsoft.com/answers/storage/attachments/215035-image.png|width=672,height=80!

*Below is the image that shows failure in 10.4 LTS:*

!https://docs.microsoft.com/answers/storage/attachments/215026-image.png|width=668,height=69!

 

  was:
I have a job that infers schema from mongodb and does operations such as 
flattening and unwinding because there are nested fields. After performing 
various transformations, finally when the count() is performed, it works 
perfectly fine in databricks runtime version 7.3 LTS but fails to perform the 
same in 10.4 LTS.

Below is the image that shows successful run in 7.3 LTS:

!https://docs.microsoft.com/answers/storage/attachments/215035-image.png|width=672,height=80!

Below is the image that shows failure in 10.4 LTS:

!https://docs.microsoft.com/answers/storage/attachments/215026-image.png|width=668,height=69!

 


> PySpark df.count() operation works fine on DBR 7.3 LTS but fails in DBR 10.4 
> LTS
> --------------------------------------------------------------------------------
>
>                 Key: SPARK-39605
>                 URL: https://issues.apache.org/jira/browse/SPARK-39605
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 3.2.1
>            Reporter: Manoj Chandrashekar
>            Priority: Major
>             Fix For: 3.0.1
>
>
> I have a job that infers schema from mongodb and does operations such as 
> flattening and unwinding because there are nested fields. After performing 
> various transformations, finally when the count() is performed, it works 
> perfectly fine in databricks runtime version 7.3 LTS but fails to perform the 
> same in 10.4 LTS.
> *Below is the image that shows successful run in 7.3 LTS:*
> !https://docs.microsoft.com/answers/storage/attachments/215035-image.png|width=672,height=80!
> *Below is the image that shows failure in 10.4 LTS:*
> !https://docs.microsoft.com/answers/storage/attachments/215026-image.png|width=668,height=69!
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-39605) PySpark df.count() operation works fine on DBR 7.3 LTS but fails in DBR 10.4 LTS

Reply via email to