[ 
https://issues.apache.org/jira/browse/SPARK-47503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

alexey updated SPARK-47503:
---------------------------
    Description: 
Spark history sever fails to display query for cached JDBC relation (or 
calculation derived from it)  named in quotes

!image-2024-03-21-14-46-48-939.png|width=1017,height=741!

(Screenshot and generated history in attachments)

How to reproduce:
{code:java}
val ticketsDf = spark.read.jdbc("jdbc:postgresql://localhost:5432/demo", """ 
"test-schema".tickets """.trim, properties)
val bookingDf = spark.read.parquet("path/bookings")

ticketsDf.cache().count()

val resultDf = bookingDf.join(ticketsDf, Seq("book_ref"))

resultDf.write.mode(SaveMode.Overwrite).parquet("path/result") {code}
For some reason i'm unable to add history attachment, i'll try to link it in 
the comment.

So the problem is in SparkPlanGraphNode class which creates a dot node. When 
there is no metrics to display it simply returns tagged name and in this case 
name contains quotes which corrupts dot file.
Suggested solution is to escape name string

 

  was:
Spark history sever fails to display query for cached JDBC relation (or 
calculation derived from it)  named in quotes

!image-2024-03-21-14-46-48-939.png|width=1017,height=741!

How to reproduce:
{code:java}
val ticketsDf = spark.read.jdbc("jdbc:postgresql://localhost:5432/demo", """ 
"test-schema".tickets """.trim, properties)
val bookingDf = spark.read.parquet("path/bookings")

ticketsDf.cache().count()

val resultDf = bookingDf.join(ticketsDf, Seq("book_ref"))

resultDf.write.mode(SaveMode.Overwrite).parquet("path/result") {code}
For some reason i'm unable to add history attachment, i'll try to link it in 
the comment.

So the problem is in SparkPlanGraphNode class which creates a dot node. When 
there is no metrics to display it simply returns tagged name and in this case 
name contains quotes which corrupts dot file.
Suggested solution is to escape name string

 


> Spark history sever fails to display query for cached JDBC relation named in 
> quotes
> -----------------------------------------------------------------------------------
>
>                 Key: SPARK-47503
>                 URL: https://issues.apache.org/jira/browse/SPARK-47503
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.2.1, 4.0.0
>            Reporter: alexey
>            Priority: Major
>         Attachments: Screenshot_11.png, eventlog_v2_local-1711020585149.rar
>
>
> Spark history sever fails to display query for cached JDBC relation (or 
> calculation derived from it)  named in quotes
> !image-2024-03-21-14-46-48-939.png|width=1017,height=741!
> (Screenshot and generated history in attachments)
> How to reproduce:
> {code:java}
> val ticketsDf = spark.read.jdbc("jdbc:postgresql://localhost:5432/demo", """ 
> "test-schema".tickets """.trim, properties)
> val bookingDf = spark.read.parquet("path/bookings")
> ticketsDf.cache().count()
> val resultDf = bookingDf.join(ticketsDf, Seq("book_ref"))
> resultDf.write.mode(SaveMode.Overwrite).parquet("path/result") {code}
> For some reason i'm unable to add history attachment, i'll try to link it in 
> the comment.
> So the problem is in SparkPlanGraphNode class which creates a dot node. When 
> there is no metrics to display it simply returns tagged name and in this case 
> name contains quotes which corrupts dot file.
> Suggested solution is to escape name string
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to