[ 
https://issues.apache.org/jira/browse/SPARK-40802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17619124#comment-17619124
 ] 

Mingli Rui commented on SPARK-40802:
------------------------------------

Hi, [~hyukjin.kwon], Could you please explain more about +We could probably 
introduce a dialect to optimize this further.+ Do you mean let's move 
{{JDBCRDD.getQueryOutputSchema}} to a function for JdbcDialect? So that every 
concrete Jdbc dialect class has a chance to resolve the schema by their own 
way? For example,
{code:java}
abstract class JdbcDialect extends Serializable with Logging {
  def getQueryOutputSchema(query: String, options: JDBCOptions): StructType
}

private object MsSqlServerDialect extends JdbcDialect {
   override def getQueryOutputSchema(query: String, options: JDBCOptions): 
StructType = {
      // The provider specific solution
   }
}{code}

> Enhance JDBC Connector to use PreparedStatement.getMetaData() to resolve 
> schema instead of PreparedStatement.executeQuery()
> ---------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-40802
>                 URL: https://issues.apache.org/jira/browse/SPARK-40802
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.1.0
>            Reporter: Mingli Rui
>            Priority: Major
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Currently, Spark JDBC Connector uses *PreparedStatement.executeQuery()* to 
> resolve the JDBCRelation's schema. The schema query is like *s"SELECT * FROM 
> $table_or_query WHERE 1=0".*
> But it is not necessary to execute the query. It's enough to *prepare* the 
> query. With preparing the statement, the query is parsed and compiled, but is 
> not executed. It will be more efficient.
> So, it's better to use PreparedStatement.getMetaData() to resolve schema.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to