[ https://issues.apache.org/jira/browse/SPARK-40802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17619124#comment-17619124 ]
Mingli Rui commented on SPARK-40802: ------------------------------------ Hi, [~hyukjin.kwon], Could you please explain more about +We could probably introduce a dialect to optimize this further.+ Do you mean let's move {{JDBCRDD.getQueryOutputSchema}} to a function for JdbcDialect? So that every concrete Jdbc dialect class has a chance to resolve the schema by their own way? For example, {code:java} abstract class JdbcDialect extends Serializable with Logging { def getQueryOutputSchema(query: String, options: JDBCOptions): StructType } private object MsSqlServerDialect extends JdbcDialect { override def getQueryOutputSchema(query: String, options: JDBCOptions): StructType = { // The provider specific solution } }{code} > Enhance JDBC Connector to use PreparedStatement.getMetaData() to resolve > schema instead of PreparedStatement.executeQuery() > --------------------------------------------------------------------------------------------------------------------------- > > Key: SPARK-40802 > URL: https://issues.apache.org/jira/browse/SPARK-40802 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.1.0 > Reporter: Mingli Rui > Priority: Major > Original Estimate: 24h > Remaining Estimate: 24h > > Currently, Spark JDBC Connector uses *PreparedStatement.executeQuery()* to > resolve the JDBCRelation's schema. The schema query is like *s"SELECT * FROM > $table_or_query WHERE 1=0".* > But it is not necessary to execute the query. It's enough to *prepare* the > query. With preparing the statement, the query is parsed and compiled, but is > not executed. It will be more efficient. > So, it's better to use PreparedStatement.getMetaData() to resolve schema. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org