[ https://issues.apache.org/jira/browse/SPARK-41070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ramakrishna updated SPARK-41070: -------------------------------- Component/s: SQL > Performance issue when Spark SQL connects with TeraData > -------------------------------------------------------- > > Key: SPARK-41070 > URL: https://issues.apache.org/jira/browse/SPARK-41070 > Project: Spark > Issue Type: Bug > Components: Spark Core, SQL > Affects Versions: 2.4.4 > Reporter: Ramakrishna > Priority: Major > > We are connecting Tera data from spark SQL with below API > {color:#ff8b00}Dataset<Row> jdbcDF = spark.read().jdbc(connectionUrl, > tableQuery, connectionProperties);{color} > We are facing one issue when we execute above logic on large table with > million rows every time we are seeing below extra query is executing every > time as this resulting performance hit on DB. > This below information we got from DBA. We dont have any logs on SPARK SQL. > SELECT 1 FROM ONE_MILLION_ROWS_TABLE; > |1| > |1| > |1| > |1| > |1| > |1| > |1| > |1| > |1| > |1| > > Can you please clarify why this query is executing or is there any chance > that this type of query is executing from our code it self while check for > rows count from dataframe. > > Please provide me your inputs on this. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org