[ https://issues.apache.org/jira/browse/SPARK-10977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967695#comment-14967695 ]
Rick Hillegas commented on SPARK-10977: --------------------------------------- I have created a pull request for SPARK-10857 which also addresses the vulnerabilities described by this issue: https://github.com/apache/spark/pull/9202 > SQL injection bugs in JdbcUtils and DataFrameWriter > --------------------------------------------------- > > Key: SPARK-10977 > URL: https://issues.apache.org/jira/browse/SPARK-10977 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.5.1 > Reporter: Rick Hillegas > Priority: Minor > > SPARK-10857 identifies a SQL injection bug in the JDBC dialect code. A > similar SQL injection bug can be found in 2 places in JdbcUtils and another > place in DataFrameWriter: > {noformat} > The DROP TABLE logic in JdbcUtils concatenates boilerplate with a > user-supplied string: > def dropTable(conn: Connection, table: String): Unit = { > conn.prepareStatement(s"DROP TABLE $table").executeUpdate() > } > Same for the INSERT logic in JdbcUtils: > def insertStatement(conn: Connection, table: String, rddSchema: StructType): > PreparedStatement = { > val sql = new StringBuilder(s"INSERT INTO $table VALUES (") > var fieldsLeft = rddSchema.fields.length > while (fieldsLeft > 0) { > sql.append("?") > if (fieldsLeft > 1) sql.append(", ") else sql.append(")") > fieldsLeft = fieldsLeft - 1 > } > conn.prepareStatement(sql.toString()) > } > Same for the CREATE TABLE logic in DataFrameWriter: > def jdbc(url: String, table: String, connectionProperties: Properties): > Unit = { > ... > > if (!tableExists) { > val schema = JdbcUtils.schemaString(df, url) > val sql = s"CREATE TABLE $table ($schema)" > conn.prepareStatement(sql).executeUpdate() > } > ... > } > {noformat} > Maybe we can find a common solution to all of these SQL injection bugs. > Something like this: > 1) Parse the user-supplied table name into a table identifier and an optional > schema identifier. We can borrow logic from org.apache.derby.iapi.util.IdUtil > in order to do this. > 2) Double-quote (and escape as necessary) the schema and table identifiers so > that the database interprets them as delimited ids. > That should prevent the SQL injection attacks. > With this solution, if the user specifies table names like cityTable and > trafficSchema.congestionTable, then the generated DROP TABLE statements would > be > {noformat} > DROP TABLE "CITYTABLE" > DROP TABLE "TRAFFICSCHEMA"."CONGESTIONTABLE" > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org