[ https://issues.apache.org/jira/browse/SPARK-27716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16843649#comment-16843649 ]
Liang-Chi Hsieh commented on SPARK-27716: ----------------------------------------- If the added support doesn't cover all cases, doesn't it make users more confused? > Complete the transactions support for part of jdbc datasource operations. > ------------------------------------------------------------------------- > > Key: SPARK-27716 > URL: https://issues.apache.org/jira/browse/SPARK-27716 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.4.3 > Reporter: feiwang > Priority: Major > Labels: pull-request-available > > With the jdbc datasource, we can save a rdd to the database. > The comments for the function saveTable is that. > {code:java} > /** > * Saves the RDD to the database in a single transaction. > */ > def saveTable( > df: DataFrame, > tableSchema: Option[StructType], > isCaseSensitive: Boolean, > options: JdbcOptionsInWrite) > {code} > In fact, it is not true. > The savePartition operation is in a single transaction but the saveTable > operation is not in a single transaction. > There are several cases of data transmission: > case1: Append data to origin existed gptable. > case2: Overwrite origin gptable, but the table is a cascadingTruncateTable, > so we can not drop the gptable, we have to truncate it and append data. > case3: Overwrite origin existed table and the table is not a > cascadingTruncateTable, so we can drop it first. > case4: For an unexisted table, create and transmit data. > In this PR, I add a transactions support for case3 and case4. > For case3 and case4, we can transmit the rdd to a temp table at first. > We use an accumulator to record the suceessful savePartition operations. > At last, we compare the value of accumulator with dataFrame's partitionNum. > If all the savePartition operations are successful, we drop the origin table > if it exists, then we alter the temp table rename to origin table. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org