Gengliang Wang created SPARK-23202: -------------------------------------- Summary: Break down DataSourceV2Writer.commit into two phase Key: SPARK-23202 URL: https://issues.apache.org/jira/browse/SPARK-23202 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 2.2.1 Reporter: Gengliang Wang
Currently, the api DataSourceV2Writer#commit(WriterCommitMessage[]) commits a writing job with a list of commit messages. It makes sense in some scenarios, e.g. MicroBatchExecution. However, on receiving commit message, driver can start processing messages(e.g. persist messages into files) before all the messages are collected. The proposal is to Break down DataSourceV2Writer.commit into two phase: # add(WriterCommitMessage message): Handles a commit message produced by \{@link DataWriter#commit()}. # commit(): Commits the writing job. This should make the API more flexible, and more reasonable for implementing some datasources. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org