[jira] [Comment Edited] (SPARK-33638) Full support of V2 table creation in Structured Streaming writer path

Jungtaek Lim (Jira) Thu, 03 Dec 2020 23:57:06 -0800


    [ 
https://issues.apache.org/jira/browse/SPARK-33638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17243803#comment-17243803
 ]


Jungtaek Lim edited comment on SPARK-33638 at 12/4/20, 7:56 AM:
----------------------------------------------------------------

I don't agree with handling this in DataStreamWriter, hence I changed the 
title. My claim is designing DataStreamWriterV2, nothing else.

I also don't agree that we need to deal with partition columns verification in 
such way. DataFrameWriterV2 does this nicely, via branching the path between 
appending/overwriting/truncating table vs creating/replacing table and enforce 
latter whenever the configuration for creating table is provided. I think this 
is pretty much clearer for end users, rather than letting they concern about 
the impact.

For sure, even we address it with DataStreamWriterV2, we still need to deal 
with the consistency in DataStreamWriter.toTable(). Given DataStreamWriterV2 is 
taking place and recommended for table write, that would be less important.


was (Author: kabhwan):
I don't agree with handling this in DataStreamWriter, hence I changed the 
title. My claim is designing DataStreamWriterV2, nothing else.

I also don't agree that we need to deal with partition columns verification in 
such way. DataFrameWriterV2 does this nicely, via branching the path between 
appending/overwriting/truncating table vs creating/replacing table and enforce 
latter whenever the configuration for creating table is provided. I think this 
is pretty much clearer for end users, rather than letting they concern about 
the impact.

For sure, even we address it with DataStreamWriterV2, we still need to deal 
with the consistency in DataStreamWriter.toTable(). Given DataStreamWriterV2 is 
taking place and recommended for table write, that would be less important.

> Full support of V2 table creation in Structured Streaming writer path
> ---------------------------------------------------------------------
>
>                 Key: SPARK-33638
>                 URL: https://issues.apache.org/jira/browse/SPARK-33638
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 3.1.0
>            Reporter: Yuanjian Li
>            Priority: Blocker
>
> Currently, we want to add support of creating if not exists in 
> DataStreamWriter.toTable API. Since the file format in streaming doesn't 
> support DSv2 for now, the current implementation mainly focuses on V1 
> support. We need more work to do for the full support of V2 table creation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-33638) Full support of V2 table creation in Structured Streaming writer path

Reply via email to