[jira] [Commented] (SPARK-30542) Two Spark structured streaming jobs cannot write to same base path

Jungtaek Lim (Jira) Tue, 06 Oct 2020 23:00:26 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-30542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209310#comment-17209310
 ]


Jungtaek Lim commented on SPARK-30542:
--------------------------------------

This is a limitation, not a bug. There're known 3rd party alternatives (A-Z 
order: Apache Hudi, Apache Iceberg, Delta Lake) which support multiple jobs 
writing to the same path, so you may want to explore such things.

> Two Spark structured streaming jobs cannot write to same base path
> ------------------------------------------------------------------
>
>                 Key: SPARK-30542
>                 URL: https://issues.apache.org/jira/browse/SPARK-30542
>             Project: Spark
>          Issue Type: Bug
>          Components: Structured Streaming
>    Affects Versions: 2.3.0
>            Reporter: Sivakumar
>            Priority: Major
>
> Hi All,
> Spark Structured Streaming doesn't allow two structured streaming jobs to 
> write data to the same base directory which is possible with using dstreams.
> As __spark___metadata directory will be created by default for one job, 
> second job cannot use the same directory as base path as already 
> _spark__metadata directory is created by other job, It is throwing exception.
> Is there any workaround for this, other than creating separate base path's 
> for both the jobs.
> Is it possible to create the __spark__metadata directory else where or 
> disable without any data loss.
> If I had to change the base path for both the jobs, then my whole framework 
> will get impacted, So i don't want to do that.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-30542) Two Spark structured streaming jobs cannot write to same base path

Reply via email to