[
https://issues.apache.org/jira/browse/SPARK-53890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18037825#comment-18037825
]
Dongjoon Hyun commented on SPARK-53890:
---------------------------------------
Due to the inactivity for last one month, I moved this test JIRA issue to the
Apache Spark 4.2.0 for now. We can bring it back until the Apache Spark 4.1.0
release.
> [SDP] Test (and fix) read/readstream options are respected for pipelines
> ------------------------------------------------------------------------
>
> Key: SPARK-53890
> URL: https://issues.apache.org/jira/browse/SPARK-53890
> Project: Spark
> Issue Type: Sub-task
> Components: Declarative Pipelines
> Affects Versions: 4.2.0
> Reporter: Anish Mahto
> Priority: Major
>
> Add tests to verify read/readstream options are actually respected by the
> flow that executes the read/readstream dataframe.
> Trivial test example might be:
> {code:python}
> @materialized_view def mv_from_csv():
> return spark.read.option("delimiter", "|").csv("/my/table.csv")
> {code}
> I suspect that today, the read/readstream options will not be respected
> ([1|https://github.com/apache/spark/blob/master/sql/pipelines/src/main/scala/org/apache/spark/sql/pipelines/graph/FlowAnalysis.scala#L120],
> [2)|#L131].
> If true, a solution might be to copy over the options in the
> `UnresolvedRelation` into either the DataFrameReader that is constructed or
> the `streamingReadOptions`/`batchReadOptions` argument.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]