Sure. My point was that Delta Lake is also one of the 3rd party libraries
and there's no way for Apache Spark to do that. There's a Delta Lake's own
group and the request is better to be there.
On Mon, Oct 5, 2020 at 9:54 PM Enrico Minack wrote:
> Though spark.read. refers to "built-in" data
Though spark.read. refers to "built-in" data sources, there is
nothing that prevents 3rd party libraries to "extend" spark.read in
Scala or Python. As users know the Spark-way to read built-in data
sources, it feels natural to hook 3rd party data sources under the same
scheme, to give users a
Hi,
"spark.read." is a "shorthand" for "built-in" data sources, not for
external data sources. spark.read.format() is still an official way to use
it. Delta Lake is not included in Apache Spark so that is indeed not
possible for Spark to refer to.
Starting from Spark 3.0, the concept of
Hi there,
I'm just wondering if there is any incentive to implement read/write methods in
the DataFrameReader/DataFrameWriter for delta similar to e.g. parquet?
For example, using PySpark, "spark.read.parquet" is available, but
"spark.read.delta" is not (same for write).
In my opinion,