[ 
https://issues.apache.org/jira/browse/SPARK-27751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16843815#comment-16843815
 ] 

Marc-Olivier Andrez commented on SPARK-27751:
---------------------------------------------

Hi [~hyukjin.kwon],

Although the method `FileFormat.buildReader` does belong to a private package, 
is there a good reason for not leaving the method public?

If I am correct, extending `FileFormat` was the way to define readers of data 
in formats not supported by Apache Spark in previous versions. For example, 
[org.apache.spark.sql.hive.orc.OrcFileFormat|https://spark.apache.org/docs/2.1.1/api/scala/#org.apache.spark.sql.hive.orc.OrcFileFormat]
  and 
[com.databricks.spark.avro.DefaultSource|https://github.com/databricks/spark-avro/blob/branch-4.0/src/main/scala/com/databricks/spark/avro/DefaultSource.scala]
 extend `FileFormat`.

Some implementations of `FileFormat` may delegate the reading to an existing 
`FileFormat` and add additional information (similar to the method 
[FileFormat.buildReaderWithPartitionValues|https://github.com/apache/spark/blob/68fa601d62c1e5e7b37a1b7d6b0236019239a00a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala#L121]).
 Such implementations override the `buildReader` method and need to call the 
method `delegate.buildReader`, which has become protected. Leaving the method 
public would make the transition to Apache Spark 2.4.x smoother. 

Many thanks in advance for your answer

> buildReader is now protected
> ----------------------------
>
>                 Key: SPARK-27751
>                 URL: https://issues.apache.org/jira/browse/SPARK-27751
>             Project: Spark
>          Issue Type: Question
>          Components: Spark Core
>    Affects Versions: 2.4.3
>            Reporter: Geet Kumar
>            Priority: Major
>
> I have recently upgraded to spark 2.4.0 and was relying on the `buildReader` 
> method. It originally was public and now it is protected. 
> What was the reason for this change?
> The only workaround I can see is to use `buildReaderWithPartitionValues` 
> which remains public. Any plans to revert `buildReader` to be public again?
> The change was made here: [https://github.com/apache/spark/pull/17253/files]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to