External dependencies in public APIs (was previously: Upgrading to Kafka 0.9.x)

Reynold Xin Fri, 26 Feb 2016 11:50:06 -0800

Dropping Kafka list since this is about a slightly different topic.

Every time we expose the API of a 3rd party application as a public Spark
API has caused some problems down the road. This goes from Hadoop, Tachyon,
Kafka, to Guava. Most of these are used for input/output.


The good thing is that in Spark 2.0 we are removing most of those
exposures, and in the new DataFrame/Dataset API we are providing an unified
input/output API for end-users so the internals of the 3rd party
dependencies are no longer exposed directly to users. Unfortunately, some
Spark APIs still depend on Hadoop.

It is important to keep this in mind as we develop Spark. We should avoid
to the best degree possible exposing other projects' APIs for the long term
stability of Spark APIs.


On Fri, Feb 26, 2016 at 9:46 AM, Mark Grover <m...@apache.org> wrote:

> Hi Kafka devs,
> I come to you with a dilemma and a request.
>
> Based on what I understand, users of Kafka need to upgrade their brokers
> to Kafka 0.9.x first, before they upgrade their clients to Kafka 0.9.x.
>
> However, that presents a problem to other projects that integrate with
> Kafka (Spark, Flume, Storm, etc.). From here on, I will speak for Spark +
> Kafka, since that's the one I am most familiar with.
>
> In the light of compatibility (or the lack thereof) between 0.8.x and
> 0.9.x, Spark is faced with a problem of what version(s) of Kafka to be
> compatible with, and has 2 options (discussed in this PR
> <https://github.com/apache/spark/pull/11143>):
> 1. We either upgrade to Kafka 0.9, dropping support for 0.8. Storm and
> Flume are already on this path.
> 2. We introduce complexity in our code to support both 0.8 and 0.9 for the
> entire duration of our next major release (Apache Spark 2.x).
>
> I'd love to hear your thoughts on which option, you recommend.
>
> Long term, I'd really appreciate if Kafka could do something that doesn't
> make Spark having to support two, or even more versions of Kafka. And, if
> there is something that I, personally, and Spark project can do in your
> next release candidate phase to make things easier, please do let us know.
>
> Thanks!
> Mark
>

External dependencies in public APIs (was previously: Upgrading to Kafka 0.9.x)

Reply via email to