Hi Martin,

I agree.

1. Currently, geom serializers, spatial partitioning code and some format
reader code in Sedona-core (all in Java) is independent from Spark
dependency. So Sedona-Flink actually re-uses those. But a refactor of
Sedona ST / RS functions are needed as some of them depend on Spark SQL,
which is not necessary.

2. So let's keep both Spark 2.4 and Spark 3.3 support in the next Sedona
release (1.2.1), Scala version will include 2.11, 2.12 but no 2.13. In the
next-next Sedona release (1.3.0), we will drop them completely. Sedona
1.2.1 will be the last release that supports Spark 2.4 and Scala 2.11.

Thanks,
Jia


On Thu, Jun 23, 2022 at 4:03 AM Martin Andersson <
u.martin.anders...@gmail.com> wrote:

> Hi,
>
> I guess that the pending Spark 3.3 support is a big enough feature to
> warrant a new Sedona release. It makes no sense to remove Spark 2.4 and
> Scala 2.11 before the release.
>
> After Sedona-next is released I think that Spark 2.4 can safely be removed.
>
> Long term, but i think that's another discussion, there are a lot of
> benefits to moving shared code between Sedona-Spark and Sedona-Flink to a
> common java-only module (sedona-common?). That would include partitioning
> code and probably most ST_x/RS_x functions.
>
> That would give Sedona-Flink first class, scala-free, support. It would
> also open up Sedona to other jvm data tools regardless of whether they are
> written in java, scala, kotlin, clojure or any other jvm language. Possibly
> Sedona-Kafka, Sedona-Hive etc. That would make Scala-version support a
> Sedona-Spark issue only and not a general Sedona issue.
>
> Br,
> Martin
>
> On 2022/06/19 06:10:31 Jia Yu wrote:
> > Dear all,
> >
> > I am proposing to drop the support of Spark 2.4 and Scala 2.11 in the
> next
> > Sedona release. The version number will be 1.3.0 if we drop this support,
> > otherwise it will be 1.2.1.
> >
> > Here is the status of Spark 2.4 and Sedona for Spark 2.4
> > 1. Spark community has announced Spark 2.4 EOL on March 03 2021:
> > https://www.mail-archive.com/dev@spark.apache.org/msg27476.html
> > 2. Spark 3.0 was released on 06-16-2020.
> > 3. Spark 3.3.0 was released a few days ago. And starting from Spark 3.2,
> > Spark releases binaries for both Scala 2.12 and 2.13.
> > 4. Only a few Sedona users are using Spark 2.4. According to the
> statistics
> > of Maven Central (Scala/Java API only), only around 1K out of 100K
> > downloads are using Sedona for Spark 2.4. (core-2.4_2.11, core-2.4_2.12,
> > python-adapter-2.4_2.11, python-adapter-2.4_2.12)
> >
> > Benefits of dropping the support:
> > 1. Reduce the complexity of maintaining the source code for different
> Spark
> > versions. Currently, several files have two versions for Spark 2.4 and
> 3.x,
> > controlled by "anchor" keywords. I wrote a Python script to pre-process
> the
> > source code all the time:
> >
>
> https://github.com/apache/incubator-sedona/blob/master/spark-version-converter.py
> > 2. Reduce the overhead of releasing binary packages. Currently, the main
> > POM.xml is quite complex in order to compile against different Spark
> > versions. Therefore, we weren't able to release Sedona for Scala 2.13.
> >
> > Plan of Sedona for Spark 3.X
> > 1. Sedona source code already supports Scala 2.13 but no Sedona binary
> > release. We will release Sedona for both Scala 2.12 and 2.13, but no
> Scala
> > 2.11.
> > 2. Sedona already releases binaries for Spark 3.0, 3.1, 3.2
> > 3. The two latest PRs of Sedona are adding the support for Spark 3.3.
> > https://github.com/apache/incubator-sedona/pull/636
> > https://github.com/apache/incubator-sedona/pull/635
> >
> > What do you think of this proposal? If you don't like this, what is the
> > best time to drop the support of Spark 2.4 and Scala 2.11?
> >
> > I will let this discussion open for at least 3 days. If no objection, I
> > will remove Spark 2.4 from POM.xml and GitHub Actions, but leave the
> Spark
> > 2.4 support in the source code. So whoever wants to use Sedona on Spark
> 2.4
> > can still compile the source code by themselves.
> >
> > Thanks,
> > Jia
> >
>

Reply via email to