Re: Support Spark 2.4 in Sedona 1.0

Jia Yu Thu, 12 Nov 2020 15:23:51 -0800

In order to support Spark 2.4, Sedona needs to use different logic for SQL
aggregation functions. I am not sure if this could be achieved by using
different profiles.


Sedona for Spark 3.0:
https://github.com/apache/incubator-sedona/blob/master/sql/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/AggregateFunctions.scala
Sedona for Spark 2.4:
https://github.com/apache/incubator-sedona/blob/spark-2.3-2.4/sql/src/main/scala/org/apache/spark/sql/geosparksql/expressions/AggregateFunctions.scala

On Thu, Nov 12, 2020 at 9:42 AM Mo Sarwat <themosar...@gmail.com> wrote:

> Yes, I believe we still need to support Spark 2.4 since  many Sedona users
> are still using it
>
> On Wed, Nov 11, 2020 at 11:21 PM Netanel Malka <netan...@sela.co.il>
> wrote:
>
> > What do you mean by compile target?
> >
> > ?
> >
> > Iv'e found Apache Zeppelin handle multiple Spark versions here using
> > profile for each Spark version:
> >
> > https://github.com/apache/zeppelin/blob/master/spark/pom.xml#L185
> >
> >
> > Do you think this method is better?
> >
> >
> >
> > Netanel Malka,
> > Big Data Consultant
> > [Description: Description: Description: Description:
> > cid:image001.jpg@01C85203.36A2AF30]
> > ________________________________
> > From: Felix Cheung <felixche...@apache.org>
> > Sent: Thursday, November 12, 2020 04:05
> > To: dev@sedona.apache.org
> > Cc: Jinxuan Wu; Mohamed Sarwat; Netanel Malka; Paweł Kociński; Zongsi
> > Zhang; d...@sedona.incubator.apache.org
> > Subject: Re: Support Spark 2.4 in Sedona 1.0
> >
> > I am not sure it should be a branch? It is common to deal with this as a
> > compile target, not as a separate branch. A separate branch might have
> > difficulty to release?
> >
> > There are a few example in projects where they handle multiple Spark
> > target version like this.
> >
> >
> > On Wed, Nov 11, 2020 at 12:56 PM Jia Yu <ji...@apache.org<mailto:
> > ji...@apache.org>> wrote:
> > OK. I agree. I am gonna create a branch for spark-2.3/2.4. Regarding the
> > compiler used in each branch,
> >
> > For Sedona on Spark 3.0, I will compile it using Scala 2.12
> > For Sedona on Spark 2.4, I will compile it using Scala 2.11.
> >
> > For the Java code in both branches, I will compile them using Java 1.8
> >
> > Am I missing anything here?
> >
> >
> > On Wed, Nov 11, 2020 at 7:31 AM Netanel Malka <netan...@sela.co.il
> <mailto:
> > netan...@sela.co.il>> wrote:
> >
> > > Hi,
> > >
> > > I also think that we need to support 2.4.
> > >
> > > I saw that even Apache Spark still releases 2.4.x artifacts. (2.4.7 Sep
> > > 12, 2020)
> > >
> > > I also asked about it on *us...@spark.apache.org<mailto:
> > us...@spark.apache.org> <us...@spark.apache.org<mailto:
> > us...@spark.apache.org>>*
> > >  :
> > >
> > >
> > > *Sean Owen (answered the question): *
> > >
> > > "I don't think there's an official EOL for Spark 2.4.x but would expect
> > > another maintenance release in the first half of 2021 at least. I'd
> also
> > > guess it wouldn't be maintained by 2022."
> > >
> > >
> > > ?BR,
> > >
> > >
> > >
> > > Netanel Malka,
> > > Big Data Consultant
> > > [image: Description: Description: Description: Description:
> > > cid:image001.jpg@01C85203.36A2AF30]
> > > ------------------------------
> > > *From:* Paweł Kociński <pawel93kocin...@gmail.com<mailto:
> > pawel93kocin...@gmail.com>>
> > > *Sent:* Wednesday, November 11, 2020 00:29
> > > *To:* Jia Yu
> > > *Cc:* dev@sedona.apache.org<mailto:dev@sedona.apache.org>;
> > d...@sedona.incubator.apache.org<mailto:d...@sedona.incubator.apache.org>;
> > Jinxuan Wu;
> > > Mohamed Sarwat; Netanel Malka; Zongsi Zhang
> > > *Subject:* Re: Support Spark 2.4 in Sedona 1.0
> > >
> > > Hi Jia,
> > > I think we should support spark 2.4, a lot of users still use it. More
> > > than that I think more users still have jobs written in spark 2.4 than
> > > 3.0.  We will use an additional branch for that use case ? I mean Spark
> > 2.4
> > > with scala 2.12 is important one.
> > > Regards,
> > > Paweł
> > >
> > > pon., 9 lis 2020 o 20:44 Jia Yu <ji...@apache.org<mailto:
> > ji...@apache.org>> napisał(a):
> > >
> > >> Dear all,
> > >>
> > >> In Sedona 1.0, we definitely will support Spark 3.0. But I wonder
> > whether
> > >> we should support Spark 2.4.
> > >>
> > >> In order to support Spark 2.4, we need to do the following
> > >>
> > >> 1. Compile the source using Scala 2.11. Sedona master branch currently
> > is
> > >> compiled by Scala 2.12 and Java 1.8
> > >> 2. For the Scala code of Sedona-SQL and Viz-SQL, I need to change the
> > (1)
> > >> UDF registration hook (2) the SQL aggregation function format
> > >> 3. In the future releases of Sedona, use git cherry-pick to pick
> > >> important features back to the Spark 2.4 branch. This is what I did in
> > >> GeoSpark to support Spark 2.1, 2.2, 2.3
> > >>
> > >> GeoSpark 1.2.0 - 1.3.1 support Spark 2.4 already. We can simply leave
> it
> > >> that way and just support Spark 3.0.
> > >>
> > >> Do you think we should support Spark 2.4 in the future release?
> > >>
> > >> Thanks,
> > >> Jia Yu
> > >>
> > >
> >
>

Re: Support Spark 2.4 in Sedona 1.0

Reply via email to