[ https://issues.apache.org/jira/browse/DATAFU-168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eyal Allweil updated DATAFU-168: -------------------------------- Summary: Support Spark 2.4.6 and up - fix collectLimitedList compilation (was: Add support for Spark 2.4.6 and up) > Support Spark 2.4.6 and up - fix collectLimitedList compilation > --------------------------------------------------------------- > > Key: DATAFU-168 > URL: https://issues.apache.org/jira/browse/DATAFU-168 > Project: DataFu > Issue Type: Improvement > Affects Versions: 1.6.1 > Reporter: Eyal Allweil > Priority: Major > Fix For: 1.8.0 > > > Once DATAFU-167 is merged, datafu-spark will support Spark versions up to > 2.4.5. However, because our implementation of _collectLimitedList_ extends > Spark's {_}collect{_}, and because its interface was changed in 2.4.6, > compilation is broken for us. > > Here is the relevant line from collectLimitedList: > [https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/spark/utils/overwrites/SparkOverwriteUDAFs.scala#L104)] > Here is the compilation warning: > {code:java} > /Users/eyal/git/datafu/datafu-spark/src/main/scala/spark/utils/overwrites/SparkOverwriteUDAFs.scala:104: > class CollectLimitedList needs to be abstract, since: > it has 3 unimplemented members. > /** As seen from class CollectLimitedList, the missing signatures are as > follows. > * For convenience, these are usable as stub implementations. > */ > // Members declared in > org.apache.spark.sql.catalyst.expressions.aggregate.Collect > protected val bufferElementType: org.apache.spark.sql.types.DataType = ??? > protected def convertToBufferElement(value: Any): Any = ??? > // Members declared in > org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate > def eval(buffer: scala.collection.mutable.ArrayBuffer[Any]): Any = ??? > case class CollectLimitedList(child: Expression, > ^ > one error found > FAILURE: Build failed with an exception. > {code} > > > We need to either *1)* update our implementation, and drop support for older > versions (and then release this in our version 1.8.0) or *2)* copy the code > in a backwards compatible way. > Please note that you can replicate this compilation error on the master > branch even without merging DATAFU-167 by running: > {code:java} > ./gradlew :datafu-spark:test -PscalaVersion=2.11 -PsparkVersion=2.4.6 --tests > "DataFrame*"{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)