[ 
https://issues.apache.org/jira/browse/SPARK-49961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-49961:
-----------------------------------
    Labels: pull-request-available  (was: )

> Dataset.transform no longer has the correct return type
> -------------------------------------------------------
>
>                 Key: SPARK-49961
>                 URL: https://issues.apache.org/jira/browse/SPARK-49961
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 4.0.0
>            Reporter: Chris Twiner
>            Priority: Major
>              Labels: pull-request-available
>
> In versions prior to 4.0.0-preview2 sql.Dataset transform had signature:
> {code:java}
> def transform[U](t: (sql.Dataset[T]) ⇒ sql.Dataset[U]): sql.Dataset[U] {code}
> 4.0.0-preview2 has moved this to the parent class sql.api.Dataset with the 
> signature:
> {code:java}
> def transform[U](t: (sql.api.Dataset[T]) ⇒ sql.api.Dataset[U]): 
> sql.api.Dataset[U] {code}
> rendering all function objects and return values with incompatible types.
> It seems F Bounded or some similar self type is needed to have the types 
> remain correct (e.g. if you are dealing with sql.Dataset all types should be 
> sql.Dataset),
> {code:java}
> import sparkSession.implicits._
> val ds = Seq(1, 2).toDS()
> val f: Dataset[Int] => Dataset[Int] = d => d.selectExpr("(value + 1) 
> value").as[Int]
> val transformed = ds.transform(f)
> assert(transformed.collect().sorted === Array(2, 3)) {code}
> now fails to compile.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to