oh the map in DataFrame is actually using a RowEncoder. i left it out because it wasn't important:
so this doesn't compile: def f[T]: Dataset[T] => Dataset[T] = dataset => { val df = dataset.toDF df.map(row => row)(RowEncoder(df.schema)).as[T] } now this does compile. but i don't like it, since the assumption of an implicit encoder isnt always true, and i dont want to start passing encoders around when they are already embedded in the datasets: def f[T: Encoder]: Dataset[T] => Dataset[T] = dataset => { val df = dataset.toDF df.map(row => row)(RowEncoder(df.schema)).as[T] } On Thu, Jan 26, 2017 at 10:50 AM, Jacek Laskowski <ja...@japila.pl> wrote: > Hi Koert, > > map will take the value that has an implicit Encoder to any value that > may or may not have an encoder in scope. That's why I'm asking about > the map function to see what it does. > > Pozdrawiam, > Jacek Laskowski > ---- > https://medium.com/@jaceklaskowski/ > Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark > Follow me at https://twitter.com/jaceklaskowski > > > On Thu, Jan 26, 2017 at 4:18 PM, Koert Kuipers <ko...@tresata.com> wrote: > > the map operation works on DataFrame so it doesn't need an encoder. It > could > > have been any operation on DataFrame. the issue is at the end going back > to > > Dataset[T] using as[T]. this requires an encoder for T which i know i > > already have since i started with a Dataset[T]. > > > > i could add an implicit encoder but that assumes T has an implicit > encoder > > which isn't always true. for example i could be using a kryo encoder. but > > anyhow i shouldn't have to be guessing this Encoder[T] since it's part > of my > > dataset already > > > > On Jan 26, 2017 05:18, "Jacek Laskowski" <ja...@japila.pl> wrote: > > > > Hi, > > > > Can you show the code from map to reproduce the issue? You can create > > encoders using Encoders object (I'm using it all over the place for > schema > > generation). > > > > Jacek > > > > On 25 Jan 2017 10:19 p.m., "Koert Kuipers" <ko...@tresata.com> wrote: > >> > >> i often run into problems like this: > >> > >> i need to write a Dataset[T] => Dataset[T], and inside i need to switch > to > >> DataFrame for a particular operation. > >> > >> but if i do: > >> dataset.toDF.map(...).as[T] i get error: > >> Unable to find encoder for type stored in a Dataset. > >> > >> i know it has an encoder, because i started with Dataset[T] > >> > >> so i would like to do: > >> dataset.toDF.map(...).as[T](dataset.encoder) > >> > > >