Re: [SPARK-30319][SQL] Add a stricter version of as[T]

2020-01-08 Thread Enrico Minack
Yes, as[T] is lazy as any transformation is, but in terms of data processing not schema. You seem to imply the as[T] is lazy in terms of the schema, where I do no know of any other transformation that behaves like this. Your proposed solution works, because the map transformation returns the

Re: [SPARK-30319][SQL] Add a stricter version of as[T]

2020-01-07 Thread Wenchen Fan
I think it's simply because as[T] is lazy. You will see the right schema if you do `df.as[T].map(identity)`. On Tue, Jan 7, 2020 at 4:42 PM Enrico Minack wrote: > Hi Devs, > > I'd like to propose a stricter version of as[T]. Given the interface def > as[T](): Dataset[T], it is

[SPARK-30319][SQL] Add a stricter version of as[T]

2020-01-07 Thread Enrico Minack
Hi Devs, I'd like to propose a stricter version of as[T]. Given the interface def as[T](): Dataset[T], it is counter-intuitive that the schema of the returned Dataset[T] is not agnostic to the schema of the originating Dataset. The schema should always be derived only from T. I am proposing