>From what I understand , if the transformation is untyped it will return a >Dataframe , otherwise it will return a Dataset. In the source code you will >see that return type is a Dataframe instead of a Dataset and they should also >be annotated with @group untypedrel. Thus , you could check the signature of >the method to determine if it is untyped or not.
In general , anything that changes the type of a column or adds a new column in a Dataset will be untyped. The idea of a Dataset is to stay constant when it comes to the schema. The moment you try to modify the schema , we need to fallback to a Dataframe. For example , withColumn is untyped because it transforms the Dataset(typed) to an untyped structure(Dataframe). From: Akhilanand <akhilanand...@gmail.com> Sent: Thursday, February 21, 2019 7:35 PM To: user <user@spark.apache.org> Subject: Difference between Typed and untyped transformation in dataset API What is the key difference between Typed and untyped transformation in dataset API? How do I determine if its typed or untyped? Any gotchas when to use what apart from the reason that it does the job for me?