Hi All, I am wondering what is the easiest and concise way to express the computation below in Spark Structured streaming given that it supports both imperative and declarative styles? I am just trying to select rows that has max timestamp for each train? Instead of doing some sort of nested queries like we normally do in any relational database I am trying to see if I can leverage both imperative and declarative at the same time. If nested queries or join are not required then I would like to see how this can be possible? I am using spark 2.1.1.
Dataset Train Dest Time1 HK 10:001 SH 12:001 SZ 14:002 HK 13:002 SH 09:002 SZ 07:00 The desired result should be: Train Dest Time1 SZ 14:002 HK 13:00