Re: Keeping only latest row by key?

2018-07-19 Thread Fabian Hueske
.groupBy("id") > > .sortGroup("timestamp", Order.DESCENDING) > > .first(1); > > > > Is there anything I’ve misunderstood with this? > > > > *From:* Porritt, James > *Sent:* 19 July 2018 09:21 > *To:* 'Timo Walther&#x

RE: Keeping only latest row by key?

2018-07-19 Thread Porritt, James
this? From: Porritt, James Sent: 19 July 2018 09:21 To: 'Timo Walther' Cc: user@flink.apache.org Subject: RE: Keeping only latest row by key? Hi Timo, Thanks for this. I’ve been looking into creating this in Java by looking at MaxAggFunction.scala as a basis. Is it c

RE: Keeping only latest row by key?

2018-07-19 Thread Porritt, James
the correct type of table field? Thanks, James. From: Timo Walther Sent: 18 July 2018 12:21 To: Porritt, James Cc: user@flink.apache.org Subject: Re: Keeping only latest row by key? Hi James, the easiest solution for this bahavior is to use a user-defined LAST_VALUE aggregate function as

Re: Keeping only latest row by key?

2018-07-18 Thread Timo Walther
Hi James, the easiest solution for this bahavior is to use a user-defined LAST_VALUE aggregate function as discussed here [1]. I hope this helps. Regards, Timo [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Using-SQL-with-dynamic-tables-where-rows-are-updated-td20519.htm

Re: Keeping only latest row by key?

2018-07-18 Thread Andrey Zagrebin
Hi James, There are over windows in Flink Table API: https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/table/tableApi.html#over-windows It should be possible to implement this behavio

Keeping only latest row by key?

2018-07-17 Thread Porritt, James
In Spark if I want to be able to get a set of unique rows by id, using the criteria of keeping the row with the latest timestamp, I would do the following: .withColumn("rn", F.row_number().over( Window.partitionBy