Re: time-windowed joins and tumbling windows

2020-03-25 Thread Vinod Mehra
> and > > the regular DataStream API windows. I don't expect bugs in DataStream > API windows, so I would suggest to verify the join operator. > > I hope this helps. > > Regards, > Timo > > > > On 13.03.20 23:56, Vinod Mehra wrote: > > Thanks Timo for re

Re: time-windowed joins and tumbling windows

2020-03-13 Thread Vinod Mehra
at 3:56 PM Vinod Mehra wrote: > Thanks Timo for responding back! Answers below: > > > 1) Which planner are you using? > > We are using Flink 1.8 and using the default planner > (org.apache.flink.table.calcite.FlinkPlannerImpl) > from: org.apache.flink:flink-table-planner_2.11

Re: time-windowed joins and tumbling windows

2020-03-13 Thread Vinod Mehra
of 1 or higher? > 4) Can you share the output of TableEnvironment.explain() with us? > > Shouldn't c have a rowtime constraint around o instead of r? Such that > all time-based operations work on o.rowtime? > > Regards, > Timo > > > On 10.03.20 19:26, Vinod Mehra wrote: &

time-windowed joins and tumbling windows

2020-03-10 Thread Vinod Mehra
Hi! We are testing the following 3 way time windowed join to keep the retained state size small. Using joins for the first time here. It works in unit tests but we are not able to get expected results in production. We are still troubleshooting this issue. Can you please help us review this in

Re: OVER operator filtering out records

2019-08-25 Thread Vinod Mehra
(?). Is that what is going on? Can someone confirm? Is there a way to flush out periodically? Thanks, Vinod On Fri, Aug 23, 2019 at 10:37 PM Vinod Mehra wrote: > Although things improved during bootstrapping and when even volume was > larger. As soon as the traffic slowed down the events are getting

Re: OVER operator filtering out records

2019-08-23 Thread Vinod Mehra
Although things improved during bootstrapping and when even volume was larger. As soon as the traffic slowed down the events are getting stuck (buffered?) at the OVER operator for a very long time. Several hours. On Fri, Aug 23, 2019 at 5:04 PM Vinod Mehra wrote: > (Forgot to mention that

Re: OVER operator filtering out records

2019-08-23 Thread Vinod Mehra
, 2019 at 3:09 PM Vinod Mehra wrote: > We have a SQL based flink job which is consume a very low volume stream (1 > or 2 events in few hours): > > > > > > > *SELECT user_id,COUNT(*) OVER (PARTITION BY user_id ORDER BY rowtime > RANGE INTERVAL '30' DAY PRECEDING

OVER operator filtering out records

2019-08-23 Thread Vinod Mehra
We have a SQL based flink job which is consume a very low volume stream (1 or 2 events in few hours): *SELECT user_id,COUNT(*) OVER (PARTITION BY user_id ORDER BY rowtime RANGE INTERVAL '30' DAY PRECEDING) as count_30_days, COALESCE(occurred_at, logged_at) AS latency_marker,

Re: count(DISTINCT) in flink SQL

2019-06-04 Thread Vinod Mehra
On Tue, Jun 4, 2019 at 5:14 PM Vinod Mehra wrote: > Thanks a lot Fabian for the detailed response. I know all the duplicates > are going to arrive within an hour max of the actual event. So using a 1 > hour running session window should be fine for me. > > Is the following the

Re: count(DISTINCT) in flink SQL

2019-06-04 Thread Vinod Mehra
tention-time > [3] > https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/sql.html#group-windows > > Am Do., 30. Mai 2019 um 02:18 Uhr schrieb Vinod Mehra : > >> Another interesting thing is that if I add DISTINCT in the 2nd query it >> doesn't co

Re: count(DISTINCT) in flink SQL

2019-05-29 Thread Vinod Mehra
event_foo GROUP BY event_id, user_id ) GROUP BY user_id, MONTH(row_datetime), YEAR(row_datetime) On Wed, May 29, 2019 at 5:15 PM Vinod Mehra wrote: > More details on the error with query#1 that used COUNT(DISTINCT()): > > org.apache.flink.table.api.TableException: Cannot generat

Re: count(DISTINCT) in flink SQL

2019-05-29 Thread Vinod Mehra
t org.apache.flink.table.api.StreamTableEnvironment.optimize(StreamTableEnvironment.scala:683) at org.apache.flink.table.api.StreamTableEnvironment.translate(StreamTableEnvironment.scala:730) at org.apache.flink.table.api.java.StreamTableEnvironment.toRetractStream(StreamTableEnvironment.scala:414) On Wed, May 29, 2019 a

count(DISTINCT) in flink SQL

2019-05-29 Thread Vinod Mehra
Hi! We are using apache-flink-1.4.2. It seems this version doesn't support count(DISTINCT). I am trying to find a way to dedup the stream. So I tried: SELECT CONCAT_WS( '-', CAST(MONTH(longToDateTime(rowtime)) AS VARCHAR), CAST(YEAR(longToDateTime(rowtime)) AS VARCHAR),

Re: TUMBLE function with MONTH/YEAR intervals produce milliseconds window unexpectedly

2019-04-03 Thread Vinod Mehra
every x months on 2nd day at 01:34:00.000 PST? Same thing with years. Thanks, Vinod On Thu, Mar 28, 2019 at 5:02 PM Vinod Mehra wrote: > btw the max DAY window that is allowed is 99 days. After that it blows up > here: > https://github.com/apache/calcite/blob/master/core/src/main/java/o

Re: TUMBLE function with MONTH/YEAR intervals produce milliseconds window unexpectedly

2019-03-28 Thread Vinod Mehra
e 100 exceeds precision of DAY(2) field" Resetting things based on larger windows (month, quarter, year) can be quite useful. Is there a practical limitation with Flink (state size blows up?) for not supporting such large windows? - Vinod On Thu, Mar 28, 2019 at 3:24 PM Vinod Mehra wrote

Re: TUMBLE function with MONTH/YEAR intervals produce milliseconds window unexpectedly

2019-03-28 Thread Vinod Mehra
; > [1] > https://issues.apache.org/jira/browse/FLINK-11017?jql=project%20%3D%20FLINK%20AND%20text%20~%20Month > > > On Thu, 28 Mar 2019, 19:32 Vinod Mehra, wrote: > >> Hi All! >> >> We are using: org.apache.flink:flink-table_2.11:jar:1.4.2:compile >> >&

Re: TUMBLE function with MONTH/YEAR intervals produce milliseconds window unexpectedly

2019-03-28 Thread Vinod Mehra
t%20%3D%20FLINK%20AND%20text%20~%20Month > > > On Thu, 28 Mar 2019, 19:32 Vinod Mehra, wrote: > >> Hi All! >> >> We are using: org.apache.flink:flink-table_2.11:jar:1.4.2:compile >> >> SELECT >> COALESCE(user_id, -1) AS user_id, >>

Re: TUMBLE function with MONTH/YEAR intervals produce milliseconds window unexpectedly

2019-03-28 Thread Vinod Mehra
Doh! Sorry about that! :) Thanks again! On Thu, Mar 28, 2019 at 12:49 PM Dawid Wysakowicz wrote: > I did ;) but here is the link one more time: > https://issues.apache.org/jira/browse/FLINK-11017?jql=project%20%3D%20FLINK%20AND%20text%20~%20Month > > On Thu, 28 Mar 2019, 20:48

TUMBLE function with MONTH/YEAR intervals produce milliseconds window unexpectedly

2019-03-28 Thread Vinod Mehra
Hi All! We are using: org.apache.flink:flink-table_2.11:jar:1.4.2:compile SELECT COALESCE(user_id, -1) AS user_id, count(id) AS count_per_window, sum(amount) AS charge_amount_per_window, TUMBLE_START(rowtime, INTERVAL '2' YEAR) AS twindow_start, TUMBLE_END(rowtime, INTERVAL '2' YEAR)