Recall: Questions about Stateful Operations in SS

2017-07-27 Thread Zhang, Lubo
Zhang, Lubo would like to recall the message, "Questions about Stateful Operations in SS". - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

RE: Questions about Stateful Operations in SS

2017-07-27 Thread Zhang, Lubo
Got you, thanks for your reply. Best regards Lubo From: Tathagata Das [mailto:tathagata.das1...@gmail.com] Sent: Thursday, July 27, 2017 3:08 AM To: Zhang, Lubo <lubo.zh...@intel.com> Cc: dev@spark.apache.org Subject: Re: Questions about Stateful Operations in SS Hello Lubo, Th

RE: Questions about Stateful Operations in SS

2017-07-27 Thread Zhang, Lubo
From: Tathagata Das [mailto:tathagata.das1...@gmail.com] Sent: Thursday, July 27, 2017 3:08 AM To: Zhang, Lubo <lubo.zh...@intel.com> Cc: dev@spark.apache.org Subject: Re: Questions about Stateful Operations in SS Hello Lubo, The idea of timeouts is to make a best-effort and last-resort

Re: Questions about Stateful Operations in SS

2017-07-26 Thread Tathagata Das
Hello Lubo, The idea of timeouts is to make a best-effort and last-resort effort to process a key, when it has not received data for a while. With processing time timeout is 1 minute, the system guarantees that it will not timeout unless at least 1 minute has passed. Defining a precise timing on

Questions about Stateful Operations in SS

2017-07-26 Thread Zhang, Lubo
Hi all I have a question about the Stateful operations [map/flatmap]GroupsWithState in Structured streaming. Issue are as follows: Take StructuredSessionization case for example, first I input two words like apache and spark in batch 0, then input another word Hadoop in batch 1 until timeout