All, I had three questions:
(1) Is there a timeline for stable Spark 2.0 release? I know the 'preview' build is out there, but was curious what the timeline was for full release. Jira seems to indicate that there should be a release 7/27. (2) For 'continuous' datasets there has been a lot of discussion. One item that came up in tickets was the idea that 'count()' and other functions do not apply to continuous datasets: https://github.com/apache/spark/pull/12080. In this case what is the intended procedure to calculate a streaming statistic based on an interval (e.g. count the number of records in a 2 minute window every 2 minutes)? (3) In previous releases (1.6.1) the call to DStream / RDD repartition w/ a number of partitions set to zero silently deletes data. I have looked in Jira for a similar issue, but I do not see one. I would like to address this (and would likely be willing to go fix it myself). Should I just create a ticket? Thank you, Bryan Jeffrey