Hi all, Does anyone want to share something about how to provide more better testing tools or good documentations?
@Till, @Aljoscha Do you have further suggestions about the improvements of testing tools and what is the next step we can do? Best, Tony Wei Tony Wei <tony19920...@gmail.com> 於 2018年9月29日 週六 上午12:20寫道: > Hi Till, > > Thanks for your feedback. I didn't think about this before. It will be > better if we can provide such tools instead of > let user deal with testing in operator level directly. Because test > harness was introduced for contributors, who > may have more knowledge about internal design, to test their patches > easily, it sometimes may not be intuitive > to users to use test harness. And that's why I want to provide some > examples in docs to ease the pain from such > problem in the beginning. > > If Aljoscha has already made some efforts on it and let such tool be > available in the near future, I will look forward > to seeing this happen. And if there is anything I can help, I'm glad to > give a hand. > > To follow up the discussion about users' requirements, I can provide some > from my experience. What I want to > achieve is to do unit testing on my operators with Flink features, such as > states, event time and checkpoint. > > From the testing doc [1] I provided, I concluded some scenarios I would > test in my flink application: > > 1. Easily control stream records to see the changes about the states > and outputs, no matter it is one input > stream operator or two. > 2. Manually control watermark progress to test "out of order" > problem, "late data" problem or the behavior of > "event time service". > 3. Preserve states from every version for testing states evolution and > compatibility or just to test the change > of Flink version. > 4. Test customized stateful source function. In current version, > source function still need to be implemented > by a long running method. How to easily block it and verify the states > is helpful. > 5. A simple way to expose states to verify the exactly value stored in > states, instead of testing it indirectly by > outputs. I archived this by getting state backend from test harness > and refer to some examples from Flink > project to access the keyed states I needed, but it is too deeper into > the internal implementations to make > it as an example in my testing doc [1]. > > Best, > Tony Wei > > [1] > https://github.com/apache/flink/compare/master...tony810430:flink-testing-doc > > 2018-09-28 21:48 GMT+08:00 Till Rohrmann <trohrm...@apache.org>: > >> Hi Tony, >> >> I think this is a long sought-after feature to provide better testing >> tools >> for our users. Thus, I'm strongly in favour of adding something like this. >> If I remember correctly Aljoscha already spend some brain cycles on this >> and he also gave a training about the current state at FF SF 2018. I've >> pulled him in to give more details. >> >> The first two things we can do is to collect requirements from our users >> and to see what the current state is. Based on that we could plan which >> things to add in which order. >> >> Cheers, >> Till >> >> On Fri, Sep 28, 2018 at 3:13 PM Tony Wei <tony19920...@gmail.com> wrote: >> >> > Hi all, >> > >> > @Ken, Thanks for your positive feedback. I have a similar experience >> with >> > test harness and that's why >> > I want to provide more contents on testing doc to prevent from this >> kind of >> > problems. >> > >> > Does anyone have any feedback and advice? I would like to collect more >> > opinions from every developers >> > and users. Please let me know what you think about this topic. All >> > suggestions are welcome. Thank you. >> > >> > Best, >> > Tony Wei >> > >> > 2018-09-26 2:20 GMT+08:00 Ken Krugler <kkrugler_li...@transpac.com>: >> > >> > > Hi Tony, >> > > >> > > I think this would be great - we’ve been building out tests using >> > > AbstractStreamOperator, and the lack of documentation has made it >> > > challenging. >> > > >> > > For example, there was this exchange I had with Piotr about a month >> ago: >> > > >> > > > You made a small mistake when restoring from state using test >> harness, >> > > that I myself have also done in the past. Problem is with an ordering >> of >> > > those calls: >> > > > >> > > > result.open(); >> > > > if (savedState != null) { >> > > > result.initializeState(savedState); >> > > > } >> > > > >> > > > Open is supposed to be called after initializeState, and if you look >> > > into the code of AbstractStreamOperatorTestHarness#open, if it is >> called >> > > before initialize, it will initialize harness without any state. >> > > > >> > > > Unfortunate is that this is implicit behaviour that doesn’t throw >> any >> > > error (test harness is not part of a Flink’s public api). I will try >> to >> > fix >> > > this: https://issues.apache.org/jira/browse/FLINK-10159 < >> > > https://issues.apache.org/jira/browse/FLINK-10159> >> > > — Ken >> > > >> > > > On Sep 25, 2018, at 3:30 AM, Tony Wei <tony19920...@gmail.com> >> wrote: >> > > > >> > > > Hi all, >> > > > >> > > > It seems that there are more and more users from user mailing list >> ask >> > > how >> > > > to do unit test with Flink >> > > > features like states or timer. And the community usually tends to >> > suggest >> > > > them using >> > > > `AbstractStreamOperator` and provide an example from Flink github >> repo. >> > > > Here I sort out some >> > > > examples and write them down in the testing documentation [1]. And I >> > > would >> > > > link to contribute back >> > > > to the Flink. >> > > > >> > > > The reason why I ask it first in dev mailing list is that >> > > > `AbstractStreamOperator` is an internal API and >> > > > could be changed at any time. I'm not sure if it is worth to provide >> > > these >> > > > examples on testing >> > > > document, so I want to collect some feedbacks before I go to open a >> > JIRA >> > > > ticket. >> > > > >> > > > If this is feasible and valuable, then I will open the corresponding >> > JIRA >> > > > ticket and we can discuss >> > > > more details of what examples are good to have in the document or >> how >> > to >> > > > structure the content. >> > > > >> > > > I would really appreciate any feedback from you. Thanks in advance. >> > > > >> > > > Best Regards, >> > > > Tony Wei >> > > > >> > > > [1] >> > > > https://github.com/apache/flink/compare/master... >> > > tony810430:flink-testing-doc >> > > >> > > -------------------------- >> > > Ken Krugler >> > > +1 530-210-6378 >> > > http://www.scaleunlimited.com >> > > Custom big data solutions & training >> > > Flink, Solr, Hadoop, Cascading & Cassandra >> > > >> > > >> > >> > >