Thanks for bringing this to the ML Brian +1 For full TestPipelineOptions deprecation. Even worth to remove it, bad part is that this class resides in 'sdks/core/main/java' and not in testing as I imagined so this could count as a 'breaking' change.
On Thu, Nov 7, 2019 at 8:27 PM Luke Cwik <[email protected]> wrote: > > There was issue with asynchrony of p.run(), some runners blocked till the > pipeline was complete with p.run() which was never meant to be the intent. > > The test timeout one makes sense to be able to configure it per runner (since > Dataflow takes a lot longer than other runners) but we may be able to > configure a Junit test timeout attribute instead. > > I would be for getting rid of them. > > > On Wed, Nov 6, 2019 at 3:36 PM Robert Bradshaw <[email protected]> wrote: >> >> +1 to all of these are probably obsolete at this point and would be >> nice to remove. >> >> >> On Wed, Nov 6, 2019 at 3:00 PM Kenneth Knowles <[email protected]> wrote: >> > >> > Good find. I think TestPipelineOptions is from very early days. It makes >> > sense to me that these are all obsolete. Some guesses, though I haven't >> > dug through commit history to confirm: >> > >> > - TempRoot: a while ago TempLocation was optional, so I think this would >> > provide a default for things like gcpTempLocation and stagingLocation >> > - OnSuccessMatcher: for runners where pipeline used to not terminate in >> > streaming mode. Now I think every runner can successfully waitUntilFinish. >> > Also the current API for waitUntilFinish went through some evolutions >> > around asynchrony so it wasn't always a good choice. >> > - OnCreateMatcher: just for symmetry? I don't know >> > - TestTimeoutSeconds: probably also for the asychrony/waitUntilfinish >> > issue >> > >> > Kenn >> > >> > On Wed, Nov 6, 2019 at 12:19 PM Brian Hulette <[email protected]> wrote: >> >> >> >> I recently came across TestPipelineOptions, and now I'm wondering if >> >> maybe it should be deprecated. It only seems to actually be supported for >> >> Spark and Dataflow (via TestSparkRunner and TestDataflowRunner), and I >> >> think it may make more sense to move the functionality it provides into >> >> the tests that need it. >> >> >> >> TestPipelineOptions currently has four attributes: >> >> >> >> # TempRoot >> >> It's purpose isn't documented, but many tests read TempRoot and use it to >> >> set a TempLocation (example). I think this attribute makes sense (e.g. we >> >> can set TempRoot once and each test has its own subdirectory), but I'm >> >> not sure. Can anyone confirm the motivation for it? I'd like to at least >> >> add a docstring for it. >> >> >> >> # OnCreateMatcher >> >> A way to register a matcher that will be checked right after a pipeline >> >> has started. It's never set except for in TestDataflowRunnerTest, so I >> >> think this is absolutely safe to remove. >> >> >> >> # OnSuccessMatcher >> >> A way to register a matcher that will be checked right after a pipeline >> >> has successfully completed. This is used in several tests >> >> (RequiresStableInputIT, WordCountIT, ... 8 total occurrences), but I >> >> don't see why they couldn't all be replaced with a >> >> `p.run().waitUntilFinish()`, followed by an assert. >> >> >> >> I think the current approach is actually dangerous, because running these >> >> tests with runners other than TestDataflowRunner or TestSparkRunner means >> >> the matchers are never actually checked. This is actually how I came >> >> across TestPipelineOptions - I tried running a test with the DirectRunner >> >> and couldn't make it fail. >> >> >> >> # TestTimeoutSeconds >> >> Seems to just be a wrapper for `waitUntilFinish(duration)`, and only used >> >> in one place. I think it would be cleaner for the test to be responsible >> >> for calling waitUntilFinish (which we do elsewhere), the only drawback is >> >> it requires a small refactor so the test has access to the PipelineResult >> >> object. >> >> >> >> >> >> So I have a couple of questions for the community >> >> 1) Are there thoughts on TempRoot? Can we get rid of it? >> >> 2) Are there any objections to removing the other three attributes? Am I >> >> missing something? Unless there are any objections I think I'll write a >> >> patch to remove them. >> >> >> >> Thanks, >> >> Brian
