I figured out the issue.  The join caused part of the job's execution to be
delayed.  I added my own hacky wait condition into the test to make sure
the join job finishes first and it's fine.

What common test utilities exist for Flink?  I found
flink/flink-test-utils-parent.  I implemented a simple sleep loop to wait
for jobs to finish.  I'm guessing this can be done with one of the other
utilities.

Are there any open source test examples?

How are watermarks usually sent with Table API in tests?

After I collect some answers, I'm fine updating the Flink testing page.
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/testing.html#testing-flink-jobs

On Thu, Oct 8, 2020 at 8:52 AM Austin Cawley-Edwards <
austin.caw...@gmail.com> wrote:

> Can't comment on the SQL issues, but here's our exact setup for Bazel and
> Junit5 w/ the resource files approach:
> https://github.com/fintechstudios/vp-operator-demo-ff-virtual-2020/tree/master/tools/junit
>
> Best,
> Austin
>
> On Thu, Oct 8, 2020 at 2:41 AM Dan Hill <quietgol...@gmail.com> wrote:
>
>> I was able to get finer grained logs showing.  I switched from
>> -Dlog4j.configuration to -Dlog4j.configurationFile and it worked.  With my
>> larger test case, I was hitting a silent log4j error.  When I created a
>> small test case to just test logging, I received a log4j error.
>>
>> Here is a tar
>> <https://drive.google.com/file/d/1b6vJR_hfaRZwA28jKNlUBxDso7YiTIbk/view?usp=sharing>
>> with the info logs for:
>> - (test-nojoin.log) this one works as expected
>> - (test-join.log) this does not work as expected
>>
>> I don't see an obvious issue just by scanning the logs.  I'll take a
>> deeper in 9 hours.
>>
>>
>>
>>
>> On Wed, Oct 7, 2020 at 8:28 PM Dan Hill <quietgol...@gmail.com> wrote:
>>
>>> Switching to junit4 did not help.
>>>
>>> If I make a request to the url returned from
>>> MiniClusterWithClientResource.flinkCluster.getClusterClient().getWebInterfaceURL(),
>>> I get
>>> {"errors":["Not found."]}.  I'm not sure if this is intentional.
>>>
>>>
>>>
>>>
>>> On Tue, Oct 6, 2020 at 4:16 PM Dan Hill <quietgol...@gmail.com> wrote:
>>>
>>>> @Aljoscha - Thanks!  That setup lets fixing the hacky absolute path
>>>> reference.  However, the actual log calls are not printing to the console.
>>>> Only errors appear in my terminal window and the test logs.  Maybe console
>>>> logger does not work for this junit setup.  I'll see if the file version
>>>> works.
>>>>
>>>> On Tue, Oct 6, 2020 at 4:08 PM Austin Cawley-Edwards <
>>>> austin.caw...@gmail.com> wrote:
>>>>
>>>>> What Aljoscha suggested is what works for us!
>>>>>
>>>>> On Tue, Oct 6, 2020 at 6:58 PM Aljoscha Krettek <aljos...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> Hi Dan,
>>>>>>
>>>>>> to make the log properties file work this should do it: assuming the
>>>>>> log4j.properties is in //src/main/resources. You will need a
>>>>>> BUILD.bazel
>>>>>> in that directory that has only the line
>>>>>> "exports_files(["log4j.properties"]). Then you can reference it in
>>>>>> your
>>>>>> test via "resources = ["//src/main/resources:log4j.properties"],". Of
>>>>>> course you also need to have the right log4j deps (or slf4j if you're
>>>>>> using that)
>>>>>>
>>>>>> Hope that helps!
>>>>>>
>>>>>> Aljoscha
>>>>>>
>>>>>> On 07.10.20 00:41, Dan Hill wrote:
>>>>>> > I'm trying to use Table API for my job.  I'll soon try to get a test
>>>>>> > working for my stream job.
>>>>>> > - I'll parameterize so I can have different sources and sink for
>>>>>> tests.
>>>>>> > How should I mock out a Kafka source?  For my test, I was planning
>>>>>> on
>>>>>> > changing the input to be from a temp file (instead of Kafka).
>>>>>> > - What's a good way of forcing a watermark using the Table API?
>>>>>> >
>>>>>> >
>>>>>> > On Tue, Oct 6, 2020 at 3:35 PM Dan Hill <quietgol...@gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> >> Thanks!
>>>>>> >>
>>>>>> >> Great to know.  I copied this junit5-jupiter-starter-bazel
>>>>>> >> <
>>>>>> https://github.com/junit-team/junit5-samples/tree/main/junit5-jupiter-starter-bazel>
>>>>>> rule
>>>>>> >> into my repository (I don't think junit5 is supported directly with
>>>>>> >> java_test yet).  I tried a few ways of bundling `log4j.properties`
>>>>>> into the
>>>>>> >> jar and didn't get them to work.  My current iteration hacks the
>>>>>> >> log4j.properties file as an absolute path.  My failed attempts
>>>>>> would spit
>>>>>> >> an error saying log4j.properties file was not found.  This route
>>>>>> finds it
>>>>>> >> but the log properties are not used for the java logger.
>>>>>> >>
>>>>>> >> Are there a better set of rules to use for junit5?
>>>>>> >>
>>>>>> >> # build rule
>>>>>> >> java_junit5_test(
>>>>>> >>      name = "tests",
>>>>>> >>      srcs = glob(["*.java"]),
>>>>>> >>      test_package = "ai.promoted.logprocessor.batch",
>>>>>> >>      deps = [...],
>>>>>> >>      jvm_flags =
>>>>>> >>
>>>>>> ["-Dlog4j.configuration=file:///Users/danhill/code/src/ai/promoted/logprocessor/batch/log4j.properties"],
>>>>>> >> )
>>>>>> >>
>>>>>> >> # log4j.properties
>>>>>> >> status = error
>>>>>> >> name = Log4j2PropertiesConfig
>>>>>> >> appenders = console
>>>>>> >> appender.console.type = Console
>>>>>> >> appender.console.name = LogToConsole
>>>>>> >> appender.console.layout.type = PatternLayout
>>>>>> >> appender.console.layout.pattern = %d [%t] %-5p %c - %m%n
>>>>>> >> rootLogger.level = info
>>>>>> >> rootLogger.appenderRefs = stdout
>>>>>> >> rootLogger.appenderRef.stdout.ref = LogToConsole
>>>>>> >>
>>>>>> >> On Tue, Oct 6, 2020 at 3:34 PM Austin Cawley-Edwards <
>>>>>> >> austin.caw...@gmail.com> wrote:
>>>>>> >>
>>>>>> >>> Oops, this is actually the JOIN issue thread [1]. Guess I should
>>>>>> revise
>>>>>> >>> my previous "haven't had issues" statement hah. Sorry for the
>>>>>> spam!
>>>>>> >>>
>>>>>> >>> [1]:
>>>>>> >>>
>>>>>> apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Streaming-SQL-Job-Switches-to-FINISHED-before-all-records-processed-td38382.html
>>>>>> >>>
>>>>>> >>> On Tue, Oct 6, 2020 at 6:32 PM Austin Cawley-Edwards <
>>>>>> >>> austin.caw...@gmail.com> wrote:
>>>>>> >>>
>>>>>> >>>> Unless it's related to this issue[1], which was w/ my JOIN and
>>>>>> time
>>>>>> >>>> characteristics, though not sure that applies for batch.
>>>>>> >>>>
>>>>>> >>>> Best,
>>>>>> >>>> Austin
>>>>>> >>>>
>>>>>> >>>> [1]:
>>>>>> >>>>
>>>>>> apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-SQL-Streaming-Join-Creates-Duplicates-td37764.html
>>>>>> >>>>
>>>>>> >>>>
>>>>>> >>>> On Tue, Oct 6, 2020 at 6:20 PM Austin Cawley-Edwards <
>>>>>> >>>> austin.caw...@gmail.com> wrote:
>>>>>> >>>>
>>>>>> >>>>> Hey Dan,
>>>>>> >>>>>
>>>>>> >>>>> We use Junit5 and Bazel to run Flink SQL tests on a mini
>>>>>> cluster and
>>>>>> >>>>> haven’t had issues, though we’re only testing on streaming jobs.
>>>>>> >>>>>
>>>>>> >>>>> Happy to help setting up logging with that if you’d like.
>>>>>> >>>>>
>>>>>> >>>>> Best,
>>>>>> >>>>> Austin
>>>>>> >>>>>
>>>>>> >>>>> On Tue, Oct 6, 2020 at 6:02 PM Dan Hill <quietgol...@gmail.com>
>>>>>> wrote:
>>>>>> >>>>>
>>>>>> >>>>>> I don't think any of the gotchas apply to me (at the bottom of
>>>>>> this
>>>>>> >>>>>> link).
>>>>>> >>>>>>
>>>>>> >>>>>>
>>>>>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/testing.html#junit-rule-miniclusterwithclientresource
>>>>>> >>>>>>
>>>>>> >>>>>> I'm assuming for a batch job that I don't have to do anything
>>>>>> for:
>>>>>> >>>>>> "You can implement a custom parallel source function for
>>>>>> emitting
>>>>>> >>>>>> watermarks if your job uses event time timers."
>>>>>> >>>>>>
>>>>>> >>>>>> On Tue, Oct 6, 2020 at 2:42 PM Dan Hill <quietgol...@gmail.com>
>>>>>> wrote:
>>>>>> >>>>>>
>>>>>> >>>>>>> I've tried to enable additional logging for a few hours
>>>>>> today.  I
>>>>>> >>>>>>> think something with junit5 is swallowing the logs.  I'm
>>>>>> using Bazel and
>>>>>> >>>>>>> junit5.  I setup MiniClusterResourceConfiguration using a
>>>>>> custom
>>>>>> >>>>>>> extension.  Are there any known issues with Flink and
>>>>>> junit5?  I can try
>>>>>> >>>>>>> switching to junit4.
>>>>>> >>>>>>>
>>>>>> >>>>>>> When I've binary searched this issue, this failure happens if
>>>>>> my
>>>>>> >>>>>>> query in step 3 has a join it.  If I remove the join, I can
>>>>>> remove step 4
>>>>>> >>>>>>> and the code still works.  I've renamed a bunch of my tables
>>>>>> too and the
>>>>>> >>>>>>> problem still exists.
>>>>>> >>>>>>>
>>>>>> >>>>>>>
>>>>>> >>>>>>>
>>>>>> >>>>>>>
>>>>>> >>>>>>>
>>>>>> >>>>>>> On Tue, Oct 6, 2020, 00:42 Aljoscha Krettek <
>>>>>> aljos...@apache.org>
>>>>>> >>>>>>> wrote:
>>>>>> >>>>>>>
>>>>>> >>>>>>>> Hi Dan,
>>>>>> >>>>>>>>
>>>>>> >>>>>>>> there were some bugs and quirks in the MiniCluster that we
>>>>>> recently
>>>>>> >>>>>>>> fixed:
>>>>>> >>>>>>>>
>>>>>> >>>>>>>>    - https://issues.apache.org/jira/browse/FLINK-19123
>>>>>> >>>>>>>>    - https://issues.apache.org/jira/browse/FLINK-19264
>>>>>> >>>>>>>>
>>>>>> >>>>>>>> But I think they are probably unrelated to your case. Could
>>>>>> you
>>>>>> >>>>>>>> enable
>>>>>> >>>>>>>> logging and see from the logs whether the 2) and 3) jobs
>>>>>> execute
>>>>>> >>>>>>>> correctly on the MiniCluster?
>>>>>> >>>>>>>>
>>>>>> >>>>>>>> Best,
>>>>>> >>>>>>>> Aljoscha
>>>>>> >>>>>>>>
>>>>>> >>>>>>>> On 06.10.20 08:08, Dan Hill wrote:
>>>>>> >>>>>>>>> I'm writing a test for a batch job using
>>>>>> >>>>>>>> MiniClusterResourceConfiguration.
>>>>>> >>>>>>>>>
>>>>>> >>>>>>>>> Here's a simple description of my working test case:
>>>>>> >>>>>>>>> 1) I use TableEnvironment.executeSql(...) to create a
>>>>>> source and
>>>>>> >>>>>>>> sink table
>>>>>> >>>>>>>>> using tmp filesystem directory.
>>>>>> >>>>>>>>> 2) I use executeSql to insert some test data into the
>>>>>> source tabel.
>>>>>> >>>>>>>>> 3) I use executeSql to select from source and insert into
>>>>>> sink.
>>>>>> >>>>>>>>> 4) I use executeSql from the same source to a different
>>>>>> sink.
>>>>>> >>>>>>>>>
>>>>>> >>>>>>>>> When I do these steps, it works.  If I remove step 4, no
>>>>>> data gets
>>>>>> >>>>>>>> written
>>>>>> >>>>>>>>> to the sink.  My actual code is more complex than this (has
>>>>>> create
>>>>>> >>>>>>>> view,
>>>>>> >>>>>>>>> join and more tables).  This is a simplified description but
>>>>>> >>>>>>>> highlights the
>>>>>> >>>>>>>>> weird error.
>>>>>> >>>>>>>>>
>>>>>> >>>>>>>>> Has anyone hit issues like this?  I'm assuming I have a
>>>>>> small code
>>>>>> >>>>>>>> bug in
>>>>>> >>>>>>>>> my queries that's causing issues.  These queries appear to
>>>>>> work in
>>>>>> >>>>>>>>> production so I'm confused.  Are there ways of viewing
>>>>>> failed jobs
>>>>>> >>>>>>>> or
>>>>>> >>>>>>>>> queries with MiniClusterResourceConfiguration?
>>>>>> >>>>>>>>>
>>>>>> >>>>>>>>> Thanks!
>>>>>> >>>>>>>>> - Dan
>>>>>> >>>>>>>>>
>>>>>> >>>>>>>>
>>>>>> >>>>>>>>
>>>>>> >
>>>>>>
>>>>>>

Reply via email to