Re: Issue with Flink - unrelated executeSql causes other executeSqls to fail.

Aljoscha Krettek Tue, 06 Oct 2020 15:59:12 -0700

Hi Dan,

to make the log properties file work this should do it: assuming thelog4j.properties is in //src/main/resources. You will need a BUILD.bazelin that directory that has only the line"exports_files(["log4j.properties"]). Then you can reference it in yourtest via "resources = ["//src/main/resources:log4j.properties"],". Ofcourse you also need to have the right log4j deps (or slf4j if you'reusing that)


Hope that helps!

Aljoscha

On 07.10.20 00:41, Dan Hill wrote:

I'm trying to use Table API for my job.  I'll soon try to get a test
working for my stream job.
- I'll parameterize so I can have different sources and sink for tests.
How should I mock out a Kafka source?  For my test, I was planning on
changing the input to be from a temp file (instead of Kafka).
- What's a good way of forcing a watermark using the Table API?


On Tue, Oct 6, 2020 at 3:35 PM Dan Hill <quietgol...@gmail.com> wrote:

Thanks!

Great to know.  I copied this junit5-jupiter-starter-bazel
<https://github.com/junit-team/junit5-samples/tree/main/junit5-jupiter-starter-bazel>
 rule
into my repository (I don't think junit5 is supported directly with
java_test yet).  I tried a few ways of bundling `log4j.properties` into the
jar and didn't get them to work.  My current iteration hacks the
log4j.properties file as an absolute path.  My failed attempts would spit
an error saying log4j.properties file was not found.  This route finds it
but the log properties are not used for the java logger.

Are there a better set of rules to use for junit5?

# build rule
java_junit5_test(
     name = "tests",
     srcs = glob(["*.java"]),
     test_package = "ai.promoted.logprocessor.batch",
     deps = [...],
     jvm_flags =
["-Dlog4j.configuration=file:///Users/danhill/code/src/ai/promoted/logprocessor/batch/log4j.properties"],
)

# log4j.properties
status = error
name = Log4j2PropertiesConfig
appenders = console
appender.console.type = Console
appender.console.name = LogToConsole
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = %d [%t] %-5p %c - %m%n
rootLogger.level = info
rootLogger.appenderRefs = stdout
rootLogger.appenderRef.stdout.ref = LogToConsole

On Tue, Oct 6, 2020 at 3:34 PM Austin Cawley-Edwards <
austin.caw...@gmail.com> wrote:

Oops, this is actually the JOIN issue thread [1]. Guess I should revise
my previous "haven't had issues" statement hah. Sorry for the spam!

[1]:
apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Streaming-SQL-Job-Switches-to-FINISHED-before-all-records-processed-td38382.html

On Tue, Oct 6, 2020 at 6:32 PM Austin Cawley-Edwards <
austin.caw...@gmail.com> wrote:

Unless it's related to this issue[1], which was w/ my JOIN and time
characteristics, though not sure that applies for batch.

Best,
Austin

[1]:
apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-SQL-Streaming-Join-Creates-Duplicates-td37764.html


On Tue, Oct 6, 2020 at 6:20 PM Austin Cawley-Edwards <
austin.caw...@gmail.com> wrote:

Hey Dan,

We use Junit5 and Bazel to run Flink SQL tests on a mini cluster and
haven’t had issues, though we’re only testing on streaming jobs.

Happy to help setting up logging with that if you’d like.

Best,
Austin

On Tue, Oct 6, 2020 at 6:02 PM Dan Hill <quietgol...@gmail.com> wrote:

I don't think any of the gotchas apply to me (at the bottom of this
link).

https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/testing.html#junit-rule-miniclusterwithclientresource

I'm assuming for a batch job that I don't have to do anything for:
"You can implement a custom parallel source function for emitting
watermarks if your job uses event time timers."

On Tue, Oct 6, 2020 at 2:42 PM Dan Hill <quietgol...@gmail.com> wrote:

I've tried to enable additional logging for a few hours today.  I
think something with junit5 is swallowing the logs.  I'm using Bazel and
junit5.  I setup MiniClusterResourceConfiguration using a custom
extension.  Are there any known issues with Flink and junit5?  I can try
switching to junit4.

When I've binary searched this issue, this failure happens if my
query in step 3 has a join it.  If I remove the join, I can remove step 4
and the code still works.  I've renamed a bunch of my tables too and the
problem still exists.





On Tue, Oct 6, 2020, 00:42 Aljoscha Krettek <aljos...@apache.org>
wrote:

Hi Dan,

there were some bugs and quirks in the MiniCluster that we recently
fixed:

   - https://issues.apache.org/jira/browse/FLINK-19123
   - https://issues.apache.org/jira/browse/FLINK-19264

But I think they are probably unrelated to your case. Could you
enable
logging and see from the logs whether the 2) and 3) jobs execute
correctly on the MiniCluster?

Best,
Aljoscha

On 06.10.20 08:08, Dan Hill wrote:

I'm writing a test for a batch job using

MiniClusterResourceConfiguration.


Here's a simple description of my working test case:
1) I use TableEnvironment.executeSql(...) to create a source and

sink table

using tmp filesystem directory.
2) I use executeSql to insert some test data into the source tabel.
3) I use executeSql to select from source and insert into sink.
4) I use executeSql from the same source to a different sink.

When I do these steps, it works.  If I remove step 4, no data gets

written

to the sink.  My actual code is more complex than this (has create

view,

join and more tables).  This is a simplified description but

highlights the

weird error.

Has anyone hit issues like this?  I'm assuming I have a small code

bug in

my queries that's causing issues.  These queries appear to work in
production so I'm confused.  Are there ways of viewing failed jobs

or

queries with MiniClusterResourceConfiguration?

Thanks!
- Dan

Re: Issue with Flink - unrelated executeSql causes other executeSqls to fail.

Reply via email to