Re: autoCommit for postgres jdbc streaming in Table/SQL API

2020-10-08 Thread Till Rohrmann
This sounds good. Maybe there are others in the community who can help with the review before the Jark and Leonard are back. Cheers, Till On Wed, Oct 7, 2020 at 7:33 PM Dylan Forciea wrote: > Actually…. It looks like what I did covers both cases. I’ll see about > getting some unit tests and doc

Re: The file STDOUT does not exist on the TaskExecutor

2020-10-08 Thread Till Rohrmann
The easiest way to suppress this error would be to disable the logging for TaskManagerStdoutFileHandler in your log4j.properties file. Cheers, Till On Wed, Oct 7, 2020 at 8:48 PM sidhant gupta wrote: > Hi Till, > > I understand the errors which appears in my logs are not stopping me from > runn

Re: Network issue leading to "No pooled slot available"

2020-10-08 Thread Khachatryan Roman
Hi Dan Diephouse, >From the logs you provided indeed it looks like 1 causes 2 => 3 => 4, where 2 is a bug. It's unclear though where the interruption is ignored (Flink/Hadoop FS/S3 client). What version of Flink are you using? Regards, Roman On Wed, Oct 7, 2020 at 11:16 PM Dan Diephouse wrote

Re: flink configuration: best practice for checkpoint storage secrets

2020-10-08 Thread XU Qinghui
Hello Till Thanks a lot for the reply. But it turns out the IAM is applicable only when the job is running inside AWS, which is not my case (basically we are just using the S3 API provided by other services). By reading again the flink doc, it seems it's suggesting to use the flink-conf.yaml file,

state access causing segmentation fault

2020-10-08 Thread Colletta, Edward
Using Flink 1.9.2, Java, FsStateBackend. Running Session cluster on EC2 instances. I have a KeyedProcessFunction that is causing a segmentation fault, crashing the flink task manager. The seems to be caused by using 3 State variables in the operator. The crash happens consistently after some

Native State in Python Stateful Functions?

2020-10-08 Thread Clements, Danial C
Hi, In doing some testing with Flink stateful functions in Python and I’ve gotten a small POC working. One of our key requirements for our stream processors is that they be written in python due to the skillset of our team. Given that the Python DataStreams api seems to be under development i

Re: state access causing segmentation fault

2020-10-08 Thread Dawid Wysakowicz
Hi, It should be absolutely fine to use multiple state objects. I am not aware of any limits to that. A minimal, reproducible example would definitely be helpful. For those kind of exceptions, I'd look into the serializers you use. Other than that I cannot think of an obvious reason for that kind

Re: 订阅

2020-10-08 Thread tison
Please send email with any content to -subscr...@flink.apache.org for subscription. For example, mailto:user-zh-subscr...@flink.apache.org to subscribe user...@flink.apache.org Best, tison. 葛春法-18667112979 于2020年10月8日周四 下午8:45写道: > I want to subscribe flink mail.

How can I increase the parallelism on the Table API for Streaming Aggregation?

2020-10-08 Thread Felipe Gutierrez
Hi community, I was implementing the stream aggregation using Table API [1] and trying out the local aggregation plan to optimize the query. Basically I had to configure it like this: Configuration configuration = tableEnv.getConfig().getConfiguration(); // set low-level key-value options configu

Re: Network issue leading to "No pooled slot available"

2020-10-08 Thread Dan Diephouse
Using the latest - 1.11.2. I would assume the interruption is being ignored in the Hadoop / S3 layer. I was looking at the defaults and (if I understood correctly) the client will retry 20 times. Which would explain why it never gets cancelled... On Thu, Oct 8, 2020 at 1:27 AM Khachatryan Roman <

Re: Streaming SQL Job Switches to FINISHED before all records processed

2020-10-08 Thread Austin Cawley-Edwards
Hey Timo, Sorry for the delayed reply. I'm using the Blink planner and using non-time-based joins. I've got an example repo here that shows my query/ setup [1]. It's got the manual timestamp assignment commented out for now, but that does indeed solve the issue. I'd really like to not worry about

Re: Issue with Flink - unrelated executeSql causes other executeSqls to fail.

2020-10-08 Thread Austin Cawley-Edwards
Can't comment on the SQL issues, but here's our exact setup for Bazel and Junit5 w/ the resource files approach: https://github.com/fintechstudios/vp-operator-demo-ff-virtual-2020/tree/master/tools/junit Best, Austin On Thu, Oct 8, 2020 at 2:41 AM Dan Hill wrote: > I was able to get finer grain

Re: How can I increase the parallelism on the Table API for Streaming Aggregation?

2020-10-08 Thread Khachatryan Roman
Hi Felipe, Your source is not parallel so it doesn't make sense to make local group operator parallel. If the source implemented ParallelSourceFunction, subsequent operators would be parallelized too. Regards, Roman On Thu, Oct 8, 2020 at 5:00 PM Felipe Gutierrez < felipe.o.gutier...@gmail.com>

Re: autoCommit for postgres jdbc streaming in Table/SQL API

2020-10-08 Thread Dylan Forciea
I’ve updated the unit tests and documentation, and I was running the azure test pipeline as described in the instructions. However, it appears that what seems to be an unrelated test for the JMX code failed. Is this a matter of me not setting things up correctly? I wanted to ensure everything lo

Re: Issue with Flink - unrelated executeSql causes other executeSqls to fail.

2020-10-08 Thread Dan Hill
I figured out the issue. The join caused part of the job's execution to be delayed. I added my own hacky wait condition into the test to make sure the join job finishes first and it's fine. What common test utilities exist for Flink? I found flink/flink-test-utils-parent. I implemented a simpl

Re: Network issue leading to "No pooled slot available"

2020-10-08 Thread Dan Diephouse
Did some digging... definitely appears that the Amazon SDK definitely is not picking up the interrupt. I will try playing with the connection timeout. Hadoop defaults it to 20 ms, which may be part of the problem. Anyone have any other ideas? In theory this should be fixed by SDK v2 which use

Re: Network issue leading to "No pooled slot available"

2020-10-08 Thread Dan Diephouse
Well, I just dropped in the latest Amazon 1.11.878 SDK and now it appears to respect interrupts in a test case I created. (the test fails with the SDK that is in use by Flink) I will try it in a full fledged Flink environment and report back. On Thu, Oct 8, 2020 at 3:41 PM Dan Diephouse wrote:

Re: The file STDOUT does not exist on the TaskExecutor

2020-10-08 Thread Yang Wang
I second till's suggestion. Currently in container environment(docker/K8s), we could not output the STDOUT/STDERR to console and log files(taskmanager.out/err) simultaneously. In consideration of the user experience, we are using "conf/log4j-console.properties" to only output the STDOUT/STDERR to c

Re: autoCommit for postgres jdbc streaming in Table/SQL API

2020-10-08 Thread Jark Wu
Hi Dylan, Sorry for the late reply. We've just come back from a long holiday. Thanks for reporting this problem. First, I think this is a bug that `autoCommit` is false by default (JdbcRowDataInputFormat.Builder). We can fix the default to true in 1.11 series, and I think this can solve your prob

Re: The file STDOUT does not exist on the TaskExecutor

2020-10-08 Thread sidhant gupta
Thanks Yang for providing another alternative solution. On Fri, Oct 9, 2020, 7:49 AM Yang Wang wrote: > I second till's suggestion. Currently in container > environment(docker/K8s), we could not output the > STDOUT/STDERR to console and log > files(taskmanager.out/err) simultaneously. In conside

Best way to test Table API and SQL

2020-10-08 Thread Rex Fenley
Hello I'd like to write a unit test for my Flink Job. It consists mostly of the Table API and SQL using a StreamExecutionEnvironment with the blink planner, from source to sink. What's the best approach for testing Table API/SQL? I read https://flink.apache.org/news/2020/02/07/a-guide-for-unit-t

Re: Native State in Python Stateful Functions?

2020-10-08 Thread Tzu-Li (Gordon) Tai
Hi, Nice to hear that you are trying out StateFun! It is by design that function state is attached to each HTTP invocation request, as defined by StateFun's remote invocation request-reply protocol. This decision was made with typical application cloud-native architectures in mind - having functi

Any testing issues when using StreamTableEnvironment.createTemporaryView?

2020-10-08 Thread Dan Hill
*Summary* I'm hitting an error when running a test that is related to using createTemporaryView to convert a Protobuf input stream to Flink Table API. I'm not sure how to debug "SourceConversion$5.processElement(Unknown Source)" line. Is this generated code? How can I debug this? Any help would

Re: Network issue leading to "No pooled slot available"

2020-10-08 Thread Khachatryan Roman
Thanks for checking this workaround! I've created a jira issue [1] to check if AWS SDK version can be upgraded in Flink distribution. Regards, Roman On Fri, Oct 9, 2020 at 12:54 AM Dan Diephouse wrote: > Well, I just dropped in the latest Amazon 1.11.878 SDK and now it > appears to respect in

[DISCUSS] FLIP-147: Support Checkpoints After Tasks Finished

2020-10-08 Thread Yun Gao
Hi, devs & users As discussed in FLIP-131 [1], Flink will make DataStream the unified API for processing bounded and unbounded data in both streaming and blocking modes. However, one long-standing problem for the streaming mode is that currently Flink does not support checkpoints after some tas

Re: [DISCUSS] FLIP-147: Support Checkpoints After Tasks Finished

2020-10-08 Thread Yun Gao
Hi, devs & users Very sorry for the spoiled formats, I resent the discussion as follows. As discussed in FLIP-131[1], Flink will make DataStream the unified API for processing bounded and unbounded data in both streaming and blocking modes. However, one long-standing problem for the streaming m

Re: NoResourceAvailableException

2020-10-08 Thread Khachatryan Roman
I assume that before submitting a job you started a cluster with default settings with ./bin/start-cluster.sh. Did you submit any other jobs? Can you share the logs from log folder? Regards, Roman On Wed, Oct 7, 2020 at 11:03 PM Alexander Semeshchenko wrote: > >

flink session job retention time

2020-10-08 Thread Richard Moorhead
Is there a configuration that controls how long jobs are retained in a flink session?