?????? Flink 1.12.2 sql api use parquet format error

2021-04-09 Thread ??????
After change pom.xml, new error: org.apache.flink.runtime.rest.handler.RestHandlerException: Could not execute application. at org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$handleRequest$1(JarRunHandler.java:108) at java.util.concurrent.CompletableFuture.uniHandle(Completa

Re: Flink Metrics emitted from a Kubernetes Application Cluster

2021-04-09 Thread Chesnay Schepler
This is currently not possible. See also FLINK-8358 On 4/9/2021 4:47 AM, Claude M wrote: Hello, I've setup Flink as an Application Cluster in Kubernetes. Now I'm looking into monitoring the Flink cluster in Datadog. This is what is configured in the flink-conf.yaml to emit metrics: metrics.

Re: Flink 1.13 and CSV (batch) writing

2021-04-09 Thread Flavio Pompermaier
Thanks Kurt, now it works. However I can't find a way to skip the CSV header..before there was "format.ignore-first-line" but now I can't find another way to skip it. I could set csv.ignore-parse-errors to true but then I can't detect other parsing errors, otherwise I need to manually transofrm th

Re: Why does flink-quickstart-scala suggests adding connector dependencies in the default scope, while Flink Hive integration docs suggest the opposite

2021-04-09 Thread Till Rohrmann
Hi Yik San, (1) You could do the same with Kafka. For Hive I believe that the dependency is simply quite large so that it hurts more if you bundle it with your user code. (2) If you change the content in the lib directory, then you have to restart the cluster. Cheers, Till On Fri, Apr 9, 2021 a

Re: Flink 1.13 and CSV (batch) writing

2021-04-09 Thread Flavio Pompermaier
And another thing: in my csv I added ',bye' (to test null Long values) but I get a parse error..if I add 'csv.null-literal' = '' it seems to work..is that the right way to solve this problem? On Fri, Apr 9, 2021 at 10:13 AM Flavio Pompermaier wrote: > Thanks Kurt, now it works. However I can't

Re: Why does flink-quickstart-scala suggests adding connector dependencies in the default scope, while Flink Hive integration docs suggest the opposite

2021-04-09 Thread Yik San Chan
Thank you Till! On Fri, Apr 9, 2021 at 4:25 PM Till Rohrmann wrote: > Hi Yik San, > > (1) You could do the same with Kafka. For Hive I believe that the > dependency is simply quite large so that it hurts more if you bundle it > with your user code. > > (2) If you change the content in the lib di

Re: Flink does not cleanup some disk memory after submitting jar over rest

2021-04-09 Thread Till Rohrmann
Hi, What you could also do is to create several heap dumps [1] whenever you submit a new job. This could allow us to analyze whether there is something increasing the heap memory consumption. Additionally, you could try to upgrade your cluster to Flink 1.12.2 since we fixed some problems Maciek me

Re: Dynamic configuration via broadcast state

2021-04-09 Thread Arvid Heise
Hi Vishal, what you are trying to achieve is quite common and has its own documentation [1]. Currently, there is no way to hold back elements of the non-broadcast side (your question 2 in OP), so you have to save them until configuration arrives. If you have several configurable operators, you co

Re: 回复: period batch job lead to OutOfMemoryError: Metaspace problem

2021-04-09 Thread Arvid Heise
Hi, What do you mean by light-weight way? Just to clarify: you copy the jar once in the lib folder and restart the cluster once (and put it into the lib/ for future clusters). Not sure how it would be more light-weight. You can still bundle it into your jar if you prefer it. It just tends to be b

Re: how to submit jobs remotely when a rest proxy like nginx is used and REST endpoint is bind to loopback interface?

2021-04-09 Thread Arvid Heise
Hi Ming, instead of using the command line interface to run Flink applications, you should use the REST API [1]. You should first upload your jar in one call and then execute the job in the second call. The rest endpoint would be http://10.20.39.43:8080/ [1] http

Re: Flink 1.13 and CSV (batch) writing

2021-04-09 Thread Kurt Young
`format.ignore-first-line` is unfortunately a regression compared to the old one. I've created a ticket [1] to track this but according to current design, it seems not easy to do. Regarding null values, I'm not sure if I understand the issue you had. What do you mean by using ',bye' to test null L

Re: 回复: period batch job lead to OutOfMemoryError: Metaspace problem

2021-04-09 Thread Maciek Próchniak
Hi Arvid, "You can still bundle it into your jar if you prefer it." - is it really the case with JDBC drivers? I think that if the driver is not on Flink main classpath (that is, in the lib folder) there is no way the class would be loaded by main classloader - regardless of parent/child clas

Re: Flink: Exception from container-launch exitCode=2

2021-04-09 Thread Till Rohrmann
Hi Yik San, to me it looks as if there is a problem with the job and the deployment. Unfortunately, the logging seems to not have worked. Could you check that you have a valid log4j.properties file in your conf directory. Cheers, Till On Wed, Apr 7, 2021 at 4:57 AM Yik San Chan wrote: > *The q

Re: Flink 1.13 and CSV (batch) writing

2021-04-09 Thread Flavio Pompermaier
In my real CSV I have LONG columns that can contain null values. In that case I get a parse exception (and I would like to avoid to read it as a string). The ',bye' is just the way you can test that in my example (add that line to the input csv). If I use 'csv.null-literal' = '' it seems to work b

Re: Flink: Exception from container-launch exitCode=2

2021-04-09 Thread Till Rohrmann
I actually think that the logging problem is caused by Hadoop 2.7.3 which pulls in the slf4j-log4j12-1.7.10.jar. This binding is then used but there is no proper configuration file for log4j because Flink actually uses log4j2. Cheers, Till On Fri, Apr 9, 2021 at 12:05 PM Till Rohrmann wrote: >

Re: Flink 1.13 and CSV (batch) writing

2021-04-09 Thread Kurt Young
Converting from table to DataStream in batch mode is indeed a problem now. But I think this will be improved soon. Best, Kurt On Fri, Apr 9, 2021 at 6:14 PM Flavio Pompermaier wrote: > In my real CSV I have LONG columns that can contain null values. In that > case I get a parse exception (and

Query regarding flink metric types

2021-04-09 Thread V N, Suchithra (Nokia - IN/Bangalore)
Hi Community, Need some information regarding metrics type mentioned in flink documentation. https://ci.apache.org/projects/flink/flink-docs-stable/ops/metrics.html For the checkpoint metrics, below metrics are defined as of type gauge. As per my understanding gauge type is used to represent a v

Re: 回复: period batch job lead to OutOfMemoryError: Metaspace problem

2021-04-09 Thread Arvid Heise
Afaik the main issue is that the JDBC drivers are leaking as they usually assume only one classloader. If you are aware of it, you can bundle it in your jar. However, you are right - it doesn't help with OP, so it was probably not a good idea. On Fri, Apr 9, 2021 at 11:45 AM Maciek Próchniak wrot

Re: Flink 1.13 and CSV (batch) writing

2021-04-09 Thread Flavio Pompermaier
That's absolutely useful. IMHO also join should work without windows/triggers and left/right outer joins should be easier in order to really migrate legacy code. Also reduceGroup would help but less urgent. I hope that my feedback as Flink user could be useful. Best, Flavio On Fri, Apr 9, 2021 at

Re: Task manager local state data after crash / recovery

2021-04-09 Thread Till Rohrmann
Hi Dhanesh, The way local state works in Flink currently is the following: The user configures a `taskmanager.state.local.root-dirs` or the tmp directory is used where Flink creates a "localState" directory. This is the base directory for all local state. Within this directory a TaskManager create

Re: Proper way to get DataStream

2021-04-09 Thread Maminspapin
Arvid Heise-4, Ok, this is clear for me now. Good answer. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Task manager local state data after crash / recovery

2021-04-09 Thread dhanesh arole
Thanks a lot for answering it in detail. This makes sense and cleared lots of doubt. On Fri, 9 Apr 2021 at 13:02 Till Rohrmann wrote: > Hi Dhanesh, > > The way local state works in Flink currently is the following: The user > configures a `taskmanager.state.local.root-dirs` or the tmp directory

Re: clear() in a ProcessWindowFunction

2021-04-09 Thread Roman Khachatryan
Hi Vishal, Sorry for the late reply, Please find my answers below. By state I assume the state obtained via getRuntimeContext (access to window state is not allowed).. > The state is scoped to the key (created per key in the ProcessWindowFunction > with a ttl ) Yes. > The state will remain aliv

Re: FLINK Kinesis consumer Checkpointing data loss

2021-04-09 Thread Vijayendra Yadav
Thank You it helped. > On Apr 8, 2021, at 10:53 PM, Arvid Heise wrote: > >  > Hi Vijay, > > if you don't specify a checkpoint, then Flink assumes you want to start from > scratch (e.g., you had a bug in your business logic and need to start > completely without state). > > If there is any

Flink Metric isBackPressured not available

2021-04-09 Thread Claude M
Hello, The documentation here https://ci.apache.org/projects/flink/flink-docs-stable/ops/metrics.html states there is a isBackPressured metric available yet I don't see it. Any ideas why? Thanks