Re: [QUERY] Multiple elastic search sinks for a single flink pipeline

2020-10-14 Thread Chesnay Schepler
Are the number of sinks fixed? If so, then you can just take the output of your map function and apply multiple filters, writing the output of each filter into a sync. You could also use a process function with side-outputs, and apply a source to each output. On 10/14/2020 6:05 PM, Vignesh

Re: Flink Job cluster in HA mode - recovery vs upgrade

2020-08-23 Thread Chesnay Schepler
jar, we stop job with save point sp2, change manifest to specify “-s sp2” and newer image, and create K8s job again, on start will HAServices still read job graph from Zookeeper? *From:* Chesnay Schepler *Sent:* Sunday

Re: Flink Job cluster in HA mode - recovery vs upgrade

2020-08-23 Thread Chesnay Schepler
? *From:* Chesnay Schepler *Sent:* Sunday, August 23, 2020 1:46:45 AM *To:* Alexey Trenikhun ; Piotr Nowojski *Cc:* Flink User Mail List *Subject:* Re: Flink Job cluster in HA mode - recovery vs upgrade A job cluster is submitted as a job

Re: [DISCUSS] Remove Kafka 0.10.x connector (and possibly 0.11.x)

2020-08-25 Thread Chesnay Schepler
+1 to remove both the 1.10 and 1.11 connectors. The connectors have not been actively developed for some time. They are basically just sitting around causing noise by causing test instabilities and eating CI time. It would  also allow us to really simplify the module structure of the Kafka

Re: Loading FlinkKafkaProducer fails with LinkError

2020-08-25 Thread Chesnay Schepler
gt; and unpacks it into the image. I've ran: find . -iname "*.jar" | xargs -n 1 jar tf | grep -i producerrecord find . -iname "*.jar" | xargs -n 1 jar tf | grep -i kafka Both from within /lib, they both produce no results. On Tue, Aug 25, 2020 at 10:07

Re: Loading FlinkKafkaProducer fails with LinkError

2020-08-25 Thread Chesnay Schepler
The NoSuchMethodException shows that the class is still on the classpath, but with a different version than your code is expecting. Otherwise you would've gotten a different error. This implies that there are 2 versions of the kafka dependencies on the classpath in your original run; it

Re: The frequency flink push metrics to pushgateway?

2020-08-28 Thread Chesnay Schepler
You can configure the reporter interval; please see this example . On 28/08/2020 08:49, wangl...@geekplus.com wrote: Using

Re: Default Flink Metrics Graphite

2020-08-26 Thread Chesnay Schepler
metrics.reporter.grph.class: org.apache.flink.metrics.graphite.GraphiteReporter https://ci.apache.org/projects/flink/flink-docs-release-1.10/monitoring/metrics.html#graphite-orgapacheflinkmetricsgraphitegraphitereporter On 26/08/2020 16:40, Vijayendra Yadav wrote: Hi Dawid, I have 1.10.0

Re: Flink Job cluster in HA mode - recovery vs upgrade

2020-08-21 Thread Chesnay Schepler
Schepler *Cc:* Alexey Trenikhun ; Flink User Mail List *Subject:* Re: Flink Job cluster in HA mode - recovery vs upgrade Thank you for the clarification Chesney and sorry for the incorrect previous answer. Piotrek czw., 20 sie 2020 o 15:59 Chesnay Schepler <mailto:ches...@apache.org>>

Re: Flink Job cluster in HA mode - recovery vs upgrade

2020-08-22 Thread Chesnay Schepler
failed job will be over written by new one which will have same job-id? ---- *From:* Chesnay Schepler *Sent:* Friday, August 21, 2020 12:16 PM *To:* Alexey Trenikhun ; Piotr Nowojski *Cc:* Flink User Mail List *Subject:* Re: Flink J

Re: AWS EMR deployment error : NoClassDefFoundError org/apache/flink/api/java/typeutils/ResultTypeQueryable

2020-08-21 Thread Chesnay Schepler
If this class cannot be found on the classpath then chances are Flink is completely missing from the classpath. I haven't worked with EMR, but my guess is that you did not submit things correctly. From the EMR documentation I could gather that the submission should work without the

Re: Debezium Flink EMR

2020-08-21 Thread Chesnay Schepler
@Jark Would it be possible to use the 1.11 debezium support in 1.10? On 20/08/2020 19:59, Rex Fenley wrote: Hi, I'm trying to set up Flink with Debezium CDC Connector on AWS EMR, however, EMR only supports Flink 1.10.0, whereas Debezium Connector arrived in Flink 1.11.0, from looking at the

Re: Flink Job cluster in HA mode - recovery vs upgrade

2020-08-20 Thread Chesnay Schepler
This is incorrect; we do store the JobGraph in ZooKeeper. If you just delete the deployment the cluster will recover the previous JobGraph (assuming you aren't changing the Zookeeper configuration). If you wish to update the job, then you should cancel it (along with creating a savepoint),

Re: No space left on device exception

2020-08-20 Thread Chesnay Schepler
, Aug 20, 2020 at 2:01 PM Chesnay Schepler <mailto:ches...@apache.org>> wrote: Could you try adding this to your flink-conf.yaml? s3.staging-directory:/usr/mware/flink/tmp On 20/08/2020 20:50, Vishwas Siravara wrote: Hi Piotr, I did some analysis and realised that the t

Re: No space left on device exception

2020-08-20 Thread Chesnay Schepler
Could you try adding this to your flink-conf.yaml? s3.staging-directory:/usr/mware/flink/tmp On 20/08/2020 20:50, Vishwas Siravara wrote: Hi Piotr, I did some analysis and realised that the temp files for s3 checkpoints are staged in /tmp although the /io.tmp.dirs /is set to a different 

Re: Flink Job cluster in HA mode - recovery vs upgrade

2020-08-23 Thread Chesnay Schepler
controller is used to manage JobManager. Am I right? *From:* Chesnay Schepler *Sent:* Saturday, August 22, 2020 12:58 AM *To:* Alexey Trenikhun ; Piotr Nowojski *Cc:* Flink User Mail List *Subject:* Re: Flink Job cluster

Re: Flink S3 Hadoop dependencies

2020-08-14 Thread Chesnay Schepler
Filesystems are supposed to be used as plugins (by putting the jars under plugins/ instead of lib/), in which case they are loaded separately from other classes, specifically user-code. https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/plugins.html On 14/08/2020 20:25, Satish

Re: Maximum query and refresh rate for metrics from REST API

2020-09-17 Thread Chesnay Schepler
By default metrics are only updated every 10 seconds; this can be controlled via https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/config.html#metrics-fetcher-update-interval. On 9/17/2020 12:22 AM, Piper Piper wrote: Hello, What is the recommended way to get metrics (such as

Re: Blobserver dying mid-application

2020-10-01 Thread Chesnay Schepler
It would also be good to know how many slots you have on each task executor. On 10/1/2020 11:21 AM, Till Rohrmann wrote: Hi Andreas, do the logs of the JM contain any information? Theoretically, each task submission to a `TaskExecutor` can trigger a new connection to the BlobServer. This

Re: Blobserver dying mid-application

2020-10-01 Thread Chesnay Schepler
** *From:*Chesnay Schepler *Sent:* Thursday, October 1, 2020 5:42 AM *To:* Till Rohrmann ; Hailu, Andreas [Engineering] *Cc:* user@flink.apache.org *Subject:* Re: Blobserver dying mid-application It would also be good to know how many slots you have on each task executor. On 10/1/2020 11:21

Re: sideOutputLateData doesn't work with map()

2020-09-17 Thread Chesnay Schepler
This is working as intended, but is admittedly inconvenient. The reason why the original version does not work is that the side-output is scoped to the DataStream that the process function creates; the Map function creates another DataStream though that does not retain the side-output of the

Re: java.lang.NoSuchMethodError while writing to Kafka from Flink

2020-05-25 Thread Chesnay Schepler
Please double-check that your distribution and application jar were built against the same Flink version. This looks related to a binary-compatibility issues reporter in FLINK-13586 .

Re: Query Rest API from IDE during runtime

2020-05-25 Thread Chesnay Schepler
StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(config) does not actually create any resources yet, this only happens when you run a job. Upon execute() the Flink cluster is started, the job is run, and once the job finishes (and execute() returns) the cluster shuts down. So, you can

Re: Query Rest API from IDE during runtime

2020-05-25 Thread Chesnay Schepler
If you set DeploymentOptions.ATTACHED to false then execute() does not block until the job finishes, and returns a DetachedJobExecutionResult from which you can retrieve the Job ID. If you need to know when the job finishes you will have to continuously query the REST API. This is the only

Re: History Server Not Showing Any Jobs - File Not Found?

2020-05-28 Thread Chesnay Schepler
-data/hadoop/hdp-2.6.5.0/hadoop/bin::/gns/mw/dbclient/postgres/jdbc/pg-jdbc-9.3.v01/postgresql-9.3-1100-jdbc4.jar/ // You can see we have references to Hadoop mapred/yarn/hdfs libs in there. *// *ah** *From:*Chesnay Schepler *Sent:* Sunday, May 3, 2020 6:00 PM *To:* Hailu, Andreas [Engineering

Re: History Server Not Showing Any Jobs - File Not Found?

2020-05-28 Thread Chesnay Schepler
/bin::/gns/mw/dbclient/postgres/jdbc/pg-jdbc-9.3.v01/postgresql-9.3-1100-jdbc4.jar/ // You can see we have references to Hadoop mapred/yarn/hdfs libs in there. *// *ah *From:*Chesnay Schepler <mailto:ches...@apache.org> *Sent:* Sunday, May 3, 2020 6:00 PM

Re: History Server Not Showing Any Jobs - File Not Found?

2020-05-29 Thread Chesnay Schepler
*ah** *From:*Hailu, Andreas [Engineering] *Sent:* Thursday, May 28, 2020 12:18 PM *To:* 'Chesnay Schepler' ; user@flink.apache.org *Subject:* RE: History Server Not Showing Any Jobs - File Not Found? Okay, I will look further to see if we’re mistakenly using a version that’s pre-2.6.0. Howeve

Re: History Server Not Showing Any Jobs - File Not Found?

2020-06-02 Thread Chesnay Schepler
keep the latest X archives in the dir - or is this something we need to manage ourselves? Thanks. *// *ah** *From:*Hailu, Andreas [Engineering] *Sent:* Friday, May 29, 2020 8:46 AM *To:* 'Chesnay Schepler' ; user@flink.apache.org *Subject:* RE: History Server Not Showing Any Jobs - File Not Found

Re: On Flink job re-submission, observing error, "the rpc invocation size exceeds the maximum akka framesize"

2020-09-18 Thread Chesnay Schepler
how can we know the expected size for which it is failing? If you did not configure akka.framesize yourself then it is set to the documented default value. See the configuration documentation for the release you are using. > Does the operator state have any impact on the expected Akka frame

Re: On Flink job re-submission, observing error, "the rpc invocation size exceeds the maximum akka framesize"

2020-09-18 Thread Chesnay Schepler
configuration but we are unable to identify the size for which it fails. Could you help out on this? Awaiting a response. Regards, Shravan Chesnay Schepler wrote how can we know the expected size for which it is failing? If you did not configure akka.framesize yourself then it is set to t

Re: On Flink job re-submission, observing error, "the rpc invocation size exceeds the maximum akka framesize"

2020-09-18 Thread Chesnay Schepler
isn't very convincing. Regards, M S Shravan Chesnay Schepler wrote If you use 1.10.0 or above the framesize for which it failed is part of the exception message, see FLINK-14618. If you are using older version, then I'm afraid there is no way to tell. On 9/18/2020 12:11 PM, shravan wrot

Re: mvn clean verify - testConfigurePythonExecution failing

2020-10-22 Thread Chesnay Schepler
try naming it PythonProgramOptionsITCase; it apparently needs a jar to be created first, which happens after unit tests (tests suffixed with Test) are executed. On 10/22/2020 1:48 PM, Juha Mynttinen wrote: Hello there, The PR https://github.com/apache/flink/pull/13322 lately added the test

Re: FLINK 1.11 Graphite Metrics

2020-10-27 Thread Chesnay Schepler
at 11:18 AM Chesnay Schepler <mailto:ches...@apache.org>> wrote: So the plugins directory is completely empty? In that case, please download the flink-metrics-graphite jar <https://mvnrepository.com/artifact/org.apache.flink/flink-metrics-graphite/1.11.0>

Re: FLINK 1.11 Graphite Metrics

2020-10-27 Thread Chesnay Schepler
mkdir -p ./plugins/s3-fs-hadoop cp ./opt/flink-s3-fs-hadoop-1.11.0.jar ./plugins/s3-fs-hadoop/ 3) Recompiled Application with Flink 1.11 dependency. 4) Updated Graphite plugin class in config That is all I did. Regards, Vijay On Tue, Oct 27, 2020 at 10:00 AM Ches

Re: FLINK 1.11 Graphite Metrics

2020-10-25 Thread Chesnay Schepler
Ah wait, in 1.11 it should not longer be necessary to explicitly copy the reporter jar. Please update your reporter configuration to this: |metrics.reporter.grph.factory.class: org.apache.flink.metrics.graphite.GraphiteReporterFactory| On 10/25/2020 4:00 PM, Chesnay Schepler wrote: Have

Re: FLINK 1.11 Graphite Metrics

2020-10-25 Thread Chesnay Schepler
Have you followed the documentation, specifically this bit? > In order to use this reporter you must copy |/opt/flink-metrics-influxdb-1.11.2.jar| into the |plugins/influxdb| folder of your Flink distribution. On 10/24/2020 12:17 AM, Vijayendra Yadav wrote: Hi Team, for Flink 1.11 Graphite

Re: RestClusterClient and classpath

2020-10-28 Thread Chesnay Schepler
. @Chesnay Schepler I think that we are setting the correct classloader during jobgraph creation [1]. Is that what you mean? Cheers, Kostas [1] https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/program/PackagedProgramUtils.java#L122 On Wed, Oct 28, 2020

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-28 Thread Chesnay Schepler
we can do and I would say that they seem to be working pretty well (e.g. the recent Mesos discussion). Of course they are not perfect but the alternative would be to never remove anything user facing until the next major release, which I find pretty strict. On Wed, Oct 28, 2020 at 10:04 AM Chesna

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-28 Thread Chesnay Schepler
ing it, we should remove it and not leave unmaintained code around. On Wed, Oct 28, 2020 at 12:11 PM Chesnay Schepler wrote: The alternative could also be to use a different argument than "no one uses it", e.g., we are fine with removing it at the cost of friction for some users beca

Re: [DISCUSS] Remove flink-connector-filesystem module.

2020-10-28 Thread Chesnay Schepler
cannot remove it before trying to provide a smooth migration path. Thanks, Kostas [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API On Fri, Oct 16, 2020 at 4:36 PM Chesnay Schepler wrote: @Seth: Earlier in this discussion it was said that the BucketingSink would

Re: RestClusterClient and classpath

2020-10-28 Thread Chesnay Schepler
he jar is the classpath (I'm trying to debug the program from the IDE)..isn'it? Classpath: [file:/tmp/job-bundle.jar] ... Best, Flavio On Tue, Oct 27, 2020 at 10:39 AM Chesnay Schepler mailto:ches...@apache.org>> wrote: * your JobExecutor is _

Re: RestClusterClient and classpath

2020-10-27 Thread Chesnay Schepler
* your JobExecutor is _not_ putting it on the classpath. On 10/27/2020 10:36 AM, Chesnay Schepler wrote: Well it happens on the client before you even hit the RestClusterClient, so I assume that either your jar is not packaged correctly or you your JobExecutor is putting it on the classpath

Re: FLINK 1.11 Graphite Metrics

2020-10-27 Thread Chesnay Schepler
ion you have given.What else can I try ? export FLINK_CONF_DIR=${app_install_path}/flinkconf/ regards, Vijay On Sun, Oct 25, 2020 at 8:03 AM Chesnay Schepler mailto:ches...@apache.org>> wrote: Ah wait, in 1.11 it should not l

Re: RestClusterClient and classpath

2020-10-27 Thread Chesnay Schepler
Well it happens on the client before you even hit the RestClusterClient, so I assume that either your jar is not packaged correctly or you your JobExecutor is putting it on the classpath. On 10/27/2020 9:42 AM, Flavio Pompermaier wrote: Sure. Here it is (org.apache.flink.client.cli.JobExecutor

Re: Flink 1.11 throws Unrecognized field "error_code"

2020-07-17 Thread Chesnay Schepler
Please double-check that the client and server are using the same Flink version. On 17/07/2020 02:42, Lian Jiang wrote: Hi, I am using java 1.8 and Flink 1.11 by following https://ci.apache.org/projects/flink/flink-docs-release-1.11/try-flink/local_installation.html on my MAC Mojave

Re: Any change in behavior related to the "web.upload.dir" behavior between Flink 1.9 and 1.11

2020-08-03 Thread Chesnay Schepler
From what I can tell we have not changed anything. Are you making any modifications to the image? This exception should only be thrown if there is already a file with the same path, and I don't think Flink would do that. On 03/08/2020 21:43, Avijit Saha wrote: Hello, Has there been any

Re: The bytecode of the class does not match the source code

2020-08-05 Thread Chesnay Schepler
Please make sure you have loaded the correct source jar, and aren't by chance still using the 1.11.0 source jar. On 05/08/2020 09:57, 魏子涵 wrote: Hi, everyone:       I found  the 【org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl】 class in【flink-runtime_2.11-1.11.1.jar】does not match

Re: Flink Promethues Metricsreporter Question

2020-08-06 Thread Chesnay Schepler
The PrometheusReporter acts as a scraping target for a single process. If you already have setup something in the Flink cluster that allows Prometheus/ServiceMonitor to scrape (Flink) metrics, then it shouldn't be necessary. It doesn't coordinate with other services in any way; it just has

Re: Question about ParameterTool

2020-08-11 Thread Chesnay Schepler
The benefit of the ParameterTool is that you do not increase your dependency footprint by using it. When using another CLI library you will generally package it within your user-jar, which may or may not increase the risk of dependency conflicts. Whether, and how large this risk is, depends

Re: The bytecode of the class does not match the source code

2020-08-05 Thread Chesnay Schepler
-blog.csdnimg.cn/20200805180232929.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L0NTRE5yaG1t,size_16,color_FF,t_70 At 2020-08-05 16:46:11, "Chesnay Schepler" wrote: Please make sure you have loaded the correct

Re: Metrics for the overall pipeline

2020-08-06 Thread Chesnay Schepler
You could create an abstract class that extends AbstractRichFunction, and all your remaining functions extend that class and implement the respective (Map/etc.)Function interface. On 06/08/2020 13:20, Manish G wrote: Adding metrics to individual RichMapFunction implementation classes would

Re: Hadoop_Compatability

2020-08-06 Thread Chesnay Schepler
We still offer a flink-shaded-hadoop-2 artifact that you can find on the download page: https://flink.apache.org/downloads.html#additional-components In 1.9 we changed the artifact name. Note that we will not release newer versions of this dependency. As for providing Hadoop class, there is

Re: Dependency vulnerabilities with Apache Flink 1.10.1 version

2020-08-06 Thread Chesnay Schepler
log4j - If you don't use a Socket appender, you're good. Otherwise, you can replace the log4j jars in lib/ with a newer version. You could also upgrade to 1.11.1 which uses log4j2. guava - We do not use Guava for serialization AFAIK. We also do not use Java serialization for records.

Re: Missing metrics when using metric reporter on high parallelism

2020-08-11 Thread Chesnay Schepler
IIRC this can be caused by the Carbon MAX_CREATES_PER_MINUTE setting. I would deem it unlikely that the reporter thread is busy for 30 seconds. On 11/08/2020 16:57, Nikola Hrusov wrote: Hello, I am doing some tests with flink 1.11.1 and I have noticed something strange/wrong going on with

Re: Metrics for number of events in a timeframe

2020-08-04 Thread Chesnay Schepler
meter * timeframe (in seconds) is the simplest option, although it will not be that accurate due to the flattening of spikes. You'd get the best results by using a time-series database, and calculating the difference between the current count and one 5 minutes ago. An example for Prometheus:

Re: Metrics for number of events in a timeframe

2020-08-04 Thread Chesnay Schepler
No, because Flink counters are mapped to Prometheus gauges. On 04/08/2020 15:52, Manish G wrote: That documentation states: |delta| should only be used with gauges. Would that cause an issue as we are using counter. With regards On Tue, Aug 4, 2020 at 7:12 PM Chesnay Schepler <mailto:c

Re: getting in an infinite loop while creating the dependency-reduced pom

2020-08-03 Thread Chesnay Schepler
https://issues.apache.org/jira/browse/MSHADE-148 On 03/08/2020 10:16, Dongwon Kim wrote: Hi, I create a new maven project (I'm using Maven 3.6.3) w/ the below command |curl https://flink.apache.org/q/quickstart-SNAPSHOT.sh | bash -s 1.11.1| and add the following dependencies to

Re: Chaining the creation of a WatermarkStrategy doesn't work?

2020-07-08 Thread Chesnay Schepler
WatermarkStrategy.forBoundedOutOfOrderness(Duration.of(1, ChronoUnit.MINUTES)) returns a WatermarkStrategy, but the exact type is entirely dependent on the variable declaration (i.e., it is not dependent on any argument). So, when you assign the strategy to a variable then the compiler can

Re: A query on Flink metrics in kubernetes

2020-07-09 Thread Chesnay Schepler
From Flink's perspective no metrics are aggregated, nor are metric requests forwarded to some other process. Each TaskExecutor has its own reporter, that each must be scraped to get the full set of metrics. On 09/07/2020 11:39, Manish G wrote: Hi, I have a query regarding prometheus

Re: Limiting metrics logs to custom metric

2020-07-08 Thread Chesnay Schepler
There's no built-in functionality for this. You could customize the reporter though. On 08/07/2020 17:19, Manish G wrote: Hi, I have added a Meter in my code and pushing it to app logs using slf4j reporter. I observe that apart from my custometrics, lots of other metrics like gauge,

Re: Limiting metrics logs to custom metric

2020-07-08 Thread Chesnay Schepler
? On Wed, Jul 8, 2020, 9:38 PM Chesnay Schepler <mailto:ches...@apache.org>> wrote: There's no built-in functionality for this. You could customize the reporter though. On 08/07/2020 17:19, Manish G wrote: > Hi, > > I have added a Meter in my code and push

Re: Integrating prometheus

2020-07-03 Thread Chesnay Schepler
my custom metrics(a simple Counter) is still not shown on the prometheus board, though default metrics I can see. Anything I am missing? On Fri, Jul 3, 2020 at 8:57 PM Chesnay Schepler mailto:ches...@apache.org>> wrote: What metrics specifically are

Re: SSL for QueryableStateClient

2020-07-07 Thread Chesnay Schepler
Queryable state does not support SSL. On 06/07/2020 22:42, mail2so...@yahoo.co.in wrote: Hello, I am running flink on Kubernetes, and from outside the Ingress to a proxy on Kubernetes is via SSL 443 PORT only. Can you please provide guidance on how to setup the SSL for

Re: Logging Flink metrics

2020-07-06 Thread Chesnay Schepler
Have you looked at the SLF4J reporter? https://ci.apache.org/projects/flink/flink-docs-release-1.10/monitoring/metrics.html#slf4j-orgapacheflinkmetricsslf4jslf4jreporter On 06/07/2020 13:49, Manish G wrote: Hi, Is it possible to log Flink metrics in application logs apart from publishing it

Re: Logging Flink metrics

2020-07-06 Thread Chesnay Schepler
is as: metrics.reporter.slf4j.class: org.apache.flink.metrics.slf4j.Slf4jReporter metrics.reporter.slf4j.interval: 30 SECONDS On Mon, Jul 6, 2020 at 7:57 PM Chesnay Schepler <mailto:ches...@apache.org>> wrote: How long did the job run for, and what is the configured interval?

Re: Logging Flink metrics

2020-07-06 Thread Chesnay Schepler
and invoked markEvent() method in the target code. But I don't see any related logs. I am doing this all on my local computer. Anything else I need to do? With regards Manish On Mon, Jul 6, 2020 at 5:24 PM Chesnay Schepler <mailto:ches...@apache.org>> wrote: Have you looked at the SLF4J

Re: Decompressing Tar Files for Batch Processing

2020-07-07 Thread Chesnay Schepler
I would probably go with a separate process. Downloading the file could work with Flink if it is already present in some supported filesystem. Decompressing the file is supported for selected formats (deflate, gzip, bz2, xz), but this seems to be an undocumented feature, so I'm not sure how

Re: Logging Flink metrics

2020-07-06 Thread Chesnay Schepler
, which will likely entail manually starting the processes and updating the configuration in-between, e.g.: ./bin/jobmanager.sh start ./bin/taskmanager.sh start On 06/07/2020 19:16, Manish G wrote: Yes. On Mon, Jul 6, 2020 at 10:43 PM Chesnay Schepler <mailto:ches...@apache.org>&

Re: Logging Flink metrics

2020-07-06 Thread Chesnay Schepler
['localhost:9250', 'localhost:9251']     metrics_path: /* * * This is the whole configuration I have done based on several tutorials and blogs available online. ** /**/ On Mon, Jul 6, 2020 at 10:20 PM Chesnay Schepler <mailto:ches...@apache.org>> wrote: These are all JobMana

Re: Logging Flink metrics

2020-07-06 Thread Chesnay Schepler
without having to remove the entire configuration. I'd suggest to just remove the option, try again, and report back. On 06/07/2020 16:35, Chesnay Schepler wrote: Please enable debug logging and search for warnings from the metric groups/registry/reporter. If you cannot find anything suspicious

Re: Logging Flink metrics

2020-07-06 Thread Chesnay Schepler
quot;58483036154d7f72ad1bbf10eb86bc2e",host="localhost",job_name="frauddetection",} 0.0 On Mon, Jul 6, 2020 at 9:51 PM Chesnay Schepler <mailto:ches...@apache.org>> wrote: You've said elsewhere that you do see some metrics in prometheus, which

Re: Logging Flink metrics

2020-07-06 Thread Chesnay Schepler
metrics.reporter.slf4j.interval: 30 SECONDS // And while I can see custom metrics in Taskmanager logs, but prometheus dashboard logs doesn't show custom metrics. With regards On Mon, Jul 6, 2020 at 8:55 PM Chesnay Schepler <mailto:ches...@apache.org>&

Re: RequiredParameters in Flink 1.11

2020-07-13 Thread Chesnay Schepler
/** * ... * * @deprecated These classes will be dropped in the next version. Use {@link ParameterTool} or a third-party * command line parsing library instead. */ On 13/07/2020 17:24, Flavio Pompermaier wrote: In Flink 1.11 RequiredParameters and Option have been deprecated. Is there any

Re: Missing jars

2020-07-14 Thread Chesnay Schepler
flink-formats is a pom artifact, meaning that there are no jars for it. You should add a dependency to the specific format(s) you are interested in, like flink-formats-csv. On 14/07/2020 17:41, Daves Open wrote: I added flink-formats to my pom.xml file, but the jar files are not found.  I

Re: RestartStrategy failure count when losing a Task Manager

2020-07-15 Thread Chesnay Schepler
1) A restart in one region only increments the count by 1, independent of how many tasks from that region fail at the same time. If tasks from different regions fail at the same time, then the bound is incremented by the number of affected regions. 2) I would consider what failure rate is

Re: Communicating with my operators

2020-07-15 Thread Chesnay Schepler
Using an S3 bucket containing the configuration is the way to go. 1) web sockets, or more generally all approaches where you connect to the source The JobGraph won't help you; it doesn't contain the information on where tasks are deployed to at runtime. It is just an abstract representation

Re: Is there a way to access the number of 'keys' that have been used for partioning?

2020-07-15 Thread Chesnay Schepler
This information is not readily available; in fact Flink itself doesn't know how many keys there are at any point. You'd have to calculate it yourself. On 15/07/2020 17:11, orionemail wrote: Hi, I need to query the number of keys that a stream has been split by, is there a way to do this?

Re: Integrating prometheus

2020-07-03 Thread Chesnay Schepler
What metrics specifically are you interested in? On 03/07/2020 17:22, Robert Metzger wrote: Hi Manish, Currently, Flink's metric system does not support metrics via annotations. You need to go with the documented approach. But of course, you can try to build your own metrics abstraction based

Re: Customised RESTful trigger

2020-07-12 Thread Chesnay Schepler
You can specify arguments to your job via query parameters or a json body (recommended) as documented here . On 10/07/2020 18:48, Jacek Grzebyta wrote: Hello, I am a newbie in the Apache

Re: History Server Not Showing Any Jobs - File Not Found?

2020-07-12 Thread Chesnay Schepler
e future J *// *ah** *From:*Chesnay Schepler *Sent:* Tuesday, June 2, 2020 3:55 AM *To:* Hailu, Andreas [Engineering] ; user@flink.apache.org *Subject:* Re: History Server Not Showing Any Jobs - File Not Found? 1) It downloads all archives and stores them on disk; the only thing stored in memory

Re: Request: Documentation for External Communication with Flink Cluster

2020-06-15 Thread Chesnay Schepler
https://ci.apache.org/projects/flink/flink-docs-release-1.10/monitoring/rest_api.html#api On 15/06/2020 17:47, Morgan Geldenhuys wrote: Hi Community, I am interested in creating an external client for submitting and managing Flink jobs via a HTTP/REST endpoint. Taking a look at the

Re: Unable to run flink job in dataproc cluster with jobmanager provided

2020-06-17 Thread Chesnay Schepler
Are you by any chance creating a local environment via (Stream)ExecutionEnvironment#createLocalEnvironment? On 17/06/2020 17:05, Sourabh Mehta wrote: Hi Team, I'm  exploring flink for one of my use case, I'm facing some issues while running a flink job in cluster mode. Below are the steps I

Re: Faild to load dependency after migration to Flink 1.10

2020-06-24 Thread Chesnay Schepler
Did you make any modifications to the Flink scripts in order to get log4j2 to work with Flink 1.10? IIRC we had to modify the scripts when we migrated to log4j2 in 1.11; if this is done incorrectly it could break thing. Do all Flink processes have this issue, or only TaskExecutors? Can you

Re: Renaming the metrics

2020-06-22 Thread Chesnay Schepler
There's currently no way to change this. A related enhancement was proposed on FLINK-17495 that would at least allow you to attach a custom label, but the initial implementation wasn't general enough. On 22/06/2020 15:08, Arvid Heise wrote: Hi Ori, I see that the

Re: Unable to run flink job in dataproc cluster with jobmanager provided

2020-06-22 Thread Chesnay Schepler
Cheers, Till On Wed, Jun 17, 2020 at 7:39 PM Sourabh Mehta mailto:sourabhmehta2...@gmail.com>> wrote: No, I am not. On Wed, 17 Jun 2020 at 10:48 PM, Chesnay Schepler mailto:ches...@apache.org>> wrote: Are you by any chance

Re: Completed Job List in Flink UI

2020-06-18 Thread Chesnay Schepler
https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/config.html#jobstore-expiration-time On 18/06/2020 19:57, Ivan Yang wrote: Hello, In Flink web UI Overview tab, "Completed Job List” displays recent completed or cancelled job only for short period of time. After a while, they

Re: Flink savepoints history

2020-06-07 Thread Chesnay Schepler
I can't quite find the answer right now, but the Web UI relies entirely on the REST API. So, whenever you see something in the UI, and wonder where that data comes from, open up the developer tools in your browser, go to the network tab, reload the page and see what requests are being made.

Re: Timer metric in Flink

2020-06-10 Thread Chesnay Schepler
You cannot add custom metric types, just implementations of the existing ones. Your timer(wrapper) will have to implement Gauge or Histogram. On 10/06/2020 14:17, Vinay Patil wrote: Hi, As timer metric is not provided out of the box, can I create a new MetricGroup by implementing this 

Re: Simple stateful polling source

2020-06-08 Thread Chesnay Schepler
, you'd always want to extend RichParallelSourceFunction for a parallel data source; not out of necessity, but simplicity. On 07/06/2020 17:43, Ken Krugler wrote: Hi Chesnay, On Jun 19, 2019, at 6:05 AM, Chesnay Schepler <mailto:ches...@apache.org>> wrote: A (Rich)SourceFunction

Re: Run command after Batch is finished

2020-06-08 Thread Chesnay Schepler
This goes in the right direction; have a look at org.apache.flink.api.common.io.FinalizeOnMaster, which an OutputFormat can implement to run something on the Master after all subtasks have been closed. On 08/06/2020 17:25, Andrey Zagrebin wrote: Hi Mark, I do not know how you output the

Re: Window Function use case;

2020-06-04 Thread Chesnay Schepler
If you input data already contains both the SensorID and FactoryID, why would the following not be sufficient? DataStream sensorEvents = ...; sensorEvents .filter(sensorEvent -> sensorEvent.Status.equals("alerte")) .map(sensorEvent -> sensorEvent.FactoryID) .addSink() If the problem is that

Re: Timer metric in Flink

2020-06-11 Thread Chesnay Schepler
. Is there a plan or JIRA ticket to add Timer metric in future release, I think it is good to have Regards, Vinay Patil On Wed, Jun 10, 2020 at 5:55 PM Chesnay Schepler <mailto:ches...@apache.org>> wrote: You cannot add custom metric types, just implementations of the existing ones. Y

Re: REST API randomly returns Not Found for an existing job

2020-07-24 Thread Chesnay Schepler
How reproducible is this problem / how often does it occur? How is the cluster deployed? Is anything else happening to the cluster around that that time (like a JobMaster failure)? On 24/07/2020 13:28, Tomasz Dudziak wrote: Hi, I have come across an issue related to GET /job/:jobId endpoint

Re: How to start flink standalone session on windows ?

2020-07-24 Thread Chesnay Schepler
Flink no longer runs natively on Windows; you will have to use some unix-like environment like WSL or cygwin. On 24/07/2020 04:27, wangl...@geekplus.com.cn wrote: There's no  start-cluster.bat and flink.bat in bin directory. So how can i start flink on windowns OS? Thanks, Lei

Re: Using md5 hash while sinking files to s3

2020-07-16 Thread Chesnay Schepler
Please try configuring : fs.s3a.etag.checksum.enabled: true On 16/07/2020 03:11, nikita Balakrishnan wrote: Hello team, I’m developing a system where we are trying to sink to an immutable s3 bucket. This bucket has server side encryption set as KMS. The DataStream sink works perfectly fine

Re: Using md5 hash while sinking files to s3

2020-07-16 Thread Chesnay Schepler
hash value manually while sinking? The fs.s3a.etag.checksum.enabled: true will do it for me? And Do I need to specify anywhere that we have to use md5 hashing? On Thu, Jul 16, 2020 at 12:04 AM Chesnay Schepler <mailto:ches...@apache.org>> wrote: Please try co

Re: backup configuration in Flink doc

2020-07-16 Thread Chesnay Schepler
They should be public yes; I do not know what the "Backup" category is supposed to mean, and I suspect this was a WIP title. On 16/07/2020 18:01, Steven Wu wrote: The configuration page has this "backup" section. Can I assume that they are public interfaces? The name "backup" is a little

Re: Question on Pattern Matching

2020-07-16 Thread Chesnay Schepler
Have you read this part of the documentation ? From what I understand, it provides you hooks for processing matched/timed out patterns. On 16/07/2020 20:23, Basanth Gowda

Re: Count of records coming into the Flink App from KDS and at each step through Execution Graph

2020-07-29 Thread Chesnay Schepler
I'd recommend to do the aggregation over 5 seconds in graphite/prometheus etc., and expose a counter in Flink for each attribute/event_name. User variables are a good choice for encoding the attribute/event_name values. As for your remaining questions: Flink does not support aggregating

Re: Making sense of Meter metrics graph on Grafana

2020-07-29 Thread Chesnay Schepler
Yes; a rate of 1 means that 1 event occurred per second, which in your case means one call to markEvent() per second. Note that the default Meter implementation calculates the rate per second over the last minute (basically, rate(T) = (count(T) - count(T-60)) / 60; so short spikes tend to be

<    3   4   5   6   7   8   9   10   11   12   >