Re: Exception when calculating throughputEMA in 1.14.3

2022-08-22 Thread Chesnay Schepler
Since 1.14.6 has not been released yet your best bet is to either disable the debloating feature, upgrade to 1.15, or build Flink yourself. On 23/08/2022 08:48, Chesnay Schepler wrote: https://issues.apache.org/jira/browse/FLINK-25454 On 23/08/2022 04:54, Liting Liu (litiliu) wrote: Hi, we

[NOTICE] Switch docker image base to Eclipse Temurin

2022-09-05 Thread Chesnay Schepler
On Wednesday, September 9th, the Flink 1.14.5/1.15.2 Docker images will switch bases FROM openjdk:8/11-jar (Debian-based) TO eclipse-temurin:8/11-jre-jammy (Ubuntu-based) due to the deprecation of the OpenJDK images. Users that customized the images are advised to check for breaking changes.

Re: [NOTICE] Switch docker image base to Eclipse Temurin

2022-09-05 Thread Chesnay Schepler
* September 7th On 05/09/2022 11:27, Chesnay Schepler wrote: On Wednesday, September 9th, the Flink 1.14.5/1.15.2 Docker images will switch bases FROM openjdk:8/11-jar (Debian-based) TO eclipse-temurin:8/11-jre-jammy (Ubuntu-based) due to the deprecation of the OpenJDK images. Users that

Re: New licensing for Akka

2022-09-07 Thread Chesnay Schepler
We'll have to look into it. The license would apply to usages of Flink. That said, I'm not sure if we'd even be allowed to use Akka under that license since it puts significant restrictions on the use of the software. If that is the case, then it's either use a fork created by another party or

Re: New licensing for Akka

2022-09-07 Thread Chesnay Schepler
Just to squash concerns, we will make sure this license change will not affect Flink users in any way. On 07/09/2022 11:14, Robin Cassan via user wrote: Hi all! It seems Akka have announced a licensing change https://www.lightbend.com/blog/why-we-are-changing-the-license-for-akka If I understa

Re: Slow Tests in Flink 1.15

2022-09-07 Thread Chesnay Schepler
The test that gotten slow; how many test cases does it actually contain / how many jobs does it actually run? Are these tests using the table/sql API? On 07/09/2022 14:15, Alexey Trenikhun wrote: We are also observing extreme slow down (5+ minutes vs 15 seconds) in 1 of 2 integration tests . Bo

Re: Cassandra sink with Flink 1.15

2022-09-07 Thread Chesnay Schepler
Are you running into this in the IDE, or when submitting the job to a Flink cluster? If it is the first, then you're probably affected by the Scala-free Flink efforts. Either add an explicit dependency on flink-streaming-scala or migrate to Flink tuples. On 07/09/2022 14:17, Lars Skjærven wr

Re: [NOTICE] Switch docker image base to Eclipse Temurin

2022-09-08 Thread Chesnay Schepler
2022 at 1:21 PM Chesnay Schepler wrote: * September 7th On 05/09/2022 11:27, Chesnay Schepler wrote: > On Wednesday, September 9th, the Flink 1.14.5/1.15.2 Docker images > will switch bases > > FROM openjdk:8/11-jar (Debian-based) > TO eclipse-temur

[NOTICE] Blog post regarding Akka's licensing change

2022-09-08 Thread Chesnay Schepler
Hello, You may have heard about a recent change to the licensing of Akka. We just published a blog-post regarding this change and what it means for Flink. https://flink.apache.org/news/2022/09/08/akka-license-change.html TL;DR: Flink is not in any immediate danger and we will ensure that use

Re: Fail to build Flink 1.15.1

2022-09-09 Thread Chesnay Schepler
hmm...we've only seen that error in older Flink version: https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/flinkdev/ide_setup/#compilation-fails-with-cannot-find-symbol-symbol-method-defineclass-location-class-sunmiscunsafe Please double-check whether you actually checked out 1.15.

Re: New licensing for Akka

2022-09-09 Thread Chesnay Schepler
7, 2022 at 4:30 PM Robin Cassan via user wrote: Thanks a lot for your answers, this is reassuring! Cheers Le mer. 7 sept. 2022 à 13:12, Chesnay Schepler a écrit : Just to squash concerns, we will make sure this license change will not affect Flink

Re: Fail to build Flink 1.15.1

2022-09-12 Thread Chesnay Schepler
PM, Chesnay Schepler wrote: hmm...we've only seen that error in older Flink version: https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/flinkdev/ide_setup/#compilation-fails-with-cannot-find-symbol-symbol-method-defineclass-location-class-sunmiscunsafe Please double-check wh

Re: How to read flink state data without setting uid?

2022-09-22 Thread Chesnay Schepler
Currently the state processor API does not support that. On 22/09/2022 11:02, BIGO wrote: I didn't set the uid for my flink operator, is there any way to read the flink state data? State Processor API requires uid. Thanks.

Re: How to read flink state data without setting uid?

2022-09-22 Thread Chesnay Schepler
You will need to reload the savepoint with the original job and add uids to all operators (while also setting the uid hashes on all operators to properly restore the state). On 22/09/2022 11:06, Chesnay Schepler wrote: Currently the state processor API does not support that. On 22/09/2022 11

Re: [DISCUSS] FLIP-265 Deprecate and remove Scala API support

2022-10-05 Thread Chesnay Schepler
> It's possible that for the sake of the Scala API, we would occasionally require some changes in the Java API. As long as those changes are not detrimental to Java users, they should be considered. That's exactly the model we're trying to get to. Don't fix scala-specific issues with scala cod

Re: [DISCUSS] Reverting sink metric name changes made in 1.15

2022-10-10 Thread Chesnay Schepler
On 10/10/2022 11:24, Martijn Visser wrote: Sidenote: metric names are not mentioned in the FLIP process as a public API. Might make sense to have a separate follow-up to add that to the list (I do think we should list them there). That's a general issue we have. There's a lot of things we _ us

Re: [DISCUSS] Reverting sink metric name changes made in 1.15

2022-10-10 Thread Chesnay Schepler
> I’m with Xintong’s idea to treat numXXXSend as an alias of numXXXOut But that's not possible. If it were that simple there would have never been a need to introduce another metric in the first place. It's a rather fundamental issue with how the new sinks work, in that they emit data to the

Re: [DISCUSS] Reverting sink metric name changes made in 1.15

2022-10-11 Thread Chesnay Schepler
ht be against consistency with metrics in other operators in Flink but maybe it’s acceptable to have the sink as a special case. Best, Qingsheng On Oct 10, 2022, 19:13 +0800, Chesnay Schepler , wrote: > I’m with Xintong’s idea to treat numXXXSend as an alias of numXXXOut But that's not

Re: Flink falls back on to kryo serializer for GenericTypes

2022-10-12 Thread Chesnay Schepler
There's no alternative to Kryo for generic types, apart from implementing your Flink serializer (but technically at that point the type is no longer treated as a generic type). enableForAvro only forces Avro to be used for POJO types. On 11/10/2022 09:29, Sucheth S wrote: Hello, How to avoid

Re: [DISCUSS] FLIP-265 Deprecate and remove Scala API support

2022-10-13 Thread Chesnay Schepler
Support for records has not been investigated yet. We're still at the stage of getting things to run at all on Java 17. It _may_ be possible, it _may_ not be. On 13/10/2022 07:39, Salva Alcántara wrote: Hi Martijn, Maybe a bit of an off-topic, but regarding Java 17 support, will it be possib

Re: flink-table-api-scala-bridge sources

2022-10-31 Thread Chesnay Schepler
Thanks for reporting the issue; I've filed a ticket (FLINK-29803). On 25/10/2022 09:40, Clemens Valiente wrote: Hi everyone I noticed when going through the scala datastream/table api bridge in my IDE I cannot see the source of the code. I believe it is because the Sources are missing on mave

Re: [Security] - Critical OpenSSL Vulnerability

2022-11-01 Thread Chesnay Schepler
We just push new images with the same tags. On 01/11/2022 14:35, Matthias Pohl wrote: The Docker image for Flink 1.12.7 uses an older base image which comes with openssl 1.1.1k. There was a previous post in the OpenSSL mailing list reporting a low vulnerability being fixed with 3.0.6 and 1.1.1r

Re: Kinesis Connector does not work

2022-11-08 Thread Chesnay Schepler
Said dependency (on commons-logging) is not meant to be provided by the docker image, but bundled in your user-jar (along with the connector). On 08/11/2022 02:14, Matt Fysh wrote: Hi, I'm following the kinesis connector instructions as documented here: https://nightlies.apache.org/flink/flink

Re: Kinesis Connector does not work

2022-11-08 Thread Chesnay Schepler
Or if this is something more general, it might be something to talk about in the Python section of the docs because most Python users are not going to understand the interplay between Java classes. On Tue, 8 Nov 2022 at 18:50, Chesnay Schepler wrote: Said dependency (on commons-logging) is

[ACCOUNCE] Apache Flink Elasticsearch Connector 3.0.0 released

2022-11-10 Thread Chesnay Schepler
The Apache Flink community is very happy to announce the release of Apache Flink Elasticsearch Connector 3.0.0. Apache Flink® is an open-source stream processing framework for distributed, high-performing, always-available, and accurate data streaming applications. The release is available f

Re: Query about flink job manager dashboard

2022-11-30 Thread Chesnay Schepler
There's no way to disable the jar submission in the UI but have it still work via the REST API. On 30/11/2022 06:16, naga sudhakar wrote: After disabling the cancel, submit flags facing issues with below api calls. 1) /jars giving 404 2) /jars/upload 3) /jars/{jarid}/run Is there any config

Re: Query about flink job manager dashboard

2022-11-30 Thread Chesnay Schepler
te: Hi Chesnay, I have a similar question on this topic. Is there an option to disable the frontend altogether but still use REST APIs? Thanks On Wed, Nov 30, 2022 at 1:37 AM Chesnay Schepler wrote: There's no way to disable the jar submission in the UI but have it still w

[ANNOUNCE] Apache flink-connector-cassandra 3.0.0 released

2022-12-06 Thread Chesnay Schepler
|The Apache Flink community is very happy to announce the release of Apache flink-connector-cassandra 3.0.0. | |Apache Flink® is an open-source stream processing framework ||for| |distributed, high-performing, always-available, and accurate data streaming applications.| |The release is availabl

Re: Flink reactive mode for application clusters on AWS EKS

2023-01-12 Thread Chesnay Schepler
The adaptive scheduler and reactive mode both already support application clusters since 1.13. https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/elastic_scaling/ On 19/12/2022 10:17, Tamir Sagi wrote: Hey, We are running stream jobs on application clusters (v1.15.2) o

Re: Using filesystem plugin with MiniCluster

2023-01-12 Thread Chesnay Schepler
There is no good way in 1.15 IIRC. Adding a dependency on flink-s3-fs-hadoop _can_ work, if you dont run into dependency conflicts. Otherwise you have to create a plugin manager yourself, point it to some local directory via a system property (I think?), and then eagerly call FileSystem#init

Re: Async IO & Retry: how to get error details

2023-01-12 Thread Chesnay Schepler
Retry logic and per-request timeouts should be setup within asyncInvoke() (with all error-handling being done via plain CompletableFuture logic), with timeout() sort of acting as a global timeout after which you want the job to fail (e.g., to guard against mistakes in the asyncInvoke() logic).

Re: Metrics exposed by flink containing long label values

2023-01-12 Thread Chesnay Schepler
See https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/metric_reporters/#scope-variables-excludes On 06/01/2023 12:09, Surendra Lalwani via user wrote: Hi Team, We are exposing metrics by Flink to prometheus but we are seeing it contains various labels even including o

Re: Problem with custom SerializationSchema in Flink 1.15

2023-01-24 Thread Chesnay Schepler
It's a known issue that various connectors/wrappers/etc did not respect the schema lifecycle. This was fixed in 1.16.0 in https://issues.apache.org/jira/browse/FLINK-28807. You will have to lazily initialize the mapper in the serialize() method for previous versions. On 24/01/2023 11:52, P

Re: Docker image Flink 1.15.4

2023-01-26 Thread Chesnay Schepler
1.15.4 is not released yet. On 26/01/2023 16:06, Peng Zhang wrote: Hi, We would like to use Flink 1.15.4 docker image. The latest seems 1.15.3. Could you make a docker release Flink 1.1.5.4? Thanks! There is a blocking bug https://issues.apache.org/jira/browse/FLINK-28695

Re: Kryo version 2.24.0 from 2015?

2023-02-09 Thread Chesnay Schepler
> you can't reasonably stay on the 2015 version forever, refuse to adopt any of the updates or fixes in the 8 years since then, and reasonably expect things to continue to work well. We are well aware that Kryo is a ticking time bomb. > Is there any possibility a future release of Flink can up

Re: Kryo version 2.24.0 from 2015?

2023-02-09 Thread Chesnay Schepler
m 2013, before Java 8. Versions older than 3.x aren't supposed to compile nor run correctly under Java 11+. On Thu, Feb 9, 2023 at 2:34 AM Chesnay Schepler wrote:  > you can't reasonably stay on the 2015 version forever, refuse to adopt any of the updates or fixes in t

Re: Disable the chain of the Sink operator

2023-02-16 Thread Chesnay Schepler
As far as I know that chain between committer and writer is also required for correctness. On 16/02/2023 14:53, weijie guo wrote: Hi wu, I don't think it is a good choice to directly change the strategy of chain. Operator chain usually has better performance and resource utilization. If we d

Re: [Discussion] - Release major Flink version to support JDK 17 (LTS)

2023-03-30 Thread Chesnay Schepler
https://github.com/EsotericSoftware/kryo/wiki/Migration-to-v5#migration-guide Kroy themselves state that v5 likely can't read v2 data. However, both versions can be on the classpath without classpath as v5 offers a versioned artifact that includes the version in the package. It probably would

Re: [Discussion] - Take findify/flink-scala-api under Flink umbrella

2023-04-17 Thread Chesnay Schepler
> they require additional hop to serialize Scala objects This doesn't necessarily mean that we need a Scala API, because a beefed up type extraction could also solve this. > This single committer is now with us and ready to maintain it in open source. The best situation to be :-) Have you c

Re: [Discussion] - Release major Flink version to support JDK 17 (LTS)

2023-04-24 Thread Chesnay Schepler
As it turns out Kryo isn't a blocker; we ran into a JDK bug. On 31/03/2023 08:57, Chesnay Schepler wrote: https://github.com/EsotericSoftware/kryo/wiki/Migration-to-v5#migration-guide Kroy themselves state that v5 likely can't read v2 data. However, both versions can be on the

Re: Can I setup standby taskmanagers while using reactive mode?

2023-04-26 Thread Chesnay Schepler
Reactive mode doesn't support standby taskmanagers. As you said it always uses all available resources in the cluster. I can see it being useful though to not always scale to MAX but (MAX - some_offset). I'd suggest to file a ticket. On 26/04/2023 00:17, Wei Hou via user wrote: Hi Flink com

Re: [Discussion] - Release major Flink version to support JDK 17 (LTS)

2023-04-28 Thread Chesnay Schepler
> -- >> *From:* Jing Ge via user >> *Sent:* Monday, April 24, 2023 11:15 PM >> *To:* Chesnay Schepler >> *Cc:* Piotr Nowojski ; Alexis Sarda-Espinosa < >> sarda.espin...@gmail.com>; Martijn Visser

Re: StatsdMetricsReporter is emitting all metric types as gauges

2023-05-12 Thread Chesnay Schepler
nit: Whether the Flink counter abstraction is cumulative or not is irrelevant w.r.t. what we send to metric backends. For example, for Datadog we do send count diffs instead of the current total count. On 10/05/2023 09:33, Hang Ruan wrote: Hi, Iris, The Flink counter is cumulative. There ar

Re: [DISCUSS] Status of Statefun Project

2023-06-06 Thread Chesnay Schepler
If you were to fork it /and want to redistribute it/ then the short version is that 1. you have to adhere to the Apache licensing requirements 2. you have to make it clear that your fork does not belong to the Apache Flink project. (Trademarks and all that) Neither should be significant hurd

Re: Custom keyBy(), look for similaties

2016-06-08 Thread Chesnay Schepler
the idea behind key-selectors is to extract a property on which you can to equality comparisons. let's get one question out of the way first: is your scoring algorithm transitive? as in if A==B and B==C, is it a given that A==C? because if not, there's just no way to group(=partition) the data,

Re: Does Flink allows for encapsulation of transformations?

2016-06-10 Thread Chesnay Schepler
MapFunction() { *Long N = NumIter.collect().get(0);* @Override public Double map(Long arg0) throws Exception { return arg0 *4.0/N; }}); }} Thanks a lot for your time. Ser On Tuesday, June 7, 2016 8:14 AM, Chesnay Schepler wrote: 1a. ah. yeah i see how it could work, but i wouldn't co

Re: Getting the NumberOfParallelSubtask

2016-06-20 Thread Chesnay Schepler
Within the mapper you cannot access the parallelism of the following nor preceding operation. On 20.06.2016 15:56, Paschek, Robert wrote: Hi Mailing list, using a RichMapPartitionFunction i can access the total number m of this mapper utilized in my job with int m = getRuntimeContext().getNum

Re: State key serializer has not been configured in the config.

2016-06-23 Thread Chesnay Schepler
We should adjust the error message to contain the keyed stream thingy. On 23.06.2016 10:11, Till Rohrmann wrote: Hi Jacob, the `ListState` abstraction is a state which we call partitioned/key-value state. As such, it is only possible to use it with a keyed stream. This means that you have to

Re: Cassandra Connector Problem (Possible Guava Conflict?)

2016-06-26 Thread Chesnay Schepler
The problem is that the cassandra jar currently contains 2 shaded guavas. I already have a fix ready that suppressed the root-poms shade plugin configuration inside the cassandra submit. I will submit that next week. On 26.06.2016 17:46, Eamon Kavanagh wrote: Hey everyone, I'm having an issu

Re: Cassandra Connector Problem (Possible Guava Conflict?)

2016-06-26 Thread Chesnay Schepler
, Jun 26, 2016 at 11:55 AM, Chesnay Schepler <mailto:ches...@apache.org>> wrote: The problem is that the cassandra jar currently contains 2 shaded guavas. I already have a fix ready that suppressed the root-poms shade plugin configuration inside the cassandra submit.

Re: JDBC sink in flink

2016-07-05 Thread Chesnay Schepler
Hello, an instance of the JDBCOutputFormat will use a single connection to send all values. Essentially - open(...) is called at the very start to create the connection - then all invoke/writeRecord calls are executed (using the same connection) - then close() is called to clean up. The total

Re: JDBC sink in flink

2016-07-05 Thread Chesnay Schepler
that makes send. Also what's the difference between a RichOutputFormat and a RichSinkFunction ? Can I use JDBCOutputFormat as a sink in a stream ? On Tue, Jul 5, 2016 at 3:53 PM, Chesnay Schepler <mailto:ches...@apache.org>> wrote: Hello, an instance of the JDBCOutputFor

Re: Accessing external DB inside RichFlatMap Function

2016-07-07 Thread Chesnay Schepler
Couldn't he do the same thing in his RichFlatMap? open the db connection in open(), close it in close(), do stuff within these calls. On 07.07.2016 10:58, Kostas Kloudas wrote: Hi Simon, If your job reads or writes to a DB, I would suggest to use one of the already existing Flink sources or

Re: Issue with running Flink Python jobs on cluster

2016-07-13 Thread Chesnay Schepler
Hello Geoffrey, How often does this occur? Flink distributes the user-code and the python library using the Distributed Cache. Either the file is deleted right after being created for some reason, or the DC returns a file name before the file was created (which shouldn't happen, it should b

Re: Writing in flink clusters

2016-07-13 Thread Chesnay Schepler
Hello, Is that the complete error message? I'm a bit surprised it does not explicitly name any file name. If it really doesn't we should change that. Regards, Chesnay Schepler On 13.07.2016 15:35, Alexis Gendronneau wrote: Hi Roy, Have you looked on the nodes in charge of sink t

Re: Issue with running Flink Python jobs on cluster

2016-07-15 Thread Chesnay Schepler
the problem. Thanks, Geoffrey On Wed, Jul 13, 2016 at 6:12 AM Chesnay Schepler mailto:ches...@apache.org>> wrote: Hello Geoffrey, How often does this occur? Flink distributes the user-code and the python library using the Distributed Cache.

Re: Issue with running Flink Python jobs on cluster

2016-07-17 Thread Chesnay Schepler
cache. Cheers, Geoffrey On Fri, Jul 15, 2016 at 4:15 AM Chesnay Schepler mailto:ches...@apache.org>> wrote: Could you write a java job that uses the Distributed cache to distribute files? If this fails then the DC is faulty, if it doesn't somethin

Re: Issue with running Flink Python jobs on cluster

2016-07-17 Thread Chesnay Schepler
Geoffrey On Fri, Jul 15, 2016 at 4:15 AM Chesnay Schepler mailto:ches...@apache.org>> wrote: Could you write a java job that uses the Distributed cache to distribute files? If this fails then the DC is faulty, if it doesn't something in th

Re: Issue with running Flink Python jobs on cluster

2016-07-17 Thread Chesnay Schepler
ration? Unfortunately, I am relying on using a Flink cluster to run a Python job for some scientific data that needs to be completed soon. Thank for your assistance, Geoffrey On Sun, Jul 17, 2016 at 4:04 AM Chesnay Schepler <mailto:ches...@apache.org>> wrote: Please also po

Re:

2016-07-17 Thread Chesnay Schepler
Hello Chen, you can access the set configuration in your rich function like this: |public static final class Tokenizer extends RichFlatMapFunctionTuple2> { @Override public void flatMap(String value, Collector> out) { ParameterTool parameters = (ParameterTool) getRuntimeContext().getExecution

Re: Issue with running Flink Python jobs on cluster

2016-07-19 Thread Chesnay Schepler
2016 at 11:58 AM Chesnay Schepler <mailto:ches...@apache.org>> wrote: well now i know what the problem could be. You are trying to execute a job on a cluster (== not local), but have set the local flag to true. env.execute(local=True) Due to this flag the files a

Re: Error joining with Python API

2016-08-16 Thread Chesnay Schepler
looks like a bug, will look into it. :) On 16.08.2016 10:29, Ufuk Celebi wrote: I think that this is actually a bug in Flink. I'm cc'ing Chesnay who originally contributed the Python API. He can probably tell whether this is a bug in the Python API or Flink ioperator side of things. ;) On Mon,

Re: Error joining with Python API

2016-08-17 Thread Chesnay Schepler
Found the issue, there was a missing tab in the chaining method... On 16.08.2016 12:12, Chesnay Schepler wrote: looks like a bug, will look into it. :) On 16.08.2016 10:29, Ufuk Celebi wrote: I think that this is actually a bug in Flink. I'm cc'ing Chesnay who originally contr

Re: Apache siddhi into Flink

2016-08-29 Thread Chesnay Schepler
Hello Aparup, could you provide more information about Siddhi? How mature is it; how is the community? How does it compare to the Flink's CEP library? How should this integration look like? Are you proposing to replace the current CEP library, or will they co-exist with different use-cases fo

Re: Flink JMX

2016-08-29 Thread Chesnay Schepler
Hello, can you post the jmx config entries and give us more details on how you want to access it? Regards, Chesnay On 29.08.2016 12:09, Sreejith S wrote: Hi All, I am using Flink-1.1.1 and i enabled JMX metrics in configuration file. In the task manger log i can see JMX is running. Is th

Re: Flink JMX

2016-08-29 Thread Chesnay Schepler
: 8080-8082 But no hope. Am i miss anything ? Thank You On Mon, Aug 29, 2016 at 4:36 PM, Chesnay Schepler mailto:ches...@apache.org>> wrote: Hello, can you post the jmx config entries and give us more details on how you want to access it? Reg

Re: RichMapFunction in DataStream, how do I set the parameters received in open?

2016-09-12 Thread Chesnay Schepler
Hello, you cannot pass a configuration in the Steaming API. This way of configuration is more of a relic from past times. The common way to pass configure a function is to pass the parameters through the constructor and store the values in a field. Regards, Chesnay On 12.09.2016 18:27, Lui

Re: Flink JDBC JDBCOutputFormat Open

2016-09-12 Thread Chesnay Schepler
Hello, the JDBC Sink completely ignores the taskNumber and parallelism. Regards, Chesnay On 12.09.2016 08:41, Swapnil Chougule wrote: Hi Team, I want to know how tasknumber & numtasks help in opening db connection in Flink JDBC JDBCOutputFormat Open. I checked with docs where it says:

Re: Simple batch job hangs if run twice

2016-09-19 Thread Chesnay Schepler
No, I can't recall that i had this happen to me. I would enable logging and try again, as well as checking whether the second job is actually running through the WebInterface. If you tell me your NetBeans version i can try to reproduce it. Also, which version of Flink are you using? On 19.09

Re: Custom(application) Metrics - Piggyback on Flink's metrics infra or not?

2016-09-20 Thread Chesnay Schepler
Hello Eswar, as far as I'm aware the general structure of the Flink's metric system is rather similar to DropWizard. You can use DropWizard metrics by creating a simple wrapper, we even ship one for Histograms. Furthermore, you can also use DropWizard reporters, you only have to extend the Dr

Re: Flink JDBCOutputFormat logs wrong WARN message

2016-09-20 Thread Chesnay Schepler
I would agree that the condition should be changed. On 20.09.2016 10:52, Swapnil Chougule wrote: I checked following code in Flink JDBCOutputFormat while I was using in my project work. I found following snippet: @Override public void writeRecord(Row row) throws IOException {

Re: Custom(application) Metrics - Piggyback on Flink's metrics infra or not?

2016-09-22 Thread Chesnay Schepler
mit <mailto:sumitkcha...@gmail.com>> wrote: In addition, It supports enabling multiple Reporters. You can have same data pushed to multiple systems. Plus its very easy to write new reporter for doing any customization. Regards Sumit Chawla On Tue, Sep 20, 201

Re: Using Flink and Cassandra with Scala

2016-09-29 Thread Chesnay Schepler
the cassandra sink only supports java tuples and POJO's. On 29.09.2016 16:33, Sanne de Roever wrote: Hi, Does the Cassandra sink support Scala and case classes? It looks like using Java is at the moment best practice. Cheers, Sanne

Re: Task and Operator Monitoring via JMX / naming

2016-10-15 Thread Chesnay Schepler
Hello Philipp, there is certainly something very wrong here. What you _should_ see is 6 entries, 1 for each operator; 2-3 more for the tasks the operators are executed in and the taskmanager stuff. Usually, operator metrics use the name that you configured, like "TokenMapStream", whereas tas

Re: Task and Operator Monitoring via JMX / naming

2016-10-15 Thread Chesnay Schepler
Hello Philipp, the relevant names are stored in the OperatorMetricGroup/TaskMetricGroup classes in flink-runtime. The name for a task is extracted directly from the TaskDeploymentDescriptor in TaskManagerJobMetricGroup#addTask(). The name for a streaming operator that the metric system uses i

Re: Flink Metrics

2016-10-17 Thread Chesnay Schepler
Hello, we could also offer a small utility method that creates 3 flink meters, each reporting one rate of a DW meter. Timers weren't added yet since, as Till said, no one requested them yet and we haven't found a proper internal use-case for them Regards, Chesnay On 17.10.2016 09:52, Till

Re: Flink and factories?

2016-10-19 Thread Chesnay Schepler
Hello, admittedly i didn't look to deeply into this, but I would assume that you are only modifying the factory on the client. When the operators are deserialized on a cluster, your singleton instance is back to the default, which is apples (i think), since the statement that changes the fact

Re: Flink and factories?

2016-10-19 Thread Chesnay Schepler
The functions are serialized when env.execute() is being executed. The thing is, as i understand it, that your singleton is simply not part of the serialized function, so it doesn't actually matter when the function is serialized. Storing the factory instance in the function shouldn't be too m

Re: Task and Operator Monitoring via JMX / naming

2016-10-20 Thread Chesnay Schepler
This is completely unintended behavior; you should never have to adjust your topology so the metric system get's the names right. I'll take a deep look into this tomorrow ;) Regards, Chesnay On 20.10.2016 08:50, Philipp Bussche wrote: Some further observations: I had a Job which was taking ev

Re: org.apache.flink.core.fs.Path error?

2016-10-20 Thread Chesnay Schepler
Hello Radu, Flink can handle windows paths, this alone can't be the problem. If you could post the error you are getting we may pinpoint the issue, but right now i would suggest the usual: check that the path is indeed correct, that you have sufficient permissions to access the file. And yes,

Re: Task and Operator Monitoring via JMX / naming

2016-10-20 Thread Chesnay Schepler
Well the issue is the following: the metric system assumes the following naming scheme for tasks based on the DataSet API and simple streaming jobs: [CHAIN] operatorName1 [=> operatorName2 [ ...]] To retrieve the operator name the above is split by "=>", giving us a String[] of all operator na

Re: org.apache.flink.core.fs.Path error?

2016-10-20 Thread Chesnay Schepler
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(_StreamTask.java:266_) at org.apache.flink.runtime.taskmanager.Task.run(_Task.java:584_) at java.lang.Thread.run(_Thread.java:745_) *From:*Chesnay Schepler [mailto:ches...@apache.org <mailto:ches...@ap

Re: org.apache.flink.core.fs.Path error?

2016-10-20 Thread Chesnay Schepler
k.runtime.taskmanager.Task.run(_Task.java:584_) at java.lang.Thread.run(_Thread.java:745_) *From:*Chesnay Schepler [mailto:ches...@apache.org] *Sent:* Thursday, October 20, 2016 2:22 PM *To:* user@flink.apache.org *Subject:* Re: org.apache.flink.core.fs.Path error? Hello Radu, Flink can ha

Re: org.apache.flink.core.fs.Path error?

2016-10-20 Thread Chesnay Schepler
x the issue as it is in general more practical to specify paths in the form of D:\\dir\\myfile.csv … mainly as it can be understood also by other file readers outside flink *From:*Chesnay Schepler [mailto:ches...@apache.org] *Sent:* Thursday, October 20, 2016 4:06 PM *To:

Re: Task and Operator Monitoring via JMX / naming

2016-10-20 Thread Chesnay Schepler
It will be in the master tomorrow. On 20.10.2016 18:50, Philipp Bussche wrote: Thanks Chesnay ! I am not too familiar with the release cycles here but was wondering when one could expect your fix to be in the master of Flink ? Should I create my own build for the moment maybe ? Thanks. -- V

Re: Can we do batch writes on cassandra using flink while leveraging the locality?

2016-11-01 Thread Chesnay Schepler
Hello, the main issue that prevented us from writing batches is that there is a server-side limit as to how big a batch may be, however there was no way to tell how big the batch that you are currently building up actually is. Regarding locality, I'm not sure if a partitioner alone solves thi

Re: Release Process

2016-11-04 Thread Chesnay Schepler
Hello, Every contribution to the master branch will be released as part of the next minor version, in your case this would be 1.2. We are currently aiming for a release in December. In between minor versions several bug-fix versions are released (1.1.1, 1.1.2 etc.). For these the community pi

Re: Tame Flink UI?

2016-11-16 Thread Chesnay Schepler
Hello, The WebInterfaces first pulls a list of all available metrics for a specific taskmanager/job/task (which is reasonable since how else would you select them), and then requests the values for all metrics by supplying the name of every single metric it just received, which is where things

Re: Cassandra Connector

2016-11-22 Thread Chesnay Schepler
Hello, the CassandraSink is not implemented as a sink but as a special operator, so you wouldn't be able to use the addSink() method. (I can't remember the actual method being used.) There are also several different implementations for various types (tuples, pojo's, scala case classes) but we

Re: Cassandra Connector

2016-11-22 Thread Chesnay Schepler
Actually this is a bit inaccurate. _Some_ implementations are not implemented as a sink. Also, you can in fact instantiate the sinks yourself as well, as in readings.addSink(new CassandraTupleSink(, ); On 22.11.2016 09:30, Chesnay Schepler wrote: Hello, the CassandraSink is not

Re: Many operations cause StackOverflowError with AWS EMR YARN cluster

2016-11-23 Thread Chesnay Schepler
Hello, implementing collect() in python is not that trivial and the gain is questionable. There is an inherent size limit (think 10mb), and it is a bit at odds with the deployment model of the Python API. Something easier would be to execute each iteration of the for-loop as a separate job an

Re: Tame Flink UI?

2016-11-23 Thread Chesnay Schepler
? For example, instead of enumerating all metrics, maybe ask for the range? On Wed, Nov 16, 2016 at 2:05 PM, Chesnay Schepler <mailto:ches...@apache.org>> wrote: Hello, The WebInterfaces first pulls a list of all available metrics for a specific taskmanager/job/task (which

Re: JVM Non Heap Memory

2016-12-05 Thread Chesnay Schepler
Hello Daniel, I'm afraid you stumbled upon a bug in Flink. Meters were not properly cleaned up, causing the underlying dropwizard meter update threads to not be shutdown either. I've opened a JIRA and will open a PR soon. Thank your for re

Re: JVM Non Heap Memory

2016-12-05 Thread Chesnay Schepler
issue is here: https://issues.apache.org/jira/browse/FLINK-5261 I added an issue to improve the documentation about cancellation (https://issues.apache.org/jira/browse/FLINK-5260). Which version of Flink are you using? Chesnay's fix will make it into the upcoming 1.1.4 release. On 5 December 2016 at 14:04:4

Re: JVM Non Heap Memory

2016-12-05 Thread Chesnay Schepler
ng version flink's 1.1.3. So it seems the fix of Meter's won't make it to 1.1.4 ? Best Regards, Daniel Santos On 12/05/2016 01:28 PM, Chesnay Schepler wrote: We don't have to include it in 1.1.4 since Meter's do not exist in 1.1; my bad for tagging it in

Re: Parallelism and stateful mapping with Flink

2016-12-08 Thread Chesnay Schepler
It would be neat if we could support arrays as keys directly; it should boil down to checking the key type and in case of an array injecting a KeySelector that calls Arrays.hashCode(array). This worked for me when i ran into the same issue while experimenting with some stuff. The batch API can

Re: conditional dataset output

2016-12-08 Thread Chesnay Schepler
Hello Lars, The only other way i can think of how this could be done is by wrapping the used outputformat in a custom format, which calls open on the wrapped outputformat when you receive the first record. This should work but is quite hacky though as it interferes with the format life-cycle

Re: Parallelism and stateful mapping with Flink

2016-12-08 Thread Chesnay Schepler
Done. https://issues.apache.org/jira/browse/FLINK-5299 On 08.12.2016 16:50, Ufuk Celebi wrote: Would you like to open an issue for this for starters Chesnay? Would be good to fix for the upcoming release even. On 8 December 2016 at 16:39:58, Chesnay Schepler (ches...@apache.org) wrote: It

Re: How to analyze space usage of Flink algorithms

2016-12-09 Thread Chesnay Schepler
We do not measure how much data we are spilling to disk. On 09.12.2016 14:43, Fabian Hueske wrote: Hi, the heap mem usage should be available via Flink's metrics system. Not sure if that also captures spilled data. Chesnay (in CC) should know that. If the spilled data is not available as a m

Re: How to retrieve values from yarn.taskmanager.env in a Job?

2016-12-12 Thread Chesnay Schepler
Hello, can you clarify one small thing for me: Do you want to access this parameter when you define the plan (aka when you call methods on the StreamExecutionEnvironment or DataStream instances) or from within your functions/operators? Regards, Chesnay Schepler On 12.12.2016 14:21, Till

<    5   6   7   8   9   10   11   12   13   >