Re: Slack Community Invite?

2024-06-12 Thread Ufuk Celebi
FYI This was recently fixed on the website, too. > On 28. May 2024, at 05:05, Alexandre Lemaire wrote: > > Thank you! > >> On May 27, 2024, at 9:35 PM, Junrui Lee wrote: >> >> Hi Alexandre, >> >> You can try this link: >> https://join.slack.com/t/apache-flink/shared_invite/zt-2jn2dlgoi-P4oQ

Re: Flink upgrade to Flink-1.12

2021-01-25 Thread Ufuk Celebi
Thanks for reaching out. Semi-asynchronous does *not* refer to incremental checkpoints and Savepoints are always triggered as full snapshots (not incremental). Earlier versions of the RocksDb state backend supported two snapshotting modes, fully and semi-asynchronous snapshots. Semi-asynchronou

Re: Flink Jobmanager HA deployment on k8s

2021-01-21 Thread Ufuk Celebi
@Yang: I think this would be valuable to document. I think it's a natural question to ask whether you can have standby JMs with Kubernetes. What do you think? If you agree, we could create a JIRA ticket and work on the "official" docs for this. On Thu, Jan 21, 2021, at 5:05 AM, Yang Wang wrote:

Re: Savepoint Location from Flink REST API

2020-04-02 Thread Ufuk Celebi
re > >> { >> "status": { >> "id": "completed" >> }, >> *"operation"*: { >> "failure-cause": { >> "class": "string", >> "stack-trace": "string"

Re: Deploying native Kubernetes via yaml files?

2020-03-24 Thread Ufuk Celebi
PS: See also https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/deployment/kubernetes.html#flink-job-cluster-on-kubernetes On Tue, Mar 24, 2020 at 2:49 PM Ufuk Celebi wrote: > Hey Niels, > > you can check out the README with example configuration files here: > https:

Re: Deploying native Kubernetes via yaml files?

2020-03-24 Thread Ufuk Celebi
Hey Niels, you can check out the README with example configuration files here: https://github.com/apache/flink/tree/master/flink-container/kubernetes Is that what you were looking for? Best, Ufuk On Tue, Mar 24, 2020 at 2:42 PM Niels Basjes wrote: > Hi, > > As clearly documented here > https

Re: Savepoint Location from Flink REST API

2020-03-20 Thread Ufuk Celebi
Hey Aaron, you can expect one of the two responses for COMPLETED savepoints [1, 2]. 1. Success { "status": { "id": "completed" }, "savepoint": { "location": "string" } } 2. Failure { "status": { "id": "completed" }, "savepoint": { "failure-cause": { "class":

Re: [DISCUSS] Drop Savepoint Compatibility with Flink 1.2

2020-02-22 Thread Ufuk Celebi
Hey Stephan, +1. Reading over the linked ticket and your description here, I think it makes a lot of sense to go ahead with this. Since it's possible to upgrade via intermediate Flink releases as a fail-safe I don't have any concerns. – Ufuk On Fri, Feb 21, 2020 at 4:34 PM Till Rohrmann wrote

Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-19 Thread Ufuk Celebi
I'm late to the party... Welcome and congrats! :-) – Ufuk On Mon, Aug 19, 2019 at 9:26 AM Andrey Zagrebin wrote: > Hi Everybody! > > Thanks a lot for the warn welcome! > I am really happy about joining Flink committer team and hope to help the > project to grow more. > > Cheers, > Andrey > > O

Re: Flink 1.8.1: Seeing {"errors":["Not found."]} when trying to access the Jobmanagers web interface

2019-08-06 Thread Ufuk Celebi
Hey Tobias, out of curiosity: were you using the job/application cluster (as documented here: https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/deployment/docker.html#flink-job-cluster )? – Ufuk On Tue, Aug 6, 2019 at 1:50 PM Kaymak, Tobias wrote: > I was using Apache Beam and i

Re: No yarn option in self-built flink version

2019-06-12 Thread Ufuk Celebi
@Arnaud: Turns out those examples are on purpose. As Chesnay pointed out in the ticket, there are also cases where you don't necessarily want to bundle the Hadoop dependency, but still want to set a version like that. On Wed, Jun 12, 2019 at 9:32 AM Ufuk Celebi wrote: > I creat

Re: No yarn option in self-built flink version

2019-06-12 Thread Ufuk Celebi
e above paragraph… > > > > Arnaud > > > > *De :* Ufuk Celebi > *Envoyé :* vendredi 7 juin 2019 12:00 > *À :* LINZ, Arnaud > *Cc :* user ; ches...@apache.org > *Objet :* Re: No yarn option in self-built flink version > > > > Hey Arnaud, > > >

Re: No yarn option in self-built flink version

2019-06-07 Thread Ufuk Celebi
Hey Arnaud, I think you need to active the Hadoop profile via -Pinclude-hadoop (the default was changed to not include Hadoop as far as I know). For more details, check out: https://ci.apache.org/projects/flink/flink-docs-release-1.8/flinkDev/building.html#packaging-hadoop-into-the-flink-distribu

Re: QueryableState startup regression in 1.8.0 ( migration from 1.7.2 )

2019-04-29 Thread Ufuk Celebi
Actually, I couldn't even find a mention of this flag in the docs here: https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/stream/state/queryable_state.html – Ufuk On Mon, Apr 29, 2019 at 8:45 AM Ufuk Celebi wrote: > I didn't find this as part of the > https://

Re: QueryableState startup regression in 1.8.0 ( migration from 1.7.2 )

2019-04-28 Thread Ufuk Celebi
I didn't find this as part of the https://flink.apache.org/news/2019/04/09/release-1.8.0.html notes. I think an update to the Important Changes section would be valuable for users upgrading to 1.8 from earlier releases. Also, logging that the library is on the classpath but the feature flag is set

Re: Flink: How to use zookeeper when deployed in k8s?

2019-04-01 Thread Ufuk Celebi
You can set jobmanager.rpc.address: jmServiceName high-availability.jobmanager.port: 6123 in flink-conf.yaml and expose the port in the JobManager service. – Ufuk On Mon, Apr 1, 2019 at 9:29 AM sora wrote: > > Hello, > I encountered a problem when deploying flink to k8s. > When high-availabili

Re: What are savepoint state manipulation support plans

2019-03-28 Thread Ufuk Celebi
-12047 > Let's move any further discussions there. > > Cheers, > Gordon > > On Thu, Mar 28, 2019 at 5:01 PM Ufuk Celebi wrote: >> >> I think such a tool would be really valuable to users. >> >> @Gordon: What do you think about creating an umbrella tick

Re: What are savepoint state manipulation support plans

2019-03-28 Thread Ufuk Celebi
I think such a tool would be really valuable to users. @Gordon: What do you think about creating an umbrella ticket for this and linking it in this thread? That way, it's easier to follow this effort. You could also link Bravo and Seth's tool in the ticket as starting points. – Ufuk

Re: [DISCUSS] Create a Flink ecosystem website

2019-03-06 Thread Ufuk Celebi
I like Shaoxuan's idea to keep this a static site first. We could then iterate on this and make it a dynamic thing. Of course, if we have the resources in the community to quickly start with a dynamic site, I'm not apposed. – Ufuk On Wed, Mar 6, 2019 at 2:31 PM Robert Metzger wrote: > > Awesome!

Re: `env.java.opts` not persisting after job canceled or failed and then restarted

2019-01-29 Thread Ufuk Celebi
class WrappedHadoopInputStream(underlying: FlinkFSDataInputStream) > extends InputStream > with Seekable > with PositionedReadable { > > def read(): Int = underlying.read() > def seek(pos: Long): Unit = underlying.seek(pos) > def getPos: Long = underlying.getPo

Re: `env.java.opts` not persisting after job canceled or failed and then restarted

2019-01-28 Thread Ufuk Celebi
gt;>>> [2019-01-23 19:52:33.081882] at >>>>> org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:99) >>>>> [2019-01-23 19:52:33.081904] at >>>>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask

Re: [DISCUSS] Towards a leaner flink-dist

2019-01-23 Thread Ufuk Celebi
On Wed, Jan 23, 2019 at 11:01 AM Timo Walther wrote: > I think what is more important than a big dist bundle is a helpful > "Downloads" page where users can easily find available filesystems, > connectors, metric repoters. Not everyone checks Maven central for > available JAR files. I just saw tha

Re: [DISCUSS] Towards a leaner flink-dist

2019-01-23 Thread Ufuk Celebi
I like the idea of a leaner binary distribution. At the same time I agree with Jamie that the current binary is quite convenient and connection speeds should not be that big of a deal. Since the binary distribution is one of the first entry points for users, I'd like to keep it as user-friendly as

Re: `env.java.opts` not persisting after job canceled or failed and then restarted

2019-01-21 Thread Ufuk Celebi
Hey Aaron, sorry for the late reply. (1) I think I was able to reproduce this issue using snappy-java. I've filed a ticket here: https://issues.apache.org/jira/browse/FLINK-11402. Can you check the ticket description whether it's in line with what you are experiencing? Most importantly, do you se

Re: [DISCUSS] Dropping flink-storm?

2019-01-10 Thread Ufuk Celebi
+1 to drop. I totally agree with your reasoning. I like that we tried to keep it, but I don't think the maintenance overhead would be justified. – Ufuk On Wed, Jan 9, 2019 at 4:09 PM Till Rohrmann wrote: > > With https://issues.apache.org/jira/browse/FLINK-10571, we will remove the > Storm topo

Re: Per job cluster doesn't shut down after the job is canceled

2018-11-14 Thread Ufuk Celebi
Hey Paul, It might be related to this: https://github.com/apache/flink/pull/7004 (see linked issue for details). Best, Ufuk > On Nov 14, 2018, at 09:46, Paul Lam wrote: > > Hi Gary, > > Thanks for your reply and sorry for the delay. The attachment is the > jobmanager logs after invoking th

Re: Where is the "Latest Savepoint" information saved?

2018-11-11 Thread Ufuk Celebi
hould see your savepoint there. Best, Ufuk On Sun, Nov 11, 2018 at 7:45 PM Hao Sun wrote: > > This is great, I will try option 3 and let you know. > Can I log some message so I know job is recovered from the latest savepoint? > > On Sun, Nov 11, 2018 at 10:42 AM Ufuk Celebi wrot

Re: java.io.IOException: NSS is already initialized

2018-11-11 Thread Ufuk Celebi
is working with the hadoop flavour. > > On Fri, Nov 9, 2018 at 2:02 PM Ufuk Celebi wrote: >> >> Hey Hao Sun, >> >> - Is this an intermittent failure or permanent? The logs indicate that >> some checkpoints completed before the error occurs (e.g. checkpoint >>

Re: Where is the "Latest Savepoint" information saved?

2018-11-11 Thread Ufuk Celebi
Hey Hao and Paul, 1) Fetch checkpoint info manually from ZK (problematic, not recommended) - As Paul pointed out, this is problematic as the node is a serialized pointer (StateHandle) to a CompletedCheckpoint in the HA storage directory and not a path [1]. - I would not recommend this approach at

Re: java.io.IOException: NSS is already initialized

2018-11-09 Thread Ufuk Celebi
Hey Hao Sun, - Is this an intermittent failure or permanent? The logs indicate that some checkpoints completed before the error occurs (e.g. checkpoint numbers are greater than 1). - Which Java versions are you using? And which Java image? I've Googled similar issues that seem to be related to th

Re: Flink 1.6 job cluster mode, job_id is always 00000000000000000000000000000000

2018-11-05 Thread Ufuk Celebi
On Sun, Nov 4, 2018 at 10:34 PM Hao Sun wrote: > Thanks that also works. To avoid same issue with zookeeper, I assume I have > to do the same trick? Yes, exactly. The following configuration [1] entry takes care of this: high-availability.cluster-id: application-1 This will result in ZooKeeper

Re: Flink 1.6 job cluster mode, job_id is always 00000000000000000000000000000000

2018-11-04 Thread Ufuk Celebi
Hey Hao Sun, this has been changed recently [1] in order to properly support failover in job cluster mode. A workaround for you would be to add an application identifier to the checkpoint path of each application, resulting in S3 paths like application-/00...00/chk-64. Is that a feasible sol

Re: Default zookeeper

2018-05-14 Thread Ufuk Celebi
No, there is no difference if the version in your distro is part of the ZooKeeper 3.4.x series. The script is there for convenience during local testing/dev. – Ufuk On Sun, May 13, 2018 at 3:49 PM, miki haiat wrote: > When downloading the the flink source in order to run it local thire is a > z

Re: PartitionNotFoundException after deployment

2018-05-04 Thread Ufuk Celebi
Hey Gyula! I'm including Piotr and Nico (cc'd) who have worked on the network stack in the last releases. Registering the network structures including the intermediate results actually happens **before** any state is restored. I'm not sure why this reproducibly happens when you restore state. @Ni

Re: Beam quickstart

2018-04-25 Thread Ufuk Celebi
Hey Gyula, including Aljoscha (cc) here who is a committer at the Beam project. Did you also ask on the Beam mailing list? – Ufuk On Wed, Apr 25, 2018 at 3:32 PM, Gyula Fóra wrote: > Hey, > Is there somewhere an end to end guide how to run a simple beam-on-flink > application (preferrably usin

Re: Does web ui hang for 1.4 for job submission ?

2018-01-24 Thread Ufuk Celebi
It's not a stupid question at all! Try the following please: 1) Use something like Chrome's Developer Tools to check the responses you get from the web UI. If you see an error there, that should point you to what's going on. 2) Enable DEBUG logging for the JobManager and check the logs (if 1 doesn'

Re: Back-pressure Status shows OK but records are backed up in kafka

2018-01-07 Thread Ufuk Celebi
Hey Jins, our current back pressure tracking mechanism does not work with Kafka sources. To gather back pressure indicators we sample the main task thread of a subtask. For most tasks, this is the thread that emits records downstream (e.g. if you have a map function) and everything works as expect

Re: How to stop FlinkKafkaConsumer and make job finished?

2017-12-29 Thread Ufuk Celebi
i.apache.org/projects/flink/flink-docs-release-1.4/api/java/org/apache/flink/streaming/util/serialization/KeyedDeserializationSchema.html#isEndOfStream-T- > > Eron > > On Wed, Dec 27, 2017 at 12:35 AM, Ufuk Celebi wrote: >> >> Hey Jaxon, >> >> I don't t

Re: RichAsyncFunction in scala

2017-12-28 Thread Ufuk Celebi
Hey Antoine, isn't it possible to use the Java RichAsyncFunction from Scala like this: class Test extends RichAsyncFunction[Int, Int] { override def open(parameters: Configuration): Unit = super.open(parameters) override def asyncInvoke(input: Int, resultFuture: functions.async.ResultFuture

Re: keyby() issue

2017-12-28 Thread Ufuk Celebi
Hey Jinhua, On Thu, Dec 28, 2017 at 9:57 AM, Jinhua Luo wrote: > The keyby() upon the field would generate unique key as the field > value, so if the number of the uniqueness is huge, flink would have > trouble both on cpu and memory. Is it considered in the design of > flink? Yes, keyBy hash pa

Re: org.apache.zookeeper.ClientCnxn, Client session timed out

2017-12-28 Thread Ufuk Celebi
On Thu, Dec 28, 2017 at 12:11 AM, Hao Sun wrote: > Thanks! Great to know I do not have to worry duplicates inside Flink. > > One more question, why this happens? Because TM and JM both check leadership > in different interval? Yes, it's not deterministic how this happens. There will also be cases

Re: org.apache.zookeeper.ClientCnxn, Client session timed out

2017-12-27 Thread Ufuk Celebi
On Wed, Dec 27, 2017 at 4:41 PM, Hao Sun wrote: > Somehow TM detected JM leadership loss from ZK and self disconnected? > And couple of seconds later, JM failed to connect to ZK? > Yes, exactly as you describe. The TM noticed the loss of leadership before the JM did. > After all the cluster re

Re: MergingWindow

2017-12-27 Thread Ufuk Celebi
Please check your email before sending it the next time as three emails for the same message is a little spammy ;-) This is internal code that is used to implement session windows as far as I can tell. The idea is to not merge the new window as it never had any state associated with it. The genera

Re: Apache Flink - broadcasting DataStream

2017-12-27 Thread Ufuk Celebi
Hey Mans! This refers to how sub tasks are connected to each other in your program. If you have a single sub task A1 and three sub tasks B1, B2, B3, broadcast will emit each incoming record at A1 to all B1, B2, B3: A1 --+-> B1 +-> B2 +-> B3 Does this help? On Mon, Dec 25, 2017 at 7:12

Re: flink yarn-cluster run job --files

2017-12-27 Thread Ufuk Celebi
The file URL needs to be accessible from all nodes, e.g. something like S3://... or hdfs://... >From the CLI: ``` Adds a URL to each user code classloader on all nodes in the cluster. The paths must specify a protocol (e.g. file://) and be accessible on all nodes (e.g. by means of a NFS share).

Re: Fetching TaskManager log failed

2017-12-27 Thread Ufuk Celebi
Thanks for reporting this issue. A few questions: - Which version of Flink are you using? - Does it work up to the point that the Exception is thrown? e.g. for smaller files it's OK? Let me pull in Chesnay (cc'd) who has worked on log fetching for the web runtime. – Ufuk On Tue, Dec 26, 2017 a

Re: How to stop FlinkKafkaConsumer and make job finished?

2017-12-27 Thread Ufuk Celebi
Hey Jaxon, I don't think it's possible to control this via the life-cycle methods of your functions. Note that Flink currently does not support graceful stop in a meaningful manner and you can only cancel running jobs. What comes to my mind to cancel on EOF: 1) Extend Kafka consumer to stop emit

Re: Flink network access control documentation

2017-12-27 Thread Ufuk Celebi
Hey Elias, thanks for opening a ticket (for reference: https://issues.apache.org/jira/browse/FLINK-8311). I fully agree with adding docs for this. I will try to write something down this week. Your point about JobManagers only coordinating via ZK is correct though. I had a look into the JobManage

Re: state.checkpoints.dir not configured

2017-12-21 Thread Ufuk Celebi
ckpoints.dir is > not set. > > Thanks > > > > On 19.12.2017 17:55, Ufuk Celebi wrote: >> >> When the JobManager/TaskManager are starting up they log what config >> they are loading. Look for lines like >> >> "Loading configuration property: {

Re: Add new slave to running cluster?

2017-12-20 Thread Ufuk Celebi
Hey Jinhua, - The `slaves` file is only relevant for the startup scripts. You can add as many task managers as you like by starting them manually. - You can check the logs of the JobManager or its web UI (jobmanager-host:8081) to see how many TMs have registered. - The registration time out looks

Re: state.checkpoints.dir not configured

2017-12-19 Thread Ufuk Celebi
When the JobManager/TaskManager are starting up they log what config they are loading. Look for lines like "Loading configuration property: {}, {}" Do you find the required configuration as part of these messages? – Ufuk On Tue, Dec 19, 2017 at 3:45 PM, Plamen Paskov wrote: > Hi, > I'm trying

Re: consecutive stream aggregations

2017-12-18 Thread Ufuk Celebi
then the average >>> will be calculated per userId but i want average across all users. What >>> would be the solution in this case? >>> >>> Thanks >>> >>> >>> On 15.12.2017 15:46, Ufuk Celebi wrote: >>>> >>>> He

Re: consecutive stream aggregations

2017-12-15 Thread Ufuk Celebi
ut the > window requires keyBy() if i want to parallelize the data. In my case it > will not work because if i create keyBy("userId") then the average > will be calculated per userId but i want average across all users. What > would be the solution in this case? > > Than

Re: consecutive stream aggregations

2017-12-15 Thread Ufuk Celebi
ng on KeyedStream it's not possible to call > aggregate with AggregateFunction implementation. Am i missing something? > > > > On 15.12.2017 15:46, Ufuk Celebi wrote: >> >> Hey Plamen, >> >> I think what you are looking for is the AggregateFunction. This

Re: consecutive stream aggregations

2017-12-15 Thread Ufuk Celebi
Hey Plamen, I think what you are looking for is the AggregateFunction. This you can use on keyed streams. The Javadoc [1] contains an example for your use case (averaging). – Ufuk [1] https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/functions/Aggr

Re: Flink 1.4.0 keytab is unreadable

2017-12-15 Thread Ufuk Celebi
Hey 杨光, thanks for looking into this in such a detail. Unfortunately, I'm not sure what the expected behaviour is (whether the change in behaviour was accidental or on purpose). Let me pull in Gordon who has worked quite a bit on the Kerberos related components in Flink. @Gordon: 1) Do you know

Re: Python API not working

2017-12-15 Thread Ufuk Celebi
Hey Yassine, let me include Chesnay (cc'd) who worked on the Python API. I'm not familiar with the API and what it expects, but try entering `streaming` or `batch` for the mode. Chesnay probably has the details. – Ufuk On Fri, Dec 15, 2017 at 11:05 AM, Yassine MARZOUGUI wrote: > Hi All, > > I

Re: docker-flink images and CI

2017-12-15 Thread Ufuk Celebi
I agree with Patrick's (cc'd) comment from the linked issue. What I understand from the linked issue is that Patrick will take care of the Docker image update for 1.4 manually. Is that ok with you Patrick? :-) Regarding the Flink release process question: I fully agree with the idea to integrate t

Re: Flink State monitoring

2017-12-15 Thread Ufuk Celebi
Hey Liron, unfortunately, there are no built-in metrics related to state. In general, exposing the actual values as metrics is problematic, but exposing summary statistics would be a good idea. I'm not aware of a good work around at the moment that would work in the general case (taking into accou

Re: S3 Access in eu-central-1

2017-11-29 Thread Ufuk Celebi
Hey Dominik, yes, we should definitely add this to the docs. @Nico: You recently updated the Flink S3 setup docs. Would you mind adding these hints for eu-central-1 from Steve? I think that would be super helpful! Best, Ufuk On Tue, Nov 28, 2017 at 10:00 PM, Dominik Bruhn wrote: > Hey Stephan

Re: Issue with back pressure and AsyncFunction

2017-11-10 Thread Ufuk Celebi
Hey Ken, thanks for your message. Both your comments are correct (see inline). On Fri, Nov 10, 2017 at 10:31 PM, Ken Krugler wrote: > 1. A downstream function in the iteration was (significantly) increasing the > number of tuples - it would get one in, and sometimes emit 100+. > > The output wou

Re: Flink memory leak

2017-11-07 Thread Ufuk Celebi
e only use map and filters > operator. We thought that gc handle clearing the flink’s internal states. > So how can we manage the memory if it is always increasing? > > - Ebru > > On 7 Nov 2017, at 16:23, Ufuk Celebi wrote: > > Hey Ebru, the memory usage might be increasing

Re: MODERATE for d...@flink.apache.org

2017-11-07 Thread Ufuk Celebi
t; > All the cluster Flink servers are able to access network drive, and it mapped > as drive Y in all nodes. > Do I need to provide more information? > > Thanks, > Jordan > > >> On 7 Nov 2017, at 6:36 PM, Ufuk Celebi wrote: >> >> As answered by David on S

Re: MODERATE for d...@flink.apache.org

2017-11-07 Thread Ufuk Celebi
As answered by David on SO, the files need to be accessible by all nodes. In your setup this seems not to be the case, therefore it won't work. You need a distributed file system (e.g. NFS or HDFS) or object store (e.g. S3) that is accessible from all nodes. – Ufuk On Tue, Nov 7, 2017 at 3:34 AM

Re: Flink memory leak

2017-11-07 Thread Ufuk Celebi
Hey Ebru, let me pull in Aljoscha (CC'd) who might have an idea what's causing this. Since multiple jobs are running, it will be hard to understand to which job the state descriptors from the heap snapshot belong to. - Is it possible to isolate the problem and reproduce the behaviour with only a

Re: FlinkCEP behaviour with time constraints not as expected

2017-11-07 Thread Ufuk Celebi
Hey Frederico, let me pull in Dawid (cc'd) who works on CEP. He can probably clarify the expected behaviour here. Best, Ufuk On Mon, Nov 6, 2017 at 12:06 PM, Federico D'Ambrosio wrote: > Hi everyone, > > I wanted to ask if FlinkCEP in the following scenario is working as it > should, or I hav

Re: PartitionNotFoundException when running in yarn-session.

2017-10-12 Thread Ufuk Celebi
MapReduce: The > missing task is simply resubmitted and ran again. > Why doesn't that happen? > > > Niels > > On Wed, Oct 11, 2017 at 8:49 AM, Ufuk Celebi wrote: >> >> Hey Niels, >> >> any update on this? >> >> – Ufuk >> >

Re: Flink 1.3.2 Netty Exception

2017-10-11 Thread Ufuk Celebi
@Chesnay: Recycling of network resources happens after the tasks go into state FINISHED. Since we are submitting new jobs in a local loop here it can easily happen that the new job is submitted before enough buffers are available again. At least, previously that was the case. I'm CC'ing Nico who r

Re: PartitionNotFoundException when running in yarn-session.

2017-10-10 Thread Ufuk Celebi
Hey Niels, any update on this? – Ufuk On Mon, Oct 9, 2017 at 10:16 PM, Ufuk Celebi wrote: > Hey Niels, > > thanks for the detailed report. I don't think that it is related to > the Hadoop or Scala version. I think the following happens: > > - Occasionally, one of

Re: PartitionNotFoundException when running in yarn-session.

2017-10-09 Thread Ufuk Celebi
Hey Niels, thanks for the detailed report. I don't think that it is related to the Hadoop or Scala version. I think the following happens: - Occasionally, one of your tasks seems to be extremely slow in registering its produced intermediate result (the data shuffled between TaskManagers) - Anothe

Re: question on sideoutput from ProcessWindow function

2017-09-23 Thread Ufuk Celebi
+1 Created an issue here: https://issues.apache.org/jira/browse/FLINK-7677 On Thu, Sep 14, 2017 at 11:51 AM, Aljoscha Krettek wrote: > Hi, > > Chen is correct! I think it would be nice, though, to also add that > functionality for ProcessWindowFunction and I think this should be easy to > do si

Re: Noisy org.apache.flink.configuration.GlobalConfiguration

2017-09-19 Thread Ufuk Celebi
PS: To answer the question. No, I think there is no reason for this and it shouldn't happen. :-( On Tue, Sep 19, 2017 at 2:44 AM, Elias Levy wrote: > Is there a particular reason that GlobalConfiguration is so noisy? > > The task manager log is full of "Loading configuration property" messages >

Re: Noisy org.apache.flink.configuration.GlobalConfiguration

2017-09-19 Thread Ufuk Celebi
I saw this too recently when using HadoopFileSystem for checkpoints (HDFS or S3). I thought I had opened an issue for this, but I didn't. Here it is: https://issues.apache.org/jira/browse/FLINK-7643 On Tue, Sep 19, 2017 at 1:28 PM, Till Rohrmann wrote: > Hi Elias, > > which version of Flink and

Re: Flink flick cancel vs stop

2017-09-14 Thread Ufuk Celebi
Hey Elias, sorry for the delay here. No, stop is not deprecated but not fully implemented yet. One missing part is migration of the existing source functions as you say. Let me pull in Till for more details on this. @Till: Is there more missing than migrating the sources? Here is the PR and disc

Re: Flink 1.2.1 JobManager Election Deadlock

2017-09-07 Thread Ufuk Celebi
Thanks for looking into this and finding out that it is (probably) related to Curator. Very valuable information! On Thu, Sep 7, 2017 at 3:09 PM, Timo Walther wrote: > Thanks for informing us. As far as I know, we were not aware of any deadlock > in the JobManager election. Let's hope that the up

Re: Flink HA with Kubernetes, without Zookeeper

2017-08-23 Thread Ufuk Celebi
Thanks James for sharing your experience. I find it very interesting :-) On Tue, Aug 22, 2017 at 9:50 PM, Hao Sun wrote: > Great suggestions, the etcd operator is very interesting, thanks James. > > > On Tue, Aug 22, 2017, 12:42 James Bucher wrote: >> >> Just wanted to throw in a couple more de

Re: Great number of jobs and numberOfBuffers

2017-08-17 Thread Ufuk Celebi
PS: Also pulling in Nico (CC'd) who is working on the network stack. On Thu, Aug 17, 2017 at 11:23 AM, Ufuk Celebi wrote: > Hey Gwenhael, > > the network buffers are recycled automatically after a job terminates. > If this does not happen, it would be quite a major bug. >

Re: Great number of jobs and numberOfBuffers

2017-08-17 Thread Ufuk Celebi
Hey Gwenhael, the network buffers are recycled automatically after a job terminates. If this does not happen, it would be quite a major bug. To help debug this: - Which version of Flink are you using? - Does the job fail immediately after submission or later during execution? - Is the following

Re: Queryable State max number of clients

2017-08-15 Thread Ufuk Celebi
d what will be the effect? A timeout exception? > > > Best > Ziyad > > On Mon, Aug 14, 2017 at 10:17 PM, Ufuk Celebi wrote: >> >> This is as Aljoscha describes. Each thread can handle many different >> clients at the same time. You shouldn't need to change the defaul

Re: Queryable State max number of clients

2017-08-14 Thread Ufuk Celebi
This is as Aljoscha describes. Each thread can handle many different clients at the same time. You shouldn't need to change the defaults in most cases. The network threads handle the TCP connections and dispatch query tasks to the query threads which do the actual querying of the state backend. In

Re: [POLL] Dropping savepoint format compatibility for 1.1.x in the Flink 1.4.0 release

2017-07-28 Thread Ufuk Celebi
On Fri, Jul 28, 2017 at 4:03 PM, Stephan Ewen wrote: > Seems like no one raised a concern so far about dropping the savepoint > format compatibility for 1.1 in 1.4. > > Leaving this thread open for some more days, but from the sentiment, it > seems like we should go ahead? +1

Re: mirror links don't work

2017-07-04 Thread Ufuk Celebi
Thanks for reporting this. Did you find these pages by Googling for the Flink docs? They are definitely very outdated versions of Flink. On Tue, Jul 4, 2017 at 4:46 PM, AndreaKinn wrote: > I found it clicking on "download flink for hadoop 1.2" button: > https://ci.apache.org/projects/flink/flink-

Re: External checkpoints not getting cleaned up/discarded - potentially causing high load

2017-07-03 Thread Ufuk Celebi
On Mon, Jul 3, 2017 at 12:02 PM, Stefan Richter wrote: > Another thing that could be really helpful, if possible, can you attach a > profiler/sampling to your job manager and figure out the hotspot methods > where most time is spend? This would be very helpful as a starting point > where the probl

Re: Flink 1.2.0 - Flink Job restart config not using Flink Cluster restart config.

2017-06-29 Thread Ufuk Celebi
On Thu, Jun 29, 2017 at 8:59 AM, Ufuk Celebi wrote: > @Vera: As a work around you could enable checkpointing and afterwards > explicitly disable restarts via > ExecutionConfig.setRestartStrategy(null). Then the cluster default > should be picked up. @Vera: Sorry, just checked G

Re: Flink 1.2.0 - Flink Job restart config not using Flink Cluster restart config.

2017-06-29 Thread Ufuk Celebi
Hey Vera and Gordon, I agree that this behaviour is confusing. If we want to split hairs here, we wouldn't call it a bug, because the restart strategy docs say that "Default restart strategy to use in case no restart strategy has been specified for the job". The confusing part is that enabling ch

Re: UnknownKvStateKeyGroupLocation

2017-05-16 Thread Ufuk Celebi
Hey Joe! This sounds odd... are there any failures (JobManager or TaskManager) or leader elections being reported? You should see such events in the JobManager/TaskManager logs. On Tue, May 16, 2017 at 2:28 PM, Joe Olson wrote: > When running Flink in high availability mode, I've been seeing a hi

Re: Queryable State

2017-05-04 Thread Ufuk Celebi
ask > managers are present, and that the job is running on all of them, but is > there some verbiage in the logs that indicates the job manager can talk to > all the task managers, and vice versa? > > Thanks! > > > 02.05.2017, 06:03, "Ufuk Celebi" : > > Hey

Re: Multiple consumers on a subpartition

2017-04-26 Thread Ufuk Celebi
Adding to what Zhijiang said: I think the way to go would be to create multiple "read views" over the pipelined subpartition. You would have to make sure that the initial reference count of the partition buffers is incremented accordingly. The producer will be back pressured by both consumers now.

Re: Grafana plug-in for Flink

2017-04-25 Thread Ufuk Celebi
Hey Andy! I agree with Dawid. Jamie has some code available here: https://github.com/dataArtisans/queryable-state-demo/blob/master/flink-state-server/src/main/scala/com/dataartisans/stateserver/server/FlinkStateServerController.scala This returns the JSON objects that are used by the simple-json-

Re: in-memory optimization

2017-04-24 Thread Ufuk Celebi
Loop invariant data should be kept in Flink's managed memory in serialized form (in a custom hash table). That means that they are not read back again from the CSV file, but they are kept in serialized form and need be deserialized again on access. CC'ing Fabian to double check... On Mon, Apr 24,

Re: Flink on Ignite - Collocation?

2017-04-24 Thread Ufuk Celebi
Hey Matt, in general, Flink doesn't put too much work in co-locating sources (doesn't happen for Kafka, etc. either). I think the only local assignments happen in the DataSet API for files in HDFS. Often this is of limited help anyways. Your approach sounds like it could work, but I would general

Re: Disk I/O in Flink

2017-04-24 Thread Ufuk Celebi
Hey Robert, for batch that should cover the relevant spilling code. If the records are >= 5 MB, the SpillingAdaptiveSpanningRecordDeserializer will spill incoming records as well. But that should be covered by the FileChannel instrumentation as well? – Ufuk On Tue, Apr 18, 2017 at 3:57 PM, Robe

Re: Queryable State

2017-04-24 Thread Ufuk Celebi
You should be able to use queryable state w/o any changes to the default config. The `query.server.port` option defines the port of the queryable state server that serves the state on the task managers and it is enabled by default. The important thing is to configure the client to discover the Job

Re: Checkpoints very slow with high backpressure

2017-04-24 Thread Ufuk Celebi
@Yessine: no, there is no way to disable the back pressure mechanism. Do you have more details about the two last operators? What do you mean with the process function is slow on purpose? @Rune: with 1.3 Flink will configure the internal buffers in a way that not too much data is buffered in the i

Re: Monitoring REST API and YARN session

2017-03-31 Thread Ufuk Celebi
In this case they are proxied through YARN, you can check the list auf running applications and click on the Flink app master UI link. Then you have the host and port for the REST calls. Does this work? On Fri, Mar 31, 2017 at 1:51 AM, Mohammad Kargar wrote: > How can I access the REST APIs for m

Re: Async Functions and Scala async-client for mySql/MariaDB database connection

2017-03-31 Thread Ufuk Celebi
I'm not too familiar with what's happening here, but maybe Klou (cc'd) can help? On Thu, Mar 30, 2017 at 6:50 PM, Andrea Spina wrote: > Dear Flink community, > > I started to use Async Functions in Scala, Flink 1.2.0, in order to retrieve > enriching information from MariaDB database. In order to

Re: flink one transformation end,the next transformation start

2017-03-31 Thread Ufuk Celebi
What is the error message/stack trace you get here? On Thu, Mar 30, 2017 at 9:33 AM, wrote: > hi,all, > i run a job,it is : > - > val data = env.readTextFile("hdfs:///")//DataSet[(String,Array[String])] > val dataVec = > computeDataVect

Re: ClassNotFoundException upon Savepoint Disposal

2017-03-29 Thread Ufuk Celebi
Thanks for providing the logs. It looks like the JARs are uploaded (modulo any bugs ;-))... could you double check that the class is actually part of the JAR and has not been moved around via jar -tf | grep EventDataRecord If everything looks good in the JAR, I could write a short tool that pri

Re: ClassNotFoundException upon Savepoint Disposal

2017-03-27 Thread Ufuk Celebi
What kind of state backend where you using for the checkpoints? If there is a bug that prevents us from deleting the savepoint files automatically, we can do a manual workaround and delete the checkpoints files manually. With Flink 1.3 this becomes very straight forward as savepoint data all go to

Re: Task manager number mismatch container number on mesos

2017-03-23 Thread Ufuk Celebi
When it happens, is it temporary or permanent? Looping in Till and Eron who worked on the Mesos runner. – Ufuk On Thu, Mar 23, 2017 at 11:09 AM, Renjie Liu wrote: > Hi, all: > We are using flink 1.2.0 on mesos. We found the number of task managers > mismatches with container number occasinally.

  1   2   3   4   5   6   >