Re: Is there an example of flink cluster "as a job" deployment on k8s ?

2018-12-05 Thread vino yang
Hi Vishal, I found an example from Github for reference only. [1] Https://github.com/sanderploegsma/flink-k8s Thanks, vino. Vishal Santoshi 于2018年12月6日周四 上午5:50写道: > >

NoClassDefFoundError javax.xml.bind.DatatypeConverterImpl

2018-12-05 Thread Mike Mintz
Hi Flink developers, We're running some new DataStream jobs on Flink 1.7.0 using the shaded Hadoop S3 file system, and running into frequent errors saving checkpoints and savepoints to S3. I'm not sure what the underlying reason for the error is, but we often fail with the following stack trace,

Re: Assigning a port range to rest.port

2018-12-05 Thread Gyula Fóra
Thank you Till :) Gy Till Rohrmann ezt írta (időpont: 2018. dec. 5., Sze, 16:20): > Hi Gyula and Jeff, > > I think at the moment it is not possible to define a port range for the > REST client. Maybe we should add something similar to the > RestOptions#BIND_ADDRESS, namely introducing a

Is there an example of flink cluster "as a job" deployment on k8s ?

2018-12-05 Thread Vishal Santoshi

Re: Flink 1.7 job cluster (restore from checkpoint error)

2018-12-05 Thread Hao Sun
Till, Flink is automatically trying to recover from a checkpoint not savepoint. How can I get allowNonRestoredState applied in this case? Hao Sun Team Lead 1019 Market St. 7F San Francisco, CA 94103 On Wed, Dec 5, 2018 at 10:09 AM Till Rohrmann wrote: > Hi Hao, > > I think you need to provide

Re: How to distribute subtasks evenly across taskmanagers?

2018-12-05 Thread Till Rohrmann
Hi Sunny, this is a current limitation of Flink's scheduling. We are currently working on extending Flinks scheduling mechanism [1] which should also help with solving this problem. At the moment, I recommend using the per-job mode so that you have a single cluster per job. [1]

Re: Flink 1.7 job cluster (restore from checkpoint error)

2018-12-05 Thread Till Rohrmann
Hi Hao, I think you need to provide a savepoint file via --fromSavepoint to resume from in order to specify --allowNonRestoredState. Otherwise this option will be ignored because it only works if you resume from a savepoint. Cheers, Till On Wed, Dec 5, 2018 at 12:29 AM Hao Sun wrote: > I am

Re: long lived standalone job session cluster in kubernetes

2018-12-05 Thread Till Rohrmann
Hi Derek, there is this issue [1] which tracks the active Kubernetes integration. Jin Sun already started implementing some parts of it. There should also be some PRs open for it. Please check them out. [1] https://issues.apache.org/jira/browse/FLINK-9953 Cheers, Till On Wed, Dec 5, 2018 at

Re: long lived standalone job session cluster in kubernetes

2018-12-05 Thread Derek VerLee
Sounds good. Is someone working on this automation today? If not, although my time is tight, I may be able to work on a PR for getting us started down the path Kubernetes native cluster mode. On 12/4/18 5:35 AM, Till Rohrmann wrote:

Re: Query big mssql Data Source [Batch]

2018-12-05 Thread Flavio Pompermaier
whats your query? Have you used '?' where query should be parameterized? Give a look at https://github.com/apache/flink/blob/master/flink-connectors/flink-jdbc/src/test/java/org/apache/flink/api/java/io/jdbc/JDBCFullTest.java

Re: Query big mssql Data Source [Batch]

2018-12-05 Thread miki haiat
Im using jdts driver to query mssql . I used the ParametersProvider as you suggested but for some reason the job wont run parallel . [image: flink_in.JPG] Also the sink , a simple print out wont parallel [image: flink_out.JPG] On Tue, Dec 4, 2018 at 10:05 PM Flavio Pompermaier

Re: Assigning a port range to rest.port

2018-12-05 Thread Till Rohrmann
Hi Gyula and Jeff, I think at the moment it is not possible to define a port range for the REST client. Maybe we should add something similar to the RestOptions#BIND_ADDRESS, namely introducing a RestOptions#BIND_PORT which can define a port range for the binding port. RestOptions#PORT will only

Re: TTL state migration

2018-12-05 Thread Ning Shi
Hi Andrey, Thank you. That makes sense. Looking at the code of how TTL is implemented, it matches exactly what you said. Ning On Wed, Dec 5, 2018 at 9:37 AM Andrey Zagrebin wrote: > > Hi Ning, > > State backends store the timestamp of last access/modification of state with > TTL. > This

Re: TTL state migration

2018-12-05 Thread Andrey Zagrebin
Hi Ning, State backends store the timestamp of last access/modification of state with TTL. This absolute timestamp does not depend on the configured time-to-live. If you restart a job from the savepoint and configure longer time-to-live than before, it just means that map state entries, which

Re: number of files in checkpoint directory grows endlessly

2018-12-05 Thread Andrey Zagrebin
Hi Bernd, Thanks for sharing the code. Could you try the same job without TtlDb, with original RocksDb state backend? Even if data does not expire every day, the overall size will grow but the number of files should not because of compaction. Do you set some more specific db/column options in

Re: Kafka offset auto-commit stops after timeout

2018-12-05 Thread Anil
I had the same issue and enabling checkpoint seems to solve the problem. Can you please explain how does enabling checkpoint fixes the issue. Thanks! -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Assigning a port range to rest.port

2018-12-05 Thread Gyula Fóra
Maybe the problem is here? cc Till https://github.com/apache/flink/blob/44ed5ef0fc1c221f3916ab5126f1bc8ee5dfb45d/flink-yarn/src/main/java/org/apache/flink/yarn/entrypoint/YarnEntrypointUtils.java#L83

Re: Support for multiple slots per task manager in Flink 1.5

2018-12-05 Thread 罗齐
Hi Pawel, Flink 1.5 supports multiple sets per TM. “Not fully supported yet” refers to an issue [1] when requesting TM from SlotManager, which will allocate more TM than actual (but idle TMs will be released later). So I suggest you to check further in your timeout logs to identify other

Re: Assigning a port range to rest.port

2018-12-05 Thread Jeff Zhang
This requirement makes sense to me. Another issue I hit due to single value of rest port is that user can not start 2 local MiniCluster, I try to start 2 flink scala-shell in local mode, but fails due to port conflict. Gyula Fóra 于2018年12月5日周三 下午8:04写道: > Hi! > Is there any way currently to

Assigning a port range to rest.port

2018-12-05 Thread Gyula Fóra
Hi! Is there any way currently to set a port range for the rest client? rest.port only takes a single number and it is anyways overwritten to 0. This seems to be necessary when running the flink client from behind a firewall where only a predefined port-range is accessible from the outside. I

Support for multiple slots per task manager in Flink 1.5

2018-12-05 Thread Pawel Bartoszek
Hi, According to the Flink 1.5 release docs multiple slots per task manager are "not fully supported yet". Can you provide more information about what are the risks of running more than one slot per tm? We are running Flink on EMR on YARN. Previously we run 4 task task managers with 8 slot each

Re: Using port ranges to connect with the Flink Client

2018-12-05 Thread Gyula Fóra
Ah, it seems to be something with the custom flink client build that we run... Still dont know why but if I use the normal client once the job is started it works. Gyula Gyula Fóra ezt írta (időpont: 2018. dec. 5., Sze, 9:50): > I get the following error when trying to savepoint a job for

S3A AWSS3IOException from Flink's BucketingSink to S3

2018-12-05 Thread Flink Developer
I have a Flink app with high parallelism (400) running in AWS EMR. It uses Flink v1.5.2. It sources Kafka and sinks to S3 using BucketingSink (using RocksDb backend for checkpointing). The destination is defined using "s3a://" prefix. The Flink job is a streaming app which runs continuously. At