Re: Broadcast checkpoint serialization fail

2019-11-18 Thread Vasily Melnik
Hi all, We found the solution: the problem is Comparator in TreeSet we used as the value of broadcast state. Kryo is unable to serialize lambda in Comparator, so we changed to regular class - and everything is fine now. С уважением, Василий Мельник *Glow**Byte Consulting*

Re: possible backwards compatibility issue between 1.8->1.9?

2019-11-18 Thread Tzu-Li (Gordon) Tai
Hi Bekir, Before diving deeper, just to rule out the obvious: Have you changed anything with the element type of the input stream to the async wait operator? This wasn't apparent from the information so far, so I want to quickly clear that out of the way first. Cheers, Gordon On Wed, Oct 30, 20

Re: Keyed raw state - example

2019-11-18 Thread bastien dine
Hello Congxian, Thanks for your response, Don't you have an example with an Operator extending the AbstractUdfStreamOperator? Using the context.getRawKeyedStateInputs() (& output to snapshots) TimeService is reimplementing the whole stuff :/ -- Bastien DINE Data Architect / Soft

SQL for Avro GenericRecords on Parquet

2019-11-18 Thread Hanan Yehudai
I have tried to persist Generic Avro records in a parquet file and then read it via ParquetTablesource – using SQL. Seems that the SQL I not executed properly ! The persisted records are : Id , type 333,Type1 22,Type2 333,Type1 22,Type2 333,Type1 22,Type2 333,Type1 2

Re: Apache Airflow - Question about checkpointing and re-run a job

2019-11-18 Thread M Singh
Thanks Congxian for your answer and reference.  Mans On Sunday, November 17, 2019, 08:59:16 PM EST, Congxian Qiu wrote: HiYes, checkpoint data locates under jobid dir. you can try to restore from the retained checkpoint[1][1]  https://ci.apache.org/projects/flink/flink-docs-release-1.9

[ANNOUNCE] Launch of flink-packages.org: A website to foster the Flink Ecosystem

2019-11-18 Thread Robert Metzger
Hi all, I would like to announce that Ververica, with the permission of the Flink PMC, is launching a website called flink-packages.org. This goes back to an effort proposed earlier in 2019 [1] The idea of the site is to help developers building extensions / connectors / API etc. for Flink to get

Re: [ANNOUNCE] Launch of flink-packages.org: A website to foster the Flink Ecosystem

2019-11-18 Thread Oytun Tez
Congratulations! This is exciting. -- [image: MotaWord] Oytun Tez M O T A W O R D | CTO & Co-Founder oy...@motaword.com On Mon, Nov 18, 2019 at 11:07 AM Robert Metzger wrote: > Hi all, > > I would like to announce that Ververica, with the permission of

Collections as Flink job parameters

2019-11-18 Thread Протченко Алексей
  Hello all.   I have a question about providing complex configuration to Flink job. We are working on some kind of platform for running used-defined packages which actually cantain the main business logic. All the parameters we are providing via command line and parse with ParameterTool. That’

Re: SQL for Avro GenericRecords on Parquet

2019-11-18 Thread Peter Huang
Hi Hanan, Thanks for reporting the issue. Would you please attach your test code here? I may help to investigate. Best Regards Peter Huang On Mon, Nov 18, 2019 at 2:51 AM Hanan Yehudai wrote: > I have tried to persist Generic Avro records in a parquet file and then > read it via ParquetTable

Re: RocksDB state on HDFS seems not being cleanned up

2019-11-18 Thread Yun Tang
Yes, state processor API cannot read window state now, here is the track of this issue [1] [1] https://issues.apache.org/jira/browse/FLINK-13095 Best Yun Tang From: shuwen zhou Date: Monday, November 18, 2019 at 12:31 PM To: user Subject: Fwd: RocksDB state on HDFS seems not being cleanned u

RE: SQL for Avro GenericRecords on Parquet

2019-11-18 Thread Hanan Yehudai
HI Peter. Thanks. This is my code . I used one of the parquet / avro tests as a reference. The code will fail on Test testScan(ParquetTestCase) failed with: java.lang.UnsupportedOperationException at org.apache.parquet.filter2.recordlevel.IncrementallyUpdatedFilterPredicate$Value

Flink configuration at runtime

2019-11-18 Thread amran dean
Is it possible to configure certain settings at runtime, on a per-job basis rather than globally within flink-conf.yaml? For example, I have a job where it's desirable to retain a large number of checkpoints via state.checkpoints.num-retained. The checkpoints are cheap, and it's low cost. For oth

Re: Flink configuration at runtime

2019-11-18 Thread Zhu Zhu
Hi Amran, Some configs, including "state.checkpoints.num-retained", are cluster configs that always apply to the entire Flink cluster. An alternative is to use per-job mode if you are running Flink jobs on k8s/docker or yarn. Thus to create a Flink cluster for a single job. [1] https://ci.apache.

Re: Flink configuration at runtime

2019-11-18 Thread vino yang
Hi Amran, Change the config option at runtime? No, Flink does not support this feature currently. However, for Flink on Yarn job cluster mode, you can specify different config options for different jobs via program or flink-conf.yaml(copy a new flink binary package then change config file). Best

Re: Collections as Flink job parameters

2019-11-18 Thread Zhu Zhu
Hi Протченко, Yes you cannot get a Map argument from ParameterTool directly. ParameterTool fetches and stores data in the form of string so it's not feasible to support any types of configuration values which may be set by users. A workaround is to convert the map to a string in head and parse it

[DISCUSS] Support configure remote flink jar

2019-11-18 Thread tison
Hi forks, Recently, our customers ask for a feature configuring remote flink jar. I'd like to reach to you guys to see whether or not it is a general need. ATM Flink only supports configures local file as flink jar via `-yj` option. If we pass a HDFS file path, due to implementation detail it wil

Re: [DISCUSS] Support configure remote flink jar

2019-11-18 Thread Yang Wang
Hi tison, Thanks for your starting this discussion. * For user customized flink-dist jar, it is an useful feature. Since it could avoid to upload the flink-dist jar every time. Especially in production environment, it could accelerate the submission process. * For the standard flink-dist jar, FLIN

Re: how to setup a ha flink cluster on k8s?

2019-11-18 Thread Yang Wang
Hi Rock, If you want to start a ha flink cluster on k8s, the simplest way is to use ZK+HDFS/S3, just as the ha configuration on Yarn. The zookeeper-operator could help the start a zk cluster.[1] Please share more information that why it could not work. If you are using kubernetes per-job cluster,

Re: [DISCUSS] Support configure remote flink jar

2019-11-18 Thread Thomas Weise
There is a related use case (not specific to HDFS) that I came across: It would be nice if the jar upload endpoint could accept the URL of a jar file as alternative to the jar file itself. Such URL could point to an artifactory or distributed file system. Thomas On Mon, Nov 18, 2019 at 7:40 PM