Re: [ANNOUNCE] Apache Flink 1.10.1 released

2020-05-15 Thread Congxian Qiu
Thanks a lot for the release and your great job, Yu! Also thanks to everyone who made this release possible! Best, Congxian Yu Li 于2020年5月14日周四 上午1:59写道: > The Apache Flink community is very happy to announce the release of Apache > Flink 1.10.1, which is the first bugfix release for the Apach

Re: Flink performance tuning on operators

2020-05-15 Thread Chesnay Schepler
Generally there should be no difference. Can you check whether the maps are running as a chain (as a single task)? If they are running in a chain, then I would suspect that /something/ else is skewing your results. If not, then the added network/serialization pressure would explain it. I will a

Re: Flink suggestions;

2020-05-15 Thread Chesnay Schepler
Am I understanding you correctly in that, if one sensor of on factory raises an alert, then you want all sensors in that same factory to raise an alert? How big is this dataset that maps sensors to factories? Maybe you can just load them into a Map in say a FlatMap, enrich the sensor data str

the savepoint problem of upgrading job from blink-1.5 to flink-1.10

2020-05-15 Thread Roc Marshal
Hi, all. When using savepoint to upgrade a Flink job from blink-1.5 to flink-1.10, the system prompts that blink savepointV3 is incompatible with the version in Flink. Is there any solution? Thank you so much. Sincerely, Roc Marshal

Re: Incremental state with purging

2020-05-15 Thread Congxian Qiu
Hi >From your description, you want to do two things: 1 update state and remote the state older than x 2 output the state every y second >From my side, the first can be done by using TTL state as Yun said, the second can be done by using KeyedProcessFunction[1] If you want to have complex logic

Re: the savepoint problem of upgrading job from blink-1.5 to flink-1.10

2020-05-15 Thread Congxian Qiu
Hi, Could you please share the stack or the log message? If I understand correctly, savepoint V3 is not contained in 1.10, Best, Congxian Roc Marshal 于2020年5月15日周五 下午4:33写道: > Hi, all. > > When using savepoint to upgrade a Flink job from blink-1.5 to flink-1.10, > the system prompts that blink

Re: How To subscribe a Kinesis Stream using enhance fanout?

2020-05-15 Thread orionemail
Hi, We also recently needed this functionality, unfortunately we were unable to implement it ourselves so changed our plan accordingly. However we very much see the benefit for this feature and would be interested in following the JIRA ticket. Thanks ‐‐‐ Original Message ‐‐‐ On Thursd

Re: the savepoint problem of upgrading job from blink-1.5 to flink-1.10

2020-05-15 Thread Yun Tang
Hi Roc Blink-1.5 should never make the promise that it could be compatible with Flink-1.10. Moreover, the SavepointV3Serializer in Blink is totally no the same thing as Flink, and the reason why we introduce SavepointV3Serializer is because we use different state handle when we open source blin

Re: [ANNOUNCE] Apache Flink 1.10.1 released

2020-05-15 Thread Till Rohrmann
Thanks Yu for being our release manager and everyone else who made the release possible! Cheers, Till On Fri, May 15, 2020 at 9:15 AM Congxian Qiu wrote: > Thanks a lot for the release and your great job, Yu! > Also thanks to everyone who made this release possible! > > Best, > Congxian > > > Y

Testing process functions

2020-05-15 Thread Manas Kale
Hi, How do I test process functions? I tried by implementing a sink function that stores myProcessFunction's output in a list. After env.execute(), I use assertions. If I set a breakpoint in the myTestSink's invoke() method, I see that that method is being called correctly. However, after env.execu

Re: [ANNOUNCE] Apache Flink 1.10.1 released

2020-05-15 Thread Dian Fu
Thanks Yu for managing this release and everyone else who made this release possible. Good work! Regards, Dian > 在 2020年5月15日,下午6:26,Till Rohrmann 写道: > > Thanks Yu for being our release manager and everyone else who made the > release possible! > > Cheers, > Till > > On Fri, May 15, 2020 a

Re: Flink Key based watermarks with event time

2020-05-15 Thread Congxian Qiu
Hi Maybe you can try KeyedProcessFunction[1] for this, but you need to handle the allow-latency logic[2] in your own business logic(event-time records maybe out-of-order) [1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/operators/process_function.html#the-keyedprocessfu

Re: [ANNOUNCE] Apache Flink 1.10.1 released

2020-05-15 Thread Benchao Li
Thanks Yu for the great work, and everyone else who made this possible. Dian Fu 于2020年5月15日周五 下午6:55写道: > Thanks Yu for managing this release and everyone else who made this > release possible. Good work! > > Regards, > Dian > > 在 2020年5月15日,下午6:26,Till Rohrmann 写道: > > Thanks Yu for being our

Help with table-factory for SQL

2020-05-15 Thread Martin Frank Hansen
Hi, I am trying to connect to kafka through flink, but having some difficulty getting the right table-factory-source. I currently get the error: NoMatchingTableFactoryException: Could not find a suitable table factory for 'org.apache.flink.table.factories.TableSourceFactory' in the classpath. m

Developing Beam applications using Flink checkpoints

2020-05-15 Thread Ivan San Jose
Hi, we are starting to use Beam with Flink as runner on our applications, and recently we would like to get advantages that Flink checkpoiting provides, but it seems we are not understanding it clearly. Simplifying, our application does the following: - Read meesages from a couple of Kafka topic

Re: Protection against huge values in RocksDB List State

2020-05-15 Thread Robin Cassan
Hi Yun, thanks for your answer! And sorry I didn't see this limitation from the documentation, makes sense! In our case, we are merging too many elements (since each element is limited to 4Mib in our kafka topic). I agree we do not want our state to contain really big values, this is why we are try

Re: upgrade flink from 1.9.1 to 1.10.0 on EMR

2020-05-15 Thread Robert Metzger
Flink 1.11 will support Hadoop 3. EMR 6 requires Hadoop 3, that's why Flink was not included anymore. Amazon will add Flink back to EMR 6.0 soon. On Thu, May 14, 2020 at 7:11 PM aj wrote: > Hi Yang, > > I am able to resolve the issue by removing Hadoop dependency as you > mentioned. > > 1. Remov

Memory growth from TimeWindows

2020-05-15 Thread Slotterback, Chris
Hey Flink users, I wanted to see if I could get some insight on what the heap memory profile of my stream app should look like vs my expectation. My layout consists of a sequence of FlatMaps + Maps, feeding a pair of 5 minute TumblingEventTimeWindows, intervalJoined, into a 24 hour (per 5 minut

Re: Protection against huge values in RocksDB List State

2020-05-15 Thread Yun Tang
Hi Robin I think you could record the size of you list under currentKey with another value state or operator state (store a Map with , store the whole map in list when snapshotting). If you do not have many key-by keys, operator state is a good choice as that is on-heap and lightweight. Best Y

Unit / Integration tests for Table API ?

2020-05-15 Thread Darlo Bobo
Hello, I have built a small Flink app which receive events (json events), deserialize them to an object and then uses the Table API to create two tables, do some join and then write the results back to a kafka stream. What is the suggested method to correctly test that the code written with the

Re: Watermarks and parallelism

2020-05-15 Thread Gnanasoundari Soundarajan
Thanks Alexander for your detailed response. I have a requirement that each asset will communicate different event time due to connectivity issues. If I have 50 asset and each communicates with different event time, I should not lose the data because of lateness. To handle this, I have tried wi

Re: Help with table-factory for SQL

2020-05-15 Thread Jark Wu
Hi, Could you share the SQL DDL and the full exception message? It might be you are using the wrong `connector.version` or other options. Best, Jark On Fri, 15 May 2020 at 20:14, Martin Frank Hansen wrote: > Hi, > > I am trying to connect to kafka through flink, but having some difficulty >

Re: Unit / Integration tests for Table API ?

2020-05-15 Thread Jark Wu
Hi, Flink Table&SQL has a testing suite to do integration test. You can have a look at `org.apache.flink.table.planner.runtime.stream.sql.AggregateITCase`. We are using a bounded source stream and a testing sink to collect result and verify the result. You need to depend on the following dependenc