Re: [DISCUSS] Introduction of a Table API Java Expression DSL

2019-03-21 Thread Jark Wu
Hi Timo, Sounds good to me. Do you want to deprecate the string-based API in 1.9 or make the decision in 1.10 after some feedbacks ? On Thu, 21 Mar 2019 at 21:32, Timo Walther wrote: > Thanks for your feedback Rong and Jark. > > @Jark: Yes, you are right that the string-based API is used

Re: Map UDF : The Nothing type cannot have a serializer

2019-03-21 Thread Rong Rong
Based on what I saw in the implementation, I think you meant to implement a ScalarFunction right? since you are only trying to structure a VarArg string into a Map. If my understanding was correct. I think the Map constructor[1] is something you might be able to leverage. It doesn't resolve your

Re: Map UDF : The Nothing type cannot have a serializer

2019-03-21 Thread shkob1
Looking further into the RowType it seems like this field is translated as a CURSOR rather than a map.. not sure why -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Map UDF : The Nothing type cannot have a serializer

2019-03-21 Thread shkob1
Hey, As im building a SQL query, im trying to conditionally build a map such that there won't be any keys with null values in it. AFAIK from Calcite there's no native way to do it (other than using case to build the map in different ways, but then i have a lot of key/value pairs so thats not

Re: Streaming Protobuf into Parquet file not working with StreamingFileSink

2019-03-21 Thread Rafi Aroch
Hi Kostas, Yes I have. Rafi On Thu, Mar 21, 2019, 20:47 Kostas Kloudas wrote: > Hi Rafi, > > Have you enabled checkpointing for you job? > > Cheers, > Kostas > > On Thu, Mar 21, 2019 at 5:18 PM Rafi Aroch wrote: > >> Hi Piotr and Kostas, >> >> Thanks for your reply. >> >> The issue is that I

Re: [DISCUSS] Create a Flink ecosystem website

2019-03-21 Thread Oytun Tez
Thank you, all! If there are operational tasks about the ecosystem page(s), let me know (organizing the content etc, whatever). --- Oytun Tez *M O T A W O R D* The World's Fastest Human Translation Platform. oy...@motaword.com — www.motaword.com On Thu, Mar 21, 2019 at 2:14 PM Becket Qin

Re: [DISCUSS] Create a Flink ecosystem website

2019-03-21 Thread Becket Qin
Thanks for the update Robert! Looking forward to the prototype! On Thu, Mar 21, 2019 at 10:07 PM Robert Metzger wrote: > Quick summary of our call: > Daryl will soon start with a front end, build against a very simple > mock-backend. > Congxian will start implementing the Spring-based backend

[ANNOUNCE] Release 1.8.0, release candidate #4

2019-03-21 Thread Aljoscha Krettek
Hi All, Voting on RC 4 for Flink 1.8.0 has started: https://lists.apache.org/thread.html/9a2150090430e4c10c466775400c0196fe474055dab6b9ab9226960b@%3Cdev.flink.apache.org%3E. Please check this out if you want to verify your applications against this new Flink release. Best, Aljoscha

Flink and sketches

2019-03-21 Thread Flavio Pompermaier
Hi to all, I was looking for an approx_count and freq_item in Flink and I'm not sure which road to follow. At the moment I found 2 valuable options: 1. Wait for STREAMLINE to unveil their code of HLL_DISTINCT_COUNT[1] 2. Use the Yahoo Datasketches lib [2], following the example of Tobias

Re: Streaming Protobuf into Parquet file not working with StreamingFileSink

2019-03-21 Thread Rafi Aroch
Hi Piotr and Kostas, Thanks for your reply. The issue is that I don't see any committed files, only in-progress. I tried to debug the code for more details. I see that in *BulkPartWriter* I do reach the *write* methods and see events getting written, but I never reach the *closeForCommit*. I

Re: [REMINDER] Flink Forward San Francisco in a few days

2019-03-21 Thread Robert Metzger
I would like to add that the organizers of the conference have agreed to offer all Apache committers (of any Apache project) a free ticket. *To get your free ticket, use the "ASFCommitters19” promo code AND use your @apache.org email when registering.* Feel free to reach out

Re: Async Function Not Generating Backpressure

2019-03-21 Thread Ken Krugler
> On Mar 20, 2019, at 6:49 PM, Seed Zeng wrote: > > Hey Andrey and Ken, > Sorry about the late reply. I might not have been clear in my question > The performance of writing to Cassandra is the same in both cases, only that > the source rate was higher in the case of the async function is

Re: Flink 1.7.2 extremely unstable and losing jobs in prod

2019-03-21 Thread Bruno Aranda
Ok, here it goes: https://transfer.sh/12qMre/jobmanager-debug.log In an attempt to make it smaller, did remove the noisy "http wire" ones and masked a couple of things. Not sure this covers everything you would like to see. Thanks! Bruno On Thu, 21 Mar 2019 at 15:24, Till Rohrmann wrote: >

Re: Streaming Protobuf into Parquet file not working with StreamingFileSink

2019-03-21 Thread Kostas Kloudas
Hi Rafi, Piotr is correct. In-progress files are not necessarily readable. The valid files are the ones that are "committed" or finalized. Cheers, Kostas On Thu, Mar 21, 2019 at 2:53 PM Piotr Nowojski wrote: > Hi, > > I’m not sure, but shouldn’t you be just reading committed files and ignore

Avro state migration using Scala in Flink 1.7.2 (and 1.8)

2019-03-21 Thread Marc Rooding
Hi I’ve been trying to get state migration with Avro working on Flink 1.7.2 using Scala case classes but I’m not getting anywhere closer to solving it. We’re using the most basic streaming WordCount example as a reference to test the schema evolution: val wordCountStream:

Re: Flink 1.7.2 extremely unstable and losing jobs in prod

2019-03-21 Thread Till Rohrmann
Hi Bruno, could you upload the logs to https://transfer.sh/ or https://gist.github.com/ and then post a link. For further debugging this will be crucial. It would be really good if you could set the log level to DEBUG. Concerning the number of registered TMs, the new mode (not the legacy mode),

Re: Ambiguous behavior of Flink on Job cancellation with checkpoint configured

2019-03-21 Thread Aljoscha Krettek
Hi, That’s a good observation! And it is indeed the expected behaviour. There are two parts to understanding this: * "retain checkpoints” tells Flink to retain any checkpoints that it stores when a job is shut down * for recovery purposes (all checkpointing purposes, really) a savepoint

Re: Best practice to handle update messages in stream

2019-03-21 Thread Piotr Nowojski
Hi, There is an ongoing work [1] to support natively the streams like you described (we call them upsert streams/changelogs). But it boils down to the exactly the same thing you have done - aggregating the records per key and adding `latest` aggregation function. Until we support this

Re: Streaming Protobuf into Parquet file not working with StreamingFileSink

2019-03-21 Thread Piotr Nowojski
Hi, I’m not sure, but shouldn’t you be just reading committed files and ignore in-progress? Maybe Kostas could add more insight to this topic. Piotr Nowojski > On 20 Mar 2019, at 12:23, Rafi Aroch wrote: > > Hi, > > I'm trying to stream events in Prorobuf format into a parquet file. > I

Re: [VOTE] Release 1.8.0, release candidate #3

2019-03-21 Thread Yu Li
Thanks for the message Aljoscha, let's discuss in JIRA (just replied there). Best Regards, Yu On Thu, 21 Mar 2019 at 21:15, Aljoscha Krettek wrote: > Hi Yu, > > I commented on the issue. For me both Hadoop 2.8.3 and Hadoop 2.4.1 seem > to work. Could you have a look at my comment? > > I will

Re: [DISCUSS] Introduction of a Table API Java Expression DSL

2019-03-21 Thread Timo Walther
Thanks for your feedback Rong and Jark. @Jark: Yes, you are right that the string-based API is used quite a lot. On the other side, the potential user base in the future is still bigger than our current user base. Because the Table API will become equally important as the DataStream API, we

Re: [VOTE] Release 1.8.0, release candidate #3

2019-03-21 Thread Aljoscha Krettek
Hi Yu, I commented on the issue. For me both Hadoop 2.8.3 and Hadoop 2.4.1 seem to work. Could you have a look at my comment? I will also cancel this RC because of various issues. Best, Aljoscha > On 21. Mar 2019, at 12:23, Yu Li wrote: > > Thanks @jincheng > > @Aljoscha I've just opened

Re: StochasticOutlierSelection

2019-03-21 Thread Piotr Nowojski
(Adding back user mailing list) Hi Anissa, Thank you for coming back with the results. I hope this might be helpful for someone else in the future and maybe it will be one more argument for the Flink community to address this issue in some other way. Piotrek > On 20 Mar 2019, at 17:26,

Re: Set partition number of Flink DataSet

2019-03-21 Thread qi luo
Thank you Fabian! I will check these issues. > On Mar 20, 2019, at 4:25 PM, Fabian Hueske wrote: > > Hi, > > I'm sorry but I'm only familiar with the high-level design but not with the > implementation details and concrete roadmap for the involved components. > I think that FLINK-10288 [1]

Re: Flink 1.7.2 extremely unstable and losing jobs in prod

2019-03-21 Thread Bruno Aranda
Hi Andrey, Thanks for your response. I was trying to get the logs somewhere but they are biggish (~4Mb). Do you suggest somewhere I could put them? In any case, I can see exceptions like this: 2019/03/18 10:11:50,763 DEBUG org.apache.flink.runtime.jobmaster.slotpool.SlotPool -

Re: [VOTE] Release 1.8.0, release candidate #3

2019-03-21 Thread Yu Li
Thanks @jincheng @Aljoscha I've just opened FLINK-11990 for the HDFS BucketingSink issue with hadoop 2.8. IMHO it might be a blocker for 1.8.0 and need your confirmation. Thanks. Best Regards, Yu On Thu, 21 Mar 2019 at 15:57, jincheng sun

Re: Async Function Not Generating Backpressure

2019-03-21 Thread Andrey Zagrebin
Hi Seed, when you create `AsyncDataStream.(un)orderedWait` which capacity do you pass in or you use the default one (100)? Best, Andrey On Thu, Mar 21, 2019 at 2:49 AM Seed Zeng wrote: > Hey Andrey and Ken, > Sorry about the late reply. I might not have been clear in my question > The

local dev using kafka consumer by docker got wrong cluster node id

2019-03-21 Thread Andy Hoang
So I want to run flink in my local. Kafka docker and its zookeeper has been work great for local dev of other projects, I want to try this kafka with new flink project in local. I have problem of first, the connect from my kafka consumer source is created but then it try to connect with a

Job crashed very soon

2019-03-21 Thread yinhua.dai
Hi Community, I was trying to run a big batch job which use JDBCInputFormat to retrieve a large amount data from a mysql database and do some joins in flink, the environment is AWS EMR. But it always failed very fast. I'm using flink on yarn, flink 1.6.1 my cluster has 1000GB memory, my job

Re: [DISCUSS] Create a Flink ecosystem website

2019-03-21 Thread Robert Metzger
Okay, great. Congxian Qiu, Daryl and I have a kick-off call later today at 2pm CET, 9pm China time about the design of the ecosystem page (see: https://github.com/rmetzger/flink-community-tools/issues/4) Please let me know if others want to join as well, I can add them to the invite. On Wed, Mar

Best practice to handle update messages in stream

2019-03-21 Thread 徐涛
Hi Experts, Assuming there is a stream which content is like this: Seq ID MONEY 1.100 100 2.100 200 3.101 300 The record of Seq#2 is updating record of Seq#1, changing the money

Facebook: Save Wilpattu One Srilanka's Most Loved Place

2019-03-21 Thread felipe . o . gutierrez
Olá, Eu acabei de assinar o abaixo-assinado "Facebook: Save Wilpattu One Srilanka's Most Loved Place" e queria saber se você pode ajudar assinando também. A nossa meta é conseguir 15.000 assinaturas e precisamos de mais apoio. Você pode ler mais sobre este assunto e assinar o abaixo-assinado

Ambiguous behavior of Flink on Job cancellation with checkpoint configured

2019-03-21 Thread Parth Sarathy
Hi All, We are using flink 1.7.2 and have enabled checkpoint with RocksDB configured as state backend with retain checkpoints on job cancel. In our scenario we are cancelling the job and while resubmitting the job, we try to restore the job with latest checkpoint / savepoint

Re: [VOTE] Release 1.8.0, release candidate #3

2019-03-21 Thread jincheng sun
Thanks for the quick fix, Yu. the PR of FLINK-11972 has been merged. Cheers, Jincheng Yu Li 于2019年3月21日周四 上午7:23写道: > -1, observed stably failure on streaming bucketing end-to-end test case in > two different environments (Linux/MacOS) when

Documentation of mesos-appmaster-job.sh

2019-03-21 Thread Jacky Yin 殷传旺
Hello All, I cannot find any documentation or help about how to use $flin_home/bin/mesos-appmaster-job.sh. Anybody help? Thanks! Jacky Yin

Facebook: Save Wilpattu One Srilanka's Most Loved Place

2019-03-21 Thread dhanuka . priyanath
Hello there, I just signed the petition "Facebook: Save Wilpattu One Srilanka's Most Loved Place" and wanted to see if you could help by adding your name. Our goal is to reach 7,500 signatures and we need more support. You can read more and sign the petition here: http://chng.it/ZxfxKQcqqk