When will FLink support kafka1.0

2018-08-27 Thread spoon_lz
Some other departments within the company have adopted the version of kafka1.0. I have seen that flink currently supports 0.9, 0.10 and 0.11. When will the version of kafka1.0 be supported ? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: JobGraphs not cleaned up in HA mode

2018-08-27 Thread vino yang
Hi Encho, A temporary solution can be used to determine if it has been cleaned up by monitoring the specific JobID under Zookeeper's "/jobgraph". Another solution, modify the source code, rudely modify the cleanup mode to the synchronous form, but the flink operation Zookeeper's path needs to obta

Re: JobGraphs not cleaned up in HA mode

2018-08-27 Thread Encho Mishinev
Thank you very much for the info! Will keep track of the progress. In the meantime is there any viable workaround? It seems like HA doesn't really work due to this bug. On Tue, Aug 28, 2018 at 4:52 AM vino yang wrote: > About some implementation mechanisms. > Flink uses Zookeeper to store JobGr

Re: withFormat(Csv) is undefined for the type BatchTableEnvironment

2018-08-27 Thread vino yang
Hi Francois, Yes, the withFormat API comes from an instance of BatchTableDescriptor, and the BatchTableDescriptor instance is returned by the connect API, so you should call BatchTableEnvironment#connect first. Thanks, vino. françois lacombe 于2018年8月27日周一 下午10:26写道: > Hi all, > > I'm currently

Re: Difficulty managing keyed streams

2018-08-27 Thread vino yang
Hi Gabriel, In your scenario, I guess you should be based on Event time. In this case, I think you can implement self-triggering by customizing the trigger of the window, and then combine ProcessWindowFunction[1] to define your calculation logic. Because most of your time is based on Watermark, an

Re: JobGraphs not cleaned up in HA mode

2018-08-27 Thread vino yang
About some implementation mechanisms. Flink uses Zookeeper to store JobGraph (Job's description information and metadata) as a basis for Job recovery. However, previous implementations may cause this information to not be properly cleaned up because it is asynchronously deleted by a background thre

Re: JobGraphs not cleaned up in HA mode

2018-08-27 Thread vino yang
Hi Encho, This is a problem already known to the Flink community, you can track its progress through FLINK-10011[1], and currently Till is fixing this issue. [1]: https://issues.apache.org/jira/browse/FLINK-10011 Thanks, vino. Encho Mishinev 于2018年8月27日周一 下午10:13写道: > I am running Flink 1.5.3

Re: would you join a Slack workspace for Flink?

2018-08-27 Thread Fabian Hueske
Hi Nicos, That looks like a good start! Would you like to open n issue and a pull request? Thanks, Fabian Nicos Maris schrieb am Mo., 27. Aug. 2018, 17:49: > I agree with you Fabian. > > The question then is how to instruct users to add code to their email. > What about the following? Where sh

Re: Kryo Serialization Issue

2018-08-27 Thread Darshan Singh
Thanks, We ran into differnet errors and then realized it was OOM issue which was causing different parts to be failed. Flink was buffering too much data as we were reading too fast from source. Reducing the speed fixed the issue. However, I am curious how to achieve the same with S3 apart from l

Re: would you join a Slack workspace for Flink?

2018-08-27 Thread Nicos Maris
I agree with you Fabian. The question then is how to instruct users to add code to their email. What about the following? Where should it be placed? If you send us an email with a code snippet, make sure that: 1. you do not link to files in external services as such files can change, get delete

Difficulty managing keyed streams

2018-08-27 Thread Gabriel Pelielo | Stone
Hello everyone. I'm currently developing a flink program to aggregate information about my company's clients' credit card transactions. Each transaction has a clientId and a transactionDate related to it. What I want to do is make a Sliding week time window with size 21 days sliding every 1 ho

Re: What's the advantage of using BroadcastState?

2018-08-27 Thread Xingcan Cui
Hi Radu, I cannot make a full understanding of your question but I guess the answer is NO. The broadcast state pattern just provides you with an automatic data broadcasting and a bunch of map states to cache the "low-throughput” patterns. Also, to keep consistency, it forbid the `processElemen

Re: [External] Re: How to do test in Flink?

2018-08-27 Thread Chang Liu
Thanks Joe! Best regards/祝好, Chang Liu 刘畅 On Fri, Aug 24, 2018 at 6:55 PM Joe Malt wrote: > Hi Chang, > > A time-saving tip for finding which library contains a class: go to > https://search.maven.org/ > and enter fc: followed by the fully-qualified name of the class. You > should get the li

Re: Raising a bug in Flink's unit test scripts

2018-08-27 Thread Fabian Hueske
Hi Averell, If this is a more general error, I'd prefer a separate issue & PR. Thanks, Fabian Am Fr., 24. Aug. 2018 um 13:15 Uhr schrieb Averell : > Good day everyone, > > I'm writing unit test for the bug fix FLINK-9940, and found that in some > existing tests in flink-fs-tests cannot detect t

withFormat(Csv) is undefined for the type BatchTableEnvironment

2018-08-27 Thread françois lacombe
Hi all, I'm currently trying to load a CSV file content with Flink 1.6.0 table API. This error is raised as a try to execute the code written in docs https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/table/connect.html#csv-format ExecutionEnvironment env = ExecutionEnvironment.getEx

RE: What's the advantage of using BroadcastState?

2018-08-27 Thread Radu Tudoran
Hi Fabian, Thanks for the blog post about broadcast state. I have a question with respect to the update capabilities of the broadcast state: Assume you do whatever processing logic in the main processElement function .. and at a given context marker you 1) would change a local field marker, to

Re: AvroSchemaConverter and Tuple classes

2018-08-27 Thread françois lacombe
Thank you all for you answers. It's ok with BatchTableSource All the best François 2018-08-26 17:40 GMT+02:00 Rong Rong : > Yes you should be able to use Row instead of Tuple in your > BatchTableSink. > There's sections in Flink documentation regarding mapping of data types to > table schemas

Re: Low Performance in High Cardinality Big Window Application

2018-08-27 Thread Ning Shi
> If you have a window larger than hours then you need to rethink your > architecture - this is not streaming anymore. Only because you receive events > in a streamed fashion you don’t need to do all the processing in a streamed > fashion. Thanks for the thoughts, I’ll keep that in mind. Howeve

JobGraphs not cleaned up in HA mode

2018-08-27 Thread Encho Mishinev
I am running Flink 1.5.3 with two job managers and two task managers in Kubernetes along with HDFS and Zookeeper in high-availability mode. My problem occurs after the following actions: - Upload a .jar file to jobmanager-1 - Run a streaming job from the jar on jobmanager-1 - Wait for 1 or 2 chec

Re: Question about QueryableState

2018-08-27 Thread Kostas Kloudas
Thanks a lot Pierre! Kostas > On Aug 27, 2018, at 2:16 PM, Pierre Zemb wrote: > > Hi! > Just created the JIRA (https://issues.apache.org/jira/browse/FLINK-10225 > ). > > Thanks for your reply, > Pierre > > Le jeu. 23 août 2018 à 14:31, Kosta

Re: Question about QueryableState

2018-08-27 Thread Pierre Zemb
Hi! Just created the JIRA (https://issues.apache.org/jira/browse/FLINK-10225). Thanks for your reply, Pierre Le jeu. 23 août 2018 à 14:31, Kostas Kloudas a écrit : > Hi Pierre, > > You are right that this should not happen. > It seems like a bug. > Could you open a JIRA and post it here? > > Th

Re: Data loss when restoring from savepoint

2018-08-27 Thread Andrey Zagrebin
Hi, true, StreamingFileSink does not support s3 in 1.6.0, it is planned for the next 1.7 release, sorry for confusion. The old BucketingSink has in general problem with s3. Internally BucketingSink queries s3 as a file system to list already written file parts (batches) and determine index of t

Re: Can I only use checkpoints instead of savepoints in production?

2018-08-27 Thread Averell
Thank you, Vino. I found it, http://:8088/ Regards, Averell -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Dealing with Not Serializable classes in Java

2018-08-27 Thread Dominik Wosiński
Hey ;) I have received one response that was sent directly to my email and not to user group : > Hi Dominik, > > I think you can put the unserializable fields into RichFunctions and > initiate them in the `open` method, so the the fields won’t need to be > serialized with the tasks. > > Best, > Pa

Re: Small-files source - partitioning based on prefix of file

2018-08-27 Thread Averell
Hello Fabian, and all, Please excuse me for digging this old thread up. I have a question regarding sending of the "barrier" messages in Flink's checkpointing mechanism: I want to know when those barrier messages are sent when I am using a file source. Where can I find it in the source code? I'm

Re: Dealing with Not Serializable classes in Java

2018-08-27 Thread Chesnay Schepler
You don't need RichFunctions for that, you should be able to just do: private static final ObjectMapper objectMapper = new ObjectMapper().registerModule(new JavaTimeModule()); On 27.08.2018 10:28, Dominik Wosiński wrote: Hey Paul, Yeah that is possible, but I was asking in terms of serializatio

Re: Dealing with Not Serializable classes in Java

2018-08-27 Thread Dominik Wosiński
Hey Paul, Yeah that is possible, but I was asking in terms of serialization schema. So I would really want to avoid RichFunction :) Best Regards, Dominik. pon., 27 sie 2018 o 10:23 Chesnay Schepler napisał(a): > The null check in the method is the general-purpose way of solving it. > If the Obj

Re: Dealing with Not Serializable classes in Java

2018-08-27 Thread Chesnay Schepler
The null check in the method is the general-purpose way of solving it. If the ObjectMapper is thread-safe you could also initialize it as a static field. On 26.08.2018 17:58, Dominik Wosiński wrote: Hey, I was wondering how do You normally deal with fields that contain references that are no

Re: Can I only use checkpoints instead of savepoints in production?

2018-08-27 Thread vino yang
Hi Averell, I have not used aws products, but if it is similar to YARN, or if you have visited YARN's web ui. Then you look at the YARN ApplicationMaster log to view the JM log, and the container log is the tm log. Thanks, vino. Averell 于2018年8月27日周一 下午4:09写道: > Hi Vino, > > Could you please t

Re: Can I only use checkpoints instead of savepoints in production?

2018-08-27 Thread Averell
Hi Vino, Could you please tell where I should find the JM and TM logs? I'm running on an AWS EMR using yarn. Thanks and best regards, Averell -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: would you join a Slack workspace for Flink?

2018-08-27 Thread Chesnay Schepler
I fully agree with Fabian, we'd just be replacing an empty IRC with an empty gitter. On 27.08.2018 09:50, Fabian Hueske wrote: Hi, I don't think that recommending Gists is a good idea. Sure, well formatted and highlighted code is nice and much better than posting screenshots but Gists can be

Re: [DISCUSS] Remove the slides under "Community & Project Info"

2018-08-27 Thread Fabian Hueske
I agree to remove the slides section. A lot of the content is out-dated and hence not only useless but might sometimes even cause confusion. Best, Fabian Am Mo., 27. Aug. 2018 um 08:29 Uhr schrieb Renjie Liu < liurenjie2...@gmail.com>: > Hi, Stephan: > Can we put project wiki in some place? I

Re: would you join a Slack workspace for Flink?

2018-08-27 Thread Fabian Hueske
Hi, I don't think that recommending Gists is a good idea. Sure, well formatted and highlighted code is nice and much better than posting screenshots but Gists can be deleted. Deleting a Gist would make an archived thread useless. I would definitely support instructions on how to add code to a mail

Re: Can I only use checkpoints instead of savepoints in production?

2018-08-27 Thread vino yang
Hi Averell, This problem is caused by a heartbeat timeout between JM and TM. You can locate it by: 1) Check the network status of the node at the time, such as whether the connection with other systems is equally problematic; 2) Check the tm log to see if there are more specific reasons; 3) View t

Re: Can I only use checkpoints instead of savepoints in production?

2018-08-27 Thread Averell
Thank you Vino. I put the message in a tag, and I don't know why it was not shown in the email thread. I paste the error message below in this email. Anyway, it seems that was an issue with enabling checkpointing. Now I am able to get it turned on properly, and my job is getting restored automa