Re: Job restart hook

2018-04-04 Thread Kostas Kloudas
> works. One small question on the same is can we restore from checkpoints with > different parallelism? > > On Tue, Apr 3, 2018 at 2:48 AM, Kostas Kloudas <mailto:k.klou...@data-artisans.com>> wrote: > Hi Navneeth, > > If I understand correctly, you have a job

Re: bad data output

2018-04-03 Thread Kostas Kloudas
Hi Darshan, You can use side outputs [1] and a process function to split the data in as many streams as you want, e.g. correct, fixable and wrong. Each side output will be a separate stream that your can process individually. You can always send the “bad data” directly from your process functio

Re: Job restart hook

2018-04-03 Thread Kostas Kloudas
Hi Navneeth, If I understand correctly, you have a job with parallelism p=20, a TM goes down (eg. with 4 slots), and you want until the TM comes up, to run the job with p=16 and then re-running it with 20 again, when the TM comes up. If this is the case, one important thing to keep in mind is t

Re: Query regarding to CountinousFileMonitoring operator

2018-03-26 Thread Kostas Kloudas
Hi Puneet, If you mean that after processing a file, you want to move it to another directory outside the one containing the data to be processed, then I am afraid that this is currently not possible. This is because the whole logic of how to treat files is included in your FileInputFormat.

Re: Queryable State

2018-03-21 Thread Kostas Kloudas
Hi Vishal, As Fabian said, queryable state is just a feature that exposes the state kept within Flink, and it is not made to replace functionality that would otherwise be made by a sink. In the future the functionality will definitely evolve but for there are no discussions currently, for keepi

Re: State serialization problem when we add a new field in the object

2018-03-14 Thread Kostas Kloudas
Hi Konstantin, What you could do, is that you write and intermediate job that has the old ValueState “oldState” and the new one “newState”, with the new format. When an element comes in this intermediate job, you check the oldState if it is empty for that key or not. If it is null (empty), y

Re: Dynamic CEP https://issues.apache.org/jira/browse/FLINK-7129?subTaskView=all

2018-03-08 Thread Kostas Kloudas
Hi Vishal, Dawid (cc’ed) who was working on that stopped because in the past Flink did not support broadcast state. This is now added (in the master) and the implementation of FLINK-7129 will continue hopefully soon. Cheers, Kostas > On Mar 8, 2018, at 4:09 PM, Vishal Santoshi wrote: > > He

Re: Simple CEP pattern

2018-03-07 Thread Kostas Kloudas
k source code, but could you explain little bit > more what to do with it in this case ? > > Best, Esa > > From: Kostas Kloudas [mailto:k.klou...@data-artisans.com] > Sent: Wednesday, March 7, 2018 3:51 PM > To: Esa Heikkinen > Cc: user@flink.apache.org > Subject:

Re: CEP issue

2018-03-07 Thread Kostas Kloudas
t; Again I would also advise ( though not a biggy ) that strategic debug > statements in the CEP core would help folks to see what actually happens. We > instrumented the code to follow the construction of NFA that was very > helpful. > > On Wed, Mar 7, 2018 at 9:23 AM, Kostas

Re: CEP issue

2018-03-07 Thread Kostas Kloudas
sive ) and stop > a pattern. One question I had is that an NFA can be in a FinalState or a > StopState. > > What would constitute a StopState ? > > On Wed, Mar 7, 2018 at 8:47 AM, Kostas Kloudas <mailto:k.klou...@data-artisans.com>> wrote: > Hi Vishal, > >

Re: Simple CEP pattern

2018-03-07 Thread Kostas Kloudas
> Often I don’t know is it problem with “pattern” or “select”, because no > results.. Is there any way to debug CEP’s operations ? > > Best, Esa > > From: Kostas Kloudas [mailto:k.klou...@data-artisans.com] > Sent: Wednesday, March 7, 2018 2:54 PM > To: Esa Heikki

Re: CEP issue

2018-03-07 Thread Kostas Kloudas
itions are seen, one can prune. Simply speaking if n-m false have been > seen there is no way that out of n there will be ever m trues and thus > SharedBuffer can be pruned to the last true seen ( very akin to > skipToFirstAfterLast ). > > We will keep instrumenting the code ( which

Re: Simple CEP pattern

2018-03-07 Thread Kostas Kloudas
Hi Esa, You could try the examples either from the documentation or from the training. http://training.data-artisans.com/exercises/CEP.html Kostas > On Mar 7, 2018, at 11:32 AM, Esa Heikkinen > wrote: > > What would be the simplest worki

Re: Questions about the FlinkCEP

2018-03-01 Thread Kostas Kloudas
e in the same pattern or > even other places in the application. Is this possible ? > > Best, Esa > > From: Kostas Kloudas [mailto:k.klou...@data-artisans.com] > Sent: Thursday, March 1, 2018 11:35 AM > To: Esa Heikkinen > Cc: user@flink.apache.org > Subject: Re:

Re: Questions about the FlinkCEP

2018-03-01 Thread Kostas Kloudas
Hi Esa, The answers to the questions are inlined. > On Feb 28, 2018, at 8:32 PM, Esa Heikkinen wrote: > > Hi > > I have tried to learn FlinkCEP [1], but i have yet not found the clear > answers for questions: > 1) Whether the pattern of CEP is meant only for one data stream at the same > tim

Re: Important (proposed) CEP changes for Flink 1.5.

2018-02-21 Thread Kostas Kloudas
will > remain back ward compatible ( 1.4 to 1.5 ). > > On Wed, Feb 21, 2018 at 5:06 AM, Kostas Kloudas <mailto:kklou...@gmail.com>> wrote: > Hi all, > > Currently due to backwards compatibility there are some issues that seem to > be affecting CEP users tha

Re: Optimizing multiple aggregate queries on a CEP using Flink

2018-02-15 Thread Kostas Kloudas
Hi Sahil, Currently CEP does not support multi-query optimizations out-of-the-box. In some cases you can do manual optimizations to your code, but there is no optimizer involved. Cheers, Kostas > On Feb 15, 2018, at 11:12 AM, Sahil Arora wrote: > > Hi Timo, > Thanks a lot for the help. I will

Re: CEP for time series in csv-file

2018-02-08 Thread Kostas Kloudas
Hi Esa, I think the best place to start is the documentation available at the flink website. Some pointers are the following: CEP documentation: https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/libs/cep.html

Re: Flink CEP exception during RocksDB update

2018-02-06 Thread Kostas Kloudas
Hi Varun, The branch I previously sent you has been now merged to the master. So could you try the master and tell us if you see any change in the behavior? Has the problem been fixed, or has the message of the exception changed? Thanks, Kostas > On Jan 29, 2018, at 10:09 AM, Kostas Klou

Re: CEP issue

2018-02-06 Thread Kostas Kloudas
Thanks a lot Vishal! We are looking forward to a test case that reproduces the failure. Kostas > On Feb 2, 2018, at 4:05 PM, Vishal Santoshi wrote: > > This is the pattern. Will create a test case. > /** > * > * @param condition a single condition is applied as a acceptance criteria > *

Re: Trigger Time vs. Latest Acknowledgement

2018-01-31 Thread Kostas Kloudas
Hi Juho, I think that neither a) nor b) hold. The reported times are wall-clock times (or processing time in Flink terminology) when the checkpoint started and when it finished. What you want, if I understand correctly, is these times to reflect the event time of your pipeline. In other wo

Re: Flink CEP exception during RocksDB update

2018-01-29 Thread Kostas Kloudas
let me know if the problem is still there? In addition, are you using Scala with case classes or Java? Thanks for helping fix the problem, Kostas > On Jan 24, 2018, at 5:54 PM, Kostas Kloudas > wrote: > > Hi Varun, > > Thanks for taking time to look into it. Could you give a

Re: Flink CEP exception during RocksDB update

2018-01-24 Thread Kostas Kloudas
bugging runtime checkpoints would be > very helpful. > Thanks in advance for your assistance. > > Regards, > Varun > > On Jan 18, 2018, at 8:11 AM, Kostas Kloudas <mailto:k.klou...@data-artisans.com>> wrote: > >> Thanks a lot Varun! >> >> Ko

Re: CEP issue in 1.3.2. Does 1.4 fix this ?

2018-01-23 Thread Kostas Kloudas
Hi Vishal, Thanks for checking and glad to hear that your job works after the fix! As for the equals/hashcode question, if your question is if you have to implement exact equals() method and the corresponding hashcode() then the answer is yes. These methods are used when retrieving and cleaning

Re: Unable to query MapState

2018-01-22 Thread Kostas Kloudas
Hi Velu, I would recommend to switch to Flink 1.4 as the queryable state has been refactored to be compatible with all types of state. You can read more here: https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/state/queryable_state.html

Re: Flink CEP exception during RocksDB update

2018-01-18 Thread Kostas Kloudas
from my iPhone > > On Jan 15, 2018, at 10:21 AM, Kostas Kloudas <mailto:k.klou...@data-artisans.com>> wrote: > >> Hi Varun, >> >> This can be related to this issue: >> https://issues.apache.org/jira/browse/FLINK-8226 >> <https://issue

Re: Flink CEP exception during RocksDB update

2018-01-15 Thread Kostas Kloudas
Hi Varun, This can be related to this issue: https://issues.apache.org/jira/browse/FLINK-8226 which is currently fixed on the master. Could you please try the current master to see if the error persists? Thanks, Kostas > On Jan 15, 2018, at 4

Re: How to apply patterns from a source onto another datastream?

2017-12-28 Thread Kostas Kloudas
Hi Jayant, As Dawid said, currently dynamically updating patterns is currently not supported. There is also this question raised in the dev mailing list with the subject CEP: Dynamic Patterns. I will repeat my answer here so that we are on the same page: "To support this, we need 2 features

Re: Triggers in Flink CEP

2017-12-19 Thread Kostas Kloudas
Hi Shailesh, The pattern operator does not use Flink’s windowing mechanism internally. Conceptually you may think that there are windows in both, and this is true, but there are significant differences that prevent using Flink windowing for CEP. The above implies also that using triggers for ea

Re: Hardware Reference Architecture

2017-12-11 Thread Kostas Kloudas
ly-and-efficiently-operate-apache-flink/3 <http://www.slideshare.net/FlinkForward/flink-forward-berlin-2017-robert-metzger-keep-it-going-how-to-reliably-and-efficiently-operate-apache-flink/3> Kostas > On Dec 7, 2017, at 6:36 PM, Kostas Kloudas > wrote: > > Hi Hayden, >

Re: Hardware Reference Architecture

2017-12-07 Thread Kostas Kloudas
Hi Hayden, It would be nice if you could share a bit more details about your use case and the load that you expect to have, as this could allow us to have a better view of your needs. As a general set of rules: 1) I would say that the bigger your cluster (in terms of resources, not necessarily

Re: Testing CoFlatMap correctness

2017-12-07 Thread Kostas Kloudas
Hi Tovi, What you need is the TwoInputStreamOperatorTestHarness. This will allow you to do something like: TwoInputStreamOperatorTestHarness testHarness = new TwoInputStreamOperatorTestHarness<>(myoperator); testHarness.setup(); testHarness.open(); testHarness.processWatermark1(new Water

Re: Maintain heavy hitters in Flink application

2017-12-07 Thread Kostas Kloudas
Hi Max, You are right that Queryable State is not designed to be used as a means for a job to query its own state. In fact, given that you do not know the jobId of your job from within the job itself, I do not think you can use queryable state in your scenario. What you can do is to have a fla

Re: Are there plans to support Hadoop 2.9.0 on near future?

2017-11-29 Thread Kostas Kloudas
play/FLINK/Flink+Release+and+Feature+Plan#FlinkReleaseandFeaturePlan-Flink1.4> > that they estimated releasing 1.4 at September. Do you know if it will be > released this year or we may have to wait longer? > > Thanks a lot. > > De: Kostas Kloudas <mailto:k.klou...@data-arti

Re: Are there plans to support Hadoop 2.9.0 on near future?

2017-11-29 Thread Kostas Kloudas
Hi Oriol, As you may have seen form the mailing list we are currently in the process of releasing Flink 1.4. This is going to be a hadoop-free distribution which means that it should work with any hadoop version, including Hadoop 2.9.0. Given this, I would recommend to try out the release cand

Re: Bad entry in block exception with RocksDB

2017-11-23 Thread Kostas Kloudas
Hi Kien, Could you share some more information about your job? What operators are you using, the format of your elements, etc? Thanks, Kostas > On Nov 23, 2017, at 2:23 AM, Kien Truong wrote: > > Hi, > > We are seeing this exception in one of our job, whenever a check point or > save point i

Re: readFile, DataStream

2017-11-13 Thread Kostas Kloudas
Hi Juan, The problem is that once a file for a certain timestamp is processed and the global modification timestamp is modified, then all files for that timestamp are considered processed. The solution is not to remove the = from the modificationTime <= globalModificationTime; in ContinuousFil

Re: Queryable State Python

2017-11-10 Thread Kostas Kloudas
Hi Martin, I will try to reply to your questions inline: > On Nov 10, 2017, at 1:59 PM, Martin Eden wrote: > > Hi, > > Our team is looking at replacing Redis with Flink's own queryable state > mechanism. However our clients are using python. > > 1. Is there a python integration with the Flin

Re: When using Flink for CEP, can the data in Cassandra database be used for state

2017-11-09 Thread Kostas Kloudas
Hi Shyla, Happy to hear that you are experimenting with CEP! For enriching your input stream with data from Cassandra (or whichever external storage system) you could use: * either the AsyncIO functionality offered by Flink (https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream

Re: FlinkCEP, circular references and checkpointing failures

2017-11-03 Thread Kostas Kloudas
you happen to know when it's expected the 1.4 stable release? > > Thank you very much, > Federico > > 2017-11-03 15:25 GMT+01:00 Kostas Kloudas <mailto:k.klou...@data-artisans.com>>: > Perfect! thanks a lot! > > Kostas > >> On Nov 3, 2017, a

Re: FlinkCEP, circular references and checkpointing failures

2017-11-03 Thread Kostas Kloudas
Perfect! thanks a lot! Kostas > On Nov 3, 2017, at 3:23 PM, Federico D'Ambrosio > wrote: > > Hi Kostas, > > yes, I'm using 1.3.2. I'll try the current master and I'll get back to you. > > 2017-11-03 15:21 GMT+01:00 Kostas Kloudas <mail

Re: FlinkCEP, circular references and checkpointing failures

2017-11-03 Thread Kostas Kloudas
Hi Federico, I assume that you are using Flink 1.3, right? In this case, in 1.4 we have fixed a bug that seems similar to your case: https://issues.apache.org/jira/browse/FLINK-7756 Could you try the current master to see if it fixes your probl

Re: FlinkCEP: pattern application on a KeyedStream

2017-10-19 Thread Kostas Kloudas
Hi Federico, If I understand your question correctly, then yes, the application of a Pattern on a keyed stream is similar to the application of a map function. It will search for the pattern on each per-key stream of data. So there will be state (buffer with partial matches, queued elements, et

Re: async io operator timeouts

2017-10-10 Thread Kostas Kloudas
t; > On Mon, Oct 9, 2017 at 7:12 PM, Kostas Kloudas <mailto:k.klou...@data-artisans.com>> wrote: > Hi Karthik, > > Currently there is no way to provide a handler for timed-out requests. > So the behavior is exactly what you described. A request fails, an exception > is th

Re: serialization error when using multiple metrics counters

2017-10-09 Thread Kostas Kloudas
Hi Colin, Are you initializing your counters from within the open() method of you rich function? In other words, are you calling counter = getRuntimeContext.getMetricGroup.counter(“my counter”) from within the open(). The counter interface is not serializable. So if you instantiate the count

Re: Bucketing/Rolling Sink: How to overwrite method "openNewPartFile" - to append a new timestamp to part file path every time a new part file is being created

2017-10-09 Thread Kostas Kloudas
Hi Raja, To know about the method, I suppose you have looked at the source code of the sink. I think that including the timestamp of the element in the path file is not as easy as overriding the openNewPartFile. The reason is that the filenames serve as identities for the associated state of t

Re: async io operator timeouts

2017-10-09 Thread Kostas Kloudas
Hi Karthik, Currently there is no way to provide a handler for timed-out requests. So the behavior is exactly what you described. A request fails, an exception is thrown and the job is restarted. A handler would be a nice addition. If you want, you can open a JIRA about it and if would like to

Re: Windowed Stream Queryable State Support

2017-10-08 Thread Kostas Kloudas
Hi Vijay, If by “Windowed Stream Queryable State Support” you mean when will Flink allow to query the state of an in-flight window, then a version will be available in 1.4 yes. Cheers, Kostas > On Oct 7, 2017, at 2:55 PM, vijayakumar palaniappan > wrote: > > What is the state of Windowed St

Re: Issue with CEP library

2017-09-30 Thread Kostas Kloudas
e 148 events for 4 ids and I saw all > of them being captured. I did not change anything as far as delays in the > producer. > > The behavior is quite arbitrary and I am suspecting the cause is because of > bugs FLINK-7549 <https://issues.apache.org/jira/browse/FLINK-7549> and > FL

Re: Issue with CEP library

2017-09-28 Thread Kostas Kloudas
cep-not-recognizing-pattern> > > > > I would really appreciate your guidance on this. > > Best regards, > Ajay > > > > > > On Thu, Sep 28, 2017 at 1:38 AM, Kostas Kloudas <mailto:k.klou...@data-artisans.com>> wrote: > Hi Ajay, > > I will

Re: Issue with CEP library

2017-09-28 Thread Kostas Kloudas
Hi Ajay, I will look a bit more on the issue. But in the meantime, could you run your job with parallelism of 1, to see if the results are the expected? Also could you change the pattern, for example check only for the start, to see if all keys pass through. As for the code, you apply keyBy(0

Re: CustomPartitioner that simulates ForwardPartitioner and watermarks

2017-09-28 Thread Kostas Kloudas
zation/deserialization. Since the stream is large I > want to avoid the network shuffle at the least. > > I thought operator instances within a taskmanager would get the same indexId, > but apparently this is not the case. > > Thanks, > >> On 27. Sep 2017, at 17:16, K

Re: CustomPartitioner that simulates ForwardPartitioner and watermarks

2017-09-27 Thread Kostas Kloudas
Hi Yunus, I am not sure if I understand correctly the question. Am I correct to assume that you want the following? ———> time ProcessAProcessB Task1: W(3) E(1) E(2) E(5)

Re: StreamCorruptedException

2017-09-27 Thread Kostas Kloudas
Hi Sridhar, From looking at your code: 1) The “KafkaDataSource” is a custom source that you implemented? Does this source buffer anything? 2) The getStreamSource2 seems to return again a "new KafkaDataSource”. Can this be a problem? 3) You are working on processing time and you are simply detec

Re: the design of spilling to disk

2017-09-19 Thread Kostas Kloudas
Hi Florin, Unfortunately, there is no design document. The UnilateralSortMerger.java is used in the batch processing mode (not is streaming) and, in fact, the code dates some years back. I cc also Fabian as he may have more things to say on this. Now for the streaming side, Flink uses 3 state

Re: Queryable State

2017-09-15 Thread Kostas Kloudas
Hi Navneeth, If you increase the timeout, everything works ok? I suppose from your config that you are running in standalone mode, right? Any other information about the job (e.g. code and/or size of state being fetched) and the cluster setup that can help us pin down the problem, would be appr

Re: QueryableState - No KvStateLocation found for KvState instance

2017-09-13 Thread Kostas Kloudas
Hi, As Biplob said this means that the JM cannot find the requested state. The reasons can be one of the above but given that you said you are using the FlinkMiniCluster, I assume you are testing. In this case, it can also be that you start querying your state to soon after the job is submitted,

Re: BucketingSink never closed

2017-09-08 Thread Kostas Kloudas
Hi Flavio, If I understand correctly, I think you bumped into this issue: https://issues.apache.org/jira/browse/FLINK-2646 There is also a similar discussion on the BucketingSink here: http://apache-flink-mailing-list-archive.1008284.n3.nabble

Re: Flink CEP questions

2017-08-17 Thread Kostas Kloudas
ink/flink-docs-release-1.3/dev/libs/cep.html#combining-patterns> in the docs above) I am not aware of any examples but you can check this slides: https://www.slideshare.net/dataArtisans/kostas-kloudas-complex-event-processing-with-flink-the-state-of-flinkcep <https://www.slideshare.net

Re: Access to datastream from BucketSink

2017-08-16 Thread Kostas Kloudas
tream"); > } > } > > my question now is how do I access the data stream from within the S3Bucketer > so that I can generate a filename based on the data with the data stream. > > Thanks, > >> On 16 Aug 2017, at 12:55, Kostas Kloudas wrote: >

Re: Access to datastream from BucketSink

2017-08-16 Thread Kostas Kloudas
-release-1.3/api/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.html > > <https://ci.apache.org/projects/flink/flink-docs-release-1.3/api/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.html> > > >> On 16 Aug 2017, at 12:24, Kos

Re: Access to datastream from BucketSink

2017-08-16 Thread Kostas Kloudas
Hi Ant, I think you can do it by implementing your own Bucketer. Cheers, Kostas . > On Aug 16, 2017, at 1:09 PM, ant burton wrote: > > Hello, > > Given > >// Set StreamExecutionEnvironment >final StreamExecutionEnvironment env = > StreamExecutionEnvironment.getExecutionEnvi

Re: FsStateBackend with incremental backup enable does not work with Keyed CEP

2017-08-12 Thread Kostas Kloudas
Hi Daiqing, I think Stefan is right and this will be fixed in the upcoming release. Could you open a JIRA for it with the Exception that you posted here? Thanks, Kostas > On Aug 12, 2017, at 10:05 AM, Stefan Richter > wrote: > > Hi, > > from a quick look, I would say this is likely a problem

Re: Flink CEP issues

2017-08-08 Thread Kostas Kloudas
Hi Daiqing, Is it possible to share your job in order to reproduce the problem? Or at least a minimal example. If you see from the JIRA, there is another user in https://issues.apache.org/jira/browse/FLINK-6321 who had a similar problem but we

Re: [POLL] Dropping savepoint format compatibility for 1.1.x in the Flink 1.4.0 release

2017-08-02 Thread Kostas Kloudas
+1 > On Aug 2, 2017, at 3:16 PM, Till Rohrmann wrote: > > +1 > > On Wed, Aug 2, 2017 at 9:12 AM, Stefan Richter > wrote: > >> +1 >> >> Am 28.07.2017 um 16:03 schrieb Stephan Ewen : >> >> Seems like no one raised a concern so far about dropping the savepoint >> format compatibility for 1.1 i

Re: data loss after implementing checkpoint

2017-07-31 Thread Kostas Kloudas
Hi Sridhar, Stephan already covered the correct sequence of actions in order for your second program to know its correct starting point. As far as the active/inactive rules are concerned, as Nico pointed out you have to somehow store in the backend which rules are active and which are not upon

Re: Integrating Flink CEP with a Rules Engine

2017-06-23 Thread Kostas Kloudas
The rules, or patterns supported by FlinkCEP are presented in the documentation link I posted earlier. Dynamically updating these patterns, is not supported yet, but there are discussions to add this feature soon. If the rules you need are supported by the current version of FlinkCEP, then yo

Re: Integrating Flink CEP with a Rules Engine

2017-06-23 Thread Kostas Kloudas
see some other > issues/consequences please comment. I also have the impression that > distribution is less of an issue because the rete network is > calculated only once and updates are not 'dynamic' (but I might be > wrong). > > Ismaël > > ps. I add Thomas in copy

Re: Integrating Flink CEP with a Rules Engine

2017-06-23 Thread Kostas Kloudas
Hi Jorn and Sridhar, It would be worth describing a bit more what these tools are and what are your needs. In addition, and to see what the CEP library already offers here you can find the documentation: https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/libs/cep.html

Re: Flink CEP not emitting timed out events properly

2017-06-20 Thread Kostas Kloudas
Are you sure that after incrementing the wm by 1sec, there is no element that will come with a timestamp smaller than this? Or, that after 10sec of inactivity, no element will come with such a timestamp? Kostas > On Jun 20, 2017, at 4:18 PM, Biplob Biswas wrote: > > currentMaxTimestamp = cur

Re: Flink CEP not emitting timed out events properly

2017-06-20 Thread Kostas Kloudas
You are correct that elements are waiting until a watermark with a higher timestamp than theirs (or the patterns timeout) arrives. Now for the Watermark emitter, 1) how do you measure the 10sec in processing time and ii) by how much do you advance the watermark. If you advance it by a lot, th

Re: Flink CEP not emitting timed out events properly

2017-06-20 Thread Kostas Kloudas
Hi Biplob, You are correct that only a higher watermark leads to discarded events. Are you sure that your custom watermark emitter does not emit a high watermark? E.g. your partition has elements that are far out-of-order. In addition, are you sure that your elements are not simply buffered and

Re: Flink CEP not emitting timed out events properly

2017-06-16 Thread Kostas Kloudas
Hi Biplob, If you know what you want, you can always write your custom AssignerWithPeriodicWatermarks that does your job. If you want to just increase the watermark, you could simply check if you have received any elements and if not, emit a watermark with the timestamp of the previous watermark

Re: Flink CEP not emitting timed out events properly

2017-06-16 Thread Kostas Kloudas
Hi Biplob, With processing time there are no watermarks in the stream. The problem that you are seeing is because in processing time, the CEP library expects the “next” element to come, in order to investigate if some of the patterns have timed-out. Kostas > On Jun 16, 2017, at 1:29 PM, Biplob

Re: Java 8 lambdas for CEP patterns won't compile

2017-06-12 Thread Kostas Kloudas
Done. > On Jun 12, 2017, at 12:24 PM, Ted Yu wrote: > > Can you add link to this thread in the JIRA ? > > Cheers > > On Mon, Jun 12, 2017 at 3:15 AM, Kostas Kloudas <mailto:k.klou...@data-artisans.com>> wrote: > Unfortunately, there was no discussion as this

Re: Java 8 lambdas for CEP patterns won't compile

2017-06-12 Thread Kostas Kloudas
On Jun 12, 2017, at 11:51 AM, Ted Yu wrote: > > Do know which JIRA / discussion thread had the context for this decision ? > > I did a quick search in JIRA but only found FLINK-3681. > > Cheers > > On Mon, Jun 12, 2017 at 1:48 AM, Kostas Kloudas <mailto:k.klou...@d

Re: Java 8 lambdas for CEP patterns won't compile

2017-06-12 Thread Kostas Kloudas
Hi David and Ted, The documentation is outdated. I will update it today. Java8 Lambdas are NOT supported by CEP in Flink 1.3. Hopefully this will change soon. I will open a JIRA for this. Cheers, Kostas > On Jun 11, 2017, at 11:55 PM, Ted Yu wrote: > >

Re: Fink: KafkaProducer Data Loss

2017-06-01 Thread Kostas Kloudas
Hi Ninad, I think that Gordon could shed some more light on this but I suggest you should update your Flink version to at least the 1.2. The reason is that we are already in the process of releasing Flink 1.3 (which will come probably today) and a lot of things have changed/fixed/improved sin

Re: No Alerts with FinkCEP

2017-05-31 Thread Kostas Kloudas
You could also remove the autoWatermarkInterval, if you are satisfied with ProcessingTime. Although keep in mind that processingTime assigns timestamps to elements based on the order that they arrive to the operator. This means that replaying the same stream can give different results. If you

Re: No Alerts with FinkCEP

2017-05-31 Thread Kostas Kloudas
Hi Biplob, Great to hear that everything worked out and that you are not blocked! For the timestamp assigning issue, you mean that you specified no timestamp extractor in your job and all your elements had Long.MIN_VALUE timestamp right? Kostas > On May 31, 2017, at 1:28 PM, Biplob Biswas wrot

Re: Does RichFilterFunction work on multiple thread?

2017-05-26 Thread Kostas Kloudas
Your objects will be processed by a single thread. Kostas > On May 26, 2017, at 4:50 PM, luutuan wrote: > > Hi, when I have a set of objects goes through a RichFilterFunction, by > default, will the filter handle all objects in 1 single thread or will > divide the work to multiple threads? > Th

Re: No Alerts with FinkCEP

2017-05-26 Thread Kostas Kloudas
Hi Biplob, For the 1.4 version, the input of the select function has changed to expect a list of matching events (Map> map instead of Map map), as we have added quantifiers. Also the FIlterFunction has changed to SimpleCondition. The documentation is lagging a bit behind, but it is coming s

Fwd: invalid type code: 00

2017-05-26 Thread Kostas Kloudas
I am forwarding Stefan’s reply: Hi, this problem can be caused by https://issues.apache.org/jira/browse/FLINK-6044. It is fixed in 1.2.1 and 1.3. Best, Stefan >> Am 26.05.2017 um 16:16 schrieb Kostas Kloudas : >> >> Hi, >> >> Could you provide some info o

Fwd: invalid type code: 00

2017-05-26 Thread Kostas Kloudas
I am forwarding Stefan’s reply here: > Hi, > > this problem can be caused by > https://issues.apache.org/jira/browse/FLINK-6044 > <https://issues.apache.org/jira/browse/FLINK-6044>. It is fixed in 1.2.1 and > 1.3. > > Best, > Stefan > >> Am

Re: invalid type code: 00

2017-05-26 Thread Kostas Kloudas
Hi, Could you provide some info on when is this error happening? From what I see you are using the heap or fs state backend and you are failing to read the state back when restoring from a failure. The failure can be unrelated to this, but it could be useful if you could check the task manager l

Re: No Alerts with FinkCEP

2017-05-26 Thread Kostas Kloudas
One additional comment, from your code it seems you are using Flink 1.2. It would be worth upgrading to 1.3. The updated CEP library includes a lot of new features and bugfixes. Cheers, Kostas > On May 26, 2017, at 3:33 PM, Kostas Kloudas > wrote: > > Hi Biplob, > > From a

Re: No Alerts with FinkCEP

2017-05-26 Thread Kostas Kloudas
Hi Biplob, From a first scan of the code I cannot find sth fishy. You are working on ProcessingTime, given that you do not provide any time characteristic specification, right? In this case, if you print your partitionedInput stream, do you see elements flowing as expected? If elements are fl

Re: Are timers in ProcessFunction fault tolerant?

2017-05-26 Thread Kostas Kloudas
on with the key of the stream? > > On Fri, May 26, 2017 at 1:51 PM, Kostas Kloudas <mailto:k.klou...@data-artisans.com>> wrote: > Hi Moiz, > > state.clear() refers to the state that you have registered in your job, using > the getState() > from the runtimeConte

Re: invalid type code: 00

2017-05-26 Thread Kostas Kloudas
Hi! Can you give us some information about your job? Which Flink version you are using? What is your job doing (e.g. operators that you are using)? Which operator throws this exception? Which state-backend are you using? This exception means that you cannot retrieve your state because of some

Re: Are timers in ProcessFunction fault tolerant?

2017-05-26 Thread Kostas Kloudas
Hi Moiz, state.clear() refers to the state that you have registered in your job, using the getState() from the runtimeContext. Timers are managed by Flink’s timer service and they are cleaned up by Flink itself when the job terminates. Kostas > On May 26, 2017, at 6:41 AM, Moiz S Jinia wro

Re: Question about start with checkpoint.

2017-05-20 Thread Kostas Kloudas
Hi, In order to change parallelism, you should take a savepoint, as described here: https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/savepoints.html Kostas > On May 21, 2017, at 5:43 AM, yunfa

Re: FlinkCEP latency/throughput

2017-05-17 Thread Kostas Kloudas
Hello Alfred, As a first general remark, Flink was not optimized for multicore deployments but rather for distributed environments. This implies overheads (serialization, communication etc), when compared to libs optimized for multicores. So there may be libraries that are better optimized for t

Re: Timer fault tolerance in Flink

2017-05-17 Thread Kostas Kloudas
Hi Rahul, The timers are fault tolerant and their timestamp is the absolute value of when to fire. This means that if you are at time t = 10 and you register a timer “10 ms from now”, the timer will have a firing timestamp of 20. This is checkpointed, so the “new machine” that takes over the fai

Re: Stateful streaming question

2017-05-17 Thread Kostas Kloudas
needed. At every incoming event, check the > previous state and update/output to kafka or whatever data store you are > using. > > > > > > Thanks > > Ankit > > > > From: Flavio Pompermaier mailto:pomperma...@okkam.it>> > Date: Tuesda

Re: Stateful streaming question

2017-05-16 Thread Kostas Kloudas
Hi Flavio, From what I understand, for the first part you are correct. You can use Flink’s internal state to keep your enriched data. In fact, if you are also querying an external system to enrich your data, it is worth looking at the AsyncIO feature: https://ci.apache.org/projects/flink/flink-

Re: Problem with Kafka Consumer

2017-05-16 Thread Kostas Kloudas
; suggested. The sink was blocking the reads making the Kafka pipeline stall, > due to a misconfiguration of an internal client that is calling an external > service. > > Thanks for your help, > Simone. > > On 16/05/2017 14:01, Kostas Kloudas wrote: >> Hi Simone, &

Re: Problem with Kafka Consumer

2017-05-16 Thread Kostas Kloudas
Hi Simone, I suppose that you use messageStream.keyBy(…).window(…) right? .windowAll() is not applicable to keyedStreams. Some follow up questions are: In your logs, do you see any error messages? What does your RowToQuery() sink do? Can it be that it blocks and the back pressure makes all th

Re: assignTimestampsAndWatermarks not working as expected

2017-05-04 Thread Kostas Kloudas
Hi Jayesh, Glad that it finally worked! From a first look, I cannot spot anything wrong with the code itself. The only thing I have to note is that the locations of the logs and the printouts you put in your code differ and normally they are not printed in the console. Thanks, Kostas > On Ma

Re: Long running time based Patterns

2017-05-04 Thread Kostas Kloudas
> Am on 1.3 - expect it'll be fixed by the time stable is out? > > Thanks! > > Moiz > > — > sent from phone > > On 04-May-2017, at 8:12 PM, Kostas Kloudas <mailto:k.klou...@data-artisans.com>> wrote: > > Hi Moiz, > > You are on Flink 1.

Re: OperatorState partioning when recovering from failure

2017-05-04 Thread Kostas Kloudas
Hi Seth, Upon restoring, splits will be re-shuffled among the new tasks, and I believe that state is repartitioned in a round robin way (although I am not 100% sure so I am also including Stefan and Aljoscha in this). The priority queues will be reconstructed based on the restored elements. So

<    1   2   3   4   5   >