Hi All,
I'm using Flink Kafka connector08. I need to check/monitor the offsets of
the my flink application's kafka consumer.
When running this:
bin/kafka-consumer-groups.sh --zookeeper --describe
--group
I get the message: No topic available for consumer group provided. Why is
the consumer no
Try this. Your WM's need to move forward. Also don't use System Timestamp.
Use the timestamp of the element seen as the reference as the elements are
most likely lagging the system timestamp.
DataStream withTimestampsAndWatermarks = tuples
.assignTimestampsAndWatermarks(new
AssignerWithPer
Hello,
I tried setting the watermark to System.currentTimeMillis() - 5000L, event
timestamps are System.currentTimeMillis(). I do not observe the expected
behaviour of the PatternTimeoutFunction firing once the watermark moves
past the timeout "anchored" by a pattern match.
Here is the complete t
This is one of my challenges too-
1. The JavaScript rules are only applicable inside one operator (next,
followedBy, notFollowedBy). And the JavaScript rules can apply to only the
event in that operator. I make it a little more dynamic by creating a Rules
HashMap and add rules with the names "firs
Hi Sameer,
I just replied to the earlier post, but I will copy it here:
We also have the same requirement - we want to allow the user to change the
matching patterns and have them take effect immediately. I'm wondering whether
the proposed trigger DSL takes us one step closer:(I don't think it so
We also have the same requirement - we want to allow the user to change the
matching patterns and have them take effect immediately. I'm wondering whether
the proposed trigger DSL takes us one step closer:(I don't think it solves the
problem) or we have to dynamically generate Flink job JAR file
In Flink 1.1.1, I am seeing what looks like a serialization issue of
org.apache.hadoop.conf.Configuration or when used with
org.apache.flink.api.scala.hadoop.mapreduce.HadoopOutputFormat. When I use the
mapred.HadoopOutputFormat version, it works just fine.
Specifically, the job fails because "
I have used a JavaScript engine in my CEP to evaluate my patterns. Each
event is a list of named attributes (HashMap like). And event is attached
to a list of rules expressed as JavaScript code (See example below with one
rule but I can match as many rules). The rules are distributed over a
connec
Hello,
I am new to Apache Flink and am trying to build a CEP using Flink's API. One
of the requirements is the ability to add/change patterns at runtime for
anomaly detection (maintaining the systems availability). Any Ideas of how
could I do that?
For instance, If I have a stream of security eve
Assuming an element with timestamp which is later than the last emitted
watermark arrives, would it just be dropped because the PatternStream does
not have a max allowed lateness method? In that case it appears that CEP
cannot handle late events yet out of the box.
If we do want to support late ev
Ah ok great, thanks! I will try upgrading sometime this week then.
Cheers,
Josh
On Tue, Oct 11, 2016 at 5:37 PM, Stephan Ewen wrote:
> Hi Josh!
>
> I think the master has gotten more stable with respect to that. The issue
> you mentioned should be fixed.
>
> Another big set of changes (the last
Hi Josh!
I think the master has gotten more stable with respect to that. The issue
you mentioned should be fixed.
Another big set of changes (the last big batch) is going in in the next
days - this time for re-sharding timers (window operator) and other state
that is not organized by key.
If you
Thanks Till - This is helpful to know.
Sameer
On Tue, Oct 11, 2016 at 12:20 PM, Till Rohrmann
wrote:
> Hi Sameer,
>
> the CEP operator will take care of ordering the elements.
>
> Internally what happens is that the elements are buffered before being
> applied to the state machine. The operator
I will give it a try, my current time/watermark assigner extends
AscendingTimestampExtractor so I can't override setting the watermark to
the last seen event timestamp.
Thanks for your replies.
/David
On Tue, Oct 11, 2016 at 6:17 PM, Till Rohrmann
wrote:
> But then no element later than the la
Hi Sameer,
the CEP operator will take care of ordering the elements.
Internally what happens is that the elements are buffered before being
applied to the state machine. The operator only applies the elements after
it has seen a watermark which is greater than the timestamps of the
elements being
But then no element later than the last emitted watermark must be issued by
the sources. If that is the case, then this solution should work.
Cheers,
Till
On Tue, Oct 11, 2016 at 4:50 PM, Sameer W wrote:
> Hi,
>
> If you know that the events are arriving in order and a consistent lag,
> why not
Hi!
I think to some extend this is expected. There is some cleanup code that
deletes files and then issues parent directory remove requests. It relies
on the fact that the parent directory is only removed if it is empty (after
the last file was deleted).
Is this a problem right now, or just a co
Thanks, Till. I will wait for your response.
- LF
From: Till Rohrmann
To: user@flink.apache.org; lg...@yahoo.com
Sent: Tuesday, October 11, 2016 2:49 AM
Subject: Re: more complex patterns for CEP (was: CEP two transitions to the
same state)
The timeline is hard to predict to be
Hi Stephan,
Thanks, that sounds good!
I'm planning to upgrade to Flink 1.2-SNAPSHOT as soon as possible - I was
delaying upgrading due to the issues with restoring operator state you
mentioned on my other thread here:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-job-f
Hi,
I get many (multiple times per minute) errors in my Namenode HDFS logfile:
2016-10-11 17:17:07,596 INFO ipc.Server (Server.java:logException(2401)) -
IPC Server handler 295 on 8020, call
org.apache.hadoop.hdfs.protocol.ClientProtocol.delete from datanode1:34872
Call#2361 Retry#0
org.apache.h
Hi Josh!
There are two ways to improve the RocksDB / S3 behavior
(1) Use the FullyAsync mode. It stores the data in one file, not in a
directory. Since directories are the "eventual consistent" part of S3, this
prevents many issues.
(2) Flink 1.2-SNAPSHOT has some additional fixes that circumven
Hi,
If you know that the events are arriving in order and a consistent lag, why
not just increment the watermark time every time the getCurrentWatermark()
method is invoked based on the autoWatermarkInterval (or less to be
conservative).
You can check if the watermark has changed since the arriva
Hi Aljoscha,
Yeah I'm using S3. Is this a known problem when using S3? Do you have any
ideas on how to restore my job from this state, or prevent it from
happening again?
Thanks,
Josh
On Tue, Oct 11, 2016 at 1:58 PM, Aljoscha Krettek
wrote:
> Hi,
> you are using S3 to store the checkpoints, r
Thank you Fabian and Flavio for your help.
Best,
Yassine
2016-10-11 14:02 GMT+02:00 Flavio Pompermaier :
> I posted a workaround for that at https://github.com/okkam-it/
> flink-examples/blob/master/src/main/java/it/okkam/datalinks/batch/flink/
> datasourcemanager/importers/Csv2RowExample.java
>
Hi Kostas:
Thank you for your rapid response!
My use-case is that :
For every incoming event, we want to age the out-of-date event , count the
event in window and send the result.
For example:
The events are coming as flowing:
[cid:image002.png@01D22401.7DD230E0]
We want flowing result:
[cid:im
Hi,
are you windowing based on event time?
Cheers,
Aljoscha
On Fri, 7 Oct 2016 at 09:28 Fabian Hueske wrote:
> If you are using time windows, you can access the TimeWindow parameter of
> the WindowFunction.apply() method.
> The TimeWindow contains the start and end timestamp of a window (as Lon
Hi,
you are using S3 to store the checkpoints, right? It might be that you're
running into a problem with S3 "directory listings" not being consistent.
Cheers,
Aljoscha
On Tue, 11 Oct 2016 at 12:40 Josh wrote:
Hi all,
I just have a couple of questions about checkpointing and restoring
state f
I posted a workaround for that at
https://github.com/okkam-it/flink-examples/blob/master/src/main/java/it/okkam/datalinks/batch/flink/datasourcemanager/importers/Csv2RowExample.java
On 11 Oct 2016 1:57 p.m., "Fabian Hueske" wrote:
> Hi,
>
> Flink's String parser does not support escaped quotes.
Hi,
Flink's String parser does not support escaped quotes. You data contains a
double double quote (""). The parser identifies this as the end of the
string field.
As a workaround, you can read the file as a regular text file, line by line
and do the parsing in a MapFunction.
Best, Fabian
2016-1
Hi,
If using CEP with event-time I have events which can be slightly out of
order and I want to sort them by timestamp within their time-windows before
applying CEP-
For example, if using 5 second windows and I use the following
ds2 = ds.keyBy.window(TumblingWindow(10 seconds).apply(/*Sort by
Ti
Hi Zhangrucong,
Sliding windows only support time-based slide.
So your use-case is not supported out-of-the-box.
But, if you describe a bit more what you want to do,
we may be able to find a way together to do your job using
the currently offered functionality.
Kostas
> On Oct 11, 2016, at 1
Forgot to add parseQuotedStrings('"'). After adding it I'm getting the same
exception with the second code too.
2016-10-11 13:29 GMT+02:00 Yassine MARZOUGUI :
> Hi Fabian,
>
> I tried to debug the code, and it turns out a line in my csv data is
> causing the ArrayIndexOutOfBoundsException, here i
Hi Fabian,
I tried to debug the code, and it turns out a line in my csv data is
causing the ArrayIndexOutOfBoundsException, here is the exception
stacktrace:
java.lang.ArrayIndexOutOfBoundsException: -1
at
org.apache.flink.types.parser.StringParser.parseField(StringParser.java:49)
at
org.apache.f
Hello everyone:
Now, I am want to use DataStream sliding window API. I look at the API and I
have a question, dose the sliding time window support sliding by every incoming
event?
Thanks in advance!
Hi all,
I just have a couple of questions about checkpointing and restoring
state from RocksDB.
1) In some cases, I find that it is impossible to restore a job from a
checkpoint, due to an exception such as the one pasted below[*]. In
this case, it appears that the last checkpoint is somehow co
Hi David,
the problem is still that there is no corresponding watermark saying that 4
seconds have now passed. With your code, watermarks will be periodically
emitted but the same watermark will be emitted until a new element arrives
which will reset the watermark. Thus, the system can never know
I will check it this nigth
Thanks
2016-10-11 11:24 GMT+02:00 Timo Walther :
> I have opened a PR (https://github.com/apache/flink/pull/2619). Would be
> great if you could try it and comment if it solves you problem.
>
> Timo
>
> Am 10/10/16 um 17:48 schrieb Timo Walther:
>
> I could reproduce t
Hi,
how does "doesn't work" manifest?
Cheers,
Aljoscha
On Wed, 28 Sep 2016 at 22:54 Anchit Jatana
wrote:
> Hi All,
>
> I have a use case where in need to create multiple source streams from
> multiple files and monitor the files for any changes using the "
> FileProcessingMode.PROCESS_CONTINUOU
The timeline is hard to predict to be honest. It depends a little bit on
how fast the community can proceed with these things. At the moment I'm
personally involved in other issues and, thus, cannot work on the CEP
library. I hope to get back to it soon.
Cheers,
Till
On Sat, Oct 8, 2016 at 12:42
I have opened a PR (https://github.com/apache/flink/pull/2619). Would be
great if you could try it and comment if it solves you problem.
Timo
Am 10/10/16 um 17:48 schrieb Timo Walther:
I could reproduce the error locally. I will prepare a fix for it.
Timo
Am 10/10/16 um 11:54 schrieb Alberto
Hi Yassine,
I ran your code without problems and got the correct result.
Can you provide the Stacktrace of the Exception?
Thanks, Fabian
2016-10-10 10:57 GMT+02:00 Yassine MARZOUGUI :
> Thank you Fabian and Stephan for the suggestions.
> I couldn't override "readLine()" because it's final, so w
Hi,
I have a low throughput job (approx. 1000 messager per Minute), that
consumes from Kafka und writes directly to HDFS. After an hour or so, I get
the following warnings in the Task Manager log:
2016-10-10 01:59:44,635 WARN org.apache.hadoop.hdfs.DFSClient
- Slow ReadProcessor
Hi Ken,
I think your solution should work.
You need to make sure though, that you properly manage the state of your
function, i.e., memorize all records which have been received but haven't
be emitted yet.
Otherwise records might get lost in case of a failure.
Alternatively, you can implement thi
43 matches
Mail list logo