Hi Paul,
This blog post [1] includes an example of an early trigger that should
pretty much do what you are looking for.
This one [2] explains the windowing mechanics of Flink (window assigner,
trigger, function, etc).
Hope this helps,
Fabian
[1]
https://www.mapr.com/blog/essential-guide-streami
Hi all,
I'm attempting to use long SlidingEventTime window (duration 24 hours) but I
would like updates more frequently than the 24 hour length. I naeively
attempted to use a simple CountTrigger(10) to give me the window every time 10
samples are collected, however, the window processing func
Yes, let me describe an example use-case that I'm trying to implement
efficiently within Flink.
We've been asked to aggregate per-user data on a daily level, and from there
produce aggregates on a variety of time frames. For example, 7 days, 30 days,
180 days, and 365 days.
We can talk about t
Hi Stephan,
Thank you for your reply.
Till when can I expect this feature to be integrated in master or release
version ?
We are going to get production data (financial data) in October end , so
want to have this feature before that.
Regards,
Vinay Patil
On Mon, Aug 29, 2016 at 11:15 AM, Steph
Hi,
this is the JIRA for mesos support:
https://issues.apache.org/jira/browse/FLINK-1984
The first part of it has been merged today:
https://github.com/apache/flink/pull/2315
So its likely going to be fully functional for the 1.2 release.
You can run multiple TaskManager instances per machine, an
Hi!
Null is indeed not supported for some basic data types (tuples / case
classes).
Can you use Option for nullable fields?
Stephan
On Mon, Aug 29, 2016 at 8:04 PM, Jack Huang wrote:
> Hi all,
>
> It seems like flink does not allow passing case class objects with
> null-valued fields to the
Hi all,
It seems like flink does not allow passing case class objects with
null-valued fields to the next operators. I am getting the following error
message:
*Caused by: java.lang.RuntimeException: Could not forward element to
next operator*
at
org.apache.flink.streaming.runtime.task
Hi!
The way that the JIRA issue you linked will achieve this is by hooking into
the network stream pipeline directly, and encrypt the raw network byte
stream. We built the network stack on Netty, and will use Netty's SSL/TLS
handlers for that.
That should be much more efficient than manual encryp
Hi Ufuk,
This is regarding this issue
https://issues.apache.org/jira/browse/FLINK-4404
How can we achieve this, I am able to decrypt the data from Kafka coming
in, but I want to make sure that the data is encrypted when flowing between
TM's.
One approach I can think of is to decrypt the data at
Hi Robert,
Thanks for your reply. Few follow up questions -
Is there any timeline for mesos support?
In standalone installations, how about having one Task slot per TaskManager
and multiple TaskManager instances per machine?
On Mon, Aug 29, 2016 at 7:13 PM, Robert Metzger wrote:
> Hi,
>
> for i
Thank you very much Chesnay, your pointer on "java.rmi.server.hostname"
solved the issue. Now i am able to get flink metrics in 1.1.1
"host" setting in JMX Reporter was because i thought of not exposing JMX
metrics as public. Since my flink cluster is in AWS cloud. So i tried to
bind it to private
I've noticed this too. The UI incorrectly tries to load its resources (CSS, JS)
from root. Try adding a trailing slash to the URL.
-Shannon
From: Trevor Grant mailto:trevor.d.gr...@gmail.com>>
Date: Friday, August 26, 2016 at 12:52 PM
To: mailto:user@flink.apache.org>>
Subject: Flink WebUI on YA
Hello,
That you can't access JMX in 1.0.3 even though you set all the JVM JMX
options is unrelated to Flink. As such your JMX setup in general is broken.
Note that in order to remotely access JMX you usually have to set
"java.rmi.server.hostname" system-property on the host as well.
Regarding
Hi everyone,
I have an integration test for which a use a LocalStreamEnvironment.
Currently, the Flink Job is started in a separated thread, which I
interrupt after some time and then do some assertions.
In this situation is there a better way to stop/cancel a running job in
LocalStreamEnvironmen
That's just awesome!
Thanks,
Steffen
On August 29, 2016 3:39:52 PM GMT+02:00, Stephan Ewen wrote:
>You are thinking too complicated here ;-) because Flink internally
>already
>does all the logic of monitoring the minimum watermark across stream
>partitions.
>As long as you match the Flink source
Hi,
for isolation, we recommend using YARN, or soon Mesos.
For standalone installations, you'll need to manually set up multiple
independent Flink clusters within one physical cluster if you want them to
be isolated.
Regards,
Robert
On Mon, Aug 29, 2016 at 1:41 PM, Abhishek Agarwal
wrote:
>
Any other opinion on this?
Thanks :)
Aris
From: aris kol
Sent: Sunday, August 28, 2016 12:04 AM
To: user@flink.apache.org
Subject: Re: Accessing state in connected streams
In the implementation I am passing just one CoFlatMapFunction, where flatMap1,
which
Hi,
I think the JMX port is logged, but not accessible through the REST
interface of the JobManager.
I think its a useful feature. If you want, you can file a JIRA for it.
On Mon, Aug 29, 2016 at 12:14 PM, Sreejith S wrote:
> Thank you Stephen ! That helps,
>
> Is it possible to get the JMX_POR
Hi,
I think in Flink 1.1.1 JMX will be started on port 8080, 8081 or 8082 (on
the JM, 8081 is probably occupied by the web interface).
On Mon, Aug 29, 2016 at 1:25 PM, Sreejith S wrote:
> Hi Chesnay,
>
> I added the below configuration in flink-conf in each taskmanagers. (flink
> 1.0.3 version
You are thinking too complicated here ;-) because Flink internally already
does all the logic of monitoring the minimum watermark across stream
partitions.
As long as you match the Flink source parallelism to the number of Kinesis
shared, that part is taken care of for you.
You only need to publis
Hi there,
I'm feeding a Flink stream with events from a Kinesis stream and I'm
looking for some guidance on how to enable event time in the Flink stream.
I've read through the documentation and it seems like I want to add
events that carry watermark information to the Kinesis stream and
subs
Hi,
this JIRA is a good starting point:
https://issues.apache.org/jira/browse/FLINK-3755
If you don't care about processing guarantees and you are using a stateless
streaming job, you can implement a simple Kafka consumer that uses Kafka's
consumer group mechanism. I recently implemented such a Ka
Hi Rss,
> why Flink implements different serialization schemes for keyed and non
keyed messages for Kafka?
The non-keyed serialization schema is a basic schema, which works for most
use cases.
For advanced users which need access to the key, offsets, the partition or
topic, there's the keyed ser
Hello, thanks for the answer.
1. There is currently no way to avoid the repartitioning. When you do a
> keyBy(), Flink will shuffle the data through the network. What you would
> need is a way to tell Flink that the data is already partitioned. If you
> would use keyed state, you would also need t
The JobManager UI starts when running Flink on YARN.
The address of the UI is registered at YARN, so you can also access it
through YARNs command line tools or its web interface.
On Fri, Aug 26, 2016 at 7:28 PM, Trevor Grant
wrote:
> Stephan,
>
> Will the jobmanager-UI exist? E.g. if I am runni
The "env.java.home" variable is only evaluated by the start scripts, not
the YARN code.
The solution you've mentioned earlier is a good work around in my opinion.
On Fri, Aug 26, 2016 at 3:48 AM, Renkai wrote:
> It seems that this config variant only effect local cluster and stand alone
> clust
Hi rss,
Concerning your questions:
1. There is currently no way to avoid the repartitioning. When you do a
keyBy(), Flink will shuffle the data through the network. What you would
need is a way to tell Flink that the data is already partitioned. If you
would use keyed state, you would also need to
Hi,
Does anyone have an example of working Parquet Sink for Flink DataStreaming
API?
In Flink 1.1 a new RollingSink with custom writers was released. According to
[FLINK-3637] it was done to allow working with ORC and Parquet formats. Maybe
someone has already used it for Parquet?
I’ve found an
Hi,
you could use Zookeeper if you want to dynamically change the DB name /
credentials at runtime.
The GlobalJobParameters are immutable at runtime, so you can not pass
updates through it to the cluster.
They are intended for parameters for all operators/the entire job and the
web interface.
Re
In a standalone flink cluster, the TaskManager has multiple slots and can
run different applications in different slot but still within same JVM.
Given that, two applications are running under same process, how is the cpu
and memory isolation achieved?
Are there any recommendations to achieve isol
Hi Chesnay,
I added the below configuration in flink-conf in each taskmanagers. (flink
1.0.3 version )
# Enable JMX
env.java.opts: -Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.port=
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
-D
Hi,
this page explains how to relocate classes in a fat jar:
https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html
Regards,
Robert
On Wed, Aug 10, 2016 at 10:31 PM, Janardhan Reddy <
janardhan.re...@olacabs.com> wrote:
> We don't use guava directly, we use another l
Hello,
can you post the jmx config entries and give us more details on how you
want to access it?
Regards,
Chesnay
On 29.08.2016 12:09, Sreejith S wrote:
Hi All,
I am using Flink-1.1.1 and i enabled JMX metrics in configuration
file. In the task manger log i can see JMX is running.
Is th
Hi,
As I found, the problem was rebalance() because the data arrives in 1 minute
(it re-processes old events) and it's a bit strange that when configured
watermark as 10 minutes it worked.
After removing rebalance(), it works as expected that setting watermark
less.
DataStream streams = env.addS
Thank you Stephen ! That helps,
Is it possible to get the JMX_PORT kind of details from taskmanagers ?
My program will connect to Job Manager and get all taskmaneger IP's and JMX
Port and create a JMX_URL automatically. This is what i am trying to
achieve.
Is it possible ?
Thanks,
On Mon, Aug
Hi All,
I am using Flink-1.1.1 and i enabled JMX metrics in configuration file. In
the task manger log i can see JMX is running.
Is this metrics exposed only through Flink Metrics API's ?
I tried to connect to flink JMX URL using normal javax , but connections
getting refused.
Thanks,
--
*
Thanks Stephan. I understand guaranteeing exactly once semantics with the
dynamic scaling is tough. If I were to let go of the exactly once
requirement, is it not possible in current version? It would be really
great if you can point me to the JIRA tracking this work.
On Mon, Aug 29, 2016 at 2:30
Hello,
The result contains (a,Map(3 -> rt)) because reduce prints all
intermediate results (sometimes called a "rolling reduce"). It's
designed this way, because Flink streams are generally infinite, so
there is no last element where you could print the "final" results.
However, you can use window
+1 support siddhi as a flink operator on DataStream, siddhi support rich
CEP features other than pattern match, also support extensible
snapshot/restore interface for fault-tolerance which should be easy to
integrate with flink's state management (tested with some draft code), we
(apache eagle comm
Hi Aparup,
I haven't looked in detail at Siddhi's internals especially the way it
handles distributed execution. I've only seen that it uses Hazelcast for a
distributed in-memory cache. If a distributed cache for the communication
between instances of the CEP operator is needed, then you would hav
You should be able to call the Monitor handler of the JobManager:
http://jobmanagerhost:8081/taskmanagers
That gives you a JSON response like this:
{ "taskmanagers [
{ "id" : "7c8835b89acf533cb8a5119dbcaf4b4f",
"path" : "akka.tcp://flink@127.0.1.1:56343/user/taskmanager",
"dataPort" : 5
Hi, in flink,the datastream have reduce Transformations,but the result do
not satisfy for me,for example,val pairs2 =
env.fromCollection((Array(("a", Map(3->"rt")),("a", Map(4->"yt")),("b",
Map(5->"dfs")val re=
pairs2.keyBy(0).reduce((x1,x2)=>(x1._1,x2._2++x1._2))re.map
Hi Flavio,
yes, Joda should not be excluded.
This will be fixed in Flink 1.1.2.
Cheers, Fabian
2016-08-29 11:00 GMT+02:00 Flavio Pompermaier :
> Hi to all,
> I've tried to upgrade from Flink 1.0.2 to 1.1.1 so I've copied the
> excludes of the maven shade plugin from the java quickstart pom bu
Hi!
There is a lot of work in progress on that feature, and it looks like you
can expect the next version to have some upscale/downscale feature that
maintains exactly-once semantics.
Stephan
On Mon, Aug 29, 2016 at 9:00 AM, Abhishek Agarwal
wrote:
> Is it possible to upscale or downscale a f
Hi to all,
I've tried to upgrade from Flink 1.0.2 to 1.1.1 so I've copied the
excludes of the maven shade plugin from the java quickstart pom but it
includes the exclude of joda (that is not contained in the flink-dist
jar). That causes my job to fail.
Shouldn't it be removed from the exclude lis
Nice idea!
If you look at the current CEP library, it is simply a custom operator.
Often, you can even get away with a custom FlatMapFunction that uses state:
https://ci.apache.org/projects/flink/flink-docs-master/dev/state.html#using-the-keyvalue-state-interface
Stephan
On Mon, Aug 29, 2016 at
Hi,
that would certainly be possible? What do you think can be gained by having
knowledge about the current watermark in the WindowFunction, in a specific
case, possibly?
Cheers,
Aljoscha
On Wed, 24 Aug 2016 at 23:21 Shannon Carey wrote:
> What do you think about adding the current watermark to
I think siddhi is a fairly matured CEP library. I am thinking it should
co-exist with existing CEP library. My thinking is we should be able to use
Siddhi QL/ Siddhi Patterns on top of flink data streams. This can co-exist
naturally with existing Java / Scala based Flink CEP Library. I am still
Hello Aparup,
could you provide more information about Siddhi? How mature is it; how
is the community? How does it compare to the Flink's CEP library?
How should this integration look like? Are you proposing to replace the
current CEP library, or will they co-exist with different use-cases fo
Is it possible to upscale or downscale a flink application without
re-deploying (similar to rebalancing in storm)?
--
Regards,
Abhishek Agarwal
50 matches
Mail list logo