Re: Duplicated data when using Externalized Checkpoints in a Flink Highly Available cluster

2017-05-30 Thread F.Amara
Hi Gordan,

Thanks alot for the reply. 
The events are produced using a KafkaProducer, submitted to a topic and
thereby consumed by the Flink application using a FlinkKafkaConsumer. I
verified that during a failure recovery scenario(of the Flink application)
the KafkaProducer was not interrupted, resulting in not sending duplicated
values from the data source. I observed the output from the
FlinkKafkaConsumer and noticed duplicates starting from that point onwards.
Is the FlinkKafkaConsumer capable of intoducing duplicates?

How can I implement exactly-once processing for my application? Could you
please guide me on what I might have missed?

Thanks,
Amara




--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Duplicated-data-when-using-Externalized-Checkpoints-in-a-Flink-Highly-Available-cluster-tp13301p13379.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


Kafka partitions -> task slots? (keyed stream)

2017-05-30 Thread Moiz S Jinia
For a keyed stream (where the key is also the message key in the source
kafka topic), is the parallelism of the job restricted to the number of
partitions in the topic?

Source topic has 5 partitions, but available task slots are 12. (3 task
managers each with 4 slots)

Moiz


Re: New "Powered by Flink" success case

2017-05-30 Thread Rosellini, Luca
Thank you Fabian!

*KEEDIO*

*Luca Rosellini*

*+34 667 24 38 57 <+34%20667%2024%2038%2057>*

*www.keedio.com *

C/ Virgilio 25, Pozuelo de Alarcón

2017-05-27 0:08 GMT+02:00 Fabian Hueske :

> Hi Luca,
>
> thanks for sharing this exciting use case!
> I added KEEDIO to Flink's Powered By list: https://cwiki.apache.org/
> confluence/display/FLINK/Powered+by+Flink
>
> Thank you and best regards,
> Fabian
>
> 2017-05-25 10:35 GMT+02:00 Rosellini, Luca :
>
>> Hello everybody,
>> I am posting this here following the guidelines I've found in the
>> "Powered by Flink" page.
>>
>> At KEEDIO (http://www.keedio.com) we use Apache Flink CEP API in a log
>> aggregation solution 
>> for Red Hat OpenStack that enables us to discover operational anomalies
>> that otherwise would be very hard to spot.
>>
>> We think it would be interesting to add this use case in the "Powered by
>> Flink" page.
>>
>> Let me know if you need further information.
>>
>> Thanks in advance,
>>
>> *KEEDIO*
>>
>> *Luca Rosellini*
>>
>> *+34 667 24 38 57 <+34%20667%2024%2038%2057>*
>>
>> *www.keedio.com *
>>
>> C/ Virgilio 25, Pozuelo de Alarcón
>>
>
>


Re: No Alerts with FinkCEP

2017-05-30 Thread Dawid Wysakowicz
Hi Biplob,

The message you mention should not be a problem here. It just says you
can't use your events as POJOs (e.g. you can't use keyBy("chargedAccount")
).
Your code seems fine and without some example data I think it will be hard
to help you.

As for the PART 2 of your first email.
In 1.3 we introduced NOT pattern but right now it does not support time
ranges in which a pattern should not occur. The thing you can do though is
to specify a positive Pattern like: ("a" -> "b" within 1s) and select the
timeouted patterns, which in fact are the ones that you want to trigger
alerts for.


Re: Gelly and degree filtering

2017-05-30 Thread Nico Kruber
Does Martin's answer to a similar thread help?

https://lists.apache.org/thread.html/
000af2fb17a883b60f4a2359ebbeca42e3160c2167a88995c2ee28c2@
%3Cuser.flink.apache.org%3E

On Monday, 29 May 2017 19:38:20 CEST Martin Junghanns wrote:
> Hi Ali :)
> 
> You could compute the degrees beforehand (e.g. using the
> Graph.[in|out|get]degrees()) methods and use the resulting dataset as a
> new vertex dataset. You can now run your vertex-centric computation and
> access the degrees as vertex value.
> 
> Cheers,
> 
> Martin


On Sunday, 28 May 2017 12:02:52 CEST Daniel Dalek wrote:
> Hi all,
> 
> I have a question related to Gelly and graph filtering and hoping to get
> some pointers/input.
> 
> Basically I have a bipartite graph prepared for a signal/collect iteration,
> but want to prune it first to only include target nodes with x or more
> edges (indegree >= x). To filter on vertex value (or id) it seems
> straightforward to use subgraph and FilterFunction:
> 
> prunedG = graph.subgraph(
> new FilterFunction>() {
> public boolean filter(Vertex vertex) {
> return (vertex.getValue() > 0);
> }
> }, ...
> 
> 
> Modifying this to call something like "return (vertex.getInDegree() >=x)"
> seemed appropriate but the degree information is in the graph (or available
> as separate methods when running GatherFunction etc), and not accessible
> directly from the vertex object inside the filter function.
> 
> Any suggestions on how to accomplish this?



signature.asc
Description: This is a digitally signed message part.


Re: Porting batch percentile computation to streaming window

2017-05-30 Thread Gyula Fóra
Hi William,

I think basically the feature you are looking for are side inputs which is
not implemented yet but let me try to give a workaround that might work.

If I understand correctly you have two windowed computations:
TDigestStream = allMetrics.windowAll(...).reduce()
windowMetricsByIP = allMetrics.keyBy(ip).reduce()

And now you want to join these two by window to compute the percentiles
Something like:

TDigestStream.broadcast().connect(windowMetricsByIP).flatMap(JoiningCoFlatMap)

In your JoiningCoFlatMap you could keep a state of Map and
every by ip metric aggregate could pick up the TDigest for the current
window. All this assumes that you attach the window information to the
aggregate metrics and the TDigest (you can do this in the window reduce
step).

This logic now assumes that you get the TDigest result before getting any
groupBy metric, which will probably not be the case so you could do some
custom buffering in state. Depending on the rate of the stream this might
or might not be feasible :)

Does this sound reasonable? I hope I have understood the use-case correctly.
Gyula


William Saar  ezt írta (időpont: 2017. máj. 29., H, 18:34):

> I am porting a calculation from Spark batches that uses broadcast
> variables to compute percentiles from metrics and curious for tips on doing
> this with Flink streaming.
>
> I have a windowed computation where I am compute metrics for IP-addresses
> (a windowed stream of metrics objects grouped by IP-addresses). Now I would
> like to compute percentiles for each IP from the metrics.
>
> My idea is to send all the metrics to a node that computes a global
> TDigest and then rejoins the computed global TDigest with the IP-grouped
> metrics stream to compute the percentiles for each IP. Is there a neat way
> to implement this in Flink?
>
> I am curious about the best way to join a global valuem like our TDigest,
> with every result of a grouped window stream.  Also how to know when the
> TDigest is complete and has seen every element in the window (say if I
> implement it in a stateful flatMap that emits the value after seeing all
> stream values).
>
> Thanks!
>
> William
>


[ANNOUNCE] Flink Forward Berlin (11-13 Sep 2017) Call for Submissions is open now

2017-05-30 Thread Robert Metzger
Dear Flink Community,

The Call for Submissions for Flink Forward Berlin 2017 is open now!

Since we believe in collaboration, participation and exchange of ideas, we
are inviting the Flink community to submit a session. Share your knowledge,
applications, use cases and best practices and shape the program of Flink
Forward Berlin 2017! Submit your idea here: http://berlin.flink-forward.
org/submit-your-talk/


We welcome submissions on everything Flink-related, including experiences
with using Flink, products based on Flink, technical talks on extending
Flink, as well as connecting Flink with other open source or proprietary
software.

Flink Forward is offering scholarships to members of underrepresented
groups in the technical community to help break down the barriers that
prevent underrepresented groups from participating in Flink Forward. Find
more information on the scholarship here: http://berlin.flink-forward.
org/conference-diversity-scholarship-program/

For more information visit our website: http://berlin.flink-forward.org or
contact us at he...@flink-forward.org.


Regards,
Robert


Re: How can I increase Flink managed memory?

2017-05-30 Thread Nico Kruber
By default, Flink allocates a fraction of 0.7 (taskmanager.memory.fraction) of 
the free memory (total memory configured via taskmanager.heap.mb minus memory 
used for network buffers) for its managed memory. An absolute value may be set 
using taskmanager.memory.size (overrides the fraction parameter). [1]
In general, [1] is a good read for how the different memory settings work 
together.

The 1 network buffers of size taskmanager.memory.segment-size (default: 
32k) will this remove 320MB from your taskmanager.heap.mb (which in your 
example is set to 563GB?!) and therefore do not affect the managed memory size 
much.

Whether the size used by the JVM is preallocated--and thus immediately 
visible--depends on taskmanager.memory.preallocate.


But actually what are trying to fix? Is your node flushing data to disk yet? Or 
has it just not accumulated that much operator state yet?



Nico

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/
config.html#managed-memory


On Monday, 29 May 2017 19:09:55 CEST Sathi Chowdhury wrote:
> For got to mention I am running 10 slots in each machine and
> taskmanager.network.numberOfBuffers: 1
 Is there a scope of improving
> memory usage?
> 
> From: Sathi Chowdhury
> mailto:sathi.chowdh...@elliemae.com>>
 Date:
> Monday, May 29, 2017 at 9:55 AM
> To: "user@flink.apache.org"
> mailto:user@flink.apache.org>>
 Subject: How can I
> increase Flink managed memory?
> 
> Hello Flink Dev and Community
> I have 5 task managers each tie 64 GB of memory
> I am running flink on yarn with task manager heap taskmanager.heap.mb:
> 563200
 Link still shows that it is using about 21 GB memory leaving 35 GB
> available..how and what can I do to fix it? Please suggest
> Thanks
> Sathi
> =Notice to Recipient: This e-mail transmission, and any
> documents, files or previous e-mail messages attached to it may contain
> information that is confidential or legally privileged, and intended for
> the use of the individual or entity named above. If you are not the
> intended recipient, or a person responsible for delivering it to the
> intended recipient, you are hereby notified that you must not read this
> transmission and that any disclosure, copying, printing, distribution or
> use of any of the information contained in or attached to this transmission
> is STRICTLY PROHIBITED. If you have received this transmission in error,
> please immediately notify the sender by telephone or return e-mail and
> delete the original transmission and its attachments without reading or
> saving in any manner. Thank you. =
 =Notice to
> Recipient: This e-mail transmission, and any documents, files or previous
> e-mail messages attached to it may contain information that is confidential
> or legally privileged, and intended for the use of the individual or entity
> named above. If you are not the intended recipient, or a person responsible
> for delivering it to the intended recipient, you are hereby notified that
> you must not read this transmission and that any disclosure, copying,
> printing, distribution or use of any of the information contained in or
> attached to this transmission is STRICTLY PROHIBITED. If you have received
> this transmission in error, please immediately notify the sender by
> telephone or return e-mail and delete the original transmission and its
> attachments without reading or saving in any manner. Thank you.
> =



signature.asc
Description: This is a digitally signed message part.


Re: State in Custom Tumble Window Class

2017-05-30 Thread rhashmi
Thanks Aljoscha Krettek, 

So the results will not be deterministic for late events. For idempotent
update, i would need to find an additional key base of current event time if
they are late and attached to the aggregator which probably possible by
doing some function(maxEventTime, actualEventTime). For that i need
maxEventTime to be stored as part of state & recover in case of runtime
failure.

Here is my corner case like. 
-- If assume whole flink runtime crashed(auto commit on) & after recovery
the first event arrived is from past(actually late). Without keeping max
currentTime state, may potentially override previous aggregate. 

I was wondering if i can record my last max EventTime as part of checkPoint, 
or run query against sink source to find last processed eventtime.

Any recommendation?
 



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/State-in-Custom-Tumble-Window-Class-tp13177p13387.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


Re: Porting batch percentile computation to streaming window

2017-05-30 Thread William Saar
> This logic now assumes that you get the TDigest result before
getting any groupBy metric, which will probably not be the case so you
could do some custom buffering in state. Depending on the rate of the
stream this might or might not be feasible :)

Unfortunately, I think this assumption is a deal-breaker. The value
stream is not grouped, but I need to distribute the values to compute
the metrics and I am populating the TDigest with the metrics

Your suggestion gave me some ideas. Assume I have
windowMetricsByIp =
values.keyBy(ip).window(TumblingTimeWindow).fold(computeMetrics)
 tDigestStream = windowMetricsByIp.global().flatMap(tDigestMapper) //
How do I know when the flat map has seen all values and should emit
its result?
percentilesStream =
tDigestStream.broadcast().connect(windowMetricsByIp).flatMap 

If I attach information about the current window to the metrics events
on line 1, can I perhaps use that information to make flatMap on line
2 decide when to emit its T-Digest? The crudest solution is to emit
the T-Digest for a window when the first event of the next window
arrives (will this cause problems with back-pressure?) 
Less crude, maybe I can store watermark information or something on
metrics objects in line 1 and emit T digests more often in line 2?

Finally, how do I access the watermark/window information in my fold
operation in line 1?

Thanks!

- Original Message -
From: "Gyula Fóra" 
To:"William Saar" , 
Cc:
Sent:Tue, 30 May 2017 08:56:28 +
Subject:Re: Porting batch percentile computation to streaming window

Hi William,

I think basically the feature you are looking for are side inputs
which is not implemented yet but let me try to give a workaround that
might work. 

If I understand correctly you have two windowed computations:
TDigestStream = allMetrics.windowAll(...).reduce()
windowMetricsByIP = allMetrics.keyBy(ip).reduce()

And now you want to join these two by window to compute the
percentiles

Something like:

TDigestStream.broadcast().connect(windowMetricsByIP).flatMap(JoiningCoFlatMap)

In your JoiningCoFlatMap you could keep a state of Map and every by ip metric aggregate could pick up the TDigest
for the current window. All this assumes that you attach the window
information to the aggregate metrics and the TDigest (you can do this
in the window reduce step). 

This logic now assumes that you get the TDigest result before getting
any groupBy metric, which will probably not be the case so you could
do some custom buffering in state. Depending on the rate of the stream
this might or might not be feasible :)

Does this sound reasonable? I hope I have understood the use-case
correctly.
Gyula

William Saar  ezt írta (időpont: 2017. máj.
29., H, 18:34):
I am porting a calculation from Spark batches that uses broadcast
variables to compute percentiles from metrics and curious for tips on
doing this with Flink streaming.

I have a windowed computation where I am compute metrics for
IP-addresses (a windowed stream of metrics objects grouped by
IP-addresses). Now I would like to compute percentiles for each IP
from the metrics.

My idea is to send all the metrics to a node that computes a global
TDigest and then rejoins the computed global TDigest with the
IP-grouped metrics stream to compute the percentiles for each IP. Is
there a neat way to implement this in Flink?

I am curious about the best way to join a global valuem like our
TDigest, with every result of a grouped window stream.  Also how to
know when the TDigest is complete and has seen every element in the
window (say if I implement it in a stateful flatMap that emits the
value after seeing all stream values).

Thanks!
William
 

Links:
--
[1] mailto:will...@saar.se



Re: Porting batch percentile computation to streaming window

2017-05-30 Thread Gyula Fóra
I think you could actually do a window operation to get the tDigestStream
from windowMetricsByIp:

windowMetricsByIp.allWindow(SameWindowAsTumblingTimeWindow).fold(...)

This way the watermark mechanism should ensure you get all partial results
before flushing the global window.

Gyula

William Saar  ezt írta (időpont: 2017. máj. 30., K, 15:03):

> > This logic now assumes that you get the TDigest result before getting
> any groupBy metric, which will probably not be the case so you could do
> some custom buffering in state. Depending on the rate of the stream this
> might or might not be feasible :)
>
> Unfortunately, I think this assumption is a deal-breaker. The value stream
> is not grouped, but I need to distribute the values to compute the metrics
> and I am populating the TDigest with the metrics
>
> Your suggestion gave me some ideas. Assume I have
> windowMetricsByIp =
> values.keyBy(ip).window(TumblingTimeWindow).fold(computeMetrics)
> tDigestStream = windowMetricsByIp.global().flatMap(tDigestMapper) // How
> do I know when the flat map has seen all values and should emit its result?
> percentilesStream =
> tDigestStream.broadcast().connect(windowMetricsByIp).flatMap
>
> If I attach information about the current window to the metrics events on
> line 1, can I perhaps use that information to make flatMap on line 2 decide
> when to emit its T-Digest? The crudest solution is to emit the T-Digest for
> a window when the first event of the next window arrives (will this cause
> problems with back-pressure?)
> Less crude, maybe I can store watermark information or something on
> metrics objects in line 1 and emit T digests more often in line 2?
>
> Finally, how do I access the watermark/window information in my fold
> operation in line 1?
>
> Thanks!
>
>
> - Original Message -
> From:
> "Gyula Fóra" 
>
> To:
> "William Saar" , 
> Cc:
>
> Sent:
> Tue, 30 May 2017 08:56:28 +
> Subject:
> Re: Porting batch percentile computation to streaming window
>
>
>
>
> Hi William,
>
> I think basically the feature you are looking for are side inputs which is
> not implemented yet but let me try to give a workaround that might work.
>
> If I understand correctly you have two windowed computations:
> TDigestStream = allMetrics.windowAll(...).reduce()
> windowMetricsByIP = allMetrics.keyBy(ip).reduce()
>
> And now you want to join these two by window to compute the percentiles
> Something like:
>
>
> TDigestStream.broadcast().connect(windowMetricsByIP).flatMap(JoiningCoFlatMap)
>
> In your JoiningCoFlatMap you could keep a state of Map
> and every by ip metric aggregate could pick up the TDigest for the current
> window. All this assumes that you attach the window information to the
> aggregate metrics and the TDigest (you can do this in the window reduce
> step).
>
> This logic now assumes that you get the TDigest result before getting any
> groupBy metric, which will probably not be the case so you could do some
> custom buffering in state. Depending on the rate of the stream this might
> or might not be feasible :)
>
> Does this sound reasonable? I hope I have understood the use-case
> correctly.
> Gyula
>
>
> William Saar  ezt írta (időpont: 2017. máj. 29., H,
> 18:34):
>
>> I am porting a calculation from Spark batches that uses broadcast
>> variables to compute percentiles from metrics and curious for tips on doing
>> this with Flink streaming.
>>
>> I have a windowed computation where I am compute metrics for IP-addresses
>> (a windowed stream of metrics objects grouped by IP-addresses). Now I would
>> like to compute percentiles for each IP from the metrics.
>>
>> My idea is to send all the metrics to a node that computes a global
>> TDigest and then rejoins the computed global TDigest with the IP-grouped
>> metrics stream to compute the percentiles for each IP. Is there a neat way
>> to implement this in Flink?
>>
>> I am curious about the best way to join a global valuem like our TDigest,
>> with every result of a grouped window stream.  Also how to know when the
>> TDigest is complete and has seen every element in the window (say if I
>> implement it in a stateful flatMap that emits the value after seeing all
>> stream values).
>>
>> Thanks!
>>
>> William
>>
>


Problems submitting Flink to Yarn with Kerberos

2017-05-30 Thread Dominique Rondé
Hi folks,

I just become into the need to bring Flink into a yarn system, that is
configured with kerberos. According to the documentation, I changed the
flink.conf.yaml like that:

security.kerberos.login.use-ticket-cache: true
security.kerberos.login.contexts: Client

I know that providing a keytab is the prefered, but I have to do a
special request to receive one. ;-)

After startup, the provisionent is stopped by this error:

2017-05-30 16:16:48,684 INFO 
org.apache.flink.yarn.YarnClusterClient   - Waiting
until all TaskManagers have connected
Waiting until all TaskManagers have connected
2017-05-30 16:16:48,685 INFO 
org.apache.flink.yarn.YarnClusterClient   - Starting
client actor system.
2017-05-30 16:16:52,099 WARN 
org.apache.flink.runtime.net.ConnectionUtils  - Could
not connect to lfrar255.srv.allianz/10.17.24.162:56659. Selecting a
local address using heuristics.
2017-05-30 16:16:52,473 INFO 
akka.event.slf4j.Slf4jLogger  -
Slf4jLogger started
2017-05-30 16:16:52,512 INFO 
Remoting  - Starting
remoting
2017-05-30 16:16:52,670 INFO 
Remoting  - Remoting
started; listening on addresses
:[akka.tcp://fl...@sla09037.srv.allianz:34579]
Exception in thread "main" java.lang.RuntimeException: Unable to get
ClusterClient status from Application Client
at
org.apache.flink.yarn.YarnClusterClient.getClusterStatus(YarnClusterClient.java:248)
at
org.apache.flink.yarn.YarnClusterClient.waitForClusterToBeReady(YarnClusterClient.java:520)
at
org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:660)
at
org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:476)
at
org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:473)
at
org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
at
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
at
org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:473)
Caused by:
org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could
not retrieve the leader gateway
at
org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:141)
at
org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:691)
at
org.apache.flink.yarn.YarnClusterClient.getClusterStatus(YarnClusterClient.java:242)
... 10 more
Caused by: java.util.concurrent.TimeoutException: Futures timed out
after [1 milliseconds]
at
scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at
scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
at
scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:190)
at scala.concurrent.Await.result(package.scala)
at
org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:139)
... 12 more
2017-05-30 16:17:02,690 INFO 
org.apache.flink.yarn.YarnClusterClient   - Shutting
down YarnClusterClient from the client shutdown hook
2017-05-30 16:17:02,691 INFO 
org.apache.flink.yarn.YarnClusterClient   -
Disconnecting YarnClusterClient from ApplicationMaster
2017-05-30 16:17:03,693 INFO 
akka.remote.RemoteActorRefProvider$RemotingTerminator - Shutting
down remote daemon.
2017-05-30 16:17:03,696 INFO 
akka.remote.RemoteActorRefProvider$RemotingTerminator - Remote
daemon shut down; proceeding with flushing remote transports.
2017-05-30 16:17:03,744 INFO 
akka.remote.RemoteActorRefProvider$RemotingTerminator - Remoting
shut down.
 
Has anyone an idea what is going wrong?

Best wished

Dominique



Re: Problems submitting Flink to Yarn with Kerberos

2017-05-30 Thread Tzu-Li (Gordon) Tai
Hi Dominique,

Could you tell us the version / build commit of Flink that you’re using?

Cheers,
Gordon


On 30 May 2017 at 4:29:08 PM, Dominique Rondé (dominique.ro...@allsecur.de) 
wrote:

Hi folks,

I just become into the need to bring Flink into a yarn system, that is 
configured with kerberos. According to the documentation, I changed the 
flink.conf.yaml like that:

security.kerberos.login.use-ticket-cache: true
security.kerberos.login.contexts: Client

I know that providing a keytab is the prefered, but I have to do a special 
request to receive one. ;-)

After startup, the provisionent is stopped by this error:

2017-05-30 16:16:48,684 INFO  org.apache.flink.yarn.YarnClusterClient   
    - Waiting until all TaskManagers have connected
Waiting until all TaskManagers have connected
2017-05-30 16:16:48,685 INFO  org.apache.flink.yarn.YarnClusterClient   
    - Starting client actor system.
2017-05-30 16:16:52,099 WARN  org.apache.flink.runtime.net.ConnectionUtils  
    - Could not connect to lfrar255.srv.allianz/10.17.24.162:56659. 
Selecting a local address using heuristics.
2017-05-30 16:16:52,473 INFO  akka.event.slf4j.Slf4jLogger  
    - Slf4jLogger started
2017-05-30 16:16:52,512 INFO  Remoting  
    - Starting remoting
2017-05-30 16:16:52,670 INFO  Remoting  
    - Remoting started; listening on addresses 
:[akka.tcp://fl...@sla09037.srv.allianz:34579]
Exception in thread "main" java.lang.RuntimeException: Unable to get 
ClusterClient status from Application Client
    at 
org.apache.flink.yarn.YarnClusterClient.getClusterStatus(YarnClusterClient.java:248)
    at 
org.apache.flink.yarn.YarnClusterClient.waitForClusterToBeReady(YarnClusterClient.java:520)
    at 
org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:660)
    at 
org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:476)
    at 
org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:473)
    at 
org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
    at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
    at 
org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:473)
Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: 
Could not retrieve the leader gateway
    at 
org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:141)
    at 
org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:691)
    at 
org.apache.flink.yarn.YarnClusterClient.getClusterStatus(YarnClusterClient.java:242)
    ... 10 more
Caused by: java.util.concurrent.TimeoutException: Futures timed out after 
[1 milliseconds]
    at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
    at 
scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
    at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
    at 
scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
    at scala.concurrent.Await$.result(package.scala:190)
    at scala.concurrent.Await.result(package.scala)
    at 
org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:139)
    ... 12 more
2017-05-30 16:17:02,690 INFO  org.apache.flink.yarn.YarnClusterClient   
    - Shutting down YarnClusterClient from the client shutdown hook
2017-05-30 16:17:02,691 INFO  org.apache.flink.yarn.YarnClusterClient   
    - Disconnecting YarnClusterClient from ApplicationMaster
2017-05-30 16:17:03,693 INFO  
akka.remote.RemoteActorRefProvider$RemotingTerminator - Shutting down 
remote daemon.
2017-05-30 16:17:03,696 INFO  
akka.remote.RemoteActorRefProvider$RemotingTerminator - Remote daemon 
shut down; proceeding with flushing remote transports.
2017-05-30 16:17:03,744 INFO  
akka.remote.RemoteActorRefProvider$RemotingTerminator - Remoting shut 
down.
 
Has anyone an idea what is going wrong?

Best wished

Dominique


Re: Problems submitting Flink to Yarn with Kerberos

2017-05-30 Thread Dominique Rondé
Hi Gordon,

we use Flink Flink 1.2.0 bundled with Hadoop 2.6 and Scala 2.11 build on
2017-02-02.

Cheers

Dominique


Am 30.05.2017 um 16:31 schrieb Tzu-Li (Gordon) Tai:
> Hi Dominique,
>
> Could you tell us the version / build commit of Flink that you’re using?
>
> Cheers,
> Gordon
>
>
> On 30 May 2017 at 4:29:08 PM, Dominique Rondé
> (dominique.ro...@allsecur.de ) wrote:
>
>> Hi folks,
>>
>> I just become into the need to bring Flink into a yarn system, that
>> is configured with kerberos. According to the documentation, I
>> changed the flink.conf.yaml like that:
>>
>> security.kerberos.login.use-ticket-cache: true
>> security.kerberos.login.contexts: Client
>>
>> I know that providing a keytab is the prefered, but I have to do a
>> special request to receive one. ;-)
>>
>> After startup, the provisionent is stopped by this error:
>>
>> 2017-05-30 16:16:48,684 INFO 
>> org.apache.flink.yarn.YarnClusterClient   -
>> Waiting until all TaskManagers have connected
>> Waiting until all TaskManagers have connected
>> 2017-05-30 16:16:48,685 INFO 
>> org.apache.flink.yarn.YarnClusterClient   -
>> Starting client actor system.
>> 2017-05-30 16:16:52,099 WARN 
>> org.apache.flink.runtime.net.ConnectionUtils  - Could
>> not connect to lfrar255.srv.allianz/10.17.24.162:56659. Selecting a
>> local address using heuristics.
>> 2017-05-30 16:16:52,473 INFO 
>> akka.event.slf4j.Slf4jLogger  -
>> Slf4jLogger started
>> 2017-05-30 16:16:52,512 INFO 
>> Remoting  -
>> Starting remoting
>> 2017-05-30 16:16:52,670 INFO 
>> Remoting  -
>> Remoting started; listening on addresses
>> :[akka.tcp://fl...@sla09037.srv.allianz:34579]
>> Exception in thread "main" java.lang.RuntimeException: Unable to get
>> ClusterClient status from Application Client
>> at
>> org.apache.flink.yarn.YarnClusterClient.getClusterStatus(YarnClusterClient.java:248)
>> at
>> org.apache.flink.yarn.YarnClusterClient.waitForClusterToBeReady(YarnClusterClient.java:520)
>> at
>> org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:660)
>> at
>> org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:476)
>> at
>> org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:473)
>> at
>> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:422)
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
>> at
>> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
>> at
>> org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:473)
>> Caused by:
>> org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException:
>> Could not retrieve the leader gateway
>> at
>> org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:141)
>> at
>> org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:691)
>> at
>> org.apache.flink.yarn.YarnClusterClient.getClusterStatus(YarnClusterClient.java:242)
>> ... 10 more
>> Caused by: java.util.concurrent.TimeoutException: Futures timed out
>> after [1 milliseconds]
>> at
>> scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
>> at
>> scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
>> at
>> scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
>> at
>> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
>> at scala.concurrent.Await$.result(package.scala:190)
>> at scala.concurrent.Await.result(package.scala)
>> at
>> org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:139)
>> ... 12 more
>> 2017-05-30 16:17:02,690 INFO 
>> org.apache.flink.yarn.YarnClusterClient   -
>> Shutting down YarnClusterClient from the client shutdown hook
>> 2017-05-30 16:17:02,691 INFO 
>> org.apache.flink.yarn.YarnClusterClient   -
>> Disconnecting YarnClusterClient from ApplicationMaster
>> 2017-05-30 16:17:03,693 INFO 
>> akka.remote.RemoteActorRefProvider$RemotingTerminator -
>> Shutting down remote daemon.
>> 2017-05-30 16:17:03,696 INFO 
>> akka.remote.RemoteActorRefProvider$RemotingTerminator -
>> Remote daemon shut down; proceeding with flushing remote transports.
>> 2017-05-30 16:17:03,744 INFO 
>> akka.remote.RemoteActorRefProvider$RemotingTerminator -
>> Remoting

Re: Kafka partitions -> task slots? (keyed stream)

2017-05-30 Thread Stefan Richter
Hi,

it is not restricting the parallelism of your job. Only increasing the 
parallelism of your Job’s sources to more than 5 will not bring any 
improvements. All other operators could still benefit from a higher parallelism.

> Am 30.05.2017 um 09:49 schrieb Moiz S Jinia :
> 
> For a keyed stream (where the key is also the message key in the source kafka 
> topic), is the parallelism of the job restricted to the number of partitions 
> in the topic?
> 
> Source topic has 5 partitions, but available task slots are 12. (3 task 
> managers each with 4 slots)
> 
> Moiz



Re: Kafka partitions -> task slots? (keyed stream)

2017-05-30 Thread Moiz S Jinia
I have just 1 job (that has a ProcessFunction with timers).

You're saying that giving more task slots to my job than the number of
partitions on the source topic is not going to help.

This implies that 1 partition cannot be assigned to more than 1 task slot.
That makes sense as otherwise ordering for a partition would not be
guaranteed.

Thanks.

On Tue, May 30, 2017 at 8:43 PM, Stefan Richter  wrote:

> Hi,
>
> it is not restricting the parallelism of your job. Only increasing the
> parallelism of your Job’s sources to more than 5 will not bring any
> improvements. All other operators could still benefit from a higher
> parallelism.
>
> > Am 30.05.2017 um 09:49 schrieb Moiz S Jinia :
> >
> > For a keyed stream (where the key is also the message key in the source
> kafka topic), is the parallelism of the job restricted to the number of
> partitions in the topic?
> >
> > Source topic has 5 partitions, but available task slots are 12. (3 task
> managers each with 4 slots)
> >
> > Moiz
>
>


Does job restart resume from last known internal checkpoint?

2017-05-30 Thread Moiz S Jinia
In a checkpointed Flink job will doing a graceful restart make it resume
from last known internal checkpoint? Or are all checkpoints discarded when
the job is stopped?

If discarded, what will be the resume point?

Moiz


Re: Porting batch percentile computation to streaming window

2017-05-30 Thread William Saar
Nice! The solution is actually starting to look quite clean with this
in place.

Finally, does Flink offer functionality to retrieve information about
the current window that a rich function is running on? I don't see
anything in the RuntimeContext classes about the current window... 

As you pointed out earlier, I need to attach a window ID (for
instance, the starting timestamp of the window) to each metric and
propagate it to the TDigest objects to be able to associate the
metrics with the right TDigest in the last stateful CoFlatMapFunction.
You mentioned I should compute the window information in the initial
fold function that computes the metrics,  and while I can compute a
common window-start timestamp from the events in the metrics
computation it would seem less ugly and error-prone if I could get
information about the current window the fold function is running on
from Flink.

- Original Message -
From:
 "Gyula Fóra" 

To:
"William Saar" , 
Cc:

Sent:
Tue, 30 May 2017 13:56:08 +
Subject:
Re: Porting batch percentile computation to streaming window

I think you could actually do a window operation to get the
tDigestStream from windowMetricsByIp:

windowMetricsByIp.allWindow(SameWindowAs
TumblingTimeWindow
).fold(...)

This way the watermark mechanism should ensure you get all partial
results before flushing the global window.

Gyula

William Saar  ezt írta (időpont: 2017. máj.
30., K, 15:03):

> This logic now assumes that you get the TDigest result before
getting any groupBy metric, which will probably not be the case so you
could do some custom buffering in state. Depending on the rate of the
stream this might or might not be feasible :)

Unfortunately, I think this assumption is a deal-breaker. The value
stream is not grouped, but I need to distribute the values to compute
the metrics and I am populating the TDigest with the metrics

Your suggestion gave me some ideas. Assume I have
windowMetricsByIp =
values.keyBy(ip).window(TumblingTimeWindow).fold(computeMetrics)
 tDigestStream = windowMetricsByIp.global().flatMap(tDigestMapper) //
How do I know when the flat map has seen all values and should emit
its result?
percentilesStream =
tDigestStream.broadcast().connect(windowMetricsByIp).flatMap 

If I attach information about the current window to the metrics events
on line 1, can I perhaps use that information to make flatMap on line
2 decide when to emit its T-Digest? The crudest solution is to emit
the T-Digest for a window when the first event of the next window
arrives (will this cause problems with back-pressure?) 
Less crude, maybe I can store watermark information or something on
metrics objects in line 1 and emit T digests more often in line 2?

Finally, how do I access the watermark/window information in my fold
operation in line 1?

Thanks!

- Original Message -
From:
 "Gyula Fóra" 

To:
"William Saar" , 
Cc:

Sent:
Tue, 30 May 2017 08:56:28 +
Subject:
Re: Porting batch percentile computation to streaming window

Hi William,

I think basically the feature you are looking for are side inputs
which is not implemented yet but let me try to give a workaround that
might work. 

If I understand correctly you have two windowed computations:
TDigestStream = allMetrics.windowAll(...).reduce()
windowMetricsByIP = allMetrics.keyBy(ip).reduce()

And now you want to join these two by window to compute the
percentiles

Something like:

TDigestStream.broadcast().connect(windowMetricsByIP).flatMap(JoiningCoFlatMap)

In your JoiningCoFlatMap you could keep a state of Map and every by ip metric aggregate could pick up the TDigest
for the current window. All this assumes that you attach the window
information to the aggregate metrics and the TDigest (you can do this
in the window reduce step). 

This logic now assumes that you get the TDigest result before getting
any groupBy metric, which will probably not be the case so you could
do some custom buffering in state. Depending on the rate of the stream
this might or might not be feasible :)

Does this sound reasonable? I hope I have understood the use-case
correctly.
Gyula

William Saar  ezt írta (időpont: 2017. máj.
29., H, 18:34):

I am porting a calculation from Spark batches that uses broadcast
variables to compute percentiles from metrics and curious for tips on
doing this with Flink streaming.

I have a windowed computation where I am compute metrics for
IP-addresses (a windowed stream of metrics objects grouped by
IP-addresses). Now I would like to compute percentiles for each IP
from the metrics.

My idea is to send all the metrics to a node that computes a global
TDigest and then rejoins the computed global TDigest with the
IP-grouped metrics stream to compute the percentiles for each IP. Is
there a neat way to implement this in Flink?

I am curious about the best way to join a global valuem like our
TDigest, with every result of a grouped window stream.  Also how to
know when the TDigest is complete and has seen every eleme

HTTP listener source

2017-05-30 Thread Madhukar Thota
Hi

As anyone implemented HTTP listener in flink source which acts has a
Rest API to receive JSON payload via Post method and writes to Kafka or
kinesis or any sink sources.

Any guidance or sample snippet will be appreciated.


Amazon Athena

2017-05-30 Thread Madhukar Thota
Anyone used used Amazon Athena with Apache Flink?

I have use case where I want to write streaming data ( which is in Avro
format) from kafka to s3 by converting into parquet format and update S3
location with daily partitions on Athena table.

Any guidance is appreciated.