Re: How to increase akka heartbeat?

2016-02-20 Thread Saiph Kappa
Thanks for your help. Apparently the problem was not in Akka. It seems that
when using a source .socketTextStream with maxRetry = -1, it continually
attempts to connect to the socket for the 1st time, but once it is
connected, and if no data is sent, it seems that the job is terminated.

On Fri, Feb 19, 2016 at 9:13 PM, Stephan Ewen  wrote:

> Hi Saiph!
>
> What is the problem that is happening? The log actually looks like the Job
> is successfully sent to the JobManager.
>
> Stephan
>
>
>
> On Fri, Feb 19, 2016 at 8:49 PM, Robert Metzger 
> wrote:
>
>> Hi,
>> can you maybe (if you want also private) send me the full logs of the
>> jobmanager? The messages you've posted here are logged at DEBUG level. They
>> don't indicate an erroneous behavior of the system.
>>
>> On Fri, Feb 19, 2016 at 8:44 PM, Saiph Kappa 
>> wrote:
>>
>>> These were the parameters that I set btw:
>>>
>>> akka.watch.heartbeat.interval: 100
>>> akka.transport.heartbeat.interval: 1000
>>>
>>> On Fri, Feb 19, 2016 at 7:43 PM, Saiph Kappa 
>>> wrote:
>>>
 I am not sure.

 For that particular machine I get messages like these:
 «
 myip:6123/user/jobmanager#291801197])) at akka://flink/user/$a from
 Actor[akka://flink/deadLetters].
 ^[[34m[INFO]^[[0;39m o.a.f.r.c.JobClientActor- Connected to new
 JobManager akka.tcp://flink@myip:6123/user/jobmanager.

 ^[[34m[INFO]^[[0;39m o.a.f.r.c.JobClientActor- Sending message to
 JobManager akka.tcp://flink@myip:6123/user/jobmanager to submit job
 JOB1 (5f9cef0c2e4b69530bf1e2485e94d326) and wait for progress


 ^[[39m[DEBUG]^[[0;39m o.a.f.r.c.JobClientActor- Handled message
 LeaderSessionMessage(null,JobManagerActorRef(Actor[akka.tcp://flink@myip:6123/user/jobmanager#291801197]))
 in 48 ms from Actor[akka://flink/deadLetters].


 ^[[39m[DEBUG]^[[0;39m o.a.f.r.c.JobClientActor- Handled message
 LeaderSessionMessage(null,JobManagerActorRef(Actor[akka.tcp://flink@myip:6123/user/jobmanager#291801197]))
 in 48 ms from Actor[akka://flink/deadLetters].

 ^[[39m[DEBUG]^[[0;39m o.a.f.r.c.JobClientActor- Received message
 JobSubmitSuccess(2575d5ff5c10336beb7820a052a63623) at akka://flink/user/$a
 from Actor[akka.tcp://flink@myip:6123/user/jobmanager#1144818256].
 »

 I tried to set the heartbeat interval in the cluster but it didn't
 solve the problem, should I try to set it in the client (how can I do it)?
 I see no other errors or exceptions on the log files.




 On Fri, Feb 19, 2016 at 7:07 PM, Robert Metzger 
 wrote:

> Hi Saiph,
>
> are you sure that the jobs are cancelled because the client
> disconnects?
>
> For the different timeouts, check the configuration page:
> https://ci.apache.org/projects/flink/flink-docs-release-0.10/setup/config.html
> and search for "heartbeat".
>
> On Fri, Feb 19, 2016 at 8:04 PM, Saiph Kappa 
> wrote:
>
>> Hi,
>>
>> I have a Flink client application that launches jobs to remote
>> clusters. However I'm getting my jobs cancelled:
>> "18:25:29,650 WARN
>> akka.remote.ReliableDeliverySupervisor- 
>> Association
>> with remote system [akka.tcp://flink@127.0.0.1:52929] has failed,
>> address is now gated for [5000] ms. Reason is: [Disassociated]."
>>
>> How can I increase the akka heartbeat interval? Where should I set
>> that configuration parameter, in the client or in the Flink clusters, and
>> in which file.
>>
>> Thanks.
>>
>>
>

>>>
>>
>


Re: Looking for co founder /partner in Bangalore

2016-02-20 Thread jay vyas
Its not the right forum, but, I think this really would be a extension of
what is already in ASF BigTop, which is working on packaging flink, has
kafka and zepplin, already.

 Camel might go well into bigtop, and

the cassandra, drill, camel parts could be added on top possibly...

Replying on public list b/c, if this particular platform is going to be
opensource, then I think it is relevant to the ASF.


On Sat, Feb 20, 2016 at 9:52 AM, Ashutosh Kumar 
wrote:

> I am planning to build a analytics platform based on Flink, Kafka , Camel,
> Zeppelin , Drill and Cassandra . I am looking for co founder/partner in
> Bangalore .
>
> I am sorry if this is not a right forum to express this.
>
> Thanks
> Ashutosh
>



-- 
jay vyas


Re: Trying to comprehend rolling windows + event time

2016-02-20 Thread lofifnc
Hi Nirmalya,

The aggregates will be printed 5 times because I have a rolling window with
the length of 5 minutes, which will shift 1 minute forward after each
evaluation. Because my input is within the time interval of 0 to 1 minute,
it is perfectly aligned with the windows and will fit completely into 5
different windows, thus will be evaluated 5 times. 

I have created another gist, where the start and end time of each window is
shown along with it's content.
The single record I'm emitting into the data flow will fit into 5 different
overlapping windows.  
https://gist.github.com/lofifnc/36861d8f37b166cc4863

If you've haven't already read them, I can really recommend the two
articles, Tyler Akidau has published on o'reilly radar, they have some
really great explanations and visualisations in them:
https://www.oreilly.com/people/09f01-tyler-akidau

Best,
Alex







--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Trying-to-comprehend-rolling-windows-event-time-tp5050p5056.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


Re: Trying to comprehend rolling windows + event time

2016-02-20 Thread lofifnc
Hi,

You're right, expect that ("grace", "arctic", 25) is emitted with timestamp
90 seconds along with a for watermark 90 seconds.

I followed your advice and implemented a simple window function printing the
start + end of a window along with it's content. You can see that a window
from minute 1 till 6 is emitted containing the ("grace", "arctic", 25)
triple.

window: -4 -> 1 key: (arctic)
(susi,arctic,20)
===
window: -4 -> 1 key: (elephant)
(hans,elephant,15)
(pete,elephant,40)
===
window: 1 -> 6 key: (arctic)
(grace,arctic,25)
===
window: 0 -> 5 key: (elephant)
(hans,elephant,15)
(pete,elephant,40)
===
window: 0 -> 5 key: (arctic)
(susi,arctic,20)
(grace,arctic,25)
===
window: -1 -> 4 key: (arctic)
(susi,arctic,20)
(grace,arctic,25)
===
window: -1 -> 4 key: (elephant)
(hans,elephant,15)
(pete,elephant,40)
===
window: -2 -> 3 key: (arctic)
(susi,arctic,20)
(grace,arctic,25)
===
window: -2 -> 3 key: (elephant)
(hans,elephant,15)
(pete,elephant,40)
===
window: -3 -> 2 key: (arctic)
(susi,arctic,20)
(grace,arctic,25)
===
window: -3 -> 2 key: (elephant)
(hans,elephant,15)
(pete,elephant,40)
===

So what basically happens is that the windowing mechanism emits all
unfinished windows when it's closed?
Because based on the watermarks Flink can not decide that, the window 1 -> 6
is finished. 

Best Alex.



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Trying-to-comprehend-rolling-windows-event-time-tp5034p5055.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


Looking for co founder /partner in Bangalore

2016-02-20 Thread Ashutosh Kumar
I am planning to build a analytics platform based on Flink, Kafka , Camel,
Zeppelin , Drill and Cassandra . I am looking for co founder/partner in
Bangalore .

I am sorry if this is not a right forum to express this.

Thanks
Ashutosh


Re: Using numberOfTaskSlots to control parallelism

2016-02-20 Thread Zach Cox
Thanks for the input Aljoscha and Ufuk! I will try out the #2 approach and
report back.

Thanks,
Zach


On Sat, Feb 20, 2016 at 7:26 AM Ufuk Celebi  wrote:

> On Sat, Feb 20, 2016 at 10:12 AM, Aljoscha Krettek 
> wrote:
> > IMHO the only change for 2) is that you possibly get better machine
> utilization because it will use more parallel threads.  So I think it’s a
> valid approach.
> >
> > @Ufuk, could there be problems with the number of network buffers? I
> think not, because the connections are multiplexed in one channel, is this
> correct?
>
> I would not expect it to become a problem. If it does, it's easy to
> resolve by throwing a little more memory at the problem. [1]
>
> – Ufuk
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#configuring-the-network-buffers
>


Re: Using numberOfTaskSlots to control parallelism

2016-02-20 Thread Ufuk Celebi
On Sat, Feb 20, 2016 at 10:12 AM, Aljoscha Krettek  wrote:
> IMHO the only change for 2) is that you possibly get better machine 
> utilization because it will use more parallel threads.  So I think it’s a 
> valid approach.
>
> @Ufuk, could there be problems with the number of network buffers? I think 
> not, because the connections are multiplexed in one channel, is this correct?

I would not expect it to become a problem. If it does, it's easy to
resolve by throwing a little more memory at the problem. [1]

– Ufuk

[1] 
https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#configuring-the-network-buffers


Re: Using numberOfTaskSlots to control parallelism

2016-02-20 Thread Aljoscha Krettek
IMHO the only change for 2) is that you possibly get better machine utilization 
because it will use more parallel threads.  So I think it’s a valid approach.

@Ufuk, could there be problems with the number of network buffers? I think not, 
because the connections are multiplexed in one channel, is this correct?

I’ll also talk with the others so see if we can resolve the watermark/kafka 
partition issues before the 1.0 release.
> On 20 Feb 2016, at 02:14, Zach Cox  wrote:
> 
> What would the differences be between these scenarios?
> 
> 1) one task manager with numberOfTaskSlots=1 and one job with parallelism=1
> 
> 2) one task manager with numberOfTaskSlots=10 and one job with parallelism=10
> 
> In both cases all of the job's tasks get executed within the one task 
> manager's jvm. Are there any downsides to doing #2 instead of #1?
> 
> I ask this question because of current issues related to watermarks with 
> Kafka sources [1] [2] and changing parallelism with savepoints [3]. I'm 
> writing a Flink job that processes events from Kafka topics that have 12 
> partitions. I'm wondering if I should just set the job parallelism=12 and 
> make numberOfTaskSlots sum to 12 across however many task managers I set up. 
> It seems like watermarks would work properly then, and I could effectively 
> change job parallelism using the number of task managers (e.g. 1 TM with 
> slots=12, or 2 TMs with slots=6, or 12 TMs with slots=1, etc).
> 
> Am I missing any important details that would make this a bad idea? It seems 
> like a bit of abuse of numberOfTaskSlots, but also seems like a fairly simple 
> solution to a few current issues.
> 
> Thanks,
> Zach
> 
> [1] 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kafka-partition-alignment-for-event-time-tt4782.html
> [2] https://issues.apache.org/jira/browse/FLINK-3375
> [3] 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Changing-parallelism-tt4967.html
>