Re: working with flink Session Windows

2018-07-20 Thread antonio saldivar
Hello

Actually I evaluate my WindowFunction with a trigger alert, having
something like below code (testing with 2 different windows), expecting 5K
elements per second arriving


SingleOutputStreamOperator windowedElem = element

.keyBy("id")

.timeWindow(Time.seconds(120))

// .window(EventTimeSessionWindows.withGap(Time.minutes(20)))

.trigger(new AlertTrigger(env.getStreamTimeCharacteristic())

.aggregate(new EleAggregator(), new EleWindowFn())


El vie., 20 jul. 2018 a las 21:23, Hequn Cheng ()
escribió:

> Hi antonio,
>
> I think it worth a try to test the performance in your scenario, since job
> performance can be affected by a number of factors(say your WindowFunction).
>
> Best, Hequn
>
> On Sat, Jul 21, 2018 at 2:59 AM, antonio saldivar 
> wrote:
>
>> Hello
>>
>> I am building an app but for this UC I want to test with session windows
>> and I am not sure if this will be expensive for the compute resources
>> because the gap will be 10 mins, 20 mins 60 mins because I want to trigger
>> an alert if the element reaches some thresholds within those periods of
>> time.
>>
>> flink version 1.4.2
>>
>> Thank you
>> Best Regards
>> Antonio Saldivar
>>
>
>


Re: working with flink Session Windows

2018-07-20 Thread Hequn Cheng
Hi antonio,

I think it worth a try to test the performance in your scenario, since job
performance can be affected by a number of factors(say your WindowFunction).

Best, Hequn

On Sat, Jul 21, 2018 at 2:59 AM, antonio saldivar 
wrote:

> Hello
>
> I am building an app but for this UC I want to test with session windows
> and I am not sure if this will be expensive for the compute resources
> because the gap will be 10 mins, 20 mins 60 mins because I want to trigger
> an alert if the element reaches some thresholds within those periods of
> time.
>
> flink version 1.4.2
>
> Thank you
> Best Regards
> Antonio Saldivar
>


Re: Flink on Mesos: containers question

2018-07-20 Thread Renjie Liu
Hi, Alexei:

What you paste is expected behavior. Jobmanager, two task managers each
should run in a docker instance.

13276 is should be the process of job manager, and it's the same
process as 789.
They have different processes id because in show them in different
namesapces(that's a concept in cgroup, which docker actually dependens on).

On Thu, Jul 19, 2018 at 10:00 PM Till Rohrmann  wrote:

> Hi Alexei,
>
> I actually never used Mesos with container images. I always used it in a
> way where the Mesos task directly starts the Java process.
>
> Cheers,
> Till
>
> On Thu, Jul 19, 2018 at 2:44 PM NEKRASSOV, ALEXEI  wrote:
>
>> Till,
>>
>>
>>
>> Any insight into how Flink components are containerized in Mesos?
>>
>>
>>
>> Thanks!
>>
>> Alex
>>
>>
>>
>> *From:* Fabian Hueske [mailto:fhue...@gmail.com]
>> *Sent:* Monday, July 16, 2018 7:57 AM
>> *To:* NEKRASSOV, ALEXEI 
>> *Cc:* user@flink.apache.org; Till Rohrmann 
>> *Subject:* Re: Flink on Mesos: containers question
>>
>>
>>
>> Hi Alexei,
>>
>>
>>
>> Till (in CC) is familiar with Flink's Mesos support in 1.4.x.
>>
>>
>>
>> Best, Fabian
>>
>>
>>
>> 2018-07-13 15:07 GMT+02:00 NEKRASSOV, ALEXEI :
>>
>> Can someone please clarify how Flink on Mesos in containerized?
>>
>>
>>
>> On 5-node Mesos cluster I started Flink (1.4.2) with two Task Managers.
>> Mesos shows “flink” task and two “taskmanager” tasks, all on the same VM.
>>
>> On that VM I see one Docker container running a process that seems to be
>> Mesos App Master:
>>
>>
>>
>> $ docker ps -a
>>
>> CONTAINER IDIMAGE
>> COMMAND  CREATED STATUS
>> PORTS   NAMES
>>
>> 97b6840466c0mesosphere/dcos-flink:1.4.2-1.0   "/bin/sh -c
>> /sbin/..."   41 hours agoUp 41 hours
>> mesos-a0079d85-9ccb-4c43-8d31-e6b1ad750197
>>
>> $ docker exec 97b6840466c0 /bin/ps -efww
>>
>> UIDPID  PPID  C STIME TTY  TIME CMD
>>
>> root 1 0  0 Jul11 ?00:00:00 /bin/sh -c /sbin/init.sh
>>
>> root 7 1  0 Jul11 ?00:00:02 runsvdir -P /etc/service
>>
>> root 8 7  0 Jul11 ?00:00:00 runsv flink
>>
>> root   629 0  0 Jul12 pts/000:00:00 /bin/bash
>>
>> root   789 8  1 Jul12 ?00:09:16
>> /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -classpath
>> /flink-1.4.2/lib/flink-python_2.11-1.4.2.jar:/flink-1.4.2/lib/flink-shaded-hadoop2-uber-1.4.2.jar:/flink-1.4.2/lib/log4j-1.2.17.jar:/flink-1.4.2/lib/slf4j-log4j12-1.7.7.jar:/flink-1.4.2/lib/flink-dist_2.11-1.4.2.jar::/etc/hadoop/conf/:
>> -Dlog.file=/mnt/mesos/sandbox/flink--mesos-appmaster-alex-tfc87d-private-agents-3.novalocal.log
>> -Dlog4j.configuration=file:/flink-1.4.2/conf/log4j.properties
>> -Dlogback.configurationFile=file:/flink-1.4.2/conf/logback.xml
>> org.apache.flink.mesos.runtime.clusterframework.MesosApplicationMasterRunner
>> -Dblob.server.port=23170 -Djobmanager.heap.mb=256
>> -Djobmanager.rpc.port=23169 -Djobmanager.web.port=23168
>> -Dmesos.artifact-server.port=23171 -Dmesos.initial-tasks=2
>> -Dmesos.resourcemanager.tasks.cpus=2 -Dmesos.resourcemanager.tasks.mem=2048
>> -Dtaskmanager.heap.mb=512 -Dtaskmanager.memory.preallocate=true
>> -Dtaskmanager.numberOfTaskSlots=1 -Dparallelism.default=1
>> -Djobmanager.rpc.address=localhost -Dmesos.resourcemanager.framework.role=*
>> -Dsecurity.kerberos.login.use-ticket-cache=true
>>
>> root  1027 0  0 12:54 ?00:00:00 /bin/ps -efww
>>
>>
>>
>> Then on the VM itself I see another process with the same command line as
>> the one in the container:
>>
>>
>>
>> root 13276  9689  1 Jul12 ?00:09:18
>> /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -classpath /flink
>> -1.4.2/lib/flink-python_2.11-1.4.2.jar:/flink-1.4.2/lib/flink
>> -shaded-hadoop2-uber-1.4.2.jar:/flink-1.4.2/lib/log4j-1.2.17.jar:/flink
>> -1.4.2/lib/slf4j-log4j12-1.7.7.jar:/flink-1.4.2/lib/flink-dist_2.11-1.4.2.jar::/etc/hadoop/conf/:
>> -Dlog.file=/mnt/mesos/sandbox/flink--mesos-appmaster-alex-tfc87d-private-agents-3.novalocal.log
>> -Dlog4j.configuration=file:/flink-1.4.2/conf/log4j.properties
>> -Dlogback.configurationFile=file:/flink-1.4.2/conf/logback.xml
>> org.apache.flink.mesos.runtime.clusterframework.MesosApplicationMasterRunner
>> -Dblob.server.port=23170 -Djobmanager.heap.mb=256
>> -Djobmanager.rpc.port=23169 -Djobmanager.web.port=23168
>> -Dmesos.artifact-server.port=23171 -Dmesos.initial-tasks=2
>> -Dmesos.resourcemanager.tasks.cpus=2 -Dmesos.resourcemanager.tasks.mem=2048
>> -Dtaskmanager.heap.mb=512 -Dtaskmanager.memory.preallocate=true
>> -Dtaskmanager.numberOfTaskSlots=1 -Dparallelism.default=1
>> -Djobmanager.rpc.address=localhost -Dmesos.resourcemanager.framework.role=*
>> -Dsecurity.kerberos.login.use-ticket-cache=true
>>
>>
>>
>> And I see two processes on the VM that seem to be related to Task
>> Managers:
>>
>>
>>
>> root 13688 13687  0 Jul12 ?00:04:25
>> /docker-java-home/jre/bin/java -Xms1448m -Xmx1448m -classpath
>> /mnt/mesos/sandbox/flink/lib/flink

Re: ProcessFunction example from the documentation giving me error

2018-07-20 Thread anna stax
It is not the code, but I don't know what the problem is.
A simple word count with  socketTextStream used to work but now gives the
same error.
Apps with kafka source which used to work is giving the same error.
When I have a source generator within the app itself works good.

So, with socketTextStream and kafka source gives me
java.net.ConnectException: Operation timed out (Connection timed out)  error

On Fri, Jul 20, 2018 at 10:29 AM, anna stax  wrote:

> My object name is  CreateUserNotificationRequests, thats why  you see
> CreateUserNotificationRequests in the Error message.
> I edited the object name after pasting the code...Hope there is no
> confusion and I get some help.
> Thanks
>
>
>
> On Fri, Jul 20, 2018 at 10:10 AM, anna stax  wrote:
>
>> Hello all,
>>
>> This is my code, just trying to make the code example in
>> https://ci.apache.org/projects/flink/flink-docs-release-1
>> .5/dev/stream/operators/process_function.html work
>>
>> object ProcessFunctionTest {
>>
>>   def main(args: Array[String]) {
>>
>> val env = StreamExecutionEnvironment.getExecutionEnvironment
>> env.setStreamTimeCharacteristic(TimeCharacteristic.ProcessingTime)
>> val text = env.socketTextStream("localhost", )
>>
>> val text1 = text.map(s => (s,s)).keyBy(0).process(new
>> CountWithTimeoutFunction())
>>
>> text1.print()
>>
>> env.execute("CountWithTimeoutFunction")
>>   }
>>
>>   case class CountWithTimestamp(key: String, count: Long, lastModified:
>> Long)
>>
>>   class CountWithTimeoutFunction extends ProcessFunction[(String,
>> String), (String, Long)] {
>>
>> lazy val state: ValueState[CountWithTimestamp] = getRuntimeContext
>>   .getState(new ValueStateDescriptor[CountWithTimestamp]("myState",
>> classOf[CountWithTimestamp]))
>>
>> override def processElement(value: (String, String), ctx:
>> ProcessFunction[(String, String), (String, Long)]#Context, out:
>> Collector[(String, Long)]): Unit = {
>> ..
>> }
>>
>> override def onTimer(timestamp: Long, ctx: ProcessFunction[(String,
>> String), (String, Long)]#OnTimerContext, out: Collector[(String, Long)]):
>> Unit = {
>> ...
>> }
>>   }
>> }
>>
>>
>> Exception in thread "main" 
>> org.apache.flink.runtime.client.JobExecutionException:
>> java.net.ConnectException: Operation timed out (Connection timed out)
>> at org.apache.flink.runtime.minicluster.MiniCluster.executeJobB
>> locking(MiniCluster.java:625)
>> at org.apache.flink.streaming.api.environment.LocalStreamEnviro
>> nment.execute(LocalStreamEnvironment.java:121)
>> at org.apache.flink.streaming.api.scala.StreamExecutionEnvironm
>> ent.execute(StreamExecutionEnvironment.scala:654)
>> at com.whil.flink.streaming.CreateUserNotificationRequests$.
>> main(CreateUserNotificationRequests.scala:42)
>> at com.whil.flink.streaming.CreateUserNotificationRequests.
>> main(CreateUserNotificationRequests.scala)
>> Caused by: java.net.ConnectException: Operation timed out (Connection
>> timed out)
>> at java.net.PlainSocketImpl.socketConnect(Native Method)
>> at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSock
>> etImpl.java:350)
>> at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPl
>> ainSocketImpl.java:206)
>> at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocket
>> Impl.java:188)
>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>> at java.net.Socket.connect(Socket.java:589)
>> at org.apache.flink.streaming.api.functions.source.SocketTextSt
>> reamFunction.run(SocketTextStreamFunction.java:96)
>> at org.apache.flink.streaming.api.operators.StreamSource.run(
>> StreamSource.java:87)
>> at org.apache.flink.streaming.api.operators.StreamSource.run(
>> StreamSource.java:56)
>> at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.
>> run(SourceStreamTask.java:99)
>> at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(
>> StreamTask.java:306)
>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:703)
>> at java.lang.Thread.run(Thread.java:745)
>>
>> On Thu, Jul 19, 2018 at 11:22 PM, vino yang 
>> wrote:
>>
>>> Hi anna,
>>>
>>> Can you share your program and the exception stack trace and more
>>> details about what's your source and state backend?
>>>
>>> From the information you provided, it seems Flink started a network
>>> connect but timed out.
>>>
>>> Thanks, vino.
>>>
>>> 2018-07-20 14:14 GMT+08:00 anna stax :
>>>
 Hi all,

 I am new to Flink. I am using the classes CountWithTimestamp and
 CountWithTimeoutFunction from the examples found in
 https://ci.apache.org/projects/flink/flink-docs-release-1.5/
 dev/stream/operators/process_function.html

 I am getting the error Exception in thread "main"
 org.apache.flink.runtime.client.JobExecutionException:
 java.net.ConnectException: Operation timed out (Connection timed out)

 Looks like when timer’s time is reached I am getting this error. Any
 idea why. Please help

 Thanks


Re: Query regarding rest.port property

2018-07-20 Thread Chesnay Schepler
Effectively you can't disable them selectively; reason being that they 
are actually one and the same.
The ultimate solution is to build flink-dist yourself, and exclude 
"flink-runtime-web" from it, which removes

the required files.

Note that being able to selectively disable them _for security reasons_ 
wouldn't do you much good as far as I can tell; if you have access to 
the REST API you can do anything the UI does. Similarly, if you can 
restrict access to the REST API, you also do that for the UI.


On 20.07.2018 18:14, Vino yang wrote:

Hi Vinay,

Did job manager run in node "myhost"? Did you check the port you 
specified open for remote access?


Can you try to start web UI, but just forbid its port?


Vino yang
Thanks.


On 2018-07-20 22:48 , Vinay Patil  Wrote:

Hi,

We have disabled Flink Web UI for security reasons however we want
to use REST Api for monitoring purpose. For that I have set
jobmanager.web.port = -1 , rest.port=,


  rest.address=myhost


But I am not able to access any REST api using
https://myhost:/

Is it mandatory to have Flink Web UI running or am I missing any
configuration ?

Regards,
Vinay Patil





Re: org.apache.flink.streaming.runtime.streamrecord.LatencyMarker cannot be cast to org.apache.flink.streaming.runtime.streamrecord.StreamRecord

2018-07-20 Thread Gregory Fee
This is on Flink 1.4.2. I filed it as Flink-9905. Thanks!

On Fri, Jul 20, 2018 at 2:51 AM, Dawid Wysakowicz 
wrote:

> Hi Gregory,
> I think it is some flink bug. Could you file a JIRA for it? Also which
> version of flink are you using?
> Best,
> Dawid
>
> On Fri, 20 Jul 2018 at 04:34, vino yang  wrote:
>
>> Hi Gregory,
>>
>> This exception seems a bug, you can create a issues in the JIRA.
>>
>> Thanks, vino.
>>
>> 2018-07-20 10:28 GMT+08:00 Philip Doctor :
>>
>>> Oh you were asking about the cast exception, I haven't seen that before,
>>> sorry to be off topic.
>>>
>>>
>>>
>>>
>>> --
>>> *From:* Philip Doctor 
>>> *Sent:* Thursday, July 19, 2018 9:27:15 PM
>>> *To:* Gregory Fee; user
>>> *Subject:* Re: org.apache.flink.streaming.runtime.streamrecord.LatencyMarker
>>> cannot be cast to org.apache.flink.streaming.runtime.streamrecord.
>>> StreamRecord
>>>
>>>
>>>
>>> I'm just a flink user, not an expert.  I've seen that exception before.
>>> I have never seen it be the actual error, I usually see it when some other
>>> operator is throwing an uncaught exception and busy dying.  It seems to me
>>> that the prior operator throws this error "Can't forward to the next
>>> operator" why? because the next operator's already dead, but the job is
>>> busy dying asynchronously, so you get a cloud of errors that sort of
>>> surround the root cause.  I'd read your logs a little further back.
>>> --
>>> *From:* Gregory Fee 
>>> *Sent:* Thursday, July 19, 2018 9:10:37 PM
>>> *To:* user
>>> *Subject:* org.apache.flink.streaming.runtime.streamrecord.LatencyMarker
>>> cannot be cast to org.apache.flink.streaming.runtime.streamrecord.
>>> StreamRecord
>>>
>>> Hello, I have a job running and I've gotten this error a few times. The
>>> job recovers from a checkpoint and seems to continue forward fine. Then the
>>> error will happen again sometime later, perhaps 1 hour. This looks like a
>>> Flink bug to me but I could use an expert opinion. Thanks!
>>>
>>> org.apache.flink.streaming.runtime.tasks.ExceptionInChainedOperatorException:
>>> Could not forward element to next operator
>>>
>>> at org.apache.flink.streaming.runtime.tasks.OperatorChain$
>>> CopyingChainingOutput.pushToOperator(OperatorChain.java:566)
>>>
>>> at org.apache.flink.streaming.runtime.tasks.OperatorChain$
>>> CopyingChainingOutput.collect(OperatorChain.java:524)
>>>
>>> at org.apache.flink.streaming.runtime.tasks.OperatorChain$
>>> CopyingChainingOutput.collect(OperatorChain.java:504)
>>>
>>> at org.apache.flink.streaming.api.operators.AbstractStreamOperator$
>>> CountingOutput.collect(AbstractStreamOperator.java:830)
>>>
>>> at org.apache.flink.streaming.api.operators.AbstractStreamOperator$
>>> CountingOutput.collect(AbstractStreamOperator.java:808)
>>>
>>> at com.lyft.streamingplatform.BetterBuffer$OutputThread.run(
>>> BetterBuffer.java:316)
>>>
>>> Caused by: org.apache.flink.streaming.runtime.tasks.
>>> ExceptionInChainedOperatorException: Could not forward element to next
>>> operator
>>>
>>> at org.apache.flink.streaming.runtime.tasks.OperatorChain$
>>> CopyingChainingOutput.pushToOperator(OperatorChain.java:566)
>>>
>>> at org.apache.flink.streaming.runtime.tasks.OperatorChain$
>>> CopyingChainingOutput.collect(OperatorChain.java:524)
>>>
>>> at org.apache.flink.streaming.runtime.tasks.OperatorChain$
>>> CopyingChainingOutput.collect(OperatorChain.java:504)
>>>
>>> at org.apache.flink.streaming.api.operators.AbstractStreamOperator$
>>> CountingOutput.collect(AbstractStreamOperator.java:830)
>>>
>>> at org.apache.flink.streaming.api.operators.AbstractStreamOperator$
>>> CountingOutput.collect(AbstractStreamOperator.java:808)
>>>
>>> at org.apache.flink.streaming.api.operators.
>>> TimestampedCollector.collect(TimestampedCollector.java:51)
>>>
>>> at com.lyft.dsp.functions.Labeler$UnlabelerFunction.
>>> processElement(Labeler.java:67)
>>>
>>> at com.lyft.dsp.functions.Labeler$UnlabelerFunction.
>>> processElement(Labeler.java:48)
>>>
>>> at org.apache.flink.streaming.api.operators.ProcessOperator.
>>> processElement(ProcessOperator.java:66)
>>>
>>> at org.apache.flink.streaming.runtime.tasks.OperatorChain$
>>> CopyingChainingOutput.pushToOperator(OperatorChain.java:549)
>>>
>>> ... 5 more
>>>
>>> Caused by: org.apache.flink.streaming.runtime.tasks.
>>> ExceptionInChainedOperatorException: Could not forward element to next
>>> operator
>>>
>>> at org.apache.flink.streaming.runtime.tasks.OperatorChain$
>>> CopyingChainingOutput.pushToOperator(OperatorChain.java:566)
>>>
>>> at org.apache.flink.streaming.runtime.tasks.OperatorChain$
>>> CopyingChainingOutput.collect(OperatorChain.java:524)
>>>
>>> at org.apache.flink.streaming.runtime.tasks.OperatorChain$
>>> CopyingChainingOutput.collect(OperatorChain.java:504)
>>>
>>> at org.apache.flink.streaming.api.operators.AbstractStreamOperator$
>>> CountingOutput.collect(AbstractStreamOperator.java:830)
>>>
>>> at 

working with flink Session Windows

2018-07-20 Thread antonio saldivar
Hello

I am building an app but for this UC I want to test with session windows
and I am not sure if this will be expensive for the compute resources
because the gap will be 10 mins, 20 mins 60 mins because I want to trigger
an alert if the element reaches some thresholds within those periods of
time.

flink version 1.4.2

Thank you
Best Regards
Antonio Saldivar


Re: Flink 1.4.2 -> 1.5.0 upgrade Queryable State debugging

2018-07-20 Thread Philip Doctor
Yeah I just went to reproduce this on a fresh environment, I blew away all the 
Zookeeper data and the error went away.  I'm running HA JobManager (1 active, 2 
standby) and 3 TMs.  I'm not sure how to fully account for this behavior yet, 
it looks like I can make this run from a totally fresh environment, so until I 
start practicing how to save point + restore a 1.4.2 -> 1.5.0 job, I look to be 
good for the moment.  Sorry to have bothered you all.


Thanks.


From: Dawid Wysakowicz 
Sent: Friday, July 20, 2018 3:09:46 AM
To: Philip Doctor
Cc: user; Kostas Kloudas
Subject: Re: Flink 1.4.2 -> 1.5.0 upgrade Queryable State debugging

Hi Philip,
Could you attach the full stack trace? Are you querying the same job/cluster in 
both tests?
I am also looping in Kostas, who might know more about changes in Queryable 
state between 1.4.2 and 1.5.0.
Best,
Dawid

On Thu, 19 Jul 2018 at 22:33, Philip Doctor 
mailto:philip.doc...@physiq.com>> wrote:
Dear Flink Users,
I'm trying to upgrade to flink 1.5.0, so far everything works except for the 
Queryable state client.  Now here's where it gets weird.  I have the client 
sitting behind a web API so the rest of our non-java ecosystem can consume it.  
I've got 2 tests, one calls my route directly as a java method call, the other 
calls the deployed server via HTTP (the difference in the test intending to 
flex if the server is properly started, etc).

The local call succeeds, the remote call fails, but the failure isn't what I'd 
expect (around the server layer) it's compalining about contacting the oracle 
for state location:


 Caused by: 
org.apache.flink.queryablestate.exceptions.UnknownLocationException: Could not 
contact the state location oracle to retrieve the state location.\n\tat 
org.apache.flink.queryablestate.client.proxy.KvStateClientProxyHandler.getKvStateLookupInfo(KvStateClientProxyHandler.java:228)\n


The client call is pretty straightforward:

return client.getKvState(
flinkJobID,
stateDescriptor.queryableStateName,
key,
keyTypeInformation,
stateDescriptor
)

I've confirmed via logs that I have the exact same key, flink job ID, and 
queryable state name.

So I'm going bonkers on what difference might exist, I'm wondering if I'm 
packing my jar wrong and there's some resources I need to look out for? (back 
on flink 1.3.x I had to handle the reference.conf file for AKKA when you were 
depending on that for Queryable State, is there something like that? etc).  Is 
there /more logging/ somewhere on the server side that might give me a hint?  
like "Tried to query state for X but couldn't find the BananaLever" ?  I'm 
pretty stuck right now and ready to try any random ideas to move forward.


The only changes I've made aside from the jar version bump from 1.4.2 to 1.5.0 
is handling queryableStateName (which is now nullable), no other code changes 
were required, and to confirm, this all works just fine with 1.4.2.


Thank you.


akka.stream.materializer exception

2018-07-20 Thread Rad Rad
Hi, 

I have a problem when running streaming Flink from jar file using CLI, the
program works fine if it runs from IDE?
The main exception is [1]
When I search for this exception,  I tried to solve it by adding akka
dependencies to my pom.xml  file as [2] and maven shaded plugin for jar
execution [3]

I will be thankful if anyone helps me to solve this problem. 

[1] 
Could not resolve substitution to a value: ${akka.stream.materializer} 

[2]  

com.typesafe.akka
akka-actor_2.11
2.4.16



   com.typesafe.akka
   akka-protobuf_2.11
   2.4.16


   com.typesafe.akka
   akka-stream_2.11
2.4.16

  
org.apache.maven.plugins
maven-shade-plugin
3.0.0

  
package

  shade


  

  org.sonatype.haven.HavenCli

  

  

  





--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: ProcessFunction example from the documentation giving me error

2018-07-20 Thread anna stax
My object name is  CreateUserNotificationRequests, thats why  you see
CreateUserNotificationRequests
in the Error message.
I edited the object name after pasting the code...Hope there is no
confusion and I get some help.
Thanks



On Fri, Jul 20, 2018 at 10:10 AM, anna stax  wrote:

> Hello all,
>
> This is my code, just trying to make the code example in
> https://ci.apache.org/projects/flink/flink-docs-release-
> 1.5/dev/stream/operators/process_function.html work
>
> object ProcessFunctionTest {
>
>   def main(args: Array[String]) {
>
> val env = StreamExecutionEnvironment.getExecutionEnvironment
> env.setStreamTimeCharacteristic(TimeCharacteristic.ProcessingTime)
> val text = env.socketTextStream("localhost", )
>
> val text1 = text.map(s => (s,s)).keyBy(0).process(new
> CountWithTimeoutFunction())
>
> text1.print()
>
> env.execute("CountWithTimeoutFunction")
>   }
>
>   case class CountWithTimestamp(key: String, count: Long, lastModified:
> Long)
>
>   class CountWithTimeoutFunction extends ProcessFunction[(String, String),
> (String, Long)] {
>
> lazy val state: ValueState[CountWithTimestamp] = getRuntimeContext
>   .getState(new ValueStateDescriptor[CountWithTimestamp]("myState",
> classOf[CountWithTimestamp]))
>
> override def processElement(value: (String, String), ctx:
> ProcessFunction[(String, String), (String, Long)]#Context, out:
> Collector[(String, Long)]): Unit = {
> ..
> }
>
> override def onTimer(timestamp: Long, ctx: ProcessFunction[(String,
> String), (String, Long)]#OnTimerContext, out: Collector[(String, Long)]):
> Unit = {
> ...
> }
>   }
> }
>
>
> Exception in thread "main" 
> org.apache.flink.runtime.client.JobExecutionException:
> java.net.ConnectException: Operation timed out (Connection timed out)
> at org.apache.flink.runtime.minicluster.MiniCluster.executeJobBlocking(
> MiniCluster.java:625)
> at org.apache.flink.streaming.api.environment.LocalStreamEnvironment.
> execute(LocalStreamEnvironment.java:121)
> at org.apache.flink.streaming.api.scala.StreamExecutionEnvironment.
> execute(StreamExecutionEnvironment.scala:654)
> at com.whil.flink.streaming.CreateUserNotificationRequests$.main(
> CreateUserNotificationRequests.scala:42)
> at com.whil.flink.streaming.CreateUserNotificationRequests.main(
> CreateUserNotificationRequests.scala)
> Caused by: java.net.ConnectException: Operation timed out (Connection
> timed out)
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at java.net.AbstractPlainSocketImpl.doConnect(
> AbstractPlainSocketImpl.java:350)
> at java.net.AbstractPlainSocketImpl.connectToAddress(
> AbstractPlainSocketImpl.java:206)
> at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:
> 188)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at org.apache.flink.streaming.api.functions.source.
> SocketTextStreamFunction.run(SocketTextStreamFunction.java:96)
> at org.apache.flink.streaming.api.operators.StreamSource.
> run(StreamSource.java:87)
> at org.apache.flink.streaming.api.operators.StreamSource.
> run(StreamSource.java:56)
> at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(
> SourceStreamTask.java:99)
> at org.apache.flink.streaming.runtime.tasks.StreamTask.
> invoke(StreamTask.java:306)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:703)
> at java.lang.Thread.run(Thread.java:745)
>
> On Thu, Jul 19, 2018 at 11:22 PM, vino yang  wrote:
>
>> Hi anna,
>>
>> Can you share your program and the exception stack trace and more details
>> about what's your source and state backend?
>>
>> From the information you provided, it seems Flink started a network
>> connect but timed out.
>>
>> Thanks, vino.
>>
>> 2018-07-20 14:14 GMT+08:00 anna stax :
>>
>>> Hi all,
>>>
>>> I am new to Flink. I am using the classes CountWithTimestamp and
>>> CountWithTimeoutFunction from the examples found in
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.5/
>>> dev/stream/operators/process_function.html
>>>
>>> I am getting the error Exception in thread "main"
>>> org.apache.flink.runtime.client.JobExecutionException:
>>> java.net.ConnectException: Operation timed out (Connection timed out)
>>>
>>> Looks like when timer’s time is reached I am getting this error. Any
>>> idea why. Please help
>>>
>>> Thanks
>>>
>>
>>
>


Re: ProcessFunction example from the documentation giving me error

2018-07-20 Thread anna stax
Hello all,

This is my code, just trying to make the code example in
https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/stream/
operators/process_function.html work

object ProcessFunctionTest {

  def main(args: Array[String]) {

val env = StreamExecutionEnvironment.getExecutionEnvironment
env.setStreamTimeCharacteristic(TimeCharacteristic.ProcessingTime)
val text = env.socketTextStream("localhost", )

val text1 = text.map(s => (s,s)).keyBy(0).process(new
CountWithTimeoutFunction())

text1.print()

env.execute("CountWithTimeoutFunction")
  }

  case class CountWithTimestamp(key: String, count: Long, lastModified:
Long)

  class CountWithTimeoutFunction extends ProcessFunction[(String, String),
(String, Long)] {

lazy val state: ValueState[CountWithTimestamp] = getRuntimeContext
  .getState(new ValueStateDescriptor[CountWithTimestamp]("myState",
classOf[CountWithTimestamp]))

override def processElement(value: (String, String), ctx:
ProcessFunction[(String, String), (String, Long)]#Context, out:
Collector[(String, Long)]): Unit = {
..
}

override def onTimer(timestamp: Long, ctx: ProcessFunction[(String,
String), (String, Long)]#OnTimerContext, out: Collector[(String, Long)]):
Unit = {
...
}
  }
}


Exception in thread "main"
org.apache.flink.runtime.client.JobExecutionException:
java.net.ConnectException: Operation timed out (Connection timed out)
at
org.apache.flink.runtime.minicluster.MiniCluster.executeJobBlocking(MiniCluster.java:625)
at
org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:121)
at
org.apache.flink.streaming.api.scala.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.scala:654)
at
com.whil.flink.streaming.CreateUserNotificationRequests$.main(CreateUserNotificationRequests.scala:42)
at
com.whil.flink.streaming.CreateUserNotificationRequests.main(CreateUserNotificationRequests.scala)
Caused by: java.net.ConnectException: Operation timed out (Connection timed
out)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at
org.apache.flink.streaming.api.functions.source.SocketTextStreamFunction.run(SocketTextStreamFunction.java:96)
at
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:87)
at
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:56)
at
org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:99)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:306)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:703)
at java.lang.Thread.run(Thread.java:745)

On Thu, Jul 19, 2018 at 11:22 PM, vino yang  wrote:

> Hi anna,
>
> Can you share your program and the exception stack trace and more details
> about what's your source and state backend?
>
> From the information you provided, it seems Flink started a network
> connect but timed out.
>
> Thanks, vino.
>
> 2018-07-20 14:14 GMT+08:00 anna stax :
>
>> Hi all,
>>
>> I am new to Flink. I am using the classes CountWithTimestamp and
>> CountWithTimeoutFunction from the examples found in
>> https://ci.apache.org/projects/flink/flink-docs-release-1.5/
>> dev/stream/operators/process_function.html
>>
>> I am getting the error Exception in thread "main"
>> org.apache.flink.runtime.client.JobExecutionException:
>> java.net.ConnectException: Operation timed out (Connection timed out)
>>
>> Looks like when timer’s time is reached I am getting this error. Any idea
>> why. Please help
>>
>> Thanks
>>
>
>


Re: Flink job hangs using rocksDb as backend

2018-07-20 Thread shishal singh
Hi Richer,

Actually for the testing , now I have reduced the number of timers to few
thousands (5-6K) but my job still gets stuck randomly.  And its not
reproducible each time. next time when I restart the job it again starts
working  for few few hours/days then gets stuck again.
I took thread dump when my job was hanged with almost 100% cpu . The most
cpu taking thread has following stack:

It look like sometimes its not able to read data from RocksDB.

*"process (3/6)" #782 prio=5 os_prio=0 tid=0x7f68b81ddcf0 nid=0xee73
runnable [0x7f688d83a000]*
*   java.lang.Thread.State: RUNNABLE*
* at org.rocksdb.RocksDB.get(Native Method)*
* at org.rocksdb.RocksDB.get(RocksDB.java:810)*
* at
org.apache.flink.contrib.streaming.state.RocksDBMapState.contains(RocksDBMapState.java:137)*
* at
org.apache.flink.runtime.state.UserFacingMapState.contains(UserFacingMapState.java:72)*
* at
nl.ing.gmtt.observer.analyser.job.rtpe.process.RtpeProcessFunction.isEventExist(RtpeProcessFunction.java:150)*
* at
nl.ing.gmtt.observer.analyser.job.rtpe.process.RtpeProcessFunction.onTimer(RtpeProcessFunction.java:93)*
* at
org.apache.flink.streaming.api.operators.KeyedProcessOperator.onEventTime(KeyedProcessOperator.java:75)*
* at
org.apache.flink.streaming.api.operators.HeapInternalTimerService.advanceWatermark(HeapInternalTimerService.java:288)*
* at
org.apache.flink.streaming.api.operators.InternalTimeServiceManager.advanceWatermark(InternalTimeServiceManager.java:108)*
* at
org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:885)*
* at
org.apache.flink.streaming.runtime.io.StreamInputProcessor$ForwardingValveOutputHandler.handleWatermark(StreamInputProcessor.java:288)*
* - locked <0x000302b61458> (a java.lang.Object)*
* at
org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:189)*
* at
org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.inputWatermark(StatusWatermarkValve.java:111)*
* at
org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:189)*
* at
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:69)*
* at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:264)*
* at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)*
* at java.lang.Thread.run(Thread.java:748)*

*   Locked ownable synchronizers:*
* - None*

*"process (2/6)" #781 prio=5 os_prio=0 tid=0x7f68b81dcef0 nid=0xee72
runnable [0x7f688fe54000]*
*   java.lang.Thread.State: RUNNABLE*
* at org.rocksdb.RocksDB.get(Native Method)*
* at org.rocksdb.RocksDB.get(RocksDB.java:810)*
* at
org.apache.flink.contrib.streaming.state.RocksDBMapState.get(RocksDBMapState.java:102)*
* at
org.apache.flink.runtime.state.UserFacingMapState.get(UserFacingMapState.java:47)*
* at
nl.ing.gmtt.observer.analyser.job.rtpe.process.RtpeProcessFunction.onTimer(RtpeProcessFunction.java:99)*
* at
org.apache.flink.streaming.api.operators.KeyedProcessOperator.onEventTime(KeyedProcessOperator.java:75)*
* at
org.apache.flink.streaming.api.operators.HeapInternalTimerService.advanceWatermark(HeapInternalTimerService.java:288)*
* at
org.apache.flink.streaming.api.operators.InternalTimeServiceManager.advanceWatermark(InternalTimeServiceManager.java:108)*
* at
org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:885)*
* at
org.apache.flink.streaming.runtime.io.StreamInputProcessor$ForwardingValveOutputHandler.handleWatermark(StreamInputProcessor.java:288)*
* - locked <0x000302b404a0> (a java.lang.Object)*
* at
org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:189)*
* at
org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.inputWatermark(StatusWatermarkValve.java:111)*
* at
org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:189)*
* at
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:69)*
* at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:264)*
* at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)*
* at java.lang.Thread.run(Thread.java:748)*

*   Locked ownable synchronizers:*


Regards,
Shishal


On Thu, Jul 12, 2018 at 4:11 PM Stefan Richter 
wrote:

> Hi,
>
> Did you check the metrics for the garbage collector? Stuck with high CPU
> consumption and lots of timers sound like there could be a possible
> problem, because timer are currently on-heap objects, but we are working on
> RocksDB-based timers right now.
>
> Best,
> Stefan
>
> Am 12.07.2018 um 14:54 schrieb shishal singh :
>
> Thanks Stefan/Stephan/Nico,
>
> Indeed there are 2 problem. For the 2nd problem ,I am almost certain that
> explanation given by Stephan is the 

Re: Query regarding rest.port property

2018-07-20 Thread Vino yang
Hi Vinay, Did job manager run in node "myhost"? Did you check the port you 
specified open for remote access? Can you try to start web UI, but just forbid 
its port?  Vino yang Thanks. On 2018-07-20 22:48 , Vinay Patil Wrote: Hi, 
We have disabled Flink Web UI for security reasons however we want to use REST 
Api for monitoring purpose. For that I have set jobmanager.web.port = -1 , 
rest.port=, rest.address=myhost But I am not able to access any REST api 
using https://myhost:/ Is it mandatory to have Flink Web UI 
running or am I missing any configuration ? Regards, Vinay Patil

Re: Re: Bootstrapping the state

2018-07-20 Thread Vino yang
Hi Henkka, If you want to customize the datastream text source for your 
purpose. You can use a read counter, if the value of counter would not change 
in a interval you can guess all the data has been read. Just a idea, you can 
choose other solution. About creating a savepoint automatically on job exists, 
it sounds a good idea. I did not know any plan about this, I would try to 
submit this idea to the community. And about "how to bootstrap a state", what 
does that mean? can you explain this? Thank, vino On 2018-07-20 20:00 , Henri 
Heiskanen Wrote: Hi, Thanks. Just to clarify, where would you then invoke the 
savepoint creation? I basically need to know when all data is read, create a 
savepoint and then exit. I think I could just as well use the 
PROCESS_CONTINUOSLY, monitor the stream outside and then use rest api to cancel 
with savepoint. Any plans to have feature where I could choose Flink to make a 
savepoint on job exists? I am also keen on hearing other ideas how to bootstrap 
a state. I was initially thinking of just reading data from Cassandra if no 
state available. Br, Henkka On Thu, Jul 19, 2018 at 3:15 PM vino yang 
 wrote: Hi Henkka, The behavior of the text file source 
meets expectation. Flink will not keep your source task thread when it exit 
from it's invoke method. That means you should keep your source task alive. So 
to implement this, you should customize a text file source (implement 
SourceFunction interface). For your requirement, you can check a no more data 
idle time, if expire, then exit, finally the job will stop. You can also refer 
the implementation of other source connectors. Thanks, vino. 2018-07-19 19:52 
GMT+08:00 Henri Heiskanen : Hi, I've been looking 
into how to initialise large state and especially checked this presentation by 
Lyft referenced in this group as well: 
https://www.youtube.com/watch?v=WdMcyN5QZZQ In our use case we would like to 
load roughly 4 billion entries into this state and I believe loading this data 
from s3, creating a savepoint and then restarting in streaming mode from a 
savepoint would work very well. In the presentation I get an impression that I 
could read from s3 and when all done (without any custom termination detector 
etc) I could just make a savepoint by calling the rest api from the app. 
However, I've noticed that if I read data from files the job will 
auto-terminate when all data is read and job appears not to be running even if 
I add the sleep in the main program (very simple app attached below). I could 
use FileProcessingMode.PROCESS_CONTINUOUSLY to prevent the job from terminating 
and create the savepoint from outside the app, but that would require 
termination detection etc and would make the solution less clean. Has anyone 
more details how I could accomplish this? Br, Henkka public class StreamingJob 
{public static void main(String[] args) throws Exception {   if 
(args.length == 0) {  args = "--initFile init.csv".split(" ");   }  
 // set up the streaming execution environment   final 
StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();   ParameterTool 
params = ParameterTool.fromArgs(args);   String initFile = 
params.get("initFile");   if (initFile != null) {  
env.readTextFile(initFile).map(new MapFunction>() { @Override public Tuple4 map(String s) throws Exception {
String[] data = s.split(",");return new Tuple4(data[0], data[1], data[2], data[3]); }  
}).keyBy(0, 1).map(new ProfileInitMapper());   }   // execute program   
env.execute("Flink Streaming Java API Skeleton");   // when all data 
read, save the state   Thread.sleep(1);} }

Query regarding rest.port property

2018-07-20 Thread Vinay Patil
Hi,

We have disabled Flink Web UI for security reasons however we want to use
REST Api for monitoring purpose. For that I have set jobmanager.web.port =
-1 , rest.port=, rest.address=myhost

But I am not able to access any REST api using https://
myhost:/

Is it mandatory to have Flink Web UI running or am I missing any
configuration ?

Regards,
Vinay Patil


Re: Cannot configure akka.ask.timeout

2018-07-20 Thread Lukas Kircher
Thanks for your answers.

In my use case I am reading from a large number of individual files. Jobs are 
issued directly from the Java API, the results are collected (in memory) and 
re-used partially in follow-up jobs.

I feared that using a MiniCluster or local environment I would not be able to 
overwrite the default for `akka.ask.timeout` without tinkering with Flink's 
source code - thanks for your confirmation. For now I'll stick to this solution 
though.

Cheers,
Lukas

> On 19. Jul 2018, at 11:03, Gary Yao  wrote:
> 
> Hi Lukas,
> 
> It seems that when using MiniCluster, the config key akka.ask.timeout is not
> respected. Instead, a hardcoded timeout of 10s is used [1]. Since all
> communication is locally, it would be interesting to see in detail what your
> job looks like that it exceeds the timeout.
> 
> The key akka.ask.timeout specifies the default RPC timeout. However, for
> requests originating from the REST API, web.timeout overrides this value. When
> submitting a job using the CLI to a (standalone) cluster, a request is issued
> against the REST API. Therefore, you can try setting web.timeout=10 in the
> flink-conf.yaml as already proposed by Vishal Santoshi.
> 
> Best,
> Gary
> 
> [1] 
> https://github.com/apache/flink/blob/749dd29935f319b062051141e150eed7a1a5f298/flink-runtime/src/main/java/org/apache/flink/runtime/minicluster/MiniCluster.java#L185
>  
> 
> 
> On Fri, Jul 13, 2018 at 12:24 PM, Lukas Kircher  > wrote:
> Hello,
> 
> I have problems setting configuration parameters for Akka in Flink 1.5.0. 
> When I run a job I get the exception listed below which states that Akka 
> timed out after 1ms. I tried to increase the timeout by following the 
> Flink configuration documentation. Specifically I did the following:
> 
> 1) Passed a configuration to the Flink execution environment with 
> `akka.ask.timeout` set to a higher value. I started this in Intellij.
> 2) Passed program arguments via the run configuration in Intellij, e.g. 
> `-Dakka.ask.timeout:100s`
> 3) Added `akka.ask.timeout: 100 s` to flink-conf.yaml and started a local 
> standalone cluster via start-cluster.sh. The setting is reflected in Flink's 
> web interface.
> 
> However - despite explicit configuration the default setting seems to be 
> used. The exception below states in each case that akka ask timed out after 
> 1ms.
> 
> As my problem seems very basic I do not include an SSCCE for now but I can 
> try to build one if this helps figuring out the issue.
> 
> --
> [...]
> Exception in thread "main" 
> org.apache.flink.runtime.client.JobExecutionException: Could not retrieve 
> JobResult.
> [...]
>   at 
> org.apache.flink.runtime.minicluster.MiniCluster.executeJobBlocking(MiniCluster.java:619)
>   at 
> org.apache.flink.client.LocalExecutor.executePlan(LocalExecutor.java:234)
>   at 
> org.apache.flink.api.java.LocalEnvironment.execute(LocalEnvironment.java:91)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:816)
>   at org.apache.flink.api.java.DataSet.collect(DataSet.java:413)
> [...]
> Caused by: akka.pattern.AskTimeoutException: Ask timed out on 
> [Actor[akka://flink/user/dispatcher8df05371-effc-468b-8a22-e2f364f65d6a#582308583
>  <>]] after [1 ms]. Sender[null] sent message of type 
> "org.apache.flink.runtime.rpc.messages.LocalFencedMessage".
>   at 
> akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604)
>   at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126)
>   at 
> scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601)
>   at 
> scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109)
>   at 
> scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599)
>   at 
> akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329)
>   at 
> akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280)
>   at 
> akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284)
>   at 
> akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236)
>   at java.lang.Thread.run(Thread.java:745)
> [...]
> --
> 
> 
> Best regards and thanks for your help,
> Lukas
> 
> 
> 
> 



Re: Bootstrapping the state

2018-07-20 Thread Henri Heiskanen
Hi,

Thanks. Just to clarify, where would you then invoke the savepoint
creation? I basically need to know when all data is read, create a
savepoint and then exit. I think I could just as well use the
PROCESS_CONTINUOSLY, monitor the stream outside and then use rest api to
cancel with savepoint.

Any plans to have feature where I could choose Flink to make a savepoint on
job exists? I am also keen on hearing other ideas how to bootstrap a state.
I was initially thinking of just reading data from Cassandra if no state
available.

Br,
Henkka

On Thu, Jul 19, 2018 at 3:15 PM vino yang  wrote:

> Hi Henkka,
>
> The behavior of the text file source meets expectation. Flink will not
> keep your source task thread when it exit from it's invoke method. That
> means you should keep your source task alive. So to implement this, you
> should customize a text file source (implement SourceFunction interface).
>
>  For your requirement, you can check a no more data idle time, if expire,
> then exit, finally the job will stop.
>
> You can also refer the implementation of other source connectors.
>
> Thanks, vino.
>
> 2018-07-19 19:52 GMT+08:00 Henri Heiskanen :
>
>> Hi,
>>
>> I've been looking into how to initialise large state and especially
>> checked this presentation by Lyft referenced in this group as well:
>> https://www.youtube.com/watch?v=WdMcyN5QZZQ
>>
>> In our use case we would like to load roughly 4 billion entries into this
>> state and I believe loading this data from s3, creating a savepoint and
>> then restarting in streaming mode from a savepoint would work very well. In
>> the presentation I get an impression that I could read from s3 and when all
>> done (without any custom termination detector etc) I could just make a
>> savepoint by calling the rest api from the app. However, I've noticed that
>> if I read data from files the job will auto-terminate when all data is read
>> and job appears not to be running even if I add the sleep in the main
>> program (very simple app attached below).
>>
>> I could use FileProcessingMode.PROCESS_CONTINUOUSLY to prevent the job
>> from terminating and create the savepoint from outside the app, but that
>> would require termination detection etc and would make the solution less
>> clean.
>>
>> Has anyone more details how I could accomplish this?
>>
>> Br,
>> Henkka
>>
>> public class StreamingJob {
>>
>>public static void main(String[] args) throws Exception {
>>   if (args.length == 0) {
>>  args = "--initFile init.csv".split(" ");
>>   }
>>
>>   // set up the streaming execution environment
>>   final StreamExecutionEnvironment env = 
>> StreamExecutionEnvironment.getExecutionEnvironment();
>>
>>   ParameterTool params = ParameterTool.fromArgs(args);
>>
>>   String initFile = params.get("initFile");
>>   if (initFile != null) {
>>  env.readTextFile(initFile).map(new MapFunction> Tuple4>() {
>> @Override
>> public Tuple4 map(String s) 
>> throws Exception {
>>String[] data = s.split(",");
>>return new Tuple4(data[0], 
>> data[1], data[2], data[3]);
>> }
>>  }).keyBy(0, 1).map(new ProfileInitMapper());
>>   }
>>
>>   // execute program
>>   env.execute("Flink Streaming Java API Skeleton");
>>
>>   // when all data read, save the state
>>   Thread.sleep(1);
>>}
>> }
>>
>>
>>
>>
>


Re: Joining streamed data to reference data

2018-07-20 Thread Dawid Wysakowicz
Hi James,

1) Unfortunately, Flink does not support DataSet with DataStream joins as
of now. If the "batch" table is small enough you might try the solution
suggested by Vino to load it in the UDTF. You can also try implementing the
Stream version of this table yourself. You can use the
org.apache.flink.table.sources.CsvTableSource and
org.apache.flink.orc.OrcRowInputFormat as examples.

2) Providing better out-of-the box support for multiple source and formats
in high on the roadmap for upcoming releases. So I would guess you can
expect support for orc in stream in the nearest future.

Best,
Dawid

On Fri, 20 Jul 2018 at 11:59, vino yang  wrote:

> Hi Porritt,
>
> Flink does not support streaming and batch join, currently, streaming and
> batch job are both independent.
>
> I guess your use case is streaming and dimension table join?
> Unfortunately, it's not possible for the Flink SQL API to join a stream
> with a common dataset now.
>
> 1)
> As a workaround, if the table is just a tiny one, you can achieve a
> inner/left outer join with the user defined table functions :
>
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/table/sql.html#joins
>
> 2)
> I did not see any plan about this.
>
> Thanks, vino.
>
>
> 2018-07-20 17:29 GMT+08:00 Porritt, James :
>
>> I was hoping to join a StreamTableSource to a BatchTableSource, but I
>> find it’s not simple. A couple of questions:
>>
>>
>>
>> 1)  Other than just pushing the DataSet to a Kafka topic (either
>> internally or externally to the application) and reading it into a
>> DataStream are there any means of doing the conversion?
>>
>> 2)  Are there any plans to get OrcTableSource to be both
>> StreamTableSource and BatchTableSource instead of just a BatchTableSource?
>>
>>
>>
>> Thanks,
>>
>> James.
>> ##
>> The information contained in this communication is confidential and
>> intended only for the individual(s) named above. If you are not a named
>> addressee, please notify the sender immediately and delete this email
>> from your system and do not disclose the email or any part of it to any
>> person. The views expressed in this email are the views of the author
>> and do not necessarily represent the views of Millennium Capital Partners
>> LLP (MCP LLP) or any of its affiliates. Outgoing and incoming electronic
>> communications of MCP LLP and its affiliates, including telephone
>> communications, may be electronically archived and subject to review
>> and/or disclosure to someone other than the recipient. MCP LLP is
>> authorized and regulated by the Financial Conduct Authority. Millennium
>> Capital Partners LLP is a limited liability partnership registered in
>> England & Wales with number OC312897 and with its registered office at
>> 50 Berkeley Street, London, W1J 8HD
>> 
>> .
>> ##
>>
>>
>


Re: Joining streamed data to reference data

2018-07-20 Thread vino yang
Hi Porritt,

Flink does not support streaming and batch join, currently, streaming and
batch job are both independent.

I guess your use case is streaming and dimension table join?
Unfortunately, it's not possible for the Flink SQL API to join a stream
with a common dataset now.

1)
As a workaround, if the table is just a tiny one, you can achieve a
inner/left outer join with the user defined table functions :

https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/table/sql.html#joins

2)
I did not see any plan about this.

Thanks, vino.


2018-07-20 17:29 GMT+08:00 Porritt, James :

> I was hoping to join a StreamTableSource to a BatchTableSource, but I find
> it’s not simple. A couple of questions:
>
>
>
> 1)  Other than just pushing the DataSet to a Kafka topic (either
> internally or externally to the application) and reading it into a
> DataStream are there any means of doing the conversion?
>
> 2)  Are there any plans to get OrcTableSource to be both
> StreamTableSource and BatchTableSource instead of just a BatchTableSource?
>
>
>
> Thanks,
>
> James.
> ##
> The information contained in this communication is confidential and
> intended only for the individual(s) named above. If you are not a named
> addressee, please notify the sender immediately and delete this email
> from your system and do not disclose the email or any part of it to any
> person. The views expressed in this email are the views of the author
> and do not necessarily represent the views of Millennium Capital Partners
> LLP (MCP LLP) or any of its affiliates. Outgoing and incoming electronic
> communications of MCP LLP and its affiliates, including telephone
> communications, may be electronically archived and subject to review
> and/or disclosure to someone other than the recipient. MCP LLP is
> authorized and regulated by the Financial Conduct Authority. Millennium
> Capital Partners LLP is a limited liability partnership registered in
> England & Wales with number OC312897 and with its registered office at
> 50 Berkeley Street, London, W1J 8HD
> 
> .
> ##
>
>


Re: org.apache.flink.streaming.runtime.streamrecord.LatencyMarker cannot be cast to org.apache.flink.streaming.runtime.streamrecord.StreamRecord

2018-07-20 Thread Dawid Wysakowicz
Hi Gregory,
I think it is some flink bug. Could you file a JIRA for it? Also which
version of flink are you using?
Best,
Dawid

On Fri, 20 Jul 2018 at 04:34, vino yang  wrote:

> Hi Gregory,
>
> This exception seems a bug, you can create a issues in the JIRA.
>
> Thanks, vino.
>
> 2018-07-20 10:28 GMT+08:00 Philip Doctor :
>
>> Oh you were asking about the cast exception, I haven't seen that before,
>> sorry to be off topic.
>>
>>
>>
>>
>> --
>> *From:* Philip Doctor 
>> *Sent:* Thursday, July 19, 2018 9:27:15 PM
>> *To:* Gregory Fee; user
>> *Subject:* Re:
>> org.apache.flink.streaming.runtime.streamrecord.LatencyMarker cannot be
>> cast to org.apache.flink.streaming.runtime.streamrecord.StreamRecord
>>
>>
>>
>> I'm just a flink user, not an expert.  I've seen that exception before.
>> I have never seen it be the actual error, I usually see it when some other
>> operator is throwing an uncaught exception and busy dying.  It seems to me
>> that the prior operator throws this error "Can't forward to the next
>> operator" why? because the next operator's already dead, but the job is
>> busy dying asynchronously, so you get a cloud of errors that sort of
>> surround the root cause.  I'd read your logs a little further back.
>> --
>> *From:* Gregory Fee 
>> *Sent:* Thursday, July 19, 2018 9:10:37 PM
>> *To:* user
>> *Subject:* org.apache.flink.streaming.runtime.streamrecord.LatencyMarker
>> cannot be cast to
>> org.apache.flink.streaming.runtime.streamrecord.StreamRecord
>>
>> Hello, I have a job running and I've gotten this error a few times. The
>> job recovers from a checkpoint and seems to continue forward fine. Then the
>> error will happen again sometime later, perhaps 1 hour. This looks like a
>> Flink bug to me but I could use an expert opinion. Thanks!
>>
>> org.apache.flink.streaming.runtime.tasks.ExceptionInChainedOperatorException:
>> Could not forward element to next operator
>>
>> at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:566)
>>
>> at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:524)
>>
>> at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:504)
>>
>> at
>> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:830)
>>
>> at
>> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:808)
>>
>> at
>> com.lyft.streamingplatform.BetterBuffer$OutputThread.run(BetterBuffer.java:316)
>>
>> Caused by:
>> org.apache.flink.streaming.runtime.tasks.ExceptionInChainedOperatorException:
>> Could not forward element to next operator
>>
>> at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:566)
>>
>> at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:524)
>>
>> at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:504)
>>
>> at
>> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:830)
>>
>> at
>> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:808)
>>
>> at
>> org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51)
>>
>> at
>> com.lyft.dsp.functions.Labeler$UnlabelerFunction.processElement(Labeler.java:67)
>>
>> at
>> com.lyft.dsp.functions.Labeler$UnlabelerFunction.processElement(Labeler.java:48)
>>
>> at
>> org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66)
>>
>> at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:549)
>>
>> ... 5 more
>>
>> Caused by:
>> org.apache.flink.streaming.runtime.tasks.ExceptionInChainedOperatorException:
>> Could not forward element to next operator
>>
>> at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:566)
>>
>> at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:524)
>>
>> at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:504)
>>
>> at
>> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:830)
>>
>> at
>> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:808)
>>
>> at
>> org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51)
>>
>> at
>> 

Joining streamed data to reference data

2018-07-20 Thread Porritt, James
I was hoping to join a StreamTableSource to a BatchTableSource, but I find it's 
not simple. A couple of questions:


1)  Other than just pushing the DataSet to a Kafka topic (either internally 
or externally to the application) and reading it into a DataStream are there 
any means of doing the conversion?

2)  Are there any plans to get OrcTableSource to be both StreamTableSource 
and BatchTableSource instead of just a BatchTableSource?

Thanks,
James.
##

The information contained in this communication is confidential and

intended only for the individual(s) named above. If you are not a named

addressee, please notify the sender immediately and delete this email

from your system and do not disclose the email or any part of it to any

person. The views expressed in this email are the views of the author

and do not necessarily represent the views of Millennium Capital Partners

LLP (MCP LLP) or any of its affiliates. Outgoing and incoming electronic

communications of MCP LLP and its affiliates, including telephone

communications, may be electronically archived and subject to review

and/or disclosure to someone other than the recipient. MCP LLP is

authorized and regulated by the Financial Conduct Authority. Millennium

Capital Partners LLP is a limited liability partnership registered in

England & Wales with number OC312897 and with its registered office at

50 Berkeley Street, London, W1J 8HD.

##


Re: Flink 1.4.2 -> 1.5.0 upgrade Queryable State debugging

2018-07-20 Thread Dawid Wysakowicz
Hi Philip,
Could you attach the full stack trace? Are you querying the same
job/cluster in both tests?
I am also looping in Kostas, who might know more about changes in Queryable
state between 1.4.2 and 1.5.0.
Best,
Dawid

On Thu, 19 Jul 2018 at 22:33, Philip Doctor 
wrote:

> Dear Flink Users,
> I'm trying to upgrade to flink 1.5.0, so far everything works except for
> the Queryable state client.  Now here's where it gets weird.  I have the
> client sitting behind a web API so the rest of our non-java ecosystem can
> consume it.  I've got 2 tests, one calls my route directly as a java method
> call, the other calls the deployed server via HTTP (the difference in the
> test intending to flex if the server is properly started, etc).
>
> The local call succeeds, the remote call fails, but the failure isn't what
> I'd expect (around the server layer) it's compalining about contacting the
> oracle for state location:
>
> 
>  Caused by:
> org.apache.flink.queryablestate.exceptions.UnknownLocationException: Could
> not contact the state location oracle to retrieve the state location.\n\tat
> org.apache.flink.queryablestate.client.proxy.KvStateClientProxyHandler.getKvStateLookupInfo(KvStateClientProxyHandler.java:228)\n
> 
>
> The client call is pretty straightforward:
>
> return client.getKvState(
> flinkJobID,
> stateDescriptor.queryableStateName,
> key,
> keyTypeInformation,
> stateDescriptor
> )
>
> I've confirmed via logs that I have the exact same key, flink job ID, and
> queryable state name.
>
> So I'm going bonkers on what difference might exist, I'm wondering if I'm
> packing my jar wrong and there's some resources I need to look out for?
> (back on flink 1.3.x I had to handle the reference.conf file for AKKA when
> you were depending on that for Queryable State, is there something like
> that? etc).  Is there /more logging/ somewhere on the server side that
> might give me a hint?  like "Tried to query state for X but couldn't find
> the BananaLever" ?  I'm pretty stuck right now and ready to try any random
> ideas to move forward.
>
>
> The only changes I've made aside from the jar version bump from 1.4.2 to
> 1.5.0 is handling queryableStateName (which is now nullable), no other code
> changes were required, and to confirm, this all works just fine with 1.4.2.
>
>
> Thank you.
>


Re: ProcessFunction example from the documentation giving me error

2018-07-20 Thread vino yang
Hi anna,

Can you share your program and the exception stack trace and more details
about what's your source and state backend?

>From the information you provided, it seems Flink started a network connect
but timed out.

Thanks, vino.

2018-07-20 14:14 GMT+08:00 anna stax :

> Hi all,
>
> I am new to Flink. I am using the classes CountWithTimestamp and
> CountWithTimeoutFunction from the examples found in
> https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/stream/
> operators/process_function.html
>
> I am getting the error Exception in thread "main" 
> org.apache.flink.runtime.client.JobExecutionException:
> java.net.ConnectException: Operation timed out (Connection timed out)
>
> Looks like when timer’s time is reached I am getting this error. Any idea
> why. Please help
>
> Thanks
>


Re: Is KeyedProcessFunction available in Flink 1.4?

2018-07-20 Thread anna stax
Thanks Bowen.

On Thu, Jul 19, 2018 at 4:45 PM, Bowen Li  wrote:

> Hi Anna,
>
> KeyedProcessFunction is only available starting from Flink 1.5. The doc is
> here
> .
> It extends ProcessFunction and shares the same functionalities except
> giving more access to timers' key, thus you can refer to examples of
> ProcessFunction in that document.
>
> Bowen
>
>
> On Thu, Jul 19, 2018 at 3:26 PM anna stax  wrote:
>
>> Hello all,
>> I am using Flink 1.4 because thats the version provided by the latest AWS
>> EMR.
>> Is KeyedProcessFunction available in Flink 1.4?
>>
>> Also please share any links to good examples on using
>> KeyedProcessFunction .
>>
>> Thanks
>>
>


ProcessFunction example from the documentation giving me error

2018-07-20 Thread anna stax
Hi all,

I am new to Flink. I am using the classes CountWithTimestamp and
CountWithTimeoutFunction from the examples found in
https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/stream/operators/process_function.html

I am getting the error Exception in thread "main"
org.apache.flink.runtime.client.JobExecutionException:
java.net.ConnectException: Operation timed out (Connection timed out)

Looks like when timer’s time is reached I am getting this error. Any idea
why. Please help

Thanks