Re: could I chain two timed window?

2017-12-11 Thread Fabian Hueske
Hi,

sliding windows replicate their records for each window.
If you have use an incrementally aggregating function (ReduceFunction,
AggregateFunction) with a sliding, the space requirement should not be an
issue because each window stores a single value.
However, this also means that each window performs its aggregations
independently from the others. So, if you many concurrent sliding windows,
pre-aggregate the records in a tumbling window can reduce the computational
effort.

Best, Fabian



2017-12-12 8:10 GMT+01:00 Jinhua Luo :

> Hi All,
>
> Given one stream source which generates 20k events/sec, and I need to
> aggregate the element count using sliding window of 1 hour size.
>
> The problem is, the window may buffer too many elements (which may
> cause a lot of block I/O because of checkpointing?), and in fact it
> does not necessary to store them for one hour, because the elements
> should get folded incrementally. But unlike Tumbling Window, the
> sliding window would save elements for next window, right?
>
> So I am considering kind of workaround, should I chain two window like
> below:
>
> .timeWindow(Time.minutes(1))
> ...
> .timeWindow(Time.hours(1), Time.minutes(1))
>
> Here the first window generate 1 minute aggregation units and the
> second window provides the sliding output.
>
> Any suggestions? Thanks.
>


Re: when does the timed window ends?

2017-12-11 Thread Fabian Hueske
Hi,

this depends on the window type. Tumbling and Sliding Windows are (by
default) aligned with the epoch time (1970-01-01 00:00:00).
For example a tumbling window of 2 hour starts and ends every two hours,
i.e., from 12:00:00 to 13:59:59.999, from 14:00:00 to 15:59:59.999, etc.

The documentation says a window is created when an element arrives. This
does not imply that the start time of the window is the time of the first
element.
So it might happen that the first element of a 2 hour tumbling window
arrives at 13:59:59.000 and the window is closed 1 second later.

However, there are also windows for which the first element defines the
start time such as the built-in session window.
You can also define custom windows like that.

Best, Fabian

2017-12-12 7:57 GMT+01:00 Jinhua Luo :

> Hi All,
>
> The document said "a window is created as soon as the first element
> that should belong to this window arrives, and the window is
> completely removed when the time (event or processing time) passes its
> end timestamp plus the user-specified allowed lateness (see Allowed
> Lateness).".
>
> I am still confused.
>
> If the window contains only one element (which triggers the window
> creation), and no more elements come in during the window size (e.g. 1
> minute), then when does the window function get invoked? after 1
> minute?
>
> I mean, the window would finish either when any element indicates the
> watermark is larger than the window size, or, when the processing time
> (no matter for event-timed window or process-timed window) pass over
> the window size since the first element?
>


Re: How to deal with dynamic types

2017-12-11 Thread Piotr Nowojski
Hi,

For truly dynamic class you would need a custom TypeInformation or 
TypeDeserializationSchema and store the fields on some kind of Map. Maybe something could be done with inheritance if records that always 
share the same fields could be deserialized to some specific class with 
fixed/predefinied fields.

However in your case it seems like you can ignore all of the dynamic fields, 
and just implement a deserializer that skips/ignores all of the field except of 
Dept and Salary. It could produce simple POJO with those two fields or a even 
Touple2. If those fields are missing, set them to null and 
discard/filter out the record, since you will not be able to use it for 
calculating your average anyway.

Piotrek

> On 11 Dec 2017, at 16:13, madan  wrote:
> 
> Hi,
> 
> I am trying some initial samples with flink. I have one doubt regarding data 
> types. Flink support data types Tuple(max 25 fields), Java POJOs, Primitive 
> types, Regular classes etc., 
> In my case I do not have fixed type. I have meta data with filed names & its 
> types. For ex., (Id:int, Name:String, Salary:Double, Dept:String,... etc). I 
> do not know the number of fields, its names or types till I receive metadata. 
> In these what should be the source type I should go with? Please suggest. 
> Small example would be of great help.
> 
> 
> Scenario trying to solve :
> 
> Input :
> Metadata : {"id":"int", 
> "Name":"String","Salary":"Double","Dept":"String"}
> Data file :   csv data file with above fields data 
> 
> Output required is : Calculate average of salary by department wise.
> 
>
> -- 
> Thank you,
> Madan.



Re: Custom Metrics

2017-12-11 Thread Piotr Nowojski
Hi,

Reporting once per 10 seconds shouldn’t create problems. Best to try it out. 
Let us know if you get into some troubles :)

Piotrek

> On 11 Dec 2017, at 18:23, Navneeth Krishnan  wrote:
> 
> Thanks Piotr. 
> 
> Yes, passing the metric group should be sufficient. The subcomponents will 
> not be able to provide the list of metrics to register since the metrics are 
> created based on incoming data by tenant. Also I am planning to have the 
> metrics reported every 10 seconds and hope it shouldn't be a problem. We use 
> influx and grafana to plot the metrics.
> 
> The option 2 that I had in mind was to collect all metrics and use influx db 
> sink to report it directly inside the pipeline. But it seems reporting per 
> node might not be possible.
> 
> 
> On Mon, Dec 11, 2017 at 3:14 AM, Piotr Nowojski  > wrote:
> Hi,
> 
> I’m not sure if I completely understand your issue.
> 
> 1.
> - You don’t have to pass RuntimeContext, you can always pass just the 
> MetricGroup or ask your components/subclasses “what metrics do you want to 
> register” and register them at the top level.
> - Reporting tens/hundreds/thousands of metrics shouldn’t be an issue for 
> Flink, as long as you have a reasonable reporting interval. However keep in 
> mind that Flink only reports your metrics and you still need something to 
> read/handle/process/aggregate your metrics
> 2.
> I don’t think that reporting per node/jvm is possible with Flink’s metric 
> system. For that you would need some other solution, like report your metrics 
> using JMX (directly register MBeans from your code)
> 
> Piotrek
> 
> > On 10 Dec 2017, at 18:51, Navneeth Krishnan  > > wrote:
> >
> > Hi,
> >
> > I have a streaming pipeline running on flink and I need to collect metrics 
> > to identify how my algorithm is performing. The entire pipeline is 
> > multi-tenanted and I also need metrics per tenant. Lets say there would be 
> > around 20 metrics to be captured per tenant. I have the following ideas for 
> > implemention but any suggestions on which one might be better will help.
> >
> > 1. Use flink metric group and register a group per tenant at the operator 
> > level. The disadvantage of this approach for me is I need the 
> > runtimecontext parameter to register a metric and I have various subclasses 
> > to which I need to pass this object to limit the metric scope within the 
> > operator. Also there will be too many metrics reported if there are higher 
> > number of subtasks.
> > How is everyone accessing flink state/ metrics from other classes where you 
> > don't have access to runtimecontext?
> >
> > 2. Use a custom singleton metric registry to add and send these metrics 
> > using custom sink. Instead of using flink metric group to collect metrics 
> > per operatior - subtask, collect per jvm and use influx sink to send the 
> > metric data. What i'm not sure in this case is how to collect only once per 
> > node/jvm.
> >
> > Thanks a bunch in advance.
> 
> 



could I chain two timed window?

2017-12-11 Thread Jinhua Luo
Hi All,

Given one stream source which generates 20k events/sec, and I need to
aggregate the element count using sliding window of 1 hour size.

The problem is, the window may buffer too many elements (which may
cause a lot of block I/O because of checkpointing?), and in fact it
does not necessary to store them for one hour, because the elements
should get folded incrementally. But unlike Tumbling Window, the
sliding window would save elements for next window, right?

So I am considering kind of workaround, should I chain two window like below:

.timeWindow(Time.minutes(1))
...
.timeWindow(Time.hours(1), Time.minutes(1))

Here the first window generate 1 minute aggregation units and the
second window provides the sliding output.

Any suggestions? Thanks.


RE: ProgramInvocationException: Could not upload the jar files to the job manager / No space left on device

2017-12-11 Thread Chan, Regina
And if it helps, I'm running on flink 1.2.1. I saw this ticket: 
https://issues.apache.org/jira/browse/FLINK-5828 It only started happening when 
I was running all 50 flows at the same time. However, it looks like it's not an 
issue with creating the cache directory but with running out of space there? 
But what's in there is also tiny.

bash-4.1$ hdfs dfs -du -h 
hdfs://d191291/user/delp/.flink/application_1510733430616_2098853
1.1 K
hdfs://d191291/user/delp/.flink/application_1510733430616_2098853/5c71e4b6-2567-4d34-98dc-73b29c502736-taskmanager-conf.yaml
1.4 K
hdfs://d191291/user/delp/.flink/application_1510733430616_2098853/flink-conf.yaml
93.5 M   
hdfs://d191291/user/delp/.flink/application_1510733430616_2098853/flink-dist_2.10-1.2.1.jar
264.8 M  hdfs://d191291/user/delp/.flink/application_1510733430616_2098853/lib
1.9 K
hdfs://d191291/user/delp/.flink/application_1510733430616_2098853/log4j.properties


From: Chan, Regina [Tech]
Sent: Tuesday, December 12, 2017 1:56 AM
To: 'user@flink.apache.org'
Subject: ProgramInvocationException: Could not upload the jar files to the job 
manager / No space left on device

Hi,

I'm currently submitting 50 separate jobs to a 50TM, 1 slot set up. Each job 
has 1 parallelism. There's plenty of space left in my cluster and on that node. 
It's not clear to me what's happening. Any pointers?

On the client side, when I try to execute, I see the following:
org.apache.flink.client.program.ProgramInvocationException: The program 
execution failed: Could not upload the jar files to the job manager.
at 
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427)
at 
org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:101)
at 
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:400)
at 
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:387)
at 
org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:62)
at 
org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:926)
at 
com.gs.ep.da.lake.refinerlib.flink.FlowData.execute(FlowData.java:143)
at 
com.gs.ep.da.lake.refinerlib.flink.FlowData.flowPartialIngestionHalf(FlowData.java:107)
at com.gs.ep.da.lake.refinerlib.flink.FlowData.call(FlowData.java:72)
at com.gs.ep.da.lake.refinerlib.flink.FlowData.call(FlowData.java:39)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Could not 
upload the jar files to the job manager.
at 
org.apache.flink.runtime.client.JobSubmissionClientActor$1.call(JobSubmissionClientActor.java:150)
at akka.dispatch.Futures$$anonfun$future$1.apply(Future.scala:95)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.io.IOException: Could not retrieve the JobManager's blob port.
at 
org.apache.flink.runtime.blob.BlobClient.uploadJarFiles(BlobClient.java:745)
at 
org.apache.flink.runtime.jobgraph.JobGraph.uploadUserJars(JobGraph.java:565)
at 
org.apache.flink.runtime.client.JobSubmissionClientActor$1.call(JobSubmissionClientActor.java:148)
... 9 more
Caused by: java.io.IOException: PUT operation failed: Connection reset
at 
org.apache.flink.runtime.blob.BlobClient.putInputStream(BlobClient.java:512)
at org.apache.flink.runtime.blob.BlobClient.put(BlobClient.java:374)
at 
org.apache.flink.runtime.blob.BlobClient.uploadJarFiles(BlobClient.java:771)
at 
org.apache.flink.runtime.blob.BlobClient.uploadJarFiles(BlobClient.java:740)
... 11 more
Caused by: java.net.SocketException: Connection reset
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:118)
at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
at 
org.ap

when does the timed window ends?

2017-12-11 Thread Jinhua Luo
Hi All,

The document said "a window is created as soon as the first element
that should belong to this window arrives, and the window is
completely removed when the time (event or processing time) passes its
end timestamp plus the user-specified allowed lateness (see Allowed
Lateness).".

I am still confused.

If the window contains only one element (which triggers the window
creation), and no more elements come in during the window size (e.g. 1
minute), then when does the window function get invoked? after 1
minute?

I mean, the window would finish either when any element indicates the
watermark is larger than the window size, or, when the processing time
(no matter for event-timed window or process-timed window) pass over
the window size since the first element?


ProgramInvocationException: Could not upload the jar files to the job manager / No space left on device

2017-12-11 Thread Chan, Regina
Hi,

I'm currently submitting 50 separate jobs to a 50TM, 1 slot set up. Each job 
has 1 parallelism. There's plenty of space left in my cluster and on that node. 
It's not clear to me what's happening. Any pointers?

On the client side, when I try to execute, I see the following:
org.apache.flink.client.program.ProgramInvocationException: The program 
execution failed: Could not upload the jar files to the job manager.
at 
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427)
at 
org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:101)
at 
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:400)
at 
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:387)
at 
org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:62)
at 
org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:926)
at 
com.gs.ep.da.lake.refinerlib.flink.FlowData.execute(FlowData.java:143)
at 
com.gs.ep.da.lake.refinerlib.flink.FlowData.flowPartialIngestionHalf(FlowData.java:107)
at com.gs.ep.da.lake.refinerlib.flink.FlowData.call(FlowData.java:72)
at com.gs.ep.da.lake.refinerlib.flink.FlowData.call(FlowData.java:39)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Could not 
upload the jar files to the job manager.
at 
org.apache.flink.runtime.client.JobSubmissionClientActor$1.call(JobSubmissionClientActor.java:150)
at akka.dispatch.Futures$$anonfun$future$1.apply(Future.scala:95)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.io.IOException: Could not retrieve the JobManager's blob port.
at 
org.apache.flink.runtime.blob.BlobClient.uploadJarFiles(BlobClient.java:745)
at 
org.apache.flink.runtime.jobgraph.JobGraph.uploadUserJars(JobGraph.java:565)
at 
org.apache.flink.runtime.client.JobSubmissionClientActor$1.call(JobSubmissionClientActor.java:148)
... 9 more
Caused by: java.io.IOException: PUT operation failed: Connection reset
at 
org.apache.flink.runtime.blob.BlobClient.putInputStream(BlobClient.java:512)
at org.apache.flink.runtime.blob.BlobClient.put(BlobClient.java:374)
at 
org.apache.flink.runtime.blob.BlobClient.uploadJarFiles(BlobClient.java:771)
at 
org.apache.flink.runtime.blob.BlobClient.uploadJarFiles(BlobClient.java:740)
... 11 more
Caused by: java.net.SocketException: Connection reset
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:118)
at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
at 
org.apache.flink.runtime.blob.BlobClient.putInputStream(BlobClient.java:499)
... 14 more


On the job manager logs I see this:

2017-12-12 01:42:47,608 ERROR 
org.apache.flink.runtime.blob.BlobServerConnection- PUT operation 
failed
java.io.IOException: No space left on device
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:345)
at 
org.apache.flink.runtime.blob.BlobServerConnection.put(BlobServerConnection.java:314)
at 
org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:113)
2017-12-12 01:42:47,608 ERROR 
org.apache.flink.runtime.blob.BlobServerConnection- PUT operation 
failed
java.io.IOException: No space left on device
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:345)
at 
org.apache.flink.runtime.blob.BlobServerConnection.put(BlobServerConnection.java:314)
at 
org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:113)
2017-12-12 01:42:47,608 ERROR 
org.apa

Re: The timing operation is similar to storm’s tick

2017-12-11 Thread Marvin777
thanks.



2017-12-11 17:51 GMT+08:00 Fabian Hueske :

> Hi,
>
> I think you are looking for a ProcessFunction with timers [1].
>
> Best,
> Fabian
>
> [1] https://ci.apache.org/projects/flink/flink-docs-
> release-1.3/dev/stream/process_function.html
>
> 2017-12-11 9:03 GMT+01:00 Marvin777 :
>
>> hi,
>>
>> I'm new to apache Flink. I want to update the property value per minute
>> via an HTTP request.I did this in storm using tick tuple. Does Flink have
>> something similar which i can use in a flink operator?
>>
>> thanks.
>>
>
>


Re: save points through REST API not supported ?

2017-12-11 Thread Vishal Santoshi
One last question..

Can you conform that the "This will be available in 1.5 where we rework the
client-cluster communication to go entirely through the REST API." comment
, ches...@apache.org ?

On Mon, Dec 11, 2017 at 4:54 AM, Chesnay Schepler 
wrote:

> This doesn't sound like proper behavior, could you provide instructions on
> how to reproduce this?
>
>
> On 07.12.2017 13:42, Lasse Nedergaard wrote:
>
> I hope it can be put in 1.4.1.
>
> I have one concern about the rest api. We running 1.3.1 on dc/os and if we
> apply parameters as arguments and in our code validate these parameters and
> throw an exception during startup if something is wrong we see that all
> uploaded jar disappear all running jobs are gone and sometimes we loss all
> the task managers. I don’t think it is the attended behaviour.
>
> Med venlig hilsen / Best regards
> Lasse Nedergaard
>
>
> Den 7. dec. 2017 kl. 13.00 skrev Chesnay Schepler :
>
> In retrospect I'm quite frustrated we didn't get around to implementing
> this for 1.4.
> The least-effort implementation would have required copying one class, and
> modifying ~10 lines.
>
> Doesn't get any more low-hanging than that.
>
> On 07.12.2017 12:51, Chesnay Schepler wrote:
>
> I've finished the implementation. I created 2 new branches in my
> repository, for 1.3.2 and 1.4.
>
> 1.3.2: https://github.com/zentol/flink/tree/release-1.3.2-custom
> 1.4: https://github.com/zentol/flink/tree/release-1.4-custom
>
> Triggering savepoints works exactly like cancel-with-savepoint, just
> replace "cancel-with-savepoint" with "trigger-savepoint" in the URL.
>
>- "/jobs/:jobid/trigger-savepoint"
>- "/jobs/:jobid/trigger-savepoint/target-directory/:targetDirectory"
>- "/jobs/:jobid/trigger-savepoint/in-progress/:requestId"
>
>
> On 07.12.2017 03:43, Vishal Santoshi wrote:
>
> That would be awesome... we could do it too if that is acceptable...
>
> On Dec 6, 2017 3:56 PM, "Chesnay Schepler"  wrote:
>
>> No, this is also not possible in 1.4.
>>
>> This will be available in 1.5 where we rework the client-cluster
>> communication to go entirely through the REST API.
>> This will mean that everything you can do with the command-line client
>> can also be achieved directly through the REST API.
>>
>> However, from what i can tell this should be straight-forward to
>> implement in a custom build. I can take a deeper look if this would be an
>> acceptable workaround for you.
>>
>> On 07.12.2017 00:20, Vishal Santoshi wrote:
>>
>> Is that something 1.4 has?
>>
>> On Dec 6, 2017 1:01 PM, "Lasse Nedergaard" 
>> wrote:
>>
>>> Hi.
>>>
>>> It is not possible through REST in Flink 1.3.2 I’m looking for the
>>> feature. The only option is to use ./Flink savepoint for now
>>>
>>> Med venlig hilsen / Best regards
>>> Lasse Nedergaard
>>>
>>>
>>> Den 6. dec. 2017 kl. 21.52 skrev Vishal Santoshi <
>>> vishal.santo...@gmail.com>:
>>>
>>> I was more interested in savepoints WITHOUT cancellation...
>>>
>>> On Dec 6, 2017 11:07 AM, "vipul singh"  wrote:
>>>
 Hi Vishal,

 Job cancellations can be done via a REST API: https://ci.apache.org/pro
 jects/flink/flink-docs-release-1.3/monitoring/rest_api.html#
 cancel-job-with-savepoint

 Thanks,
 Vipul

 On Wed, Dec 6, 2017 at 10:56 AM, Vishal Santoshi <
 vishal.santo...@gmail.com> wrote:

> One can submit jobs, upload jars, kill jobs etc very strange that you
> can’t do a save point ?
>
> Or am I missing something obvious ?
>
> Vishal
>



 --
 Thanks,
 Vipul

>>>
>>
>
>
>


Re: Flink-Kafka connector - partition offsets for a given timestamp?

2017-12-11 Thread Tzu-Li (Gordon) Tai
Hi Connie,

We do have a pull request for the feature, that should almost be ready
after rebasing: https://github.com/apache/flink/pull/3915, JIRA:
https://issues.apache.org/jira/browse/FLINK-6352.
This means, of course, that the feature isn't part of any release yet. We
can try to make sure this happens for Flink 1.5, for which the proposed
release date is around February 2018.

Cheers,
Gordon

On Tue, Dec 12, 2017 at 3:53 AM, Yang, Connie  wrote:

> Hi,
>
>
>
> Does Flink-Kafka connector allow job graph to consume topoics/partitions
> from a specific timestamp?
>
>
>
> https://github.com/apache/flink/blob/master/flink-
> connectors/flink-connector-kafka-base/src/main/java/org/
> apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java#L469
> seems to suggest that a job graph can only start from an earliest, latest
> or a set of offsets.
>
>
>
> KafkaConsumer API, https://github.com/apache/kafka/blob/trunk/clients/src/
> main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java#L1598,
> gives us a way to find partition offsets based on a timestamp.
>
>
>
> Thanks
>
> Connie
>
>


Flink-Kafka connector - partition offsets for a given timestamp?

2017-12-11 Thread Yang, Connie
Hi,

Does Flink-Kafka connector allow job graph to consume topoics/partitions from a 
specific timestamp?

https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java#L469
 seems to suggest that a job graph can only start from an earliest, latest or a 
set of offsets.

KafkaConsumer API, 
https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java#L1598,
 gives us a way to find partition offsets based on a timestamp.

Thanks
Connie


Re: Custom Metrics

2017-12-11 Thread Navneeth Krishnan
Thanks Piotr.

Yes, passing the metric group should be sufficient. The subcomponents will
not be able to provide the list of metrics to register since the metrics
are created based on incoming data by tenant. Also I am planning to have
the metrics reported every 10 seconds and hope it shouldn't be a problem.
We use influx and grafana to plot the metrics.

The option 2 that I had in mind was to collect all metrics and use influx
db sink to report it directly inside the pipeline. But it seems reporting
per node might not be possible.


On Mon, Dec 11, 2017 at 3:14 AM, Piotr Nowojski 
wrote:

> Hi,
>
> I’m not sure if I completely understand your issue.
>
> 1.
> - You don’t have to pass RuntimeContext, you can always pass just the
> MetricGroup or ask your components/subclasses “what metrics do you want to
> register” and register them at the top level.
> - Reporting tens/hundreds/thousands of metrics shouldn’t be an issue for
> Flink, as long as you have a reasonable reporting interval. However keep in
> mind that Flink only reports your metrics and you still need something to
> read/handle/process/aggregate your metrics
> 2.
> I don’t think that reporting per node/jvm is possible with Flink’s metric
> system. For that you would need some other solution, like report your
> metrics using JMX (directly register MBeans from your code)
>
> Piotrek
>
> > On 10 Dec 2017, at 18:51, Navneeth Krishnan 
> wrote:
> >
> > Hi,
> >
> > I have a streaming pipeline running on flink and I need to collect
> metrics to identify how my algorithm is performing. The entire pipeline is
> multi-tenanted and I also need metrics per tenant. Lets say there would be
> around 20 metrics to be captured per tenant. I have the following ideas for
> implemention but any suggestions on which one might be better will help.
> >
> > 1. Use flink metric group and register a group per tenant at the
> operator level. The disadvantage of this approach for me is I need the
> runtimecontext parameter to register a metric and I have various subclasses
> to which I need to pass this object to limit the metric scope within the
> operator. Also there will be too many metrics reported if there are higher
> number of subtasks.
> > How is everyone accessing flink state/ metrics from other classes where
> you don't have access to runtimecontext?
> >
> > 2. Use a custom singleton metric registry to add and send these metrics
> using custom sink. Instead of using flink metric group to collect metrics
> per operatior - subtask, collect per jvm and use influx sink to send the
> metric data. What i'm not sure in this case is how to collect only once per
> node/jvm.
> >
> > Thanks a bunch in advance.
>
>


Re: Hardware Reference Architecture

2017-12-11 Thread Kostas Kloudas
Hi Hayden,

This is a talk from Flink Forward that may be of help to you:
https://www.youtube.com/watch?v=8l8dCKMMWkw 


and here are the slides:
www.slideshare.net/FlinkForward/flink-forward-berlin-2017-robert-metzger-keep-it-going-how-to-reliably-and-efficiently-operate-apache-flink/3
 


Kostas

> On Dec 7, 2017, at 6:36 PM, Kostas Kloudas  
> wrote:
> 
> Hi Hayden,
> 
> It would be nice if you could share a bit more details about your use case 
> and the load that you expect to have,
> as this could allow us to have a better view of your needs.
> 
> As a general set of rules:
> 1) I would say that the bigger your cluster (in terms of resources, not 
> necessarily machines) the better.
> 2) the more the RAM per machine the better, as this will allow to fit more 
> things in memory without spilling to disk
> 3) in the dilemma between few powerful machines vs a lot of small ones, I 
> would go more towards the first, as this 
> allows for smaller network delays.
> 
> Once again, the above rules are just general recommendations and more details 
> about your workload will give us 
> more information to work with.
> 
> In the documentation here: 
> https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/yarn_setup.html#background--internals
>  
> 
> you can find some details about deployment, monitoring, etc.
> 
> I hope this helps,
> Kostas
> 
>> On Dec 7, 2017, at 1:53 PM, Marchant, Hayden > > wrote:
>> 
>> Hi,
>> 
>> I'm looking for guidelines for Reference architecture for Hardware for a 
>> small/medium Flink cluster - we'll be installing on in-house bare-metal 
>> servers. I'm looking for guidance for:
>> 
>> 1. Number and spec of  CPUs
>> 2. RAM
>> 3. Disks
>> 4. Network
>> 5. Proximity of servers to each other
>> 
>> (Most likely, we will choose YARN as a cluster manager for Flink)
>> 
>> If someone can share a document or link with relevant information, I will be 
>> very grateful.
>> 
>> Thanks,
>> Hayden Marchant
>> 
> 



How to deal with dynamic types

2017-12-11 Thread madan
Hi,

I am trying some initial samples with flink. I have one doubt regarding
data types. Flink support data types Tuple(max 25 fields), Java POJOs,
Primitive types, Regular classes etc.,
In my case I do not have fixed type. I have meta data with filed names &
its types. For ex., (Id:int, Name:String, Salary:Double, Dept:String,...
etc). I do not know the number of fields, its names or types till I receive
metadata. In these what should be the source type I should go with? Please
suggest. Small example would be of great help.


Scenario trying to solve :

Input :
Metadata : {"id":"int",
"Name":"String","Salary":"Double","Dept":"String"}
Data file :   csv data file with above fields data

Output required is : Calculate average of salary by department wise.


-- 
Thank you,
Madan.


Re: aggregate does not allow RichAggregateFunction ?

2017-12-11 Thread Vishal Santoshi
Perfect, f in our use case, the kafka partition  key and the keyBy use the
same exact field and thus the order will be preserved.

On Mon, Dec 11, 2017 at 4:34 AM, Fabian Hueske  wrote:

> Hi,
>
> the order or records that are sent from one task to another task is
> preserved (task refers to the parallel instance of an operator).
> However, a task that receives records from multiple input tasks, consumes
> records from its inputs in arbitrary order.
>
> If a job reads from a partitioned Kafka topic and does a keyBy on the
> partitioning key of the Kafka topic, an operator task that follows the
> keyBy consumes all records with the same key from exactly one input task
> (the one reading the Kafka partition for the key).
> However, since Flink's and Kafka's partitioning functions are not the
> same, records from the same partition with different keys can be sent to
> different tasks.
>
> So:
> 1) Records from the same partition might not be processed by the same
> operator (and hence not in order).
> 2) Records with the same key are processed by the same operator in the
> same order in which they were read from the partition.
>
> Best,
> Fabian
>
> 2017-12-09 18:09 GMT+01:00 Vishal Santoshi :
>
>> An additional question is that if the source is key partitioned  ( kafka
>> ) does a keyBy retain the order of  a kafka partirion across a shuffle ?
>>
>> On Fri, Dec 8, 2017 at 1:12 PM, Vishal Santoshi <
>> vishal.santo...@gmail.com> wrote:
>>
>>> I understand that. Let me elaborate. The sequence of events is
>>>
>>> 1. Round robin dispatch to kafka cluster  ( it is not partitioned on the
>>> key which we may ultimately do  and than I will have more questions on how
>>> to key y and still keep order, pbly avoid shuffle :) ) .
>>> 2. key by a high cardinality key
>>> 3. Sessionize
>>> 4. B'coz of the RR on kafka ( and even if partitioned on the key and a
>>> subsequent key by ), the sort order is not retained and the ACC has to hold
>>> on to the elements in a List . When the Window is finalized  we sort the in
>>> ACC  List  and do pagination, We are looking for paths within a session
>>> from . a source to a sink event based. I was hoping to use ROCKS DB state
>>> as a final merged list and thus off heap and  use a Count based Trigger to
>>> evaluate the ACC and merge the inter Trigger collection to the  master copy
>>> rather than keeping all events in the ACC ( I would imagine a very general
>>> pattern to use ).
>>>
>>> Does that make sense ?
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Dec 8, 2017 at 9:11 AM, Aljoscha Krettek 
>>> wrote:
>>>
 Hi,

 If you use an AggregatingFunction in this way (i.e. for a window) the
 ACC should in fact be kept in the state backend. Did you configure the job
 to use RocksDB? How are the memory problems manifesting?

 Best,
 Aljoscha


 On 6. Dec 2017, at 14:57, Fabian Hueske  wrote:

 Hi Vishal,

 you are right, it is not possible to use state in an AggregateFunction
 because windows need to be mergeable.
 An AggregateFunction knows how to merge its accumulators but merging
 generic state is not possible.

 I am not aware of an efficient and easy work around for this.
 If you want to use the provided session window logic, you can use a
 WindowFunction that performs all computations when the window is triggered.
 This means that aggregations do not happen eagerly and all events for a
 window are collected and held in state.
 Another approach could be to implement the whole logic (incl. the
 session windowing) using a ProcessFunction. This would be a major effort
 though.

 Best,
 Fabian

 2017-12-06 3:52 GMT+01:00 Vishal Santoshi :

> It seems that this has to do with session windows tbat are mergeable ?
> I tried the RixhWindow function and that seems to suggest that one cannot
> use state ? Any ideas folks...
>
> On Dec 1, 2017 10:38 AM, "Vishal Santoshi" 
> wrote:
>
>> I have a simple Aggregation with one caveat. For some reason I have
>> to keep a large amount of state till the window is GCed. The state is
>> within the Accumulator ( ACC ). I am hitting a memory bottleneck and 
>> would
>> like to offload the state  to the states backend ( ROCKSDB), keeping the
>> between checkpoint state in memory ( seems to be an obvious fix). I am 
>> not
>> though allowed to have a RichAggregateFunction in the aggregate method 
>> of a
>> windowed stream . That begs 2 questions
>>
>> 1. Why
>> 2. Is there an alternative for stateful window aggregation where we
>> manage the state. ?
>>
>> Thanks Vishal
>>
>>
>> Here is the code ( generics but it works  )
>>
>> SingleOutputStreamOperator retVal = input
>> .keyBy(keySelector)
>> .window(EventTimeSessionWindows.withGap(gap))
>> .aggregate(
>>   

Re: REST api: how to upload jar?

2017-12-11 Thread Piotr Nowojski
Hi,

Have you tried this

https://stackoverflow.com/questions/41724269/apache-flink-rest-client-jar-upload-not-working
 


?

Piotrek

> On 11 Dec 2017, at 14:22, Edward  wrote:
> 
> Let me try that again -- it didn't seem to render my commands correctly:
> 
> Thanks for the response, Shailesh. However, when I try with python, I get
> the 
> same error as when I attempted this with cURL: 
> 
> $ python uploadJar.py
> java.io.FileNotFoundException:
> /tmp/flink-web-4bed7801-fa5e-4e5e-abf1-3fa13ba1f528/438eaac1-7647-4716-8d8d-f95acd8129b2_/path/to/jar/file.jar
> (No such file or directory)
> 
> That is, if I tell python (or cURL) that my jar file is at 
> /path/to/jar/file.jar, the file path it uses on the server side includes 
> that entire path in the target file name. And if I try the script with no
> path (i.e. run the script 
> in the folder where file.jar exists), it uploads an empty file named 
> file.jar.  The endpoint at file/upload seems to be take the form-data 
> element "jarfile" and use the fully qualified path when trying to save the 
> jar file on the server side. 
> 
> Here is my equivalent attempt using cURL, which gives the same 
> FileNoFoundException as above: 
> 
> curl 'http://localhost:8081/jars/upload' -H 'Content-Type:
> multipart/form-data; boundary=Boundary' --data-binary
> $'--Boundary\r\nContent-Disposition: form-data; name="jarfile";
> filename="/path/to/jar/file.jar"\r\nContent-Type:
> application/java-archive\r\n\r\n\r\n--Boundary--\r\n' 
> 
> 
> 
> 
> --
> Sent from: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/



Re: REST api: how to upload jar?

2017-12-11 Thread Edward
Let me try that again -- it didn't seem to render my commands correctly:

Thanks for the response, Shailesh. However, when I try with python, I get
the 
same error as when I attempted this with cURL: 

$ python uploadJar.py
java.io.FileNotFoundException:
/tmp/flink-web-4bed7801-fa5e-4e5e-abf1-3fa13ba1f528/438eaac1-7647-4716-8d8d-f95acd8129b2_/path/to/jar/file.jar
(No such file or directory)

That is, if I tell python (or cURL) that my jar file is at 
/path/to/jar/file.jar, the file path it uses on the server side includes 
that entire path in the target file name. And if I try the script with no
path (i.e. run the script 
in the folder where file.jar exists), it uploads an empty file named 
file.jar.  The endpoint at file/upload seems to be take the form-data 
element "jarfile" and use the fully qualified path when trying to save the 
jar file on the server side. 

Here is my equivalent attempt using cURL, which gives the same 
FileNoFoundException as above: 

curl 'http://localhost:8081/jars/upload' -H 'Content-Type:
multipart/form-data; boundary=Boundary' --data-binary
$'--Boundary\r\nContent-Disposition: form-data; name="jarfile";
filename="/path/to/jar/file.jar"\r\nContent-Type:
application/java-archive\r\n\r\n\r\n--Boundary--\r\n' 




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: REST api: how to upload jar?

2017-12-11 Thread Edward
Thanks for the response, Shailesh. However, when I try with python, I get the
same error as when I attempted this with cURL:


That is, if I tell python (or cURL) that my jar file is at
/path/to/jar/file.jar, the file path it uses on the server side includes
that entire path. And if I try the script with no path (i.e. run the script
in the folder where file.jar exists), it uploads an empty file named
file.jar.  The endpoint at file/upload seems to be take the form-data
element "jarfile" and use the fully qualified path when trying to save the
jar file on the server side.

Here is my equivalent attempt using cURL, which gives the same
FileNoFoundException as above:




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: Exception when using the time attribute in table API

2017-12-11 Thread Piotr Nowojski
Hi,

NoSuchMethodError probably comes from some mismatched compile/runtime versions 
of the Flink. Do you have to use 1.4-SNAPSHOT version? It can change on daily 
basis, so you have to be more careful about what Flink jar's you are using at 
runtime and what on compile time. If you really need some 1.4 features, it 
would be better to relay on the latest RC version (currently that would be 
RC3). 

Regarding 1.3.2 sorry, could you be more specific what problem are you 
observing and provide more details (stack trace/log)? Is it compiler error? 
Runtime error? 

Piotrek

> On 8 Dec 2017, at 17:36, Sendoh  wrote:
> 
> Hi Flink users,
> 
> I saw this error 
> 
> 12/08/2017 17:31:27   groupBy: (shipmentNumber), window:
> (TumblingGroupWindow('w$, 'rowtime, 360.millis)), select:
> (shipmentNumber, SUM(grandTotal) AS EXPR$1) -> to: Row(3/4) switched to
> FAILED 
> java.lang.NoSuchMethodError:
> org.apache.flink.api.common.functions.AggregateFunction.add(Ljava/lang/Object;Ljava/lang/Object;)V
> 
> The flink version is 1.4-SNAPSHOT. I already implemetend
> DefinedRowtimeAttribute in my table source and return the 
> event time column as row time
> 
> I thought of mvn issue and, and also tried 1.3.2 and it shows the data type
> is not supporting tumble window no matter using Types.LONG() or
> Types.SQL_TIMESTAMP().
> 
> Is there anything I should also notice?
> 
> Best,
> 
> Sendoh
> 
> 
> 
> --
> Sent from: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/



Re: Parallelizing a tumbling group window

2017-12-11 Thread Timo Walther

Hi Colin,

unfortunately, selecting the parallelism for parts of a SQL query is not 
supported yet. By default, tumbling window operators use the default 
parallelism of the environment. Simple project and select operations 
have the same parallelism as the inputs they are applied on.


I think the easiest solution so far is to explicilty set the parallelism 
of operators that are not part of the Table API and use the 
environment's parallelism to scale the SQL query.


I hope that helps.

Regards,
Timo


Am 12/9/17 um 3:06 AM schrieb Colin Williams:

Hello,

I've inherited some flink application code.

We're currently creating a table using a Tumbling SQL query similar to 
the first example in


https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/table/sql.html#group-windows 



Where each generated SQL query looks something like

SELECT measurement, `tag_AppId`(tagset), P99(`field_Adds`(fieldset)), 
TUMBLE_START(rowtime, INTERVAL '10' MINUTE) FROM LINES GROUP BY 
measurement, `tag_AppId`(tagset), TUMBLE(rowtime, INTERVAL '10' MINUTE)



We are also using a UDFAGG function in some of the queries which I 
think might be cleaned up and optimized a bit (using scala types and 
possibly not well implemented)


We then turn the result table back into a datastream using 
toAppendStream, and eventually add a derivative stream to a sink. 
We've configured TimeCharacteristic to event-time processing.


In some streaming scenarios everything is working fine with a 
parallelism of 1, but in others it appears that we can't keep up with 
the event source.


Then we are investigating how to enable parallelism specifically on 
the SQL table query or aggregator.


Can anyone suggest a good way to go about this? It wasn't clear from 
the documentation.


Best,

Colin Williams







Re: ayncIO & TM akka response

2017-12-11 Thread Piotr Nowojski
Hi,

Please search the task manager logs for the potential reason of 
failure/disconnecting around the time when you got this error on the job 
manager. There should be some clearly visible exception. 

Thanks, Piotrek

> On 9 Dec 2017, at 20:35, Chen Qin  wrote:
> 
> Hi there,
> 
> In recent, our production fink jobs observed some weird performance issue. 
> When job tailing kafka source failed and try to catch up, asyncIO after event 
> trigger get much higher load on task thread. Since each TM allocated two 
> virtual CPU in docker, my assumption was akka message between JM and TM 
> shouldn't be impacted.
> 
> What I observed was TM get closed and keep restart with same error message 
> below. Any suggestion is appreciated!
> 
> 
> org.apache.flink.runtime.io 
> .network.netty.exception.RemoteTransportException:
>  Connection unexpectedly closed by remote task manager 
> '​xxx/​xxx:5841'. This might indicate that the remote task manager 
> was lost.
> at org.apache.flink.runtime.io 
> .network.netty.PartitionRequestClientHandler.channelInactive(PartitionRequestClientHandler.java:115)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:237)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:223)
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:237)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:223)
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:294)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:237)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:223)
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:829)
> at 
> io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:610)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
> at java.lang.Thread.run(Thread.java:748)
> 
> ​Chen​ 



Re: Custom Metrics

2017-12-11 Thread Piotr Nowojski
Hi,

I’m not sure if I completely understand your issue.

1.
- You don’t have to pass RuntimeContext, you can always pass just the 
MetricGroup or ask your components/subclasses “what metrics do you want to 
register” and register them at the top level.
- Reporting tens/hundreds/thousands of metrics shouldn’t be an issue for Flink, 
as long as you have a reasonable reporting interval. However keep in mind that 
Flink only reports your metrics and you still need something to 
read/handle/process/aggregate your metrics
2.
I don’t think that reporting per node/jvm is possible with Flink’s metric 
system. For that you would need some other solution, like report your metrics 
using JMX (directly register MBeans from your code)

Piotrek

> On 10 Dec 2017, at 18:51, Navneeth Krishnan  wrote:
> 
> Hi,
> 
> I have a streaming pipeline running on flink and I need to collect metrics to 
> identify how my algorithm is performing. The entire pipeline is 
> multi-tenanted and I also need metrics per tenant. Lets say there would be 
> around 20 metrics to be captured per tenant. I have the following ideas for 
> implemention but any suggestions on which one might be better will help.
> 
> 1. Use flink metric group and register a group per tenant at the operator 
> level. The disadvantage of this approach for me is I need the runtimecontext 
> parameter to register a metric and I have various subclasses to which I need 
> to pass this object to limit the metric scope within the operator. Also there 
> will be too many metrics reported if there are higher number of subtasks. 
> How is everyone accessing flink state/ metrics from other classes where you 
> don't have access to runtimecontext?
> 
> 2. Use a custom singleton metric registry to add and send these metrics using 
> custom sink. Instead of using flink metric group to collect metrics per 
> operatior - subtask, collect per jvm and use influx sink to send the metric 
> data. What i'm not sure in this case is how to collect only once per node/jvm.
> 
> Thanks a bunch in advance.



Re: Exception when using the time attribute in table API

2017-12-11 Thread Timo Walther

Hi Sendoh,

at a first glance this looks like a Maven issue to me. Are you sure you 
are using a consistent version for both core Flink and flink-table (also 
consistent Scala version 2.11)?


Maybe you can share your pom.xml with us. It seems that flink-table is a 
newer version that your Flink core.


Regards,
Timo



Am 12/8/17 um 5:37 PM schrieb Sendoh:

Hi Flink users,

I saw this error

12/08/2017 17:31:27 groupBy: (shipmentNumber), window:
(TumblingGroupWindow('w$, 'rowtime, 360.millis)), select:
(shipmentNumber, SUM(grandTotal) AS EXPR$1) -> to: Row(3/4) switched to
FAILED
java.lang.NoSuchMethodError:
org.apache.flink.api.common.functions.AggregateFunction.add(Ljava/lang/Object;Ljava/lang/Object;)V

The flink version is 1.4-SNAPSHOT. I already implemetend
DefinedRowtimeAttribute in my table source and setup the
event time column as rowtime

I thought of mvn issue and, and also tried 1.3.2 and it shows the data type
is not supporting tumble window no matter using Types.LONG() or
Types.SQL_TIMESTAMP().

Is there anything I should also notice?

Best,

Sendoh



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/





Re: save points through REST API not supported ?

2017-12-11 Thread Chesnay Schepler
This doesn't sound like proper behavior, could you provide instructions 
on how to reproduce this?


On 07.12.2017 13:42, Lasse Nedergaard wrote:

I hope it can be put in 1.4.1.

I have one concern about the rest api. We running 1.3.1 on dc/os and 
if we apply parameters as arguments and in our code validate these 
parameters and throw an exception during startup if something is wrong 
we see that all uploaded jar disappear all running jobs are gone and 
sometimes we loss all the task managers. I don’t think it is the 
attended behaviour.


Med venlig hilsen / Best regards
Lasse Nedergaard


Den 7. dec. 2017 kl. 13.00 skrev Chesnay Schepler >:


In retrospect I'm quite frustrated we didn't get around to 
implementing this for 1.4.
The least-effort implementation would have required copying one 
class, and modifying ~10 lines.


Doesn't get any more low-hanging than that.

On 07.12.2017 12:51, Chesnay Schepler wrote:
I've finished the implementation. I created 2 new branches in my 
repository, for 1.3.2 and 1.4.


1.3.2: https://github.com/zentol/flink/tree/release-1.3.2-custom
1.4: https://github.com/zentol/flink/tree/release-1.4-custom

Triggering savepoints works exactly like cancel-with-savepoint, just 
replace "cancel-with-savepoint" with "trigger-savepoint" in the URL.


  * "/jobs/:jobid/trigger-savepoint"
  * "/jobs/:jobid/trigger-savepoint/target-directory/:targetDirectory"
  * "/jobs/:jobid/trigger-savepoint/in-progress/:requestId"


On 07.12.2017 03:43, Vishal Santoshi wrote:

That would be awesome... we could do it too if that is acceptable...

On Dec 6, 2017 3:56 PM, "Chesnay Schepler" > wrote:


No, this is also not possible in 1.4.

This will be available in 1.5 where we rework the
client-cluster communication to go entirely through the REST API.
This will mean that everything you can do with the command-line
client can also be achieved directly through the REST API.

However, from what i can tell this should be straight-forward
to implement in a custom build. I can take a deeper look if
this would be an acceptable workaround for you.

On 07.12.2017 00:20, Vishal Santoshi wrote:

Is that something 1.4 has?

On Dec 6, 2017 1:01 PM, "Lasse Nedergaard"
mailto:lassenederga...@gmail.com>>
wrote:

Hi.

It is not possible through REST in Flink 1.3.2 I’m looking
for the feature. The only option is to use ./Flink
savepoint for now

Med venlig hilsen / Best regards
Lasse Nedergaard


Den 6. dec. 2017 kl. 21.52 skrev Vishal Santoshi
mailto:vishal.santo...@gmail.com>>:


I was more interested in savepoints WITHOUT cancellation...

On Dec 6, 2017 11:07 AM, "vipul singh"
mailto:neoea...@gmail.com>> wrote:

Hi Vishal,

Job cancellations can be done via a REST API:

https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/rest_api.html#cancel-job-with-savepoint



Thanks,
Vipul

On Wed, Dec 6, 2017 at 10:56 AM, Vishal Santoshi
mailto:vishal.santo...@gmail.com>> wrote:

One can submit jobs, upload jars, kill jobs etc
very strange that you can’t do a save point ?

Or am I missing something obvious ?

Vishal




-- 
Thanks,

Vipul











Re: The timing operation is similar to storm’s tick

2017-12-11 Thread Fabian Hueske
Hi,

I think you are looking for a ProcessFunction with timers [1].

Best,
Fabian

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/process_function.html

2017-12-11 9:03 GMT+01:00 Marvin777 :

> hi,
>
> I'm new to apache Flink. I want to update the property value per minute
> via an HTTP request.I did this in storm using tick tuple. Does Flink have
> something similar which i can use in a flink operator?
>
> thanks.
>


Re: aggregate does not allow RichAggregateFunction ?

2017-12-11 Thread Fabian Hueske
Hi,

the order or records that are sent from one task to another task is
preserved (task refers to the parallel instance of an operator).
However, a task that receives records from multiple input tasks, consumes
records from its inputs in arbitrary order.

If a job reads from a partitioned Kafka topic and does a keyBy on the
partitioning key of the Kafka topic, an operator task that follows the
keyBy consumes all records with the same key from exactly one input task
(the one reading the Kafka partition for the key).
However, since Flink's and Kafka's partitioning functions are not the same,
records from the same partition with different keys can be sent to
different tasks.

So:
1) Records from the same partition might not be processed by the same
operator (and hence not in order).
2) Records with the same key are processed by the same operator in the same
order in which they were read from the partition.

Best,
Fabian

2017-12-09 18:09 GMT+01:00 Vishal Santoshi :

> An additional question is that if the source is key partitioned  ( kafka )
> does a keyBy retain the order of  a kafka partirion across a shuffle ?
>
> On Fri, Dec 8, 2017 at 1:12 PM, Vishal Santoshi  > wrote:
>
>> I understand that. Let me elaborate. The sequence of events is
>>
>> 1. Round robin dispatch to kafka cluster  ( it is not partitioned on the
>> key which we may ultimately do  and than I will have more questions on how
>> to key y and still keep order, pbly avoid shuffle :) ) .
>> 2. key by a high cardinality key
>> 3. Sessionize
>> 4. B'coz of the RR on kafka ( and even if partitioned on the key and a
>> subsequent key by ), the sort order is not retained and the ACC has to hold
>> on to the elements in a List . When the Window is finalized  we sort the in
>> ACC  List  and do pagination, We are looking for paths within a session
>> from . a source to a sink event based. I was hoping to use ROCKS DB state
>> as a final merged list and thus off heap and  use a Count based Trigger to
>> evaluate the ACC and merge the inter Trigger collection to the  master copy
>> rather than keeping all events in the ACC ( I would imagine a very general
>> pattern to use ).
>>
>> Does that make sense ?
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Fri, Dec 8, 2017 at 9:11 AM, Aljoscha Krettek 
>> wrote:
>>
>>> Hi,
>>>
>>> If you use an AggregatingFunction in this way (i.e. for a window) the
>>> ACC should in fact be kept in the state backend. Did you configure the job
>>> to use RocksDB? How are the memory problems manifesting?
>>>
>>> Best,
>>> Aljoscha
>>>
>>>
>>> On 6. Dec 2017, at 14:57, Fabian Hueske  wrote:
>>>
>>> Hi Vishal,
>>>
>>> you are right, it is not possible to use state in an AggregateFunction
>>> because windows need to be mergeable.
>>> An AggregateFunction knows how to merge its accumulators but merging
>>> generic state is not possible.
>>>
>>> I am not aware of an efficient and easy work around for this.
>>> If you want to use the provided session window logic, you can use a
>>> WindowFunction that performs all computations when the window is triggered.
>>> This means that aggregations do not happen eagerly and all events for a
>>> window are collected and held in state.
>>> Another approach could be to implement the whole logic (incl. the
>>> session windowing) using a ProcessFunction. This would be a major effort
>>> though.
>>>
>>> Best,
>>> Fabian
>>>
>>> 2017-12-06 3:52 GMT+01:00 Vishal Santoshi :
>>>
 It seems that this has to do with session windows tbat are mergeable ?
 I tried the RixhWindow function and that seems to suggest that one cannot
 use state ? Any ideas folks...

 On Dec 1, 2017 10:38 AM, "Vishal Santoshi" 
 wrote:

> I have a simple Aggregation with one caveat. For some reason I have to
> keep a large amount of state till the window is GCed. The state is within
> the Accumulator ( ACC ). I am hitting a memory bottleneck and would like 
> to
> offload the state  to the states backend ( ROCKSDB), keeping the between
> checkpoint state in memory ( seems to be an obvious fix). I am not though
> allowed to have a RichAggregateFunction in the aggregate method of a
> windowed stream . That begs 2 questions
>
> 1. Why
> 2. Is there an alternative for stateful window aggregation where we
> manage the state. ?
>
> Thanks Vishal
>
>
> Here is the code ( generics but it works  )
>
> SingleOutputStreamOperator retVal = input
> .keyBy(keySelector)
> .window(EventTimeSessionWindows.withGap(gap))
> .aggregate(
> new AggregateFunction() {
>
> @Override
> public ACC createAccumulator() {
> ACC newInstance = (ACC) accumulator.clone();
> newInstance.resetLocal();
> return newInstance;
> }
>
> @Ov

Re: Problem with runGatherSumApplyIteration

2017-12-11 Thread rostami

Dear Stefan,

thanks for your answer.

Here is the flink version:

org.apache.flink
flink-java
1.3.2


org.apache.flink
flink-gelly_2.11
1.3.2


org.apache.flink
flink-clients_2.11
1.3.2


This is the full stack trace:

org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
	at  
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:717)
	at  
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:663)
	at  
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:663)
	at  
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
	at  
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)

at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
	at  
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)

at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
	at  
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
	at  
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)

at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
	at  
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.Exception: The user defined 'open()' method  
caused an exception:  
org.apache.flink.api.common.functions.RuntimeContext.hasBroadcastVariable(Ljava/lang/String;)Z

at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:475)
	at  
org.apache.flink.runtime.iterative.task.AbstractIterativeTask.run(AbstractIterativeTask.java:145)
	at  
org.apache.flink.runtime.iterative.task.IterationIntermediateTask.run(IterationIntermediateTask.java:92)

at 
org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:345)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoSuchMethodError:  
org.apache.flink.api.common.functions.RuntimeContext.hasBroadcastVariable(Ljava/lang/String;)Z
	at  
org.apache.flink.graph.spargel.ScatterGatherIteration$ScatterUdfWithEdgeValues.open(ScatterGatherIteration.java:249)
	at  
org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:38)

at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:471)
... 5 more

Regards,
Ali


Quoting Stefan Richter :


Hi,

it would be helpful if you could tell us the Flink version you are  
using and the full stacktrace. However, this looks like there could  
be a version conflict, e.g. is your cluster running the same version  
of Flink that you build your job against?


Best,
Stefan


Am 08.12.2017 um 10:23 schrieb rost...@informatik.uni-leipzig.de:

Dear All,

I got the following error when I use the function  
"runGatherSumApplyIteration":
... java.lang.NoSuchMethodError:  
org.apache.flink.api.common.functions.RuntimeContext.hasBroadcastVariable(Ljava/lang/String;)Z


I got this problem even when I use the given example from Flink  
documentation.


Anyone saw such a problem?

Regards,
Ali







RE: slot group indication per operator

2017-12-11 Thread Sofer, Tovi
Hi.

Any update or suggestion on this?

Best regards,
Tovi
From: Timo Walther [mailto:twal...@apache.org]
Sent: יום ג 05 דצמבר 2017 18:55
To: user@flink.apache.org
Cc: ches...@apache.org
Subject: Re: slot group indication per operator

Hi Tovi,

you are right, it is difficult to check the correct behavior.

@Chesnay: Do you know if we can get this information? If not through the Web 
UI, maybe via REST? Do we have access to the full ExecutionGraph somewhere?

Otherwise it might make sense to open an issue for this.

Regards,
Timo


Am 12/5/17 um 4:25 PM schrieb Sofer, Tovi :
Hi all,
I am trying to use the slot group feature, by having ‘default’ group and 
additional ‘market’ group.
The purpose is to divide the resources equally between two sources and their 
following operators.
I’ve set the slotGroup on the source of the market data.
Can I assume that all following operators created from this source will use 
same slot group of ‘market’?
(The operators created for market stream are pretty complex, with connect and 
split).
In Web UI I saw there are 16 slots, but didn’t see indication per operator to 
which group it was assigned. How can I know?

Relevant Code:

env.setParallelism(8);
conf.setInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, 16); \\ to allow 
Parallelism of 8 per group

// Market source and operators:

KeyedStream windowedStreamA = sourceProvider.provide(env)

.name(spotSourceProvider.getName())

.slotSharingGroup(SourceMsgType.MARKET.slotGroup())

.flatMap(new ParserMapper(new MarketMessageParser()))

.name(ParserMapper.class.getSimpleName())

.filter(new USDFilter())

.name(USDFilter.class.getSimpleName())

.keyBy(MarketEvent.CURRENCY_FIELD)

.timeWindow(Time.of(windowSizeMs, TimeUnit.MILLISECONDS))

.process(new LastInWindowPriceChangeFunction()))

.name(LastInWindowPriceChangeFunction.class.getSimpleName())

.keyBy(SpotTickEvent.CURRENCY_FIELD);


marketConnectedStream = windowedStreamA.connect(windowedStreamB)

.flatMap(new MarketCoMapper()))

.name(MarketCoMapper.class.getSimpleName())



SplitStream stocksWithSpotsStreams = marketConnectedStream

.split( market -> ImmutableList.of("splitA"," splitB") );



DataStream< MarketAWithMarketB> splitA = stocksWithSpotsStreams.select("splitA 
");


Thanks and regards,
Tovi






The timing operation is similar to storm’s tick

2017-12-11 Thread Marvin777
hi,

I'm new to apache Flink. I want to update the property value per minute via
an HTTP request.I did this in storm using tick tuple. Does Flink have
something similar which i can use in a flink operator?

thanks.