I was able to resolve my issue by collecting all the 'column' Arrays via
countWindowAll and using flatMap to emit 'row' Arrays.
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Parallelizing-DataStream-operations-on-Array-elements-tp9911p9917.
Till Rohrmann wrote
> I'm not sure whether I grasp the whole problem, but can't you split
> thevector up into the different rows, group by the row index and then
> applysome kind of continuous aggregation or window function?
So I could flatMap my incoming Arrays into (rowId, arrayElement) and gath
Hello,
Every contribution to the master branch will be released as part of the
next minor version, in your case this would be 1.2.
We are currently aiming for a release in December.
In between minor versions several bug-fix versions are released (1.1.1,
1.1.2 etc.). For these the community pi
Hi Aljoscha,
I know it's tricky...
Few weeks ago we decided to implement it without windows, using just
stateful operator and some queues/map per key as state - so yeah, we
tried to imagine how to do this in plain java and one stream ;)
We also process watermarks to evict old events. Fortuna
Hi Daniel,
I'm not sure whether I grasp the whole problem, but can't you split the
vector up into the different rows, group by the row index and then apply
some kind of continuous aggregation or window function?
Maybe it helps if you can share some of your code with the community to
discuss the i
Hi,
the state of the window is kept by the WindowOperator (which uses the state
descriptor you mentioned to access the state). The FoldFunction does not
itself keep the state but is only used to update the state inside the
WindowOperator, if you will.
When you say restart, are you talking about st
Hello!
I have a data source that emits Arrays that I collect into windows via
countWindow. Rather than parallelize my subsequent operations by groups of
these arrays, I'd like to parallelize my operations across the elements of
the array (rows rather than columns, if you will) within each window.
Thanks, I didn't know about the -z flag!
I haven't been able to get it to work though (using yarn-cluster, with a
zookeeper root configured to /flink in my flink-conf.yaml)
I can see my job directory in ZK under
/flink/application_1477475694024_0015 and I've tried a few ways to restore
the job:
Hi Max,
Thanks for your help. Flink-spector looks just like what I need.
Greetings,
Juan
On Thu, Nov 3, 2016 at 11:05 AM, Maximilian Michels wrote:
> Hi Juan,
>
> StreamingMultipleProgramsTestBase is in the testing scope. Thus, is it
> not bundled in the normal jars. You would have to add the
Hello Aljoscha,
Thank you for your reply.
But I believe, reading from the docs, that any user function has to be a
Rich Function, if it wishes to have state.
Now any Rich Function cannot be used or accepted on a Window.
For instances looking at flink source version 1.1.3 which is the one I'm
If the configured ZooKeeper paths are still the same, the job should
be recovered automatically. On each submission a unique ZK namespace
is used based on the app ID.
So you have in ZK:
/flink/app_id/...
You would have to set that manually to resume an old application. You
can do this via -z flag
Hi Ufuk,
I see, but in my case the failure caused YARN application moved into a
finished/failed state - so the application itself is no longer running. How
can I restart the application (or start a new YARN application) and ensure
that it uses the checkpoint pointer stored in Zookeeper?
Thanks,
J
No you don't need to manually trigger a savepoint. With HA checkpoints
are persisted externally and store a pointer in ZooKeeper to recover
them after a JobManager failure.
On Fri, Nov 4, 2016 at 2:27 PM, Josh wrote:
> I have a follow up question to this - if I'm running a job in 'yarn-cluster'
>
I have a follow up question to this - if I'm running a job in
'yarn-cluster' mode with HA and then at some point the YARN application
fails due to some hardware failure (i.e. the YARN application moves to
"FINISHED"/"FAILED" state), how can I restore the job from the most recent
checkpoint?
I can
The tickets are in Flink's Jira:
https://issues.apache.org/jira/browse/FLINK-4965
https://issues.apache.org/jira/browse/FLINK-4966
Are you looking to process temporal graphs with the DataStream API?
On Fri, Nov 4, 2016 at 5:52 AM, otherwise777 wrote:
> Cool, thnx for that,
>
> I tried searc
Hi, Fabian:
Will the checkpointed state be restored between streaming jobs?
On Fri, Nov 4, 2016 at 6:47 PM Fabian Hueske wrote:
> Yes, that is true for Flink 1.1.x.
>
> The upcoming Flink 1.2.0 release will be able to restore jobs from
> savepoints with different parallelism.
>
> 2016-11-04 11:2
Hi Scott & Stephan,
The problem has happened a couple more times since yesterday, it's very
strange as my job was running fine for over a week before this started
happening. I find that if I restart the job (and restore from the last
checkpoint) it runs fine for a while (couple of hours) before br
Hi Daniel,
Flink will checkpoint the state of all operations (in your case to HDFS).
Flink has several APIs for dealing with state in user functions:
https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/state.html The
window operator also internally uses these APIs.
Let me know if you n
Yes, that is true for Flink 1.1.x.
The upcoming Flink 1.2.0 release will be able to restore jobs from
savepoints with different parallelism.
2016-11-04 11:24 GMT+01:00 Renjie Liu :
> Hi, all:
> It seems that flink's checkpoint mechanism saves state per partition.
> However, if I want to change c
Hi, all:
It seems that flink's checkpoint mechanism saves state per partition.
However, if I want to change configuration of the job and resubmit the job,
I can not recover from last checkpoint, right? Say I change the parallelism
of kafka consumer and partitions assigned to each subtask will be
di
HI,
Does the flink kafka connector 0.8.2 handle broker's leader change
gracefully since simple kafka consumer should be handling leader changes
for a partition.
How would the consumer behave when upgrading the brokers from 0.8 to 0.9.
Thanks
Hi Anchit,
The documentation mentions that you need Zookeeper in addition to
setting the application attempts. Zookeeper is needed to retrieve the
current leader for the client and to filter out old leaders in case
multiple exist (old processes could even stay alive in Yarn). Moreover, it
is neede
Cool, thnx for that,
I tried searching for it in teh github but couldn't find it, do you have the
url by any chance?
I'm going to try to implement such an algorithm for temporal graphs
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Retrievi
Thank you for helping to investigate the issue. I've filed an issue in our
bugtracker: https://issues.apache.org/jira/browse/FLINK-5013
On Wed, Nov 2, 2016 at 10:09 PM, Justin Yan wrote:
> Sorry it took me a little while, but I'm happy to report back that it
> seems to be working properly with E
Hi Pedro,
yes, I was more or less suggesting a similar approach to the one taken by
King. In code, it would look somewhat like this:
DataStream input = ...;
DataStream> withMyWindows = input.map(new
AssignMyWindows())
withMyWindows
.keyBy(...)
.window(new DynamicWindowAssigner())
...
where
25 matches
Mail list logo