Hi,
I was thinking on a problem and how to solve it with Flink Streaming.
Imagine you have a stream of data where you want to apply several
transformations, where some transformations depend on previous
transformations and there is a final set of actions. This is modeled in a
natural way as a DAG
Hi,
I am running a flink stream application on a cluster with 6 slaves/task
managers. I have set in flink-conf.yaml of every machine
"parallelization.degree.default: 6". However, when I run my application it
just uses one task slot and not all of them. Am I missing something?
Thanks.
Thanks! :-)
I hope we can fix it for the release.
On Tue, Feb 23, 2016 at 4:45 PM, Zach Cox wrote:
> Hi Ufuk - here is the jira issue with the requested information:
> https://issues.apache.org/jira/browse/FLINK-3483
>
> -Zach
>
>
> On Tue, Feb 23, 2016 at 8:59 AM Ufuk Celebi wrote:
>>
>> Hey Z
Hi Ufuk - here is the jira issue with the requested information:
https://issues.apache.org/jira/browse/FLINK-3483
-Zach
On Tue, Feb 23, 2016 at 8:59 AM Ufuk Celebi wrote:
> Hey Zach! I'm not aware of an open issue for this.
>
> You can go ahead and open an issue for it. It will be very helpful
Registering a data type is only relevant for the Kryo serializer or if you
want to serialize a subclass of a POJO. Registering has the advantage that
you assign an id to the class which is written instead of the full class
name. The latter is usually much longer than the id.
Cheers,
Till
On Tue,
Hey Zach! I'm not aware of an open issue for this.
You can go ahead and open an issue for it. It will be very helpful to
include the following:
- exact Chrome and OS X version
- the exectuion plan as JSON (via env.getExecutionPlan())
- screenshot
Thanks!
– Ufuk
On Tue, Feb 23, 2016 at 3:46 PM,
Hi - I typically use the Chrome browser on OS X, and notice that with
1.0.0-rc0 the job graph visualization displays the nodes in the graph, but
not any of the edges. Also the graph does not move around when dragging the
mouse.
The job graph visualization seems to work perfectly in Safari and Fire
Hi Till (and others).
Thank you very much for your helpful answer.
On 23.02.2016 14:20, Till Rohrmann wrote:
[...] In contrast, if you had a parallel data source which would
consist of multiple source task, then these tasks would be independent
and spread out across your cluster [...]
Can yo
Hi Flavio,
I think the point is that Flink can use its serialization tools if you
register the class in advance. If you don't do that, it will use Kryo
as a fall-back which is slightly less efficient.
Equals and hash code have to be implemented correctly if you compare
Pojos. For standard types l
Hi Tim,
depending on how you create the DataSource fileList, Flink will
schedule the downstream operators differently. If you used the
ExecutionEnvironment.fromCollection method, then it will create a DataSource
with a CollectionInputFormat. This kind of DataSource will only be executed
with a deg
I would go with one task manager with 48 slots per machine. This
reduces the communication overheads between task managers.
Regarding memory configuration: Given that the machines have plenty of
memory, I would configure a bigger heap than the 4 GB you had
previously. Furhermore, you can also cons
Dear FLINK community.
I was wondering what would be the recommended (best?) way to achieve
some kind of file conversion. That runs in parallel on all available
Flink Nodes, since it it "embarrassingly parallel" (no dependency
between files).
Say, I have a HDFS folder that contains multiple
I'm not very familiar with the inner workings of the InputFomat's. calling
.open() got rid of the Nullpointer but the stream still produces no output.
As a temporary solution I wrote a batch job that just unions all the
different datasets and puts them (sorted) into a single folder.
cheers Martin
Hi Ufuk and Fabian,
Is that better to start 48 task manager ( one slot each ) in one machine
than having single task manager with 48 slot ? Any trade-off that we should
know etc ?
Cheers
On Tue, Feb 23, 2016 at 3:03 PM, Welly Tambunan wrote:
> Hi Ufuk,
>
> Thanks for the explanation.
>
> Yes.
Great. That's good news. Let us know if you encounter more issues with the
Kafka connector.
By the way, Kafka released 0.9.0.1, maybe updating your brokers to that
version resolves the issues? (Maybe the problems of some of the topics were
caused by bugs in Kafka)
On Tue, Feb 23, 2016 at 10:23 AM
On Tue, Feb 23, 2016 at 10:17 AM, Ana M. Martinez wrote:
> I believe that setting taskmanager.numberOfTaskSlots is not necessary, but
> setParallelism is, as by default 1 was taken.
Yes, the number of slots in local execution defaults to the maximum
parallelism of the job.
Hi Robert,
After we restarted our Kafka / Zookeeper cluster the consumer worked. Some
of our topics had some problems. The flink's consumer for Kafka 0.9 works
as expected.
Thanks!
On 19 February 2016 at 12:03, Lopez, Javier wrote:
> Hi, these are the properties:
>
> Properties properties = ne
Hi all,
Thank you very much for your help. It worked perfectly like this:
Configuration conf = new Configuration();
conf.setInteger("taskmanager.network.numberOfBuffers", 16000);
conf.setInteger("taskmanager.numberOfTaskSlots”,32);
final ExecutionEnvironment env =
ExecutionEnvironment.createLoc
Hi Ufuk,
Thanks for the explanation.
Yes. Our jobs is all streaming job.
Cheers
On Tue, Feb 23, 2016 at 2:48 PM, Ufuk Celebi wrote:
> The new default is equivalent to the previous "streaming mode". The
> community decided to get rid of this distinction, because it was
> confusing to users.
>
19 matches
Mail list logo