l just serialize an integer id. So the
> amount of data being transferred goes down drastically.
>
> The disableAutoTypeRegistration flag is ignored in the DataStream API at
> the moment.
>
>
>
>
>
>
>
> On Thu, Feb 23, 2017 at 7:00 PM, Dmitry Golubets <dgolub...@gmai
case classes? If so, what are
> you using for doing that?
>
> Regards,
> Robert
>
> On Fri, Feb 17, 2017 at 9:17 PM, Dmitry Golubets <dgolub...@gmail.com>
> wrote:
>
>> Hi Daniel,
>>
>> I've implemented a macro that generates message pack serializ
DefaultKryoSerializer(..)
> .
>
> I'm interested on knowing what have you done there for a boost of about
> 50% .
>
> Some small or simple example would be very nice.
>
> Thank you very much in advance.
>
> Kind Regards,
>
> Daniel Santos
>
> On 02/17/201
th for you.
>
> Cheers,
> Till
>
>
> On Fri, Feb 17, 2017 at 12:38 PM, Dmitry Golubets <dgolub...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I was using ```cs.knownDirectSubclasses``` recursively to find and
>> register subclasses, which may have resulted
Hi,
Can I force Flink to use Akka 2.4 (recompile if needed)?
Is it going to misbehave in a subtle way?
Best regards,
Dmitry
w it is just hard coded to use a round-robin repartitioner
> implementation as default.
>
> However, I’m not sure of the plans in exposing this to the user and making
> it configurable.
> Looping in Stefan (in cc) who mostly worked on this part and see if he can
> provide more info.
&g
Hi,
It looks impossible to implement a keyed state with operator state now.
I know it sounds like "just use a keyed state", but latter requires
updating it on every value change as opposed to operator state and thus can
be expensive (especially if you have to deal with mutable structures inside
The docs say that it may improve performance.
How true is it, when custom serializers are provided?
There is also 'disableAutoTypeRegistration' method in the config class,
implying Flink registers types automatically.
So, given that I have an hierarchy:
trait A
class B extends A
class C extends
Hi,
I need to re-create a Kafka topic when a job is started in "clean" mode.
I can do it, but I'm not sure if I do it in the right place.
Is it fine to put this kind of code in the "main"?
Then it's called on every job submit.
But.. how to detect if a job is being started from a savepoint?
Or
Update: I've now used 1.1.3 versions as in the example in the docs and it
works!
Looks like these is an incompatibility with the latest logback.
Best regards,
Dmitry
On Wed, Feb 8, 2017 at 3:20 PM, Dmitry Golubets <dgolub...@gmail.com> wrote:
> Hi Robert,
>
> After reading that
i.apache.org/
> projects/flink/flink-docs-release-1.2/monitoring/best_
> practices.html#use-logback-when-running-flink-on-a-cluster
>
> On Tue, Feb 7, 2017 at 1:07 PM, Dmitry Golubets <dgolub...@gmail.com>
> wrote:
>
>> Hi,
>>
>> documentation says: "Users willin
Hi,
documentation says: "Users willing to use logback instead of log4j can just
exclude log4j (or delete it from the lib/ folder)."
But then Flink just doesn't start. I added logback-classic 1.10 to it's lib
folder, but still get NoClassDefFoundError:
ch/qos/logback/core/joran/spi/JoranException
thing is missing, feel free to report it here.
>
> The PRs will be merged later today.
>
>
> On Mon, Feb 6, 2017 at 4:41 PM, Dmitry Golubets <dgolub...@gmail.com>
> wrote:
> > Hi guys,
> >
> > I would appreciate if someone could explain to me what's the dif
Hi guys,
I would appreciate if someone could explain to me what's the difference
between those two.
The current description refers to "dynamic scaling", and yet I can't find
anything about it in Flink's docs.
Best regards,
Dmitry
and-line-arguments-and-
> passing-them-around-in-your-flink-application
>
> On Thu, Jan 26, 2017 at 5:38 PM, Dmitry Golubets <dgolub...@gmail.com>
> wrote:
>
>> Hi,
>>
>> Is there a place for user defined configuration settings?
>> How to read them?
>>
>> Best regards,
>> Dmitry
>>
>
>
Hi,
Is there a place for user defined configuration settings?
How to read them?
Best regards,
Dmitry
dependency coming from? Maybe you can resolve the
> issue on your side for now.
> I've filed a JIRA for this issue: https://issues.apache.
> org/jira/browse/FLINK-5661
>
>
>
> On Wed, Jan 25, 2017 at 8:24 PM, Dmitry Golubets <dgolub...@gmail.com>
> wrote:
>
&
I've build latest Flink from sources and it seems that httpclient
dependency from flink-mesos is not shaded. It causes troubles with latest
AWS SDK.
Do I build it wrong or is it a known problem?
Best regards,
Dmitry
Hi,
I've just added my custom MsgPack serializers hoping to see performance
increase. I covered all data types in between chains.
However this Kryo method still takes a lot of CPU: IdentityObjectIntMap.get
Is there something else should be configured?
Or is there no way to get away from Kryo
Hi,
I'm looking for the right way to do the following scheme:
1. Read data
2. Split it into partitions for parallel processing
3. In every partition group data in N elements batches
4. Process these batches
My first attempt was:
*dataStream.keyBy(_.key).countWindow(..)*
But countWindow groups
for that.
> Overall, that seemed the more scalable design to us.
> Can your use case follow a similar approach?
>
> Stephan
>
>
>
> On Tue, Jan 17, 2017 at 10:57 AM, Dmitry Golubets <dgolub...@gmail.com>
> wrote:
>
>> Hi Timo,
>>
>> I don't have an
you have to implement
> your own operator. That depends on your use case though.
>
> You can maintain backpressure by using Flink's operator state. But did you
> also thought about a Window Join instead?
>
> I hope that helps.
>
> Timo
>
>
>
>
> Am 17/01/17 um 00
Hi,
there are only *two *interfaces defined at the moment:
*OneInputStreamOperator*
and
*TwoInputStreamOperator.*
Is there any way to define an operator with arbitrary number of inputs?
My another concern is how to maintain *backpressure *in the operator?
Let's say I read events from two Kafka
e memory consumption if data is
> serialized into a fixed number of buffers instead of being put on the JVM
> heap.
>
> Best, Fabian
>
> 2017-01-16 14:21 GMT+01:00 Dmitry Golubets <dgolub...@gmail.com>:
>
>> Hi Ufuk,
>>
>> Do you know what's the reason f
Hi Ufuk,
Do you know what's the reason for serialization of data between different
threads?
Also, thanks for the link!
Best regards,
Dmitry
On Mon, Jan 16, 2017 at 1:07 PM, Ufuk Celebi wrote:
> Hey Dmitry,
>
> this is not possible if I'm understanding you correctly.
>
> A
Hi,
Let's say we have multiple subtask chains and all of them are executing in
the same task manager slot (i.e. in the same JVM).
What's the point in serializing data between them?
Can it be disabled?
The reason I want keep different chains is that some subtasks should be
executed in parallel to
26 matches
Mail list logo