We will definitely also try to get the chaining overhead down a bit.
BTW: To reach this kind of throughput, you need sources that can produce
very fast...
On Fri, Sep 4, 2015 at 12:20 AM, Welly Tambunan wrote:
> Hi Stephan,
>
> That's good information to know. We will hit
Hi Stephan,
Cheers
On Fri, Sep 4, 2015 at 2:31 PM, Stephan Ewen wrote:
> We will definitely also try to get the chaining overhead down a bit.
>
> BTW: To reach this kind of throughput, you need sources that can produce
> very fast...
>
> On Fri, Sep 4, 2015 at 12:20 AM, Welly
Hi Stephan,
Thanks for your clarification.
Basically we will have lots of sensor that will push this kind of data to
queuing system ( currently we are using RabbitMQ, but will soon move to
Kafka).
We also will use the same pipeline to process the historical data.
I also want to minimize the
In a set of benchmarks a while back, we found that the chaining mechanism
has some overhead right now, because of its abstraction. The abstraction
creates iterators for each element and makes it hard for the JIT to
specialize on the operators in the chain.
For purely local chains at full speed,
Hi Stephan,
That's good information to know. We will hit that throughput easily. Our
computation graph has lot of chaining like this right now.
I think it's safe to minimize the chain right now.
Thanks a lot for this Stephan.
Cheers
On Thu, Sep 3, 2015 at 7:20 PM, Stephan Ewen
Hi All,
I would like to filter some item from the event stream. I think there are
two ways doing this.
Using the regular pipeline filter(...).map(...). We can also use flatMap
for doing both in the same operator.
Any performance improvement if we are using flatMap ? As that will be done
in one
Hey Welly,
If you call filter and map one after the other like you mentioned, these
operators will be chained and executed as if they were running in the same
operator.
The only small performance overhead comes from the fact that the output of
the filter will be copied before passing it as input