Hello Again,
After our discussion I implemented a new serializer called
DeflateListDelegateSerializer, which is basicly kryo's deflate
serializer on top of ListDelegateSerializer.
public class DeflateListDelegateSerializer extends DeflateSerializer {
public DeflateListDelegateSerializer () {
super(new ListDelegateSerializer());
}
public Collection create(Kryo kryo, Input input, Class<Collection>
type) {
return new ListDelegate();
}
}
After taking it online I see an incredible drop in network traffic and
big hit in supevisor loads as expected.
I added some graphs showing the effects.
On 09/18/2015 09:00 PM, Bobby Evans wrote:
https://github.com/apache/storm/blob/master/storm-core/src/jvm/backtype/storm/Config.java#L212-L217
topology.tuple.serializer
- Bobby
On Friday, September 18, 2015 12:54 PM, Onur Yalazı
<[email protected]> wrote:
Hello Bobby,
If it's possible to enable a TupleSerializer while submitting
topologies, without touching Storm's internals it would be really wise
and a fast way to implement compression. Actually I have no idea if it's
possible this way.
Other than that, If I got the Netty's gist right, to enable Zlib
compression one needs only 2 lines of code for each pipeline to enable
it. It's first class citizen of netty. Looking into
https://github.com/netty/netty/tree/master/codec/src/main/java/io/netty/handler/codec/compression
a good chunk of compression algorithms are also available.
From
https://github.com/netty/netty/blob/ed4a89082bb29b9e7d869c5d25d6b9ea8fc9d25b/example/src/main/java/io/netty/example/factorial/FactorialClientInitializer.java:
// Enable stream compression (you can remove these two if
unnecessary)
pipeline.addLast(ZlibCodecFactory.newZlibEncoder(ZlibWrapper.GZIP));
pipeline.addLast(ZlibCodecFactory.newZlibDecoder(ZlibWrapper.GZIP));
On 09/18/2015 08:32 PM, Bobby Evans wrote:
Compression was just not something that we really though about all that much.
The fastest route is probably to replace the tuple serializer with one that can
handle compression. We did something similar for encryption.
https://github.com/apache/storm/blob/master/storm-core/src/jvm/backtype/storm/security/serialization/BlowfishTupleSerializer.java
But compression is generic enough it might be nice to make it a part of the
Real TupleSerializer.
https://github.com/EsotericSoftware/kryo#compression-and-encryption
I would also suggest that you look at Snappy and LZO or LZ4 for your
compression. As they tend to be much faster and still get good compression
ratios. - Bobby
On Friday, September 18, 2015 11:51 AM, Onur Yalazı
<[email protected]> wrote:
Hello,
I'm very new to storm world and the list, so Hello from Turkey.
Because of a recent incident we had to increase our openstack network
bandwidth soft limits from 1gb/s to 2gb/s.
And of course even though the problem resides in our tuples' size and
topology size, I thought if storm's netty was using ZlibEncoding.
Looking into backtype.storm.messaging.netty.StormServerPipelineFactory
and client counterpart, pipline seems to not have any compression handlers.
Is it an intentional decision to not include compression in the
pipeline? I know it would need more processing power and reduce topology
performance, but I would like to know it it was considered before, and
if not raise the issue.
Thank you.