+1. Java serialization is much slower than Kryo serialization and using Java serialization must be an explicit application designer choice.

Thank you,
Vlad

On 5/17/16 11:50, Thomas Weise wrote:
IMO automatically picking a serialializer conflicts with predictable system
behavior. If the serialization does not work I would want to know that
instead of the system doing some trick and arrive at suboptimal or faulty
behavior.

That does not mean we cannot have optimizations though, as long as there is
explicit user control.

Thomas


On Tue, May 17, 2016 at 11:34 AM, Bhupesh Chawda <[email protected]>
wrote:

As Ram ans Sandesh pointed out, we do have @Bind and @DefaultSerializer
annotations. However, these are tightly coupled with the field in question
and do require modifying external code. Additionally it may also break
other systems, if we are binding it to a JavaSerializer and perhaps there
are systems which have other means of serializing the field.

My point was more to do with user having to worry about what serializer to
use and how to serialize objects.
For example, I liked the approach that Storm takes by falling back to Java
serialization automatically in case the target class does not have a
default constructor.

Of course, we can explore type based serialization. But this email was more
about the usability aspect; to handle classes not having default
constructors in general, not just POJO tuples.

~Bhupesh



On Tue, May 17, 2016 at 9:53 AM, Pramod Immaneni <[email protected]>
wrote:

Can we do a test where we hard code a codec for a POJO and compare
performance against kryo. Thereafter we can dynamically compose a
codec via pojoutils and inject it.

Thanks

On May 17, 2016, at 8:16 AM, Vlad Rozov <[email protected]>
wrote:
+1 for type based serialization. Tuples in most cases are flat
records/pojo and it should be possible programmatically construct a codec
that will significantly outperform Kryo. It should also reduce amount of
data passed over the wire. I started to look in that direction as well as
Kryo serialization is one of bottlenecks that limits Apex throughput when
operators are deployed into different containers including NODE_LOCAL
case.
Thank you,
Vlad

On 5/17/16 07:13, Sandesh Hegde wrote:
If it is possible to serialize, platform should do it automatically,
it
reduces the tribal knowledge requirement to use the platform. Couples
of
month back, I also sent out the similar email.

Type based serialization may improve the performance.

On Tue, May 17, 2016, 6:06 AM Munagala Ramanath <[email protected]
wrote:
Traditionally, we've recommended using
"@DefaultSerializer(JavaSerializer.class)" or
"@FieldSerializer.Bind(CustomSerializer.class)" as outlined at


http://docs.datatorrent.com/troubleshooting/#application-throwing-following-kryo-exception
Can you describe why those approaches are not adequate ?

Ram

On Mon, May 16, 2016 at 11:46 PM, Bhupesh Chawda <
[email protected]>
wrote:

Hi All,

While working on the integration of Apex with Apache Samoa, I am
coming
across some scenarios where I have to add default constructors in
some
external classes to make them Kryo serializable. Although this
should
be
okay, we would like to avoid modifying external classes as far as
possible.
Some other streaming engines have taken different approaches towards
serialization.

I looked at Flink and Storm serialization mechanisms.

Storm has a fall back mechanism on Java serialization. It does use
Kryo
for
serialization due to performance. But, if the class is not
serializable
using Kryo, then it will try to serialize it using Java
serialization. If
even then it cannot serialize, then it throws an error. [1]

Flink has its own serialization stack where it uses a serializer
based on
the type information known about the data. [2]

What does the community think about the current state of
serialization in
Apex. Is there a need to explore some approaches which could avoid
serialization issues such as the one described above? Are there any
other
approaches one could use?

1.
http://storm.apache.org/releases/current/Serialization.html#java-serialization
2.
https://cwiki.apache.org/confluence/display/FLINK/Type+System,+Type+Extraction,+Serialization
~Bhupesh

Reply via email to