Another important question was having a separate serializer vs Kryo.

Here is one work comparing performance that looks like a very promising
area to work,
https://flink.apache.org/news/2015/05/11/Juggling-with-Bits-and-Bytes.html




On Wed, May 18, 2016 at 11:07 AM David Yan <[email protected]> wrote:

> I think if we require users to set an attribute explicitly to fallback to
> Java serialization, it kinda defeats the purpose of unblocking new users
> with Kryo serialization.
>
> I like Tim's suggestion of printing out a warning when the fallback is in
> effect.
>
> On Wed, May 18, 2016 at 10:58 AM, Sandesh Hegde <[email protected]>
> wrote:
>
> > Users should explicitly set this
> > attribute,
> >  as it affects the performance greatly. If Kryo fails to deserialize then
> > instead of showing
> > stack
> >  trace, we can show a message indicating them to set the Attribute.
> >
> >
> > On Wed, May 18, 2016 at 10:52 AM Timothy Farkas <
> > [email protected]> wrote:
> >
> > > It will help a new user get something up and running quickly, but it
> may
> > > leave people scratching their heads as to why performance is so bad. If
> > we
> > > move in the direction of an automatic fallback I think we should also
> > > devise a way to explicitly warn the users with something more than just
> > an
> > > obscure warning message. Perhaps there can be a rest endpoint in the
> app
> > > master that a UI can tap into which keeps a log of all the tuning
> > decisions
> > > made by the platform, and a corresponding dtcli command? That way we
> can
> > > tell newcomers to check that log if they are having any performance
> > issues.
> > >
> > > Thanks,
> > > Tim
> > >
> > > On Wed, May 18, 2016 at 10:36 AM, David Yan <[email protected]>
> > wrote:
> > >
> > > > I think having a fallback to Java serialization is a good thing.
> > > > I can imagine a user having trouble with Kryo serialization of their
> > > > operator and unable to figure out then give up totally without us
> even
> > > > knowing.
> > > >
> > > > David
> > > >
> > > > On Tue, May 17, 2016 at 11:50 AM, Thomas Weise <
> [email protected]
> > >
> > > > wrote:
> > > >
> > > > > IMO automatically picking a serialializer conflicts with
> predictable
> > > > system
> > > > > behavior. If the serialization does not work I would want to know
> > that
> > > > > instead of the system doing some trick and arrive at suboptimal or
> > > faulty
> > > > > behavior.
> > > > >
> > > > > That does not mean we cannot have optimizations though, as long as
> > > there
> > > > is
> > > > > explicit user control.
> > > > >
> > > > > Thomas
> > > > >
> > > > >
> > > > > On Tue, May 17, 2016 at 11:34 AM, Bhupesh Chawda <
> > > > [email protected]>
> > > > > wrote:
> > > > >
> > > > > > As Ram ans Sandesh pointed out, we do have @Bind and
> > > @DefaultSerializer
> > > > > > annotations. However, these are tightly coupled with the field in
> > > > > question
> > > > > > and do require modifying external code. Additionally it may also
> > > break
> > > > > > other systems, if we are binding it to a JavaSerializer and
> perhaps
> > > > there
> > > > > > are systems which have other means of serializing the field.
> > > > > >
> > > > > > My point was more to do with user having to worry about what
> > > serializer
> > > > > to
> > > > > > use and how to serialize objects.
> > > > > > For example, I liked the approach that Storm takes by falling
> back
> > to
> > > > > Java
> > > > > > serialization automatically in case the target class does not
> have
> > a
> > > > > > default constructor.
> > > > > >
> > > > > > Of course, we can explore type based serialization. But this
> email
> > > was
> > > > > more
> > > > > > about the usability aspect; to handle classes not having default
> > > > > > constructors in general, not just POJO tuples.
> > > > > >
> > > > > > ~Bhupesh
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Tue, May 17, 2016 at 9:53 AM, Pramod Immaneni <
> > > > [email protected]
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Can we do a test where we hard code a codec for a POJO and
> > compare
> > > > > > > performance against kryo. Thereafter we can dynamically
> compose a
> > > > > > > codec via pojoutils and inject it.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > > On May 17, 2016, at 8:16 AM, Vlad Rozov <
> > [email protected]
> > > >
> > > > > > wrote:
> > > > > > > >
> > > > > > > > +1 for type based serialization. Tuples in most cases are
> flat
> > > > > > > records/pojo and it should be possible programmatically
> > construct a
> > > > > codec
> > > > > > > that will significantly outperform Kryo. It should also reduce
> > > amount
> > > > > of
> > > > > > > data passed over the wire. I started to look in that direction
> as
> > > > well
> > > > > as
> > > > > > > Kryo serialization is one of bottlenecks that limits Apex
> > > throughput
> > > > > when
> > > > > > > operators are deployed into different containers including
> > > NODE_LOCAL
> > > > > > case.
> > > > > > > >
> > > > > > > > Thank you,
> > > > > > > > Vlad
> > > > > > > >
> > > > > > > >> On 5/17/16 07:13, Sandesh Hegde wrote:
> > > > > > > >> If it is possible to serialize, platform should do it
> > > > automatically,
> > > > > > it
> > > > > > > >> reduces the tribal knowledge requirement to use the
> platform.
> > > > > Couples
> > > > > > of
> > > > > > > >> month back, I also sent out the similar email.
> > > > > > > >>
> > > > > > > >> Type based serialization may improve the performance.
> > > > > > > >>
> > > > > > > >>> On Tue, May 17, 2016, 6:06 AM Munagala Ramanath <
> > > > > [email protected]
> > > > > > >
> > > > > > > wrote:
> > > > > > > >>>
> > > > > > > >>> Traditionally, we've recommended using
> > > > > > > >>> "@DefaultSerializer(JavaSerializer.class)" or
> > > > > > > >>> "@FieldSerializer.Bind(CustomSerializer.class)" as outlined
> > at
> > > > > > > >>>
> > > > > > > >>>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://docs.datatorrent.com/troubleshooting/#application-throwing-following-kryo-exception
> > > > > > > >>>
> > > > > > > >>> Can you describe why those approaches are not adequate ?
> > > > > > > >>>
> > > > > > > >>> Ram
> > > > > > > >>>
> > > > > > > >>> On Mon, May 16, 2016 at 11:46 PM, Bhupesh Chawda <
> > > > > > > [email protected]>
> > > > > > > >>> wrote:
> > > > > > > >>>
> > > > > > > >>>> Hi All,
> > > > > > > >>>>
> > > > > > > >>>> While working on the integration of Apex with Apache
> Samoa,
> > I
> > > am
> > > > > > > coming
> > > > > > > >>>> across some scenarios where I have to add default
> > constructors
> > > > in
> > > > > > some
> > > > > > > >>>> external classes to make them Kryo serializable. Although
> > this
> > > > > > should
> > > > > > > be
> > > > > > > >>>> okay, we would like to avoid modifying external classes as
> > far
> > > > as
> > > > > > > >>> possible.
> > > > > > > >>>> Some other streaming engines have taken different
> approaches
> > > > > towards
> > > > > > > >>>> serialization.
> > > > > > > >>>>
> > > > > > > >>>> I looked at Flink and Storm serialization mechanisms.
> > > > > > > >>>>
> > > > > > > >>>> Storm has a fall back mechanism on Java serialization. It
> > does
> > > > use
> > > > > > > Kryo
> > > > > > > >>> for
> > > > > > > >>>> serialization due to performance. But, if the class is not
> > > > > > > serializable
> > > > > > > >>>> using Kryo, then it will try to serialize it using Java
> > > > > > > serialization. If
> > > > > > > >>>> even then it cannot serialize, then it throws an error.
> [1]
> > > > > > > >>>>
> > > > > > > >>>> Flink has its own serialization stack where it uses a
> > > serializer
> > > > > > > based on
> > > > > > > >>>> the type information known about the data. [2]
> > > > > > > >>>>
> > > > > > > >>>> What does the community think about the current state of
> > > > > > > serialization in
> > > > > > > >>>> Apex. Is there a need to explore some approaches which
> could
> > > > avoid
> > > > > > > >>>> serialization issues such as the one described above? Are
> > > there
> > > > > any
> > > > > > > other
> > > > > > > >>>> approaches one could use?
> > > > > > > >>>>
> > > > > > > >>>> 1.
> > > > > > > >>>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://storm.apache.org/releases/current/Serialization.html#java-serialization
> > > > > > > >>>> 2.
> > > > > > > >>>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Type+System,+Type+Extraction,+Serialization
> > > > > > > >>>>
> > > > > > > >>>> ~Bhupesh
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to