I think if we require users to set an attribute explicitly to fallback to Java serialization, it kinda defeats the purpose of unblocking new users with Kryo serialization.
I like Tim's suggestion of printing out a warning when the fallback is in effect. On Wed, May 18, 2016 at 10:58 AM, Sandesh Hegde <[email protected]> wrote: > Users should explicitly set this > attribute, > as it affects the performance greatly. If Kryo fails to deserialize then > instead of showing > stack > trace, we can show a message indicating them to set the Attribute. > > > On Wed, May 18, 2016 at 10:52 AM Timothy Farkas < > [email protected]> wrote: > > > It will help a new user get something up and running quickly, but it may > > leave people scratching their heads as to why performance is so bad. If > we > > move in the direction of an automatic fallback I think we should also > > devise a way to explicitly warn the users with something more than just > an > > obscure warning message. Perhaps there can be a rest endpoint in the app > > master that a UI can tap into which keeps a log of all the tuning > decisions > > made by the platform, and a corresponding dtcli command? That way we can > > tell newcomers to check that log if they are having any performance > issues. > > > > Thanks, > > Tim > > > > On Wed, May 18, 2016 at 10:36 AM, David Yan <[email protected]> > wrote: > > > > > I think having a fallback to Java serialization is a good thing. > > > I can imagine a user having trouble with Kryo serialization of their > > > operator and unable to figure out then give up totally without us even > > > knowing. > > > > > > David > > > > > > On Tue, May 17, 2016 at 11:50 AM, Thomas Weise <[email protected] > > > > > wrote: > > > > > > > IMO automatically picking a serialializer conflicts with predictable > > > system > > > > behavior. If the serialization does not work I would want to know > that > > > > instead of the system doing some trick and arrive at suboptimal or > > faulty > > > > behavior. > > > > > > > > That does not mean we cannot have optimizations though, as long as > > there > > > is > > > > explicit user control. > > > > > > > > Thomas > > > > > > > > > > > > On Tue, May 17, 2016 at 11:34 AM, Bhupesh Chawda < > > > [email protected]> > > > > wrote: > > > > > > > > > As Ram ans Sandesh pointed out, we do have @Bind and > > @DefaultSerializer > > > > > annotations. However, these are tightly coupled with the field in > > > > question > > > > > and do require modifying external code. Additionally it may also > > break > > > > > other systems, if we are binding it to a JavaSerializer and perhaps > > > there > > > > > are systems which have other means of serializing the field. > > > > > > > > > > My point was more to do with user having to worry about what > > serializer > > > > to > > > > > use and how to serialize objects. > > > > > For example, I liked the approach that Storm takes by falling back > to > > > > Java > > > > > serialization automatically in case the target class does not have > a > > > > > default constructor. > > > > > > > > > > Of course, we can explore type based serialization. But this email > > was > > > > more > > > > > about the usability aspect; to handle classes not having default > > > > > constructors in general, not just POJO tuples. > > > > > > > > > > ~Bhupesh > > > > > > > > > > > > > > > > > > > > On Tue, May 17, 2016 at 9:53 AM, Pramod Immaneni < > > > [email protected] > > > > > > > > > > wrote: > > > > > > > > > > > Can we do a test where we hard code a codec for a POJO and > compare > > > > > > performance against kryo. Thereafter we can dynamically compose a > > > > > > codec via pojoutils and inject it. > > > > > > > > > > > > Thanks > > > > > > > > > > > > > On May 17, 2016, at 8:16 AM, Vlad Rozov < > [email protected] > > > > > > > > wrote: > > > > > > > > > > > > > > +1 for type based serialization. Tuples in most cases are flat > > > > > > records/pojo and it should be possible programmatically > construct a > > > > codec > > > > > > that will significantly outperform Kryo. It should also reduce > > amount > > > > of > > > > > > data passed over the wire. I started to look in that direction as > > > well > > > > as > > > > > > Kryo serialization is one of bottlenecks that limits Apex > > throughput > > > > when > > > > > > operators are deployed into different containers including > > NODE_LOCAL > > > > > case. > > > > > > > > > > > > > > Thank you, > > > > > > > Vlad > > > > > > > > > > > > > >> On 5/17/16 07:13, Sandesh Hegde wrote: > > > > > > >> If it is possible to serialize, platform should do it > > > automatically, > > > > > it > > > > > > >> reduces the tribal knowledge requirement to use the platform. > > > > Couples > > > > > of > > > > > > >> month back, I also sent out the similar email. > > > > > > >> > > > > > > >> Type based serialization may improve the performance. > > > > > > >> > > > > > > >>> On Tue, May 17, 2016, 6:06 AM Munagala Ramanath < > > > > [email protected] > > > > > > > > > > > > wrote: > > > > > > >>> > > > > > > >>> Traditionally, we've recommended using > > > > > > >>> "@DefaultSerializer(JavaSerializer.class)" or > > > > > > >>> "@FieldSerializer.Bind(CustomSerializer.class)" as outlined > at > > > > > > >>> > > > > > > >>> > > > > > > > > > > > > > > > > > > > > > http://docs.datatorrent.com/troubleshooting/#application-throwing-following-kryo-exception > > > > > > >>> > > > > > > >>> Can you describe why those approaches are not adequate ? > > > > > > >>> > > > > > > >>> Ram > > > > > > >>> > > > > > > >>> On Mon, May 16, 2016 at 11:46 PM, Bhupesh Chawda < > > > > > > [email protected]> > > > > > > >>> wrote: > > > > > > >>> > > > > > > >>>> Hi All, > > > > > > >>>> > > > > > > >>>> While working on the integration of Apex with Apache Samoa, > I > > am > > > > > > coming > > > > > > >>>> across some scenarios where I have to add default > constructors > > > in > > > > > some > > > > > > >>>> external classes to make them Kryo serializable. Although > this > > > > > should > > > > > > be > > > > > > >>>> okay, we would like to avoid modifying external classes as > far > > > as > > > > > > >>> possible. > > > > > > >>>> Some other streaming engines have taken different approaches > > > > towards > > > > > > >>>> serialization. > > > > > > >>>> > > > > > > >>>> I looked at Flink and Storm serialization mechanisms. > > > > > > >>>> > > > > > > >>>> Storm has a fall back mechanism on Java serialization. It > does > > > use > > > > > > Kryo > > > > > > >>> for > > > > > > >>>> serialization due to performance. But, if the class is not > > > > > > serializable > > > > > > >>>> using Kryo, then it will try to serialize it using Java > > > > > > serialization. If > > > > > > >>>> even then it cannot serialize, then it throws an error. [1] > > > > > > >>>> > > > > > > >>>> Flink has its own serialization stack where it uses a > > serializer > > > > > > based on > > > > > > >>>> the type information known about the data. [2] > > > > > > >>>> > > > > > > >>>> What does the community think about the current state of > > > > > > serialization in > > > > > > >>>> Apex. Is there a need to explore some approaches which could > > > avoid > > > > > > >>>> serialization issues such as the one described above? Are > > there > > > > any > > > > > > other > > > > > > >>>> approaches one could use? > > > > > > >>>> > > > > > > >>>> 1. > > > > > > >>> > > > > > > > > > > > > > > > > > > > > > http://storm.apache.org/releases/current/Serialization.html#java-serialization > > > > > > >>>> 2. > > > > > > >>> > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/Type+System,+Type+Extraction,+Serialization > > > > > > >>>> > > > > > > >>>> ~Bhupesh > > > > > > > > > > > > > > > > > > > > > > > > > > > >
