Hey Garry, Yea, that's using 0.6.0 configuration. Dang. I'll open a JIRA.
Cheers, Chris On 2/11/14 10:28 AM, "Garry Turkington" <[email protected]> wrote: >Chris, > >Thanks for the clarification. > >BTW what I think led me astray re serializers.default was this page so >perhaps that could be removed from there too: > >http://samza.incubator.apache.org/learn/documentation/0.7.0/jobs/configura >tion.html > >Thanks! >Garry > >-----Original Message----- >From: Chris Riccomini [mailto:[email protected]] >Sent: 11 February 2014 18:06 >To: [email protected] >Subject: Re: Serde defaults and config hierarchy > >Hey Garry, > >Only that default key is deprecated. The other ones are still used (i.e. >the string serde snippet you've pasted is valid). > >Cheers, >Chris > >On 2/11/14 8:38 AM, "Garry Turkington" <[email protected]> >wrote: > >>Hi Chris, >> >>Thanks for the explanation, all makes sense. >> >>Is it only that one key in the serializers.* config namespace that is >>deprecated or are entries re serializer factory no longer needed, e.g: >> >>serializers.registry.string.class=org.apache.samza.serializers.StringSe >>rde >>Factory >> >>Thanks >>Garry >> >>-----Original Message----- >>From: Chris Riccomini [mailto:[email protected]] >>Sent: 11 February 2014 01:10 >>To: [email protected] >>Subject: Re: Serde defaults and config hierarchy >> >>Hey Garry, >> >>The first one you have is now defunct: >> >> serializers.default=string >> >> >>This was a relic of Samza 0.6.0. It shouldn't do anything. >> >>You are correct regarding the other two. The system-level serde defines >>a serde that is used by default for all streams in the system. The >>stream-level serde overrides any system level serde, and defines the >>serde only for the specific stream. >> >>Actually, no levels are required. Samza is not strongly typed >>internally--it just uses Object everywhere. Internally, consumers and >>producers just get objects and need to know what to do with them. If >>you DO define a serde, Samza will pass the object through the serde, >>and give the consumer/producer the results. This sounds a little goofy >>until you start thinking about things like JDBC consumers, which >>receive result sets, and never have bytes. It's easy for such consumers >>to hand objects directly to Samza, rather than try to hack around a >>byte-specific serde interface. >> >>In general, it's best practice to define key and message serdes at the >>system level for Kafka systems, which sounds like what you've done. >> >>Regarding the serde package splice (core vs. serializers), the main >>reason for a separate serializers package is to isolate dependencies >>from core. >>We want as few dependencies as possible in samza-core, since it's a >>low-level framework, and we want to avoid version conflicts where >>developers might want (or need) to use version X of a library, and we >>depend on an API incompatible version Y of the same library. >> >>That said, the split is completely arbitrary right now, because Samza >>already depends on Jackson for a number of things, and all serdes that >>exist so far either use Jackson, or Java primitives (e.g. Integer serde). >>We didn't put a ton of thought into that--it just evolved organically >>to what it is now. We'll probably need to refactor it at some point. >> >>Cheers, >>Chris >> >>On 2/10/14 3:33 PM, "Garry Turkington" >><[email protected]> >>wrote: >> >>>Hi, >>> >>>Damn, Chris asked for my task config which is only going to show how >>>confused I am on serde config options. So I want to avoid any >>>embarrassment. :) >>> >>>Looking at config files there seem to be 3 places to define the serde, >>>for example: >>> >>>serializers.default=string >>>systems.kafka.samza.msg.serde=string >>>systems.kafka.streams.msgs-parsed.samza.msg.serde=string >>> >>>I've been reading this as the first is the default for all defined >>>systems, the 2nd for a given system and the 3rd is specifying for a >>>given stream. Is this correct? If so are all levels required or could >>>I for example get away with only the 2nd if I only used Kafka and only >>>had streams requiring the string serde? I got myself into some knots >>>with a task with multiple streams each with different serdes so >>>clarity would be good. >>> >>>And as an aside any reason why two serdes are in samza-serializers and >>>the rest are in samza-core? At first blush it looked like a >>>system/user-facing split but they both seem to have a mix >>>(JSON/metrics in one, Integer/Checkpoint etc in the other). >>> >>>Thanks >>>Garry >>> >> >> >>----- >>No virus found in this message. >>Checked by AVG - www.avg.com >>Version: 2014.0.4259 / Virus Database: 3697/7081 - Release Date: >>02/10/14 > > >----- >No virus found in this message. >Checked by AVG - www.avg.com >Version: 2014.0.4259 / Virus Database: 3697/7081 - Release Date: 02/10/14
