Hey Garry, Only that default key is deprecated. The other ones are still used (i.e. the string serde snippet you've pasted is valid).
Cheers, Chris On 2/11/14 8:38 AM, "Garry Turkington" <g.turking...@improvedigital.com> wrote: >Hi Chris, > >Thanks for the explanation, all makes sense. > >Is it only that one key in the serializers.* config namespace that is >deprecated or are entries re serializer factory no longer needed, e.g: > >serializers.registry.string.class=org.apache.samza.serializers.StringSerde >Factory > >Thanks >Garry > >-----Original Message----- >From: Chris Riccomini [mailto:criccom...@linkedin.com] >Sent: 11 February 2014 01:10 >To: dev@samza.incubator.apache.org >Subject: Re: Serde defaults and config hierarchy > >Hey Garry, > >The first one you have is now defunct: > > serializers.default=string > > >This was a relic of Samza 0.6.0. It shouldn't do anything. > >You are correct regarding the other two. The system-level serde defines a >serde that is used by default for all streams in the system. The >stream-level serde overrides any system level serde, and defines the >serde only for the specific stream. > >Actually, no levels are required. Samza is not strongly typed >internally--it just uses Object everywhere. Internally, consumers and >producers just get objects and need to know what to do with them. If you >DO define a serde, Samza will pass the object through the serde, and give >the consumer/producer the results. This sounds a little goofy until you >start thinking about things like JDBC consumers, which receive result >sets, and never have bytes. It's easy for such consumers to hand objects >directly to Samza, rather than try to hack around a byte-specific serde >interface. > >In general, it's best practice to define key and message serdes at the >system level for Kafka systems, which sounds like what you've done. > >Regarding the serde package splice (core vs. serializers), the main >reason for a separate serializers package is to isolate dependencies from >core. >We want as few dependencies as possible in samza-core, since it's a >low-level framework, and we want to avoid version conflicts where >developers might want (or need) to use version X of a library, and we >depend on an API incompatible version Y of the same library. > >That said, the split is completely arbitrary right now, because Samza >already depends on Jackson for a number of things, and all serdes that >exist so far either use Jackson, or Java primitives (e.g. Integer serde). >We didn't put a ton of thought into that--it just evolved organically to >what it is now. We'll probably need to refactor it at some point. > >Cheers, >Chris > >On 2/10/14 3:33 PM, "Garry Turkington" <g.turking...@improvedigital.com> >wrote: > >>Hi, >> >>Damn, Chris asked for my task config which is only going to show how >>confused I am on serde config options. So I want to avoid any >>embarrassment. :) >> >>Looking at config files there seem to be 3 places to define the serde, >>for example: >> >>serializers.default=string >>systems.kafka.samza.msg.serde=string >>systems.kafka.streams.msgs-parsed.samza.msg.serde=string >> >>I've been reading this as the first is the default for all defined >>systems, the 2nd for a given system and the 3rd is specifying for a >>given stream. Is this correct? If so are all levels required or could I >>for example get away with only the 2nd if I only used Kafka and only >>had streams requiring the string serde? I got myself into some knots >>with a task with multiple streams each with different serdes so clarity >>would be good. >> >>And as an aside any reason why two serdes are in samza-serializers and >>the rest are in samza-core? At first blush it looked like a >>system/user-facing split but they both seem to have a mix (JSON/metrics >>in one, Integer/Checkpoint etc in the other). >> >>Thanks >>Garry >> > > >----- >No virus found in this message. >Checked by AVG - www.avg.com >Version: 2014.0.4259 / Virus Database: 3697/7081 - Release Date: 02/10/14