Typing is often hard ;) This sounds like a DSL-specific design decision. Perhaps we could start by specifying what the objectives and capabilities of this particular DSL would be. I think we would then be able to comment on the advantages and disadvantages of various choices. Otherwise, it is hard to assess how a particular choice would impact the end goal.
On Thu, Apr 28, 2016 at 5:39 AM, Jean-Baptiste Onofré <j...@nanthrax.net> wrote: > Hi all, > > I started to sketch a couple of declarative DSLs (XML and JSON) on top of > the SDK (I created a new dsl module on my local git). > > When using the SDK, the user "controls" and knows the type of the data. > > For instance, if the pipeline starts with a Kafka source, the user knows > that he will have a PCollection of KafkaRecords (it can eventually use a > coder). > > Imagine we have a DSL like this (just an example): > > <pipeline> > <from uri="kafka?bootstrapServers=...&topic=..."/> > <to uri="hdfs://path/to/out"/> > </pipeline> > > The KafkaRecord collection from the Kafka source has to be "converted" > into a collection of String for instance. > > In the DSL, I think it makes sense to do it "implicitly". If I compare > with what we are doing in Apache Camel, the DSL could have a DataExchange > context where we can store a set of TypeConverters. It's basically a Map to > convert from one type (KafkaRecord) to another type (String). It means that > the IO have to define the expected type (provided for source, consumed for > sink). > > Generally speaking, we can image to use Avro to convert any type (mostly). > > Thoughts ? > > Thanks, > Regards > JB > -- > Jean-Baptiste Onofré > jbono...@apache.org > http://blog.nanthrax.net > Talend - http://www.talend.com >