+1 to have this feature. -Priyanka
On Tue, Jan 17, 2017 at 9:18 PM, Pramod Immaneni <pra...@datatorrent.com> wrote: > +1 > > On Mon, Jan 16, 2017 at 1:23 AM, Chinmay Kolhatkar <chin...@apache.org> > wrote: > > > Hi All, > > > > Currently a DAG that is generated by user, if contains any POJOfied > > operators, TUPLE_CLASS attribute needs to be set on each and every port > > which receives or sends a POJO. > > > > For e.g., if a DAG is like File -> Parser -> Transform -> Dedup -> > > Formatter -> Kafka, then TUPLE_CLASS attribute needs to be set by user on > > both input and output ports of transform, dedup operators and also on > > parser output and formatter input. > > > > The proposal here is to reduce work that is required by user to configure > > the DAG. Technically speaking if an operators knows input schema and > > processing properties, it can determine output schema and convey it to > > downstream operators. This way the complete pipeline can be configured > > without user setting TUPLE_CLASS or even creating POJOs and adding them > to > > classpath. > > > > On the same idea, I want to propose an approach where the pipeline can be > > configured without user setting TUPLE_CLASS or even creating POJOs and > > adding them to classpath. > > Here is the document which at a high level explains the idea and a high > > level design: > > https://docs.google.com/document/d/1ibLQ1KYCLTeufG7dLoHyN_ > > tRQXEM3LR-7o_S0z_porQ/edit?usp=sharing > > > > I would like to get opinion from community about feasibility and > > applications of this proposal. > > Once we get some consensus we can discuss the design in details. > > > > Thanks, > > Chinmay. > > >