Hi Ravindra,

I have following questions:

1. How does DataLoadProcessorStep inteface work? For each step, it will call
its child step to execute and apply its logic to the returned iterator of
the child? And how does it map to OutputFormat in hadoop interface?

2. This step interface relies on iterator to do the encoding row by row,
will it be convinient to add batch encoder support now or later? 

3. for the ditionary part, besides generator I think it is better also
considering the interface for the reading of dictionary while querying. Are
you planning to use the same interface? If so, it is not just a Generator.
If the dictionary interface is well designed, other developer can also add
new dictionary type. For example:
- based on usage frequency to assign dictionary value, for better
compression, similar to huffman encoding
- order-preserving dictionary which can do range filter on dictionary value
directly

Regards,
Jacky



--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Discussion-regrading-design-of-data-load-after-kettle-removal-tp1672p1726.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.

Reply via email to