Yogi Devendra created APEXMALHAR-2009:
-----------------------------------------

             Summary: concrete operator for writing to HDFS file
                 Key: APEXMALHAR-2009
                 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2009
             Project: Apache Apex Malhar
          Issue Type: Task
            Reporter: Yogi Devendra
            Assignee: Yogi Devendra


Currently, for writing to HDFS file we have AbstractFileOutputOperator in the 
malhar library.

It has following abstract methods :
1. protected abstract String getFileName(INPUT tuple)
2. protected abstract byte[] getBytesForTuple(INPUT tuple)

These methods are kept generic to give flexibility to the app developers. But, 
someone who is new to apex; would look for ready-made implementation instead of 
extending Abstract implementation.

Thus, I am proposing to add concrete operator HDFSOutputOperator to malhar. Aim 
of this operator would be to serve the purpose of ready to use operator for 
most frequent use-cases.

Here are my key observations on most frequent use-cases:
------------------------------------------------------------------------------

1. Writing tuples of type byte[] or String. 
2. All tuples on a particular stream land up in the same output file.
3. App developer may want to add some custom tuple separator (e.g. newline 
character) between tuples.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to