Yogi Devendra created APEXMALHAR-2009:
-----------------------------------------
Summary: concrete operator for writing to HDFS file
Key: APEXMALHAR-2009
URL: https://issues.apache.org/jira/browse/APEXMALHAR-2009
Project: Apache Apex Malhar
Issue Type: Task
Reporter: Yogi Devendra
Assignee: Yogi Devendra
Currently, for writing to HDFS file we have AbstractFileOutputOperator in the
malhar library.
It has following abstract methods :
1. protected abstract String getFileName(INPUT tuple)
2. protected abstract byte[] getBytesForTuple(INPUT tuple)
These methods are kept generic to give flexibility to the app developers. But,
someone who is new to apex; would look for ready-made implementation instead of
extending Abstract implementation.
Thus, I am proposing to add concrete operator HDFSOutputOperator to malhar. Aim
of this operator would be to serve the purpose of ready to use operator for
most frequent use-cases.
Here are my key observations on most frequent use-cases:
------------------------------------------------------------------------------
1. Writing tuples of type byte[] or String.
2. All tuples on a particular stream land up in the same output file.
3. App developer may want to add some custom tuple separator (e.g. newline
character) between tuples.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)