[
https://issues.apache.org/jira/browse/STREAMS-342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596075#comment-14596075
]
ASF GitHub Bot commented on STREAMS-342:
----------------------------------------
Github user eponvert commented on a diff in the pull request:
https://github.com/apache/incubator-streams/pull/234#discussion_r32946082
--- Diff:
streams-contrib/streams-persist-hdfs/src/main/java/org/apache/streams/hdfs/WebHdfsPersistReader.java
---
@@ -204,6 +212,83 @@ public StreamsResultSet readCurrent() {
return current;
}
+ public StreamsDatum processLine(String line) {
+
+ String[] fields =
line.split(hdfsConfiguration.getFieldDelimiter());
+
+ if( fields.length == 0)
+ return null;
+
+ String id = null;
+ DateTime ts = null;
+ Map<String, Object> metadata = null;
+ String json = null;
+
+ if( hdfsConfiguration.getFields().contains( HdfsConstants.DOC )
--- End diff --
this would be a lot more readable if you pulled out
`hdfsConfiguration.getFields()` into its own variable
> Expose convertResultToString and processLine as public methods
> --------------------------------------------------------------
>
> Key: STREAMS-342
> URL: https://issues.apache.org/jira/browse/STREAMS-342
> Project: Streams
> Issue Type: Improvement
> Reporter: Steve Blackmon
> Assignee: Steve Blackmon
>
> Expose convertResultToString and processLine as public methods.
> This will allow Datum <-> String based on a specific HdfsConfiguration to be
> performed in frameworks such as Spark.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)