[jira] [Commented] (STREAMS-293) allow for missing metadata fields in streams-persist-hdfs

ASF GitHub Bot (JIRA) Thu, 19 Mar 2015 15:24:30 -0700

    [ 
https://issues.apache.org/jira/browse/STREAMS-293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14370242#comment-14370242
 ]


ASF GitHub Bot commented on STREAMS-293:
----------------------------------------

Github user jfrazee commented on the pull request:

    https://github.com/apache/incubator-streams/pull/195#issuecomment-83785169
  
    :+1: This would be/is really helpful for processing document only streams 
on disk. 
    
    I would add though that it'd be cool if it was a little bit more flexible 
maybe having the option of a user provided function or lambda to define how to 
process the files -- problem for another day though.


> allow for missing metadata fields in streams-persist-hdfs
> ---------------------------------------------------------
>
>                 Key: STREAMS-293
>                 URL: https://issues.apache.org/jira/browse/STREAMS-293
>             Project: Streams
>          Issue Type: Improvement
>            Reporter: Steve Blackmon
>            Assignee: Steve Blackmon
>
> Currently streams-persist-hdfs writer creates (and reader expects) exactly 
> four columns.  this could be made much more flexible without too much effort. 
>  
> Update reader to support additional use cases:
> a) file paths containing one json document per line
> b) file paths containing just id and json on each line, 
> c) file paths containing id timestamp and json document on each line
> Update writer support
> a) ids only
> b) ids and timestamp only
> c) ids timestamp and json only



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (STREAMS-293) allow for missing metadata fields in streams-persist-hdfs

Reply via email to