[ 
https://issues.apache.org/jira/browse/BEAM-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906648#comment-15906648
 ] 

Eugene Kirpichov commented on BEAM-1581:
----------------------------------------

[~aviemzur] - I think XML source/sink are a bad design in retrospect - they 
were both, as far as I remember, created mostly to demonstrate the whole idea 
of file-based sources/sinks, and before the best practices for transform 
development shaped up, and did not really come from unifying an experience with 
an array of XML use cases either. We should not use them as API guidance.

The suggestion to have an abstract JsonIO seems to contradict the 
recommendation from the PTransform Style Guide (see 
https://beam.apache.org/contribute/ptransform-style-guide/#injecting-user-specified-behavior)
 to use PTransform composition as an extensibility device whenever possible 
(instead of inheritance) - and that recommendation is specifically directed at 
cases like this; the better alternative is to return String's and let the user 
compose it with a ParDo parsing the strings.

[~eljefe6a] - "File as a self-contained JSON" means there's no JSON-specific 
logic, it's simply "File as a self-contained String" - we should definitely 
have that, but under a separate JIRA issue.

Aviem / Jesse - could you perhaps come up with a list of common ways in which 
you have seen people store a collection of stuff in JSON file(s)? I think 
without that, or while keeping it implicit, we're kind of acting blindly. Let's 
list all the known use cases and abstract upward from that.

> JsonIO
> ------
>
>                 Key: BEAM-1581
>                 URL: https://issues.apache.org/jira/browse/BEAM-1581
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-java-extensions
>            Reporter: Aviem Zur
>            Assignee: Aviem Zur
>
> A new IO (with source and sink) which will read/write Json files.
> Similarly to {{XmlSource}}/{{XmlSink}}, this IO should have a 
> {{JsonSource}}/{{JonSink}} which are a {{FileBaseSource}}/{{FileBasedSink}}.
> Consider using methods/code (or refactor these) found in {{AsJsons}} and 
> {{ParseJsons}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to