[ 
https://issues.apache.org/jira/browse/BEAM-13150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17445449#comment-17445449
 ] 

Brian Hulette commented on BEAM-13150:
--------------------------------------

Absolutely agree it would be great to have a good path for this.

Thinking about it more, I'm not sure it would be a good idea to make this 
change in Apache Beam, as it would introduce a circular dependency between TFX 
and Beam. I think we should either:
- Try to implement this completely in TFX (e.g. can there be a PTransform that 
produces a schema'd PCollection by reading the TF schema and generating an 
appropriate type?), or
- Add some generalizable support for defining schemas from arbitrary types in 
Beam (BEAM-8732), and then leverage that from TFX.

That being said, at this early stage it makes a lot of sense to hack on this in 
Beam, so we understand the problem and what general infrastructure we need.

> Integrate TFRecord/tf.train.Example with Beam Schemas and the DataFrame API
> ---------------------------------------------------------------------------
>
>                 Key: BEAM-13150
>                 URL: https://issues.apache.org/jira/browse/BEAM-13150
>             Project: Beam
>          Issue Type: Improvement
>          Components: dsl-dataframe, sdk-py-core
>            Reporter: Brian Hulette
>            Assignee: Brian Hulette
>            Priority: P2
>
> See discussion in BEAM-12955



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to