[ 
https://issues.apache.org/jira/browse/BEAM-6479?focusedWorklogId=258216&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-258216
 ]

ASF GitHub Bot logged work on BEAM-6479:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 11/Jun/19 23:36
            Start Date: 11/Jun/19 23:36
    Worklog Time Spent: 10m 
      Work Description: sini commented on issue #8418: [BEAM-6479] Deprecate 
AvroIO.RecordFormatter
URL: https://github.com/apache/beam/pull/8418#issuecomment-501063020
 
 
   Question, as I use this... in my FileIO dynamic write we have a mixed 
PCollection of hundreds of different types of binary avro records that we 
re-serialize using this sink. Would materializing an intermediate PCollection 
of GenericRecords, as is the proposed alternative, not increase the size and 
cost of shuffle operations if the collection needed to be flushed to disk as 
the schema is duplicated for every record? Or was this also a problem with the 
RecordFormatter transform step and I was just oblivious to it?
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 258216)
    Time Spent: 0.5h  (was: 20m)

> Deprecate AvroIO.RecordFormatter
> --------------------------------
>
>                 Key: BEAM-6479
>                 URL: https://issues.apache.org/jira/browse/BEAM-6479
>             Project: Beam
>          Issue Type: Task
>          Components: io-java-avro
>    Affects Versions: 2.9.0
>            Reporter: Romain Manni-Bucau
>            Assignee: Ismaël Mejía
>            Priority: Major
>             Fix For: 2.13.0
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
>  AvroIO.RecordFormatter is an user friendly way to transform user elements 
> into Avro GenericRecords before writing to a Sink. This can be achieved 
> easily by doing a ParDo with the same goal and using Sink that knows how to 
> write a PCollection of IndexedRecords. Like the one proposed in BEAM-6480



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to