[
https://issues.apache.org/jira/browse/BEAM-12328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17549288#comment-17549288
]
Danny McCormick commented on BEAM-12328:
----------------------------------------
This issue has been migrated to https://github.com/apache/beam/issues/20894
> Conversion from Avro GenericRecords to Beam Rows takes too much time
> --------------------------------------------------------------------
>
> Key: BEAM-12328
> URL: https://issues.apache.org/jira/browse/BEAM-12328
> Project: Beam
> Issue Type: Improvement
> Components: dsl-sql, sdk-java-core
> Reporter: Ismaël Mejía
> Priority: P3
> Attachments: Screenshot from 2021-05-10 09-55-58.png, Screenshot from
> 2021-05-10 10-04-33.png
>
>
> While running TPC-DS SQL query 3 on Dataflow with a input dataset of 1TB I
> noticed the performancew was not good compared with a native version
> implemented with Raw GenericRecords.
> I enabled profiling and noticed that the performance of the Convert transform
> that converts Avro GenericRecords into Beam Rows is taking too much time,
> specially for those records that are relatively simple (not nested recordes /
> logical types or anything complex).
> There is definitely something odd that can be improved in this transformation.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)