[ https://issues.apache.org/jira/browse/BEAM-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17052242#comment-17052242 ]
Ismaël Mejía commented on BEAM-6874: ------------------------------------ Reassigning to [~kedin] since he is the author of the mentioned class `HCatalogTableProvider`. If you are not working on this anymore Anton can you please assign to someone else. > HCatalogTableProvider always read all rows > ------------------------------------------ > > Key: BEAM-6874 > URL: https://issues.apache.org/jira/browse/BEAM-6874 > Project: Beam > Issue Type: Bug > Components: dsl-sql, io-java-hcatalog > Affects Versions: 2.11.0 > Reporter: Near > Assignee: Anton Kedin > Priority: Major > Attachments: limit.png > > > Hi, > I'm using HCatalogTableProvider while doing SqlTransform.query. The query is > something like "select * from `hive`.`table_name` limit 10". Despite of the > limit clause, the data source still reads much more rows (the data of Hive > table are files on S3), even more than the number of rows in one file (or > partition). > > Some more details: > # It is running on Flink. > # I actually implemented my own HiveTableProvider because HCatalogBeamSchema > only supports primitive types. However, the table provider works when I query > a small table with ~1k rows. -- This message was sent by Atlassian Jira (v8.3.4#803005)