[ https://issues.apache.org/jira/browse/BEAM-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17052430#comment-17052430 ]
Anton Kedin commented on BEAM-6874: ----------------------------------- I won't be able to work on this. [~altay], do you know who can take a look at this? > HCatalogTableProvider always read all rows > ------------------------------------------ > > Key: BEAM-6874 > URL: https://issues.apache.org/jira/browse/BEAM-6874 > Project: Beam > Issue Type: Bug > Components: dsl-sql, io-java-hcatalog > Affects Versions: 2.11.0 > Reporter: Near > Assignee: Ahmet Altay > Priority: Major > Attachments: limit.png > > > Hi, > I'm using HCatalogTableProvider while doing SqlTransform.query. The query is > something like "select * from `hive`.`table_name` limit 10". Despite of the > limit clause, the data source still reads much more rows (the data of Hive > table are files on S3), even more than the number of rows in one file (or > partition). > > Some more details: > # It is running on Flink. > # I actually implemented my own HiveTableProvider because HCatalogBeamSchema > only supports primitive types. However, the table provider works when I query > a small table with ~1k rows. -- This message was sent by Atlassian Jira (v8.3.4#803005)