[ https://issues.apache.org/jira/browse/DRILL-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14979632#comment-14979632 ]
ASF GitHub Bot commented on DRILL-3623: --------------------------------------- Github user jacques-n commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-152047328 Interesting. Can you explain where the time is coming from? It isn't clear to me why this will have a big impact over what we had before. While you're pushing the limit down to just above the scan nodes, we already had an optimization which avoided parallelization. Since we're pipelined this really shouldn't matter much. Is limit zero not working right in the limit operator? It should terminate upon receiving schema, not wait until a batch of actual records (I'm wondering if it is doing the latter). Is sending zero records through causing operators to skip compilation? In what cases was this change taking something from hundreds of seconds to a few seconds? I'm asking these questions so I can better understand as I want to make sure there isn't a bug somewhere else. Thanks! > Hive query hangs with limit 0 clause > ------------------------------------ > > Key: DRILL-3623 > URL: https://issues.apache.org/jira/browse/DRILL-3623 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Hive > Affects Versions: 1.1.0 > Environment: MapR cluster > Reporter: Andries Engelbrecht > Assignee: Jinfeng Ni > Fix For: Future > > > Running a select * from hive.table limit 0 does not return (hangs). > Select * from hive.table limit 1 works fine > Hive table is about 6GB with 330 files with parquet using snappy compression. > Data types are int, bigint, string and double. > Querying directory with parquet files through the DFS plugin works fine > select * from dfs.root.`/user/hive/warehouse/database/table` limit 0; -- This message was sent by Atlassian JIRA (v6.3.4#6332)