[ https://issues.apache.org/jira/browse/DRILL-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14979500#comment-14979500 ]
ASF GitHub Bot commented on DRILL-3623: --------------------------------------- Github user jinfengni commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-152034091 Sudheesh and I feel this new approach is more like a big optimization step towards solving the performance issue for "limit 0" query, rather than hack solution : 1) It shows quite significantly reduction in query time, from hundreds of seconds to couple of seconds in some cases. That's a big improvement. 2) it would benefit not only schema-based query, but also schema-less query, while the original approach would only apply for schema-based query. I agree we should continue to optimize "limit 0" query. But for now, I think this new approach has its own merits. The aggregation /implicit casting are the two things that I can think of, if we go with the schema-based approach. > Hive query hangs with limit 0 clause > ------------------------------------ > > Key: DRILL-3623 > URL: https://issues.apache.org/jira/browse/DRILL-3623 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Hive > Affects Versions: 1.1.0 > Environment: MapR cluster > Reporter: Andries Engelbrecht > Assignee: Jinfeng Ni > Fix For: Future > > > Running a select * from hive.table limit 0 does not return (hangs). > Select * from hive.table limit 1 works fine > Hive table is about 6GB with 330 files with parquet using snappy compression. > Data types are int, bigint, string and double. > Querying directory with parquet files through the DFS plugin works fine > select * from dfs.root.`/user/hive/warehouse/database/table` limit 0; -- This message was sent by Atlassian JIRA (v6.3.4#6332)