[ 
https://issues.apache.org/jira/browse/DRILL-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14979500#comment-14979500
 ] 

ASF GitHub Bot commented on DRILL-3623:
---------------------------------------

Github user jinfengni commented on the pull request:

    https://github.com/apache/drill/pull/193#issuecomment-152034091
  
    Sudheesh and I feel this new approach is more like a big optimization step 
towards solving the performance issue for "limit 0" query, rather than hack 
solution :  1) It shows quite significantly reduction in query time, from 
hundreds of seconds to couple of seconds in some cases. That's a big 
improvement. 2) it would benefit not only schema-based query, but also 
schema-less query, while the original approach would only apply for 
schema-based query. 
    
    I agree we should continue to optimize "limit 0" query. But for now, I 
think this new approach has its own merits.
    
    The aggregation /implicit casting are the two things that I can think of, 
if we go with the schema-based approach. 
     



> Hive query hangs with limit 0 clause
> ------------------------------------
>
>                 Key: DRILL-3623
>                 URL: https://issues.apache.org/jira/browse/DRILL-3623
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Hive
>    Affects Versions: 1.1.0
>         Environment: MapR cluster
>            Reporter: Andries Engelbrecht
>            Assignee: Jinfeng Ni
>             Fix For: Future
>
>
> Running a select * from hive.table limit 0 does not return (hangs).
> Select * from hive.table limit 1 works fine
> Hive table is about 6GB with 330 files with parquet using snappy compression.
> Data types are int, bigint, string and double.
> Querying directory with parquet files through the DFS plugin works fine
> select * from dfs.root.`/user/hive/warehouse/database/table` limit 0;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to