Hi, Drillers,

I am pretty new to Drill and I am trying to understand the work flow of drill 
query execution.
When I am reading on the fragment section in 
http://drill.apache.org/docs/drill-query-execution/,  I have some questions:

1. It looks to me that major fragment  is like a spark stage in concept, and 
there will be shuffle between major fragments?
2. There will be one or multiple minor fragments within each major fragment, it 
looks to me that it is similar to Spark pipeline in one stage(there can be many 
operators in one stage if there is no shuffle involved)
3. When I run the Drill locally(start with drill-embedded), and run the 
following query

select name, max(age)  from hdfs.`user`.`/people.json` group by name order by 
name desc limit 2

The above query should  involve shuffle,so that there should be at least 2 
major fragments, but I find that there is only one major fragment from the 
drill web ui.  Not sure whether my above understanding  is right.

Thanks!


Reply via email to