I would guess that for the first run, data had to be read off disk, plus
code runtime code had to be compiled. Subsequent runs did not need to do
this, since the data should then be in cache, as well as the compiled
classes, so the subsequent runs are noticeably faster. Runs 1 - 4 have a
range of about 1.5 seconds, which seems like an unremarkable amount of
noise.

On Fri, Jan 16, 2015 at 3:07 AM, mufy <[email protected]> wrote:

> Hello,
>
> I was curious to know the possible reason(s) behind the difference in
> timings observed as shown below:
>
> 0: jdbc:drill:zk=> select count(*) from
> dfs.tmp.`yelp_academic_dataset_review.json`;
> +------------+
> |   EXPR$0   |
> +------------+
> | 1125458    |
> +------------+
> 1 row selected (15.214 seconds)
>
> 0: jdbc:drill:zk=> select count(*) from
> dfs.tmp.`yelp_academic_dataset_review.json`;
> +------------+
> |   EXPR$0   |
> +------------+
> | 1125458    |
> +------------+
> 1 row selected (12.717 seconds)
>
> 0: jdbc:drill:zk=> select count(*) from
> dfs.tmp.`yelp_academic_dataset_review.json`;
> +------------+
> |   EXPR$0   |
> +------------+
> | 1125458    |
> +------------+
> 1 row selected (11.833 seconds)
>
> 0: jdbc:drill:zk=> select count(*) from
> dfs.tmp.`yelp_academic_dataset_review.json`;
> +------------+
> |   EXPR$0   |
> +------------+
> | 1125458    |
> +------------+
> 1 row selected (13.298 seconds)
>
> 0: jdbc:drill:zk=> select count(*) from
> dfs.tmp.`yelp_academic_dataset_review.json`;
> +------------+
> |   EXPR$0   |
> +------------+
> | 1125458    |
> +------------+
> 1 row selected (12.749 seconds)
>
> This was run using MapR Drill 0.7.0 on a 5 node MapR cluster.
>
>
> ---
> Mufeed Usman
> My LinkedIn <http://www.linkedin.com/pub/mufeed-usman/28/254/400> | My
> Social Cause <http://www.vision2016.org.in/> | My Blogs : LiveJournal
> <http://mufeed.livejournal.com>
>



-- 
 Steven Phillips
 Software Engineer

 mapr.com

Reply via email to