Yes, i went through the benchmarks and started testing this one.
I have tested this one using Hadoop Map-Reduce. And it seems BZ worked
faster than GZ. As i know GZ is non-splittable and BZ is splittable.
Hadoop MR takes the advantage of this splittable property and launched
multiple mappers and
Shankar,
This is expected behavior, bzip2 decompression is four to twelve times
slower than decompressing gzip compressed files.
You can look at the comparison benchmark here for numbers -
http://tukaani.org/lzma/benchmarks.html
On Thu, Aug 4, 2016 at 5:13 PM, Shankar Mane
Ok so query planning took less than one second in both the aggregate
queries.
Looks like most of the time is getting spent in query execution.
On Thu, Aug 4, 2016 at 5:13 PM, Shankar Mane
wrote:
> Please find the query plan for both queries. FYI: I am not seeing
>
Please find the query plan for both queries. FYI: I am not seeing
any planning difference between these 2 queries except Cost.
/ Query on GZ
/
0: jdbc:drill:> explain plan for select channelid, count(serverTime) from
Can you please do an explain plan over the two aggregate queries. That way
we can know where most of the time is being spent, is it in the query
planning phase or is it query execution that is taking longer. Please share
the query plans and the time taken for those explain plan statements.
On
It is plain json (1 json per line).
Each json message size = ~4kb
no. of json messages = ~5 Millions.
store.parquet.compression = snappy ( i don't think, this parameter get
used. As I am querying select only.)
On Mon, Aug 1, 2016 at 3:27 PM, Khurram Faraaz wrote:
> What