I'm running with CDH 5.3.3 (Hadoop 2.5.0 + cdh patches)... so those two
issues are hopefully not an issue. I'll try the two configs suggested and
report back.

Thanks!

On Wed, Sep 30, 2015 at 3:14 PM, Ryan Harris <ryan.har...@zionsbancorp.com>
wrote:

> I would suggest trying:
>
> set hive.hadoop.supports.splittable.combineinputformat = true;
>
>
>
> you might also need to increase mapreduce.input.fileinputformat.split.minsize
> to something larger, like 32MB
>
> set mapreduce.input.fileinputformat.split.minsize = 33554432;
>
>
>
> Depending on your hadoop distro and version, be potentially aware of
>
> https://issues.apache.org/jira/browse/MAPREDUCE-1597
>
> and
>
> https://issues.apache.org/jira/browse/MAPREDUCE-5537
>
>
>
> test it and see...
>
>
>
> *From:* Pradeep Gollakota [mailto:pradeep...@gmail.com]
> *Sent:* Wednesday, September 30, 2015 3:33 PM
> *To:* user@hive.apache.org
> *Subject:* Re: CombineHiveInputFormat not working
>
>
>
> mapred.min.split.size = mapreduce.input.fileinputformat.split.maxsize = 1
> mapred.max.split.size = mapreduce.input.fileinputformat.split.maxsize =
> 134217728
> hive.hadoop.supports.splittable.combineinputformat = false
>
>
>
> My average file size is pretty small... it's usually between 500K and 20MB.
>
>
>
> So it looks like the splittable support is turned off? I've been seeing
> some posts on the mailing list saying there's correctness problems when
> using this and LZO.
>
>
>
> Is this still the case? Can I turn this on with LZ4?
>
>
>
> Thanks!
>
>
>
> On Wed, Sep 30, 2015 at 1:38 PM, Ryan Harris <ryan.har...@zionsbancorp.com>
> wrote:
>
> Also...
>
> mapreduce.input.fileinputformat.split.maxsize
>
>
>
> and, what is the size of your input files?
>
>
>
> *From:* Ryan Harris
> *Sent:* Wednesday, September 30, 2015 2:37 PM
> *To:* 'user@hive.apache.org'
> *Subject:* RE: CombineHiveInputFormat not working
>
>
>
> what are your values for:
>
> mapred.min.split.size
>
> mapred.max.split.size
>
> hive.hadoop.supports.splittable.combineinputformat
>
>
>
>
>
> *From:* Pradeep Gollakota [mailto:pradeep...@gmail.com]
> *Sent:* Wednesday, September 30, 2015 2:20 PM
> *To:* user@hive.apache.org
> *Subject:* CombineHiveInputFormat not working
>
>
>
> Hi all,
>
>
>
> I have an external table of with the following DDL.
>
>
>
> ```
>
> DROP TABLE IF EXISTS raw_events;
>
> CREATE EXTERNAL TABLE IF NOT EXISTS raw_events (
>
>     raw_event_string string)
>
> PARTITIONED BY (dc string, community string, dt string)
>
> STORED AS TEXTFILE
>
> LOCATION '/lithium/events/{dc}/{community}/events/{year}/{month}/{day}'
>
> ```
>
>
>
> The files are loaded externally and are LZ4 compressed. When I run a query
> on this table for a single day, I'm getting 1 mapper per file even though
> the input format is set to CombineHiveInputFormat.
>
>
>
> Does anyone know if CombineHiveInputFormat does not work with LZ4
> compressed files or have any idea why split combination is not working?
>
>
>
> Thanks!
>
> Pradeep
> ------------------------------
>
> THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS
> CONFIDENTIAL and may contain information that is privileged and exempt from
> disclosure under applicable law. If you are neither the intended recipient
> nor responsible for delivering the message to the intended recipient,
> please note that any dissemination, distribution, copying or the taking of
> any action in reliance upon the message is strictly prohibited. If you have
> received this communication in error, please notify the sender immediately.
> Thank you.
>
>
> ------------------------------
> THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS
> CONFIDENTIAL and may contain information that is privileged and exempt from
> disclosure under applicable law. If you are neither the intended recipient
> nor responsible for delivering the message to the intended recipient,
> please note that any dissemination, distribution, copying or the taking of
> any action in reliance upon the message is strictly prohibited. If you have
> received this communication in error, please notify the sender immediately.
> Thank you.
>

Reply via email to