Re: Merging small files in partitions

2015-06-17 Thread Mohammad Islam
Hi Edward,Can we do the same/similar thing for parquet file?Any 
pointer?Regards,Mohammad 


 On Tuesday, June 16, 2015 2:35 PM, Edward Capriolo  
wrote:
   

 https://github.com/edwardcapriolo/filecrush

On Tue, Jun 16, 2015 at 5:05 PM, Chagarlamudi, Prasanth 
 wrote:

Hello,I am looking for an optimized way to merge small files in hive partitions 
into one big file.I came across Alter Table/Partition Concatenate 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/PartitionConcatenate.
 Doc says this only works for RCFiles. I wish there is something similar for 
TEXT FILE format.Any suggestions? Thanks in advancePrasanth  

This e-mail and files transmitted with it are confidential, and are intended 
solely for the use of the individual or entity to whom this e-mail is 
addressed. If you are not the intended recipient, or the employee or agent 
responsible to deliver it to the intended recipient, you are hereby notified 
that any dissemination, distribution or copying of this communication is 
strictly prohibited. If you are not one of the named recipient(s) or otherwise 
have reason to believe that you received this message in error, please 
immediately notify sender by e-mail, and destroy the original message. Thank 
You.




  

Re: Merging small files in partitions

2015-06-16 Thread Edward Capriolo
https://github.com/edwardcapriolo/filecrush

On Tue, Jun 16, 2015 at 5:05 PM, Chagarlamudi, Prasanth <
prasanth.chagarlam...@epsilon.com> wrote:

>  Hello,
>
> I am looking for an optimized way to merge small files in hive partitions
> into one big file.
>
> I came across *Alter Table/Partition Concatenate *
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/PartitionConcatenate.
> Doc says this only works for RCFiles. I wish there is something similar for
> TEXT FILE format.
>
> Any suggestions?
>
>
>
> Thanks in advance
>
> Prasanth
>
>
>
>
>
> --
>
> This e-mail and files transmitted with it are confidential, and are
> intended solely for the use of the individual or entity to whom this e-mail
> is addressed. If you are not the intended recipient, or the employee or
> agent responsible to deliver it to the intended recipient, you are hereby
> notified that any dissemination, distribution or copying of this
> communication is strictly prohibited. If you are not one of the named
> recipient(s) or otherwise have reason to believe that you received this
> message in error, please immediately notify sender by e-mail, and destroy
> the original message. Thank You.
>


Merging small files in partitions

2015-06-16 Thread Chagarlamudi, Prasanth
Hello,
I am looking for an optimized way to merge small files in hive partitions into 
one big file.
I came across Alter Table/Partition Concatenate 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/PartitionConcatenate.
 Doc says this only works for RCFiles. I wish there is something similar for 
TEXT FILE format.
Any suggestions?

Thanks in advance
Prasanth





This e-mail and files transmitted with it are confidential, and are intended 
solely for the use of the individual or entity to whom this e-mail is 
addressed. If you are not the intended recipient, or the employee or agent 
responsible to deliver it to the intended recipient, you are hereby notified 
that any dissemination, distribution or copying of this communication is 
strictly prohibited. If you are not one of the named recipient(s) or otherwise 
have reason to believe that you received this message in error, please 
immediately notify sender by e-mail, and destroy the original message. Thank 
You.