Re: Merging files

2013-07-31 Thread Something Something
Thanks, John. But I don't see an option to specify the # of output files. How does Crush decide how many files to create? Is it only based on file sizes? On Wed, Jul 31, 2013 at 6:28 AM, John Meagher wrote: > Here's a great tool for handling exactly that case: > https://github.com/edwardcaprio

Re: Merging files

2013-07-31 Thread Something Something
So you are saying, we will first do a 'hadoop count' to get the total # of bytes for all files. Let's say that comes to: 1538684305 Default Block Size is: 128M So, total # of blocks needed: 1538684305 / 131072 = 11740 Max file blocks = 11740 / 50 (# of output files) = 234 Does this calculat