Re: Max Number of Records

Ole Tange Sun, 11 Jun 2017 04:28:12 -0700

On Fri, Jun 9, 2017 at 5:47 PM, Ling, Stephen *
<[email protected]> wrote:


> I am currently using the program to split a database that is around Size
> 134,625,557,455 bytes. I’ve been trying to split the database into around
> 0.5g, 0.25g, and 0.125g pieces. The program however, has been unable to
> split the database completely evenly and I am just wondering if there’s a
> certain limitation to this problem

rand | head -c 134625557455 > database
parallel -j3 -a database --pipepart --block 0.125g wc
:
488108 2767530 125000180
487866 2770031 125000013
487224 2762455 125000532
  1117    6383  296473 <--- this is the last incomplete block
489535 2766571 125000417
488926 2768247 125000247

This is what we expect, as GNU Parallel only finds a splitpoint at a \n.

If you want exactly 125000000 bytes, use --recend '':

parallel -j3 -a database --pipepart --recend '' --block 0.125g wc
488671 2770097 125000000
487496 2763312 125000000
489431 2767600 125000000

/Ole

Re: Max Number of Records

Reply via email to