On Sun, Apr 26, 2015 at 06:24:18PM -0700, Michael Forney wrote: > tar > --- > Since fb1595a69c091a6f6a9303b1fab19360b876d114, tar calls remove(3) on > directories before extracting them. I'm not sure that it is reasonable > for tar to do this because users may want to re-extract archives, or > extract archives on top a directory structure that already exists. > Additionally, it is fairly common to find tar archives containing the > "." directory (possibly with a trailing '/'), which were constructed > using "tar -cf foo.tar .".
Yeah that makes sense I suppose. Some things that need to be done for tar: - Investigate aforementioned remove vs unlink issue. - When we tar a file, we need to ensure to use both name/prefix if the filename is more than 100 chars. - Strip leading / from filenames and dangerous things like ../../ etc. > cat, tee > -------- > These utilities read from stdin using fread(3) into a buffer of size > BUFSIZ. However, fread will read until it fills up the entire buffer (or > hits EOF) before returning, causing noticeable delay when the input > comes from other programs or scripts. > > To demonstrate this problem, compare the output of these commands: > > for i in $(seq 500) ; do printf 0123456789abcdef ; sleep 0.005 ; done > for i in $(seq 500) ; do printf 0123456789abcdef ; sleep 0.005 ; done | cat > for i in $(seq 500) ; do printf 0123456789abcdef ; sleep 0.005 ; done | tee > > I considered fixing this by making the concat function take an fd > instead and make a single call to read(2), but this causes problems for > sponge, which uses a FILE * obtained from tmpfile(3) as both output and > input for concat. We could also use mkstemp(3) to return a file > descriptor, and use a FILE * from fdopen for writing, and the file > descriptor for reading, but this seems unclean to me. We should avoid mixing file stream I/O and raw I/O. Check out '2.5.1 Interaction of File Descriptors and Standard I/O Streams'. > Another option would be to use fgetc and fputc for the concat > implementation, and let libc take care of the buffering. I'm not sure if > this has any performance implications. Sounds about right.