On Wed, 5 Sep 2012 18:48:00 +0200 Lionel Cons wrote: > On 5 September 2012 17:00, Glenn Fowler <g...@research.att.com> wrote: > > > > (1) are we talking about libast mktemp(3) or ksh mktemp(1)
> AST mktemp as plain command. We replaced the machine's native > /usr/bin/mktemp with AST mktemp since GNU coreutils and (especially!) > the Solaris /usr/bin/mktemp are prone to even more collisions (Solaris > mktemp in Solaris 2.6-10 and 11 (Opensolaris didn't have the problem > since it used the ksh93 mktemp) suffers from printing random garbage > in rare occasions, too). > > (2) did the original temp file exist when the dup name was generated > I don't know. I have to ask. I'm just the messenger. roland is correct that the low level ast routine is pathtemp() the reason for question (2) is that pathtemp() has a collision detection loop and mktemp(1) calls pathtemp() with an fd pointer that instructs it to create the temp file with open(path, O_CREAT|O_RDWR|O_EXCL, mode) if this fails then another pseudo-random tmp path is generated until there is no collision even if the range of the generated paths were limited pathtemp(3) as called by mktemp(1) should never return success on a path that already exists unless mktemp were called with --unsafe or with --directory pathtemp() in this mode could fail (by not returning) if the entire range were covered by existing files -- it would loop in a bad fashion, attempting random paths attempting to hit unused names since the collision in your case was after 96 hours I don't think its any kind of weird filesystem timing problem can you send the mktemp command line used? I looked at the pathtemp() code and the range can be improved by switching from base 32 (should have been 36! : [0-9]+[a-z]) to base 62 ([0-9]+[a-z]+[A-Z]) numeric representation of the pesudorandom hash and fixing the mktemp(1) user supplied prefix logic to use more of the hash when the prefix length is less than the max 5 chars I ran this test with the old and new pathtemp() ksh -c ' integer n typeset f typeset -A seen builtin mktemp for p in _____ ____ ___ __ _ "" do while : do ((n++)) f=$(mktemp -u "$p") if [[ ${seen[$f]} ]] then printf "%11s %9d %9d\n" $f ${seen[$f]} $n break fi seen[$f]=$n done done ' the test uses -u (--unsafe) so it doesn't generate any temp files it does check for collisions using access(2), but that's not a factor for the test the results show that the old alg collides around ~10^4 calls regardless of prefix size and the new alg collides around ~10^6 calls with prefix length 0 for avoiding predictability the new results are better and they also make the collision detection loop more efficient with mktemp prefix "": the old alg is limited to 32^5 = ~10^7 different names the new alg is limited to 32^10 = ~10^15 different names using the max #X's (14) in the template notation mktemp ${prefix}XXXXXXXXXXXXXX the old and new alg are limited to 32^14 = ~10^21 different names again, the test calls mktemp with -u (collision detection disabled), so the results show collisions in name generation only old _____3e.3vj 2657 3178 ____2c.vks 3828 4723 ___0c.7td 4814 5268 __1i.2q7 8004 10667 _26.6kb 13720 14311 3n.dc5 16113 16578 new _____03.44w 76 359 ____041.2Nu 2460 2890 ___02Vu.1De 7965 23963 __02eD1.3p1 144094 246462 _04p47i.3E5 316344 855852 04yrNdG.0NY 1375224 2357065 I looked further into the code and also modified pathtemp() to to call mkdir() atomically rahter than open(O_CREAT) if the prefix arg ends with "/" this will move the mktemp --directory collision detection from mktemp(1) to pathtemp(3) _______________________________________________ ast-developers mailing list ast-developers@research.att.com https://mailman.research.att.com/mailman/listinfo/ast-developers