On Wed, 5 Sep 2012 18:48:00 +0200 Lionel Cons wrote:
> On 5 September 2012 17:00, Glenn Fowler <g...@research.att.com> wrote:
> >
> > (1) are we talking about libast mktemp(3) or ksh mktemp(1)

> AST mktemp as plain command. We replaced the machine's native
> /usr/bin/mktemp with AST mktemp since GNU coreutils and (especially!)
> the Solaris /usr/bin/mktemp are prone to even more collisions (Solaris
> mktemp in Solaris 2.6-10 and 11 (Opensolaris didn't have the problem
> since it used the ksh93 mktemp) suffers from printing random garbage
> in rare occasions, too).

> > (2) did the original temp file exist when the dup name was generated

> I don't know. I have to ask. I'm just the messenger.

roland is correct that the low level ast routine is pathtemp()

the reason for question (2) is that pathtemp() has a collision detection loop
and mktemp(1) calls pathtemp() with an fd pointer that instructs it to
create the temp file with
        open(path, O_CREAT|O_RDWR|O_EXCL, mode)
if this fails then another pseudo-random tmp path is generated
until there is no collision

even if the range of the generated paths were limited pathtemp(3) as called
by mktemp(1) should never return success on a path that already exists
unless mktemp were called with --unsafe or with --directory

pathtemp() in this mode could fail (by not returning) if the entire range
were covered by existing files -- it would loop in a bad fashion, attempting
random paths attempting to hit unused names

since the collision in your case was after 96 hours I don't think its
any kind of weird filesystem timing problem

can you send the mktemp command line used?

I looked at the pathtemp() code and the range can be improved by switching
from base 32 (should have been 36! : [0-9]+[a-z]) to base 62 ([0-9]+[a-z]+[A-Z])
numeric representation of the pesudorandom hash
and fixing the mktemp(1) user supplied prefix logic to use more of the hash
when the prefix length is less than the max 5 chars

I ran this test with the old and new pathtemp()

ksh -c '
        integer n
        typeset f
        typeset -A seen
        builtin mktemp
        for p in _____ ____ ___ __ _ ""
        do      while   :
                do      ((n++))
                        f=$(mktemp -u "$p")
                        if      [[ ${seen[$f]} ]]
                        then    printf "%11s %9d %9d\n" $f ${seen[$f]} $n
                                break
                        fi
                        seen[$f]=$n
                done
        done
'

the test uses -u (--unsafe) so it doesn't generate any temp files
it does check for collisions using access(2), but that's not a factor for the 
test
the results show that the old alg collides around ~10^4 calls regardless of 
prefix size
and the new alg collides around ~10^6 calls with prefix length 0

for avoiding predictability the new results are better
and they also make the collision detection loop more efficient

with mktemp prefix "":
the old alg is limited to 32^5 = ~10^7 different names
the new alg is limited to 32^10 = ~10^15 different names

using the max #X's (14) in the template notation
        mktemp ${prefix}XXXXXXXXXXXXXX
the old and new alg are limited to 32^14 = ~10^21 different names

again, the test calls mktemp with -u (collision detection disabled), 
so the results show collisions in name generation only

old

_____3e.3vj      2657      3178
 ____2c.vks      3828      4723
  ___0c.7td      4814      5268
   __1i.2q7      8004     10667
    _26.6kb     13720     14311
     3n.dc5     16113     16578

new

_____03.44w        76       359
____041.2Nu      2460      2890
___02Vu.1De      7965     23963
__02eD1.3p1    144094    246462
_04p47i.3E5    316344    855852
04yrNdG.0NY   1375224   2357065

I looked further into the code and also modified pathtemp() to
to call mkdir() atomically rahter than open(O_CREAT) if the prefix
arg ends with "/"

this will move the mktemp --directory collision detection from
mktemp(1) to pathtemp(3)

_______________________________________________
ast-developers mailing list
ast-developers@research.att.com
https://mailman.research.att.com/mailman/listinfo/ast-developers

Reply via email to