Duncan Murdoch wrote:
1) can you tell me what my original set.seed() value was?  (I wouldn't
be able to figure it out, but maybe someone can)

The only way I know is to test all 2^32 possible values of the seed. I think cryptographers would know faster ways.

Well, I'm not a cryptographer, but I know a faster way: rainbow tables.

        http://en.wikipedia.org/wiki/Rainbow_table

Given that the algorithm to generate these images is known and that each seed always gives the same image as output, you can simply precompute all possible images, hash them using your favorite algorithm -- say, SHA-256 -- and record the seed-to-hash correspondence on disk. Then given an output image, you can hash it and use that to look up the seed.

It can take a long time to generate all the images, but then you have database like lookup speeds for image-to-seed correspondence.

This is not just a theoretical idea. There are underground sites where you can put in, say, an MD5 password hash and get out the likely password that was actually used. This allows a black hat to break into one site, grab their password hash database, reverse engineer the passwords, and then go use them to bang on the front door of other sites users of that site he first compromised also use. There are defenses against this: salting the passwords and using passwords too big to appear in rainbow tables are easiest.

Now, if the seed was removed just before the values were generated, the seed would be generated from the system clock. If you knew the time that this occurred approximately, the search could be a lot faster.

This also helps with the rainbow table approach.

Given that the seed for the generation algorithm is always the current wall time, you can restrict the needed rainbow table size greatly. You simply have to know when the algorithm was first put into use, then start your rainbow table with that time's value as the first seed, and only compute up to "now" plus whatever you need for future operations.

For instance, you can cover about 3 years worth of image production in about 1/45 the time as it takes to cover all 2^32 possible images. Say it takes a month to generate a rainbow table covering those 3 years. Full coverage would then take nearly 4 years.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to