> > For you or anyone wishing to put together the scripts for the current > set.mm using your formula(s), I uploaded 3 files from the 2018 run for > reference: > ... > The emails describe how we did this as well as some problems we ran into. > Some jobs took much longer than others, and we ended up with 72 logs.
With that information it is possible to compare the schedule you used in 2018 with the schedules built by other approaches. Your schedule apparently had the form: [41040,57642,44711,45151,50352,45021,49597,46507,42826,43233,44058,43347,49606,48702,45067,39088,45075,44259,44467,50376,48313,45099,47391,225349] with maximum job time being 225349. Meanwhile, if one were to randomly scramble theorems and evenly distribute them among 24 CPUs, the maximum job time would have been ~80521, which is ~36% of the 2018 run. Maximum job time resulting from the LPT algorithm with 'size' heuristic is ~67790, which is ~30% of the 2018 run. If we could predict minimization run times with absolute precision, LPT algorithm would have managed to produce a makespan of ~52994, which is ~24% of the 2018 run; that seems to be out of reach now :-) If I understand correctly, we should first add theorems with largest > expected run time, right? Otherwise if there are 999 fast tiny ones added > first, they would become about evenly distributed, and the 1000th giant one > added at the end would skew one job significantly. > Longest Processing Time (LPT) scheduling algorithm works like this. You sort jobs (theorems) based on the time they take (sizes in our case) in the non-increasing order, and then assign each job to a CPU with the smallest load, one by one (CPU load is the sum of run times of jobs it has at the moment). It's essentially the same as if each CPU pulled a new job out of the sorted queue once it completed a previous task. Obviously the best way would be to implement that algorithm "online", so that different copies of the metamath program would pull theorems to minimize from a common sorted queue, but I don't know a way to do that without some sort of threads library. It will be interesting to see how this works out for the current set.mm, > which has had massive modifications in the past 2 years. > I made a bash script to generate job1xx.cmd files automatically: https://gist.github.com/savask/e44ee6086a1321b2bbddbce027975bae To run it, modify the metamath_exe variable inside the script to point at your local metamath installation and run it as "./preparejobs.sh N filename" where N is the number of CPUs and filename is the database name. For example, "./preparejobs.sh 24 set.mm". It will generate N files job1xx.cmd in the same directory with the format as in the min2018-jobs.zip archive. My bash skills are far from perfect, so feel free to report on a bug in that script. I'm very intrigued to know if it will improve the full minimization run, so I hope it works well :-) -- You received this message because you are subscribed to the Google Groups "Metamath" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/metamath/6742bd11-6cfb-4245-84fd-508a5ac3b5f9%40googlegroups.com.
