The hamdist assembler is quite short but it is, sadly, a Windows frame function because I am a register short and have to do a register save. So it is different to the same assembler on Linux :-( Looking at the interleaving of temporary registers r8, r9, r10 and r11, it is not hard to do a minor reordering to reduce register temporaries from four to three, thereby gaining the extra register I need to make this a Windows leaf function. Of course, reordering is exactly what we don't want to do after optimisation of this order. But it does raise the issue of whether an optimisation of a version using three registers rather than four would give the same performance. It would hence be interesting to look at what your optimiser does to a 'three register version'. A better solution would be to get your optimiser to run on Windows assembler code, which I think ought to be possible. I am not sure that it is well enough documented to do this but I would be willing to give it a try. Brian
-- You received this message because you are subscribed to the Google Groups "mpir-devel" group. To view this discussion on the web visit https://groups.google.com/d/msg/mpir-devel/-/OAfbNBWJqMoJ. To post to this group, send email to mpir-devel@googlegroups.com. To unsubscribe from this group, send email to mpir-devel+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en.