Hi all, I'm Chris, British grad student at MIT. Circa 2 years experience in Python but that is mainly in small scripts written for research purposes - am looking to get some experience writing production level code. Have an advanced PhD level understanding of physics/maths so comfortable creating algorithms but a newcomer to sympy and open source contributions - therefore, the guidance I'm looking for will mainly be in moving my code into the "real world" (doctests/docstrings/workflow/improving code readability etc). Happy to work on almost any problems and big fan of diversity so any collaboration requests would be welcome.
I've dived right into the poisson sampling problem - https://github.com/sympy/sympy/pull/13943 <https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2Fsympy%2Fsympy%2Fpull%2F13943&sa=D&sntz=1&usg=AFQjCNF1yNR7f_8VexVZfgzvhsI137Iszw>. I have a working algorithm, similar to what was proposed by Leonid Kovalev. The algorithm finds an upper bound on the cdf then uses a bisection to refine the root. It works well for high values of lambda - the upper bound calculation is very low cost and bisection is O(log2(lambda)). It's only limitation comes from the maximum recursion depth for high lambda cdf calculations which I think might be an issue considering in itself. I am getting extremely variable behaviour in computing the cdf for high values of lambda. I'm not sure if anyone else who is more knowledgable in this area can suggest why this might be? The behaviour I'm observing is that calculating the cdf for values of lambda greater than ~200 gives a maximum recursion depth error. However, if the cdf is first calculated for all the lower values of lambda (say 100,120,140,160 etc) during debugging then it has no problem evaluating the cdf up to 200 and even higher (which is what is done in the previous algorithm). I'm assuming these values are temporarily stored in memory somewhere and so it requires less recursive calls for these higher values but I'm slightly at a loss as to how and why it is doing this. Any pointers appreciated. Possible solutions/workarounds could be using normal approximations (should be a good approximation above say lambda = 100, particularly considering the discrete nature of the sampling we're doing) or perhaps writing a new function specifically for the poisson cdf as it doesn't have a (particularly useful) closed form. As a side note - another very useful algorithm for sampling of random distributions is the rejection sampling method - https://en.wikipedia.org/wiki/Rejection_sampling. It does require bounds being placed on the values so is not suitable for every situation but it is fast and easy to implement. Perhaps, worth considering for method overloading in situations where there is a difficult cdf to invert and the user can provide bounds? Open to suggestions but just putting the idea out there. Looking forward to working with you all. Cheers, Chris -- You received this message because you are subscribed to the Google Groups "sympy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sympy+unsubscr...@googlegroups.com. To post to this group, send email to sympy@googlegroups.com. Visit this group at https://groups.google.com/group/sympy. To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/f7ce2cd8-87f2-49e7-a0fa-46a664657ed8%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.