Re: Issue 3129 in sympy: Drastic change to sympy.stats: Adding concept of Probability Distributions on surface level

sympy Tue, 13 Mar 2012 19:07:04 -0700

Comment #11 on issue 3129 by mrock...@gmail.com: Drastic change tosympy.stats: Adding concept of Probability Distributions on surface level

http://code.google.com/p/sympy/issues/detail?id=3129

In general PSpaces can have a many symbols bound to them. They act asindices into the distribution. The PSpace is making a value statement aboutthese symbols/concepts.

Consider not using internal symbols at all. How do we represent probabilitydensities? The current plan is to use a Lambda. Lambdas use internalsymbols i.e.


dist = Lambda( (x,y) , exp((x**2+y**2)/(2*pi)) / ... )

The x and y here are internal symbols. Well, I suppose we could use dummiesinstead but then our lambdas look bad with names like x1, x2, .... This isa minor detail really though, why the big deal? Well, suppose we want arandom variable that corresponds to the 'y' in the probability density. Howdo we specify that we want the 1th variable and not the 0th one? Well, wecould use an index. Something like


Y = RandomSymbol('Y', dist, index=1)

The idea of using an index here seems separated from what we want. In thissense an internal symbol acts like a more conceptual index.


Regarding your example I have two comments.

(1) I suggest NormalDistribution as a name rather than Normal. I wouldleave Normal for syntactic sugar later on. This is a relatively minordisagreement though.(2) The result you're getting is easy in the case of the normaldistribution but I think it's very challenging in even trivially morecomplex situations. How does this work in the case of a beta distribution?The current design specifically avoids any sort of special-rule forwell-known distributions. Everything is represented as a SymPy Expr. Wefail to get some nice results but, in this sense at least, the system ismuch simpler.

X = BetaDistribution(2, 3).new('X')
Z = (X-2)/3
pspace(Z)

???

I suspect that a solution that attempts to make decisions like this willnecessarily become very complex.

I think that we're trying to push too much into the concept of adistribution. I suspect that there are two separable tasks here. Managingrandom symbol interaction and computing on distributions. I now think thatthe concept of a probability space is probably necessary. I think that muchof the complexity of the PSpace object should be factored out into aDistribution object and that PSpace should become very simple. Hopefullymuch of the complexity can be simplified in this factoring process.


Some thoughts
There should be a single PSpace class (no subclasses).
It should contain a Distribution and a set of symbols

There should be a Distribution interface that handles things likecompute_density, integrate, P, etc....Distribution should be subclassed to Continuous and Finite and should besomething like what is proposed above.

This separates two concepts that should have been separated before. I thinkthis solution is clean.


--
You received this message because you are subscribed to the Google Groups 
"sympy-issues" group.
To post to this group, send email to sympy-issues@googlegroups.com.
To unsubscribe from this group, send email to 
sympy-issues+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sympy-issues?hl=en.

Re: Issue 3129 in sympy: Drastic change to sympy.stats: Adding concept of Probability Distributions on surface level

Reply via email to