[sage-devel] Re: Expected value of probability space

2008-12-02 Thread Robert Dodier

On Dec 1, 1:48 am, "William Stein" <[EMAIL PROTECTED]> wrote:

> [Another response in this thread from David Kohel (who maybe should be
> posting on list)]

> Actually, I correct myself -- the average should be over the values
> of the function, weighted by the probabilities.  The domain of the
> function (the keys) can be in any set (e.g. "A","B","C"), so the
> current behavior is correct.

David, I have to say the current behavior seems anomalous at best.

>From what I can tell, since a probability space is just a normalized
measure space, the underlying set on which the measure is
defined need not be something for which linear combinations
like A P(A) + B P(B) + C P(C) + ... is defined.

If such linear combinations are not defined, then
DiscreteProbabilitySpace.expectation should throw
an exception.

Maybe the expectation function should be moved out
of the probability space base class (I don't know what
that is in Sage) and into some subclass which can
guarantee the operation succeeds. Just a thought.

FWIW

Robert Dodier

--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://www.sagemath.org
-~--~~~~--~~--~--~---



[sage-devel] Re: Expected value of probability space

2008-12-01 Thread Paul Butler
Hi David,

When I was referring to the "probability space ps" and the "random variable
ps", I was referring to the fact that ps is by inheritance a probability
space and a random variable (is_DiscreteProbabilitySpace(ps) ==
is_DiscreteRandomVariable(ps) == True).

I do see that the values are necessarily probabilities, I'm not suggesting
that that be changed. But right now if you ask for the expected value of ps,
it uses the probability function (from the constructor) as both the
probability function and the random variable.

In other words, currently ps.expectation() = sum(P(x)^2 over all x).

I only brought it up because after looking at the code I was unsure if it
was intentional or just a side-effect of the class organization. I'm not
aware of any formal definition of the expectation of a probability space, so
I wanted to ask to clarify.

You mentioned that continuous random variables are missing, which is
something I wouldn't mind working on. I may follow up later with an email to
get some input on what needs to be done, if you don't mind.

-- Paul

On Mon, Dec 1, 2008 at 4:50 PM, David Kohel <[EMAIL PROTECTED]> wrote:

>
> Hi Paul,
>
> > Thanks for explaining that, I see how that causes problems when S is not
> a
> > set of numbers. Even so, would it make sense for the random variable ps
> to
> > be the identity function X(x) = x on the probability space ps? Currently
> the
> > random variable ps is the function X(x) = P(x). Is this a useful random
> > variable that I'm just not aware of?
>
> Well, ps is a probability space (you did create it as
> DiscreteProbabilitySpace,
> rather than DiscreteRandomVariable, which I had missed in my first
> reply),
> hence its values are necessarily probabilities.  There is no other
> valid choice.
>
> To create a DiscreteRandomVariable, you currently must first create a
> probability space, then the function on that space.  One could have
> shorter
> constructors which assume a finite uniform probability space, if not
> given.
>
> The order of the arguments is probability space, then function or
> values,
> but if a probability space is no longer required, this order should
> change.
>
> --David
>
>
> >
>

--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://www.sagemath.org
-~--~~~~--~~--~--~---



[sage-devel] Re: Expected value of probability space

2008-12-01 Thread David Kohel

Hi Paul,

> Thanks for explaining that, I see how that causes problems when S is not a
> set of numbers. Even so, would it make sense for the random variable ps to
> be the identity function X(x) = x on the probability space ps? Currently the
> random variable ps is the function X(x) = P(x). Is this a useful random
> variable that I'm just not aware of?

Well, ps is a probability space (you did create it as
DiscreteProbabilitySpace,
rather than DiscreteRandomVariable, which I had missed in my first
reply),
hence its values are necessarily probabilities.  There is no other
valid choice.

To create a DiscreteRandomVariable, you currently must first create a
probability space, then the function on that space.  One could have
shorter
constructors which assume a finite uniform probability space, if not
given.

The order of the arguments is probability space, then function or
values,
but if a probability space is no longer required, this order should
change.

--David


--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://www.sagemath.org
-~--~~~~--~~--~--~---



[sage-devel] Re: Expected value of probability space

2008-12-01 Thread Paul Butler
Hi David,

Thanks for explaining that, I see how that causes problems when S is not a
set of numbers. Even so, would it make sense for the random variable ps to
be the identity function X(x) = x on the probability space ps? Currently the
random variable ps is the function X(x) = P(x). Is this a useful random
variable that I'm just not aware of?

One way to accomplish this is to override expectation in the
DiscreteProbabilitySpace class with the one you sent earlier. We would also
need to do this for variance, covarience, etc., so I admit this is probably
not the ideal solution.

-- Paul

On Mon, Dec 1, 2008 at 3:48 AM, William Stein <[EMAIL PROTECTED]> wrote:

>
> [Another response in this thread from David Kohel (who maybe should be
> posting on list)]
>
> Hi William and Paul,
>
> Actually, I correct myself -- the average should be over the values
> of the function, weighted by the probabilities.  The domain of the
> function (the keys) can be in any set (e.g. "A","B","C"), so the
> current behavior is correct.
>
> In the defintition of the probability space itself:
>
> sage: ps = DiscreteProbabilitySpace([1,2,3],{1:1/3,2:1/3,3:1/3})
>
> Consider replacing the domain of ps with such a set.  Then it
> should be clear that you can't average over "A", "B", and "C".
>
> S = ["A","B","C"]
> P = {}
> for i in range(3):
>   P[S[i]] = 1/3
>
> ps = DiscreteProbabilitySpace(S,P)
> ps.expectation() # 0.33
>
> This is the random variable with
>
>  f("A") = 1,
>  f("B") = 2,
>  f("C") = 3,
>
> for which the expectation is 2:
>
> f = {}
> for i in range(3):
>   f[S[i]] = i+1
>
> rv = DiscreteRandomVariable(ps,f)
> rv.expectation() # 2.00
>
> On the other hand, I'd be happy with some syntax for creating a
> random variable with the uniform distribution P(x) = 1/n on some
> set or tuple. Currently this gives an error:
>
> sage: rv = DiscreteRandomVariable(S,f)
>
> but it could easily create the uniform distribution on S behind
> the scenes.  The reason I didn't do this is the following:
>
> sage: ps1 = DiscreteProbabilitySpace(S,P)
> sage: ps2 = DiscreteProbabilitySpace(S,P)
> sage: ps1 == ps2
> False
>
> Probably this should be considered a bug.  But to avoid creating
> many copies of the "same" space, one should consider caching the
> probability spaces.  However, since dictionaries are mutable, it
> can't be cached.
>
> The correct solution might be to define additional types for the
> uniform distribution and other standard distributions (which are
> the most likely candidates to be created over and over) and have
> only one instance of each.
>
> My type DiscreteProbabilitySpace is also too naive -- it really
> only implements a FiniteProbabilitySpace.
>
> It is suitable for the examples I've taught in classical cryptography,
> but for a statistics class one certainly needs infinite discrete
> probability spaces and more classes for distributions on continuous
> intervals.
>
> Some thought also needs to be given to their ease of use.  The
> usual approach to this is to first write some documentation which
> describes the desired behavior and then implement the classes and
> functions.
>
> --David
>
> - Forwarded message from "David R. Kohel" <[EMAIL PROTECTED]>
> -
>
> Date: Mon, 1 Dec 2008 09:04:12 +0100
> From: "David R. Kohel" <[EMAIL PROTECTED]>
> To: William Stein <[EMAIL PROTECTED]>
> Cc: Paul Butler <[EMAIL PROTECTED]>
> Subject: Re: Fwd: [sage-devel] Expected value of probability space
> User-Agent: Mutt/1.5.6i
>
> Dear William, Paul,
>
> Indeed, the function definition should be:
>
>   def expectation(self):
>   r"""
>   The expectation of the discrete random variable, namely
> $\sum_{x \in S} p(x) X[x]$,
>   where $X$ = self and $S$ is the probability space of $X$.
>   """
>   E = 0
>   Omega = self.probability_space()
>   for x in self._function.keys():
>   E += Omega(x) * x
>   return E
>
> rather than:
>
>   def expectation(self):
>   r"""
>   The expectation of the discrete random variable, namely
> $\sum_{x \in S} p(x) X[x]$,
>   where $X$ = self and $S$ is the probability space of $X$.
>   """
>   E = 0
>   Omega = self.probability_space()
>   for x in self._function.keys():
>   E += Omega(x) * self(x)
>   return E
>
> Cheers,
>
> David
>
> >
>

--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://www.sagemath.org
-~--~~~~--~~--~--~---



[sage-devel] Re: Expected value of probability space

2008-12-01 Thread William Stein

[Another response in this thread from David Kohel (who maybe should be
posting on list)]

Hi William and Paul,

Actually, I correct myself -- the average should be over the values
of the function, weighted by the probabilities.  The domain of the
function (the keys) can be in any set (e.g. "A","B","C"), so the
current behavior is correct.

In the defintition of the probability space itself:

sage: ps = DiscreteProbabilitySpace([1,2,3],{1:1/3,2:1/3,3:1/3})

Consider replacing the domain of ps with such a set.  Then it
should be clear that you can't average over "A", "B", and "C".

S = ["A","B","C"]
P = {}
for i in range(3):
   P[S[i]] = 1/3

ps = DiscreteProbabilitySpace(S,P)
ps.expectation() # 0.33

This is the random variable with

 f("A") = 1,
 f("B") = 2,
 f("C") = 3,

for which the expectation is 2:

f = {}
for i in range(3):
   f[S[i]] = i+1

rv = DiscreteRandomVariable(ps,f)
rv.expectation() # 2.00

On the other hand, I'd be happy with some syntax for creating a
random variable with the uniform distribution P(x) = 1/n on some
set or tuple. Currently this gives an error:

sage: rv = DiscreteRandomVariable(S,f)

but it could easily create the uniform distribution on S behind
the scenes.  The reason I didn't do this is the following:

sage: ps1 = DiscreteProbabilitySpace(S,P)
sage: ps2 = DiscreteProbabilitySpace(S,P)
sage: ps1 == ps2
False

Probably this should be considered a bug.  But to avoid creating
many copies of the "same" space, one should consider caching the
probability spaces.  However, since dictionaries are mutable, it
can't be cached.

The correct solution might be to define additional types for the
uniform distribution and other standard distributions (which are
the most likely candidates to be created over and over) and have
only one instance of each.

My type DiscreteProbabilitySpace is also too naive -- it really
only implements a FiniteProbabilitySpace.

It is suitable for the examples I've taught in classical cryptography,
but for a statistics class one certainly needs infinite discrete
probability spaces and more classes for distributions on continuous
intervals.

Some thought also needs to be given to their ease of use.  The
usual approach to this is to first write some documentation which
describes the desired behavior and then implement the classes and
functions.

--David

- Forwarded message from "David R. Kohel" <[EMAIL PROTECTED]> -

Date: Mon, 1 Dec 2008 09:04:12 +0100
From: "David R. Kohel" <[EMAIL PROTECTED]>
To: William Stein <[EMAIL PROTECTED]>
Cc: Paul Butler <[EMAIL PROTECTED]>
Subject: Re: Fwd: [sage-devel] Expected value of probability space
User-Agent: Mutt/1.5.6i

Dear William, Paul,

Indeed, the function definition should be:

   def expectation(self):
   r"""
   The expectation of the discrete random variable, namely
$\sum_{x \in S} p(x) X[x]$,
   where $X$ = self and $S$ is the probability space of $X$.
   """
   E = 0
   Omega = self.probability_space()
   for x in self._function.keys():
   E += Omega(x) * x
   return E

rather than:

   def expectation(self):
   r"""
   The expectation of the discrete random variable, namely
$\sum_{x \in S} p(x) X[x]$,
   where $X$ = self and $S$ is the probability space of $X$.
   """
   E = 0
   Omega = self.probability_space()
   for x in self._function.keys():
   E += Omega(x) * self(x)
   return E

Cheers,

David

--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://www.sagemath.org
-~--~~~~--~~--~--~---



[sage-devel] Re: Expected value of probability space

2008-12-01 Thread William Stein

On Sun, Nov 30, 2008 at 9:39 PM, Paul Butler <[EMAIL PROTECTED]> wrote:
> I've been experimenting with probability and found that in Sage, a
> probability space is also a random variable by inheritance. This may be
> useful. Without it, creating a random variable requires two classes: a
> probability space and a random variable on that probability space.
>
> Unfortunately, the random variable doesn't work like I expected. For
> example:
>
> sage: ps = DiscreteProbabilitySpace([1,2,3],{1:1/3,2:1/3,3:1/3})
> sage: ps.expectation()
> 0.333
>
> (I expected 2.00)
>
> I've prepared a patch that gives me the value I'd expect, but I'd like to
> make sure this is the proper behavior.
>
> -- Paul
>

Response from David Kohel:

Dear William, Paul,

Indeed, the function definition should be:

   def expectation(self):
   r"""
   The expectation of the discrete random variable, namely
$\sum_{x \in S} p(x) X[x]$,
   where $X$ = self and $S$ is the probability space of $X$.
   """
   E = 0
   Omega = self.probability_space()
   for x in self._function.keys():
   E += Omega(x) * x
   return E

rather than:

   def expectation(self):
   r"""
   The expectation of the discrete random variable, namely
$\sum_{x \in S} p(x) X[x]$,
   where $X$ = self and $S$ is the probability space of $X$.
   """
   E = 0
   Omega = self.probability_space()
   for x in self._function.keys():
   E += Omega(x) * self(x)
   return E

Cheers,

David

--~--~-~--~~~---~--~~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://www.sagemath.org
-~--~~~~--~~--~--~---