Re: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg)

2014-03-06 Thread josef . pktd
On Thu, Mar 6, 2014 at 8:38 PM, Sturla Molden  wrote:
> Sebastian Berg  wrote:
>
>> I am right now a bit unsure about whether or not the "weights" would be
>> "aweights" or different... R seems to not care about the scale of the
>> weights which seems a bit odd to me for an unbiased estimator? I always
>> assumed that we can do the statistics behind using the ddof... But even
>> if we can figure out the right way, what I am doubting a bit is that if
>> we add weights, their names should be clear enough to not clash with
>> possibly different kind of (interesting) weights in other functions.
>
> http://en.wikipedia.org/wiki/Weighted_arithmetic_mean#Weighted_sample_covariance


just as additional motivation (I'm not into definition of weights right now :)

I was just reading a chapter on robust covariance estimation, and one
of the steps in many of the procedures requires weighted covariances,
and weighted variances.

weights are just to reduce the influence of outlying observations.

Josef


>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg)

2014-03-06 Thread Sturla Molden
Sturla Molden  wrote:
>  wrote:
> 
>> The only question IMO is which ddof for weighted std, ...
> 
> Something like this?
> 
> sum_weights - (ddof/float(n))*sum_weights

Please ignore.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg)

2014-03-06 Thread Sturla Molden
Sebastian Berg  wrote:

> I am right now a bit unsure about whether or not the "weights" would be
> "aweights" or different... R seems to not care about the scale of the
> weights which seems a bit odd to me for an unbiased estimator? I always
> assumed that we can do the statistics behind using the ddof... But even
> if we can figure out the right way, what I am doubting a bit is that if
> we add weights, their names should be clear enough to not clash with
> possibly different kind of (interesting) weights in other functions.

http://en.wikipedia.org/wiki/Weighted_arithmetic_mean#Weighted_sample_covariance

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg)

2014-03-06 Thread Sebastian Berg
On Do, 2014-03-06 at 16:30 -0500, josef.p...@gmail.com wrote:
> On Thu, Mar 6, 2014 at 3:49 PM, Ralf Gommers  wrote:
> >
> >
> >
> > On Thu, Mar 6, 2014 at 1:40 PM, Sebastian Berg 
> > wrote:
> >>
> >> On Mi, 2014-03-05 at 10:21 -0800, David Goldsmith wrote:
> >> >
> >> >
> >> >
> >> > Date: Wed, 05 Mar 2014 17:45:47 +0100
> >> > From: Sebastian Berg 
> >> > Subject: [Numpy-discussion] Adding weights to cov and corrcoef
> >> > To: numpy-discussion@scipy.org
> >> > Message-ID: <1394037947.21356.20.camel@sebastian-t440>
> >> > Content-Type: text/plain; charset="UTF-8"
> >> >
> >> > Hi all,
> >> >
> >> > in Pull Request https://github.com/numpy/numpy/pull/3864 Neol
> >> > Dawe
> >> > suggested adding new parameters to our `cov` and `corrcoef`
> >> > functions to
> >> > implement weights, which already exists for `average` (the PR
> >> > still
> >> > needs to be adapted).
> >> >
> >> >
> >> > Do you mean adopted?
> >> >
> >>
> >> What I meant was that the suggestion isn't actually implemented in the
> >> PR at this time. So you can't pull it in to try things out.
> >>
> >> >
> >> > However, we may have missed something obvious, or maybe it is
> >> > already
> >> > getting too statistical for NumPy, or the keyword argument
> >> > might be
> >> > better `uncertainties` and `frequencies`. So comments and
> >> > insights are
> >> > very welcome :).
> >> >
> >> >
> >> > +1 for it being "too baroque" for NumPy--should go in SciPy (if it
> >> > isn't already there): IMHO, NumPy should be kept as "lean and mean" as
> >> > possible, embellishments are what SciPy is for.  (Again, IMO.)
> >> >
> >>
> >> Well, on the other hand, scipy does not actually have a `std` function
> >> of its own, I think. So if it is quite useful I think this may be an
> >> option (I don't think I ever used weights with std, so I can't argue
> >> strongly for inclusion myself). Unless adding new functions to
> >> `scipy.stats` (or just statsmodels) which implement different types of
> >> weights is the longer term plan, then things might bite...
> >
> >
> > AFAIK there's currently no such plan.
> 
> since numpy has taken over all the basic statistics, var, std, cov,
> corrcoef, and scipy.stats dropped those, I don't see any reason to
> resurrect them.
> 
> The only question IMO is which ddof for weighted std, ...
> 

I am right now a bit unsure about whether or not the "weights" would be
"aweights" or different... R seems to not care about the scale of the
weights which seems a bit odd to me for an unbiased estimator? I always
assumed that we can do the statistics behind using the ddof... But even
if we can figure out the right way, what I am doubting a bit is that if
we add weights, their names should be clear enough to not clash with
possibly different kind of (interesting) weights in other functions.


> statsmodels has the basic statistics with frequency weights, but they
> are largely in support of t-test and similar hypothesis tests.
> 
> Josef
> 
> 
> >
> > Ralf
> >
> >
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg)

2014-03-06 Thread Sturla Molden
 wrote:

> The only question IMO is which ddof for weighted std, ...

Something like this?

sum_weights - (ddof/float(n))*sum_weights



Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg)

2014-03-06 Thread josef . pktd
On Thu, Mar 6, 2014 at 3:49 PM, Ralf Gommers  wrote:
>
>
>
> On Thu, Mar 6, 2014 at 1:40 PM, Sebastian Berg 
> wrote:
>>
>> On Mi, 2014-03-05 at 10:21 -0800, David Goldsmith wrote:
>> >
>> >
>> >
>> > Date: Wed, 05 Mar 2014 17:45:47 +0100
>> > From: Sebastian Berg 
>> > Subject: [Numpy-discussion] Adding weights to cov and corrcoef
>> > To: numpy-discussion@scipy.org
>> > Message-ID: <1394037947.21356.20.camel@sebastian-t440>
>> > Content-Type: text/plain; charset="UTF-8"
>> >
>> > Hi all,
>> >
>> > in Pull Request https://github.com/numpy/numpy/pull/3864 Neol
>> > Dawe
>> > suggested adding new parameters to our `cov` and `corrcoef`
>> > functions to
>> > implement weights, which already exists for `average` (the PR
>> > still
>> > needs to be adapted).
>> >
>> >
>> > Do you mean adopted?
>> >
>>
>> What I meant was that the suggestion isn't actually implemented in the
>> PR at this time. So you can't pull it in to try things out.
>>
>> >
>> > However, we may have missed something obvious, or maybe it is
>> > already
>> > getting too statistical for NumPy, or the keyword argument
>> > might be
>> > better `uncertainties` and `frequencies`. So comments and
>> > insights are
>> > very welcome :).
>> >
>> >
>> > +1 for it being "too baroque" for NumPy--should go in SciPy (if it
>> > isn't already there): IMHO, NumPy should be kept as "lean and mean" as
>> > possible, embellishments are what SciPy is for.  (Again, IMO.)
>> >
>>
>> Well, on the other hand, scipy does not actually have a `std` function
>> of its own, I think. So if it is quite useful I think this may be an
>> option (I don't think I ever used weights with std, so I can't argue
>> strongly for inclusion myself). Unless adding new functions to
>> `scipy.stats` (or just statsmodels) which implement different types of
>> weights is the longer term plan, then things might bite...
>
>
> AFAIK there's currently no such plan.

since numpy has taken over all the basic statistics, var, std, cov,
corrcoef, and scipy.stats dropped those, I don't see any reason to
resurrect them.

The only question IMO is which ddof for weighted std, ...

statsmodels has the basic statistics with frequency weights, but they
are largely in support of t-test and similar hypothesis tests.

Josef


>
> Ralf
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg)

2014-03-06 Thread Ralf Gommers
On Thu, Mar 6, 2014 at 1:40 PM, Sebastian Berg
wrote:

> On Mi, 2014-03-05 at 10:21 -0800, David Goldsmith wrote:
> >
> >
> >
> > Date: Wed, 05 Mar 2014 17:45:47 +0100
> > From: Sebastian Berg 
> > Subject: [Numpy-discussion] Adding weights to cov and corrcoef
> > To: numpy-discussion@scipy.org
> > Message-ID: <1394037947.21356.20.camel@sebastian-t440>
> > Content-Type: text/plain; charset="UTF-8"
> >
> > Hi all,
> >
> > in Pull Request https://github.com/numpy/numpy/pull/3864 Neol
> > Dawe
> > suggested adding new parameters to our `cov` and `corrcoef`
> > functions to
> > implement weights, which already exists for `average` (the PR
> > still
> > needs to be adapted).
> >
> >
> > Do you mean adopted?
> >
>
> What I meant was that the suggestion isn't actually implemented in the
> PR at this time. So you can't pull it in to try things out.
>
> >
> > However, we may have missed something obvious, or maybe it is
> > already
> > getting too statistical for NumPy, or the keyword argument
> > might be
> > better `uncertainties` and `frequencies`. So comments and
> > insights are
> > very welcome :).
> >
> >
> > +1 for it being "too baroque" for NumPy--should go in SciPy (if it
> > isn't already there): IMHO, NumPy should be kept as "lean and mean" as
> > possible, embellishments are what SciPy is for.  (Again, IMO.)
> >
>
> Well, on the other hand, scipy does not actually have a `std` function
> of its own, I think. So if it is quite useful I think this may be an
> option (I don't think I ever used weights with std, so I can't argue
> strongly for inclusion myself). Unless adding new functions to
> `scipy.stats` (or just statsmodels) which implement different types of
> weights is the longer term plan, then things might bite...
>

AFAIK there's currently no such plan.

Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg)

2014-03-06 Thread Sebastian Berg
On Mi, 2014-03-05 at 10:21 -0800, David Goldsmith wrote:
> 
> 
> 
> Date: Wed, 05 Mar 2014 17:45:47 +0100
> From: Sebastian Berg 
> Subject: [Numpy-discussion] Adding weights to cov and corrcoef
> To: numpy-discussion@scipy.org
> Message-ID: <1394037947.21356.20.camel@sebastian-t440>
> Content-Type: text/plain; charset="UTF-8"
> 
> Hi all,
> 
> in Pull Request https://github.com/numpy/numpy/pull/3864 Neol
> Dawe
> suggested adding new parameters to our `cov` and `corrcoef`
> functions to
> implement weights, which already exists for `average` (the PR
> still
> needs to be adapted).
> 
> 
> Do you mean adopted?
> 

What I meant was that the suggestion isn't actually implemented in the
PR at this time. So you can't pull it in to try things out.

>  
> However, we may have missed something obvious, or maybe it is
> already
> getting too statistical for NumPy, or the keyword argument
> might be
> better `uncertainties` and `frequencies`. So comments and
> insights are
> very welcome :).
> 
> 
> +1 for it being "too baroque" for NumPy--should go in SciPy (if it
> isn't already there): IMHO, NumPy should be kept as "lean and mean" as
> possible, embellishments are what SciPy is for.  (Again, IMO.)
> 

Well, on the other hand, scipy does not actually have a `std` function
of its own, I think. So if it is quite useful I think this may be an
option (I don't think I ever used weights with std, so I can't argue
strongly for inclusion myself). Unless adding new functions to
`scipy.stats` (or just statsmodels) which implement different types of
weights is the longer term plan, then things might bite...

> DG
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg)

2014-03-05 Thread David Goldsmith
Date: Wed, 05 Mar 2014 17:45:47 +0100

> From: Sebastian Berg 
> Subject: [Numpy-discussion] Adding weights to cov and corrcoef
> To: numpy-discussion@scipy.org
> Message-ID: <1394037947.21356.20.camel@sebastian-t440>
> Content-Type: text/plain; charset="UTF-8"
>
> Hi all,
>
> in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe
> suggested adding new parameters to our `cov` and `corrcoef` functions to
> implement weights, which already exists for `average` (the PR still
> needs to be adapted).
>

Do you mean adopted?


> However, we may have missed something obvious, or maybe it is already
> getting too statistical for NumPy, or the keyword argument might be
> better `uncertainties` and `frequencies`. So comments and insights are
> very welcome :).
>

+1 for it being "too baroque" for NumPy--should go in SciPy (if it isn't
already there): IMHO, NumPy should be kept as "lean and mean" as possible,
embellishments are what SciPy is for.  (Again, IMO.)

DG
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion