Re: [R] compare histograms

2010-10-15 Thread Rainer M Krug
On Fri, Oct 15, 2010 at 2:47 AM, Michael Bedward
michael.bedw...@gmail.comwrote:

 Hi Rainer,

 Great - many thanks for that.  Yes, I'm using OSX

 I initially tried to use install.packages to get get a pre-built
 binary of earthmovdist from Rforge, but it failed with...

 In getDependencies(pkgs, dependencies, available, lib) :
  package ‘earthmovdist’ is not available


Yes - we had some problems with getting the package build for OSX, but we
(more specifically Dirk) are working on that.



 When I tried installing with type=source this was also failing.

 However, after reading your post I looked at the error messages
 properly and it turned out to be a trivial problem. The .First
 function defined in my .Rprofile was printing some text to the console
 with cat() which was being incorrectly picked up by the package build
 as if it was a makefile argument. When I commented out the call to cat
 the package installed successfully. I haven't had this problem
 installing other packages from source so I think there must be a
 little problem with your setup (?)


Thanks for letting us know - could you send us the offending entry in your
.Rprofile (or the whole .Rprofile?), so that we can see if it is an OSX or
general problem?



 Now that it's installed I look forward to trying it out shortly.


Great - please give us some feedback on what you think about it.

Cheers,

Rainer



 Thanks again.

 Michael




 On 15 October 2010 03:17, Rainer M Krug r.m.k...@gmail.com wrote:
 
 
  On Thu, Oct 14, 2010 at 3:15 AM, Michael Bedward 
 michael.bedw...@gmail.com
  wrote:
 
  Hi Juan,
 
  Yes, you can use EMD to quantify the difference between any pair of
  histograms regardless of their shape. The only constraint, at least
  the way that I've done it previously, is to have compatible bins. The
  original application of EMD was to compare images based on colour
  histograms which could have all sorts of shapes.
 
  I looked at the package that Dennis alerted me to on RForge but
  unfortunately it seems to be inactive
 
  No - well, it depends how you define inactive: the functionality we
 wanted
  to include is included, therefore no further development was necessary.
 
 
  and the nightly builds are broken. I've downloaded the source code and
  will have a look at it
  sometime in the next few days.
 
  Thanks for alerting us - we will look into that. But just don't use the
  nightly builds, as they are not different to the last release. Just
 download
  the package for your system (I assume Windows or mac, as I just installed
  from source without problems under Linux).
  Let me know if it doesn't work,
  Cheers,
  Rainer
 
 
  Meanwhile, let me know if you want a copy of my own code. It uses the
  lpSolve package.
 
  Michael
 
  On 14 October 2010 08:46, Juan Pablo Fededa jpfed...@gmail.com wrote:
   Hi Michael,
  
  
   I have the same challenge, can you use this earth movers distance it
 to
   compare bimodal distributions?
   Thanks  cheers,
  
  
   Juan
  
  
   On Wed, Oct 13, 2010 at 4:39 AM, Michael Bedward
   michael.bedw...@gmail.com
   wrote:
  
   Just to add to Greg's comments: I've previously used 'Earth Movers
   Distance' to compare histograms. Note, this is a distance metric
   rather than a parametric statistic (ie. not a test) but it at least
   provides a consistent way of quantifying similarity.
  
   It's relatively easy to implement the metric in R (formulating it as
 a
   linear programming problem). Happy to dig out the code if needed.
  
   Michael
  
   On 13 October 2010 02:44, Greg Snow greg.s...@imail.org wrote:
That depends a lot on what you mean by the histograms being
equivalent.
   
You could just plot them and compare visually.  It may be easier to
compare them if you plot density estimates rather than histograms.
 Even
better would be to do a qqplot comparing the 2 sets of data rather
than the
histograms.
   
If you want a formal test then the ks.test function can compare 2
datasets.  Note that the null hypothesis is that they come from the
same
distribution, a significant result means that they are likely
different (but
the difference may not be of practical importance), but a
non-significant
test could mean they are the same, or that you just do not have
enough power
to find the difference (or the difference is hard for the ks test
 to
see).
 You could also use a chi-squared test to compare this way.
   
Another approach would be to use the vis.test function from the
TeachingDemos package.  Write a small function that will either
 plot
your 2
histograms (density plots), or permute the data between the 2
 groups
and
plot the equivalent histograms.  The vis.test function then
 presents
you
with an array of plots, one of which is the original data and the
rest based
on permutations.  If there is a clear meaningful difference in the
groups
you will be 

Re: [R] compare histograms

2010-10-14 Thread Rainer M Krug
On Thu, Oct 14, 2010 at 3:15 AM, Michael Bedward
michael.bedw...@gmail.comwrote:

 Hi Juan,

 Yes, you can use EMD to quantify the difference between any pair of
 histograms regardless of their shape. The only constraint, at least
 the way that I've done it previously, is to have compatible bins. The
 original application of EMD was to compare images based on colour
 histograms which could have all sorts of shapes.

 I looked at the package that Dennis alerted me to on RForge but
 unfortunately it seems to be inactive


No - well, it depends how you define inactive: the functionality we wanted
to include is included, therefore no further development was necessary.


 and the nightly builds are broken. I've downloaded the source code and will
 have a look at it
 sometime in the next few days.


Thanks for alerting us - we will look into that. But just don't use the
nightly builds, as they are not different to the last release. Just download
the package for your system (I assume Windows or mac, as I just installed
from source without problems under Linux).

Let me know if it doesn't work,

Cheers,

Rainer




 Meanwhile, let me know if you want a copy of my own code. It uses the
 lpSolve package.

 Michael

 On 14 October 2010 08:46, Juan Pablo Fededa jpfed...@gmail.com wrote:
  Hi Michael,
 
 
  I have the same challenge, can you use this earth movers distance it to
  compare bimodal distributions?
  Thanks  cheers,
 
 
  Juan
 
 
  On Wed, Oct 13, 2010 at 4:39 AM, Michael Bedward 
 michael.bedw...@gmail.com
  wrote:
 
  Just to add to Greg's comments: I've previously used 'Earth Movers
  Distance' to compare histograms. Note, this is a distance metric
  rather than a parametric statistic (ie. not a test) but it at least
  provides a consistent way of quantifying similarity.
 
  It's relatively easy to implement the metric in R (formulating it as a
  linear programming problem). Happy to dig out the code if needed.
 
  Michael
 
  On 13 October 2010 02:44, Greg Snow greg.s...@imail.org wrote:
   That depends a lot on what you mean by the histograms being
 equivalent.
  
   You could just plot them and compare visually.  It may be easier to
   compare them if you plot density estimates rather than histograms.
  Even
   better would be to do a qqplot comparing the 2 sets of data rather
 than the
   histograms.
  
   If you want a formal test then the ks.test function can compare 2
   datasets.  Note that the null hypothesis is that they come from the
 same
   distribution, a significant result means that they are likely
 different (but
   the difference may not be of practical importance), but a
 non-significant
   test could mean they are the same, or that you just do not have enough
 power
   to find the difference (or the difference is hard for the ks test to
 see).
You could also use a chi-squared test to compare this way.
  
   Another approach would be to use the vis.test function from the
   TeachingDemos package.  Write a small function that will either plot
 your 2
   histograms (density plots), or permute the data between the 2 groups
 and
   plot the equivalent histograms.  The vis.test function then presents
 you
   with an array of plots, one of which is the original data and the rest
 based
   on permutations.  If there is a clear meaningful difference in the
 groups
   you will be able to spot the plot that does not match the rest,
 otherwise it
   will just be guessing (might be best to have a fresh set of eyes that
 have
   not seen the data before see if they can pick out the real plot).
  
   --
   Gregory (Greg) L. Snow Ph.D.
   Statistical Data Center
   Intermountain Healthcare
   greg.s...@imail.org
   801.408.8111
  
  
   -Original Message-
   From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
   project.org] On Behalf Of solafah bh
   Sent: Monday, October 11, 2010 4:02 PM
   To: R help mailing list
   Subject: [R] compare histograms
  
   Hello
   How to compare  two statistical histograms? How i can know if these
   histograms are equivalent or not??
  
   Regards
  
  
  
 [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible 

Re: [R] compare histograms

2010-10-14 Thread Michael Bedward
Hi Rainer,

Great - many thanks for that.  Yes, I'm using OSX

I initially tried to use install.packages to get get a pre-built
binary of earthmovdist from Rforge, but it failed with...

In getDependencies(pkgs, dependencies, available, lib) :
  package ‘earthmovdist’ is not available

When I tried installing with type=source this was also failing.

However, after reading your post I looked at the error messages
properly and it turned out to be a trivial problem. The .First
function defined in my .Rprofile was printing some text to the console
with cat() which was being incorrectly picked up by the package build
as if it was a makefile argument. When I commented out the call to cat
the package installed successfully. I haven't had this problem
installing other packages from source so I think there must be a
little problem with your setup (?)

Now that it's installed I look forward to trying it out shortly.

Thanks again.

Michael




On 15 October 2010 03:17, Rainer M Krug r.m.k...@gmail.com wrote:


 On Thu, Oct 14, 2010 at 3:15 AM, Michael Bedward michael.bedw...@gmail.com
 wrote:

 Hi Juan,

 Yes, you can use EMD to quantify the difference between any pair of
 histograms regardless of their shape. The only constraint, at least
 the way that I've done it previously, is to have compatible bins. The
 original application of EMD was to compare images based on colour
 histograms which could have all sorts of shapes.

 I looked at the package that Dennis alerted me to on RForge but
 unfortunately it seems to be inactive

 No - well, it depends how you define inactive: the functionality we wanted
 to include is included, therefore no further development was necessary.


 and the nightly builds are broken. I've downloaded the source code and
 will have a look at it
 sometime in the next few days.

 Thanks for alerting us - we will look into that. But just don't use the
 nightly builds, as they are not different to the last release. Just download
 the package for your system (I assume Windows or mac, as I just installed
 from source without problems under Linux).
 Let me know if it doesn't work,
 Cheers,
 Rainer


 Meanwhile, let me know if you want a copy of my own code. It uses the
 lpSolve package.

 Michael

 On 14 October 2010 08:46, Juan Pablo Fededa jpfed...@gmail.com wrote:
  Hi Michael,
 
 
  I have the same challenge, can you use this earth movers distance it to
  compare bimodal distributions?
  Thanks  cheers,
 
 
  Juan
 
 
  On Wed, Oct 13, 2010 at 4:39 AM, Michael Bedward
  michael.bedw...@gmail.com
  wrote:
 
  Just to add to Greg's comments: I've previously used 'Earth Movers
  Distance' to compare histograms. Note, this is a distance metric
  rather than a parametric statistic (ie. not a test) but it at least
  provides a consistent way of quantifying similarity.
 
  It's relatively easy to implement the metric in R (formulating it as a
  linear programming problem). Happy to dig out the code if needed.
 
  Michael
 
  On 13 October 2010 02:44, Greg Snow greg.s...@imail.org wrote:
   That depends a lot on what you mean by the histograms being
   equivalent.
  
   You could just plot them and compare visually.  It may be easier to
   compare them if you plot density estimates rather than histograms.
    Even
   better would be to do a qqplot comparing the 2 sets of data rather
   than the
   histograms.
  
   If you want a formal test then the ks.test function can compare 2
   datasets.  Note that the null hypothesis is that they come from the
   same
   distribution, a significant result means that they are likely
   different (but
   the difference may not be of practical importance), but a
   non-significant
   test could mean they are the same, or that you just do not have
   enough power
   to find the difference (or the difference is hard for the ks test to
   see).
    You could also use a chi-squared test to compare this way.
  
   Another approach would be to use the vis.test function from the
   TeachingDemos package.  Write a small function that will either plot
   your 2
   histograms (density plots), or permute the data between the 2 groups
   and
   plot the equivalent histograms.  The vis.test function then presents
   you
   with an array of plots, one of which is the original data and the
   rest based
   on permutations.  If there is a clear meaningful difference in the
   groups
   you will be able to spot the plot that does not match the rest,
   otherwise it
   will just be guessing (might be best to have a fresh set of eyes that
   have
   not seen the data before see if they can pick out the real plot).
  
   --
   Gregory (Greg) L. Snow Ph.D.
   Statistical Data Center
   Intermountain Healthcare
   greg.s...@imail.org
   801.408.8111
  
  
   -Original Message-
   From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
   project.org] On Behalf Of solafah bh
   Sent: Monday, October 11, 2010 4:02 PM
   To: R help mailing list
   Subject: [R] compare 

Re: [R] compare histograms

2010-10-13 Thread Dennis Murphy
Hi:

This recent thread revealed that a package on R-forge for calculating earth
movers distance is available:

http://r.789695.n4.nabble.com/Measure-Difference-Between-Two-Distributions-td2712281.html#a2713505

HTH,
Dennis

On Tue, Oct 12, 2010 at 7:39 PM, Michael Bedward
michael.bedw...@gmail.comwrote:

 Just to add to Greg's comments: I've previously used 'Earth Movers
 Distance' to compare histograms. Note, this is a distance metric
 rather than a parametric statistic (ie. not a test) but it at least
 provides a consistent way of quantifying similarity.

 It's relatively easy to implement the metric in R (formulating it as a
 linear programming problem). Happy to dig out the code if needed.

 Michael

 On 13 October 2010 02:44, Greg Snow greg.s...@imail.org wrote:
  That depends a lot on what you mean by the histograms being equivalent.
 
  You could just plot them and compare visually.  It may be easier to
 compare them if you plot density estimates rather than histograms.  Even
 better would be to do a qqplot comparing the 2 sets of data rather than the
 histograms.
 
  If you want a formal test then the ks.test function can compare 2
 datasets.  Note that the null hypothesis is that they come from the same
 distribution, a significant result means that they are likely different (but
 the difference may not be of practical importance), but a non-significant
 test could mean they are the same, or that you just do not have enough power
 to find the difference (or the difference is hard for the ks test to see).
  You could also use a chi-squared test to compare this way.
 
  Another approach would be to use the vis.test function from the
 TeachingDemos package.  Write a small function that will either plot your 2
 histograms (density plots), or permute the data between the 2 groups and
 plot the equivalent histograms.  The vis.test function then presents you
 with an array of plots, one of which is the original data and the rest based
 on permutations.  If there is a clear meaningful difference in the groups
 you will be able to spot the plot that does not match the rest, otherwise it
 will just be guessing (might be best to have a fresh set of eyes that have
 not seen the data before see if they can pick out the real plot).
 
  --
  Gregory (Greg) L. Snow Ph.D.
  Statistical Data Center
  Intermountain Healthcare
  greg.s...@imail.org
  801.408.8111
 
 
  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
  project.org] On Behalf Of solafah bh
  Sent: Monday, October 11, 2010 4:02 PM
  To: R help mailing list
  Subject: [R] compare histograms
 
  Hello
  How to compare  two statistical histograms? How i can know if these
  histograms are equivalent or not??
 
  Regards
 
 
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] compare histograms

2010-10-13 Thread Michael Bedward
Ah, that's interesting. I'll have a look because it's bound to be
better than my effort.

Many thanks Dennis.

Michael

On 13 October 2010 22:36, Dennis Murphy djmu...@gmail.com wrote:
 Hi:

 This recent thread revealed that a package on R-forge for calculating earth
 movers distance is available:

 http://r.789695.n4.nabble.com/Measure-Difference-Between-Two-Distributions-td2712281.html#a2713505

 HTH,
 Dennis

 On Tue, Oct 12, 2010 at 7:39 PM, Michael Bedward michael.bedw...@gmail.com
 wrote:

 Just to add to Greg's comments: I've previously used 'Earth Movers
 Distance' to compare histograms. Note, this is a distance metric
 rather than a parametric statistic (ie. not a test) but it at least
 provides a consistent way of quantifying similarity.

 It's relatively easy to implement the metric in R (formulating it as a
 linear programming problem). Happy to dig out the code if needed.

 Michael

 On 13 October 2010 02:44, Greg Snow greg.s...@imail.org wrote:
  That depends a lot on what you mean by the histograms being equivalent.
 
  You could just plot them and compare visually.  It may be easier to
  compare them if you plot density estimates rather than histograms.  Even
  better would be to do a qqplot comparing the 2 sets of data rather than the
  histograms.
 
  If you want a formal test then the ks.test function can compare 2
  datasets.  Note that the null hypothesis is that they come from the same
  distribution, a significant result means that they are likely different 
  (but
  the difference may not be of practical importance), but a non-significant
  test could mean they are the same, or that you just do not have enough 
  power
  to find the difference (or the difference is hard for the ks test to see).
   You could also use a chi-squared test to compare this way.
 
  Another approach would be to use the vis.test function from the
  TeachingDemos package.  Write a small function that will either plot your 2
  histograms (density plots), or permute the data between the 2 groups and
  plot the equivalent histograms.  The vis.test function then presents you
  with an array of plots, one of which is the original data and the rest 
  based
  on permutations.  If there is a clear meaningful difference in the groups
  you will be able to spot the plot that does not match the rest, otherwise 
  it
  will just be guessing (might be best to have a fresh set of eyes that have
  not seen the data before see if they can pick out the real plot).
 
  --
  Gregory (Greg) L. Snow Ph.D.
  Statistical Data Center
  Intermountain Healthcare
  greg.s...@imail.org
  801.408.8111
 
 
  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
  project.org] On Behalf Of solafah bh
  Sent: Monday, October 11, 2010 4:02 PM
  To: R help mailing list
  Subject: [R] compare histograms
 
  Hello
  How to compare  two statistical histograms? How i can know if these
  histograms are equivalent or not??
 
  Regards
 
 
 
        [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] compare histograms

2010-10-13 Thread Michael Bedward
Hi Juan,

Yes, you can use EMD to quantify the difference between any pair of
histograms regardless of their shape. The only constraint, at least
the way that I've done it previously, is to have compatible bins. The
original application of EMD was to compare images based on colour
histograms which could have all sorts of shapes.

I looked at the package that Dennis alerted me to on RForge but
unfortunately it seems to be inactive and the nightly builds are
broken. I've downloaded the source code and will have a look at it
sometime in the next few days.

Meanwhile, let me know if you want a copy of my own code. It uses the
lpSolve package.

Michael

On 14 October 2010 08:46, Juan Pablo Fededa jpfed...@gmail.com wrote:
 Hi Michael,


 I have the same challenge, can you use this earth movers distance it to
 compare bimodal distributions?
 Thanks  cheers,


 Juan


 On Wed, Oct 13, 2010 at 4:39 AM, Michael Bedward michael.bedw...@gmail.com
 wrote:

 Just to add to Greg's comments: I've previously used 'Earth Movers
 Distance' to compare histograms. Note, this is a distance metric
 rather than a parametric statistic (ie. not a test) but it at least
 provides a consistent way of quantifying similarity.

 It's relatively easy to implement the metric in R (formulating it as a
 linear programming problem). Happy to dig out the code if needed.

 Michael

 On 13 October 2010 02:44, Greg Snow greg.s...@imail.org wrote:
  That depends a lot on what you mean by the histograms being equivalent.
 
  You could just plot them and compare visually.  It may be easier to
  compare them if you plot density estimates rather than histograms.  Even
  better would be to do a qqplot comparing the 2 sets of data rather than the
  histograms.
 
  If you want a formal test then the ks.test function can compare 2
  datasets.  Note that the null hypothesis is that they come from the same
  distribution, a significant result means that they are likely different 
  (but
  the difference may not be of practical importance), but a non-significant
  test could mean they are the same, or that you just do not have enough 
  power
  to find the difference (or the difference is hard for the ks test to see).
   You could also use a chi-squared test to compare this way.
 
  Another approach would be to use the vis.test function from the
  TeachingDemos package.  Write a small function that will either plot your 2
  histograms (density plots), or permute the data between the 2 groups and
  plot the equivalent histograms.  The vis.test function then presents you
  with an array of plots, one of which is the original data and the rest 
  based
  on permutations.  If there is a clear meaningful difference in the groups
  you will be able to spot the plot that does not match the rest, otherwise 
  it
  will just be guessing (might be best to have a fresh set of eyes that have
  not seen the data before see if they can pick out the real plot).
 
  --
  Gregory (Greg) L. Snow Ph.D.
  Statistical Data Center
  Intermountain Healthcare
  greg.s...@imail.org
  801.408.8111
 
 
  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
  project.org] On Behalf Of solafah bh
  Sent: Monday, October 11, 2010 4:02 PM
  To: R help mailing list
  Subject: [R] compare histograms
 
  Hello
  How to compare  two statistical histograms? How i can know if these
  histograms are equivalent or not??
 
  Regards
 
 
 
        [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] compare histograms

2010-10-12 Thread Greg Snow
That depends a lot on what you mean by the histograms being equivalent.

You could just plot them and compare visually.  It may be easier to compare 
them if you plot density estimates rather than histograms.  Even better would 
be to do a qqplot comparing the 2 sets of data rather than the histograms.

If you want a formal test then the ks.test function can compare 2 datasets.  
Note that the null hypothesis is that they come from the same distribution, a 
significant result means that they are likely different (but the difference may 
not be of practical importance), but a non-significant test could mean they are 
the same, or that you just do not have enough power to find the difference (or 
the difference is hard for the ks test to see).  You could also use a 
chi-squared test to compare this way.

Another approach would be to use the vis.test function from the TeachingDemos 
package.  Write a small function that will either plot your 2 histograms 
(density plots), or permute the data between the 2 groups and plot the 
equivalent histograms.  The vis.test function then presents you with an array 
of plots, one of which is the original data and the rest based on permutations. 
 If there is a clear meaningful difference in the groups you will be able to 
spot the plot that does not match the rest, otherwise it will just be guessing 
(might be best to have a fresh set of eyes that have not seen the data before 
see if they can pick out the real plot).

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of solafah bh
 Sent: Monday, October 11, 2010 4:02 PM
 To: R help mailing list
 Subject: [R] compare histograms
 
 Hello
 How to compare  two statistical histograms? How i can know if these
 histograms are equivalent or not??
 
 Regards
 
 
 
   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] compare histograms

2010-10-12 Thread Michael Bedward
Just to add to Greg's comments: I've previously used 'Earth Movers
Distance' to compare histograms. Note, this is a distance metric
rather than a parametric statistic (ie. not a test) but it at least
provides a consistent way of quantifying similarity.

It's relatively easy to implement the metric in R (formulating it as a
linear programming problem). Happy to dig out the code if needed.

Michael

On 13 October 2010 02:44, Greg Snow greg.s...@imail.org wrote:
 That depends a lot on what you mean by the histograms being equivalent.

 You could just plot them and compare visually.  It may be easier to compare 
 them if you plot density estimates rather than histograms.  Even better would 
 be to do a qqplot comparing the 2 sets of data rather than the histograms.

 If you want a formal test then the ks.test function can compare 2 datasets.  
 Note that the null hypothesis is that they come from the same distribution, a 
 significant result means that they are likely different (but the difference 
 may not be of practical importance), but a non-significant test could mean 
 they are the same, or that you just do not have enough power to find the 
 difference (or the difference is hard for the ks test to see).  You could 
 also use a chi-squared test to compare this way.

 Another approach would be to use the vis.test function from the TeachingDemos 
 package.  Write a small function that will either plot your 2 histograms 
 (density plots), or permute the data between the 2 groups and plot the 
 equivalent histograms.  The vis.test function then presents you with an array 
 of plots, one of which is the original data and the rest based on 
 permutations.  If there is a clear meaningful difference in the groups you 
 will be able to spot the plot that does not match the rest, otherwise it will 
 just be guessing (might be best to have a fresh set of eyes that have not 
 seen the data before see if they can pick out the real plot).

 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of solafah bh
 Sent: Monday, October 11, 2010 4:02 PM
 To: R help mailing list
 Subject: [R] compare histograms

 Hello
 How to compare  two statistical histograms? How i can know if these
 histograms are equivalent or not??

 Regards



       [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.