Re: [R] How to compare stacked histograms/datasets

2012-07-11 Thread Joshua Wiley
Hi,

Sure, you could do a qqplot for each variable between two datasets.
In a 2d graph, it will be hard to reasonably compare more than 2
datasets (you can put many such graphs on a single page, but it would
be pairwise sets of comparisons, I think.  Perhaps you could plots
multiple qqplots on top of each other varying the points by colour for
the different data sets?

I have not seen anything like this before, so I suppose it depends
what helps you understand your data.

Cheers,

Josh

On Sat, Jul 7, 2012 at 3:25 PM, Atulkakrana atulkakr...@gmail.com wrote:
 Hello Joshua,

 Thanks for taking time out to help me with problem. Actually the comparison
 is to be done among two (if possible, more than two) datasets and not within
 the dataset. Each dataset hold 5 variables (i.e Red, Purple, Blue, Grey and
 Yellow) for 21 different positions i.e 1-21n. So, we have 5 values for each
 position (total 21) that make a single dataset or stacked histogram (Plot in
 original post).

 Initially I was comparing datasets by plotting stacked histograms for each
 and analyzing them visually. But that doesn't give a statistical idea of how
 similar or different the datasets are. Therefore, I want to evaluate the
 datasets in order to quantify their difference/similarity. So, end result
 would be a plot showing similarity/difference among two or more datasets.

 Example datasets: http://pastebin.com/iYj1RNvt

 Does the method you explained can be applied to multiple datasets? Can a
 qqplot be obtained in such a case?

 Awaiting your reply

 Thanks

 Atul


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/How-to-compare-stacked-histograms-datasets-tp4635668p4635744.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to compare stacked histograms/datasets

2012-07-07 Thread Joshua Wiley
Hi,

Probably easier to work with the raw data, but whatever.  If your data
is in a data frame, dat,

## create row index
dat$x - 1:21

## load packages
require(ggplot2)
require(reshape2)

## melt the data frame to be long, long dat, ldat for short
ldat - melt(dat, id.vars=x)

## plot the distributions
ggplot(ldat, aes(x, value, colour = variable)) + geom_line()

## they don't really look on the same scale
## we could scale the data first to have equal mean and variance
dat2 - as.data.frame(scale(dat))
## remake index so it is not scaled
dat2$x - 1:21

ldat2 - melt(dat2, id.vars=x)
ggplot(ldat2, aes(x, value, colour = variable)) + geom_line()

which yields the attached PDF (maybe scrubbed on the official list as
most file extensions are, but should go through to you personally via
gmail).  I'm not sure it's the greatest approach ever, but it gives
you a sense if they go up and down together or at different points.

Cheers,

Josh

On Fri, Jul 6, 2012 at 1:55 PM, Atulkakrana atulkakr...@gmail.com wrote:
 Hello All,

 I have a couple of stacked histograms which I need to compare/evaluate for
 similarity or difference.
 http://r.789695.n4.nabble.com/file/n4635668/Selection_011.png

 I believe rather than evaluating histograms is will be east to work with
 dataset used to plot these stacked histograms, which is in format:

 RED  PURPLE BLUE
 GREY   YELLOW
 22.0640569395   16.9483985765   0   60.9875444840
 8.18505338088.85231316730   82.9626334520
 6.85053380786.89501779360.756227758 85.4982206406   0.5338078292
 6.76156583635.24911032031.645907473386.3434163701   0.6672597865
 5.82740213527.384341637 2.135231316784.6530249111.1565836299
 7.87366548046.628113879 1.556939501883.9412811388   1.2010676157
 7.16192170828.18505338081.245551601483.4074733096   1.3790035587
 5.560498220610.2758007117   1.067615658483.0960854093   1.0231316726
 7.11743772247.60676156580.711743772284.5640569395   0.756227758
 7.87366548043.95907473310.667259786587.50.3113879004
 7.65124555167.87366548040.533807829283.9412811388   0.5338078292
 7.60676156588.98576512461.467971530281.9395017794   0.3558718861
 8.94128113888.00711743771.379003558781.6725978648   0.5782918149
 19.0836298932   9.20818505342.135231316769.5729537367   1.3790035587
 14.9911032028   11.0765124555   3.202846975170.7295373665   1.0676156584
 15.3914590747   10.8985765125   3.024911032 70.6850533808   1.2900355872
 17.4822064057   12.5444839858   2.491103202867.4822064057   1.334519573
 15.8362989324   13.0338078292   2.001779359469.1281138791.334519573
 17.03736654810.4537366548   2.402135231370.1067615658   1.2010676157
 20.2846975089   10.0088967972   0   69.7064056941.0676156584
 28.7366548043   12.6334519573   0   58.6298932384   0

 Is there any possible way I can compare such dataset from multiple
 experiments (n=8) and visually show (plot) that these datasets are in
 consensus or differ from each other?

 Awaiting reply,

 Atul


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/How-to-compare-stacked-histograms-datasets-tp4635668.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/


plots.pdf
Description: Adobe PDF document
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to compare stacked histograms/datasets

2012-07-07 Thread Atulkakrana
Hello Joshua,

Thanks for taking time out to help me with problem. Actually the comparison
is to be done among two (if possible, more than two) datasets and not within
the dataset. Each dataset hold 5 variables (i.e Red, Purple, Blue, Grey and
Yellow) for 21 different positions i.e 1-21n. So, we have 5 values for each
position (total 21) that make a single dataset or stacked histogram (Plot in
original post).

Initially I was comparing datasets by plotting stacked histograms for each
and analyzing them visually. But that doesn't give a statistical idea of how
similar or different the datasets are. Therefore, I want to evaluate the
datasets in order to quantify their difference/similarity. So, end result
would be a plot showing similarity/difference among two or more datasets. 

Example datasets: http://pastebin.com/iYj1RNvt

Does the method you explained can be applied to multiple datasets? Can a
qqplot be obtained in such a case?

Awaiting your reply

Thanks

Atul


--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-compare-stacked-histograms-datasets-tp4635668p4635744.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.