from:"Greg Snow"

Re: [R] How to test if a slope is different than 1?

2012-04-25 Thread Greg Snow

Doesn't the p-value from using offset work for you?  if you really
need a p-value.  The confint method is a quick and easy way to see if
it is significantly different from 1 (see Rolf's response), but does
not provide an exact p-value.  I guess you could do confidence
intervals at different confidence levels until you find the level such
that one of the limits is close enough to 1, but that seems like way
to much work.  You could also compute the p-value by taking the slope
minus 1 divided by the standard error and plug that into the pt
function with the correct degrees of freedom.  You could even write a
function to do that for you, but it still seems more work than adding
the offset to the formula.

On Tue, Apr 24, 2012 at 8:17 AM, Mark Na mtb...@gmail.com wrote:
 Hi Greg. Thanks for your reply. Do you know if there is a way to use the
 confint function to get a p-value on this test?

 Thanks, Mark



 On Mon, Apr 23, 2012 at 3:10 PM, Greg Snow 538...@gmail.com wrote:

 One option is to subtract the continuous variable from y before doing
 the regression (this works with any regression package/function).  The
 probably better way in R is to use the 'offset' function:

 formula = I(log(data$AB.obs + 1, 10)-log(data$SIZE,10)) ~
 log(data$SIZE, 10) + data$Y
 formula = log(data$AB.obs + 1) ~ offset( log(data$SIZE,10) ) +
 log(data$SIZE,10) + data$Y

 Or you can use a function like 'confint' to find the confidence
 interval for the slope and see if 1 is in the interval.

 On Mon, Apr 23, 2012 at 12:11 PM, Mark Na mtb...@gmail.com wrote:
  Dear R-helpers,
 
  I would like to test if the slope corresponding to a continuous variable
  in
  my model (summary below) is different than one.
 
  I would appreciate any ideas for how I could do this in R, after having
  specified and run this model?
 
  Many thanks,
 
  Mark Na
 
 
 
  Call:
  lm(formula = log(data$AB.obs + 1, 10) ~ log(data$SIZE, 10) +
    data$Y)
 
  Residuals:
     Min       1Q   Median       3Q      Max
  -0.94368 -0.13870  0.04398  0.17825  0.63365
 
  Coefficients:
                   Estimate Std. Error t value  Pr(|t|)
  (Intercept)        -1.18282    0.09120 -12.970    2e-16 ***
  log(data$SIZE, 10)  0.56009    0.02564  21.846    2e-16 ***
  data$Y2008          0.16825    0.04366   3.854  0.000151 ***
  data$Y2009          0.20310    0.04707   4.315 0.238 ***
  ---
  Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 
  Residual standard error: 0.2793 on 228 degrees of freedom
  Multiple R-squared: 0.6768,     Adjusted R-squared: 0.6726
  F-statistic: 159.2 on 3 and 228 DF,  p-value:  2.2e-16
 
         [[alternative HTML version deleted]]
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Gregory (Greg) L. Snow Ph.D.
 538...@gmail.com





-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Accessing a list

2012-04-25 Thread Greg Snow

I believe that fortune(312) applies here.  As my current version of
fortunes does not show this I am guessing that it is in the
development version and so here is what fortune(312) will eventually
print (unless something changes or I got something wrong):

The problem here is that the $ notation is a magical shortcut and like
any other magic if used
incorrectly is likely to do the programmatic equivalent of turning
yourself into a toad.
—Greg Snow (in response to a user that wanted to access a column whose name is
stored in y via x$y rather than x[[y]])
R-help (February 2012)

On Tue, Apr 24, 2012 at 9:42 PM, Jim Silverton jim.silver...@gmail.com wrote:
 Hi,
 I have the following problem- I want to access a list whose elements are
 imp1, imp2, imp3 etc I tried theusing the paste comand in a for loop see
 the last for loop below. But I keep calling it df but df = imp1 (for the
 first run). Any ideas on how I can access the elements of the list?

 Isaac



 require(Amelia)
 library(Amelia)
 data.use - read.csv(multiplecarol.CSV, header=T)
 names(data.use) = c(year, dischargex1, y, pressurex2 , windx3)

 ts - c (c(1:12), c(1:12), c(1:12), c(1:12), c(1:12), c(1:12), c(1:12),
 c(1:6) )
 length(ts)
 data.use = cbind(ts, data.use)

 #a.out2 - amelia(data.use, m = 1000, idvars = year)


 n.times = 100
 a.out.time - amelia(data.use, m = n.times, ts=ts, idvars=year,
 polytime=2)

 constant.col = dischargex1.col = pressurex2.col = windx3.col =
 rep(0,n.times)

 for (i in 1: n.times)
 {
 x = c(imp,i)
 df = paste(x, collapse = )
 data1 = a.out.time[[1]]$df
 attach(data1)
 y = as.numeric(y)
 dischargex1 = as.numeric(dischargex1)
 pressurex2 = as.numeric(pressurex2)
 windx3 = as.numeric(windx3)
 multi.regress = lm(y~ dischargex1 + pressurex2 + windx3)
 constant.col[i] = as.numeric(multi.regress[[1]][1])
 dischargex1.col[i] = as.numeric(multi.regress[[1]][2])
 pressurex2.col[i] = as.numeric(multi.regress[[1]][3])
 windx3.col[i] = as.numeric(multi.regress[[1]][4])
 }


 --
 Thanks,
 Jim.

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Scatter plot / LOESS, or LOWESS for more than one parameter

2012-04-24 Thread Greg Snow

Assuming that you want event as the x-axis (horizontal) you can do
something like (untested without reproducible data):

par(mfrow=c(2,1))
scatter.smooth( event, pH1 )
scatter.smooth( event, pH2 )

or

plot( event, pH1, ylim=range(pH1,pH2) , col='blue')
points( event, pH2, col='green' )
lines( loess.smooth(event,pH1), col='blue')
lines( loess.smooth(event,pH2), col='green')

Only do the second one if pH1 and pH2 are measured on the same scale
in a way that the comparison and any crossings are meaningful or if
there is enough separation (but not too much) that there is no
overlap, but still enough detail.



On Mon, Apr 23, 2012 at 10:40 PM, R. Michael Weylandt
michael.weyla...@gmail.com wrote:
 The scatter plot is easy:

 plot(pH1 ~ pH2, data = OBJ)

 When you say a loess for each -- how do you break them up? Are there
 repeat values for pH1? If so, this might be hard to do in base
 graphics, but ggplot2 would make it easy:

 library(ggplot2)
 ggplot(OBJ, aes(x = pH1, y = pH2)) + geom_point() + stat_smooth() +
 facet_wrap(~factor(pH1))

 or something similar.

 Michael

 On Mon, Apr 23, 2012 at 11:26 PM, David Doyle kydaviddo...@gmail.com wrote:
 Hi folks.

 If I have the following in my data

 event    pH1    pH2
 1            4.0     6.0
 2            4.3     5.9
 3            4.1     6.1
 4            4.0     5.9
 and on and on. for about 400 events

 Is there a way I can get R to plot event vs. pH1  and event vs. pH2 and
 then do a loess or lowess line for each??

 Thanks in advance
 David

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] need advice on using excel to check data for import into R

2012-04-23 Thread Greg Snow

This is really a job for a database, and Excel is not a database (even
though many think it is).  I have some clients that I have convinced
to create an Access database rather than use Excel (still MS product
so it can't be that scary, right?).  They were often a little
reluctant at first because they would be using a new tool, and they
actually had to think about the design of the database up front, but
once they got to serious data entry they were very grateful for me
directing them to Access over Excel.  Databases have tools to validate
data on entry so there will be fewer cases where you need to ask them
for corrections (and it will be easier for them to fix any problems
that do sneak through).

On Sun, Apr 22, 2012 at 12:34 PM, Markus Weisner r...@themarkus.com wrote:
 I have created an S4 object type for conducting fire department data
 analysis.  The object includes validity check that ensures certain fields
 are present and that duplicate records don't exist for certain combinations
 of columns (e.g. no duplicate incident number / incident data / unit ID
 ensures that the data does not show the same fire engine responding twice
 on the same call).

 I am finding that I spend a lot of time taking client data, converting it
 to my S4 object, and then sending it back to the client to correct data
 validity issues.

 I am trying to figure out a clever way to have excel (typically the program
 used by my clients) check client data prior to them submitting it to me.  I
 have been working with somebody on trying to develop an excel toolbar
 add-in with limited success.

 My question is whether anybody can think of clever alternatives for clients
 to validate their data … for example, is their a R excel plugin (that would
 be easily installed by a client) where I might be able write some lines of
 R to check the data and output messages … or maybe some sort of server
 where they could upload their data and I could have some lines of R code
 that would check the code and send back potential error messages?

 I realize this is a fairly open ended question … just looking for some
 general ideas and directions to go. Getting a little frustrated with
 spending most of my work time dealing with data cleaning issues … guessing
 this is a problem shared by many of us that use R!

 Thanks,
 Markus

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Selecting columns whose names contain mutated except when they also contain non or un

2012-04-23 Thread Greg Snow

Here is a method that uses negative look behind:

 tmp - c('mutation','nonmutated','unmutated','verymutated','other')
 grep((?!un)(?!non)muta, tmp, perl=TRUE)
[1] 1 4

it looks for muta that is not immediatly preceeded by un or non (but
it would match unusually mutated since the un is not immediatly
befor the muta).

Hope this helps,

On Mon, Apr 23, 2012 at 10:10 AM, Paul Miller pjmiller...@yahoo.com wrote:
 Hello All,

 Started out awhile ago trying to select columns in a dataframe whose names 
 contain some variation of the word mutant using code like:

 names(KRASyn)[grep(muta, names(KRASyn))]

 The idea then would be to add together the various columns using code like:

 KRASyn$Mutant_comb - rowSums(KRASyn[grep(muta, names(KRASyn))])

 What I discovered though, is that this selects columns like nonmutated and 
 unmutated as well as columns like mutated, mutation, and mutational.

 So I'd like to know how to select columns that have some variation of the 
 word mutant without the non or the un. I've been looking around for an 
 example of how to do that but haven't found anything yet.

 Can anyone show me how to select the columns I need?

 Thanks,

 Paul

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to test if a slope is different than 1?

2012-04-23 Thread Greg Snow

One option is to subtract the continuous variable from y before doing
the regression (this works with any regression package/function).  The
probably better way in R is to use the 'offset' function:

formula = I(log(data$AB.obs + 1, 10)-log(data$SIZE,10)) ~
log(data$SIZE, 10) + data$Y
formula = log(data$AB.obs + 1) ~ offset( log(data$SIZE,10) ) +
log(data$SIZE,10) + data$Y

Or you can use a function like 'confint' to find the confidence
interval for the slope and see if 1 is in the interval.

On Mon, Apr 23, 2012 at 12:11 PM, Mark Na mtb...@gmail.com wrote:
 Dear R-helpers,

 I would like to test if the slope corresponding to a continuous variable in
 my model (summary below) is different than one.

 I would appreciate any ideas for how I could do this in R, after having
 specified and run this model?

 Many thanks,

 Mark Na



 Call:
 lm(formula = log(data$AB.obs + 1, 10) ~ log(data$SIZE, 10) +
   data$Y)

 Residuals:
    Min       1Q   Median       3Q      Max
 -0.94368 -0.13870  0.04398  0.17825  0.63365

 Coefficients:
                  Estimate Std. Error t value  Pr(|t|)
 (Intercept)        -1.18282    0.09120 -12.970    2e-16 ***
 log(data$SIZE, 10)  0.56009    0.02564  21.846    2e-16 ***
 data$Y2008          0.16825    0.04366   3.854  0.000151 ***
 data$Y2009          0.20310    0.04707   4.315 0.238 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 Residual standard error: 0.2793 on 228 degrees of freedom
 Multiple R-squared: 0.6768,     Adjusted R-squared: 0.6726
 F-statistic: 159.2 on 3 and 228 DF,  p-value:  2.2e-16

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] xyplot - ordering factors in graph

2012-04-21 Thread Greg Snow

R works on the idea that factor level ordering is a property of the
data rather than a property of the graph.  So if you have the factor
levels ordered properly in the data, then the graph will take care of
itself.  To order the levels see functions like: factor, relevel, and
reorder.

On Sat, Apr 21, 2012 at 1:23 AM, pip philsiv...@hotmail.com wrote:
 Hello - newbie
 Have created a lattice graph and want to know how to sort one of the
 elements which is a factor.
 The factor numbers in graph are - eg - 10   32  21
                                                                      2
 22  4 etc

 Regards

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/xyplot-ordering-factors-in-graph-tp4576013p4576013.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Print warning messages and save them automatically in a file

2012-04-20 Thread Greg Snow

Would using the 'sink' function with type='message' and split=TRUE do
what you want?

On Thu, Apr 19, 2012 at 2:00 AM, Alexander juschitz_alexan...@yahoo.de wrote:
 Hello,
 I am working under R2.11.0 Windows and I would like to ask you if you know a
 way to save all warning messages obtained by the R function warning in a
 file and keeping the functionalities of the base-function warning. For
 example if I use external code, I don't want to replace all lines containing
 warning(...) by a selfwritten function. I want to execute it normally and
 everytime the external code makes a call to warning, I want the warnings
 message printed out in the console AND written in a file.

 My first solution is to redefine the function warning in the global
 environment such as:

 warning - function(...){
   write(...,Warning.log,append=TRUE)
   base::warning(...)
 #unfortunately the warning happens always in the function warning of the
 .GlobalEnv
 #and doesn't indicate anymore where the error happens :-(
 }

 This solution isn't very clean. I would like to try to redefine
 warning.expression in options. In last case, I don't understand how the
 passing of arguments works. I would like to do something like:

 options(warning.expression=quote({
  write(...,Warning.log,append=TRUE)
  ?
  }))

 I put the  because I don't know how I should call the function warning
 without being recursive and how I can pas arguments.

 Thank you

 Alexander

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Print-warning-messages-and-save-them-automatically-in-a-file-tp4570163p4570163.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Matrix multiplication by multple constants

2012-04-20 Thread Greg Snow

And another way is to remember properties of matrix multiplication:

y %*% diag(x)



On Fri, Apr 20, 2012 at 8:35 AM, David Winsemius dwinsem...@comcast.net wrote:

 On Apr 20, 2012, at 4:57 AM, Dimitris Rizopoulos wrote:

 try this:

 x  - 1:3
 y  - matrix(1:12, ncol = 3, nrow = 4)

 y * rep(x, each = nrow(y))


 Another way with a function specifically designed for that purpose:

 sweep(y, 2, x, *)

 -- David.




 I hope it helps.

 Best,
 Dimitris


 On 4/20/2012 10:51 AM, Vincy Pyne wrote:

 Dear R helpers

 Suppose

 x- c(1:3)

 y- matrix(1:12, ncol = 3, nrow = 4)

 y

     [,1] [,2] [,3]
 [1,]    1    5    9
 [2,]    2    6   10
 [3,]    3    7   11
 [4,]    4    8   12

 I wish to multiply 1st column of y by first element of x i.e. 1, 2nd
 column of y by 2nd element of x i.e. 2 an so on. Thus the resultant matrix
 should be like

 z


     [,1]   [,2]    [,3]

 [1,]    1    10    27

 [2,]    2    12    30

 [3,]    3    14    33

 [4,]    4    16    36


 When I tried simple multiplication like x*y, y is getting multiplied
 column-wise

 x*z

      [,1] [,2] [,3]
 [1,]    1    5    9
 [2,]    4   12   20
 [3,]    9   21   33
 [4,]   16   32   48


 Kindly guide

 Regards

 Vincy

        [[alternative HTML version deleted]]




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 --
 Dimitris Rizopoulos
 Assistant Professor
 Department of Biostatistics
 Erasmus University Medical Center

 Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
 Tel: +31/(0)10/7043478
 Fax: +31/(0)10/7043014
 Web: http://www.erasmusmc.nl/biostatistiek/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 David Winsemius, MD
 West Hartford, CT


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ternaryplot as an inset graph

2012-04-20 Thread Greg Snow

The triplot function in the TeachingDemos package uses base graphics,
the subplot function (also in TeachingDemos) is another way to place a
plot within a plot (and triplot and subplot do work together).

If you want to stick to grid graphics then you can use viewports in
grid to insert one plot into another.

On Fri, Apr 20, 2012 at 10:05 AM, Ben Bolker bbol...@gmail.com wrote:
 Young, Jennifer A Jennifer.Young at dfo-mpo.gc.ca writes:

 I am trying to add a ternary plot as a corner inset graph to a larger
 main ternary plot. I have successfully used add.scatter in the past for
 different kinds of plots but It doesn't seem to work for this particular
 function. It overlays the old plot rather than plotting as an inset.

 Here is a simple version of what I'm trying. Note that if I change the
 inset plot to be an ordinary scatter, for instance, it works as
 expected.

 library(ade4)
 library(vcd)
 tdat - data.frame(x=runif(20), y=rlnorm(20), z=rlnorm(20))
 insetPlot - function(data){
   ternaryplot(data)
 }
 ternaryplot(tdat)
 add.scatter(insetPlot(tdat), posi=topleft, ratio=.2)


  I think the problem is that add.scatter assumes you're using base
 graphics, while ternaryplot() uses grid graphics.  Mixing and
 matching grid+base graphs is a little bit tricky.  You might try it with
 triax.plot() from the plotrix package, which I believe does ternary
 plots in base graphics ...

  Ben Bolker

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pierce's criterion

2012-04-19 Thread Greg Snow

Determining what is an outlier is complicated regardless of the tools
used (this is a philosophical issue rather than an R issue).  You need
to make some assumptions and definitions based on the science that
produces the data rather than the data itself before even approaching
the question of outliers.  What is an outlier for a normal
distribution may be reasonable from a gamma distribution and
completely expected from a cauchy distribution.

See the 'outliers' dataset in the TeachingDemos package, and more
importantly the examples in the help page for it, for a demonstration
of the perils of automatic outlier deletion.

On Wed, Apr 18, 2012 at 4:11 PM, Ryan Murphy rmurp...@u.rochester.edu wrote:
 Hello all,

 I would like to rigorously test whether observations in my dataset are
 outliers.  I guess all the main tests in R (Grubbs) impose the assumption
 of normality.  My data is surely not normal, so I would like to use
 something else.  As far as I can tell from wikipedia, Peirce's criterion is
 just that.

 The data I am interested in testing is: 1) Continuous on the unit interval
 2) Discrete 3) Ordinal on 0 6.  If you need more specifics, (1) refers to
 the gini index of inequality, (2) refers to measures for the number of
 assasinations, strikes, etc in a country, (3) refers to ranking data of how
 politically free a country is.

 Does R do this test?

 Thanks a lot, and PS I unlike many economists prefer R over Stata R 
 Stata!

 Sincerely,
 Ryan Murphy

 --
 Ryan Murphy
 2012
 B.A. Economics and Mathematics
 339-223-4181

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] call object from character?

2012-04-19 Thread Greg Snow

Almost always when people ask this question (it and its answer are FAQ
7.21) it is because they want to do things the wrong way (just don't
know there is a better way).

The better way is to put the variables that you want to access in this
way into a list, then you can easily access the objects in the list by
name (or position, or all of them, etc.):

mylist - list(a=12)
call_A - 'a'
mylist[[call_A]]

Adding more objects to the list is easier than creating new data
objects in the global environment, and if you want to do something
with all those objects (copy, delete, rename, etc.) then you have 1
list to work with rather than a bunch of separate objects.  If you
want to do the same operation on all the objects (the common follow-up
question) then if they are in a list you can use lapply, sapply, or
vapply and it is much simpler than looping and getting.

On Wed, Apr 18, 2012 at 8:25 PM, chuck.01 charliethebrow...@gmail.com wrote:
 Let say I have an object (I hope my terminology is correct) a
 a - 12
 a
 [1] 12

 And a has been assigned the number 12, or whatever
 And lets say I have a character call_A
 call_A - a
call_A
 [1] a

 What is the function F that allows this to happen:
 F( call_A )
 [1] 12


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/call-object-from-character-tp4569686p4569686.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Effeciently sum 3d table

2012-04-16 Thread Greg Snow

Look at the Reduce function.

On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra dava...@verizon.net wrote:
 I have a large number of 3d tables that I wish to sum
 Is there an efficient way to do this? Or perhaps a function I can call?

 I tried using do.call(sum,listoftables) but that returns a single value.

 So far, it seems only a loop will do the job.


 TIA,
 DAV

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Effeciently sum 3d table

2012-04-16 Thread Greg Snow

Here is a simple example:

 mylist - replicate(4, matrix(rnorm(12), ncol=3), simplify=FALSE)
 A - Reduce( `+`, mylist )
 B - mylist[[1]] + mylist[[2]] + mylist[[3]] + mylist[[4]]
 all.equal(A,B)
[1] TRUE

Basically what Reduce does is it first applies the function (`+` in
this case) to the 1st 2 elements of mylist, then applies it to that
result and the 3rd element, then that result and the 4th element (and
would continue on if mylist had more than 4 elements).  It is
basically a way to create functions like sum from functions like `+`
which only work on 2 objects at a time.

Another way to see what it is doing is to run something like:

 Reduce( function(a,b){ cat(I am adding,a,and,b,\n); a+b }, 1:10 )

The Reduce function will probably not be any faster than a really well
written loop, but will probably be faster (both to write the command
and to run) than a poorly designed naive loop application.


On Mon, Apr 16, 2012 at 12:52 PM, David A Vavra dava...@verizon.net wrote:
 Thanks Greg,

 I think this may be what I'm after but the documentation for it isn't
 particularly clear. I hate it when someone documents a piece of code saying
 it works kinda like some other code (running elsewhere, of course) making
 the tacit assumption that everybody will immediately know what that means
 and implies.

 I'm sure I'll understand it once I know what it is trying to say. :) There's
 an item in the examples which may be exactly what I'm after.

 DAV


 -Original Message-
 From: Greg Snow [mailto:538...@gmail.com]
 Sent: Monday, April 16, 2012 11:54 AM
 To: David A Vavra
 Cc: r-help@r-project.org
 Subject: Re: [R] Effeciently sum 3d table

 Look at the Reduce function.

 On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra dava...@verizon.net wrote:
 I have a large number of 3d tables that I wish to sum
 Is there an efficient way to do this? Or perhaps a function I can call?

 I tried using do.call(sum,listoftables) but that returns a single value.

 So far, it seems only a loop will do the job.


 TIA,
 DAV


 --
 Gregory (Greg) L. Snow Ph.D.
 538...@gmail.com




-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Gradients in bar charts XXXX

2012-04-13 Thread Greg Snow

Here is one approach:

tmp - rbinom(10, 100, 0.78)

mp - barplot(tmp, space=0, ylim=c(0,100))

tmpfun - colorRamp( c('green','yellow',rep('red',8)) )

mat - 1-row(matrix( nrow=100, ncol=10 ))/100
tmp2 - tmpfun(mat)

mat2 - as.raster( matrix( rgb(tmp2, maxColorValue=255), ncol=10) )

for(i in 1:10) mat2[ mat[,i] = tmp[i]/100, i] - NA


rasterImage(mat2, mp[1] - (mp[2]-mp[1])/2, 0, mp[10] + (mp[2]-mp[1])/2, 100,
interpolate=FALSE)

barplot(tmp, col=NA, add=TRUE, space=0)


You can tweak it to your desire.  It might look a little better if
each bar were drawn independently with interpolate=TRUE (this would
also be needed if you had space between the bars).


On Mon, Apr 9, 2012 at 12:40 PM, Jason Rodriguez
jason.rodrig...@dca.ga.gov wrote:
 Hello, I have a graphics-related question:

 I was wondering if anyone knows of a way to create a bar chart that is 
 colored with a three-part gradient that changes at fixed y-values. Each bar 
 needs to fade green-to-yellow at Y=.10 and from yellow-to-red at Y=.20. Is 
 there an option in a package somewhere that offers an easy way to do this?

 Attached is a chart I macgyvered together in Excel using a combination of a 
 simple bar chart, fit line, and some drawing tools. I want to avoid doing it 
 this way in the future by finding a way to replicate it in R.

 Any ideas?

 Thanks,

 Jason Michael Rodriguez
 Data Analyst
 State Housing Trust Fund for the Homeless
 Georgia Department of Community Affairs
 Email:  jason.rodrig...@dca.ga.gov


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Curve fitting, probably splines

2012-04-13 Thread Greg Snow

This sounds like possibly using logsplines may be what you want.  See
the 'oldlogspline' function in the 'logspline' package.

On Thu, Apr 12, 2012 at 7:45 AM, Michael Haenlein
haenl...@escpeurope.eu wrote:
 Dear all,

 This is probably more related to statistics than to [R] but I hope someone
 can give me an idea how to solve it nevertheless:

 Assume I have a variable y that is a function of x: y=f(x). I know the
 average value of y for different intervals of x. For example, I know that
 in the interval[0;x1] the average y is y1, in the interval [x1;x2] the
 average y is y2 and so forth.

 I would like to find a line of minimum curvature so that the average values
 of y in each interval correspond to y1, y2, ...

 My idea was to use (cubic) splines. But the problem I have seems somewhat
 different to what is usually done with splines. As far as I understand it,
 splines help to find a curve that passes a set of given points. But I don't
 have any points, I only have average values of y per interval.

 If you have any suggestions on how to solve this, I'd love to hear them.

 Thanks very much in advance,

 Michael

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to compute a vector of min values ?

2012-04-07 Thread Greg Snow

Peter showed how to get the minimums from a list or data frame using
sapply, here is a way to copy your 1440 vectors into a single list
(doing this and keeping your data in a list instead of separate
vectors will make your life easier in general):

my.list - lapply( 1:1440, function(x) get( sprintf(v%i,x)) )

You can then name the elements of the list, if you want, with something like:

names(my.list) - sprintf(v%i, 1:1440)

Then if all the vectors are of the same length you can convert this
into a data frame with:

df - as.data.frame(my.list)

But this is not needed as most of the work can be done with it as a
list (and if they are different lengths then the list is how it should
stay).

Either way you can now use sapply on the list/data frame to get all
the minimums.

To anticipate a possible future question, if you next want the minimum
of each position across vectors then you can use the pmin function:

do.call( pmin, my.list )


On Fri, Apr 6, 2012 at 12:29 AM, peter dalgaard pda...@gmail.com wrote:

 On Apr 6, 2012, at 00:25 , ikuzar wrote:

 Hi,

 I'd like to know how to get a vector of min value from many vectors without
 making a loop. For example :

 v1 = c( 1, 2, 3)
 v2 =  c( 2, 3, 4)
 v3 = c(3, 4, 5)
 df = data.frame(v1, v2, v3)
 df
  v1 v2 v3
 1  1  2  3
 2  2  3  4
 3  3  4  5
 min_vect = min(df)
 min_vect
 [1] 1

 I 'd like to get min_vect = (1, 2, 3), where 1 is the min of v1, 2 is the
 min of v2 and 3 is the min of v3.

 The example above are very easy but, in real, I have got v1, v2, ... v1440

 sapply(df, min)

 (possibly sapply(df, min, na.rm=TRUE) )


 Thanks for your help,

 ikuzar

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/how-to-compute-a-vector-of-min-values-tp4536224p4536224.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 --
 Peter Dalgaard, Professor,
 Center for Statistics, Copenhagen Business School
 Solbjerg Plads 3, 2000 Frederiksberg, Denmark
 Phone: (+45)38153501
 Email: pd@cbs.dk  Priv: pda...@gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bayesian 95% Credible interval

2012-04-07 Thread Greg Snow

The emp.hpd function in the TeachingDemos package will do this
(assumes a single interval result, either unimodal or multimodes but
the valleys between don't drop far enough to split the interval).  I
am sure there are similar functions in other packages as well.

On Fri, Apr 6, 2012 at 12:39 PM, Gyanendra Pokharel
gyanendra.pokha...@gmail.com wrote:
 Hi all,
 I have the data from the posterior distribution for some parameter. I want
 to find the 95% credible interval. I think t.test(data) is only for the
 confidence interval. I did not fine function for the Bayesian credible
 interval. Could some one suggest me?

 Thanks

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Histogram classwise

2012-04-05 Thread Greg Snow

You might want to look at the lattice or ggplot2 packages, both of
which can create a graph for each of the classes.

On Tue, Apr 3, 2012 at 6:20 AM, arunkumar akpbond...@gmail.com wrote:
 Hi
 I have a data class wise. I want to create a histogram class wise without
 using for loop as it takes a long time
 my data looks like this

 x       class
 27      1
 93      3
 65      5
 1       2
 69      5
 2       1
 92      4
 49      5
 55      4
 46      1
 51      3
 100     4




 -
 Thanks in Advance
        Arun
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Histogram-classwise-tp4528624p4528624.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] identify with mfcol=c(1,2)

2012-04-05 Thread Greg Snow

I tried your code, first I removed the reference to the global
variable data$Line, then it works if I finish identifying by either
right clicking (I am in windows) and choosing stop, or using the stop
menu.  It does as you say if I press escape or use the stop sign
button (both stop the whole evaluation rather than just the
identifying).

On Tue, Apr 3, 2012 at 8:52 AM, John Sorkin jsor...@grecc.umaryland.edu wrote:
 I would like to have a figure with two graphs. This is easily accomplished 
 using mfcol:

 oldpar - par(mfcol=c(1,2))
 plot(x,y)
 plot(z,x)
 par(oldpar)

 I run into trouble if I try to use identify with the two plots. If, after 
 identifying points on my first graph I hit the ESC key, or hitting stop menu 
 bar of my R session, the system stops the identification process, but fails 
 to give me my second graph. Is there a way to allow for the identification of 
 points when one is plotting to graphs in a single graph window? My code 
 follows.

 plotter - function(first,second) {
  # Allow for two plots in on graph window.
  oldpar-par(mfcol=c(1,2))

  #Bland-Altman plot.
  plot((second+first)/2,second-first)
  abline(0,0)
  # Allow for indentification of extreme values.
  BAzap-identify((second+first)/2,second-first,labels = seq_along(data$Line))
  print(BAzap)

  # Plot second as a function of first value.
  plot(first,second,main=Limin vs. Limin,xlab=First (cm^2),ylab=Second 
 (cm^3))
  # Add identity line.
  abline(0,1,lty=2,col=red)
  # Allow for identification of extreme values.
  zap-identify(first,second,labels = seq_along(data$Line))
  print(zap)
  # Add regression line.
  fit1-lm(first~second)
  print(summary(fit1))
  abline(fit1)
  print(summary(fit1)$sigma)

  # reset par to default values.
  par(oldpar)

 }
 plotter(first,second)


 Thanks,
 John






 John David Sorkin M.D., Ph.D.
 Chief, Biostatistics and Informatics
 University of Maryland School of Medicine Division of Gerontology
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing)

 Confidentiality Statement:
 This email message, including any attachments, is for ...{{dropped:15}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] meaning of sigma from LM, is it the same as RMSE

2012-04-05 Thread Greg Snow

If you look at the code for summary.lm the line for the value of sigma is:

ans$sigma - sqrt(resvar)

and above that we can see that resvar is defined as:

 resvar - rss/rdf

If that is not sufficient you can find how rss and rdf are computed in
the code as well.

On Tue, Apr 3, 2012 at 8:56 AM, John Sorkin jsor...@grecc.umaryland.edu wrote:
 Is the sigma from a lm, i.e.

 fit1 - lm(y~x)
 summary(fit1)
 summary(fit1)$sigma

 the RMSE (root mean square error)

 Thanks,
 John

 John David Sorkin M.D., Ph.D.
 Chief, Biostatistics and Informatics
 University of Maryland School of Medicine Division of Gerontology
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing)

 Confidentiality Statement:
 This email message, including any attachments, is for ...{{dropped:14}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How does predict.loess work?

2012-04-05 Thread Greg Snow

Run the examples for the loess.demo function in the TeachingDemos
package to get a better understanding of what goes into the loess
predictions.

On Tue, Apr 3, 2012 at 2:12 PM, Recher She rrrecher@gmail.com wrote:
 Dear R community,

 I am trying to understand how the predict function, specifically, the
 predict.loess function works.

 I understand that the loess function calculates regression parameters at
 each data point in 'data'.

 lo - loess ( y~x, data)

 p - predict (lo, newdata)

 I understand that the predict function predicts values for 'newdata'
 according to the loess regression parameters. How does predict.loess do
 this in the case that 'newdata' is different from the original data x? How
 does the interpolation take place?

 Thank you.

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] simulate correlated binary, categorical and continuous variable

2012-04-04 Thread Greg Snow

How are you calculating the correlations? That may be part of the
problem, when you categorize a continuous variable you get a factor
whose internal representation is a set of integers. If you try to get
a correlation with that variable it will not be the polychoric
correlation.

Also do you need your data to have the exact proportions and means
that you show below? or represent random samples from those
populations and therefore the actual proportions and means will vary a
bit from what is specified?

If you are interested in tetrachoric and polychoric correlations, then
generating the latent normals and categorizing seems the most
straightforward method.

Also, which function (from which package) are you using to generate
your normal variables? That may have some effect.

On Sun, Apr 1, 2012 at 7:00 PM, Burak Aydin burak235...@hotmail.com wrote:
Hello Greg,
Sorry for the confusion.
Lets say, I have a population. I have 6 variables. They are correlated to
each other. I can get you pearson correlation, tetrachoric or polychoric
correlation coefficients.
2 of them continuous, 2 binary, 2 categorical.
Lets assume following conditions;
Co1 and Co2 are normally distributed continuous random variables. Co1-- N
(0,1), Co2--N(100,15)
Ca1 and Ca2 are categorical variables. Ca1 probabilities
=c(.02,.18,.28,.22,.30), Ca2 probs =c(.06,.18,.76)
Bi1 and Bi2 are binaries, Marginal probabilities Bi1 p= 0.4, Bi2 p=0.5.
And , again, I have the correlations.

When I try to simulate this population I fail. If I keep the means and
probabilities same I lost the correct correlations. When I keep
correlations, I loose precision on means and frequencies/probabilities.
See these links please
http://www.mathworks.com/products/statistics/demos.html?file=/products/demos/shipping/stats/copulademo.html
http://stats.stackexchange.com/questions/22856/how-to-generate-correlated-test-data-that-has-bernoulli-categorical-and-contin
http://www.springerlink.com/content/011x633m554u843g/

--
View this message in context:
http://r.789695.n4.nabble.com/simulate-correlated-binary-categorical-and-continuous-variable-tp4516433p4524863.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

Re: [R] simulate correlated binary, categorical and continuous variable

2012-03-31 Thread Greg Snow

Your explanation below has me more confused than before. Now it is
possible that it is just me, but it seems that if others understood it
then someone else would have given a better answer by now. Are you
restricting your categorical and binary variables to be binned
versions of underlying normals? if that is the case I doubt that
there would be a more efficient way than binning a normal variable.

If not then can you show us more of what you want to produce? along
with what you mean by correlation or covariance with categorical
variables (which is meaningless without additional
restrictions/assumptions).

On Fri, Mar 30, 2012 at 3:41 PM, Burak Aydin burak235...@hotmail.com wrote:
Hello Greg,
Thanks for your time,
Lets say I know Pearson covariance matrix.
When I use rmvnorm to simulate 9 variables and then dichotomize/categorize
them, I cant retrieve the population covariance matrix.

--
View this message in context:
http://r.789695.n4.nabble.com/simulate-correlated-binary-categorical-and-continuous-variable-tp4516433p4520464.html
Sent from the R help mailing list archive at Nabble.com.

--
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

Re: [R] discrepancy between paired t test and glht on lme models

2012-03-30 Thread Greg Snow

I nominate the following paragraph for the fortunes package:

The basic issue appears to be that glht is not smart enough to deal
with degrees of freedom so it uses an asymptotic z-test instead of a
t-test. Infinite df, basically, and since 4 is a pretty poor
approximation of infinity, you get your discrepancy.



On Thu, Mar 29, 2012 at 1:36 AM, peter dalgaard pda...@gmail.com wrote:

 On Mar 28, 2012, at 20:23 , Rajasimhan Rajagovindan wrote:

 Hi folks,



 I am working with repeated measures data and I ran into issues where the
 paired t-test results did not match those obtained by employing glht()
 contrasts on a lme model. While the lme model itself appears to be fine,
 there seems to be some discrepancy with using glht() on the lme model
 (unless I am missing something here).  I was wondering if someone could
 help identify the issue. On my actual dataset the  differences between
 glht() and paired t test is more severe than the example provided here.


 You might want to move to the R-sig-ME (mixed effects) mailing list for up to 
 date advice.

 The basic issue appears to be that glht is not smart enough to deal with 
 degrees of freedom so it uses an asymptotic z-test instead of a t-test. 
 Infinite df, basically, and since 4 is a pretty poor approximation of 
 infinity, you get your discrepancy.

 It's not that surprising, given that lme() itself is pretty poor at figuring 
 out df in some cases. Especially if you have to deal with cross-stratum 
 effects, the calculation of appropriate degrees of freedom is nontrivial. 
 Some recent developments allow the calculation of Kenward-Roger for the 
 lmer() models, but I wouldn't know to what extend this carries to glht-style 
 testing.




 I am using glht() for my data since I need to perform pairwise comparisons
 across multiple levels, any alternate approach to performing posthoc
 comparisons on lme object is also welcome.

 I have included the code and the results from a mocked up data (one that I
 found online) here.





 require(nlme)

 require(multcomp)



 dv - c(1,3,2,2,2,5,3,4,3,5)

 subject - factor(c(s1,s1,s2,s2,s3,s3,s4,s4,s5,s5))

 myfactor - factor(c(f1,f2,f1,f2,f1,f2,f1,f2,f1,f2))

 mydata - data.frame(dv, subject, myfactor)

 rm(subject,myfactor,dv)

 attach(mydata)



 # paired t test (H0: f2-f1 = 0)

 t.test(mydata[myfactor=='f2',1],mydata[myfactor=='f1',1],paired=TRUE)

 # yields :  t = 3.1379, df = 4, p-value = 0.03492, mean of the differences=
 1.6





 # lme (f1 as reference level)

 fit.lme - lme(dv ~ myfactor, random =
 ~1|subject,method=REML,correlation=corCompSymm(),data=mydata)

 summary(fit.lme) # yields identical results as paired t test

 # f2-f1:   t = 3.1379, df = 4, p-value = 0.0349



 summary(glht(fit.lme,linfct=mcp(myfactor=Tukey)))

 # while test statistic is comparable, p value is different

 # have noticed cases where the differences between glht() and paired t test
 is more severe



 ### sample outputs from the script ###


 # things appear ok here and match paired t test results
 #

 summary(fit.lme)

 Linear mixed-effects model fit by REML

 Data: mydata

       AIC      BIC    logLik

  36.43722 36.83443 -13.21861



 Random effects:

 Formula: ~1 | subject

        (Intercept)  Residual

 StdDev:   0.7420274 0.8058504



 Correlation Structure: Compound symmetry

 Formula: ~1 | subject

 Parameter estimate(s):

          Rho

 -0.0009325763

 Fixed effects: dv ~ myfactor

            Value Std.Error DF  t-value p-value

 (Intercept)   2.2 0.4898979  4 4.490732  0.0109

 myfactorf2    1.6 0.5099022  4 3.137857  0.0349

 Correlation:

           (Intr)

 myfactorf2 -0.52



 Standardized Within-Group Residuals:

        Min          Q1         Med          Q3         Max

 -1.45279696 -0.53193228  0.03481143  0.58490026  1.09867599



 Number of Observations: 10

 Number of Groups: 5



 # result differs from paired t test  !

 summary(glht(fit.lme,linfct=mcp(myfactor=Tukey)),test=adjusted(none))



         Simultaneous Tests for General Linear Hypotheses



 Multiple Comparisons of Means: Tukey Contrasts





 Fit: lme.formula(fixed = dv ~ myfactor, data = mydata, random = ~1 |

    subject, correlation = corCompSymm(), method = REML)



 Linear Hypotheses:

             Estimate Std. Error z value Pr(|z|)

 f2 - f1 == 0   1.6000     0.5099   3.138   0.0017 **     --

 ---

 Signif. codes:  0 Œ***‚ 0.001 Œ**‚ 0.01 Œ*‚ 0.05 Œ.‚ 0.1 Œ ‚ 1

 (Adjusted p values reported -- none method)



 

 platform       i386-pc-mingw32

 arch           i386

 os             mingw32

 system         i386, mingw32

 status

 major          2

 minor          13.1

 year           2011

 month          07

 day            08

 svn rev        56322

 language       R

 version.string R version 2.13.1 (2011-07-08)

       [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list

Re: [R] simulate correlated binary, categorical and continuous variable

2012-03-30 Thread Greg Snow

Partly this depends on what you mean by a covariance between
categorical variables (and binary) and what is a covariance between a
categorical and a continuous variable?

On Thu, Mar 29, 2012 at 12:31 PM, Burak Aydin burak235...@hotmail.com wrote:
 Hi,
 I d like to simulate 9 variables; 3 binary, 3 categorical and 3 continuous
 with a known covariance matrix.
 Using mvtnorm and later dichotimize/categorize variables is not efficient.
 Do you know any package or how to simulate mixed data?

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/simulate-correlated-binary-categorical-and-continuous-variable-tp4516433p4516433.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot points using circles filled half in red and half in blue.

2012-03-28 Thread Greg Snow

I would use the my.symbols function from the TeachingDemos package
(but then I might be a little bit biased), here is a simple example:

library(TeachingDemos)

x - runif(25)
y - runif(25)
z - sample(1:4, 25, TRUE)

ms.halfcirc2 - function(col, adj=pi/2, ...) {
theta - seq(0, 2*pi, length.out=300)+adj
x - cos(theta)
y - sin(theta)
if(col==1) {
polygon(x,y)
} else if(col==2) {
polygon(x,y, col='red')
} else if(col==3) {
polygon(x,y, col='blue')
} else {
polygon(x[1:150], y[1:150], border=NA, col='red')
polygon(x[151:300], y[151:300], border=NA, col='blue')
polygon(x,y)
}
}

my.symbols( x, y, ms.halfcirc2, inches=1/5, add=FALSE,
symb.plots=TRUE, col=z)


# spice it up a bit
my.symbols( x, y, ms.halfcirc2, inches=1/5, add=FALSE,
symb.plots=TRUE, col=z, adj=runif(25, 0, pi))


Adjust things to fit better what you want.

On Tue, Mar 27, 2012 at 8:49 PM, alan alan.wu2...@gmail.com wrote:
 I want to plot many points and want to use circles. The filling color
 depends on variable a. if a=1, then not fill
 if a=2 then fill with red, if a=3 then fill with blue, if a=4, fill
 half with red and half with blue. Can anyone tell me how to plot the
 case a=4? Thanks a lot

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to test for the difference of means in population, please help

2012-03-27 Thread Greg Snow

You should use mixed effects modeling to analyze data of this sort.
This is not a topic that has generally been covered by introductory
classes, so you should consult with a professional statistician on
your problem, or educate yourself well beyond the novice level (this
takes more than just reading 1 book, a few classes would be good to
get to this level, or intense study of several books).

Since everything is balanced nicely, you could average over the 4
repeats and use a 2 sample t test (assuming the assumptions hold, your
sample data would be fine) comparing the 2 sets of 400 means.  This
will test for a general difference in the overall means, but ignores
other information and hypotheses that may be important (which is why
the mixed effects model approach is much preferred).

On Tue, Mar 27, 2012 at 1:13 AM, ali_protocol
mohammadianalimohammad...@gmail.com wrote:
 Dear all,

 Novice in statistics.

 I have 2 experimental conditions. Each condition has ~400 points as its
 response. Each condition is done in 4 repereats (so I have 2 x 400 x 4
 points).

 I want to compare the means of two conditions and test whether they are same
 or not. Which test should I use?

 #populations
 c = matrix (sample (1:20,1600, replace= TRUE), 400 ,4)
 b = matrix (sample (1:20,1600, replace= TRUE), 400 ,4)

 #means of repeats
 c.mean= apply (c,2, mean)
 b.mean= apply (b,2,mean)

 #mean of experiment
 c.mean.all= mean (c)
 b.mean.all= mean (b)

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/How-to-test-for-the-difference-of-means-in-population-please-help-tp4508089p4508089.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Work -Shift Scheduling - Constraint Linear Programming

2012-03-26 Thread Greg Snow

Running findFn('linear programming') from the sos package brings up
several possibilities that look promising.

On Sun, Mar 25, 2012 at 5:48 AM, agent dunham crossp...@hotmail.com wrote:
 Dear Community,

 I've a Work -Shift Scheduling Problem I'd like to solve via constraint
 linear programming.

 Maybe something similar to
 http://support.sas.com/documentation/cdl/en/orcpug/63349/HTML/default/viewer.htm#orcpug_clp_sect037.htm

 Can anybody suggest me any package/R examples to solve this?

 If it's needed more details of my little problemm I can provide.

 Thanks in advance, u...@host.com as u...@host.com

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Work-Shift-Scheduling-Constraint-Linear-Programming-tp4503037p4503037.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Compare similarit of two vector of not same length

2012-03-24 Thread Greg Snow

If you are trying to see if both vectors could be random samples from
the same population then I would look at a qqplot (see ?qqplot) which
will compare them visually (and if they are not the same length then
the qqplot function will use interpolation to compare them.  For a
more formal test you can use the ks.test function (also can take
vectors of different length), just note that a non-significant result
does not mean that they are the same, and with big sample sizes this
can be significant even though the differences are not practically
meaningful.  Another option is to do the qqplot along with the
vis.test function in the TeachingDemos package, this lets you do a
test based on the qqplot, but also gives you a feel for the practical
difference.

On Sat, Mar 24, 2012 at 5:44 AM, Alaios ala...@yahoo.com wrote:
 Dear all,
 this is not strictly R question. I have two vectors of different length (this 
 is in the order of 10%). I am trying to see if still one can compare these 
 two for similarity.

 IF the vectors were of the same length I would just take the difference of 
 the two and plot a pdf of it.

 One way I am thinking is prorbably to find the longer length and short it in 
 some way to get to the length of the short.


 Which are the math formulations for this type of problems and which of those 
 R supports?

 Regards
 Alex

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to compute within-group mean and sd?

2012-03-24 Thread Greg Snow

In addition to Michael's answers, there are packages that allow you to
use SQL syntax on R data objects, so you could probably just use what
you are familiar with.

On Sat, Mar 24, 2012 at 9:32 AM, reeyarn reey...@gmail.com wrote:
 Hi, I want to run something like
  SELECT firm_id, count(*), mean(value), sd(value)
  FROM table
  GROUP BY firm_id;

 But I have to write a for loop like
  for ( id in unique(table$firm_id ) {
    print(paste(  id, mean(table[firm_id == id, value])  ))
  }

 Is there any way to do it easier? Thanks :)


 Best,
 Reeyarn Lee

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] show and produce PDF file with pdf() and dev.off( ) in function

2012-03-24 Thread Greg Snow

As others have said, you pretty much need to do the plot 2 times, but
if it takes more that one command to create the plot you can use the
dev.copy function to copy what you have just plotted into another
graphics device rather than reissuing all the commands again.

On Sat, Mar 24, 2012 at 9:43 AM, Uwe Ligges
lig...@statistik.tu-dortmund.de wrote:
 On 24.03.2012 13:11, Igor Sosa Mayor wrote:

 apart from the other answers, be aware that you have to 'print' the
 graph with

 pl-plot(x)
 print(pl)



 Which is true for lattice function but not for a base graphics plot().

 Uwe Ligges




 in case you're using lattice or ggplot2 plots.

 On Fri, Mar 23, 2012 at 02:40:04PM -0700, casperyc wrote:

 Hi all,

 I know how to use pdf() and dev.off() to produce and save a graph.

 However, when I put them in a function say

 myplot(x=1:20){
   pdf(xplot.pdf)
   plot(x)
   dev.off()
 }

 the function work. But is there a way show the graph in R as well as
 saving
 it to the workspace?

 Thanks.

 casper

 -
 ###
 PhD candidate in Statistics
 School of Mathematics, Statistics and Actuarial Science, University of
 Kent
 ###

 --
 View this message in context:
 http://r.789695.n4.nabble.com/show-and-produce-PDF-file-with-pdf-and-dev-off-in-function-tp4500213p4500213.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to write and analyze data with 3 dimensions

2012-03-23 Thread Greg Snow

You could put this data into a 3 dimensional array and then use the
apply function to apply a function (such as mean) over which ever
variables you choose.

Or you could put the data into a data frame in long format where you
have your 3 variable indices in 3 columns, then the data in a 4th
column.  Then use the tapply function to apply the mean (or other
function) to groups based on the indices of choice.

If you want to do fancier things in either case then look into the
reshape2 and plyr packages for ways of shaping the data and taking the
data apart into pieces, apply a function to each piece, then put it
all back together again.

On Tue, Mar 20, 2012 at 11:16 AM, jorge Rogrigues hjm...@gmail.com wrote:
 Suppose I have data organized in the following way:
 (P_i, M_j, S_k)

 where i, j and k and indexes for sets.
 I would like to analyze the data to get for example the following
 information:
 what is the average over k for
 (P_i, M_j)
 or what is the average over j and k for P_i.

 My question is what would be the way of doing this in R.
 Specifically how should I write the data in a csv file
 and how do I read the data from the csv file into R and perform these basic
 operations.

 Thank you.

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to hide code of any function

2012-03-22 Thread Greg Snow

See the 'petals' function in the TeachingDemos package for one example
of hiding source from casual inspection (intermediate level R users
will still easily be able to figure out what the key code is, but will
not be able to claim that they stumbled across it on accident).

This post gives another possibility:
https://stat.ethz.ch/pipermail/r-devel/2011-October/062236.html



On Thu, Mar 15, 2012 at 6:53 AM, mrzung mrzun...@gmail.com wrote:
 hi

 I'm making some program and it need to be hidden.

 it's not commercial purpose but it is educational,

 so i do want to hide the code of function.

 for example,

 if i made following function:

 a-function(x){
 y-x^2
 print(y)
 }

 i do not want someone to type a and take the code of the function.

 is there anyone who can help me?


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/how-to-hide-code-of-any-function-tp4474822p4474822.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help please. 2 tables, which test?

2012-03-13 Thread Greg Snow

For this case I would use a permutation test. Start by choosing some
statistic that represents your 4 students across the different grades,
some possibilities would be the sum of scores across grades and
students, or mean, or median, or ...

Compute the selected statistic for your 4 students and save that
value. Now select 4 students at random and compute the same
statistic, repeat this a bunch of times (thousands) and compute the
statistic each time. All those stats on the random selections
represent the distribution of the statistic under the null hypothesis
that your 4 students were randomly chosen (vs. chosen based on
something that is related to the grade). Now you just compare the
stat on the original 4 students to the distribution (if you need a
specific p-value it is just the proportion of the random stats that
are as or more extreme as your original 4).

On Sat, Mar 10, 2012 at 4:04 AM, aoife doherty aaral.si...@gmail.com wrote:
Thank you for the replies.
So what my test wants to do is this:

I have a big matrix, 30 rows (students in a class) X 50 columns (students
grades for the year).
An example of the matrix is as such:

grade1 grade2 grade3 . grade 50
student 1
student 2***
student 3
student 4***
student 5***
student 6
.
.
.
.
.
student 30***

As you can see, four students (students 2,4,5 and 30) have stars beside
their name. I have chosen these students based on a particular
characteristic that they all share.I then pulled these students out to make
a new table:

grade1 grade2 grade3 ... grade 50

student 2
student 4
student 5
student 30

and what i want to see is basically is there any difference between the
grades this particular set of students(i.e. student 2,4,5 and 30) got, and
the class as a whole?

So my null hypothesis is that there is no difference between this set of
students grades, and what you would expect from the class as a whole.

Aaral

On Sat, Mar 10, 2012 at 12:18 AM, Greg Snow 538...@gmail.com wrote:

Just what null hypothesis are you trying to test or what question are
you trying to answer by comparing 2 matrices of different size?

I think you need to figure out what your real question is before
worrying about which test might work on it.

Trying to get your data to fit a given test rather than finding the
appropriate test or other procedure to answer your question is like
buying a new suit then having plastic surgery to make you fit the suit
rather than having the tailor modify the suit to fit you.

If you can give us more information about what your question is we
have a better chance of actually helping you.

On Fri, Mar 9, 2012 at 9:46 AM, aoife doherty aaral.si...@gmail.com
wrote:

Thank you. Can the chi-squared test compare two matrices that are not
the
same size, eg if matrix 1 is a 2 X 4 table, and matrix 2 is a 3 X 5
matrix?

On Fri, Mar 9, 2012 at 4:37 PM, Greg Snow 538...@gmail.com wrote:

The chi-squared test is one option (and seems reasonable to me if it
the the proportions/patterns that you want to test). One way to do
the test is to combine your 2 matrices into a 3 dimensional array (the
abind package may help here) and test using the loglin function.

On Thu, Mar 8, 2012 at 5:46 AM, aaral singh aaral.si...@gmail.com
wrote:
Hi.Please help if someone can.

Problem:
I have 2 matrices

matrix 1:
Freq None Some
Heavy 3 2 5
Never 8 13 8
Occas 1 4 4
Regul 9 5 7

matrix 2:
Freq None Some
Heavy 7 1 3
Never 87 18 84
Occas 12 3 4
Regul 9 1 7

I want to see if matrix 1 is significantly different from matrix 2. I
consider using a chi-squared test. Is this appropriate?
Could anyone advise?
Many thank you.
Aaral Singh

--
View this message in context:

http://r.789695.n4.nabble.com/help-please-2-tables-which-test-tp4456312p4456312.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

Re: [R] help please. 2 tables, which test?

2012-03-09 Thread Greg Snow

The chi-squared test is one option (and seems reasonable to me if it
the the proportions/patterns that you want to test).  One way to do
the test is to combine your 2 matrices into a 3 dimensional array (the
abind package may help here) and test using the loglin function.

On Thu, Mar 8, 2012 at 5:46 AM, aaral singh aaral.si...@gmail.com wrote:
 Hi.Please help if someone can.

 Problem:
 I have 2 matrices

 Eg

 matrix 1:
                Freq  None  Some
  Heavy    3        2          5
  Never    8       13         8
  Occas    1        4          4
  Regul     9        5         7

 matrix 2:
                  Freq     None     Some
  Heavy        7          1             3
  Never      87         18          84
  Occas      12           3            4
  Regul        9            1            7


 I want to see if matrix 1 is significantly different from matrix 2. I
 consider using a chi-squared test. Is this appropriate?
 Could anyone advise?
 Many thank you.
 Aaral Singh

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/help-please-2-tables-which-test-tp4456312p4456312.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] xyplot without external box

2012-03-09 Thread Greg Snow

Why do you want to do this?  Lattice was not really designed to put
just part of the graph up, but rather to create the entire graph using
one command.

If you want to show a process, putting up part of a graph at a time,
it may be better to create the whole graph as a vector graphics file
(pdf, postscript, svg, pgf, emf, etc.) then use an external program to
remove those parts that you don't want for a given step.

On Thu, Mar 8, 2012 at 6:02 AM, Mauricio Zambrano-Bigiarini
hzambran.newsgro...@gmail.com wrote:
 Dear list members,

 Within a loop, I need to create an xyplot with only a legend, not even
 with the default external box drawn by lattice.

 I already managed to remove the axis labels and tick marks, but I
 couldn't find in the documentation of xyplot how to remove the
 external box.

 I would really appreciate any help with this


 - START ---
 library(lattice)

 x-1:100
 cuts - unique( quantile( as.numeric(x),
                           probs=c(0, 0.25, 0.5, 0.75, 0.9, 0.95, 1),
 na.rm=TRUE) )

 gof.levels - cut(x, cuts)
 nlevels - length(levels(gof.levels))

 xyplot(1~1, groups=gof.levels,  type=n, xlab=, ylab=,
          scales=list(draw=FALSE),
          key = list(x = .5, y = .5, corner = c(0.5, 0.5),
                 title=legend,
                 points = list(pch=16, col=c(2,4,3), cex=1.5),
                 text = list(levels(gof.levels))
                         )
      )

 -  END  ---

 Thanks in advance,

 Mauricio Zambrano-Bigiarini

 --
 
 FLOODS Action
 Water Resources Unit (H01)
 Institute for Environment and Sustainability (IES)
 European Commission, Joint Research Centre (JRC)
 webinfo    : http://floods.jrc.ec.europa.eu/
 
 DISCLAIMER:
 The views expressed are purely those of the writer
 and may not in any circumstances be regarded as stating
 an official position of the European Commission.
 
 Linux user #454569 -- Ubuntu user #17469
 
 There is only one pretty child in the world,
 and every mother has it.
 (Chinese Proverb)
 
 http://c2.com/cgi/wiki?HowToAskQuestionsTheSmartWay

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to sort frequency distribution table?

2012-03-09 Thread Greg Snow

R tends to see the ordering of factor levels as a property of the data
rather than a property of the table/graph.  So it is generally best to
modify the data object (factor) to represent what you want rather than
look for an option in the table/plot function (this will also be more
efficient in the long run).  Here is a simple example using the
reorder function:

 tmp - factor(sample( letters[1:5], 100, TRUE ))
 table(tmp)
tmp
 a  b  c  d  e
20 20 19 18 23
 tmp2 - reorder(tmp, rep(1,length(tmp)), sum)
 table(tmp2)
tmp2
 d  c  a  b  e
18 19 20 20 23
 tmp2 - reorder(tmp, rep(-1,length(tmp)), sum)
 table(tmp2)
tmp2
 e  a  b  c  d
23 20 20 19 18


On Wed, Mar 7, 2012 at 9:46 PM, Manish Gupta mandecent.gu...@gmail.com wrote:
 Hi,

 I am working on categorical data with column as disease name(categaory).

 My input data is
  [1] Acute lymphoblastic leukemia (childhood)
  [2] Adiponectin levels
  [3] Adiponectin levels
  [4] Adiponectin levels
  [5] Adiponectin levels
  [6] Adiponectin levels
  [7] Adiposity
  [8] Adiposity
  [9] Adiposity
 [10] Adiposity
 [11] Age-related macular degeneration
 [12] Age-related macular degeneration
 [13] Aging (time to death)
 [14] Aging (time to event)
 [15] Aging (time to event)
 [16] Aging (time to event)
 [17] Aging (time to event)
 [18] AIDS
 [19] AIDS
 [20] AIDS
 .


 when i use table command, i get


 [,1]
 Acute lymphoblastic leukemia (childhood)                             1
 Adiponectin levels
 5
 Adiposity
 4
 Age-related macular degeneration
 2
 Aging (time to death)
 1
 ..

 But i need to sort this table by frequency and need to plot a histogram with
 lable first column (e.g. Adiposity , Age-related macular degeneration  as
 bar name). How can i do it?

 Regards

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/How-to-sort-frequency-distribution-table-tp4455595p4455595.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help please. 2 tables, which test?

2012-03-09 Thread Greg Snow

Just what null hypothesis are you trying to test or what question are
you trying to answer by comparing 2 matrices of different size?

I think you need to figure out what your real question is before
worrying about which test might work on it.

Trying to get your data to fit a given test rather than finding the
appropriate test or other procedure to answer your question is like
buying a new suit then having plastic surgery to make you fit the suit
rather than having the tailor modify the suit to fit you.

If you can give us more information about what your question is we
have a better chance of actually helping you.

On Fri, Mar 9, 2012 at 9:46 AM, aoife doherty aaral.si...@gmail.com wrote:

 Thank you. Can the chi-squared test compare two matrices that are not the
 same size, eg if matrix 1 is a 2 X 4 table, and matrix 2 is a 3 X 5 matrix?



 On Fri, Mar 9, 2012 at 4:37 PM, Greg Snow 538...@gmail.com wrote:

 The chi-squared test is one option (and seems reasonable to me if it
 the the proportions/patterns that you want to test).  One way to do
 the test is to combine your 2 matrices into a 3 dimensional array (the
 abind package may help here) and test using the loglin function.

 On Thu, Mar 8, 2012 at 5:46 AM, aaral singh aaral.si...@gmail.com wrote:
  Hi.Please help if someone can.
 
  Problem:
  I have 2 matrices
 
  Eg
 
  matrix 1:
                 Freq  None  Some
   Heavy    3        2          5
   Never    8       13         8
   Occas    1        4          4
   Regul     9        5         7
 
  matrix 2:
                   Freq     None     Some
   Heavy        7          1             3
   Never      87         18          84
   Occas      12           3            4
   Regul        9            1            7
 
 
  I want to see if matrix 1 is significantly different from matrix 2. I
  consider using a chi-squared test. Is this appropriate?
  Could anyone advise?
  Many thank you.
  Aaral Singh
 
  --
  View this message in context:
  http://r.789695.n4.nabble.com/help-please-2-tables-which-test-tp4456312p4456312.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



 --
 Gregory (Greg) L. Snow Ph.D.
 538...@gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] gsub: replacing double backslashes with single backslash

2012-03-07 Thread Greg Snow

The issue here is the difference between what is contained in a string
and what R displays to you.

The string produced with the code:

 tmp - C:\\

only has 3 characters (as David pointed out), the third of which is a
single backslash, since the 1st \ escapes the 2nd and the R string
parsing rules use the combination to put a sing backslash in the
string.  When you print the string (whether you call print directly or
indirectly) the print function escapes special characters, including
the backslash, so you see \\ which represents a single backslash in
the string.  If you use the cat function instead of the print
function, then you will only see a single backslash (and other escape
sequences such as \n will also display different in print vs. cat
output).  There are other ways to see the exact string (write to a
file, use in certain command, etc.) but cat is probably the simplest.

On Wed, Mar 7, 2012 at 7:57 AM, David Winsemius dwinsem...@comcast.net wrote:

 On Mar 7, 2012, at 6:54 AM, Markus Elze wrote:

 Hello everybody,
 this might be a trivial question, but I have been unable to find this
 using Google. I am trying to replace double backslashes with single
 backslashes using gsub.


 Actually you don't have double backslashes in the argument you are
 presenting to gsub. The string entered at the console as C:\\ only has a
 single backslash.

 nchar(C:\\)
 [1] 3


 There seems to be some unexpected behaviour with regards to the
 replacement string \\. The following example uses the string C:\\ which
 should be converted to C:\ .

  gsub(, \\, C:\\)
 [1] C:


 But I do not understand that returned value, either. I thought that the
 'repl' argument (which I think I have demonstrated is a single backslash)
 would get put back in the returned value.



  gsub(, Test, C:\\)
 [1] C:Test
  gsub(, , C:\\)
 [1] C:\\


 I thought the parsing rules for 'replacement' were different than the rules
 for 'patt'. So I'm puzzled, too. Maybe something changed in 2.14?

 sub(, \\, C:\\, fixed=TRUE)
 [1] C:\\

 sub(, \\, C:\\)
 [1] C:
 sub(([\\]), \\1, C:\\)
 [1] C:\\

 The NEWS file does say that there is a new regular expression implementation
 and that the help file for regex should be consulted.

 And presumably we should study this:

 http://laurikari.net/tre/documentation/regex-syntax/

  In the 'replacement' argument, the \\ is used to back-reference a
 numbered sub-pattern, so perhaps \\ is now getting handled as the null
 subpattern? I don't see that mentioned in the regex help page, but it is a
 big page. I also didn't see \\ referenced in the TRE documentation, but
 then again I don't think that \\ in console or source() input is a double
 backslash. The TRE document says that A \ cannot be the last character of
 an ERE. I cannot tell whether that rule gets applied to the 'replacement'.




 I have observed similar behaviour for fixed=TRUE and perl=TRUE. I use R
 2.14.1 64-bit on Windows 7.



 --
 David Winsemius, MD
 West Hartford, CT


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to read this data properly?

2012-03-03 Thread Greg Snow

Using the readlines function on your dat string gives the error
because it is looking for a file named 2 3 ... which it is not
finding.  more likely what you want is to create a text connection
(see ?textConnection) to your string, then use scan or read.table on
that connection.

On Sat, Mar 3, 2012 at 8:15 AM, Bogaso Christofer
bogaso.christo...@gmail.com wrote:
 Dear all, I have been given a data something like below:



 Dat = 2 3 28.3 3.05 8 3 3 22.5 1.55 0 1 1 26.0 2.30 9 3 3 24.8 2.10 0

 3 3 26.0 2.60 4 2 3 23.8 2.10 0 3 2 24.7 1.90 0 2 1 23.7 1.95 0

 3 3 25.6 2.15 0 3 3 24.3 2.15 0 2 3 25.8 2.65 0 2 3 28.2 3.05 11

 4 2 21.0 1.85 0 2 1 26.0 2.30 14 1 1 27.1 2.95 8 2 3 25.2 2.00 1

 2 3 29.0 3.00 1 4 3 24.7 2.20 0 2 3 27.4 2.70 5 2 2 23.2 1.95 4





 I want to create a matrix out of those data for my further calculations. I
 have tried with readLines() but got error:



 readLines(Dat)

 Error in file(con, r) : cannot open the connection

 In addition: Warning message:

 In file(con, r) :

  cannot open file '2 3 28.3 3.05 8 3 3 22.5 1.55 0 1 1 26.0 2.30 9 3 3 24.8
 2.10 0

 3 3 26.0 2.60 4 2 3 23.8 2.10 0 3 2 24.7 1.90 0 2 1 23.7 1.95 0

 3 3 25.6 2.15 0 3 3 24.3 2.15 0 2 3 25.8 2.65 0 2 3 28.2 3.05 11

 4 2 21.0 1.85 0 2 1 26.0 2.30 14 1 1 27.1 2.95 8 2 3 25.2 2.00 1

 2 3 29.0 3.00 1 4 3 24.7 2.20 0 2 3 27.4 2.70 5 2 2 23.2 1.95 4': No such
 file or directory





 Can somebody help to put that data in some workable format?



 Thanks and regards,


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] contour for plotting confidence interval on scatter plot of bivariate normal distribution

2012-03-03 Thread Greg Snow

Look at the ellipse package (and the ellipse function in the package)
for a simple way of showing a confidence region for bivariate data on
a plot (a 68% confidence interval is about 1 SD if you just want to
show 1 SD).

On Sat, Mar 3, 2012 at 7:54 AM, drflxms drfl...@googlemail.com wrote:
 Dear all,

 I created a bivariate normal distribution:

 set.seed(138813)
 n-100
 x-rnorm(n); y-rnorm(n)

 and plotted a scatterplot of it:

 plot(x,y)

 Now I'd like to add the 2D-standard deviation.

 I found a thread regarding plotting arbitrary confidence boundaries from
 Pascal Hänggi
 http://www.mail-archive.com/r-help@r-project.org/msg24013.html
 which cites the even older thread
 http://tolstoy.newcastle.edu.au/R/help/03b/5384.html

 As I am unfortunately only a very poor R programmer, the code of Pascal
 Hänggi is a myth to me and I am not sure whether I was able to translate
 the recommendation of Brain Ripley in the later thread (which provides
 no code) into the the correct R code. Brain wrote:

 You need a 2D density estimate (e.g. kde2d in MASS) then compute the
 density values at the points and draw the contour of the density which
 includes 95% of the points (at a level computed from the sorted values
 via quantile()). [95% confidence interval was desired in thread instead
 of standard deviation...]

 So I tried this...

 den-kde2d(x, y, n=n) #as I chose n to be the same as during creating
 the distributions x and y (see above), a z-value is assigned to every
 combination of x and y.

 # create a sorted vector of z-values (instead of the matrix stored
 inside the den object
 den.z -sort(den$z)

 # set desired confidence border to draw and store it in variable
 confidence.border - quantile(den.z, probs=0.6827, na.rm = TRUE)

 # draw a line representing confidence.border on the existing scatterplot
 par(new=TRUE)
 contour(den, levels=confidence.border, col = red, add = TRUE)

 Unfortunately I doubt very much this is correct :( In fact I am sure
 this is wrong, because the border for probs=0.05 is drawn outside the
 values So please help and check.
 Pascal Hänggis code seems to work, but I don't understand the magic he
 does with

 pp - array()
 for (i in 1:1000){
        z.x - max(which(den$x  x[i]))
        z.y - max(which(den$y  y[i]))
        pp[i] - den$z[z.x, z.y]
 }

 before doing the very same as I did above:

 confidencebound - quantile(pp, 0.05, na.rm = TRUE)

 plot(x, y)
 contour(den, levels = confidencebound, col = red, add = TRUE)


 My problems:

 1.) setting probs=0.6827 is somehow a dirty trick which I can only use
 by simply knowing that this is the percentage of values inside +-1sd
 when a distribution is normal. Is there a way doing this with native
 sd function?
 sd(den.z) is not correct, as den.z is in contrast to x and y not normal
 any more. So ecdf(den.z)(sd(den.z)) results in a percentile of 0.5644 in
 this example instead of the desired 0.6827.

 2.) I would like to have code that works with any desired confidence.
 Unfortunately setting probs to the desired confidence would probably be
 wrong (?!) as it relates to den.z instead of x and y, which are the
 underlying distributions I am interested in. To put it short I want the
 confidence of x/y and not of den.z.


 I am really completely stuck. Please help me out of this! Felix


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cleaning up messy Excel data

2012-03-03 Thread Greg Snow

Sometimes we adapt to our environment, sometimes we adapt our
environment to us. I like fortune(108).

I actually was suggesting that you add a tool to your toolbox, not limit it.

In my experience (and I don't expect everyone else's to match) data
manipulation that seems easier in Excel than R is only easier until
the client comes back and wants me to redo the whole analysis with one
typo fixed.  Then rerunning the script in R (or Perl or other tool) is
a lot easier than trying to remember where all I clicked, dragged,
selected, etc.

I do use Excel for somethings (though I would be happy to find other
tools for that if it were possible to expunge Excel from the earth)
and Word (I actually like using R2wd to send tables and graphs to word
that I can then give to clients who just want to be able to copy and
paste them to something else), I just think that many of the tasks
that many people use excel for would be better served with a better
tool.

If someone reading this decides to put some more thought into a
project up front and actually design a database up front rather than
letting it evolve into some monstrosity in Excel, and that decision
saves them some later grief, then the world will be a little bit
better place.

On Fri, Mar 2, 2012 at 6:04 PM, jim holtman jholt...@gmail.com wrote:
 Unfortunately they only know how to use Excel and Word.  They are not
 folks who use a computer every day.  Many of them run factories or
 warehouses and asking them to use something like Access would not
 happen in my lifetime (I have retired twice already).

 I don't have any problems with them messing up the data that I send
 them; they are pretty good about making changes within the context of
 the spreadsheet.  The other issue is that I working with people in
 twenty different locations spread across the US, so I might be able to
 one of them to use Access (there is one I know that uses it), but that
 leaves 19 other people I would not be able to communicate with.

 The other thing is, is that I use Excel myself to slice/dice data
 since there are things that are easier in Excel than R (believe it or
 not).  There are a number of tools I keep in my toolkit, and R is
 probably the most important, but I have not thrown the rest of them
 away since they still serve a purpose.

 So if you can come up with a way to 20 diverse groups, who are not
 computer literate, to change over in a couple of days from Excel to
 Access let me know.  BTW, I tried to use Access once and gave it up
 because it was not as intuitive as some other tools and did not give
 me any more capability than the ones I was using.  So I know I would
 have a problem in convincing other to make the change just so they
 could communicate with me, while they still had to use Excel to most
 of their other interfaces.

 This is the real world where you have to learn how to adapt to your
 environment and make the best of it.  So you just have to learn that
 Excel can be your friend (or at least not your enemy) and can serve a
 very useful purpose in getting your ideas across to other people.

 On Fri, Mar 2, 2012 at 6:41 PM, Greg Snow 538...@gmail.com wrote:
 Try sending your clients a data set (data frame, table, etc) as an MS
 Access data table instead.  They can still view the data as a table,
 but will have to go to much more effort to mess up the data, more
 likely they will do proper edits without messing anything up (mixing
 characters in with numbers, have more sexes than your biology teacher
 told you about, add extra lines at top or bottom that makes reading
 back into R more difficult, etc.)

 I have had a few clients that I talked into using MS Access from the
 start to enter their data, there was often a bit of resistance at
 first, but once they tried it and went through the process of
 designing the database up front they ended up thanking me and believed
 that the entire data entry process was easier and quicker than had the
 used excel as they originally planned.

 Access is still part of MS office, so they don't need to learn R or in
 any way break their chains from being prisoners of bill, but they will
 be more productive in more ways than just interfacing with you.

 Access (databases in general) force you to plan things out and do the
 correct thing from the start.  It is possible to do the right thing in
 Excel, but Excel does not encourage (let alone force) you to do the
 right thing, but makes it easy to do the wrong thing.

 On Thu, Mar 1, 2012 at 6:15 AM, jim holtman jholt...@gmail.com wrote:
 But there are some important reasons to use Excel.  In my work there
 are a lot of people that I have to send the equivalent of a data.frame
 to who want to look at the data and possibly slice/dice the data
 differently and then send back to me updates.  These folks do not know
 how to use R, but do have Microsoft Office installed on their
 computers and know how to use the different products.

 I have been very successful in conveying what

Re: [R] Shape manipulation

2012-03-03 Thread Greg Snow

A general solution if you always want 2 columns and the pattern is
always every other column (but the number of total columns could
change) would be:

cbind(  c(Dat[,c(TRUE,FALSE)]), c(Dat[,c(FALSE,TRUE)]) )



On Sat, Mar 3, 2012 at 11:40 AM, David Winsemius dwinsem...@comcast.net wrote:

 On Mar 3, 2012, at 11:02 AM, Bogaso Christofer wrote:

 Hi all, let say I have following matrix:



 Dat - matrix(1:30, 5, 6); colnames(Dat) - rep(c(Name1, Names2), 3)


 Dat


    Name1 Names2 Name1 Names2 Name1 Names2

 [1,]     1      6    11     16    21     26

 [2,]     2      7    12     17    22     27

 [3,]     3      8    13     18    23     28

 [4,]     4      9    14     19    24     29

 [5,]     5     10    15     20    25     30



 From this matrix, I want to create another matrix with 2 columns for
 Name1

 and Name2. Therefore, my final matrix will have 2 columns and 15 rows.
 Is
 there any direct R function to achieve this?


 rbind(Dat[,1:2], Dat[,3:4], Dat[,5:6])

        [[alternative HTML version deleted]]


 Bogaso;

 It is really long past due for you to learn how to send plain text messages
 from your mailer.

 --

 David Winsemius, MD
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] contour for plotting confidence interval on scatter plot of bivariate normal distribution

2012-03-03 Thread Greg Snow

The key part of the ellipse function is:

matrix(c(t * scale[1] * cos(a + d/2) + centre[1], t * scale[2] *
cos(a - d/2) + centre[2]), npoints, 2, dimnames = list(NULL,
names))

Where (if I did not miss anything) the variable 't' is derived from a
chisquare distribution and the confidence level, scale[1] and scale[2]
are the standard deviations of the 2 variables, d is the eccentricity
based on the correlation and a is just a sequence from 0 to 2*pi.  So
if you use 't' as 1 instead of derived based on confidence then you
would get a 1 SD ellipse in the sense that any 1 dimensional slice
through the mean point would cut the ellipse at 1 SD from the mean.
You could then change t to 2 for the 2 SD curve, etc.




On Sat, Mar 3, 2012 at 12:25 PM, drflxms drfl...@googlemail.com wrote:
 Thank you very much for your thoughts!

 Exactly what you mention is, what I am thinking about during the last
 hours: What is the relation between the den$z distribution and the z
 distribution.
 That's why I asked for ecdf(distribution)(value)-percentile earlier
 this day (thank you again for your quick and insightful answer on
 that!). I used it to compare certain values in both distributions by
 their percentile.

 I really think you are completely right: I urgently need some lessons in
 bivariate/multivariate normal distributions. (I am a neurologist and
 unfortunately did not learn too much about statistics in university :-()
 I'll take your statement as a starter:

 Once you go into two dimensions, SD loses all meaning, and adding
 nonparametric density estimation into the mix doesn't help, so just stop
 thinking in those terms!

 This makes me really think a lot! Is plotting the 0,68 confidence
 interval in 2D as equivalent to +-1 SD really nonsense!?

 By the way: all started very harmless. I was asked to draw an example of
 the well known target analogy for accuracy and precision based on real
 (=simulated) data. (see i.e.
 http://en.wikipedia.org/wiki/Accuracy_and_precision for a simple hand
 made 2d graphic).

 Well, I did by

 set.seed(138813)
 x-rnorm(n); y-rnorm(n)
 plot(x,y)

 I was asked whether it might be possible to add a histogram with
 superimposed normal curve to the drawing: no problem. And where is the
 standard deviation, well abline(v=sd(... OK.

 Then I realized, that this is of course only true for one of the
 distributions (x) and only in one slice of the scatterplot of x and y.
 The real thing is is a 3d density map above the scatterplot. A very nice
 example of this is demo(bivar) in the rgl package (for a picture see i.e
 http://rgl.neoscientists.org/gallery.shtml right upper corner).

 Great! But how to correctly draw the standard deviation boundaries for
 the shots on the target (the scatterplot of x and y)...

 I'd be grateful for hints on what to read on that matter (book, website
 etc.)

 Greetings from Munich, Felix.


 Am 03.03.12 19:22, schrieb peter dalgaard:

 On Mar 3, 2012, at 17:01 , drflxms wrote:

 # this is the critical block, which I still do not comprehend in detail
 z - array()
 for (i in 1:n){
        z.x - max(which(den$x  x[i]))
        z.y - max(which(den$y  y[i]))
        z[i] - den$z[z.x, z.y]
 }

 As far as I can tell, the point is to get at density values corresponding to 
 the values of (x,y) that you actually have in your sample, as opposed to 
 den$z which is for an extended grid of all possible (x_i, y_j) combinations.

 It's unclear to me what happens if you look at quantiles for the entire 
 den$z. I kind of suspect that it is some sort of approximate numerical 
 integration, but maybe not of the right thing

 Re SD: Once you go into two dimensions, SD loses all meaning, and adding 
 nonparametric density estimation into the mix doesn't help, so just stop 
 thinking in those terms!


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] contour for plotting confidence interval on scatter plot of bivariate normal distribution

2012-03-03 Thread Greg Snow

To further explain.  If you want contours of a bivariate normal, then
you want ellipses.   The density for a bivariate normal (with 0
correlation to keep things simple, but the theory will extend to
correlated cases)  is proportional to exp( -1/2 ( x1^2/v1 + x2^2/v2 )
so a contour of the distribution will be all points such that x1^2/v1
+ x2^2/v2 = c for some constant c (each c will give a different
contour), but that is the definition of an ellipse (well divide both
sides by c so that the right side is 1 to get the canonical form).
The ellipse function in the ellipse package chooses c from the chi
squared distribution (since if x1 and x2 are normally distributed with
mean 0 (or have the mean subtracted), then x1^2/v1 +x2^2/v2 is chi
squared distributed with 2 degrees of freedom.

So if you really want to you can try to approximate the contours in
some other way, but any decent approach will just converge to the
ellipse.

On Sat, Mar 3, 2012 at 1:26 PM, drflxms drfl...@googlemail.com wrote:
 Wow, David,

 thank you for these sources, which I just screened. bagplot looks most
 promising to me. I found it in the package ‘aplpack’ as well as in the R
 Graph Gallery
 http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=112

 Ellipses are not exactly what I am heading for. I am looking for a 2D
 equivalent of plotting 1D standard deviation boundaries. In other words
 how to plot SD boundaries on a 2D scatter plot.
 So I started searching for contour/boundaries etc. instead of ellipse
 leading me to Pascal Hänggi:
 http://www.mail-archive.com/r-help@r-project.org/msg24013.html

 To describe it in an image: I want to cut the density mountain above the
 scatter plot (see demo(bivar) in the rgl package) in a way so that the
 part of the mountain, that covers 68% of the data on the x-y-plane below
 it (+-1 SD) is removed. Then I'd like to project the edge that results
 from the cut to the x-y-plane below the mountain. This should be the 2d
 equivalent of 1s SD boundaries.

 I think this might be achieved as well by Hänggis code as by the
 function of Forester. Unfortunately they result in slightly different
 boundaries which shouldn't be the case. And I did not figure out which
 one is correct if one is correct at all (!?).

 Can anyone explain the difference?

 I compared them with this code:

 # parameters:
 n-100

 # generate samples:
 set.seed(138813)
 x-rnorm(n); y-rnorm(n)
 a-list(x=x,y=y) # input for Foresters function which is appended at
 the very end

 # estimate non-parameteric density surface via kernel smoothing
 library(MASS)
 den-kde2d(x, y, n=n)

 z - array()
 for (i in 1:n){
        z.x - max(which(den$x  x[i]))
        z.y - max(which(den$y  y[i]))
        z[i] - den$z[z.x, z.y]
 }

 # store class/level borders of confidence interval in variables
 confidence.border - quantile(z, probs=0.05, na.rm = TRUE)# 0.05
 corresponds to 0.95 in draw.contour

 plot(x,y)
 draw.contour(a, alpha=0.95)
 par(new=TRUE)
 contour(den, levels=confidence.border, col = red, add = TRUE)


 ###
 ## drawcontour.R
 ## Written by J.D. Forester, 17 March 2008
 ###

 ## This function draws an approximate density contour based on
 empirical, bivariate data.

 ##change testit to FALSE if sourcing the file
 testit=TRUE

 draw.contour-function(a,alpha=0.95,plot.dens=FALSE, line.width=2,
 line.type=1, limits=NULL, density.res=800,spline.smooth=-1,...){
  ## a is a list or matrix of x and y coordinates (e.g.,
 a=list(x=rnorm(100),y=rnorm(100)))
  ## if a is a list or dataframe, the components must be labeled x and y
  ## if a is a matrix, the first column is assumed to be x, the second y
  ## alpha is the contour level desired
  ## if plot.dens==TRUE, then the joint density of x and y are plotted,
  ## otherwise the contour is added to the current plot.
  ## density.res controls the resolution of the density plot

  ## A key assumption of this function is that very little probability
 mass lies outside the limits of
  ## the x and y values in a. This is likely reasonable if the number
 of observations in a is large.

  require(MASS)
  require(ks)
  if(length(line.width)!=length(alpha)){
    line.width - rep(line.width[1],length(alpha))
  }

  if(length(line.type)!=length(alpha)){
    line.type - rep(line.type[1],length(alpha))
  }

  if(is.matrix(a)){
    a=list(x=a[,1],y=a[,2])
  }
  ##generate approximate density values
  if(is.null(limits)){
    limits=c(range(a$x),range(a$y))
  }
  f1-kde2d(a$x,a$y,n=density.res,lims=limits)

  ##plot empirical density
  if(plot.dens) image(f1,...)

  if(is.null(dev.list())){
    ##ensure that there is a window in which to draw the contour
    plot(a,type=n,xlab=X,ylab=Y)
  }

  ##estimate critical contour value
  ## assume that density outside of plot is very small

  zdens - rev(sort(f1$z))
  Czdens - cumsum(zdens)
  Czdens - (Czdens/Czdens[length(zdens)])
  for(cont.level in 1:length(alpha)){
    ##This

Re: [R] Cleaning up messy Excel data

2012-03-02 Thread Greg Snow

Try sending your clients a data set (data frame, table, etc) as an MS
Access data table instead.  They can still view the data as a table,
but will have to go to much more effort to mess up the data, more
likely they will do proper edits without messing anything up (mixing
characters in with numbers, have more sexes than your biology teacher
told you about, add extra lines at top or bottom that makes reading
back into R more difficult, etc.)

I have had a few clients that I talked into using MS Access from the
start to enter their data, there was often a bit of resistance at
first, but once they tried it and went through the process of
designing the database up front they ended up thanking me and believed
that the entire data entry process was easier and quicker than had the
used excel as they originally planned.

Access is still part of MS office, so they don't need to learn R or in
any way break their chains from being prisoners of bill, but they will
be more productive in more ways than just interfacing with you.

Access (databases in general) force you to plan things out and do the
correct thing from the start.  It is possible to do the right thing in
Excel, but Excel does not encourage (let alone force) you to do the
right thing, but makes it easy to do the wrong thing.

On Thu, Mar 1, 2012 at 6:15 AM, jim holtman jholt...@gmail.com wrote:
 But there are some important reasons to use Excel.  In my work there
 are a lot of people that I have to send the equivalent of a data.frame
 to who want to look at the data and possibly slice/dice the data
 differently and then send back to me updates.  These folks do not know
 how to use R, but do have Microsoft Office installed on their
 computers and know how to use the different products.

 I have been very successful in conveying what I am doing for them by
 communicating via Excel spreadsheets.  It is also an important medium
 in dealing with some international companies who provide data via
 Excel and expect responses back via Excel.

 When dealing with data in a tabular form, Excel does provide a way for
 a majority of the people I work with to understand the data.  Yes,
 there are problems with some of the ways that people use Excel, and
 yes I have had to invest time in scrubbing some of the data that I get
 from them, but if I did not, then I would probably not have a job
 working for them.  I use R exclusively for the analysis that I do, but
 find it convenient to use Excel to provide a communication mechanism
 to the majority of the non-R users that I have to deal with.  It is a
 convenient work-around because I would never get them to invest the
 time to learn R.

 So in the real world these is a need to Excel and we are not going to
 cause it to go away; we have to learn how to live with it, and from my
 standpoint, it has definitely benefited me in being able to
 communicate with my users and continuing to provide them with results
 that they are happy with.  They refer to letting me work my magic on
 the data; all they know is they see the result via Excel and in the
 background R is doing the heavy lifting that they do not have to know
 about.

 On Wed, Feb 29, 2012 at 4:41 PM, Rolf Turner rolf.tur...@xtra.co.nz wrote:
 On 01/03/12 04:43, John Kane wrote:

 (mydata- as.factor(c(1,2,3, 2, 5, 2)))
 str(mydata)

 newdata- as.character(mydata)

 newdata[newdata==2]- 0
 newdata- as.numeric(newdata)
 str(newdata)

 We really need to keep Excel (and other spreadsheets) out of peoples
 hands.


 Amen, bro'!!!

    cheers,

        Rolf Turner

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Jim Holtman
 Data Munger Guru

 What is the problem that you are trying to solve?
 Tell me what you want to do, not how you want to do it.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] convert list to text file

2012-03-02 Thread Greg Snow

Or

lapply(LIST, cat, file='outtext.txt', append=TRUE)

On Thu, Mar 1, 2012 at 6:20 AM, R. Michael Weylandt
michael.weyla...@gmail.com wrote:
 Perhaps something like

 sink(outtext.txt)
 lapply(LIST, print)
 sink()

 You could replace print with cat and friends if you wanted more
 detailed control over the look of the output.

 Michael

 On Thu, Mar 1, 2012 at 5:28 AM,  t.galesl...@ebh.umcn.nl wrote:
 Dear R users,

 Is it possible to write the following list to a text-file?

 List:

 [[1]]
 [1] 500

 [[2]]
 [1] 1

 [[3]]
    [,1] [,2] [,3] [,4] [,5]
 FID    1    2    3    4    5
 Var    2    0    2    1    1

 I would like to have the textfile look like this:

 500
 1
 FID 1 2 3 4 5
 Var 2 0 2 1 1

 Thank you very much in advance for your help!

 Kind regards,

 Tessel Galesloot
 Department of Epidemiology, Biostatistics and HTA (133)
 Radboud University Nijmegen Medical Centre



 Het UMC St Radboud staat geregistreerd bij de Kamer van Koophandel in het 
 handelsregister onder nummer 41055629.
 The Radboud University Nijmegen Medical Centre is listed in the Commercial 
 Register of the Chamber of Commerce under file number 41055629.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fridays date to date

2012-03-02 Thread Greg Snow

If you know that your first date is a Friday then you can use seq with
by=7 day, then you don't need to post filter the vector.

On Thu, Mar 1, 2012 at 1:40 PM, Ben quant ccqu...@gmail.com wrote:
 Great thanks!

 ben

 On Thu, Mar 1, 2012 at 1:30 PM, Marc Schwartz marc_schwa...@me.com wrote:

 On Mar 1, 2012, at 2:02 PM, Ben quant wrote:

  Hello,
 
  How do I get the dates of all Fridays between two dates?
 
  thanks,
 
  Ben


 Days - seq(from = as.Date(2012-03-01),
            to = as.Date(2012-07-31),
            by = day)

  str(Days)
  Date[1:153], format: 2012-03-01 2012-03-02 2012-03-03 2012-03-04
 ...

 # See ?weekdays

  Days[weekdays(Days) == Friday]
  [1] 2012-03-02 2012-03-09 2012-03-16 2012-03-23 2012-03-30
  [6] 2012-04-06 2012-04-13 2012-04-20 2012-04-27 2012-05-04
 [11] 2012-05-11 2012-05-18 2012-05-25 2012-06-01 2012-06-08
 [16] 2012-06-15 2012-06-22 2012-06-29 2012-07-06 2012-07-13
 [21] 2012-07-20 2012-07-27

 HTH,

 Marc Schwartz



        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Connecting points on a line with arcs/curves

2012-03-02 Thread Greg Snow

?xspline

On Thu, Mar 1, 2012 at 8:15 AM, hendersi ir...@cam.ac.uk wrote:

 Hello,

 I have a spreadsheet of pairs of coordinates and I would like to plot a line
 along which curves/arcs connect each pair of coordinates. The aim is to
 visualise the pattern of point connections.

 Thanks! Ian

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Connecting-points-on-a-line-with-arcs-curves-tp4435247p4435247.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with sum function

2012-03-02 Thread Greg Snow

Others explained why it happens, but you might want to look at the
zapsmall function for one way to deal with it.

On Thu, Mar 1, 2012 at 2:49 PM, Mark A. Albins kamoko...@gmail.com wrote:
 Hi!

 I'm running R version 2.13.0 (2011-04-13)
 Platform: i386-pc-mingw32/i386 (32-bit)

 When i type in the command:

 sum(c(-0.2, 0.8, 0.8, -3.2, 1.8))

 R returns the value:

 -5.551115e-17

 Why doesn't R return zero in this case?  There shouldn't be any rounding
 error in a simple sum.

 Thanks,

 Mark

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Computing line= for mtext

2012-03-02 Thread Greg Snow

I would use the regular text function instead of mtext (remembering to
set par(xpd=...)), then use the grconvertX and grconvertY functions to
find the location to plot at (possibly adding in the results from
strwidth or stheight).

On Thu, Mar 1, 2012 at 4:52 PM, Frank Harrell f.harr...@vanderbilt.edu wrote:
 Rich's pointers deals with lattice/grid graphics.  Does anyone have a
 solution for base graphics?
 Thanks
 Frank

 Richard M. Heiberger wrote

 Frank,

 This can be done directly with a variant of the panel.axis function.
 See function panel.axis.right in the HH package.  This was provided for me
 by David Winsemius in response to my query on this list in October 2011
 https://stat.ethz.ch/pipermail/r-help/2011-October/292806.html

 The email thread also includes comments by Deepayan Sarkar and Paul
 Murrell.

 Rich

 On Wed, Feb 29, 2012 at 8:48 AM, Frank Harrell lt;f.harrell@gt;wrote:

 I want to right-justify a vector of numbers in the right margin of a
 low-level plot.  For this I need to compute the line parameter to give to
 mtext.  Is this the correct scalable calculation?

 par(mar=c(4,3,1,5)); plot(1:20)
 s - 'abcde'; w=strwidth(s, units='inches')/par('cin')[1]
 mtext(s, side=4, las=1, at=5, adj=1, line=w-.5, cex=1)
 mtext(s, side=4, las=1, at=7, adj=1, line=2*(w-.5), cex=2)

 Thanks
 Frank

 -
 Frank Harrell
 Department of Biostatistics, Vanderbilt University
 --
 View this message in context:
 http://r.789695.n4.nabble.com/Computing-line-for-mtext-tp4431554p4431554.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmllt;http://www.r-project.org/posting-guide.htmlgt;
 and provide commented, minimal, self-contained, reproducible code.


       [[alternative HTML version deleted]]

 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 -
 Frank Harrell
 Department of Biostatistics, Vanderbilt University
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Computing-line-for-mtext-tp4431554p4436923.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] testing two data sets

2012-02-24 Thread Greg Snow

?ks.test
?qqplot

also look at permutation tests and possibly the vis.test function in
the TeachingDemos package.

Note that with all of these large samples may give you power to detect
meaningless differences and small samples may not have enough power to
detect potentially important differences.

On Wed, Feb 22, 2012 at 12:37 AM, Mohammed Ouassou
mohammed.ouas...@statkart.no wrote:
 Hi everyone,

 I have 2 data sets and I like to carry out a test to find out if they
 come from the same distribution.


 Any suggestions ? thanks in advance.

 M.O






 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Repeated cross-validation for a lm object

2012-02-18 Thread Greg Snow

The validate function in the rms package can do cross validation of
ols objects (ols is similar to lm, but with additional information),
the default is to do bootstrap validation, but you can specify
crossvalidation instead.

On Thu, Feb 16, 2012 at 10:44 AM, samuel-rosa
alessandrosam...@yahoo.com.br wrote:
 Dear R users

 I'd like to hear from someone if there is a function to do a repeated k-fold
 cross-validation for a lm object and get the predicted values for every
 observation. The situation is as follows:
 I had a data set composed by 174 observations from which I sampled randomly
 a subset composed by 150 observations. With the subset (n = 150) I fitted
 the model: y = a + bx. The model validation has to be done using a repeated
 k-fold cross-validation on the complete data set (n = 174). I need to use 10
 folds and repeat the cross-validation 100 times. In the end of the
 procedure, I need to have access to the predicted values for each
 observation, that is, to the 100 predicted values for each observation. The
 function lmCV() in the package chemometrics provides the predicted values.
 However, it works only with multiple linear regression models.
 I hope there is a way of doing it.
 Best regards,

 -
 Bc.Sc.Agri. Alessandro Samuel-Rosa
 Postgraduate Program in Soil Science
 Federal University of Santa Maria
 Av. Roraima, nº 1000, Bairro Camobi, CEP 97105-970
 Santa Maria, Rio Grande do Sul, Brazil
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Repeated-cross-validation-for-a-lm-object-tp4394833p4394833.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with e+01 number abbreviations

2012-02-16 Thread Greg Snow

Also look at the zapsmall function.  A useful but often overlooked tool.

On Thu, Feb 16, 2012 at 2:54 AM, Petr Savicky savi...@cs.cas.cz wrote:
 On Thu, Feb 16, 2012 at 10:17:09AM +0100, Gian Maria Niccolò Benucci wrote:
 Dear List,

 I will appreciate any advice regarding how to convert the following numbers
 [I got in return by taxondive()] in numeric integers without the e.g.
 6.4836e+01
 abbreviations.
 Thank you very much in advance,

 Gian

  taxa_dive
              Species       Delta      Delta*     Lambda+      Delta+ S
 Delta+
 Nat1      5.e+00  6.4836e+01  9.5412e+01  6.7753e+02  8.7398e+01
 436.99
 Nat2      2.e+00  4.0747e+01  1.e+02  0.e+00  1.e+02
 200.00
 Nat3      3.e+00  4.5381e+01  7.7652e+01  2.8075e+02  8.8152e+01
 264.46
 

 Hi.

 The exponential format was used probably due to some small
 numbers. For example

  tst - rbind(
  c( 5.e+00, 6.4836e+01, 9.5412e+01, 6.7753e+02, 8.7398e+01, 436.99),
  c( 2.e+00, 4.0747e+01, 1.e+02, 0.e+00, 1.e+02, 200.00),
  c( 3.e+00, 4.5381e+01, 7.7652e+01, 2.8075e+02, 8.8152e+01, 264.46),
  c( 1e-8,       1e-8,       1e-8,       1e-8,       1e-8,       1 ))

  tst

        [,1]       [,2]       [,3]       [,4]       [,5]   [,6]
  [1,] 5e+00 6.4836e+01 9.5412e+01 6.7753e+02 8.7398e+01 436.99
  [2,] 2e+00 4.0747e+01 1.e+02 0.e+00 1.e+02 200.00
  [3,] 3e+00 4.5381e+01 7.7652e+01 2.8075e+02 8.8152e+01 264.46
  [4,] 1e-08 1.e-08 1.e-08 1.e-08 1.e-08   1.00

 Try roudning the numbers, for example

  round(tst, digits=4)

       [,1]   [,2]    [,3]   [,4]    [,5]   [,6]
  [1,]    5 64.836  95.412 677.53  87.398 436.99
  [2,]    2 40.747 100.000   0.00 100.000 200.00
  [3,]    3 45.381  77.652 280.75  88.152 264.46
  [4,]    0  0.000   0.000   0.00   0.000   1.00

 Alternatively, options(scipen=20) forces a fixed point printing
 with more digits.

  options(scipen=20)
  tst

             [,1]        [,2]         [,3]         [,4]         [,5]   [,6]
  [1,] 5. 64.8360  95.4120 677.5300  87.3980 436.99
  [2,] 2. 40.7470 100.   0. 100. 200.00
  [3,] 3. 45.3810  77.6520 280.7500  88.1520 264.46
  [4,] 0.0001  0.0001   0.0001   0.0001   0.0001   1.00

 Hope this helps.

 Petr Savicky.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to test the random factor effect in lme

2012-02-15 Thread Greg Snow

This post https://stat.ethz.ch/pipermail/r-sig-mixed-models/2009q1/001819.html
may help you understand why the standard p-values in some cases are
not the right thing to do and what one alternative is.

On Tue, Feb 14, 2012 at 3:36 PM, Xiang Gao xianggao2...@gmail.com wrote:
 Hi

 I am working on a Nested one-way ANOVA. I don't know how to implement
 R code to test the significance of the random factor

 My R code so far can only test the fixed factor :

 anova(lme(PCB~Area,random=~1|Sites, data = PCBdata))
            numDF denDF   F-value p-value
 (Intercept)     1    12 1841.7845  .0001
 Area              1     4    4.9846  0.0894


 Here is my data and my hand calculation.

 PCBdata
   Area Sites PCB
 1     A     1  18
 2     A     1  16
 3     A     1  16
 4     A     2  19
 5     A     2  20
 6     A     2  19
 7     A     3  18
 8     A     3  18
 9     A     3  20
 10    B     4  21
 11    B     4  20
 12    B     4  18
 13    B     5  19
 14    B     5  20
 15    B     5  21
 16    B     6  19
 17    B     6  23
 18    B     6  21

 By hand calculation, the result should be:
 Source  SS      DF      MS
 Areas      18.00  1    18.00
 Sites        14.44  4    3.61
 Error        20.67  12  1.72
 Total           53.11   17   ---


 MSareas/MSsites = 4.99 --- matching the R output
 MSsites/MSE = 2.10
 Conclusion is that Neither of Areas nor Sites make differences.


 My R code so far can only test the fixed effect :

 anova(lme(PCB~Area,random=~1|Sites, data = PCBdata))
            numDF denDF   F-value p-value
 (Intercept)     1    12 1841.7845  .0001
 Area              1     4    4.9846  0.0894



 --
 Xiang Gao, Ph.D.
 Department of Biology
 University of North Texas

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] hexplom question(s)

2012-02-14 Thread Greg Snow

Assuming this is the hexplom function from the hexbin package (it is
best to be specific in case there are multiple versions of the
function you ask about), you can specify lower.panel=function(...){}
for a, and as.matrix=TRUE for c, for b I am not sure what exactly
you want to do, but look at the diag.panel.splom function in the
lattice package as a possible solution.

On Mon, Feb 13, 2012 at 5:16 PM, Debs Majumdar debs_st...@yahoo.com wrote:
 Hi,

    I am trying to use the --hexplom-- function to draw a scatterplot matrix.

 The following works for me: hexplom(~file[,1:4], xbins=15,  xlab=)

 However, I want to make some changes to the graph:

 a) I only want to print/draw only one-half of the plot. Is there anyway to 
 get rid of the plots in the lower triangular matrix?

 b) Is there anyway, I can overwrite the xlabels?

 c) Not very important, but the variables start from the bottom and goes up. 
 E.g. I am plotting 4 variables ans I have a 4x4 matrix for the plot. Is there 
 anyway I can reverse the diagonals? i.e I would like to list the variables 
 and axes on 1x1, 2x2, 3x3 and 4x4 rather than the default where it lists the 
 first variable on 4x1 follwed by 3x2, 2x3 and 1x4?


 Thanks,

 -Joey


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] testing for a distribution of probability

2012-02-14 Thread Greg Snow

All the distribution tests are rule out tests, i.e. they can tell you
if your data does not match a given distribution, but they can never
tell you that the data does come from a specific distribution.

Note also that the results of any of these studies may not be that
useful, for small sample sizes it is more important to rule out a
given distribution, but unless there is a huge difference you won't
have much power to do this.  For large sample sizes it is less
important because using a close distribution will generally give you
robust results, but you will have power to detect small, meaningless
differences.  So often your choice is between a meaningless answer to
a meaningful question or a meaningful answer to a meaningless
question.

What is more important and a better approach is to understand the
science behind the process that generated the data and use that
knowledge to find a distribution that is reasonable (even if not
exact) or to use techniques that make fewer assumptions about the
distribution if you cannot find something close enough to be
reasonable (e.g. bootstrap, permutation, other non-parametric,
simulations to determine cut-off values).



On Tue, Feb 14, 2012 at 4:21 AM, Bianca A Santini
b.sant...@sheffield.ac.uk wrote:
 Hello!
 I have several variables. Each of them has a different distribution. I was
 thinking to use a Generalized Linear Model, glm(), but I need to introduce
 the family. Do you know if R has any tests for matching data to any
 distribution ( I am aware of shapiro.test).

 All the best,


 --
 BAS

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Wildcard for indexing?

2012-02-14 Thread Greg Snow

Note that you can also do logical comparisons with the results of grepl like:

grepl('^as', a) | grepl('^df',a)

For the given example it is probably simplest to do it in the regular
expression as shown, but for some more complex cases (or including
other variables) the logic with the output may be simpler.

On Tue, Feb 14, 2012 at 8:23 AM, Johannes Radinger jradin...@gmx.at wrote:


  Original-Nachricht 
 Datum: Tue, 14 Feb 2012 10:18:33 -0500
 Von: Sarah Goslee sarah.gos...@gmail.com
 An: Johannes Radinger jradin...@gmx.at
 CC: R-help@r-project.org
 Betreff: Re: [R] Wildcard for indexing?

 Hi,

 You should probably do a bit of reading about regular expressions, but
 here's one way:

 On Tue, Feb 14, 2012 at 10:10 AM, Johannes Radinger jradin...@gmx.at
 wrote:
  Hi,
 
   Original-Nachricht 
  Datum: Tue, 14 Feb 2012 09:59:39 -0500
  Von: R. Michael Weylandt michael.weyla...@gmail.com
  An: Johannes Radinger jradin...@gmx.at
  CC: R-help@r-project.org
  Betreff: Re: [R] Wildcard for indexing?
 
  I think the grep()-family (regular expressions) will be the easiest
  way to do this, though it sounds like you might prefer grepl() which
  returns a logical vector:
 
  ^[AB] # Starts with either an A or a B
  ^A_ # Starting with A_
 
  a -  c(A_A,A_B,C_A,BB,A_Asd
  grepl(^[AB], a)
  grepl(^A_)
 
  Yes grepl() is what I am looking for.
  is there also something like an OR statement e.g. if I want to
  select for elements that start with as OR df?

  a - c(as1, bb, as2, cc, df, aa, dd, sdf)
  grepl(^as|^df, a)
 [1]  TRUE FALSE  TRUE FALSE  TRUE FALSE FALSE FALSE


 The square brackets match any of those characters, so are good
 for single characters. For more complex patterns, | is the or symbol.
 ^ marks the beginning.

 Thank you so much Sarah! I tried that | symbol intuitively, there was just a 
 problem with the quotation marks :(

 Now everything is solved...

 /johannes


 Sarah

 --
 Sarah Goslee
 http://www.functionaldiversity.org

 --

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plotting bar graph over a geographical map

2012-02-02 Thread Greg Snow

If you are willing to use base graphics instead of ggplot2 graphs, then look at 
the subplot function in the TeachingDemos package.  One of the examples there 
shows adding multiple small bar graphs to a map.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of sjlabrie
 Sent: Tuesday, January 31, 2012 9:53 PM
 To: r-help@r-project.org
 Subject: [R] Plotting bar graph over a geographical map
 
 Hi,
 
 I am looking for a way to plot bar on a map instead of the standard
 points.
 I have been using ggplot2 and maps libraries.
 The points are added with the function geom_point. I know that there is
 a
 function
 geom_bar but I can't figure out how to use it.
 
 Thank you for your help,
 
 Simon
 
 ### R-code
 library(ggplot2)
 library(maps)
 
 measurements - read.csv(all_podo.count.csv, header=T)
 allworld - map_data(world)
 
 pdf(map.pdf)
 ggplot(measurements, aes(long, lat)) +
  geom_polygon(data = allworld, aes(x = long, y = lat, group = group),
  colour = grey70, fill = grey70) +
  geom_point(aes(size = ref)) +
  opts(axis.title.x = theme_blank(),
  axis.title.y = theme_blank()) +
  geom_bar(aes(y = normcount))
 dev.off()
 ###
 
 
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/Plotting-
 bar-graph-over-a-geographical-map-tp4346925p4346925.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] percentage from density()

2012-01-28 Thread Greg Snow

If you use logspline estimation (logspline package) instead of kernel density 
estimation then this is simple as there are cumulative area functions for 
logspline fits.

If you need to do this with kernel density estimates then you can just find the 
area over your region for the kernel centered at each data point and average 
those values together to get the area under the entire density estimate.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Duke
Sent: Friday, January 27, 2012 3:45 PM
To: r-help@r-project.org
Subject: [R] percentage from density()

Hi folks,

I know that density function will give a estimated density for a give 
dataset. Now from that I want to have a percentage estimation for a 
certain range. For examle:

  y = density(c(-20,rep(0,98),20))
  plot(y, xlim=c(-4,4))

Now if I want to know the percentage of data lying in (-20,2). Basically 
it should be the area of the curve from -20 to 2. Anybody knows a simple 
function to do it?

Thanks,

D.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How do I compare 47 GLM models with 1 to 5 interactions and unique combinations?

2012-01-27 Thread Greg Snow

What variables to consider adding and when to stop adding them depends greatly 
upon what question(s) you are trying to answer and the science behind your data.

Are you trying to create a model to predict your outcome for future predictors? 
 How precise of predictions are needed?

Are you trying to understand how certain predictors relate to the response? How 
they relate after conditioning on other predictors?

Will humans be using your equation directly? Or will it be in a black box that 
the computer generates predictions from but people never need to look at the 
details?

What is the cost (money, time, difficulty, etc.) of collecting the different 
predictors?

Answers to the above questions will be much more valuable in choosing the 
best model than AIC or other values (though you should still look at the 
results from analyses for information to combine with the other information).  
R and its programmers (no matter how great and wonderful they are) cannot 
answer these for you.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Jhope
 Sent: Thursday, January 26, 2012 2:26 PM
 To: r-help@r-project.org
 Subject: Re: [R] How do I compare 47 GLM models with 1 to 5
 interactions and unique combinations?
 
 I ask the question about when to stop adding another variable even
 though it
 lowers the AIC because each time I add a variable the AIC is lower. How
 do I
 know when the model is a good fit? When to stop adding variables,
 keeping
 the model simple?
 
 Thanks, J
 
 --
 View this message in context: http://r.789695.n4.nabble.com/How-do-I-
 compare-47-GLM-models-with-1-to-5-interactions-and-unique-combinations-
 tp4326407p4331848.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Placing a Shaded Box on a Plot

2012-01-27 Thread Greg Snow

The locator() function can help you find coordinates of interest on an existing 
plot.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Stephanie Cooke
 Sent: Friday, January 27, 2012 1:03 AM
 To: r-help@r-project.org
 Subject: [R] Placing a Shaded Box on a Plot
 
 Hello,
 
 I would like to place shaded boxes on different areas of a
 phylogenetic tree plot. Since I can not determine how to find axes on
 the phylogenetic tree plot I am not able to place the box over certain
 areas. Below is example code for the shaded box that I have tried to
 use, and the first four values specify the position.
 
 rect(110, 400, 135, 450, col=grey, border=transparent)
 
 Any suggestions on how to place a shaded box to highlight certain
 areas of a plot I would greatly appreciate.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to write the entire session to file?

2012-01-27 Thread Greg Snow

A different approach is to use the etxtStart function in the TeachingDemos 
package.  You need to run this before you start, then it will save everything 
(commands and output and plots if you tell it to) to a file that can then be 
post processed to give a file that shows basic coloring (or with options in the 
post processing, even more coloring).  Though it may be better to just run your 
R session through an editor like ESS/emacs or others.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Ajay Askoolum
 Sent: Friday, January 27, 2012 12:04 PM
 To: R General Forum
 Subject: [R] How to write the entire session to file?
 
 savehistory writes all the executed lines from the session.
 
 How can I write everything (executed lines and output) from the active
 session to a file?
 
 Using Edit | Select All then Edit Copy, I can copy everything to the
 clipboard and write the whole thing to a file manually.
 
 If I just used the clipboard, I can paste the whole content into
 another edittor (for documentation). Is there a way to copy the content
 of the session with the syntax colouring?
 
 Thanks.
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] null distribution of binom.test p values

2012-01-26 Thread Greg Snow

I believe that what you are seeing is due to the discrete nature of the 
binomial test.  When I run your code below I see the bar between 0.9 and 1.0 is 
about twice as tall as the bar between 0.0 and 0.1, but the bar between 0.8 and 
0.9 is not there (height 0), if you average the top 2 bars (0.8-0.9 and 
0.9-1.0) then the average height is similar to that of the lowest bar.  The bar 
between 0.5 and 0.6 is also 0, if you average that one with the next 2 (0.6-0.7 
and 0.7-0.8) then they are also similar to the bars near 0.



-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Chris Wallace
 Sent: Thursday, January 26, 2012 5:44 AM
 To: r-help@r-project.org
 Subject: [R] null distribution of binom.test p values
 
 Dear R-help,
 
 I must be missing something very obvious, but I am confused as to why
 the null distribution for p values generated by binom.test() appears to
 be non-uniform.  The histogram generated below has a trend towards
 values closer to 1 than 0.  I expected it to be flat.
 
 hist(sapply(1:1000, function(i,n=100)
 binom.test(sum(rnorm(n)0),n,p=0.5,alternative=two)$p.value))
 
 This trend is more pronounced for small n, and the distribution appears
 uniform for larger n, say n=1000.  I had expected the distribution to
 be
 discrete for small n, but not skewed.  Can anyone explain why?
 
 Many thanks,
 
 Chris.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] null distribution of binom.test p values

2012-01-26 Thread Greg Snow

Yes that is due to the discreteness of the distribution, consider the following:

 binom.test(39,100,.5)

Exact binomial test

data:  39 and 100 
number of successes = 39, number of trials = 100, p-value = 0.0352
alternative hypothesis: true probability of success is not equal to 0.5 
95 percent confidence interval:
 0.2940104 0.4926855 
sample estimates:
probability of success 
  0.39 

 binom.test(40,100,.5)

Exact binomial test

data:  40 and 100 
number of successes = 40, number of trials = 100, p-value = 0.05689
alternative hypothesis: true probability of success is not equal to 0.5 
95 percent confidence interval:
 0.3032948 0.5027908 
sample estimates:
probability of success 
   0.4

(you can do the same for 60 and 61)

So notice that the probability of getting 39 or more extreme is 0.0352, but 
anything less extreme will result in not rejecting the null hypothesis (because 
the probability of getting a 40 or a 60 (dbinom(40,100,.5)) is about 1% each, 
so we see a 2% jump there).  So the size/probability of a type I error will 
generally not be equal to alpha unless n is huge or alpha is chosen to 
correspond to a jump in the distribution rather than using common round values.

I have seen suggestions that instead of the standard test we use a test that 
rejects the null for values 39 and more extreme, don't reject the null for 41 
and less extreme, and if you see a 40 or 60 then you generate a uniform random 
number and reject if it is below a certain value (that value chosen to give an 
overall probability of type I error of 0.05).  This will correctly size the 
test, but becomes less reproducible (and makes clients nervous when they 
present their data and you pull out a coin, flip it, and tell them if they have 
significant results based on your coin flip (or more realistically a die 
roll)).  I think it is better in this case if you know your final sample size 
is going to be 100 to explicitly state that alpha will be 0.352 (but then you 
need to justify why you are not using the common 0.05 to reviewers).

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: Chris Wallace [mailto:chris.wall...@cimr.cam.ac.uk]
 Sent: Thursday, January 26, 2012 9:36 AM
 To: Greg Snow
 Cc: r-help@r-project.org
 Subject: Re: [R] null distribution of binom.test p values
 
 Greg, thanks for the reply.
 
 Unfortunately, I remain unconvinced!
 
 I ran a longer simulation, 100,000 reps.  The size of the test is
 consistently too small (see below) and the histogram shows increasing
 bars even within the parts of the histogram with even bar spacing.  See
 https://www-gene.cimr.cam.ac.uk/staff/wallace/hist.png
 
 y-sapply(1:10, function(i,n=100)
   binom.test(sum(rnorm(n)0),n,p=0.5,alternative=two)$p.value)
 mean(y0.01)
 # [1] 0.00584
 mean(y0.05)
 # [1] 0.03431
 mean(y0.1)
 # [1] 0.08646
 
 Can that really be due to the discreteness of the distribution?
 
 C.
 
 On 26/01/12 16:08, Greg Snow wrote:
  I believe that what you are seeing is due to the discrete nature of
 the binomial test.  When I run your code below I see the bar between
 0.9 and 1.0 is about twice as tall as the bar between 0.0 and 0.1, but
 the bar between 0.8 and 0.9 is not there (height 0), if you average the
 top 2 bars (0.8-0.9 and 0.9-1.0) then the average height is similar to
 that of the lowest bar.  The bar between 0.5 and 0.6 is also 0, if you
 average that one with the next 2 (0.6-0.7 and 0.7-0.8) then they are
 also similar to the bars near 0.
 
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] What does [[1]] mean?

2012-01-26 Thread Greg Snow

Have you read ?[[ ?

The short answer is that you can use both [] and [[]] on lists, the [] 
construct will return a subset of  the list (which will be a list) while [[]] 
will return a single element of the list (which could be a list or a vector or 
whatever that element may be):  compare:

 tmp - list( a=1, b=letters )
 tmp[1]
$a
[1] 1

 tmp[1] + 1
Error in tmp[1] + 1 : non-numeric argument to binary operator
 tmp[[1]]
[1] 1
 tmp[[1]] + 1
[1] 2

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Ajay Askoolum
 Sent: Thursday, January 26, 2012 11:27 AM
 To: R General Forum
 Subject: [R] What does [[1]] mean?
 
 I know that [] is used for indexing.
 I know that [[]] is used for reference to a property of a COM object.
 
 I cannot find any explanation of what [[1]] does or, more pertinently,
 where it should be used.
 
 Thank you.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] function for grouping

2012-01-25 Thread Greg Snow

I nominate this response for the fortunes package.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of David Winsemius
 Sent: Wednesday, January 25, 2012 10:23 AM
 To: yan
 Cc: r-help@r-project.org
 Subject: Re: [R] function for grouping
 
 
 On Jan 25, 2012, at 12:10 PM, yan wrote:
 
  thanks petr,
  what if I got 200 elements, so I have to write expand.grid(x1=1,
  x2=1:2,
  x3=1:3, x4=1:3, x5=1:3x200=1:3))?
 
 Perhaps same thing that will happen when those monks finish the Towers
 of Hanoi?
 
 2*3^198
 [1] 5.902533e+94
 
 --
 
 David Winsemius, MD
 West Hartford, CT
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to save the R script itself into a rData file?

2012-01-23 Thread Greg Snow

You could use the saveHistory command to save the history of commands that you 
have written to a file, then read that into a variable using the scan function, 
then do the save or save.image to save everything.

A different approach would be to save transcripts of your session that would 
show the commands run and the output created, on option for doing this is to 
run R inside of ESS/emacs, another option is the txtStart function in the 
TeachingDemos package.

You could also use the addTaskCallback function to add a task callback that 
adds each command (well the successful ones, errors don't trigger the 
callbacks) to a text vector, then this text vector would be saved in .Rdata 
when doing save.image()


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Michael
Sent: Saturday, January 21, 2012 5:25 PM
To: r-help
Subject: [R] how to save the R script itself into a rData file?

Hi all,

As a part of work flow, I do a lot of experiments and save all my results
into rData file...

i.e. at the end of all my experiments, I do
save.image(experiment_name_with_series_number.rData)...

However, some times even with the rData files, I cannot remember the
context where these data files were generated.

Of course, I can make the R data file names and the R script file names the
same, so that whenever I see a data file, I will be able to track down to
how the result file was generated.

This is fine. But sometimes a bunch of different results rData files were
generated simply from varying a parameter in the same R script file.

It's kind of messy to save different R script files with different names
when only parameters are different, and not to say if there are a bunch of
parameters that need to be put into file names...

Lets say I changed the parameters x to 0.123, y to -0.456, z to -999.99

Then I have to save the R script file as
Experiment_001_x=0.123_y=-0.456_z=-999.99.r

and the result file as Experiment_001_x=0.123_y=-0.456_z=-999.99.rData

...

This is kind of messy, isn't it?

Is there a way to save the whole script file (i.e. the context where the
data file is generated) into the rData file? It cannot be the file location
and/or file name of the R script file; it needs to be the whole content of
the file... to prevent the parameters change .. i.e. the same R script file
but with different combinations of parameters...

How to do that?

Any good tricks?

Thanks a lot!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] graph paper look

2012-01-23 Thread Greg Snow

In addition to the recommendations to use the grid function, you could just do:

par(tck=1)

before calling the plotting functions.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Erin Hodgess
Sent: Wednesday, January 18, 2012 6:19 PM
To: r-help@r-project.org
Subject: [R] graph paper look

Dear R People:

Short of doing a series of ablines, is there a way to produce graph
paper in R please?

Thanks,
Erin


-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Read and show Bitmap Images

2012-01-23 Thread Greg Snow

See the rasterImage function to do the plotting.  If you need to read the image 
in then I would start with the EBImage package from bioconductor (though there 
are others as well).

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Alaios
Sent: Monday, January 16, 2012 12:05 AM
To: R-help@r-project.org
Subject: [R] Read and show Bitmap Images

Dear all,
I am looking for a function that can plot bitmap images and by plotting I mean 
a function that can read an image's matrix structure with integers and assign 
colors.
Do you please suggest me what else I can do for plotting these images?

I would like to thank you for your reply

B.R
Alex

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Display numbers on map

2012-01-23 Thread Greg Snow

You might consider using the state.vbm map that is now part of the maptools 
package.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Jeffrey Joh
Sent: Tuesday, January 17, 2012 3:37 AM
To: r-help@r-project.org
Subject: [R] Display numbers on map

I have a text file with states and numbers.  I would like to display each 
number that corresponds to a state on a map.

I am trying to use the maps package, but it doesn't show Alaska or Hawaii.  Do 
you have suggestions on how to do this?

Jeffrey

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how do I make a movie out of a timeseries of 2D data?

2012-01-12 Thread Greg Snow

If you need the animation in a file outside of R (or possibly in R) then look 
at the animation package.  This allows you quite a few options on how to save 
an animation, some of which depend on outside programs, but options mean that 
if you don't have one of those programs there are other ways to do this.

If you just want to explore this within R then the development version of the 
TeachingDemos package (on R-Forge) has added an animate control to the tkexamp 
function.  Just create a function that takes the time index as the argument and 
creates the plot you want for each time, then run tkexamp using the animate 
control for the time argument.  You will see a new window with the plot for the 
1st time index along with a slider that will let you move through time and a 
button that will step through the remaining times automatically (you can 
specify the speed when running tkexamp).

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Michael
Sent: Thursday, January 12, 2012 7:38 AM
To: r-help; r-sig-finance
Subject: [R] how do I make a movie out of a timeseries of 2D data?

Hi all,

I have an array of 1 x 200 x 200 numbers... which is a time-series of
200x200 2D data...

The 1st dimension is the time index.

Is there a way to make a movie out of these data - i.e. playback 1
frames(200x200) at a playback rate per second?

Thanks a lot!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Options for generating editable figures?

2012-01-03 Thread Greg Snow

I have had clients who also wanted to make little changes to the graphs (mostly 
changing colors or line widths).  Most after doing this a couple of times have 
been happy to give be better descriptions of what they want so I can just do it 
correctly the first time.  

I mostly give them the graphs in .wmf or .emf format, however I have found that 
if I create the file and send it to them, most have problems getting it into 
word or power point, instead I usually copy and paste it into a word document 
and send the word document to them, they can then copy and paste from there to 
their presentation or report.  Of course this is only an option if you have MS 
word on the same computer as you are working on.  With those files double 
clicking takes the user into a basic editor where they can change colors, line 
widths, etc.  However, sometimes opening that editor will redo all the text, so 
what started as changing one line color also requires them to re orient all the 
axis and tick labels.

Inkscape is a much more capable program for doing these kinds of edits, and for 
basic editing it is fairly straight forward, so for your description of options 
below, I would suggest that you make them learn Inkscape if they really want to 
edit the graphs themselves.  Inkscape can also import pdf files (though it is 
an import rather than a simple open and you often need to ungroup a bunch of 
objects before editing them) so that may be another option for you.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Allen McBride
 Sent: Monday, January 02, 2012 7:51 PM
 To: r-help@r-project.org
 Subject: [R] Options for generating editable figures?
 
 Hello all,
 
 I'm using R to produce figures for people who want to be able to edit
 the figures directly, and who use PowerPoint a lot. I use a Mac, and
 I'd
 appreciate any advice about how to approach this. Here's what I've come
 up with so far:
 
 1) I can use xfig() and then ask them to install Inkscape to edit the
 files. Downsides are no transparency and a learning curve with
 Inkscape.
 2) I can do the same as above but with svg() instead of xfig(). But for
 reasons I don't understand, when I use svg() I can't seem to edit the
 resulting figures' text objects in Inkscape.
 3) I can try to install UniConvertor, which sounds like quite a task
 for
 someone of my modest skills. This would supposedly allow me to create
 .wmf files, which might (and I've read conflicting things about this)
 be
 importable into PowerPoint as editable graphics.
 4) I found an old suggestion in the archives that an EPS could be
 imported into PowerPoint and made editable. This almost worked for me
 (using Inkscape to convert a cairo_ps()-generated file to EPS) -- but
 only using PowerPoint under Windows, and lots of vectors and all text
 were lost along the way.
 
 Am I on the right track? Am I missing any better pathways? I know
 similar questions have come up before, but the discussions I found in
 the archives were old, and maybe things have changed in recent years.
 
 Thanks for any advice!
 --Allen McBride
 
 R version: 2.13.1
 Platform: Mac OS 10.7.2
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Array element is function of its position in the array

2011-12-29 Thread Greg Snow

Does vnew - vold[,,ks] accomplish what you want?

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Asher Meir
 Sent: Thursday, December 29, 2011 1:58 PM
 To: r-help@r-project.org
 Subject: [R] Array element is function of its position in the array
 
 I want to create a new array which selects values from an original
 array
 based on a function of the indices. That is:
 
 I want to create a new matrix Vnew[i,j,k]=Vold[i,j,ks] where ks is a
 function of the index elements i,j,k. I want to do this WITHOUT a loop.
 
 Call the function ksfunction, and the array dimensions nis,njs,nks. I
 can
 do this using a loop as follows:
 
 # Loop version:
 Vnew-array(NA,c(nis,njs,nks)
 for(i1 in 1:nis)for(j1 in 1:njs)for(k1 in
 1:nks)Vnew[i1,j1,k1]-Vold[i1,k1,ksfunction(i1,j1,k1)]
 
 I already know how to create an array of the ks's:
 ksarray[i,j,k]=ksfunction(i,j,k)  # I have the array ksarray ready
 
 I don't want a loop because nis,njs, and nks are pretty large and it
 takes
 forever.
 
 Would appreciate help with this issue.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [newbie] read row from file into vector

2011-12-29 Thread Greg Snow

The scan function can be used to read a single row.  If your file has multiple 
rows you can use the skip and nlines arguments to determine which row to read.  
With the what argument sent to a single item (a number or string depending on 
which you want) it will read each element on that row into a vector.

If you want to do more of the hard work yourself you can read in a whole line 
as a single string using the readLines function then use the strsplit (or 
possibly better, tools from the gsubfun package) to split that string into a 
vector (the unlist function may also be of help).

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Tom Roche
 Sent: Thursday, December 29, 2011 1:51 PM
 To: r-help@r-project.org
 Subject: [R] [newbie] read row from file into vector
 
 
 summary: how to read a row (not column) from a file into a vector (not
 a data frame)?
 
 details:
 
 I'm using
 
 $ lsb_release -ds
 Linux Mint Debian Edition
 $ uname -rv
 3.0.0-1-amd64 #1 SMP Sun Jul 24 02:24:44 UTC 2011
 $ R --version
 R version 2.14.1 (2011-12-22)
 
 I'm new to R (having previously used it only for graphics), but have
 worked in many other languages. I've got a CSV file from which I'd like
 to read the values from a single *row* into a vector. E.g., for a file
 such that
 
 $ head -n 2 ~/data/foo.csv | tail -n 1
 5718,0.3,0.47,0,0,0,0,0,0,0,0,0.08,0.37,0,0,0.83,1.55,0,0,0,0,0,0,0,0,0
 ,0.00,2.48,2.33,0.17,0,0,0,0,0,0,0.00,10.69,0.18,0,0,0,0
 
 I'd like to be able to populate a vector 'v' s.t. v[1]=5718, ...
 v[43]=0
 
 I can't seem to do that with, e.g., read.csv(...) or scan(...), both of
 which seem column-oriented. What am I missing?
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] p values in lmer

2011-12-24 Thread Greg Snow

This takes me back to listening to a professor lament about the researchers 
that would spend years collecting their data, then negate all that effort 
because they insist on using tools that are quick rather than correct.

So, before dismissing the use of pvals.fnc you might ask how long it takes to 
run relative to how long it took to collect the data and the importance of the 
answer.  If you feel the need to compute p-values multiple times, then you may 
need to rethink your approach (model selection based on repeated p-values 
results in p-values that are meaningless at best).

If you consider the above and still feel the need for a quick p-value rather 
than a correct one then you can use the 
SnowsCorrectlySizedButOtherwiseUselessTestOfAnything function from the 
TeachingDemos package. It is quick (but be sure to fully read the 
documentation).

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of arunkumar
Sent: Thursday, December 22, 2011 9:13 PM
To: r-help@r-project.org
Subject: [R] p values in lmer

hi

How to get p-values for lmer funtion other than pvals.fnc(), since it takes
long time for execution

-
Thanks in Advance
Arun
--
View this message in context: 
http://r.789695.n4.nabble.com/p-values-in-lmer-tp4227434p4227434.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Renaming Within A Function

2011-12-22 Thread Greg Snow

The error is because you are trying to assign to the result of a get call, and 
nobody has programmed that (hence could not find function) because it is 
mostly (if not completely) meaningless to do so.

It is not completely clear what you want to accomplish, but there is probably a 
better way to accomplish it.  Preferable is to create and modify the data 
object fully within the function then return that object (and let the caller of 
the function worry about assigning it).

Some things to read that may be enlightening if you really feel the need to 
have your function modify existing objects:

 library(fortunes)
 fortune(236)

And 

http://cran.r-project.org/doc/Rnews/Rnews_2001-3.pdf
(the article in the Programmer's Niche)

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Pete Brecknock
 Sent: Thursday, December 22, 2011 12:15 PM
 To: r-help@r-project.org
 Subject: [R] Renaming Within A Function
 
 I am trying to rename column names in a dataframe within a function. I
 am
 seeing an error (listed below) that I don't understand.
 
 Would be grateful of an explanation of what I am doing wrong and how I
 should rewrite the function to allow me to be able to rename my
 variables.
 
 Thanks.
 
 # Test Function
 myfunc -function(var){
   d   = c(1,2,3,4,5)
   dts = seq(as.Date(2011-01-01),by=month,length.out=length(d))
 
   assign(paste(var,.df,sep=), cbind(dts, d))
 
   names(get(paste(var,.df,sep=)))  - c(Dates,Data)
 }
 
 # Call Function
 myfunc(X)
 
 # Error Message
 Error in names(get(paste(var, .df, sep = ))) - c(Dates, Data)
 :
   could not find function get-
 
 --
 View this message in context: http://r.789695.n4.nabble.com/Renaming-
 Within-A-Function-tp4226368p4226368.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to Fit a Set of Lines Parametrized by a Number

2011-12-21 Thread Greg Snow

This looks like a hierarchical Bayes type problem.  There are a few packages 
that do Bayes estimation or link to external tools (like openbugs) to do this.  
You would just set up each of the relationships like you define below, y is a 
function of a(k), b(k), x and e where e comes from a normal distribution with 
mean 0 and variance sigma2, then a(k) is the relationship that you show along 
with something similar for b(k), then you just need prior distributions for 
alpha, beta, (and the same for b(k)), and sigma2 and let it run.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Lorenzo Isella
 Sent: Wednesday, December 21, 2011 8:59 AM
 To: r-h...@stat.math.ethz.ch
 Subject: [R] How to Fit a Set of Lines Parametrized by a Number
 
 Dear All,
 It is not very difficult, in R, to perform a linear fit
 
 y=Ax+B on a single set of data.
 However, imagine that you have several datasets labelled by a number
 (real or integer does not matter) K. For each individual dataset, it
 would make sense to resort to a linear fit, but now A and B both
 depend on K.
 In other words you would like to fit all your data according to
 
 y=A(K)x+B(K).
 
 You already have an idea of the functional dependence of A and B on K
 (which involves other unknown parameters to estimate) e.g.
 
 A(K)=alpha+beta^K, with unknown parameters alpha and beta.
 
 How would you tackle this problem?
 On top of my head, if I have N datasets, I can only think about
 getting N estimates {A1,A2...AN} for the A parameter for all the N
 datasets by fitting them individually.
 I would then resort e.g. to a Levemberg-Marquardt algorithm to
 determine the values of alpha and beta that best fit alpha+beta^K to
 my set {A1,A2...AN} for the corresponding N values of K.
 For B(K), I would follow exactly the same procedure.
 Does anybody know any better method?
 Any suggestion is welcome.
 Cheers
 
 Lorenzo
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple plots in one subplot

2011-12-16 Thread Greg Snow

Look at the layout function, it may do what you want.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of annek
 Sent: Thursday, December 15, 2011 11:36 PM
 To: r-help@r-project.org
 Subject: [R] Multiple plots in one subplot
 
 Hi,
 I making a figure with six sub-plots using par(mfcol=c(2,3)). In the
 last
 sub-plot I want to have two graphs instead of one. I have tried using
 par(fig=x,y,z,v) but this par seems to overwrite the first par. Is
 there a
 simple solution?
 Thanks!
 
 Anna
 
 --
 View this message in context: http://r.789695.n4.nabble.com/Multiple-
 plots-in-one-subplot-tp4203525p4203525.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fundamental guide to use of numerical optimizers?

2011-12-15 Thread Greg Snow

This really depends on more than just the optimizer, a lot can depend on what 
the data looks like and what question is being asked.  In bootstrapping it is 
possible to get bootstrap samples for which there is no unique correct answer 
to converge to, for example if there is a category where there ends up being no 
data due to the bootstrap but you still want to estimate a parameter for that 
category then there are an infinite number of possible answers that are all 
equal in the likelihood so there will be a lack of convergence on that 
parameter.  A stratified bootstrap or semi-parametric bootstrap can be used to 
avoid this problem (but may change the assumptions being made as well), or you 
can just throw out all those samples that don't have a full answer (which could 
be what your presenter did).

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Paul Johnson
 Sent: Thursday, December 15, 2011 9:38 AM
 To: R-help
 Subject: [R] fundamental guide to use of numerical optimizers?
 
 I was in a presentation of optimizations fitted with both MPlus and
 SAS yesterday.  In a batch of 1000 bootstrap samples, between 300 and
 400 of the estimations did not converge.  The authors spoke as if this
 were the ordinary cost of doing business, and pointed to some
 publications in which the nonconvergence rate was as high or higher.
 
 I just don't believe that's right, and if some problem is posed so
 that the estimate is not obtained in such a large sample of
 applications, it either means the problem is badly asked or badly
 answered.  But I've got no traction unless I can actually do
 better
 
 Perhaps I can use this opportunity to learn about R functions like
 optim, or perhaps maxLik.
 
 From reading r-help, it seems to me there are some basic tips for
 optimization, such as:
 
 1. It is wise to scale the data so that all columns have the same
 range before running an optimizer.
 
 2. With estimates of variance parameters, don't try to estimate sigma
 directly, instead estimate log(sigma) because that puts the domain of
 the solution onto the real number line.
 
 3 With estimates of proportions, estimate instead the logit, for the
 same reason.
 
 Are these mistaken generalizations?  Are there other tips that
 everybody ought to know?
 
 I understand this is a vague question, perhaps the answers are just in
 the folklore. But if somebody has written them out, I would be glad to
 know.
 
 --
 Paul E. Johnson
 Professor, Political Science
 1541 Lilac Lane, Room 504
 University of Kansas
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] nice report generator?

2011-12-14 Thread Greg Snow

Duncan, 

If you are taking suggestions for expanding the tables package (looks great) 
then I would suggest some way to get the tables into MS products.  If I create 
a full output/report myself then I am happy to work in LaTeX, but much of what 
I do is to produce tables and graphs to clients that don't know LaTeX and just 
want something that they can copy and paste into powerpoint or word.  For this 
I have been using the R2wd package (and the wdTable function for the tables).  
I would love to have some toolset that I could use your tables package to 
create the main table, then transfer it fairly simply to word or excel.  I 
don't care much about the fluff of how the table looks (coloring rows or 
columns, line widths, etc.) just getting it into a table (not just the text 
version).

One possibility is just an as.matrix method that would produce something that I 
could feed to wdTable.  Or just a textual representation of the table with 
columns separated by tabs so that it could be copied to the clipboard then 
pasted into excel or word (I would then let the client deal with all the tweaks 
on the appearance).

Thanks,

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Duncan Murdoch
Sent: Thursday, December 08, 2011 4:52 PM
To: Tal Galili
Cc: r-help
Subject: Re: [R] nice report generator?

On 11-12-08 1:37 PM, Tal Galili wrote:
 Helloe dear Duncan, Gabor, Michael and others,

 Do you think it could be (reasonably) possible to create a bridge between a
 cast_df object from the {reshape} package into a table in Duncan's new
 {tables} package?

I'm not that familiar with the reshape package (and neither it nor 
reshape2 appears to have a vignette to give me an overview), so I don't 
have any idea if that makes sense.  The table package is made to work on 
dataframes, and only dataframes.  It converts them into matrices with 
lots of attributes, so that the print methods can put nice labels on. 
But it's strictly rectangular to rectangular in the kinds of conversions 
it does, and from the little I know about reshape, it works on more 
general arrays, converting them to and from dataframes.



 That would allow one to do pivot-table like operations on an object using
 {reshape}, and then display it (as it would have been in excel - or better)
 using the {tables} package.

You'll have to give an example of what you want to do.

Duncan Murdoch








 Contact
 Details:---
 Contact me: tal.gal...@gmail.com |  972-52-7275845
 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
 www.r-statistics.com (English)
 --




 On Thu, Dec 8, 2011 at 5:24 PM, Michaelcomtech@gmail.com  wrote:

 Hi folks,

 In addition to Excel style tables, it would be great to have Excel 2010
 Pivot Table in R...

 Any thoughts?

 Thanks a lot!

 On Thu, Dec 8, 2011 at 4:49 AM, Tal Galilital.gal...@gmail.com  wrote:

 I think it would be *great *if an extension of Duncan's new tables
 package could include themes and switches as are seen in the video Gabor
 just linked to.


 Tal


   On Thu, Dec 8, 2011 at 6:58 AM, Gabor Grothendieck
 ggrothendi...@gmail.com  wrote:

   On Wed, Dec 7, 2011 at 11:42 PM, Michaelcomtech@gmail.com  wrote:
 Do you have an example...? Thanks a lot!

 See this video:
 http://www.woopid.com/video/1388/Format-as-Table

 --
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] map at fips level using multiple variables

2011-12-14 Thread Greg Snow

Colors probably are not the best for so many levels and combinations.  Look at 
the symbols function (or the my.symbols and subplot functions in the 
TeachingDemos package) for ways to add symbols to a map showing multiple 
variables.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of bby2...@columbia.edu
Sent: Wednesday, December 07, 2011 9:14 PM
To: David Winsemius
Cc: r-help@r-project.org
Subject: Re: [R] map at fips level using multiple variables

Hi David,

Sorry it sounds vague.

Here is my current code, which gives the distribution of family size  
at US county level. You will see on a US map family size distribution  
represented by different colors.

Now if i have another variable income, which has 3 categories(50k,  
50k-80k,80k). How do I show 5x3 categories on the map? I guess I  
could always create a third variable to do this. Just wondering maybe  
there is a function to do this readily?

Thank you!
Bonnie Yuan

y=data1$size
x11()
hist(y,nclass=15) # histogram of y
fivenum(y)
# specify the cut-off point of y
y.colorBuckets=as.numeric(cut(y, c(1,2, 3,6)))
# legend showing the cut-off points.
legend.txt=c(0-1,1-2,2-3,3-6,6)
colorsmatched=y.colorBuckets[match(county.fips$fips,fips[,1])]
x11()
map(county, col = colors[colorsmatched], fill = TRUE, resolution = 0)
map(state, col = white, fill = FALSE, add = TRUE, lty = 1, lwd = 0.2)
title(Family Size)
legend(bottom, legend.txt, horiz = TRUE, fill = colors, cex=0.7)

Quoting David Winsemius dwinsem...@comcast.net:

 On Dec 7, 2011, at 6:12 PM, bby2...@columbia.edu wrote:

 Hi, I just started playing with county FIPS feature in maps package  
  which allows geospatial visualization of variables on US county   
 level. Pretty cool.

 Got code?

 I did some search but couldn't find answer to this question--how   
 can I map more than 2 variables on US map?

 2 variables is a bit on the vague side for programming purposes.

 For example, you can map by the breakdown of income or family size.  
  How do you further breakdown based on the values of both variables  
  and show them on the county FIPS level?

 breakdown suggests a factor construct. If so, then :

 ?interaction

 But the show part of the question remains very vague.

  Can't you be a bit more specific? What DO you want?

 -- 
 David Winsemius, MD
 West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] nice report generator?

2011-12-14 Thread Greg Snow

Richard,

I have looked at SWord before, but to my knowledge it does not deal directly 
with the tabular objects created by the tables package (please correct me if I 
am wrong).  These objects do have a matrix as the main data, but the attributes 
are different from the usual dimnames.   There are functions in tables that 
will then take this structure and print it out in a nice way with the headers, 
or work with the latex function to create a nice table for LaTeX, but there are 
not (yet) tools for doing this in MS products.  SWord and R2wd (and other 
tools) could transfer the data just fine, but then the column and row 
names/headers would still need to be put in by hand, which negates the whole 
convenience of using these tools.  One option would be to have a function in 
tables that converted to a regular matrix with some form of meaningful dimnames 
that could then be used with R2wd or SWord or odfWeave or other tools (that was 
one of my suggestions).

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111

From: Richard M. Heiberger [mailto:r...@temple.edu]
Sent: Wednesday, December 14, 2011 11:20 AM
To: Greg Snow
Cc: Duncan Murdoch; Tal Galili; r-help
Subject: Re: [R] nice report generator?

Greg,

Please look at the SWord package.  This package integrates MS Word with R
in a manner similar to the SWeave integration of LaTeX with R.
Download SWord from rcom.univie.ac.athttp://rcom.univie.ac.at
If you have a recent download of RExcel from the RAndFriends installer, then
you will already have SWord on your machine.

Rich

On Wed, Dec 14, 2011 at 12:39 PM, Greg Snow 
greg.s...@imail.orgmailto:greg.s...@imail.org wrote:
Duncan,

If you are taking suggestions for expanding the tables package (looks great) 
then I would suggest some way to get the tables into MS products.  If I create 
a full output/report myself then I am happy to work in LaTeX, but much of what 
I do is to produce tables and graphs to clients that don't know LaTeX and just 
want something that they can copy and paste into powerpoint or word.  For this 
I have been using the R2wd package (and the wdTable function for the tables).  
I would love to have some toolset that I could use your tables package to 
create the main table, then transfer it fairly simply to word or excel.  I 
don't care much about the fluff of how the table looks (coloring rows or 
columns, line widths, etc.) just getting it into a table (not just the text 
version).

One possibility is just an as.matrix method that would produce something that I 
could feed to wdTable.  Or just a textual representation of the table with 
columns separated by tabs so that it could be copied to the clipboard then 
pasted into excel or word (I would then let the client deal with all the tweaks 
on the appearance).

Thanks,

-Original Message-
From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org 
[mailto:r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org] On 
Behalf Of Duncan Murdoch
Sent: Thursday, December 08, 2011 4:52 PM
To: Tal Galili
Cc: r-help
Subject: Re: [R] nice report generator?

On 11-12-08 1:37 PM, Tal Galili wrote:
 Helloe dear Duncan, Gabor, Michael and others,

 Do you think it could be (reasonably) possible to create a bridge between a
 cast_df object from the {reshape} package into a table in Duncan's new
 {tables} package?

I'm not that familiar with the reshape package (and neither it nor
reshape2 appears to have a vignette to give me an overview), so I don't
have any idea if that makes sense.  The table package is made to work on
dataframes, and only dataframes.  It converts them into matrices with
lots of attributes, so that the print methods can put nice labels on.
But it's strictly rectangular to rectangular in the kinds of conversions
it does, and from the little I know about reshape, it works on more
general arrays, converting them to and from dataframes.



 That would allow one to do pivot-table like operations on an object using
 {reshape}, and then display it (as it would have been in excel - or better)
 using the {tables} package.

You'll have to give an example of what you want to do.

Duncan Murdoch








 Contact
 Details:---
 Contact me: tal.gal...@gmail.commailto:tal.gal...@gmail.com |  
 972-52-7275845
 Read me: www.talgalili.comhttp://www.talgalili.com/ (Hebrew) | 
 www.biostatistics.co.ilhttp://www.biostatistics.co.il/ (Hebrew) |
 www.r-statistics.comhttp://www.r-statistics.com/ (English)
 --




 On Thu, Dec 8, 2011 at 5:24 PM, 
 Michaelcomtech@gmail.commailto:comtech@gmail.com  wrote:

 Hi folks,

 In addition to Excel style tables, it would be great to have Excel 2010
 Pivot Table in R...

 Any thoughts?

 Thanks a lot!

 On Thu, Dec 8, 2011 at 4:49 AM, Tal 
 Galilital.gal

Re: [R] nice report generator?

2011-12-14 Thread Greg Snow

There is also the problem that SWord's license does not allow for commercial 
use.  R2wd, write.table, and odfWeave don't have this restriction.

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111

From: Richard M. Heiberger [mailto:r...@temple.edu]
Sent: Wednesday, December 14, 2011 11:20 AM
To: Greg Snow
Cc: Duncan Murdoch; Tal Galili; r-help
Subject: Re: [R] nice report generator?

Greg,

Please look at the SWord package.  This package integrates MS Word with R
in a manner similar to the SWeave integration of LaTeX with R.
Download SWord from rcom.univie.ac.athttp://rcom.univie.ac.at
If you have a recent download of RExcel from the RAndFriends installer, then
you will already have SWord on your machine.

Rich

On Wed, Dec 14, 2011 at 12:39 PM, Greg Snow 
greg.s...@imail.orgmailto:greg.s...@imail.org wrote:
Duncan,

If you are taking suggestions for expanding the tables package (looks great) 
then I would suggest some way to get the tables into MS products.  If I create 
a full output/report myself then I am happy to work in LaTeX, but much of what 
I do is to produce tables and graphs to clients that don't know LaTeX and just 
want something that they can copy and paste into powerpoint or word.  For this 
I have been using the R2wd package (and the wdTable function for the tables).  
I would love to have some toolset that I could use your tables package to 
create the main table, then transfer it fairly simply to word or excel.  I 
don't care much about the fluff of how the table looks (coloring rows or 
columns, line widths, etc.) just getting it into a table (not just the text 
version).

One possibility is just an as.matrix method that would produce something that I 
could feed to wdTable.  Or just a textual representation of the table with 
columns separated by tabs so that it could be copied to the clipboard then 
pasted into excel or word (I would then let the client deal with all the tweaks 
on the appearance).

Thanks,

-Original Message-
From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org 
[mailto:r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org] On 
Behalf Of Duncan Murdoch
Sent: Thursday, December 08, 2011 4:52 PM
To: Tal Galili
Cc: r-help
Subject: Re: [R] nice report generator?

On 11-12-08 1:37 PM, Tal Galili wrote:
 Helloe dear Duncan, Gabor, Michael and others,

 Do you think it could be (reasonably) possible to create a bridge between a
 cast_df object from the {reshape} package into a table in Duncan's new
 {tables} package?

I'm not that familiar with the reshape package (and neither it nor
reshape2 appears to have a vignette to give me an overview), so I don't
have any idea if that makes sense.  The table package is made to work on
dataframes, and only dataframes.  It converts them into matrices with
lots of attributes, so that the print methods can put nice labels on.
But it's strictly rectangular to rectangular in the kinds of conversions
it does, and from the little I know about reshape, it works on more
general arrays, converting them to and from dataframes.



 That would allow one to do pivot-table like operations on an object using
 {reshape}, and then display it (as it would have been in excel - or better)
 using the {tables} package.

You'll have to give an example of what you want to do.

Duncan Murdoch








 Contact
 Details:---
 Contact me: tal.gal...@gmail.commailto:tal.gal...@gmail.com |  
 972-52-7275845
 Read me: www.talgalili.comhttp://www.talgalili.com/ (Hebrew) | 
 www.biostatistics.co.ilhttp://www.biostatistics.co.il/ (Hebrew) |
 www.r-statistics.comhttp://www.r-statistics.com/ (English)
 --




 On Thu, Dec 8, 2011 at 5:24 PM, 
 Michaelcomtech@gmail.commailto:comtech@gmail.com  wrote:

 Hi folks,

 In addition to Excel style tables, it would be great to have Excel 2010
 Pivot Table in R...

 Any thoughts?

 Thanks a lot!

 On Thu, Dec 8, 2011 at 4:49 AM, Tal 
 Galilital.gal...@gmail.commailto:tal.gal...@gmail.com  wrote:

 I think it would be *great *if an extension of Duncan's new tables
 package could include themes and switches as are seen in the video Gabor
 just linked to.


 Tal


   On Thu, Dec 8, 2011 at 6:58 AM, Gabor Grothendieck
 ggrothendi...@gmail.commailto:ggrothendi...@gmail.com  wrote:

   On Wed, Dec 7, 2011 at 11:42 PM, 
 Michaelcomtech@gmail.commailto:comtech@gmail.com  wrote:
 Do you have an example...? Thanks a lot!

 See this video:
 http://www.woopid.com/video/1388/Format-as-Table

 --
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.comhttp://gmail.com/

 __
 R-help@r-project.orgmailto:R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman

Re: [R] axis thickness in plot()

2011-12-07 Thread Greg Snow

Often when someone wants lines (axes) in R plots to be thicker or thinner it is 
because they are producing the plots at the wrong size, then changing the size 
of the plot in some other program (like MSword) and the lines do not look as 
nice.  If this is your case, then the better approach is to produce the 
original graph at the appropriate size, then you don't need to worry about the 
effects of resizing.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of AlexC
 Sent: Tuesday, December 06, 2011 9:35 AM
 To: r-help@r-project.org
 Subject: [R] axis thickness in plot()
 
 Hello,
 
 I am trying to increase the thickness of the axis in plot() without
 reverting to the use of paint programs
 
 i see posts on that topic for the xyplot function but want to see if i
 can
 do it with plot() because i've already setup my graph script using that
 
 i thought i could use axis() function and specify lwd=thickness or
 lwd.axis= but that does not work like it does for lwd.ticks
 
 If anyone has an idea, sincerely
 
 heres the script
 
 windows(width=7,height=7)
 plot(data$Winter,data$NbFirstBroods,ylab=number of breeding
 pairs,xlab=winter
 harshness,cex=1.5,cex.lab=1.5,cex.axis=1.5,font.axis=2,axes=FALSE)
 points(data$Winter,data$NbFirstBroods,cex=1.5,col=black,pch=19)
 abline(lm(data$NbFirstBroods~data$Winter),col=red,lwd=4)
 
 i tried axis(1, lwd.axis = 3,lwd.ticks=3) for example
 
 also when adding the y axis axis(2...) x and y axes are disconnected
 
 Thank you for your kind help in advance,
 
 Alexandre
 
 --
 View this message in context: http://r.789695.n4.nabble.com/axis-
 thickness-in-plot-tp4165430p4165430.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change the limits of a plot a posteriori

2011-12-06 Thread Greg Snow

The zoomplot function in the TeachingDemos package can be used for this (it 
actually redoes the entire plot, but with new limits).  This will generally 
work for a quick exploration, but for quality plots it is suggested to create 
the 1st plot with the correct range to begin with.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of jcano
 Sent: Thursday, December 01, 2011 11:12 AM
 To: r-help@r-project.org
 Subject: [R] Change the limits of a plot a posteriori
 
 Hi all
 
 How can I change the limits (xlim or ylim) in a plot that has been
 already
 created?
 
 For example, consider this naive example
 curve(dbeta(x,2,4))
 curve(dbeta(x,8,13),add=T,col=2)
 
 When adding the second curve, it goes off the original limits computed
 by R
 for the first graph, which are roughly, c(0,2.1)
 
 I know two obvious solutions for this, which are:
 1) passing a sufficiently large parameter e.g. ylim=c(0,4) to the first
 graphic
 curve(dbeta(x,2,4),ylim=c(0,4))
 curve(dbeta(x,8,13),add=T,col=2)
 
 or
 
 2) switch the order in which I plot the curves
 curve(dbeta(x,8,13),col=2)
 curve(dbeta(x,2,4),add=T)
 
 but I guess if there is any way of adjusting the limits of the graphic
 a
 posteriori, once you have a plot with the undesired limits, forcing R
 to
 redraw it with the new limits, but without having to execute again the
 curve commands
 
 Hope I made myself clear
 
 Best regards and thank you very much in advance
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/Change-the-
 limits-of-a-plot-a-posteriori-tp4129750p4129750.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rearrange set of items randomly

2011-11-14 Thread Greg Snow

If you don't want to go with the simple method mentioned by David and Ted, or 
you just want some more theory, you can check out: 
http://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle and implement that.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of flokke
 Sent: Monday, November 07, 2011 2:09 PM
 To: r-help@r-project.org
 Subject: [R] rearrange set of items randomly
 
 Dear all,
 I hope that this question is not too weird, I will try to explain it as
 good
 as I can.
 
 I have to write a function for a school project and one limitation is
 that I
 may not use the in built function sample()
 
 At one point in the function I would like to resample/rearrange the
 items of
 my sample (so I would want to use sample, but I am not allowed to do
 so), so
 I have to come up with sth else that does the same as the in built
 function
 sample()
 
 The only thing that sample() does is rearranging the items of a sample,
 so I
 searched the internet for a function that does that to be able to use
 it,
 but I cannot find anything that could help me.
 
 Can maybe someone help me with this?
 I would be very grateful,
 
 Cheers,
 Maria
 
 --
 View this message in context: http://r.789695.n4.nabble.com/rearrange-
 set-of-items-randomly-tp4013723p4013723.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] drawing ellipses in R

2011-11-11 Thread Greg Snow

Those formulas are the standard way to convert from polar coordinates to 
Euclidean coordinates.  The polar coordinates are 'r' which is the radius or 
distance from the center point and 'theta' which is the angle (0 is pointing in 
the positive x direction).

If r is constant and theta coveres a full cycle then you will get a circle.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of mms...@comcast.net
Sent: Monday, October 31, 2011 10:50 PM
To: r-h...@stat.math.ethz.ch
Subject: [R] drawing ellipses in R

Hello, 

I have been following the thread dated Monday, October 9, 2006 when Kamila 
Naxerova asked a question about plotting elliptical shapes. Can you explain the 
equations for X and Y. I believe they used the parametric form of x and y (x=r 
cos(theta), y=r sin(theta). I don't know what r is here ? Can you explain 1)the 
origin of these equations and 2) what is r? 

Sincerely, 
Mary A. Marion 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multivariate random variable

2011-11-11 Thread Greg Snow

Your question is a bit too general to give a useful answer.  One possible 
answer to your question is:

 mrv - matrix( runif(1000), ncol=10 )

Which generates multivariate random observations, but is unlikely to be what 
you are really trying to accomplish.  There are many tools for generating 
multivariate random data including Metropolis-Hastigns, Gibbs sampling, 
rejection sampling, conditional generation, copulas, and many others, which one 
will be best (or which combination will be best) depends on what you are 
actually trying to accomplish.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Anera Salucci
Sent: Tuesday, November 01, 2011 5:22 AM
To: r-help@r-project.org
Subject: [R] multivariate random variable

Dear All,
 
How can I generate multivariate random variable (not multivariate normal )
 
I am in urgent
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Export to .txt

2011-11-05 Thread Greg Snow

Look at the txtStart function in the TeachingDemos package.  It works like sink 
but also includes commands as well as output.  Though I have never tried it 
with browser() (and it does not always include the results of errors).

Another option in to use some type of editor that links with R such as 
emacs/ESS or tinn-R (or other) and then save the entire transcript.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of stat.kk
Sent: Tuesday, November 01, 2011 4:15 PM
To: r-help@r-project.org
Subject: [R] Export to .txt

Hi,

I would like to export all my workspace (even with the evaluation of
commands) to the text file. I know about the sink() function but it doesnt
work as I would like. My R-function looks like this: there are instructions
for user displayed by cat() command and browser() commands for fulfilling
them. While using the sink() command the instructions dont display :(
Can anyone help me with a equivalent command to File - Save to file...
option? 

Thank you very much.

--
View this message in context: 
http://r.789695.n4.nabble.com/Export-to-txt-tp3965699p3965699.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subplot strange behavoir

2011-10-24 Thread Greg Snow

I see the problem, I fixed this bug for version 2.8 of TeachingDemos, but have 
not submitted the new version to CRAN yet (I thought that I had fixed this 
earlier, but apparently it is still only in the development version).  An easy 
fix is to install version 2.8 from R-forge (install.packages(TeachingDemos, 
repos=http://R-Forge.R-project.org;) and then it should work for you.

Sorry about not seeing this earlier.


-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of emorway
 Sent: Friday, October 21, 2011 4:21 PM
 To: r-help@r-project.org
 Subject: Re: [R] subplot strange behavoir
 
 Hello Dr. Snow,
 
 With regard to your response from earlier this month:
 
 
 When I copy and paste your code I get what is expected, the 2 subplots
 line
 up on the same y-value.  What version of R are you using, which version
 of
 subplot? What platform?
 
 I'm still troubled by the fact that layout and subplot (from
 TeachingDemos)
 are not playing nicely together on my machine.  sessionInfo():
 
 sessionInfo()
 #R version 2.13.2 (2011-09-30)
 #Platform: x86_64-pc-mingw32/x64 (64-bit)
 #other attached packages:
 #[1] TeachingDemos_2.7
 
 I'd really like to get this working on my machine as it seems to be
 working
 on yours.  While I previously tried a simply example for the initial
 forum
 post, I'm curious if the real plot I'm trying to make works on your
 machine.
 Should you happen to have a spare moment and I'm not pushing my luck,
 I've
 attached 4 small data files, 1 text file containing the R commands I'm
 trying to run (including 'layout' and 'subplot' called
 R_Commands_Plot_MT3D_Analytical_Comparison_For_Paper.txt), and the
 incorrect tiff output I'm getting on my machine.  I've directed all
 paths in
 the R code to c:/temp/  so everything should quickly work if files are
 dropped there.  Should it work on your machine as we would expect, does
 anything come to mind for how to fix it on my machine?
 
 Very Respectfully,
 Eric
 
 http://r.789695.n4.nabble.com/file/n3926941/AnalyticalDissolvedSorbedCo
 ncAt20PoreVols.txt
 AnalyticalDissolvedSorbedConcAt20PoreVols.txt
 http://r.789695.n4.nabble.com/file/n3926941/AnalyticalEffluentConcentra
 tion.txt
 AnalyticalEffluentConcentration.txt
 http://r.789695.n4.nabble.com/file/n3926941/Conc_Breakthru_at_100cm.txt
 Conc_Breakthru_at_100cm.txt
 http://r.789695.n4.nabble.com/file/n3926941/Conc_Profile_20T.txt
 Conc_Profile_20T.txt
 http://r.789695.n4.nabble.com/file/n3926941/R_Commands_Plot_MT3D_Analyt
 ical_Comparison_For_Paper.txt
 R_Commands_Plot_MT3D_Analytical_Comparison_For_Paper.txt
 http://r.789695.n4.nabble.com/file/n3926941/NonEquilibrium_ForPaper.tif
 NonEquilibrium_ForPaper.tif
 
 --
 View this message in context: http://r.789695.n4.nabble.com/subplot-
 strange-behavoir-tp3875917p3926941.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Irregular 3d objects with rgl

2011-10-15 Thread Greg Snow

You could use the rgl package and plot a sprite at each of your points with the 
color based on the concentration:

plume$col - cut(plume$conc, c(-1,0.01,0.02,0.3,0.7,1), 
labels=c('blue','green','yellow','orange','red'))


plume2 - plume
theta - atan2(plume2$y-mean(plume2$y), plume2$x-mean(plume2$x))
slice - pi/4  theta  theta  3*pi/4
plume2$y[slice] - plume2$y[slice] + 3 

library(rgl)
open3d()
sprites3d( plume2$x, plume2$y, plume2$z, color=as.character(plume$col),
 lit=FALSE, radius=1)

It looks better with more points in each direction and a smaller radius on the 
sprites.


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of emorway
Sent: Friday, October 14, 2011 6:15 PM
To: r-help@r-project.org
Subject: [R] Irregular 3d objects with rgl

Hello, 

While exploring if rgl is along the lines of what I need, I checked out
demo(rgl) and didn't find quite what I'm looking for, and am therefore
seeking additional help/suggestions.

The application is geared towards displaying a 3D rendering of a contaminant
plume in the subsurface with the following highlights:  Once the plume was
rendered as a 3D object, a pie-like wedge could be removed (or cut away)
exposing the higher concentrations within the plume as 'hotter' colors. 
About the closest example I could find is here: 

http://mclaneenv.com/graphicdownloads/plume.jpg 

Whereas this particular rendering shows a bullet-like object where 3/4 of
the object is removed, I would like to try and show something where 3/4 of
the object remains, and where the object has been cut away the colors would
show concentrations within the plume, just as in the example referenced
above.  It would seem most software capable of this type of thing is
proprietary (and perhaps for good reason if it is a difficult problem to
solve). 

I've put together a very simple 6x6x6 cube with non-zero values internal to
it representing the plume.  I wondering if an isosurface where conc = 0.01
can be rendered in 3D and then if a bite or wedge can be removed from the
3d object exposing the higher concentrations inside as discussed above?

x-c(1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6)

y-c(1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4,5,5,5,5,5,5,6,6,6,6,6,6,1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4,5,5,5,5,5,5,6,6,6,6,6,6,1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4,5,5,5,5,5,5,6,6,6,6,6,6,1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4,5,5,5,5,5,5,6,6,6,6,6,6,1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4,5,5,5,5,5,5,6,6,6,6,6,6,1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4,5,5,5,5,5,5,6,6,6,6,6,6)

z-c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6)

conc-c(0,0,0,0,0,0,0,0,0.1,0.1,0,0,0,0.1,1,1,0.1,0,0,0.1,0.5,1,0.1,0,0,0,0.2,0.2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.05,0.1,0,0,0,0.05,0.8,0.8,0.05,0,0,0.05,0.4,0.8,0.05,0,0,0,0.1,0.1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.05,0,0,0,0,0.6,0.6,0.02,0,0,0,0.2,0.5,0.02,0,0,0,0.05,0.05,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.05,0.2,0,0,0,0,0,0.05,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.02,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0)

plume-data.frame(cbind(x=x,y=y,z=z,conc=conc))

if it helps to view the concentrations in layer by layer tabular form:
Layer 1
0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.10 0.10 0.00 0.00
0.00 0.10 1.00 0.50 0.20 0.00
0.00 0.10 1.00 1.00 0.20 0.00
0.00 0.00 0.10 0.10 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00
Layer 2 
0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.05 0.05 0.00 0.00
0.00 0.05 0.80 0.40 0.10 0.00
0.00 0.10 0.80 0.80 0.10 0.00
0.00 0.00 0.05 0.05 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00
Layer 3 
0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.60 0.20 0.05 0.00
0.00 0.05 0.60 0.50 0.05 0.00
0.00 0.00 0.02 0.02 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00
Layer 4 
0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.05 0.00 0.00 0.00
0.00 0.00 0.20 0.05 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00
Layer 5 
0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00
0.00

Re: [R] US States percentage change plot

2011-10-13 Thread Greg Snow

Unless your audience is mainly interested in Texas and California and is 
completely content to ignore Rhode Island, then I would suggest that you look 
at the state.vbm map in the TeachingDemos package that works with the maptools 
package.  The example there shows coloring based on a variable.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Michael Charles Bailey I
 Sent: Wednesday, October 12, 2011 6:46 PM
 To: r-help@r-project.org
 Subject: [R] US States percentage change plot
 
 Hi, I would like to make a plot of the US states (or lower 48) that are
 colored based upon a percentage change column. Ideally, it would
 gradually
 be more blue the larger the positive change, and more red the more
 negative
 is the change.
 
 The data I have looks like:
 
State Percent.Change
 1Alabama0.004040547
 2 Alaska   -0.000202211
 3Arizona   -0.002524567
 4   Arkansas   -0.008525333
 5 California0.001828754
 6   Colorado0.06150
 
 I have read help for the maps library and similar plots online but
 can't
 grasp how to map the percentage.change column to the map. thank in
 advance,
 
 Michael Bailey
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] monotonic factors

2011-10-12 Thread Greg Snow

One approach would be to code dummy variables for your factor levels, have d1 
equal to 0 for 'low' and 1 for 'med' and 'high', then have d2 equal to 1 for 
'high' and 0 otherwise.  For linear regression there are functions that will 
fit a model with all non-negative coefficients, but I don't know of anything 
like that for glms, so one option is to fit with all the dummy variables, then 
if any of the estimated coefficients are negative remove that variable (force 
it to 0) and refit.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Jeffrey Pollock
 Sent: Wednesday, October 12, 2011 6:29 AM
 To: r-help@r-project.org
 Subject: [R] monotonic factors
 
 Hello all,
 
 
 
 I have an ordered factor that I would like to include in the linear
 predictor of a binomial glm, where the estimated coefficients are
 constrained to be monotonic. Does anyone know how to do this? I've
 tried
 using an ordered factor but this does not have the desired effect, an
 (artificial) example of this follows;
 
 
 
 n - 100
 
 strings - sample(c(low, med, high), n, TRUE)
 
 
 
 x.ordered - ordered(strings, c(low, med, high))
 
 x.unordered - factor(strings)
 
 
 
 pr - ifelse(strings == low, 0.4, ifelse(strings == med, 0.3, 0.2))
 
 
 
 y - rbinom(n, 1, pr)
 
 
 
 mod.ordered - glm(y ~ x.ordered, binomial)
 
 mod.unordered - glm(y ~ x.unordered, binomial)
 
 
 
 summary(mod.ordered)
 
 summary(mod.unordered)
 
 -
 ** Confidentiality: The
 contents of this e-mail and any attachments transmitted with it are
 intended to be confidential to the intended recipient; and may be
 privileged or otherwise protected from disclosure. If you are not an
 intended recipient of this e-mail, do not duplicate or redistribute it
 by any means. Please delete it and any attachments and notify the
 sender that you have received it in error. This e-mail is sent by a
 William Hill PLC group company. The William Hill group companies
 include, among others, William Hill PLC (registered number 4212563),
 William Hill Organization Limited (registered number 278208), William
 Hill Credit Limited (registered number 413846), WHG (International)
 Limited (registered number 99191) and WHG Trading Limited (registered
 number 101439). Each of William Hill PLC, William Hill Organization
 Limited and William Hill Credit Limited is registered in Engl!
 
  and and Wales and has its registered office at Greenside House, 50
 Station Road, Wood Green, London N22 7TP. Each of WHG (International)
 Limited and WHG Trading Limited is registered in Gibraltar and has its
 registered office at 6/1 Waterport Place, Gibraltar. Unless
 specifically indicated otherwise, the contents of this e-mail are
 subject to contract; and are not an official statement, and do not
 necessarily represent the views, of William Hill PLC, its subsidiaries
 or affiliated companies. Please note that neither William Hill PLC, nor
 its subsidiaries and affiliated companies can accept any responsibility
 for any viruses contained within this e-mail and it is your
 responsibility to scan any emails and their attachments. William Hill
 PLC, its subsidiaries and affiliated companies may monitor e-mail
 traffic data and also the content of e-mails for effective operation of
 the e-mail system, or for security, purposes.
 *
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Chi-Square test and survey results

2011-10-12 Thread Greg Snow

The chisq.test function is expecting a contingency table, basically one column 
should have the count of respondents and the other column should have the count 
of non-respondents (yours looks like it is the total instead of the 
non-respondents), so your data is wrong to begin with.  A significant 
chi-square here just means that the proportion responding differs in some of 
the regions, that does not mean that the sample is representative (or not 
representative).  What is more important (and not in the data or standard 
tests) is if there is a relationship between why someone chose to respond and 
the outcomes of interest.

If you are concerned with different proportions responding then you could do 
post-stratification to correct for the inequality when computing other 
summaries or tests (though region 6 will still give you problems, you will need 
to make some assumptions, possibly combine it with another region that is 
similar).

Throwing away data is rarely, if ever, beneficial.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of ghe...@mathnmaps.com
 Sent: Tuesday, October 11, 2011 1:32 PM
 To: r-help@r-project.org
 Subject: [R] Chi-Square test and survey results
 
 An organization has asked me to comment on the validity of their
 recent all-employee survey.  Survey responses, by geographic region,
 compared
 with the total number of employees in each region, were as follows:
 
  ByRegion
All.Employees Survey.Respondents
 Region_1735142
 Region_2500 83
 Region_3897 78
 Region_4717133
 Region_5167 48
 Region_6309  0
 Region_7806125
 Region_8627122
 Region_9858177
 Region_10   851160
 Region_11   336 52
 Region_12  1823312
 Region_1380  9
 Region_14   774121
 Region_15   561 24
 Region_16   834134
 
 How well does the survey represent the employee population?
 Chi-square test says, not very well:
 
  chisq.test(ByRegion)
 
  Pearson's Chi-squared test
 
 data:  ByRegion
 X-squared = 163.6869, df = 15, p-value  2.2e-16
 
 By striking three under-represented regions (3,6, and 15), we get
 a more reasonable, although still not convincing, result:
 
  chisq.test(ByRegion[setdiff(1:16,c(3,6,15)),])
 
  Pearson's Chi-squared test
 
 data:  ByRegion[setdiff(1:16, c(3, 6, 15)), ]
 X-squared = 22.5643, df = 12, p-value = 0.03166
 
 This poses several questions:
 
 1)  Looking at a side-by-side barchart (proportion of responses vs.
 proportion of employees, per region), the pattern of survey responses
 appears, visually, to match fairly well the pattern of employees.  Is
 this a case where we trust the numbers and not the picture?
 
 2) Part of the problem, ironically, is that there were too many
 responses
 to the survey.  If we had only one-tenth the responses, but in the same
 proportions by region, the chi-square statistic would look much better,
 (though with a warning about possible inaccuracy):
 
 data:  data.frame(ByRegion$All.Employees, 0.1 *
 (ByRegion$Survey.Respondents))
 X-squared = 17.5912, df = 15, p-value = 0.2848
 
 Is there a way of reconciling a large response rate with an
 unrepresentative
 response profile?  Or is the bad news that the survey will give very
 precise
 results about a very ill-specified sub-population?
 
 (Of course, I would put in softer terms, like you need to assess the
 degree
 of homogeneity across different regions .)
 
 3) Is Chi-squared really the right measure of how representative is the
 survey?
 
  
 
 Thanks for any help you can give - hope these questions make sense -
 
 George H.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] stop()

2011-10-11 Thread Greg Snow

Replace stop() with break to see if that does what you want.  (you may also 
want to include cat() or warn() to indicate the early stopping.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Doran, Harold
 Sent: Tuesday, October 11, 2011 11:32 AM
 To: r-help@r-project.org
 Subject: [R] stop()
 
 Suppose I have a function, such as the toy example below:
 
 myFun - function(x, max.iter = 5) {
for(i in 1:10){
result - x + i
iter - i
if(iter == max.iter) stop('Max reached')
}
result
}
 
 I can of course do this:
 myFun(10, max.iter = 11)
 
 However, if I reach the maximum number of iterations before my
 algorithm has finished (in my real application there are EM steps for
 a mixed model), I actually want the function to return the value of
 result up to that point. Currently using stop(), I would get
 
  myFun(10, max.iter = 4)
 Error in myFun(10, max.iter = 4) : Max reached
 
 But, in this toy case the function should return the value of result
 up to iteration 4.
 
 Not sure how I can adjust this.
 
 Thanks,
 Harold
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to draw 4 random weights that sum up to 1?

2011-10-10 Thread Greg Snow

You probably want to generate data from a Dirichlet distribution.  There are 
some functions in packages that will do this and give you more background, or 
you can just generate 4 numbers from an exponential (or gamma) distribution and 
divide them by their sum.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Alexander Engelhardt
 Sent: Monday, October 10, 2011 10:11 AM
 To: r-help
 Subject: [R] How to draw 4 random weights that sum up to 1?
 
 Hey list,
 
 This might be a more general question and not that R-specific. Sorry
 for
 that.
 
 I'm trying to draw a random vector of 4 numbers that sum up to 1.
 My first approach was something like
 
a - runif(1)
b - runif(1, max=1-a)
c - runif(1, max=1-a-b)
d - 1-a-b-c
 
 but this kind of distorts the results, right?
 Would the following be a good approach?
 
w - sample(1:100, 4, replace=TRUE)
w - w/sum(w)
 
 I'd prefer a general algorithm-kind of answer to a specific R function
 (if there is any). Although a function name would help too, if I can
 sourcedive.
 
 Thanks in advance,
   Alex
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to draw 4 random weights that sum up to 1?

2011-10-10 Thread Greg Snow

As an interesting extension to David's post, try:

M4.e - matrix(rexp(4,1), ncol=4)

Instead of the uniform and rerun the rest of the code (note the limits on the 
x-axis).

With 3 dimensions and the restriction we can plot in 2 dimensions to compare:

library(TeachingDemos)

m3.unif - matrix(runif(3000),  ncol=3)
m3.unif - m3.unif/rowSums(m3.unif)

m3.exp  - matrix(rexp(3000,1), ncol=3)
m3.exp  - m3.exp/rowSums(m3.exp)


dev.new()
triplot(m3.unif)

dev.new()
triplot(m3.exp)

now compare the 2 plots on the density of the points near the corners.


-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of David Winsemius
 Sent: Monday, October 10, 2011 12:05 PM
 To: Uwe Ligges
 Cc: r-help; Alexander Engelhardt
 Subject: Re: [R] How to draw 4 random weights that sum up to 1?
 
 
 On Oct 10, 2011, at 12:44 PM, Uwe Ligges wrote:
 
 
 
  On 10.10.2011 18:10, Alexander Engelhardt wrote:
  Hey list,
 
  This might be a more general question and not that R-specific.
  Sorry for
  that.
 
  I'm trying to draw a random vector of 4 numbers that sum up to 1.
  My first approach was something like
 
  a - runif(1)
  b - runif(1, max=1-a)
  c - runif(1, max=1-a-b)
  d - 1-a-b-c
 
  but this kind of distorts the results, right?
  Would the following be a good approach?
 
  w - sample(1:100, 4, replace=TRUE)
  w - w/sum(w)
 
  Yes, although better combine both ways to
 
  w - runif(4)
  w - w / sum(w)
 
 For the non-statisticians in the audience like myself who didn't know
 what that distribution might look like (it being difficult to
 visualize densities on your 3-dimensional manifold in 4-space),  here
 is my effort to get an appreciation:
 
   M4 - matrix(runif(4), ncol=4)
   M4 - M4/rowSums(M4)
 # just a larger realization of Ligges' advice
   colMeans(M4)
 [1] 0.2503946 0.2499594 0.2492118 0.2504342
   plot(density(M4[,1]))
   lines(density(M4[,2]),col=red)
   lines(density(M4[,3]),col=blue)
   lines(density(M4[,4]),col=green)
 
 plot(density(rowSums(M4[,1:2])))
 
   plot(density(rowSums(M4[,1:3])))
 plot(density(rowSums(M4[,2:4])))
 
 # rather kewl results, noting that these are a reflecion around 0.5 of
 the single vector densities.
 
 
  Uwe Ligges
 
 
 
  I'd prefer a general algorithm-kind of answer to a specific R
  function
  (if there is any). Although a function name would help too, if I can
  sourcedive.
 
 --
 
 David Winsemius, MD
 West Hartford, CT
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subplot strange behavoir

2011-10-05 Thread Greg Snow

When I copy and paste your code I get what is expected, the 2 subplots line up 
on the same y-value.  What version of R are you using, which version of 
subplot? What platform?

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of emorway
 Sent: Wednesday, October 05, 2011 1:40 PM
 To: r-help@r-project.org
 Subject: [R] subplot strange behavoir
 
 Hello,
 
 Below is some example code that should reproduce an error I'm
 encountering
 while trying to create a tiff plot with two subplots.  If I run just
 the
 following bit of code through the R GUI the result is what I'd like to
 have
 appear in the saved tiff image:
 
 x-seq(0:20)
 y-c(1,1,2,2,3,4,5,4,3,6,7,1,1,2,2,3,4,5,4,3,6)
 plot(x,y,type=l,las=1,ylim=c(0,12))
 subplot(edm.sub(x[seq(1:5)],y[seq(1:5)]),x=4,y=9,size=c(1,1.5))
 subplot(edm.sub(x[seq(15,20,by=1)],y[seq(15,20,by=1)]),x=17,y=9,size=c(
 1,1.5))
 
 However, if expanding on this code with:
 
 edm.sub-function(x,y){plot(x,y,col=red,frame.plot=F,
las=1,xaxs=i,yaxs=i,type=b,
ylim=c(0,6),xlab=,ylab=)}
 
 png(c:/temp/lookat.tif,res=120,height=600,width=1200)
 layout(matrix(c(1,2),2,2,byrow=TRUE),c(1.5,2.5),respect=TRUE)
 plot(seq(1:10),seq(1:10),type=l,las=1,col=blue)
 plot(x,y,type=l,las=1,ylim=c(0,12))
 subplot(edm.sub(x[seq(1:5)],y[seq(1:5)]),x=4,y=9,size=c(1,1.5))
 subplot(edm.sub(x[seq(15,20,by=1)],y[seq(15,20,by=1)]),x=17,y=9,size=c(
 1,1.5))
 dev.off()
 
 One will notice the second subplot is out of position (notice the
 y-coordinate is the same for both subplots...y=9):
 http://r.789695.n4.nabble.com/file/n3875917/lookat.png
 
 If I try to 'guess' a new y-coordinate for the second subplot, say
 y=10:
 
 png(c:/temp/lookat.tif,res=120,height=600,width=1200)
 layout(matrix(c(1,2),2,2,byrow=TRUE),c(1.5,2.5),respect=TRUE)
 plot(seq(1:10),seq(1:10),type=l,las=1,col=blue)
 plot(x,y,type=l,las=1,ylim=c(0,12))
 subplot(edm.sub(x[seq(1:5)],y[seq(1:5)]),x=4,y=9,size=c(1,1.5))
 subplot(edm.sub(x[seq(15,20,by=1)],y[seq(15,20,by=1)]),x=17,y=10,size=c
 (1,1.5))
 dev.off()
 
 R kicks back the following message
 Error in plot.new() : plot region too large
 
 Am I mis-using subplot?
 
 Thanks, Eric
 
 --
 View this message in context: http://r.789695.n4.nabble.com/subplot-
 strange-behavoir-tp3875917p3875917.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

< 1 2 3 4 5 6 7 8 9 10 >

501 - 600 of 2156 matches

Mail list logo