date:20090708

Re: [R] Uncorrelated random vectors - Thank you!

2009-07-08 Thread Stein, Luba (AIM SE)

Thanks to all for the answers! I solved my problem now by sufficient iteration!

Have a nice day!
Luba 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Thanks: Mathematical annotation axis in lattice

2009-07-08 Thread Coster, Albart


Hello,

thanks for the two replies. The following code worked as expected:

pos - 1:10
lab - letters[pos]

ll - parse(text = paste(pos,*phi[,lab,],sep = ))

xyplot(1:10~1:10,scales = list(x = list(labels = ll,at = 1:10)))

Best regards,

Albart





-Original Message-
From: Coster, Albart
Sent: Tue 7/7/2009 1:27 PM
To: r-help-requ...@r-project.org
Subject: Mathematical annotation axis in lattice
 
Dear list,

making mathematical expressions in plots is not difficult: expression(phi[1]) 
for example. At this moment I am stuck in creating a vector of expressions:

pos - 1:10
lab - letters[pos]

Now, I would like to create a vector of expressions which I could use for 
labeling the x-axis of a lattice plot. 

ll - 

as.expression(paste(pos, phi[,lab,],sep = )

xyplot(1:10~11:10,scales = list(x = list(labels = ll,at = 1:10)))

does not work. I read about the function substitute, but that did not solve it.

Could you recommend me how I should do this? 

Thanks in advance,

Albart Coster

Wageningen Universiteit
Netherlands

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Rdsm, a DSM package for parallel R programming

2009-07-08 Thread Norm Matloff

As I mentioned last week, I've been developing a package that I call
Rdsm (R distributed shared memory), modeled after a similar package,
PerlDSM, I wrote for Perl some years ago.  It is now in alpha form, so
I'm not uploading to CRAN yet, but it is definitely usable, and I am
releasing it at 

http://heather.cs.ucdavis.edu/~matloff/R/Rdsm

I hope many try it out, and give me some feedback.

Note that the word distributed here means that the memory is not
really shared, but instead is an abstraction, to give the programmer a
shared-memory view even though the program may be running on several
separate machines.  For C/C++ this is generally accomplished by
manipulation of the virtual memory hardware.  For R, I do this by
redefining functions such as [ and [- for a new class.

Rdsm is intended as an alternative for those who favor the shared-memory
view of things.  In the parallel processing community, there has always
been a debate between advocates of the two main programming paradigms,
shared memory and message passing.  Shared memory advocates claim
greater clarity of code, while the message passing people point to that
paradigm's greater flexibility.  I happen to be of the shared-memory
school.  Given the popularity of OpenMP for C/C++/FORTRAN, I believe
Rdsm will be of interest to many for R.  Indeed, in the next few months,
I will be extending Rdsm with functions that give it the look and feel
of OpenMP.

Norm Matloff
UC Davis



- End forwarded message -

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] can't get rJava to install on Linux

2009-07-08 Thread Dirk Eddelbuettel


On 7 July 2009 at 21:28, Mark Kimpel wrote:
| Having difficulties getting rJava to install on my Debian Squeeze box.

Did you try the binary package? A simple

 sudo apt-get install r-cran-rjava

should do; if not you can at least use its Build-Depeds via

 sudo apt-get build-depends r-cran-rjava
 
Hth, Dirk 

-- 
Three out of two people have difficulties with fractions.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] clogit comparison between Stata and R

2009-07-08 Thread David Hugh-Jones

Hello all

I'm moving back and forth between stata and R at the moment - of course,
using R whenever possible :-)

I'm running conditional logits on some panel data and I get slightly
different results and different N in the two programs.

In R I run
clogit(trans.dem ~ I(avg.gle_rgdp.500/gle_rgdp) + log(gle_rgdp) +
timesince.dem + I(timesince.dem^2) + timesince.dict + I(timesince.dict^2) +
p_polity2 + I(p_polity2^2)  + strata(ccodecow) + cluster(ccodecow),
method=approximate, data=univ)

and I get an n of 3747.

In Stata, I run
clogit trans_dem avg_gle_rgdp_ratio loggle_rgdp timesince_dem
timesince_demsq timesince_dict timesince_dictsq p_polity2 pol2sq,
group(ccodecow) vce(cluster ccodecow)

which I hope is the same model. I get a message 29 groups (935 obs) dropped
because of all positive or all negative outcomes, and an n of 2812. Also,
the coefficients are slightly different.

I understand why Stata is dropping the groups with all outcomes the same...
this is inevitable in a conditional logit, right? Is R doing the same? And
what might be the cause of the difference in coefficients?

Cheers
David Hugh-Jones
Post-doctoral Researcher
Max Planck Institute of Economics, Jena
http://davidhughjones.googlepages.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to read point shp file to R?

2009-07-08 Thread David Hugh-Jones

You might also find the R wiki useful:
http://wiki.r-project.org/rwiki/doku.php?id=tips:spatial-data
http://wiki.r-project.org/rwiki/doku.php?id=tips:stats-spatial

David Hugh-Jones
Post-doctoral Researcher
Max Planck Institute of Economics, Jena
http://davidhughjones.googlepages.com


2009/7/7 Sunny sunshineab...@gmail.com

 I am new with R and want do some analysis with a point vector data file.
 Any
 help is appreciate.  Sunny

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R 2.9.0 plot still forcing current time zone

2009-07-08 Thread Britton Stephens


the help page for plot.POSIXct says

As from R 2.9.0 the date-times for a 'POSIXct' input are
interpreted in the timwzonw give by the 'tzone' attribute it
there is one, otherwise the current timezone.  (Earlier vrsions
always used the current timezone.)

however I am using 2.9.0 on linux and the following still happily 
produces an x-axis in local (MDT) time


 x=strptime(paste('09-01-01 00:00:00',sep=''),format='%y-%m-%d 
%H:%M:%S',tz=GMT)+60*60*24*(seq(0.5,1.5,.1))

 x
[1] 2009-01-01 12:00:00 GMT 2009-01-01 14:24:00 GMT
[3] 2009-01-01 16:48:00 GMT 2009-01-01 19:12:00 GMT
[5] 2009-01-01 21:36:00 GMT 2009-01-02 00:00:00 GMT
[7] 2009-01-02 02:24:00 GMT 2009-01-02 04:48:00 GMT
[9] 2009-01-02 07:12:00 GMT 2009-01-02 09:36:00 GMT
[11] 2009-01-02 12:00:00 GMT
 attributes(x)
$class
[1] POSIXt  POSIXct

$tzone
[1] GMT

 plot(x,rep(1,11))

Is this a bug, or am I missing something?  Thanks a lot!
Britt

--
Britton B. Stephens
National Center for Atmospheric Research
P.O. Box 3000, 1850 Table Mesa Drive
Boulder, CO  80307-3000
Phone: (303) 497-1018
Fax: (303) 497-1092

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Dump plots to powerpoint?

2009-07-08 Thread Barry Rowlingson

On Tue, Jul 7, 2009 at 11:32 PM, Ben Bolkerbol...@ufl.edu wrote:


  Why not directly generate a large PNG file (which will be much better
 for line art than JPG anyway)?  Or EMF?

  See http://wiki.r-project.org/rwiki/doku.php?id=tips:graphics-misc:export

  [Of course, this doesn't answer the original question ... to which I
 suspect the answer is no.]

 So image generation is done, now we want to put them all into a
presentation (One image per slide? Titles?)

 Suggestions:

 1. Dump Powerpoint, learn LaTeX and beamer, your audience will be
happy. Including a bunch of image files? Trivial.

 2. Dump Powerpoint, use OpenOffice - the OO Impress file is a zip
file, one file of which is an XML description of the presentation, so
then you just have to create an XML file a bit like that that
specifies all your images. You could do this in R. It just needs a bit
of simple reverse engineering. Create a simple presentation like the
one you want to do with a few images in, then save, then unzip it,
figure it out, write a little template (using R's brew package
perhaps), then write a new XML file with all your images specified,
zip up, job done. Save it from OpenOffice as a Powerpoint file if you
really need to use Powerpoint.

 3. Okay, so you really want to use Powerpoint, in which case the
latest file format (the one with the 'x' at the end) should be some
kind of XML file which you might be able to reverse engineer in a
similar way to (2). Good luck with that.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help resolving error in quantcut

2009-07-08 Thread Chris Anderson

I am trying to use the quantcut function to create deciles, but I am getting 
the error below. I am new to using this function and do not know how to 
properly use the options or some other conversion that is necessary.
#initial summary using describe function in Hmisc library
DegreeBurn4th 
  n missing  uniqueMean .05 .10 .25 .50 .75 .90 
.95 
 76 133  16  0.0325  0.  0.  0.  0.  0.0225  0.0900 
 0.1725 
  0 0.01 0.02 0.03 0.04 0.05 0.06 0.08 0.09 0.12 0.16 0.17 0.18 
0.24 0.36 0.5
Frequency 486342211211111   
 1   1
% 63845331131111
11   1
 degree.quant = quantcut(DegreeBurn4th, q=seq(0, 1, 0.1), labels=F,na.rm=TRUE)
Error in if (sum(flag) == 0) return(cut) else return(min(x[flag], na.rm = 
na.rm)) : 
  missing value where TRUE/FALSE needed

#orignal data
print(DegreeBurn4th)
  [1] 0.09 0.00 0.00   NA   NA 0.03   NA 0.02   NA 0.00 0.01 0.00   NA   NA   
NA   NA   NA 0.00   NA 0.05 0.03 0.00   NA 0.02 0.00   NA 0.00   NA   NA 0.16   
NA   NA 0.24
 [34]   NA   NA 0.00   NA 0.00 0.08   NA   NA   NA 0.00 0.00   NA   NA 0.01   
NA 0.09   NA 0.00 0.00 0.00 0.06   NA 0.00   NA   NA 0.00   NA   NA 0.00 0.01   
NA   NA 0.00
 [67]   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA 
0.00 0.00   NA   NA 0.00   NA 0.05 0.00   NA   NA   NA 0.00 0.02 0.18   NA   NA 
  NA 0.03   NA
[100]   NA 0.00   NA   NA   NA 0.36   NA   NA   NA   NA 0.00 0.00 0.00   NA 
0.00   NA 0.17   NA 0.00   NA   NA   NA 0.00   NA 0.00 0.00 0.00   NA   NA 0.12 
0.00   NA 0.01
[133] 0.00   NA   NA   NA   NA 0.00 0.00   NA 0.01 0.00 0.00   NA   NA 0.00 
0.04   NA   NA   NA 0.00 0.00   NA 0.03   NA 0.00   NA 0.00   NA   NA 0.01 0.00 
0.00   NA   NA
[166]   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA 0.00   NA   
NA   NA   NA   NA   NA   NA   NA   NA   NA 0.50   NA   NA   NA   NA   NA   NA   
NA   NA 0.04
[199]   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA
#convert missing to zero
 DegreeBurn4th[is.na(DegreeBurn4th)]-0.00
 print(DegreeBurn4th)
  [1] 0.09 0.00 0.00 0.00 0.00 0.03 0.00 0.02 0.00 0.00 0.01 0.00 0.00 0.00 
0.00 0.00 0.00 0.00 0.00 0.05 0.03 0.00 0.00 0.02 0.00 0.00 0.00 0.00 0.00 0.16 
0.00 0.00 0.24
 [34] 0.00 0.00 0.00 0.00 0.00 0.08 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 
0.00 0.09 0.00 0.00 0.00 0.00 0.06 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 
0.00 0.00 0.00
 [67] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 
0.00 0.00 0.00 0.00 0.00 0.00 0.05 0.00 0.00 0.00 0.00 0.00 0.02 0.18 0.00 0.00 
0.00 0.03 0.00
[100] 0.00 0.00 0.00 0.00 0.00 0.36 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 
0.00 0.00 0.17 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.12 
0.00 0.00 0.01
[133] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 
0.04 0.00 0.00 0.00 0.00 0.00 0.00 0.03 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 
0.00 0.00 0.00
[166] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.50 0.00 0.00 0.00 0.00 0.00 0.00 
0.00 0.00 0.04
[199] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
 degree.quant = quantcut(DegreeBurn4th, q=seq(0, 1, 0.1), labels=F,na.rm=TRUE)
Error in if (pairs[1, i] == pairs[1, i - 1]  pairs[1, i] == pairs[2,  : 
  missing value where TRUE/FALSE needed
 degree.quant = quantcut(DegreeBurn4th, q=seq(0, 1, 0.1), 
 labels=F,include.lowest=TRUE)
Error in cut.default(x[!flag], breaks = newquant, include.lowest = TRUE,  : 
  formal argument include.lowest matched by multiple actual arguments
 degree.quant = quantcut(DegreeBurn4th, q=seq(0, 1, 0.1), 
 labels=F,include.lowest=F)
Error in cut.default(x[!flag], breaks = newquant, include.lowest = TRUE,  : 
  formal argument include.lowest matched by multiple actual arguments
 degree.quant = quantcut(DegreeBurn4th, q=seq(0, 1, 0.1), 
 labels=F,include.lowest=T)
Error in cut.default(x[!flag], breaks = newquant, include.lowest = TRUE,  : 
  formal argument include.lowest matched by multiple actual arguments
 



Chris Anderson

http://www.seocodebreaker.com/?thankyou-page=429

Criminal Lawyers - Click here.
http://thirdpartyoffers.netzero.net/TGL2241/fc/BLSrjpYbd6xeB8PyC2qYcdt9oup93MpUqzGGHKa4mySkwS9XNfuLPlvlNq4/
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Odp: error: no such index at level 2

2009-07-08 Thread Petr PIKAL

Hi

r-help-boun...@r-project.org napsal dne 07.07.2009 19:06:17:

 Hi,
 
 I am confused about how to select elements from a list.
 
 I'm trying to select all rows of a table 'crossRsorted' such that the
 mean of a related vector is  0.  The related vector is accessible as
 a list element l[[i]] where i is the row index.
 
 I thought this would work:
 
  crossRsorted[mean(q[[ crossRsorted[,1] ]], na.rm = TRUE)  0, ]
 Error in q[[crossRsorted[, 1]]] : no such index at level 2

Strange, I got completely different error. Couldn't be that only ***you*** 
have crossRsorted?

 crossRsorted[mean(q[[ crossRsorted[,1] ]], na.rm = TRUE)  0, ]
Error: object 'crossRsorted' not found


What is crossRsorted? Data frame?, List? What is q? List? 

You need to provide at least output from

str(q) and str(crossRsorted) to get some reasonable answers.

and far better to provide artificial data to demonstrate the problem.

with 2 data frames

df1[rowMeans(df2)0,]

selects rows of df1 which correspond to rows with row mean df20

with data frame and list

df1[sapply(list1,mean)0,]

selects rows of df1 which correspond to list elements with mean 0

But without knowing structure of your data? Nobody knows.

Regards
Petr

 
 How can I express: select only those rows 'r_i' from crossRsorted
 where mean(q[[r_i[1]]])  0?
 
 Thanks,
 
  - Godmar
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] bigglm() results different from glm()+Another question

2009-07-08 Thread utkarshsinghal

Hi Greg,

Many thanks for your precious time. Here is a workable code:

set.seed(1)
xx = data.frame(x1=runif(1,0,10), x2=runif(1,0,10), 
x3=runif(1,0,10))
xx$y = 3 + xx$x1 + 2*xx$x2 + 3*xx$x3 + rnorm(1)

chunksize = 500
fit = biglm(y~x1+x2+x3, data=xx[1:chunksize,])
for(i in seq(chunksize,1,chunksize)) fit=update(fit, 
moredata=xx[(i+1):(i+chunksize),])
AIC(fit)
[1] 28956.91

And the AIC for other chunksizes:
chunksizeAIC
500  28956.91
100027956.91
200025956.91
250024956.91
500019956.91
19956.91

Also I noted that the estimated coefficients are not dependent on 
chunksize and AIC is exactly a linear function of chunksize. So I guess 
it is some problem with the calculation of AIC, may be in some degree of 
freedom or adding some constant somewhere.

And my comments below.


Regards
Utkarsh


Greg Snow wrote:

 How many rows does xx have?

 Let's look at your example for chunksize 1, you initially fit the 
 first 1 observations, then the seq results in just the value 1 
 which means that you do the update based on vaues 10001 through 2, 
 if xx only has 1 rows, then this should give at least one error.  
 If xx has 2 or more rows, then only chunksize 1 will ever see 
 the 2^th value, the other chunksizes will use less of the data.

Understood your point and apologize that you had to spend time going 
into the logic inside for loop. I definitely thought of that but my 
actual problem was the variation in AICs (which I was sure about), so to 
ignore this loop problem (temporarily), I deliberately chose the 
chunksizes such that the number of rows is a multiple of chunksize. I 
knew there is still one extra iteration happening and I checked that it 
was not causing any problem, the moredata in the last iteration will 
be all NA's and update does nothing in such a case.

For example:
Let's say chunksize=5000, even though xx has only 1 rows, fit2 
and fit3 below are exactly same.

fit1 = biglm(y~x1+x2+x3, data=xx[1:5000,])
fit2 = update(fit1, moredata=xx[5001:1,])
fit3 = update(fit2, moredata=xx[10001:15000,])
AIC(fit1); AIC(fit2); AIC(fit3)
[1] 5018.282
[1] 19956.91
[1] 19956.91

(The AIC matches with the table above and no warnings at all)

I checked all these things before sending my first mail and dropped the 
idea of refining the for loop as this will save me a few lines of code 
and also the loop looks good and easy to understand. Moreover it is 
neither taking any extra run time nor producing any warnings or errors.

  

 Also looking at the help for update.biglm, the 2^nd argument is 
 moredata not data, so if the code below is the code that you 
 actually ran, then the new data chunks are going into the ... 
 argument (and being ignored as that is there for future expansion and 
 does nothing yet) and the moredata argument is left empty, which 
 should also be giving an error.  For the code below, the model is only 
 being fit to the initial chunk and never updated, so with different 
 chunk sizes, there is different amounts of data per model.  You can 
 check this by doing summary(fit) and looking at the sample size in the 
 2^nd line.

My fault in writing the mail. In the actual code, I gave update(fit, 
xx[(i+1):(i+chunksize),]) ,i.e., I just passed the new chunk as the 2nd 
argument without mentioning the argument name, which is correct, but 
while writing the mail I added the argument name as data without 
checking what it is.

  

 It is easier for us to help you if you provide code that can be run by 
 copying and pasting (we don't have xx, so we can't just run the code 
 below, you could include a line to randomly generate an xx, or a link 
 to where a copy of xx can be downloaded from).  It also helps if you 
 mention any errors or warnings that you receive in the process of 
 running your code.

  

 Hope this helps,

  

 -- 

 Gregory (Greg) L. Snow Ph.D.

 Statistical Data Center

 Intermountain Healthcare

 greg.s...@imail.org

 801.408.8111

  

 *From:* utkarshsinghal [mailto:utkarsh.sing...@global-analytics.com]
 *Sent:* Tuesday, July 07, 2009 12:10 AM
 *To:* Greg Snow
 *Cc:* Thomas Lumley; r help
 *Subject:* Re: [R] bigglm() results different from glm()+Another question

  

 Trust me, it is the same total data I am using, even the chunksizes 
 are all equal. I also crosschecked by manually creating the chunks and 
 updating as in example given on biglm help page.
  ?biglm


 Regards
 Utkarsh



 Greg Snow wrote:

 Are you sure that you are fitting all the models on the same total 
 data?  A first glance looks like you may be including more data in 
 some of the chunk sizes, or be producing an error that update does not 
 know how to deal with.

  

 -- 

 Gregory (Greg) L. Snow Ph.D.

 Statistical Data Center

 Intermountain Healthcare

 greg.s...@imail.org mailto:greg.s...@imail.org

 801.408.8111

  

 *From:* utkarshsinghal

[R] Plotting the PDF and the Cumulative Probability

2009-07-08 Thread aledanda


Hallo,

I have to fit my distribution with Beta-prime. I found the parameters now I
need to plot the Cumulative probability and the Probability density of my
fitted data. 
With gamma for exemple is easy:

PDF:

plot(x,dgamma(x, shape,rate))

Cumulative probability:

plot(x,pgamma(x, shape,rate))

How can I do with beta-prime?
Can I use the pbeta and dbeta defined for Beta distribution even though my
function is a Beta-prime?

Thanks a lot!

Ale
-- 
View this message in context: 
http://www.nabble.com/Plotting-the-PDF-and-the-Cumulative-Probability-tp24387435p24387435.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Dump plots to powerpoint?

2009-07-08 Thread T . Zumbrunn


Another suggestion:

Create your presentation with an OpenDocument Presentation format  
compatible application (e.g. OpenOffice Impress), create the plots  
with Sweave chunks, and process the file with odfWeave() (package  
odfWeave). If necessary, you can export to other formats such as  
PowerPoint.


Best wishes
Thomas Zumbrunn

Quoting Barry Rowlingson b.rowling...@lancaster.ac.uk:

On Tue, Jul 7, 2009 at 11:32 PM, Ben Bolkerbol...@ufl.edu wrote:



 Why not directly generate a large PNG file (which will be much better
for line art than JPG anyway)?  Or EMF?

 See http://wiki.r-project.org/rwiki/doku.php?id=tips:graphics-misc:export

 [Of course, this doesn't answer the original question ... to which I
suspect the answer is no.]


 So image generation is done, now we want to put them all into a
presentation (One image per slide? Titles?)

 Suggestions:

 1. Dump Powerpoint, learn LaTeX and beamer, your audience will be
happy. Including a bunch of image files? Trivial.

 2. Dump Powerpoint, use OpenOffice - the OO Impress file is a zip
file, one file of which is an XML description of the presentation, so
then you just have to create an XML file a bit like that that
specifies all your images. You could do this in R. It just needs a bit
of simple reverse engineering. Create a simple presentation like the
one you want to do with a few images in, then save, then unzip it,
figure it out, write a little template (using R's brew package
perhaps), then write a new XML file with all your images specified,
zip up, job done. Save it from OpenOffice as a Powerpoint file if you
really need to use Powerpoint.

 3. Okay, so you really want to use Powerpoint, in which case the
latest file format (the one with the 'x' at the end) should be some
kind of XML file which you might be able to reverse engineer in a
similar way to (2). Good luck with that.

Barry


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to separate the string?

2009-07-08 Thread Hemavathi Ramulu

Hi everyone,
Thanks alot. Its work with help of you all.

regards,
Hema

On Tue, Jul 7, 2009 at 5:00 PM, Petr PIKAL petr.pi...@precheza.cz wrote:

 Hi

 If you have data frame like this

 test=data.frame(x=c(abcd, abc, abcde))
 than
 strsplit(as.matrix(test), )

 makes a list with splitted character vectors. If you want them in data
 frame you would need to combine vectors of unequal length.

 However I would try reading your text file with

 read.fwf(file, 1)

 Regards
 Petr


 Hemavathi Ramulu hema.ram...@gmail.com napsal dne 07.07.2009 10:36:40:

  Hi Petr,
 
  The data in text file and not csv format.
  The word separate  which I mean in this content is like split/separate
 the
  string to each alphabet
  where each alphabet will be in different column.
 
  thanks alot.
 
  regards,
  Hema.

  On Tue, Jul 7, 2009 at 4:12 PM, Petr PIKAL petr.pi...@precheza.cz
 wrote:
  Hi
 
  r-help-boun...@r-project.org napsal dne 07.07.2009 09:54:30:
 
   Hi everyone,
   Hi want to separate the string(column1) for example

  Well, how did you get the data in R? Are they in separated columns of
  data.frame? What do you mean by separate?
 
  
   column1 column2 column3 column4 column5 column6
   bear   b   e a  r
   cat c   a  t
   tigert   i   g  e   r
  
   I know how to do this in excel where using MID function.

  As Microsoft is more user friendly and uses translated functions in
  language specific versions of Excel I do not have function MID. I
 suspect
  it takes values from middle of string set by some identifiers. If it is
  the case see
 
  ?substr
 
  However I would start with
 
  ?read.table
 
  and related read.* functions to get the data into R in appropriate
 shape.
 
  Regards
  Petr
 
 
   Now I want to solve it using R. The list of strings is in text file. I
   looked up the help but did not find it.
   Can someone help me here?
  
   Thank you very much.
  
   Regards,
   Hema
  
  [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.

 
 
 




-- 
Hemavathi Ramulu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fitting a trend-line

2009-07-08 Thread anupam sinha

Hi all,
 I am new to R. How does one go about fitting a trend-line to a
scatter plot? Any help is appreciated.

Thanks and regards,

Anupam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Odp: Fitting a trend-line

2009-07-08 Thread Petr PIKAL

Hi
see
?lm and ?abline

Regards
Petr

r-help-boun...@r-project.org napsal dne 08.07.2009 11:31:19:

 Hi all,
  I am new to R. How does one go about fitting a trend-line to a
 scatter plot? Any help is appreciated.
 
 Thanks and regards,
 
 Anupam
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fitting a trend-line

2009-07-08 Thread wapita wapita

Why dont you do a linear regression?

 Date: Wed, 8 Jul 2009 15:01:19 +0530
 From: anupam.cont...@gmail.com
 To: r-help@r-project.org
 Subject: [R] Fitting a trend-line

 Hi all,
  I am new to R. How does one go about fitting a trend-line to a
 scatter plot? Any help is appreciated.

 Thanks and regards,

 Anupam

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

_

[[elided Hotmail spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] system() how to make a program run a specific file

2009-07-08 Thread Paulo E. Cardoso

I'd like to know how to call a program to run or open a specific  file.

 

something like this:

system('C:\\Program Files (x86)\\IrfanView\\i_view32.exe','-A:\\
teste.jpg') is not working.

 

any help will be appreciated



Paulo E. Cardoso

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Dump plots to powerpoint?

2009-07-08 Thread Gabor Grothendieck

Check out the R2PPT package on CRAN.

On Tue, Jul 7, 2009 at 4:38 PM, Thomasaikto...@yahoo.com wrote:
 Hi,

 Is it possible to dump a series of plots directly into a powerpoint 
 presentation (as is possible in Splus)?

 Thank you,
 Thomas




        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RODBC and sqlSave issue

2009-07-08 Thread wapita wapita








Hello,

I contact you after having unsuccessfully asked my question to R mailing list. 

I use the package RODBC to connect to a MS-SQL server. 
I am able to getQuery from the database.

I
am now studying the sqlSave some data into the database. Unfortunetly,
I meet some issues relating to the format of the data that arrives into
the database. I have three columns. The first one should be in the
MS-SQL format datetime. The second one in the MS-SQL format
varchar(50), and the third one in the MS-SQl format numeric(20,8).




I use the following command line:
 sqlSave(channel, DF, tablename=essai_global, rownames=FALSE, oldstyle=FALSE)

The data is indeed send to the database. But the types are wrong (varchar(255) 
pour les trois colonnes.)

I have then tried to use the varTypes argument, but I do not manage to use it.

If I use the following command lines:
 varTypes=c(datetime,varchar(50),numeric(20,8))
 sqlSave(channel, DF, tablename=essai_global, rownames=FALSE, oldstyle=FALSE)



I have the following resturn:

Warning message:
In sqlSave(channel, DF, tablename = essai_global, rownames = FALSE,  :
  argument 'varTypes' has no names and will be ignored



and the types are still wrong..

How can I use the varTypes??? I have read the documentation, but I dd not 
manage to find out.

Thank you very much

Wapita
_

r  !  Téléchargez-le maintenant ! 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Dump plots to powerpoint?

2009-07-08 Thread SIES 73

Hi,

On windows, you can use a COM client (with packages like rcom or RDCOMClient) 
to control powerpoint from R and insert the generated image using powerpoint's 
object model. You can either use the clipboard or an intermediate image file 
saved to disk.

Not hard to do, but this seems to be already implemented in package RPPT 
recently released to CRAN, so have a look at it: 
http://stat.ethz.ch/CRAN/web/packages/R2PPT/index.html

About the image format, using windows metafiles allows you to double-click the 
image in powerpoint, ungroup, and then edit each of its components (text, 
lines, etc.)

Regards,

Enrique
 

--

Date: Tue, 7 Jul 2009 13:38:48 -0700 (PDT)
From: Thomas aikto...@yahoo.com
Subject: [R] Dump plots to powerpoint?
To: r-help@r-project.org
Message-ID: 254923.51562...@web110511.mail.gq1.yahoo.com
Content-Type: text/plain

Hi,

Is it possible to dump a series of plots directly into a powerpoint 
presentation (as is possible in Splus)?

Thank you,
Thomas



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] system() how to make a program run a specific file

2009-07-08 Thread Paulo E. Cardoso

After all it's very easy:

system(paste('C:\\Program Files
(x86)\\IrfanView\\i_view32.exe','A:\\test.jpg'))


Paulo E. Cardoso


 -Mensagem original-
 De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 Em nome de Paulo E. Cardoso
 Enviada: quarta-feira, 8 de Julho de 2009 10:59
 Para: r-help@r-project.org
 Cc: r-h...@stat.math.ethz.ch
 Assunto: [R] system() how to make a program run a specific file
 
 I'd like to know how to call a program to run or open a specific  file.
 
 
 
 something like this:
 
 system('C:\\Program Files (x86)\\IrfanView\\i_view32.exe','-A:\\
 teste.jpg') is not working.
 
 
 
 any help will be appreciated
 
 
 
 Paulo E. Cardoso
 
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.
 Checked by AVG - www.avg.com
 Version: 8.5.375 / Virus Database: 270.13.8/2223 - Release Date:
 07/07/09 17:54:00

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fitting a trend-line

2009-07-08 Thread Jim Lemon


anupam sinha wrote:

Hi all,
 I am new to R. How does one go about fitting a trend-line to a
scatter plot? Any help is appreciated.

  

Hi Anupam,
Have a look at the help page for the abline function in the graphics 
package.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] stats::decompose - Problem finding seasonal component without trend

2009-07-08 Thread Mike HC



Hi R-help,

I'd like to extract the seasonal component of a short timeseries, and was
hoping to use stats::decompose.  I don't want to decompose the 'trend'
component so I thought I should call decompose(x,filter=0). I think I've
either misunderstood the filter argument or come upon a bug/feature in
decompose.

# EXAMPLE
x-ts(c(2:12,rep(1,12),1:12),start=c(2009,2),frequency=12);x  # Starts in
Feb

# Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
#2009   2   3   4   5   6   7   8   9  10  11  12
#2010   1   1   1   1   1   1   1   1   1   1   1   1
#2011   1   2   3   4   5   6   7   8   9  10  11  12

decompose(x) #ok, got some answer for seasonal component, but I don't want
to split the residual into trend and random.

decompose(x,filter=0) #this seems broken, ignoring some of the data in
seasonal calculation, and losing some points in the random component
# END EXAMPLE

I've debug-stepped through decompose and, as far as I can understand the
manipulation, it appears to ignore the first and last period. And only the
middle 12 points (all 1 in my example) are used in the calculation of the
seasonal averages. Unrelated, but it also seems to duplicate one value
during the calculation, and throw a warning due to a seemingly unnecessary
'end' argument to window.

I can probably get away with using some function like sweep or scale
instead, but please let me know if I'm just misusing decompose.  If it's a
bug, I hope the above helps..

Regards,
Mike

P.S.

I see this comment in the R 2.8.0 release notes:

 o   HoltWinters() and decompose() use a (statistically) more
efficient computation for seasonal fits (they used to waste
one period).

I'm on R 2.80:
   _   
platform   i386-pc-mingw32 
arch   i386
os mingw32 
system i386, mingw32   
status 
major  2   
minor  8.0 
year   2008
month  10  
day20  
svn rev46754   
language   R   
version.string R version 2.8.0 (2008-10-20)

-- 
View this message in context: 
http://www.nabble.com/stats%3A%3AdecomposeProblem-finding-seasonal-component-without-trend-tp24389771p24389771.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Import xlsx file in Ubuntu 9.04

2009-07-08 Thread Rodrigo Aluizio

Hi list,
By the entire last 2 weeks I was looking for a way to directly import xlsx
files to R in a Linux OS (Ubuntu 9.04). I already read the R Import/Export
guide, and I know how to use gdata to import xls files and read.table to
import .csv. My problem is that all data that I receive is in the xlsx
format, and I have to convert all the files to xls.
Well, when I was using Windows Vista OS, RODBC did the trick with the
odbcConnectExcel2007 function (which I know is not present in the Linux
RODBC package, probably due to drivers issue). Isn't there a way to import
this xlsx files directly to R without any previous conversion (.csv or
.xls)?

Thank you for the attention, it's probable that some one already asked it. I
even remember seen that somewhere, but without a definitive answer.

Rodrigo.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R 2.9.0 plot still forcing current time zone

2009-07-08 Thread jim holtman

Try this: set the timezone to what you want before plotting:

 tzsave - Sys.getenv(TZ)  # save current
 Sys.setenv(TZ=GMT)  # set to whatever
  plot(x,rep(1,11))  # plot
 Sys.setenv(TZ=tzsave)  # restore
  plot(x,rep(1,11))  # plot in original time zone


On Wed, Jul 8, 2009 at 2:21 AM, Britton Stephenssteph...@ucar.edu wrote:
 the help page for plot.POSIXct says

 As from R 2.9.0 the date-times for a 'POSIXct' input are
    interpreted in the timwzonw give by the 'tzone' attribute it
    there is one, otherwise the current timezone.  (Earlier vrsions
    always used the current timezone.)

 however I am using 2.9.0 on linux and the following still happily produces
 an x-axis in local (MDT) time

 x=strptime(paste('09-01-01 00:00:00',sep=''),format='%y-%m-%d
 %H:%M:%S',tz=GMT)+60*60*24*(seq(0.5,1.5,.1))
 x
 [1] 2009-01-01 12:00:00 GMT 2009-01-01 14:24:00 GMT
 [3] 2009-01-01 16:48:00 GMT 2009-01-01 19:12:00 GMT
 [5] 2009-01-01 21:36:00 GMT 2009-01-02 00:00:00 GMT
 [7] 2009-01-02 02:24:00 GMT 2009-01-02 04:48:00 GMT
 [9] 2009-01-02 07:12:00 GMT 2009-01-02 09:36:00 GMT
 [11] 2009-01-02 12:00:00 GMT
 attributes(x)
 $class
 [1] POSIXt  POSIXct

 $tzone
 [1] GMT

 plot(x,rep(1,11))

 Is this a bug, or am I missing something?  Thanks a lot!
 Britt

 --
 Britton B. Stephens
 National Center for Atmospheric Research
 P.O. Box 3000, 1850 Table Mesa Drive
 Boulder, CO  80307-3000
 Phone: (303) 497-1018
 Fax: (303) 497-1092

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] linear regression and testing the slope

2009-07-08 Thread evrim akar

Dear All,

First of all I would like to say I do not have much knowledge about this
subject, so most of you can find it really easy. I am doing a linear
regression and I want to test if the slope of the curve is 0. R gives the
summary statistics:

Call:
lm(formula = x ~ s)

Residuals:
  Min1QMedian3Q   Max
-0.025096 -0.020316 -0.001203  0.011658  0.044970

Coefficients:
 Estimate Std. Error t value Pr(|t|)
(Intercept)  0.005567   0.016950   0.3280.750
s   -0.001599   0.002499  -0.6400.538

Residual standard error: 0.02621 on 9 degrees of freedom
Multiple R-squared: 0.04352,Adjusted R-squared: -0.06276
F-statistic: 0.4095 on 1 and 9 DF,  p-value: 0.5382

what is this t-value for? The explanation in the help file was unfortunately
not clear to me. How can I test my hypotheses that if the slope is 0?

Thank you in advance,

regards,

Evrim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] transform multi skew-t to uniform distribution

2009-07-08 Thread Adelchi Azzalini


RHRPO 
RHRPO Hi R-users,
RHRPO _I have a data from multi skew t and would like to transform each of the 
data to uniform data._ I tried using 'pmst' but only got one output:
RHRPO _
RHRPO  rr1 - as.vector(r1);rr1
RHRPO _[1]_ 0.7207582_ 5.2250906_ 1.7422237_ 0.5677233_ 0.7473555 -0.6020626 
-2.1947872 -1.1128313 -0.6587316 -1.1409261
RHRPO _
RHRPO _
RHRPO  pmst(rr1, xi=rep(0,10), Omega=diag(10), alpha=rep(1,10), df=5)
RHRPO [1] 3.676525e-09

you are computing a 10-dimensional distribution function at a
a 10-dimensional point; so you get a single number out -- this is
as expected.

I presume that  actually you want to compute a 1-dimensional
distribution at 10 different points, which is achieved by 

   pst(rr1, dp=c(0,1,1,5))

[1] 0.564580 0.996707 0.867177 0.497123 0.575915 0.085922 0.004127
0.030807 [9] 0.076839 0.029117

Best regards,

Adelchi Azzalini
-- 
Adelchi Azzalini  azzal...@stat.unipd.it
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ReShape to create Time from Observations?

2009-07-08 Thread Mark Knecht

On Tue, Jul 7, 2009 at 4:22 PM, jim holtmanjholt...@gmail.com wrote:
 Does something like this work for you;  it uses the reshape package:

 X-data.frame(A=1:10, B=0, C=1, Ob1=1:10, Ob2=2:11, Ob3=3:12,
 + Ob4=4:13, Ob5=3:12, Ob6=2:11)
 Y-data.frame(A=1:20, B=0, C=1, D=5, Ob1=1:10, Ob2=2:11, Ob3=3:12,
 + Ob4=4:13, Ob5=3:12, Ob6=2:11, Ob7=5:9)
 Z-data.frame(A=1:30, B=0, C=1, D=6, E=1:2, Ob1=1:10, Ob2=2:11,
 + Ob3=3:12, Ob4=4:13, Ob5=3:12, Ob6=2:11, Ob7=1:10, Ob8=3:12)

 f.melt -
 + function(df)
 + {
 +     # get the starting column number of Ob1, then extend for rest of 
 columns
 +     require(reshape)
 +     melt(df, measure=seq(match(Ob1, names(df)), ncol(df)))
 + }
 x.m - f.melt(X)
 y.m - f.melt(Y)
 z.m - f.melt(Z)

 # sample data
 head(x.m, 20)
    A B C variable value
 1   1 0 1      Ob1     1
 2   2 0 1      Ob1     2
 3   3 0 1      Ob1     3
 4   4 0 1      Ob1     4
 5   5 0 1      Ob1     5
 6   6 0 1      Ob1     6
 7   7 0 1      Ob1     7
 8   8 0 1      Ob1     8
 9   9 0 1      Ob1     9
 10 10 0 1      Ob1    10
 11  1 0 1      Ob2     2
 12  2 0 1      Ob2     3
 13  3 0 1      Ob2     4
 14  4 0 1      Ob2     5
 15  5 0 1      Ob2     6
 16  6 0 1      Ob2     7
 17  7 0 1      Ob2     8
 18  8 0 1      Ob2     9
 19  9 0 1      Ob2    10
 20 10 0 1      Ob2    11

SNIP

Jim,
   It wasn't exactly what I was looking for but I think the ideas plus
a bit of off-list help from another member helped me get much closer.
The idea of using match is very helpful in my case because I'm able to
leverage the fact that in my data files everything to the right is
also an observation to easily create  list to the end of the row. Try
the following:

X-data.frame(A=1:10, B=0, C=1, Ob1=1:10, Ob2=2:11, Ob3=3:12,Ob4=4:13,
Ob5=3:12, Ob6=2:11)

BrkPnt-match(Ob1,names(X))
Ob_Group - list(names(X)[BrkPnt:ncol(X)])

# Give to reshape to turn ObX into time
answerX1- reshape(X, varying=Ob_Group, direction='long')

and at this point I can subset based on id or some other variable:

subset(answerX1, A==1)
A B C time Ob1 id
1.1 1 0 11   1  1
1.2 1 0 12   2  1
1.3 1 0 13   3  1
1.4 1 0 14   4  1
1.5 1 0 15   3  1
1.6 1 0 16   2  1

   I *think* this is data that I can sent to matplot/qplot and get
charts that I'm interested in. I'll work on that today to verify but
it looks about right to me using this simple case:

PlotData-subset(answerX1, A==1)
matplot(PlotData$time,PlotData$Ob1)

   I really like the match idea. The first observation should
generally be in about the first 20 columns of my files which can
potentially be thousands of columns wide. There's no reason in my case
to match every other column to the right as I already know they will
match. I can get a list of all the observations with BrkPnt:ncol(X) or
all the independent variables using 1:BrkPnt-1. I could also, if I
chose, extract a specific group of observations by matching Ob20 and
Ob40 to potentially find observations taken in a certain time period
every day, etc. Nice!

   I'll put it back in a function as you did for use in my actual code.

Cheers,
Mark




 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RDCOMClient: how to close Excel process?

2009-07-08 Thread Lauri Nikkinen

Hi,

I’m using R package RDCOMClient (http://www.omegahat.org/RDCOMClient/)
to retrieve data from MS Excel workbook. I’m using the code below to
count the number of sheets in the workbook and then loop the data from
sheets in to a list.

# R code ###
library(gdata)
library(RDCOMClient)

xl - COMCreate(Excel.Application)
sh - xl$Workbooks()$Open(normalizePath(sample_file.xls))$Sheets()$Count()

DF.list - list()
for (i in 1:sh) {
   DF.list[[i]] - read.xls(sample_file.xls, sheet=i,
stringsAsFactors = FALSE)
   }
##

COMCreate opens Excel process and it can be seen from Windows Task
Manager. When I try to open sample_file.xls in Excel, it just flashes
in the screen and shuts down. When I kill (via task manager) the Excel
process COMCreate started, sample_file.xls will open normally.

The question is, how can I close the Excel process COMCreate started.
xl$Close() doesn’t seem to work. The same problem have been presented
in this post to R-help:
http://tolstoy.newcastle.edu.au/R/help/06/04/25990.html

-L

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R regular expression to extract words with the query string.

2009-07-08 Thread Praveen Surendran

Hi,

 

Is there a way in R to get the string which matches the expression, where
the expression is a substring of the parent string.

 

Lets say, I have $i - transcript:ENST112334 pid:ENSP12345

What I need is the string pid:ENSP12345 from $i using the query
ENSP.

 

Appreciate your comments.

 

Praveen  Surendran

School of Medicine and Medical Sciences

University College Dublin

Belfiled, Dublin 4

Ireland.

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] system() how to make a program run a specific file - RUN and Output directory issues

2009-07-08 Thread Paulo E. Cardoso

I have a particular case where the program I'm calling needs a additional
instructions, to click a RUN button and set a output directory. Could these
options be controlled with system() function?


Paulo E. Cardoso


 -Mensagem original-
 De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 Em nome de Paulo E. Cardoso
 Enviada: quarta-feira, 8 de Julho de 2009 12:08
 Para: r-help@r-project.org
 Cc: r-h...@stat.math.ethz.ch
 Assunto: Re: [R] system() how to make a program run a specific file
 
 After all it's very easy:
 
 system(paste('C:\\Program Files
 (x86)\\IrfanView\\i_view32.exe','A:\\test.jpg'))
 
 
 Paulo E. Cardoso
 
 
  -Mensagem original-
  De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org]
  Em nome de Paulo E. Cardoso
  Enviada: quarta-feira, 8 de Julho de 2009 10:59
  Para: r-help@r-project.org
  Cc: r-h...@stat.math.ethz.ch
  Assunto: [R] system() how to make a program run a specific file
 
  I'd like to know how to call a program to run or open a specific
 file.
 
 
 
  something like this:
 
  system('C:\\Program Files (x86)\\IrfanView\\i_view32.exe','-A:\\
  teste.jpg') is not working.
 
 
 
  any help will be appreciated
 
  
 
  Paulo E. Cardoso
 
 
 
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
  guide.html
  and provide commented, minimal, self-contained, reproducible code.
  Checked by AVG - www.avg.com
  Version: 8.5.375 / Virus Database: 270.13.8/2223 - Release Date:
  07/07/09 17:54:00
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.
 Checked by AVG - www.avg.com
 Version: 8.5.375 / Virus Database: 270.13.8/2223 - Release Date:
 07/07/09 17:54:00

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RDCOMClient: how to close Excel process?

2009-07-08 Thread Henrique Dallazuanna

Try this:

xl$Quit()

On Wed, Jul 8, 2009 at 10:06 AM, Lauri Nikkinen lauri.nikki...@iki.fiwrote:

 Hi,

 Im using R package RDCOMClient (http://www.omegahat.org/RDCOMClient/)
 to retrieve data from MS Excel workbook. Im using the code below to
 count the number of sheets in the workbook and then loop the data from
 sheets in to a list.

 # R code ###
 library(gdata)
 library(RDCOMClient)

 xl - COMCreate(Excel.Application)
 sh -
 xl$Workbooks()$Open(normalizePath(sample_file.xls))$Sheets()$Count()

 DF.list - list()
 for (i in 1:sh) {
   DF.list[[i]] - read.xls(sample_file.xls, sheet=i,
 stringsAsFactors = FALSE)
   }
 ##

 COMCreate opens Excel process and it can be seen from Windows Task
 Manager. When I try to open sample_file.xls in Excel, it just flashes
 in the screen and shuts down. When I kill (via task manager) the Excel
 process COMCreate started, sample_file.xls will open normally.

 The question is, how can I close the Excel process COMCreate started.
 xl$Close() doesnt seem to work. The same problem have been presented
 in this post to R-help:
 http://tolstoy.newcastle.edu.au/R/help/06/04/25990.html

 -L

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] linear regression and testing the slope

2009-07-08 Thread Ted Harding

On 08-Jul-09 12:29:40, evrim akar wrote:
 Dear All,
 First of all I would like to say I do not have much knowledge
 about this subject, so most of you can find it really easy.
 I am doing a linear regression and I want to test if the slope
 of the curve is 0. R gives the summary statistics:
 
 Call:
 lm(formula = x ~ s)
 
 Residuals:
   Min1QMedian3Q   Max
 -0.025096 -0.020316 -0.001203  0.011658  0.044970
 
 Coefficients:
  Estimate Std. Error t value Pr(|t|)
 (Intercept)  0.005567   0.016950   0.3280.750
 s   -0.001599   0.002499  -0.6400.538
 
 Residual standard error: 0.02621 on 9 degrees of freedom
 Multiple R-squared: 0.04352,Adjusted R-squared: -0.06276
 F-statistic: 0.4095 on 1 and 9 DF,  p-value: 0.5382
 
 what is this t-value for? The explanation in the help file was
 unfortunately not clear to me. How can I test my hypotheses that
 if the slope is 0?
 
 Thank you in advance,
 regards,
 Evrim

The quantity 't' is the estimated value (-0.001599 for the slope 's')
divided by its estimated standard error (0.002499). Taking the values
as reported by the summary:

  t = -0.001599/0.002499 = -0.639856

which R has reported (to 3 significant figures) as -0.640

The Pr(|t|) is the probability, assuming the null hypothesis that
the slope (coefficient of 's') is zero, that data could arise at random
giving rise to a t-value which, in absolute value, would exceed the
absolute value |t| = |-0.639856| = 0.639856 which you got from your
data.

The relevance of this for testing the hypothesis that the slope is 0
is that, if the slope really is 0, then large values (either way) of
the coefficient of 's' (reported by R as Estimate) are unlikely.
So if you got a value of Pr(|t|) which was small (conventionally
less that 0.05, or 0.01, etc.) then you would have a value so large
that getting a value at least as large as this if the hypothesis
were true would be unlikely. Therefore it would be more plausible
that the null hypothesis was false.

In your case, the P-value Pr(|t|) = 0.538, so you would be more
likely than not to get an estimate at least as deviant from 0 as the
one you did get, if the null hypothesis were true. Hence the data do
not provide grounds for rejecting the null hypothesis.

Note that not having grounds for rejection does not mean that you
must accept it: a non-signifcant t-value is not proof that the
null hypothesis is true.

There is a good basic outline of the t-test in the Wikipedia article
Student's t-test:

  http://en.wikipedia.org/wiki/Student%27s_t-test

Hoping this helps,
Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 08-Jul-09   Time: 14:17:52
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R regular expression to extract words with the query string.

2009-07-08 Thread Henrique Dallazuanna

Try this:

sapply(strsplit(i, ' '), grep, pattern='ENSP', value = T)

On Wed, Jul 8, 2009 at 10:04 AM, Praveen Surendran praveen.surend...@ucd.ie
 wrote:

 Hi,



 Is there a way in R to get the string which matches the expression, where
 the expression is a substring of the parent string.



 Lets say, I have $i - transcript:ENST112334 pid:ENSP12345

 What I need is the string pid:ENSP12345 from $i using the query
 ENSP.



 Appreciate your comments.



 Praveen  Surendran

 School of Medicine and Medical Sciences

 University College Dublin

 Belfiled, Dublin 4

 Ireland.




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RDCOMClient: how to close Excel process?

2009-07-08 Thread Henrique Dallazuanna

Then, you can try this:

xl - COMCreate(Excel.Application)
wk  - xl$Workbooks()
sh - wk$Open(normalizePath(sample_file.xls))$Sheets()$Count()

wk$Close()
xl$Quit()



On Wed, Jul 8, 2009 at 10:19 AM, Lauri Nikkinen lauri.nikki...@iki.fiwrote:

 Thanks but that did not work. xl$Quit() does not kill the Excel
 process and sample_file.xls will not open.

 I'm using Windows XP SP2 and R 2.8.1

 -L

 2009/7/8 Henrique Dallazuanna www...@gmail.com:
  Try this:
 
  xl$Quit()
 
  On Wed, Jul 8, 2009 at 10:06 AM, Lauri Nikkinen lauri.nikki...@iki.fi
  wrote:
 
  Hi,
 
  Im using R package RDCOMClient (http://www.omegahat.org/RDCOMClient/)
  to retrieve data from MS Excel workbook. Im using the code below to
  count the number of sheets in the workbook and then loop the data from
  sheets in to a list.
 
  # R code ###
  library(gdata)
  library(RDCOMClient)
 
  xl - COMCreate(Excel.Application)
  sh -
  xl$Workbooks()$Open(normalizePath(sample_file.xls))$Sheets()$Count()
 
  DF.list - list()
  for (i in 1:sh) {
DF.list[[i]] - read.xls(sample_file.xls, sheet=i,
  stringsAsFactors = FALSE)
}
  ##
 
  COMCreate opens Excel process and it can be seen from Windows Task
  Manager. When I try to open sample_file.xls in Excel, it just flashes
  in the screen and shuts down. When I kill (via task manager) the Excel
  process COMCreate started, sample_file.xls will open normally.
 
  The question is, how can I close the Excel process COMCreate started.
  xl$Close() doesnt seem to work. The same problem have been presented
  in this post to R-help:
  http://tolstoy.newcastle.edu.au/R/help/06/04/25990.html
 
  -L
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
  --
  Henrique Dallazuanna
  Curitiba-Paraná-Brasil
  25° 25' 40 S 49° 16' 22 O
 




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RDCOMClient: how to close Excel process?

2009-07-08 Thread Lauri Nikkinen

Thanks but that did not work. xl$Quit() does not kill the Excel
process and sample_file.xls will not open.

I'm using Windows XP SP2 and R 2.8.1

-L

2009/7/8 Henrique Dallazuanna www...@gmail.com:
 Try this:

 xl$Quit()

 On Wed, Jul 8, 2009 at 10:06 AM, Lauri Nikkinen lauri.nikki...@iki.fi
 wrote:

 Hi,

 I’m using R package RDCOMClient (http://www.omegahat.org/RDCOMClient/)
 to retrieve data from MS Excel workbook. I’m using the code below to
 count the number of sheets in the workbook and then loop the data from
 sheets in to a list.

 # R code ###
 library(gdata)
 library(RDCOMClient)

 xl - COMCreate(Excel.Application)
 sh -
 xl$Workbooks()$Open(normalizePath(sample_file.xls))$Sheets()$Count()

 DF.list - list()
 for (i in 1:sh) {
   DF.list[[i]] - read.xls(sample_file.xls, sheet=i,
 stringsAsFactors = FALSE)
   }
 ##

 COMCreate opens Excel process and it can be seen from Windows Task
 Manager. When I try to open sample_file.xls in Excel, it just flashes
 in the screen and shuts down. When I kill (via task manager) the Excel
 process COMCreate started, sample_file.xls will open normally.

 The question is, how can I close the Excel process COMCreate started.
 xl$Close() doesn’t seem to work. The same problem have been presented
 in this post to R-help:
 http://tolstoy.newcastle.edu.au/R/help/06/04/25990.html

 -L

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R regular expression to extract words with the query string.

2009-07-08 Thread Praveen Surendran

Thanks Henrique.

This is indeed short and quite simple compared to what I was using which
goes like...

 

unlist(strsplit(i,split= ))[grep(ENSP,unlist(strsplit(i,split= )))]
J

 

Cheers,

 

Praveen.

 

From: Henrique Dallazuanna [mailto:www...@gmail.com] 
Sent: 08 July 2009 14:18
To: praveen.surend...@ucd.ie
Cc: r-help@r-project.org
Subject: Re: [R] R regular expression to extract words with the query
string.

 

Try this:

sapply(strsplit(i, ' '), grep, pattern='ENSP', value = T)

On Wed, Jul 8, 2009 at 10:04 AM, Praveen Surendran
praveen.surend...@ucd.ie wrote:

Hi,



Is there a way in R to get the string which matches the expression, where
the expression is a substring of the parent string.



Lets say, I have $i - transcript:ENST112334 pid:ENSP12345

What I need is the string pid:ENSP12345 from $i using the query
ENSP.



Appreciate your comments.



Praveen  Surendran

School of Medicine and Medical Sciences

University College Dublin

Belfiled, Dublin 4

Ireland.




   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error: no such index at level 2

2009-07-08 Thread Godmar Back

On Wed, Jul 8, 2009 at 4:22 AM, Petr PIKAL petr.pi...@precheza.cz wrote:

 Hi

 r-help-boun...@r-project.org napsal dne 07.07.2009 19:06:17:

  Hi,
 
  I am confused about how to select elements from a list.
 
  I'm trying to select all rows of a table 'crossRsorted' such that the
  mean of a related vector is  0.  The related vector is accessible as
  a list element l[[i]] where i is the row index.
 
  I thought this would work:
 
   crossRsorted[mean(q[[ crossRsorted[,1] ]], na.rm = TRUE)  0, ]
  Error in q[[crossRsorted[, 1]]] : no such index at level 2

 Strange, I got completely different error. Couldn't be that only ***you***
 have crossRsorted?


Ok, fair enough. I'm still thinking of a language in which the meaning of
operators is apparent from their syntactical structure - probably need to
read more of The R Inferno.

Here's an example that reproduces the problem, I think (though the error
message is slightly different):

 q-list()
 q[[105]] - as.numeric(c(0,0,1))
 q[[104]] - as.numeric(c(1,1,1))
 q[[10]] - as.integer(c(3,3,1))
 crossRsorted - data.frame(i = c(105, 104,10))
 q[[ crossRsorted[,1] ]]
Error in q[[crossRsorted[, 1]]] : recursive indexing failed at level 2

Even though the list 'q' has component 105, 104, and 10, the expression q[[
crossRsorted[,1] ]] causes an error.
Why?

And why does this work:

 q[[c(105)]]
[1] 0 0 1

but not this:

 q[[c(105,104)]]
Error in q[[c(105, 104)]] : subscript out of bounds
 q[[c(105,104,10)]]
Error in q[[c(105, 104, 10)]] : recursive indexing failed at level 2

even though q[[105]], q[[104], and q[[10]] are perfectly legitimate items?

Coming back to my question, how to I express select all i in a vector for
which q[[i]] meets some predicate, where q is a list?

Thank you for the tip about 'str' - that's the typeof function I've been
craving. (I thought 'attributes' or 'summary' was all there was...)
The output for str in the original problem:

In my original problem, the output is:


 str(crossRsorted)
'data.frame':   15750 obs. of  5 variables:
 $ i : num  105 104 9 8 10 9 98 97 10 8 ...
 $ j : num  104 105 8 9 9 10 97 98 8 10 ...
 $ r : num  -0.973 -0.973 0.764 0.764 0.744 ...
 $ n : num  135 135 138 138 138 138 136 136 138 138 ...
 $ pvalue: num  2.90e-86 2.90e-86 0.00 0.00 0.00 ...

and

 str(q)
List of 165
 $ : NULL
 $ : NULL
 $ : NULL
 $ : NULL
 $ :'data.frame':   138 obs. of  1 variable:
  ..$ howdidyouhear: chr [1:138] 0 3 3 3 3 ...
 $ :'data.frame':   138 obs. of  1 variable:
  ..$ approximatelywhendidyoustart: int [1:138] 0 0 5 1 5 5 1 2 6 0 ...
[ main body deleted ]
 $ :'data.frame':   138 obs. of  1 variable:
  ..$ revisiontestpage: num [1:138] 0 0 0 0 0 0 0 0 0 0 ...

basically - a heterogeneous sparse list of NULL and data.frames of types
character, num, and int.

However - by construction - the q[[i]] for i in crossRsorted[,1] are all
non-NULL, as in my small reproducible example above.

with data frame and list

 df1[sapply(list1,mean)0,]

 selects rows of df1 which correspond to list elements with mean 0


I can't run 'sapply' over my list because sapply will also iterate over the
NULLs. I want to access only those components in list1 that occur in
df1[1,].

 - Godmar

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error: no such index at level 2

2009-07-08 Thread Henrique Dallazuanna

Its because '[[' accept only element, so you need use '[':

q[crossRsorted[,1]]



On Wed, Jul 8, 2009 at 10:28 AM, Godmar Back god...@gmail.com wrote:

 On Wed, Jul 8, 2009 at 4:22 AM, Petr PIKAL petr.pi...@precheza.cz wrote:

  Hi
 
  r-help-boun...@r-project.org napsal dne 07.07.2009 19:06:17:
 
   Hi,
  
   I am confused about how to select elements from a list.
  
   I'm trying to select all rows of a table 'crossRsorted' such that the
   mean of a related vector is  0.  The related vector is accessible as
   a list element l[[i]] where i is the row index.
  
   I thought this would work:
  
crossRsorted[mean(q[[ crossRsorted[,1] ]], na.rm = TRUE)  0, ]
   Error in q[[crossRsorted[, 1]]] : no such index at level 2
 
  Strange, I got completely different error. Couldn't be that only
 ***you***
  have crossRsorted?


 Ok, fair enough. I'm still thinking of a language in which the meaning of
 operators is apparent from their syntactical structure - probably need to
 read more of The R Inferno.

 Here's an example that reproduces the problem, I think (though the error
 message is slightly different):

  q-list()
  q[[105]] - as.numeric(c(0,0,1))
  q[[104]] - as.numeric(c(1,1,1))
  q[[10]] - as.integer(c(3,3,1))
  crossRsorted - data.frame(i = c(105, 104,10))
  q[[ crossRsorted[,1] ]]
 Error in q[[crossRsorted[, 1]]] : recursive indexing failed at level 2

 Even though the list 'q' has component 105, 104, and 10, the expression q[[
 crossRsorted[,1] ]] causes an error.
 Why?

 And why does this work:

  q[[c(105)]]
 [1] 0 0 1

 but not this:

  q[[c(105,104)]]
 Error in q[[c(105, 104)]] : subscript out of bounds
  q[[c(105,104,10)]]
 Error in q[[c(105, 104, 10)]] : recursive indexing failed at level 2

 even though q[[105]], q[[104], and q[[10]] are perfectly legitimate items?

 Coming back to my question, how to I express select all i in a vector for
 which q[[i]] meets some predicate, where q is a list?

 Thank you for the tip about 'str' - that's the typeof function I've been
 craving. (I thought 'attributes' or 'summary' was all there was...)
 The output for str in the original problem:

 In my original problem, the output is:


  str(crossRsorted)
 'data.frame':   15750 obs. of  5 variables:
  $ i : num  105 104 9 8 10 9 98 97 10 8 ...
  $ j : num  104 105 8 9 9 10 97 98 8 10 ...
  $ r : num  -0.973 -0.973 0.764 0.764 0.744 ...
  $ n : num  135 135 138 138 138 138 136 136 138 138 ...
  $ pvalue: num  2.90e-86 2.90e-86 0.00 0.00 0.00 ...

 and

  str(q)
 List of 165
  $ : NULL
  $ : NULL
  $ : NULL
  $ : NULL
  $ :'data.frame':   138 obs. of  1 variable:
  ..$ howdidyouhear: chr [1:138] 0 3 3 3 3 ...
  $ :'data.frame':   138 obs. of  1 variable:
  ..$ approximatelywhendidyoustart: int [1:138] 0 0 5 1 5 5 1 2 6 0 ...
 [ main body deleted ]
  $ :'data.frame':   138 obs. of  1 variable:
  ..$ revisiontestpage: num [1:138] 0 0 0 0 0 0 0 0 0 0 ...

 basically - a heterogeneous sparse list of NULL and data.frames of types
 character, num, and int.

 However - by construction - the q[[i]] for i in crossRsorted[,1] are all
 non-NULL, as in my small reproducible example above.

 with data frame and list
 
  df1[sapply(list1,mean)0,]
 
  selects rows of df1 which correspond to list elements with mean 0
 

 I can't run 'sapply' over my list because sapply will also iterate over the
 NULLs. I want to access only those components in list1 that occur in
 df1[1,].

  - Godmar

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] functions to calculate t-stats, etc. for lm.fit objects?

2009-07-08 Thread Whit Armstrong

I'm running a huge number of regressions in a loop, so I tried lm.fit
for a speedup.  However, I would like to be able to calculate the
t-stats for the coefficients.

Does anyone have some functions for calculating the regression summary
stats of an lm.fit object?

Thanks,
Whit

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error: no such index at level 2

2009-07-08 Thread Godmar Back

On Wed, Jul 8, 2009 at 9:40 AM, Henrique Dallazuanna www...@gmail.comwrote:

 Its because '[[' accept only element, so you need use '[':

 q[crossRsorted[,1]]


This appears to be doing something different. For instance, my 'q' has 165
components, but what you suggest has 15750:
 length(q)
[1] 165
 length(q[ crossRsorted[,1] ])
[1] 15750

hardly what I want.

Meanwhile, it looks as though [[ ]] does not vectorize its arguments, it
curries them!

Note that:

 q[[c(105,104)]]
Error in q[[c(105, 104)]] : subscript out of bounds

gives the same error as:

 q[[105]][[104]]
Error in q[[105]][[104]] : subscript out of bounds

Very mysterious, though, in all fairness, explained in help([[) where it
says:

 '[[' can be applied recursively to lists, so that if the single
 index 'i' is a vector of length 'p', 'alist[[i]]' is equivalent to
 'alist[[i1]]...[[ip]]' providing all but the final indexing
 results in a list.

which leads to square one: how to express select all r[i] where q[[i]]
fulfills some predicate?

 - Godmar

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Import xlsx file in Ubuntu 9.04

2009-07-08 Thread Marc Schwartz


On Jul 8, 2009, at 6:56 AM, Rodrigo Aluizio wrote:


Hi list,
By the entire last 2 weeks I was looking for a way to directly  
import xlsx
files to R in a Linux OS (Ubuntu 9.04). I already read the R Import/ 
Export
guide, and I know how to use gdata to import xls files and  
read.table to

import .csv. My problem is that all data that I receive is in the xlsx
format, and I have to convert all the files to xls.
Well, when I was using Windows Vista OS, RODBC did the trick with the
odbcConnectExcel2007 function (which I know is not present in the  
Linux
RODBC package, probably due to drivers issue). Isn't there a way to  
import

this xlsx files directly to R without any previous conversion (.csv or
.xls)?

Thank you for the attention, it's probable that some one already  
asked it. I

even remember seen that somewhere, but without a definitive answer.

Rodrigo.




Your best bet on Linux would be to open the Excel 2007 files using  
OpenOffice's Calc and save them to CSV files. The latest versions of  
OpenOffice will open Office 2007 files.


An alternative of course would be to see if it is reasonable for the  
providers of the files to save them in the older XLS format instead,  
or to see if they have other file formats that they can send you  
rather than using Excel at all.


There is a very preliminary Perl module in progress, that should  
eventually provide for a more efficient path:


  http://search.cpan.org/dist/Spreadsheet-XLSX/

But from what I have seen, there are enough problems with it  
(including data integrity issues), that I would not use it in  
production work.


Unfortunately, I don't believe that you have a lot of options on Linux  
at the moment.


HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading from Google Docs

2009-07-08 Thread Farrel Buchinsky

I  have previously read R Installation and Administration. I read it
again. It does not help me
The relevant paragraph is below. But I need lower level instructions. Where
can I find them.

R CMD INSTALL works in Windows to install source packages if you have the
source-code package files (option Source Package Installation Files in the
installer) and toolset (see The Windows
toolsetfile:///C:/Program%20Files/R/R-2.9.1/doc/manual/R-admin.html#The-Windows-toolset)
installed. Installation of binary packages must be done by install.packages
. R CMD INSTALL --help will tell you the current options under Windows
(which differ from those on a Unix-alike): in particular there is a choice
of the types of documentation to be installed.
Farrel Buchinsky
Google Voice Tel: (412) 567-7870



2009/6/19 Uwe Ligges lig...@statistik.tu-dortmund.de

 See the manual R Installation and Administration for information on how
 to install source packages on Windows.

 Uwe Ligges

 Farrel Buchinsky wrote:

 After issuing tar xvfz RgoogleDocs_0.2.2-src.tar.gzI am getting an error

 message
 'tar' is not recongnized as an internal or external command, operable
 program or batch file.

 Should I use my 7-zip to open up the archive?
 Where should I be doing this? For instance can I do it all in my
 download directory or should I do it in C:\Program
 Files\R\R-2.9.0\library or should I manually create C:\Program
 Files\R\R-2.9.0\library\RGoogleDocs and do it all there or will the Rcmd
 INSTALL RGoogleDocs_0.2-2.tar.gz command do that for me.

 Yes, you assumed correctly. I am using Windows XP.
 Farrel Buchinsky
 Google Voice Tel: (412) 567-7870



 On Thu, Jun 18, 2009 at 20:17, Gabor Grothendieck
 ggrothendi...@gmail.comwrote:

  I have haven't neen following this thread but:

 1. if RGoogleDocs_0.2-2.tar.gz is a source distribution (as
 opposed to built source) then the first line renames it so
 that its not the same name as the built file about to be created.
 The second line detars it into the RGoogleDocs directory.  The third
 builds
 the built source file, RGoogleDocs_0.2-2.tar.gz.  The fourth
 installs the built source file into R.  I've assumed Windows.
 If you are on Linux replace rename with mv.

 rename RGoogleDocs_0.2-2.tar.gz RgoogleDocs_0.2.2-src.tar.gz
 tar xvfz RgoogleDocs_0.2.2-src.tar.gz
 Rcmd build RGoogleDocs
 Rcmd INSTALL RGoogleDocs_0.2-2.tar.gz

 or

 2. if RGoogleDocs_0.2-2.tar.gz is already a built source file then you
 can just issue the last of the above lines and don't need
 the others.

 On Thu, Jun 18, 2009 at 7:52 PM, Farrel Buchinskyfjb...@gmail.com
 wrote:

 What do you mean by cd the.directory.containing.RGoogleDocs
 Do you mean the directory where I downloaded the
 RGoogleDocs_0.2-2.tar.gz
 to? Or do you mean that I must create a directory called RGoogleDocs

 under

 Library and then change to that directory?
 Farrel Buchinsky
 Google Voice Tel: (412) 567-7870



 On Mon, Mar 2, 2009 at 22:16, Gabor Grothendieck 

 ggrothendi...@gmail.com

 wrote:

 Finally enter into the Windows console:

 cd the.directory.containing.RGoogleDocs
 Rcmd build RGoogleDocs
 Rcmd INSTALL RGoogleDocs_1.0.0.tar.gz

 except replace RGoogleDocs_1.0.0.tar.gz with the filename
 created by the build.


[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R regular expression to extract words with the query string.

2009-07-08 Thread Jorge Ivan Velez

Dear Praveen,
Try also:

strsplit(i,' ')[[1]][2]
# [1] pid:ENSP12345

HTH,

Jorge


On Wed, Jul 8, 2009 at 9:04 AM, Praveen Surendran
praveen.surend...@ucd.iewrote:

 Hi,



 Is there a way in R to get the string which matches the expression, where
 the expression is a substring of the parent string.



 Lets say, I have $i - transcript:ENST112334 pid:ENSP12345

 What I need is the string pid:ENSP12345 from $i using the query
 ENSP.



 Appreciate your comments.



 Praveen  Surendran

 School of Medicine and Medical Sciences

 University College Dublin

 Belfiled, Dublin 4

 Ireland.




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R regular expression to extract words with the query string.

2009-07-08 Thread Gabor Grothendieck

Try this:

library(gsubfn)
i - transcript:ENST112334 pid:ENSP12345
strapply(i, paste(\\w*, ENSP, \\w*, sep = ), c, simplify = unlist)

This says to match any number (possibly zero) of word
characters followed by ENSP followed by more word
characters.  c just returns the match without
further processing and unlist unlists the result giving
a character vector (which otherwise would be a list).

See http://gsubfn.googlecode.com for more info.

On Wed, Jul 8, 2009 at 9:04 AM, Praveen
Surendranpraveen.surend...@ucd.ie wrote:
 Hi,



 Is there a way in R to get the string which matches the expression, where
 the expression is a substring of the parent string.



 Lets say, I have $i - transcript:ENST112334 pid:ENSP12345

 What I need is the string pid:ENSP12345 from $i using the query
 ENSP.



 Appreciate your comments.



 Praveen  Surendran

 School of Medicine and Medical Sciences

 University College Dublin

 Belfiled, Dublin 4

 Ireland.




        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading from Google Docs

2009-07-08 Thread Duncan Murdoch


On 08/07/2009 10:02 AM, Farrel Buchinsky wrote:

I  have previously read R Installation and Administration. I read it
again. It does not help me
The relevant paragraph is below. But I need lower level instructions. Where
can I find them.


Follow the link.  If Windows can't find tar, your toolset is installed 
incorrectly.


Duncan Murdoch



R CMD INSTALL works in Windows to install source packages if you have the
source-code package files (option “Source Package Installation Files” in the
installer) and toolset (see The Windows
toolsetfile:///C:/Program%20Files/R/R-2.9.1/doc/manual/R-admin.html#The-Windows-toolset)
installed. Installation of binary packages must be done by install.packages
. R CMD INSTALL --help will tell you the current options under Windows
(which differ from those on a Unix-alike): in particular there is a choice
of the types of documentation to be installed.
Farrel Buchinsky
Google Voice Tel: (412) 567-7870



2009/6/19 Uwe Ligges lig...@statistik.tu-dortmund.de


See the manual R Installation and Administration for information on how
to install source packages on Windows.

Uwe Ligges

Farrel Buchinsky wrote:


After issuing tar xvfz RgoogleDocs_0.2.2-src.tar.gzI am getting an error

message
'tar' is not recongnized as an internal or external command, operable
program or batch file.

Should I use my 7-zip to open up the archive?
Where should I be doing this? For instance can I do it all in my
download directory or should I do it in C:\Program
Files\R\R-2.9.0\library or should I manually create C:\Program
Files\R\R-2.9.0\library\RGoogleDocs and do it all there or will the Rcmd
INSTALL RGoogleDocs_0.2-2.tar.gz command do that for me.

Yes, you assumed correctly. I am using Windows XP.
Farrel Buchinsky
Google Voice Tel: (412) 567-7870



On Thu, Jun 18, 2009 at 20:17, Gabor Grothendieck
ggrothendi...@gmail.comwrote:

 I have haven't neen following this thread but:

1. if RGoogleDocs_0.2-2.tar.gz is a source distribution (as
opposed to built source) then the first line renames it so
that its not the same name as the built file about to be created.
The second line detars it into the RGoogleDocs directory.  The third
builds
the built source file, RGoogleDocs_0.2-2.tar.gz.  The fourth
installs the built source file into R.  I've assumed Windows.
If you are on Linux replace rename with mv.

rename RGoogleDocs_0.2-2.tar.gz RgoogleDocs_0.2.2-src.tar.gz
tar xvfz RgoogleDocs_0.2.2-src.tar.gz
Rcmd build RGoogleDocs
Rcmd INSTALL RGoogleDocs_0.2-2.tar.gz

or

2. if RGoogleDocs_0.2-2.tar.gz is already a built source file then you
can just issue the last of the above lines and don't need
the others.

On Thu, Jun 18, 2009 at 7:52 PM, Farrel Buchinskyfjb...@gmail.com
wrote:


What do you mean by cd the.directory.containing.RGoogleDocs
Do you mean the directory where I downloaded the
RGoogleDocs_0.2-2.tar.gz
to? Or do you mean that I must create a directory called RGoogleDocs


under


Library and then change to that directory?
Farrel Buchinsky
Google Voice Tel: (412) 567-7870



On Mon, Mar 2, 2009 at 22:16, Gabor Grothendieck 


ggrothendi...@gmail.com


wrote:


Finally enter into the Windows console:

cd the.directory.containing.RGoogleDocs
Rcmd build RGoogleDocs
Rcmd INSTALL RGoogleDocs_1.0.0.tar.gz

except replace RGoogleDocs_1.0.0.tar.gz with the filename
created by the build.


   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading from Google Docs

2009-07-08 Thread Farrel Buchinsky

Forgive my naivte, but how do I make windows find tar. In other words from
where do I issue the command and what is the command.
Farrel Buchinsky
Google Voice Tel: (412) 567-7870



On Wed, Jul 8, 2009 at 10:09, Duncan Murdoch murd...@stats.uwo.ca wrote:

 On 08/07/2009 10:02 AM, Farrel Buchinsky wrote:

 I  have previously read R Installation and Administration. I read it
 again. It does not help me
 The relevant paragraph is below. But I need lower level instructions.
 Where
 can I find them.


 Follow the link.  If Windows can't find tar, your toolset is installed
 incorrectly.

 Duncan Murdoch


 R CMD INSTALL works in Windows to install source packages if you have the
 source-code package files (option Source Package Installation Files in
 the
 installer) and toolset (see The Windows

 toolsetfile:///C:/Program%20Files/R/R-2.9.1/doc/manual/R-admin.html#The-Windows-toolset)

 installed. Installation of binary packages must be done by
 install.packages
 . R CMD INSTALL --help will tell you the current options under Windows
 (which differ from those on a Unix-alike): in particular there is a choice
 of the types of documentation to be installed.
 Farrel Buchinsky
 Google Voice Tel: (412) 567-7870



 2009/6/19 Uwe Ligges lig...@statistik.tu-dortmund.de

  See the manual R Installation and Administration for information on how
 to install source packages on Windows.

 Uwe Ligges

 Farrel Buchinsky wrote:

  After issuing tar xvfz RgoogleDocs_0.2.2-src.tar.gzI am getting an error

 message
 'tar' is not recongnized as an internal or external command, operable
 program or batch file.

 Should I use my 7-zip to open up the archive?
 Where should I be doing this? For instance can I do it all in my
 download directory or should I do it in C:\Program
 Files\R\R-2.9.0\library or should I manually create C:\Program
 Files\R\R-2.9.0\library\RGoogleDocs and do it all there or will the Rcmd
 INSTALL RGoogleDocs_0.2-2.tar.gz command do that for me.

 Yes, you assumed correctly. I am using Windows XP.
 Farrel Buchinsky
 Google Voice Tel: (412) 567-7870



 On Thu, Jun 18, 2009 at 20:17, Gabor Grothendieck
 ggrothendi...@gmail.comwrote:

  I have haven't neen following this thread but:

 1. if RGoogleDocs_0.2-2.tar.gz is a source distribution (as
 opposed to built source) then the first line renames it so
 that its not the same name as the built file about to be created.
 The second line detars it into the RGoogleDocs directory.  The third
 builds
 the built source file, RGoogleDocs_0.2-2.tar.gz.  The fourth
 installs the built source file into R.  I've assumed Windows.
 If you are on Linux replace rename with mv.

 rename RGoogleDocs_0.2-2.tar.gz RgoogleDocs_0.2.2-src.tar.gz
 tar xvfz RgoogleDocs_0.2.2-src.tar.gz
 Rcmd build RGoogleDocs
 Rcmd INSTALL RGoogleDocs_0.2-2.tar.gz

 or

 2. if RGoogleDocs_0.2-2.tar.gz is already a built source file then you
 can just issue the last of the above lines and don't need
 the others.

 On Thu, Jun 18, 2009 at 7:52 PM, Farrel Buchinskyfjb...@gmail.com
 wrote:

  What do you mean by cd the.directory.containing.RGoogleDocs
 Do you mean the directory where I downloaded the
 RGoogleDocs_0.2-2.tar.gz
 to? Or do you mean that I must create a directory called RGoogleDocs

  under

  Library and then change to that directory?
 Farrel Buchinsky
 Google Voice Tel: (412) 567-7870



 On Mon, Mar 2, 2009 at 22:16, Gabor Grothendieck 

  ggrothendi...@gmail.com

  wrote:

  Finally enter into the Windows console:

 cd the.directory.containing.RGoogleDocs
 Rcmd build RGoogleDocs
 Rcmd INSTALL RGoogleDocs_1.0.0.tar.gz

 except replace RGoogleDocs_1.0.0.tar.gz with the filename
 created by the build.

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]



 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fitting a trend-line

2009-07-08 Thread anupam sinha

Thanks a lot for all your suggestions.

Regards,

Anupam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] truncated regression out-of-sample predictions

2009-07-08 Thread Wouterse, Fleur (IFPRI-Senegal)

Dear all, 

 

I am trying to implement Simar  Wilson's (2007) second algorithm and
have the following question: If I use a truncated regression on the mn
observations, how do I get fitted values for all n observations, instead
of for m observations, which is what the command fitted returns; I would
need these to construct the left-truncation needed to draw n random
deviates. 

 

Thanks for your help, 

 

Fleur

 

Fleur Wouterse, Ph.D. 

Post-Doctoral Fellow 

IFPRI-Dakar 

Immeuble Ousseynou Thiam Gueye 

Rue de Thies

Point E, BP 15702 CP 12524

Dakar Fann

Senegal

Phone: +221 33 869 3986

Email: f.woute...@cgiar.org

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] recoding strings containing colons

2009-07-08 Thread Donald Braman

Curious to know if recode can work with strings containing colons.  I
haven't gotten it to work yet, but perhaps there is a way?

Donald Braman
http://www.culturalcognition.com/braman/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error: no such index at level 2

2009-07-08 Thread Godmar Back

On Wed, Jul 8, 2009 at 9:40 AM, Henrique Dallazuanna www...@gmail.comwrote:

 Its because '[[' accept only element, so you need use '[':

 q[crossRsorted[,1]]


Henrique,

I figured out what q[crossRsorted[,1]] does - it produces q[i] for all i in
crossRsorted[,1]. Ok. Since a given index 'k' of q[[k]] can occur in
multiple rows in crossRsorted[,1], this is not what I want.

Meanwhile, I was able to express what I do want like so:

crossRsorted[Filter(function (idx) mean(q[[idx]], na.rm = TRUE),
unique(crossRsorted[,1])), ]

but, I'm afraid, that's not really R style.  Or is it?  But perhaps the
only way?

I think I'm starting to see the allure of R: every indexing task ends up a
challenging puzzle.
Which prevents Alzheimer's [1].

 - Godmar

[1] http://www.timesonline.co.uk/tol/life_and_style/article508785.ece

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] please remove me from this list

2009-07-08 Thread Curley, Jane

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] functions to calculate t-stats, etc. for lm.fit objects?

2009-07-08 Thread Marc Schwartz


On Jul 8, 2009, at 8:51 AM, Whit Armstrong wrote:


I'm running a huge number of regressions in a loop, so I tried lm.fit
for a speedup.  However, I would like to be able to calculate the
t-stats for the coefficients.

Does anyone have some functions for calculating the regression summary
stats of an lm.fit object?

Thanks,
Whit




Whit, depending upon just how much time savings you are realizing by  
using lm.fit() and not lm(), the approach to your question may vary.


Do you need all of the models, or only a subset?

If the latter, then I would narrow down your model set and re-run them  
with lm() so that you can use summary.lm() directly. That would entail  
less custom coding, which may otherwise offset any time savings from  
using lm.fit()


If the former, then there are two choices as I see them.

The first would be to restructure the object resulting from lm.fit()  
by adding the elements required to run summary.lm(). However, I would  
think that this overhead would bring you back to a point where just  
using lm() would be a better approach from a time standpoint.


The second would be to cook up a function that only provides the  
subset of results that you need from summary.lm() and then use that on  
the results of lm.fit(). Here again, there remains the question of  
just how much time are you saving using lm.fit() versus the additional  
overhead of calculating even a subset of the output.


Here is a very simple approach to a function that would get you a  
subset of the output that you would get using, for example,  
coef(summary(lm.object)). This is using a selective approach of  
copying and slightly editing code from summary.lm(). Note that there  
is other code in summary.lm() to handle weights and such, if your  
models are more complex. You would need to add that in if that is the  
case.


If you need much more summary output than this on each model, then I  
think you would be better off just using lm() and summary.lm().



# Use at your own risk...untested on more complex models  :-)

# 'x' is an lm.fit object

calc.lm.t - function(x)
{
  Qr - x$qr
  r - x$residuals
  p - x$rank
  p1 - 1L:p
  rss - sum(r^2)

  n - NROW(Qr$qr)
  rdf - n - p

  resvar - rss/rdf
  R - chol2inv(Qr$qr[p1, p1, drop = FALSE])
  se - sqrt(diag(R) * resvar)

  est - x$coefficients[Qr$pivot[p1]]
  tval - est/se

  res - cbind(est = est, se = se, tval = tval)
  res
}



Here is some simple example data:

set.seed(1)
y - rnorm(100)
x - rnorm(100)


# Get the default coefficient output using summary.lm()
 coef(summary(lm(y ~ x)))
 Estimate Std. Error t value  Pr(|t|)
(Intercept)  0.1088521158 0.09034800  1.20480938 0.2311784
x   -0.0009323697 0.09472155 -0.00984327 0.9921663



# Now use calc.lm.t

lmf - lm.fit(model.matrix(y ~ x), y)

 calc.lm.t(lmf)
  est setval
(Intercept)  0.1088521158 0.09034800  1.20480938
x   -0.0009323697 0.09472155 -0.00984327



I'll leave it to you to see whether this approach may or may not be  
helpful from a time perspective.


HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Passing arguments to with()

2009-07-08 Thread Tymek Wołodźko

Hi,

I've been wondering how to write a function that will produce results
from multiple tests (eg. paired t-tests) for all or several variables
in some data frame. I'd like it to do t-test for each variable ('x')
in 'data' by 'y'. I'm stuck in here:

function(data,y) {
for (x in names(data)) {
with(data, t.test(x~y))
}}

How to tell 'with' that 'x' and 'y' are names of columns in 'data'? Or
pass similar arguments?

I probably understand the logic why this is not working, but still
don't know how to make it work.

Thanks in advance for any help!
Timo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Formatting a Table

2009-07-08 Thread cvandy


I've created a short program to print a table of learning curve factors. 
However, I cannot figure out how to format the table to:
1) Get rid of the [1]s in the first column and replace it with the values of
N.
2) Line up the first row with the factors (decimal fractions).
Thanks for any help.
The complete program and output is as follows:

 Lc-seq(0.70,0.95,0.05) #Specify learning curves
 T-function(N,Lc)  #Create a function to calc.time for Nth unit
+ {
+ N^(log(Lc,10)/log(2,10))  #Function
+ }
 for (N in seq(2,10,2))
+ {if (N==2){print(T(N,Lc)*100)}else{print(T(N,Lc),digits=3)}}
[1] 70 75 80 85 90 95
[1] 0.490 0.562 0.640 0.722 0.810 0.902
[1] 0.398 0.475 0.562 0.657 0.762 0.876
[1] 0.343 0.422 0.512 0.614 0.729 0.857
[1] 0.306 0.385 0.477 0.583 0.705 0.843

-- 
View this message in context: 
http://www.nabble.com/Formatting-a-Table-tp24391433p24391433.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Two-way ANOVA gives different results using anova(lm()) than doing it by hand

2009-07-08 Thread Lars Bergemann


Hey!

 

Could you please take a quick look at what I have done? Somehow I get wrong 
results using the anova(lm()) combination compared to doing a two way ANOVA by 
hand.

 

Running:

 

Data-read.table(Data.txt);
g-lm(ExM~S1*S2,Data);
anova(g);

 

Gives:

 

Analysis of Variance Table

Response: ExM
   Df Sum Sq Mean Sq F valuePr(F)
S1  1 4.3679  4.3679 167.045  2.2e-16 ***
S2  1 0.9427  0.9427  36.053 8.236e-09 ***
S1:S2   1 0.3231  0.3231  12.357 0.0005371 ***
Residuals 212 5.5434  0.0261  


I compared it to the work done by hand, ie calculated all the different square 
sums using sum() and tapply().

So I know that anova(lm()) gets the degrees of freedom equal two 1, 1, 1 and 
212 when it should be 5, 5, 25 and 180. Also, the square sums are quite 
different ... I get 4.xx, 4.xx, 1.xx, 0.xx ... as you see, what anova(lm()) 
gets is different.

 

The data: S1 has 6 levels, so has S2. On average, each cell has 6 values, most 
cells have actually 6 values, and there are two of each: 5, 7, 4, 8 - so 
average 6.

 

Could you please help me, why it does not work with anova(lm())? I tried quite 
a few thinks found with Google, but it all gave me the same result as 
anova(lm()) ... 

 

Thanks a lot!

 

Lars

_



S1  S2  ExO ExM
1.000   0.000   0.000   0.819   0.830
2.000   0.000   0.000   0.835   0.846
3.000   0.000   0.000   0.891   0.902
4.000   0.000   0.000   0.905   0.916
5.000   0.000   0.000   0.839   0.850
6.000   2.500   0.000   0.863   0.874
7.000   2.500   0.000   0.898   0.909
8.000   2.500   0.000   0.887   0.898
9.000   2.500   0.000   0.909   0.920
10.000  2.500   0.000   0.892   0.903
11.000  2.500   0.000   0.886   0.897
12.000  5.000   0.000   0.841   0.852
13.000  5.000   0.000   0.881   0.892
14.000  5.000   0.000   0.874   0.885
15.000  5.000   0.000   0.873   0.884
16.000  5.000   0.000   0.886   0.897
17.000  5.000   0.000   0.858   0.869
18.000  10.000  0.000   0.709   0.720
19.000  10.000  0.000   0.702   0.713
20.000  10.000  0.000   0.727   0.738
21.000  10.000  0.000   0.737   0.748
22.000  10.000  0.000   0.762   0.773
23.000  10.000  0.000   0.716   0.727
24.000  20.000  0.000   0.381   0.392
25.000  20.000  0.000   0.437   0.448
26.000  20.000  0.000   0.443   0.454
27.000  20.000  0.000   0.412   0.423
28.000  20.000  0.000   0.414   0.425
29.000  20.000  0.000   0.362   0.373
30.000  40.000  0.000   0.034   0.045
31.000  40.000  0.000   0.030   0.041
32.000  40.000  0.000   0.036   0.047
33.000  40.000  0.000   0.062   0.073
34.000  40.000  0.000   0.063   0.074
35.000  40.000  0.000   0.085   0.096
36.000  0.000   0.039   0.573   0.584
37.000  0.000   0.039   0.337   0.348
38.000  0.000   0.039   0.557   0.568
39.000  0.000   0.039   0.422   0.433
40.000  0.000   0.039   0.542   0.553
41.000  0.000   0.039   0.428   0.439
42.000  0.000   0.078   0.293   0.304
43.000  0.000   0.078   0.346   0.357
44.000  0.000   0.078   0.241   0.252
45.000  0.000   0.078   0.261   0.272
46.000  0.000   0.078   0.298   0.309
47.000  0.000   0.156   0.223   0.234
48.000  0.000   0.156   0.215   0.226
49.000  0.000   0.156   0.196   0.207
50.000  0.000   0.156   0.238   0.249
51.000  0.000   0.156   0.276   0.287
52.000  0.000   0.156   0.294   0.305
53.000  0.000   0.156   0.291   0.302
54.000  0.000   0.313   0.194   0.205
55.000  0.000   0.313   0.186   0.197
56.000  0.000   0.313   0.204   0.215
57.000  0.000   0.313   0.336   0.347
58.000  0.000   0.313   0.315   0.326
59.000  0.000   0.313   0.251   0.262
60.000  0.000   0.625   0.211   0.222
61.000  0.000   0.625   0.203   0.214
62.000  0.000   0.625   0.182   0.193
63.000  0.000   0.625   0.336   0.347
64.000  0.000   0.625   0.383   0.394
65.000  0.000   0.625   0.364   0.375
66.000  0.000   0.625   0.255   0.266
67.000  2.500   0.039   0.519   0.530
68.000  2.500   0.039   0.503   0.514
69.000  2.500   0.039   0.491   0.502
70.000  2.500   0.039   0.490   0.501
71.000  2.500   0.039   0.509   0.520
72.000  2.500   0.039   0.546   0.557
73.000  5.000   0.039   0.483   0.494
74.000  5.000   0.039   0.462   0.473
75.000  5.000   0.039   0.449   0.460
76.000  5.000   0.039   0.422   0.433
77.000  5.000   0.039   0.418   0.429
78.000  5.000   0.039   0.428   0.439
79.000  10.000  0.039   0.321   0.332
80.000  10.000  0.039   0.296   0.307
81.000  10.000  0.039   0.273   0.284
82.000  10.000  0.039   0.275   0.286
83.000  10.000  0.039   0.308   0.319
84.000  10.000  0.039   0.325   0.336
85.000  20.000  0.039   0.146   0.157
86.000  20.000  0.039   0.129   0.140
87.000  20.000  0.039   0.122   0.133
88.000  20.000  0.039   0.096   0.107
89.000  20.000  0.039   0.113   0.124
90.000  20.000  0.039   0.119   0.130
91.000  40.000  0.039   0.031   0.042
92.000  40.000  0.039   0.035   0.046
93.000  40.000  0.039   0.034   0.045
94.000  40.000  0.039   0.035   0.046
95.000  40.000  0.039   0.072

Re: [R] functions to calculate t-stats, etc. for lm.fit objects?

2009-07-08 Thread Whit Armstrong

Marc,

Thanks very much for your detailed reply.  I'll give your code a try
and post back the time difference.

Cheers,
Whit


On Wed, Jul 8, 2009 at 10:50 AM, Marc Schwartzmarc_schwa...@me.com wrote:
 On Jul 8, 2009, at 8:51 AM, Whit Armstrong wrote:

 I'm running a huge number of regressions in a loop, so I tried lm.fit
 for a speedup.  However, I would like to be able to calculate the
 t-stats for the coefficients.

 Does anyone have some functions for calculating the regression summary
 stats of an lm.fit object?

 Thanks,
 Whit



 Whit, depending upon just how much time savings you are realizing by using
 lm.fit() and not lm(), the approach to your question may vary.

 Do you need all of the models, or only a subset?

 If the latter, then I would narrow down your model set and re-run them with
 lm() so that you can use summary.lm() directly. That would entail less
 custom coding, which may otherwise offset any time savings from using
 lm.fit()

 If the former, then there are two choices as I see them.

 The first would be to restructure the object resulting from lm.fit() by
 adding the elements required to run summary.lm(). However, I would think
 that this overhead would bring you back to a point where just using lm()
 would be a better approach from a time standpoint.

 The second would be to cook up a function that only provides the subset of
 results that you need from summary.lm() and then use that on the results of
 lm.fit(). Here again, there remains the question of just how much time are
 you saving using lm.fit() versus the additional overhead of calculating even
 a subset of the output.

 Here is a very simple approach to a function that would get you a subset of
 the output that you would get using, for example, coef(summary(lm.object)).
 This is using a selective approach of copying and slightly editing code from
 summary.lm(). Note that there is other code in summary.lm() to handle
 weights and such, if your models are more complex. You would need to add
 that in if that is the case.

 If you need much more summary output than this on each model, then I think
 you would be better off just using lm() and summary.lm().


 # Use at your own risk...untested on more complex models  :-)

 # 'x' is an lm.fit object

 calc.lm.t - function(x)
 {
  Qr - x$qr
  r - x$residuals
  p - x$rank
  p1 - 1L:p
  rss - sum(r^2)

  n - NROW(Qr$qr)
  rdf - n - p

  resvar - rss/rdf
  R - chol2inv(Qr$qr[p1, p1, drop = FALSE])
  se - sqrt(diag(R) * resvar)

  est - x$coefficients[Qr$pivot[p1]]
  tval - est/se

  res - cbind(est = est, se = se, tval = tval)
  res
 }



 Here is some simple example data:

 set.seed(1)
 y - rnorm(100)
 x - rnorm(100)


 # Get the default coefficient output using summary.lm()
 coef(summary(lm(y ~ x)))
                 Estimate Std. Error     t value  Pr(|t|)
 (Intercept)  0.1088521158 0.09034800  1.20480938 0.2311784
 x           -0.0009323697 0.09472155 -0.00984327 0.9921663



 # Now use calc.lm.t

 lmf - lm.fit(model.matrix(y ~ x), y)

 calc.lm.t(lmf)
                      est         se        tval
 (Intercept)  0.1088521158 0.09034800  1.20480938
 x           -0.0009323697 0.09472155 -0.00984327



 I'll leave it to you to see whether this approach may or may not be helpful
 from a time perspective.

 HTH,

 Marc Schwartz



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] #INCLUDE

2009-07-08 Thread Idgarad

What is R's equivalent to a C-like #include to incorporate external files. I
have a 2k line function that is generated and need to include it at runtime
but not manage it as a package (as it changes hourly.) Any ideas?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading from Google Docs

2009-07-08 Thread Duncan Murdoch


On 08/07/2009 10:13 AM, Farrel Buchinsky wrote:

Forgive my naivte, but how do I make windows find tar. In other words from
where do I issue the command and what is the command.


You need to install the toolset, and let the installer set your path.

Duncan Murdoch


Farrel Buchinsky
Google Voice Tel: (412) 567-7870



On Wed, Jul 8, 2009 at 10:09, Duncan Murdoch murd...@stats.uwo.ca wrote:


On 08/07/2009 10:02 AM, Farrel Buchinsky wrote:


I  have previously read R Installation and Administration. I read it
again. It does not help me
The relevant paragraph is below. But I need lower level instructions.
Where
can I find them.


Follow the link.  If Windows can't find tar, your toolset is installed
incorrectly.

Duncan Murdoch



R CMD INSTALL works in Windows to install source packages if you have the
source-code package files (option “Source Package Installation Files” in
the
installer) and toolset (see The Windows

toolsetfile:///C:/Program%20Files/R/R-2.9.1/doc/manual/R-admin.html#The-Windows-toolset)

installed. Installation of binary packages must be done by
install.packages
. R CMD INSTALL --help will tell you the current options under Windows
(which differ from those on a Unix-alike): in particular there is a choice
of the types of documentation to be installed.
Farrel Buchinsky
Google Voice Tel: (412) 567-7870



2009/6/19 Uwe Ligges lig...@statistik.tu-dortmund.de

 See the manual R Installation and Administration for information on how

to install source packages on Windows.

Uwe Ligges

Farrel Buchinsky wrote:

 After issuing tar xvfz RgoogleDocs_0.2.2-src.tar.gzI am getting an error

message
'tar' is not recongnized as an internal or external command, operable
program or batch file.

Should I use my 7-zip to open up the archive?
Where should I be doing this? For instance can I do it all in my
download directory or should I do it in C:\Program
Files\R\R-2.9.0\library or should I manually create C:\Program
Files\R\R-2.9.0\library\RGoogleDocs and do it all there or will the Rcmd
INSTALL RGoogleDocs_0.2-2.tar.gz command do that for me.

Yes, you assumed correctly. I am using Windows XP.
Farrel Buchinsky
Google Voice Tel: (412) 567-7870



On Thu, Jun 18, 2009 at 20:17, Gabor Grothendieck
ggrothendi...@gmail.comwrote:

 I have haven't neen following this thread but:


1. if RGoogleDocs_0.2-2.tar.gz is a source distribution (as
opposed to built source) then the first line renames it so
that its not the same name as the built file about to be created.
The second line detars it into the RGoogleDocs directory.  The third
builds
the built source file, RGoogleDocs_0.2-2.tar.gz.  The fourth
installs the built source file into R.  I've assumed Windows.
If you are on Linux replace rename with mv.

rename RGoogleDocs_0.2-2.tar.gz RgoogleDocs_0.2.2-src.tar.gz
tar xvfz RgoogleDocs_0.2.2-src.tar.gz
Rcmd build RGoogleDocs
Rcmd INSTALL RGoogleDocs_0.2-2.tar.gz

or

2. if RGoogleDocs_0.2-2.tar.gz is already a built source file then you
can just issue the last of the above lines and don't need
the others.

On Thu, Jun 18, 2009 at 7:52 PM, Farrel Buchinskyfjb...@gmail.com
wrote:

 What do you mean by cd the.directory.containing.RGoogleDocs

Do you mean the directory where I downloaded the
RGoogleDocs_0.2-2.tar.gz
to? Or do you mean that I must create a directory called RGoogleDocs

 under

 Library and then change to that directory?

Farrel Buchinsky
Google Voice Tel: (412) 567-7870



On Mon, Mar 2, 2009 at 22:16, Gabor Grothendieck 

 ggrothendi...@gmail.com

 wrote:

 Finally enter into the Windows console:

cd the.directory.containing.RGoogleDocs
Rcmd build RGoogleDocs
Rcmd INSTALL RGoogleDocs_1.0.0.tar.gz

except replace RGoogleDocs_1.0.0.tar.gz with the filename
created by the build.

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



   [[alternative HTML version deleted]]





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





[[alternative HTML version deleted]]





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do

Re: [R] #INCLUDE

2009-07-08 Thread Godmar Back

?source ?

On Wed, Jul 8, 2009 at 11:16 AM, Idgaradidga...@gmail.com wrote:
 What is R's equivalent to a C-like #include to incorporate external files. I
 have a 2k line function that is generated and need to include it at runtime
 but not manage it as a package (as it changes hourly.) Any ideas?

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Comparing GAMMs

2009-07-08 Thread Paul Simonin


Greetings!
 I am looking for advice regarding the best way to compare GAMMs. I 
know other model outputs return enough information for R's AIC, ANOVA, 
etc. commands to function, but this is not the case with GAMM unless one 
specifies the gam or lme portion. I know these parts of the gamm contain 
items that will facilitate comparisons between gamms. Is it correct to 
simply use these values for this purpose? For example, the lme portion 
of the gamm returns a log liklihood value that could be used to 
calculate information criteria. However, I am wondering whether entire 
gamms be compared using this, or only the lme part.
 Maybe my thinking about the lme and gam portions of gamms is 
incorrect? If this appears to be the case, let me know! In general, if 
someone could clarify my understanding in any way it would be much 
appreciated.

Thank you very much!
Sincerely,
Paul Simonin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Passing arguments to with()

2009-07-08 Thread Duncan Murdoch


On 08/07/2009 10:01 AM, Tymek Wo?odz'ko wrote:

Hi,

I've been wondering how to write a function that will produce results
from multiple tests (eg. paired t-tests) for all or several variables
in some data frame. I'd like it to do t-test for each variable ('x')
in 'data' by 'y'. I'm stuck in here:

function(data,y) {
for (x in names(data)) {
with(data, t.test(x~y))
}}

How to tell 'with' that 'x' and 'y' are names of columns in 'data'? Or
pass similar arguments?


Don't use with.  Use t.test(data[[x]] ~ data[[y]]).

Duncan Murdoch



I probably understand the logic why this is not working, but still
don't know how to make it work.

Thanks in advance for any help!
Timo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Formatting a Table

2009-07-08 Thread Godmar Back

You could use 'cat(sprintf())', C-style:

 for (N in seq(2,10,2))
+ {if (N==2){cat(sprintf(%5d,
T(N,Lc)*100),\n)}else{cat(sprintf(%5.3f, T(N,Lc)), \n)}}
   707580858995
0.490 0.562 0.640 0.722 0.810 0.902
0.398 0.475 0.562 0.657 0.762 0.876
0.343 0.422 0.512 0.614 0.729 0.857
0.306 0.385 0.477 0.583 0.705 0.843

On Wed, Jul 8, 2009 at 9:20 AM, cvandycvand...@gmail.com wrote:

 I've created a short program to print a table of learning curve factors.
 However, I cannot figure out how to format the table to:
 1) Get rid of the [1]s in the first column and replace it with the values of
 N.
 2) Line up the first row with the factors (decimal fractions).
 Thanks for any help.
 The complete program and output is as follows:

 Lc-seq(0.70,0.95,0.05) #Specify learning curves
 T-function(N,Lc)  #Create a function to calc.time for Nth unit
 + {
 + N^(log(Lc,10)/log(2,10))  #Function
 + }
 for (N in seq(2,10,2))
 + {if (N==2){print(T(N,Lc)*100)}else{print(T(N,Lc),digits=3)}}
 [1] 70 75 80 85 90 95
 [1] 0.490 0.562 0.640 0.722 0.810 0.902
 [1] 0.398 0.475 0.562 0.657 0.762 0.876
 [1] 0.343 0.422 0.512 0.614 0.729 0.857
 [1] 0.306 0.385 0.477 0.583 0.705 0.843

 --
 View this message in context: 
 http://www.nabble.com/Formatting-a-Table-tp24391433p24391433.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Formatting a Table

2009-07-08 Thread David Huffer

Cvandy, is this close to what you need:

   printT - function ( .seq = seq ( 2 , 10 , 2 ) ) {
  +   x - t ( sapply ( .seq , T , Lc ) )
  +   x - cbind (
  + .seq
  + , rbind (
  +   format ( x [ 1 , ] * 100 )
  +   , format ( x [ -1 , ] , digits = 3 )
  + )
  +   )
  +   dimnames ( x ) [[2]] - NULL
  +   print ( x , quote = FALSE )
  + }
   printT ( )
   [,1] [,2]  [,3]  [,4]  [,5]  [,6]  [,7]
  [1,] 2707580859095
  [2,] 40.490 0.562 0.640 0.722 0.810 0.902
  [3,] 60.398 0.475 0.562 0.657 0.762 0.876
  [4,] 80.343 0.422 0.512 0.614 0.729 0.857
  [5,] 10   0.306 0.385 0.477 0.583 0.705 0.843

Im not really sure what you mean by Line up the first row with
the factors (decimal fractions).

--
 David
 
 -
 David Huffer, Ph.D.   Senior Statistician
 CSOSA/Washington, DC   david.huf...@csosa.gov
 -


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of cvandy
Sent: Wednesday, July 08, 2009 9:21 AM
To: r-help@r-project.org
Subject: [R] Formatting a Table


I've created a short program to print a table of learning curve factors. 
However, I cannot figure out how to format the table to:
1) Get rid of the [1]s in the first column and replace it with the values of
N.
2) Line up the first row with the factors (decimal fractions).
Thanks for any help.
The complete program and output is as follows:

 Lc-seq(0.70,0.95,0.05) #Specify learning curves
 T-function(N,Lc)  #Create a function to calc.time for Nth unit
+ {
+ N^(log(Lc,10)/log(2,10))  #Function
+ }
 for (N in seq(2,10,2))
+ {if (N==2){print(T(N,Lc)*100)}else{print(T(N,Lc),digits=3)}}
[1] 70 75 80 85 90 95
[1] 0.490 0.562 0.640 0.722 0.810 0.902
[1] 0.398 0.475 0.562 0.657 0.762 0.876
[1] 0.343 0.422 0.512 0.614 0.729 0.857
[1] 0.306 0.385 0.477 0.583 0.705 0.843

-- 
View this message in context: 
http://www.nabble.com/Formatting-a-Table-tp24391433p24391433.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Comparing GAMMs

2009-07-08 Thread Gavin Simpson

On Wed, 2009-07-08 at 11:24 -0400, Paul Simonin wrote:
 Greetings!
   I am looking for advice regarding the best way to compare GAMMs. I 
 know other model outputs return enough information for R's AIC, ANOVA, 
 etc. commands to function, but this is not the case with GAMM unless one 
 specifies the gam or lme portion. I know these parts of the gamm contain 
 items that will facilitate comparisons between gamms. Is it correct to 
 simply use these values for this purpose? For example, the lme portion 
 of the gamm returns a log liklihood value that could be used to 
 calculate information criteria. However, I am wondering whether entire 
 gamms be compared using this, or only the lme part.
   Maybe my thinking about the lme and gam portions of gamms is 
 incorrect? If this appears to be the case, let me know! In general, if 
 someone could clarify my understanding in any way it would be much 
 appreciated.
 Thank you very much!
 Sincerely,
 Paul Simonin

Hi Paul,

Are your GAMMs Guassian (i.e. AMM) or non-Gaussian? If they are
Gaussian, then

anova(mod1$lme, mod2$lme)

gives an approximate LRT for the two models. That will also yield AIC
and BIC which might also be used for inference. Your AMM in this case is
just a linear mixed model and these usual forms of inference apply, with
the caveat that the hypothesis testing is approximate. You end up using
both the $lme and the $gam components for various aspects of model
inspection, interrogation etc, but for hypothesis testing, the lme bit
is sufficient. You can also use things like intervals(mod1$lme) to look
at confidence on the smoothing parameters. See Simon Wood's book [1]
section 6.7 for more details, and preceding sections on how the
smoothers can be formulated as a mixed model.

If your GAMMS are generalised then I'm not sure what the best approach
for comparison or hypothesis testing might be - especially as this is an
ongoing research topic for GLMMs, and also because of the method by
which GAMMs are fitted in mgcv. Simon Wood says as much in his 2006
monograph [1, page 318, section 6.6.2]. The non-Gaussian case uses
glmmPQL from package MASS, and this doesn't return a likelihood and
hence no AIC (in the same way that quasi families in glm() fits don't
return likelihoods).

So having said that, if you do have a likelihood, then you must be
fitting AMM via gamm() and the first half of my reply would seem most
appropriate.

[1] Wood, S.N. (2006) Generalized Additive Models; an Introduction with
R. Chapman  Hall/CRC.

HTH

G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Randomizing a dataframe

2009-07-08 Thread Mark Na

Hi R-helpers,

I have a dataframe (called data) with trees in rows (n=100) and insect
species (n=10) in columns. My tree IDs are in a column called TREE and each
species has a column labeled SPEC1, SPEC2, SPEC3, etc...

I wish to randomize the values in my dataframe such that row and column
totals are held constant, i.e. in my randomized data each tree will have the
same number of individual insects as in the real data (constant row totals)
and each species will have the same number of individuals as in the real
data (constant column totals).

I will eventually want to do this many times, but I would appreciate help
getting started with the randomization.

Thank you, Mark Na

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error: no such index at level 2

2009-07-08 Thread David Huffer

Godmar, 

I don't follow...

   q - list ( )
   q [[ 105 ]] - as.numeric ( c ( 0 , 0 , 1 ) )
   q [[ 104 ]] - as.numeric ( c ( 1 , 1 , 1 ) )
   q [[ 10 ]] - as.integer ( c ( 3 , 3 , 1 ) )
   crossRsorted - data.frame ( i = c ( 105 , 104 , 10 ) )
   q [ crossRsorted [ , 1 ] ]
  [[1]]
  [1] 0 0 1

  [[2]]
  [1] 1 1 1

  [[3]]
  [1] 3 3 1

   length ( q [ crossRsorted [ , 1 ] ] )
  [1] 3
  

How'd you come up with 

   length(q)
  [1] 165
   length(q[ crossRsorted[,1] ])
  [1] 15750

I must be missing something.  

--
 David
 
 -
 David Huffer, Ph.D.   Senior Statistician
 CSOSA/Washington, DC   david.huf...@csosa.gov
 -


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Godmar Back
Sent: Wednesday, July 08, 2009 9:58 AM
To: Henrique Dallazuanna
Cc: r-help@r-project.org; Petr PIKAL
Subject: Re: [R] error: no such index at level 2

On Wed, Jul 8, 2009 at 9:40 AM, Henrique Dallazuanna www...@gmail.comwrote:

 Its because '[[' accept only element, so you need use '[':

 q[crossRsorted[,1]]


This appears to be doing something different. For instance, my 'q' has 165
components, but what you suggest has 15750:
 length(q)
[1] 165
 length(q[ crossRsorted[,1] ])
[1] 15750

hardly what I want.

Meanwhile, it looks as though [[ ]] does not vectorize its arguments, it
curries them!

Note that:

 q[[c(105,104)]]
Error in q[[c(105, 104)]] : subscript out of bounds

gives the same error as:

 q[[105]][[104]]
Error in q[[105]][[104]] : subscript out of bounds

Very mysterious, though, in all fairness, explained in help([[) where it
says:

 '[[' can be applied recursively to lists, so that if the single
 index 'i' is a vector of length 'p', 'alist[[i]]' is equivalent to
 'alist[[i1]]...[[ip]]' providing all but the final indexing
 results in a list.

which leads to square one: how to express select all r[i] where q[[i]]
fulfills some predicate?

 - Godmar

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Godmar Back
Sent: Wednesday, July 08, 2009 9:58 AM
To: Henrique Dallazuanna
Cc: r-help@r-project.org; Petr PIKAL
Subject: Re: [R] error: no such index at level 2

On Wed, Jul 8, 2009 at 9:40 AM, Henrique Dallazuanna www...@gmail.comwrote:

 Its because '[[' accept only element, so you need use '[':

 q[crossRsorted[,1]]


This appears to be doing something different. For instance, my 'q' has 165
components, but what you suggest has 15750:
 length(q)
[1] 165
 length(q[ crossRsorted[,1] ])
[1] 15750

hardly what I want.

Meanwhile, it looks as though [[ ]] does not vectorize its arguments, it
curries them!

Note that:

 q[[c(105,104)]]
Error in q[[c(105, 104)]] : subscript out of bounds

gives the same error as:

 q[[105]][[104]]
Error in q[[105]][[104]] : subscript out of bounds

Very mysterious, though, in all fairness, explained in help([[) where it
says:

 '[[' can be applied recursively to lists, so that if the single
 index 'i' is a vector of length 'p', 'alist[[i]]' is equivalent to
 'alist[[i1]]...[[ip]]' providing all but the final indexing
 results in a list.

which leads to square one: how to express select all r[i] where q[[i]]
fulfills some predicate?

 - Godmar

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error: no such index at level 2

2009-07-08 Thread Godmar Back

Sorry, I mixed my toy example to recreate the problem with the actual data set.

The 'crossRsorted' in the toy and in the actual are different. See my
latest posting in this thread.

 - Godmar

On Wed, Jul 8, 2009 at 11:55 AM, David Hufferdavid.huf...@csosa.gov wrote:
 Godmar,

 I don't follow...

   q - list ( )
   q [[ 105 ]] - as.numeric ( c ( 0 , 0 , 1 ) )
   q [[ 104 ]] - as.numeric ( c ( 1 , 1 , 1 ) )
   q [[ 10 ]] - as.integer ( c ( 3 , 3 , 1 ) )
   crossRsorted - data.frame ( i = c ( 105 , 104 , 10 ) )
   q [ crossRsorted [ , 1 ] ]
  [[1]]
  [1] 0 0 1

  [[2]]
  [1] 1 1 1

  [[3]]
  [1] 3 3 1

   length ( q [ crossRsorted [ , 1 ] ] )
  [1] 3
  

 How'd you come up with

   length(q)
  [1] 165
   length(q[ crossRsorted[,1] ])
  [1] 15750

 I must be missing something.

 --
  David

  -
  David Huffer, Ph.D.               Senior Statistician
  CSOSA/Washington, DC           david.huf...@csosa.gov
  -


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf Of Godmar Back
 Sent: Wednesday, July 08, 2009 9:58 AM
 To: Henrique Dallazuanna
 Cc: r-help@r-project.org; Petr PIKAL
 Subject: Re: [R] error: no such index at level 2

 On Wed, Jul 8, 2009 at 9:40 AM, Henrique Dallazuanna www...@gmail.comwrote:

 Its because '[[' accept only element, so you need use '[':

 q[crossRsorted[,1]]


 This appears to be doing something different. For instance, my 'q' has 165
 components, but what you suggest has 15750:
 length(q)
 [1] 165
 length(q[ crossRsorted[,1] ])
 [1] 15750

 hardly what I want.

 Meanwhile, it looks as though [[ ]] does not vectorize its arguments, it
 curries them!

 Note that:

 q[[c(105,104)]]
 Error in q[[c(105, 104)]] : subscript out of bounds

 gives the same error as:

 q[[105]][[104]]
 Error in q[[105]][[104]] : subscript out of bounds

 Very mysterious, though, in all fairness, explained in help([[) where it
 says:

     '[[' can be applied recursively to lists, so that if the single
     index 'i' is a vector of length 'p', 'alist[[i]]' is equivalent to
     'alist[[i1]]...[[ip]]' providing all but the final indexing
     results in a list.

 which leads to square one: how to express select all r[i] where q[[i]]
 fulfills some predicate?

  - Godmar

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf Of Godmar Back
 Sent: Wednesday, July 08, 2009 9:58 AM
 To: Henrique Dallazuanna
 Cc: r-help@r-project.org; Petr PIKAL
 Subject: Re: [R] error: no such index at level 2

 On Wed, Jul 8, 2009 at 9:40 AM, Henrique Dallazuanna www...@gmail.comwrote:

 Its because '[[' accept only element, so you need use '[':

 q[crossRsorted[,1]]


 This appears to be doing something different. For instance, my 'q' has 165
 components, but what you suggest has 15750:
 length(q)
 [1] 165
 length(q[ crossRsorted[,1] ])
 [1] 15750

 hardly what I want.

 Meanwhile, it looks as though [[ ]] does not vectorize its arguments, it
 curries them!

 Note that:

 q[[c(105,104)]]
 Error in q[[c(105, 104)]] : subscript out of bounds

 gives the same error as:

 q[[105]][[104]]
 Error in q[[105]][[104]] : subscript out of bounds

 Very mysterious, though, in all fairness, explained in help([[) where it
 says:

     '[[' can be applied recursively to lists, so that if the single
     index 'i' is a vector of length 'p', 'alist[[i]]' is equivalent to
     'alist[[i1]]...[[ip]]' providing all but the final indexing
     results in a list.

 which leads to square one: how to express select all r[i] where q[[i]]
 fulfills some predicate?

  - Godmar

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading from Google Docs

2009-07-08 Thread Gabor Grothendieck

Its safer just to temporarily add it to your path.

Unfortunately Rtools has a find command that conflicts with
the find command in Windows so if you add the Rtools
bin directory to your path permanently then you could
find other programs stop working.  That actually happened
to me once and it took the longest time until I discovered
that Rtools was the culprit.

If you follow the advice I gave you normally won't have
that problem.

On Wed, Jul 8, 2009 at 11:21 AM, Duncan Murdochmurd...@stats.uwo.ca wrote:
 On 08/07/2009 10:13 AM, Farrel Buchinsky wrote:

 Forgive my naivte, but how do I make windows find tar. In other words from
 where do I issue the command and what is the command.

 You need to install the toolset, and let the installer set your path.

 Duncan Murdoch

 Farrel Buchinsky
 Google Voice Tel: (412) 567-7870



 On Wed, Jul 8, 2009 at 10:09, Duncan Murdoch murd...@stats.uwo.ca wrote:

 On 08/07/2009 10:02 AM, Farrel Buchinsky wrote:

 I  have previously read R Installation and Administration. I read it
 again. It does not help me
 The relevant paragraph is below. But I need lower level instructions.
 Where
 can I find them.

 Follow the link.  If Windows can't find tar, your toolset is installed
 incorrectly.

 Duncan Murdoch


 R CMD INSTALL works in Windows to install source packages if you have
 the
 source-code package files (option “Source Package Installation Files” in
 the
 installer) and toolset (see The Windows


 toolsetfile:///C:/Program%20Files/R/R-2.9.1/doc/manual/R-admin.html#The-Windows-toolset)

 installed. Installation of binary packages must be done by
 install.packages
 . R CMD INSTALL --help will tell you the current options under Windows
 (which differ from those on a Unix-alike): in particular there is a
 choice
 of the types of documentation to be installed.
 Farrel Buchinsky
 Google Voice Tel: (412) 567-7870



 2009/6/19 Uwe Ligges lig...@statistik.tu-dortmund.de

  See the manual R Installation and Administration for information on
 how

 to install source packages on Windows.

 Uwe Ligges

 Farrel Buchinsky wrote:

  After issuing tar xvfz RgoogleDocs_0.2.2-src.tar.gzI am getting an
 error

 message
 'tar' is not recongnized as an internal or external command, operable
 program or batch file.

 Should I use my 7-zip to open up the archive?
 Where should I be doing this? For instance can I do it all in my
 download directory or should I do it in C:\Program
 Files\R\R-2.9.0\library or should I manually create C:\Program
 Files\R\R-2.9.0\library\RGoogleDocs and do it all there or will the
 Rcmd
 INSTALL RGoogleDocs_0.2-2.tar.gz command do that for me.

 Yes, you assumed correctly. I am using Windows XP.
 Farrel Buchinsky
 Google Voice Tel: (412) 567-7870



 On Thu, Jun 18, 2009 at 20:17, Gabor Grothendieck
 ggrothendi...@gmail.comwrote:

  I have haven't neen following this thread but:

 1. if RGoogleDocs_0.2-2.tar.gz is a source distribution (as
 opposed to built source) then the first line renames it so
 that its not the same name as the built file about to be created.
 The second line detars it into the RGoogleDocs directory.  The third
 builds
 the built source file, RGoogleDocs_0.2-2.tar.gz.  The fourth
 installs the built source file into R.  I've assumed Windows.
 If you are on Linux replace rename with mv.

 rename RGoogleDocs_0.2-2.tar.gz RgoogleDocs_0.2.2-src.tar.gz
 tar xvfz RgoogleDocs_0.2.2-src.tar.gz
 Rcmd build RGoogleDocs
 Rcmd INSTALL RGoogleDocs_0.2-2.tar.gz

 or

 2. if RGoogleDocs_0.2-2.tar.gz is already a built source file then
 you
 can just issue the last of the above lines and don't need
 the others.

 On Thu, Jun 18, 2009 at 7:52 PM, Farrel Buchinskyfjb...@gmail.com
 wrote:

  What do you mean by cd the.directory.containing.RGoogleDocs

 Do you mean the directory where I downloaded the
 RGoogleDocs_0.2-2.tar.gz
 to? Or do you mean that I must create a directory called RGoogleDocs

  under

  Library and then change to that directory?

 Farrel Buchinsky
 Google Voice Tel: (412) 567-7870



 On Mon, Mar 2, 2009 at 22:16, Gabor Grothendieck 

  ggrothendi...@gmail.com

  wrote:

  Finally enter into the Windows console:

 cd the.directory.containing.RGoogleDocs
 Rcmd build RGoogleDocs
 Rcmd INSTALL RGoogleDocs_1.0.0.tar.gz

 except replace RGoogleDocs_1.0.0.tar.gz with the filename
 created by the build.

       [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


       [[alternative HTML version deleted]]



 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide

Re: [R] Uncorrelated random vectors

2009-07-08 Thread Greg Snow

The mvrnorm function in the MASS package has an argument to force the generated 
data to have the exact mean/variance structure as specified which when used 
with a diagonal variance matrix will generate data that has a 0 (within round 
off error) correlation in the data.  No post processing by Gramm-Schmidt or 
other methods needed.  The author(s) of the function cleverly hid this feature 
by placing the information on the help page for the function.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Moshe Olshansky
 Sent: Tuesday, July 07, 2009 9:10 PM
 To: r-help@r-project.org; Luba (AIM SE)Stein
 Subject: Re: [R] Uncorrelated random vectors
 
 
 As mentioned by somebody before, there is no problem for the normal
 case - use mvrnorm function from MASS package with any mu and make
 Sigma be any diagonal matrix (with strictly positive diagonal). Note
 that even though all the correlations are 0, the SAMPLE correlations
 won't be 0. If you want to create a set of vectors whose SAMPLE
 correlations are 0 you will have to use a variant of Gramm-Schmidt.
 I do not know whether a variant of mvrnorm exists for logistic
 distribution (my guess is that it does not).
 
 --- On Tue, 7/7/09, Stein, Luba (AIM SE) luba.st...@allianz.com
 wrote:
 
  From: Stein, Luba (AIM SE) luba.st...@allianz.com
  Subject: [R] Uncorrelated random vectors
  To: r-help@r-project.org r-help@r-project.org
  Received: Tuesday, 7 July, 2009, 11:45 PM
  Hello,
 
  is it possible to create two uncorrelated random vectors
  for a given distribution.
 
  In fact, I would like to have something like the function
  rnorm or rlogis with the extra property that they are
  uncorrelated.
 
  Thanks for your help,
  Luba
 
 
 
 
      [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org
  mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
  and provide commented, minimal, self-contained,
  reproducible code.
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Import xlsx file in Ubuntu 9.04

2009-07-08 Thread Duncan Temple Lang



I did some preliminary work on xslx (and docx and pptx) files
some time ago and will hopefully finish things off by the
end of summer.  We can read these with a combination
of the Rcompression and XML package.

I have put versions of two packages (ROOXML and RExcelXML)
at

  http://www.omegahat.org/Prerelease/

(ROOXML_0.1-0.tar.gz and RExcelXML_0.1-0.tar.gz)

There are no guarantees about how they work at this point, but
the basic structures are there. I'd be happy to hear about any problems
and to try to add functionality. Given the framework, it should
be relatively easy to add support for additional cell types, etc.


  D.



Marc Schwartz wrote:

On Jul 8, 2009, at 6:56 AM, Rodrigo Aluizio wrote:


Hi list,
By the entire last 2 weeks I was looking for a way to directly import 
xlsx
files to R in a Linux OS (Ubuntu 9.04). I already read the R 
Import/Export

guide, and I know how to use gdata to import xls files and read.table to
import .csv. My problem is that all data that I receive is in the xlsx
format, and I have to convert all the files to xls.
Well, when I was using Windows Vista OS, RODBC did the trick with the
odbcConnectExcel2007 function (which I know is not present in the Linux
RODBC package, probably due to drivers issue). Isn't there a way to 
import

this xlsx files directly to R without any previous conversion (.csv or
.xls)?

Thank you for the attention, it's probable that some one already asked 
it. I

even remember seen that somewhere, but without a definitive answer.

Rodrigo.




Your best bet on Linux would be to open the Excel 2007 files using 
OpenOffice's Calc and save them to CSV files. The latest versions of 
OpenOffice will open Office 2007 files.


An alternative of course would be to see if it is reasonable for the 
providers of the files to save them in the older XLS format instead, or 
to see if they have other file formats that they can send you rather 
than using Excel at all.


There is a very preliminary Perl module in progress, that should 
eventually provide for a more efficient path:


  http://search.cpan.org/dist/Spreadsheet-XLSX/

But from what I have seen, there are enough problems with it (including 
data integrity issues), that I would not use it in production work.


Unfortunately, I don't believe that you have a lot of options on Linux 
at the moment.


HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Randomizing a dataframe

2009-07-08 Thread Mark Knecht

On Wed, Jul 8, 2009 at 8:54 AM, Mark Namtb...@gmail.com wrote:
 Hi R-helpers,

 I have a dataframe (called data) with trees in rows (n=100) and insect
 species (n=10) in columns. My tree IDs are in a column called TREE and each
 species has a column labeled SPEC1, SPEC2, SPEC3, etc...

 I wish to randomize the values in my dataframe such that row and column
 totals are held constant, i.e. in my randomized data each tree will have the
 same number of individual insects as in the real data (constant row totals)
 and each species will have the same number of individuals as in the real
 data (constant column totals).

 I will eventually want to do this many times, but I would appreciate help
 getting started with the randomization.

 Thank you, Mark Na

        [[alternative HTML version deleted]]


Sounds like maybe you're looking for some form of Monte Carlo
experiments in R which is on my list of to-do for the next month. I
need to do something like rearrange the dates in one database as in
Monte Carlo but then rearrange all my other databases so that dates
still match up. It's just not bubbled to the top of the list yet.

I took a quick look in Google and found MCMCpack pretty quickly.
There's some documentation out there which is easy to find if it's of
interest.

Good luck and I'll be following the thread.

cheers,
Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading from Google Docs

2009-07-08 Thread William Dunlap

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Gabor Grothendieck
 Sent: Wednesday, July 08, 2009 9:04 AM
 To: Duncan Murdoch
 Cc: R; Uwe Ligges; Farrel Buchinsky
 Subject: Re: [R] Reading from Google Docs

 Its safer just to temporarily add it to your path.

I recommend that also.  Here is the SETPATH.BAT file
that I put into my Rtools directory that sets up PATH so
it can be used for building R and R packages.  I run it
from within the cmd window I will use for building
packages.  Note that it totally replaces the current value
of PATH with a new one; it does not append or prepend
entries to the existing one.  You will have to adjust the
entries for you own machine.  It is safe to add other entries
(like e:\cygwin\bin) to the end of this PATH, but you
might run into trouble putting entries at the front of PATH.

(I have a similar script to run before building packages
for S+, whose package building system uses the Microsoft
compilers and ActiveState perl but no cygwin tools.)

E:\type e:\Rtools\SETPATH.BAT
set RTOOLS=E:\Rtools
REM RHOME is for use in this script, R_HOME will be set by R itself.
set RHOME=E:\R-svn\r-devel

set PATH=C:\WINDOWS\system32;C:\WINDOWS

set PATH=%RTOOLS%\bin;%RTOOLS%\perl\bin;%RTOOLS%\MinGW\bin;%PATH%
set PATH=%RHOME%\bin;%PATH%

set PATH=%PATH%;E:\Program Files\MiKTeX 2.7\miktex\bin
set PATH=%PATH%;E:\Program Files\Inno Setup 5
set PATH=%PATH%;C:\Program Files\HTML Help Workshop
set PATH=%PATH%;E:\Program Files\CollabNet Subversion Server

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com  

 Unfortunately Rtools has a find command that conflicts with
 the find command in Windows so if you add the Rtools
 bin directory to your path permanently then you could
 find other programs stop working.  That actually happened
 to me once and it took the longest time until I discovered
 that Rtools was the culprit.

 If you follow the advice I gave you normally won't have
 that problem.

 On Wed, Jul 8, 2009 at 11:21 AM, Duncan 
 Murdochmurd...@stats.uwo.ca wrote:
  On 08/07/2009 10:13 AM, Farrel Buchinsky wrote:

  Forgive my naivte, but how do I make windows find tar. In 
 other words from
  where do I issue the command and what is the command.

  You need to install the toolset, and let the installer set 
 your path.

  Duncan Murdoch

  Farrel Buchinsky
  Google Voice Tel: (412) 567-7870

  On Wed, Jul 8, 2009 at 10:09, Duncan Murdoch 
 murd...@stats.uwo.ca wrote:

  On 08/07/2009 10:02 AM, Farrel Buchinsky wrote:

  I  have previously read R Installation and 
 Administration. I read it
  again. It does not help me
  The relevant paragraph is below. But I need lower level 
 instructions.
  Where
  can I find them.

  Follow the link.  If Windows can't find tar, your toolset 
 is installed
  incorrectly.

  Duncan Murdoch

  R CMD INSTALL works in Windows to install source 
 packages if you have
  the
  source-code package files (option Source Package 
 Installation Files in
  the
  installer) and toolset (see The Windows

 toolsetfile:///C:/Program%20Files/R/R-2.9.1/doc/manual/R-admi
 n.html#The-Windows-toolset)

  installed. Installation of binary packages must be done by
  install.packages
  . R CMD INSTALL --help will tell you the current options 
 under Windows
  (which differ from those on a Unix-alike): in particular 
 there is a
  choice
  of the types of documentation to be installed.
  Farrel Buchinsky
  Google Voice Tel: (412) 567-7870

  2009/6/19 Uwe Ligges lig...@statistik.tu-dortmund.de

   See the manual R Installation and Administration for 
 information on
  how

  to install source packages on Windows.

  Uwe Ligges

  Farrel Buchinsky wrote:

   After issuing tar xvfz RgoogleDocs_0.2.2-src.tar.gzI 
 am getting an
  error

  message
  'tar' is not recongnized as an internal or external 
 command, operable
  program or batch file.

  Should I use my 7-zip to open up the archive?
  Where should I be doing this? For instance can I do it 
 all in my
  download directory or should I do it in C:\Program
  Files\R\R-2.9.0\library or should I manually create C:\Program
  Files\R\R-2.9.0\library\RGoogleDocs and do it all 
 there or will the
  Rcmd
  INSTALL RGoogleDocs_0.2-2.tar.gz command do that for me.

  Yes, you assumed correctly. I am using Windows XP.
  Farrel Buchinsky
  Google Voice Tel: (412) 567-7870

  On Thu, Jun 18, 2009 at 20:17, Gabor Grothendieck
  ggrothendi...@gmail.comwrote:

   I have haven't neen following this thread but:

  1. if RGoogleDocs_0.2-2.tar.gz is a source distribution (as
  opposed to built source) then the first line renames it so
  that its not the same name as the built file about to 
 be created.
  The second line detars it into the RGoogleDocs 
 directory.  The third
  builds
  the built source file, RGoogleDocs_0.2-2.tar.gz.  The fourth
  installs the built source file into R.  I've assumed Windows.

Re: [R] bigglm() results different from glm()+Another question

2009-07-08 Thread Greg Snow

OK, it appears that the problem is the df.resid component of the biglm object.  
Everything else is being updated by the update function except the df.resid 
piece, so it is based solely on the initial fit and the chunksize used there.  
The df.resid piece is then used in the computation of the AIC and hence the 
differences that you see.  There could also be a difference in the p-values and 
confidence intervals, but at those high of numbers, the differences are smaller 
than can be seen at the level of rounding done.

This appears to be a bug/overlooked piece to me, Thomas is cc'd on this so he 
should be able to fix this.

A work around in the meantime is to do something like:

 fit$df.resid - 1-4

Then compute the AIC.

Also as an aside, if you change your seq to: seq(chunksize, 1-chunksize, 
chunksize) then you won't get the error messages.

Hope this helps,

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111

From: utkarshsinghal [mailto:utkarsh.sing...@global-analytics.com]
Sent: Wednesday, July 08, 2009 2:24 AM
To: Greg Snow
Cc: Thomas Lumley; r help
Subject: Re: [R] bigglm() results different from glm()+Another question

Hi Greg,

Many thanks for your precious time. Here is a workable code:

set.seed(1)
xx = data.frame(x1=runif(1,0,10), x2=runif(1,0,10), 
x3=runif(1,0,10))
xx$y = 3 + xx$x1 + 2*xx$x2 + 3*xx$x3 + rnorm(1)

chunksize = 500
fit = biglm(y~x1+x2+x3, data=xx[1:chunksize,])
for(i in seq(chunksize,1,chunksize)) fit=update(fit, 
moredata=xx[(i+1):(i+chunksize),])
AIC(fit)
[1] 28956.91

And the AIC for other chunksizes:
chunksizeAIC
500  28956.91
100027956.91
200025956.91
250024956.91
500019956.91
19956.91

Also I noted that the estimated coefficients are not dependent on chunksize and 
AIC is exactly a linear function of chunksize. So I guess it is some problem 
with the calculation of AIC, may be in some degree of freedom or adding some 
constant somewhere.

And my comments below.


Regards
Utkarsh


Greg Snow wrote:
How many rows does xx have?
Let's look at your example for chunksize 1, you initially fit the first 
1 observations, then the seq results in just the value 1 which means 
that you do the update based on vaues 10001 through 2, if xx only has 1 
rows, then this should give at least one error.  If xx has 2 or more rows, 
then only chunksize 1 will ever see the 2th value, the other chunksizes 
will use less of the data.
Understood your point and apologize that you had to spend time going into the 
logic inside for loop. I definitely thought of that but my actual problem was 
the variation in AICs (which I was sure about), so to ignore this loop problem 
(temporarily), I deliberately chose the chunksizes such that the number of rows 
is a multiple of chunksize. I knew there is still one extra iteration happening 
and I checked that it was not causing any problem, the moredata in the last 
iteration will be all NA's and update does nothing in such a case.

For example:
Let's say chunksize=5000, even though xx has only 1 rows, fit2 and 
fit3 below are exactly same.

fit1 = biglm(y~x1+x2+x3, data=xx[1:5000,])
fit2 = update(fit1, moredata=xx[5001:1,])
fit3 = update(fit2, moredata=xx[10001:15000,])
AIC(fit1); AIC(fit2); AIC(fit3)
[1] 5018.282
[1] 19956.91
[1] 19956.91

(The AIC matches with the table above and no warnings at all)

I checked all these things before sending my first mail and dropped the idea of 
refining the for loop as this will save me a few lines of code and also the 
loop looks good and easy to understand. Moreover it is neither taking any extra 
run time nor producing any warnings or errors.



Also looking at the help for update.biglm, the 2nd argument is moredata not 
data, so if the code below is the code that you actually ran, then the new 
data chunks are going into the ... argument (and being ignored as that is 
there for future expansion and does nothing yet) and the moredata argument is 
left empty, which should also be giving an error.  For the code below, the 
model is only being fit to the initial chunk and never updated, so with 
different chunk sizes, there is different amounts of data per model.  You can 
check this by doing summary(fit) and looking at the sample size in the 2nd line.
My fault in writing the mail. In the actual code, I gave update(fit, 
xx[(i+1):(i+chunksize),]) ,i.e., I just passed the new chunk as the 2nd 
argument without mentioning the argument name, which is correct, but while 
writing the mail I added the argument name as data without checking what it 
is.



It is easier for us to help you if you provide code that can be run by copying 
and pasting (we don't have xx, so we can't just run the code below, you could 
include a line to randomly generate an xx, or a link to where a copy of xx can 
be downloaded from).  It also helps if you mention

[R] matching each row

2009-07-08 Thread tathta


I have two dataframes, the first column of each dataframe is a unique id
number (the rest of the columns are data variables).  
I would like to figure out how many times each id number appears in each
dataframe.  

So far I can use: 
length( match (dataframeA$unique.id[1], dataframeB$unique.id) )

but this only works on each row of dataframe A one-at-a-time.  

I would like to do this for all of the rows in dataframe A, and then put the
results in a new variable: dataframeA$count


I'm new to R, so please be patient with me!


Sorry if this question has already been answered, my search of the archives
only brought up one relevant post, and I didn't understand the answer to
it  http://www.nabble.com/match-to20799206.html#a20799206


thx
-- 
View this message in context: 
http://www.nabble.com/matching-each-row-tp24393051p24393051.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Extracting a column name in loop?

2009-07-08 Thread mister_bluesman


Hi,

I am writing a script that will address columns using syntax like: 

data_set[,1] 

to extract the data from the first column of my data set, for example. This
code will be placed in a loop (where the column reference will be placed by
a variable). 

What I also need to do is extract the column NAME for a given column being
processed in the loop. The dataframe has been set so that R knows that the
top line refers to column headers. 

Can anyone help me understand how to do this?

Thanks.
-- 
View this message in context: 
http://www.nabble.com/Extracting-a-column-name-in-loop--tp24393160p24393160.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extracting a column name in loop?

2009-07-08 Thread Mark Knecht

On Wed, Jul 8, 2009 at 8:41 AM,
mister_bluesmanmister_blues...@hotmail.com wrote:

 Hi,

 I am writing a script that will address columns using syntax like:

 data_set[,1]

 to extract the data from the first column of my data set, for example. This
 code will be placed in a loop (where the column reference will be placed by
 a variable).

 What I also need to do is extract the column NAME for a given column being
 processed in the loop. The dataframe has been set so that R knows that the
 top line refers to column headers.

 Can anyone help me understand how to do this?

 Thanks.

Possibly something like

names(data_set)[i]

?

HTH,
Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading from Google Docs

2009-07-08 Thread Duncan Murdoch


On 08/07/2009 12:04 PM, Gabor Grothendieck wrote:

Its safer just to temporarily add it to your path.

Unfortunately Rtools has a find command that conflicts with
the find command in Windows so if you add the Rtools
bin directory to your path permanently then you could
find other programs stop working.  That actually happened
to me once and it took the longest time until I discovered
that Rtools was the culprit.


That's true, but there is a workaround: you can manually rename the 
find.exe in Rtools, and adjust the entry in one of the R makefiles 
(MkRules), and it will use the new name instead of find.  The reason 
you might not want to do this is you might expect find to act the way it 
does on Unix:  the Rtools basically try to make Windows look a little 
bit like Unix.


Duncan Murdoch



If you follow the advice I gave you normally won't have
that problem.

On Wed, Jul 8, 2009 at 11:21 AM, Duncan Murdochmurd...@stats.uwo.ca wrote:

On 08/07/2009 10:13 AM, Farrel Buchinsky wrote:

Forgive my naivte, but how do I make windows find tar. In other words from
where do I issue the command and what is the command.

You need to install the toolset, and let the installer set your path.

Duncan Murdoch


Farrel Buchinsky
Google Voice Tel: (412) 567-7870



On Wed, Jul 8, 2009 at 10:09, Duncan Murdoch murd...@stats.uwo.ca wrote:


On 08/07/2009 10:02 AM, Farrel Buchinsky wrote:


I  have previously read R Installation and Administration. I read it
again. It does not help me
The relevant paragraph is below. But I need lower level instructions.
Where
can I find them.


Follow the link.  If Windows can't find tar, your toolset is installed
incorrectly.

Duncan Murdoch



R CMD INSTALL works in Windows to install source packages if you have
the
source-code package files (option “Source Package Installation Files” in
the
installer) and toolset (see The Windows


toolsetfile:///C:/Program%20Files/R/R-2.9.1/doc/manual/R-admin.html#The-Windows-toolset)

installed. Installation of binary packages must be done by
install.packages
. R CMD INSTALL --help will tell you the current options under Windows
(which differ from those on a Unix-alike): in particular there is a
choice
of the types of documentation to be installed.
Farrel Buchinsky
Google Voice Tel: (412) 567-7870



2009/6/19 Uwe Ligges lig...@statistik.tu-dortmund.de

 See the manual R Installation and Administration for information on
how

to install source packages on Windows.

Uwe Ligges

Farrel Buchinsky wrote:

 After issuing tar xvfz RgoogleDocs_0.2.2-src.tar.gzI am getting an
error

message
'tar' is not recongnized as an internal or external command, operable
program or batch file.

Should I use my 7-zip to open up the archive?
Where should I be doing this? For instance can I do it all in my
download directory or should I do it in C:\Program
Files\R\R-2.9.0\library or should I manually create C:\Program
Files\R\R-2.9.0\library\RGoogleDocs and do it all there or will the
Rcmd
INSTALL RGoogleDocs_0.2-2.tar.gz command do that for me.

Yes, you assumed correctly. I am using Windows XP.
Farrel Buchinsky
Google Voice Tel: (412) 567-7870



On Thu, Jun 18, 2009 at 20:17, Gabor Grothendieck
ggrothendi...@gmail.comwrote:

 I have haven't neen following this thread but:


1. if RGoogleDocs_0.2-2.tar.gz is a source distribution (as
opposed to built source) then the first line renames it so
that its not the same name as the built file about to be created.
The second line detars it into the RGoogleDocs directory.  The third
builds
the built source file, RGoogleDocs_0.2-2.tar.gz.  The fourth
installs the built source file into R.  I've assumed Windows.
If you are on Linux replace rename with mv.

rename RGoogleDocs_0.2-2.tar.gz RgoogleDocs_0.2.2-src.tar.gz
tar xvfz RgoogleDocs_0.2.2-src.tar.gz
Rcmd build RGoogleDocs
Rcmd INSTALL RGoogleDocs_0.2-2.tar.gz

or

2. if RGoogleDocs_0.2-2.tar.gz is already a built source file then
you
can just issue the last of the above lines and don't need
the others.

On Thu, Jun 18, 2009 at 7:52 PM, Farrel Buchinskyfjb...@gmail.com
wrote:

 What do you mean by cd the.directory.containing.RGoogleDocs

Do you mean the directory where I downloaded the
RGoogleDocs_0.2-2.tar.gz
to? Or do you mean that I must create a directory called RGoogleDocs

 under

 Library and then change to that directory?

Farrel Buchinsky
Google Voice Tel: (412) 567-7870



On Mon, Mar 2, 2009 at 22:16, Gabor Grothendieck 

 ggrothendi...@gmail.com

 wrote:

 Finally enter into the Windows console:

cd the.directory.containing.RGoogleDocs
Rcmd build RGoogleDocs
Rcmd INSTALL RGoogleDocs_1.0.0.tar.gz

except replace RGoogleDocs_1.0.0.tar.gz with the filename
created by the build.

  [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,

[R] Simple monovariate classification?

2009-07-08 Thread rgunton



I'm looking for an R function that simply recodes a quantitative  
variable into a number of classes according to specified break-points.  
 Obviously I can do this using nested ifelse() commands, but I want  
to write it into a function where I can't pre-specify the number of  
classes.  Is there an obvious way to do this?


An example to clarify: how to convert c(0,10,5,1,9,6) to  
c(1,3,2,1,3,2) by specifying breaks=c(2.5,7.5) - or something like  
that.


Thanks,

Richard Gunton.
INRA-Dijon, France

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading from Google Docs

2009-07-08 Thread Farrel Buchinsky

Does changing the path in Windows work in real time or does one need to
restart the computer for the changes to take effect.
Farrel Buchinsky
Google Voice Tel: (412) 567-7870



On Wed, Jul 8, 2009 at 12:04, Gabor Grothendieck ggrothendi...@gmail.comwrote:

 Its safer just to temporarily add it to your path.

 Unfortunately Rtools has a find command that conflicts with
 the find command in Windows so if you add the Rtools
 bin directory to your path permanently then you could
 find other programs stop working.  That actually happened
 to me once and it took the longest time until I discovered
 that Rtools was the culprit.

 If you follow the advice I gave you normally won't have
 that problem.

 On Wed, Jul 8, 2009 at 11:21 AM, Duncan Murdochmurd...@stats.uwo.ca
 wrote:
  On 08/07/2009 10:13 AM, Farrel Buchinsky wrote:
 
  Forgive my naivte, but how do I make windows find tar. In other words
 from
  where do I issue the command and what is the command.
 
  You need to install the toolset, and let the installer set your path.
 
  Duncan Murdoch
 
  Farrel Buchinsky
  Google Voice Tel: (412) 567-7870
 
 
 
  On Wed, Jul 8, 2009 at 10:09, Duncan Murdoch murd...@stats.uwo.ca
 wrote:
 
  On 08/07/2009 10:02 AM, Farrel Buchinsky wrote:
 
  I  have previously read R Installation and Administration. I read it
  again. It does not help me
  The relevant paragraph is below. But I need lower level instructions.
  Where
  can I find them.
 
  Follow the link.  If Windows can't find tar, your toolset is installed
  incorrectly.
 
  Duncan Murdoch
 
 
  R CMD INSTALL works in Windows to install source packages if you have
  the
  source-code package files (option Source Package Installation Files
 in
  the
  installer) and toolset (see The Windows
 
 
 
 toolsetfile:///C:/Program%20Files/R/R-2.9.1/doc/manual/R-admin.html#The-Windows-toolset)
 
  installed. Installation of binary packages must be done by
  install.packages
  . R CMD INSTALL --help will tell you the current options under Windows
  (which differ from those on a Unix-alike): in particular there is a
  choice
  of the types of documentation to be installed.
  Farrel Buchinsky
  Google Voice Tel: (412) 567-7870
 
 
 
  2009/6/19 Uwe Ligges lig...@statistik.tu-dortmund.de
 
   See the manual R Installation and Administration for information on
  how
 
  to install source packages on Windows.
 
  Uwe Ligges
 
  Farrel Buchinsky wrote:
 
   After issuing tar xvfz RgoogleDocs_0.2.2-src.tar.gzI am getting an
  error
 
  message
  'tar' is not recongnized as an internal or external command,
 operable
  program or batch file.
 
  Should I use my 7-zip to open up the archive?
  Where should I be doing this? For instance can I do it all in my
  download directory or should I do it in C:\Program
  Files\R\R-2.9.0\library or should I manually create C:\Program
  Files\R\R-2.9.0\library\RGoogleDocs and do it all there or will the
  Rcmd
  INSTALL RGoogleDocs_0.2-2.tar.gz command do that for me.
 
  Yes, you assumed correctly. I am using Windows XP.
  Farrel Buchinsky
  Google Voice Tel: (412) 567-7870
 
 
 
  On Thu, Jun 18, 2009 at 20:17, Gabor Grothendieck
  ggrothendi...@gmail.comwrote:
 
   I have haven't neen following this thread but:
 
  1. if RGoogleDocs_0.2-2.tar.gz is a source distribution (as
  opposed to built source) then the first line renames it so
  that its not the same name as the built file about to be created.
  The second line detars it into the RGoogleDocs directory.  The
 third
  builds
  the built source file, RGoogleDocs_0.2-2.tar.gz.  The fourth
  installs the built source file into R.  I've assumed Windows.
  If you are on Linux replace rename with mv.
 
  rename RGoogleDocs_0.2-2.tar.gz RgoogleDocs_0.2.2-src.tar.gz
  tar xvfz RgoogleDocs_0.2.2-src.tar.gz
  Rcmd build RGoogleDocs
  Rcmd INSTALL RGoogleDocs_0.2-2.tar.gz
 
  or
 
  2. if RGoogleDocs_0.2-2.tar.gz is already a built source file then
  you
  can just issue the last of the above lines and don't need
  the others.
 
  On Thu, Jun 18, 2009 at 7:52 PM, Farrel Buchinskyfjb...@gmail.com
 
  wrote:
 
   What do you mean by cd the.directory.containing.RGoogleDocs
 
  Do you mean the directory where I downloaded the
  RGoogleDocs_0.2-2.tar.gz
  to? Or do you mean that I must create a directory called
 RGoogleDocs
 
   under
 
   Library and then change to that directory?
 
  Farrel Buchinsky
  Google Voice Tel: (412) 567-7870
 
 
 
  On Mon, Mar 2, 2009 at 22:16, Gabor Grothendieck 
 
   ggrothendi...@gmail.com
 
   wrote:
 
   Finally enter into the Windows console:
 
  cd the.directory.containing.RGoogleDocs
  Rcmd build RGoogleDocs
  Rcmd INSTALL RGoogleDocs_1.0.0.tar.gz
 
  except replace RGoogleDocs_1.0.0.tar.gz with the filename
  created by the build.
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide

Re: [R] Two-way ANOVA gives different results using anova(lm()) than doing it by hand

2009-07-08 Thread Greg Snow

Well, since we don't have Data.txt it is kind of hard for us to replicate what 
you have done.

Here goes a guess as to what the problem may be.

Have you told R anywhere that S1 and S2 are factors with 6 levels rather than 
numeric vectors? Or are you just hoping that the computer can read your mind to 
find out this information?  

(reading minds is one of the things that R and computers in general are not 
very good at yet.  I have made a note to my future self to use the TimeTravel 
package to send a copy of the ESP package back to my past self, but I have not 
received it yet).


-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Lars Bergemann
 Sent: Wednesday, July 08, 2009 8:35 AM
 To: r-help@r-project.org
 Subject: [R] Two-way ANOVA gives different results using anova(lm())
 than doing it by hand
 
 
 Hey!
 
 
 
 Could you please take a quick look at what I have done? Somehow I get
 wrong results using the anova(lm()) combination compared to doing a two
 way ANOVA by hand.
 
 
 
 Running:
 
 
 
 Data-read.table(Data.txt);
 g-lm(ExM~S1*S2,Data);
 anova(g);
 
 
 
 Gives:
 
 
 
 Analysis of Variance Table
 
 Response: ExM
Df Sum Sq Mean Sq F valuePr(F)
 S1  1 4.3679  4.3679 167.045  2.2e-16 ***
 S2  1 0.9427  0.9427  36.053 8.236e-09 ***
 S1:S2   1 0.3231  0.3231  12.357 0.0005371 ***
 Residuals 212 5.5434  0.0261
 
 
 I compared it to the work done by hand, ie calculated all the different
 square sums using sum() and tapply().
 
 So I know that anova(lm()) gets the degrees of freedom equal two 1, 1,
 1 and 212 when it should be 5, 5, 25 and 180. Also, the square sums are
 quite different ... I get 4.xx, 4.xx, 1.xx, 0.xx ... as you see, what
 anova(lm()) gets is different.
 
 
 
 The data: S1 has 6 levels, so has S2. On average, each cell has 6
 values, most cells have actually 6 values, and there are two of each:
 5, 7, 4, 8 - so average 6.
 
 
 
 Could you please help me, why it does not work with anova(lm())? I
 tried quite a few thinks found with Google, but it all gave me the same
 result as anova(lm()) ...
 
 
 
 Thanks a lot!
 
 
 
 Lars
 
 _
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] #INCLUDE

2009-07-08 Thread John Kane

?source  perhaps?

--- On Wed, 7/8/09, Idgarad idga...@gmail.com wrote:

 From: Idgarad idga...@gmail.com
 Subject: [R] #INCLUDE
 To: r-help@r-project.org
 Received: Wednesday, July 8, 2009, 11:16 AM
 What is R's equivalent to a C-like
 #include to incorporate external files. I
 have a 2k line function that is generated and need to
 include it at runtime
 but not manage it as a package (as it changes hourly.) Any
 ideas?

  __
The new Internet Explorer® 8 - Faster, safer, easier.  Optimized for Yahoo!  
Get it Now for Free! at http://downloads.yahoo.com/ca/internetexplorer/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Two-way ANOVA gives different results using anova(lm()) than doing it by hand

2009-07-08 Thread Marc Schwartz


On Jul 8, 2009, at 12:11 PM, Greg Snow wrote:

Well, since we don't have Data.txt it is kind of hard for us to  
replicate what you have done.


Here goes a guess as to what the problem may be.

Have you told R anywhere that S1 and S2 are factors with 6 levels  
rather than numeric vectors? Or are you just hoping that the  
computer can read your mind to find out this information?


(reading minds is one of the things that R and computers in general  
are not very good at yet.  I have made a note to my future self to  
use the TimeTravel package to send a copy of the ESP package back to  
my past self, but I have not received it yet).



A definite Fortunes candidate.

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matching each row

2009-07-08 Thread David Huffer

Something like this? 

   dataframeA - data.frame (
  +   unique.id= c(1,1,3,3,3,5,7,7, 9)
  +   , x1=rnorm(9)
  +   , x2=rnorm(9)
  +   , x3=rnorm(9)
  + )
   dataframeB - data.frame (
  +   unique.id= c(2,3,4,5,5,5,6,7,9,10,10)
  +   , x4=rnorm(11)
  +   , x5=rnorm(11)
  +   , x6=rnorm(11)
  + )
   match.counts - function ( x , y ) {
  +   out - cbind (
  + table ( x [ which ( x %in% y ) ] )
  + , table ( y [ which ( y %in% x ) ] )
  +   )
  +   dimnames ( out ) [[2]] - c ( N in x , N in y )
  +   out
  + }
   match.counts ( dataframeA$unique.id , dataframeB$unique.id )
N in x N in y
  3  3  1
  5  1  3
  7  2  1
  9  1  1
  

--
 David
 
 -
 David Huffer, Ph.D.   Senior Statistician
 CSOSA/Washington, DC   david.huf...@csosa.gov
 -

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of tathta
Sent: Wednesday, July 08, 2009 11:10 AM
To: r-help@r-project.org
Subject: [R] matching each row


I have two dataframes, the first column of each dataframe is a unique id
number (the rest of the columns are data variables).  
I would like to figure out how many times each id number appears in each
dataframe.  

So far I can use: 
length( match (dataframeA$unique.id[1], dataframeB$unique.id) )

but this only works on each row of dataframe A one-at-a-time.  

I would like to do this for all of the rows in dataframe A, and then put the
results in a new variable: dataframeA$count


I'm new to R, so please be patient with me!


Sorry if this question has already been answered, my search of the archives
only brought up one relevant post, and I didn't understand the answer to
it  http://www.nabble.com/match-to20799206.html#a20799206


thx
-- 
View this message in context: 
http://www.nabble.com/matching-each-row-tp24393051p24393051.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Randomizing a dataframe

2009-07-08 Thread Greg Snow

Here is one approach (there are others, some that are probably better, but this 
can get you started):

1. rearrange your data so that every insect is a single row with 2 columns: the 
tree id and the species (this new dataset will have as many rows as the sum of 
the values in the old dataset).  The reshape package may be able to help with 
this step (you may also need the rep function).

2. randomly permute one of the 2 columns (see ?sample).

3. restructure the permuted data back to the original (the table function may 
be enough here, the reshape package will give more options).

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Mark Na
 Sent: Wednesday, July 08, 2009 9:54 AM
 To: r-help@r-project.org
 Subject: [R] Randomizing a dataframe
 
 Hi R-helpers,
 
 I have a dataframe (called data) with trees in rows (n=100) and insect
 species (n=10) in columns. My tree IDs are in a column called TREE and
 each
 species has a column labeled SPEC1, SPEC2, SPEC3, etc...
 
 I wish to randomize the values in my dataframe such that row and column
 totals are held constant, i.e. in my randomized data each tree will
 have the
 same number of individual insects as in the real data (constant row
 totals)
 and each species will have the same number of individuals as in the
 real
 data (constant column totals).
 
 I will eventually want to do this many times, but I would appreciate
 help
 getting started with the randomization.
 
 Thank you, Mark Na
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple monovariate classification?

2009-07-08 Thread Greg Hirson


Try ?cut

Greg

rgun...@dijon.inra.fr wrote:


I'm looking for an R function that simply recodes a quantitative 
variable into a number of classes according to specified break-points. 
 Obviously I can do this using nested ifelse() commands, but I want to 
write it into a function where I can't pre-specify the number of 
classes.  Is there an obvious way to do this?


An example to clarify: how to convert c(0,10,5,1,9,6) to 
c(1,3,2,1,3,2) by specifying breaks=c(2.5,7.5) - or something like 
that.


Thanks,

Richard Gunton.
INRA-Dijon, France

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


--
Greg Hirson
ghir...@ucdavis.edu

Graduate Student
Agricultural and Environmental Chemistry

1106 Robert Mondavi Institute North
One Shields Avenue
Davis, CA 95616

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple monovariate classification?

2009-07-08 Thread Greg Hirson


Richard,

More specifically,

x = c(0,10,5,1,9,6)

cut(x, breaks = c(-Inf, 2.5,7.5, Inf), labels = c(1, 2, 3))
#[1] 1 3 2 1 3 2

Hope that helps,

Greg

rgun...@dijon.inra.fr wrote:


I'm looking for an R function that simply recodes a quantitative 
variable into a number of classes according to specified break-points. 
 Obviously I can do this using nested ifelse() commands, but I want to 
write it into a function where I can't pre-specify the number of 
classes.  Is there an obvious way to do this?


An example to clarify: how to convert c(0,10,5,1,9,6) to 
c(1,3,2,1,3,2) by specifying breaks=c(2.5,7.5) - or something like 
that.


Thanks,

Richard Gunton.
INRA-Dijon, France

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


--
Greg Hirson
ghir...@ucdavis.edu

Graduate Student
Agricultural and Environmental Chemistry

1106 Robert Mondavi Institute North
One Shields Avenue
Davis, CA 95616

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matching each row

2009-07-08 Thread tathta


Close...   

The output I'm looking for is more like this: 

output -
data.frame(unique.id=c(1,3,5,7,9),N.in.x=c(2,3,1,2,1),N.in.y=c(0,1,3,1,1))

The first column can be gotten using a small change to the first table line: 
table ( x [ which ( x %in% x ) ] )   ##the 3rd x used to be a y

but I can't modify it to make the second ideal output column, I just end
up with warnings...  




Something like this? 

   dataframeA - data.frame (
  +   unique.id= c(1,1,3,3,3,5,7,7, 9)
  +   , x1=rnorm(9)
  +   , x2=rnorm(9)
  +   , x3=rnorm(9)
  + )
   dataframeB - data.frame (
  +   unique.id= c(2,3,4,5,5,5,6,7,9,10,10)
  +   , x4=rnorm(11)
  +   , x5=rnorm(11)
  +   , x6=rnorm(11)
  + )
   match.counts - function ( x , y ) {
  +   out - cbind (
  + table ( x [ which ( x %in% y ) ] )
  + , table ( y [ which ( y %in% x ) ] )
  +   )
  +   dimnames ( out ) [[2]] - c ( N in x , N in y )
  +   out
  + }
   match.counts ( dataframeA$unique.id , dataframeB$unique.id )
N in x N in y
  3  3  1
  5  1  3
  7  2  1
  9  1  1
  

--
 David
 
-- 
View this message in context: 
http://www.nabble.com/matching-each-row-tp24393051p24396184.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] OK - I got the data - now what? :-)

2009-07-08 Thread Michael A. Miller

 Mark wrote:

 Currently my data is one experiment per row, but that's
 wasting space as most experiments only take 20% of the row
 and 80% of the row is filled with 0's. I might want to make
 the array more narrow and have a flag somewhere in the 1st
 10 columns that says the this row is a continuation row
 from the previous row. That way I could pack the array
 better, use less memory and when I do finally test for 0 I
 have a short line to traverse?

This may be a bit off track from the data manipulation you are
working on, but I thought I'd point out that another way to
handle this sort of data is to make a table with one measurement
per row, rather than one experiment per row.

experiment measurement value
 A   1  0.27
 A   2  0.66
 A   3  0.24
 A   4  0.55
 B   1  0.13
 B   2  0.65
 B   3  0.83
 B   4  0.41
 B   5  0.92
 B   6  0.67
 C   1  0.75
 C   2  0.97
 C   3  0.49
 C   4  0.58
 D   1  1.00
 D   2  0.71
 E   1  0.11
 E   2  0.50
 E   3  0.98
 E   4  0.07
 E   5  0.94
 E   6  0.57
 E   7  0.34
 E   8  0.21


If you wrote the output of your calculations in this way, one
value per line, it can easily be read into R as a data.frame and
handled with less need for munging.  No need to remove the
zero-padding because the zeros aren't needed in the first place.

You can subset the data with subset, as in

  test - read.table('test.dat',header=TRUE)
  expA - subset(test, experiment=='A')
  expB - subset(test, experiment=='B')

so there is no need to deal with ragged/zero-padded arrays. Your
plots can be grouped automatically with lattice: 

require(lattice)
xyplot(value ~ measurement, data=test, group=experiment, type='b')
xyplot(value ~ measurement | experiment, data=test, type='b')


It is simple to do calculations by experiment using tapply.  For
example


 with(test, tapply(value, experiment, mean))
A B C D E 
0.430 0.6016667 0.6975000 0.855 0.465 
 

 with(test, tapply(measurement, experiment, max))
A B C D E 
4 6 4 2 8 



Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] typo in ts detrending implementation in spec.pgram?

2009-07-08 Thread Mikhail Titov

Hello!

I wonder if there is a typo in detrending code of spec.pgram in spectrum.R from 
stats package.

One can see in the code 
https://svn.r-project.org/R/trunk/src/library/stats/R/spectrum.R .

I am afraid there is a typo and the code should look like

if (detrend) {
t - 1L:N - (N + 1)/2
  sumt2 - N * (N^2 - 1)/12
  for (i in 1L:ncol(x))
x[, i] - x[, i] - mean(x[, i]) - sum((x[, i]-mean(x[,i]) * t) * 
t/sumt2
}


Note x[, i]-mean(x[,i]) instead of x[,i] only as in repository. Here is a quick 
reference 
http://en.wikipedia.org/wiki/Simple_linear_regression#Estimating_the_regression_line
 . Note $\hat b$ there. It has not x in summation, but x-mean(x).

Perhaps, the even better solution would be resid(lm(x[,i] ~ seq(along = 
x[,i]))) . See http://tolstoy.newcastle.edu.au/R/help/05/01/10115.html

Mikhail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading from Google Docs

2009-07-08 Thread Farrel Buchinsky

Hooray! I got it to work. Here is what I think happened.My hold up was that
the tar command was not working. If you recall, when I issued the command:
tar xvfz RgoogleDocs_0.2.2-src.tar.gz
cmd.exe told me it could not be found

I reran Rtools29.exe which is the Rtools setup program which offered to
change my path. However it still did not work. I went to lunch and took the
opportunity to reboot my computer.

When I retried after lunch the tar command worked and everything thereafter
worked. I think that the file C:\Program Files\R\Rtools\bin\tar.exe could
not be found earlier. I just looked back at my path and I see
that C:\Program Files\R\Rtools\bin is on the path.

RgoogleDocs 0.2-2 is amazing. I can now read data straight into a dataframe.
The fact that I am always reading from realtime data is astounding.

sheets.con = getGoogleDocsConnection(getGoogleAuth(fjb...@gmail.com,
password here, service = wise))
ts2=getWorksheets(Consents Received,sheets.con)# put the name of the
spreadsheet in the inverted commas
names(ts2)
sheetAsMatrix(ts2$Sheet1,header=TRUE, as.data.frame=TRUE, trim=TRUE)

MAGIC

Boy oh boy that process of getting source to binary was super painful. Now
that I have the package as binary I can share the whole folder with my
coworker and she is able to use RGoogleDocs. I intend to use the same
process for the other two windows machines that I use. I really do not want
to go through the same installation and path hassles all over again.

Should I post my directory containing the binary files somewhere so that
others do not have to experience pain. Does etiquette dictate that I should
post the directory to help other or does etiquette dictate that it is Duncan
Temple Lang's code and thus it his prerogative to distribute his work as he
wishes?

Farrel Buchinsky
Google Voice Tel: (412) 567-7870



On Wed, Jul 8, 2009 at 12:59, Farrel Buchinsky fjb...@gmail.com wrote:

 Does changing the path in Windows work in real time or does one need to
 restart the computer for the changes to take effect.
 Farrel Buchinsky
 Google Voice Tel: (412) 567-7870



 On Wed, Jul 8, 2009 at 12:04, Gabor Grothendieck 
 ggrothendi...@gmail.comwrote:

 Its safer just to temporarily add it to your path.

 Unfortunately Rtools has a find command that conflicts with
 the find command in Windows so if you add the Rtools
 bin directory to your path permanently then you could
 find other programs stop working.  That actually happened
 to me once and it took the longest time until I discovered
 that Rtools was the culprit.

 If you follow the advice I gave you normally won't have
 that problem.

 On Wed, Jul 8, 2009 at 11:21 AM, Duncan Murdochmurd...@stats.uwo.ca
 wrote:
  On 08/07/2009 10:13 AM, Farrel Buchinsky wrote:
 
  Forgive my naivte, but how do I make windows find tar. In other words
 from
  where do I issue the command and what is the command.
 
  You need to install the toolset, and let the installer set your path.
 
  Duncan Murdoch
 
  Farrel Buchinsky
  Google Voice Tel: (412) 567-7870
 
 
 
  On Wed, Jul 8, 2009 at 10:09, Duncan Murdoch murd...@stats.uwo.ca
 wrote:
 
  On 08/07/2009 10:02 AM, Farrel Buchinsky wrote:
 
  I  have previously read R Installation and Administration. I read
 it
  again. It does not help me
  The relevant paragraph is below. But I need lower level instructions.
  Where
  can I find them.
 
  Follow the link.  If Windows can't find tar, your toolset is installed
  incorrectly.
 
  Duncan Murdoch
 
 
  R CMD INSTALL works in Windows to install source packages if you have
  the
  source-code package files (option Source Package Installation Files
 in
  the
  installer) and toolset (see The Windows
 
 
 
 toolsetfile:///C:/Program%20Files/R/R-2.9.1/doc/manual/R-admin.html#The-Windows-toolset)
 
  installed. Installation of binary packages must be done by
  install.packages
  . R CMD INSTALL --help will tell you the current options under
 Windows
  (which differ from those on a Unix-alike): in particular there is a
  choice
  of the types of documentation to be installed.
  Farrel Buchinsky
  Google Voice Tel: (412) 567-7870
 
 
 
  2009/6/19 Uwe Ligges lig...@statistik.tu-dortmund.de
 
   See the manual R Installation and Administration for information
 on
  how
 
  to install source packages on Windows.
 
  Uwe Ligges
 
  Farrel Buchinsky wrote:
 
   After issuing tar xvfz RgoogleDocs_0.2.2-src.tar.gzI am getting an
  error
 
  message
  'tar' is not recongnized as an internal or external command,
 operable
  program or batch file.
 
  Should I use my 7-zip to open up the archive?
  Where should I be doing this? For instance can I do it all in my
  download directory or should I do it in C:\Program
  Files\R\R-2.9.0\library or should I manually create C:\Program
  Files\R\R-2.9.0\library\RGoogleDocs and do it all there or will the
  Rcmd
  INSTALL RGoogleDocs_0.2-2.tar.gz command do that for me.
 
  Yes, you assumed correctly. I am using Windows XP.
  Farrel Buchinsky
  Google Voice

Re: [R] matching each row

2009-07-08 Thread Marc Schwartz


On Jul 8, 2009, at 10:09 AM, tathta wrote:



I have two dataframes, the first column of each dataframe is a  
unique id

number (the rest of the columns are data variables).
I would like to figure out how many times each id number appears in  
each

dataframe.

So far I can use:
length( match (dataframeA$unique.id[1], dataframeB$unique.id) )

but this only works on each row of dataframe A one-at-a-time.

I would like to do this for all of the rows in dataframe A, and then  
put the

results in a new variable: dataframeA$count


I'm new to R, so please be patient with me!


Sorry if this question has already been answered, my search of the  
archives
only brought up one relevant post, and I didn't understand the  
answer to

it  http://www.nabble.com/match-to20799206.html#a20799206



If I am correctly understanding what you are looking for, you could do  
something like the following:


# Create some simple data. Note that only a subset of the ID's (3:5)  
will match across the two DF's:

set.seed(1)
DF.A - data.frame(ID = sample(1:5, 10, replace = TRUE))
DF.B - data.frame(ID = sample(3:7, 10, replace = TRUE))

 DF.A
   ID
1   2
2   2
3   3
4   5
5   2
6   5
7   5
8   4
9   4
10  1

 DF.B
   ID
1   4
2   3
3   6
4   4
5   6
6   5
7   6
8   7
9   4
10  6


Now, create counts of the IDs in each, coercing the results to data  
frames and setting the count column name for each:


TAB.A - as.data.frame(table(DF.A$ID), responseName = Count.A)
TAB.B - as.data.frame(table(DF.B$ID), responseName = Count.B)

 TAB.A
  Var1 Count.A
11   1
22   3
33   1
44   2
55   3

 TAB.B
  Var1 Count.B
13   1
24   3
35   1
46   4
57   1


Now, use merge() to join each of the two above. 'all = TRUE' will  
include non-matching keys:


 merge(TAB.A, TAB.B, by = Var1, all = TRUE)
  Var1 Count.A Count.B
11   1  NA
22   3  NA
33   1   1
44   2   3
55   3   1
66  NA   4
77  NA   1


Note that you will get NAs for any non-matching ID's (Var1).

See ?table, ?as.data.frame and ?merge for more information.

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] OK - I got the data - now what? :-)

2009-07-08 Thread Mark Knecht

On Wed, Jul 8, 2009 at 10:51 AM, Michael A. Millermmill...@iupui.edu wrote:
 Mark wrote:

     Currently my data is one experiment per row, but that's
     wasting space as most experiments only take 20% of the row
     and 80% of the row is filled with 0's. I might want to make
     the array more narrow and have a flag somewhere in the 1st
     10 columns that says the this row is a continuation row
     from the previous row. That way I could pack the array
     better, use less memory and when I do finally test for 0 I
     have a short line to traverse?

 This may be a bit off track from the data manipulation you are
 working on, but I thought I'd point out that another way to
 handle this sort of data is to make a table with one measurement
 per row, rather than one experiment per row.

 experiment measurement value
         A           1  0.27
         A           2  0.66
         A           3  0.24
         A           4  0.55
         B           1  0.13
         B           2  0.65
         B           3  0.83
         B           4  0.41
         B           5  0.92
         B           6  0.67
         C           1  0.75
         C           2  0.97
         C           3  0.49
         C           4  0.58
         D           1  1.00
         D           2  0.71
         E           1  0.11
         E           2  0.50
         E           3  0.98
         E           4  0.07
         E           5  0.94
         E           6  0.57
         E           7  0.34
         E           8  0.21


 If you wrote the output of your calculations in this way, one
 value per line, it can easily be read into R as a data.frame and
 handled with less need for munging.  No need to remove the
 zero-padding because the zeros aren't needed in the first place.

 You can subset the data with subset, as in

  test - read.table('test.dat',header=TRUE)
  expA - subset(test, experiment=='A')
  expB - subset(test, experiment=='B')

 so there is no need to deal with ragged/zero-padded arrays. Your
 plots can be grouped automatically with lattice:

 require(lattice)
 xyplot(value ~ measurement, data=test, group=experiment, type='b')
 xyplot(value ~ measurement | experiment, data=test, type='b')


 It is simple to do calculations by experiment using tapply.  For
 example


 with(test, tapply(value, experiment, mean))
        A         B         C         D         E
 0.430 0.6016667 0.6975000 0.855 0.465


 with(test, tapply(measurement, experiment, max))
 A B C D E
 4 6 4 2 8



 Mike


Mike,
   It's not really that far off track as I didn't have any background
when I started this in R. This is the first time I've used it. I
simply chose to use a format that I thought would work for me in both
Excel and R. I do like your examples.

   My impression of reshape coupled with cast is that it's pretty
capable of giving me more or less the same format you suggest although
it is a bit of work. Currently in my files I save only the start and
finish times of the experiments and planned on calculating all the
times in the middle if necessary. With this format I'd just write them
out on each line and save that work in R.

   I suppose the files using this alternative format would be a lot
larger on disk. I currently have 10 values + 500 observations per
experiment with an average experiment tracking file containing maybe
500-1000 experiments. With this format in the worst I suppose I'd have
(10+1) * 1000 per experiment on disk, but on average it would be less
than that because as you say I wouldn't write out any zeros. Once in R
in memory they'd be equivalent. Disk space doesn't matter but reading
and writing the files might be slower. I suppose I don't really have
to write the zeros out anyway, but at this point it's jsut one
additional subset after going through reshape.

   It might be an advantage to get to the subset commands immediately
but still I've got 10 independent variables and I suspect I'm going to
be using reshape/cast more than once to get to my answers so I haven't
been against learning how to work with it.

   Overall they are good inputs and I appreciate them. Thanks!

Cheers,
Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] \dQuote in packages

2009-07-08 Thread Rebecca Sela

I am in the process of submitting a package to CRAN.  R CMD check ran 
successfully on the package on my local computer, using R version 2.1.1.  
However, on the computers for CRAN (with version 2.10.0), the following errors 
occurred:

Warning in parse_Rd(./man/predict.Rd, encoding = unknown) :
  ./man/predict.Rd:28: unknown macro '\dquote'
*** error on file ./man/predict.Rd
Error : ./man/predict.Rd:28: Unrecognized macro \dquote
Warning in parse_Rd(./man/print.Rd, encoding = unknown) :
  ./man/print.Rd:17: unexpected UNKNOWN '\sideeffects'
Warning in parse_Rd(./man/simpleREEMdata.Rd, encoding = unknown) :
  ./man/simpleREEMdata.Rd:10: unknown macro '\item'

Are \dquote, \sideeffects, and \item not supported in newer versions of R?  Is 
there some underlying problem that I should fix that makes these show up?

Thank you very much.

Rebecca

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] truncated regression out-of-sample predictions

2009-07-08 Thread Wouterse, Fleur (IFPRI-Senegal)

Dear all, 

 

I am trying to implement Simar  Wilson's (2007) second algorithm and
have the following question: If I use a truncated regression on the mn
observations, how do I get fitted values for all n observations, instead
of for m observations, which is what the command fitted returns; I would
need these to construct the left-truncation needed to draw n random
deviates. 

 

Thanks for your help, 

 

Fleur

 

 

Fleur Wouterse, Ph.D. 

Post-Doctoral Fellow 

IFPRI-Dakar 

Immeuble Ousseynou Thiam Gueye 

Rue de Thies

Point E, BP 15702 CP 12524

Dakar Fann

Senegal

Phone: +221 33 869 3986

Email: f.woute...@cgiar.org

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] heatmap.2: question regarding the raw z-score

2009-07-08 Thread Chrysanthi A.

Hi,

I am analysing gene expression data using the heatmap.2 function in R and I
was wondering what is the formula of the raw z-score bar which shows the
colors for each pixel.
According to that post:
https://mailman.stat.ethz.ch/pipermail/r-help/2006-September/113598.html, it
is the

(actual value - mean of the group) / standard deviation.

But, mean of which group? Mean of the gene vector? And actual value of that
gene on a sample?  I would be grateful if you could give me some more
details about it or even if there is a book/manual that I could address
to..

Thanks a lot,

Chrysanthi.

*
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] print() to file?

2009-07-08 Thread Steve Jaffe


I'd like to write some objects (eg arrays) to a log file. cat() flattens them
out. I'd like them formatted as in 'print' but print only writes to stdout.
Is there a simple way to achieve this result? 

Thanks

-- 
View this message in context: 
http://www.nabble.com/print%28%29-to-file--tp24397445p24397445.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matching each row

2009-07-08 Thread tathta


From an email suggestion, here are two sample datasets, and my ideal output: 

dataA - data.frame(unique.id=c(A,B,C,B),x=11:14,y=5:2)
dataB -
data.frame(unique.id=c(A,B,A,B,A,C,D,A),x=27:20,y=22:29)

## mystery operation(s) happen here

## ideal output would be: 
dataA -
data.frame(unique.id=c(A,B,C,B),x=11:14,y=5:2,countA=c(1,2,1,2),countB=c(4,2,1,2))


so my mystery operation(s) would count the number of times the unique id
shows up in a given dataset.  
my ideal outputs are as follows: 
countA is the mystery operation applied to dataA (counting occurrences
within the same dataset)
countB is applied to dataB (counting occurrences within a second dataset).  



My best try so far is to do: 
tempA - aggregate(dataA$unique.id,list(dataA$unique.id),length)

which gives me a matrix with ONE instance of each unique.id and the
counts...
(and which I thought was kinda cute)
but it only works for within a single dataset! 




tathta wrote:
 
 I have two dataframes, the first column of each dataframe is a unique id
 number (the rest of the columns are data variables).  
 I would like to figure out how many times each id number appears in each
 dataframe.  
 
 So far I can use: 
 length( match (dataframeA$unique.id[1], dataframeB$unique.id) )
 
 but this only works on each row of dataframe A one-at-a-time.  
 
 I would like to do this for all of the rows in dataframe A, and then put
 the results in a new variable: dataframeA$count
 
 
 I'm new to R, so please be patient with me!
 
 
 
 thx
 

-- 
View this message in context: 
http://www.nabble.com/matching-each-row-tp24393051p24395711.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] bootstrapping error message Error in t.star[r, ] - statistic(data, i[r, ], ...) : number of items to replace is not a multiple of replacement length

2009-07-08 Thread Karina Boege




Hi,

I am trying to run some bootstraps with the boot package. When I run 
it with 400 replicates it does it ok, but then I need to run the same 
analysis but with 89, 86, 102 and 106 samples (for four different 
environments), and then is when I get the error message:


 mybootstrap - boot(Datos, mystat, 2000)
Error in t.star[r, ] - statistic(data, i[r, ], ...) :  number of 
items to replace is not a multiple of replacement length


Anyone familiar with this error message?
Does anyone knows the minimum sample size for boot package to run 
properly? Is there anyway to tell R how many samples should it pick 
for the resampling?


If it helps, this is how my model looks like:

mymodel = lm(Datos[,4]~Datos[,1]+ 
Datos[,8]+Datos[,9]+Datos[,10]+Datos[,11]+Datos[,12])

summary(mymodel)

mystat - function(a,b)
f- lm(a[b,4]~a[b,1]+a[b,8]+ a[b,9]+a[b,10]+a[b,11]+a[b,12])$coef

mybootstrap - boot(Datos, mystat, 2000)

INT1-boot.ci(mybootstrap, conf=0.95, type=all, index=1)
INT2-boot.ci(mybootstrap, conf=0.95, type=all, index=2)
INT3-boot.ci(mybootstrap, conf=0.95, type=all, index=3)
INT4-boot.ci(mybootstrap, conf=0.95, type=all, index=4)
INT5-boot.ci(mybootstrap, conf=0.95, type=all, index=5)
INT6-boot.ci(mybootstrap, conf=0.95, type=all, index=6)
INT7-boot.ci(mybootstrap, conf=0.95, type=all, index=7)


Thanks for your help! I am new to bootstraps and to R, and I feel 
pretty lonely with this


Karina Boege

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 172 matches

Mail list logo