[R] Missing data

2013-04-24 Thread Roslina Zakaria
Dear r-users,

I would like to investigate about how to fill in missing data.  I started with 
a complete data and try to introduce missing data into the data series.  Then I 
would use some method to fill in the missing data and then compare with the 
original data how good it is.  My question is, how do I introduce missing data 
in my complete data systematically like for example every 10th data will be 
erased and assumed as missing.  Here are some rainfall data:

125
130.3
327.2
252.2
33.8
6.1
5.1
0.5
0.5
0
2.3
0
0
0
0
0
0
0
0
0
0.8
5.1
0
0.3
0
0
0
0
0
0
45.7
43.4
0
0
0
0
0

Thank you so much for any help given.  I hope my question is clear.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regarding Modeling - Please! QUICK HELP

2013-04-24 Thread Rolf Turner


Sorry, this list has a "No homework" policy.
Please ask your lecturer or tutor about this.

cheers,

Rolf Turner

On 25/04/13 14:18, Andrew Cochrane wrote:

I'm a student currently working with the *sleepstudy* dataset in
matrix.pkg. It deals with the reaction times of sleep deprived students
over a period of days.

I am trying to model reaction times in order to describe the variation
between students by days they havent slept.

This is what I'm running in R, but unfortunately I'm missing something:



logmod11 <- lmer(log(Reaction) ~ (Subject|Days),REML=FALSE)


This is obviously incorrect, so If someone could give me some quick help
I'd really appreciate it.

Thanks!



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regarding Modeling - Please! QUICK HELP

2013-04-24 Thread Patrick Coulombe
Hi Andrew,

I don't know the dataset at all (and you seem to assume that your
readers will), but anyway: it looks like you're trying to do an
intercept-only model. If that's the case, try:

> logmod11 <- lmer(log(Reaction) ~ 1 + (1|Subject),REML=FALSE)

1 is the intercept, and anything in parentheses are your random
effects--in this case, the intercept is random, and your level-2 class
variable is Subject (several lines per Subject).

If you want to add Days as a predictor, try:

> logmod11 <- lmer(log(Reaction) ~ 1 + Days + (Days|Subject),REML=FALSE)

Here, both the intercept and the coefficient for Days are random
(allowed to vary for each Subject). Don't forget to include the
dataset after your formula if it's not attached to your environment.

Hope this helps,
Patrick


2013/4/24 Andrew Cochrane :
> I'm a student currently working with the *sleepstudy* dataset in
> matrix.pkg. It deals with the reaction times of sleep deprived students
> over a period of days.
>
> I am trying to model reaction times in order to describe the variation
> between students by days they havent slept.
>
> This is what I'm running in R, but unfortunately I'm missing something:
>
>
>> logmod11 <- lmer(log(Reaction) ~ (Subject|Days),REML=FALSE)
>
>
> This is obviously incorrect, so If someone could give me some quick help
> I'd really appreciate it.
>
> Thanks!
>
> --
> J. Andrew Cochrane
> University of Illinois | 2013
> College of Liberal Arts and Sciences | Statistics
> (630) 991-7502
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning a variable value based on multiple columns

2013-04-24 Thread Patrick Coulombe
Hi Jason,

I think that the easiest for you would be to keep your current elseif
statements as is, but change your NA into something else (e.g., -999,
or anything else). To do this in one line, you can use the package
"gdata".

In this code, I assume that your data are stored in the variable "dataset":


###
#install package gdata if not yet installed
install.packages("gdata")

#load package gdata
library(gdata)

#change NA into -999
dataset <- NAToUnknown(dataset, -999)


#do your ifs/ifelses here...
#...
#...


#change -999 back into NA
dataset <- unknownToNA(dataset, -999)



And that should do it.

Hope this helps,
Patrick


2013/4/24 Jason Stout, M.D. 
>
> Hi All,
>
> I'm hoping someone can help me with a relatively simple problem.  Take the 
> following dataset:
>
> IDDiabetesESRDHIVContact
> 100NA0
> 210NA0
> 3NA  100
> 40NA  01
> 51110
>
> I want to generate a column called TSTcutoff based on the values in the row.  
> TSTcutoff would be the lower of 15 (if Diabetes=ESRD=HIV=Contact=0), 10 (if 
> Diabetes or ESRD=1 AND HIV=Contact=0), or 5 (if HIV OR Contact=1).  I was 
> thinking this could be done with a series of IFELSE statements, but the NA 
> values make this more challenging.  I want to ignore NA values when 
> calculating TSTcutoff.  So the final dataset should look like this:
>
> IDDiabetesESRDHIVContact TSTcutoff
> 100NA015
> 210NA0 10
> 3NA  10010
> 40NA  015
> 511105
>
> Thanks for any suggestions.
>
> Jason Stout, MD, MHS
> Box 102359-DUMC
> Durham, NC 27710
> FAX 919-681-7494
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Regarding Modeling - Please! QUICK HELP

2013-04-24 Thread Andrew Cochrane
I'm a student currently working with the *sleepstudy* dataset in
matrix.pkg. It deals with the reaction times of sleep deprived students
over a period of days.

I am trying to model reaction times in order to describe the variation
between students by days they havent slept.

This is what I'm running in R, but unfortunately I'm missing something:


> logmod11 <- lmer(log(Reaction) ~ (Subject|Days),REML=FALSE)


This is obviously incorrect, so If someone could give me some quick help
I'd really appreciate it.

Thanks!

-- 
J. Andrew Cochrane
University of Illinois | 2013
College of Liberal Arts and Sciences | Statistics
(630) 991-7502

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bootstrapping in R

2013-04-24 Thread Preetam Pal
Hi all,

1>i have 3 vectors a,b and c, each of length 25... i want to define a
new data frame z such that z[1] = (a[1] b[1] c[1]), z[2] = (a[2] b[2] c[2])
and so on...how do i do it in R


2> Then i want to draw bootstrap samples from z.

Kindly suggest how i can do this in R.

Thanks,
Preetam
-- 
Preetam Pal
(+91)-9432212774
M-Stat 2nd Year, Room No. N-114
Statistics Division,   C.V.Raman
Hall
Indian Statistical Institute, B.H.O.S.
Kolkata.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Linear Interpolation : Missing rates

2013-04-24 Thread Katherine Gobin
Dear R forum

I have data.frame as

df = data.frame(rate_name = c("USD_1w", "USD_1w", "USD_1w", "USD_1w", "USD_1m", 
"USD_1m", "USD_1m", "USD_1m", "USD_2m", "USD_2m", "USD_2m", "USD_2m",  
"GBP_1w", "GBP_1w", "GBP_1w", "GBP_1w", "GBP_1m", "GBP_1m", "GBP_1m", "GBP_1m", 
"GBP_2m", "GBP_2m", "GBP_2m", "GBP_2m", "EURO_1w", "EURO_1w", "EURO_1w", 
"EURO_1w", "EURO_2w", "EURO_2w", "EURO_2w", "EURO_2w", "EURO_2m", "EURO_2m", 
"EURO_2m", "EURO_2m"), rates = c(2.05, 2.07, 2.06, 2.06, 2.22, 2.24, 2.23, 
2.23, 2.31, 2.33, 2.33, 2.31, 1.06, 1.08, 1.08, 1.08, 1.21, 1.21, 1.23, 1.21, 
1.41, 1.39, 1.39, 1.37, 1.82, 1.82, 1.81, 1.80, 1.98, 1.98, 1.97, 1.97, 2.1, 
2.09, 2.09, 2.11)) 

currency = c("EURO", "GBP", "USD")
tenor = c("1w", "2w", "1m", "2m", "3m")

# _

> df
   rate_name rates
   rate_name rates
1 USD_1w  2.05
2 USD_1w  2.07
3 USD_1w  2.06
4 USD_1w  2.06
5 USD_1m  2.22
6 USD_1m  2.24
7 USD_1m  2.23
8 USD_1m  2.23
9 USD_2m  2.31
10    USD_2m  2.33
11    USD_2m  2.33
12    USD_2m  2.31
13    GBP_1w  1.06
14    GBP_1w  1.08
15    GBP_1w  1.08
16    GBP_1w  1.08
17    GBP_1m  1.21
18    GBP_1m  1.21
19    GBP_1m  1.23
20    GBP_1m  1.21
21    GBP_2m  1.41
22    GBP_2m  1.39
23    GBP_2m  1.39
24    GBP_2m  1.37
25   EURO_1w  1.82
26   EURO_1w  1.82
27   EURO_1w  1.81
28   EURO_1w  1.80
29   EURO_2w  1.98
30   EURO_2w  1.98
31   EURO_2w  1.97
32   EURO_2w  1.97
33   EURO_2m  2.10
34   EURO_2m  2.09
35   EURO_2m  2.09
36   EURO_2m  2.11

As can be seen that USD_2w, GBP_2w and EURO_1m are missing and I need to 
INTERPOLATE these rates, which can be done using approx or approxfun. In 
reality I can have many currencies with many tenors. Problem is when the 
data.frame "df" is read or accessed in R, I am not aware which tenor is 
missing. For a given currency, it is possible that mare than 1 consecutive 
tenors may be missing e.g. in case of EURO, I may have EURO_1w, EURO_2w and 
then EURO_4m. So EURO_1m, EURO_2m and EURO_3m are missing. 


I understand it's sort of vague question from me and do apologize for the same. 
Any suggestion please.

Regards

Katherine





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loop for main title in a plot

2013-04-24 Thread David Winsemius

On Apr 24, 2013, at 9:22 PM, Eva Günther wrote:

> Hi all,
> 
> I have a problem in including my plot in a loop. Here is a simple example
> for one plot:
> 
> # Plot simple graph with super- and subscript
> a<-c(1,2,3,4)
> b<-c(1,2,3,4)
> 
> plot(x=a,y=b,
>ylab=expression(paste("Apple"["P"])),
>xlab=expression(paste("Banana"^"th")),
>main=expression(paste(italic("i-")~"4"^"th"~"choice")))
> 
> Now I would like to include the titel (main) as a function of the number of
> trails
> for (trial in 1:nTrials) {
> plot(
> 
> main=expression(paste(italic("i-")~"trial"^"th"~"choice")))
> 
> }
> 

I have not quite figured out what the exact goal is but this shows how to loop 
with expression that have evaluated component:   bquote is your friend:

for (trial in 1:length(a)) {
plot(x=a,y=b,   # you were at the very least missing an x and y argument
main=bquote( italic(i)-.(trial)^th~choice))
}

... and you had too many quoted elements. Plotmath `paste` is usually not 
needed. Learn to use * and ~.

David.

> e.g. nTrials = 5
> The title should look like this:
> 
> 5th plot: i ^th choice
> 4th plot: i-1 ^th choice
> 3th plot: i-2 ^th choice and so on
> 
> I have problems to create that, could you please help me?
> 
> Thank you!!
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fonts not rendering during plot

2013-04-24 Thread David Winsemius

On Apr 24, 2013, at 8:42 PM, George Dietz wrote:

> I apologize for not going in to too much detail-I didn't want to confuse 
> people with the gory details of what I'm trying to do


You've slighted us. Kept us from our due. We LOVE gory OS and code details. 
Chop that chicken up. Throw dem bones.  Without the complete set of entrails 
our divinations are incomplete and we are unable to perform our wizardly deeds 
of daring-do.

-- 
David.

> (trying to make a portable R installation bundled in to a mac app, among 
> other things).  I am on a mac, Mountain Lion. Anyhow, I figured out the 
> problem was with the paths pango looks uses to search for font configuration 
> files. I solved it by writing a script that sets these paths at run-time.
> 
> George
> 
> On Apr 24, 2013, at 7:45 PM, David Winsemius  wrote:
> 
>> 
>> On Apr 24, 2013, at 7:29 AM, George Dietz wrote:
>> 
>>> Hi,
>>> 
>>> I am having a problem where sometimes fonts for text in plots don't get 
>>> rendered-the text just shows up as boxes. If I call R from a certain 
>>> directory the problem goes away, otherwise the fonts don't render (but 
>>> plots get made). In both cases, my PANGO_RC and FONTCONFIG_FILE do not 
>>> change…
>> 
>> Offering the OS details is more courteous than omitting them. (And there are 
>> several other courtesies expected of you listed in the Fine Posting Guide.) 
>> If this happens to be on a Mac, it may mean your screen fonts are corrupt. 
>> You can see their names with: names(quartzFonts()).  Using Fontbook.app will 
>> sometimes reveal that there are two copies of one of the screen graphics 
>> fonts and that one of them displays as blanks. These should be deleted with 
>> Fontbook.app.
>> 
>> -- 
>> 
>> David Winsemius
>> Alameda, CA, USA
>> 
> 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Loop for main title in a plot

2013-04-24 Thread Eva Günther
Hi all,

I have a problem in including my plot in a loop. Here is a simple example
for one plot:

# Plot simple graph with super- and subscript
a<-c(1,2,3,4)
b<-c(1,2,3,4)

plot(x=a,y=b,
ylab=expression(paste("Apple"["P"])),
xlab=expression(paste("Banana"^"th")),
main=expression(paste(italic("i-")~"4"^"th"~"choice")))

Now I would like to include the titel (main) as a function of the number of
trails
for (trial in 1:nTrials) {
plot(

main=expression(paste(italic("i-")~"trial"^"th"~"choice")))

}

e.g. nTrials = 5
The title should look like this:

5th plot: i ^th choice
4th plot: i-1 ^th choice
3th plot: i-2 ^th choice and so on

I have problems to create that, could you please help me?

Thank you!!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Assigning a variable value based on multiple columns

2013-04-24 Thread Jason Stout, M.D.
Hi All,

I'm hoping someone can help me with a relatively simple problem.  Take the 
following dataset:

IDDiabetesESRDHIVContact
100NA0
210NA0
3NA  100
40NA  01
51110

I want to generate a column called TSTcutoff based on the values in the row.  
TSTcutoff would be the lower of 15 (if Diabetes=ESRD=HIV=Contact=0), 10 (if 
Diabetes or ESRD=1 AND HIV=Contact=0), or 5 (if HIV OR Contact=1).  I was 
thinking this could be done with a series of IFELSE statements, but the NA 
values make this more challenging.  I want to ignore NA values when calculating 
TSTcutoff.  So the final dataset should look like this:

IDDiabetesESRDHIVContact TSTcutoff
100NA015
210NA0 10
3NA  10010
40NA  015
511105

Thanks for any suggestions.

Jason Stout, MD, MHS
Box 102359-DUMC
Durham, NC 27710
FAX 919-681-7494

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fonts not rendering during plot

2013-04-24 Thread George Dietz
I apologize for not going in to too much detail-I didn't want to confuse people 
with the gory details of what I'm trying to do (trying to make a portable R 
installation bundled in to a mac app, among other things).  I am on a mac, 
Mountain Lion. Anyhow, I figured out the problem was with the paths pango looks 
uses to search for font configuration files. I solved it by writing a script 
that sets these paths at run-time.

George

On Apr 24, 2013, at 7:45 PM, David Winsemius  wrote:

> 
> On Apr 24, 2013, at 7:29 AM, George Dietz wrote:
> 
>> Hi,
>> 
>> I am having a problem where sometimes fonts for text in plots don't get 
>> rendered-the text just shows up as boxes. If I call R from a certain 
>> directory the problem goes away, otherwise the fonts don't render (but plots 
>> get made). In both cases, my PANGO_RC and FONTCONFIG_FILE do not change…
> 
> Offering the OS details is more courteous than omitting them. (And there are 
> several other courtesies expected of you listed in the Fine Posting Guide.) 
> If this happens to be on a Mac, it may mean your screen fonts are corrupt. 
> You can see their names with: names(quartzFonts()).  Using Fontbook.app will 
> sometimes reveal that there are two copies of one of the screen graphics 
> fonts and that one of them displays as blanks. These should be deleted with 
> Fontbook.app.
> 
> -- 
> 
> David Winsemius
> Alameda, CA, USA
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Distance matrices Combinations

2013-04-24 Thread arun
HI,

Your code is not very clear. 

mata<-m[,c(u)]       # assume that "m" is "mat1" or "w" (as "m" was not defined)

#also assume that "124" as nrow  of each matrix 
# For 10 distance matrices (#10C4)
set.seed(25)
lst1<- lapply(1:10,function(i) 
dist(matrix(sample(1:50,5*124,replace=TRUE),nrow=124)))
names(lst1)<- seq_along(lst1)
 lst2<-lapply(seq_len(ncol(combn(names(lst1),4))),function(i) 
{x<-combn(names(lst1),4)[,i];as.matrix((lst1[[x[1]]])^2+(lst1[[x[2]]])^2+(lst1[[x[3]]])^2+(lst1[[x[4]]])^2)})
 X<- do.call(cbind,lapply(lst2,function(x) {w<- 
sqrt(x);do.call(rbind,lapply(seq_len(nrow(w)),function(i) 
{r<-matrix(sort(x[i,],index.return=TRUE)$ix,ncol=1); u<- r[2:6,]; 
mata<-w[,u];b<- matrix(apply(mata,1,mean),ncol=1); e<-sum(abs(b-w[,i]))}))}))
 o<-mean(apply(X,2,sd))
 o
#[1] 268.7831
A.K.

- Original Message -
From: eliza botto 
To: "r-help@r-project.org" 
Cc: 
Sent: Wednesday, April 24, 2013 2:07 PM
Subject: [R] Distance matrices Combinations

Dear UseRs,
MY PROBLEM IS A SMALL PIECE OF A REAL BIG AND A COMPLICATED PROBLEM. IF I 
DELIBERATE IN A VERY SIMPLE WAY THEN ALL I 
WANT IS TO PUT ALL THE POSSIBLE COMBINATIONS OF 75 DISTANCE MATRICES (BY TAKING 
4 MATRICES, MORE COMMONLY 75C4), in the following equation.


t<-as.matrix((MAT1)^2+(MAT2)^2+(MAT3)^2+(MAT4)^2+,upper=T,diag=T))

Then "1215450" values of "t"(one for each combination) should one by one be 
inculcated in the following loop(to calculate the "o" value) and in the end 
want those 10 combinations of distance matrices which have lowest "o" values.

e <- vector("list", 124)

w<-sqrt(t)

mat1<-w

for (i in 1:124){

r<-matrix(sort(mat1[i,],index.return=TRUE)$ix,ncol=1)

u<-r[c(2,3,4,5,6),1]

mata<-m[,c(u)]                ##(shifted)

amata<-apply(mata,1,mean)

amata<-data.frame(amata)

aavg<-as.matrix(amata, ncol=1)

b<-aavg

e[[i]]<-print(sum(abs(b-m[,i])))

}

x<-do.call(rbind,e)

Y<-x

z <- apply(Y,2,sd)

o<-mean(Y)

Does my question make any scene?
Thanks in advance
Elisa                           
    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] creating raster image in R

2013-04-24 Thread Kristi Glover
Hi R-User



I was trying to make a raster map with WGS84 projection 
in R, but I could not make it. I found one data set in Google that data 
is almost the same format as of mine. I wanted to make a raster map of 
temperature with 1 degree spatial resolution for the global scale. 
I
 could make it in the GIS software but I do have many variables (to be 
many raster images) and ultimately I am importing them into R for 
further analysis. Therefore, I wanted to make them in R, if possible. 

But, I am wondering how I can create a raster map in R.
(I have provided link of data for your references, this is an example of data 
set).

I am really appropriating for your help.

#--
I downloaded the following packages 

#create a raster map from scratch
install.packages("raster", dependencies=TRUE)
library(raster)  # raster data
install.packages("rgdal", dependencies=TRUE)
library(rgdal)  # input/output, projections
install.packages("rgeos", dependencies=TRUE)
library(rgeos)  # geometry ops
install.packages("spdep", dependencies=TRUE)
library(spdep)  # spatial dependence
install.packages("pastecs", dependencies=TRUE)
library(pastecs)

pts<-read.table.url("https://www.betydb.org//miscanthusyield.csv";, header=T, 
sep=",")
proj4string(pts)=<- CRS("+proj=longlat +datum=WGS84")
after that, what I am supposed to do for creating a raster map? any suggestions?
#---

Cheers,
Kristi
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to make a raster image in R

2013-04-24 Thread Kristi Glover
Hi R-User



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] the joy of spreadsheets (off-topic)

2013-04-24 Thread Ross Boylan

On 4/17/2013 5:18 AM, Kevin Wright wrote:

On Tue, Apr 16, 2013 at 4:33 PM, Jim Lemon  wrote:


On 04/17/2013 03:25 AM, Sarah Goslee wrote:

The final point does relate to Excel and any application that hides what is

going on to the casual observer. I will treasure this URL to give to anyone
who chastises my moaning when I have to perform some task in Excel. It is
not an error in the application (although these certainly exist) but a
salutory caution to those who think that if a reasonable looking number
appears in a cell, it must be the correct answer. I have found not one, but
two such errors in the simple calculation of a "birthday age" from the date
of birth and date of death.

Jim


So there (maybe) was a bug in Excel.  Maybe hidden from the "casual
observer".  And since Excel is not R, and we are R snobs, Excel is evil,
right?  But, wait.  Is it easier for a "casual observer" to detect a flaw
in the formula in Excel, or to find an incorrect array index in an R
script?
If the person knows R, or can fake it, I think it is easier.  You have 
to hunt around an Excel spreadsheet to see what the formulae are,
and the cell references usually have no inherent meaning.  Further, one 
of the errors they made, not including all the data in a range, is very 
easy to make in excel but would be very hard to make in R.


As others have noted, the problem was not a bug in Excel the program 
(unless you consider the design a bug) but a bug induced by the use of 
Excel.


I doubt the exclusion of the range was deliberate, although the other 
errors seem to have been.  However, it is likely that if the result had 
not been to their liking the original authors would have rechecked their 
work and discovered the problem.  One of the "errors", equal weighting 
of countries regardless of how many years they spent in a given state, 
is arguably a judgement call. Selective exclusion and inclusion of data 
is also a judgement call, but that strikes me as less defensible.


Someone wrote that the overall finding of a negative relation between 
debt and growth is intact.  First of all, the headline summary was that 
if debt/GDP > 90% you fall off a cliff.  That is not intact; it is 
false.  The remaining relation is quite weak.  And the substantive 
conclusion that high debt *causes* weaker growth is a complete reading 
into a correlational finding.  It is pretty hard to sort out causal 
ordering, but some evidence suggests it is more the reverse: 
http://krugman.blogs.nytimes.com/2013/04/18/correlation-causality-and-casuistry/. 
See Krugman and Delongs blogs generally for gleeful commentary, or the 
original critique in 
http://www.peri.umass.edu/236/hash/31e2ff374b6377b2ddec04deaa6388b1/publication/566/.


At any rate, a policy-relevant conclusion would need to be based on a 
much more careful analysis than was done, careful not only in the 
mechanics but in using methods that at least attempted to sort out the 
causal relations.


The irony is that the substantively most trivial mistake is also the 
most clearly an error, while the more important issues are at least a 
little less clear-cut.


Ross

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Stochastic Frontier: Finding the optimal scale/scale efficiency by "frontier" package

2013-04-24 Thread jpm miao
Hi,

   I am trying to find out the scale efficiency and optimal scale of banks
by stochastic frontier analysis given the panel data of bank. I am free to
choose any model of stochastic frontier analysis.

   The only approach I know to work with R is to estimate a translog
production function by sfa or other related function in frontier package,
and then use the Ray 1998 formula to find the scale efficiency. However, as
the textbook Coelli et al 2005 point out that the concavity may not be
satisfied,  one needs to impose the nonpositive definiteness condition so
that the scale efficiency <1. How can I do it with frontier package?

   Is there any other SFA model/function in R recommended to find out the
scale efficiency and optimal scale?

   Thanks,

Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question

2013-04-24 Thread Roger Koenker
good usually means good relative to something else, in this case the comparison 
seems, as Michael
has already said, f0 <- rq(y ~ 1, tau = ?) and then one can compute the R1 
version that I originally
suggested.  But since there is still no explicit way to evaluate this, it is 
all a bit pointless.

Roger Koenker
rkoen...@illinois.edu




On Apr 24, 2013, at 6:37 PM, R. Michael Weylandt wrote:

>> On Tue, Apr 23, 2013 at 2:54 PM, nafiseh hagiaghamohammadi
>>  wrote:
>>> Hi
>>> 
>>> I fit one  linear quantile regression with package quantreg and I want to
>>> khow this model is good or not.Is there method for checking it?
>>> Thanks your advice
> 
>>  I ask this question because  there is 2 models,f0 and f1 in  (R1 <- 1 -
>> f1$rho/f0$rho ),
>> is it true?
>> 
>>  but I fit 1 model and I want to check goodness of fit for 1 model .
>> 
>> 
> 
> Please keep your responses on list so you can get a quick reply even
> when I'm otherwise busy.
> 
> I think you could -- for a rough and ready comparison -- compare
> against a constant (empirical quantile) model (not unlike how basic
> OLS models compare against the constant mean predictor)  but someone
> else might know if there's any subtleties about quantile regression
> that should be noted here.
> 
> MW
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help me make faster R code for Kennard-Stone algorithm [My code is so slow from Matlab]

2013-04-24 Thread Kevin Hao
Hi all,

Can you help me change my Kennard-Stone algorithm to faster one?

[The original code can run fast in matlab, but when I change matlab code to
R code, it is so slow.]

Since my code so crude and too many loops (changed from matlab code), it is
too slow. I hope that you can help to improve the performance.

Thanks.

kevin


#
##
# Kennard-Stone algorithm for selection of samples
ksFun = function(x, N) {
# Initial the vector of minimum distance
dminmax = sample(0, N, replace = TRUE) # Default: FALSE
M = nrow(x)
samples = 1:M
# Initializes the matrix of distances
D = matrix(0, nrow = M, ncol = M)
for (i in 1:(M - 1)) {
xa = x[i, ]
for (j in (i + 1):M) {
xb = x[j, ]
# D: Upper Trianglar Matrix
# D[i, j] = Euclidean distance between object i and
j (j > i)
D[i, j] = max(svd(xa - xb)$d) # the largest
singular value
}
}

cat("for (i in 1:(M - 1)) done! \n")

maxD = apply(D, 2, max)
index_row = apply(D, 2, function(x) which(x == max(x))[1])

dummy = max(maxD)
index_column = which(maxD == dummy)
m = vector()
m[1] = index_row[index_column]
m[2] = index_column

dminmax[2] = D[m[1], m[2]]

for (i in 3:N) {
pool = setdiff(samples, m)
dmin = sample(0, M-i+1, replace = TRUE)
for (j in 1:(M-i+1)) {
indexa = pool[j]
d = sample(0, i-i, replace = TRUE)
for ( k in 1:(i-1)) {
indexb = m[k]
if (indexa < indexb) {
d[k] = D[indexa, indexb]
} else {
d[k] = D[indexb, indexa]
}
}
dmin[j] = min(d)
}
#cat("for (j in 1:(M -i+1)) done! \n")
dminmax[i] = max(dmin)
index = which(dmin == max(dmin))
m[i] = pool[index]
}
cat("for (i in 3:N) done! \n")
res = list(m = m, dminmax = dminmax)
return(res)
}

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] about the loglm and glm---Re:Re: Regression on stratified count data

2013-04-24 Thread meng
Understand.
Many thanks for your detailed explaination.





 



At 2013-04-24 22:22:55,"Achim Zeileis"  wrote:
>On Wed, 24 Apr 2013, meng wrote:
>
>> Hi,Achim:
>> Can all the analysis you mentioned via loglm be performed via
>> glm(...,family=poisson)?
>
>Yes.
>
>## transform table back to data.frame
>df <- as.data.frame(tab)
>
>## fit models: conditional independence, no-three way interaction,
>## and saturated
>g1 <- glm(Freq ~ age/(drug + case), data = df, family = poisson)
>g2 <- glm(Freq ~ (age + drug + case)^2, data = df, family = poisson)
>g3 <- glm(Freq ~ age * drug * case, data = df, family = poisson)
>
>## likelihood-ratio tests (against saturated)
>anova(g1, g3, test = "Chisq")
>anova(g2, g3, test = "Chisq")
>
>## compare fitted frequencies (which are essentially equal)
>all.equal(as.numeric(fitted(g1)),
>   as.data.frame(as.table(fitted(m1)))$Freq)
>all.equal(as.numeric(fitted(g2)),
>   as.data.frame(as.table(fitted(m2)))$Freq)
>
>The difference is mainly that loglm() has a specialized user interface and 
>that it uses a different optimizer (iterative proportional fitting rather 
>than iterative reweighted least squares).
>
>Best,
>Z
>
>> Many thanks.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> At 2013-04-24 19:37:10,"Achim Zeileis"  wrote:
>> >On Wed, 24 Apr 2013, meng wrote:
>> >
>> >> Hi all:
>> >> For stratified count data,how to perform regression analysis?
>> >>
>> >> My data:
>> >> age case oc count
>> >> 1   1 121
>> >> 1   1 226
>> >> 1   2 117
>> >> 1   2 259
>> >> 2   1 118
>> >> 2   1 288
>> >> 2   2 1 7
>> >> 2   2 2   95
>> >>
>> >> age:
>> >> 1:<40y
>> >> 2:>40y
>> >>
>> >> case:
>> >> 1:patient
>> >> 2:health
>> >>
>> >> oc:
>> >> 1:use drug
>> >> 2:not use drug
>> >>
>> >> My purpose:
>> >> Anaysis whether case and oc are correlated, and age is a stratified varia
>> ble.
>> >>
>> >> My solution:
>> >> 1,Mantel-Haenszel test by using function "mantelhaen.test"
>> >> 2,loglinear regression by using function glm(count~case*oc,family=poisson
>> ).But I don't know how to handle variable "age",which is the stratified vari
>> able.
>> >
>> >Instead of using glm(family = poisson) it is also convenient to use 
>> >loglm() from package MASS for the associated convenience table.
>> >
>> >The code below shows how to set up the contingency table, visualize it 
>> >using package vcd, and then fit two models using loglm. The models 
>> >considered are conditional independence of case and drug given age and the 
>> >"no three-way interaction" already suggested by Peter. Both models are 
>> >also accompanied by visualizations of the residuals.
>> >
>> >## contingency table with nice labels
>> >tab <- expand.grid(drug = 1:2, case = 1:2, age = 1:2)
>> >tab$count <- c(21, 26, 17, 59, 18, 88, 7, 95)
>> >tab$age <- factor(tab$age, levels = 1:2, labels = c("<40", ">40"))
>> >tab$case <- factor(tab$case, levels = 1:2, labels = c("patient", 
>> >"healthy"))
>> >tab$drug <- factor(tab$drug, levels = 1:2, labels = c("yes", "no"))
>> >tab <- xtabs(count ~ age + drug + case, data = tab)
>> >
>> >## visualize case explained by drug given age
>> >library("vcd")
>> >mosaic(case ~ drug | age, data = tab,
>> >   split_vertical = c(TRUE, TRUE, FALSE))
>> >
>> >## test wheter drug and case are independent given age
>> >m1 <- loglm(~ age/(drug + case), data = tab)
>> >m1
>> >
>> >## visualize corresponding residuals from independence model
>> >mosaic(case ~ drug | age, data = tab,
>> >   split_vertical = c(TRUE, TRUE, FALSE),
>> >   residuals_type = "deviance",
>> >   gp = shading_hcl, gp_args = list(interpolate = 1.2))
>> >mosaic(case ~ drug | age, data = tab,
>> >   split_vertical = c(TRUE, TRUE, FALSE),
>> >   residuals_type = "pearson",
>> >   gp = shading_hcl, gp_args = list(interpolate = 1.2))
>> >
>> >## test whether there is no three-way interaction
>> >## (i.e., dependence of case on drug is the same for both age groups)
>> >m2 <- loglm(~ (age + drug + case)^2, data = tab)
>> >m2
>> >
>> >## visualization (with default pearson residuals)
>> >mosaic(case ~ drug | age, data = tab,
>> >   expected = ~ (age + drug + case)^2,
>> >   split_vertical = c(TRUE, TRUE, FALSE),
>> >   gp = shading_hcl, gp_args = list(interpolate = 1.2))
>> >
>> >
>> >> Many thanks for your help.
>> >>
>> >> My best.
>> >>   [[alternative HTML version deleted]]
>> >>
>> >> __
>> >> R-help@r-project.org mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.h
>> tml
>> >> and provide commented, minimal, self-contained, reproducible code.
>> >>
>> 
>> 
>> 
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, m

Re: [R] fonts not rendering during plot

2013-04-24 Thread David Winsemius

On Apr 24, 2013, at 7:29 AM, George Dietz wrote:

> Hi,
> 
> I am having a problem where sometimes fonts for text in plots don't get 
> rendered-the text just shows up as boxes. If I call R from a certain 
> directory the problem goes away, otherwise the fonts don't render (but plots 
> get made). In both cases, my PANGO_RC and FONTCONFIG_FILE do not change…

Offering the OS details is more courteous than omitting them. (And there are 
several other courtesies expected of you listed in the Fine Posting Guide.) If 
this happens to be on a Mac, it may mean your screen fonts are corrupt. You can 
see their names with: names(quartzFonts()).  Using Fontbook.app will sometimes 
reveal that there are two copies of one of the screen graphics fonts and that 
one of them displays as blanks. These should be deleted with Fontbook.app.

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question

2013-04-24 Thread R. Michael Weylandt
> On Tue, Apr 23, 2013 at 2:54 PM, nafiseh hagiaghamohammadi
>  wrote:
>> Hi
>>
>> I fit one  linear quantile regression with package quantreg and I want to
>> khow this model is good or not.Is there method for checking it?
>> Thanks your advice

>   I ask this question because  there is 2 models,f0 and f1 in  (R1 <- 1 -
> f1$rho/f0$rho ),
> is it true?
>
>   but I fit 1 model and I want to check goodness of fit for 1 model .
>
>

 Please keep your responses on list so you can get a quick reply even
when I'm otherwise busy.

I think you could -- for a rough and ready comparison -- compare
against a constant (empirical quantile) model (not unlike how basic
OLS models compare against the constant mean predictor)  but someone
else might know if there's any subtleties about quantile regression
that should be noted here.

MW

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] getting started in parallel computing on a windows OS

2013-04-24 Thread Martin Morgan

On 04/24/2013 02:50 PM, Benjamin Caldwell wrote:

Dear R help,

I've what I think is a fairly simple parallel problem, and am getting
bogged down in documentation and packages for much more complex situations.

I have a big matrix  (30^5,5]. I have a function that will act on each row
of that matrix sequentially and output the 'best' result from the whole
matrix (it compares the result from each row to the last and keeps the
'better' result). I would like to divide that first large matrix into
chunks equal to the number of cores I have available to me, and work
through each chunk, then output the results from each chunk.

I'm really having trouble making head or tail of how to do this on a
windows machine - lots of different false starts on several different
packages now. Basically, I have the function, and I can of course easily
divide the matrix into chunks. I just need a way to process each chunk
in parallel (other than opening new R sessions for each core manually).

Any help much appreciated - after two days of trying to get this to work
I'm pretty burnt out.


Hi Ben -- in your code from this morning you had a function

fitting <- function(ndx.grd=two,dt.grd=one,ind.vr='ind',rsp.vr='res') {
## ... setup
for(i in 1:length(ndx.grd[,1])){
## ... do work
}
## ... collate results
}

that you're trying to run in parallel. Obviously the ## ... represent lines I've 
removed. When you say something like


y <- foreach(icount(length(two))) %dopar% fitting()

its saying that you want to run fitting() length(two) times. So you're actually 
doing the same thing length(two) times, whereas you really want to divide the 
work thats inside fitting() into chunks, and do those on separate cores!


Conceptually what you'd like to do is

fit_one <- function(idx, ndx.grd, dt.grd, ind.vr, rsp.vr) {
## ... do work on row idx _ONLY_
}

and then evaluate with

## ... setup
y <-
  foreach (idx = icount(nrow(two)) %dopar% one_fit(idx, two, one, "ind", "res")
## ... collate

so that fit_one fits just one of your combinations. foreach will worry about 
distributing the work. Make sure that fit_one works first, before trying to run 
this in parallel; your use of try(), trying to fit different data types 
(character, integer, numeric) into a matrix rather than data.frame, and the type 
coercions all indicate that you're fighting with R rather than working with it.


Hope that helps,

Martin



Thanks

*Ben Caldwell*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sum up column values according to row id

2013-04-24 Thread arun
HI
Not sure whether this helps:
ipso<- read.table(text="
   id dbh h  g
1  FPE0164  36 13.62 0.10178760
2  FPE0164  31 12.70 0.07547676
21 FPE1127  57 18.85 0.25517586
13 FPE1127  39 15.54 0.11945906
12 FPE1127  34 14.78 0.09079203
6  FPE1127  32 15.12 0.08042477
5  FPE1127  28 14.13 0.06157522
15 FPE1127  27 13.50 0.05725553
19 FPE1127  25 13.28 0.04908739
11 FPE1127  19 11.54 0.02835287 
",sep="",header=TRUE,stringsAsFactors=FALSE)

library(plyr)
ddply(ipso,.(id),summarize,g=mean(head(g,6)))
 #  id  g
#1 FPE0164 0.08863218
#2 FPE1127 0.11078041


aggregate(g~id,ipso,function(x) mean(head(x,6)))
#   id  g
#1 FPE0164 0.08863218
#2 FPE1127 0.11078041



>Dear All, 
>
>here a problem I think many of you can solve in few minutes. 
>
>I have a dataframe which contains values of plot id, diameters, heigths and 
>basal area of trees, thus columns names are: id | dbh | h | g 
>
>head(ipso, n=10)        id dbh     h          g 
>1  FPE0164  36 13.62 0.10178760 
>2  FPE0164  31 12.70 0.07547676 
>21 FPE1127  57 18.85 0.25517586 
>13 FPE1127  39 15.54 0.11945906 
>12 FPE1127  34 14.78 0.09079203 
>6  FPE1127  32 15.12 0.08042477 
>5  FPE1127  28 14.13 0.06157522 
>15 FPE1127  27 13.50 0.05725553 
>19 FPE1127  25 13.28 0.04908739 
>11 FPE1127  19 11.54 0.02835287 
>
>from here I need to calculate the mean of the six greater g_ith for 
>each id_ith. The clauses are that: 
>
>if length(id) >=6 
>
>do the mean of the first six greaters g 
>
>
>else 
>do the mean of all the g_ith in the id_ith (in head print above e.g. 
>for the id==FPE0164 do the mean of just these two values of g). 
>
>The g are already ordered by id ascending and g descending using: 
>
>ipso <- ipso[with(ipso, order(ipso$id, -ipso$g)), ] # Order for id 
>ascending and g descending 
>
>I tried a lot of for loops and tapply() without results. 
>
>Can anyone help me to solve this? 
>
>Thanks for your attention 
>
>Best whishes 
>
>Matteo 


__ 
[hidden email] mailing list 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Floating point precision causing undesireable behaviour when printing as.POSIXlt times with microseconds?

2013-04-24 Thread O'Hanlon, Simon J
Dear list,
When using as.POSIXlt with times measured down to microseconds the default 
format.POSIXlt seems to cause some possibly undesirable behaviour:

According to the code in format.POSIXlt the maximum accuracy of printing 
fractional seconds is 1 microsecond, but if I do;

options( digits.secs = 6 )
as.POSIXlt( 1.02 , tz="", origin="1970-01-01")
as.POSIXlt( 1.98 , tz="", origin="1970-01-01")
as.POSIXlt( 1.99 , tz="", origin="1970-01-01")

I return respectively:
[1] "1970-01-01 01:00:01.02 BST"
[1] "1970-01-01 01:00:01.98 BST"
[1] "1970-01-01 01:00:01 BST"

If options( digits.secs = 6 ) should I not expect to be able to print 1.99 
seconds? This seems to be caused by the following code fragment in 
format.POSIXlt:

np <- getOption("digits.secs")
if (is.null(np))
np <- 0L
else np <- min(6L, np)
if (np >= 1L)
for (i in seq_len(np) - 1L) if (all(abs(secs - round(secs,
i)) < 1e-06)) {
np <- i
break
}

Specifically
for (i in seq_len(np) - 1L) if (all(abs(secs - round(secs,
i)) < 1e-06))
Which in the case of 1.99 seconds will give:

options( scipen = 10 )
np <- 6
sapply( seq_len(np) - 1L , function(x) abs(1.99 - round(1.99, x)) )

 [,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.01 0.01 0.01 0.01 0.01 0.01

The logical test all( ... < 1e-06)  should evaluate to FALSE but due to 
floating point precision it evaluates TRUE:

sprintf( "%.20f" , abs(1. 99  - round(1. 99,5)))
[1] "0.00991773"

If instead of:

for (i in seq_len(np) - 1L) if (all(abs(secs - round(secs,
i)) < 1e-06))

in format.POSIXlt we had a comparison value that was half the minimum increment:

for (i in seq_len(np) - 1L) if (all(abs(secs - round(secs,
i)) < 5e-07))

This behaviour disappears:

mod.format.POSIXlt( as.POSIXlt( 1.99 , tz="", origin="1970-01-01") )
[1] "1970-01-01 01:00:01.99"

But I am unsure if the original behaviour is what I should expect given the 
documentation (I have read it and I can't see a reason to expect 1.99 to 
round down to 1). And also if changing the formatting function would have other 
undesirable consequences?

My sessionInfo():
R version 3.0.0 (2013-04-03)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United Kingdom.1252
[2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base



Thank you,


Simon



Simon O'Hanlon
Postgraduate Researcher

Helminth Ecology Research Group
Department of Infectious Disease Epidemiology
Imperial College London
St. Mary's Hospital, Norfolk Place,
London, W2 1PG, UK

Office: +44 (0) 20 759 43229



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Trouble Computing Type III SS in a Cox Regression

2013-04-24 Thread Terry Therneau
I should hope that there is trouble, since "type III" is an undefined concept for a Cox 
model.  Since SAS Inc fostered the cult of type III they have recently added it as an 
option for phreg, but I am not able to find any hints in the phreg documentation of what 
exactly they are doing when you invoke it.  If you can unearth this information, then I 
will be happy to tell you whether

   a. using the test (whatever it is) makes any sense at all for your data set
   b. if "a" is true, how to get it out of R

I use the word "cult" on purpose -- an entire generation of users who believe in the 
efficacy of this incantation without having any idea what it actually does.  In many 
particular instances the SAS type III corresponds to a survey sampling question, i.e., 
reweight the data so that it is balanced wrt factor A and then test factor B in the new 
sample.  The three biggest problems with type III are that
1: the particular test has been hyped as "better" when in fact it sometimes is sensible 
and sometimes not, 2: SAS implemented it as a computational algorithm which unfortunately 
often works even when the underlying rationale does not hold and
3: they explain it using a notation that completely obscures the actual question.  This 
last leads to the nonsense phrase "test for main effects in the presence of interactions".


There is a "survey reweighted" approach for Cox models, very closely related to the work 
on causal inference ("marginal structural models"), but I'd bet dollars to donuts that 
this is not what SAS is doing.


(Per 2 -- type III was a particular order of operations of the sweep algorithm for linear 
models, and for backwards compatability that remains the core definition even as 
computational algorthims have left sweep behind.  But Cox models can't be computed using 
the sweep algorithm).


Terry Therneau


On 04/24/2013 12:41 PM, r-help-requ...@r-project.org wrote:

Hello All,

Am having some trouble computing Type III SS in a Cox Regression using either 
drop1 or Anova from the car package. Am hoping that people will take a look to 
see if they can tell what's going on.

Here is my R code:

cox3grp<- subset(survData,
Treatment %in% c("DC", "DA", "DO"),
c("PTNO", "Treatment", "PFS_CENSORED", "PFS_MONTHS", "AGE", "PS2"))
cox3grp<- droplevels(cox3grp)
str(cox3grp)

coxCV<- coxph(Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2, data=cox3grp, method = 
"efron")
coxCV

drop1(coxCV, test="Chisq")

require(car)
Anova(coxCV, type="III")

And here are my results:

cox3grp<- subset(survData,
+   Treatment %in% c("DC", "DA", "DO"),
+   c("PTNO", "Treatment", "PFS_CENSORED", "PFS_MONTHS", "AGE", 
"PS2"))

>  cox3grp<- droplevels(cox3grp)
>  str(cox3grp)

'data.frame':   227 obs. of  6 variables:
  $ PTNO: int  1195997 104625 106646 1277507 220506 525343 789119 
817160 824224 82632 ...
  $ Treatment   : Factor w/ 3 levels "DC","DA","DO": 1 1 1 1 1 1 1 1 1 1 ...
  $ PFS_CENSORED: int  1 1 1 0 1 1 1 1 0 1 ...
  $ PFS_MONTHS  : num  1.12 8.16 6.08 1.35 9.54 ...
  $ AGE : num  72 71 80 65 72 60 63 61 71 70 ...
  $ PS2 : Ord.factor w/ 2 levels "Yes"<"No": 2 2 2 2 2 2 2 2 2 2 ...
>  
>  coxCV<- coxph(Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2, data=cox3grp, method = "efron")

>  coxCV

Call:
coxph(formula = Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2,
 data = cox3grp, method = "efron")


   coef exp(coef) se(coef)  z p
AGE0.00492 1.005  0.00789  0.624 0.530
PS2.L -0.34523 0.708  0.14315 -2.412 0.016

Likelihood ratio test=5.66  on 2 df, p=0.0591  n= 227, number of events= 198
>  
>  drop1(coxCV, test="Chisq")

Single term deletions

Model:
Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2
DfAICLRT Pr(>Chi)
 1755.2
AGE 1 1753.6 0.3915  0.53151
PS2 1 1758.4 5.2364  0.02212 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>  
>  require(car)

>  Anova(coxCV,  type="III")

Analysis of Deviance Table (Type III tests)
 LR Chisq Df Pr(>Chisq)
AGE   0.3915  10.53151
PS2   5.2364  10.02212 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>  

Both drop1 and Anova give me a different p-value than I get from coxph for both 
my two-level ps2 variable and for age. This is not what I would expect based on 
experience using SAS to conduct similar analyses. Indeed SAS consistently 
produces the same p-values. Namely the ones I get from coxph.

My sense is that I'm probably misusing R in some way but I'm not sure what I'm 
likely to be doing wrong. SAS prodcues Wald Chi-Square results for its type III 
tests. Maybe that has something to do with it. Ideally, I'd like to get type 
III values that match those from coxph. If anyone could help me understand 
better, that would be greatly appreciated.

Thanks,

Paul


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE 

Re: [R] Tables package - remove NAs and NaN

2013-04-24 Thread Santosh
yes, With Duncan's and Liviu's help,  I was able to remove those NAs and
NaNs from the tabular summary.

svn .. thing has not worked for me yet.. would try this later..

Thanks so much!
Regards,
Santosh



On Wed, Apr 24, 2013 at 1:46 PM, Santosh  wrote:

> Thanks so much.. I will try it out.
> Regards,
> Santosh
>
>
> On Wed, Apr 24, 2013 at 1:39 PM, Duncan Murdoch 
> wrote:
>
>> On 13-04-24 4:29 PM, Santosh wrote:
>>
>>> Dear Duncan,
>>>
>>> Thanks for your email. For some reason svn does not seem to work on my
>>> machine.. not sure if it is due to firewall or due to my company's IS
>>> environment settings. I could, however, install the package after
>>> pointing
>>> the repos to r-forge site as you had suggested. But, yes, the versions
>>> installed were 0.7.47 and 0.7.51 and therefore is not the latest one.
>>>
>>> Could you, please, be able to send it to me as a zipped attachment that I
>>> can use in both 32-bit and 64-bit Windows?
>>>
>>
>> You can download it from
>>
>> http://www.stats.uwo.ca/**faculty/murdoch/temp/tables_0.**7.57.tar.gz
>>
>> for the source, or
>>
>> http://www.stats.uwo.ca/**faculty/murdoch/temp/tables_0.**7.57.zip
>>
>> for the binary install.  I think the latter should work in both 32 and 64
>> bit Windows, but I haven't tested it.
>>
>> Duncan Murdoch
>>
>>
>>> Thanks so much,
>>> Santosh
>>>
>>>
>>> On Wed, Apr 24, 2013 at 1:02 PM, Duncan Murdoch <
>>> murdoch.dun...@gmail.com>**wrote:
>>>
>>>  On 13-04-24 3:23 PM, Santosh wrote:

  Dear Rxperts,
>
> Sorry if I am posting a really really dumb request.. I am new to
> subversion
> and am trying to use subversion to download the tables package as
> suggested
> by Duncan. I installed subversion client(from collabnet) and tried to
> access "tables" package using the command below.
>
> svn checkout svn://scm.r-forge.r-project.org/svnroot/tables/<
> http://**scm.r-forge.r-project.org/**svnroot/tables/
> >
>
>
> I get the following error message:
> C:\Users\santosh\temp>svn checkout svn://
> scm.r-forge.r-project.org/svnroot/tables/
> 
> >
>
> svn: E730060: Unable to connect to a repository at URL
> 'svn://scm.r-forge.r-proj ect.org/svnroot/tables'
> svn: E730060: Can't connect to host 'scm.r-forge.r-project.org': A
> connection at tempt failed because the connected party did not properly
> respond after a period  of time, or established connection failed
> because
> connected host has failed to
> respond.
>
> Is there anything additional I need to do with Subversion or with the
> commands?
>
>
>
 The spacing looks funny there:  You should have no blanks in the path.

 Maybe you didn't, it's only the email.  In that case, R-forge is
 probably
 just responding very slowly.

 You could try this path instead:

 svn checkout svn://scm.r-forge.r-project.
 org/svnroot/tables/pkg/tables<**http://scm.r-forge.r-project.**
 org/svnroot/tables/pkg/tables
 >


 This is the subdirectory that contains everything in the package.

 And you should be able to install the package directly from R-forge by
 setting your repository to http://R-forge.r-project.org, but it is very
 slow on updates, so it hasn't got to the current version yet.

 Duncan Murdoch



  Regards,
> Santosh
>
> On Tue, Apr 23, 2013 at 5:13 AM, Duncan Murdoch <
> murdoch.dun...@gmail.com
>
>> **wrote:
>>
>
>   On 13-04-23 6:31 AM, Duncan Murdoch wrote:
>
>>
>>   On 13-04-22 10:40 PM, David Winsemius wrote:
>>
>>>
>>>
>>>  On Apr 22, 2013, at 5:49 PM, Santosh wrote:

Dear Rxperts,

  q <- data.frame(p=rep(c("A","B"),**each=10,len=30),
> a=rep(c(1,2,3),each=10),id=**seq(30),
>
>
> b=round(runif(30,10,20)),
> c=round(runif(30,40,70)))
> The operation below...
> tabular(((p=factor(p))*(a=**factor(a))+1) ~ (N = 1) + (b + c)*
>
>
> (mean+sd),data=q)
> yields some rows of NAs and NaN as shown below
>
> b   c
> p a   N  mean  sdmean  sd
> A 1   10 16.30 2.497 52.30  9.358
>   20   NaNNA   NaN NA
>   3   10 15.60 2.716 60.30  8.001
> B 10   NaNNA   NaN NA
>   2   10 15.40 2.366 57.70 10.414
>   30   NaNNA   

[R] getting started in parallel computing on a windows OS

2013-04-24 Thread Benjamin Caldwell
Dear R help,

I've what I think is a fairly simple parallel problem, and am getting
bogged down in documentation and packages for much more complex situations.

I have a big matrix  (30^5,5]. I have a function that will act on each row
of that matrix sequentially and output the 'best' result from the whole
matrix (it compares the result from each row to the last and keeps the
'better' result). I would like to divide that first large matrix into
chunks equal to the number of cores I have available to me, and work
through each chunk, then output the results from each chunk.

I'm really having trouble making head or tail of how to do this on a
windows machine - lots of different false starts on several different
packages now. Basically, I have the function, and I can of course easily
divide the matrix into chunks. I just need a way to process each chunk
in parallel (other than opening new R sessions for each core manually).

Any help much appreciated - after two days of trying to get this to work
I'm pretty burnt out.

Thanks

*Ben Caldwell*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pglm package: fitted values and residuals

2013-04-24 Thread Achim Zeileis

On Wed, 24 Apr 2013, Paul Johnson wrote:


On Wed, Apr 24, 2013 at 3:11 AM,   wrote:

I'm using the package pglm and I'have estimated a "random probit 
model". I need to save in a vector the fitted values and the residuals 
of the model but I can not do it.


I tried with the command fitted.values using the following procedure 
without results:


This is one of those "ask the pglm authors" questions. You should take 
it up with the authors of the package.  There is a specialized email 
list R-sig-mixed where you will find more people working on this exact 
same thing.


pglm looks like fun to me, but it is not quite done, so far as I can 
tell. Well, the authors have not gone the "extra step" to make their 
regression objects behave like other R regression objects.  In case you 
need alternative software, ask in R-sig-mixed. You'll learn that most of 
these can be estimated with other packages. But I really like the 
homogeneous user interface that is spelled out in pglm, and I expect my 
students will run into the same questions that you have..


I just downloaded their source code, you probably ought to do that so
you can understand what they are doing.   They provide the fitting
functions, but they do not do any of the other work necessary to make
these functions fit together with the R class framework.  There are no
methods for "predict", anova, and so forth.


This is only partially true. In fact, "pglm" employs the framework 
provided by the "maxLik" (by Ott Toomet and Arne Henningsen) and hence it 
inherits some of the methods that "maxLik" provides for all of its fitted 
model objects. So there is summary(), coef(), vcov(), AIC() work and you 
can leverage tools like coeftest() from "lmtest", linearHypothesis() from 
"car" or sandwich() from "sandwich" do work.


But it is certainly true that even more features would be desirable and I 
think that Yves always planned on enhancing "pglm" at some point. (After 
all it's still the initial version 0.1-0 on CRAN...)



I'm in their R folder looking for implementations:

pauljohn@pols110:/tmp/pglm/R$ grep summary *
pauljohn@pols110:/tmp/pglm/R$ grep predict *
pauljohn@pols110:/tmp/pglm/R$ grep class *
pauljohn@pols110:/tmp/pglm/R$ grep fitted *
pglm.R:  # glm models can be fitted

Run


example(pglm)


what can we do after that?


plot(anb)

Error in xy.coords(x, y, xlabel, ylabel, log) :
 'x' is a list, but does not have components 'x' and 'y'

## Nothing.
## We do get a regression summary object, that's better than some
packages provide:


anbsum <- summary(anb)


## And a coefficient table


coef(anbsum)

Estimate  Std. error   t value  Pr(> t)
(Intercept) -6.933764e-01 0.061391429 -11.294351205 1.399336e-29
wage 1.517009e-02 0.006375966   2.379261231 1.734738e-02
exper1.314229e-03 0.007400129   0.177595444 8.590407e-01
ruralyes-8.594328e-05 0.051334716  -0.001674175 9.986642e-01


model.matrix(anb)

Error in terms.default(object) : no terms component nor attribute

anova(anb)

Error in UseMethod("anova") :
 no applicable method for 'anova' applied to an object of class
"c('maxLik', 'maxim')"

predict(anb)

Error in UseMethod("predict") :
 no applicable method for 'predict' applied to an object of class
"c('maxLik', 'maxim')"

So, if you want those features with these models, you'll have to get
busy and do a lot of coding!

While working on regression support lately, I've reached the conclusion 
that if an R package that claims to "do regression" but does not provide 
methods for summary, predict, anova, nobs, fitted, logLik, AIC, and so 
forth, then it is not done yet. Otherwise, users like you who expect to 
be able to run methods like fitted or such have a bad experience, as you 
are having now.


Maybe somebody reading this will remind us where the common list of R
regression methods is listed. I know for sure I've seen a document
about these things, but I'm baffled now trying to find it. But I'm
sure there is one.


I'm sure that there are many. One of my attempts to write up a list is in 
Table 1 of vignette("betareg", package = "betareg").


Personally, I don't write anova() methods for my model objects because I 
can leverage lrtest() and waldtest() from "lmtest" and linearHypothesis() 
and deltaMethod() from "car" as long as certain standard methods are 
available, including coef(), vcov(), logLik(), etc.


Similarly, an AIC() method is typically not needed as long as logLik() is 
available. And BIC() works if nobs() is available in addition.


Best,
Z



pj


library(pglm)

m1_S<-pglm(Feed ~ Cons_PC_1 + imp_gen_1 + LGDP_PC_1 + lnEI_1 +
SH_Ren_1,data,family=binomial(probit),model="random",method="bfgs",index=c("Year","IDCountry"))

m1_S$fitted.values
residuals(m1)


Can someone help me about it?

Thanks



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-gui

Re: [R] RDA permutest envfit

2013-04-24 Thread Gavin Simpson
On Wed, 2013-04-24 at 13:13 -0700, Rémi Lesmerises wrote:
> Thank you!
> 
> A last question: Is it still the same explanation if I remove the
> condition "first=TRUE" (and then testing for all axis) and permutest
> gives the same result?

No; again `permutest(, first = FALSE)` and `envfit()` are testing
different models/fits but how they are different is not the same as with
your previous question.

`permutest(, first = FALSE)` is testing whether the variance
explained by the k environmental variable (k = 4 I believe in your case)
is sufficiently large as to be unusual when compared to the variance
explained under a null model (achieved by permuting residuals,
propagating those residuals plus fitted values to give new data and
refitting the model on those new data). `envfit()` is doing what it
always did; assessing whether the correlation between the vector and the
2-d configuration is unusually large.

In the `permutest(, first = FALSE)` case, you have a model with 4df.
In the `envfit()` case you have 4 models each with 1 degree of freedom.

The two methods also differ in terms of the role played by the
environmental data. In the `permutest` case, the environmental data are
the predictors. In the `envfit` case the env data are the responses and
the axis scores in the 2 dimensions chosen are the predictors.

HTH

G

> Rémi Lesmerises, biol. M.Sc.,
> Candidat Ph.D. en Biologie
> Université du Québec à Rimouski
> remilesmeri...@yahoo.ca
> 
> 
> 
> On Wed, 2013-04-24 at 10:38 -0700, Rémi Lesmerises wrote:
> > Dear all,
> > 
> > I did a RDA and when I looked to the signification of the test with
> > permutest, the output was non-significant. But when I used the envfit
> > function, some of the vectors are significant. All the test's
> > conditions are respected. What it means? Is it an error in the script?
> 
> The two functions are testing two fundamentally different things.
> `permutest` (as you invoked it --- `first = TRUE`) is testing the first
> axis of the RDA. That axis is a linear combination of the constraints.
> 
> The `envfit` procedure is testing the individual correlations between
> the 2-d configuration of samples in the RDA space and the direction of
> maximal variance of the environmental data. Each variable is considered
> separately.
> 
> I can easily imagine a case, where the variance explained on the first
> axis is not significant but variance over 2 axes is significant, as one
> where the vectors do not point solely along the first RDA axis but also
> point along the 2nd axis. By looking only at their contributions to the
> first axis and summing them you don't explain a whole lot, but when you
> look at the directions in 2D space each variable explains a significant
> amount of variance.
> 
> HTH
> 
> G
> 
> > Commands and output:
> > 
> > > permutest(rda.ind, perm=999, first=TRUE)
> > 
> > Permutation test for rda 
> > 
> > Call: rda(formula = x ~ ARH_frs + CON_frs + PRP_frs + RUI_frs +
> > VAM_frs, data = z)
> > Permutation test for first constrained eigenvalue
> > Pseudo-F:4.093568 (with 1, 10 Degrees of Freedom)
> > Significance:0.413 
> > Based on 999 permutations under reduced model.
> > 
> > > fit <- envfit(rda.ind, z, perm = 999, display = "lc")
> > > fit
> > 
> > ***VECTORS
> > 
> >  RDA1  RDA2 r2 Pr(>r)
> > VAM_frs  0.145281 -0.989390 0.2388  0.147
> > 
> > ARH_frs -0.876494 -0.481413 0.6127  0.002 ** 
> > CON_frs  0.904278  0.426944 0.4846  0.013 *  
> > PRP_frs -0.997634  0.068755 0.9433  0.001 ***
> > RUI_frs -0.648512 -0.761204 0.6243  0.004 ** 
> > ---
> > Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
> > P values based on 999 permutations.
> > 
> >  
> > Rémi Lesmerises, biol. M.Sc.,
> > Candidat Ph.D. en Biologie
> > Université du Québec à Rimouski
> > remilesmeri...@yahoo.ca
> > 
> > [[alternative HTML version deleted]]
> > 
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] identify object that causes "Error in loadNamespace(name) : there is no package called ‘R.utils’"

2013-04-24 Thread Duncan Murdoch

On 13-04-24 10:57 AM, Duncan Murdoch wrote:

On 13-04-24 10:12 AM, Martin Morgan wrote:

On 04/24/2013 06:03 AM, Duncan Murdoch wrote:

On 13-04-24 5:46 AM, Liviu Andronic wrote:

Dear all,
I've bumped into the: "Error in loadNamespace(name) : there is no
package called ‘R.utils’" error. I've already read a bit on this (
http://www.cybaea.net/Blogs/Data/A-warning-on-the-R-save-format.html )
but I have a follow-up question.

Given a workspace that automatically loads a package that I don't
really need/want (e.g. ‘R.utils’), how do I identify which object
requires this package to load? I would like to avoid loading ‘R.utils’
every time I open an R session.


That's not easy, because the code in R that triggers that error has no idea of
the name of the object it is loading.


Maybe traceback() can provide some hints? I did, more or less arbitrarily

library(rms)
a = list(fun=ie.setup)
save(a, file="/tmp/a.rda")
remove.packages("rms")

and then in a new session

   > load("/tmp/a.rda")
Error in loadNamespace(name) : there is no package called 'rms'
   > traceback()
7: stop(e)
6: value[[3L]](cond)
5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
4: tryCatchList(expr, classes, parentenv, handlers)
3: tryCatch(loadNamespace(name), error = function(e) stop(e))
2: getNamespace(c("rms", "3.6-3"))
1: load("/tmp/a.rda")

with the line numbered 2 giving me the necessary hint.


That tells you that some object needs the rms package, but I don't see
how you would know it is "a" that is the problem.  We already knew that
rms was needed from the error message.

What I've done sometimes in debugging is to change that error to a
warning in the getNamespace() function, and add some tracing code to the
serialization code to print the names of objects as they are loaded.
(This goes in ReadItem in src/main/serialize.c.)

I wouldn't expect Liviu to make those changes, but perhaps a "verbose"
option could be added to load(), so that it could be available to users.



I have added this in R-devel.  The format of the printed output may well 
change before this is ever released, but it should be enough to identify 
the bad item already.


You'll need a build of R-devel from r62658 or newer to see this.  Then

load("/tmp/a.rda", verbose=TRUE)

will print the names of objects as they are read (the names are read 
after the attributes and before the value).  If you want to see reams of 
mostly useless information, you can try verbose=n (for some number n=2 
or more); this prints names and component numbers to a greater depth.


Duncan Murdoch



Duncan Murdoch



Martin




You could try a binary search to find out, but it will be tedious:
1. Install R.utils.
2. Load the workspace successfully.
3. Delete half the objects, and save it.
4. Uninstall R.utils, and see if you can load the workspace.

At this point you'll know if there's an object needing R.utils still left or
not, and you can repeat the steps until you find a single object that causes the
problem.  (But it might not be the only one, so deleting it from the original
workspace might not solve your problem.)

A better approach is to *never* save and load workspaces unless you know exactly
what is in them.  Always reply "no" to the question about saving your workspace
(or set that as the default).  If you accidentally end up with a workspace being
loaded, delete it.

Duncan Murdoch



Regards,
Liviu



sessionInfo()

R version 2.15.3 (2013-03-01)
Platform: i686-pc-linux-gnu (32-bit)

locale:
[1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=C

attached base packages:
[1] datasets  grDevices splines   graphics  utils stats
methods   base

other attached packages:
[1] R.utils_1.23.2R.oo_1.13.0   R.methodsS3_1.4.2
tables_0.7.57 reshape2_1.2.2
[6] car_2.0-15nnet_7.3-6MASS_7.3-23
Hmisc_3.10-1  survival_2.37-2
[11] foreign_0.8-53

loaded via a namespace (and not attached):
[1] cluster_1.14.3   grid_2.15.3  lattice_0.20-13  plyr_1.8
rstudio_0.97.312
[6] stringr_0.6.2tools_2.15.3




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RDA permutest envfit

2013-04-24 Thread Rémi Lesmerises
Thank you!

A last question: Is it still the same explanation if I remove the condition 
"first=TRUE" (and then testing for all axis) and permutest gives the same 
result?
 
Rémi Lesmerises, biol. M.Sc.,
Candidat Ph.D. en Biologie
Université du Québec à Rimouski
remilesmeri...@yahoo.ca



On Wed, 2013-04-24 at 10:38 -0700, Rémi Lesmerises wrote:
> Dear all,
> 
> I did a RDA and when I looked to the signification of the test with
> permutest, the output was non-significant. But when I used the envfit
> function, some of the vectors are significant. All the test's
> conditions are respected. What it means? Is it an error in the script?

The two functions are testing two fundamentally different things.
`permutest` (as you invoked it --- `first = TRUE`) is testing the first
axis of the RDA. That axis is a linear combination of the constraints.

The `envfit` procedure is testing the individual correlations between
the 2-d configuration of samples in the RDA space and the direction of
maximal variance of the environmental data. Each variable is considered
separately.

I can easily imagine a case, where the variance explained on the first
axis is not significant but variance over 2 axes is significant, as one
where the vectors do not point solely along the first RDA axis but also
point along the 2nd axis. By looking only at their contributions to the
first axis and summing them you don't explain a whole lot, but when you
look at the directions in 2D space each variable explains a significant
amount of variance.

HTH

G

> Commands and output:
> 
> > permutest(rda.ind, perm=999, first=TRUE)
> 
> Permutation test for rda 
> 
> Call: rda(formula = x ~ ARH_frs + CON_frs + PRP_frs + RUI_frs +
> VAM_frs, data = z)
> Permutation test for first constrained eigenvalue
> Pseudo-F:        4.093568 (with 1, 10 Degrees of Freedom)
> Significance:    0.413 
> Based on 999 permutations under reduced model.
> 
> > fit <- envfit(rda.ind, z, perm = 999, display = "lc")
> > fit
> 
> ***VECTORS
> 
>              RDA1      RDA2     r2 Pr(>r)    
> VAM_frs  0.145281 -0.989390 0.2388  0.147    
> 
> ARH_frs -0.876494 -0.481413 0.6127  0.002 ** 
> CON_frs  0.904278  0.426944 0.4846  0.013 *  
> PRP_frs -0.997634  0.068755 0.9433  0.001 ***
> RUI_frs -0.648512 -0.761204 0.6243  0.004 ** 
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ 
> ’ 1 
> P values based on 999 permutations.
> 
>  
> Rémi Lesmerises, biol. M.Sc.,
> Candidat Ph.D. en Biologie
> Université du Québec à Rimouski
> remilesmeri...@yahoo.ca
> 
>     [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tables package - remove NAs and NaN

2013-04-24 Thread Duncan Murdoch

On 13-04-24 3:23 PM, Santosh wrote:

Dear Rxperts,
Sorry if I am posting a really really dumb request.. I am new to subversion
and am trying to use subversion to download the tables package as suggested
by Duncan. I installed subversion client(from collabnet) and tried to
access "tables" package using the command below.

svn checkout svn://scm.r-forge.r-project.org/svnroot/tables/

I get the following error message:
C:\Users\santosh\temp>svn checkout svn://
scm.r-forge.r-project.org/svnroot/tables/
svn: E730060: Unable to connect to a repository at URL
'svn://scm.r-forge.r-proj ect.org/svnroot/tables'
svn: E730060: Can't connect to host 'scm.r-forge.r-project.org': A
connection at tempt failed because the connected party did not properly
respond after a period  of time, or established connection failed because
connected host has failed to
respond.

Is there anything additional I need to do with Subversion or with the
commands?




The spacing looks funny there:  You should have no blanks in the path.

Maybe you didn't, it's only the email.  In that case, R-forge is 
probably just responding very slowly.


You could try this path instead:

svn checkout svn://scm.r-forge.r-project.org/svnroot/tables/pkg/tables

This is the subdirectory that contains everything in the package.

And you should be able to install the package directly from R-forge by 
setting your repository to http://R-forge.r-project.org, but it is very 
slow on updates, so it hasn't got to the current version yet.


Duncan Murdoch




Regards,
Santosh

On Tue, Apr 23, 2013 at 5:13 AM, Duncan Murdoch wrote:


On 13-04-23 6:31 AM, Duncan Murdoch wrote:


On 13-04-22 10:40 PM, David Winsemius wrote:



On Apr 22, 2013, at 5:49 PM, Santosh wrote:

  Dear Rxperts,

q <- data.frame(p=rep(c("A","B"),**each=10,len=30),
a=rep(c(1,2,3),each=10),id=**seq(30),
b=round(runif(30,10,20)),
c=round(runif(30,40,70)))
The operation below...
tabular(((p=factor(p))*(a=**factor(a))+1) ~ (N = 1) + (b + c)*
(mean+sd),data=q)
yields some rows of NAs and NaN as shown below

   b   c
p a   N  mean  sdmean  sd
A 1   10 16.30 2.497 52.30  9.358
 20   NaNNA   NaN NA
 3   10 15.60 2.716 60.30  8.001
B 10   NaNNA   NaN NA
 2   10 15.40 2.366 57.70 10.414
 30   NaNNA   NaN NA
 All 30 15.77 2.473 56.77  9.601

How do I remove the rows having N=0 ?
I would like the resulting table look like..
   b   c
p a   N  mean  sdmean  sd
A 1   10 16.30 2.497 52.30  9.358
   3   10 15.60 2.716 60.30  8.001
B  2   10 15.40 2.366 57.70 10.414
 All 30 15.77 2.473 56.77  9.601



Here's a bit of a hack:

tabular( (`p a`=interaction(p,a, drop=TRUE, sep=" ")) ~ (N = 1) + (b +
c)*
   (mean+sd),data=q)

   b   c
p a N  mean sd mean sd
A 1 10 12.8 0.7888 52.1 8.020
B 2 10 16.3 3.0569 54.9 8.711
A 3 10 14.6 3.7771 56.5 6.980

I have been rather hoping that Duncan Murdoch would have noticed the
earlier thread, but maybe he can comment on whether there is a more direct
route/



This isn't something that the package is designed to handle:  if you say
p*a, it wants all combinations of p and a.

If I wanted a table like that, I'd use a different hack.  One
possibility is to create that interaction column, but display it as just
the initial letter, labelled p, and then add another column to contain
the a values as data.  It would be tricky to get the formatting right.

Another possibility is to generate the whole table with the N=0 rows,
and then post-process it to remove those rows, and adjust the row labels
appropriately.  This approach probably gives the nicer result, but the
post-processing is quite messy:  you need to delete some rows from the
table, from its rowLabels attribute, and from the justification
attributes of both the table and its rowLabels.  (I should add a [
method to the package to hide this messiness.)



I've done this now, in version 0.7.54 on R-forge.  To leave out the rows
with N=0, you can select a subset of the table where N (the first column)
is non-zero:

tab <- tabular(((p=factor(p))*(a=**factor(a))+1) ~ (N = 1) + (b +
c)*(mean+sd),data=q)

tab[ tab[,1] > 0, ]

and it produces this:


  b   c
  p a   N  mean  sdmean sd
  A 1   10 16.20 3.458 56.3 10.155
3   10 13.60 2.119 58.1  8.075
  B 2   10 14.40 2.547 51.2  9.438
All 30 14.73 2.888 55.2  9.419

Indexing of tables isn't as general as indexing of matrices, but most of
the simple forms should work.  I haven't tested yet, but I expect this will
be fine in LaTeX or HTML (also new, not on CRAN yet) output as well.

Duncan Murdoch





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pglm package: fitted values and residuals

2013-04-24 Thread Paul Johnson
On Wed, Apr 24, 2013 at 3:11 AM,   wrote:
> I'm using the package pglm and I'have estimated a "random probit model".
> I need to save in a vector the fitted values and the residuals of the model
> but I can not do it.
>
> I tried with the command fitted.values using the following procedure without
> results:
>
This is one of those "ask the pglm authors" questions. You should take
it up with the authors of the package.  There is a specialized email
list R-sig-mixed where you will find more people working on this exact
same thing.

pglm looks like fun to me, but it is not quite done, so far as I can
tell. Well, the authors have not gone the "extra step" to make their
regression objects behave like other R regression objects.   In case
you need alternative software, ask in R-sig-mixed. You'll learn that
most of these can be estimated with other packages. But I really like
the homogeneous user interface that is spelled out in pglm, and I
expect my students will run into the same questions that you have..

I just downloaded their source code, you probably ought to do that so
you can understand what they are doing.   They provide the fitting
functions, but they do not do any of the other work necessary to make
these functions fit together with the R class framework.  There are no
methods for "predict", anova, and so forth.

I'm in their R folder looking for implementations:

pauljohn@pols110:/tmp/pglm/R$ grep summary *
pauljohn@pols110:/tmp/pglm/R$ grep predict *
pauljohn@pols110:/tmp/pglm/R$ grep class *
pauljohn@pols110:/tmp/pglm/R$ grep fitted *
pglm.R:  # glm models can be fitted

Run

> example(pglm)

what can we do after that?

> plot(anb)
Error in xy.coords(x, y, xlabel, ylabel, log) :
  'x' is a list, but does not have components 'x' and 'y'

## Nothing.
## We do get a regression summary object, that's better than some
packages provide:

> anbsum <- summary(anb)

## And a coefficient table

> coef(anbsum)
 Estimate  Std. error   t value  Pr(> t)
(Intercept) -6.933764e-01 0.061391429 -11.294351205 1.399336e-29
wage 1.517009e-02 0.006375966   2.379261231 1.734738e-02
exper1.314229e-03 0.007400129   0.177595444 8.590407e-01
ruralyes-8.594328e-05 0.051334716  -0.001674175 9.986642e-01

> model.matrix(anb)
Error in terms.default(object) : no terms component nor attribute
> anova(anb)
Error in UseMethod("anova") :
  no applicable method for 'anova' applied to an object of class
"c('maxLik', 'maxim')"
> predict(anb)
Error in UseMethod("predict") :
  no applicable method for 'predict' applied to an object of class
"c('maxLik', 'maxim')"

So, if you want those features with these models, you'll have to get
busy and do a lot of coding!

While working on regression support lately, I've reached the
conclusion that if an R package that claims to "do regression" but
does not provide methods for summary, predict, anova, nobs, fitted,
logLik, AIC, and so forth, then it is not done yet. Otherwise, users
like you who expect to be able to run methods like fitted or such have
a bad experience, as you are having now.

Maybe somebody reading this will remind us where the common list of R
regression methods is listed. I know for sure I've seen a document
about these things, but I'm baffled now trying to find it. But I'm
sure there is one.


pj

> library(pglm)
>
> m1_S<-pglm(Feed ~ Cons_PC_1 + imp_gen_1 + LGDP_PC_1 + lnEI_1 +
> SH_Ren_1,data,family=binomial(probit),model="random",method="bfgs",index=c("Year","IDCountry"))
>
> m1_S$fitted.values
> residuals(m1)
>
>
> Can someone help me about it?
>
> Thanks
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tables package - remove NAs and NaN

2013-04-24 Thread Santosh
Dear Rxperts,
Sorry if I am posting a really really dumb request.. I am new to subversion
and am trying to use subversion to download the tables package as suggested
by Duncan. I installed subversion client(from collabnet) and tried to
access "tables" package using the command below.

svn checkout svn://scm.r-forge.r-project.org/svnroot/tables/

I get the following error message:
C:\Users\santosh\temp>svn checkout svn://
scm.r-forge.r-project.org/svnroot/tables/
svn: E730060: Unable to connect to a repository at URL
'svn://scm.r-forge.r-proj ect.org/svnroot/tables'
svn: E730060: Can't connect to host 'scm.r-forge.r-project.org': A
connection at tempt failed because the connected party did not properly
respond after a period  of time, or established connection failed because
connected host has failed to
respond.

Is there anything additional I need to do with Subversion or with the
commands?


Regards,
Santosh

On Tue, Apr 23, 2013 at 5:13 AM, Duncan Murdoch wrote:

> On 13-04-23 6:31 AM, Duncan Murdoch wrote:
>
>> On 13-04-22 10:40 PM, David Winsemius wrote:
>>
>>>
>>> On Apr 22, 2013, at 5:49 PM, Santosh wrote:
>>>
>>>  Dear Rxperts,
 q <- data.frame(p=rep(c("A","B"),**each=10,len=30),
 a=rep(c(1,2,3),each=10),id=**seq(30),
 b=round(runif(30,10,20)),
 c=round(runif(30,40,70)))
 The operation below...
 tabular(((p=factor(p))*(a=**factor(a))+1) ~ (N = 1) + (b + c)*
 (mean+sd),data=q)
 yields some rows of NAs and NaN as shown below

   b   c
 p a   N  mean  sdmean  sd
 A 1   10 16.30 2.497 52.30  9.358
 20   NaNNA   NaN NA
 3   10 15.60 2.716 60.30  8.001
 B 10   NaNNA   NaN NA
 2   10 15.40 2.366 57.70 10.414
 30   NaNNA   NaN NA
 All 30 15.77 2.473 56.77  9.601

 How do I remove the rows having N=0 ?
 I would like the resulting table look like..
   b   c
 p a   N  mean  sdmean  sd
 A 1   10 16.30 2.497 52.30  9.358
   3   10 15.60 2.716 60.30  8.001
 B  2   10 15.40 2.366 57.70 10.414
 All 30 15.77 2.473 56.77  9.601

>>>
>>> Here's a bit of a hack:
>>>
>>> tabular( (`p a`=interaction(p,a, drop=TRUE, sep=" ")) ~ (N = 1) + (b +
>>> c)*
>>>   (mean+sd),data=q)
>>>
>>>   b   c
>>>p a N  mean sd mean sd
>>>A 1 10 12.8 0.7888 52.1 8.020
>>>B 2 10 16.3 3.0569 54.9 8.711
>>>A 3 10 14.6 3.7771 56.5 6.980
>>>
>>> I have been rather hoping that Duncan Murdoch would have noticed the
>>> earlier thread, but maybe he can comment on whether there is a more direct
>>> route/
>>>
>>>
>> This isn't something that the package is designed to handle:  if you say
>> p*a, it wants all combinations of p and a.
>>
>> If I wanted a table like that, I'd use a different hack.  One
>> possibility is to create that interaction column, but display it as just
>> the initial letter, labelled p, and then add another column to contain
>> the a values as data.  It would be tricky to get the formatting right.
>>
>> Another possibility is to generate the whole table with the N=0 rows,
>> and then post-process it to remove those rows, and adjust the row labels
>> appropriately.  This approach probably gives the nicer result, but the
>> post-processing is quite messy:  you need to delete some rows from the
>> table, from its rowLabels attribute, and from the justification
>> attributes of both the table and its rowLabels.  (I should add a [
>> method to the package to hide this messiness.)
>>
>
> I've done this now, in version 0.7.54 on R-forge.  To leave out the rows
> with N=0, you can select a subset of the table where N (the first column)
> is non-zero:
>
> tab <- tabular(((p=factor(p))*(a=**factor(a))+1) ~ (N = 1) + (b +
> c)*(mean+sd),data=q)
>
> tab[ tab[,1] > 0, ]
>
> and it produces this:
>
>
>  b   c
>  p a   N  mean  sdmean sd
>  A 1   10 16.20 3.458 56.3 10.155
>3   10 13.60 2.119 58.1  8.075
>  B 2   10 14.40 2.547 51.2  9.438
>All 30 14.73 2.888 55.2  9.419
>
> Indexing of tables isn't as general as indexing of matrices, but most of
> the simple forms should work.  I haven't tested yet, but I expect this will
> be fine in LaTeX or HTML (also new, not on CRAN yet) output as well.
>
> Duncan Murdoch
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [r] Problems with Fortran calls when loaded a dll compiled with gfortran-4 Cygwin 4.5.3

2013-04-24 Thread Duncan Murdoch

On 13-04-24 2:46 PM, Jens Olofsson wrote:

Ok. I apologise for not understanding. So, I have installed R-tools. It
changed my PATH-variable. I didn't installed Cygwin dlls as stated by
http://cran.r-project.org/doc/manuals/R-admin.html#The-Windows-toolset.
Instead my PATH-variable contains the path to the Cygwin dlls AFTER the
path to R... So, I started RTerm (32-bit) and tried > R CMD SHLIB mango.f95
and got the same error as earlier "Error: unexpected symbol in "R CMD"".
The same goes for RTerm (64-bit). Can you pls advice me on how to proceed?


See my first paragraph below.

Duncan Murdoch


Sincerely Jens


On 24 April 2013 20:08, Duncan Murdoch  wrote:


On 13-04-24 1:51 PM, Jens Olofsson wrote:


Dear Duncan,
I know this isn't a forum for Cygwin, but for R. Pls treat me as a noob
and
also remember I am on Windows. How should I use R CMD SHLIB as if I write
that at the prompt I get the error: unexpected symbol in "R CMD". I have
mango.f95 in the working directory.




That's a command-line command, not something done with R.  You can use it
from your bash shell if you have R and the Rtools directories on your path,
or from the Windows CMD shell.

BTW, my comment wasn't trying to tell you to go to a Cygwin forum, it was
telling you that Cygwin's gfortran is unsupported.  You need to use the
MinGW-64 one that we distribute if you want us to be able to help.

Duncan Murdoch

  //Jens




On 24 April 2013 19:46, Duncan Murdoch  wrote:

  On 13-04-24 1:36 PM, Jens Olofsson wrote:


  Dear users of R

I have a subroutine in Fortran95, compiled to a DLL with gfortran in
Cygwin
4.5.3.



We don't support Cygwin.  You should use the gfortran in Rtools, and get
R
to set the command line options for you, either by putting the code in a
package, or by using R CMD SHLIB Mango.f95.

Duncan Murdoch

   The subroutine is:


subroutine MyPBP( S, p, N )
   ! Expose subroutine rtest to users of this DLL
   !DEC$ ATTRIBUTES DLLEXPORT, C, REFERENCE, ALIAS: "mypbp_" ::mypbp
   ! This function computes the Poisson-Binomial distribution
   ! of size N using p
   double precision, intent(inout) :: S(N+1)
   double precision, intent(in) :: p(N)
   integer, intent(in) :: N
   double precision :: X(N+1)
   integer i, j
   !X=0
   !S=0
   X(1) = 1 - p(1)
   X(2) = p(1)
   do i = 2, N
   S(1) = X(1)*(1-p(i))
   do j = 2,i
   S(j) = X(j-1)*p(i) + X(j)*(1-p(i))
   end do
   S(i+1) = X(i)*p(i)
   X = S
   if (i == N) then
   S = X
   end if
   end do
end subroutine MyPBP
and it is saved into Mango.f95
I compile it from the bash shell using: gfortran-4 c- Mango.f95 and
gfortran-4 -shared -o Mango.dll Mango.o
I am on a Windows machine running Windows 7 with Intel i7.
I load the dll in a 32-bit R by dyn.load("Mango.dll"). Using
getLoadedDLLs
I can see the DLL. However, is.loaded("Mango.dll") = FALSE. In
addition, R
stop responding when I try .Fortran("MyPBP", as.numeric(S),
as.numeric(p),
as.integer(N)),
where N<-5, S<-array(0,N+1) and p<- c(0.1, 0.2, 0.5, 0.8, 0.9).

What am I doing wrong?
Any ideas, thoughts and/or comments are highly appreciated.

Jens

  [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help




PLEASE do read the posting guide http://www.R-project.org/**
posting-guide.html 





and provide commented, minimal, self-contained, reproducible code.













__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RDA permutest envfit

2013-04-24 Thread Gavin Simpson
On Wed, 2013-04-24 at 10:38 -0700, Rémi Lesmerises wrote:
> Dear all,
> 
> I did a RDA and when I looked to the signification of the test with
> permutest, the output was non-significant. But when I used the envfit
> function, some of the vectors are significant. All the test's
> conditions are respected. What it means? Is it an error in the script?

The two functions are testing two fundamentally different things.
`permutest` (as you invoked it --- `first = TRUE`) is testing the first
axis of the RDA. That axis is a linear combination of the constraints.

The `envfit` procedure is testing the individual correlations between
the 2-d configuration of samples in the RDA space and the direction of
maximal variance of the environmental data. Each variable is considered
separately.

I can easily imagine a case, where the variance explained on the first
axis is not significant but variance over 2 axes is significant, as one
where the vectors do not point solely along the first RDA axis but also
point along the 2nd axis. By looking only at their contributions to the
first axis and summing them you don't explain a whole lot, but when you
look at the directions in 2D space each variable explains a significant
amount of variance.

HTH

G

> Commands and output:
> 
> > permutest(rda.ind, perm=999, first=TRUE)
> 
> Permutation test for rda 
> 
> Call: rda(formula = x ~ ARH_frs + CON_frs + PRP_frs + RUI_frs +
> VAM_frs, data = z)
> Permutation test for first constrained eigenvalue
> Pseudo-F:4.093568 (with 1, 10 Degrees of Freedom)
> Significance:0.413 
> Based on 999 permutations under reduced model.
> 
> > fit <- envfit(rda.ind, z, perm = 999, display = "lc")
> > fit
> 
> ***VECTORS
> 
>  RDA1  RDA2 r2 Pr(>r)
> VAM_frs  0.145281 -0.989390 0.2388  0.147
> 
> ARH_frs -0.876494 -0.481413 0.6127  0.002 ** 
> CON_frs  0.904278  0.426944 0.4846  0.013 *  
> PRP_frs -0.997634  0.068755 0.9433  0.001 ***
> RUI_frs -0.648512 -0.761204 0.6243  0.004 ** 
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
> P values based on 999 permutations.
> 
>  
> Rémi Lesmerises, biol. M.Sc.,
> Candidat Ph.D. en Biologie
> Université du Québec à Rimouski
> remilesmeri...@yahoo.ca
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [r] Problems with Fortran calls when loaded a dll compiled with gfortran-4 Cygwin 4.5.3

2013-04-24 Thread Jens Olofsson
Hello again. 

I may be a bit slow, but I'm learning. :-)
Installed Rtools, added R to the path. Used CMD, moved to the folder of the 
source and wrote R CMD SHLIB Mango.f95 and voila I got a dll I loaded in R and 
I could call my current subroutines as intended. 

I am very grateful for ur help and ur patience. 

Sincerely, Jens

24 apr 2013 kl. 21:33 skrev "R. Michael Weylandt " 
:

> 
> 
> On Apr 24, 2013, at 7:46 PM, Jens Olofsson  wrote:
> 
>> Ok. I apologise for not understanding. So, I have installed R-tools. It
>> changed my PATH-variable. I didn't installed Cygwin dlls as stated by
>> http://cran.r-project.org/doc/manuals/R-admin.html#The-Windows-toolset.
>> Instead my PATH-variable contains the path to the Cygwin dlls AFTER the
>> path to R... So, I started RTerm (32-bit) and tried > R CMD SHLIB mango.f95
> 
> Repeating what Duncan said -- R CMD SHLIB is to be done outside of R, not 
> within R. 
> 
> Some recent discussion also suggests its perhaps easier to do this with 
> RStudio + devtools as part of a package than as a standalone shared object. 
> 
> MW
> 
> 
>> and got the same error as earlier "Error: unexpected symbol in "R CMD"".
>> The same goes for RTerm (64-bit). Can you pls advice me on how to proceed?
>> Sincerely Jens
>> 
>> 
>> On 24 April 2013 20:08, Duncan Murdoch  wrote:
>> 
>>> On 13-04-24 1:51 PM, Jens Olofsson wrote:
>>> 
 Dear Duncan,
 I know this isn't a forum for Cygwin, but for R. Pls treat me as a noob
 and
 also remember I am on Windows. How should I use R CMD SHLIB as if I write
 that at the prompt I get the error: unexpected symbol in "R CMD". I have
 mango.f95 in the working directory.
>>> 
>>> 
>>> That's a command-line command, not something done with R.  You can use it
>>> from your bash shell if you have R and the Rtools directories on your path,
>>> or from the Windows CMD shell.
>>> 
>>> BTW, my comment wasn't trying to tell you to go to a Cygwin forum, it was
>>> telling you that Cygwin's gfortran is unsupported.  You need to use the
>>> MinGW-64 one that we distribute if you want us to be able to help.
>>> 
>>> Duncan Murdoch
>>> 
>>> //Jens
 
 
 
 On 24 April 2013 19:46, Duncan Murdoch  wrote:
 
 On 13-04-24 1:36 PM, Jens Olofsson wrote:
> 
> Dear users of R
>> I have a subroutine in Fortran95, compiled to a DLL with gfortran in
>> Cygwin
>> 4.5.3.
> We don't support Cygwin.  You should use the gfortran in Rtools, and get
> R
> to set the command line options for you, either by putting the code in a
> package, or by using R CMD SHLIB Mango.f95.
> 
> Duncan Murdoch
> 
> The subroutine is:
> 
>> subroutine MyPBP( S, p, N )
>> ! Expose subroutine rtest to users of this DLL
>> !DEC$ ATTRIBUTES DLLEXPORT, C, REFERENCE, ALIAS: "mypbp_" ::mypbp
>> ! This function computes the Poisson-Binomial distribution
>> ! of size N using p
>> double precision, intent(inout) :: S(N+1)
>> double precision, intent(in) :: p(N)
>> integer, intent(in) :: N
>> double precision :: X(N+1)
>> integer i, j
>> !X=0
>> !S=0
>> X(1) = 1 - p(1)
>> X(2) = p(1)
>> do i = 2, N
>> S(1) = X(1)*(1-p(i))
>> do j = 2,i
>> S(j) = X(j-1)*p(i) + X(j)*(1-p(i))
>> end do
>> S(i+1) = X(i)*p(i)
>> X = S
>> if (i == N) then
>> S = X
>> end if
>> end do
>> end subroutine MyPBP
>> and it is saved into Mango.f95
>> I compile it from the bash shell using: gfortran-4 c- Mango.f95 and
>> gfortran-4 -shared -o Mango.dll Mango.o
>> I am on a Windows machine running Windows 7 with Intel i7.
>> I load the dll in a 32-bit R by dyn.load("Mango.dll"). Using
>> getLoadedDLLs
>> I can see the DLL. However, is.loaded("Mango.dll") = FALSE. In
>> addition, R
>> stop responding when I try .Fortran("MyPBP", as.numeric(S),
>> as.numeric(p),
>> as.integer(N)),
>> where N<-5, S<-array(0,N+1) and p<- c(0.1, 0.2, 0.5, 0.8, 0.9).
>> 
>> What am I doing wrong?
>> Any ideas, thoughts and/or comments are highly appreciated.
>> 
>> Jens
>> 
>>[[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> 
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html 
>> 
>> 
>> and provide commented, minimal, self-contained, reproducible code.
>> 
>>   [[alternative HTML version deleted]]
>> 
>> __

Re: [R] mgcv: how select significant predictor vars when using gam(...select=TRUE) using automatic optimization

2013-04-24 Thread Juliet Hannah
Hi Jan and Simon,

If possible, could you attach the diagnostic plots. I would be curious to
see them.

Thanks,

Juliet


On Fri, Apr 19, 2013 at 4:39 AM, jholstei  wrote:

> Simon,
>
> that was very instructive—very special thanks to you.
> I already noticed that the model was bad, but it was not clear to me that
> transformation of predictors to, say a more centered distribution is
> helpful here.
> And thanks for pointing out Tweedie, I noticed that the error structure is
> far from normal and more like gamma or poisson, but Gamma made things worse.
>
> Best regards,
> Jan
>
>
>
>
>
> Am 18 Apr 2013 um 17:25 schrieb Simon Wood:
>
> > Jan,
> >
> > Thanks for the data (off list). The p-value computations are based on
> the approximation that things are approximately normal on the linear
> predictor scale, but actually they are no where close to normal in this
> case, which is why the p-values look inconsistent. The reason that the
> approximate normality assumption doesn't hold is that the model is quite a
> poor fit. If you take a look at gam.check(fit) you'll see that the constant
> variance assumption of quasi(link=log) is violated quite badly, and the
> residual distribution is really quite odd (plot residuals against fitted as
> well). Also see plot(fit,pages=1,scale=0) - it shows ballooning confidence
> intervals and smooth estimates that are so low in places that they might as
> well be minus infinity (given log link) - clearly something is wrong with
> this model!
> >
> > I would be inclined to reset all the 0's to 0 (rather than 0.01), and
> then to try Tweedie(p=1.5,link=log) as the family. Also the predictor
> variables are very skewed which is giving leverage problems, so I would
> transform them to give less skew. e.g. Something like
> >
> > fit<-gam(target~s(log(mgs))+s(I(gsd^.5))+s(I(mud^.25))+s(log(ssCmax)),
> > family=Tweedie(p=1.6,link=log),data=df,method="REML")
> >
> > gives a model that is closer to being reasonable (p-values are then
> consistent between select=TRUE and FALSE).
> >
> > best,
> > Simon
> >
> > On 18/04/13 14:24, Simon Wood wrote:
> >> Jan,
> >>
> >> Thanks for this. Is there any chance that you could send me the data off
> >> list and I'll try to figure out what is happening? (Under the
> >> understanding that I'll only use the data for investigating this issue,
> >> of course).
> >>
> >> best,
> >> Simon
> >>
> >> on 18/04/13 11:11, Jan Holstein wrote:
> >>> Simon,
> >>>
> >>> thanks for the reply,  I guess I'm pretty much up to date using
> >>>  mgcv 1.7-22.
> >>> Upgrading to R 3.0.0 also didn't do any change.
> >>>
> >>> Unfortunately using method="REML" does not make any difference:
> >>>
> >>> ### first with "select=FALSE"
>  fit<-gam(target
> 
> ~s(mgs)+s(gsd)+s(mud)+s(ssCmax),family=quasi(link=log),data=wspe1,method="REML",select=F)
> 
>  summary(fit)
> >>>
> >>> Family: quasi
> >>> Link function: log
> >>> Formula:
> >>> target ~ s(mgs) + s(gsd) + s(mud) + s(ssCmax)
> >>> Parametric coefficients:
> >>> Estimate Std. Error t value Pr(>|t|)
> >>> (Intercept)   -4.724  7.462  -0.6330.527
> >>> Approximate significance of smooth terms:
> >>> edf Ref.df  F p-value
> >>> s(mgs)3.118  3.492  0.099   0.974
> >>> s(gsd)6.377  7.044 15.596  <2e-16 ***
> >>> s(mud)8.837  8.971 18.832  <2e-16 ***
> >>> s(ssCmax) 3.886  4.051  2.342   0.052 .
> >>> ---
> >>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >>> R-sq.(adj) =  0.403   Deviance explained = 40.6%
> >>> REML score =  33186  Scale est. = 8.7812e+05  n = 4511
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>  Then using "select=T"
> >>>
>  fit2<-gam(target
> 
> ~s(mgs)+s(gsd)+s(mud)+s(ssCmax),family=quasi(link=log),data=wspe1,method="REML",select=TRUE)
> 
>  summary(fit2)
> >>> Family: quasi
> >>> Link function: log
> >>> Formula:
> >>> target ~ s(mgs) + s(gsd) + s(mud) + s(ssCmax)
> >>> Parametric coefficients:
> >>> Estimate Std. Error t value Pr(>|t|)
> >>> (Intercept)   -6.406  5.239  -1.2230.222
> >>> Approximate significance of smooth terms:
> >>> edf Ref.df F p-value
> >>> s(mgs)2.844  8 25.43  <2e-16 ***
> >>> s(gsd)6.071  9 14.50  <2e-16 ***
> >>> s(mud)6.875  8 21.79  <2e-16 ***
> >>> s(ssCmax) 3.787  8 18.42  <2e-16 ***
> >>> ---
> >>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >>> R-sq.(adj) =0.4   Deviance explained = 40.1%
> >>> REML score =  33203  Scale est. = 8.8359e+05  n = 4511
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> I played around with other families/link functions with no success
> >>> regarding
> >>> the "select" behaviour.
> >>>
> >>> Well, look at the structure of my data:
> >>> 
> >>>
> >>> All possible predictor variables in principle look like this, and taken
> >>> alone, each and every is significant according to p-value (but not all
>

Re: [R] [r] Problems with Fortran calls when loaded a dll compiled with gfortran-4 Cygwin 4.5.3

2013-04-24 Thread R. Michael Weylandt


On Apr 24, 2013, at 7:46 PM, Jens Olofsson  wrote:

> Ok. I apologise for not understanding. So, I have installed R-tools. It
> changed my PATH-variable. I didn't installed Cygwin dlls as stated by
> http://cran.r-project.org/doc/manuals/R-admin.html#The-Windows-toolset.
> Instead my PATH-variable contains the path to the Cygwin dlls AFTER the
> path to R... So, I started RTerm (32-bit) and tried > R CMD SHLIB mango.f95

Repeating what Duncan said -- R CMD SHLIB is to be done outside of R, not 
within R. 

Some recent discussion also suggests its perhaps easier to do this with RStudio 
+ devtools as part of a package than as a standalone shared object. 

MW


> and got the same error as earlier "Error: unexpected symbol in "R CMD"".
> The same goes for RTerm (64-bit). Can you pls advice me on how to proceed?
> Sincerely Jens
> 
> 
> On 24 April 2013 20:08, Duncan Murdoch  wrote:
> 
>> On 13-04-24 1:51 PM, Jens Olofsson wrote:
>> 
>>> Dear Duncan,
>>> I know this isn't a forum for Cygwin, but for R. Pls treat me as a noob
>>> and
>>> also remember I am on Windows. How should I use R CMD SHLIB as if I write
>>> that at the prompt I get the error: unexpected symbol in "R CMD". I have
>>> mango.f95 in the working directory.
>>> 
>> 
>> 
>> That's a command-line command, not something done with R.  You can use it
>> from your bash shell if you have R and the Rtools directories on your path,
>> or from the Windows CMD shell.
>> 
>> BTW, my comment wasn't trying to tell you to go to a Cygwin forum, it was
>> telling you that Cygwin's gfortran is unsupported.  You need to use the
>> MinGW-64 one that we distribute if you want us to be able to help.
>> 
>> Duncan Murdoch
>> 
>> //Jens
>>> 
>>> 
>>> 
>>> On 24 April 2013 19:46, Duncan Murdoch  wrote:
>>> 
>>> On 13-04-24 1:36 PM, Jens Olofsson wrote:
 
 Dear users of R
> I have a subroutine in Fortran95, compiled to a DLL with gfortran in
> Cygwin
> 4.5.3.
> 
> 
 We don't support Cygwin.  You should use the gfortran in Rtools, and get
 R
 to set the command line options for you, either by putting the code in a
 package, or by using R CMD SHLIB Mango.f95.
 
 Duncan Murdoch
 
  The subroutine is:
 
> subroutine MyPBP( S, p, N )
>  ! Expose subroutine rtest to users of this DLL
>  !DEC$ ATTRIBUTES DLLEXPORT, C, REFERENCE, ALIAS: "mypbp_" ::mypbp
>  ! This function computes the Poisson-Binomial distribution
>  ! of size N using p
>  double precision, intent(inout) :: S(N+1)
>  double precision, intent(in) :: p(N)
>  integer, intent(in) :: N
>  double precision :: X(N+1)
>  integer i, j
>  !X=0
>  !S=0
>  X(1) = 1 - p(1)
>  X(2) = p(1)
>  do i = 2, N
>  S(1) = X(1)*(1-p(i))
>  do j = 2,i
>  S(j) = X(j-1)*p(i) + X(j)*(1-p(i))
>  end do
>  S(i+1) = X(i)*p(i)
>  X = S
>  if (i == N) then
>  S = X
>  end if
>  end do
> end subroutine MyPBP
> and it is saved into Mango.f95
> I compile it from the bash shell using: gfortran-4 c- Mango.f95 and
> gfortran-4 -shared -o Mango.dll Mango.o
> I am on a Windows machine running Windows 7 with Intel i7.
> I load the dll in a 32-bit R by dyn.load("Mango.dll"). Using
> getLoadedDLLs
> I can see the DLL. However, is.loaded("Mango.dll") = FALSE. In
> addition, R
> stop responding when I try .Fortran("MyPBP", as.numeric(S),
> as.numeric(p),
> as.integer(N)),
> where N<-5, S<-array(0,N+1) and p<- c(0.1, 0.2, 0.5, 0.8, 0.9).
> 
> What am I doing wrong?
> Any ideas, thoughts and/or comments are highly appreciated.
> 
> Jens
> 
> [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> 
>> 
> PLEASE do read the posting guide http://www.R-project.org/**
> posting-guide.html 
> 
>> 
> 
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
 
>>> 
>> 
> 
>[[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting 

Re: [R] R cannot find the path on my mac

2013-04-24 Thread MacQueen, Don
The "why" is that you haven't got the path correct yet.

Assuming you are running R.app or R64.app (running the R Console),
one way to find the correct path is to type

  file.choose()

Navigate to your file and choose it. It will then tell you the correct
path.

You can also do things like
  source(  file.path(  'path.to.folder', 'filename' ) )
if you want to keep the folder name and the filename separate.

You could also open Terminal (in the Utilities folder) and type
  find . -name filename
and it should tell you the path relative to your home folder. This assumes
that the file is somewhere below
  /Users/gban

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 4/24/13 8:46 AM, "Gitte Brinch Andersen"  wrote:

>Hi
>
>I am really sorry for this probably quite simple question.
>I am new to R, and I am running a pipeline that has already been made.
>All I have to do is give the paths for different folders, where the
>pipeline can find the files with my data.
>
>But every time I try to run the pipeline it returns with the message,
>that it cannot find the file. And I really don't know why. I have found
>the path by going to the folder and finding the info about the folder,
>where the location of this is stated. I copy-paste this into the source
>command.
>
>The path for the folder is:
>
>When found with the "info about the folder":
>/Users/gban/Desktop/Methylation analysis/MethylationDataAnalysis
>
>But I think it should be:
>/Users/gban/Desktop/Methylation
>analysis/MethylationDataAnalysis/1.Normalization_raw_data/
>as it is the folder 1.Normalization_raw_data where my data are in.
>
>I have tried both paths but it returns with this message every time:
>
>> source('/Users/gban/Desktop/Methylation
>>analysis/MethylationDataAnalysis/1.Normalization_raw_data/pipelineIllumin
>>aMethylation.main.R')
>Fejl i file(filename, "r", encoding = encoding) :
>  cannot open the connection
>In addition: Advarselsbesked:
>In file(filename, "r", encoding = encoding) :
>  cannot open file '/Users/gban/Desktop/Methylation
>analysis/MethylationDataAnalysis/1.
>450K_pipeline_Nov2012release./SRC/loadMethylumi2.R': No such file or
>directory
>
>
>Do you have any idea why?
>I know this is probably more a question of how to type the path on a mac,
>than it is a question of how R works.
>
>But I am really frustrated about this, and thought it might be possible
>for you to help?
>
>Thank you in advance, and again sorry for the quite stupid question.
>
>Best
>
>Gitte Brinch Andersen
>
>E-mail: gitt...@hum-gen.au.dk
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] the joy of spreadsheets (off-topic)

2013-04-24 Thread peter dalgaard

On Apr 24, 2013, at 20:01 , Thomas Adams wrote:

> One might wonder if the "Excel error" was indeed THAT or perhaps a way to get 
> the desired results, give the other issues in their analysis?

I think I'd reserve that suspicion for what they did with the NZ data:

Growth for 1946-49:  7.7, 11.9, −9.9, and 10.8 
--1951: -7.6

Those were the 5 years with Debt/GDP > 90%. Obviously, the economy was going up 
and down like a yoyo. So they retain only the last value, miscode it as -7.9, 
and give that one year the same weight as decades of positive growth in other 
countries...


> 
> 
> On Wed, Apr 24, 2013 at 11:58 AM, peter dalgaard  wrote:
> In case you haven't noticed, this is making the rounds in the media, 
> including a handful of references to R. See e.g.
> 
> http://news.slashdot.org/story/13/04/17/0215211/excel-error-contributes-to-problems-with-austerity-study
> 
> I suppose we can't fortune()'ify anonymous quotes, but I kind of like this 
> exchange:
> 
> "Bacon Bits": "SPSS and R are very good at statistical analysis. Quantrix, 
> MapleSoft, IBM Algorithmics, and other software is for financial data 
> modeling. None of those is particularly appropriate for sharing data in a 
> useful format with peers. Excel is."
> 
> "Hatta": "R is extremely appropriate for sharing data in a useful format with 
> peers. It's completely free for one. But more importantly, it saves every 
> single step of your analysis. Send someone an Excel file, and who knows what 
> they've done to the data. Send someone your R project directory and they can 
> see exactly what you did.
> 
> The problem with sending R files to your peers isn't that the R files aren't 
> useful. It's that your peers aren't."
> 
> 
> 
> 
> On Apr 16, 2013, at 19:25 , Sarah Goslee wrote:
> 
> > Given that we occasionally run into problems with comparing Excel
> > results to R results, and other spreadsheet-induced errors, I thought
> > this might be of interest.
> >
> > http://www.nextnewdeal.net/rortybomb/researchers-finally-replicated-reinhart-rogoff-and-there-are-serious-problems
> >
> > The punchline:
> >
> > "If this error turns out to be an actual mistake Reinhart-Rogoff made,
> > well, all I can hope is that future historians note that one of the
> > core empirical points providing the intellectual foundation for the
> > global move to austerity in the early 2010s was based on someone
> > accidentally not updating a row formula in Excel."
> >
> > Ouch.
> >
> > (Note: I know nothing about the site, the author of the article, or
> > the study in question. I was pointed to it by someone else. But if
> > true: highly problematic.)
> >
> > Sarah
> >
> > --
> > Sarah Goslee
> > http://www.functionaldiversity.org
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> --
> Peter Dalgaard, Professor
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd@cbs.dk  Priv: pda...@gmail.com
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [r] Problems with Fortran calls when loaded a dll compiled with gfortran-4 Cygwin 4.5.3

2013-04-24 Thread Jens Olofsson
Ok. I apologise for not understanding. So, I have installed R-tools. It
changed my PATH-variable. I didn't installed Cygwin dlls as stated by
http://cran.r-project.org/doc/manuals/R-admin.html#The-Windows-toolset.
Instead my PATH-variable contains the path to the Cygwin dlls AFTER the
path to R... So, I started RTerm (32-bit) and tried > R CMD SHLIB mango.f95
and got the same error as earlier "Error: unexpected symbol in "R CMD"".
The same goes for RTerm (64-bit). Can you pls advice me on how to proceed?
Sincerely Jens


On 24 April 2013 20:08, Duncan Murdoch  wrote:

> On 13-04-24 1:51 PM, Jens Olofsson wrote:
>
>> Dear Duncan,
>> I know this isn't a forum for Cygwin, but for R. Pls treat me as a noob
>> and
>> also remember I am on Windows. How should I use R CMD SHLIB as if I write
>> that at the prompt I get the error: unexpected symbol in "R CMD". I have
>> mango.f95 in the working directory.
>>
>
>
> That's a command-line command, not something done with R.  You can use it
> from your bash shell if you have R and the Rtools directories on your path,
> or from the Windows CMD shell.
>
> BTW, my comment wasn't trying to tell you to go to a Cygwin forum, it was
> telling you that Cygwin's gfortran is unsupported.  You need to use the
> MinGW-64 one that we distribute if you want us to be able to help.
>
> Duncan Murdoch
>
>  //Jens
>>
>>
>>
>> On 24 April 2013 19:46, Duncan Murdoch  wrote:
>>
>>  On 13-04-24 1:36 PM, Jens Olofsson wrote:
>>>
>>>  Dear users of R
 I have a subroutine in Fortran95, compiled to a DLL with gfortran in
 Cygwin
 4.5.3.


>>> We don't support Cygwin.  You should use the gfortran in Rtools, and get
>>> R
>>> to set the command line options for you, either by putting the code in a
>>> package, or by using R CMD SHLIB Mango.f95.
>>>
>>> Duncan Murdoch
>>>
>>>   The subroutine is:
>>>
 subroutine MyPBP( S, p, N )
   ! Expose subroutine rtest to users of this DLL
   !DEC$ ATTRIBUTES DLLEXPORT, C, REFERENCE, ALIAS: "mypbp_" ::mypbp
   ! This function computes the Poisson-Binomial distribution
   ! of size N using p
   double precision, intent(inout) :: S(N+1)
   double precision, intent(in) :: p(N)
   integer, intent(in) :: N
   double precision :: X(N+1)
   integer i, j
   !X=0
   !S=0
   X(1) = 1 - p(1)
   X(2) = p(1)
   do i = 2, N
   S(1) = X(1)*(1-p(i))
   do j = 2,i
   S(j) = X(j-1)*p(i) + X(j)*(1-p(i))
   end do
   S(i+1) = X(i)*p(i)
   X = S
   if (i == N) then
   S = X
   end if
   end do
 end subroutine MyPBP
 and it is saved into Mango.f95
 I compile it from the bash shell using: gfortran-4 c- Mango.f95 and
 gfortran-4 -shared -o Mango.dll Mango.o
 I am on a Windows machine running Windows 7 with Intel i7.
 I load the dll in a 32-bit R by dyn.load("Mango.dll"). Using
 getLoadedDLLs
 I can see the DLL. However, is.loaded("Mango.dll") = FALSE. In
 addition, R
 stop responding when I try .Fortran("MyPBP", as.numeric(S),
 as.numeric(p),
 as.integer(N)),
 where N<-5, S<-array(0,N+1) and p<- c(0.1, 0.2, 0.5, 0.8, 0.9).

 What am I doing wrong?
 Any ideas, thoughts and/or comments are highly appreciated.

 Jens

  [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 
 >
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html 
 
 >

 and provide commented, minimal, self-contained, reproducible code.



>>>
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sum up column values according to row id

2013-04-24 Thread Rui Barradas

Hello,

Note also that in the with(9 instruction there's some redundancy, th OP 
could simply do


with(ipso, order(id, -g))


Hope this helps,

Rui Barradas

Em 24-04-2013 19:10, David Carlson escreveu:

Something like this?

mean6 <- function(x) {
 if (length(x) < 6) {
   mn <- mean(x)
   } else {
mn <- mean(x[1:6])
}
return(mn)
}

aggregate(g~id, ipso, mean6)

-
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Matteo Mura
Sent: Wednesday, April 24, 2013 7:57 AM
To: r-help@r-project.org
Subject: [R] Sum up column values according to row id

Dear All,

here a problem I think many of you can solve in few minutes.

I have a dataframe which contains values of plot id, diameters, heigths and
basal area of trees, thus columns names are: id | dbh | h | g

head(ipso, n=10)id dbh h  g
1  FPE0164  36 13.62 0.10178760
2  FPE0164  31 12.70 0.07547676
21 FPE1127  57 18.85 0.25517586
13 FPE1127  39 15.54 0.11945906
12 FPE1127  34 14.78 0.09079203
6  FPE1127  32 15.12 0.08042477
5  FPE1127  28 14.13 0.06157522
15 FPE1127  27 13.50 0.05725553
19 FPE1127  25 13.28 0.04908739
11 FPE1127  19 11.54 0.02835287

from here I need to calculate the mean of the six greater g_ith for each
id_ith. The clauses are that:

if length(id) >=6

do the mean of the first six greaters g


else
do the mean of all the g_ith in the id_ith (in head print above e.g.
for the id==FPE0164 do the mean of just these two values of g).

The g are already ordered by id ascending and g descending using:

ipso <- ipso[with(ipso, order(ipso$id, -ipso$g)), ] # Order for id ascending
and g descending

I tried a lot of for loops and tapply() without results.

Can anyone help me to solve this?

Thanks for your attention

Best whishes

Matteo

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Trouble Computing Type III SS in a Cox Regression using drop1 and Anova

2013-04-24 Thread John Fox
Dear Paul,

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Paul Miller
> Sent: Wednesday, April 24, 2013 1:18 PM
> To: r-help@r-project.org
> Subject: [R] Trouble Computing Type III SS in a Cox Regression using
> drop1 and Anova
> 
> Hello All,
> 
> Am having some trouble computing Type III SS in a Cox Regression using
> either drop1 or Anova from the car package. Am hoping that people will
> take a look to see if they can tell what's going on.
> 
> Here is my R code:
> 

. . .

> 
> Both drop1 and Anova give me a different p-value than I get from coxph
> for both my two-level ps2 variable and for age. This is not what I
> would expect based on experience using SAS to conduct similar analyses.
> Indeed SAS consistently produces the same p-values. Namely the ones I
> get from coxph.
> 
> My sense is that I'm probably misusing R in some way but I'm not sure
> what I'm likely to be doing wrong. SAS prodcues Wald Chi-Square results
> for its type III tests. Maybe that has something to do with it.
> Ideally, I'd like to get type III values that match those from coxph.
> If anyone could help me understand better, that would be greatly
> appreciated.

You've answered your own question: The summary() output gives you Wald
tests, and drop1() and Anova() give you LR tests. From ?Anova:

"test.statistic: ... for a Cox model, whether to calculate "LR"
(partial-likelihood ratio) or "Wald" tests; ..."

Thus, if you want the Wald test, you can ask for it (though it escapes me
why you prefer it to the LR test).

I hope this helps,
 John

---
John Fox
Senator McMaster Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help with logkda.pari (in package untb)

2013-04-24 Thread Chase, Jennifer
Hello-
I'm looking for help using the function logkda.pari. logkda.pari needs to
access the program pari/gp (which of course I've installed), but I don't
understand what commands I should use so that it recognizes that it is
installed. Any help with this would be much appreciated.

Cheers
Jen Chase

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [r] Problems with Fortran calls when loaded a dll compiled with gfortran-4 Cygwin 4.5.3

2013-04-24 Thread Duncan Murdoch

On 13-04-24 1:51 PM, Jens Olofsson wrote:

Dear Duncan,
I know this isn't a forum for Cygwin, but for R. Pls treat me as a noob and
also remember I am on Windows. How should I use R CMD SHLIB as if I write
that at the prompt I get the error: unexpected symbol in "R CMD". I have
mango.f95 in the working directory.



That's a command-line command, not something done with R.  You can use 
it from your bash shell if you have R and the Rtools directories on your 
path, or from the Windows CMD shell.


BTW, my comment wasn't trying to tell you to go to a Cygwin forum, it 
was telling you that Cygwin's gfortran is unsupported.  You need to use 
the MinGW-64 one that we distribute if you want us to be able to help.


Duncan Murdoch


//Jens



On 24 April 2013 19:46, Duncan Murdoch  wrote:


On 13-04-24 1:36 PM, Jens Olofsson wrote:


Dear users of R
I have a subroutine in Fortran95, compiled to a DLL with gfortran in
Cygwin
4.5.3.



We don't support Cygwin.  You should use the gfortran in Rtools, and get R
to set the command line options for you, either by putting the code in a
package, or by using R CMD SHLIB Mango.f95.

Duncan Murdoch

  The subroutine is:

subroutine MyPBP( S, p, N )
  ! Expose subroutine rtest to users of this DLL
  !DEC$ ATTRIBUTES DLLEXPORT, C, REFERENCE, ALIAS: "mypbp_" ::mypbp
  ! This function computes the Poisson-Binomial distribution
  ! of size N using p
  double precision, intent(inout) :: S(N+1)
  double precision, intent(in) :: p(N)
  integer, intent(in) :: N
  double precision :: X(N+1)
  integer i, j
  !X=0
  !S=0
  X(1) = 1 - p(1)
  X(2) = p(1)
  do i = 2, N
  S(1) = X(1)*(1-p(i))
  do j = 2,i
  S(j) = X(j-1)*p(i) + X(j)*(1-p(i))
  end do
  S(i+1) = X(i)*p(i)
  X = S
  if (i == N) then
  S = X
  end if
  end do
end subroutine MyPBP
and it is saved into Mango.f95
I compile it from the bash shell using: gfortran-4 c- Mango.f95 and
gfortran-4 -shared -o Mango.dll Mango.o
I am on a Windows machine running Windows 7 with Intel i7.
I load the dll in a 32-bit R by dyn.load("Mango.dll"). Using getLoadedDLLs
I can see the DLL. However, is.loaded("Mango.dll") = FALSE. In addition, R
stop responding when I try .Fortran("MyPBP", as.numeric(S), as.numeric(p),
as.integer(N)),
where N<-5, S<-array(0,N+1) and p<- c(0.1, 0.2, 0.5, 0.8, 0.9).

What am I doing wrong?
Any ideas, thoughts and/or comments are highly appreciated.

Jens

 [[alternative HTML version deleted]]

__**
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/**listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/**
posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.








__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sum up column values according to row id

2013-04-24 Thread David Carlson
Something like this?

mean6 <- function(x) {
if (length(x) < 6) {
  mn <- mean(x)
  } else {
mn <- mean(x[1:6])
}
return(mn)
}

aggregate(g~id, ipso, mean6)

-
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Matteo Mura
Sent: Wednesday, April 24, 2013 7:57 AM
To: r-help@r-project.org
Subject: [R] Sum up column values according to row id

Dear All,

here a problem I think many of you can solve in few minutes.

I have a dataframe which contains values of plot id, diameters, heigths and
basal area of trees, thus columns names are: id | dbh | h | g

head(ipso, n=10)id dbh h  g
1  FPE0164  36 13.62 0.10178760
2  FPE0164  31 12.70 0.07547676
21 FPE1127  57 18.85 0.25517586
13 FPE1127  39 15.54 0.11945906
12 FPE1127  34 14.78 0.09079203
6  FPE1127  32 15.12 0.08042477
5  FPE1127  28 14.13 0.06157522
15 FPE1127  27 13.50 0.05725553
19 FPE1127  25 13.28 0.04908739
11 FPE1127  19 11.54 0.02835287

from here I need to calculate the mean of the six greater g_ith for each
id_ith. The clauses are that:

if length(id) >=6

do the mean of the first six greaters g


else
do the mean of all the g_ith in the id_ith (in head print above e.g.
for the id==FPE0164 do the mean of just these two values of g).

The g are already ordered by id ascending and g descending using:

ipso <- ipso[with(ipso, order(ipso$id, -ipso$g)), ] # Order for id ascending
and g descending

I tried a lot of for loops and tapply() without results.

Can anyone help me to solve this?

Thanks for your attention

Best whishes

Matteo

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Distance matrices Combinations

2013-04-24 Thread eliza botto
Dear UseRs,
MY PROBLEM IS A SMALL PIECE OF A REAL BIG AND A COMPLICATED PROBLEM. IF I 
DELIBERATE IN A VERY SIMPLE WAY THEN ALL I 
WANT IS TO PUT ALL THE POSSIBLE COMBINATIONS OF 75 DISTANCE MATRICES (BY TAKING 
4 MATRICES, MORE COMMONLY 75C4), in the following equation.


t<-as.matrix((MAT1)^2+(MAT2)^2+(MAT3)^2+(MAT4)^2+,upper=T,diag=T))

Then "1215450" values of "t"(one for each combination) should one by one be 
inculcated in the following loop(to calculate the "o" value) and in the end 
want those 10 combinations of distance matrices which have lowest "o" values.

e <- vector("list", 124)

w<-sqrt(t)

mat1<-w

for (i in 1:124){

r<-matrix(sort(mat1[i,],index.return=TRUE)$ix,ncol=1)

u<-r[c(2,3,4,5,6),1]

mata<-m[,c(u)]  ##(shifted)

amata<-apply(mata,1,mean)

amata<-data.frame(amata)

aavg<-as.matrix(amata, ncol=1)

b<-aavg

e[[i]]<-print(sum(abs(b-m[,i])))

}

x<-do.call(rbind,e)

Y<-x

z <- apply(Y,2,sd)

o<-mean(Y)

Does my question make any scene?
Thanks in advance
Elisa 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] the joy of spreadsheets (off-topic)

2013-04-24 Thread Thomas Adams
One might wonder if the "Excel error" was indeed THAT or perhaps a way to
get the desired results, give the other issues in their analysis?


On Wed, Apr 24, 2013 at 11:58 AM, peter dalgaard  wrote:

> In case you haven't noticed, this is making the rounds in the media,
> including a handful of references to R. See e.g.
>
>
> http://news.slashdot.org/story/13/04/17/0215211/excel-error-contributes-to-problems-with-austerity-study
>
> I suppose we can't fortune()'ify anonymous quotes, but I kind of like this
> exchange:
>
> "Bacon Bits": "SPSS and R are very good at statistical analysis. Quantrix,
> MapleSoft, IBM Algorithmics, and other software is for financial data
> modeling. None of those is particularly appropriate for sharing data in a
> useful format with peers. Excel is."
>
> "Hatta": "R is extremely appropriate for sharing data in a useful format
> with peers. It's completely free for one. But more importantly, it saves
> every single step of your analysis. Send someone an Excel file, and who
> knows what they've done to the data. Send someone your R project directory
> and they can see exactly what you did.
>
> The problem with sending R files to your peers isn't that the R files
> aren't useful. It's that your peers aren't."
>
>
>
>
> On Apr 16, 2013, at 19:25 , Sarah Goslee wrote:
>
> > Given that we occasionally run into problems with comparing Excel
> > results to R results, and other spreadsheet-induced errors, I thought
> > this might be of interest.
> >
> >
> http://www.nextnewdeal.net/rortybomb/researchers-finally-replicated-reinhart-rogoff-and-there-are-serious-problems
> >
> > The punchline:
> >
> > "If this error turns out to be an actual mistake Reinhart-Rogoff made,
> > well, all I can hope is that future historians note that one of the
> > core empirical points providing the intellectual foundation for the
> > global move to austerity in the early 2010s was based on someone
> > accidentally not updating a row formula in Excel."
> >
> > Ouch.
> >
> > (Note: I know nothing about the site, the author of the article, or
> > the study in question. I was pointed to it by someone else. But if
> > true: highly problematic.)
> >
> > Sarah
> >
> > --
> > Sarah Goslee
> > http://www.functionaldiversity.org
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [r] Problems with Fortran calls when loaded a dll compiled with gfortran-4 Cygwin 4.5.3

2013-04-24 Thread Jens Olofsson
Dear Duncan,
I know this isn't a forum for Cygwin, but for R. Pls treat me as a noob and
also remember I am on Windows. How should I use R CMD SHLIB as if I write
that at the prompt I get the error: unexpected symbol in "R CMD". I have
mango.f95 in the working directory.
//Jens



On 24 April 2013 19:46, Duncan Murdoch  wrote:

> On 13-04-24 1:36 PM, Jens Olofsson wrote:
>
>> Dear users of R
>> I have a subroutine in Fortran95, compiled to a DLL with gfortran in
>> Cygwin
>> 4.5.3.
>>
>
> We don't support Cygwin.  You should use the gfortran in Rtools, and get R
> to set the command line options for you, either by putting the code in a
> package, or by using R CMD SHLIB Mango.f95.
>
> Duncan Murdoch
>
>  The subroutine is:
>> subroutine MyPBP( S, p, N )
>>  ! Expose subroutine rtest to users of this DLL
>>  !DEC$ ATTRIBUTES DLLEXPORT, C, REFERENCE, ALIAS: "mypbp_" ::mypbp
>>  ! This function computes the Poisson-Binomial distribution
>>  ! of size N using p
>>  double precision, intent(inout) :: S(N+1)
>>  double precision, intent(in) :: p(N)
>>  integer, intent(in) :: N
>>  double precision :: X(N+1)
>>  integer i, j
>>  !X=0
>>  !S=0
>>  X(1) = 1 - p(1)
>>  X(2) = p(1)
>>  do i = 2, N
>>  S(1) = X(1)*(1-p(i))
>>  do j = 2,i
>>  S(j) = X(j-1)*p(i) + X(j)*(1-p(i))
>>  end do
>>  S(i+1) = X(i)*p(i)
>>  X = S
>>  if (i == N) then
>>  S = X
>>  end if
>>  end do
>> end subroutine MyPBP
>> and it is saved into Mango.f95
>> I compile it from the bash shell using: gfortran-4 c- Mango.f95 and
>> gfortran-4 -shared -o Mango.dll Mango.o
>> I am on a Windows machine running Windows 7 with Intel i7.
>> I load the dll in a 32-bit R by dyn.load("Mango.dll"). Using getLoadedDLLs
>> I can see the DLL. However, is.loaded("Mango.dll") = FALSE. In addition, R
>> stop responding when I try .Fortran("MyPBP", as.numeric(S), as.numeric(p),
>> as.integer(N)),
>> where N<-5, S<-array(0,N+1) and p<- c(0.1, 0.2, 0.5, 0.8, 0.9).
>>
>> What am I doing wrong?
>> Any ideas, thoughts and/or comments are highly appreciated.
>>
>> Jens
>>
>> [[alternative HTML version deleted]]
>>
>> __**
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] substitute rename.vars {gdata}

2013-04-24 Thread Knut Krueger

is there a equivalent to rename.vars {gdata}?

As I do not use read.xls i am not using library gdata for a package 
except this function


Knut

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [r] Problems with Fortran calls when loaded a dll compiled with gfortran-4 Cygwin 4.5.3

2013-04-24 Thread Duncan Murdoch

On 13-04-24 1:36 PM, Jens Olofsson wrote:

Dear users of R
I have a subroutine in Fortran95, compiled to a DLL with gfortran in Cygwin
4.5.3.


We don't support Cygwin.  You should use the gfortran in Rtools, and get 
R to set the command line options for you, either by putting the code in 
a package, or by using R CMD SHLIB Mango.f95.


Duncan Murdoch


The subroutine is:
subroutine MyPBP( S, p, N )
 ! Expose subroutine rtest to users of this DLL
 !DEC$ ATTRIBUTES DLLEXPORT, C, REFERENCE, ALIAS: "mypbp_" ::mypbp
 ! This function computes the Poisson-Binomial distribution
 ! of size N using p
 double precision, intent(inout) :: S(N+1)
 double precision, intent(in) :: p(N)
 integer, intent(in) :: N
 double precision :: X(N+1)
 integer i, j
 !X=0
 !S=0
 X(1) = 1 - p(1)
 X(2) = p(1)
 do i = 2, N
 S(1) = X(1)*(1-p(i))
 do j = 2,i
 S(j) = X(j-1)*p(i) + X(j)*(1-p(i))
 end do
 S(i+1) = X(i)*p(i)
 X = S
 if (i == N) then
 S = X
 end if
 end do
end subroutine MyPBP
and it is saved into Mango.f95
I compile it from the bash shell using: gfortran-4 c- Mango.f95 and
gfortran-4 -shared -o Mango.dll Mango.o
I am on a Windows machine running Windows 7 with Intel i7.
I load the dll in a 32-bit R by dyn.load("Mango.dll"). Using getLoadedDLLs
I can see the DLL. However, is.loaded("Mango.dll") = FALSE. In addition, R
stop responding when I try .Fortran("MyPBP", as.numeric(S), as.numeric(p),
as.integer(N)),
where N<-5, S<-array(0,N+1) and p<- c(0.1, 0.2, 0.5, 0.8, 0.9).

What am I doing wrong?
Any ideas, thoughts and/or comments are highly appreciated.

Jens

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] string size limits in RCurl

2013-04-24 Thread Elmore, Ryan
Hi All,

I am running into what appears to be character size limit in a JSON string when 
trying retrieve data from either `curlPerform()` or `getURL()`.  Here is 
non-reproducible code [1], but it should shed some light on the problem.

# Note that .base.url is the basic url for the API, q is a query, user
#  is specified, etc.
session = getCurlHandle()
curl.opts <- list(userpwd = paste(user, ":", key, sep = ""),
  httpheader = "Content-Type: application/json")
request <- paste(.base.url, q, sep = "")
txt <- getURL(url = request, curl = session, .opts = curl.opts,
  write = basicTextGatherer())

or

r = dynCurlReader()
curlPerform(url = request, writefunction = r$update, curl = session,
.opts = curl.opts)

My guess is that the `update` or `value` functions in the `basicTextGather` or 
`dynCurlReader` text handler objects are having trouble with the large strings. 
 In this example, `r$value()` will return a truncated string that is 
approximately 2 MB.  The code given above will work fine for queries < 2 MB.

Note that I can easily do the following from the command line (or using 
`system()` in R), but writing to disc seems like a waste if I am doing the 
subsequent analysis in R.

curl -v --header "Content-Type: application/json" --user 
username:register:passwd 
https://base.url.for.api/getdata/select+*+from+sometable > stream.json

where `file.json` is a roughly 14MB json string. I can read the string into R 
using either

con <- file(paste(.project.path, "data/stream.json", sep = ""), "r")
string <- readLines(con)

or directly to list as

tmp <- fromJSON(file = paste(.project.path, "data/stream.json", sep = ""))

Any thoughts are very much appreciated.  Note that I posted this same 
question/comment to StackOverflow and will happily provide any helpful 
suggestions to that list as well.

Ryan

[1] - Sorry for not providing reproducible code, but I'm dealing with a govt 
firewall.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Residuals for fracdiff

2013-04-24 Thread fmesserGmail
Hi,

 

I am using the fracdiff package to estimate the parameters of an
ARFIMA(1,d,1) model. I would also like to get the residuals of the series. I
have seen another post about this (below). However, being still quite at the
beginner level in terms of R, I did not quite understand how this worked. I
also read through the fracdiff package manual with no success to find any
help with the residuals. I would be very grateful for any help. Thank you!

 

Best regards,

François Messer

HEC, Faculty of Business and Economics

University of Lausanne

 

Message-id: <3c3d6095.5eb01...@lmttrading.com>
 
 

> Date: Wed, 9 Jan 2002 07:57:31 +
> From: Susana Barbosa mailto:susanabarb...@novalis.fc.up.pt?Subject=Re:%20[R]%20How%20to%20obtain
%20the%20series%20of%20residuals%20from%20fracdiff> >
> Subject: [R] How to obtain the series of residuals from fracdiff
>
> Hi
>
> I'm using fracdiff package to estimate the parameters of a
> fractionally-differenced ARIMA (p,d,q) model, and it works fine, but I
wanted
> to have also the filtered series and the series of residuals.
> I understand these are calculated in the subroutine fdfilt, in the program
> fdcore.f, but I can't manage to get them out.
>
> Any suggestion would be much appreciated
>
> Thanks
>
> Susana Barbosa

Hi Susana

For fractional differencing you can use, e.g., something like

fracdiff <- function (x, d, N = 100)
{
n <- 0:N
w <- gamma(-d+n)/(gamma(-d)*gamma(n+1))
y <- filter(x, w, sides=1)
return (y)
}

Adrian

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] fonts not rendering during plot

2013-04-24 Thread George Dietz
Hi,

I am having a problem where sometimes fonts for text in plots don't get 
rendered-the text just shows up as boxes. If I call R from a certain directory 
the problem goes away, otherwise the fonts don't render (but plots get made). 
In both cases, my PANGO_RC and FONTCONFIG_FILE do not change…

Thank you,
-George
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] puzzles of Finance data programming with R

2013-04-24 Thread chenzi14
Hi
This is Candice from Newcastle University.
I am now coming across some problems with the programming of R.
It is like this.
I am asked to run three models based on a sample of CPI:
Random walk
Recursive AR(4)
Rolling AR(4)
1.Based on the papers I find, I think the random walk may be similar to an 
AR(1) model (if with drift). But I am not sure whether this is right?
2.For the recursive model, I found a command named filter.
 m4=filter(Z,filter=c(c1,c2,c3,c4),method="recursive",side=1)
However, since it is not a simulating process, I cannot define the coefficients.
But this type of command needs specific coefs.
And I also try another code like 
m=filter(Z,rep(1,5),method="recursive",side=1). 
However, this command is applied to MA models.By doing rep(1,5), I think this 
means I define all the coefs to be 1.
Additionally, I tried the ?fiter.
But there are not that relevant examples.
3.For the rolling AR(4) model.
In fact, I have done a rollapply command for 12 years rolling windows for 
previous questions.
I think the methodology for the merge command should be used. Since rolling is 
a bit like a process of merging and moving on. 
But I do not know, in order to run a rolling AR(4) model, which command should 
I pick?


PS. If any additional packages like the TSA, need to be installed, please tell 
me. 


Thank you very much!
I am really looking forward to hear from you.


Best Regards
Candice




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] matching observations and ranking

2013-04-24 Thread arun
Just to add:
If your original dataset have only few columns, then you can try this too:
res1<-within(mutate(dat1,AB001A_1=1*(AB001A==AB001A[2]),AB0002A_1=1*(AB0002A==AB0002A[2]),AB362_1=1*(AB362==AB362[2]),SUM=rowSums(cbind(AB001A_1,AB0002A_1,AB362_1)),MATCH=(SUM/3)*100),{MATCH[1:2]<-NA;RANK=rank(MATCH,ties.method="min");SUM[1:2]<-NA;AB001A_1[1:2]<-NA;AB0002A_1[1:2]<-NA;AB362_1[1:2]<-NA})
identical(res,res1)
#[1] TRUE
A.K.





- Original Message -
From: arun 
To: R help 
Cc: 
Sent: Wednesday, April 24, 2013 10:09 AM
Subject: Re: matching observations and ranking 

Hi,
May be this helps:
As you wanted to match only from row3 onwards to row2, the corresponding values 
on row1 and row2 were set to NA.
dat1<- read.table(text="
  S.No AB001A AB0002A AB362
   P1   -/-    C/C   A/A   
    P2   C/C    C/C   A/A   
    3   C/C    C/C   A/A   
    4   C/C    C/C   A/A   
    5   C/C    C/C   A/A   
    6   C/C    C/C   A/A   
    7   C/C    C/C   A/A   
    8   -/-    -/-   -/-   
    9   C/C    C/C   A/A   
    10  C/C    C/C   A/A   
    11  -/-    C/C   A/A   
    12  C/C    C/C   A/A   
    13  C/C    C/C   A/A   
    14  C/C    C/C   A/A   
    15  C/C    -/-   A/A   
    16   -/-    C/C   A/A   
    17   A/A    A/C   A/A   
    18  C/A    A/A   A/A
",sep="",header=TRUE,stringsAsFactors=FALSE)
dat2<-cbind(dat1,(1*mapply("==",dat1[,-1],dat1[2,-1])))
names(dat2)[duplicated(names(dat2))]<- 
paste0(names(dat2)[duplicated(names(dat2))],"_1")
library(plyr)
 dat3<-mutate(dat2,SUM=rowSums(cbind(AB001A_1,AB0002A_1,AB362_1)), 
MATCH=(SUM/3)*100)
 dat3[1:2,5:9]<-NA
res<-mutate(dat3,RANK=rank(MATCH,ties.method="min"))
 head(res)
#  S.No AB001A AB0002A AB362 AB001A_1 AB0002A_1 AB362_1 SUM MATCH RANK
#1   P1    -/- C/C   A/A   NA    NA  NA  NA    NA   17
#2   P2    C/C C/C   A/A   NA    NA  NA  NA    NA   18
#3    3    C/C C/C   A/A    1 1   1   3   100    7
#4    4    C/C C/C   A/A    1 1   1   3   100    7
#5    5    C/C C/C   A/A    1 1   1   3   100    7
#6    6    C/C C/C   A/A    1 1   1   3   100    7
A.K.


>Hi Arun, 
>Thank you very much for your help in solving my problem, 
>S. No   AB001A  AB0002A AB362   AB001A    AB0002A     AB362   SUM %Match  Rank 
 >   P1   -/-        C/C   A/A                         
  > P 2   C/C        C/C   A/A                         
  >  3   C/C        C/C   A/A                         
  >  4   C/C        C/C   A/A                         
  >  5   C/C        C/C   A/A                         
   > 6   C/C        C/C   A/A                         
   > 7   C/C        C/C   A/A                         
   > 8   -/-        -/-   -/-                         
   > 9   C/C        C/C   A/A                         
   >10  C/C        C/C   A/A                         
   > 11  -/-        C/C   A/A                         
   > 12  C/C        C/C   A/A                         
   > 13  C/C        C/C   A/A                         
   > 14  C/C        C/C   A/A                         
    >16  C/C        -/-   A/A                         
>Actually i want to match observation from 3 to 16 with the value in 
p2 (i.e 3 with p2, 4 with p2, 5 with p2 etc), if they match i would like
to give >value 1 and store it in corresponding dummy variable i.e. 
AB001A and i would like to do samething for remaining vars too and 
storing in their >dummy vars. Finally i want make sum of all the matched 
(i.e. 1 score) in each row and calculate percentage of match and then 
rank. This what i >want, sorry for not expressing my problem exactly in 
understandable way. 




>Hi to all bloggers, 
 >my data looks like this, 
>
>S. No   AB001A  AB0002A AB362   VAR1    VAR2    VAR3    SUM %Match  Rank 
 >  1   -/-        C/C   A/A                         
 >   2   C/C        C/C   A/A                         
  >  3   C/C        C/C   A/A                         
  >  4   C/C        C/C   A/A                         
   > 5   C/C        C/C   A/A                         
  >  6   C/C        C/C   A/A                         
   > 7   C/C        C/C   A/A                         
   > 8   -/-        -/-   -/-                         
   > 9   C/C        C/C   A/A                         
   > 10  C/C        C/C   A/A                         
   > 11  -/-        C/C   A/A                         
   > 12  C/C        C/C   A/A                         
   > 13  C/C        C/C   A/A                         
   > 14  C/C        C/C   A/A                         
   > 16  C/C        -/-   A/A          

[R] Regression and FMMs with flexmix

2013-04-24 Thread Robin Tviet




I am repeating this because it seems that some people think it is important to 
reveal your identity I don;t understand why this is so important.   Hopefuly 
now this list will be helpful.

Could someone please assist with this


I am trying to understand how to use the flexmix package, I have read the 
Leisch paper but am very unclear what is needed for the M-step driver.  I am 
just fitting a simple linear regression model.  The documentation is far from 
clear what the FLXmclust function does, but, it in principle could do all I 
need, however, I do not get sentible results as if I try the following the 
result is poor:

x<-c()
for(i in 
0:99){x$y[2*i]=(0+i);x$x[2*i]=i;x$x[2*i+1]=i;x$y[2*i+1]=i+1000;x$g[2*i]=1;x$g[2*i+1]=2}
m1<-flexmix(y~x  ,data=x,k=2)
table(x$g,m1@cluster)

1  2
1 25 74
2 67 33

It all depends on the randomised starting values.  So I think I need a better 
"driver", but, I cannot find a spec for what I have to do in the driver.

Where is FLXmclust documented?  can anyone assist?
 


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R cannot find the path on my mac

2013-04-24 Thread Gitte Brinch Andersen
Hi

I am really sorry for this probably quite simple question.
I am new to R, and I am running a pipeline that has already been made. All I 
have to do is give the paths for different folders, where the pipeline can find 
the files with my data.

But every time I try to run the pipeline it returns with the message, that it 
cannot find the file. And I really don't know why. I have found the path by 
going to the folder and finding the info about the folder, where the location 
of this is stated. I copy-paste this into the source command.

The path for the folder is:

When found with the "info about the folder":
/Users/gban/Desktop/Methylation analysis/MethylationDataAnalysis

But I think it should be:
/Users/gban/Desktop/Methylation 
analysis/MethylationDataAnalysis/1.Normalization_raw_data/
as it is the folder 1.Normalization_raw_data where my data are in.

I have tried both paths but it returns with this message every time:

> source('/Users/gban/Desktop/Methylation 
> analysis/MethylationDataAnalysis/1.Normalization_raw_data/pipelineIlluminaMethylation.main.R')
Fejl i file(filename, "r", encoding = encoding) : 
  cannot open the connection
In addition: Advarselsbesked:
In file(filename, "r", encoding = encoding) :
  cannot open file '/Users/gban/Desktop/Methylation 
analysis/MethylationDataAnalysis/1. 
450K_pipeline_Nov2012release./SRC/loadMethylumi2.R': No such file or directory


Do you have any idea why? 
I know this is probably more a question of how to type the path on a mac, than 
it is a question of how R works.

But I am really frustrated about this, and thought it might be possible for you 
to help?

Thank you in advance, and again sorry for the quite stupid question.

Best

Gitte Brinch Andersen

E-mail: gitt...@hum-gen.au.dk

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] matching observations and ranking

2013-04-24 Thread arun
Hi,
May be this helps:
As you wanted to match only from row3 onwards to row2, the corresponding values 
on row1 and row2 were set to NA.
dat1<- read.table(text="
  S.No AB001A AB0002A AB362
   P1   -/-    C/C   A/A   
    P2   C/C    C/C   A/A   
    3   C/C    C/C   A/A   
    4   C/C    C/C   A/A   
    5   C/C    C/C   A/A   
    6   C/C    C/C   A/A   
    7   C/C    C/C   A/A   
    8   -/-    -/-   -/-   
    9   C/C    C/C   A/A   
    10  C/C    C/C   A/A   
    11  -/-    C/C   A/A   
    12  C/C    C/C   A/A   
    13  C/C    C/C   A/A   
    14  C/C    C/C   A/A   
    15  C/C    -/-   A/A   
    16   -/-    C/C   A/A   
    17   A/A    A/C   A/A   
    18  C/A    A/A   A/A
",sep="",header=TRUE,stringsAsFactors=FALSE)
dat2<-cbind(dat1,(1*mapply("==",dat1[,-1],dat1[2,-1])))
names(dat2)[duplicated(names(dat2))]<- 
paste0(names(dat2)[duplicated(names(dat2))],"_1")
library(plyr)
 dat3<-mutate(dat2,SUM=rowSums(cbind(AB001A_1,AB0002A_1,AB362_1)), 
MATCH=(SUM/3)*100)
 dat3[1:2,5:9]<-NA
res<-mutate(dat3,RANK=rank(MATCH,ties.method="min"))
 head(res)
#  S.No AB001A AB0002A AB362 AB001A_1 AB0002A_1 AB362_1 SUM MATCH RANK
#1   P1    -/- C/C   A/A   NA    NA  NA  NA    NA   17
#2   P2    C/C C/C   A/A   NA    NA  NA  NA    NA   18
#3    3    C/C C/C   A/A    1 1   1   3   100    7
#4    4    C/C C/C   A/A    1 1   1   3   100    7
#5    5    C/C C/C   A/A    1 1   1   3   100    7
#6    6    C/C C/C   A/A    1 1   1   3   100    7
A.K.


>Hi Arun, 
>Thank you very much for your help in solving my problem, 
>S. No   AB001A  AB0002A AB362   AB001A    AB0002A     AB362   SUM %Match  Rank 
 >   P1   -/-        C/C   A/A                         
  > P 2   C/C        C/C   A/A                         
  >  3   C/C        C/C   A/A                         
  >  4   C/C        C/C   A/A                         
  >  5   C/C        C/C   A/A                         
   > 6   C/C        C/C   A/A                         
   > 7   C/C        C/C   A/A                         
   > 8   -/-        -/-   -/-                         
   > 9   C/C        C/C   A/A                         
   >10  C/C        C/C   A/A                         
   > 11  -/-        C/C   A/A                         
   > 12  C/C        C/C   A/A                         
   > 13  C/C        C/C   A/A                         
   > 14  C/C        C/C   A/A                         
    >16  C/C        -/-   A/A                         
>Actually i want to match observation from 3 to 16 with the value in 
p2 (i.e 3 with p2, 4 with p2, 5 with p2 etc), if they match i would like
 to give >value 1 and store it in corresponding dummy variable i.e. 
AB001A and i would like to do samething for remaining vars too and 
storing in their >dummy vars. Finally i want make sum of all the matched 
(i.e. 1 score) in each row and calculate percentage of match and then 
rank. This what i >want, sorry for not expressing my problem exactly in 
understandable way. 




>Hi to all bloggers, 
 >my data looks like this, 
>
>S. No   AB001A  AB0002A AB362   VAR1    VAR2    VAR3    SUM %Match  Rank 
 >  1   -/-        C/C   A/A                         
 >   2   C/C        C/C   A/A                         
  >  3   C/C        C/C   A/A                         
  >  4   C/C        C/C   A/A                         
   > 5   C/C        C/C   A/A                         
  >  6   C/C        C/C   A/A                         
   > 7   C/C        C/C   A/A                         
   > 8   -/-        -/-   -/-                         
   > 9   C/C        C/C   A/A                         
   > 10  C/C        C/C   A/A                         
   > 11  -/-        C/C   A/A                         
   > 12  C/C        C/C   A/A                         
   > 13  C/C        C/C   A/A                         
   > 14  C/C        C/C   A/A                         
   > 16  C/C        -/-   A/A                         
   > 17   -/-        C/C   A/A                         
   > 18   C/C        C/C   A/A                         
   > 19  C/C        C/C   A/A                         
>I want to match obs 3 with obs 2 if it exactly matched then score 
will be 1 else 0, that will be stored in var1 for AB001a, in var2 for 
ab0002a and in >var3 for ab362 and i want to calculate sum of all the 1's
and observation match percent and their rank (top ten matchers), I did 
this successfully in >excel but it took me lot of time, i used if 
condition in excel like (=if(A3=A$2,1,0) and

[R] about the loglm and glm---Re:Re: Regression on stratified count data

2013-04-24 Thread meng
Hi,Achim:


Can all the analysis you mentioned via loglm be performed via 
glm(...,family=poisson)?


Many thanks.










At 2013-04-24 19:37:10,"Achim Zeileis"  wrote:
>On Wed, 24 Apr 2013, meng wrote:
>
>> Hi all:
>> For stratified count data,how to perform regression analysis?
>>
>> My data:
>> age case oc count
>> 1   1 121
>> 1   1 226
>> 1   2 117
>> 1   2 259
>> 2   1 118
>> 2   1 288
>> 2   2 1 7
>> 2   2 2   95
>>
>> age:
>> 1:<40y
>> 2:>40y
>>
>> case:
>> 1:patient
>> 2:health
>>
>> oc:
>> 1:use drug
>> 2:not use drug
>>
>> My purpose:
>> Anaysis whether case and oc are correlated, and age is a stratified variable.
>>
>> My solution:
>> 1,Mantel-Haenszel test by using function "mantelhaen.test"
>> 2,loglinear regression by using function 
>> glm(count~case*oc,family=poisson).But I don't know how to handle variable 
>> "age",which is the stratified variable.
>
>Instead of using glm(family = poisson) it is also convenient to use 
>loglm() from package MASS for the associated convenience table.
>
>The code below shows how to set up the contingency table, visualize it 
>using package vcd, and then fit two models using loglm. The models 
>considered are conditional independence of case and drug given age and the 
>"no three-way interaction" already suggested by Peter. Both models are 
>also accompanied by visualizations of the residuals.
>
>## contingency table with nice labels
>tab <- expand.grid(drug = 1:2, case = 1:2, age = 1:2)
>tab$count <- c(21, 26, 17, 59, 18, 88, 7, 95)
>tab$age <- factor(tab$age, levels = 1:2, labels = c("<40", ">40"))
>tab$case <- factor(tab$case, levels = 1:2, labels = c("patient", 
>"healthy"))
>tab$drug <- factor(tab$drug, levels = 1:2, labels = c("yes", "no"))
>tab <- xtabs(count ~ age + drug + case, data = tab)
>
>## visualize case explained by drug given age
>library("vcd")
>mosaic(case ~ drug | age, data = tab,
>   split_vertical = c(TRUE, TRUE, FALSE))
>
>## test wheter drug and case are independent given age
>m1 <- loglm(~ age/(drug + case), data = tab)
>m1
>
>## visualize corresponding residuals from independence model
>mosaic(case ~ drug | age, data = tab,
>   split_vertical = c(TRUE, TRUE, FALSE),
>   residuals_type = "deviance",
>   gp = shading_hcl, gp_args = list(interpolate = 1.2))
>mosaic(case ~ drug | age, data = tab,
>   split_vertical = c(TRUE, TRUE, FALSE),
>   residuals_type = "pearson",
>   gp = shading_hcl, gp_args = list(interpolate = 1.2))
>
>## test whether there is no three-way interaction
>## (i.e., dependence of case on drug is the same for both age groups)
>m2 <- loglm(~ (age + drug + case)^2, data = tab)
>m2
>
>## visualization (with default pearson residuals)
>mosaic(case ~ drug | age, data = tab,
>   expected = ~ (age + drug + case)^2,
>   split_vertical = c(TRUE, TRUE, FALSE),
>   gp = shading_hcl, gp_args = list(interpolate = 1.2))
>
>
>> Many thanks for your help.
>>
>> My best.
>>  [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R Interactive Mode

2013-04-24 Thread Hrachya Astsatryan
Dear all,

We are doing some research about the time series analysis of NDVI, and 
we found the NDVITS package which is a very great tool.
Unfortunately when we run it, after  TimeSeriesAnalysis it asks to enter 
"Village or Country".

library("ndvits", lib.loc="/home/vahe/R/i686-pc-linux-gnu-library/2.15")
ndvidirectory=paste(system.file("extdata/VITO_Mzimba",
 package="ndvits"), "/", sep="")
region="Mzimba"
Ystart=2004
Yend=2006
shape="SLP_Mzimba"
shapedir=paste(system.file("extdata/shape", package="ndvits"),
"/", sep="")
outfile = "mzimbaTS2.txt"
outfile2 = "MzimbaTS2.pdf"
outfiel3 = "my.pdf"
signal = TimeSeriesAnalysis(shape, shapedir, ndvidirectory, region, 
Ystart, Yend, outfile, outfile2)


How it is possible to call the package by default indicating Village 
option /we don't want to enter the parameter and don't want to change 
anything in the code which is quite difficult/.

Thank you in advance,
Hrach
-- 
Hrachya Astsatryan
Head of HPC Laboratory,
Institute for Informatics and Automation Problems,
National Academy of Sciences of the Republic of Armenia
1, P. Sevak str., Yerevan 0014, Armenia
t: 374 10 284780
f: 374 10 285812
e: hr...@sci.am
skype: tighra

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sum up column values according to row id

2013-04-24 Thread Matteo Mura
Dear All,

here a problem I think many of you can solve in few minutes.

I have a dataframe which contains values of plot id, diameters, heigths and
basal area of trees, thus columns names are: id | dbh | h | g

head(ipso, n=10)id dbh h  g
1  FPE0164  36 13.62 0.10178760
2  FPE0164  31 12.70 0.07547676
21 FPE1127  57 18.85 0.25517586
13 FPE1127  39 15.54 0.11945906
12 FPE1127  34 14.78 0.09079203
6  FPE1127  32 15.12 0.08042477
5  FPE1127  28 14.13 0.06157522
15 FPE1127  27 13.50 0.05725553
19 FPE1127  25 13.28 0.04908739
11 FPE1127  19 11.54 0.02835287

from here I need to calculate the mean of the six greater g_ith for
each id_ith. The clauses are that:

if length(id) >=6

do the mean of the first six greaters g


else
do the mean of all the g_ith in the id_ith (in head print above e.g.
for the id==FPE0164 do the mean of just these two values of g).

The g are already ordered by id ascending and g descending using:

ipso <- ipso[with(ipso, order(ipso$id, -ipso$g)), ] # Order for id
ascending and g descending

I tried a lot of for loops and tapply() without results.

Can anyone help me to solve this?

Thanks for your attention

Best whishes

Matteo

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RDA permutest envfit

2013-04-24 Thread Rémi Lesmerises
Dear all,

I did a RDA and when I looked to the signification of the test with permutest, 
the output was non-significant. But when I used the envfit function, some of 
the vectors are significant. All the test's conditions are respected. What it 
means? Is it an error in the script?


Commands and output:

> permutest(rda.ind, perm=999, first=TRUE)

Permutation test for rda 

Call: rda(formula = x ~ ARH_frs + CON_frs + PRP_frs + RUI_frs + VAM_frs, data = 
z)
Permutation test for first constrained eigenvalue
Pseudo-F:        4.093568 (with 1, 10 Degrees of Freedom)
Significance:    0.413 
Based on 999 permutations under reduced model.

> fit <- envfit(rda.ind, z, perm = 999, display = "lc")
> fit

***VECTORS

             RDA1      RDA2     r2 Pr(>r)    
VAM_frs  0.145281 -0.989390 0.2388  0.147    

ARH_frs -0.876494 -0.481413 0.6127  0.002 ** 
CON_frs  0.904278  0.426944 0.4846  0.013 *  
PRP_frs -0.997634  0.068755 0.9433  0.001 ***
RUI_frs -0.648512 -0.761204 0.6243  0.004 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ 
’ 1 
P values based on 999 permutations.

 
Rémi Lesmerises, biol. M.Sc.,
Candidat Ph.D. en Biologie
Université du Québec à Rimouski
remilesmeri...@yahoo.ca

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [r] Problems with Fortran calls when loaded a dll compiled with gfortran-4 Cygwin 4.5.3

2013-04-24 Thread Jens Olofsson
Dear users of R
I have a subroutine in Fortran95, compiled to a DLL with gfortran in Cygwin
4.5.3.
The subroutine is:
subroutine MyPBP( S, p, N )
! Expose subroutine rtest to users of this DLL
!DEC$ ATTRIBUTES DLLEXPORT, C, REFERENCE, ALIAS: "mypbp_" ::mypbp
! This function computes the Poisson-Binomial distribution
! of size N using p
double precision, intent(inout) :: S(N+1)
double precision, intent(in) :: p(N)
integer, intent(in) :: N
double precision :: X(N+1)
integer i, j
!X=0
!S=0
X(1) = 1 - p(1)
X(2) = p(1)
do i = 2, N
S(1) = X(1)*(1-p(i))
do j = 2,i
S(j) = X(j-1)*p(i) + X(j)*(1-p(i))
end do
S(i+1) = X(i)*p(i)
X = S
if (i == N) then
S = X
end if
end do
end subroutine MyPBP
and it is saved into Mango.f95
I compile it from the bash shell using: gfortran-4 c- Mango.f95 and
gfortran-4 -shared -o Mango.dll Mango.o
I am on a Windows machine running Windows 7 with Intel i7.
I load the dll in a 32-bit R by dyn.load("Mango.dll"). Using getLoadedDLLs
I can see the DLL. However, is.loaded("Mango.dll") = FALSE. In addition, R
stop responding when I try .Fortran("MyPBP", as.numeric(S), as.numeric(p),
as.integer(N)),
where N<-5, S<-array(0,N+1) and p<- c(0.1, 0.2, 0.5, 0.8, 0.9).

What am I doing wrong?
Any ideas, thoughts and/or comments are highly appreciated.

Jens

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Trouble Computing Type III SS in a Cox Regression using drop1 and Anova

2013-04-24 Thread Paul Miller
Hello All,

Am having some trouble computing Type III SS in a Cox Regression using either 
drop1 or Anova from the car package. Am hoping that people will take a look to 
see if they can tell what's going on.

Here is my R code:

cox3grp <- subset(survData,
Treatment %in% c("DC", "DA", "DO"),
c("PTNO", "Treatment", "PFS_CENSORED", "PFS_MONTHS", "AGE", "PS2"))
cox3grp <- droplevels(cox3grp)
str(cox3grp)

coxCV <- coxph(Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2, data=cox3grp, 
method = "efron")
coxCV

drop1(coxCV, test="Chisq")

require(car)
Anova(coxCV, type="III") 

And here are my results:

cox3grp <- subset(survData,
+   Treatment %in% c("DC", "DA", "DO"),
+   c("PTNO", "Treatment", "PFS_CENSORED", "PFS_MONTHS", "AGE", 
"PS2"))
> cox3grp <- droplevels(cox3grp)
> str(cox3grp)
'data.frame':   227 obs. of  6 variables:
 $ PTNO: int  1195997 104625 106646 1277507 220506 525343 789119 817160 
824224 82632 ...
 $ Treatment   : Factor w/ 3 levels "DC","DA","DO": 1 1 1 1 1 1 1 1 1 1 ...
 $ PFS_CENSORED: int  1 1 1 0 1 1 1 1 0 1 ...
 $ PFS_MONTHS  : num  1.12 8.16 6.08 1.35 9.54 ...
 $ AGE : num  72 71 80 65 72 60 63 61 71 70 ...
 $ PS2 : Ord.factor w/ 2 levels "Yes"<"No": 2 2 2 2 2 2 2 2 2 2 ...
> 
> coxCV <- coxph(Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2, data=cox3grp, 
> method = "efron")
> coxCV
Call:
coxph(formula = Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2, 
data = cox3grp, method = "efron")


  coef exp(coef) se(coef)  z p
AGE0.00492 1.005  0.00789  0.624 0.530
PS2.L -0.34523 0.708  0.14315 -2.412 0.016

Likelihood ratio test=5.66  on 2 df, p=0.0591  n= 227, number of events= 198 
> 
> drop1(coxCV, test="Chisq")
Single term deletions

Model:
Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2
   DfAICLRT Pr(>Chi)  
1755.2  
AGE 1 1753.6 0.3915  0.53151  
PS2 1 1758.4 5.2364  0.02212 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
> 
> require(car)
> Anova(coxCV,  type="III")
Analysis of Deviance Table (Type III tests)
LR Chisq Df Pr(>Chisq)  
AGE   0.3915  10.53151  
PS2   5.2364  10.02212 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
> 

Both drop1 and Anova give me a different p-value than I get from coxph for both 
my two-level ps2 variable and for age. This is not what I would expect based on 
experience using SAS to conduct similar analyses. Indeed SAS consistently 
produces the same p-values. Namely the ones I get from coxph. 

My sense is that I'm probably misusing R in some way but I'm not sure what I'm 
likely to be doing wrong. SAS prodcues Wald Chi-Square results for its type III 
tests. Maybe that has something to do with it. Ideally, I'd like to get type 
III values that match those from coxph. If anyone could help me understand 
better, that would be greatly appreciated.

Thanks,

Paul

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] identify object that causes "Error in loadNamespace(name) : there is no package called ‘R.utils’"

2013-04-24 Thread Duncan Murdoch

On 13-04-24 11:08 AM, Liviu Andronic wrote:

Dear Duncan,


On Wed, Apr 24, 2013 at 3:03 PM, Duncan Murdoch
 wrote:

A better approach is to *never* save and load workspaces unless you know
exactly what is in them.  Always reply "no" to the question about saving
your workspace (or set that as the default).  If you accidentally end up
with a workspace being loaded, delete it.


I must admit that I'm a bit surprised by this. I was always under the
impression that saving/restoring workspaces was the proper workflow in
R. If you use R interactively (e.g., not by running scripts), how else
would you store your data, intermediary results, etc., while working
on a project? Am I missing something?


I think Michael and Hadley have already given good advice.  I'll just 
add one thing:  If you use R interactively, you should still be running 
scripts, just running them within an interactive session.  That way you 
have a record of what you did, and can back up (by starting over and 
running everything up to a known good spot) to make corrections.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need to replicate Boltzman Signmodial Curve fit from Graph Pad

2013-04-24 Thread David Carlson
Sorry. I left out a line:

> pHvals <- seq(min(dta$pH), max(dta$pH), length.out=100)

Should be inserted just before the lines() function.

-
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352

From: J J [mailto:rnoob...@gmail.com] 
Sent: Tuesday, April 23, 2013 6:49 PM
To: David Carlson
Subject: Re: [R] Need to replicate Boltzman Signmodial Curve fit from Graph
Pad

Thank you David! That looks really close, I believe Graphpad gave a V50 of
 7.244.

One thing I missed on your example was where you obtained 'pHvals' from?


On Tue, Apr 23, 2013 at 5:17 PM, David Carlson  wrote:
You may want to compare R's four point logistic with the results you get
using Graphpad. For these data the Bottom parameter is not significantly
different from zero (so the three point sigmoid might be adequate), but as
Bert points out, especially with only 8 data points, it is probably
overfitting the model.

> dta <- read.table(text="pH counts
+ 3.8 968
+ 5.0 1347
+ 5.8 2867
+ 6.6 9203
+ 7.0 15817
+ 7.4 20297
+ 8.2 31916
+ 9.2 35756",
+ header=TRUE)
> Sig.nls <- nls(counts~SSfpl(pH, Bottom, Top, V50, Slope), dta)
> summary(Sig.nls)

Formula: counts ~ SSfpl(pH, Bottom, Top, V50, Slope)

Parameters:
          Estimate  Std. Error t value     Pr(>|t|)
Bottom   718.76871   579.25327   1.241     0.282463
Top    36983.98871  1008.31727  36.679 0.032986 ***
V50        7.24930     0.05054 143.436 0.000142 ***
Slope      0.55582     0.05023  11.065     0.000379 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 767.1 on 4 degrees of freedom

Number of iterations to convergence: 0
Achieved convergence tolerance: 0.03808
> options(scipen=5)
> confint(Sig.nls)
Waiting for profiling to be done...
                2.5%         97.5%
Bottom  -963.4106068  2243.7154750
Top    34499.9962176 40150.4114887
V50        7.1173103     7.4033391
Slope      0.4382893     0.7114438
> coef(Sig.nls)
       Bottom           Top           V50         Slope
  718.7687104 36983.9887074     7.2493032     0.5558236
> plot(dta, pch=16)
> lines(pHvals, predict(Sig.nls, data.frame(pH=pHvals)), col="red")

-
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Bert Gunter
Sent: Tuesday, April 23, 2013 12:04 PM
To: John Kane
Cc: r-help@r-project.org; J J
Subject: Re: [R] Need to replicate Boltzman Signmodial Curve fit from Graph
Pad

I think you'll find that this is just an alternate parameterization of a 4 P
logistic.
... and in any case there is no statistically detectable difference between
the two.

That is, this is just basically a Graphpad marketing ploy. Fit such data
with anyone's logistic function ... and that is probably overfitting in many
cases, anyway.

Cheers,
Bert

On Tue, Apr 23, 2013 at 9:54 AM, John Kane  wrote:
> No idea of the area but does this link help?
> http://bioinformatics.oxfordjournals.org/content/24/13/1549.full
>
> John Kane
> Kingston ON Canada
>
>
>> -Original Message-
>> From: rnoob...@gmail.com
>> Sent: Tue, 23 Apr 2013 10:55:59 -0400
>> To: r-help@r-project.org
>> Subject: [R] Need to replicate Boltzman Signmodial Curve fit from
>> Graph Pad
>>
>> Hello useRs (please don't kill me),
>>
>> I've fairly new to R having only a few months of playing around with R.
>> What little I've learned has been extremely useful.
>>
>> If someone could point me as to how to replicate the Boltzman
>> Sigmodial curve fit as provided by Graphpad software I'd be eternally
grateful.
>>
>> Where  we currently use Graphpad for only this one function,its seems
>> highly inefficient for a $100 piece of software to be used for only
>> one function (which isn't nearly as bad as the company's pandemic
>> reliance on Excel for nearly everything else).
>>
>> so anyway, what I'd normally do is take a set of data like this:
>>
>>  pH counts  3.8 968  5.0 1347  5.8 2867  6.6 9203  7.0 15817  7.4
>> 20297
>> 8.2
>> 31916  9.2 35756
>> then fit this to a Boltzman sigmodial in Graphpad. Graphpad spits out
>> a much longer set of vectors and also information about v50 and
>> confidence intervals etc (if anyone is familiar with that software).
>> This is what i'd like to replicate.
>>
>> It might be that I just don't know what i'm doing,so feel free to
>> call me an idiot but any help is greatly appreciated!
>>
>> -JJ
>>
>>       [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> 
> FREE 3

Re: [R] identify object that causes "Error in loadNamespace(name) : there is no package called ‘R.utils’"

2013-04-24 Thread John Kane

> -Original Message-
> From: h.wick...@gmail.com
> Sent: Wed, 24 Apr 2013 10:48:01 -0500
> To: landronim...@gmail.com
> Subject: Re: [R] identify object that causes "Error in
> loadNamespace(name) : there is no package called ‘R.utils’"
> 
>> I must admit that I'm a bit surprised by this. I was always under the
>> impression that saving/restoring workspaces was the proper workflow in
>> R. If you use R interactively (e.g., not by running scripts), how else
>> would you store your data, intermediary results, etc., while working
>> on a project? Am I missing something?
> 
> You don't _store_ intermediate results - you recreate them from your
> script.  If they are time consuming, then you can use readRDS/saveRDS
> to cache the results. If you don't start with a clean workspace
> frequently, it's difficult to tell whether or not your code is
> reproducible.
> 
> Hadley
> 
> --
> Chief Scientist, RStudio
> http://had.co.nz/

Not to mention http://www.mail-archive.com/r-help@r-project.org/msg196997.html

John Kane
Kingston ON Canada


GET FREE SMILEYS FOR YOUR IM & EMAIL - Learn more at 
http://www.inbox.com/smileys
Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most 
webmails

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] the joy of spreadsheets (off-topic)

2013-04-24 Thread peter dalgaard
In case you haven't noticed, this is making the rounds in the media, including 
a handful of references to R. See e.g.

http://news.slashdot.org/story/13/04/17/0215211/excel-error-contributes-to-problems-with-austerity-study

I suppose we can't fortune()'ify anonymous quotes, but I kind of like this 
exchange:

"Bacon Bits": "SPSS and R are very good at statistical analysis. Quantrix, 
MapleSoft, IBM Algorithmics, and other software is for financial data modeling. 
None of those is particularly appropriate for sharing data in a useful format 
with peers. Excel is."

"Hatta": "R is extremely appropriate for sharing data in a useful format with 
peers. It's completely free for one. But more importantly, it saves every 
single step of your analysis. Send someone an Excel file, and who knows what 
they've done to the data. Send someone your R project directory and they can 
see exactly what you did.

The problem with sending R files to your peers isn't that the R files aren't 
useful. It's that your peers aren't."



 
On Apr 16, 2013, at 19:25 , Sarah Goslee wrote:

> Given that we occasionally run into problems with comparing Excel
> results to R results, and other spreadsheet-induced errors, I thought
> this might be of interest.
> 
> http://www.nextnewdeal.net/rortybomb/researchers-finally-replicated-reinhart-rogoff-and-there-are-serious-problems
> 
> The punchline:
> 
> "If this error turns out to be an actual mistake Reinhart-Rogoff made,
> well, all I can hope is that future historians note that one of the
> core empirical points providing the intellectual foundation for the
> global move to austerity in the early 2010s was based on someone
> accidentally not updating a row formula in Excel."
> 
> Ouch.
> 
> (Note: I know nothing about the site, the author of the article, or
> the study in question. I was pointed to it by someone else. But if
> true: highly problematic.)
> 
> Sarah
> 
> -- 
> Sarah Goslee
> http://www.functionaldiversity.org
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] identify object that causes "Error in loadNamespace(name) : there is no package called ‘R.utils’"

2013-04-24 Thread Hadley Wickham
> I must admit that I'm a bit surprised by this. I was always under the
> impression that saving/restoring workspaces was the proper workflow in
> R. If you use R interactively (e.g., not by running scripts), how else
> would you store your data, intermediary results, etc., while working
> on a project? Am I missing something?

You don't _store_ intermediate results - you recreate them from your
script.  If they are time consuming, then you can use readRDS/saveRDS
to cache the results. If you don't start with a clean workspace
frequently, it's difficult to tell whether or not your code is
reproducible.

Hadley

--
Chief Scientist, RStudio
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with raster function with values lower than -9999

2013-04-24 Thread Fabio Berzaghi
As I have reported in a previous thread, I have been having problems 
importing a ASCII raster that has values that go from Min. :-69826220 
to  Max.   :167780500. The problem I am encountering is that when I use 
the raster function to import the ASCII file then every value smaller 
than - is reported as NA and the minimum value is -9458.


Is this a bug of the function and is there a workaround? When I import 
the same ASCII file as a data frame everything is fine and I get the 
whole range of values.


Any help is appreciated

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] identify object that causes "Error in loadNamespace(name) : there is no package called ‘R.utils’"

2013-04-24 Thread R. Michael Weylandt
On Wed, Apr 24, 2013 at 4:08 PM, Liviu Andronic  wrote:
> Dear Duncan,
>
>
> On Wed, Apr 24, 2013 at 3:03 PM, Duncan Murdoch
>  wrote:
>> A better approach is to *never* save and load workspaces unless you know
>> exactly what is in them.  Always reply "no" to the question about saving
>> your workspace (or set that as the default).  If you accidentally end up
>> with a workspace being loaded, delete it.
>>
> I must admit that I'm a bit surprised by this. I was always under the
> impression that saving/restoring workspaces was the proper workflow in
> R. If you use R interactively (e.g., not by running scripts), how else
> would you store your data, intermediary results, etc., while working
> on a project? Am I missing something?

I save the objects themselves (rather than the whole workspace) using
saveRDS() and readRDS() -- which I *think* are considered better
practice than save() and load() because they don't force names on you.

Cheers,
MW

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] identify object that causes "Error in loadNamespace(name) : there is no package called ‘R.utils’"

2013-04-24 Thread Liviu Andronic
Dear Duncan,


On Wed, Apr 24, 2013 at 3:03 PM, Duncan Murdoch
 wrote:
> A better approach is to *never* save and load workspaces unless you know
> exactly what is in them.  Always reply "no" to the question about saving
> your workspace (or set that as the default).  If you accidentally end up
> with a workspace being loaded, delete it.
>
I must admit that I'm a bit surprised by this. I was always under the
impression that saving/restoring workspaces was the proper workflow in
R. If you use R interactively (e.g., not by running scripts), how else
would you store your data, intermediary results, etc., while working
on a project? Am I missing something?

Regards,
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] identify object that causes "Error in loadNamespace(name) : there is no package called ‘R.utils’"

2013-04-24 Thread Liviu Andronic
Dear Duncan,


On Wed, Apr 24, 2013 at 4:57 PM, Duncan Murdoch
 wrote:
> What I've done sometimes in debugging is to change that error to a warning
> in the getNamespace() function, and add some tracing code to the
> serialization code to print the names of objects as they are loaded. (This
> goes in ReadItem in src/main/serialize.c.)
>
> I wouldn't expect Liviu to make those changes, but perhaps a "verbose"
> option could be added to load(), so that it could be available to users.
>
That would be useful, indeed. As it stands save()/load() workspaces in
R seems very fragile and could easily trip users when working on
multiple machines or sharing their workspace with a colleague. In such
cases it is important to be able to quickly pinpoint the offending
object.

Regards,
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] identify object that causes "Error in loadNamespace(name) : there is no package called ‘R.utils’"

2013-04-24 Thread Duncan Murdoch

On 13-04-24 10:12 AM, Martin Morgan wrote:

On 04/24/2013 06:03 AM, Duncan Murdoch wrote:

On 13-04-24 5:46 AM, Liviu Andronic wrote:

Dear all,
I've bumped into the: "Error in loadNamespace(name) : there is no
package called ‘R.utils’" error. I've already read a bit on this (
http://www.cybaea.net/Blogs/Data/A-warning-on-the-R-save-format.html )
but I have a follow-up question.

Given a workspace that automatically loads a package that I don't
really need/want (e.g. ‘R.utils’), how do I identify which object
requires this package to load? I would like to avoid loading ‘R.utils’
every time I open an R session.


That's not easy, because the code in R that triggers that error has no idea of
the name of the object it is loading.


Maybe traceback() can provide some hints? I did, more or less arbitrarily

library(rms)
a = list(fun=ie.setup)
save(a, file="/tmp/a.rda")
remove.packages("rms")

and then in a new session

  > load("/tmp/a.rda")
Error in loadNamespace(name) : there is no package called 'rms'
  > traceback()
7: stop(e)
6: value[[3L]](cond)
5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
4: tryCatchList(expr, classes, parentenv, handlers)
3: tryCatch(loadNamespace(name), error = function(e) stop(e))
2: getNamespace(c("rms", "3.6-3"))
1: load("/tmp/a.rda")

with the line numbered 2 giving me the necessary hint.


That tells you that some object needs the rms package, but I don't see 
how you would know it is "a" that is the problem.  We already knew that 
rms was needed from the error message.


What I've done sometimes in debugging is to change that error to a 
warning in the getNamespace() function, and add some tracing code to the 
serialization code to print the names of objects as they are loaded. 
(This goes in ReadItem in src/main/serialize.c.)


I wouldn't expect Liviu to make those changes, but perhaps a "verbose" 
option could be added to load(), so that it could be available to users.


Duncan Murdoch



Martin




You could try a binary search to find out, but it will be tedious:
1. Install R.utils.
2. Load the workspace successfully.
3. Delete half the objects, and save it.
4. Uninstall R.utils, and see if you can load the workspace.

At this point you'll know if there's an object needing R.utils still left or
not, and you can repeat the steps until you find a single object that causes the
problem.  (But it might not be the only one, so deleting it from the original
workspace might not solve your problem.)

A better approach is to *never* save and load workspaces unless you know exactly
what is in them.  Always reply "no" to the question about saving your workspace
(or set that as the default).  If you accidentally end up with a workspace being
loaded, delete it.

Duncan Murdoch



Regards,
Liviu



sessionInfo()

R version 2.15.3 (2013-03-01)
Platform: i686-pc-linux-gnu (32-bit)

locale:
   [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
LC_TIME=en_US.UTF-8
   [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
   [7] LC_PAPER=C LC_NAME=C
LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=C

attached base packages:
[1] datasets  grDevices splines   graphics  utils stats
methods   base

other attached packages:
   [1] R.utils_1.23.2R.oo_1.13.0   R.methodsS3_1.4.2
tables_0.7.57 reshape2_1.2.2
   [6] car_2.0-15nnet_7.3-6MASS_7.3-23
Hmisc_3.10-1  survival_2.37-2
[11] foreign_0.8-53

loaded via a namespace (and not attached):
[1] cluster_1.14.3   grid_2.15.3  lattice_0.20-13  plyr_1.8
   rstudio_0.97.312
[6] stringr_0.6.2tools_2.15.3




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to make a raster image in R from my own data set

2013-04-24 Thread Kristi Glover
Hi R-user,
I was trying to make a raster map with WGS84 projection in R, but I could not 
make it. I found one data set in Google that data is almost the same format as 
of mine. I wanted to make a raster map of temperature with 1 degree spatial 
resolution for the global scale. 
I could make it in GIS software but I do have many variables (to be many raster 
images) and ultimately I am importing them to R for further analysis. 
Therefore, I wanted to make them in R, if possible. 

It would be great if you give some hints on how script look like  in creating a 
raster map from my own data set (I have provided link for your references, this 
is an example data set).

I am really appropriating for your help.

#--
#create a raster map from scratch

install.packages("raster", dependencies=TRUE)
library(raster)  # raster data
install.packages("rgdal", dependencies=TRUE)
library(rgdal)  # input/output, projections
install.packages("rgeos", dependencies=TRUE)
library(rgeos)  # geometry ops
install.packages("spdep", dependencies=TRUE)
library(spdep)  # spatial dependence
install.packages("pastecs", dependencies=TRUE)
library(pastecs)
pts<-read.table.url("https://www.betydb.org//miscanthusyield.csv";, header=T, 
sep=",")
proj4string(pts)=<- CRS("+proj=longlat +datum=WGS84")
#---

Cheers,
Kristi
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extract function extracting only NA values

2013-04-24 Thread Fabio Berzaghi

ok so i imported the raster as a dataframe and everything is fine
   Min.   :-69826220
   Max.   :167780500

so it's something with values lower than - that are interpreted as 
NA by the raster function



On 4/24/2013 13:52, Fabio Berzaghi wrote:
So I think I might have found what is causing this problem. The values 
I have in this raster


> summary(ps0011yme)
layer
Min.-9458.911
1st Qu.   1955256.000
Median   10618870.000
3rd Qu.  79577730.000
Max.167780500.000
NA's0.000

From ArcMap though I get different values
min -69826224 and max 167780496

So maybe when I am importing the rest are in R something goes wrong 
and all the values below a certain threshold are considered NA. Is 
this a bug or do I get to use a specific parameter for the raster 
function?



On 4/24/2013 13:10, Fabio Berzaghi wrote:

Hello,

I have five raster files in ASCII format. With four of them I have no 
problem extracting values based on a set of X and Y coordinates. 
Unfortunately with one of the files all I managed to extract is NA 
values. To verify the problem I have opened the raster with ArcMap 
and there are no NA values where I am extracting. I have also plotted 
the spatial point class on top of the raster in R and it does 
correspond to the correct locations.


These are some of the commands I am using, and as I already pointed 
out that works perfectly with other raster files.


sp<-SpatialPoints(xysp)
xy$rasterimg<-extract(rasterimg,sp)

Can anyone help? At this point I am rather clueless about this.

Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] about the loglm and glm---Re:Re: Regression on stratified count data

2013-04-24 Thread Achim Zeileis

On Wed, 24 Apr 2013, meng wrote:


Hi,Achim:
Can all the analysis you mentioned via loglm be performed via
glm(...,family=poisson)?


Yes.

## transform table back to data.frame
df <- as.data.frame(tab)

## fit models: conditional independence, no-three way interaction,
## and saturated
g1 <- glm(Freq ~ age/(drug + case), data = df, family = poisson)
g2 <- glm(Freq ~ (age + drug + case)^2, data = df, family = poisson)
g3 <- glm(Freq ~ age * drug * case, data = df, family = poisson)

## likelihood-ratio tests (against saturated)
anova(g1, g3, test = "Chisq")
anova(g2, g3, test = "Chisq")

## compare fitted frequencies (which are essentially equal)
all.equal(as.numeric(fitted(g1)),
  as.data.frame(as.table(fitted(m1)))$Freq)
all.equal(as.numeric(fitted(g2)),
  as.data.frame(as.table(fitted(m2)))$Freq)

The difference is mainly that loglm() has a specialized user interface and 
that it uses a different optimizer (iterative proportional fitting rather 
than iterative reweighted least squares).


Best,
Z


Many thanks.







At 2013-04-24 19:37:10,"Achim Zeileis"  wrote:
>On Wed, 24 Apr 2013, meng wrote:
>
>> Hi all:
>> For stratified count data,how to perform regression analysis?
>>
>> My data:
>> age case oc count
>> 1   1 121
>> 1   1 226
>> 1   2 117
>> 1   2 259
>> 2   1 118
>> 2   1 288
>> 2   2 1 7
>> 2   2 2   95
>>
>> age:
>> 1:<40y
>> 2:>40y
>>
>> case:
>> 1:patient
>> 2:health
>>
>> oc:
>> 1:use drug
>> 2:not use drug
>>
>> My purpose:
>> Anaysis whether case and oc are correlated, and age is a stratified varia
ble.
>>
>> My solution:
>> 1,Mantel-Haenszel test by using function "mantelhaen.test"
>> 2,loglinear regression by using function glm(count~case*oc,family=poisson
).But I don't know how to handle variable "age",which is the stratified vari
able.
>
>Instead of using glm(family = poisson) it is also convenient to use 
>loglm() from package MASS for the associated convenience table.
>
>The code below shows how to set up the contingency table, visualize it 
>using package vcd, and then fit two models using loglm. The models 
>considered are conditional independence of case and drug given age and the 
>"no three-way interaction" already suggested by Peter. Both models are 
>also accompanied by visualizations of the residuals.
>
>## contingency table with nice labels
>tab <- expand.grid(drug = 1:2, case = 1:2, age = 1:2)
>tab$count <- c(21, 26, 17, 59, 18, 88, 7, 95)
>tab$age <- factor(tab$age, levels = 1:2, labels = c("<40", ">40"))
>tab$case <- factor(tab$case, levels = 1:2, labels = c("patient", 
>"healthy"))
>tab$drug <- factor(tab$drug, levels = 1:2, labels = c("yes", "no"))
>tab <- xtabs(count ~ age + drug + case, data = tab)
>
>## visualize case explained by drug given age
>library("vcd")
>mosaic(case ~ drug | age, data = tab,
>   split_vertical = c(TRUE, TRUE, FALSE))
>
>## test wheter drug and case are independent given age
>m1 <- loglm(~ age/(drug + case), data = tab)
>m1
>
>## visualize corresponding residuals from independence model
>mosaic(case ~ drug | age, data = tab,
>   split_vertical = c(TRUE, TRUE, FALSE),
>   residuals_type = "deviance",
>   gp = shading_hcl, gp_args = list(interpolate = 1.2))
>mosaic(case ~ drug | age, data = tab,
>   split_vertical = c(TRUE, TRUE, FALSE),
>   residuals_type = "pearson",
>   gp = shading_hcl, gp_args = list(interpolate = 1.2))
>
>## test whether there is no three-way interaction
>## (i.e., dependence of case on drug is the same for both age groups)
>m2 <- loglm(~ (age + drug + case)^2, data = tab)
>m2
>
>## visualization (with default pearson residuals)
>mosaic(case ~ drug | age, data = tab,
>   expected = ~ (age + drug + case)^2,
>   split_vertical = c(TRUE, TRUE, FALSE),
>   gp = shading_hcl, gp_args = list(interpolate = 1.2))
>
>
>> Many thanks for your help.
>>
>> My best.
>>    [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.h
tml
>> and provide commented, minimal, self-contained, reproducible code.
>>



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] identify object that causes "Error in loadNamespace(name) : there is no package called ‘R.utils’"

2013-04-24 Thread Martin Morgan

On 04/24/2013 06:03 AM, Duncan Murdoch wrote:

On 13-04-24 5:46 AM, Liviu Andronic wrote:

Dear all,
I've bumped into the: "Error in loadNamespace(name) : there is no
package called ‘R.utils’" error. I've already read a bit on this (
http://www.cybaea.net/Blogs/Data/A-warning-on-the-R-save-format.html )
but I have a follow-up question.

Given a workspace that automatically loads a package that I don't
really need/want (e.g. ‘R.utils’), how do I identify which object
requires this package to load? I would like to avoid loading ‘R.utils’
every time I open an R session.


That's not easy, because the code in R that triggers that error has no idea of
the name of the object it is loading.


Maybe traceback() can provide some hints? I did, more or less arbitrarily

library(rms)
a = list(fun=ie.setup)
save(a, file="/tmp/a.rda")
remove.packages("rms")

and then in a new session

> load("/tmp/a.rda")
Error in loadNamespace(name) : there is no package called 'rms'
> traceback()
7: stop(e)
6: value[[3L]](cond)
5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
4: tryCatchList(expr, classes, parentenv, handlers)
3: tryCatch(loadNamespace(name), error = function(e) stop(e))
2: getNamespace(c("rms", "3.6-3"))
1: load("/tmp/a.rda")

with the line numbered 2 giving me the necessary hint.

Martin




You could try a binary search to find out, but it will be tedious:
1. Install R.utils.
2. Load the workspace successfully.
3. Delete half the objects, and save it.
4. Uninstall R.utils, and see if you can load the workspace.

At this point you'll know if there's an object needing R.utils still left or
not, and you can repeat the steps until you find a single object that causes the
problem.  (But it might not be the only one, so deleting it from the original
workspace might not solve your problem.)

A better approach is to *never* save and load workspaces unless you know exactly
what is in them.  Always reply "no" to the question about saving your workspace
(or set that as the default).  If you accidentally end up with a workspace being
loaded, delete it.

Duncan Murdoch



Regards,
Liviu



sessionInfo()

R version 2.15.3 (2013-03-01)
Platform: i686-pc-linux-gnu (32-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
LC_TIME=en_US.UTF-8
  [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=C LC_NAME=C
LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=C

attached base packages:
[1] datasets  grDevices splines   graphics  utils stats
methods   base

other attached packages:
  [1] R.utils_1.23.2R.oo_1.13.0   R.methodsS3_1.4.2
tables_0.7.57 reshape2_1.2.2
  [6] car_2.0-15nnet_7.3-6MASS_7.3-23
Hmisc_3.10-1  survival_2.37-2
[11] foreign_0.8-53

loaded via a namespace (and not attached):
[1] cluster_1.14.3   grid_2.15.3  lattice_0.20-13  plyr_1.8
  rstudio_0.97.312
[6] stringr_0.6.2tools_2.15.3




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] identify object that causes "Error in loadNamespace(name) : there is no package called ‘R.utils’"

2013-04-24 Thread Duncan Murdoch

On 13-04-24 5:46 AM, Liviu Andronic wrote:

Dear all,
I've bumped into the: "Error in loadNamespace(name) : there is no
package called ‘R.utils’" error. I've already read a bit on this (
http://www.cybaea.net/Blogs/Data/A-warning-on-the-R-save-format.html )
but I have a follow-up question.

Given a workspace that automatically loads a package that I don't
really need/want (e.g. ‘R.utils’), how do I identify which object
requires this package to load? I would like to avoid loading ‘R.utils’
every time I open an R session.


That's not easy, because the code in R that triggers that error has no 
idea of the name of the object it is loading.


You could try a binary search to find out, but it will be tedious:
1. Install R.utils.
2. Load the workspace successfully.
3. Delete half the objects, and save it.
4. Uninstall R.utils, and see if you can load the workspace.

At this point you'll know if there's an object needing R.utils still 
left or not, and you can repeat the steps until you find a single object 
that causes the problem.  (But it might not be the only one, so deleting 
it from the original workspace might not solve your problem.)


A better approach is to *never* save and load workspaces unless you know 
exactly what is in them.  Always reply "no" to the question about saving 
your workspace (or set that as the default).  If you accidentally end up 
with a workspace being loaded, delete it.


Duncan Murdoch



Regards,
Liviu



sessionInfo()

R version 2.15.3 (2013-03-01)
Platform: i686-pc-linux-gnu (32-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
LC_TIME=en_US.UTF-8
  [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=C LC_NAME=C
LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=C

attached base packages:
[1] datasets  grDevices splines   graphics  utils stats
methods   base

other attached packages:
  [1] R.utils_1.23.2R.oo_1.13.0   R.methodsS3_1.4.2
tables_0.7.57 reshape2_1.2.2
  [6] car_2.0-15nnet_7.3-6MASS_7.3-23
Hmisc_3.10-1  survival_2.37-2
[11] foreign_0.8-53

loaded via a namespace (and not attached):
[1] cluster_1.14.3   grid_2.15.3  lattice_0.20-13  plyr_1.8
  rstudio_0.97.312
[6] stringr_0.6.2tools_2.15.3




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] News package

2013-04-24 Thread R. Michael Weylandt
On Wed, Apr 24, 2013 at 9:02 AM, Aswathy Nair  wrote:
> Hi,
>
> Is there any package available in R to download news content?

What news source are you looking for?

You could, e.g., use the twitteR package, but to my knowledge for
things like Google News or the BBC you'll need to likely roll your own
with the XML and RCurl packages.

MW

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bootstrapping a 11x10 matrix

2013-04-24 Thread Adams, Jean
Herbejie Rose,

You could use the boot() function in the R package boot.  For example:

# example data matrix
m <- matrix(sample(11*10), ncol=10)

# function to calculate column means for indexed rows of matrix
myfun <- function(data, i) {
apply(data[i, ], 2, mean)
}

# 1000 bootstrap samples
library(boot)
myboot <- boot(m, myfun, R=1000)
myboot
dim(myboot$t)

Jean



On Tue, Apr 23, 2013 at 10:41 AM, Herbejie Rose Cuizon <
herbejier...@gmail.com> wrote:

> Good evening! This is Herbejie Rose
>
> I need your help for me to compute the following:
>
> I just want to ask on how to bootstrap a 11x10 matrix  to obtain 1000
> bootstrap samples and compute the 10 dimensional mean per bootstrap sample.
> the 11x10 dimension of the matrix has 11 subjects and 10 variables. Just
> want to have 11x10 per bootrap sample with replacement and compute the 10
> dimensional mean.
>
>
> Hope to have a positive response on this matter..
>
>
> Thank you so much.
>
>
>
> Sincerely yours,
>
>
> Herbejie Rose
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extract function extracting only NA values

2013-04-24 Thread Fabio Berzaghi
So I think I might have found what is causing this problem. The values I 
have in this raster


> summary(ps0011yme)
layer
Min.-9458.911
1st Qu.   1955256.000
Median   10618870.000
3rd Qu.  79577730.000
Max.167780500.000
NA's0.000

From ArcMap though I get different values
min -69826224 and max 167780496

So maybe when I am importing the rest are in R something goes wrong and 
all the values below a certain threshold are considered NA. Is this a 
bug or do I get to use a specific parameter for the raster function?



On 4/24/2013 13:10, Fabio Berzaghi wrote:

Hello,

I have five raster files in ASCII format. With four of them I have no 
problem extracting values based on a set of X and Y coordinates. 
Unfortunately with one of the files all I managed to extract is NA 
values. To verify the problem I have opened the raster with ArcMap and 
there are no NA values where I am extracting. I have also plotted the 
spatial point class on top of the raster in R and it does correspond 
to the correct locations.


These are some of the commands I am using, and as I already pointed 
out that works perfectly with other raster files.


sp<-SpatialPoints(xysp)
xy$rasterimg<-extract(rasterimg,sp)

Can anyone help? At this point I am rather clueless about this.

Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for a better code for my problem.

2013-04-24 Thread Jorge I Velez
Try

subset(Dat, AA == "A" | (AA == "B" & BB == "b"))

HTH,
Jorge.-


On Wed, Apr 24, 2013 at 8:21 PM, Christofer Bogaso <
bogaso.christo...@gmail.com> wrote:

> Hello again,
>
> Let say I have following data:
>
> Dat <- structure(list(AA = structure(c(3L, 1L, 2L, 1L, 2L, 3L, 3L, 2L,
> 3L, 1L, 1L, 3L, 3L, 2L, 2L, 3L, 2L, 1L, 1L, 1L), .Label = c("A",
> "B", "C"), class = "factor"), BB = structure(c(2L, 3L, 2L, 2L,
> 2L, 3L, 2L, 2L, 2L, 1L, 1L, 2L, 3L, 1L, 3L, 2L, 1L, 2L, 2L, 3L
> ), .Label = c("a", "b", "c"), class = "factor"), CC = 1:20), .Names =
> c("AA",
> "BB", "CC"), row.names = c(NA, -20L), class = "data.frame")
>
> Now I want to select a subset of that 'Dat', for which:
> 1. First column will contain ALL "A"
> 2. First column will contain those "B" for which "BB = b" in second column.
>
> Therefore I tries following:
>
> > Only_A <- Dat[Dat[, 'AA'] == "A", ]
> > Only_B <- Dat[Dat[, 'AA'] == "B", ]
> > rbind(Only_A, Only_B[Only_B[, 'BB'] == "b", ])
>AA BB CC
> 2   A  c  2
> 4   A  b  4
> 10  A  a 10
> 11  A  a 11
> 18  A  b 18
> 19  A  b 19
> 20  A  c 20
> 3   B  b  3
> 5   B  b  5
> 8   B  b  8
>
>
> However I believe there must be some better code to achieve that which
> is tidier, i.e. there must be some 1-liner code.
>
> Can somebody suggest any better approach if possible?
>
> Thanks and regards,
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regression on stratified count data

2013-04-24 Thread Achim Zeileis

On Wed, 24 Apr 2013, meng wrote:


Hi all:
For stratified count data,how to perform regression analysis?

My data:
age case oc count
1   1 121
1   1 226
1   2 117
1   2 259
2   1 118
2   1 288
2   2 1 7
2   2 2   95

age:
1:<40y
2:>40y

case:
1:patient
2:health

oc:
1:use drug
2:not use drug

My purpose:
Anaysis whether case and oc are correlated, and age is a stratified variable.

My solution:
1,Mantel-Haenszel test by using function "mantelhaen.test"
2,loglinear regression by using function glm(count~case*oc,family=poisson).But I don't 
know how to handle variable "age",which is the stratified variable.


Instead of using glm(family = poisson) it is also convenient to use 
loglm() from package MASS for the associated convenience table.


The code below shows how to set up the contingency table, visualize it 
using package vcd, and then fit two models using loglm. The models 
considered are conditional independence of case and drug given age and the 
"no three-way interaction" already suggested by Peter. Both models are 
also accompanied by visualizations of the residuals.


## contingency table with nice labels
tab <- expand.grid(drug = 1:2, case = 1:2, age = 1:2)
tab$count <- c(21, 26, 17, 59, 18, 88, 7, 95)
tab$age <- factor(tab$age, levels = 1:2, labels = c("<40", ">40"))
tab$case <- factor(tab$case, levels = 1:2, labels = c("patient", 
"healthy"))

tab$drug <- factor(tab$drug, levels = 1:2, labels = c("yes", "no"))
tab <- xtabs(count ~ age + drug + case, data = tab)

## visualize case explained by drug given age
library("vcd")
mosaic(case ~ drug | age, data = tab,
  split_vertical = c(TRUE, TRUE, FALSE))

## test wheter drug and case are independent given age
m1 <- loglm(~ age/(drug + case), data = tab)
m1

## visualize corresponding residuals from independence model
mosaic(case ~ drug | age, data = tab,
  split_vertical = c(TRUE, TRUE, FALSE),
  residuals_type = "deviance",
  gp = shading_hcl, gp_args = list(interpolate = 1.2))
mosaic(case ~ drug | age, data = tab,
  split_vertical = c(TRUE, TRUE, FALSE),
  residuals_type = "pearson",
  gp = shading_hcl, gp_args = list(interpolate = 1.2))

## test whether there is no three-way interaction
## (i.e., dependence of case on drug is the same for both age groups)
m2 <- loglm(~ (age + drug + case)^2, data = tab)
m2

## visualization (with default pearson residuals)
mosaic(case ~ drug | age, data = tab,
  expected = ~ (age + drug + case)^2,
  split_vertical = c(TRUE, TRUE, FALSE),
  gp = shading_hcl, gp_args = list(interpolate = 1.2))



Many thanks for your help.

My best.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] extract function extracting only NA values

2013-04-24 Thread Fabio Berzaghi

Hello,

I have five raster files in ASCII format. With four of them I have no 
problem extracting values based on a set of X and Y coordinates. 
Unfortunately with one of the files all I managed to extract is NA 
values. To verify the problem I have opened the raster with ArcMap and 
there are no NA values where I am extracting. I have also plotted the 
spatial point class on top of the raster in R and it does correspond to 
the correct locations.


These are some of the commands I am using, and as I already pointed out 
that works perfectly with other raster files.


sp<-SpatialPoints(xysp)
xy$rasterimg<-extract(rasterimg,sp)

Can anyone help? At this point I am rather clueless about this.

Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for a better code for my problem.

2013-04-24 Thread Rui Barradas

Hello,

You can easily make of the following a one-liner. Note that the order of 
rows is not the same as in your code, so identical() will return FALSE.



idx <- Dat[, 'AA'] == "A" | (Dat[, 'AA'] == "B" & Dat[, 'BB'] == "b")
res2 <- Dat[idx, ]


Hope this helps,

Rui Barradas

Em 24-04-2013 11:21, Christofer Bogaso escreveu:

Hello again,

Let say I have following data:

Dat <- structure(list(AA = structure(c(3L, 1L, 2L, 1L, 2L, 3L, 3L, 2L,
3L, 1L, 1L, 3L, 3L, 2L, 2L, 3L, 2L, 1L, 1L, 1L), .Label = c("A",
"B", "C"), class = "factor"), BB = structure(c(2L, 3L, 2L, 2L,
2L, 3L, 2L, 2L, 2L, 1L, 1L, 2L, 3L, 1L, 3L, 2L, 1L, 2L, 2L, 3L
), .Label = c("a", "b", "c"), class = "factor"), CC = 1:20), .Names = c("AA",
"BB", "CC"), row.names = c(NA, -20L), class = "data.frame")

Now I want to select a subset of that 'Dat', for which:
1. First column will contain ALL "A"
2. First column will contain those "B" for which "BB = b" in second column.

Therefore I tries following:


Only_A <- Dat[Dat[, 'AA'] == "A", ]
Only_B <- Dat[Dat[, 'AA'] == "B", ]
rbind(Only_A, Only_B[Only_B[, 'BB'] == "b", ])

AA BB CC
2   A  c  2
4   A  b  4
10  A  a 10
11  A  a 11
18  A  b 18
19  A  b 19
20  A  c 20
3   B  b  3
5   B  b  5
8   B  b  8


However I believe there must be some better code to achieve that which
is tidier, i.e. there must be some 1-liner code.

Can somebody suggest any better approach if possible?

Thanks and regards,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Looking for a better code for my problem.

2013-04-24 Thread Christofer Bogaso
Hello again,

Let say I have following data:

Dat <- structure(list(AA = structure(c(3L, 1L, 2L, 1L, 2L, 3L, 3L, 2L,
3L, 1L, 1L, 3L, 3L, 2L, 2L, 3L, 2L, 1L, 1L, 1L), .Label = c("A",
"B", "C"), class = "factor"), BB = structure(c(2L, 3L, 2L, 2L,
2L, 3L, 2L, 2L, 2L, 1L, 1L, 2L, 3L, 1L, 3L, 2L, 1L, 2L, 2L, 3L
), .Label = c("a", "b", "c"), class = "factor"), CC = 1:20), .Names = c("AA",
"BB", "CC"), row.names = c(NA, -20L), class = "data.frame")

Now I want to select a subset of that 'Dat', for which:
1. First column will contain ALL "A"
2. First column will contain those "B" for which "BB = b" in second column.

Therefore I tries following:

> Only_A <- Dat[Dat[, 'AA'] == "A", ]
> Only_B <- Dat[Dat[, 'AA'] == "B", ]
> rbind(Only_A, Only_B[Only_B[, 'BB'] == "b", ])
   AA BB CC
2   A  c  2
4   A  b  4
10  A  a 10
11  A  a 11
18  A  b 18
19  A  b 19
20  A  c 20
3   B  b  3
5   B  b  5
8   B  b  8


However I believe there must be some better code to achieve that which
is tidier, i.e. there must be some 1-liner code.

Can somebody suggest any better approach if possible?

Thanks and regards,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] identify object that causes "Error in loadNamespace(name) : there is no package called ‘R.utils’"

2013-04-24 Thread Liviu Andronic
Dear all,
I've bumped into the: "Error in loadNamespace(name) : there is no
package called ‘R.utils’" error. I've already read a bit on this (
http://www.cybaea.net/Blogs/Data/A-warning-on-the-R-save-format.html )
but I have a follow-up question.

Given a workspace that automatically loads a package that I don't
really need/want (e.g. ‘R.utils’), how do I identify which object
requires this package to load? I would like to avoid loading ‘R.utils’
every time I open an R session.

Regards,
Liviu


> sessionInfo()
R version 2.15.3 (2013-03-01)
Platform: i686-pc-linux-gnu (32-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
LC_TIME=en_US.UTF-8
 [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C LC_NAME=C
LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=C

attached base packages:
[1] datasets  grDevices splines   graphics  utils stats
methods   base

other attached packages:
 [1] R.utils_1.23.2R.oo_1.13.0   R.methodsS3_1.4.2
tables_0.7.57 reshape2_1.2.2
 [6] car_2.0-15nnet_7.3-6MASS_7.3-23
Hmisc_3.10-1  survival_2.37-2
[11] foreign_0.8-53

loaded via a namespace (and not attached):
[1] cluster_1.14.3   grid_2.15.3  lattice_0.20-13  plyr_1.8
 rstudio_0.97.312
[6] stringr_0.6.2tools_2.15.3


-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pglm package: fitted values and residuals

2013-04-24 Thread alfonso . carfora

I'm using the package pglm and I'have estimated a "random probit model".
I need to save in a vector the fitted values and the residuals of the  
model but I can not do it.


I tried with the command fitted.values using the following procedure  
without results:


library(pglm)

m1_S<-pglm(Feed ~ Cons_PC_1 + imp_gen_1 + LGDP_PC_1 + lnEI_1 +  
SH_Ren_1,data,family=binomial(probit),model="random",method="bfgs",index=c("Year","IDCountry"))


m1_S$fitted.values
residuals(m1)


Can someone help me about it?

Thanks


**  
IL MERITO DEGLI STUDENTI VIENE RICONOSCIUTO

Il 5 per mille all'Universita' degli Studi di Napoli "Parthenope" incrementa le 
borse di studio agli studenti - codice fiscale 80018240632
http://www.uniparthenope.it/index.php/5xmille 


http://www.uniparthenope.it/index.php/it/avvisi-sito-di-ateneo/2943-la-parthenope-premia-il-tuo-voto-di-diploma-ed-il-tuo-imegno-con-i-proventi-del-5-per-mille

Questa informativa e' inserita in automatico dal sistema al fine esclusivo 
della realizzazione dei fini istituzionali dell'ente.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] News package

2013-04-24 Thread Aswathy Nair
Hi,

Is there any package available in R to download news content?


Thanks in advance.
Ashy

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simulate user input in R

2013-04-24 Thread Vahe nr
Thanks Jeff for your reply...
The code is not written by me it is the NDVITS package which is used in R.
TimeSeriesAnalysis {ndvits}
And at that point it requires my input.

Regards,
 Vahe


On Wed, Apr 24, 2013 at 11:04 AM, Jeff Newmiller
wrote:

> Your example is not reproducible, so it is open to misinterpretation.
>
> Normally the approach taken in R is not to wait for user input, but have
> the user call the function with the desired values. One interpretation of
> your question is that the function you are using is broken by design, and
> you should re-write it.
>
> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live
> Go...
>   Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
> ---
> Sent from my phone. Please excuse my brevity.
>
> Vahe nr  wrote:
>
> >Dear all,
> >
> >Can we simulate user input in R ?
> >for example if we have a function which needs an input from the user to
> >continue its work, can we automate this step (simulate the input...)
> >
> >Here is the sample:
> >choose between one of the grouping factor available :
> >c("Village", "Country")
> >we need to enter Village or Country to continue.can I automatically
> >pass one of the values without user input.
> >
> >Regards,
> > Vahe
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simulate user input in R

2013-04-24 Thread Jeff Newmiller
Your example is not reproducible, so it is open to misinterpretation.

Normally the approach taken in R is not to wait for user input, but have the 
user call the function with the desired values. One interpretation of your 
question is that the function you are using is broken by design, and you should 
re-write it.

---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Vahe nr  wrote:

>Dear all,
>
>Can we simulate user input in R ?
>for example if we have a function which needs an input from the user to
>continue its work, can we automate this step (simulate the input...)
>
>Here is the sample:
>choose between one of the grouping factor available :
>c("Village", "Country")
>we need to enter Village or Country to continue.can I automatically
>pass one of the values without user input.
>
>Regards,
> Vahe
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.