[R] Extract data from Array to Table

2015-02-11 Thread Karim Mezhoud
Dear All,
I am facing the task to extract data from array to table. here an example
of array.

##Get A list of Matrices with unequal rows

Disease - NULL
Diseases - NULL
ListMatByGene - NULL
for(i in 1:3){

Disease[[i]] -matrix(sample(-30:30,25+(5*
i)),5+i)
rownames(Disease[[i]]) - paste0(Sample,1:(5+i))
colnames(Disease[[i]]) - paste0(Gene,1:5)

D - paste0(Disease,i)
Diseases[[D]] - Disease[[i]]
}


## get the same Column from all matrices
getColumn - function(x, colNum, len = nrow(x)){
y - x[,colNum]
length(y) - len
y
}

## get Matrices by the same columns of the list of matrices
getMatrices - function(colNums, dataList = x){
# the number of rows required
n - max(sapply(dataList, nrow))
lapply(colNums, function(x, dat, n) { # iterate along requested columns
do.call(cbind, lapply(dat, getColumn,x, len=n)) # iterate along
input data list
}, dataList, n)
}

## Rotate the list of matrices by 90°
G - paste0(Gene,1:5)
ListMatByGene[G] - getMatrices(c(1:ncol(Diseases[[1]])),dataList=Diseases)

## get Disease correlation by gene
DiseaseCorrelation - lapply(ListMatByGene,function(x) cor(x,use=na,
method=spearman))

##convert the list of Matrices to array
ArrayDiseaseCor - array(unlist(DiseaseCorrelation), dim =
c(nrow(DiseaseCorrelation[[1]]), ncol(DiseaseCorrelation[[1]]),
length(DiseaseCorrelation)))
dimnames(ArrayDiseaseCor) - list(names(Diseases), names(Diseases),
colnames(Diseases[[1]]))

## Select only correlation bigger than 0.5 from the array
FilterDiseaseCor - apply(ArrayDiseaseCor,MARGIN=c(1,2) ,function(x)
x[abs(x)0.5])

## Final result
FilterDiseaseCor

 Disease1   Disease2  Disease3
Disease1 Numeric,5  Numeric,2 -0.9428571
Disease2 Numeric,2  Numeric,5 Numeric,2
Disease3 -0.9428571 Numeric,2 Numeric,5


Question is:
How can get a table as:

D1  D2   Cor   Gene
Disease1Disease2  -0.94Gene2
Disease1Disease2   0.78Gene4
Disease3Disease2   0.5  Gene5
...

and
 Disease1   Disease2  Disease3
Disease1510
Disease21 53
Disease30  3   5

Or in general, How can I extract data from Array to Table?

Thanks
Karim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to solve this complex equation

2015-02-11 Thread Chel Hee Lee
A~ha~!!  Thank you, Prof. Peter Dalgaard, so much for your wonderful 
lesson!!!  Learning new things everyday from this R-help mailing list!


Chel Hee Lee

On 2/11/2015 10:37 AM, peter dalgaard wrote:


On 11 Feb 2015, at 17:11 , Chel Hee Lee chl...@mail.usask.ca wrote:


The functional form given in the post written by Ssuhanchen captures my eyes.   
It is the cumulative distribution function of Poisson when the number of counts 
is less than or equal to 2 with unknown parameter mu=x/2.   Since it is a 
nonlinear function, there may be multiple solutions but the solution should be 
greater than 0 (if I am in the right track).   I am assuming this functional 
form is originated from the Poisson.  Under this assumption, one solution is 
found as below:


rt - uniroot(function(x) ppois(2, lambda=x)-0.05, interval=c(0.5,1), 
extendInt=yes)

Warning messages:
1: In ppois(2, lambda = x) : NaNs produced
2: In ppois(2, lambda = x) : NaNs produced
3: In ppois(2, lambda = x) : NaNs produced

ppois(2, lambda=rt$root)

[1] 0.051

rt$root

[1] 6.295791

Thus, the solution x would be rt$root*2 (Note that I did not try to find other 
solutions).  I hope this helps.



Given the Poisson connection, I would pretty strongly expect the solution to be 
unique.

Notice also that your rt$root comes out as the upper end of the confidence 
interval in


poisson.test(2, alt=l)


Exact Poisson test

data:  2 time base: 1
number of events = 2, time base = 1, p-value = 0.9197
alternative hypothesis: true event rate is less than 1
95 percent confidence interval:
  0.00 6.295794
sample estimates:
event rate
  2






Chel Hee Lee

On 2/10/2015 2:29 AM, Rolf Turner wrote:

On 10/02/15 14:04, Ssuhanchen wrote:

Hi!

I want to use R to calculate the variable x which is in a complex equation
in below:

  2
  Σ[exp(-x/2)*(x^k)/(2^k*k!)]=0.05
k=0

how to solve this equation to get the exact x in R?


Is this homework?  Sure looks like it.  Talk to your prof.  Or do a bit of work 
on learning how to use R --- which is presumably the point of the exercise.

cheers,

Rolf Turner



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] A portfolio return function?

2015-02-11 Thread Ernest Stokely
For finance applications, I'm surprised that I am unable to find a function to 
compute the portfolio return (sqrt(t(w) %*% V %*% w)) where w are portfolio 
weights and V is the cov(returns). The Performance Analytics portfolio return 
function seems to compute something else.
Ernie

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting this data

2015-02-11 Thread JS Huang
Hi,

  Here is an implementation.  The image is uploaded as Rplot02.png.
 gen - read.table(geno.txt,header=TRUE)
 gen
  Genotype   E1   E2E3   E4   E5   E6   E7   E8   E9  E10  E11  E12  E13 
E14  E15   E16  E17  E18
1   G1 0.79 2.11  6.21 0.56 4.06 2.13 5.61 0.20 3.32 3.01 5.12 0.77 0.78
0.62 2.00  5.87 2.50 0.49
2   G2 0.59 0.92 10.11 0.74 4.01 1.12 4.58 0.70 2.61 1.49 4.59 0.20 1.33
0.62 2.67  5.58 2.22 0.93
3   G3 1.18 3.44  4.08 0.50 6.91 4.27 6.87 0.30 4.57 2.88 6.61 0.59 0.97
1.02 1.45  5.32 2.65 2.73
4   G4 2.17 4.46  6.51 0.35 4.80 5.99 5.44 1.20 4.21 3.59 4.89 0.39 2.26
2.05 3.33  8.55 2.10 2.39
5   G5 0.99 4.83  6.71 1.43 5.83 2.95 3.66 0.50 1.54 2.55 5.70 1.83 0.39
1.44 1.66  5.36 1.32 1.19
6   G6 1.57 3.94  7.49 0.61 5.88 2.74 7.22 1.19 4.81 3.72 5.59 0.38 1.36
1.43 3.94  9.15 2.50 2.56
7   G7 2.38 4.74  9.94 1.31 5.74 5.59 9.01 0.70 4.47 3.59 4.85 3.48 2.35
1.23 3.22 11.21 2.89 3.01
8   G8 0.98 2.56 12.02 3.31 5.41 2.03 5.92 0.30 4.39 4.19 6.10 0.39 1.97
0.62 4.93  8.89 3.45 2.95
   E19  E20  E21  E22  E23  E24
1 0.42 0.73 5.66 4.61 3.98 0.32
2 0.40 0.68 4.28 3.89 2.98 1.78
3 1.31 2.63 6.51 5.42 5.43 0.60
4 0.03 1.81 6.05 6.31 6.53 1.94
5 0.08 1.03 5.38 2.45 5.18 0.54
6 0.08 0.68 6.46 6.02 5.28 2.52
7 0.33 0.91 9.15 7.03 6.39 2.13
8 0.01 0.62 8.35 5.71 5.86 0.38
 x - expand.grid(gen[,1],names(gen)[2:dim(gen)[2]])
 x
Var1 Var2
1 G1   E1
2 G2   E1
3 G3   E1
4 G4   E1
5 G5   E1
6 G6   E1
7 G7   E1
8 G8   E1
9 G1   E2
10G2   E2
11G3   E2
12G4   E2
13G5   E2
14G6   E2
15G7   E2
16G8   E2
17G1   E3
18G2   E3
19G3   E3
20G4   E3
21G5   E3
22G6   E3
23G7   E3
24G8   E3
25G1   E4
26G2   E4
27G3   E4
28G4   E4
29G5   E4
30G6   E4
31G7   E4
32G8   E4
33G1   E5
34G2   E5
35G3   E5
36G4   E5
37G5   E5
38G6   E5
39G7   E5
40G8   E5
41G1   E6
42G2   E6
43G3   E6
44G4   E6
45G5   E6
46G6   E6
47G7   E6
48G8   E6
49G1   E7
50G2   E7
51G3   E7
52G4   E7
53G5   E7
54G6   E7
55G7   E7
56G8   E7
57G1   E8
58G2   E8
59G3   E8
60G4   E8
61G5   E8
62G6   E8
63G7   E8
64G8   E8
65G1   E9
66G2   E9
67G3   E9
68G4   E9
69G5   E9
70G6   E9
71G7   E9
72G8   E9
73G1  E10
74G2  E10
75G3  E10
76G4  E10
77G5  E10
78G6  E10
79G7  E10
80G8  E10
81G1  E11
82G2  E11
83G3  E11
84G4  E11
85G5  E11
86G6  E11
87G7  E11
88G8  E11
89G1  E12
90G2  E12
91G3  E12
92G4  E12
93G5  E12
94G6  E12
95G7  E12
96G8  E12
97G1  E13
98G2  E13
99G3  E13
100   G4  E13
101   G5  E13
102   G6  E13
103   G7  E13
104   G8  E13
105   G1  E14
106   G2  E14
107   G3  E14
108   G4  E14
109   G5  E14
110   G6  E14
111   G7  E14
112   G8  E14
113   G1  E15
114   G2  E15
115   G3  E15
116   G4  E15
117   G5  E15
118   G6  E15
119   G7  E15
120   G8  E15
121   G1  E16
122   G2  E16
123   G3  E16
124   G4  E16
125   G5  E16
126   G6  E16
127   G7  E16
128   G8  E16
129   G1  E17
130   G2  E17
131   G3  E17
132   G4  E17
133   G5  E17
134   G6  E17
135   G7  E17
136   G8  E17
137   G1  E18
138   G2  E18
139   G3  E18
140   G4  E18
141   G5  E18
142   G6  E18
143   G7  E18
144   G8  E18
145   G1  E19
146   G2  E19
147   G3  E19
148   G4  E19
149   G5  E19
150   G6  E19
151   G7  E19
152   G8  E19
153   G1  E20
154   G2  E20
155   G3  E20
156   G4  E20
157   G5  E20
158   G6  E20
159   G7  E20
160   G8  E20
161   G1  E21
162   G2  E21
163   G3  E21
164   G4  E21
165   G5  E21
166   G6  E21
167   G7  E21
168   G8  E21
169   G1  E22
170   G2  E22
171   G3  E22
172   G4  E22
173   G5  E22
174   G6  E22
175   G7  E22
176   G8  E22
177   G1  E23
178   G2  E23
179   G3  E23
180   G4  E23
181   G5  E23
182   G6  E23
183   G7  E23
184   G8  E23
185   G1  E24
186   G2  E24
187   G3  E24
188   G4  E24
189   G5  E24
190   G6  E24
191   G7  E24
192   G8  E24
 names(gen) - NULL
 genfinal - data.frame(x, unlist(gen[,2:dim(gen)[2]]))
 names(genfinal) - c(Genotype,category,value)
 genfinal$category - as.numeric(genfinal$category)
 head(genfinal)
  Genotype category value
1   G11  0.79
2   G21  0.59
3   G31  1.18
4   G41  2.17
5   G51  0.99
6   G61  1.57
 library(ggplot2)
 ggplot(genfinal,aes(x=category,y=value,colour=Genotype)) + geom_line()
Rplot02.png http://r.789695.n4.nabble.com/file/n4703089/Rplot02.png  



--
View this message in context: 
http://r.789695.n4.nabble.com/plotting-this-data-tp4702926p4703089.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 

[R] Download internet videos

2015-02-11 Thread Raoni Rodrigues
Hello R-helpers,

It is possible donwload youtube videos with R? I made a google search and
find no options to do that.

Thanks in advanced,

Raoni

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to solve this complex equation

2015-02-11 Thread peter dalgaard

On 11 Feb 2015, at 17:11 , Chel Hee Lee chl...@mail.usask.ca wrote:

 The functional form given in the post written by Ssuhanchen captures my eyes. 
   It is the cumulative distribution function of Poisson when the number of 
 counts is less than or equal to 2 with unknown parameter mu=x/2.   Since it 
 is a nonlinear function, there may be multiple solutions but the solution 
 should be greater than 0 (if I am in the right track).   I am assuming this 
 functional form is originated from the Poisson.  Under this assumption, one 
 solution is found as below:
 
  rt - uniroot(function(x) ppois(2, lambda=x)-0.05, interval=c(0.5,1), 
  extendInt=yes)
 Warning messages:
 1: In ppois(2, lambda = x) : NaNs produced
 2: In ppois(2, lambda = x) : NaNs produced
 3: In ppois(2, lambda = x) : NaNs produced
  ppois(2, lambda=rt$root)
 [1] 0.051
  rt$root
 [1] 6.295791
 
 Thus, the solution x would be rt$root*2 (Note that I did not try to find 
 other solutions).  I hope this helps.
 

Given the Poisson connection, I would pretty strongly expect the solution to be 
unique. 

Notice also that your rt$root comes out as the upper end of the confidence 
interval in

 poisson.test(2, alt=l)

Exact Poisson test

data:  2 time base: 1
number of events = 2, time base = 1, p-value = 0.9197
alternative hypothesis: true event rate is less than 1
95 percent confidence interval:
 0.00 6.295794
sample estimates:
event rate 
 2 





 Chel Hee Lee
 
 On 2/10/2015 2:29 AM, Rolf Turner wrote:
 On 10/02/15 14:04, Ssuhanchen wrote:
 Hi!
 
 I want to use R to calculate the variable x which is in a complex equation
 in below:
 
  2
  Σ[exp(-x/2)*(x^k)/(2^k*k!)]=0.05
 k=0
 
 how to solve this equation to get the exact x in R?
 
 Is this homework?  Sure looks like it.  Talk to your prof.  Or do a bit of 
 work on learning how to use R --- which is presumably the point of the 
 exercise.
 
 cheers,
 
 Rolf Turner
 
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to solve this complex equation

2015-02-11 Thread Chel Hee Lee
The functional form given in the post written by Ssuhanchen captures my 
eyes.   It is the cumulative distribution function of Poisson when the 
number of counts is less than or equal to 2 with unknown parameter 
mu=x/2.   Since it is a nonlinear function, there may be multiple 
solutions but the solution should be greater than 0 (if I am in the 
right track).   I am assuming this functional form is originated from 
the Poisson.  Under this assumption, one solution is found as below:


 rt - uniroot(function(x) ppois(2, lambda=x)-0.05, interval=c(0.5,1), 
extendInt=yes)

Warning messages:
1: In ppois(2, lambda = x) : NaNs produced
2: In ppois(2, lambda = x) : NaNs produced
3: In ppois(2, lambda = x) : NaNs produced
 ppois(2, lambda=rt$root)
[1] 0.051
 rt$root
[1] 6.295791

Thus, the solution x would be rt$root*2 (Note that I did not try to find 
other solutions).  I hope this helps.


Chel Hee Lee

On 2/10/2015 2:29 AM, Rolf Turner wrote:

On 10/02/15 14:04, Ssuhanchen wrote:

Hi!

I want to use R to calculate the variable x which is in a complex 
equation

in below:

  2
  Σ[exp(-x/2)*(x^k)/(2^k*k!)]=0.05
k=0

how to solve this equation to get the exact x in R?


Is this homework?  Sure looks like it.  Talk to your prof.  Or do a 
bit of work on learning how to use R --- which is presumably the point 
of the exercise.


cheers,

Rolf Turner



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Download internet videos

2015-02-11 Thread Jeff Newmiller
The set of things that R can do is not a subset of the set of things packages 
may have been written for.

Have at it.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On February 11, 2015 11:17:47 AM EST, Raoni Rodrigues 
caciquesamu...@gmail.com wrote:
Hello R-helpers,

It is possible donwload youtube videos with R? I made a google search
and
find no options to do that.

Thanks in advanced,

Raoni

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] prediction intervals for robust regression

2015-02-11 Thread Burns, Jonathan (NONUS)
I have created robust regression models using least trimmed squares and 
MM-regression (using the R package robustbase).

I am now looking to create prediction intervals for the predicted results.  
While I have seen some discussion in the literature about confidence intervals 
on the estimates for robust regression, I haven't had much success in finding 
out how to create prediction intervals for the results.  I was wondering if 
anyone would be able to provide some direction on how to create these 
prediction intervals in the robust regression setting.

Thanks,

Jonathan Burns
Sr. Statistician
General Dynamics Information Technology
Medicare  Medicaid Solutions
One West Pennsylvania Avenue
Baltimore, MD 21204
(410)-842-1594
jonathan.bur...@gdit.commailto:jonathan.bur...@gdit.com
www.gdit.comhttp://www.gdit.com/


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] prediction intervals for robust regression

2015-02-11 Thread Bert Gunter
Presumably you've checked out:

http://cran.r-project.org/web/views/Robust.html

If you can estimate the variance of parameter estimates, betahat, then
you can estimate the variance of a predicted value, X betahat; add the
estimated variance of individuals to this if that's what you're
looking for (and it's not already available).

Further questions should go to a statistics site like
stats.stackexchange.com, as statistical questions are off topic here.

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
Clifford Stoll




On Wed, Feb 11, 2015 at 11:03 AM, Burns, Jonathan (NONUS)
jonathan.bur...@gdit.com wrote:
 I have created robust regression models using least trimmed squares and 
 MM-regression (using the R package robustbase).

 I am now looking to create prediction intervals for the predicted results.  
 While I have seen some discussion in the literature about confidence 
 intervals on the estimates for robust regression, I haven't had much success 
 in finding out how to create prediction intervals for the results.  I was 
 wondering if anyone would be able to provide some direction on how to create 
 these prediction intervals in the robust regression setting.

 Thanks,

 Jonathan Burns
 Sr. Statistician
 General Dynamics Information Technology
 Medicare  Medicaid Solutions
 One West Pennsylvania Avenue
 Baltimore, MD 21204
 (410)-842-1594
 jonathan.bur...@gdit.commailto:jonathan.bur...@gdit.com
 www.gdit.comhttp://www.gdit.com/


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package build help

2015-02-11 Thread Hadley Wickham
On Sun, Feb 8, 2015 at 5:15 PM, Duncan Murdoch murdoch.dun...@gmail.com wrote:
 On 08/02/2015 4:06 PM, Glenn Schultz wrote:
 Hello All,

 I am in the final stages of building my first package BondLab and the 
 check throughs the following warning.  I think this is namespace thing.  I 
 have not done anything with the namespace at this point.  I am turning my 
 attention to the namespace now.  Am I correct this can be a handled by the 
 namespace?


 I would guess you have imported the lubridate and plyr packages, and
 also defined your own duration() and here() functions, hiding theirs.

You can also see this problem if you have

import(plyr)
import(plyr, here)

etc.

Or with

import(plyr)
import(lubridate)

since I think both provide a here() function.

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sqldf() difference between R 3.1.2 and 3.0.1

2015-02-11 Thread Doran, Harold
That seems to have worked, both in the new and old version of R. I'll do more 
unit testing on other files.

Thank you, Gabor.

-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] 
Sent: Wednesday, February 11, 2015 10:22 AM
To: Doran, Harold
Cc: r-help@r-project.org
Subject: Re: [R] sqldf() difference between R 3.1.2 and 3.0.1

On Wed, Feb 11, 2015 at 9:45 AM, Doran, Harold hdo...@air.org wrote:
 I have a function written and tested using R 3.0.1 and sqldf_0.4-7.1 that 
 works perfectly. However, using this same code with R 3.1.2 and sqldf_0.4-10 
 yields the error below that I am having a difficult time deciphering. Hence, 
 same code behaves differently on different versions of R and sqldf().

 Error in sqliteSendQuery(con, statement, bind.data) :
   error in statement: no such column: V1


 Reproducible example below as well as complete sessionInfo all provided below.


 My function and code using the function are below.

 dorReader - function(dorFile, layout, sepChar = '\n'){
 sepChar - as.character(sepChar)
 dorFile - as.character(dorFile)
 layout$type2 - ifelse(layout$type == 'C', 'character',
 
 ifelse(layout$type == 'N', 'numeric', 'Date'))
 dor - file(dorFile)
 attr(dor, file.format) - list(sep = sepChar)
 getVars - paste(select,
paste(substr(V1, , layout$Start, , ,
  layout$Length, ) ', layout$Variable.Name, ', 
 collapse = , ), from dor)
 dat - sqldf(getVars)

 classConverter - function(obj, types){
 out - lapply(1:length(obj),FUN = 
 function(i){FUN1 - switch(types[i],character = as.character,numeric = 
 as.numeric,factor = as.factor, Date = as.character); FUN1(obj[,i])})
 names(out) - colnames(obj)
 as.data.frame(out)
 }
 dat - classConverter(dat, layout$type2)
 names(dat) - layout$Variable.Name
 dat
 }

 ### contents of fwf file 'sample.txt'
 1234567
 1234567
 1234567
 1234567
 1234567
 1234567
 1234567
 1234567

 layout - data.frame(Variable.Name =c('test1', 'test2'), Length = 
 c(3,4), Start =c(1,4), End = c(3,7), type = c('N', 'N'))

 tmp - dorReader('sample.txt', layout)

sqldf is documented to use the sqliteImportFile defaults for file.format 
components.  It may be that RSQLite 1.0 has changed the default for header in 
sqliteImportFile.

Try replacing your statement that sets file.format with this:

attr(dor, file.format) - list(sep = sepChar, header = FALSE)
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sqldf() difference between R 3.1.2 and 3.0.1

2015-02-11 Thread Gabor Grothendieck
On Wed, Feb 11, 2015 at 9:45 AM, Doran, Harold hdo...@air.org wrote:
 I have a function written and tested using R 3.0.1 and sqldf_0.4-7.1 that 
 works perfectly. However, using this same code with R 3.1.2 and sqldf_0.4-10 
 yields the error below that I am having a difficult time deciphering. Hence, 
 same code behaves differently on different versions of R and sqldf().

 Error in sqliteSendQuery(con, statement, bind.data) :
   error in statement: no such column: V1


 Reproducible example below as well as complete sessionInfo all provided below.


 My function and code using the function are below.

 dorReader - function(dorFile, layout, sepChar = '\n'){
 sepChar - as.character(sepChar)
 dorFile - as.character(dorFile)
 layout$type2 - ifelse(layout$type == 'C', 'character',
 
 ifelse(layout$type == 'N', 'numeric', 'Date'))
 dor - file(dorFile)
 attr(dor, file.format) - list(sep = sepChar)
 getVars - paste(select,
paste(substr(V1, , layout$Start, , ,
  layout$Length, ) ', layout$Variable.Name, ', 
 collapse = , ), from dor)
 dat - sqldf(getVars)

 classConverter - function(obj, types){
 out - lapply(1:length(obj),FUN = 
 function(i){FUN1 - switch(types[i],character = as.character,numeric = 
 as.numeric,factor = as.factor, Date = as.character); FUN1(obj[,i])})
 names(out) - colnames(obj)
 as.data.frame(out)
 }
 dat - classConverter(dat, layout$type2)
 names(dat) - layout$Variable.Name
 dat
 }

 ### contents of fwf file 'sample.txt'
 1234567
 1234567
 1234567
 1234567
 1234567
 1234567
 1234567
 1234567

 layout - data.frame(Variable.Name =c('test1', 'test2'), Length = c(3,4), 
 Start =c(1,4), End = c(3,7), type = c('N', 'N'))

 tmp - dorReader('sample.txt', layout)

sqldf is documented to use the sqliteImportFile defaults for
file.format components.  It may be that RSQLite 1.0 has changed the
default for header in sqliteImportFile.

Try replacing your statement that sets file.format with this:

attr(dor, file.format) - list(sep = sepChar, header = FALSE)

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rotate array by 90°

2015-02-11 Thread Karim Mezhoud
Thanks Lee and Huang.
That is useful.
Best




On Fri, Feb 6, 2015 at 1:53 AM, Chel Hee Lee chl...@mail.usask.ca wrote:

  lapply(1:2, function(x) t(A[rev(1:3),,x]))
 [[1]]
  [,1] [,2] [,3]
 [1,] g  d  a
 [2,] h  e  b
 [3,] i  f  c

 [[2]]
  [,1] [,2] [,3]
 [1,] p  m  j
 [2,] q  n  k
 [3,] r  o  l

 

 Is this what you are looking for?  I hope this helps.

 Chel Hee Lee


 On 02/05/2015 03:17 PM, Karim Mezhoud wrote:

 Dear aal,

 Is there a way to rotate array or a cube of matrices by Y axis?


 MatLab example:

 A = cat(3,{'a' 'b' 'c';'d' 'e' 'f';'g' 'h' 'i'},{'j' 'k' 'l';'m' 'n'
 'o';'p' 'q' 'r'})

 A(:,:,1) =

  'a''b''c'
  'd''e''f'
  'g''h''i'


 A(:,:,2) =

  'j''k''l'
  'm''n''o'
  'p''q''r'

 Rotate the cell array by 270 degrees.

 B = rot90(A,3)

 B(:,:,1) =

  'g''d''a'
  'h''e''b'
  'i''f''c'


 B(:,:,2) =

  'p''m''j'
  'q''n''k'
  'r''o''l'


 karim

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/
 posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] factor levels numeric values

2015-02-11 Thread JS Huang
Hi,

  Suppose your data frame is called data and the name of the factor column
is named tobeConverted.  I have tried this and it worked.  Hope this helps.

 as.numeric(as.character(data$tobeConverted))



--
View this message in context: 
http://r.789695.n4.nabble.com/factor-levels-numeric-values-tp4699515p4703090.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] library(Rcmdr) sh: otool: command not found

2015-02-11 Thread varin sacha
Dear John,
Dear Dan,

Many thanks for your response. It works perfectly.

Best,


- Mail original -
De : John Fox j...@mcmaster.ca
À : 'varin sacha' varinsa...@yahoo.fr
Cc : 'r-help@R-project.org' r-help@r-project.org
Envoyé le : Mercredi 11 février 2015 3h12
Objet : RE: [R] library(Rcmdr) sh: otool: command not found

Dear varin sacha,

This problem presumably will also arise if you try to load the tcltk package 
directly via library(tcltk) and has been discussed before on the R-SIG-Mac 
email list, for example at 
https://stat.ethz.ch/pipermail/r-sig-mac/2014-December/011260.html. The 
problem is fixed in the patched version of R 3.1.2 for Mac OS X, available at 
http://r.research.att.com/.

I hope this helps,
John

---
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/




 -Original Message-
 From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of varin sacha
 Sent: February-10-15 6:17 PM
 To: r-help@R-project.org
 Subject: [R] library(Rcmdr) sh: otool: command not found

 Hi R experts,

 I have just updated R and RStudio. I am running OS X 10.6.8

 R version 3.1.2 (2014-10-31) -- Pumpkin Helmet
 Copyright (C) 2014 The R Foundation for Statistical Computing
 Platform: x86_64-apple-darwin10.8.0 (64-bit)


 RStudio Version 0.98.1102 – © 2009-2014 RStudio, Inc.
 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/534.59.10
 (KHTML, like Gecko)


 Everything is ok for installing the Rcmdr package.

 install.packages(Rcmdr)
 Installing package into ‘/Users/Caro/Library/R/3.1/library’
 (as ‘lib’ is unspecified)
 essai de l'URL 'http://cran.rstudio.com/bin/macosx/contrib/3.1/Rcmdr_2.1-
 6.tgz'
 Content type 'application/x-gzip' length 5340999 bytes (5.1 Mb) URL ouverte
 ==
 downloaded 5.1 Mb


 The downloaded binary packages are in
 /var/folders/V2/V26Pbtc-GsSUTVPWEj6Bdk+++TI/-Tmp-
 //RtmpgNlyXf/downloaded_packages
 


 But once I write library(Rcmdr), it doesn't work.

 library(Rcmdr)
 Le chargement a nécessité le package : splines Le chargement a nécessité le
 package : RcmdrMisc Le chargement a nécessité le package : car Le
 chargement a nécessité le package : sandwich Error : .onLoad a échoué dans
 loadNamespace() pour 'tcltk', détails :
 appel : system2(otool, c(-L, shQuote(DLL)), stdout = TRUE) erreur : erreur
 lors de l'exécution d'une commande Erreur : le chargement du package ou de
 l'espace de noms a échoué pour ‘Rcmdr’
 sh: otool: command not found

 Has somebody any idea of how I can solve this problem ?

 Thanks,

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.


---
This email has been checked for viruses by Avast antivirus software.
http://www.avast.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] sqldf() difference between R 3.1.2 and 3.0.1

2015-02-11 Thread Doran, Harold
I have a function written and tested using R 3.0.1 and sqldf_0.4-7.1 that works 
perfectly. However, using this same code with R 3.1.2 and sqldf_0.4-10 yields 
the error below that I am having a difficult time deciphering. Hence, same code 
behaves differently on different versions of R and sqldf().

Error in sqliteSendQuery(con, statement, bind.data) :
  error in statement: no such column: V1


Reproducible example below as well as complete sessionInfo all provided below.


My function and code using the function are below.

dorReader - function(dorFile, layout, sepChar = '\n'){
sepChar - as.character(sepChar)
dorFile - as.character(dorFile)
layout$type2 - ifelse(layout$type == 'C', 'character',

ifelse(layout$type == 'N', 'numeric', 'Date'))
dor - file(dorFile)
attr(dor, file.format) - list(sep = sepChar)
getVars - paste(select,
   paste(substr(V1, , layout$Start, , ,
 layout$Length, ) ', layout$Variable.Name, ', collapse 
= , ), from dor)
dat - sqldf(getVars)

classConverter - function(obj, types){
out - lapply(1:length(obj),FUN = 
function(i){FUN1 - switch(types[i],character = as.character,numeric = 
as.numeric,factor = as.factor, Date = as.character); FUN1(obj[,i])})
names(out) - colnames(obj)
as.data.frame(out)
}
dat - classConverter(dat, layout$type2)
names(dat) - layout$Variable.Name
dat
}

### contents of fwf file 'sample.txt'
1234567
1234567
1234567
1234567
1234567
1234567
1234567
1234567

layout - data.frame(Variable.Name =c('test1', 'test2'), Length = c(3,4), 
Start =c(1,4), End = c(3,7), type = c('N', 'N'))

tmp - dorReader('sample.txt', layout)

### SessionInfo where functions behaves as expected
 sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252  
  LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C   LC_TIME=English_United States.1252

attached base packages:
[1] tcltk stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] sqldf_0.4-7.1 RSQLite.extfuns_0.0.1 RSQLite_0.11.4DBI_0.2-7 
gsubfn_0.6-5
[6] proto_0.3-10  MiscPsycho_1.6statmod_1.4.18

loaded via a namespace (and not attached):
[1] chron_2.3-45 tools_3.0.1




### SessionInfo for version not working
 sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] tcltk stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] sqldf_0.4-10  RSQLite_1.0.0 DBI_0.3.1 gsubfn_0.6-6  proto_0.3-10

loaded via a namespace (and not attached):
[1] chron_2.3-45 tools_3.1.2

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] MApply and SubStr

2015-02-11 Thread Brian Trautman
That's exactly what I wanted, thank you very much!

My intent was to perform the SubStr operation first, and then apply the
types to the columns. I wasn't expecting the two types in the same column.

I appreciate your response!

On Tue, Feb 10, 2015 at 5:03 PM, David Winsemius dwinsem...@comcast.net
wrote:


 On Feb 10, 2015, at 3:58 PM, Brian Trautman wrote:

  Hi!
 
  I'm trying to write a custom function that applies SubStr to a string,
 and
  then depending on the arguments, converts the output to a number.
 
  The substring part of my code works fine, but it's not converting the
 way I
  want to --
 
  options('stringsAsFactors'=FALSE)
  require(data.table)
 
  substr_typeswitch - function(x, start, stop, typeto='chr')
  {
   tmpvar - substr(x=x, start=start, stop=stop)
   tmpvar - switch(typeto, num=as.numeric(tmpvar), tmpvar)
   return(tmpvar)
  }
   startpos - c(01, 03)
   endpos -   c(02, 04)
   typelist - c('chr', 'num')
 
   startdata - as.data.table(c('aa01', 'bb02'))
 
   enddata_want - as.data.table(mapply(substr_typeswitch, startdata,
  startpos, endpos, typelist))
 
  If I examine enddata_want --
 
  str(enddata_want)
  Classes ‘data.table’ and 'data.frame': 2 obs. of  2 variables:
  $ V1: chr  aa bb
  $ NA: chr  1 2
  - attr(*, .internal.selfref)=externalptr
 
  1 and 2 are being stored as character, and not as number.

 It appears from you code that you might be expecting a vector in a
 dataframe object to have a character mode in the first postition and a
 numeric mode in the second position. That wouldn't seem to be a reasonable
 expectation. But maybe you were hoping the chr and num types were to be
 applied to columns. I was surprised to get something different from
 as.data.table:

  str(enddata_want)
 Classes ‘data.table’ and 'data.frame':  2 obs. of  2 variables:
  $ V1: Factor w/ 2 levels aa,bb: 1 2
  $ NA: Factor w/ 2 levels 1,2: 1 2
  - attr(*, .internal.selfref)=externalptr

 The mapply operation made a matrix which forces all values to be the same
 mode:

  str( mapply(substr_typeswitch, startdata,
 +  startpos, endpos, typelist) )
  chr [1:2, 1:2] aa bb 1 2
  - attr(*, dimnames)=List of 2
   ..$ : NULL
   ..$ : chr [1:2] V1 NA

 You might have gotten something less homogeneous if you added the SIMPLIFY
 argument:

  str( mapply(substr_typeswitch, startdata,
 +  startpos, endpos, typelist, SIMPLIFY=FALSE) )
 List of 2
  $ V1: chr [1:2] aa bb
  $ NA: num [1:2] 1 2




 
  Can anyone help me understand what I'm doing wrong?
 
  Thank you!
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 David Winsemius
 Alameda, CA, USA



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to read this Rpart decision tree?

2015-02-11 Thread Sarah Goslee
Hi Kim

fancyRpartPlot is a front-end to prp, and you can pass it all of the
prp options - it says this in the help for fancyRpartPlot, and that's
about all it says.

So you need to spend some time reading about prp options, and how to
customize your plot to get what you want. There are lots of detailed
resources; here's one to get you started.

http://www.milbo.org/rpart-plot/prp.pdf

Sarah

On Wed, Feb 11, 2015 at 3:02 AM, Kim C. minorthre...@hotmail.com wrote:
 Hi all,

 In the attachment or this link (http://oi58.tinypic.com/35ic9qc.jpg) you'll 
 find the decision tree I made. I used the Rpart package to make the tree and 
 the rattle package using the fancyRpartPlot to plot it. The data in the tree 
 looks different than about every example I have seen before. I don't 
 understand how I should read it. I want to predict Product (which are 
 productkeys). The variables to predict it contain age, incomegroup, gender, 
 totalchildren, education, occupation, houseownerflag, numberCars.It looks 
 like the upper number is a ProductKey. The n is number of observations? And 
 the percentage of the yes/no question below.

 This is the code I used.

 ss.rpart1 - rpart(Product ~ ., data=sstrain, 
 control=rpart.control(minbucket=2,minsplit=1, cp=-1))
 spt - which.min(ss.rpart1$cptable[, xerror])
 scp - ss.rpart1$cptable[opt, CP]
 ss.rpart2 - prune(ss.rpart1, cp=cp)
 fancyRpartPlot(ss.rpart2)

 So why does the tree looks so different from the most (for example: 
 http://media.tumblr.com/a9f482ff88b0b9cfaffca7ffd46c6a8e/tumblr_inline_mz7pyuaYJQ1s5wtly.png).
  This is from Trevor Stephen's TItanic tutorial. The first node show that 62% 
 of 100% doesn't survive. If they were male, only 19% of them were survivors. 
 I find that a lot examples look like that. Why does mine predict per 
 ProductKey and every node it has something else. it doesn't make sense to me. 
 And it doesn't have the two numbers like .62 and .38 but it has n=197e+3. So 
 should I read the first node like For 100% of the observations of ProductKey 
 1074, the incomegroup was moderate)?

 Thank you!

 Kim

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help in anova

2015-02-11 Thread Marco Barbàra
I suspect that your treatment levels are perfectly nested inside the
soilgroup levels. If this were the case you can either use one of the
factor in your analysis or the other, depending on what you want to
analyze.


Il giorno Wed, 11 Feb 2015 14:47:18 +
Mir Salam mir.sa...@uef.fi ha scritto:

 Dear all,
 


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help in anova

2015-02-11 Thread Mir Salam
Dear all,



I have follwing factors in my data set (G)

names(G)
relative ht, biomass, treatment, variety, soil group

relative height
biomass
treatment (4 levels)- control+lime, control+ash, contaminated soil+lime, 
contaminated soil+ash
variety (4 levels)- Salix swherinii, Salix mursinifolia, Klara, Karin
soilgroup (2 level)- control and contaminated
if I fit aov function this is ok

anova1-aov(ht~treatment*variety,data=G)
summary(anova1)
 R- output
 Df Sum SqMean Sq   F value 
Pr(F)
treatment  314015 4672 65.021  2e-16 
***
variety3 5883   196127.295 
2.95e-12 ***
treatment:variety9   5474  608 8.465 8.93e-09 ***
Residuals 80 5748 72
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
above I can see my treatment and variety interaction effect
anova3-aov(ht~treatment*variety*soilgroup,data=G)
 summary(anova3)
R-ouput

  Df  Sum SqMean SqF value 
Pr(F)
treatment314015  4672  65.021  2e-16 
***
variety  3 5883   1961   27.295 
2.95e-12 ***
treatment:variety  9  5474   608 8.465 8.93e-09 ***
soilgroup -missing
treatment:variety:soilgroup - missing
Here, I am missing value of soil group and interation of treatment: variety: 
soilgroup- Can any one please tell me why?
Thanks for help


Best regards
Salam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Maxent does not work

2015-02-11 Thread Kenneth Z
Is there any package that can replace maxent for a large number of independent 
variables?


Thanks,
Ken

Sent from my iPhone



 On Feb 2, 2015, at 6:07 PM, Kenneth Z kenneth...@gmail.com wrote:
 
 Yesterday I installed the most recent R and maxent package, but it stopped 
 working. Even a simple command like
 Model - maxent(matrix(c(1,2,3,4,5,6,7,8), nrow=2, ncol=4),c(1,-1))
 
 will cause a fatal error in R. I am attaching a screenshot to this email.
 Any help will be appreciated.
 
 Best regards,
 Ken
 
 
 
 
 On Dec 12, 2012, at 4:13 PM, Kenneth Z kenneth...@gmail.com wrote:
 
 I found that the optim() function does not always reach the real minimum. Is 
 it because the solution is trapped at a local minimum?
 
 Thanks!
 Ken
 
 
 
 On Dec 12, 2012, at 2:17 PM, Jeff Newmiller jdnew...@dcn.davis.ca.us 
 wrote:
 
 ... origin pro?
 
 Then why are you here?
 
 It is not clear from your message that this has anything to do with R.
 ---
 Jeff NewmillerThe .   .  Go Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
Live:   OO#.. Dead: OO#..  Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
 --- 
 Sent from my phone. Please excuse my brevity.
 
 vishal katoch vkatoch...@yahoo.co.in wrote:
 
 Hello,
 i am working in origin pro,
 i want to plot a graph as like a pdf attached but with black and white
 lines.
 here radial axis varies from 0 to 1. and angular axis from 0 degree to
 60 degree.and third axis which is depend on both radial and axial gives
 non intersecting lines.
 how can i read the data from plot for replot.
 vikas 
 
 
 
 
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mixed-effects model for pre-post randomization design

2015-02-11 Thread Ben Bolker
Marco Barbàra jabbba at gmail.com writes:

 
 DeaR userRs,
 
 I recently read this Liang-Zeger article:
 
 http://sankhya.isical.ac.in/search/62b1/fpaper7.html
 
 in which (among other things) they adopt a random intercept model for
 pre-post designed trials, using a conditional likelihood approach
 (I didn't think it possible with only two measurements per subject)
 
 I'm trying to figure out (if and) how it is possible to reproduce
 straightforwardly their model using R standard mixed model tools, but
 I cannot even try to reproduce their work, since they used a
 non-available dataset (I found an extract on prof. Diggle's web site
 where it is explicitly reported to be confidential), so I have to
 review a bit of likelihood theory along with some implementation
 details.
 
 In the meantime, I wonder if anyone here could point out any related
 documentation to me.
 


  This might get more attention on r-sig-mixed-mod...@r-project.org.
I took a quick look at the paper, but it's not a case where the
answer is immediately obvious.  The paper of reference for lme4
(see http://cran.r-project.org/web/packages/lme4/citation.html )
gives technical details of lme4's implementation, in case that's
useful.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Nonlinear integer programming question

2015-02-11 Thread Zwick, Rebecca J
I am seeking an optimization routine that can deal with the following problem:
Maximize g(x), where x is a vector and g is nonlinear, subject to linear 
constraints of the form h(x)0 and m(x)=0 and subject to the constraint that 
all values of x are 0 or 1.
I can't find a nonlinear optimization program in R that states that it can 
accommodate 0-1 constraints.
Oddly, Excel's Solver will produce a solution to such problems but (1) I don't 
trust it and (2) it cannot handle a large number of constraints.
Thanks, all!

Rebecca Zwick  (Santa Barbara, California)
Statistical Analysis, Data Analysis, and Psychometric Research
Educational Testing Service




This e-mail and any files transmitted with it may contain privileged or 
confidential information. It is solely for use by the individual for whom it is 
intended, even if addressed incorrectly. If you received this e-mail in error, 
please notify the sender; do not disclose, copy, distribute, or take any action 
in reliance on the contents of this information; and delete it from your 
system. Any other use of this e-mail is prohibited.


Thank you for your compliance.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Impute time-series data, perhaps with a Kalman filter - do you know of any R code?

2015-02-11 Thread Olivier Crouzet
Hi,

searching for kalman filter R gives this paper that was published in
JSS. It may help.

@article{Tusell:2010:JSSOBK:v39i02,
  author =  Fernando Tusell,
  title =   Kalman Filtering in R,
  journal = Journal of Statistical Software,
  volume =  39,
  number =  2,
  pages =   1--27,
  day = 1,
  month =   3,
  year =2011,
  CODEN =   JSSOBK,
  ISSN =1548-7660,
  bibdate = 2010-08-17,
  URL = http://www.jstatsoft.org/v39/i02;,
  accepted =2010-08-17,
  acknowledgement = ,
  keywords =,
  submitted =   2010-01-12,
}

15:23:00 -0500 John Sorkin jsor...@grecc.umaryland.edu wrote:

 Does anyone have code that uses a Kalman filter to impute time-series
 data? If not, do you know of any software that can be used to impute
 time-series data? Thank you, John
 John David Sorkin M.D., Ph.D.
 Professor of Medicine
 Chief, Biostatistics and Informatics
 University of Maryland School of Medicine Division of Gerontology and
 Geriatric Medicine Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing) 
 
 Confidentiality Statement:
 This email message, including any attachments, is for ...{{dropped:28}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] AR1 covariance structure for lme object and R/SAS differences in model output

2015-02-11 Thread peter dalgaard

 On 11 Feb 2015, at 16:57 , anord andreas.n...@biol.lu.se wrote:
 
 Dear R users, 
 We are working on a data set in which we have measured repeatedly a
 physiological response variable (y) 
 every 20 min for 12 h (time variable; 'x') in subjects ('id') beloning to
 one of five groups ('group'; 'A' to 'E'). Data are located at:
 https://www.dropbox.com/s/hf455aev3teb5e0/data.csv?dl=0
 
 We are interested to model if the response in y differences with time (i.e.
 'x') for the two groups. Thus:
 require(nlme)
 m1-lme(y~group*x+group*I(x^2),random=~x|id,data=data.df,na.action=na.omit)
 
 But because data are collected repeatedly over short time intervals for each
 subject, it seemed prudent to consider an autoregressive covariance
 structure. Thus:
 m2-update(m1,~.,corr=corCAR1(form=~x|id))
 
 AIC values indicate the latter (i.e. m2) as more appropriate:
  anova(m1,m2)
 #   Model df  AIC  BIC   logLikTest  L.Ratio 
 p-value
 #m1 1 19 2155.996 2260.767 -1058.9981
 #m2 2 20 2021.944 2132.229  -990.9718 1 vs 2 136.0525  .0001
 
 Fixed effects and test statistics differ between models. A look at marginal
 ANOVA tables suggest inference might differ somewhat between models:
 
 anova.lme(m1,type=m)
 #  numDF denDF  F-value p-value
 #(Intercept)  1  1789 63384.80  .0001
 #group 445  1.29  0.2893
 #x   1  1789 0.05  0.8226
 #I(x^2)1  1789 4.02  0.0451
 #group:x  4  1789 2.61  0.0341
 #group:I(x^2)   4  1789 4.37  0.0016
 
 anova.lme(m2,type=m)
 # numDF denDF  F-value p-value
 #(Intercept)  1  1789 59395.79  .0001
 #group 445  1.33  0.2725
 #x   1  1789 0.04  0.8379
 #I(x^2)1  1789 2.28  0.1312
 #group:x  4  1789 2.09  0.0802
 #group:I(x^2)   4  1789 2.81  0.0244
 
 Now, this is all well. But: my colleagues have been running the same data
 set using PROC MIXED in SAS and come up with substantially different results
 when comparing SAS default covariance structure (variance components) and
 AR1. Specifically, there is virtually no change in either test statistics or
 fitted values when using AR1 instead of Variance Components in SAS, which
 fits the observation that AIC values (in SAS) indicate both covariance
 structures fit data equally well. 
 
 This is not very satisfactory to me, and I would be interesting to know what
 is happening here. Realizing
 this might not be the correct forum for this question, I would like to ask
 you all if anyone would have any
 input as to what is going on here, e.g. am I setting up my model
 erroneously, etc.? 
 
 N.b. I have no desire to replicate SAS results, but I would most certainly
 be interested to know what could possibly explain  such a large discrepancy
 between the two platforms. Any suggestions greatly welcomed.
 
 (Data are located at:
 https://www.dropbox.com/s/hf455aev3teb5e0/data.csv?dl=0)
 

Hmm, does SAS include a nugget effect perchance? At any rate, showing the SAS 
output (or some of it) might make it easier for someone to help.

Also, R-sig-ME is a better choice for discussions of mixed effects models.


-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem Solved: optim fails when using arima

2015-02-11 Thread Albert Shuxiang Li
I am using arima(x, order=c(p,0,q)) function for my project, which deals
with a set of large differenced time series data, data size varies from
8000 to 7. I checked their stationarity before applying arima.
Occasionally, arima(x, order=c(p,0,q)) gives me error like following (which
stops script running):

Error in optim(init[mask], armafn, method = optim.method, Hessian = TRUE, :
non-finite finite-difference value [16]

The last [16] would change anyting from 1 to 16. Using argument
method=CSS, or ML, or default did not help. I am using the newest R
version 3.1.2 for windows 7.

I have done a lot of research on internet for this Error Message, and tried
a lot of suggested solutions too. But the results are negative. Then,
finally, I used following line which solved my problem.

arima(x, order=c(p,0,q), optim.method=Nelder-Mead)

Hope this helps others with similar situations.

Shuxiang Albert Li

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Impute time-series data, perhaps with a Kalman filter - do you know of any R code?

2015-02-11 Thread John Sorkin
Does anyone have code that uses a Kalman filter to impute time-series data? If 
not, do you know of any software that can be used to impute time-series data?
Thank you,
John
John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric 
Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing) 

Confidentiality Statement:
This email message, including any attachments, is for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized use, disclosure or distribution is prohibited. If you are not 
the intended recipient, please contact the sender by reply email and destroy 
all copies of the original message. 
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.table with missing data and consecutive delimiters

2015-02-11 Thread Rui Barradas

Hello,

You're missing a dollar sign: 2$$$5, not 2$$5.

Hope this helps,

Rui Barradas

Em 11-02-2015 14:53, Tim Victor escreveu:

All,

Assume we have data in an ASCII file that looks like

Var1$Var2$Var3$Var4
1$2$3$4
2$$5
$$$6

When I execute

read.table( 'test.dat', header=TRUE, sep='$' )

I, of course, receive the following error:

Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
  :
   line 2 did not have 4 elements

When I set fill=TRUE, e.g., read.table( 'test.dat', header=TRUE, sep='$',
fill=TRUE )

I get:

   Var1 Var2 Var3 Var4
11234
22   NA5   NA
3   NA   NA   NA6


What I need is

   Var1 Var2 Var3 Var4
11234
22   NA   NA5
3   NA   NA   NA6

What am I missing?

Thanks,

Tim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] prediction intervals for robust regression

2015-02-11 Thread Prof Brian Ripley

On 11/02/2015 19:38, Bert Gunter wrote:

Presumably you've checked out:

http://cran.r-project.org/web/views/Robust.html

If you can estimate the variance of parameter estimates, betahat, then
you can estimate the variance of a predicted value, X betahat; add the
estimated variance of individuals to this if that's what you're
looking for (and it's not already available).


But that's not too much use without some idea of the error distribution, 
and using robust statistics assumes it is non-normal, long-tailed.  And 
it is unusual to have enough data to estimate the tail behaviour of such 
a distribution.


It might be better to do this with a parametric model with a long-tailed 
error distribution, especially if there is evidence (e.g. from other 
samples) about the latter.



Further questions should go to a statistics site like
stats.stackexchange.com, as statistical questions are off topic here.


Agreed.




Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
Clifford Stoll




On Wed, Feb 11, 2015 at 11:03 AM, Burns, Jonathan (NONUS)
jonathan.bur...@gdit.com wrote:

I have created robust regression models using least trimmed squares and 
MM-regression (using the R package robustbase).

I am now looking to create prediction intervals for the predicted results.  
While I have seen some discussion in the literature about confidence intervals 
on the estimates for robust regression, I haven't had much success in finding 
out how to create prediction intervals for the results.  I was wondering if 
anyone would be able to provide some direction on how to create these 
prediction intervals in the robust regression setting.

Thanks,

Jonathan Burns
Sr. Statistician
General Dynamics Information Technology
Medicare  Medicaid Solutions
One West Pennsylvania Avenue
Baltimore, MD 21204
(410)-842-1594
jonathan.bur...@gdit.commailto:jonathan.bur...@gdit.com
www.gdit.comhttp://www.gdit.com/



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Emeritus Professor of Applied Statistics, University of Oxford
1 South Parks Road, Oxford OX1 3TG, UK

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] read.table with missing data and consecutive delimiters

2015-02-11 Thread Tim Victor
All,

Assume we have data in an ASCII file that looks like

Var1$Var2$Var3$Var4
1$2$3$4
2$$5
$$$6

When I execute

read.table( 'test.dat', header=TRUE, sep='$' )

I, of course, receive the following error:

Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
 :
  line 2 did not have 4 elements

When I set fill=TRUE, e.g., read.table( 'test.dat', header=TRUE, sep='$',
fill=TRUE )

I get:

  Var1 Var2 Var3 Var4
11234
22   NA5   NA
3   NA   NA   NA6


What I need is

  Var1 Var2 Var3 Var4
11234
22   NA   NA5
3   NA   NA   NA6

What am I missing?

Thanks,

Tim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] $ operator is invalid for atomic vectors

2015-02-11 Thread JS Huang
Hi, 

  If x is a data frame, then x$getmean will try to get the vector named
getmean in x.  You put () after x$getmean.  I think r is confused about
it.  It appears that you want to call a function named getmean().



--
View this message in context: 
http://r.789695.n4.nabble.com/operator-is-invalid-for-atomic-vectors-tp4703112p4703114.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] AR1 covariance structure for lme object and R/SAS differences in model output

2015-02-11 Thread anord
Dear R users, 
We are working on a data set in which we have measured repeatedly a
physiological response variable (y) 
every 20 min for 12 h (time variable; 'x') in subjects ('id') beloning to
one of five groups ('group'; 'A' to 'E'). Data are located at:
https://www.dropbox.com/s/hf455aev3teb5e0/data.csv?dl=0

We are interested to model if the response in y differences with time (i.e.
'x') for the two groups. Thus:
require(nlme)
m1-lme(y~group*x+group*I(x^2),random=~x|id,data=data.df,na.action=na.omit)

But because data are collected repeatedly over short time intervals for each
subject, it seemed prudent to consider an autoregressive covariance
structure. Thus:
m2-update(m1,~.,corr=corCAR1(form=~x|id))

AIC values indicate the latter (i.e. m2) as more appropriate:
  anova(m1,m2)
#   Model df  AIC  BIC   logLikTest  L.Ratio 
p-value
#m1 1 19 2155.996 2260.767 -1058.9981
#m2 2 20 2021.944 2132.229  -990.9718 1 vs 2 136.0525  .0001

Fixed effects and test statistics differ between models. A look at marginal
ANOVA tables suggest inference might differ somewhat between models:
  
anova.lme(m1,type=m)
#  numDF denDF  F-value p-value
#(Intercept)  1  1789 63384.80  .0001
#group 445  1.29  0.2893
#x   1  1789 0.05  0.8226
#I(x^2)1  1789 4.02  0.0451
#group:x  4  1789 2.61  0.0341
#group:I(x^2)   4  1789 4.37  0.0016

anova.lme(m2,type=m)
# numDF denDF  F-value p-value
#(Intercept)  1  1789 59395.79  .0001
#group 445  1.33  0.2725
#x   1  1789 0.04  0.8379
#I(x^2)1  1789 2.28  0.1312
#group:x  4  1789 2.09  0.0802
#group:I(x^2)   4  1789 2.81  0.0244

Now, this is all well. But: my colleagues have been running the same data
set using PROC MIXED in SAS and come up with substantially different results
when comparing SAS default covariance structure (variance components) and
AR1. Specifically, there is virtually no change in either test statistics or
fitted values when using AR1 instead of Variance Components in SAS, which
fits the observation that AIC values (in SAS) indicate both covariance
structures fit data equally well. 

This is not very satisfactory to me, and I would be interesting to know what
is happening here. Realizing
this might not be the correct forum for this question, I would like to ask
you all if anyone would have any
input as to what is going on here, e.g. am I setting up my model
erroneously, etc.? 

N.b. I have no desire to replicate SAS results, but I would most certainly
be interested to know what could possibly explain  such a large discrepancy
between the two platforms. Any suggestions greatly welcomed.

(Data are located at:
https://www.dropbox.com/s/hf455aev3teb5e0/data.csv?dl=0)

With all best wishes,
Andreas






--
View this message in context: 
http://r.789695.n4.nabble.com/AR1-covariance-structure-for-lme-object-and-R-SAS-differences-in-model-output-tp4703103.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Download internet videos

2015-02-11 Thread MacQueen, Don
Hmmm.

Well, a youtube video is a file. Therefore search for R download file
and you will find the download.file() function.


-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 2/11/15, 8:17 AM, Raoni Rodrigues caciquesamu...@gmail.com wrote:

Hello R-helpers,

It is possible donwload youtube videos with R? I made a google search and
find no options to do that.

Thanks in advanced,

Raoni

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] suggestion for optimal plotting to show significant differences

2015-02-11 Thread Richard M. Heiberger
Petr,

My first attempt is to use the simple=TRUE argument to interaction2wt.

Then the bwplots in the item|item panel show the behavior of value over day
for each item.  You get a plot similar to this panel with the growth curve plots
from nlme, for example,
bwplot(value ~ day | item, data=test, horizontal=FALSE)
I am treating set as a replication and each box is cumulated over the
three sets.

My analysis question is about day.  You have it as numeric.  My
inclination would
be to make day factor.  Then you could model the interaction of day and item.

Rich

On Mon, Feb 9, 2015 at 6:01 AM, PIKAL Petr petr.pi...@precheza.cz wrote:
 Hallo Richard.

 I tried your suggestion but it seems to be no better than simple ggplot. Let 
 me extend the example a bit to 8 items which is more realistic.

 item-rep(letters[1:8], each=18)
 day-rep((0:5)*100, 24)
 set-rep(rep(1:3, each=6), 8)
 test-data.frame(item, day, set)
 set.seed(111)
 test$value-(test$day/100+1)+rnorm(144)
 test$value-test$value+(as.numeric(test$item)*1.3)

 Value is increasing during time (day) for each tested subject (item), each 
 item is measured 3 times (set) each day.

 Here is some graph
 p-ggplot(test, aes(x=day, y=value, colour=item))
 p+geom_point()+stat_smooth(method=lm, formula= y~poly(x,2))

 I can do lm or aov, however I am not sure about proper formula.

 fit-lm(value~day, data=test)
 summary(fit)
 # this shows that value is increasing with day

 fit-lm(value~day/item, data=test)
 summary(fit)
 # this suggests that value is decreasing with day (which is wrong)

 fit-lm(value~day*item, data=test)
 summary(fit)
 # and this tells me that value is increasing with day and items have 
 different intercepts but the same rate of growth (I hope I got it right).

 I do not have your book available but I went through help pages.

 Your interaction graph is not much better than ggplot.
 I can do

 interaction2wt(value ~ item * day, data=test)

 which probably is closer to actual problem.

 The basic problem is that increase of value with days is in fact not linear 
 and actually it can increase in the beginning and then stagnate or it can 
 stagnate in beginning and then increase. I am not aware of any way how to 
 compare time behaviour of different items in such situations if I cannot 
 state some common formula in which case I would use probably nlme.

 Thank for your insight, I try to go through it more deeply.

 Best regards
 Petr


 -Original Message-
 From: Richard M. Heiberger [mailto:r...@temple.edu]
 Sent: Friday, February 06, 2015 6:14 PM
 To: PIKAL Petr
 Cc: r-help@r-project.org
 Subject: Re: [R] suggestion for optimal plotting to show significant
 differences

 I would try one of these illustrations for starts.
 interaction2wt (two-way tables) is designed to be used with aov() for
 testing.
 interaction2wt shows all main effects and all two-way interactions for
 many factors.



 test -
 structure(list(item = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c(A, B), class =
 factor), day = c(0L, 100L, 200L, 300L, 400L, 500L, 0L, 100L, 200L,
 300L, 400L, 500L, 0L, 100L, 200L, 300L, 400L, 500L, 0L, 100L, 200L,
 300L, 400L, 500L, 0L, 100L, 200L, 300L, 400L, 500L, 0L, 100L, 200L,
 300L, 400L, 500L), set = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L,
 2L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L,
 2L, 3L, 3L, 3L, 3L, 3L, 3L), value = c(1.08163365169503,
 2.61998412608805, 3.07820466606394, 4.44993419381934, 5.29163171545805,
 6.29155990999293, -0.123163011367676, 2.07767236834003,
 2.32537052874901, 3.09372794501084, 6.65273721166635, 5.92304962329131,
 1.50504697705548, 2.66253728086866, 2.63420157418685, 2.78195098580416,
 6.47578642973288, 5.89587443775143, 0.848864231485078,
 1.27549677119713, 2.19573089053609, 2.45659926134292, 5.15424403414103,
 5.4813151140983, 1.25731482647214, 2.09662105167973, 1.75954023316977,
 4.81624002288939, 4.65029189325307, 6.39946904227214,
 0.944996929887344, 1.74667265331284, 2.42956264345558,
 5.17852980415141, 3.5453435965834, 6.9011238437191)), .Names =
 c(item, day, set, value), row.names = c(NA, -36L), class =
 data.frame)



 library(HH)

 test$set - factor(test$set)
 test$day - factor(test$day)
 test$item - factor(test$item)

 interaction2wt(value ~ item * day * set, data=test)

 test$item.day - interaction(test$item, test$day)
 position(test$item.day) - outer(c(-10,10),
 as.numeric(levels(test$day)), `+`)

 xyplot(value ~ as.position(item.day) | set, groups=item,
 data=test, horizontal=FALSE, pch=c(17,16),
 xlab=day,
 scales=list(
   x=list(
 alternating=1,
 at=levels(test$day), ## placement of tick labels and marks
 tck=1)),
 key=list(
   text=list(c(A,B), col=c(blue,red)),
   points=list(pch=c(17, 16), col=c(blue,red)),
space=top, 

Re: [R] Nonlinear integer programming question

2015-02-11 Thread JS Huang
Hi Rebecca,

  It will be very helpful if you can provide a set of specific functions
g(x), h(x) and m(x).



--
View this message in context: 
http://r.789695.n4.nabble.com/Nonlinear-integer-programming-question-tp4703122p4703128.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] piecewise regression and lm() question

2015-02-11 Thread Goldschneider, Jill
I was playing with some examples of piecewise regression using lm() and have 
come across a behavior I'm uncertain about.
Below is simple 3-segment dataset.  I compare predicted output of a model 
created by one call to lm() to that of 3 models created by 3 calls to lm().
In case A and B, the results are the same.  However, in case C the results 
differ for the middle segment.
Is the output of lm() for case C to be expected or not and if so, why?
Thank you,
Jill

set.seed(133)
y - c(21:30, rep(15,10), 10:1) + runif(30, -2, 2)
x - 1:30

# Case A: 3 segments, each fit with a constant value
plot(x, y)
# 3 different lm fits
p.c3 - c(predict(lm(y[1:10]~1)), predict(lm(y[11:20]~1)), 
predict(lm(y[21:30]~1)))
lines(x, p.c3, col=red)
# piecewise via lm
p.lm1 - predict(lm(y ~ I(x=10) + I(x10  x=20) + I(x20)))
lines(x, p.lm1, col=blue)
max(abs(p.c3-p.lm1))

# Case B: 3 segments, each fit with a line
plot(x, y)
# 3 different lm fits
p.c3 - c(predict(lm(y[1:10]~x[1:10])), predict(lm(y[11:20]~x[11:20])), 
predict(lm(y[21:30]~x[21:30])))
lines(x, p.c3, col=red)
# piecewise via lm
p.lm1 - predict(lm(y ~ (x=10)*x + (x10  x=20)*x + (x20)*x))
lines(x, p.lm1, col=blue)
max(abs(p.c3-p.lm1))

# Cast C: 3 segments - the middle fit with a constant value, the outer by a line
plot(x, y)
# 3 different lm fits
p.c3 - c(predict(lm(y[1:10]~x[1:10])), predict(lm(y[11:20]~1)), 
predict(lm(y[21:30]~x[21:30])))
lines(x, p.c3, col=red)
# piecewise via lm
p.lm1 - predict(lm(y ~ (x=10)*x + I(x10  x=20) + (x20)*x))
lines(x, p.lm1, col=blue)
max(abs(p.c3-p.lm1))
## the single call to lm does not have a constant value fit to the middle 
section
plot(x, p.c3-p.lm1 )



The information contained in this transmission may contain confidential 
information.  If the reader of this message is not the intended recipient, you 
are hereby notified that any review, dissemination, distribution or duplication 
of this communication is strictly prohibited.  If you are not the intended 
recipient, please contact the sender by reply email and destroy all copies of 
the original message.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] AR1 covariance structure for lme object and R/SAS differences in model output

2015-02-11 Thread Ben Bolker
anord andreas.nord at biol.lu.se writes:

 


  [snip snip]

 We are working on a data set in which we have measured repeatedly a
 physiological response variable (y) 
 every 20 min for 12 h (time variable; 'x') in subjects ('id') beloning to
 one of five groups ('group'; 'A' to 'E'). Data are located at:
 https://www.dropbox.com/s/hf455aev3teb5e0/data.csv?dl=0
 
 We are interested to model if the response in y differences 
 with time (i.e.
 'x') for the two groups. Thus:
 require(nlme)
 m1-lme(y~group*x+group*I(x^2),random=~x|id,
   data=data.df,na.action=na.omit)
 
 But because data are collected repeatedly over 
 short time intervals for each
 subject, it seemed prudent to consider an autoregressive covariance
 structure. Thus:
 m2-update(m1,~.,corr=corCAR1(form=~x|id))
 
 AIC values indicate the latter (i.e. m2) as more appropriate:
   anova(m1,m2)
 #   Model df  AIC  BIC   logLikTest  L.Ratio 
 p-value
 #m1 1 19 2155.996 2260.767 -1058.9981
 #m2 2 20 2021.944 2132.229  -990.9718 1 vs 2 136.0525  .0001
 
 Fixed effects and test statistics differ between models.
  A look at marginal
 ANOVA tables suggest inference might differ somewhat between models:
 
 anova.lme(m1,type=m)
 #  numDF denDF  F-value p-value
 #(Intercept)  1  1789 63384.80  .0001
 #group 445  1.29  0.2893
 #x   1  1789 0.05  0.8226
 #I(x^2)1  1789 4.02  0.0451
 #group:x  4  1789 2.61  0.0341
 #group:I(x^2)   4  1789 4.37  0.0016
 
 anova.lme(m2,type=m)
 # numDF denDF  F-value p-value
 #(Intercept)  1  1789 59395.79  .0001
 #group 445  1.33  0.2725
 #x   1  1789 0.04  0.8379
 #I(x^2)1  1789 2.28  0.1312
 #group:x  4  1789 2.09  0.0802
 #group:I(x^2)   4  1789 2.81  0.0244
 
 Now, this is all well. But: my colleagues have been running 
 the same data
 set using PROC MIXED in SAS and come up with 
 substantially different results
 when comparing SAS default covariance structure (variance components) and
 AR1. Specifically, there is virtually no change 
 in either test statistics or
 fitted values when using AR1 instead of Variance Components in SAS, which
 fits the observation that AIC values (in SAS) indicate both covariance
 structures fit data equally well. 
 
 This is not very satisfactory to me, and I would be 
 interesting to know what
 is happening here. Realizing
 this might not be the correct forum for this question, I would like to ask
 you all if anyone would have any
 input as to what is going on here, e.g. am I setting up my model
 erroneously, etc.? 
 
 N.b. I have no desire to replicate SAS results, but I would most certainly
 be interested to know what could possibly explain  
 such a large discrepancy
 between the two platforms. Any suggestions greatly welcomed.
 
 (Data are located at:
 https://www.dropbox.com/s/hf455aev3teb5e0/data.csv?dl=0)

  Could you repost this on r-sig-mixed-mod...@r-project.org ?

It would be useful to see some of the comparisons (fixed effects,
RE variance-covariance, correlation parameter) between SAS and
R; are they actually fitting the same model?  (e.g., does SAS
allow for covariance between the slope and intercept random effects?)
Maybe they're getting the same estimates but computing df/p-values
in different ways?

  I thought I would try this with orthogonal polynomials in case
some of the fits were unstable ...

data.df - read.csv2(ar1_data.csv)
library(nlme)
m1 - lme(y~group*x+group*I(x^2),random=~x|id,
  data=data.df,na.action=na.omit,method=ML)
## use method=ML so we can compare orthog. and non-orthog. polynomials
m1B - update(m1,.~group*poly(x,2))
m2 - update(m1,corr=corCAR1(form=~x|id))
m2B - update(m1B,corr=corCAR1(form=~x|id))
AIC(m1,m1B,m2,m2B)

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] piecewise regression and lm() question

2015-02-11 Thread Duncan Murdoch
On 11/02/2015 4:30 PM, Goldschneider, Jill wrote:
 I was playing with some examples of piecewise regression using lm() and have 
 come across a behavior I'm uncertain about.
 Below is simple 3-segment dataset.  I compare predicted output of a model 
 created by one call to lm() to that of 3 models created by 3 calls to lm().
 In case A and B, the results are the same.  However, in case C the results 
 differ for the middle segment.
 Is the output of lm() for case C to be expected or not and if so, why?

Take a look at the fit value, and you'll see what happened:

 lm(y ~ (x=10)*x + I(x10  x=20) + (x20)*x)

Call:
lm(formula = y ~ (x = 10) * x + I(x  10  x = 20) + (x  20) *
x)

Coefficients:
(Intercept)  x = 10TRUE
x  I(x  10  x = 20)TRUE
35.0741 -15.0733
-0.1644 -17.7344
 x  20TRUEx = 10TRUE:x x:x  20TRUE
 NA   1.2397  -0.9658

The * in a formula means main effect plus interaction, not
multiplication.  W  hat you want is


lm(y ~ I((x=10)*x) + I(x10  x=20) + I((x20)*x))

This doesn't give exactly the same results as the segmented regression,
because it uses the same intercept in all three segments; you might want
to add I(x = 10) as well if you don't want that.

Duncan Murdoch

 Thank you,
 Jill
 
 set.seed(133)
 y - c(21:30, rep(15,10), 10:1) + runif(30, -2, 2)
 x - 1:30
 
 # Case A: 3 segments, each fit with a constant value
 plot(x, y)
 # 3 different lm fits
 p.c3 - c(predict(lm(y[1:10]~1)), predict(lm(y[11:20]~1)), 
 predict(lm(y[21:30]~1)))
 lines(x, p.c3, col=red)
 # piecewise via lm
 p.lm1 - predict(lm(y ~ I(x=10) + I(x10  x=20) + I(x20)))
 lines(x, p.lm1, col=blue)
 max(abs(p.c3-p.lm1))
 
 # Case B: 3 segments, each fit with a line
 plot(x, y)
 # 3 different lm fits
 p.c3 - c(predict(lm(y[1:10]~x[1:10])), predict(lm(y[11:20]~x[11:20])), 
 predict(lm(y[21:30]~x[21:30])))
 lines(x, p.c3, col=red)
 # piecewise via lm
 p.lm1 - predict(lm(y ~ (x=10)*x + (x10  x=20)*x + (x20)*x))
 lines(x, p.lm1, col=blue)
 max(abs(p.c3-p.lm1))
 
 # Cast C: 3 segments - the middle fit with a constant value, the outer by a 
 line
 plot(x, y)
 # 3 different lm fits
 p.c3 - c(predict(lm(y[1:10]~x[1:10])), predict(lm(y[11:20]~1)), 
 predict(lm(y[21:30]~x[21:30])))
 lines(x, p.c3, col=red)
 # piecewise via lm
 p.lm1 - predict(lm(y ~ (x=10)*x + I(x10  x=20) + (x20)*x))
 lines(x, p.lm1, col=blue)
 max(abs(p.c3-p.lm1))
 ## the single call to lm does not have a constant value fit to the middle 
 section
 plot(x, p.c3-p.lm1 )
 
 
 
 The information contained in this transmission may contain confidential 
 information.  If the reader of this message is not the intended recipient, 
 you are hereby notified that any review, dissemination, distribution or 
 duplication of this communication is strictly prohibited.  If you are not the 
 intended recipient, please contact the sender by reply email and destroy all 
 copies of the original message.
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] simple question - mean of a row of a data.frame

2015-02-11 Thread Matthew Keller
Hi all,

Simple question I should know: I'm unclear on the logic of why the sum of a
row of a data.frame returns a valid sum but the mean of a row of a
data.frame returns NA:

sum(rock[2,])
[1] 10901.05

mean(rock[2,],trim=0)
[1] NA
Warning message:
In mean.default(rock[2, ], trim = 0) :
  argument is not numeric or logical: returning NA

I get that rock[2,] is itself a data.frame of mode list, but why the
inconsistency between functions? How can you figure this out from, e.g.,
?mean
?sum

Thanks in advance,

Matt


-- 
Matthew C Keller
Asst. Professor of Psychology
University of Colorado at Boulder
www.matthewckeller.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] simple question - mean of a row of a data.frame

2015-02-11 Thread Kehl Dániel
Hi,

rock[2,] is a data frame and you should not use sum() on a data frame, first 
google hit for the error message gives

http://stackoverflow.com/questions/19697498/r-beginner-argument-is-not-numeric-or-logical-returning-na

Otherwise I think you should use

?rowSums and ?rowMeans if you have numeric data frames.

HTH,
daniel

Feladó: R-help [r-help-boun...@r-project.org] ; meghatalmaz#243;: Matthew 
Keller [mckellerc...@gmail.com]
Küldve: 2015. február 11. 23:49
To: r help
Tárgy: [R] simple question - mean of a row of a data.frame

Hi all,

Simple question I should know: I'm unclear on the logic of why the sum of a
row of a data.frame returns a valid sum but the mean of a row of a
data.frame returns NA:

sum(rock[2,])
[1] 10901.05

mean(rock[2,],trim=0)
[1] NA
Warning message:
In mean.default(rock[2, ], trim = 0) :
  argument is not numeric or logical: returning NA

I get that rock[2,] is itself a data.frame of mode list, but why the
inconsistency between functions? How can you figure this out from, e.g.,
?mean
?sum

Thanks in advance,

Matt


--
Matthew C Keller
Asst. Professor of Psychology
University of Colorado at Boulder
www.matthewckeller.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Subsetting data with svyglm

2015-02-11 Thread Brennan O'Banion
I am aware that it is possible to specify a subset with a single
logical operator when constructing a model, such as:
svyglm(formula, design=data, subset=variable==value).

What I can't figure out is how to specify a subset with two or more
logical operators:
svyglm(formula, design=data, subset=variable==value a|value b).

Is it possible to specify a subset in this way using *glm without
having to, in my case, subset the original data, create a survey
design, and then fit a model?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] grofit issues with replicates - probit or logit or glmm

2015-02-11 Thread Marcelo Laia
Hello

I tried use grofit package in our data set. We provide a subset of our
data with X iso, and 4 doses, and insect died was count each day for
long 5 days. We started with Y insects per dishes. When one is dead, it
was counted and removed. Died insect is cumulative in the next days.
i.e. day 1 died 1. day 2 no died, so, day 2 is assigned 1 died (from day
1).

Here is the script:

library(lattice)
library(grofit)
library(repmis)

url.csv - 
https://dl.dropboxusercontent.com/u/34009642/cepajabo07_wide_acumulado.csv

data02 - read.table(url.csv, header=TRUE, sep=\t, dec=,)

head(data02)

timepoints - 1:5 # 5 days
time - t(matrix(rep(timepoints, 120), c(5, 120))) # 5 days and 120 experiments
   # (6 iso * 4 doses
   # * 5 rep)
time

TestRun1$drFit
TestRun2$drFit

colData - c(black, cyan, magenta, blue)

plot(TestRun1$gcFit, opt = s, colData = colData, colSpline = 1, 
 pch = 1:4, cex = 1)

plot(TestRun2$gcFit, opt = s, colData = colData, colSpline = 1, 
 pch = 1:4, cex = 1)

plot(TestRun1$drFit$drFittedSplines[[1]], colData = colData, 
 pch = 1:4, cex = 1)

plot(TestRun2$drFit$drFittedSplines[[1]], colData = colData, 
 pch = 1:4, cex = 1)

The problem: grofit didn't deal with replicates and do a curve for each
ones.

Is it a way to get response curve with the replicates?

We are interested in LD50, and dose response curve, and graphs.

Any suggestion is very welcome!

Thank you!

-- 
Marcelo

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to read this Rpart decision tree?

2015-02-11 Thread Kim C.
Hi all, 

In the attachment or this link (http://oi58.tinypic.com/35ic9qc.jpg) you'll 
find the decision tree I made. I used the Rpart package to make the tree and 
the rattle package using the fancyRpartPlot to plot it. The data in the tree 
looks different than about every example I have seen before. I don't understand 
how I should read it. I want to predict Product (which are productkeys). The 
variables to predict it contain age, incomegroup, gender, totalchildren, 
education, occupation, houseownerflag, numberCars.It looks like the upper 
number is a ProductKey. The n is number of observations? And the percentage 
of the yes/no question below. 

This is the code I used.  

 ss.rpart1 - rpart(Product ~ ., data=sstrain, 
 control=rpart.control(minbucket=2,minsplit=1, cp=-1))
 spt - which.min(ss.rpart1$cptable[, xerror])
 scp - ss.rpart1$cptable[opt, CP]
 ss.rpart2 - prune(ss.rpart1, cp=cp)
 fancyRpartPlot(ss.rpart2)

So why does the tree looks so different from the most (for example: 
http://media.tumblr.com/a9f482ff88b0b9cfaffca7ffd46c6a8e/tumblr_inline_mz7pyuaYJQ1s5wtly.png).
 This is from Trevor Stephen's TItanic tutorial. The first node show that 62% 
of 100% doesn't survive. If they were male, only 19% of them were survivors. I 
find that a lot examples look like that. Why does mine predict per ProductKey 
and every node it has something else. it doesn't make sense to me. And it 
doesn't have the two numbers like .62 and .38 but it has n=197e+3. So should I 
read the first node like For 100% of the observations of ProductKey 1074, the 
incomegroup was moderate)?

Thank you!

Kim


  __
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Mixed-effects model for pre-post randomization design

2015-02-11 Thread Marco Barbàra
DeaR userRs,

I recently read this Liang-Zeger article:

http://sankhya.isical.ac.in/search/62b1/fpaper7.html

in which (among other things) they adopt a random intercept model for
pre-post designed trials, using a conditional likelihood approach
(I didn't think it possible with only two measurements per subject)

I'm trying to figure out (if and) how it is possible to reproduce
straightforwardly their model using R standard mixed model tools, but
I cannot even try to reproduce their work, since they used a
non-available dataset (I found an extract on prof. Diggle's web site
where it is explicitly reported to be confidential), so I have to
review a bit of likelihood theory along with some implementation
details.

In the meantime, I wonder if anyone here could point out any related
documentation to me.

Thank you.
Marco.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to read this Rpart decision tree

2015-02-11 Thread Therneau, Terry M., Ph.D.

First:
   summary(ss.rpart1)
or summary(ss.rpart, file=whatever)

The printout will be quite long since your tree is so large, so the second form may be 
best followed by a perusal of the file with your favorite text editor.  The file name of 
whatever above should be something you choose, of course.   This will give you a full 
description of the tree.  Read the first node or two very carefully so that you understand 
what the fit did.
  Plotting routines for trees have to make display choices, since there simply is not 
enough space available to list all the details.  You have a complicated endpoint with at 
least 14 different products.  The predicted value for the each node of the tree is a 
vector of percentages (one per product, adds to one); plots often show only the name of 
the most frequent.  The alive/dead endpoint for the Titanic data is a lot easier to fit 
into a little plotting oval so of course the plotted tree is easier to grasp.


Terry T.



On 02/11/2015 05:00 AM, r-help-requ...@r-project.org wrote:

Hi all,

In the attachment or this link (http://oi58.tinypic.com/35ic9qc.jpg) you'll find the 
decision tree I made. I used the Rpart package to make the tree and the rattle package 
using the fancyRpartPlot to plot it. The data in the tree looks different than about 
every example I have seen before. I don't understand how I should read it. I want to 
predict Product (which are productkeys). The variables to predict it contain age, 
incomegroup, gender, totalchildren, education, occupation, houseownerflag, numberCars.It 
looks like the upper number is a ProductKey. The n is number of observations? 
And the percentage of the yes/no question below.

This is the code I used.


ss.rpart1 - rpart(Product ~ ., data=sstrain, 
control=rpart.control(minbucket=2,minsplit=1, cp=-1))
spt - which.min(ss.rpart1$cptable[, xerror])
scp - ss.rpart1$cptable[opt, CP]
ss.rpart2 - prune(ss.rpart1, cp=cp)
fancyRpartPlot(ss.rpart2)

So why does the tree looks so different from the most (for 
example:http://media.tumblr.com/a9f482ff88b0b9cfaffca7ffd46c6a8e/tumblr_inline_mz7pyuaYJQ1s5wtly.png).
 This is from Trevor Stephen's TItanic tutorial. The first node show that 62% of 100% 
doesn't survive. If they were male, only 19% of them were survivors. I find that a lot 
examples look like that. Why does mine predict per ProductKey and every node it has 
something else. it doesn't make sense to me. And it doesn't have the two numbers like .62 
and .38 but it has n=197e+3. So should I read the first node like For 100% of the 
observations of ProductKey 1074, the incomegroup was moderate)?

Thank you!

Kim


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] upgrading issues with Rcpp

2015-02-11 Thread arnaud gaboury
gabx@hortensia [R] sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
LC_MONETARY=en_US.UTF-8
 [6] LC_MESSAGES=en_US.UTF-8LC_PAPER=en_US.UTF-8   LC_NAME=C
   LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] utils base

loaded via a namespace (and not attached):
[1] tools_3.1.2


gabx@hortensia [R] search()
[1] .GlobalEnvtools:rstudio package:utils package:base
--

Now when trying to update Rcpp:

gabx@hortensia [R] install.packages(Rcpp)
.
Error in unloadNamespace(pkg_name) :
  namespace ‘Rcpp’ is imported by ‘reshape2’, ‘plyr’, ‘dplyr’ so
cannot be unloaded
..

I can't upgrade my packages because of Rcpp =0.11.3 needed (running
actually 0.11.2)

What is this namespace imported by  'reshape2’, ‘plyr’, ‘dplyr’ ? How
can I get rid of it and upgrade my packages ?

Thank you for hints
-- 

google.com/+arnaudgabourygabx

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] upgrading issues with Rcpp

2015-02-11 Thread Dirk Eddelbuettel
arnaud gaboury arnaud.gaboury at gmail.com writes:
 Now when trying to update Rcpp:
 
 gabx at hortensia [R] install.packages(Rcpp)
 .
 Error in unloadNamespace(pkg_name) :
   namespace ‘Rcpp’ is imported by ‘reshape2’, ‘plyr’, ‘dplyr’ so
 cannot be unloaded
 ..
 
 I can't upgrade my packages because of Rcpp =0.11.3 needed (running
 actually 0.11.2)
 
 What is this namespace imported by  'reshape2’, ‘plyr’, ‘dplyr’ ? How
 can I get rid of it and upgrade my packages ?

I'd do 

  $ R --vanilla  # to ensure nothing is loaded
   install.packages(Rcpp) # possibly set a repo

but what I really do is to use scripts update.r, install.r, install2.r, ...
from package littler which makes as this so much easier -- and do it on the 
command-line with empty R sessions.

Dirk



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Subsetting data with svyglm

2015-02-11 Thread Anthony Damico
hi brennan, survey design objects can be subsetted with the same subset()
syntax as data.frame objects, so following jeff's advice maybe you want

svyglm( formula , design = subset( surveydesign , variable %in% c( 'value
a' , 'value b' ) ) )

for some examples of how to construct a survey design with public use data,
see http://github.com/ajdamico/usgsd


On Wed, Feb 11, 2015 at 11:49 PM, Jeff Newmiller jdnew...@dcn.davis.ca.us
wrote:

 This seems like a fundamental  misunderstanding on your part of how
 operators, and in particular logical expressions, work in computer
 languages. Consider some examples:

 1+2 has a numeric answer because 1 and 2 are both numeric.
 1+a has at the very least not a numeric answer because the values on
 either side of the + sign are not both numeric.
 TRUE | FALSE  has a logical type of answer because both sides of the
 logical or operator are logical.
 However, you are expressing something like
 TRUE | a string which might mean something but that something generally
 is not a logical type of answer.

 Try
 variable==value a | variable==value b
 or
 variable %in% c( value a, value b )

 You would probably find that the Introduction to R document that comes
 with R has some enlightening examples in it. You might also find Pat Burns'
 The R Inferno entertaining as well (search for it in your favorite search
 engine).
 ---
 Jeff NewmillerThe .   .  Go Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live
 Go...
   Live:   OO#.. Dead: OO#..  Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
 ---
 Sent from my phone. Please excuse my brevity.

 On February 11, 2015 8:42:58 PM EST, Brennan O'Banion 
 brennan.oban...@gmail.com wrote:
 I am aware that it is possible to specify a subset with a single
 logical operator when constructing a model, such as:
 svyglm(formula, design=data, subset=variable==value).
 
 What I can't figure out is how to specify a subset with two or more
 logical operators:
 svyglm(formula, design=data, subset=variable==value a|value b).
 
 Is it possible to specify a subset in this way using *glm without
 having to, in my case, subset the original data, create a survey
 design, and then fit a model?
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subsetting data with svyglm

2015-02-11 Thread Jeff Newmiller
This seems like a fundamental  misunderstanding on your part of how operators, 
and in particular logical expressions, work in computer languages. Consider 
some examples:

1+2 has a numeric answer because 1 and 2 are both numeric.
1+a has at the very least not a numeric answer because the values on either 
side of the + sign are not both numeric.
TRUE | FALSE  has a logical type of answer because both sides of the logical 
or operator are logical.
However, you are expressing something like
TRUE | a string which might mean something but that something generally is 
not a logical type of answer.

Try
variable==value a | variable==value b
or
variable %in% c( value a, value b )

You would probably find that the Introduction to R document that comes with R 
has some enlightening examples in it. You might also find Pat Burns' The R 
Inferno entertaining as well (search for it in your favorite search engine).
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On February 11, 2015 8:42:58 PM EST, Brennan O'Banion 
brennan.oban...@gmail.com wrote:
I am aware that it is possible to specify a subset with a single
logical operator when constructing a model, such as:
svyglm(formula, design=data, subset=variable==value).

What I can't figure out is how to specify a subset with two or more
logical operators:
svyglm(formula, design=data, subset=variable==value a|value b).

Is it possible to specify a subset in this way using *glm without
having to, in my case, subset the original data, create a survey
design, and then fit a model?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.