Re: [R] dependent column(s) in data frame

2014-02-21 Thread PQuery
Hi,
Thanks for the reply, I will wait a couple of days and eventually post
elsewhere unless I find the solution myself.
Best.




--
View this message in context: 
http://r.789695.n4.nabble.com/dependent-column-s-in-data-frame-tp4685561p4685633.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to estimate the CI in glm()?

2014-02-21 Thread Cao Zongfu
Hi

I have fit a model, and estimated the risk difference(RD) for the
interaction term,but I don't know how to estimate the standard error and
confidence interval for RD. The code are as follows
 table(dat$A3)
# 0  1
#167762   
table(dat$A17)
# 0  1  2  3
#129982  23429  15880   2915
table(dat$A10)
# 0  1
# 39871 132335
table(dat$A10,dat$A17)
# 0  1  2  3
#  0  29833   5626   3685727
#  1 100149  17803  12195   2188
fm.add = glm(A3~ A10 + A17 +A10*A17,data=dat, family=binomial(link =
"identity"))
coefs.fm.add = summary(fm.add)$ coefficients
coefs.fm.add
 Estimate  Std. Error z value  Pr(>|z|)
(Intercept)  0.0343244059 0.001054068  32.5637454 1.337835e-232
A101-0.0125967800 0.001150347 -10.9504155  6.614351e-28
A171 0.0070904537 0.002857888   2.4810117  1.310101e-02
A172-0.0039309187 0.003017991  -1.3024951  1.927472e-01
A173-0.0123161528 0.005542338  -2.2221944  2.627017e-02
A101:A1710.0006713323 0.003160276   0.2124284  8.317728e-01
A101:A1720.0066395373 0.003357877   1.9773022  4.800748e-02
A101:A1730.0180108304 0.006566514   2.7428297  6.091227e-03

#estimate 'Risk difference of A10 when A10 = 1,A17=1'
rd.1.1= coefs.fm.add[2,1] + coefs.fm.add[3,1] + coefs.fm.add[6,1];
#estimate 'Risk difference of A10 when A10 = 1,A17=2'
rd.1.2= coefs.fm.add[2,1] + coefs.fm.add[4,1] + coefs.fm.add[7,1];
#estimate 'Risk difference of A10 when A10 = 1,A17=3'
rd.1.3= coefs.fm.add[2,1] + coefs.fm.add[5,1] + coefs.fm.add[8,1];
> rd.1.1
[1] -0.004834994
> rd.1.2
[1] -0.009888161
> rd.1.3
[1] -0.006902102

Who can help me? Thanks.


-- 
Zongfu Cao
Peking Union Medical College

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data manipulation in a data.frame

2014-02-21 Thread ioanna ioannou
Thank you very much. One further question. 

Assuming that for some points there is no classification for example:

A<-data.frame(A=c(10,100,1000,30,50,60,300,3),

  B=c(0,1,1,1,0,0,0,0),

  C=c(0,0,0,0,1,1,0,0),

  D=c(1,0,0,0,0,0,1,0))

Is there an easy way to introduce an extra none option in the variable?

A<-data.frame(A=c(10,100,1000,30,50,60,300,3),

  B=c(0,1,1,1,0,0,0,0),

  C=c(0,0,0,0,1,1,0,0),

  D=c(1,0,0,0,0,0,1,0),

   Variable=c(D,B,B,B,C,C,D,none))

Thanks in advance, 
IOanna

-Original Message-
From: arun [mailto:smartpink...@yahoo.com] 
Sent: 21 February 2014 00:19
To: r-help@r-project.org
Cc: ioanna ioannou
Subject: Re: [R] Data manipulation in a data.frame

Also,
rownames(which(t(!!A[,-1]),arr.ind=TRUE))
A.K.




On Thursday, February 20, 2014 6:48 PM, arun  wrote:
Hi,
May be this helps:

A$Variable <- rep(colnames(A[,-1]),nrow(A))[t(!!A[,-1])]
A.K.



On Thursday, February 20, 2014 5:55 PM, ioanna ioannou 
wrote:
Hello,





Assuming that I have a data frame 

A<-data.frame(A=c(10,100,1000,30,50,60,300),

              B=c(0,1,1,1,0,0,0),                        

              C=c(0,0,0,0,1,1,0),

              D=c(1,0,0,0,0,0,1))



What I would like is to introduce a new column Variable such that:



A<-data.frame(A=c(10,100,1000,30,50,60,300),

              B=c(0,1,1,1,0,0,0),                        

              C=c(0,0,0,0,1,1,0),

              D=c(1,0,0,0,0,0,1),

       Variable=c(D,B,B,B,C,C,D)) 



How can I do it?



Best 

IOanna


    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Assigning & function to variable

2014-02-21 Thread Rainer M Krug
Hi

I want to assign the function & and | to a variable, because I want to
specify as a function argument if inside the function & or | should be
used.

  link <- &

does not work, and

  link <- "&" 

results in the string "&" being assigned to link.

So how can I assign the logical function to the variable link, so that I
can do

TRUE link FALSE

Thanks,

Rainer


-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, 
UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug


pgpyoeYkZVkXl.pgp
Description: PGP signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help with gettext() for translating text

2014-02-21 Thread Daniel Kelley
I’m wondering whether anyone can help me with a translation exercise.  I have a 
package named “oce”, which does oceanographic processing, and I’d like to make 
it produce graphs with labels that work in different languages.  For example, 
in English I write “Depth” and in Spanish I’d like to write “Profundidad”.

In my “po” directory I have R-oce.pot and es.po.  As a first step, I’ve 
translated just the phrases “Depth (m)” and “Depth [m]”.  Then I build and 
installed my package.  Within R,

library(oce)
bindtextdomain("R-oce”)

yields

 [1] "/Library/Frameworks/R.framework/Versions/3.0/Resources/library/oce/po”

and then, in the OSX shell,

msgunfmt 
/Library/Frameworks/R.framework/Versions/3.0/Resources/library/oce/po/es/LC_MESSAGES/R-oce.mo
 | grep -1 Depth

yields
msgid "Depth (m)"
msgstr "Profundidad (m)"
--
--
msgid "Depth [m]"
msgstr "Profundidad [m]”

so it seems that I have successfully installed the translations.  Then, I run R 
from the shell with

LC_MESSAGES=es_ES.UTF-8 R --no-save < spanish.R

where spanish.R consists of

library(oce)
cat(gettext("Depth (m)"), "\n")
and I get

...
R es un software libre y viene sin GARANTIA ALGUNA.
...

(i.e. a Spanish introduction paragraph from R) which suggests that my 
env-variable is OK, followed by

> library(oce)
Loading required package: mapproj
Loading required package: maps
> cat(gettext("Depth (m)"), "\n")
Depth (m)

which, obviously, has not translated the text.  A similar test with

LANG=es_ES.UTF-8 R --no-save < spanish.R

yields the same results.

All of this is with R 3.0.2 on an Apple OSX platform (Mavericks); session info 
is below.

> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] es_ES.UTF-8/es_ES.UTF-8/es_ES.UTF-8/C/es_ES.UTF-8/es_ES.UTF-8
attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base
other attached packages:
[1] oce_0.9-14mapproj_1.2-1 maps_2.3-2


QUESTION: any hints on how I can get the translations to be passed through 
gettext()?

Thanks!


Dan E. Kelley, Professor  and Graduate Coordinator
Oceanography Department, Dalhousie University
PO BOX 15000
Halifax, NS B3H 4R2
phone:(902)494-1694 fax:(…)-3877 dan.kel...@dal.ca
http://oceanography.dal.ca/person/Kelley_Dan.html
http://graduatecoordinator.oceanography.dal.ca/


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning & function to variable

2014-02-21 Thread Prof Brian Ripley

On 21/02/2014 10:07, Rainer M Krug wrote:

Hi

I want to assign the function & and | to a variable, because I want to
specify as a function argument if inside the function & or | should be
used.

   link <- &

does not work, and

   link <- "&"

results in the string "&" being assigned to link.


Use backticks: see ?Quotes.

link <- `&`

You can also make use of .Primitive.


So how can I assign the logical function to the variable link, so that I
can do

TRUE link FALSE


Are you trying to ask how to make 'link' into a binary operator?  You 
cannot: user-defined binary operators are of the form %link% .





Thanks,

Rainer




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning & function to variable

2014-02-21 Thread Rainer M Krug
Prof Brian Ripley  writes:

> On 21/02/2014 10:07, Rainer M Krug wrote:
>> Hi
>>
>> I want to assign the function & and | to a variable, because I want to
>> specify as a function argument if inside the function & or | should be
>> used.
>>
>>link <- &
>>
>> does not work, and
>>
>>link <- "&"
>>
>> results in the string "&" being assigned to link.
>
> Use backticks: see ?Quotes.
>
> link <- `&`
>

Thanks - that works perfectly.

> You can also make use of .Primitive.
>
>> So how can I assign the logical function to the variable link, so that I
>> can do
>>
>> TRUE link FALSE
>
> Are you trying to ask how to make 'link' into a binary operator?  You
> cannot: user-defined binary operators are of the form %link% .

Initially I wanted,  but using the form link(x, y) makes the argument
more powerful as user defined functions can be supplied which return
logical vectors.

Thanks a lot,

Rainer

>
>
>>
>> Thanks,
>>
>> Rainer
>>
>>
>>
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>

-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, 
UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug


pgpqDgedzi02E.pgp
Description: PGP signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with gettext() for translating text

2014-02-21 Thread Prof Brian Ripley

On 21/02/2014 10:40, Daniel Kelley wrote:

I’m wondering whether anyone can help me with a translation exercise.  I have a 
package named “oce”, which does oceanographic processing, and I’d like to make 
it produce graphs with labels that work in different languages.  For example, 
in English I write “Depth” and in Spanish I’d like to write “Profundidad”.

In my “po” directory I have R-oce.pot and es.po.  As a first step, I’ve 
translated just the phrases “Depth (m)” and “Depth [m]”.  Then I build and 
installed my package.  Within R,

library(oce)
bindtextdomain("R-oce”)


I don't think you know what that does.  And do not post HTML (see the 
posting guide): you have ended up with an invalid directional quote in 
there.



yields

  [1] "/Library/Frameworks/R.framework/Versions/3.0/Resources/library/oce/po”

and then, in the OSX shell,

msgunfmt 
/Library/Frameworks/R.framework/Versions/3.0/Resources/library/oce/po/es/LC_MESSAGES/R-oce.mo
 | grep -1 Depth

yields
msgid "Depth (m)"
msgstr "Profundidad (m)"
--
--
msgid "Depth [m]"
msgstr "Profundidad [m]”

so it seems that I have successfully installed the translations.  Then, I run R 
from the shell with

LC_MESSAGES=es_ES.UTF-8 R --no-save < spanish.R

where spanish.R consists of

library(oce)
cat(gettext("Depth (m)"), "\n")


Read the help:

 If ‘domain’ is ‘NULL’ or ‘""’, a domain is searched for based on
 the namespace which contains the function calling ‘gettext’ or
 ‘ngettext’.

You are not calling this from a namespace and so need to specify the 
domain rather than the default of NULL.


Example

gettext("empty model supplied")

> gettext("empty model supplied")
[1] "empty model supplied"
> Sys.setenv(LANGUAGE="fr")
> gettext("empty model supplied")
[1] "empty model supplied"
> gettext("empty model supplied", domain = "R-stats")
[1] "modèle fourni vide"


and I get

...
R es un software libre y viene sin GARANTIA ALGUNA.
...

(i.e. a Spanish introduction paragraph from R) which suggests that my 
env-variable is OK, followed by


library(oce)

Loading required package: mapproj
Loading required package: maps

cat(gettext("Depth (m)"), "\n")

Depth (m)

which, obviously, has not translated the text.  A similar test with

LANG=es_ES.UTF-8 R --no-save < spanish.R

yields the same results.

All of this is with R 3.0.2 on an Apple OSX platform (Mavericks); session info 
is below.


sessionInfo()

R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] es_ES.UTF-8/es_ES.UTF-8/es_ES.UTF-8/C/es_ES.UTF-8/es_ES.UTF-8
attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base
other attached packages:
[1] oce_0.9-14mapproj_1.2-1 maps_2.3-2


QUESTION: any hints on how I can get the translations to be passed through 
gettext()?

Thanks!


Dan E. Kelley, Professor  and Graduate Coordinator
Oceanography Department, Dalhousie University
PO BOX 15000
Halifax, NS B3H 4R2
phone:(902)494-1694 fax:(…)-3877 dan.kel...@dal.ca
http://oceanography.dal.ca/person/Kelley_Dan.html
http://graduatecoordinator.oceanography.dal.ca/


[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data manipulation in a data.frame

2014-02-21 Thread Bert Gunter
This is easy to do in the approach I showed.

Instead of:

> names(A)[-1][as.matrix(A[,-1])%*%(seq_len(ncol(A)-1))]

modify it to:

> c("none",names(A)[-1])[as.matrix(A[,-1])%*%seq_len(ncol(A)-1)+1]

[1] "D""B""B""B""C""C""D""none"

Cheers,
Bert



Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
H. Gilbert Welch




On Fri, Feb 21, 2014 at 1:44 AM, ioanna ioannou  wrote:
> Thank you very much. One further question.
>
> Assuming that for some points there is no classification for example:
>
> A<-data.frame(A=c(10,100,1000,30,50,60,300,3),
>
>   B=c(0,1,1,1,0,0,0,0),
>
>   C=c(0,0,0,0,1,1,0,0),
>
>   D=c(1,0,0,0,0,0,1,0))
>
> Is there an easy way to introduce an extra none option in the variable?
>
> A<-data.frame(A=c(10,100,1000,30,50,60,300,3),
>
>   B=c(0,1,1,1,0,0,0,0),
>
>   C=c(0,0,0,0,1,1,0,0),
>
>   D=c(1,0,0,0,0,0,1,0),
>
>Variable=c(D,B,B,B,C,C,D,none))
>
> Thanks in advance,
> IOanna
>
> -Original Message-
> From: arun [mailto:smartpink...@yahoo.com]
> Sent: 21 February 2014 00:19
> To: r-help@r-project.org
> Cc: ioanna ioannou
> Subject: Re: [R] Data manipulation in a data.frame
>
> Also,
> rownames(which(t(!!A[,-1]),arr.ind=TRUE))
> A.K.
>
>
>
>
> On Thursday, February 20, 2014 6:48 PM, arun  wrote:
> Hi,
> May be this helps:
>
> A$Variable <- rep(colnames(A[,-1]),nrow(A))[t(!!A[,-1])]
> A.K.
>
>
>
> On Thursday, February 20, 2014 5:55 PM, ioanna ioannou 
> wrote:
> Hello,
>
>
>
>
>
> Assuming that I have a data frame
>
> A<-data.frame(A=c(10,100,1000,30,50,60,300),
>
>   B=c(0,1,1,1,0,0,0),
>
>   C=c(0,0,0,0,1,1,0),
>
>   D=c(1,0,0,0,0,0,1))
>
>
>
> What I would like is to introduce a new column Variable such that:
>
>
>
> A<-data.frame(A=c(10,100,1000,30,50,60,300),
>
>   B=c(0,1,1,1,0,0,0),
>
>   C=c(0,0,0,0,1,1,0),
>
>   D=c(1,0,0,0,0,0,1),
>
>Variable=c(D,B,B,B,C,C,D))
>
>
>
> How can I do it?
>
>
>
> Best
>
> IOanna
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data manipulation in a data.frame

2014-02-21 Thread arun
Hi IOanna,

Do you have rows with multiple '1's?  If not, you could also try:
A$Variable <- c("none",names(A)[-1])[1+with(A,B+2*C+3*D)]
A.K.




On Friday, February 21, 2014 4:44 AM, ioanna ioannou  wrote:
Thank you very much. One further question. 

Assuming that for some points there is no classification for example:

A<-data.frame(A=c(10,100,1000,30,50,60,300,3),

              B=c(0,1,1,1,0,0,0,0),                        

              C=c(0,0,0,0,1,1,0,0),

              D=c(1,0,0,0,0,0,1,0))

Is there an easy way to introduce an extra none option in the variable?

A<-data.frame(A=c(10,100,1000,30,50,60,300,3),

              B=c(0,1,1,1,0,0,0,0),                        

              C=c(0,0,0,0,1,1,0,0),

              D=c(1,0,0,0,0,0,1,0),

       Variable=c(D,B,B,B,C,C,D,none))

Thanks in advance, 
IOanna


-Original Message-
From: arun [mailto:smartpink...@yahoo.com] 
Sent: 21 February 2014 00:19
To: r-help@r-project.org
Cc: ioanna ioannou
Subject: Re: [R] Data manipulation in a data.frame

Also,
rownames(which(t(!!A[,-1]),arr.ind=TRUE))
A.K.




On Thursday, February 20, 2014 6:48 PM, arun  wrote:
Hi,
May be this helps:

A$Variable <- rep(colnames(A[,-1]),nrow(A))[t(!!A[,-1])]
A.K.



On Thursday, February 20, 2014 5:55 PM, ioanna ioannou 
wrote:
Hello,





Assuming that I have a data frame 

A<-data.frame(A=c(10,100,1000,30,50,60,300),

              B=c(0,1,1,1,0,0,0),                        

              C=c(0,0,0,0,1,1,0),

              D=c(1,0,0,0,0,0,1))



What I would like is to introduce a new column Variable such that:



A<-data.frame(A=c(10,100,1000,30,50,60,300),

              B=c(0,1,1,1,0,0,0),                        

              C=c(0,0,0,0,1,1,0),

              D=c(1,0,0,0,0,0,1),

       Variable=c(D,B,B,B,C,C,D)) 



How can I do it?



Best 

IOanna


    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Detecting Vehicle locations using R

2014-02-21 Thread umair durrani
The problem is resolved already. Please don't include this question in future 
mailing list

Umair Durrani

email: umairdurr...@outlook.com


> Subject: Re: [R] Detecting Vehicle locations using R
> From: jdnew...@dcn.davis.ca.us
> Date: Thu, 20 Feb 2014 20:06:28 -0800
> To: umairdurr...@outlook.com; r-help@r-project.org
> 
> Please read the Posting Guide, which offers several applicable tips, such as:
> Don't post in HTML format... it tends to corrupt your code samples.
> Please provide a hand-generated example result that should be what the 
> solution should transform your sample data into.
> Please show the code that did not work... you may be closer to the solution 
> than you think, or we may see from it that you could benefit from learning a 
> concept you don't know exists yet. This is not supposed to be a forum that 
> does your work for you.
> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live Go...
>   Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
> --- 
> Sent from my phone. Please excuse my brevity.
> 
> On February 20, 2014 6:30:37 PM PST, umair durrani  
> wrote:
> >I have a data frame of vehicle trajectories. Here's a snapshot:
> >>dput(head(df))structure(list(vehicle = c(2L, 2L, 2L, 2L, 2L, 2L),
> >frame = 43:48, globalx = c(6451214.156, 6451216.824, 6451219.616,
> >6451222.548, 6451225.462, 6451228.376), class = c(2L, 2L, 2L, 2L,
> >2L, 2L), velocity = c(37.76, 37.9, 38.05, 38.18, 38.32, 38.44),
> >lane = c(2L, 2L, 2L, 2L, 2L, 2L)), .Names = c("vehicle", "frame",
> >"globalx", "class", "velocity", "lane"), row.names = c(NA, 6L), class =
> >"data.frame")
> >where, vehicle= vehicle id, frame= frame id of time frames in which it
> >was observed, globalx = x coordinate of the front center of the
> >vehicle, class=type of vehicle (1=motorcycle, 2=car, 3=truck),
> >velocity=speed of vehicles in feet per second, lane= lane number (there
> >are 6 lanes).The 'frame' represents one tenth of a second i.e. one
> >frame is 0.1 seconds long. At frame 't' the vehicle has globalx
> >coordinate x(t) and at frame 't-1' (0.1 seconds before) it was x(t-1).
> >If the reference location has globalx coordinate=6451179.1116 then I
> >simply want a new column in df called 'u' which has 'yes' in the row
> >where globalx of the vehicle was greater than reference coordinate at
> >'U' AND the previous consecutive globalx coordinate of this vehicle was
> >less than reference coordinate at 'U'(i.e. reference coordinate is
> >between the 2 locations of vehicle in two consecutive frames). This
> >means that if df has 100 vehicles then there will be 100 'yes' in 'u'
> >column because every vehicle wil!
> > 
> > 
> >l meet the above criteria only once. I have tried to do this by running
> >the function with ifelse and also tried to do the same using a for loop
> >but it doesn't work for me.
> >
> >
> >   
> > [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
> 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data manipulation in a data.frame

2014-02-21 Thread Bert Gunter
This merely translates the matrix multiplication I used into explicit
arithmetic!

Nor does it generalize without extra manipulation to get the correct
arithmetic expression.

-- Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
H. Gilbert Welch




On Fri, Feb 21, 2014 at 9:25 AM, arun  wrote:
> Hi IOanna,
>
> Do you have rows with multiple '1's?  If not, you could also try:
> A$Variable <- c("none",names(A)[-1])[1+with(A,B+2*C+3*D)]
> A.K.
>
>
>
>
> On Friday, February 21, 2014 4:44 AM, ioanna ioannou  wrote:
> Thank you very much. One further question.
>
> Assuming that for some points there is no classification for example:
>
> A<-data.frame(A=c(10,100,1000,30,50,60,300,3),
>
>   B=c(0,1,1,1,0,0,0,0),
>
>   C=c(0,0,0,0,1,1,0,0),
>
>   D=c(1,0,0,0,0,0,1,0))
>
> Is there an easy way to introduce an extra none option in the variable?
>
> A<-data.frame(A=c(10,100,1000,30,50,60,300,3),
>
>   B=c(0,1,1,1,0,0,0,0),
>
>   C=c(0,0,0,0,1,1,0,0),
>
>   D=c(1,0,0,0,0,0,1,0),
>
>Variable=c(D,B,B,B,C,C,D,none))
>
> Thanks in advance,
> IOanna
>
>
> -Original Message-
> From: arun [mailto:smartpink...@yahoo.com]
> Sent: 21 February 2014 00:19
> To: r-help@r-project.org
> Cc: ioanna ioannou
> Subject: Re: [R] Data manipulation in a data.frame
>
> Also,
> rownames(which(t(!!A[,-1]),arr.ind=TRUE))
> A.K.
>
>
>
>
> On Thursday, February 20, 2014 6:48 PM, arun  wrote:
> Hi,
> May be this helps:
>
> A$Variable <- rep(colnames(A[,-1]),nrow(A))[t(!!A[,-1])]
> A.K.
>
>
>
> On Thursday, February 20, 2014 5:55 PM, ioanna ioannou 
> wrote:
> Hello,
>
>
>
>
>
> Assuming that I have a data frame
>
> A<-data.frame(A=c(10,100,1000,30,50,60,300),
>
>   B=c(0,1,1,1,0,0,0),
>
>   C=c(0,0,0,0,1,1,0),
>
>   D=c(1,0,0,0,0,0,1))
>
>
>
> What I would like is to introduce a new column Variable such that:
>
>
>
> A<-data.frame(A=c(10,100,1000,30,50,60,300),
>
>   B=c(0,1,1,1,0,0,0),
>
>   C=c(0,0,0,0,1,1,0),
>
>   D=c(1,0,0,0,0,0,1),
>
>Variable=c(D,B,B,B,C,C,D))
>
>
>
> How can I do it?
>
>
>
> Best
>
> IOanna
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Errorbar

2014-02-21 Thread Marc Girondot

Hi,
I perfectly agree that ggplot package is the perfect solution, but if 
you prefer the old-fashion plot, you can find a plot_errorbar function 
in the package phenology:

plot_errbar(1:100, rnorm(100, 1, 2),
xlab="axe x", ylab="axe y", bty="n", xlim=c(1,100),
errbar.x=2, errbar.y=rnorm(100, 1, 0.1))

Marc Girondot

Le 18/02/2014 16:57, Alzahrani, Ahmad K A a écrit :

Hi All,

Can anyone show me how to add a error bar to my graphs. I am currently using 
this code

g<-ggplot(means,aes(x=variable,y=value))

g<-g+geom_bar(stat="identity")+facet_wrap(~Site+Season)
g

Thanks,

Akaalz

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Venn diagram with numbers within each class

2014-02-21 Thread Luigi Marongiu
Dear all,

I would like to draw a Venn plot for data represented by 6 variables. I
know how to do this using the package venneuler (which requires rJava).
However this package does not report the numbers of elements within each
class.

Do you know how to display the number of elements using either this or
other packages?

Best wishes,

Luigi



library(venneuler)

library(rJava)



a<-c(   1, 1, 1, 1, 1, 1,
   1,
1, 1, 1, 1, 1, 1, 1,
0,0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0,0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0,0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0,0, 0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0,0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0,0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0)

b<-c(   0, 0, 0, 0, 0, 0,
   0,
0, 0, 0, 0, 0, 0, 0,
1,1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1,
1, 1,1, 1, 1, 1, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0,0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0,0, 0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0,0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0,0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0)

c<-c(   0, 0, 0, 0, 0, 0,
   0,
0, 0, 0, 0, 0, 0, 0,
0,1, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0,0, 0, 0, 0, 0,
0, 0, 1, 1, 1, 1, 1,
1, 1, 1,1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1,1, 1, 1,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0,0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0,0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0)

d<-c(   0, 0, 0, 0, 0, 0,
   0,
0, 0, 0, 0, 0, 0, 0,
0,0, 0, 0, 1, 1, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0,0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 1,0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0,0, 0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0,0, 1,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1,0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0)

e<-c(   1, 1, 1, 0, 0, 0,
   0,
0, 0, 0, 0, 0, 0, 0,
1,1, 0, 1, 0, 0, 1,
1, 1, 1, 1, 1, 1, 1,
1, 0,0, 1, 1, 1, 1,
1,  

[R] Venn diagram with identification numbers

2014-02-21 Thread Luigi Marongiu
Dear all,

I would like to draw a Venn plot for data represented by 6 variables. I
know how to do this using the package venneuler (which requires rJava).
However this package does not report the numbers of elements within each
class.

Do you know how to display the number of elements using either this or
other packages?

Best wishes,

Luigi



library(venneuler)

library(rJava)



a<-c(   1, 1, 1, 1, 1, 1,
   1,
1, 1, 1, 1, 1, 1, 1,
0,0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0,0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0,0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0,0, 0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0,0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0,0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0)

b<-c(   0, 0, 0, 0, 0, 0,
   0,
0, 0, 0, 0, 0, 0, 0,
1,1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1,
1, 1,1, 1, 1, 1, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0,0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0,0, 0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0,0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0,0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0)

c<-c(   0, 0, 0, 0, 0, 0,
   0,
0, 0, 0, 0, 0, 0, 0,
0,1, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0,0, 0, 0, 0, 0,
0, 0, 1, 1, 1, 1, 1,
1, 1, 1,1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1,1, 1, 1,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0,0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0,0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0)

d<-c(   0, 0, 0, 0, 0, 0,
   0,
0, 0, 0, 0, 0, 0, 0,
0,0, 0, 0, 1, 1, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0,0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 1,0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0,0, 0, 0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0,0, 1,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1,0,
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0)

e<-c(   1, 1, 1, 0, 0, 0,
   0,
0, 0, 0, 0, 0, 0, 0,
1,1, 0, 1, 0, 0, 1,
1, 1, 1, 1, 1, 1, 1,
1, 0,0, 1, 1, 1, 1,
1,  

[R] [e1071] Features that are factors when exporting a model with write.svm

2014-02-21 Thread Matthew Wood
I have a trained SVM that I want to export with write.svm and
eventually use in libSVM. Some of my features are factors. Standard
libSVM only works with features that are doubles, so I need to figure
out how my features should be represented and used.

How does e1071 treat factors in an SVM? For feature "foo" with values
"a" and "b" I'm assuming it's something like foo_a (0 or 1) and foo_b
(0 or 1). Is that right?

Do factors get treated differently in an SVM? If I convert the factors
to intergers for libSVM, I'll lose the information that a feature
doesn't take on a range of values. Is that going to cause problems? I
don't know if the model takes that into account.

When using write.svm a scale file is also output. My scale file is
missing the same number of rows as I have features that are factors.
That's another indication to me that the factors are causing issues.

Thanks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] shapiro.test

2014-02-21 Thread Gonzalo Villarino Pizarro
Dear R users,
Please help with with this maybe basic question. I am trying to see if my
data is normal but is a large file and the test does not work.
I keep getting the message : "Error in shapiro.test(x = HP_TrinityK25$V2)
:  sample size must be between 3 and 5000"
thanks!

 shapiro.test(x=HP_TrinityK25$V2)
Error in shapiro.test(x = HP_TrinityK25$V2) : sample size must be between 3
and 5000

##Note:
HP_TrinityK25= my file
HP_TrinityK25$V2= data in my file

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [e1071] Features that are factors when exporting a model with write.svm

2014-02-21 Thread Matthew Wood
I may have been able to answer my own questions by reading the e1071
source. It looks like the features are just converted to doubles with
as.double(x). And, I haven't found where in the code yet, but it looks
like it's not scaling the factors which explains why I'm missing rows
in the scale file.

On Fri, Feb 21, 2014 at 1:50 PM, Matthew Wood  wrote:
> I have a trained SVM that I want to export with write.svm and
> eventually use in libSVM. Some of my features are factors. Standard
> libSVM only works with features that are doubles, so I need to figure
> out how my features should be represented and used.
>
> How does e1071 treat factors in an SVM? For feature "foo" with values
> "a" and "b" I'm assuming it's something like foo_a (0 or 1) and foo_b
> (0 or 1). Is that right?
>
> Do factors get treated differently in an SVM? If I convert the factors
> to intergers for libSVM, I'll lose the information that a feature
> doesn't take on a range of values. Is that going to cause problems? I
> don't know if the model takes that into account.
>
> When using write.svm a scale file is also output. My scale file is
> missing the same number of rows as I have features that are factors.
> That's another indication to me that the factors are causing issues.
>
> Thanks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dependent column(s) in data frame

2014-02-21 Thread David Winsemius

On Feb 19, 2014, at 11:19 AM, PQuery wrote:

> Dear all,
> 
> I have a data frame with a status column and some condition columns. (a dput
> of part of it is listed below).
> I would like to know if:
> 
> 1) There are more chances to have a "status" of "1" when more than one
> conditions have the value of "1" ?   
> 
> 2) The "status" column is depending on any one or a combination of the
> condition columns
> Say, do I have a status of "1" whenever condition 2 & 3 (or only condition
> 2) are met ?
> 
> Do you know what type of analysis one can use to do that ?
> 
> Thanks in advance,
> P
> 
> 
> dput(df)
> structure(list(status = c(0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 0L,
> 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 1L, 1L,
> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 0L,
> 0L, 0L, 1L, 1L, 0L, 0L, 1L, 0L, 0L), cond.1 = c(0L, 0L, 0L, 1L,
> 0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 0L,
> 1L, 0L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 0L,
> 0L, 0L, 0L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L), cond.2 = c(1L,
> 0L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 0L, 1L, 1L, 0L, 0L, 0L, 1L,
> 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L,
> 0L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L,
> 1L), cond.3 = c(0L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L,
> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 1L,
> 0L, 0L, 1L, 0L, 0L, 0L), cond.4 = c(0L, 0L, 0L, 1L, 0L, 1L, 0L,
> 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 1L, 0L, 0L,
> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L,
> 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L), cond.5 = c(0L, 0L,
> 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L,
> 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L,
> 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L
> )), .Names = c("status", "cond.1", "cond.2", "cond.3", "cond.4",
> "cond.5"), row.names = c(NA, -50L), class = "data.frame")
> 

with(df, table(status=status, comb23 = cond.2&cond.3)  )
  comb23
status FALSE TRUE
 0331
 1115

The more general approach to analyzing binary responses is logistic regression.

-- David.
> 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/dependent-column-s-in-data-frame-tp4685561.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Venn diagram with identification numbers

2014-02-21 Thread David Winsemius

On Feb 21, 2014, at 11:00 AM, Luigi Marongiu wrote:

> Dear all,
> 
> I would like to draw a Venn plot for data represented by 6 variables. I
> know how to do this using the package venneuler (which requires rJava).
> However this package does not report the numbers of elements within each
> class.
> 

Labeling is already in the VennDiagram package, and I recently answered a 
question on SO asking to include percentages as well:

http://stackoverflow.com/questions/21715009/adding-percents-to-venn-diagrams-in-r/21717547#21717547

Now admittedly this was only for the set intersections of three sets. The 
output of `venneuler` is not compatible with that approach, but it does appear 
that there ir a `draw.quintuple.venn` function in VennDiagram.

-- 
David.

> Do you know how to display the number of elements using either this or
> other packages?
> 
> Best wishes,
> 
> Luigi
> 
> 
> 
>library(venneuler)
> 
>library(rJava)
> 
> 
> 
> a<-c(   1, 1, 1, 1, 1, 1,
>   1,
> 1, 1, 1, 1, 1, 1, 1,
> 0,0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0,
> 0, 0,0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0,0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0,0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0,0, 0,
> 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0,0,
> 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0)
> 
> b<-c(   0, 0, 0, 0, 0, 0,
>   0,
> 0, 0, 0, 0, 0, 0, 0,
> 1,1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1,
> 1, 1,1, 1, 1, 1, 0,
> 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0,0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0,0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0,0, 0,
> 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0,0,
> 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0)
> 
> c<-c(   0, 0, 0, 0, 0, 0,
>   0,
> 0, 0, 0, 0, 0, 0, 0,
> 0,1, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0,
> 0, 0,0, 0, 0, 0, 0,
> 0, 0, 1, 1, 1, 1, 1,
> 1, 1, 1,1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1,1, 1, 1,
> 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0,0, 0,
> 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0,0,
> 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0)
> 
> d<-c(   0, 0, 0, 0, 0, 0,
>   0,
> 0, 0, 0, 0, 0, 0, 0,
> 0,0, 0, 0, 1, 1, 0,
> 0, 0, 0, 0, 0, 0, 0,
> 0, 0,0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 1,0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0,0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 

Re: [R] R plot type

2014-02-21 Thread Rui Barradas

Hello,

The answer is yes, it is possible. I don't know how to plot a curved 
arrow but the rest should be possible to do using


?plot.default
?lines
?text

And please, post to R-Help, the odds of you getting more and better 
answers are greater.


Hope this helps,

Rui Barradas

Em 21-02-2014 20:58, catalin roibu escreveu:

Dear Rui,
Is there a possibility to create a plot like this in R?

Thank you very much!

Inline images 1

--
---
Catalin-Constantin ROIBU
Lecturer PhD, Forestry engineer
Forestry Faculty of Suceava
Str. Universitatii no. 13, Suceava, 720229, Romania
office phone +4 0230 52 29 78, ext. 531
mobile phone   +4 0745 53 18 01
+4 0766 71 76 58
FAX:+4 0230 52 16 64
silvic.usv.ro 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] shapiro.test

2014-02-21 Thread Rui Barradas

Hello,

Not answering directly to your question, if the sample size is a 
documented problem with shapiro.test and you want a normality test, why 
don't you use ?ks.test?


m <- mean(HP_TrinityK25$V2)
s <- sd(HP_TrinityK25$V2)

ks.test(HP_TrinityK25$V2, "pnorm", m, s)


Hope this helps,

Rui Barradas

Em 21-02-2014 15:59, Gonzalo Villarino Pizarro escreveu:

Dear R users,
Please help with with this maybe basic question. I am trying to see if my
data is normal but is a large file and the test does not work.
I keep getting the message : "Error in shapiro.test(x = HP_TrinityK25$V2)
:  sample size must be between 3 and 5000"
thanks!

  shapiro.test(x=HP_TrinityK25$V2)
Error in shapiro.test(x = HP_TrinityK25$V2) : sample size must be between 3
and 5000

##Note:
HP_TrinityK25= my file
HP_TrinityK25$V2= data in my file

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] shapiro.test

2014-02-21 Thread Greg Snow
Rui,

Note this quote from the last paragraph of the Details section of ?ks.test:

"If a single-sample test is used, the parameters specified in '...'
 must be pre-specified and not estimated from the data."

Which is the exact opposite of your example.



Gonzalo,

Why are you testing your data for normality?  For large sample sizes
the normality tests often give a meaningful answer to a meaningless
question (for small samples they give a meaningless answer to a
meaningful question).

If you really feel the need for a p-value then
SnowsPenultimateNormalityTest in the TeachingDemos package will work
for large sample sizes.  But note that the documentation for that
function is considered more useful than the function itself.



On Fri, Feb 21, 2014 at 3:04 PM, Rui Barradas  wrote:
> Hello,
>
> Not answering directly to your question, if the sample size is a documented
> problem with shapiro.test and you want a normality test, why don't you use
> ?ks.test?
>
> m <- mean(HP_TrinityK25$V2)
> s <- sd(HP_TrinityK25$V2)
>
> ks.test(HP_TrinityK25$V2, "pnorm", m, s)
>
>
> Hope this helps,
>
> Rui Barradas
>
> Em 21-02-2014 15:59, Gonzalo Villarino Pizarro escreveu:
>
>> Dear R users,
>> Please help with with this maybe basic question. I am trying to see if my
>> data is normal but is a large file and the test does not work.
>> I keep getting the message : "Error in shapiro.test(x = HP_TrinityK25$V2)
>> :  sample size must be between 3 and 5000"
>> thanks!
>>
>>   shapiro.test(x=HP_TrinityK25$V2)
>> Error in shapiro.test(x = HP_TrinityK25$V2) : sample size must be between
>> 3
>> and 5000
>>
>> ##Note:
>> HP_TrinityK25= my file
>> HP_TrinityK25$V2= data in my file
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] shapiro.test

2014-02-21 Thread Rolf Turner

On 22/02/14 11:04, Rui Barradas wrote:

Hello,

Not answering directly to your question, if the sample size is a
documented problem with shapiro.test and you want a normality test, why
don't you use ?ks.test?

m <- mean(HP_TrinityK25$V2)
s <- sd(HP_TrinityK25$V2)

ks.test(HP_TrinityK25$V2, "pnorm", m, s)


Strictly speaking this is not a valid test.  The KS test is used for 
testing against a *completely specified* distribution.  If there are 
parameters to be estimated, the null distribution is no longer 
applicable.  This may not be a "real" problem if the parameters are 
*well* estimated, as they would be in this instance (given that the 
sample size is over-large).  I'm not sure about this.


The "Lilliefors" test is theoretically available in this context when
mu and sigma are estimated, but according to the Wikipedia article, the 
Lilliefors distribution is not known analytically and the critical 
values must be determined by Monte Carlo methods.  There is a 
"LillieTest" function in the "DescTools" package which makes use of some 
approximations to get p-values.


However I think that a better approach would be to use a chi-squared 
goodness of fit test whereby you can adjust for estimated parameters 
simply by reducing the degrees of freedom.  I believe that the 
chi-squared test is somewhat low in power, but with a very large sample 
this should not be a problem.


The difficulty with the chi-squared test is that the choice of "bins" is 
somewhat arbitrary.  I believe the best approach is to take the bin 
boundaries to be the quantiles of the normal distribution (with 
parameters "m" and "s") corresponding to equispaced probabilities on 
[0,1], with the number of such probabilities being k+1 where
k = floor(n/5), n being the sample size.  This makes the expected counts 
all equal to n/k >= 5 so that the chi-squared test is "valid".  The 
degrees of freedom are then k-3 (k - 1 - #estimated parameters).


One last comment:  I believe that it is generally considered that 
testing for normality is a waste of time and a pseudo-intellectual 
exercise of academic interest at best.


cheers,

Rolf Turner




Hope this helps,

Rui Barradas

Em 21-02-2014 15:59, Gonzalo Villarino Pizarro escreveu:

Dear R users,
Please help with with this maybe basic question. I am trying to see if my
data is normal but is a large file and the test does not work.
I keep getting the message : "Error in shapiro.test(x = HP_TrinityK25$V2)
:  sample size must be between 3 and 5000"
thanks!

  shapiro.test(x=HP_TrinityK25$V2)
Error in shapiro.test(x = HP_TrinityK25$V2) : sample size must be
between 3
and 5000

##Note:
HP_TrinityK25= my file
HP_TrinityK25$V2= data in my file

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to add box layer using levelplot in R?

2014-02-21 Thread Jonsson
I would like to add some boxes with special extent to my plot.

Example

 gh <- raster()
 gh[] <- 1:ncell(gh)
 SP <- spsample(Spatial(bbox=bbox(gh)), 10, type="random")
Then plot them

  levelplot(gh, col.regions = rev(terrain.colors(255)), cuts=254,
margin=FALSE) +
  layer(sp.points(SP, col = "red"))
this plots a map with several crosses in it but I need to plot a box with
spacial extent:

 extent(gh) = extent(c(xmn=-180,xmx=180,ymn=-90,ymx=90))
  e6 <- extent( 2  , 8 , 45   , 51  )

I wan to add e6 to the plot and put the number2 inside the box.Any hint
please



--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-add-box-layer-using-levelplot-in-R-tp4685658.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] shapiro.test

2014-02-21 Thread Rolf Turner

On 22/02/14 11:53, Greg Snow wrote:





Why are you testing your data for normality?  For large sample sizes
the normality tests often give a meaningful answer to a meaningless
question (for small samples they give a meaningless answer to a
meaningful question).




Fortune!!!

cheers,

Rolf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with gettext() for translating text

2014-02-21 Thread Daniel Kelley
Thanks.  This helped greatly.  I’m sorry about sending the html message, and 
hope this one is plain-text.  Dan.

On Feb 21, 2014, at 7:00 AM, Prof Brian Ripley  wrote:

> On 21/02/2014 10:40, Daniel Kelley wrote:
>> I’m wondering whether anyone can help me with a translation exercise.  I 
>> have a package named “oce”, which does oceanographic processing, and I’d 
>> like to make it produce graphs with labels that work in different languages. 
>>  For example, in English I write “Depth” and in Spanish I’d like to write 
>> “Profundidad”.
>> 
>> In my “po” directory I have R-oce.pot and es.po.  As a first step, I’ve 
>> translated just the phrases “Depth (m)” and “Depth [m]”.  Then I build and 
>> installed my package.  Within R,
>> 
>> library(oce)
>> bindtextdomain("R-oce”)
> 
> I don't think you know what that does.  And do not post HTML (see the posting 
> guide): you have ended up with an invalid directional quote in there.
> 
>> yields
>> 
>>  [1] "/Library/Frameworks/R.framework/Versions/3.0/Resources/library/oce/po”
>> 
>> and then, in the OSX shell,
>> 
>> msgunfmt 
>> /Library/Frameworks/R.framework/Versions/3.0/Resources/library/oce/po/es/LC_MESSAGES/R-oce.mo
>>  | grep -1 Depth
>> 
>> yields
>> msgid "Depth (m)"
>> msgstr "Profundidad (m)"
>> --
>> --
>> msgid "Depth [m]"
>> msgstr "Profundidad [m]”
>> 
>> so it seems that I have successfully installed the translations.  Then, I 
>> run R from the shell with
>> 
>> LC_MESSAGES=es_ES.UTF-8 R --no-save < spanish.R
>> 
>> where spanish.R consists of
>> 
>> library(oce)
>> cat(gettext("Depth (m)"), "\n")
> 
> Read the help:
> 
> If ‘domain’ is ‘NULL’ or ‘""’, a domain is searched for based on
> the namespace which contains the function calling ‘gettext’ or
> ‘ngettext’.
> 
> You are not calling this from a namespace and so need to specify the domain 
> rather than the default of NULL.
> 
> Example
> 
> gettext("empty model supplied")
> 
> > gettext("empty model supplied")
> [1] "empty model supplied"
> > Sys.setenv(LANGUAGE="fr")
> > gettext("empty model supplied")
> [1] "empty model supplied"
> > gettext("empty model supplied", domain = "R-stats")
> [1] "modèle fourni vide"
> 
>> and I get
>> 
>> ...
>> R es un software libre y viene sin GARANTIA ALGUNA.
>> ...
>> 
>> (i.e. a Spanish introduction paragraph from R) which suggests that my 
>> env-variable is OK, followed by
>> 
>>> library(oce)
>> Loading required package: mapproj
>> Loading required package: maps
>>> cat(gettext("Depth (m)"), "\n")
>> Depth (m)
>> 
>> which, obviously, has not translated the text.  A similar test with
>> 
>> LANG=es_ES.UTF-8 R --no-save < spanish.R
>> 
>> yields the same results.
>> 
>> All of this is with R 3.0.2 on an Apple OSX platform (Mavericks); session 
>> info is below.
>> 
>>> sessionInfo()
>> R version 3.0.2 (2013-09-25)
>> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>> locale:
>> [1] es_ES.UTF-8/es_ES.UTF-8/es_ES.UTF-8/C/es_ES.UTF-8/es_ES.UTF-8
>> attached base packages:
>> [1] stats graphics  grDevices utils datasets  methods   base
>> other attached packages:
>> [1] oce_0.9-14mapproj_1.2-1 maps_2.3-2
>> 
>> 
>> QUESTION: any hints on how I can get the translations to be passed through 
>> gettext()?
>> 
>> Thanks!
>> 
>> 
>> Dan E. Kelley, Professor  and Graduate Coordinator
>> Oceanography Department, Dalhousie University
>> PO BOX 15000
>> Halifax, NS B3H 4R2
>> phone:(902)494-1694 fax:(…)-3877 dan.kel...@dal.ca
>> http://oceanography.dal.ca/person/Kelley_Dan.html
>> http://graduatecoordinator.oceanography.dal.ca/
>> 
>> 
>>  [[alternative HTML version deleted]]
>> 
>> 
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> 
> -- 
> Brian D. Ripley,  rip...@stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] PLM falling into dummy variable trap -- how to fix?

2014-02-21 Thread Andrew Crane-Droesch

**Appologies for cross-posting with Stack Overflow**

An example of the problem:

load(url('http://andrewcd.berkeley.edu/sdat'))
head(sdat)
library(plm)
fem = 
plm(y~T+G:t,data=sdat,effect="twoways",model="within",index=c("ID","t"))

summary(fem)
lsdvm = lm(y~ID+T+G:t,data=sdat)
summary(lsdvm)
fem$coef

`fem` is the fixed-effects model (fit with plm), and `lsdv` is the 
equivalent least-squares dummy variable model (fit with lm)


It is clear that plm is estimating the coefficients, and indeed that the 
coefficients are identical in the two models, as they should be.  But 
when I go to summarize the results, plm is having a hard time, and I'm 
pretty sure that the reason is the timeXgroup fixed effects, some of 
which need to be auto-omitted because of the dummy variable trap.  (lm, 
for example, seems to know how to automatically remove variables that 
are exact linear combinations of each other).


How do I get around this?  I'd prefer to stay with plm, as it gives much 
more parsimonious output than lm with dummy variables for each 
cross-sectional unit.  plm is also convenient for lags.





Thanks,
Andrew

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data manipulation in a data.frame

2014-02-21 Thread arun
Hi Ioanna,
If you need to paste the colnames if there are multiple 1's per row:
You could try:
A<-data.frame(A=c(10,100,1000,30,50,60,300,3,4,2,20,35,45),B=c(0,1,1,1,0,0,0,0,0,1,0,0,1),C=c(0,0,0,0,1,1,0,0,0,0,1,1,1),D=c(1,0,0,0,0,0,1,0,0,1,NA,1,1))
apply(A[,-1],1,function(x) {x1 <-paste(colnames(A[,-1])[x & 
!is.na(x)],collapse=","); x1[x1=='']<- "none";x1})
#[1] "D" "B" "B" "B" "C" "C" "D" "none"  "none" 
#[10] "B,D"   "C" "C,D"   "B,C,D"



#or Bert's method with some modification:
 
c("none",names(A)[-1],"B,D","C,D","B,C,D")[c(as.matrix(!!A[,-1]&!is.na(A[,-1]))%*%seq_len(ncol(A)-1)+1)]
# [1] "D" "B" "B" "B" "C" "C" "D" "none"  "none" 
#[10] "B,D"   "C" "C,D"   "B,C,D"
  

But, in this case, you may need to check if the combinations are there or not 
in the dataset, Otherwise

For e.g.
 
c("none",names(A)[-1],apply(combn(LETTERS[2:4],2),2,paste,collapse=","),"B,C,D")[c(as.matrix(!!A[,-1]&!is.na(A[,-1]))%*%seq_len(ncol(A)-1)+1)]
# [1] "D"    "B"    "B"    "B"    "C"    "C"    "D"    "none" "none" "B,C" 
#[11] "C"    "B,D"  "C,D" 


A.K.




On Friday, February 21, 2014 4:20 PM, ioanna ioannou  wrote:
Hello Arun, 

Actually I do have rows with multiple 1s. Could you advise how to modify the
code then?

Thanks in advance, 

Best
IOanna

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] shapiro.test

2014-02-21 Thread Bert Gunter
Second!!

-- Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
H. Gilbert Welch




On Fri, Feb 21, 2014 at 3:44 PM, Rolf Turner  wrote:
> On 22/02/14 11:53, Greg Snow wrote:
>
> 
>
>>
>> Why are you testing your data for normality?  For large sample sizes
>> the normality tests often give a meaningful answer to a meaningless
>> question (for small samples they give a meaningless answer to a
>> meaningful question).
>
>
> 
>
> Fortune!!!
>
> cheers,
>
> Rolf
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] predict model time series

2014-02-21 Thread Marlin Keith Cox
I am using a gam for a model predictor and need to forecast into the future
based off data collected earlier.  I would like to predict fish weight on
day 10 from measures taken on a fish from day 2.  This seems simple, but
using predict, I have only been able to predict weight for a given day.  A
simple linear model may work as an example.  I would like to know what the
forecasted weight is of a fish on day 10, if the fish measured on day 2 was
30.

Days<-1:10
Weight<-c(12,24,30,45,51,62,73,80,98,103)
plot(Weight~Days)
model<-lm(Weight~Days)
pred<-predict(model,Days=2)

I will then hopefully use the logic and apply it to my gam model
predictions.

Thanks, Keith


M. Keith Cox, Ph.D.
Principal
MKConsulting
17105 Glacier Hwy
Juneau, AK 99801
U.S. 907.957.4606

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Performance issue with attributes

2014-02-21 Thread Smart Guy
Hi All

I am having problem running the 'attributes' command to set a attribute on
each column of a large dataset. Dataset has 80 columns and 312407 rows. Its
taking more than 60 seconds to set simple attributes like split=TRUE,
usermissing=FALSE.

Here is the source code, assuming Dataset1 is the one that is large :-

myfunction <- function()
{
cat("Before for loop:")
print(Sys.time())
for( colIndex in 1 : 80)
{
cat("Before Attr", colIndex)
print(Sys.time())

attributes(Dataset1[1]) <- c(attributes(Dataset1[, colIndex]), list(coldesc
= c(), usermissing = c(FALSE), missingvalues  = NULL, split = c(FALSE),
levelLabels = c("")))

cat("After Attr:")
print(Sys.time())
}
cat("After for loop:")
print(Sys.time())
}

Its my feeling that R is passing all 312407 rows to set 'attributes' on a
cloumn.

Is there a more efficent way to do this?


Thanks,
SG

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Performance issue with attributes

2014-02-21 Thread Philippe Grosjean
You can use setattr() in the data.table package. It can be used too on 
data.frames or other objects.
Best,

Philippe Grosjean


On 22 Feb 2014, at 03:13, Smart Guy  wrote:

> Hi All
> 
> I am having problem running the 'attributes' command to set a attribute on
> each column of a large dataset. Dataset has 80 columns and 312407 rows. Its
> taking more than 60 seconds to set simple attributes like split=TRUE,
> usermissing=FALSE.
> 
> Here is the source code, assuming Dataset1 is the one that is large :-
> 
> myfunction <- function()
> {
> cat("Before for loop:")
> print(Sys.time())
> for( colIndex in 1 : 80)
> {
> cat("Before Attr", colIndex)
> print(Sys.time())
> 
> attributes(Dataset1[1]) <- c(attributes(Dataset1[, colIndex]), list(coldesc
> = c(), usermissing = c(FALSE), missingvalues  = NULL, split = c(FALSE),
> levelLabels = c("")))
> 
> cat("After Attr:")
> print(Sys.time())
> }
> cat("After for loop:")
> print(Sys.time())
> }
> 
> Its my feeling that R is passing all 312407 rows to set 'attributes' on a
> cloumn.
> 
> Is there a more efficent way to do this?
> 
> 
> Thanks,
> SG
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] shapiro.test

2014-02-21 Thread Philippe Grosjean
Greg,

I really like that TeachingDemos::SnowsPenultimateNormalityTest()… even the 
tortuous way to always return a p-value == 0:

# the following function works for current implementations of R
# to my knowledge, eventually it may need to be expanded
is.rational <- function(x){
rep( TRUE, length(x) )
}

tmp.p <- if( any(is.rational(x))) {
 0
} else {
 # current implementation will not get here if length
 # of x is positive.  This part is reserved for the
 # ultimate test
 1
}

(p.value is then returned as tmp.p). Also, the nice and sexy printing of that 
p-value in R as:

p-value < 2.2e-16

which looks much more serious than 'p-value = 0'… Here you has nothing to do. 
The stats::format.pval() function called from stats:::print.htest() already 
does the job for you!

I am just curious… Are there teachers out there pointing to that test? If yes, 
what fraction of the students realise what happens? I guess, it is closer to 
zero than to one, unfortunately. Wait… I need another 
SnowsPenultimateXxxxTest() here to check the null hypothesis that all my 
students are doing what they are supposed to do when discovering a new 
statistical tool!

Best,

Philippe Grosjean



On 21 Feb 2014, at 23:53, Greg Snow <538...@gmail.com> wrote:

> Rui,
> 
> Note this quote from the last paragraph of the Details section of ?ks.test:
> 
> "If a single-sample test is used, the parameters specified in '...'
> must be pre-specified and not estimated from the data."
> 
> Which is the exact opposite of your example.
> 
> 
> 
> Gonzalo,
> 
> Why are you testing your data for normality?  For large sample sizes
> the normality tests often give a meaningful answer to a meaningless
> question (for small samples they give a meaningless answer to a
> meaningful question).
> 
> If you really feel the need for a p-value then
> SnowsPenultimateNormalityTest in the TeachingDemos package will work
> for large sample sizes.  But note that the documentation for that
> function is considered more useful than the function itself.
> 
> 
> 
> On Fri, Feb 21, 2014 at 3:04 PM, Rui Barradas  wrote:
>> Hello,
>> 
>> Not answering directly to your question, if the sample size is a documented
>> problem with shapiro.test and you want a normality test, why don't you use
>> ?ks.test?
>> 
>> m <- mean(HP_TrinityK25$V2)
>> s <- sd(HP_TrinityK25$V2)
>> 
>> ks.test(HP_TrinityK25$V2, "pnorm", m, s)
>> 
>> 
>> Hope this helps,
>> 
>> Rui Barradas
>> 
>> Em 21-02-2014 15:59, Gonzalo Villarino Pizarro escreveu:
>> 
>>> Dear R users,
>>> Please help with with this maybe basic question. I am trying to see if my
>>> data is normal but is a large file and the test does not work.
>>> I keep getting the message : "Error in shapiro.test(x = HP_TrinityK25$V2)
>>> :  sample size must be between 3 and 5000"
>>> thanks!
>>> 
>>>  shapiro.test(x=HP_TrinityK25$V2)
>>> Error in shapiro.test(x = HP_TrinityK25$V2) : sample size must be between
>>> 3
>>> and 5000
>>> 
>>> ##Note:
>>> HP_TrinityK25= my file
>>> HP_TrinityK25$V2= data in my file
>>> 
>>>[[alternative HTML version deleted]]
>>> 
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
> -- 
> Gregory (Greg) L. Snow Ph.D.
> 538...@gmail.com
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to make a scatter plot

2014-02-21 Thread JohnDee
Birth order is a factor.  You can't have a fractional birth order - e.g.
ORDER = 1.5 makes no sense.  You are either first, second, third, etc. 
"Order" in your data can only take two unique values, and thus anything
plotted against birth order is only going to vary along a vertical line
above permitted values of birth order.  



--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-make-a-scatter-plot-tp4685675p4685678.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.