[R] mac question

2009-11-02 Thread Antje

Hi there,

currently, I've updated R on my Mac (OS X) to version 2.10. I was 
wondering if I have to install all additional packages again???
In Windows, I just needed to copy the library folder of the old 
installation but how does it work with Mac?


Can anybody give me a hint?

Antje

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] about the cox result

2009-11-02 Thread 孟欣
Hi all:
I finished cox analysis like this:
fit_cox<-coxph(Surv(dat$Time, dat$death) ~ dat$CD4 + 
strata(dat$gender),data=dat);
> fit_cox
Call:
coxph(formula = Surv(data_ori$Time, data_ori$death) ~ data_ori$drug + 
strata(data_ori$gender), data = data_ori)

  coef exp(coef) se(coef)zp
data_ori$drugddI 0.216  1.240.146 1.47 0.14
Likelihood ratio test=2.17  on 1 df, p=0.140  n= 467 
 
I wanna extract the result:
0.216  1.240.146 1.47 0.14 and the corresponding name "coef exp(coef) 
se(coef)zp"
from fit_cox.
 
I use the command:
str(fit_cox),but I can only find 0.216 of the result,but the other four 
results( 1.240.146 1.47 0.14) can't be found.
 
 
Anyone can help me?
Thanks a lot!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 1 dimensional optimization with local minima

2009-11-02 Thread Jeroen Ooms

I am using numerical optimization to fit a 1 parameter model, in which the
input parameter is bounded. I am currently using optimize(), however, the
problem turns out to have local minima, and optimize does not always seem to
find the global minimum. I could to write a wrapping function that tries
multiple intervals or starting values, but I would prefer a package that has
built-in methods to make it more robust against local minima.

I checked the CRAN Task View for Optimization, however there seem to be a
lot of alternatives, and not being an optimization expert, I could use some
advice. What could be an appropriate package or function for one-dimensional
bounded optimization, that includes some protection against local minima? 



-
Jeroen Ooms * Dept. of Methodology and Statistics * Utrecht University 

Visit  http://www.jeroenooms.com www.jeroenooms.com  to explore some of my
current projects.





 
-- 
View this message in context: 
http://old.nabble.com/1-dimensional-optimization-with-local-minima-tp26160001p26160001.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] hierarchical clustering with Jaccard index

2009-11-02 Thread karuna m
hi,
I want to do hierarchical clustering with Jaccord index. I tried to do with 
vegan package for finding index and hierarchical clustering with hclust 
function. While doing clustering it is showing an error message as "invalid 
distance method". I would be grateful if anyone tells how to rectify the error.
Thanks in advance,
 
kind regards,

Ms.Karunambigai M
PhD Scholar
Dept. of Biostatistics
NIMHANS
Bangalore
India


  Keep up with people you care about with Yahoo! India Mail. Learn how. 
http://in.overview.mail.yahoo.com/connectmore
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] design matrix construction question

2009-11-02 Thread Ken Knoblauch
> On Nov 2, 2009, at 10:40 PM, Ben Bolker wrote:
> >  with the following simple data frame
> > dd = structure(list(z = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L
> > ), .Label = c("a", "b"), class = "factor"), x = c(0.3, 0.2, 0.1,
> > 0, 0, 0, 0.2, 0.3)), .Names = c("z", "x"), row.names = c(NA,
> > -8L), class = "data.frame")
> >
> >  I would like know if it's possible to use model.matrix()
> > to construct the following design matrix in some sensible way:
> >
> >  za zb
> > 1  1  0
> > 2  1  0
> > 3  1  0
> > 4  0  0
> > 5  0  0
> > 6  0  0
> > 7  0  1
> > 8  0  1

How about something like this?

dd$xf <- (dd$x > 0) + 0
model.matrix(~ z:xf - 1, dd)
  za:xf zb:xf
1 1 0
2 1 0
3 1 0
4 0 0
5 0 0
6 0 0
7 0 1
8 0 1
attr(,"assign")
[1] 1 1
attr(,"contrasts")
attr(,"contrasts")$z
[1] "contr.treatment"

> > thanks
> >  Ben Bolker
> >
> >

-- 
Ken Knoblauch
Inserm U846
Stem-cell and Brain Research Institute
Department of Integrative Neurosciences
18 avenue du Doyen Lépine
69500 Bron
France
tel: +33 (0)4 72 91 34 77
fax: +33 (0)4 72 91 34 61
portable: +33 (0)6 84 10 64 10
http://www.sbri.fr/members/kenneth-knoblauch.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help with SSOAP (can't find working examples)

2009-11-02 Thread Hari Krishna Dara
First of all, let me confess that I am a newbie to R and don't know
much about the language or the environment. We have a need for
plugging in R in our production runtime and need the ability to pull
data out of our existing services. I am trying to see if I can take
advantage of SSOAP such that we can expose the data via webservices
and use SSOAP to call into them. Our runtime is mostly composed of
python, so the idea is to use rpy2 to interface from python to R, and
use SSOAP to call back into python modules where needed.

Now the problem is that I have hard time trying to find a single
example that works. I am using windows and here are the version
numbers:

R: 2.9.2
SSOAP: 0.5-3
XMLSchema: 0.1-1
XML: 2.6-0

The packages are up to date as per "Update packages". I tried all the
examples that came under library/SSOAP/examples directory and most
failed during genSOAPClientInterface(), while one failed right in
processWSDL(). This is the most common error during
genSOAPClientInterface():

Error in paste(" .elementFormQualified", .elementFormQualified, sep = " = ") :
  no slot of name "elementFormQualified" for this object of class "SchemaTypes"

There was only one reference to this error in the archives, and it
seems like the OP got the problem solved after updating XMLSchema
(which happened to be the same version as what I have). I tried a
couple of WSDL's that I have for creating python webservices, as well
as a "hello world" example, and they all have the same error as well.
While KEGG.wsdl, and others gave the above error, the eutils.wsdl gave
me the below error:

Error: Cannot resolve SOAP type in empty context

Interop.wsdl got me the below error:

Error: Cannot resolve string in SchemaCollection

nwis.wsdl failes in processWSDL() itself, with the below error:

Error in parse(text = paste(txt, collapse = "\n")) :
  unexpected input in "function(x, ..., obj = new( ‘"

I am most interested in solving the first error that I reported (as
the rest might be real issues with the wsdl syntax). Here is the full
output for KEGG.wsdl:

[1] "ArrayOfstring"
defining class ArrayOfstring
defining setAs() for ArrayOfstring
[1] "SSDBRelation"
defining class SSDBRelation
defining setAs() for SSDBRelation
[1] "ArrayOfSSDBRelation"
defining class ArrayOfSSDBRelation
defining setAs() for ArrayOfSSDBRelation
[1] "MotifResult"
defining class MotifResult
defining setAs() for MotifResult
[1] "ArrayOfMotifResult"
defining class ArrayOfMotifResult
defining setAs() for ArrayOfMotifResult
[1] "Definition"
defining class Definition
defining setAs() for Definition
[1] "ArrayOfDefinition"
defining class ArrayOfDefinition
defining setAs() for ArrayOfDefinition
[1] "LinkDBRelation"
defining class LinkDBRelation
defining setAs() for LinkDBRelation
[1] "ArrayOfLinkDBRelation"
defining class ArrayOfLinkDBRelation
defining setAs() for ArrayOfLinkDBRelation
Operation list_databases
resolve(, SchemaCollection) ArrayOfDefinition
resolve(, list) ArrayOfDefinition
Note: Method with signature "ArrayType#list" chosen for function "resolve",
 target signature "ArrayType#SchemaCollection".
 "SimpleSequenceType#SchemaCollection" would also be valid
Error in paste(" .elementFormQualified", .elementFormQualified, sep = " = ") :
  no slot of name "elementFormQualified" for this object of class "SchemaTypes"
In addition: Warning message:
In .findOrCopyClass(class2, classDef2, where, "subclass") :
  Class "VirtualXMLSchemaClass" is defined (with package slot
"XMLSchema") but no metadata object found to revise subclass
information---not exported?  Making a copy in package ".GlobalEnv"

I am not an expert in web services and WSDL, but I did compare them
with a few public WSDL samples and couldn't really make out an issue.
I would really appreciate any help or information in getting SSOAP to
work in my environment. Hopefully it is something trivial, like a
configuration issue.

And one more thing, I found version 0.5.4 of SSOAP at omegahat.org
website and tried that out as well, but it didn't make any difference.
I found a newer version of XMLSchema as well, but it didn't install
(with compilation errors). I have activeperl and mingw in the path.

Thank you,
Hari

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] why is adnonis function called adonis {package vegan}

2009-11-02 Thread stvienna wiener
Hello all,


I used google and looked at the documentation to find out why the
ADONIS function
is called adonis (in the vegan package).

I am writing a document and would like to include a
abbreviation list (similar to "ANOSIM = Analysis of Similarities").


regards,
Steve

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] design matrix construction question

2009-11-02 Thread David Winsemius


On Nov 2, 2009, at 10:40 PM, Ben Bolker wrote:



 with the following simple data frame
dd = structure(list(z = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L
), .Label = c("a", "b"), class = "factor"), x = c(0.3, 0.2, 0.1,
0, 0, 0, 0.2, 0.3)), .Names = c("z", "x"), row.names = c(NA,
-8L), class = "data.frame")

 I would like know if it's possible to use model.matrix()
to construct the following design matrix in some sensible way:

 za zb
1  1  0
2  1  0
3  1  0
4  0  0
5  0  0
6  0  0
7  0  1
8  0  1

 In other words, I want column 1 to be (z=="a" & x>0) and
column 2 to be (z=="b" & x>0).  I can construct this matrix
using

sweep(model.matrix(~z-1,dd),1,dd$x>0,"*")

and then stick it into lm.fit -- but is there a more
elegant way to do this in general?


Elegance? You decide. But at least more economical from a keystroke  
perspective:


model.matrix(~z-1,dd)*(dd$x>0)
#--
  za zb
1  1  0
2  1  0
3  1  0
4  0  0
5  0  0
6  0  0
7  0  1
8  0  1
attr(,"assign")
[1] 1 1
attr(,"contrasts")
attr(,"contrasts")$z
[1] "contr.treatment



I haven't found a formula combining
(z-1) and I(x>0) that works, although I can imagine there is one.

thanks
 Ben Bolker




David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] save an object by dynamicly created name

2009-11-02 Thread Hao Cen
Hi Henrik and David,

Thank you very much for your suggestions. I found both meet my needs.

In the following code,  I found save(obj,file=pathname) would save the
content into an object called obj. 

path <- "~/";
dir.create(path);

for (i in 1:10) {
  assign( paste("m", i, sep=""),  i:5)
  filename <- sprintf("m%02d.Rdta", i)
  pathname <- file.path(path, filename)
  obj =get(paste("m", i, sep=""))
  save(obj, file=pathname)
}

A tweak to this would be

path <- "~/";
dir.create(path);

for (i in 1:10) {
  assign( paste("m", i, sep=""),  i:5)
  filename <- sprintf("m%02d.Rdta", i)
  pathname <- file.path(path, filename)
  save(list = paste("m", i, sep=""), file=pathname)
}


Thanks a lot

Hao

-Original Message-
From: henrik.bengts...@gmail.com [mailto:henrik.bengts...@gmail.com] On
Behalf Of Henrik Bengtsson
Sent: Monday, November 02, 2009 12:34 AM
To: David Winsemius
Cc: r-help@r-project.org; jeffc
Subject: Re: [R] save an object by dynamicly created name

On Sun, Nov 1, 2009 at 9:18 PM, David Winsemius 
wrote:
>
> On Nov 1, 2009, at 11:28 PM, Henrik Bengtsson wrote:
>
>> On Sun, Nov 1, 2009 at 7:48 PM, David Winsemius 
>> wrote:
>>>
>>> On Nov 1, 2009, at 10:16 PM, Henrik Bengtsson wrote:
>>>
 path <- "data";
 dir.create(path);

 for (i in 1:10) {
  m <- i:5;
  filename <- sprintf("m%02d.Rbin", i);
  pathname <- file.path(path, filename);
  save(m, file=pathname);
 }

>>>
>>> That would result in each of the ten files containing an object with the
>>> same  name == "m". (Also on my system R data files have type Rdta.) So I
>>> thought what was requested might have been a slight mod:
>>>
>>> path <- "~/";
>>> dir.create(path);
>>>
>>> for (i in 1:10) {
>>>  assign( paste("m", i, sep=""),  i:5)
>>>  filename <- sprintf("m%02d.Rdta", i)
>>>  pathname <- file.path(path, filename)
>>>  obj =get(paste("m", i, sep=""))
>>>  save(obj, file=pathname)
>>> }
>>
>> Then a more convenient solution is to use saveObject() and
>> loadObject() of R.utils.  saveObject() does not save the name of the
>> object save.
>
> The OP asked for this outcome :
>
> " I would like to save m as m1, m2, m3 ...,
> to file /home/data/m1, /home/data/m2, home/data/m3, ..."
>
>
>>  If you want to save multiple objects, the wrap them up
>> in a list.
>
> I agree that a list would makes sense if it were to be stored in one file
,
> although it was not what requested.

That comment was not for the OP, but for saveObject()/loadObject() in
general.

> But wouldn't that require assign()-ing a name before list()-wrapping?

Nope, the whole point of using saveObject()/loadObject() is to save
the objects/values without their names that you happens to choose in
the current session, and to avoid overwriting existing ones in your
next session. My example could also have been:

library("R.utils");
saveObject(list(a=1,b=LETTERS,c=Sys.time()), file="foo.Rbin");
y <- loadObject("foo.Rbin");
z <- loadObject("foo.Rbin");
stopifnot(identical(y,z));

If you really want to attach the elements of the saved list, do:

attachLocally(loadObject("foo.Rbin"));
> str(a)
 num 1
> str(b)
 chr [1:26] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" ...
> str(c)
 POSIXct[1:1], format: "2009-11-01 21:30:41"

>
> I suppose we ought to mention that the use of assign to create a variable
is
> a FAQ ... 7.21? Yep, I have now referred to it a sufficient number of
times
> to refer to it by number.
>
>
http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-turn-a-string-into-a-
variable_003f

My personal take on assign() and get() is that if you find yourself
using them (at this level), there is a good chance there exists a
better solution that you should use instead.

My $.02

/H

>
> --
> David
>
>>  loadObject() does not assign variable, but instead return
>> them. Example:
>>
>> library("R.utils");
>> x <- list(a=1,b=LETTERS,c=Sys.time());
>> saveObject(x, file="foo.Rbin");
>> y <- loadObject("foo.Rbin");
>> stopifnot(identical(x,y));
>
>>
>> So, for the original example, I'd recommend:
>>
>> library("R.utils");
>> path <- "data";
>> mkdirs(path);
>>
>> for (i in 1:10) {
>>  m <- i:5;
>>  filename <- sprintf("m%02d.Rbin", i);
>>  saveObject(m, file=filename, path=path);
>> }
>>
>> and loading the objects back as:
>>
>> for (i in 1:10) {
>>  filename <- sprintf("m%02d.Rbin", i);
>>  m <- loadObject(filename, path=path);
>>  print(m);
>> }
>> /Henrik
>>
>>>
>>> --
>>> David.
>>>
 /H

 On Sun, Nov 1, 2009 at 6:53 PM, jeffc  wrote:
>
> Hi,
>
> I would like to save a few dynamically created objects to disk. The
> following is the basic flow of the code segment
>
> for(i = 1:10) {
>  m = i:5
>  save(m, file = ...) ## ???
> }
> To distinguish different objects to be saved, I would like to save m
as
> m1,
> m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ...
>
> I tried a couple of methods on translating between object names and
> strings
> (below) b

[R] OpenOffice Calc ODBC equivalent

2009-11-02 Thread Kenneth Roy Cabrera Torres
Hi R users:

I am using RODBC to create some new ".xls" files each with
several sheets (about 100) with sqlSave() from a data.frame
inside R  without any problem, but on windows XP platform.

I would like to know if it is posible to make a similar 
solution on linux with openoffice?

RODBC work only (for the moment) with ".xls" on windows,
via the ODBC driver.

Is it posible to make the same thing on openoffice?
I mean, an ".odt" file with several sheets each, generated from 
a data.frame in R?

Thank you for your help.

Kenneth

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] design matrix construction question

2009-11-02 Thread Ben Bolker

  with the following simple data frame
dd = structure(list(z = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L
), .Label = c("a", "b"), class = "factor"), x = c(0.3, 0.2, 0.1,
0, 0, 0, 0.2, 0.3)), .Names = c("z", "x"), row.names = c(NA,
-8L), class = "data.frame")

  I would like know if it's possible to use model.matrix()
to construct the following design matrix in some sensible way:

  za zb
1  1  0
2  1  0
3  1  0
4  0  0
5  0  0
6  0  0
7  0  1
8  0  1

  In other words, I want column 1 to be (z=="a" & x>0) and
column 2 to be (z=="b" & x>0).  I can construct this matrix
using

sweep(model.matrix(~z-1,dd),1,dd$x>0,"*")

and then stick it into lm.fit -- but is there a more
elegant way to do this in general?  I haven't found a formula combining
(z-1) and I(x>0) that works, although I can imagine there is one.

 thanks
  Ben Bolker


-- 
Ben Bolker
Associate professor, Biology Dep't, Univ. of Florida
bol...@ufl.edu / www.zoology.ufl.edu/bolker
GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc



signature.asc
Description: OpenPGP digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] re ading tokens

2009-11-02 Thread David Winsemius


On Nov 2, 2009, at 8:00 PM, j daniel wrote:



Greetings,

I am not familiar with processing text in R.  Can someone tell me  
how to

read each line of words as separate elements in a list?

FE, I would like to turn:

word1 word2 word3
word2 word4

into a list of length two with three character elements in the first  
list
and two elements in the second.  I know that this should be easy,  
but I am a

little confused by the text functions.


> txt <- textConnection("word1 word2 word3
+ word2 word4")
> strsplit(readLines(txt), " ")
[[1]]
[1] "word1" "word2" "word3"

[[2]]
[1] "word2" "word4"



--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] re ading tokens

2009-11-02 Thread jim holtman
Is this what you want:

> x <- readLines(textConnection("word1 word2 word3
+ word2 word4"))
> closeAllConnections()
> yourList <- strsplit(x, '[[:space:]]+')
>
>
> yourList
[[1]]
[1] "word1" "word2" "word3"

[[2]]
[1] "word2" "word4"

>


On Mon, Nov 2, 2009 at 8:00 PM, j daniel  wrote:
>
> Greetings,
>
> I am not familiar with processing text in R.  Can someone tell me how to
> read each line of words as separate elements in a list?
>
> FE, I would like to turn:
>
> word1 word2 word3
> word2 word4
>
> into a list of length two with three character elements in the first list
> and two elements in the second.  I know that this should be easy, but I am a
> little confused by the text functions.
>
> Thanks in advance!
> --
> View this message in context: 
> http://old.nabble.com/reading-tokens-tp26159915p26159915.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] re ading tokens

2009-11-02 Thread j daniel

Greetings,

I am not familiar with processing text in R.  Can someone tell me how to
read each line of words as separate elements in a list?

FE, I would like to turn:

word1 word2 word3
word2 word4

into a list of length two with three character elements in the first list
and two elements in the second.  I know that this should be easy, but I am a
little confused by the text functions.

Thanks in advance!
-- 
View this message in context: 
http://old.nabble.com/reading-tokens-tp26159915p26159915.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Can't pass file name as parameter to Corpus function

2009-11-02 Thread renjl0810

I'm working on a small project to extract high-frequency terms from a
document and then display those terms in web page. To this end, I've to pass
the file name as parameter to the Corpus function to build a corpus of only
one document. I can build the corpus using the code below interactively in
R. But calling the function with a file name as the parameter I got the
error message saying "Error in eval(expr, envir, enclos) : object
'strFileName' not found"

test<-function(strFileName) {
src <- URISource(strFileName)
cor <- Corpus(src, readerControl = list(reader = readPDF, language =
"en_US", load = TRUE))
}

After running the following code in R I checked the docURISource$URI and the
value is "strFileName" rather than "C:\\Temp\\readme.txt". I also checked
the URI when I was debugging the function and the URI is also  "strFileName"
rather than "C:\\Temp\\readme.txt". 

strFileName <- "C:\\Temp\\readme.txt"
docURISource = URISource(strFileName, encoding = "UTF_8")
docCorpus = Corpus(docURISource)

So it seems when running interactively the URI contains strFileName as a
variable but when called as a function, the URI contains strFileName as the
actual value. Can anybody point out what's the problem?

thanks in advance!!

Jianli
-- 
View this message in context: 
http://old.nabble.com/Can%27t-pass-file-name-as-parameter-to-Corpus-function-tp26159498p26159498.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] residuals(lme) error message

2009-11-02 Thread Rupa Parvataneni
Dear R-help,

I am trying to perform residual diagnostics on a repeated measures mixed
effect model. With 3 time points as the fixed effect and subjects as the
random effects.


> time.linearc<- summary(lmer(cnim~(as.factor(time)), random= ~1|subj,
data=ex.rmc, na.action=na.omit))

> time.linearc

 When I perform :

>res <- residuals(time.linearc)

I receive the following error message:

Error in val[, level] : incorrect number of dimensions

If possible could someone explain the problem?


Thanks,

Rupa

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Frequency

2009-11-02 Thread Nikhil Kaza

try
 sort (table(MAT), decreasing=T)

if MAT is your matrix

I think this is what you want. though if you want to sort by the first  
occurrence then it is a different story.


Nikhil

On 2 Nov 2009, at 1:35PM, Val wrote:


V1  v2  v3   v4

  569   10

347   10

46   10   18


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] kernel density estimation and classification

2009-11-02 Thread wk y


Hi,

I am interested to use Kernel Density Estimation on bivariate data for 
multi-class classification.

So far, I have managed to use the 'ks' package to plot the contours of the 
kernel density estimates based on 8-class training dataset with only 2 
variables.

However, I do not know how to make use of the Kernel Density Analysis results 
as input into a classifier (e.g. Bayesian classifiers) to accomplish the task 
of classifying testing dataset into 8 different classes.

I will really appreciate any suggestions or advice.

Thanks.

regards,
WK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Time Series methods

2009-11-02 Thread Tim Bean
Sure, but it seems like the point of the time series class is so you don't
have to create a factor based on the period of sampling. Is there anyway of
imposing calculations like median on the time series class, or is Kjetil's
suggestion the only approach?

Thanks!

On Sun, Nov 1, 2009 at 4:09 PM, Kjetil Halvorsen <
kjetilbrinchmannhalvor...@gmail.com> wrote:

> introduce a factor variable with the months and then use tapply?
>
> Kjetil
>
> On Sun, Nov 1, 2009 at 9:07 PM, Tim Bean  wrote:
> > Hello, I have a quick question about time series methodology. If I want
> to
> > display a boxplot of time series data, sorted by period, I can type:
> >
> > boxplot(data ~ cycle(data));
> >
> > where data is of class "ts"
> >
> > Is there a similar method for calculating, say, the median value of each
> > time step within the series? (So for a monthly data set, calculate median
> > for all Januarys, all Februarys, all Marchs, etc.)
> >
> > Thanks,
> > Tim
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to exclude certain columns by column names?

2009-11-02 Thread Linlin Yan
Try this:
> x[, colnames(x) != 'a']
[1] 3 4

On Tue, Nov 3, 2009 at 9:31 AM, Peng Yu  wrote:
> I can exclude columns by column number using '-'. But I wondering if
> there is an easy way to exclude some columns by column names.
>
>> x=cbind(c(1,2),c(3,4))
>> x
>     [,1] [,2]
> [1,]    1    3
> [2,]    2    4
>> colnames(x)=c('a','b')
>> x
>     a b
> [1,] 1 3
> [2,] 2 4
>> x[,-'a']
> Error in -"a" : invalid argument to unary operator
>> x[,-1]
> [1] 3 4
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to exclude certain columns by column names?

2009-11-02 Thread Jorge Ivan Velez
Hi Peng,

Here is a suggestion:

subset(x, select = -a)

See ?subset for more details.

HTH,
Jorge


On Mon, Nov 2, 2009 at 8:31 PM, Peng Yu <> wrote:

> I can exclude columns by column number using '-'. But I wondering if
> there is an easy way to exclude some columns by column names.
>
> > x=cbind(c(1,2),c(3,4))
> > x
> [,1] [,2]
> [1,]13
> [2,]24
> > colnames(x)=c('a','b')
> > x
> a b
> [1,] 1 3
> [2,] 2 4
> > x[,-'a']
> Error in -"a" : invalid argument to unary operator
> > x[,-1]
> [1] 3 4
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to exclude certain columns by column names?

2009-11-02 Thread Peng Yu
I can exclude columns by column number using '-'. But I wondering if
there is an easy way to exclude some columns by column names.

> x=cbind(c(1,2),c(3,4))
> x
 [,1] [,2]
[1,]13
[2,]24
> colnames(x)=c('a','b')
> x
 a b
[1,] 1 3
[2,] 2 4
> x[,-'a']
Error in -"a" : invalid argument to unary operator
> x[,-1]
[1] 3 4

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] modifying predict.nnet() to function with errorest()

2009-11-02 Thread Max Kuhn
Dave,

You should look at the train() function in teh caret package.

Max

On Mon, Nov 2, 2009 at 6:01 PM, Armitage, Dave  wrote:
> Greetings,
>
> I am having trouble calculating artificial neural network misclassification
> errors using errorest() from the ipred package. I have had no problems
> estimating the values with randomForest() or svm(), but can't seem to get it
> to work with nnet(). I believe this is due to the output of the
> predict.nnet() function within cv.factor(). Below is a quick example of the
> problem I'm experiencing. Any ideas on how to get around it or will it
> simply not work with nnet()?
>
>> library(MASS)
>> library(nnet)
>> library(ipred)
>> data(iris3)
>> set.seed(191)
>>
>> samp <- c(sample(1:50,25), sample(51:100,25), sample(101:150,25))
>> ird <- data.frame(rbind(iris3[,,1], iris3[,,2], iris3[,,3]),
>
> +         species = factor(c(rep("s",50), rep("c", 50), rep("v", 50
>>
>> errorest(species ~., data = ird, subset = samp, model = nnet, size = 2,
>> rang =0.1, decay = 5e-4, maxit = 200)
>
> # weights:  19
> initial  value 73.864441
> .
> .
> .
> final  value 0.339114
> converged
> Error in cv.factor(y, formula, data, model = model, predict = predict,  :
>  predict does not return factor values
>
>
>
> Thanks,
> Dave
> __
>
> Dave Armitage
> Wildlife Ecology and Conservation
> University of Florida
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] qqplot

2009-11-02 Thread John Fox
Dear Carol,

> -Original Message-
> From: carol white [mailto:wht_...@yahoo.com]
> Sent: November-02-09 4:04 PM
> To: 'Peter Flom'; John Fox
> Cc: r-h...@stat.math.ethz.ch; 'Yihui Xie'
> Subject: RE: [R] qqplot
> 
> Thanks for all your replies.
> 
> I just wanted to compare the distributions of two populations and see how
> different they are. I thought that QQplot would be a good idea. It's true
> that the mean and sd of the two distributions are not the same. I don't
know
> if any other method or graphical presentation like five number, mean and
sd
> as suggested, or boxplot would have been more suited to be used.

It depends on what you're interested in comparing. If you're interested in
differences in shape adjusting for differences in centre and spread, then a
QQ plot with a line fit through the first and third quartiles is a
reasonable choice.  If you centre the two data sets to medians of 0, the
intercept of the line will give you information about the difference in
centres and the slope about differences in spread. If you're primarily
interested in comparing centres and spreads then parallel boxplots would be
a good choice. 

Regards,
 John

> 
> thanks for your advices,
> 
> --- On Mon, 11/2/09, John Fox  wrote:
> 
> > From: John Fox 
> > Subject: RE: [R] qqplot
> > To: "'Peter Flom'" 
> > Cc: r-h...@stat.math.ethz.ch, "'carol white'" ,
"'Yihui
> Xie'" 
> > Date: Monday, November 2, 2009, 9:24 AM
> > Dear Peter,
> >
> > I assumed that Carol wanted to compare the shapes of the
> > distributions and
> > to adjust for differences in centre and spread. To put a
> > line through the
> > quartiles or to base a line on the medians and IQRs is more
> > robust than
> > using the means and sds.
> >
> > Best,
> >  John
> >
> >
> > > -Original Message-
> > > From: r-help-boun...@r-project.org
> > [mailto:r-help-boun...@r-project.org]
> > On
> > > Behalf Of Peter Flom
> > > Sent: November-02-09 11:57 AM
> > > To: carol white; Yihui Xie
> > > Cc: r-h...@stat.math.ethz.ch
> > > Subject: Re: [R] qqplot
> > >
> > > carol white 
> > wrote
> > >
> > > >So the conclusion is that abline(0,1) should
> > always be used and if it
> > > doesn't go through the qqplot, the two distributions
> > are not similar?
> > >
> > > I think it depends what you mean by "similar".
> > E.g., if you mean "are
> > both
> > > of these distributions (e.g.) normal?" then
> > abline(0,1) is not always
> > useful.
> > > But if you mean "Do these have the same mean, sd, and
> > distribution?" then
> > > abline(0,1) is the way to go.
> > >
> > > Peter
> > >
> > > Peter L. Flom, PhD
> > > Statistical Consultant
> > > Website: www DOT peterflomconsulting DOT com
> > > Writing; http://www.associatedcontent.com/user/582880/peter_flom.html
> > > Twitter: �...@peterflom
> > >
> > > __
> > > R-help@r-project.org
> > mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained,
> > reproducible code.
> >
> >
> >
> 
> 
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] avoiding loop

2009-11-02 Thread jim holtman
The first thing I would suggest is convert your dataframes to matrices
so that you are not having to continually convert them in the calls to
the functions.  Also I am not sure what the code:

realized_prob = with(DF, {
ind <- (CHOSEN == 1)
n <- tapply(theta_multiple[ind], 
CS[ind], sum)
d <- tapply(theta_multiple, CS, sum)
n / d   
})

is doing.  It looks like 'n' and 'd' might have different lengths
since they are being created by two different (CS & CS[ind])
sequences.  I have no idea why you are converting to the "DF"
dataframe.  THere is no need for that.  You could just leave the
vectors (e.g., theta_multiple, CS and ind) as they are and work with
them.  This is probably where most of your time is being spent.  So if
you start with matrices and leave the dataframes out of the main loop
you will probably see an increase in performance.

2009/11/2 parkbomee :
> This is the Rprof() report by self time.
> Is it also possible that these routines, which take long self.time, are
> causing the optim() to be slow?
>
>
> $by.self
> self.time self.pct total.time total.pct
> "FUN"   94.16 16.5  94.16  16.5
> "unlist"80.46 14.1 120.54  21.1
> "lapply"76.94 13.5 255.48  44.7
> "match" 60.76 10.6  60.88  10.7
> "as.matrix.data.frame"  31.00  5.4  51.12   8.9
> "as.character"  29.28  5.1  29.28   5.1
> "unique.default"24.36  4.3  24.40   4.3
> "data.frame"21.06  3.7  55.78   9.8
> "split.default" 20.42  3.6  84.38  14.8
> "tapply"13.84  2.4 414.28  72.5
> "structure" 11.32  2.0  22.36   3.9
> "factor"11.08  1.9 127.68  22.3
> "attributes<-"  11.00  1.9  11.00   1.9
> "=="10.56  1.8  10.56   1.8
> "%*%"   10.30  1.8  10.30   1.8
> "as.vector" 10.22  1.8  10.22   1.8
> "as.integer" 9.86  1.7   9.86   1.7
> "list"   9.64  1.7   9.64   1.7
> "exp"7.12  1.2   7.12   1.2
> "as.data.frame.integer"  5.98  1.0   8.10   1.4
>
>> To: bbom...@hotmail.com
>> CC: jholt...@gmail.com; r-help@r-project.org
>> Subject: Re: [R] avoiding loop
>> From: mtmor...@fhcrc.org
>> Date: Sun, 1 Nov 2009 22:14:09 -0800
>>
>> parkbomee  writes:
>>
>> > Thank you all.
>> >
>> > What Chuck has suggested might not be applicable since the number of
>> > different times is around 40,000.
>> >
>> > The object of optimization in my function is the varying "value",
>> > which is basically data * parameter, of which "parameter" is the
>> > object of optimization..
>> >
>> > And from the r profiling with a subset of data,
>> > I got this report..any idea what "" is?
>> >
>> >
>> > $by.total
>> > total.time total.pct self.time self.pct
>> > "" 571.56 100.0 0.02 0.0
>> > "optim" 571.56 100.0 0.00 0.0
>> > "fn" 571.54 100.0 0.98 0.2
>>
>> You're giving us 'by.total', so these are saying that all the time was
>> spent in these functions or the functions they called. Probably all
>> are in 'optim' and its arguments; since little self.time is spent
>> here, there isn't much to work with
>>
>> > "eval" 423.74 74.1 0.00 0.0
>> > "with.default" 423.74 74.1 0.00 0.0
>> > "with" 423.74 74.1 0.00 0.0
>>
>> These are probably in the internals of optim, where the function
>> you're trying to optimize is being set up for evaluation. Again
>> there's little self.time, and all these say is that a big piece of the
>> time is being spent in code called by this code.
>>
>> > "tapply" 414.28 72.5 13.84 2.4
>> > "lapply" 255.48 44.7 76.94 13.5
>> > "factor" 127.68 22.3 11.08 1.9
>> > "unlist" 120.54 21.1 80.46 14.1
>> > "FUN" 94.16 16.5 94.16 16.5
>>
>> these look like they are tapply-related calls (looking at the code for
>> tapply, it calls lapply, factor, and unlist, and FUN is the function
>> argument to tapply), perhaps from the function you're optimizing (did
>> you implement this as suggested below? it would really help to have a
>> possibly simplified version of the code you're calling).
>>
>> There is material to work with here, as apparently a fairly large
>> amount of self.time is being spent in each of these functions. So
>> here's a sample data set
>>
>> n <- 10
>> set.seed(123)
>> df <- data.frame(time=sort(as.integer(ceiling(runif(n)*n/5))),
>> value=ceiling(runif(n)*5))
>>
>> It would have been helpful for you to provide reproducible code like
>> that above, so that

Re: [R] a prolem with constrOptim

2009-11-02 Thread Ravi Varadhan
Hi,

I noticed that you are setting very small tolerance as convergence
criterion. In most optimization problems, it does not make much practical
sense to set such a low tolerance.  If you increase this, then `constrOptim'
also works. For example,

constrOptim(theta=theta.0, f=likelihood, grad=score, ui=RR,ci=zeros,
data = data2, method="BFGS",
control=list(fnscale=-1,reltol=1e-10))

However, it should be noted that my function `contrOptim.nl' provides a
better answer for both data1 and data2 in terms of having a larger
log-likelihood at convergence.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
tml

 





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Davidov, Ori (NIH/NIEHS) [V]
Sent: Monday, November 02, 2009 11:36 AM
To: r-help@r-project.org
Subject: [R] a prolem with constrOptim

Hi,

I apologize for the long message but the problem I encountered can't be
stated in a few lines.

I am having some problems with the function constrOptim. My goal is to
maximize the likelihood of product of K multinomials, each with four
catagories under linear constraints on the parameter values. I have found
that the function does not work for many data configurations.

#The likelihood and score functions are:

likelihood = function(theta,data)
{
p = matrix(theta,nrow=3,byrow=F)
s = 1-apply(p,2,sum)
p = rbind(p,s)
Z = matrix(data,nrow=4,byrow=F)
sum(log(p)*Z)
}

score = function(theta,data)
{
k = length(data)/4
S = numeric(3*k)
for(i in 1:k)
{
n = data[(1+4*(i-1)):(4*i)]
p = theta[(1+3*(i-1)):(3*i)]
P = 1-sum(p)
S[(1+3*(i-1)):(3*i)] = n[1:3]/p-n[4]/P
}
S
}

#where theta=(p11,p12,p13,p21,p22,p23,...,pK1,pK2,pK3).

#The function Rmat calculates the restriction matrix needed for constrained
estimation

Rmat = function(k)
{
R = matrix(1,4,3)
R[1,2] = R[1,3] = R[2,3] = R[3,2] = 0
RR = cbind(-R,R)
RRR = matrix(0,4*(k-1),3*k)
for ( i in 1:(k-1) ) RRR[4*(i-1)+1:4,3*(i-1)+1:6] = RR
RRR
}

#The function gen.data generates data

gen.data = function(p,N)
{
k = ncol(p)
Data = matrix(0,4,k)
for(j in 1:k)
{
Data[,j] = as.vector(rmultinom(1, size =
N[j], prob=p[,j]))
}
as.vector(Data)
}

#I have found that the function returns an error under some configurations
of the data ( 0's seem to harm it). Why is this happening and how can it #be
corrected? Why do some configurations with 0's crash the program and others
don't?

# For example for the data:

data1 = c(0,6,8,6,3,4,4,9,2,2,5,11)
data2 = c(0,6,8,6,0,4,4,9,2,2,5,11)

# inputs for constrOptim

K = 3
RR = Rmat(K)
zeros = rep(0,nrow(RR))
theta.0 = numeric(3*K)
for(i in 1:K) {theta.0[(1+3*(i-1)):(3*i)] = rep(1/10,3)+i/100}

# data1 gives an error but data2 does not!

constrOptim(theta=theta.0, f=likelihood, grad=score, ui=RR,ci=zeros,
data = data1, method="BFGS",
control=list(fnscale=-1,abstol=1e-16,reltol=1e-16))$par

constrOptim(theta=theta.0, f=likelihood, grad=score, ui=RR,ci=zeros,
data = data2, method="BFGS",
control=list(fnscale=-1,abstol=1e-16,reltol=1e-16))$par


# If you want to further check run the simulation below.

K = 3
N = rep(20,K)
p = c(1/10,2/10,2/10,5/10)
p = matrix(rep(p,K),nrow=4,byrow=F)


RR = Rmat(K)
zeros = rep(0,nrow(RR))
theta.0 = numeric(3*K)
for(i in 1:K) {theta.0[(1+3*(i-1)):(3*i)] = rep(1/10,3)+i/100}

for(i in 1:1000)
{
data = gen.data(p,N)
print(i)
print(data)
theta.til = constrOptim(theta=theta.0, f=likelihood,
grad=score, ui=RR,ci=zeros,
data = data, method="BFGS",
control=list(fnscale=-1,abstol=1e-16,reltol=1e-16))$par

print(theta.til)
}



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, repr

[R] modifying predict.nnet() to function with errorest()

2009-11-02 Thread Armitage, Dave

Greetings,

I am having trouble calculating artificial neural network 
misclassification errors using errorest() from the ipred package. 
I have had no problems estimating the values with randomForest() 
or svm(), but can't seem to get it to work with nnet(). I believe 
this is due to the output of the predict.nnet() function within 
cv.factor(). Below is a quick example of the problem I'm 
experiencing. Any ideas on how to get around it or will it simply 
not work with nnet()?



library(MASS)
library(nnet)
library(ipred)
data(iris3)
set.seed(191)

samp <- c(sample(1:50,25), sample(51:100,25), sample(101:150,25))
ird <- data.frame(rbind(iris3[,,1], iris3[,,2], iris3[,,3]),
+ species = factor(c(rep("s",50), rep("c", 50), rep("v", 
50
errorest(species ~., data = ird, subset = samp, model = nnet, 
size = 2, rang =0.1, decay = 5e-4, maxit = 200)

# weights:  19
initial  value 73.864441
.
.
.
final  value 0.339114
converged
Error in cv.factor(y, formula, data, model = model, predict = 
predict,  :

 predict does not return factor values



Thanks,
Dave
__

Dave Armitage
Wildlife Ecology and Conservation
University of Florida

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] a prolem with constrOptim

2009-11-02 Thread Ravi Varadhan
Hi Davidov,

You can use the `constrOptim.nl' function that I have written for solving
problems like yours.  It is better and more general than the `constrOptim'
function.  It is able to solve your problem, but I am not sure how well
posed your problem is.  One of the parameters seems to go to 0.  May be this
ok.

Here is the code for solving your problem:

source("constrOptim_nl.txt")  

hin <- function(par, ...) {
# function that defines the inequality constraints
   k = 3
R = matrix(1,4,3)
R[1,2] = R[1,3] = R[2,3] = R[3,2] = 0
RR = cbind(-R,R)
RRR = matrix(0, 4*(k-1),3*k)
for ( i in 1:(k-1) ) RRR[4*(i-1)+1:4,3*(i-1)+1:6] = RR
   c(RRR %*% par)
}

ans1 <- constrOptim.nl(par=theta.0, fn=likelihood, hin=hin,
data=data1, control=list(fnscale=-1))

ans2 <- constrOptim.nl(par=theta.0, fn=likelihood, hin=hin,
data=data2, control=list(fnscale=-1))

This works for both data1 and data2.

> ans1
$par
[1] 7.158690e-10 2.500022e-01 2.682024e-01 1.314431e-01 1.750843e-01
2.118290e-01 2.089612e-01 9.771505e-02 2.647332e-01

$value
[1] -72.81515


> ans2
$par
[1] 8.068277e-12 1.659718e-01 2.332808e-01 1.363536e-04 1.91e-01
2.331728e-01 7.855021e-02 1.240325e-01 2.235095e-01

$value
[1] -65.75015

Let me know if you are interested in obtaining the `constrOptim.nl'
function.   

Hope this helps,
Ravi.



---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
tml

 





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Davidov, Ori (NIH/NIEHS) [V]
Sent: Monday, November 02, 2009 11:36 AM
To: r-help@r-project.org
Subject: [R] a prolem with constrOptim

Hi,

I apologize for the long message but the problem I encountered can't be
stated in a few lines.

I am having some problems with the function constrOptim. My goal is to
maximize the likelihood of product of K multinomials, each with four
catagories under linear constraints on the parameter values. I have found
that the function does not work for many data configurations.

#The likelihood and score functions are:

likelihood = function(theta,data)
{
p = matrix(theta,nrow=3,byrow=F)
s = 1-apply(p,2,sum)
p = rbind(p,s)
Z = matrix(data,nrow=4,byrow=F)
sum(log(p)*Z)
}

score = function(theta,data)
{
k = length(data)/4
S = numeric(3*k)
for(i in 1:k)
{
n = data[(1+4*(i-1)):(4*i)]
p = theta[(1+3*(i-1)):(3*i)]
P = 1-sum(p)
S[(1+3*(i-1)):(3*i)] = n[1:3]/p-n[4]/P
}
S
}

#where theta=(p11,p12,p13,p21,p22,p23,...,pK1,pK2,pK3).

#The function Rmat calculates the restriction matrix needed for constrained
estimation

Rmat = function(k)
{
R = matrix(1,4,3)
R[1,2] = R[1,3] = R[2,3] = R[3,2] = 0
RR = cbind(-R,R)
RRR = matrix(0,4*(k-1),3*k)
for ( i in 1:(k-1) ) RRR[4*(i-1)+1:4,3*(i-1)+1:6] = RR
RRR
}

#The function gen.data generates data

gen.data = function(p,N)
{
k = ncol(p)
Data = matrix(0,4,k)
for(j in 1:k)
{
Data[,j] = as.vector(rmultinom(1, size =
N[j], prob=p[,j]))
}
as.vector(Data)
}

#I have found that the function returns an error under some configurations
of the data ( 0's seem to harm it). Why is this happening and how can it #be
corrected? Why do some configurations with 0's crash the program and others
don't?

# For example for the data:

data1 = c(0,6,8,6,3,4,4,9,2,2,5,11)
data2 = c(0,6,8,6,0,4,4,9,2,2,5,11)

# inputs for constrOptim

K = 3
RR = Rmat(K)
zeros = rep(0,nrow(RR))
theta.0 = numeric(3*K)
for(i in 1:K) {theta.0[(1+3*(i-1)):(3*i)] = rep(1/10,3)+i/100}

# data1 gives an error but data2 does not!

constrOptim(theta=theta.0, f=likelihood, grad=score, ui=RR,ci=zeros,
data = data1, method="BFGS",
control=list(fnscale=-1,abstol=1e-16,reltol=1e-16))$par

constrOptim(theta=theta.0, f=likelihood, grad=score, ui=RR,ci=zeros,
data = data2, method="BFGS",
control=list(fnscale=-1,abstol=1e-16,reltol=1e-16))$par


# If you want to further check run the simul

[R] AR Simulation with non-normal innovations - Correct

2009-11-02 Thread Ricardo Gonçalves Silva
Dear Users,

I would like to simulate an AR(1) (y_t=ct1+y_t-1+e_t) model in R where the 
innovations are supposed to follow a t-GARCH(1,1) proccess. 

By t-GARCH I want to mean that:

e_t=n_t*sqrt(h_t) and
h_t=ct2+a*(e_t)^2+b*h_t-1.

where n_t is a random variable with t-Student distribution.

If someone could give some guidelines, I can going developing the model.
I did it in matlab, but the loops are very slowly, so I would like to try R.

Thanks in advance,

Rick
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] AR Simulation with non-normal innovations

2009-11-02 Thread Ricardo Gonçalves Silva
Dear Users,

I would like to simulate AR(1) (y_t=ct1+y_t-1+e_t) model in R where the 
innovations are supposed to follow a t-GARCH(1,1) proccess. 

By t-GARCH I want to mean that:

e_t=n_t*sqrt(h_t) and
h_t=ct2+a*(e_t)^2+b*h_t-1.

If someone could give some guidelines, I can going developing the model.
I did it in matlab, but the loops are very slowly, so I would like to try R.

Thanks in advance,

Rick
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems with read.csv

2009-11-02 Thread Joe King
I use 

indata = read.csv(file.choose(),header=TRUE)

of course you can specify your file.

Joe King
206-913-2912
j...@joepking.com
"Never throughout history has a man who lived a life of ease left a name
worth remembering." --Theodore Roosevelt


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Fang (Betty) Yang
Sent: Monday, November 02, 2009 1:10 PM
To: r-help@r-project.org
Subject: [R] problems with read.csv

Dear all,

 

I'd like to ask help on R code to get the same results as the following
Splus code:

 

 

>indata<-importData("/home/data_new.csv")

 

>indata[1:5,4]

[1] 0930 1601 1006 1032 1020

 

I tried the following R code:

 

> indata<-read.csv("/home/data_new.csv")

> indata[1:5,4]

[1]  930 1601 1006 1032 1020

 

I'd like the first one to be 0930, too.

 

Thanks in advance,

 

 

Betty

 

 

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regex matching that gives byte offset?

2009-11-02 Thread Johannes Graumann
On Monday 02 November 2009 13:41:45 Prof Brian Ripley wrote:
> On Mon, 2 Nov 2009, Johannes Graumann wrote:
> > Hmmm ... that should do it, thanks. But how would one use this on a file
> > without reading it into memory completely?
> 
> ?file, ?readLines, ?readBin
> 
> will tell you about connections.
... all of which I only get to read by the line and a regexpr on that will not 
give me the absolute offset.
"grep -buo" on the unix command line is really fast for this. If I can't find 
the native R equivalent, I'm of a mind to do this via a sys call - ugly and 
not portable, but SOOO fast ... is it possible in R?

Joh

> 
> > Joh
> >
> > On Wednesday 28 October 2009 16:29:00 Prof Brian Ripley wrote:
> >> Do you mean like regexpr() (on the same help page)?
> >>
> >> Depending on your locale, you might actually prefer the character
> >> offset: if you want to match in a MBCS and have byte offsets you will
> >> need to work a bit harder if useBytes=TRUE is not sufficient for you.
> >>
> >> On Wed, 28 Oct 2009, Johannes Graumann wrote:
> >>> Hi,
> >>>
> >>> Is there any way of doing 'grep' ore something like it on the content
> >>> of a text file and extract the byte positioning of the match in the
> >>> file? I'm facing the need to access rather largish (>600MB) XML files
> >>> and would like to be able to index them ...
> >>>
> >>> Thanks for any help or flogging,
> >>>
> >>> Joh
> >>>
> >>> __
> >>> R-help@r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html and provide commented,
> >>> minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plots with k-means

2009-11-02 Thread Iuri Gavronski
David, Eduardo,

Thanks for the code. I have run it and I'm not sure what to do with the
graph when it comes up. Can I interact with it, such as an RGL graph? I've
tried clicking or dragging with the mouse and nothing happens. My system is
a Windows Vista and R2.9.

Best,

Iuri.

On Mon, Nov 2, 2009 at 7:35 PM, David Winsemius wrote:

> The attached file did not come through to the list. I think you have some
> non-standard characters (or at least non-standard in my locale). I was able
> to get the code to run after using the Zap Gremlins function in
> TextWrangler. Prior to that "treatment" pretty much every line threw an
> error of this sort:
>
>
> > setClass(Class = 'POI',
> +representation(matrizSim = 'matrix',cos.query.docs = 'vector',
> Error: unexpected input in:
> "setClass(Class = 'POI',
> ¬"
>
> >  wordsInQuery = 'ANY',docs = 'matrix', objeto = 'matrix', objetoC
> Error: unexpected input in "¬"
> > = 'matrix',
> Error: unexpected '=' in "="
>
> >  Pcoords = 'matrix', PcoordsFI = 'matrix', newPcoords = 'matrix',
> Error: unexpected input in "¬"
> > newcoords = 'numeric' ,
> Error: unexpected ',' in "newcoords = 'numeric' ,"
>
> >  newcoords_1 = 'numeric',  M = 'numeric', poisTextCol =
> Error: unexpected input in "¬"
>
> I also needed to remove a couple of spaces between function names and
> parentheses when these occurred at ends-of-lines. Attached is a working
> version as a .txt file (which should make it through the list-serv:
>
>
>
>
>
> -- David.
> > sessionInfo()
> R version 2.10.0 Patched (2009-10-29 r50258)
> x86_64-apple-darwin9.8.0
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] splines   stats graphics  grDevices utils datasets  methods
> base
>
> other attached packages:
> [1] rms_2.1-0   Hmisc_3.7-0 survival_2.35-7
>
> loaded via a namespace (and not attached):
> [1] cluster_1.12.1  grid_2.10.0 lattice_0.17-26
>
>
>
>
> On Nov 2, 2009, at 3:43 PM, eduardo san miguel wrote:
>
>  I send r-code in an attached file.
>>
>> 2009/11/2 Iuri Gavronski :
>>
>>> Eduardo,
>>>
>>> Would you mind sending me the R code in an attached file. Your code
>>> didn't
>>> work here and I am not sure it is because of line breaks from the email
>>> program.
>>>
>>> Iuri.
>>>
>>> On Mon, Nov 2, 2009 at 10:53 AM, eduardo san miguel <
>>>
>> eduardosa...@gmail.com>
>>
>>> wrote:
>>>

 Hello all,

 I have almost finished the development of a new package where ideas
 from Tamara Munzner, George Furnas and Costa and Venturini are
 implemented.

 1.- Da Costa, David & Venturini, Gilles (2006). An Interactive
 Visualization Environment for Data Exploration Using Points of
 Interest. adma 2006: 416-423

 2.- Furnas, George (1986). Generalized Fisheye Views. Human Factors in
 computing systems, CHI '86 conference proceedings, ACM, New York, pp.
 16-23.

 3.- Heidi Lam, Ronald A. Rensink, and Tamara Munzner (2006). Effects
 of 2D Geometric Transformations on Visual Memory. Proc. Applied
 Perception in Graphics and Visualization (APGV 2006), 119-126, 2006.

 4.- Keith Lau, Ron Rensink, and Tamara Munzner (2004). Perceptual
 Invariance of Nonlinear Focus+Context Transformations. Proc. First
 Symposium on Applied Perception in Graphics and Visualization (APGV
 04) 2004, pp 65-72.

 This is a sample with some basic functionality and a VERY BASIC
 example with kmeans plotting.

 Comments will be greatly appreciated.

 Regards

 -- R CODE
 require(methods)

 setClass(Class = 'POI',
  representation(matrizSim = 'matrix',cos.query.docs = 'vector',
wordsInQuery = 'ANY',docs = 'matrix', objeto = 'matrix', objetoC
 = 'matrix',
Pcoords = 'matrix', PcoordsFI = 'matrix', newPcoords = 'matrix',
 newcoords = 'numeric' ,
newcoords_1 = 'numeric',  M = 'numeric', poisTextCol =
 'character' , colores = 'vector' ,
poisCircleCol = 'character' , linesCol = 'character', itemsCol =
 'character',
LABELS =  'logical',  vscale = 'numeric',  hscale = 'numeric',
 circleCol = 'character',
plotCol = 'character',  itemsFamily = 'character',  lenteDefault
 = 'numeric',
zoomDefault = 'numeric' ,  rateDefault = 'numeric' ,
 topKDefault = 'numeric'  ,
pal = 'character',  selected = 'numeric' ,  circRadio =
 'numeric' , IncVscale = 'numeric',
cgnsphrFont = 'numeric', xClick_old = 'numeric',  yClick_old =
 'numeric',
wordsInQueryFull = 'character' ),
prototype(cos.query.docs = 0, colores = 0, newcoords = 0,
 newcoords_1 = 0, M = 3,
 vscale = 0.5 , hscale = 1.5 , circleCol = 'black' ,
 itemsCol = 'white',
 poisTextCol =  '#fff5ee',  poisCircleCol = '#fff5ee',
 linesCol = 'white',
 plotCol = 'black', itemsFamily = 'sans', lenteDefault 

Re: [R] problems with read.csv

2009-11-02 Thread David Winsemius


On Nov 2, 2009, at 4:09 PM, Fang (Betty) Yang wrote:


Dear all,

I'd like to ask help on R code to get the same results as the  
following

Splus code:


indata<-importData("/home/data_new.csv")



indata[1:5,4]


[1] 0930 1601 1006 1032 1020

I tried the following R code:


indata<-read.csv("/home/data_new.csv")


Do you get what you desire with setting as.is=TRUE?

indata<-read.csv("/home/data_new.csv", as.is=TRUE )

Also see the colClasses arguments in read.xxx functions.



indata[1:5,4]


[1]  930 1601 1006 1032 1020

I'd like the first one to be 0930, too.



So you want the strings (aka "character" vectors) to stay strings.
--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems with read.csv

2009-11-02 Thread Ista Zahn
Superficially, one answer is

indata <- read.csv("/home/data_new.csv", colClasses="character")

but I'm not sure that's what you want...

-Ista

On Mon, Nov 2, 2009 at 4:09 PM, Fang (Betty) Yang  wrote:
> Dear all,
>
>
>
> I'd like to ask help on R code to get the same results as the following
> Splus code:
>
>
>
>
>
>>indata<-importData("/home/data_new.csv")
>
>
>
>>indata[1:5,4]
>
> [1] 0930 1601 1006 1032 1020
>
>
>
> I tried the following R code:
>
>
>
>> indata<-read.csv("/home/data_new.csv")
>
>> indata[1:5,4]
>
> [1]  930 1601 1006 1032 1020
>
>
>
> I'd like the first one to be 0930, too.
>
>
>
> Thanks in advance,
>
>
>
>
>
> Betty
>
>
>
>
>
>
>
>
>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plots with k-means

2009-11-02 Thread David Winsemius
The attached file did not come through to the list. I think you have  
some non-standard characters (or at least non-standard in my locale).  
I was able to get the code to run after using the Zap Gremlins  
function in TextWrangler. Prior to that "treatment" pretty much every  
line threw an error of this sort:


> setClass(Class = 'POI',
+representation(matrizSim = 'matrix',cos.query.docs = 'vector',
Error: unexpected input in:
"setClass(Class = 'POI',
¬"
>  wordsInQuery = 'ANY',docs = 'matrix', objeto = 'matrix', objetoC
Error: unexpected input in "¬"
> = 'matrix',
Error: unexpected '=' in "="
>  Pcoords = 'matrix', PcoordsFI = 'matrix', newPcoords = 'matrix',
Error: unexpected input in "¬"
> newcoords = 'numeric' ,
Error: unexpected ',' in "newcoords = 'numeric' ,"
>  newcoords_1 = 'numeric',  M = 'numeric', poisTextCol =
Error: unexpected input in "¬"

I also needed to remove a couple of spaces between function names and  
parentheses when these occurred at ends-of-lines. Attached is a  
working version as a .txt file (which should make it through the list- 
serv:


require(methods)

setClass(Class = 'POI', representation(matrizSim = 'matrix',cos.query.docs = 
'vector', wordsInQuery = 'ANY',docs = 'matrix', objeto = 'matrix', objetoC = 
'matrix', Pcoords = 'matrix', PcoordsFI = 'matrix', newPcoords = 'matrix', 
newcoords = 'numeric' , newcoords_1 = 'numeric', M = 'numeric', poisTextCol = 
'character' , colores = 'vector' , poisCircleCol = 'character' , linesCol = 
'character', itemsCol = 'character',  LABELS = 'logical', vscale = 'numeric', 
hscale = 'numeric', circleCol = 'character', plotCol = 'character', itemsFamily 
= 'character', lenteDefault = 'numeric', zoomDefault = 'numeric' , rateDefault 
= 'numeric' , topKDefault = 'numeric' , pal = 'character', selected = 'numeric' 
, circRadio = 'numeric' , IncVscale = 'numeric', cgnsphrFont = 'numeric', 
xClick_old = 'numeric', yClick_old = 'numeric', wordsInQueryFull = 'character' 
), prototype(cos.query.docs = 0, colores = 0, newcoords = 0, newcoords_1 = 0, M 
= 3, vscale = 0.5 , hscale = 1.5 , circleCol = 'black' , itemsCol = 'white', 
poisTextCol = '#fff5ee', poisCircleCol = '#fff5ee', linesCol = 'white', plotCol 
= 'black', itemsFamily = 'sans', lenteDefault = 1, zoomDefault = 15 , 
rateDefault = 0.1 , topKDefault = 25, pal = 'topo' , selected = 1 , circRadio = 
0.25 , IncVscale = 0.05 , cgnsphrFont = 1.01, LABELS = T) ) 

setGeneric("puntosMedios" , function(Pcoords, detalle = 
5){standardGeneric("puntosMedios")})
setMethod("puntosMedios" , signature = "matrix", function(Pcoords, detalle = 
5){ 
  for (i in 1:detalle){ 
new_pcoords = matrix(rep(0,4*nrow(Pcoords)), nrow = 2* nrow(Pcoords), byrow = T 
)
cont = 0
for (i in 1:nrow(Pcoords)){
if (i == nrow(Pcoords)) { 
cont = cont + 1
new_pcoords[cont,] = Pcoords[i,]
cont = cont + 1
new_pcoords[cont,] = Pcoords[i,] - ((Pcoords[i,]-Pcoords[1,])/2)
}else{
cont = cont + 1
new_pcoords[cont,] = Pcoords[i,]
cont = cont + 1
new_pcoords[cont,] = Pcoords[i,] - 
((Pcoords[i,]-Pcoords[i+1,])/2)}}
Pcoords = new_pcoords}
return(Pcoords)

}
)

setGeneric("fishIout" , function(x, value){standardGeneric ("fishIout")})

setMethod("fishIout" ,
signature = "numeric",
function(x, value){

d = value
if (x > 0){
signo = 1
}else{
signo = -1
}
x = abs(x)
return(signo*(-(x/((d*x)-d-1
}
)

setGeneric("fishIin" ,
function(x, value){standardGeneric ("fishIin")})

setMethod("fishIin" ,
signature = "numeric",
function(x, value){

d = value
if (x > 0){
signo = 1
}else{
signo = -1
}
x = abs(x)

return(signo*(((d+1)*x)/(d*x+1)))
}
)

setGeneric("toPolar" ,
function(x, y){standardGeneric ("toPolar")})

setMethod("toPolar" ,
signature = "numeric",
function(x, y){

t1 = atan2(y,x)
rP = sqrt(x^2+y^2)
return(c(t1 = t1,rP = rP))

}
)

setGeneric("toCartesian" ,
function(t1, rP){standardGeneric ("toCartesian")})

setMethod("toCartesian" ,
signature = "numeric",
function(t1, rP){

x1 = rP*cos(t1)
y1 = rP*sin(t1)
return(c(x = x1,y = y1))

}
)

setGeneric("circulo" ,
function(cx, cy, r, circleCol, PLOT =
TRUE){standardGeneric ("circulo")})

setMethod("circulo" ,
signature = "numeric",
function(cx, cy, r, circleCol, PLOT = TRUE){
t = seq(0,2*pi,length=100)
circle = t(rbind(cx+sin(t)*r,cy+cos(t)*r))
if (PLOT == TRUE) 
plot(circle,type='l',,ylim=c(-1.15,1.15),xlim=c(-1.15,1.15),
ann=FALSE, axes=F, col = circleCol)
return(circle)

}
)

setGeneric("circulin" ,
function(cx, cy, r = 0.045,
objeto, col = 'blue', PLOT = TRUE, label = 0){
standardGeneric ("circulin")})

setMethod("circulin" ,
signature = "ANY",
function(cx, cy, r = 0.045, objeto, col = 'blue', P

Re: [R] Robust ANOVA or alternative test?

2009-11-02 Thread Kingsford Jones
See the nlme library and accompanying book by Pinheiro and Bates.  The
lme function is appropriate for linear mixed models with normal
errors, and the 'weights' argument provides flexible methods for
modeling error heterogeneity.

hth,
Kingsford

On Mon, Nov 2, 2009 at 12:10 PM, jinyan fan  wrote:
> Dear Colleagues,
>
> I am running a three-way mixed-design ANOVA, with one manipulated IV, a group 
> membership IV, and pre-/post- within subject factor. I am interested in the 
> three-way interaction effect. The regular ANVOA is problematic, because (a) 
> the sample sizes are very unbalanced, due to the group IV, (b) the 
> homogeneity of variance is violated, and (c) the homogeneity of covariance is 
> also violated. I wonder if R has a procedure to address this issue, either by 
> some sort of Robust ANOVA procedure, or by some alternative tests? Thanks a 
> lot.
>
> Jinyan Fan
>
> Assistant Professor
> Department of Psychology
> Hofstra University
> Hempstead, NY 11550
> Work: 516-463-6349
>
>
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] need help in using Hessian matrix

2009-11-02 Thread Laila Alkhalfan
Hi
I need to find the Hessian matrix for a complicated function from a certain
kind of data but i keep getting this error
Error in f1 - f2 : non-numeric argument to binary operator

the data is given by

  U<-runif(n)
  Us<-sort(U)
  tau1<- 2
  F1tau<- pgamma((tau1/theta1),shape,1)
  N1<-sum(Ushttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problems with read.csv

2009-11-02 Thread Fang (Betty) Yang
Dear all,

 

I'd like to ask help on R code to get the same results as the following
Splus code:

 

 

>indata<-importData("/home/data_new.csv")

 

>indata[1:5,4]

[1] 0930 1601 1006 1032 1020

 

I tried the following R code:

 

> indata<-read.csv("/home/data_new.csv")

> indata[1:5,4]

[1]  930 1601 1006 1032 1020

 

I'd like the first one to be 0930, too.

 

Thanks in advance,

 

 

Betty

 

 

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] qqplot

2009-11-02 Thread carol white
Thanks for all your replies.

I just wanted to compare the distributions of two populations and see how 
different they are. I thought that QQplot would be a good idea. It's true that 
the mean and sd of the two distributions are not the same. I don't know if any 
other method or graphical presentation like five number, mean and sd as 
suggested, or boxplot would have been more suited to be used.

thanks for your advices,

--- On Mon, 11/2/09, John Fox  wrote:

> From: John Fox 
> Subject: RE: [R] qqplot
> To: "'Peter Flom'" 
> Cc: r-h...@stat.math.ethz.ch, "'carol white'" , "'Yihui 
> Xie'" 
> Date: Monday, November 2, 2009, 9:24 AM
> Dear Peter,
> 
> I assumed that Carol wanted to compare the shapes of the
> distributions and
> to adjust for differences in centre and spread. To put a
> line through the
> quartiles or to base a line on the medians and IQRs is more
> robust than
> using the means and sds.
> 
> Best,
>  John
> 
> 
> > -Original Message-
> > From: r-help-boun...@r-project.org
> [mailto:r-help-boun...@r-project.org]
> On
> > Behalf Of Peter Flom
> > Sent: November-02-09 11:57 AM
> > To: carol white; Yihui Xie
> > Cc: r-h...@stat.math.ethz.ch
> > Subject: Re: [R] qqplot
> > 
> > carol white 
> wrote
> > 
> > >So the conclusion is that abline(0,1) should
> always be used and if it
> > doesn't go through the qqplot, the two distributions
> are not similar?
> > 
> > I think it depends what you mean by "similar". 
> E.g., if you mean "are
> both
> > of these distributions (e.g.) normal?" then
> abline(0,1) is not always
> useful.
> > But if you mean "Do these have the same mean, sd, and
> distribution?" then
> > abline(0,1) is the way to go.
> > 
> > Peter
> > 
> > Peter L. Flom, PhD
> > Statistical Consultant
> > Website: www DOT peterflomconsulting DOT com
> > Writing; http://www.associatedcontent.com/user/582880/peter_flom.html
> > Twitter: �...@peterflom
> > 
> > __
> > R-help@r-project.org
> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code.
> 
> 
> 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plots with k-means

2009-11-02 Thread eduardo san miguel
I send r-code in an attached file.

2009/11/2 Iuri Gavronski :
> Eduardo,
>
> Would you mind sending me the R code in an attached file. Your code didn't
> work here and I am not sure it is because of line breaks from the email
> program.
>
> Iuri.
>
> On Mon, Nov 2, 2009 at 10:53 AM, eduardo san miguel <
eduardosa...@gmail.com>
> wrote:
>>
>> Hello all,
>>
>> I have almost finished the development of a new package where ideas
>> from Tamara Munzner, George Furnas and Costa and Venturini are
>> implemented.
>>
>> 1.- Da Costa, David & Venturini, Gilles (2006). An Interactive
>> Visualization Environment for Data Exploration Using Points of
>> Interest. adma 2006: 416-423
>>
>> 2.- Furnas, George (1986). Generalized Fisheye Views. Human Factors in
>> computing systems, CHI '86 conference proceedings, ACM, New York, pp.
>> 16-23.
>>
>> 3.- Heidi Lam, Ronald A. Rensink, and Tamara Munzner (2006). Effects
>> of 2D Geometric Transformations on Visual Memory. Proc. Applied
>> Perception in Graphics and Visualization (APGV 2006), 119-126, 2006.
>>
>> 4.- Keith Lau, Ron Rensink, and Tamara Munzner (2004). Perceptual
>> Invariance of Nonlinear Focus+Context Transformations. Proc. First
>> Symposium on Applied Perception in Graphics and Visualization (APGV
>> 04) 2004, pp 65-72.
>>
>> This is a sample with some basic functionality and a VERY BASIC
>> example with kmeans plotting.
>>
>> Comments will be greatly appreciated.
>>
>> Regards
>>
>> -- R CODE
>> require(methods)
>>
>>  setClass(Class = 'POI',
>>representation(matrizSim = 'matrix',cos.query.docs = 'vector',
>>  wordsInQuery = 'ANY',docs = 'matrix', objeto = 'matrix', objetoC
>> = 'matrix',
>>  Pcoords = 'matrix', PcoordsFI = 'matrix', newPcoords = 'matrix',
>> newcoords = 'numeric' ,
>>  newcoords_1 = 'numeric',  M = 'numeric', poisTextCol =
>> 'character' , colores = 'vector' ,
>>  poisCircleCol = 'character' , linesCol = 'character', itemsCol =
>> 'character',
>>  LABELS =  'logical',  vscale = 'numeric',  hscale = 'numeric',
>> circleCol = 'character',
>>  plotCol = 'character',  itemsFamily = 'character',  lenteDefault
>> = 'numeric',
>>  zoomDefault = 'numeric' ,  rateDefault = 'numeric' ,
>> topKDefault = 'numeric'  ,
>>  pal = 'character',  selected = 'numeric' ,  circRadio =
>> 'numeric' , IncVscale = 'numeric',
>>  cgnsphrFont = 'numeric', xClick_old = 'numeric',  yClick_old =
>> 'numeric',
>>  wordsInQueryFull = 'character' ),
>>  prototype(cos.query.docs = 0, colores = 0, newcoords = 0,
>> newcoords_1 = 0, M = 3,
>>   vscale = 0.5 , hscale = 1.5 , circleCol = 'black' ,
>> itemsCol = 'white',
>>   poisTextCol =  '#fff5ee',  poisCircleCol = '#fff5ee',
>> linesCol = 'white',
>>   plotCol = 'black', itemsFamily = 'sans', lenteDefault =
>> 1, zoomDefault = 15 ,
>>   rateDefault = 0.1 , topKDefault = 25,  pal = 'topo' ,
>> selected = 1 ,
>>   circRadio = 0.25  , IncVscale = 0.05  ,  cgnsphrFont =
>> 1.01, LABELS = T)
>>  )
>>
>>  setGeneric("puntosMedios" ,
>> function(Pcoords, detalle = 5){standardGeneric
>> ("puntosMedios")})
>>
>>  setMethod("puntosMedios" ,
>>signature = "matrix",
>>function(Pcoords, detalle = 5){
>>
>>  for (i in 1:detalle){
>>new_pcoords = matrix(rep(0,4*nrow(Pcoords)), nrow = 2*
>> nrow(Pcoords), byrow = T )
>>cont = 0
>>for (i in 1:nrow(Pcoords)){
>>   if (i == nrow(Pcoords)) {
>>cont = cont + 1
>>new_pcoords[cont,] = Pcoords[i,]
>>cont = cont + 1
>>new_pcoords[cont,] = Pcoords[i,] -
>> ((Pcoords[i,]-Pcoords[1,])/2)
>>}else{
>>cont = cont + 1
>>new_pcoords[cont,] = Pcoords[i,]
>>cont = cont + 1
>>new_pcoords[cont,] = Pcoords[i,] -
>> ((Pcoords[i,]-Pcoords[i+1,])/2)}}
>>Pcoords = new_pcoords}
>>return(Pcoords)
>>
>>   }
>>  )
>>
>>  setGeneric("fishIout" ,
>> function(x, value){standardGeneric ("fishIout")})
>>
>>  setMethod("fishIout" ,
>>signature = "numeric",
>>function(x, value){
>>
>>  d = value
>>if (x > 0){
>>signo = 1
>>}else{
>>signo = -1
>>}
>>x = abs(x)
>>return(signo*(-(x/((d*x)-d-1
>>   }
>>  )
>>
>>  setGeneric("fishIin" ,
>> function(x, value){standardGeneric ("fishIin")})
>>
>>  setMethod("fishIin" ,
>>signature = "numeric",
>>function(x, value){
>>
>>  d = value
>>if (x > 0){
>>signo = 1
>>}else{
>>signo = -1
>>}
>>x = abs(x)
>>
>>return(signo*(((d+1)*x)/(d*x+1)))
>>   }
>>  )
>>
>>  setGeneric("toPolar" ,
>> function(x, y){standardGeneric ("toPolar")})
>>
>>  setMethod("toPolar" ,
>>signature = "numeric",
>>function(x, y){
>>
>>t1 = atan2(y,x)
>> 

[R] reference on p.adjust()

2009-11-02 Thread Peng Yu
I'm wondering if there is a textbook that summarize the methods on
adjusting p-values for multiple comparisons. Most of the references on
p.adjust() are over 10 years old. I feel it would be better if
somebody could recommend me a textbook on multiple hypothesis
correction, so that I can quickly get a general idea of different
methods.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Frequency

2009-11-02 Thread David Winsemius


On Nov 2, 2009, at 2:49 PM, Jorge Ivan Velez wrote:


Hi Ashta,

Yes, it is possible. Here is a suggestion:

# Data set
x <- read.table(textConnection("v1  v2  v3   v4
569   10
347   10
46   10   18"), header = TRUE)
closeAllConnections()

# Table
res <- table(data.matrix(x))
f <-  sort(res, decreasing = TRUE)
data.frame(value = names(f), counts = f)

#  :-)
require(fortunes)
fortune('Only how')





HTH,
Jorge


On Mon, Nov 2, 2009 at 2:26 PM, Ashta <> wrote:


Thank you Jorge and






# 10  4  6  3  5  7  9 18
#  3  2  2  1  1  1  1  1


This one works fine for me.  Is it possible to transpose it?
I tried  t(res[order(res, decreasing = TRUE)]), but it did not work!


And in the spirit of that fortune above, also look at:

t(t(res[order(res, decreasing = TRUE)]))


The extra t() coerces to a matrix and the outer t() does the transpose  
(I think).


--
David.



I want the result like this
10  2
4   2
6   2
3   1
.  .
.  .




On Mon, Nov 2, 2009 at 1:45 PM, Jorge Ivan Velez
<> wrote:

Hi Val,

Here is a suggestion:

res <- table(unlist(x))
res[order(res, decreasing = TRUE)]
# 10  4  6  3  5  7  9 18
#  3  2  2  1  1  1  1  1

HTH,
Jorge


On Mon, Nov 2, 2009 at 1:35 PM, Val <> wrote:


BAYESIAN INFERENCES FOR MILKING TEMPERAMENT IN CANADIAN HOLSTEINS

Hi All,

I have a data  set "x"  with several variables. Sample of the  
data is

shown

below

V1  v2  v3   v4

 569   10

347   10

46   10   18



I want the frequency  of each  data point sorted by their  
occurrence.




Below is the output that I want

10=3

6=2

4=2

9=1

5=1

7=1

3=1

How do I do it in R?



Thanks in advance



Val

  [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] "Safe" way to automatically install required packages...

2009-11-02 Thread Charlie Sharpsteen
On Mon, Nov 2, 2009 at 10:56 AM, Jonathan Greenberg
 wrote:
> R-helpers:
>
>   I'm working on an r-package that I want to make as easy-to-use as possible
> for a novice R-user, which includes automatically installing required
> packages.   I, myself, am a novice R-packager, so the solution I came up
> with was to embed:
>
> print("Loading required packages...")
> if (!require("reshape")) { install.packages("reshape") }
> if (!require("reshape")) {
>   print("Could not install package 'reshape', please contact your
> sysadmin.")
>   return()
> }
>
>   in the code proper, and put together the package using package.skeleton()
> and R CMD build.
>
>   I'm guessing there's a better way to do this -- any suggestions?
> --j

Place the dependencies of your package in a comma-seperated list in the

  depends:

field of the DESCRIPTION file. When a user runs install.packages(
'yourPackage', dependencies = T ), R will take care of downloading and
installing the dependencies

-Charlie

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Incremental ReadLines

2009-11-02 Thread Gene Leynes
James,

I think those are Unix commands?  I'm on Windows, so that's not an option
(for now)

Also the suggestions posed by Duncan and Phil seem to be working.  Thank you
so much, such a simple thing to add the "r" or "rt" to the file connection.


I read about blocking, but I didn't imagine that it meant "chunks".  I was
thinking something more like "blocking out", or guarding (perhaps for
security).

On Mon, Nov 2, 2009 at 1:47 PM, James W. MacDonald wrote:

> Hi Gene,
>
> Rather than using R to parse this file, have you considered using either
> grep or sed to pre-process the file and then read it in?
>
> It looks like you just want lines starting with numbers, so something like
>
> grep '^[0-9]\+' thefile.csv > otherfile.csv
>
> should be much faster, and then you can just read in otherfile.csv using
> read.csv().
>
> Best,
>
> Jim
>
>
>
> Gene Leynes wrote:
>
>> I've been trying to figure out how to read in a large file for a few days
>> now, and after extensive research I'm still not sure what to do.
>>
>> I have a large comma delimited text file that contains 59 fields in each
>> record.
>> There is also a header every 121 records
>>
>> This function works well for smallish records
>> getcsv=function(fname){
>>ff=file(description = fname)
>>x <- readLines(ff)
>>closeAllConnections()
>>x <- x[x != ""]  # REMOVE BLANKS
>>x=x[grep("^[-0-9]", x)]  # REMOVE ALL TEXT
>>
>>spl=strsplit(x,',')  # THIS PART IS SLOW, BUT MANAGABLE
>>
>>
>> xx=t(sapply(1:length(spl),function(temp)as.vector(na.omit(as.numeric(spl[[temp]])
>>return(xx)
>> }
>> It's not elegant, but it works.
>> For 121,000 records it completes in 2.3 seconds
>> For 121,000*5 records it completes in 63 seconds
>> For 121,000*10 records it doesn't complete
>>
>> When I try other methods to read the file in chunks (using scan), the
>> process breaks down because I have to start at the beginning of the file
>> on
>> every iteration.
>> For example:
>> fnn=function(n,col){
>>a=122*(n-1)+2
>>xx=scan(fname,skip=a-1,nlines=121,sep=',',quiet=TRUE,what=character(0))
>>xx=xx[xx!='']
>>xx=matrix(xx,ncol=49,byrow=TRUE)
>>xx[,col]
>> }
>> system.time(sapply(1:10,fnn,c=26)) # 0.31 Seconds
>> system.time(sapply(91:90,fnn,c=26))# 1.09 Seconds
>> system.time(sapply(901:910,fnn,c=26))  # 5.78 Seconds
>>
>> Even though I'm only getting the 26th column for 10 sets of records, it
>> takes a lot longer the further into the file I go.
>>
>> How can I tell scan to pick up where it left off, without it starting at
>> the
>> beginning??  There must be a good example somewhere.
>>
>> I have done a lot of research (in fact, thank you to Michael J. Crawley
>> and
>> others for your help thus far)
>>
>> Thanks,
>>
>> Gene
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> Douglas Lab
> University of Michigan
> Department of Human Genetics
> 5912 Buhl
> 1241 E. Catherine St.
> Ann Arbor MI 48109-5618
> 734-615-7826
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting NA's and Graphing a histogram

2009-11-02 Thread Jorge Ivan Velez
Hi Koraelus,

Here is a suggestion for the first part:

index <- apply(Dataset, 1, function(x) sum( is.na( x ) ) < 2 )
d2 <- Dataset[index ,]
d2
#  Alpha Beta Gamma Delta
# A 12 3 4
# BNA2 4 5
# D 891011
# E 5   NA 713

Could you please explain us in more detail what would you like to do in the
second part?

HTH,
Jorge


On Mon, Nov 2, 2009 at 1:57 PM, Koraelus <> wrote:

>
> Let's say I have a dataset
>
> Dataset<-read.csv("Dataset.txt", header=T, row.names=1)
>
> Dataset
>   Alpha Beta Gamma Delta
> A   1  23   4
> B   NA24   5
> C   NA3NA NA
> D   8  9   10  11
> E5 NA 713
> F   NANA NA   4
>
> I want to remove all rows with more than one NA
>
> So I want it to look like
>
>   Alpha Beta Gamma Delta
> A 12  34
> B  NA 2   45
> D  8   9  10   11
> E  5   NA  7   13
>
> Then I want to turn the rownames into variables A, B, D, E that I can work
> with as if I had used the attach() cmd, but on the rownames.
>
> plot(A)
> --
> View this message in context:
> http://old.nabble.com/Sorting-NA%27s-and-Graphing-a-histogram-tp26157779p26157779.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Robust ANOVA or alternative test?

2009-11-02 Thread Liviu Andronic
Hello

On 11/2/09, jinyan fan  wrote:
> to address this issue, either by some sort of Robust ANOVA procedure, or by 
> some alternative tests? Thanks a lot.
>
I cannot address this specific question, but you may want to address
robustness-related queries to r-sig-robust, and perhaps check the
Robust [1] Task View.
Best
Liviu

[1] http://cran.r-project.org/web/views/Robust.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Frequency

2009-11-02 Thread Jorge Ivan Velez
Hi Ashta,

Yes, it is possible. Here is a suggestion:

# Data set
x <- read.table(textConnection("v1  v2  v3   v4
569   10
347   10
46   10   18"), header = TRUE)
closeAllConnections()

# Table
res <- table(data.matrix(x))
f <-  sort(res, decreasing = TRUE)
data.frame(value = names(f), counts = f)

#  :-)
require(fortunes)
fortune('Only how')

HTH,
Jorge


On Mon, Nov 2, 2009 at 2:26 PM, Ashta <> wrote:

> Thank you Jorge and
>
> > res <- table(unlist(x))
> > res[order(res, decreasing = TRUE)]
> > # 10  4  6  3  5  7  9 18
> > #  3  2  2  1  1  1  1  1
>
> This one works fine for me.  Is it possible to transpose it?
> I tried  t(res[order(res, decreasing = TRUE)]), but it did not work!
>
> I want the result like this
> 10  2
>  4   2
>  6   2
>  3   1
>  .  .
>  .  .
>
>
>
>
> On Mon, Nov 2, 2009 at 1:45 PM, Jorge Ivan Velez
> <> wrote:
> > Hi Val,
> >
> > Here is a suggestion:
> >
> > res <- table(unlist(x))
> > res[order(res, decreasing = TRUE)]
> > # 10  4  6  3  5  7  9 18
> > #  3  2  2  1  1  1  1  1
> >
> > HTH,
> > Jorge
> >
> >
> > On Mon, Nov 2, 2009 at 1:35 PM, Val <> wrote:
> >
> >> BAYESIAN INFERENCES FOR MILKING TEMPERAMENT IN CANADIAN HOLSTEINS
> >>
> >> Hi All,
> >>
> >> I have a data  set "x"  with several variables. Sample of the data is
> shown
> >> below
> >>
> >>  V1  v2  v3   v4
> >>
> >>   569   10
> >>
> >>  347   10
> >>
> >>  46   10   18
> >>
> >>
> >>
> >> I want the frequency  of each  data point sorted by their occurrence.
> >>
> >>
> >>
> >> Below is the output that I want
> >>
> >> 10=3
> >>
> >> 6=2
> >>
> >> 4=2
> >>
> >> 9=1
> >>
> >> 5=1
> >>
> >> 7=1
> >>
> >> 3=1
> >>
> >> How do I do it in R?
> >>
> >>
> >>
> >> Thanks in advance
> >>
> >>
> >>
> >> Val
> >>
> >>[[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Incremental ReadLines

2009-11-02 Thread James W. MacDonald

Hi Gene,

Rather than using R to parse this file, have you considered using either 
grep or sed to pre-process the file and then read it in?


It looks like you just want lines starting with numbers, so something like

grep '^[0-9]\+' thefile.csv > otherfile.csv

should be much faster, and then you can just read in otherfile.csv using 
read.csv().


Best,

Jim



Gene Leynes wrote:

I've been trying to figure out how to read in a large file for a few days
now, and after extensive research I'm still not sure what to do.

I have a large comma delimited text file that contains 59 fields in each
record.
There is also a header every 121 records

This function works well for smallish records
getcsv=function(fname){
ff=file(description = fname)
x <- readLines(ff)
closeAllConnections()
x <- x[x != ""]  # REMOVE BLANKS
x=x[grep("^[-0-9]", x)]  # REMOVE ALL TEXT

spl=strsplit(x,',')  # THIS PART IS SLOW, BUT MANAGABLE

xx=t(sapply(1:length(spl),function(temp)as.vector(na.omit(as.numeric(spl[[temp]])
return(xx)
}
It's not elegant, but it works.
For 121,000 records it completes in 2.3 seconds
For 121,000*5 records it completes in 63 seconds
For 121,000*10 records it doesn't complete

When I try other methods to read the file in chunks (using scan), the
process breaks down because I have to start at the beginning of the file on
every iteration.
For example:
fnn=function(n,col){
a=122*(n-1)+2
xx=scan(fname,skip=a-1,nlines=121,sep=',',quiet=TRUE,what=character(0))
xx=xx[xx!='']
xx=matrix(xx,ncol=49,byrow=TRUE)
xx[,col]
}
system.time(sapply(1:10,fnn,c=26)) # 0.31 Seconds
system.time(sapply(91:90,fnn,c=26))# 1.09 Seconds
system.time(sapply(901:910,fnn,c=26))  # 5.78 Seconds

Even though I'm only getting the 26th column for 10 sets of records, it
takes a lot longer the further into the file I go.

How can I tell scan to pick up where it left off, without it starting at the
beginning??  There must be a good example somewhere.

I have done a lot of research (in fact, thank you to Michael J. Crawley and
others for your help thus far)

Thanks,

Gene

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Incremental ReadLines

2009-11-02 Thread Duncan Murdoch

On 11/2/2009 2:03 PM, Gene Leynes wrote:

I've been trying to figure out how to read in a large file for a few days
now, and after extensive research I'm still not sure what to do.

I have a large comma delimited text file that contains 59 fields in each
record.
There is also a header every 121 records


You can open the connection before reading, then read in blocks of lines 
and process those.  You don't need to reopen it every time.  For example,


ff <- file(fname, open="rt")  # rt is read text
for (block in 1:nblocks) {
  x <- readLines(ff, n=121)
  # process this block
}
close(ff)

Duncan Murdoch



This function works well for smallish records
getcsv=function(fname){
ff=file(description = fname)
x <- readLines(ff)
closeAllConnections()
x <- x[x != ""]  # REMOVE BLANKS
x=x[grep("^[-0-9]", x)]  # REMOVE ALL TEXT

spl=strsplit(x,',')  # THIS PART IS SLOW, BUT MANAGABLE

xx=t(sapply(1:length(spl),function(temp)as.vector(na.omit(as.numeric(spl[[temp]])
return(xx)
}
It's not elegant, but it works.
For 121,000 records it completes in 2.3 seconds
For 121,000*5 records it completes in 63 seconds
For 121,000*10 records it doesn't complete

When I try other methods to read the file in chunks (using scan), the
process breaks down because I have to start at the beginning of the file on
every iteration.
For example:
fnn=function(n,col){
a=122*(n-1)+2
xx=scan(fname,skip=a-1,nlines=121,sep=',',quiet=TRUE,what=character(0))
xx=xx[xx!='']
xx=matrix(xx,ncol=49,byrow=TRUE)
xx[,col]
}
system.time(sapply(1:10,fnn,c=26)) # 0.31 Seconds
system.time(sapply(91:90,fnn,c=26))# 1.09 Seconds
system.time(sapply(901:910,fnn,c=26))  # 5.78 Seconds

Even though I'm only getting the 26th column for 10 sets of records, it
takes a lot longer the further into the file I go.

How can I tell scan to pick up where it left off, without it starting at the
beginning??  There must be a good example somewhere.

I have done a lot of research (in fact, thank you to Michael J. Crawley and
others for your help thus far)

Thanks,

Gene

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Frequency

2009-11-02 Thread Ashta
Thank you Jorge and

> res <- table(unlist(x))
> res[order(res, decreasing = TRUE)]
> # 10  4  6  3  5  7  9 18
> #  3  2  2  1  1  1  1  1

This one works fine for me.  Is it possible to transpose it?
I tried  t(res[order(res, decreasing = TRUE)]), but it did not work!

I want the result like this
10  2
 4   2
 6   2
 3   1
  .  .
  .  .




On Mon, Nov 2, 2009 at 1:45 PM, Jorge Ivan Velez
 wrote:
> Hi Val,
>
> Here is a suggestion:
>
> res <- table(unlist(x))
> res[order(res, decreasing = TRUE)]
> # 10  4  6  3  5  7  9 18
> #  3  2  2  1  1  1  1  1
>
> HTH,
> Jorge
>
>
> On Mon, Nov 2, 2009 at 1:35 PM, Val <> wrote:
>
>> BAYESIAN INFERENCES FOR MILKING TEMPERAMENT IN CANADIAN HOLSTEINS
>>
>> Hi All,
>>
>> I have a data  set "x"  with several variables. Sample of the data is shown
>> below
>>
>>  V1  v2  v3   v4
>>
>>   5    6    9   10
>>
>>  3    4    7   10
>>
>>  4    6   10   18
>>
>>
>>
>> I want the frequency  of each  data point sorted by their occurrence.
>>
>>
>>
>> Below is the output that I want
>>
>> 10    =3
>>
>> 6=2
>>
>> 4=2
>>
>> 9=1
>>
>> 5=1
>>
>> 7=1
>>
>> 3=1
>>
>> How do I do it in R?
>>
>>
>>
>> Thanks in advance
>>
>>
>>
>> Val
>>
>>        [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sorting NA's and Graphing a histogram

2009-11-02 Thread Koraelus

Let's say I have a dataset

Dataset<-read.csv("Dataset.txt", header=T, row.names=1)

Dataset
   Alpha Beta Gamma Delta
A   1  23   4
B   NA24   5
C   NA3NA NA
D   8  9   10  11
E5 NA 713
F   NANA NA   4

I want to remove all rows with more than one NA

So I want it to look like

   Alpha Beta Gamma Delta
A 12  34
B  NA 2   45
D  8   9  10   11
E  5   NA  7   13

Then I want to turn the rownames into variables A, B, D, E that I can work
with as if I had used the attach() cmd, but on the rownames.

plot(A)
-- 
View this message in context: 
http://old.nabble.com/Sorting-NA%27s-and-Graphing-a-histogram-tp26157779p26157779.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Robust ANOVA or alternative test?

2009-11-02 Thread jinyan fan
Dear Colleagues,
 
I am running a three-way mixed-design ANOVA, with one manipulated IV, a group 
membership IV, and pre-/post- within subject factor. I am interested in the 
three-way interaction effect. The regular ANVOA is problematic, because (a) the 
sample sizes are very unbalanced, due to the group IV, (b) the homogeneity 
of variance is violated, and (c) the homogeneity of covariance is also 
violated. I wonder if R has a procedure to address this issue, either by some 
sort of Robust ANOVA procedure, or by some alternative tests? Thanks a lot. 

Jinyan Fan

Assistant Professor 
Department of Psychology 
Hofstra University 
Hempstead, NY 11550 
Work: 516-463-6349


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Incremental ReadLines

2009-11-02 Thread Gene Leynes
I've been trying to figure out how to read in a large file for a few days
now, and after extensive research I'm still not sure what to do.

I have a large comma delimited text file that contains 59 fields in each
record.
There is also a header every 121 records

This function works well for smallish records
getcsv=function(fname){
ff=file(description = fname)
x <- readLines(ff)
closeAllConnections()
x <- x[x != ""]  # REMOVE BLANKS
x=x[grep("^[-0-9]", x)]  # REMOVE ALL TEXT

spl=strsplit(x,',')  # THIS PART IS SLOW, BUT MANAGABLE

xx=t(sapply(1:length(spl),function(temp)as.vector(na.omit(as.numeric(spl[[temp]])
return(xx)
}
It's not elegant, but it works.
For 121,000 records it completes in 2.3 seconds
For 121,000*5 records it completes in 63 seconds
For 121,000*10 records it doesn't complete

When I try other methods to read the file in chunks (using scan), the
process breaks down because I have to start at the beginning of the file on
every iteration.
For example:
fnn=function(n,col){
a=122*(n-1)+2
xx=scan(fname,skip=a-1,nlines=121,sep=',',quiet=TRUE,what=character(0))
xx=xx[xx!='']
xx=matrix(xx,ncol=49,byrow=TRUE)
xx[,col]
}
system.time(sapply(1:10,fnn,c=26)) # 0.31 Seconds
system.time(sapply(91:90,fnn,c=26))# 1.09 Seconds
system.time(sapply(901:910,fnn,c=26))  # 5.78 Seconds

Even though I'm only getting the 26th column for 10 sets of records, it
takes a lot longer the further into the file I go.

How can I tell scan to pick up where it left off, without it starting at the
beginning??  There must be a good example somewhere.

I have done a lot of research (in fact, thank you to Michael J. Crawley and
others for your help thus far)

Thanks,

Gene

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] "Safe" way to automatically install required packages...

2009-11-02 Thread hadley wickham
If you package "depends" on another package, it will be automatically installed.

Hadley

On Mon, Nov 2, 2009 at 12:56 PM, Jonathan Greenberg
 wrote:
> R-helpers:
>
>   I'm working on an r-package that I want to make as easy-to-use as possible
> for a novice R-user, which includes automatically installing required
> packages.   I, myself, am a novice R-packager, so the solution I came up
> with was to embed:
>
> print("Loading required packages...")
> if (!require("reshape")) { install.packages("reshape") }
> if (!require("reshape")) {
>   print("Could not install package 'reshape', please contact your
> sysadmin.")
>   return()
> }
>
>   in the code proper, and put together the package using package.skeleton()
> and R CMD build.
>
>   I'm guessing there's a better way to do this -- any suggestions?
> --j
>
> --
>
> Jonathan A. Greenberg, PhD
> Postdoctoral Scholar
> Center for Spatial Technologies and Remote Sensing (CSTARS)
> University of California, Davis
> One Shields Avenue
> The Barn, Room 250N
> Davis, CA 95616
> Phone: 415-763-5476
> AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using exists with coef from an arima fit

2009-11-02 Thread Rolf Turner


On 2/11/2009, at 5:27 PM, Erin Hodgess wrote:


Dear R People:

I have the output from an arima model fit in an object xxx.

I want to verify that the ma1 coefficient is there, so I did the  
following:

xxx$coef

   ar1ar2ma1  intercept
 1.3841297 -0.4985667 -0.996 -0.1091657

str(xxx$coef)

 Named num [1:4] 1.384 -0.499 -1 -0.109
 - attr(*, "names")= chr [1:4] "ar1" "ar2" "ma1" "intercept"

exists('xxx$coef["ma1"]')

[1] FALSE




Other than using xxx$coef[3], is there another way to check this,  
please?


(1) as.logical(length(grep("ma1",names(xxx$coef)))

(2) length(grep("ma1",names(xxx$coef))) != 0

(3) with(as.list(xxx$coef),exists("ma1"))

spring to mind.

cheers,

Rolf

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] "Safe" way to automatically install required packages...

2009-11-02 Thread Jonathan Greenberg

R-helpers:

   I'm working on an r-package that I want to make as easy-to-use as 
possible for a novice R-user, which includes automatically installing 
required packages.   I, myself, am a novice R-packager, so the solution 
I came up with was to embed:


print("Loading required packages...")
if (!require("reshape")) { install.packages("reshape") }
if (!require("reshape")) {
   print("Could not install package 'reshape', please contact your 
sysadmin.")

   return()
}

   in the code proper, and put together the package using 
package.skeleton() and R CMD build.


   I'm guessing there's a better way to do this -- any suggestions? 


--j

--

Jonathan A. Greenberg, PhD
Postdoctoral Scholar
Center for Spatial Technologies and Remote Sensing (CSTARS)
University of California, Davis
One Shields Avenue
The Barn, Room 250N
Davis, CA 95616
Phone: 415-763-5476
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Frequency

2009-11-02 Thread Joe King
sorry, I forgot to send my reply to the list, I got to remember to hit reply
all:

So I set up a dummy matrix, v1,v2,v3,v4, an datamatrix

v1 = c(5,3,4)
v2 = c(6,4,6)
v3 = c(9,7,10)
v4 = c(10,10,18)
datamatrix=c(v1,v2,v3,v4)

then do 

sort(table(datamatrix))

Joe King
206-913-2912
j...@joepking.com
"Never throughout history has a man who lived a life of ease left a name
worth remembering." --Theodore Roosevelt


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Val
Sent: Monday, November 02, 2009 10:35 AM
To: r-help@r-project.org
Subject: [R] Frequency

BAYESIAN INFERENCES FOR MILKING TEMPERAMENT IN CANADIAN HOLSTEINS

Hi All,

I have a data  set "x"  with several variables. Sample of the data is shown
below

 V1  v2  v3   v4

   569   10

 347   10

 46   10   18



I want the frequency  of each  data point sorted by their occurrence.



Below is the output that I want

10=3

6=2

4=2

9=1

5=1

7=1

3=1

How do I do it in R?



Thanks in advance



Val

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Frequency

2009-11-02 Thread Jorge Ivan Velez
Hi Val,

Here is a suggestion:

res <- table(unlist(x))
res[order(res, decreasing = TRUE)]
# 10  4  6  3  5  7  9 18
#  3  2  2  1  1  1  1  1

HTH,
Jorge


On Mon, Nov 2, 2009 at 1:35 PM, Val <> wrote:

> BAYESIAN INFERENCES FOR MILKING TEMPERAMENT IN CANADIAN HOLSTEINS
>
> Hi All,
>
> I have a data  set "x"  with several variables. Sample of the data is shown
> below
>
>  V1  v2  v3   v4
>
>   569   10
>
>  347   10
>
>  46   10   18
>
>
>
> I want the frequency  of each  data point sorted by their occurrence.
>
>
>
> Below is the output that I want
>
> 10=3
>
> 6=2
>
> 4=2
>
> 9=1
>
> 5=1
>
> 7=1
>
> 3=1
>
> How do I do it in R?
>
>
>
> Thanks in advance
>
>
>
> Val
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to print the full name of the factors in summary?

2009-11-02 Thread Bert Gunter
Jack:

You are confused!

You have specified that the factor is ordered (ordered = TRUE), so you do
_not get_ the (6, not 7 btw)single degree of freedom contrasts corresponding
to weekday names. Instead you get the (default) contrasts for an ordered
factor (.L = linear, .Q = quadratic, etc.). Also you apparently do not have
all weekdays present or maybe there is partial confounding with some of your
other factors, thus producing the singularities given in the message.

While you can read up on such matters ("contrasts") in the docs and texts
about R (V&R's MASS has a nice explanation), these issues are inherently
technical. So if you do not have the requisite statistical background, I
suggest you consult a statistician.

To paraphrase an R fortune -- if you do not know how to properly use the
statistical tools, you shouldn't be using them.

Cheers,

Bert Gunter
Genentech Nonclinical Biostatistics
 
 -Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Jen-Chien Chang
Sent: Monday, November 02, 2009 7:02 AM
To: r-help@r-project.org
Subject: [R] how to print the full name of the factors in summary?

Hi,

I am wondering if there is a simple way to fix the problem I am having. 
For unknown reason, I could not get the full name of the factors to be 
printed in the summary. I have tried to used summary.lm as well but the 
problem still persists.

SJ$Weekday <- 
factor(SJ$Weekday,1:7,c("Mon","Tue","Wed","Thu","Fri","Sat","Sun"),ordered=T
)

attach(SJ)
lm.SJ <- lm(Demand ~ Weekday+Month+Holiday+Season)
summary(lm.SJ)
Call:
lm(formula = Demand ~ Weekday + Month + Holiday + Season)

Residuals:
 Min  1Q  Median  3Q Max
-69.767 -12.224  -1.378  10.857  91.376

Coefficients: (3 not defined because of singularities)
 Estimate Std. Error t value Pr(>|t|)
(Intercept)  88.7091 3.3442  26.527  < 2e-16 ***
Weekday.L20.8132 2.8140   7.396 1.08e-12 ***
Weekday.Q   -12.7667 2.8156  -4.534 7.99e-06 ***
Weekday.C   -10.6375 2.8113  -3.784 0.000182 ***
Weekday^4-8.3325 2.8103  -2.965 0.003238 **
-

Is there a way for summary to print the full name of the factors and 
levels? Say Weekday.Tue instead Weekday.L?

Thanks!

Jack Chang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Frequency

2009-11-02 Thread Val
BAYESIAN INFERENCES FOR MILKING TEMPERAMENT IN CANADIAN HOLSTEINS

Hi All,

I have a data  set "x"  with several variables. Sample of the data is shown
below

 V1  v2  v3   v4

   569   10

 347   10

 46   10   18



I want the frequency  of each  data point sorted by their occurrence.



Below is the output that I want

10=3

6=2

4=2

9=1

5=1

7=1

3=1

How do I do it in R?



Thanks in advance



Val

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] "object not found" within function

2009-11-02 Thread Thushyanthan Baskaran

Hi,

I am trying to write a function to compute many cross-tabulations with the 
-svytable- command. Here is a simplified example of the structure of my code 
(adapted from the -svytable- help file):



data(api)
func.example<-function(variable){ 
dclus1<-svydesign(id=~1, weights=~pw,data=apiclus1, fpc=~fpc)


svytable(~ variable, dclus1)

}
  
When I call this function with:


func.example(api99)

I get the following error:


Error in eval(expr, envir, enclos) : object 'variable' not found. 


(Everything works fine when I type svytable(~ api99, dclus1).)

I guess that the problem has something to do with function environments, but since I am new to R, I can't seem to figure out how to rectify the code. 

I am running R version 2.10 on Windows Vista. 


Thanks for the help!

Thushyanthan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R and Python

2009-11-02 Thread Ryan Sheftel
I am a long time user of R for financial time series analysis (xts, zoo,
etc) and for my next project I am thinking of adding the Python language to
the mix. The reason for adding Python is primarily its non-statistical
capabilities.

So my questions are what have people's experiences been with using interop
between R and Python. I see there are two items, rPy and RSPython. It looks
like rPy makes it possible to call R code from Python, and RSPython goes
both ways. My needs would be to use Python to drive R to get it's extensive
time series and stats, and also to get to Python objects from inside R.

I searched the list archives and it looks like many people use rPy, but what
about RSPython for calling Python from R? It looks like rPy only goes from R
into Python, and RSPython has not been updated since 2005?

Other messages in the archives state that RSPython only works on Unix?

Would I be foolish to build anything mission critical on RSPython? Is there
a better way to get at Python from R?

Thanks as always.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] a prolem with constrOptim

2009-11-02 Thread Davidov, Ori (NIH/NIEHS) [V]
Hi,

I apologize for the long message but the problem I encountered can't be stated 
in a few lines.

I am having some problems with the function constrOptim. My goal is to maximize 
the likelihood of product of K multinomials, each with four catagories under 
linear constraints on the parameter values. I have found that the function does 
not work for many data configurations.

#The likelihood and score functions are:

likelihood = function(theta,data)
{
p = matrix(theta,nrow=3,byrow=F)
s = 1-apply(p,2,sum)
p = rbind(p,s)
Z = matrix(data,nrow=4,byrow=F)
sum(log(p)*Z)
}

score = function(theta,data)
{
k = length(data)/4
S = numeric(3*k)
for(i in 1:k)
{
n = data[(1+4*(i-1)):(4*i)]
p = theta[(1+3*(i-1)):(3*i)]
P = 1-sum(p)
S[(1+3*(i-1)):(3*i)] = n[1:3]/p-n[4]/P
}
S
}

#where theta=(p11,p12,p13,p21,p22,p23,...,pK1,pK2,pK3).

#The function Rmat calculates the restriction matrix needed for constrained 
estimation

Rmat = function(k)
{
R = matrix(1,4,3)
R[1,2] = R[1,3] = R[2,3] = R[3,2] = 0
RR = cbind(-R,R)
RRR = matrix(0,4*(k-1),3*k)
for ( i in 1:(k-1) ) RRR[4*(i-1)+1:4,3*(i-1)+1:6] = RR
RRR
}

#The function gen.data generates data

gen.data = function(p,N)
{
k = ncol(p)
Data = matrix(0,4,k)
for(j in 1:k)
{
Data[,j] = as.vector(rmultinom(1, size = N[j], 
prob=p[,j]))
}
as.vector(Data)
}

#I have found that the function returns an error under some configurations of 
the data ( 0's seem to harm it). Why is this happening and how can it #be 
corrected? Why do some configurations with 0's crash the program and others 
don't?

# For example for the data:

data1 = c(0,6,8,6,3,4,4,9,2,2,5,11)
data2 = c(0,6,8,6,0,4,4,9,2,2,5,11)

# inputs for constrOptim

K = 3
RR = Rmat(K)
zeros = rep(0,nrow(RR))
theta.0 = numeric(3*K)
for(i in 1:K) {theta.0[(1+3*(i-1)):(3*i)] = rep(1/10,3)+i/100}

# data1 gives an error but data2 does not!

constrOptim(theta=theta.0, f=likelihood, grad=score, ui=RR,ci=zeros,
data = data1, method="BFGS",
control=list(fnscale=-1,abstol=1e-16,reltol=1e-16))$par

constrOptim(theta=theta.0, f=likelihood, grad=score, ui=RR,ci=zeros,
data = data2, method="BFGS",
control=list(fnscale=-1,abstol=1e-16,reltol=1e-16))$par


# If you want to further check run the simulation below.

K = 3
N = rep(20,K)
p = c(1/10,2/10,2/10,5/10)
p = matrix(rep(p,K),nrow=4,byrow=F)


RR = Rmat(K)
zeros = rep(0,nrow(RR))
theta.0 = numeric(3*K)
for(i in 1:K) {theta.0[(1+3*(i-1)):(3*i)] = rep(1/10,3)+i/100}

for(i in 1:1000)
{
data = gen.data(p,N)
print(i)
print(data)
theta.til = constrOptim(theta=theta.0, f=likelihood, 
grad=score, ui=RR,ci=zeros,
data = data, method="BFGS",
control=list(fnscale=-1,abstol=1e-16,reltol=1e-16))$par

print(theta.til)
}



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Summing rows based on criteria

2009-11-02 Thread PDXRugger

I am attempting to clean up some land use building data and need to join some
buildings together making sure not to double count GIS slivers.  The first
data.frame is the original, the 2nd adds all the acres for each identical
bldgid.  I now want to 
a) throw out all but one of the the cases where the Years and ImpValue are
Identical, 
b) Sum the impvalues based on: 
 1) The Years are identical and the ImpValue are not
 2)The ImpValues are identical and the Years are not

Resulting in the 3rd data frame.

Please consider the following

DF=cbind(Acres,Bldgid,Year,ImpValue)
 DF<-as.data.frame(DF)
 DF
Acres Bldgid Year ImpValue
1  100.00  1 1946 1000
2  101.00  2 1952 1400
3  100.00  3 1922 1300
4  130.00  4 1910  900
5  156.00  5 1955 5000
60.50  5 1955 1200
7  293.00  6 1999  500
8  300.00  7 1990 9000
90.09  7 1991 9000
10 100.00  8 2000 1000
11  12.50  8 2000 1000

#Aggregate acres where identical ids
dupbuild<-aggregate(DF$Acres,DF["Bldgid"],sum)
 colnames(dupbuild)[2]<-"Acres"
 
 #Add aggregated Acres to DF
 DF$Acres<-dupbuild$Acres[match(DF$Bldgid,dupbuild$Bldgid)]
 DF
   Acres dgid Year ImpValue
1  100.00  1 1946 1000
2  101.00  2 1952 1400
3  100.00  3 1922 1300
4  130.00  4 1910  900
5  156.50  5 1955 5000
6  156.50  5 1955 1200
7  293.00  6 1999  500
8  300.09  7 1990 9000
9  300.09  7 1991 9000
10 112.50  8 2000 1000
11 112.50  8 2000 1000

#desired outcome data frame 
   Acres dgid Year ImpValue
1  100.00  1 1946 1000
2  101.00  2 1952 1400
3  100.00  3 1922 1300
4  130.00  4 1910  900
5  156.50  5 1955 7200 #combined 5 & 6 
7  293.00  6 1999  500
8  300.09  7 1990 18000 #combined 8 & 9
10 112.50  8 2000 1000 #one case thrown out

So in this case the Impvalue are added together for rows 5 & 6 (from
dataframe example 2) b/c the years are identical and the Impvalue is not,
rows 8 and 9 have their Impvalue summed because there Years are identical
but the improvement value is not, and one of the cases is thrown out of rows
10 & 11 because they have identical years and ImpValue.  

When rows are joined the Year value is no longer important but to remain
consistent i would like to keep the earliest (lowest) year.  There will be
instances in the actual data where there are more than 2 cases to consider
if that makes any coding difference as i didnt include any in my example
data.  It would also be useful to include a new column keeping track of how
many joined bldgids .   I think i can figure that one out though.  

Hope this is all clear.  Thanks for the guidance and insights

JR

-- 
View this message in context: 
http://old.nabble.com/Summing-rows-based-on-criteria-tp26157755p26157755.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Avoiding for loops

2009-11-02 Thread Noah Silverman
Wow,

That's nice.

Should work well, but I just realized that I missed something in 
explaining my code.

I need to calculate the exp function on X

so it should be

exp(x) / sum(exp(x)) for each group

I tried this with:

foo <- ave(rawdata$foo,rawdata$code,FUN=function(x) exp(x) / sum(exp(x)))

That didn't work.  I got a lot of NaN

-N

On 11/2/09 3:30 AM, Peter Dalgaard wrote:
> Dimitris Rizopoulos wrote:
>
>> you could try something along these lines:
>>
>> data<- data.frame(y = rnorm(100), group = rep(1:10, each = 10))
>>
>> data$sum<- ave(data$y, data$group, FUN = sum)
>> data$norm.y<- data$y / data$sum
>> data
>>  
> .. or even
>
> transform(data, norm=ave(y, group, FUN = function(x) x/sum(x)))
>
>
>> I hope it helps.
>>
>> Best,
>> Dimitris
>>
>>
>> Noah Silverman wrote:
>>  
>>> Hi,
>>>
>>> I'm trying to normalize some data.
>>> My data is organized by groups.  I want to normalize PER GROUP as
>>> opposed to over the entire data set.
>>>
>>> The current double loop that I'm using takes almost an hour to run on
>>> about 30,000 rows of data in 2,500 groups.
>>>
>>> I'm currently doing this:
>>>
>>> -
>>> for(group in unique(data$group)){
>>>  sum_V1<- sum(data$V1[data$group == group])
>>>
>>>  for(subject in data$subject[data$group == group]){
>>>  data$V1_norm[(data$group == group&  data$subject == subject)]
>>> <- data$V1[(data$group == group&  data$subject == subject)] / sum_V1
>>>  }
>>> }
>>> -
>>>
>>> Can anyone point me to a faster way to do this in R.
>>>
>>> Thanks!
>>>
>>> -N
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>  
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] qqplot

2009-11-02 Thread Peter Flom
Peter Ehlers  wrote
>
>That's not what qqline() does and for good reason - it treats
>x and y asymmetrically.
>
>But qqline() is a very simple function, using the quartiles
>as also suggested by John. Here's a modified version that
>should work for Carol:
>
>qqline2 <- function (x, y, ...)
>{
> y <- quantile(y[!is.na(y)], c(0.25, 0.75))
> x <- quantile(x[!is.na(x)], c(0.25, 0.75))
> slope <- diff(y)/diff(x)
> int <- y[1L] - slope * x[1L]
> abline(int, slope, ...)
>}
>
>I've just replaced the line using Normal quantiles with
>the quantiles of x (and omitted the datax option).
>

Peter

Good point.

Peter




Peter L. Flom, PhD
Statistical Consultant
Website: www DOT peterflomconsulting DOT com
Writing; http://www.associatedcontent.com/user/582880/peter_flom.html
Twitter:   @peterflom

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] qqplot

2009-11-02 Thread Peter Flom
John Fox  wrote
>
>I assumed that Carol wanted to compare the shapes of the distributions and
>to adjust for differences in centre and spread. To put a line through the
>quartiles or to base a line on the medians and IQRs is more robust than
>using the means and sds.
>
Hi John

Indeed it is.

It all depends on what she wants to use the QQplot for, and what she wants to 
adjust for.

But I would guess (and I haven't done any simuls) that the method with mean and 
sd will usually be pretty
close to the one with medians and IQRs.

Maybe I will look at it a bit.

Peter

Peter L. Flom, PhD
Statistical Consultant
Website: www DOT peterflomconsulting DOT com
Writing; http://www.associatedcontent.com/user/582880/peter_flom.html
Twitter:   @peterflom

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] qqplot

2009-11-02 Thread John Fox
Dear Peter,

I assumed that Carol wanted to compare the shapes of the distributions and
to adjust for differences in centre and spread. To put a line through the
quartiles or to base a line on the medians and IQRs is more robust than
using the means and sds.

Best,
 John


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On
> Behalf Of Peter Flom
> Sent: November-02-09 11:57 AM
> To: carol white; Yihui Xie
> Cc: r-h...@stat.math.ethz.ch
> Subject: Re: [R] qqplot
> 
> carol white  wrote
> 
> >So the conclusion is that abline(0,1) should always be used and if it
> doesn't go through the qqplot, the two distributions are not similar?
> 
> I think it depends what you mean by "similar".  E.g., if you mean "are
both
> of these distributions (e.g.) normal?" then abline(0,1) is not always
useful.
> But if you mean "Do these have the same mean, sd, and distribution?" then
> abline(0,1) is the way to go.
> 
> Peter
> 
> Peter L. Flom, PhD
> Statistical Consultant
> Website: www DOT peterflomconsulting DOT com
> Writing; http://www.associatedcontent.com/user/582880/peter_flom.html
> Twitter:   @peterflom
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] qqplot

2009-11-02 Thread Peter Ehlers

Peter Flom wrote:

David Winsemius  wrote

I always assumed that the intercept was zero and the slope = unity.

 y <- rt(200, df = 5)
 qqnorm(y); qqline(y, col = 2)
 qqplot(y, rt(300, df = 5))
 abline(0, 1, col="red")



Suppose you have the following

x <- rnorm(500)
y <- 500*(x + runif(500, 0,1))
qqplot(x,y)


Then an abline(0,1) will not be useful; it will be an almost horizontal line. 
But

m1 <- lm(y~x)
abline(coef(m1))


does the trick.

Peter


That's not what qqline() does and for good reason - it treats
x and y asymmetrically.

But qqline() is a very simple function, using the quartiles
as also suggested by John. Here's a modified version that
should work for Carol:

qqline2 <- function (x, y, ...)
{
y <- quantile(y[!is.na(y)], c(0.25, 0.75))
x <- quantile(x[!is.na(x)], c(0.25, 0.75))
slope <- diff(y)/diff(x)
int <- y[1L] - slope * x[1L]
abline(int, slope, ...)
}

I've just replaced the line using Normal quantiles with
the quantiles of x (and omitted the datax option).

 -Peter Ehlers





Peter L. Flom, PhD
Statistical Consultant
Website: www DOT peterflomconsulting DOT com
Writing; http://www.associatedcontent.com/user/582880/peter_flom.html
Twitter:   @peterflom

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] save an object by dynamicly created name

2009-11-02 Thread Dieter Menne



jeffc wrote:
> 
> Hi,
> 
> I would like to save a few dynamically created objects to disk. The
> following is the basic flow of the code segment
> 
> for(i = 1:10) {
>m = i:5
>save(m, file = ...) ## ???
> }
> To distinguish different objects to be saved, I would like to save m as
> m1, m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ...
> 
> 

save(m, file = paste("/home/data/m", i, ".rdata", sep="")

Dieter

-- 
View this message in context: 
http://old.nabble.com/save-an-object-by-dynamicly-created-name-tp26155437p26157748.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] qqplot

2009-11-02 Thread Peter Flom
carol white  wrote

>So the conclusion is that abline(0,1) should always be used and if it doesn't 
>go through the qqplot, the two distributions are not similar?

I think it depends what you mean by "similar".  E.g., if you mean "are both of 
these distributions (e.g.) normal?" then abline(0,1) is not always useful.  But 
if you mean "Do these have the same mean, sd, and distribution?" then 
abline(0,1) is the way to go.

Peter

Peter L. Flom, PhD
Statistical Consultant
Website: www DOT peterflomconsulting DOT com
Writing; http://www.associatedcontent.com/user/582880/peter_flom.html
Twitter:   @peterflom

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] qqplot

2009-11-02 Thread Peter Flom
David Winsemius  wrote
>I always assumed that the intercept was zero and the slope = unity.
>
>  y <- rt(200, df = 5)
>  qqnorm(y); qqline(y, col = 2)
>  qqplot(y, rt(300, df = 5))
>  abline(0, 1, col="red")
>

Suppose you have the following

x <- rnorm(500)
y <- 500*(x + runif(500, 0,1))
qqplot(x,y)


Then an abline(0,1) will not be useful; it will be an almost horizontal line. 
But

m1 <- lm(y~x)
abline(coef(m1))


does the trick.

Peter



Peter L. Flom, PhD
Statistical Consultant
Website: www DOT peterflomconsulting DOT com
Writing; http://www.associatedcontent.com/user/582880/peter_flom.html
Twitter:   @peterflom

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] qqplot

2009-11-02 Thread carol white
So the conclusion is that abline(0,1) should always be used and if it doesn't 
go through the qqplot, the two distributions are not similar?

Thanks

--- On Mon, 11/2/09, Yihui Xie  wrote:

> From: Yihui Xie 
> Subject: Re: [R] qqplot
> To: "carol white" 
> Cc: "David Winsemius" , r-h...@stat.math.ethz.ch
> Date: Monday, November 2, 2009, 8:42 AM
> abline(0,1) is somewhere in the
> upper-left corner which you are unable
> to see. At least the first distribution seems to have a
> larger mean
> than the second one (i.e. they are not the same
> distribution).
> 
> Regards,
> Yihui
> --
> Yihui Xie 
> Phone: 515-294-6609 Web: http://yihui.name
> Department of Statistics, Iowa State University
> 3211 Snedecor Hall, Ames, IA
> 
> 
> 
> On Mon, Nov 2, 2009 at 10:33 AM, carol white 
> wrote:
> > if I have the two following matrices, abline(0,1)
> doesn't go through. QQplot is attached.
> >
> >            [,1]     [,2]     [,3]     [,4]
>     [,5]
> >  2.149644 1.992864 3.346375 2.793511 3.428230
> >  1.100762 2.152981 2.735401 2.175185 3.323058
> > 1.212406 2.131813 2.672598 2.389996 3.242490
> > 1.183770 1.908633 2.661237 2.590545 2.906059
> >  1.665190 1.778923 2.636062 2.475619 4.013407
> >
> >
> >    0.601   0.083   0.520    0.920  -0.007
> >   -0.778   0.427  -0.605   -0.066  -0.283
> >  -0.599   0.348  -0.693    0.284  -0.436
> >   -0.519   0.081  -0.590    0.678  -1.095
> >    0.009  -0.253  -0.940    0.526   1.623
> >
> >
> > --- On Mon, 11/2/09, David Winsemius 
> wrote:
> >
> >> From: David Winsemius 
> >> Subject: Re: [R] qqplot
> >> To: "carol white" 
> >> Cc: r-h...@stat.math.ethz.ch
> >> Date: Monday, November 2, 2009, 8:17 AM
> >>
> >> On Nov 2, 2009, at 10:40 AM, carol white wrote:
> >>
> >> > Hi,
> >> > We could use qqplot to see how two
> distributions are
> >> different from each other. To show better how they
> are
> >> different (departs from the straight line), how is
> it
> >> possible to plot the straight line that goes
> through them? I
> >> am looking for some thing like qqline for qqnorm.
> I thought
> >> of abline but how to determine the slope and
> intercept?
> >>
> >> I always assumed that the intercept was zero and
> the slope
> >> = unity.
> >>
> >>  y <- rt(200, df = 5)
> >>  qqnorm(y); qqline(y, col = 2)
> >>  qqplot(y, rt(300, df = 5))
> >>  abline(0, 1, col="red")
> >>
> >> I am open to education if that assumption is too
> >> simplistic, but you have not offered anything in
> the way of
> >> a counter-example.
> >>
> >> >
> >> ==
> >>
> >> David Winsemius, MD
> >> Heritage Laboratories
> >> West Hartford, CT
> >>
> >>
> >
> >
> >
> > __
> > R-help@r-project.org
> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code.
> >
> >
> 


   
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] qqplot

2009-11-02 Thread Yihui Xie
abline(0,1) is somewhere in the upper-left corner which you are unable
to see. At least the first distribution seems to have a larger mean
than the second one (i.e. they are not the same distribution).

Regards,
Yihui
--
Yihui Xie 
Phone: 515-294-6609 Web: http://yihui.name
Department of Statistics, Iowa State University
3211 Snedecor Hall, Ames, IA



On Mon, Nov 2, 2009 at 10:33 AM, carol white  wrote:
> if I have the two following matrices, abline(0,1) doesn't go through. QQplot 
> is attached.
>
>            [,1]     [,2]     [,3]     [,4]     [,5]
>  2.149644 1.992864 3.346375 2.793511 3.428230
>  1.100762 2.152981 2.735401 2.175185 3.323058
> 1.212406 2.131813 2.672598 2.389996 3.242490
> 1.183770 1.908633 2.661237 2.590545 2.906059
>  1.665190 1.778923 2.636062 2.475619 4.013407
>
>
>    0.601   0.083   0.520    0.920  -0.007
>   -0.778   0.427  -0.605   -0.066  -0.283
>  -0.599   0.348  -0.693    0.284  -0.436
>   -0.519   0.081  -0.590    0.678  -1.095
>    0.009  -0.253  -0.940    0.526   1.623
>
>
> --- On Mon, 11/2/09, David Winsemius  wrote:
>
>> From: David Winsemius 
>> Subject: Re: [R] qqplot
>> To: "carol white" 
>> Cc: r-h...@stat.math.ethz.ch
>> Date: Monday, November 2, 2009, 8:17 AM
>>
>> On Nov 2, 2009, at 10:40 AM, carol white wrote:
>>
>> > Hi,
>> > We could use qqplot to see how two distributions are
>> different from each other. To show better how they are
>> different (departs from the straight line), how is it
>> possible to plot the straight line that goes through them? I
>> am looking for some thing like qqline for qqnorm. I thought
>> of abline but how to determine the slope and intercept?
>>
>> I always assumed that the intercept was zero and the slope
>> = unity.
>>
>>  y <- rt(200, df = 5)
>>  qqnorm(y); qqline(y, col = 2)
>>  qqplot(y, rt(300, df = 5))
>>  abline(0, 1, col="red")
>>
>> I am open to education if that assumption is too
>> simplistic, but you have not offered anything in the way of
>> a counter-example.
>>
>> >
>> ==
>>
>> David Winsemius, MD
>> Heritage Laboratories
>> West Hartford, CT
>>
>>
>
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how can I convert .csv format to matrix???

2009-11-02 Thread Dieter Menne



bbslover wrote:
> 
> In my disk C:/ have a  a.csv file, I want to read it  to R, importantly,
> when I use x=read.csv("C:/a.csv") ,the x format  is  data.frame,  I want
> to it to become matrix format,  how can I do it ?
> 

as.matrix should do the job; if it does not (second example), you probably
have rownames or some other string in your data set.

Dieter

# use read.csv instead
x = data.frame(a=1:10,b=11:20)
xm = as.matrix(x)
str(xm)
xm # looks good

x1 = data.frame(a=1:10,b=11:20,c=letters[1:10])
xm1 = as.matrix(x1)
str(xm1)
xm1 # looks strange, because it's all strings


-- 
View this message in context: 
http://old.nabble.com/how-can-I-convert-.csv-format-to-matrixtp26156643p26157732.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R and Python

2009-11-02 Thread Ryan Sheftel
I am a long time user of R for financial time series analysis (xts, zoo,
etc) and for my next project I am thinking of adding the Python language to
the mix. The reason for adding Python is primarily its non-statistical
capabilities.

So my questions are what have people's experiences been with using interop
between R and Python. I see there are two items, rPy and RSPython. It looks
like rPy makes it possible to call R code from Python, and RSPython goes
both ways. My needs would be to use Python to drive R to get it's extensive
time series and stats, and also to get to Python objects from inside R.

I searched the list archives and it looks like many people use rPy, but what
about RSPython for calling Python from R? It looks like rPy only goes from R
into Python, and RSPython has not been updated since 2005?

Other messages in the archives state that RSPython only works on Unix?

Would I be foolish to build anything mission critical on RSPython? Is there
a better way to get at Python from R?

Thanks as always.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] qqplot

2009-11-02 Thread carol white
if I have the two following matrices, abline(0,1) doesn't go through. QQplot is 
attached.

[,1] [,2] [,3] [,4] [,5]
 2.149644 1.992864 3.346375 2.793511 3.428230
 1.100762 2.152981 2.735401 2.175185 3.323058
1.212406 2.131813 2.672598 2.389996 3.242490
1.183770 1.908633 2.661237 2.590545 2.906059
 1.665190 1.778923 2.636062 2.475619 4.013407

   
0.601   0.083   0.5200.920  -0.007
   -0.778   0.427  -0.605   -0.066  -0.283
  -0.599   0.348  -0.6930.284  -0.436
   -0.519   0.081  -0.5900.678  -1.095
0.009  -0.253  -0.9400.526   1.623


--- On Mon, 11/2/09, David Winsemius  wrote:

> From: David Winsemius 
> Subject: Re: [R] qqplot
> To: "carol white" 
> Cc: r-h...@stat.math.ethz.ch
> Date: Monday, November 2, 2009, 8:17 AM
> 
> On Nov 2, 2009, at 10:40 AM, carol white wrote:
> 
> > Hi,
> > We could use qqplot to see how two distributions are
> different from each other. To show better how they are
> different (departs from the straight line), how is it
> possible to plot the straight line that goes through them? I
> am looking for some thing like qqline for qqnorm. I thought
> of abline but how to determine the slope and intercept?
> 
> I always assumed that the intercept was zero and the slope
> = unity.
> 
>  y <- rt(200, df = 5)
>  qqnorm(y); qqline(y, col = 2)
>  qqplot(y, rt(300, df = 5))
>  abline(0, 1, col="red")
> 
> I am open to education if that assumption is too
> simplistic, but you have not offered anything in the way of
> a counter-example.
> 
> > 
> ==
> 
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
> 
>


  <>__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] qqplot

2009-11-02 Thread John Fox
Carol,

You could run a line through the pairs of first and third quartiles of the
two distributions, i.e., c(quantile(x, .25), quantile(y, .25)) and
c(quantile(x, .75), quantile(y, .75)). (Of course, you'd want the line to
extend across the whole graph.)

I hope this helps,
 John


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On
> Behalf Of carol white
> Sent: November-02-09 10:40 AM
> To: r-h...@stat.math.ethz.ch
> Subject: [R] qqplot
> 
> Hi,
> We could use qqplot to see how two distributions are different from each
> other. To show better how they are different (departs from the straight
> line), how is it possible to plot the straight line that goes through
them? I
> am looking for some thing like qqline for qqnorm. I thought of abline but
how
> to determine the slope and intercept?
> 
> Best wishes,
> 
> Carol
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to print the full name of the factors in summary?

2009-11-02 Thread John Fox
Jen-Chien and Adaikalavan,

Weekday is specified as an ordered factor, for which the default contrast
type is contr.poly, i.e., orthogonal-polynomial contrasts. I would have
expected a 6th-degree polynomial for the 7 days of the week, and there's a
message that suggests that 3 coefficients are aliased, so there must be
redundancies among the factors. If you want contr.treatment, treating "Mon"
as the baseline level, then simply omit ordered=T from the call to factor(),
but this won't make the collinearities disappear.

I hope this helps,
 John

--
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On
> Behalf Of Adaikalavan Ramasamy
> Sent: November-02-09 10:43 AM
> To: Jen-Chien Chang
> Cc: r-help@r-project.org
> Subject: Re: [R] how to print the full name of the factors in summary?
> 
> It would be useful to say which package the object SJ comes from or
> provide a more reproducible example.
> 
> Assuming that Demand variable is continuous and you are fitting a
> standard lm() model, then your results looks suspicious. Where are the
> coefficients for Month, Holiday, Season?
> 
> 
> 
> 
> Jen-Chien Chang wrote:
> > Hi,
> >
> > I am wondering if there is a simple way to fix the problem I am having.
> > For unknown reason, I could not get the full name of the factors to be
> > printed in the summary. I have tried to used summary.lm as well but the
> > problem still persists.
> >
> > SJ$Weekday <-
> >
>
factor(SJ$Weekday,1:7,c("Mon","Tue","Wed","Thu","Fri","Sat","Sun"),ordered=T
)
> >
> > 
> > attach(SJ)
> > lm.SJ <- lm(Demand ~ Weekday+Month+Holiday+Season)
> > summary(lm.SJ)
> > Call:
> > lm(formula = Demand ~ Weekday + Month + Holiday + Season)
> >
> > Residuals:
> > Min  1Q  Median  3Q Max
> > -69.767 -12.224  -1.378  10.857  91.376
> >
> > Coefficients: (3 not defined because of singularities)
> > Estimate Std. Error t value Pr(>|t|)
> > (Intercept)  88.7091 3.3442  26.527  < 2e-16 ***
> > Weekday.L20.8132 2.8140   7.396 1.08e-12 ***
> > Weekday.Q   -12.7667 2.8156  -4.534 7.99e-06 ***
> > Weekday.C   -10.6375 2.8113  -3.784 0.000182 ***
> > Weekday^4-8.3325 2.8103  -2.965 0.003238 **
> > -
> >
> > Is there a way for summary to print the full name of the factors and
> > levels? Say Weekday.Tue instead Weekday.L?
> >
> > Thanks!
> >
> > Jack Chang
> >
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] qqplot

2009-11-02 Thread David Winsemius


On Nov 2, 2009, at 10:40 AM, carol white wrote:


Hi,
We could use qqplot to see how two distributions are different from  
each other. To show better how they are different (departs from the  
straight line), how is it possible to plot the straight line that  
goes through them? I am looking for some thing like qqline for  
qqnorm. I thought of abline but how to determine the slope and  
intercept?


I always assumed that the intercept was zero and the slope = unity.

 y <- rt(200, df = 5)
 qqnorm(y); qqline(y, col = 2)
 qqplot(y, rt(300, df = 5))
 abline(0, 1, col="red")

I am open to education if that assumption is too simplistic, but you  
have not offered anything in the way of a counter-example.





==

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] partial matching with grep()

2009-11-02 Thread jim holtman
Is this what you want:  depends on it occurring at the endo of the string:

> grep("\\.x[^x]*$",c("a.x" ,"b.x","a.xx"),value=TRUE)
[1] "a.x" "b.x"


On Mon, Nov 2, 2009 at 9:57 AM, Vito Muggeo (UniPa)
 wrote:
> dear all,
> This is a probably a silly question.
> If I type
>> grep("x",c("a.x" ,"b.x","a.xx"),value=TRUE)
> [1] "a.x"  "b.x"  "a.xx"
>
> Instead, I would like to obtain only
> "a.x"  "b.x"
> How is it possible to get this result with grep()?
>
> many thanks for your attention,
> best,
> vito
>
> --
> 
> Vito M.R. Muggeo
> Dip.to Sc Statist e Matem `Vianelli'
> Università di Palermo
> viale delle Scienze, edificio 13
> 90128 Palermo - ITALY
> tel: 091 6626240
> fax: 091 485726/485612
> http://dssm.unipa.it/vmuggeo
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] qqplot

2009-11-02 Thread carol white
Hi,
We could use qqplot to see how two distributions are different from each other. 
To show better how they are different (departs from the straight line), how is 
it possible to plot the straight line that goes through them? I am looking for 
some thing like qqline for qqnorm. I thought of abline but how to determine the 
slope and intercept?

Best wishes,

Carol

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem building R 2.10.0 - Matrix package

2009-11-02 Thread Thomas Adams

All:

Attached is the output file from building R 2.10.0 on RedHat Linux. I 
have never previously experienced any problems when building R from 
source with new releases. But, now I get a compile error with the Matrix 
package:


CHOLMOD/Include/cholmod.h:87:22: error: UFconfig.h: No such file or 
directory

make[3]: *** [CHMfactor.o] Error 1
make[3]: Leaving directory `/tmp/RtmppKsKKl/R.INSTALL327b23c6/Matrix/src'
ERROR: compilation failed for package 'Matrix'

Any help or suggestions would be appreciated.

I am simply doing this, as I always have:

./configure --prefix=/Local/install/location (I can not use the default)
make

I should add that even with this error with the Matrix package, I can do 
a 'make install' OK. Subsequently, R seems to come up fine from the 
command line…


Regards,
Tom

--
Thomas E Adams
National Weather Service
Ohio River Forecast Center
1901 South State Route 134
Wilmington, OH 45177

EMAIL:  thomas.ad...@noaa.gov

VOICE:  937-383-0528
FAX:937-383-0033

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] partial matching with grep()

2009-11-02 Thread David Winsemius


On Nov 2, 2009, at 9:57 AM, Vito Muggeo (UniPa) wrote:


dear all,
This is a probably a silly question.
If I type
> grep("x",c("a.x" ,"b.x","a.xx"),value=TRUE)
[1] "a.x"  "b.x"  "a.xx"


> grep("\\.x\\b",c("a.x" ,"b.x","a.xx"),value=TRUE)
[1] "a.x" "b.x"

Or:
> grep("\\.x$",c("a.x" ,"b.x","a.xx"),value=TRUE)
[1] "a.x" "b.x"

Why you might ask. You need the double "\\" in order to get the  
interpreter to interpret them as single escapes prior to the period  
and the "\" in "\b", which is the end-of-word match pattern, while "$"  
is an end of sentence pattern.





Instead, I would like to obtain only
"a.x"  "b.x"
How is it possible to get this result with grep()?

many thanks for your attention,
best,
vito

--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert list to numeric

2009-11-02 Thread Don MacQueen

An example will help show the difference between single vs. double brackets.

## xl is  a list with three elements

 xl <- list( a=1:3, b=2:7, c=c('X','Y'))
 xl

$a
[1] 1 2 3

$b
[1] 2 3 4 5 6 7

$c
[1] "X" "Y"





## with single brackets, we get a subset. A subset of a list is still a list

 xl[1]

$a
[1] 1 2 3

## extract the first element of xl, as itself, whatever it is

 xl[[1]]

[1] 1 2 3

## with single brackets, we get a subset. In the next example the subset
## consists of the first and third elements of xl, i.e., a list 
having two elements

 xl[c(1,3)]

$a
[1] 1 2 3

$c
[1] "X" "Y"


## Similarly for data frames, and note how the formatting is 
different when the object

## is printed to the screen


 xd <- data.frame( a=1:3, b=2:4, c=c('X','Y','Z'))
 xd

  a b c
1 1 2 X
2 2 3 Y
3 3 4 Z

 class(xd)

[1] "data.frame"


 xd[2]

  b
1 2
2 3
3 4

 class(xd[2])

[1] "data.frame"


 xd[[2]]

[1] 2 3 4

 class(xd[[2]])

[1] "integer"



At 6:22 AM -0800 11/2/09, dadrivr wrote:

Great, that works very well.  What is the purpose of double brackets vs
single ones?  I will remember next time to include a subset of the data, so
that readers can run the script.  Thanks again for your help!


Benilton Carvalho wrote:


 it appears that what you really want is to use:

 task[[i]]

 instead of task[i]

 b

 On Nov 1, 2009, at 11:04 PM, dadrivr wrote:



 I would like to preface this by saying that I am new to R, so I 
 would ask
 that you be patient and thorough, so that I'm not completely 
 clueless.  I am
 trying to convert a list to numeric so that I can perform 
 computations on it
 (specifically mean-center the variable), but I am running into 
 problems.  I
 have imported the data set into "task" (data frame).  The data frame 
 is made
 of factors with variable names in the first row.  I am running a 
 loop to set
 a variable equal to a column in the data frame.  Here is an example 
 of my

 problem:

 for (i in 1:dim(task)[2]){
 predictor.loop <- c(task[i])
 predictor.loop.mc <- predictor.loop - mean(predictor.loop, na.rm=T)
 }

 I get the following error:
 Error in predictor.loop - mean(predictor.loop, na.rm = T) :
  non-numeric argument to binary operator
 In addition: Warning message:
 In mean.default(predictor.loop, na.rm = T) :
  argument is not numeric or logical: returning NA

 The column is entirely made up of numerical data, except for the 
 header,

 which is a string.  My problem is that I receive an error because the
 predictor.loop variable is not numerical, so I need to find a way to 
 convert

 it.  I tried using:
 predictor.loop <- c(as.numeric(task[i]))
 But I get the following error: "Error: (list) object cannot be 
 coerced to

 type 'double'"

 If I call the variable, I can assign it to a numerical list (e.g., 
 predictor
 loop <- task$variablename), but since I am assigning the variable in 
 a loop,

 >> I have to find another way as the variable name would have to change
 >> in each
 >> loop iteration.  Any help would be greatly appreciated.  Thanks!
 >> --
 >> View this message in context:
 >> http://*old.nabble.com/convert-list-to-numeric-tp26155039p26155039.html

 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://*stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://*www.*R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://*stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://*www.*R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




--
View this message in context: 
http://*old.nabble.com/convert-list-to-numeric-tp26155039p26157105.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://*stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to print the full name of the factors in summary?

2009-11-02 Thread Adaikalavan Ramasamy
It would be useful to say which package the object SJ comes from or 
provide a more reproducible example.


Assuming that Demand variable is continuous and you are fitting a 
standard lm() model, then your results looks suspicious. Where are the 
coefficients for Month, Holiday, Season?





Jen-Chien Chang wrote:

Hi,

I am wondering if there is a simple way to fix the problem I am having. 
For unknown reason, I could not get the full name of the factors to be 
printed in the summary. I have tried to used summary.lm as well but the 
problem still persists.


SJ$Weekday <- 
factor(SJ$Weekday,1:7,c("Mon","Tue","Wed","Thu","Fri","Sat","Sun"),ordered=T) 



attach(SJ)
lm.SJ <- lm(Demand ~ Weekday+Month+Holiday+Season)
summary(lm.SJ)
Call:
lm(formula = Demand ~ Weekday + Month + Holiday + Season)

Residuals:
Min  1Q  Median  3Q Max
-69.767 -12.224  -1.378  10.857  91.376

Coefficients: (3 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept)  88.7091 3.3442  26.527  < 2e-16 ***
Weekday.L20.8132 2.8140   7.396 1.08e-12 ***
Weekday.Q   -12.7667 2.8156  -4.534 7.99e-06 ***
Weekday.C   -10.6375 2.8113  -3.784 0.000182 ***
Weekday^4-8.3325 2.8103  -2.965 0.003238 **
-

Is there a way for summary to print the full name of the factors and 
levels? Say Weekday.Tue instead Weekday.L?


Thanks!

Jack Chang



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] package lme4

2009-11-02 Thread Douglas Bates
On Sun, Nov 1, 2009 at 9:01 AM, wenjun zheng  wrote:
> Hi R Users,
>     When I use package lme4 for mixed model analysis, I can't distinguish
> the significant and insignificant variables from all random independent
> variables.
>     Here is my data and result:
> Data:
>
>  Rice<-data.frame(Yield=c(8,7,4,9,7,6,9,8,8,8,7,5,9,9,5,7,7,8,8,8,4,8,6,4,8,8,9),
>                 Variety=rep(rep(c("A1","A2","A3"),each=3),3),
>                 Stand=rep(c("B1","B2","B3"),9),
>                 Block=rep(1:3,each=9))
>    Rice.lmer<-lmer(Yield ~ (1|Variety) + (1|Stand) + (1|Block) +
> (1|Variety:Stand), data = Rice)
>
> Result:
>
> Linear mixed model fit by REML
> Formula: Yield ~ (1 | Variety) + (1 | Stand) + (1 | Block) + (1 |
> Variety:Stand)
>   Data: Rice
>   AIC   BIC logLik deviance REMLdev
>  96.25 104.0 -42.12    85.33   84.25
> Random effects:
>  Groups        Name        Variance Std.Dev.
>  Variety:Stand (Intercept) 1.345679 1.16003
>  Block         (Intercept) 0.00 0.0
>  Stand         (Intercept) 0.89 0.94281
>  Variety       (Intercept) 0.024691 0.15714
>  Residual                  0.67 0.81650
> Number of obs: 27, groups: Variety:Stand, 9; Block, 3; Stand, 3; Variety, 3

> Fixed effects:
>            Estimate Std. Error t value
> (Intercept)   7.1852     0.6919   10.38

> Can you give me some advice for recognizing the significant variables among
> random effects above without other  calculating.

Well, since the estimate of the variance due to Block is zero, that's
probably not one of the significant random effects.

Why do you want to do this without other calculations?  In olden days
when each model fit involved substantial calculations by hand one did
try to avoid fitting multiple models but now that is not a problem.
You can get a hint of which random effects will be significant by
looking at their precision in a "caterpillar plot" and then fit the
reduced model and use anova to compare models.  See the enclosed

>    Any suggestions will be appreciated.
> Wenjun
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

R version 2.10.0 Patched (2009-11-01 r50288)
Copyright (C) 2009 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(lme4)
Loading required package: Matrix
Loading required package: lattice
> Rice <- data.frame(Yield = c(8,7,4,9,7,6,9,8,8,8,7,5,9,9,5,7,
+7,8,8,8,4,8,6,4,8,8,9),
+Variety = gl(3, 3, 27, labels = c("A1","A2","A3")),
+Stand = gl(3, 1, 27, labels = c("B1","B2","B3")),
+Block = gl(3, 9))
> print(fm1 <- lmer(Yield ~ 1 + (1|Block) + (1|Stand) +
+   (1|Variety) + (1|Variety:Stand), Rice))
Linear mixed model fit by REML 
Formula: Yield ~ 1 + (1 | Block) + (1 | Stand) + (1 | Variety) + (1 |  
Variety:Stand) 
   Data: Rice 
   AIC   BIC logLik deviance REMLdev
 96.25 104.0 -42.1285.33   84.25
Random effects:
 GroupsNameVariance Std.Dev.
 Variety:Stand (Intercept) 1.34568  1.16003 
 Variety   (Intercept) 0.02469  0.15713 
 Stand (Intercept) 0.9  0.94281 
 Block (Intercept) 0.0  0.0 
 Residual  0.7  0.81650 
Number of obs: 27, groups: Variety:Stand, 9; Variety, 3; Stand, 3; Block, 3

Fixed effects:
Estimate Std. Error t value
(Intercept)   7.1852 0.6919   10.38
> ## Estimate of Block variance is zero, remove that term
> print(fm2 <- lmer(Yield ~ 1 + (1|Stand) + (1|Variety) + (1|Variety:Stand),
+   Rice))
Linear mixed model fit by REML 
Formula: Yield ~ 1 + (1 | Stand) + (1 | Variety) + (1 | Variety:Stand) 
   Data: Rice 
   AIC   BIC logLik deviance REMLdev
 94.25 100.7 -42.1285.33   84.25
Random effects:
 GroupsNameVariance Std.Dev.
 Variety:Stand (Intercept) 1.345679 1.16003 
 Variety   (Intercept) 0.024692 0.15714 
 Stand (Intercept) 0.88 0.94281 
 Residual  0.67 0.81650 
Number of obs: 27, groups: Variety:Stand, 9; Variety, 3; Stand, 3

Fixed effects:
Estimate Std. Error t value
(Intercept)   7.1852 0.6919   10.38
> ## Very small estimate of variance for Variety
> ## Ch

Re: [R] superscript troubles

2009-11-02 Thread baptiste auguie
Hi,

Try this,

x = rnorm(1)
y = rnorm(1)
leg = bquote(r^2*"="*.(round(x,digits=3))*", P="*.(round(y, digits=3)))
plot.new()
legend (bty ="n","topright",legend=leg)

HTH,

baptiste

2009/11/2 Jacob Kasper :
> I know that this has been revisited over and over, yet I cannot figure out
> how to solve this case of superscript troubles...
> I would like the 2 in r2 to be superscript, yet I am pasting text before and
> after it. I have tried several variations but have not solved this yet, any
> suggestions?
>
> legend (bty =
> "n","topright",paste("r2=",round(summary(lat_x)$r.squared,digits=3),",
> P=",round(coefficients(summary(lat_x))[2,4], digits=3)))
> Thank you
> Jacob
>
> --
> Jacob Kasper
> http://twitter.com/Protect_Oceans
> 66°04' N
> 23°07' W
> Coastal & Marine Management Master's Student
> University Centre of the Westfjords
>
> Sundstræti 14                            37 Devens Rd
> Ísafjörður, 400                           Swampscott, MA 01907
> Iceland                                       USA
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] superscript troubles

2009-11-02 Thread Peter Dalgaard
Jacob Kasper wrote:
> I know that this has been revisited over and over, yet I cannot figure out
> how to solve this case of superscript troubles...
> I would like the 2 in r2 to be superscript, yet I am pasting text before and
> after it. I have tried several variations but have not solved this yet, any
> suggestions?
> 
> legend (bty =
> "n","topright",paste("r2=",round(summary(lat_x)$r.squared,digits=3),",
> P=",round(coefficients(summary(lat_x))[2,4], digits=3)))
> Thank you
> Jacob

You want the whole thing to be an expression. To insert values, bquote()
is your friend. Something like this

legend(bty="n", "topright",
   bquote(r^2 == .(round(summary(lat_x)$r.squared,digits=3)
  * "," *
  P == .(round(coefficients(summary(lat_x))[2,4], digits=3)
  )
)

(untested, if you want us to test, give us a reproducible example...)

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Interaction contrasts or posthoc test for glm (MASS) with ANOVA design

2009-11-02 Thread David Winsemius


On Nov 2, 2009, at 8:40 AM, Maya Pfaff wrote:


Dear R experts

I am running a negative-binomial GLM (glm.nb) to test the null  
hypotheses
that species 1 and 2 are equally abundant between site 1 and site2,  
and
between each other. So, I have a 2x2 factorial design with factors  
Site

(1,2) and Taxon (1,2).
Since the Site:Taxon interaction is significant, I need to do the  
equivalent
to a "post-hoc test" for ANOVA, however, the same tests (e.g. Tukey  
HSD) do

not seem to be applicable for the GLM. I tried specifying orthogonal
contrasts, but could not figure out what the interaction contrast (see
Site1:Taxon1 in below example) means.


I'm a bit puzzled by this request, since it appears that you already  
have the test result you seek in the form of the line beginning  
Site1:Taxon1. (It might be a different story if you had more species.)  
In the default glm setup, the contrasts are of type "treatment". Your  
Site1:Taxon1 coefficients would then be the mean difference (and se)  
from the estimates based on Intercept+Taxon+Site( on the scale being  
regressed on). You will notice that the z value does not change when  
you use treatment contrasts instead of those you specified.


The only thing further to do would be to construct a reduced model  
without the interaction, take the difference in deviances of the  
model, and compare to chi-sq(significance level, df=1) and that would  
give you the ML test rather than the score test which results from the  
by coefficient tests.


When I do that I get
s1: 2 x log-likelihood:  -1155.4010
s2: 2 x log-likelihood:  -1181.8600

so
> 1-pchisq(26.459,df=1)
[1] 2.691913e-07


(Which is immaterially different than the Pr(>|z|) that you get from  
the default output.


Of course that was the way we did it in school 20 years ago and it  
would be much faster to do:


> anova(s1,s2)
Likelihood ratio tests of Negative Binomial Models

Response: counts
 Model theta Resid. df2 x log-lik.   Testdf LR  
stat.  Pr(Chi)

1 Site + Taxon 0.5263897   151   -1181.860
2 Site * Taxon 0.6126726   150   -1155.401 1 vs 2 1  
26.45960 2.691083e-07


--
David

Could you please advise me how to specify a meaningful interaction  
contrast
(i.e. contrast species within sites)? Alternatively, could you  
recommend a

way to do posthoc comparisons?

Thanks for your time and kind regards
Maya


library(MASS)
counts <- c(1, 4, 9, 2, 1, 4, 2, 4, 1, 3, 2, 2, 1, 3, 1, 1, 2, 1,  
113, 83,
49, 46, 13, 52, 4, 10, 14, 10, 3, 19, 8, 21, 151, 186, 99, 11, 29,  
24, 24,
62, 15, 98, 30, 21, 63, 29, 48, 11, 16, 35, 21, 17, 6, 2, 2, 3, 3,  
4, 4, 2,
1, 2, 2, 3, 4, 8, 10, 3, 14, 3, 11, 23, 3, 51, 8, 8, 7, 1, 13, 8, 1,  
0, 1,
0, 1, 1, 1, 0, 1, 1, 1, 5, 8, 1, 1, 20, 2, 0, 0, 0, 1, 0, 0, 0, 0,  
0, 1, 0,
0, 3, 0, 1, 0, 5, 1, 1, 9, 0, 34, 4, 1, 17, 0, 7, 33, 86, 73, 67,  
79, 109,
27, 37, 23, 12, 17, 41, 8, 38, 4, 23, 14, 49, 64, 39, 31, 156, 110,  
97, 33,

170, 137, 72, 28, 54)
Site <- factor(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,  
1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,  
2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,  
2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,  
1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,  
2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,  
2, 2,

2, 2, 2, 2, 2, 2, 2, 2, 2, 2))
Taxon <- factor(c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,  
2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,  
2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,  
2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,  
1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,  
1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,  
1, 1,

1, 1, 1, 1, 1, 1, 1, 1, 1, 1))


contrasts(Site) <- cbind(c(1,-1))
contrasts(Taxon) <- cbind(c(1,-1))
s1 <- glm.nb(counts ~ Site*Taxon)
summary(s1)


Call:
glm.nb(formula = counts ~ Site * Taxon, init.theta =  
0.612672617555492,

   link = log)

Deviance Residuals:
 Min 1Q Median 3QMax
-2.269021  -1.215727  -0.454719   0.003515   2.517288

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)2.6139 0.1097  23.831  < 2e-16 ***
Site1 -0.8664 0.1097  -7.899 2.81e-15 ***
Taxon1-0.3814 0.1097  -3.477 0.000506 ***
Site1:Taxon1  -0.5977 0.1097  -5.449 5.06e-08 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for Negative Binomial(0.6127) family taken to  
be 1)


   Null deviance: 242.94  on 153  degrees of freedom
Residual deviance: 176.20  on 150  degrees of freedom
AIC: 1165.4

Number of Fisher Scoring iterations: 1


 Theta:  0.6127
 Std. Err.:  0.0690

2 x log-likelihood:  -1155.4010


TukeyHSD(

[R] superscript troubles

2009-11-02 Thread Jacob Kasper
I know that this has been revisited over and over, yet I cannot figure out
how to solve this case of superscript troubles...
I would like the 2 in r2 to be superscript, yet I am pasting text before and
after it. I have tried several variations but have not solved this yet, any
suggestions?

legend (bty =
"n","topright",paste("r2=",round(summary(lat_x)$r.squared,digits=3),",
P=",round(coefficients(summary(lat_x))[2,4], digits=3)))
Thank you
Jacob

-- 
Jacob Kasper
http://twitter.com/Protect_Oceans
66°04' N
23°07' W
Coastal & Marine Management Master's Student
University Centre of the Westfjords

Sundstræti 1437 Devens Rd
Ísafjörður, 400   Swampscott, MA 01907
Iceland   USA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to print the full name of the factors in summary?

2009-11-02 Thread Jen-Chien Chang

Hi,

I am wondering if there is a simple way to fix the problem I am having. 
For unknown reason, I could not get the full name of the factors to be 
printed in the summary. I have tried to used summary.lm as well but the 
problem still persists.


SJ$Weekday <- 
factor(SJ$Weekday,1:7,c("Mon","Tue","Wed","Thu","Fri","Sat","Sun"),ordered=T)


attach(SJ)
lm.SJ <- lm(Demand ~ Weekday+Month+Holiday+Season)
summary(lm.SJ)
Call:
lm(formula = Demand ~ Weekday + Month + Holiday + Season)

Residuals:
Min  1Q  Median  3Q Max
-69.767 -12.224  -1.378  10.857  91.376

Coefficients: (3 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept)  88.7091 3.3442  26.527  < 2e-16 ***
Weekday.L20.8132 2.8140   7.396 1.08e-12 ***
Weekday.Q   -12.7667 2.8156  -4.534 7.99e-06 ***
Weekday.C   -10.6375 2.8113  -3.784 0.000182 ***
Weekday^4-8.3325 2.8103  -2.965 0.003238 **
-

Is there a way for summary to print the full name of the factors and 
levels? Say Weekday.Tue instead Weekday.L?


Thanks!

Jack Chang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using processed objects as arguments of a function

2009-11-02 Thread David Winsemius
This would return a vector the names of objects whose names begin with  
"fund":


ls()[grep("^fund", ls())]

I'm such a grep-noob that I don't know off the top of my head what the  
pattern should be to restrict it to only those objects with digits in  
the next position.


Perhaps:
do.call( "rbind", sapply(ls()[grep("^m", ls())], get) )

--
David.

On Nov 2, 2009, at 5:24 AM, Jim Lemon wrote:


On 11/02/2009 05:04 PM, Steven Kang wrote:

Dear R users,

I wish to utilise processed and saved objects as arguments of a  
function.


Specifically, I have created objects using *"assign"*&  *"paste"*  
functions

with an incremental index i, the names of the objects are:

fund1, fund2, fund3,., fund80,. (where the numerical  
value
increments according to the index i&  class of these objects are  
dataframes)


I wish to collapse these objects row wisely using *"rbind"* function.

paste("fund", 1:i, sep = "") results in list of objects as  
characters...&

get(paste("fund", 1:i, sep = "")) outputs fund1...

Are there any methods to use these objects as an argument of  
"rbind" to

collapse the dataframes?

Your expertise in resolving this issue would be highly appreciated.

Hi Steven,
There is probably a neater way to construct the list of dataframes,  
but this will probably do what you want:


dnames<-paste("fund",1:nfunds,sep="")
makelist<-function(x) {
nitems<-length(x)
newlist<-vector("list",nitems)
for(item in 1:nitems) newlist[[item]]<-get(x[item])
return(newlist)
}
dflist<-makelist(dfnames)
do.call("rbind",dflist)

Of course all of the dataframes must have the same number of columns  
or the result will be messy or not there at all.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] partial matching with grep()

2009-11-02 Thread Vito Muggeo (UniPa)

dear all,
This is a probably a silly question.
If I type
> grep("x",c("a.x" ,"b.x","a.xx"),value=TRUE)
[1] "a.x"  "b.x"  "a.xx"

Instead, I would like to obtain only
"a.x"  "b.x"
How is it possible to get this result with grep()?

many thanks for your attention,
best,
vito

--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert list to numeric

2009-11-02 Thread Adaikalavan Ramasamy

It's a way of extracting from a list. See help("[") or help("Extract").



dadrivr wrote:

Great, that works very well.  What is the purpose of double brackets vs
single ones?  I will remember next time to include a subset of the data, so
that readers can run the script.  Thanks again for your help!
 


Benilton Carvalho wrote:

it appears that what you really want is to use:

task[[i]]

instead of task[i]

b

On Nov 1, 2009, at 11:04 PM, dadrivr wrote:

I would like to preface this by saying that I am new to R, so I  
would ask
that you be patient and thorough, so that I'm not completely  
clueless.  I am
trying to convert a list to numeric so that I can perform  
computations on it
(specifically mean-center the variable), but I am running into  
problems.  I
have imported the data set into "task" (data frame).  The data frame  
is made
of factors with variable names in the first row.  I am running a  
loop to set
a variable equal to a column in the data frame.  Here is an example  
of my

problem:

for (i in 1:dim(task)[2]){
predictor.loop <- c(task[i])
predictor.loop.mc <- predictor.loop - mean(predictor.loop, na.rm=T)
}

I get the following error:
Error in predictor.loop - mean(predictor.loop, na.rm = T) :
 non-numeric argument to binary operator
In addition: Warning message:
In mean.default(predictor.loop, na.rm = T) :
 argument is not numeric or logical: returning NA

The column is entirely made up of numerical data, except for the  
header,

which is a string.  My problem is that I receive an error because the
predictor.loop variable is not numerical, so I need to find a way to  
convert

it.  I tried using:
predictor.loop <- c(as.numeric(task[i]))
But I get the following error: "Error: (list) object cannot be  
coerced to

type 'double'"

If I call the variable, I can assign it to a numerical list (e.g.,  
predictor
loop <- task$variablename), but since I am assigning the variable in  
a loop,
I have to find another way as the variable name would have to change  
in each

loop iteration.  Any help would be greatly appreciated.  Thanks!
--
View this message in context:
http://old.nabble.com/convert-list-to-numeric-tp26155039p26155039.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert list to numeric

2009-11-02 Thread Adaikalavan Ramasamy

It's a way of extracting from a list. See help("[") or help("Extract")

Regards, Adai

dadrivr wrote:

Great, that works very well.  What is the purpose of double brackets vs
single ones?  I will remember next time to include a subset of the data, so
that readers can run the script.  Thanks again for your help!
 


Benilton Carvalho wrote:

it appears that what you really want is to use:

task[[i]]

instead of task[i]

b

On Nov 1, 2009, at 11:04 PM, dadrivr wrote:

I would like to preface this by saying that I am new to R, so I  
would ask
that you be patient and thorough, so that I'm not completely  
clueless.  I am
trying to convert a list to numeric so that I can perform  
computations on it
(specifically mean-center the variable), but I am running into  
problems.  I
have imported the data set into "task" (data frame).  The data frame  
is made
of factors with variable names in the first row.  I am running a  
loop to set
a variable equal to a column in the data frame.  Here is an example  
of my

problem:

for (i in 1:dim(task)[2]){
predictor.loop <- c(task[i])
predictor.loop.mc <- predictor.loop - mean(predictor.loop, na.rm=T)
}

I get the following error:
Error in predictor.loop - mean(predictor.loop, na.rm = T) :
 non-numeric argument to binary operator
In addition: Warning message:
In mean.default(predictor.loop, na.rm = T) :
 argument is not numeric or logical: returning NA

The column is entirely made up of numerical data, except for the  
header,

which is a string.  My problem is that I receive an error because the
predictor.loop variable is not numerical, so I need to find a way to  
convert

it.  I tried using:
predictor.loop <- c(as.numeric(task[i]))
But I get the following error: "Error: (list) object cannot be  
coerced to

type 'double'"

If I call the variable, I can assign it to a numerical list (e.g.,  
predictor
loop <- task$variablename), but since I am assigning the variable in  
a loop,
I have to find another way as the variable name would have to change  
in each

loop iteration.  Any help would be greatly appreciated.  Thanks!
--
View this message in context:
http://old.nabble.com/convert-list-to-numeric-tp26155039p26155039.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert list to numeric

2009-11-02 Thread dadrivr

Great, that works very well.  What is the purpose of double brackets vs
single ones?  I will remember next time to include a subset of the data, so
that readers can run the script.  Thanks again for your help!
 

Benilton Carvalho wrote:
> 
> it appears that what you really want is to use:
> 
> task[[i]]
> 
> instead of task[i]
> 
> b
> 
> On Nov 1, 2009, at 11:04 PM, dadrivr wrote:
> 
>>
>> I would like to preface this by saying that I am new to R, so I  
>> would ask
>> that you be patient and thorough, so that I'm not completely  
>> clueless.  I am
>> trying to convert a list to numeric so that I can perform  
>> computations on it
>> (specifically mean-center the variable), but I am running into  
>> problems.  I
>> have imported the data set into "task" (data frame).  The data frame  
>> is made
>> of factors with variable names in the first row.  I am running a  
>> loop to set
>> a variable equal to a column in the data frame.  Here is an example  
>> of my
>> problem:
>>
>> for (i in 1:dim(task)[2]){
>> predictor.loop <- c(task[i])
>> predictor.loop.mc <- predictor.loop - mean(predictor.loop, na.rm=T)
>> }
>>
>> I get the following error:
>> Error in predictor.loop - mean(predictor.loop, na.rm = T) :
>>  non-numeric argument to binary operator
>> In addition: Warning message:
>> In mean.default(predictor.loop, na.rm = T) :
>>  argument is not numeric or logical: returning NA
>>
>> The column is entirely made up of numerical data, except for the  
>> header,
>> which is a string.  My problem is that I receive an error because the
>> predictor.loop variable is not numerical, so I need to find a way to  
>> convert
>> it.  I tried using:
>> predictor.loop <- c(as.numeric(task[i]))
>> But I get the following error: "Error: (list) object cannot be  
>> coerced to
>> type 'double'"
>>
>> If I call the variable, I can assign it to a numerical list (e.g.,  
>> predictor
>> loop <- task$variablename), but since I am assigning the variable in  
>> a loop,
>> I have to find another way as the variable name would have to change  
>> in each
>> loop iteration.  Any help would be greatly appreciated.  Thanks!
>> --
>> View this message in context:
>> http://old.nabble.com/convert-list-to-numeric-tp26155039p26155039.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://old.nabble.com/convert-list-to-numeric-tp26155039p26157105.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Titles on lattice colorkey

2009-11-02 Thread Walmes Zeviani



r-help.20.trevva wrote:
> 
> Dear R-ers,
> 
> I'm not sure if this is a missing feature, a support request, or stupidity
> on my part, but nevertheless, its a question. Is it possible to add titles
> to colorkey legends? As far as I can tell, there is a command to do it for
> normal "key" legends, but not for "colorkeys". 
> 
> eg it works for a normal key, created through auto.key
> 
> xyplot(decrease ~ treatment, OrchardSprays, groups = rowpos,
>type = "a",
>auto.key = list(space = "right", points = FALSE, lines =
> TRUE,title="Key title"))
> 
> but there is no comparable command for a colorkey
> 
> x <- seq(pi/4, 5 * pi, length = 100)
> y <- seq(pi/4, 5 * pi, length = 100)
> r <- as.vector(sqrt(outer(x^2, y^2, "+")))
> grid <- expand.grid(x=x, y=y)
> grid$z <- cos(r^2) * exp(-r/(pi^3))
> levelplot(z~x*y, grid, cuts = 50, scales=list(log="e"), xlab="",
>   ylab="", main="Weird Function", sub="with log scales",
>   region = TRUE,
>   colorkey = list(space="right",title="Doesn't work"))
> 
> 
> Cheers,
> 
> Mark
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

Maybe using library(grid) we can solve. Try

levelplot(z~x*y, grid, cuts = 50, scales=list(log="e"), xlab="",
  ylab="", main="Weird Function", sub="with log scales",
  region = TRUE,
  colorkey=list(space="right", title="Doesn't work"),
  legend=list(top=list(fun=grid::textGrob("Size\n(m)", x=1.09

Walmes Zeviani.
-- 
View this message in context: 
http://old.nabble.com/Titles-on-lattice-colorkey-tp22868692p26156677.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Interaction contrasts or posthoc test for glm (MASS) with ANOVA design

2009-11-02 Thread Maya Pfaff
Dear R experts

I am running a negative-binomial GLM (glm.nb) to test the null hypotheses
that species 1 and 2 are equally abundant between site 1 and site2, and
between each other. So, I have a 2x2 factorial design with factors Site
(1,2) and Taxon (1,2).
Since the Site:Taxon interaction is significant, I need to do the equivalent
to a "post-hoc test" for ANOVA, however, the same tests (e.g. Tukey HSD) do
not seem to be applicable for the GLM. I tried specifying orthogonal
contrasts, but could not figure out what the interaction contrast (see
Site1:Taxon1 in below example) means.

Could you please advise me how to specify a meaningful interaction contrast
(i.e. contrast species within sites)? Alternatively, could you recommend a
way to do posthoc comparisons?

Thanks for your time and kind regards
Maya

> library(MASS)
> counts <- c(1, 4, 9, 2, 1, 4, 2, 4, 1, 3, 2, 2, 1, 3, 1, 1, 2, 1, 113, 83,
49, 46, 13, 52, 4, 10, 14, 10, 3, 19, 8, 21, 151, 186, 99, 11, 29, 24, 24,
62, 15, 98, 30, 21, 63, 29, 48, 11, 16, 35, 21, 17, 6, 2, 2, 3, 3, 4, 4, 2,
1, 2, 2, 3, 4, 8, 10, 3, 14, 3, 11, 23, 3, 51, 8, 8, 7, 1, 13, 8, 1, 0, 1,
0, 1, 1, 1, 0, 1, 1, 1, 5, 8, 1, 1, 20, 2, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0,
0, 3, 0, 1, 0, 5, 1, 1, 9, 0, 34, 4, 1, 17, 0, 7, 33, 86, 73, 67, 79, 109,
27, 37, 23, 12, 17, 41, 8, 38, 4, 23, 14, 49, 64, 39, 31, 156, 110, 97, 33,
170, 137, 72, 28, 54)
> Site <- factor(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2))
> Taxon <- factor(c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1))

> contrasts(Site) <- cbind(c(1,-1))
> contrasts(Taxon) <- cbind(c(1,-1))
> s1 <- glm.nb(counts ~ Site*Taxon)
> summary(s1)

Call:
glm.nb(formula = counts ~ Site * Taxon, init.theta = 0.612672617555492,
link = log)

Deviance Residuals:
  Min 1Q Median 3QMax
-2.269021  -1.215727  -0.454719   0.003515   2.517288

Coefficients:
 Estimate Std. Error z value Pr(>|z|)
(Intercept)2.6139 0.1097  23.831  < 2e-16 ***
Site1 -0.8664 0.1097  -7.899 2.81e-15 ***
Taxon1-0.3814 0.1097  -3.477 0.000506 ***
Site1:Taxon1  -0.5977 0.1097  -5.449 5.06e-08 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ 
’ 1

(Dispersion parameter for Negative Binomial(0.6127) family taken to be 1)

Null deviance: 242.94  on 153  degrees of freedom
Residual deviance: 176.20  on 150  degrees of freedom
AIC: 1165.4

Number of Fisher Scoring iterations: 1


  Theta:  0.6127
  Std. Err.:  0.0690

 2 x log-likelihood:  -1155.4010
>
> TukeyHSD(s1)
Error in UseMethod("TukeyHSD") : no applicable method for "TukeyHSD"

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >