Re: [R] [EXT] Theta from negative binomial regression and power_NegativeBinomiial from PASSED

2023-09-14 Thread Andrew Robinson via R-help
Hi John,

the negative binomial is a tricky one - there are several different 
parameterisations and therefore different interpretations of the parameters.  
Joseph Hilbe wrote a whole book on it that might be wroth checking.

Cheers,

Andrew


--
Andrew Robinson
Chief Executive Officer, CEBRA and Professor of Biosecurity,
School/s of BioSciences and Mathematics & Statistics
University of Melbourne, VIC 3010 Australia
Tel: (+61) 0403 138 955
Email: a...@unimelb.edu.au
Website: https://researchers.ms.unimelb.edu.au/~apro@unimelb/

I acknowledge the Traditional Owners of the land I inhabit, and pay my respects 
to their Elders.
On 15 Sep 2023 at 11:52 AM +1000, Sorkin, John , 
wrote:
External email: Please exercise caution

Colleagues,

I want to use the power_NetativeBinomial function from the PASSED library. The 
function requires a value for a parameter theta. The meaning of theta is not 
given in the documentation (at least I can�t find it) of the function. Further 
the descriptions of the negative binomial distribution that I am familiar with 
do not mention theta as being a parameter of the distribution. I noticed that 
when one runs the glm.nb function to perform a negative binomial regression one 
obtains a value for theta. This leads to two questions

1. Is the theta required by the power_NetativeBinomial function the theta that 
is produced by the glm.nb function
2. What is theta, and how does it relate to the parameters of the negative 
binomial distribution?

Thank you,
John

[[alternative HTML version deleted]]


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [EXT] Downloading a directory of text files into R

2023-07-25 Thread Andrew Robinson via R-help
Hi Bob,

there may be more efficient ways to go about it but I would use R to scrape the 
contents of

http://home.brisnet.org.au/~bgreen/Data/Hanson1/
http://home.brisnet.org.au/~bgreen/Data/Hanson2/

in order to form the URLs of the files, and then loop over the URLs.

Cheers,

Andrew

--
Andrew Robinson
Chief Executive Officer, CEBRA and Professor of Biosecurity,
School/s of BioSciences and Mathematics & Statistics
University of Melbourne, VIC 3010 Australia
Tel: (+61) 0403 138 955
Email: a...@unimelb.edu.au
Website: https://researchers.ms.unimelb.edu.au/~apro@unimelb/

I acknowledge the Traditional Owners of the land I inhabit, and pay my respects 
to their Elders.
On 26 Jul 2023 at 8:07 AM +1000, Bob Green , wrote:
External email: Please exercise caution

Hello,

I am seeking advice as to how I can download the 833 files from this
site:"http://home.brisnet.org.au/~bgreen/Data";

I want to be able to download them to perform a textual analysis.

If the 833 files, which are in a Directory with two subfolders were
on my computer I could read them through readtext. Using readtext I
get the error:

> x = readtext("http://home.brisnet.org.au/~bgreen/Data/*";)
Error in download_remote(file, ignore_missing, cache, verbosity) :
Remote URL does not end in known extension. Please download the
file manually.

> x = readtext("http://home.brisnet.org.au/~bgreen/Data/Dir/()")
Error in download_remote(file, ignore_missing, cache, verbosity) :
Remote URL does not end in known extension. Please download the
file manually.

Any suggestions are appreciated.

Bob

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [EXT] How to calculate the derivatives at each data point?

2023-01-31 Thread Andrew Robinson via R-help
Try something like

with(df, predict(smooth.spline(x = altitude, y = atm_values), deriv = 1))

Cheers,

Andrew

--
Andrew Robinson
Chief Executive Officer, CEBRA and Professor of Biosecurity,
School/s of BioSciences and Mathematics & Statistics
University of Melbourne, VIC 3010 Australia
Tel: (+61) 0403 138 955
Email: a...@unimelb.edu.au
Website: https://researchers.ms.unimelb.edu.au/~apro@unimelb/

I acknowledge the Traditional Owners of the land I inhabit, and pay my respects 
to their Elders.
On 31 Jan 2023 at 8:17 PM +1100, konstantinos christodoulou 
, wrote:
External email: Please exercise caution

Hi everyone,

I have a vector with atmospheric measurements (x-axis) that is
obtained/calculated at different altitudes (y-axis). The altitude is
uniformly distributed every 7 meters.
For example my dataframe is:
df <- dataframe(
*altitude* = c(1005, 1012, 1019, 1026, 1033, 1040, 1047, 1054, 1061, 1068),
*atm_values* = c(1.41, 1.40, 1.39, 1.38, 1.37, 1.37, 1.38, 1.36, 1.33, 1.31)
)

How can I find the derivatives of the atmospheric measurements at each
altitude?

I look forward to hearing from you!

Thanks,
Kostas

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [EXT] Re: Assigning categorical values to dates

2021-07-21 Thread Andrew Robinson
I wonder if you mean that you want the levels of the factor to reset within 
each month?  That is not obvious from your example, but implied by your 
question.

Andrew

--
Andrew Robinson
Director, CEBRA and Professor of Biosecurity,
School/s of BioSciences and Mathematics & Statistics
University of Melbourne, VIC 3010 Australia
Tel: (+61) 0403 138 955
Email: a...@unimelb.edu.au
Website: https://researchers.ms.unimelb.edu.au/~apro@unimelb/

I acknowledge the Traditional Owners of the land I inhabit, and pay my respects 
to their Elders.
On 22 Jul 2021, 1:47 PM +1000, N. F. Parsons , 
wrote:
External email: Please exercise caution

I am not averse to a factor-based solution, but I would still have to manually 
enter that factor each month, correct? If possible, I’d just like to point R at 
that column and have it do the work.

—
Nathan Parsons, B.SC, M.Sc, G.C.

Ph.D. Candidate, Dept. of Sociology, Portland State University
Adjunct Professor, Dept. of Sociology, Washington State University
Graduate Advocate, American Association of University Professors (OR)

Recent work (https://www.researchgate.net/profile/Nathan_Parsons3/publications)
Schedule an appointment (https://calendly.com/nate-parsons)

On Wednesday, Jul 21, 2021 at 8:30 PM, Tom Woolman mailto:twool...@ontargettek.com)> wrote:

Couldn't you convert the date columns to character type data in a data
frame, and then convert those strings to factors in a 2nd step?

The only downside I think to treating dates as factor levels is that
you might have an awful lot of factors if you have a large enough
dataset.



Quoting "N. F. Parsons" :

Hi all,

If I have a tibble as follows:

tibble(dates = c(rep("2021-07-04", 2), rep("2021-07-25", 3),
rep("2021-07-18", 4)))

how in the world do I add a column that evaluates each of those dates and
assigns it a categorical value such that

dates cycle
 
2021-07-04 1
2021-07-04 1
2021-07-25 3
2021-07-25 3
2021-07-25 3
2021-07-18 2
2021-07-18 2
2021-07-18 2
2021-07-18 2

Not to further complicate matters, but some months I may only have one
date, and some months I will have 4 dates - so thats not a fixed quantity.
We've literally been doing this by hand at my job and I'd like to automate
it.

Thanks in advance!

Nate Parsons

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [EXT] Re: nls() syntax

2020-12-11 Thread Andrew Robinson
Hi Rolf,

It might also be worth experimenting to see if

y ~ 1 / ( 1 - a[z]/x )

yields the same result. It might be cleaner if x appears only once in the 
expression.

Cheers,

Andrew

--
Andrew Robinson
Director, CEBRA; Professor of Biosecurity
School of BioSciences
University of Melbourne, VIC 3010 Australia
Tel: (+61) 0403 138 955
Email: a...@unimelb.edu.au
Website: http://cebra.unimelb.edu.au/

I acknowledge the traditional owners of the land I inhabit and pay my respects 
to their elders.

From: R-help  on behalf of Rolf Turner 

Sent: Saturday, December 12, 2020 1:39:11 PM
To: Gabor Grothendieck 
Cc: r-help@r-project.org 
Subject: [EXT] Re: [R] nls() syntax

UoM notice: External email. Be cautious of links, attachments, or impersonation 
attempts


On Fri, 11 Dec 2020 19:20:26 -0500
Gabor Grothendieck  wrote:

> The start= argument should be as follows:
>
>  nls(y ~ x/(x - a[z]),start=list(a = strt),data=xxx)

Nya-ha!  I, uh, clearly wasn't thinking clearly!  (Feel a bit *duh*
now!)

Thanks Gabor.

cheers,

Rolf

>
> On Fri, Dec 11, 2020 at 6:51 PM Rolf Turner 
> wrote:
> >
> >
> >
> > I want to fit a model y = x/(x-a) where the value of a depends
> > on the level of a factor z.  I cannot figure out an appropriate
> > syntax for nls().  The "parameter" a (to be estimated) should be a
> > vector of length equal to the number of levels of z.
> >
> > I tried:
> >
> > strt <- rep(3,length(levels(z))
> > names(strt=levels(z)
> > fit <- nls(y ~ x/(x - a[z]),start=strt,data=xxx)
> >
> > but of course got an error:
> >
> > > Error in nls(y ~ x/(x - a[z]), start = strt, data = xxx) :
> > >   parameters without starting value in 'data': a
> >
> > I keep thinking that there is something obvious that I should
> > be doing, but I can't work out what it is.
> >
> > Does there *exist* an appropriate syntax for doing what I want
> > to do?  Can anyone enlighten me?  The data set "xxx" is given
> > in dput() form at the end of this message.
> >
> > cheers,
> >
> > Rolf Turner
> >
> > --
> > Honorary Research Fellow
> > Department of Statistics
> > University of Auckland
> > Phone: +64-9-373-7599 ext. 88276
> >
> > Data set "xxx":
> >
> > structure(list(x = c(30, 40, 50, 60, 70, 80, 90, 100, 110, 120,
> > 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250,
> > 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160,
> > 170, 180, 190, 200, 210, 220, 230, 240, 250, 30, 40, 50, 60,
> > 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190,
> > 200, 210, 220, 230, 240, 250, 30, 40, 50, 60, 70, 80, 90, 100,
> > 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230,
> > 240, 250, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140,
> > 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250), y = c(1.27,
> > 1.16, 1.19, 1.15, 1.09, 1.07, 1.07, 1.05, 1.07, 1.03, 1.05, 1.07,
> > 1.06, 1.03, 1.05, 1.04, 1.03, 1.03, 1.03, 1.02, 1.02, 1.01, 1.01,
> > 1.21, 1.15, 1.1, 1.1, 1.06, 1.06, 1.05, 1.03, 1.07, 1.04, 1.04,
> > 1.02, 1.04, 1.02, 1.04, 1.03, 1.01, 1.03, 1.01, 1, 1.02, 1.03,
> > 1.02, 1.42, 1.27, 1.23, 1.14, 1.17, 1.08, 1.11, 1.06, 1.07, 1.08,
> > 1.06, 1.07, 1.04, 1.03, 1.07, 1.04, 1.03, 1.03, 1.03, 1.04, 1.03,
> > 1.03, 1.04, 1.85, 1.41, 1.35, 1.21, 1.22, 1.15, 1.14, 1.07, 1.1,
> > 1.09, 1.1, 1.09, 1.08, 1.08, 1.09, 1.09, 1.07, 1.06, 1.03, 1.08,
> > 1.05, 1.02, 1.05, 1.99, 1.6, 1.44, 1.4, 1.24, 1.3, 1.21, 1.23,
> > 1.18, 1.18, 1.12, 1.15, 1.09, 1.07, 1.13, 1.1, 1.05, 1.13, 1.09,
> > 1.03, 1.11, 1.07, 1.05), z = structure(c(1L, 1L, 1L, 1L, 1L,
> > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> > 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> > 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> > 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
> > 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
> > 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label =
> > c("p1", "p2", "p3", "p4", "p5"), class = "factor")), class =
> > "data.frame", row.names = c(NA, -115L))

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [EXT] Re: Inappropriate color name

2020-11-19 Thread Andrew Robinson
I see a lot of reasoning in this thread that I consider specious at best.

What seems clear to me, writing as a cis-gendered grey white male, is that we 
need to make more room.  How do we do this?  We do it by listening, reflecting, 
and responding.  If words that we use are hurtful, then we must change them.  
This process will not be perfect. It will be messy.  Our feelings may be hurt, 
our principles outraged. Treasured words may disappear. That is how we make 
room.

I find appeals to etymology to be irrelevant. History is rife with examples of 
innocent symbols and words being co-opted.  We abandon those symbols and words, 
rightly, because the taint clings to them, rightly or wrongly.

I find appeals to broad usage to also be irrelevant.  Change starts where and 
when we all decide that it starts.  The R community is its own thing.

Finally, I find appeals to stakeholder group size to be irrelevant. The point 
is not to count the people who won't be offended.

Here's a personal example: more than 10 years ago, a co-author and I submitted 
a package to CRAN that was designed to make R a little easier to use.  We 
called it R-assist.  It sailed through all the checks, and was published on 
CRAN.  We were then besieged with complaints from native German speakers.  So 
we changed the name.

Best wishes,

Andrew

--
Andrew Robinson
Director, CEBRA and Professor of Biosecurity,
School/s of BioSciences and Mathematics & Statistics
University of Melbourne, VIC 3010 Australia
Tel: (+61) 0403 138 955
Email: a...@unimelb.edu.au
Website: https://researchers.ms.unimelb.edu.au/~apro@unimelb/

I acknowledge the Traditional Owners of the land I inhabit, and pay my respects 
to their Elders.
On Nov 20, 2020, 9:28 AM +1100, Heinz Tuechler , wrote:
UoM notice: External email. Be cautious of links, attachments, or impersonation 
attempts

inline - David Wright wrote on 19.11.2020 12:39:
Appropriation of Indian Red as 'Chestnut' (or other alternative) will
be viewed by some as 'making appropriate' the label for a colour, and
no doubt by other groups as cultural theft by excising reference to
its origin.

Seems the best option is to recognise the actual etymology carries no
semblance of offense whatsoever, and leave well alone.


One may remember that people who might feel offended by "Indian Red"
(Native Americans) make up less than 0.5 percent of all "Indians".
It is hardly the fault of the people of India that Native Americans were
called Indians by an Italian navigator who thought he had landed in India.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] - Trying to replicate VLOOKUP in R - help needed

2020-11-16 Thread Andrew Robinson
Hi Gregg,

it's not clear from your context if all of ASSIGNED _COMPANY is NA or what the 
classes of the objects are.  Try the following ideas, none of which are tested. 
 I assume that the data set is called location.

location$ASSIGNED_COMPANY <- as.character(location$NAME)

is.a.FORT <- substr(location$ASSIGNED_COMPANY, 1, 4) == "FORT"

location$ASSIGNED_COMPANY[!is.a.FORT] <-
  sapply(location$ASSIGNED_COMPANY[!is.a.FORT],
  function(x) strsplit(x)[[1]][[1]]) # retains first name if not a fort

location$ASSIGNED_COMPANY[is.a.FORT] <-
  substr(location$ASSIGNED_COMPANY[is.a.FORT], 6,
  nchar(location$ASSIGNED _COMPANY[is.a.FORT])) # Strips FORT from Forts

substr(location$ASSIGNED_COMPANY, 2, nchar(location$ASSIGNED_COMPANY)) <-
  tolower(substr(location$ASSIGNED _COMPANY, 2,
  nchar(location$ASSIGNED _COMPANY))) # lower case word

location$ASSIGNED_COMPANY <- paste("NEC", location$ASSIGNED_COMPANY)

or you can just do

location$ASSIGNED_COMPANY[location$NAME == "ABERDEEN PROVING GROUND"] <- "NEC 
Aberdeen"

for each option 

Cheers,

Andrew

--
Andrew Robinson
Director, CEBRA and Professor of Biosecurity,
School/s of BioSciences and Mathematics & Statistics
University of Melbourne, VIC 3010 Australia
Tel: (+61) 0403 138 955
Email: a...@unimelb.edu.au
Website: https://researchers.ms.unimelb.edu.au/~apro@unimelb/

I acknowledge the Traditional Owners of the land I inhabit, and pay my respects 
to their Elders.
On Nov 17, 2020, 8:05 AM +1100, Gregg via R-help , wrote:
PROBLEM: I am trying to replicate something like a VLOOKUP in R but am having 
no success - need a bit of help.

GIVEN DATA SET (data.table): (looks something like this, but much bigger)

NAME TOTALAUTH ASSIGNED_COMPANY
ABERDEEN PROVING GROUND 1 NA
ADELPHI LABORATORY CENTER 1 NA
CARLISLE BARRACKS 1 NA
DETROIT ARSENAL 1 NA
DUGWAY PROVING GROUND 1 NA
FORT A P HILL 1 NA
FORT BELVOIR 1 NA
FORT BENNING 1 NA
FORT BLISS 1 NA
FORT BRAGG 1 NA
FORT BUCHANAN 1 NA


I am trying to update the values in the ASSIGNED_COMPANY column from NAs to a 
value that matches based on the "key" word like below.

NAME TOTALAUTH ASSIGNED_COMPANY
ABERDEEN PROVING GROUND 1 NEC Aberdeen
ADELPHI LABORATORY CENTER 1 NEC Adelphi
CARLISLE BARRACKS 1 NEC Carlise
DETROIT ARSENAL 1 NEC Detroit
DUGWAY PROVING GROUND 1 NEC Dugway
FORT A P HILL 1 NEC AP Hill
FORT BELVOIR 1 NEC Belvoir
FORT BENNING 1 NEC Benning
FORT BLISS 1 NEC Bliss
FORT BRAGG 1 NEC Bragg
FORT BUCHANAN 1 NEC Buchanon


In a nutshell, for instance...

I want to search for the keyword "ABERDEEN" in the NAME column, and for every 
row where it exists, I want to update the NA in the ASSIGNED_COMPANY column to 
"NEC Aberdeen"

I want to search for the keyword "ADELPHI" in the NAME column, and for every 
row where it exists, I want to update the NA in the ASSIGNED_COMPANY column to 
"NEC ADELPHI"

... and so on for every value in the NAME column - so in the end a I have 
matching names in the ASSIGNED_COMPANY column.

I can use an if statement because it is not vectorized.

If I use an ifelse statement, the "else" rewrites any changes with ""

Something so simple should not be difficult.

Some of the methods I attempted to use are below along with the errors I get...






###CODE###

library(data.table)
library(dplyr)
library(stringr)


VLOOKUP_inR <- data.table::fread("DATASET_TESTINGONLY.csv")

#METHOD 1 FAILS
VLOOKUP_inR %>% dplyr::rename_if(grepl("ADELPHI", VLOOKUP_inR$NAME, useBytes = 
TRUE), "NEC Adelphi")

Error in get(.x, .env, mode = "function") :

object 'NEC Adelphi' of mode 'function' was not found

#METHOD 2 FAILS
if(stringr::str_detect(VLOOKUP_inR$NAME, "ADELPHI")) {
VLOOKUP_inR$ASSIGNED_COMPANY == "NEC Adelphi"
}

Warning message:
In if (stringr::str_detect(VLOOKUP_inR$NAME, "ADELPHI")) { :
the condition has length > 1 and only the first element will be used


#METHOD 3 FAILS
ifelse(stringr::str_detect(ASIP_combined_location_tally$NAME, "ADELPHI"), 
ASIP_combined_location_tally$ASSIGNED_COMPANY == 
ASIP_combined_location_tally$ASSIGNED_COMPANY)

Error in ifelse(stringr::str_detect(ASIP_combined_location_tally$NAME, :

argument "no" is missing, with no default

#METHOD4 FAILS
VLOOKUP_inR_matching <- VLOOKUP_inR %>% mutate(ASSIGNED_COMPANY = 
ifelse(grepl(pattern = 'ABERDEEN', x = NAME), 'NEC Aberdeen', ''))
VLOOKUP_inR_matching <- VLOOKUP_inR_matching %>% mutate(ASSIGNED_COMPANY = 
ifelse(grepl(pattern = 'ADELPHI', x = NAME), 'NEC Adelphi', ''))

VLOOKUP_inR_matching <- VLOOKUP_inR_matching %>% mutate(ASSIGNED_COMPANY = 
ifelse(grepl(pattern = 'CARLISLE', x = NAME), 'NEC Carlisle Barracks',

Re: [R] [EXT] string concatenation

2020-11-04 Thread Andrew Robinson
Try

paste(x, collapse = ", ")

Cheers,

Andrew

--
Andrew Robinson
Director, CEBRA and Professor of Biosecurity,
School/s of BioSciences and Mathematics & Statistics
University of Melbourne, VIC 3010 Australia
Tel: (+61) 0403 138 955
Email: a...@unimelb.edu.au
Website: https://researchers.ms.unimelb.edu.au/~apro@unimelb/

I acknowledge the Traditional Owners of the land I inhabit, and pay my respects 
to their Elders.
On Nov 5, 2020, 4:17 PM +1100, John , wrote:

x <- c("Alice", "Bob", "Charles")

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A Question about MLE in R

2020-07-24 Thread Andrew Robinson
Hi Ravi,

that's an interesting claim and N-M.  Can you provide any reading matter to 
support it?

Cheers,

Andrew

--
Andrew Robinson
Director, CEBRA and Professor of Biosecurity,
School/s of BioSciences and Mathematics & Statistics
University of Melbourne, VIC 3010 Australia
Tel: (+61) 0403 138 955
Email: a...@unimelb.edu.au
Website: https://researchers.ms.unimelb.edu.au/~apro@unimelb/

I acknowledge the Traditional Owners of the land I inhabit, and pay my respects 
to their Elders.
On Jul 24, 2020, 10:54 PM +1000, Ravi Varadhan , wrote:
I agree with John that SANN should be removed from optim.


More importantly, the default choice of optimizer in optim should be changed 
from "Nelder-Mead" to "BFGS." Nelder-Mead is a bad choice for the most commonly 
encountered optimization problems in statistics. I really do not see a good 
reason for not changing the default in optim.


Best regards,

Ravi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A Question about MLE in R

2020-07-24 Thread Andrew Robinson
Hi John,

I wonder if you can suggest some reading material on that topic?  A cursory 
search of the net doesn't uncover anything obvious.

Andrew

--
Andrew Robinson
Director, CEBRA and Professor of Biosecurity,
School/s of BioSciences and Mathematics & Statistics
University of Melbourne, VIC 3010 Australia
Tel: (+61) 0403 138 955
Email: a...@unimelb.edu.au
Website: https://researchers.ms.unimelb.edu.au/~apro@unimelb/

I acknowledge the Traditional Owners of the land I inhabit, and pay my respects 
to their Elders.
On Jul 22, 2020, 10:49 PM +1000, J C Nash , wrote:
SANN is almost NEVER the tool to use.

I've given up trying to get it removed from optim(), and will soon give up
on telling folk not to use it.

JN

On 2020-07-22 3:06 a.m., Zixuan Qi wrote:
Hi,

I encounter a problem. I use optim() function in R to estimate likelihood
function and the method is SANN in the optim function.
out <-
optim(parm,logLik,method='SANN',hessian=T,control=list(maxit=500))

However, I find that each time I run the program, I will get different
values of parameters. My initial values are same, but the number of
iterations has reached the maximum limit. I expanded the number of
iterations to 5 million, but it��s still wrong.

I want to know what I should do. Is anyone willing to help me? Thanks so
much!

Best,
Cisy

[[alternative HTML version deleted]]


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with if else statement

2019-08-07 Thread Andrew Robinson
pmax() should work in this instance, as in any case you want the larger value.

Andrew

--
Andrew Robinson
Director, CEBRA, School of BioSciences
Reader & Associate Professor in Applied Statistics Tel: (+61) 0403 138 955
School of Mathematics and Statistics Fax: (+61) 03 8344 4599
University of Melbourne, VIC 3010 Australia
Email: a...@unimelb.edu.au
Website: http://cebra.unimelb.edu.au/

From: R-help  on behalf of Ana Marija 

Sent: Thursday, August 8, 2019 4:37 am
To: r-help
Subject: [R] Help with if else statement

Hello,

I have a data frame which looks like this:

> head(pt)
 eidQ phenoQ phenoH
1 117 -9 -9
2 125 -9 -9
3 138 -9  1
4 142 -9 -9
5 156 -9 -9
6 174 -9 -9
7   138 -9  1
8  1000127  2  1
9  1000690  2 -9
10  1000711  2 -9
11 1001431  2  1
12 1001710 -9  1

I would like to create the 3rd column called "pheno" which would have
these values:
-9,-9,1,-9,-9,-9,1,2,2,2,2,1

so in other words:
-if -9 appears in both phenoQ and phenoH I will have -9 in pheno
-if 2 appears in any of phenoQ or phenoH I will have 2 in pheno
-if I have -9 and 1 or 1 and -9 in those columns I will have 1 in pheno
-if I have -9 and 2 or 2 and -9 in those columns I will have 2 in pheno

Can you please tell me how my if else statement would look like or any
other way how to do that in R if my data frame name is "pt"

Thanks
Ana
Ana

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [FORGED] Re: Regarding R licensing usage guidance

2019-07-25 Thread Andrew Robinson
I most firmly do not agree with you, Rolf.

Over the years that I've been a member of this community I've noted the actions 
of a small group who have seemed to feel that it is ok to be an asshole to 
other people, and a group of hangers-on who have applauded and egged on this 
shameful behaviour.  Some members of this group have moved on, some remain. For 
a long while, R-help was quite toxic. 

Jeff's response was uncharacteristically rude and should be called out.  I 
ordinarily find Jeff to be terse, which is totally fine, and not colourful, 
which is also fine.  If you want colour, go watch WWE. 

Best wishes,

Andrew
-- 
Andrew Robinson 
Director, CEBRA, School of BioSciences
Reader & Associate Professor in Applied Statistics  Tel: (+61) 0403 138 955 

School of Mathematics and StatisticsFax: (+61) 03 8344 
4599 
University of Melbourne, VIC 3010 Australia
Email: a...@unimelb.edu.au
Website: http://www.ms.unimelb.edu.au/~andrewpr
 

On 7/25/19, 7:50 AM, "R-help on behalf of Rolf Turner" 
 wrote:


On 25/07/19 4:36 AM, Weiwen Ng, MPH wrote:

> Here's one way to phrase your reply:
> 
> "I'd recommend you search Google. For example, the search string
> "proprietary use GPL" produces one hit that's clearly relevant to you:



> This method is more neutrally worded. It doesn't insult the original
> poster. It doesn't assume the poster had bad intent.
> 
> Instead, you chose to phrase it thus:
> 
> "Your internet skills are pathetic. Search Google for "proprietary use 
gpl"
> and the first hit is ...  Note that there are (at least) three obvious
> alternatives if there is any question in your case ...   I think your
> desperation to steal the hard work of the various R contributors seems
> quite odious."
> 
> Think about the overall tone of your post. Consider also that someone who
> agrees with you substantive argument said that your comments were "often
> (almost always?) a bit rough about the edges."

Yeah, but Jeff's rough-about-the-edges phrasing is much more colourful, 
and colourful is *GOOD*.  There is far too much bland "S. We 
*mustn't* offend anybody" content in current discourse.  Tell it like it 
is!  Ripley into people!  If the recipient can't take the heat, he or 
she should get out of the kitchen!

See also fortunes::fortune(87).

cheers,

Rolf Turner

P.S.  Jeff makes a huge and extremely useful contribution to R-help.  He 
gives generously of time and effort to solve beginners' problems.  They 
should appreciate the time and effort and not whinge about being offended.

R. T.

-- 
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nlme

2016-10-24 Thread Andrew Robinson
Without access to the data, or commented, minimal, self-contained,
reproducible code, it's pretty hard to speculate.  I suggest that you
reframe your question so that we can see what you can see.

Andrew

On 25 October 2016 at 03:03, Santiago Bueno  wrote:
> Dear people:
>
>
> I am getting the following error when trying to run the piece of code
> below. Any insights??
>
>
>
> Error in nlme.formula(Btronc ~ a + b * dbh^2 * haut, data = cbind(dat,  :
>
>   object 'a' not found
>
>
>
> library(nlme)
>
> start <- coef(lm(Btronc~I(dbh**2*haut),data=dat))
>
> names(start) <- c("a","b")
>
> summary(nlme(Btronc~a+b*dbh**2*haut, data=cbind(dat,g="a"), fixed=a+b_1,
> start=start,
>
> groups=~g, weights=varPower(form=~dbh)))
>
>
> Best,
>
>
> Santiago
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Andrew Robinson
Deputy Director, CEBRA, School of Biosciences
Reader & Associate Professor in Applied Statistics  Tel: (+61) 0403 138 955
School of Mathematics and StatisticsFax: +61-3-8344 4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.au
Website: http://www.ms.unimelb.edu.au/~andrewpr

MSME: http://www.crcpress.com/product/isbn/9781439858028
FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding starting values for the parameters using nls() or nls2()

2016-10-09 Thread Andrew Robinson
Here are some things to try.  Maybe divide Area by 1000 and retention
by 100.  Try plotting the data and superimposing the line that
corresponds to the 'fit' from nls2.  See if you can correct it with
some careful guesses.

Getting suitable starting parameters for non-linear modeling is one of
the black arts of statistical fitting. Good luck!  And don't forget to
check for sensitivity.

Andrew

On 9 October 2016 at 22:21, Pinglei Gao  wrote:
> Hi,
>
> I have some data that i'm trying to fit a double exponential model: data.
> Frame (Area=c (521.5, 689.78, 1284.71, 2018.8, 2560.46, 524.91, 989.05,
> 1646.32, 2239.65, 2972.96, 478.54, 875.52, 1432.5, 2144.74, 2629.2),
>
> Retention=c (95.3, 87.18, 44.94, 26.36, 18.12, 84.68, 37.24, 33.04, 23.46,
> 9.72, 97.92, 71.44, 44.52, 24.44, 15.26) ) and the formula of the double
> exponential is: exp (b0*exp (b1*x^th)).
>
>
>
> I failed to guess the initial parameter values and then I learned a measure
> to find starting values from Nonlinear Regression with R (pp. 25-27):
>
>
>
>> cl<-data.frame(Area =c(521.5, 689.78, 1284.71, 2018.8, 2560.46, 524.91,
> 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52, 1432.5, 2144.74, 2629.2),
>
> + Retention =c(95.3, 87.18, 44.94, 26.36, 18.12, 84.68, 37.24, 33.04, 23.46,
> 9.72, 97.92, 71.44, 44.52, 24.44, 15.26) )
>
>> expFct <- function(Area, b0, b1,th) {exp(b0*exp(b1*Area^th))}
>
>> grid.Disperse <- expand.grid(list(b0 = seq(0.01,4, by = 0.01), th =
> c(0.02),b1 = seq(0.01, 4, by = 0.01)))
>
>> Disperse.m2a <- nls2(Retention ~expFct(Area, b0, b1,th), data = cl, start
> = grid.Disperse, algorithm = "brute-force")
>
>> Disperse.m2a
>
> Nonlinear regression model
>
>   model: Retention ~ expFct(Area, b0, th, b1)
>
>data: cl
>
> b0   th   b1
>
> 3.82 0.02 0.01
>
> residual sum-of-squares: 13596
>
> Number of iterations to convergence: 16
>
> Achieved convergence tolerance: NA
>
>
>
> I got no error then I use the output as starting values to nls2 ():
>
>> nls.m2<- nls2(Retention ~ expFct(Area, b0, b1, th), data = cl, start =
> list(b0 = 3.82, b1 = 0.02, th = 0.01))
>
> Error in (function (formula, data = parent.frame(), start, control =
> nls.control(),  :
>
> Singular gradient
>
>
>
> Why? Did I do something wrong or misunderstand something?
>
>
>
> Later, I found another measure from Modern Applied Statistics with S (pp.
> 216-217):
>
>
>
>> negexp <- selfStart(model = ~ exp(b0*exp(b1*x^th)),initial =
> negexp.SSival, parameters = c("b0", "b1", "th"),
>
> + template = function(x, b0, b1, th) {})
>
>> Disperse.ss <- nls(Retention ~ negexp(Area, b0, b1, th),data = cl, trace =
> T)
>
>  b0  b1  th
>
>4.208763  144.205455 1035.324595
>
> Error in qr.default(.swts * attr(rhs, "gradient")) :
>
>  NA/NaN/Inf (arg1) can not be called when the external function is called.
>
>
>
> Error happened again. How can I fix it? I am desperate.
>
>
>
> Best regards,
>
>
>
> Pinglei Gao
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Andrew Robinson
Deputy Director, CEBRA, School of Biosciences
Reader & Associate Professor in Applied Statistics  Tel: (+61) 0403 138 955
School of Mathematics and StatisticsFax: +61-3-8344 4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.au
Website: http://www.ms.unimelb.edu.au/~andrewpr

MSME: http://www.crcpress.com/product/isbn/9781439858028
FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm() silently drops NAs

2016-07-26 Thread Andrew Robinson
Agh.  I've argued elsewhere that the default behaviour should be to
fail, and the user should take the responsibility to explicitly handle
the missing values, even if that simply be by changing the argument.
Probably Peter and I have different experiences with the completeness
of datasets, but anything that encourages me not to think about what
I'm doing in that realm seems like a bad idea.

Best wishes,

Andrew

On 27 July 2016 at 07:30, peter dalgaard  wrote:
>
>> On 26 Jul 2016, at 22:26 , Hadley Wickham  wrote:
>>
>> On Tue, Jul 26, 2016 at 3:24 AM, Martin Maechler
>>  wrote:
>>>
> ...
>> To me, this would be the most sensible default behaviour, but I
>> realise it's too late to change without breaking many existing
>> expectations.
>
> Probably.
>
> Re. the default choice, my recollection is that at the time the only choices 
> available were na.omit and na.fail. S-PLUS was using na.fail for all the 
> usual good reasons, but from a practical perspective, the consequence was 
> that, since almost every data set has NA values,  you got an error unless you 
> added na.action=na.omit to every single lm() call. And habitually typing 
> na.action=na.omit doesn't really solve any of the issues with different 
> models being fit to different subsets and all that. So the rationale for 
> doing it differently in R was that it was better to get some probably 
> meaningful output rather than to be certain of getting nothing. And, that was 
> what the mainstream packages of the time were doing.
>
>> On a related note, I've never really understood why it's called
>> na.exclude - from my perspective it causes the _inclusion_ of missing
>> values in the predictions/residuals.
>
> I think the notion is that you exclude them from the analysis, but keep them 
> around for the other purposes.
>
> -pd
>
>>
>> Thanks for the (as always!) informative response, Martin.
>>
>> Hadley
>>
>> --
>> http://hadley.nz
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Andrew Robinson
Deputy Director, CEBRA, School of Biosciences
Reader & Associate Professor in Applied Statistics  Tel: (+61) 0403 138 955
School of Mathematics and StatisticsFax: +61-3-8344 4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.au
Website: http://www.ms.unimelb.edu.au/~andrewpr

MSME: http://www.crcpress.com/product/isbn/9781439858028
FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about lme syntax

2015-11-23 Thread Andrew Robinson
Hi Angelo,

it's dangerous to fit a model that includes interaction effects but omits
main effects.  Among other things, what can happen is that the statistical
tests become scale dependent, which is most unattractive.

I think that you should include the main effects in your model, even as
nuisance variables, and test the interaction using the model that includes
them.

BTW, your question might better be located with the mixed-effects models
special interest group.

https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Best wishes

Andrew

On Mon, Nov 23, 2015 at 9:19 PM, angelo.arc...@virgilio.it <
angelo.arc...@virgilio.it> wrote:

> Dear list,
> I need an help to understand the syntax of lme to fit my model according
> to the analysis I want to perform.
>
> My dependent variable resulted from a perceptual experiment in which
> responses of participants were measured twice for each provided stimulus.
> My goal is to verify whether the responses depend on two properties of the
> participants that I know to be related to each other (height and weight, so
> they need to be considered together as an interaction). More importantly, I
> need to understand how this relationship modulates according to the type of
> stimulus participants were presented to.
>
> Based on my understanding of lme syntax, the formula I have to use should
> be the following (because I am only interested in the interaction factor of
> Weight and Height)
>
> lme_dv <- lme(dv ~ Weight:Height:Stimulus_Type, data = scrd, random = ~ 1
> | Subject)
>
> Am I correct?
>
>
> Thank you in advance
>
> Best regards
>
> Angelo
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Andrew Robinson
Deputy Director, CEBRA, School of Biosciences
Reader & Associate Professor in Applied Statistics  Tel: (+61) 0403 138 955
School of Mathematics and StatisticsFax: +61-3-8344
4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.au
Website: http://www.ms.unimelb.edu.au/~andrewpr

MSME: http://www.crcpress.com/product/isbn/9781439858028
FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R wont accept my zero count values in the GLM with quasi_poisson dsitribution

2015-07-28 Thread Andrew Robinson
You have selected the binomial family in the call to glm.  You should
instead try something like

 family=quasipoisson(link = "log")

I hope this helps

Andrew

On Tue, Jul 28, 2015 at 4:33 PM, Charlotte <
charlotte.hu...@griffithuni.edu.au> wrote:

> Hello
>
> I have count values for abundance which follow a pattern of over-dispersal
> with many zero values.  I have read a number of documents which suggest
> that
> I don't use data transforming methods but rather than I run the GLM with
> the
> quasi poisson distribution.  So I have written my script and R is telling
> me
> that Y should be more than 0.
>
> Everything I read tells me to do it this way but I can't get R to agree.
> Did I need to add something else to my script to get it to work and keep my
> data untransformed? The script I wrote is as follows:
>
> > fit <- glm(abundance~Gender,data=teminfest,family=binomial())
>
> then I get this error
> Error in eval(expr, envir, enclos) : y values must be 0 <= y <= 1
>
> I don't use R a lot so I am having trouble figuring out what to do next.
>
> I would appreciate some help
>
> Many Thanks
> Charlotte
>
>
>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/R-wont-accept-my-zero-count-values-in-the-GLM-with-quasi-poisson-dsitribution-tp4710462.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Andrew Robinson
Deputy Director, CEBRA, School of Biosciences
Reader & Associate Professor in Applied Statistics  Tel: (+61) 0403 138 955
School of Mathematics and StatisticsFax: +61-3-8344
4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.au
Website: http://www.ms.unimelb.edu.au/~andrewpr

MSME: http://www.crcpress.com/product/isbn/9781439858028
FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with nonlinear least squares regression curve fitting

2015-02-25 Thread Andrew Robinson
Finding starting values is a bit of a dark art.  That said, there are steps
you can take, but it may take time.

First, I would scale Year so that it's not in the thousands! Experiment
with subtracting 1980 or so.  For specific advice, see inline.

On Thu, Feb 26, 2015 at 3:03 AM, Corey Callaghan 
wrote:

> The curves' functions that I want to test are in the code here (hopefully
> correctly):
>
> Inverse Quadratic Curve:
> fitmodel <- nls(Area ~ (-a*Year)*(Year + b), data = df, start=list(a=??,
> b=??, c=??))
>

I would plot the data and a smooth spline, differentiate the curve
function, identify some parameter values somewhere stable, and estimate
some values by eye, or even predict them from the first derivative of the
spline - spline.smooth will do this.

Sigmodial Curve:
> fitmodel <- nls(Area~a/(1+exp(-(b+c*Year))), data=df, start=list(a=???,
> b=???, c=??))
>

I'd use the highest value as a, fit spline as above then invert area at two
times to get b and c.

Double sigmoidal Curve:
> fitmodel <- nls(Area~a+2b(1/(1+exp(-abs(-c*Year+d)))-1/2)*sign(-c*Year+d),
> data=df, start=list(a=???, b=???, c=???)
>

 I'd use min(Area) as a, figure out b from the maximum (I guess 2b+a is the
asymptote), and experiment with two values for year to retrieve c and d
 uniroot might help?

Cheers

Andrew

-- 
Andrew Robinson
Deputy Director, CEBRA, School of Biosciences
Reader & Associate Professor in Applied Statistics  Tel: (+61) 0403 138 955
School of Mathematics and StatisticsFax: +61-3-8344
4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.au
Website: http://www.ms.unimelb.edu.au/~andrewpr

MSME: http://www.crcpress.com/product/isbn/9781439858028
FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error in "ca.jo"

2013-12-25 Thread Andrew Robinson
I respectfully disagree.  Taken out of context, this would be simply snark.
Indeed, even in context, ...

Cheers,

Andrew


On Thu, Dec 26, 2013 at 8:38 AM, Rolf Turner wrote:

> On 25/12/13 06:23, Jeff Newmiller wrote:
>
> 
>
>> In short, if you ask us why one divided by zero doesn't give 3, we have
>> to wonder if you don't belong in some other educational forum.
>>
> 
>
> Fortune!
>
> cheers,
>
> Rolf Turner
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Andrew Robinson
Deputy Director, CEBRA
Senior Lecturer in Applied Statistics  Tel:
+61-3-8344-6410
Department of Mathematics and StatisticsFax: +61-3-8344 4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.auWebsite: http://www.ms.unimelb.edu.au

FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data distribution for lme

2013-12-10 Thread Andrew Robinson
No - it is assumed to be conditionally normal, that is, normal conditional
on the model.  So you should be looking at the distributions of the
residuals rather than of the response variable, as an indicator for whether
or not the model assumptions are satisfied.   Skewness in the residuals may
or may not affect the outcome; it depends on what the purpose of the model
is.  Skewness in residuals may arise from a number of different problems
with the model.

I hope that this helps,

Andrew




On Wed, Dec 11, 2013 at 1:55 AM, peyman  wrote:

> Hi folks,
>
> I am using the lme package of R, and am wondering if it is assumed that
> the dependent factor (what we fit for; y in many relevant texts) has to
> have a normal Gaussian distribution? Is there any margins where some
> skewness in the data is accepted and how within R itself one could check
> distribution of the data?
>
> Thanks,
> Peyman
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Andrew Robinson
Deputy Director, CEBRA
Senior Lecturer in Applied Statistics  Tel:
+61-3-8344-6410
Department of Mathematics and StatisticsFax: +61-3-8344 4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.auWebsite: http://www.ms.unimelb.edu.au

FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bootstrapping respecting subject level information

2013-07-04 Thread Andrew Robinson
Josh's comment prompted me to check mty go-to reference. Davison and
Hinckley (1997 Section 3.8) recommend sampling the Subjects, but not
within the Subjects.

Cheers

Andrew

On Thu, Jul 04, 2013 at 05:53:58PM -0700, Joshua Wiley wrote:
> Hi,
> 
> It is not the easiest to follow code, but when I was working at UCLA,
> I wrote a page demonstrating a multilevel bootstrap, where I use a two
> stage sampler, (re)sampling at each level.  In your case, could be
> first draw subjects, then draw observations within subjects.  A strata
> only option does not resample all sources of variability, which are:
> 
> 1) which subjects you get and
> 2) which observations within those
> 
> The page is here: http://www.ats.ucla.edu/stat/r/dae/melogit.htm
> 
> As a side note, it demonstrates a mixed effects model in R, although
> as I mentioned it is not geared for beginners.
> 
> Cheers,
> 
> Josh
> 
> 
> 
> On Wed, Jul 3, 2013 at 7:19 PM, Sol Lago  wrote:
> > Hi there,
> >
> > This is the first time I use this forum, and I want to say from the start I 
> > am not a skilled programmer. So please let me know if the question or code 
> > were unclear!
> >
> > I am trying to bootstrap an interaction (that is my test statistic) using 
> > the package "boot". My problem is that for every resample, I would like the 
> > randomization to be done within subjects, so that observations from 
> > different subjects are not mixed. Here is the code to generate a dataframe 
> > similar to mine:
> >
> > Subject = rep(c("S1","S2","S3","S4"),4)
> > Num = rep(c("singular","plural"),8)
> > Gram= rep(c("gram","gram","ungram","ungram"),4)
> > RT  = 
> > c(657,775,678,895,887,235,645,916,930,768,890,1016,590,978,450,920)
> > data= data.frame(Subject,Num,Gram,RT)
> >
> > This is the code I used to get the empirical interaction value:
> >
> > summary(lm(RT ~ Num*Gram, data=data))
> >
> > As you can see, the interaction between my two factors is -348. I want to 
> > get a bootstrap confidence interval for this statistic, which I can 
> > generate using the "boot" package:
> >
> > #Function to create the statistic to be boostrapped
> > boot.huber <- function(data, indices) {
> > data <- data[indices, ] #select obs. in bootstrap sample
> > mod <- lm(RT ~ Num*Gram, data=data)
> > coefficients(mod)   #return coefficient vector
> > }
> >
> > #Generate bootstrap estimate
> > data.boot <- boot(data, boot.huber, 1999)
> >
> > #Get confidence interval
> > boot.ci(data.boot, index=4, type=c("norm", "perc", "bca"),conf=0.95) #4 
> > gets the CI for the interaction
> >
> > My problem is that I think the resamples should be generated without mixing 
> > the individual subjects observations: that is, to generate the new 
> > resamples, the observations from subject 1 (S1) should be shuffled within 
> > subject 1, not mixing them with the observations from subjects 2, etc... I 
> > don't know how "boot" is doing the resampling (I read the documentation but 
> > don't understand how the function is doing it)
> >
> > Does anyone know how I could make sure that the resampling procedure used 
> > by "boot" respects the subject level information?
> >
> > Thanks a lot for your help/advice!
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
> -- 
> Joshua Wiley
> Ph.D. Student, Health Psychology
> University of California, Los Angeles
> http://joshuawiley.com/
> Senior Analyst - Elkhart Group Ltd.
> http://elkhartgroup.com
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Deputy Director, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Methods of Statistical Model Estimation (CRC, 2013)
http://www.crcpress.com/product/isbn/9781439858028
Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bootstrapping respecting subject level information

2013-07-04 Thread Andrew Robinson
I think that in the case of a 2*2 balanced, replicated design such as
this one, interpreting the interaction should be safe.

Cheers

Andrew

On Fri, Jul 5, 2013 at 9:38 AM, David Winsemius  wrote:
>
> On Jul 3, 2013, at 7:19 PM, Sol Lago wrote:
>
>> Hi there,
>>
>> This is the first time I use this forum, and I want to say from the start I 
>> am not a skilled programmer. So please let me know if the question or code 
>> were unclear!
>>
>> I am trying to bootstrap an interaction (that is my test statistic) using 
>> the package "boot". My problem is that for every resample, I would like the 
>> randomization to be done within subjects, so that observations from 
>> different subjects are not mixed. Here is the code to generate a dataframe 
>> similar to mine:
>>
>> Subject = rep(c("S1","S2","S3","S4"),4)
>> Num = rep(c("singular","plural"),8)
>> Gram= rep(c("gram","gram","ungram","ungram"),4)
>> RT  = c(657,775,678,895,887,235,645,916,930,768,890,1016,590,978,450,920)
>> data= data.frame(Subject,Num,Gram,RT)
>>
>> This is the code I used to get the empirical interaction value:
>>
>> summary(lm(RT ~ Num*Gram, data=data))
>>
>> As you can see, the interaction between my two factors is -348.
>
> That depends on what you mean by "the interaction between my two factors". It 
> is almost never a good idea to attempt interpretation of interaction 
> coefficients, and is always preferable to check the predictions of hte model.
>
>> I want to get a bootstrap confidence interval for this statistic, which I 
>> can generate using the "boot" package:
>>
>> #Function to create the statistic to be boostrapped
>> boot.huber <- function(data, indices) {
>> data <- data[indices, ] #select obs. in bootstrap sample
>> mod <- lm(RT ~ Num*Gram, data=data)
>> coefficients(mod)   #return coefficient vector
>> }
>>
>> #Generate bootstrap estimate
>> data.boot <- boot(data, boot.huber, 1999)
>>
>> #Get confidence interval
>> boot.ci(data.boot, index=4, type=c("norm", "perc", "bca"),conf=0.95) #4 gets 
>> the CI for the interaction
>>
>> My problem is that I think the resamples should be generated without mixing 
>> the individual subjects observations: that is, to generate the new 
>> resamples, the observations from subject 1 (S1) should be shuffled within 
>> subject 1,
>
> What does that mean?
>
>> not mixing them with the observations from subjects 2, etc... I don't know 
>> how "boot" is doing the resampling (I read the documentation but don't 
>> understand how the function is doing it)
>
> It's doing it by selecting randomly entire rows. It is not "shuffling within 
> rows" for selected subjects.
>>
>> Does anyone know how I could make sure that the resampling procedure used by 
>> "boot" respects the subject level information?
>
> It would be doing so because that is the way you set up the indexing. The 
> column ordering tof the data within subjects is not permuted.
>
> I do think you are beyond your understanding of the statstical principles 
> that you are attempting to use and would be safer to consult with a 
> statsitician.
>
>
> --
> David Winsemius
> Alameda, CA, USA
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Andrew Robinson
Deputy Director, CEBRA
Senior Lecturer in Applied Statistics  Tel: +61-3-8344-6410
Department of Mathematics and StatisticsFax: +61-3-8344 4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.auWebsite: http://www.ms.unimelb.edu.au

FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bootstrapping respecting subject level information

2013-07-04 Thread Andrew Robinson
I'd like to preface this answer by suggesting that if you have multiple
measurements within subjects then you should possibly be thinking about
using mixed-effects models.  Here you have a balanced design and seem
to be thinking about a constrained bootstrap, but I don't know whether
the resulting distribution will have the right properties - others on
the list may.

That said, I think that this needs the 'strata' argument of the boot function.
See the help.  Something like

data.boot <- boot(data, boot.huber, 1999, strata = Subject)

(not tested)

Cheers

Andrew

On Thu, Jul 4, 2013 at 12:19 PM, Sol Lago  wrote:
> Hi there,
>
> This is the first time I use this forum, and I want to say from the start I 
> am not a skilled programmer. So please let me know if the question or code 
> were unclear!
>
> I am trying to bootstrap an interaction (that is my test statistic) using the 
> package "boot". My problem is that for every resample, I would like the 
> randomization to be done within subjects, so that observations from different 
> subjects are not mixed. Here is the code to generate a dataframe similar to 
> mine:
>
> Subject = rep(c("S1","S2","S3","S4"),4)
> Num = rep(c("singular","plural"),8)
> Gram= rep(c("gram","gram","ungram","ungram"),4)
> RT  = c(657,775,678,895,887,235,645,916,930,768,890,1016,590,978,450,920)
> data= data.frame(Subject,Num,Gram,RT)
>
> This is the code I used to get the empirical interaction value:
>
> summary(lm(RT ~ Num*Gram, data=data))
>
> As you can see, the interaction between my two factors is -348. I want to get 
> a bootstrap confidence interval for this statistic, which I can generate 
> using the "boot" package:
>
> #Function to create the statistic to be boostrapped
> boot.huber <- function(data, indices) {
> data <- data[indices, ] #select obs. in bootstrap sample
> mod <- lm(RT ~ Num*Gram, data=data)
> coefficients(mod)   #return coefficient vector
> }
>
> #Generate bootstrap estimate
> data.boot <- boot(data, boot.huber, 1999)
>
> #Get confidence interval
> boot.ci(data.boot, index=4, type=c("norm", "perc", "bca"),conf=0.95) #4 gets 
> the CI for the interaction
>
> My problem is that I think the resamples should be generated without mixing 
> the individual subjects observations: that is, to generate the new resamples, 
> the observations from subject 1 (S1) should be shuffled within subject 1, not 
> mixing them with the observations from subjects 2, etc... I don't know how 
> "boot" is doing the resampling (I read the documentation but don't understand 
> how the function is doing it)
>
> Does anyone know how I could make sure that the resampling procedure used by 
> "boot" respects the subject level information?
>
> Thanks a lot for your help/advice!
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Andrew Robinson
Deputy Director, CEBRA
Senior Lecturer in Applied Statistics  Tel: +61-3-8344-6410
Department of Mathematics and StatisticsFax: +61-3-8344 4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.auWebsite: http://www.ms.unimelb.edu.au

FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why "object 'x' not found"?

2013-02-07 Thread Andrew Robinson
Thanks, Bill.

Andrew

On Thu, Feb 07, 2013 at 09:21:53PM +, William Dunlap wrote:
> >> Notice that the '[[' function is superior in every way to the '$' 
> >> function. 
> > I'm curious to know whether you can point to some consolidated comparison?
> 
> Two problems are that List$name allows abbreviations of name and does not 
> allow
> variable names.
> 
> With List$name you get:
>   > List <- list(Eleven=11, Twelve="xii")
>   > str(List)
>   List of 2
>$ Eleven: num 11
>$ Twelve: chr "xii"
>   > List$E # $ accepts abbreviations, which might seem nice
>   [1] 11
>   > List$E <- List$E + 100 # but can bite you in replacements
>   > str(List) # Eleven not incremented, new E created
>   List of 3
>$ Eleven: num 11
>$ Twelve: chr "xii"
>$ E : num 111
>   > nm <- "Twelve" # want to use component named variable
>   > List$nm  # no component named "nm"
>   NULL
>   > eval(parse(text=paste0("List", "$", nm))) # ugly dangerous hack
>   [1] "xii"
>   > eval(call("$", quote(List), nm)) # ugly safe hack
>   [1] "xii"
> 
> With List[["name"]] syntax you get
>   > List <- list(Eleven=11, Twelve="xii")
>   > List[["E"]]
>   NULL
>   > List[["E"]] <- List[["E"]] + 100
>   > str(List)
>   List of 3
>$ Eleven: num 11
>$ Twelve: chr "xii"
>$ E : num(0) 
>   > List[["Eleven"]] <- List[["Eleven"]] + 100
>   > str(List)
>   List of 3
>$ Eleven: num 111
>$ Twelve: chr "xii"
>$ E : num(0) 
>   > nm <- "Twelve"
>   > List[[nm]]
>   [1] "xii"
> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
> 
> 
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> > Behalf
> > Of Andrew Robinson
> > Sent: Thursday, February 07, 2013 12:52 PM
> > To: David Winsemius
> > Cc: r-help
> > Subject: Re: [R] why "object 'x' not found"?
> > 
> > Hi David,
> > 
> > The following is an interesting observation ...
> > 
> > On Friday, February 8, 2013, David Winsemius wrote:
> > 
> > >
> > > On Feb 7, 2013, at 10:20 AM, Winfried Moser wrote:
> > > 
> > >
> > Notice that the '[[' function is superior in every way to the '$' function.
> > >
> > 
> > I'm curious to know whether you can point to some consolidated comparison?
> > 
> > Best wishes,
> > 
> > Andrew
> > 
> > 
> > --
> > Andrew Robinson
> > Director (A/g), ACERA
> > Department of Mathematics and StatisticsTel: +61-3-8344-6410
> > University of Melbourne, VIC 3010 Australia   (prefer email)
> > http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
> > http://www.acera.unimelb.edu.au/
> > 
> > FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
> > SPuR: http://www.ms.unimelb.edu.au/spuRs/
> > 
> > [[alternative HTML version deleted]]
> > 
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Director (A/g), ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why "object 'x' not found"?

2013-02-07 Thread Andrew Robinson
Hi David,

The following is an interesting observation ...

On Friday, February 8, 2013, David Winsemius wrote:

>
> On Feb 7, 2013, at 10:20 AM, Winfried Moser wrote:
> 
>
Notice that the '[[' function is superior in every way to the '$' function.
>

I'm curious to know whether you can point to some consolidated comparison?

Best wishes,

Andrew


-- 
Andrew Robinson
Director (A/g), ACERA
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain the model/equation at each level automatically in a regression model with a few factors

2013-02-06 Thread Andrew Robinson
Hi Chunlin,

You have to write your own myCmatrix.

Cheers

Andrew


On Wednesday, February 6, 2013, chunlin liu wrote:

> Thanks, Andrew.
>
>
>
> I studied the function "estimable" in package "gmodels"
>
> and realized that it could combine the model coefficients together
>
> if a correct cm matrix is provided. For example, if
>
>
>
> mylm <- lm(y ~ x1 + factor(x2) + x3*factor(x4), mydata)
>
>
>
> Assume that factors x2 and x4 have 6 and 10 levels respectively.
>
> If a correct cm matrix myCmatrix is provided, then
>
>
>
> estimable(mylm, myCmatrix)
>
>
>
> produces all coefficients for the equations/models at all levels.
>
> As I do not know how lm is computed,
>
> does anyone know how to obtain myCmatrix from mylm object?
> Thanks,
>
> -chunlin
>
> On Tue, Feb 5, 2013 at 1:56 AM, Andrew Robinson 
>  'mensuration...@gmail.com');>
> > wrote:
>
>> I think that the excellent estimable function from gmodels should help
>> you.
>>
>> Cheers
>>
>> Andrew
>>
>>
>> On Tuesday, February 5, 2013, chunlin liu wrote:
>>
>>>  I am wondering how to obtain the model/equation at each level
>>> automatically
>>> in a regression model with a few factors
>>> without looking at summary of the lm model. For example, consider
>>>
>>> lm.factors <- lm(y ~ x1 + factor(x2)*factor(x3)+x4*factor(x5))
>>>
>>> The coefficients of lm.factors in summary(lm.factors) might be
>>> complicated.
>>> I would like to have the equation at each level from lm.factor.
>>> Could you please let me know how to obtain the equation of lm.factors at
>>> each level automatically (not looking at summary(lm.factors))?
>>> Thanks,
>>>
>>> -Chunlin
>>>
>>>     [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>> --
>> Andrew Robinson
>> Director (A/g), ACERA
>> Department of Mathematics and StatisticsTel: +61-3-8344-6410
>> University of Melbourne, VIC 3010 Australia   (prefer email)
>> http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
>> http://www.acera.unimelb.edu.au/
>>
>> FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
>> SPuR: http://www.ms.unimelb.edu.au/spuRs/
>>
>
>

-- 
Andrew Robinson
Director (A/g), ACERA
Senior Lecturer in Applied Statistics  Tel:
+61-3-8344-6410
Department of Mathematics and StatisticsFax: +61-3-8344 4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.auWebsite: http://www.ms.unimelb.edu.au

FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain the model/equation at each level automatically in a regression model with a few factors

2013-02-05 Thread Andrew Robinson
I think that the excellent estimable function from gmodels should help you.

Cheers

Andrew

On Tuesday, February 5, 2013, chunlin liu wrote:

> I am wondering how to obtain the model/equation at each level automatically
> in a regression model with a few factors
> without looking at summary of the lm model. For example, consider
>
> lm.factors <- lm(y ~ x1 + factor(x2)*factor(x3)+x4*factor(x5))
>
> The coefficients of lm.factors in summary(lm.factors) might be complicated.
> I would like to have the equation at each level from lm.factor.
> Could you please let me know how to obtain the equation of lm.factors at
> each level automatically (not looking at summary(lm.factors))?
> Thanks,
>
> -Chunlin
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org  mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Andrew Robinson
Director (A/g), ACERA
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Script for conditional sums of vectors

2013-02-04 Thread Andrew Robinson
Hi Ben,

let me suggest some background reading - Peter Dalgaard's  or Phil
Spector's book will set you up with what you need.  You can also read one
of the many free, contributed sets of notes kept on CRAN.

I hope that this helps

Andrew


On Mon, Feb 4, 2013 at 8:29 PM, Benjamin Gillespie wrote:

> Hi guys,
>
> I hope you can help me with this (probably) simple query:
>
> I have a data frame:
>
> --
>
> a=c(1,1,1,1,1,1,2,2,2,2,2,2)
> b=c(1,1,1,2,3,4,1,1,2,2,3,4)
> c=c(400,200,300,100,500,300,200,100,500,400,200,100)
>
>
> data=data.frame(a=a,b=b,c=c)
>
> --
>
> And I would like to get the following output:
>
> --
>
> b
> a   1   2   3   4
> 1   900 100 500 300
> 2   300 900 200 100
>
> --
>
> The values in the output represent the sum of values "c" in data frame
> "data", for each "a" and "b" combination.
>
> For example, where "a" = 1 and "b" = 1, the output is 400+200+300 = 900.
>
> Please would anyone be able to provide a script to create my desired
> output?
>
> Many thanks in advance,
>
> Ben Gillespie
> Research Postgraduate
>
> School of Geography
> University of Leeds
> Leeds
> LS2 9JT
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Andrew Robinson
Director (A/g), ACERA
Senior Lecturer in Applied Statistics  Tel:
+61-3-8344-6410
Department of Mathematics and StatisticsFax: +61-3-8344 4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.auWebsite: http://www.ms.unimelb.edu.au

FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting from dataframes

2013-01-17 Thread Andrew Robinson
It's not really clear to me what you mean when you say that you want to
plot the hours, so it's hard to help.  Regardless, take a look at looping
and plotting in any of the free documentation on CRAN.

http://cran.r-project.org/other-docs.html

I hope that this helps,

Andrew


On Fri, Jan 18, 2013 at 2:21 AM, condor  wrote:

> thanks to your guys help I am closer to solving my problem but I have some
> small problem. So let's say I start with
>
> >data
> number day  hour
> 1   17  10
> 2   17  11
> 3   17  6
> 4   18  4
> 5   18  10
> 6   19  8
> 7   19  8
>
> I want to split to odd days, which I am able to do, I call this object
> frames, which looks like:
>
> > frames
> $`1`
>   c1 day1 hour1
> 1  1   1710
> 2  2   1711
> 3  3   17 6
>
> $`2`
>   c1 day1 hour1
> 4  6   19 8
> 5  7   19 8
>
> Now I want to make plot of the hours of both days, but not by hand. I need
> some sort of loop for this. How is this done?
>
> So
> par(mfrow=c(1,2))
> for(…) plot(…hours…)
>
> thanks for the help
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/plotting-from-dataframes-tp4655851.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Andrew Robinson
Director (A/g), ACERA
Senior Lecturer in Applied Statistics  Tel:
+61-3-8344-6410
Department of Mathematics and StatisticsFax: +61-3-8344 4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.auWebsite: http://www.ms.unimelb.edu.au

FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Explore patterns with GAM

2013-01-17 Thread Andrew Robinson
Hi Spyros,

I suggest that you borrow / buy the book that was written by the author of
that package, and study it. It's "Generalized Additive Models: An
Introduction with R". There's a lot of stuff going on in GAM fitting that
it would be worth paying close attention to.

I hope that this helps,

Andrew


On Fri, Jan 18, 2013 at 12:45 AM, spyros  wrote:

> Dear all,
> I new to r and I would like your help.
> I want to explore the patterns (unimodal, monotonically
> increased/decreased)
> of species richness~altitude using GAM in R. Although I run the gam
> function
> in mgcv package I do not know how to manually define knots and degrees of
> freedom.
> Any help would be greatly appreciated.
> Spyros
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Explore-patterns-with-GAM-tp4655838.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Andrew Robinson
Director (A/g), ACERA
Senior Lecturer in Applied Statistics  Tel:
+61-3-8344-6410
Department of Mathematics and StatisticsFax: +61-3-8344 4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.auWebsite: http://www.ms.unimelb.edu.au

FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] create block diagonal with each rows

2013-01-16 Thread Andrew Robinson
Hi Kathryn,

take a look at the kronecker function.

Cheers

Andrew


On Thu, Jan 17, 2013 at 1:11 PM, Kathryn Lord wrote:

> Dear R users,
>
> I'd like to create a block diagonal matrix with each rows in a matrix.
>
> Here is a simple example. (In fact, the matrix is big)
>
>
> x <- matrix(1:20, 4,5)
>
> > x
>  [,1] [,2] [,3] [,4] [,5]
> [1,]159   13   17
> [2,]26   10   14   18
> [3,]37   11   15   19
> [4,]48   12   16   20
>
>
> With each rows in matrix x, I'd like to make the matrix below.
>
>
>  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9][,10]
> [,11][,12][,13][,14][,15][,16][,17][,18][,19][,20]
> [1,]159   13   17   00000 000
> 00 00000
> [2,]0000026   10   14   18   000
> 00 00000
> [3,]0000000000 37   11
> 15   19   00000
> [4,]0000000000 000
> 00 48   12   16   20
>
>
> Any suggestion will be greatly appreciated.
>
> Best,
>
> Kathryn Lord
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Andrew Robinson
Director (A/g), ACERA
Senior Lecturer in Applied Statistics  Tel:
+61-3-8344-6410
Department of Mathematics and StatisticsFax: +61-3-8344 4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.auWebsite: http://www.ms.unimelb.edu.au

FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Forest Error for Factor to Character column

2013-01-14 Thread Andrew Robinson
After you subset the data, did you redeclare the factor? If not then R
still thinks it has the potential for all those levels.

TRAINSET$JOBTITLE <- factor(TRAINSET$JOBTITLE)

I hope this helps

Andrew

On Tuesday, January 15, 2013, Lopez, Dan wrote:

> Hi,
>
> Can someone please offer me some guidance?
>
> I imported some data. One of the columns called "JOBTITLE" when imported
> was imported as a factor column with 416 levels.
>
> I subset the data in such a way that only 4 levels have data in "JOBTITLE"
> and tried running randomForest but it complained about "JOBTITLE" having
> more than 32 categories. I know that is the limit in randomForest but I
> guess I don't understand enough about factors because I thought by
> subsetting the data this no longer would be an issue. BTW I can run
> randomForest on this dataset if I exclude "JOBTITLE".
>
> So  I then converted that column to a character vector:
> > TRAINSET$JOBTITLE<-as.character(TRAINSET$JOBTITLE)
>
> I ran Random Forest and got the below error. Why isn't this working? What
> do I need to do to get this working?
>
> > library(randomForest)
> > FOREST_model <- randomForest(as.factor(TARGET)~., data=trainset, mtry=4,
> ntree=1000,
> +importance=TRUE, do.trace=100)
>
> Error in randomForest.default(m, y, ...) :
>   NA/NaN/Inf in foreign function call (arg 1)
> In addition: Warning message:
> In data.matrix(x) : NAs introduced by coercion
>
> Your help will be greatly appreciated.
>
> Dan
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org  mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Andrew Robinson
Director (A/g), ACERA
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Off-topic: (Simple?) Random Sampling when n is a random variable

2011-06-14 Thread Andrew Robinson
Thanks Greg!

Andrew

On Tue, Jun 14, 2011 at 04:13:52PM -0600, Greg Snow wrote:
> This sounds like what is called "domains" in survey sampling (possibly other 
> names, but that is what I learned it as).  The idea is that you take a random 
> sample (or the population) then ask a question to determine which domain the 
> subject is in, then ask the question of interest in the domain of interest.  
> For example you want to know how long tourists plan to stay in the area so 
> you go to the airport and ask N people if they are tourists, if they answer 
> 'yes' then you ask how long they will be staying.  The sample size of 
> tourists n (which is <=N) is random and not know ahead.  
> 
> This is the same idea as you flipping a coin instead of asking the 1st 
> question.  And yes, the randomness of n does change the formulas needed.  
> Consult a survey sampling text for details (I am looking at the one by Lohr 
> which has a section on this).
> 
> -- 
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.s...@imail.org
> 801.408.8111
> 
> 
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> > project.org] On Behalf Of Andrew Robinson
> > Sent: Monday, June 13, 2011 7:03 PM
> > To: R-Help Discussion
> > Subject: [R] Off-topic: (Simple?) Random Sampling when n is a random
> > variable
> > 
> > Hi everyone,
> > 
> > I'm involved in a discussion with a colleague.  He suggested a sample
> > design for a finite-sized process that (to all intents and purposes)
> > involves tossing a coin and examining the unit if the coin shows
> > Heads.
> > 
> > I should emphasize that we're both approaching the problem from a
> > design-based sampling theory point of view.  So I have no argument
> > about the appropriateness of the design as such.
> > 
> > Can this design be called 'Simple Random Sampling'?  My intuition
> > suggests that it can not, because the sample size is a random
> > variable, so the usual standard error equations for SRS will be
> > inaccurate.  But I can't find any citations to back me up.  So maybe
> > I'm wrong.  My questions are:
> > 
> > 1) does this design have a name, and
> > 
> > 2) are the usual SRS formula for e.g. the standard error of the mean
> > exactly accurate?  Or are they defensibly accurate approximations?
> > 
> > 3) can anyone suggest some citations that provide guidance either way?
> > 
> > Thanks for any assistance!
> > 
> > Andrew
> > 
> > --
> > Andrew Robinson
> > Program Manager, ACERA
> > Department of Mathematics and StatisticsTel: +61-3-8344-
> > 6410
> > University of Melbourne, VIC 3010 Australia   (prefer
> > email)
> > http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-
> > 4599
> > http://www.acera.unimelb.edu.au/
> > 
> > Forest Analytics with R (Springer, 2011)
> > http://www.ms.unimelb.edu.au/FAwR/
> > Introduction to Scientific Programming and Simulation using R (CRC,
> > 2009):
> > http://www.ms.unimelb.edu.au/spuRs/
> > 
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Off-topic: (Simple?) Random Sampling when n is a random variable

2011-06-14 Thread Andrew Robinson
On Tue, Jun 14, 2011 at 11:02:52AM +1000, Andrew Robinson wrote:
> Hi everyone,
> 
> I'm involved in a discussion with a colleague.  He suggested a sample
> design for a finite-sized process that (to all intents and purposes)
> involves tossing a coin and examining the unit if the coin shows
> Heads.
> 
> I should emphasize that we're both approaching the problem from a
> design-based sampling theory point of view.  So I have no argument
> about the appropriateness of the design as such.
> 
> Can this design be called 'Simple Random Sampling'?  My intuition
> suggests that it can not, because the sample size is a random
> variable, so the usual standard error equations for SRS will be
> inaccurate.  But I can't find any citations to back me up.  So maybe
> I'm wrong.  My questions are:
> 
> 1) does this design have a name, and

Bernoulli sampling.

> 2) are the usual SRS formula for e.g. the standard error of the mean
> exactly accurate?  Or are they defensibly accurate approximations?

Not exact.  Can be approximately ok.  See 'Estimation of a Population
Total Under a "Bernoulli Sampling" Procedure' Strand 1979 American
Statistician 33 (2) 81-84.

See also Sarndal et al 'Model Assisted Survey Sampling'.

> 3) can anyone suggest some citations that provide guidance either way?

As above!

Best wishes to all

Andrew

> Thanks for any assistance!
> 
> Andrew
> 
> -- 
> Andrew Robinson  
> Program Manager, ACERA 
> Department of Mathematics and StatisticsTel: +61-3-8344-6410
> University of Melbourne, VIC 3010 Australia   (prefer email)
> http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
> http://www.acera.unimelb.edu.au/
> 
> Forest Analytics with R (Springer, 2011) 
> http://www.ms.unimelb.edu.au/FAwR/
> Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
> http://www.ms.unimelb.edu.au/spuRs/
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Off-topic: (Simple?) Random Sampling when n is a random variable

2011-06-13 Thread Andrew Robinson
Hi everyone,

I'm involved in a discussion with a colleague.  He suggested a sample
design for a finite-sized process that (to all intents and purposes)
involves tossing a coin and examining the unit if the coin shows
Heads.

I should emphasize that we're both approaching the problem from a
design-based sampling theory point of view.  So I have no argument
about the appropriateness of the design as such.

Can this design be called 'Simple Random Sampling'?  My intuition
suggests that it can not, because the sample size is a random
variable, so the usual standard error equations for SRS will be
inaccurate.  But I can't find any citations to back me up.  So maybe
I'm wrong.  My questions are:

1) does this design have a name, and

2) are the usual SRS formula for e.g. the standard error of the mean
exactly accurate?  Or are they defensibly accurate approximations?

3) can anyone suggest some citations that provide guidance either way?

Thanks for any assistance!

Andrew

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gsub() issue...

2011-05-17 Thread Andrew Robinson
Hi Charles,

It's not clear to me what you mean by "doesn't work".

> test <- "Interesting 1\nPoint\n"
> cat(test)
Interesting 1
Point
> test1 <- gsub("ing 1\nP","ing 3\nP", test)
> cat(test1)
Interesting 3
Point
> 

Cheers

Andrew

On Tue, May 17, 2011 at 10:45:31AM +0200, Thibault Charles wrote:
> Hello R helpers,
> 
>  
> 
> I get a problem using gsub() function.
> 
>  
> 
> I have the following text :
> 
>  
> 
> text <- ?? INFILTRATION INF_BASE 
> 
>  AIRCHANGE=1 ??
> 
>  
> 
> Then my code is :
> 
>  
> 
> original <- "INFILTRATION INF_BASE \n AIRCHANGE=1"
> 
>  
> 
> replace <- "INFILTRATION INF_BASE \n AIRCHANGE=3"
> 
>  
> 
> new_texte <- gsub(original,replace,text)
> 
>  
> 
> but it doesn?t work.
> 
>  
> 
> Nevertheless, cat(original) works but print(original) doesn?t?
> 
>  
> 
> Would you have an idea ?
> 
>  
> 
> Thanks
> 
>  
> 
> Thibault Charles
> 
> Solamen
> 
> Audencia - 8 route de la Joneli?re
> 
> 44300 Nantes
> 
> +33 2 40 37 46 76
> 
>  
> 
> 
>   [[alternative HTML version deleted]]
> 

> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] functions pandit and treebase in the package apTreeshape

2011-05-07 Thread Andrew Robinson
Thanks!  As noted by an earlier poster, these functions seem to have
been removed from the package.  It might be worth contacting the
maintainer to ask if any equivalents are available ...

Cheers

Andrew

On Sat, May 07, 2011 at 09:56:49AM +0200, Arnau Mir wrote:
> El 07/05/11 09:49, Arnau Mir escribió:
> >El 06/05/11 01:33, Andrew Robinson escribió:
> >>Hi Arnau,
> >>
> >>please send the output of sessionInfo() and the exact commands and
> >>response that you used to install and load apTreeshape.
> >>
> >
> 
> Sorry, I forgot the commands.
> Here they are:
> 
> > library(apTreeshape)
> Loading required package: ape
> Loading required package: quantreg
> Loading required package: SparseM
> Package SparseM (0.89) loaded.
>To cite, see citation("SparseM")
> 
> 
> Attaching package: 'SparseM'
> 
> The following object(s) are masked from 'package:base':
> 
> backsolve
> 
> Package quantreg (4.67) loaded.
> To cite, see citation("quantreg")
> 
> > pandit(2)
> Error: no se pudo encontrar la función "pandit"
> (the function "pandit" cannot be found)
> > treebase(2)
> Error: no se pudo encontrar la función "treebase"
> (the function treebase cannot be found)
> > sessionInfo()
> 
> 
> Arnau.
> 
> >Here It is:
> >
> >> sessionInfo()
> >R version 2.12.1 (2010-12-16)
> >Platform: x86_64-pc-linux-gnu (64-bit)
> >
> >locale:
> > [1] LC_CTYPE=es_ES.UTF-8   LC_NUMERIC=C
> > [3] LC_TIME=es_ES.UTF-8LC_COLLATE=es_ES.UTF-8
> > [5] LC_MONETARY=C  LC_MESSAGES=es_ES.UTF-8
> > [7] LC_PAPER=es_ES.UTF-8   LC_NAME=C
> > [9] LC_ADDRESS=C   LC_TELEPHONE=C
> >[11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C
> >
> >attached base packages:
> >[1] stats graphics  grDevices utils datasets  methods   base
> >
> >other attached packages:
> >[1] quantreg_4.67 SparseM_0.89  ape_2.5-3 
> >apTreeshape_1.4-3
> >
> >loaded via a namespace (and not attached):
> >[1] gee_4.13-14 grid_2.12.1 lattice_0.19-17 nlme_3.1-97
> >
> >
> >Arnau.
> >
> >>Cheers
> >>
> >>Andrew
> >>
> >>On Thu, May 05, 2011 at 04:42:58PM +0200, Arnau Mir wrote:
> >>>Hello.
> >>>
> >>>I'm trying to use the functions pandit and treebase. They are in the 
> >>>package apTreeshape. Once I've loaded the package, R responses:
> >>>
> >>>- no function pandit/treebase.
> >>>
> >>>Somebody knows why or what is the reason?
> >>>
> >>>
> >>>Thanks,
> >>>
> >>>Arnau.
> >>>
> >>>Arnau Mir Torres
> >>>Edifici A. Turmeda
> >>>Campus UIB
> >>>Ctra. Valldemossa, km. 7,5
> >>>07122 Palma de Mca.
> >>>tel: (+34) 971172987
> >>>fax: (+34) 971173003
> >>>email: arnau@uib.es
> >>>URL: http://dmi.uib.es/~arnau
> >>>--------
> >>>
> >>>
> >>>[[alternative HTML version deleted]]
> >>>
> >>>__
> >>>R-help@r-project.org mailing list
> >>>https://stat.ethz.ch/mailman/listinfo/r-help
> >>>PLEASE do read the posting guide 
> >>>http://www.R-project.org/posting-guide.html
> >>>and provide commented, minimal, self-contained, reproducible code.
> >
> >
> 
> 
> -- 
> 
> Arnau Mir Torres
> Edifici A. Turmeda
> Campus UIB
> Ctra. Valldemossa, km. 7,5
> 07122 Palma de Mca.
> tel: (+34) 971172987
> fax: (+34) 971173003
> email: arnau@uib.es
> URL: http://dmi.uib.es/~arnau
> 

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Installing rgdal in R: correct -configure flags for GDAL install on Linux Redhat

2011-05-05 Thread Andrew Robinson
Hi Katrina,

the error message below is actually pretty explicit.  Have you
installed the PROJ.4 library?  If not, then you need to install it.
When I had to do this I think that I used macports:

sudo port install proj

Then you need to tell configure where to find it, using the protocol
suggested below.  

I hope that this helps.

Andrew

On Thu, May 05, 2011 at 05:23:56PM -0800, Katrina Bennett wrote:
> Hi, I'm installing rgdal but I keep having failures because I have not been
> able to find a good source of information for the correct configuration
> settings when installing GDAL.
> 
> My error from the R install.packages("rgdal") is below.
> 
> Can someone point me to a good source to tell me how to set after
> ./configure when installing GDAL?
> 
> I'd like to be able to work with raster images, geotiffs, netCDF files, and
> other raster-based image processing in R.
> 
> Right now I'm running ./configure for GDAL as follows:
> 
> ./configure --prefix=$HOME/local/gdal/gdal-1.8.0 --with
> jasper=$HOME/local/jasper/jasper-1.900.1.uuid/src/libjasper
> 
> Any insight would be greatly appreciated! Thank you.
> 
> Error output from R:
> * installing *source* package 'rgdal' ...
> gdal-config: gdal-config
> checking for gcc... gcc -std=gnu99
> checking for C compiler default output file name... a.out
> checking whether the C compiler works... yes
> checking whether we are cross compiling... no
> checking for suffix of executables...
> checking for suffix of object files... o
> checking whether we are using the GNU C compiler... yes
> checking whether gcc -std=gnu99 accepts -g... yes
> checking for gcc -std=gnu99 option to accept ANSI C... none needed
> checking how to run the C preprocessor... gcc -std=gnu99 -E
> checking for egrep... grep -E
> checking for ANSI C header files... yes
> checking for sys/types.h... yes
> checking for sys/stat.h... yes
> checking for stdlib.h... yes
> checking for string.h... yes
> checking for memory.h... yes
> checking for strings.h... yes
> checking for inttypes.h... yes
> checking for stdint.h... yes
> checking for unistd.h... yes
> checking proj_api.h usability... no
> checking proj_api.h presence... no
> checking for proj_api.h... no
> Error: proj_api.h not found.
> If the PROJ.4 library is installed in a non-standard location,
> use --configure-args='--with-proj-
> include=/opt/local/include'
> for example, replacing /opt/local/* with appropriate values
> for your installation. If PROJ.4 is not installed, install it.
> ERROR: configuration failed for package 'rgdal'
> * removing
> '/import/home/u1/uaf/kbennett/R/x86_64-unknown-linux-gnu-library/2.11/rgdal'
> 
> 
> 
> 
> 
> 
> 
> 
> -- 
> Katrina E. Bennett
> PhD Student
> University of Alaska Fairbanks
> International Arctic Research Center
> 930 Koyukuk Drive, PO Box 757340
> Fairbanks, Alaska 99775-7340
> 907-474-1939 office
> 907-385-7657 cell
> kebenn...@alaska.edu
> 
> 
> Personal Address:
> UAF, PO Box 752525
> Fairbanks, Alaska 99775-2525
> bennett.katr...@gmail.com
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] two-way group mean prediction in survreg with three factors

2011-05-05 Thread Andrew Robinson
Even then, I think that there's a problem.  If C is in the model, then
the response varies by C.  The simplest way is to pick a value for C,
and then evaluate the group mean estimates of A and B (and C).

Something in my brain keeps asking whether another way to marginalize
C for the purposes of predicting A and B is just to remove it from the
model, or alternatively to make it a random effect.  Neither idea
seems rock solid at this point.  

Cheers

Andrew

On Thu, May 05, 2011 at 09:37:15AM -0400, Pang Du wrote:
> Oops, I hope not too.  Don't know why I had the brackets around B+C.  My
> model is actually A*B+C.  And I'm not sure how to obtain the two-way
> prediction of AB with C marginalized.  Thanks.
> 
> Pang
> 
> -Original Message-
> From: Andrew Robinson [mailto:a.robin...@ms.unimelb.edu.au] 
> Sent: Wednesday, May 04, 2011 10:13 PM
> To: Pang Du
> Cc: r-help@r-project.org
> Subject: Re: [R] two-way group mean prediction in survreg with three factors
> 
> I hope not!
> 
> Facetiousness aside, the model that you have fit contains C, and,
> indeed, an interaction between A and C.  So, the effect of A upon the
> response variable depends on the level of C.  The summary you want
> must marginalize C somehow, probably by a weighted or unweighted
> average across its levels.  What does that summary really mean?  Can
> you meaningfully average across the levels of a predictor that is
> included in the model as a main and an interaction term?
> 
> Best wishes
> 
> Andrew
> 
> On Wed, May 04, 2011 at 12:24:50PM -0400, Pang Du wrote:
> > I'm fitting a regression model for censored data with three categorical
> > predictors, say A, B, C.  My final model based on the survreg function is 
> > 
> > Surv(..) ~ A*(B+C).
> > 
> > I know the three-way group mean estimates can be computed using the
> predict
> > function. But is there any way to obtain two-way group mean estimates, say
> > estimated group mean for (A1, B1)-group?  The sample group means don't
> > incorporate censoring and thus may not be appropriate here.
> > 
> >  
> > 
> > Pang Du
> > 
> > Virginia Tech
> > 
> > 
> > [[alternative HTML version deleted]]
> > 
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> -- 
> Andrew Robinson  
> Program Manager, ACERA 
> Department of Mathematics and StatisticsTel: +61-3-8344-6410
> University of Melbourne, VIC 3010 Australia   (prefer email)
> http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
> http://www.acera.unimelb.edu.au/
> 
> Forest Analytics with R (Springer, 2011) 
> http://www.ms.unimelb.edu.au/FAwR/
> Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
> http://www.ms.unimelb.edu.au/spuRs/

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Vermunt's LEM in R

2011-05-05 Thread Andrew Robinson
Hi David,

you might have more luck with your request if you tell us what
Vermunt's LEM *does*, and provided some links to introductory reading
material ...

Cheers

Andrew

On Thu, May 05, 2011 at 03:34:02PM +, David Joubert wrote:
> 
> Hello-
> 
> Does anyone know of packages that could emulate what J. Vermunt's LEM does ? 
> What is the closest relative in R ?
> I use both R and LEM but have trouble transforming my multiway tables in R 
> into a .dat file compatible with LEM. 
> 
> Thanks,
> 
> David Joubert
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] functions pandit and treebase in the package apTreeshape

2011-05-05 Thread Andrew Robinson
Hi Arnau,

please send the output of sessionInfo() and the exact commands and
response that you used to install and load apTreeshape.

Cheers

Andrew

On Thu, May 05, 2011 at 04:42:58PM +0200, Arnau Mir wrote:
> Hello.
> 
> I'm trying to use the functions pandit and treebase. They are in the package 
> apTreeshape. Once I've loaded the package, R responses:
> 
> - no function pandit/treebase.
> 
> Somebody knows why or what is the reason?
> 
> 
> Thanks,
> 
> Arnau.
> 
> Arnau Mir Torres
> Edifici A. Turmeda
> Campus UIB
> Ctra. Valldemossa, km. 7,5
> 07122 Palma de Mca.
> tel: (+34) 971172987
> fax: (+34) 971173003
> email: arnau@uib.es
> URL: http://dmi.uib.es/~arnau
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nls problem with R

2011-05-05 Thread Andrew Robinson
Apologies, but I don't see a question here ... am I missing something
obvious?

Andrew

On Thu, May 05, 2011 at 01:20:33AM -0700, sterlesser wrote:
> ID1  ID2 t   V(t)
> 1 1   0   6.053078443
> 2 1   0.3403  5.56937391
> 3 1   0.4181  5.45484486
> 4 1   0.4986  5.193124598
> 5 1   0.7451  4.31386722
> 6 1   1.0069  3.645422269
> 7 1   1.5535  3.587710965
> 8 1   1.8049  3.740362689
> 9 1   2.4979  3.699837726
> 101   6.4903  2.908485019
> 111   13.5049 1.888179494
> 121   27.5049 1.176091259
> 131   41.5049 1.176091259
> 
> The model
> (1)  V(t)=V0[1-epi+ epi*exp(-c(t-t0))]
> (2)  V(t)=V0{A*exp[-lambda1(t-t0)]+(1-A)*exp[-lambda2(t-t0)]}
> 
> in formula (2) lambda1=0.5*{(c+delta)+[(c-delta)^2+4*(1-epi)*c*delta]^0.5}
>   
> lambda2=0.5*{(c+delta)-[(c-delta)^2+4*(1-epi)*c*delta]^0.5}
>A=(epi*c-lambda2)/(lambda1-lambda2)
> 
> The regression rule :
> for formula (1):(t<=2,that is) first 8 rows are used for non-linear
> regression
> epi,c,t0,V0 parameters are obtained 
> for formula (2):all 13 rows of results are used for non-linear regression 
> lambda1,lambda2,A (with these parameters, delta can be calculated from them)
> 
> Thanks for help
> Ster Lesser
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/nls-problem-with-R-tp3494454p3497825.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bivariate linear interpolation

2011-05-04 Thread Andrew Robinson
The one that gives results that you trust and uses algorithms that you
understand!

Cheers

Andrew

On Wed, May 04, 2011 at 12:52:03PM +, Halldór Björnsson wrote:
> Hi,
> 
> I have three matrices (X,Y,P) with the same dimension. The X,Y grid is
> regular and I want to
> perform linear interpolation to pick out certain points. In matlab
> appropriate call is
> something like
> 
> Pout=interp2(X,Y,P,Xout,Yout, method="linear")
> 
> where Xout and Yout are the locations where I want the Pout data
> (typically a different grid).
> (Scipy has this routine in interpolate.interp2d, with similar arguments)
> 
> 
> In R there is (as often) the choice between many different
> interpolation routines. Akima has one for irregularly spaced
> data (and does not like co-linearity in the data). Fields has another
> one, with a more complicated arguments.
> 
> What is the best R function that accomplishes this?
> 
> Sincerely
> Halldór
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Uniform Gaussian Kernel

2011-05-04 Thread Andrew Robinson
I can't honestly say that I grasp what you're trying to do, but that
said, I wonder if the curve() function will help you?

Cheers

Andrew


On Wed, May 04, 2011 at 07:30:20AM -0700, blutack wrote:
> I have a vector with lots of different numbers. I need to make a graph
> showing the Uniform Distribution of the figures. I have created a graph
> showing all the different values, but now want individual Gaussian Kernel
> round each point. This is what I have but each time it comes up with an
> error as I have just based it on the Normal Distribution, but I'm not sure
> what I need to change to make it work. Where z is my vector.
> 
> plot(0, 0, xlim=range(0, 300), ylim=range(0, 1), pch=NA,)
> for(i in 1:length(z)) {
>   points(z[i], 0, pch="|")
> }
> 
> x = seq(-10, 10, 0.01)
> for(i in 1:length(z)){
>   std_dev = 1
>   lines(x, dunif(x, z[i], sd = std_dev))
> }
> 
> Any ideas? Thanks.
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Uniform-Gaussian-Kernel-tp3495742p3495742.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] split character vector by multiple keywords simultaneously

2011-05-04 Thread Andrew Robinson
A hack would be to use gsub() to prepend e.g. XXX to the keywords that
you want, perform a strsplit() to break the lines into component
strings, and then substr() to extract the pieces that you want from
those strings.

Cheers

Andrew

On Wed, May 04, 2011 at 04:08:40PM -0700, sunny wrote:
> Hi. I have a character vector that looks like this:
> 
> > temp <- c("Company name: The first company General Manager: John Doe I
> > Managers: John Doe II, John Doe III","Company name: The second company
> > General Manager: Jane Doe I","Company name: The third company Managers:
> > Jane Doe II, Jane Doe III")
> > temp
> [1] "Company name: The first company General Manager: John Doe I Managers:
> John Doe II, John Doe III"
> [2] "Company name: The second company General Manager: Jane Doe I"
>
> [3] "Company name: The third company Managers: Jane Doe II, Jane Doe III" 
> 
> I know all the keywords, i.e. "Company name:", "General Manager:",
> "Managers:" etc. I'm looking for a way to split this character vector into
> multiple character vectors, with one column for each keyword and the
> corresponding values for each, i.e.
> 
> Company name  General Manager  Managers
> 1  The first companyJohn Doe IJohn Doe II, John
> Doe III
> 2 The second companyJane Doe I  
> 3  The third company  Jane Doe II,
> Jane Doe III
> 
> I have tried a lot to find something suitable but haven't so far. Any help
> will be greatly appreciated. I am running R-2.12.1 on x86_64 linux.
> 
> Thanks.
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/split-character-vector-by-multiple-keywords-simultaneously-tp3497033p3497033.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] two-way group mean prediction in survreg with three factors

2011-05-04 Thread Andrew Robinson
I hope not!

Facetiousness aside, the model that you have fit contains C, and,
indeed, an interaction between A and C.  So, the effect of A upon the
response variable depends on the level of C.  The summary you want
must marginalize C somehow, probably by a weighted or unweighted
average across its levels.  What does that summary really mean?  Can
you meaningfully average across the levels of a predictor that is
included in the model as a main and an interaction term?

Best wishes

Andrew

On Wed, May 04, 2011 at 12:24:50PM -0400, Pang Du wrote:
> I'm fitting a regression model for censored data with three categorical
> predictors, say A, B, C.  My final model based on the survreg function is 
> 
> Surv(..) ~ A*(B+C).
> 
> I know the three-way group mean estimates can be computed using the predict
> function. But is there any way to obtain two-way group mean estimates, say
> estimated group mean for (A1, B1)-group?  The sample group means don't
> incorporate censoring and thus may not be appropriate here.
> 
>  
> 
> Pang Du
> 
> Virginia Tech
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fGarch

2011-05-04 Thread Andrew Robinson
Hi Paul,

I suggest that you should send us commented, minimal, self-contained,
reproducible code.

That means, in essence, developing the simplest possible
representation of your problem.  In the process of developing the
simplest possible representation, you may learn more about the
problem.  Maybe even solve it.  

Even if you don't, then you enable us to make a much better
contribution, because we can actually try out our suggestions before
sending them.

With what you sent here, all we can do is speculate.

Cheers,

Andrew

On Wed, May 04, 2011 at 04:33:12PM -0400, Paul Ossenbruggen wrote:
> Hi,
> 
>   I am attempting to fit a ARMA/GARCH regression model without success.
> 
>   ### ARIMA-GARCH model with regressor ###
> 
> ### Time series data: A multivariate data set.
> cov.ts.dq = cov.ts[1:4,"dq1"][!is.na(cov.ts[,"dq1"])]
> cov.ts.day = ts.intersect(dq = diff(q.ts), day = lag(q.ts, -1))
> 
> ### The following R scripts work:
> (summary(no.day.fitr <- garchFit(dq ~ arma(0,3) + garch(1,1), data = 
> cov.ts.day)))
> (summary(no.day.fitr2 <- garchFit(dq ~ arma(0,3) + garch(1,1), data = 
> cov.ts.day,
>   include.mean=FALSE)))
> 
> ### ERROR: I add in the regressor "day".
> (summary(no.day.fitr3 <- garchFit(dq ~ day + arma(0,3) + garch(1,1), data = 
> cov.ts.day,
>   include.mean=FALSE)))
> ### Error in .garchArgsParser(formula = formula, data = data, trace = FALSE) 
> : 
> ###  object 'formula.mean' not found
> 
> ### ERROR: 
> day.fitr4 <- garchFit(formula.mean = dq ~ day + arma(0,3),formula.var = 
> ~garch(1,0), data = cov.ts.day,include.mean = FALSE)
> ### Error in garchFit(formula.mean = dq ~ day + arma(0, 3), formula.var = 
> ~garch(1,  : 
> ### Multivariate data inputs require lhs for the formula.
> ### Note: If I remove "day" I obtain the same error message.
>   
>   I would greatly appreciate knowing how to overcome this problem.
> 
>   Paul
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Panels order in lattice graphs

2011-05-04 Thread Andrew Robinson
Hi Cristina,

you can probably hack your own solution using the index.cond argument.

Cheers

Andrew

On Wed, May 04, 2011 at 04:50:53PM +0100, Cristina Silva wrote:
> Hi all,
> 
> In lattice graphs, panels are drawn from left to right and bottom to 
> top. The flag "as.table=TRUE" changes to left to right and top to 
> bottom. Is there any way to change to first top to bottom and then left 
> to right? didn´t find anything neither in Help pages nor Lattice book.
> 
> Cristina
> 
> -- 
> --
> Cristina Silva
> INRB/L-IPIMAR
> Unidade de Recursos Marinhos e Sustentabilidade
> Av. de Brasília, 1449-006 Lisboa
> Portugal
> Tel.: 351 21 3027096
> Fax: 351 21 3015948
> csi...@ipimar.pt
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems saving ff objects

2011-05-04 Thread Andrew Robinson
I wonder if this question should be directed to the package
maintainer?

Best wishes,

Andrew

On Wed, May 04, 2011 at 02:31:51PM +0100, Jannis wrote:
> Just did some more testing.May the problem be due to the fact that I am 
> using a windows machine? I just ran the same code on a Linux machine and 
> everything worked fine.
> 
> If windows (or the file system of the disk) caused the problem, is there any 
> way to resolve it? I know that using Linux would be a better choice ;-) but 
> unfortunatley this in no option at the moment
> 
> 
> Best
> Jannis
> 
> --- Jannis  schrieb am Mi, 4.5.2011:
> 
> > Von: Jannis 
> > Betreff: [R] Problems saving ff objects
> > An: r-help@r-project.org
> > Datum: Mittwoch, 4. Mai, 2011 13:17 Uhr
> > Dear list,
> > 
> > 
> > I am trying to understand and use the ff package. As I had
> > some problems saving some ff objects, and as I did not fully
> > manage to understand the whole concept of *.ff, *.ffData and
> > *.RData with the help of the documentation, I tried to
> > reproduce the examples from the help of ffsave.
> > 
> > When I ran, however : (copied from the help)
> > 
> >  message("let's create some ff objects")
> >   n <- 8e3
> >   a <- ff(sample(n, n, TRUE), vmode="integer",
> > length=n, filename="d:/tmp/a.ff")
> >   b <- ff(sample(255, n, TRUE), vmode="ubyte",
> > length=n, filename="d:/tmp/b.ff")
> >   x <- ff(sample(255, n, TRUE), vmode="ubyte",
> > length=n, filename="d:/tmp/x.ff")
> >   y <- ff(sample(255, n, TRUE), vmode="ubyte",
> > length=n, filename="d:/tmp/y.ff")
> >   z <- ff(sample(255, n, TRUE), vmode="ubyte",
> > length=n, filename="d:/tmp/z.ff")
> >   df <- ffdf(x=x, y=y, z=z)
> >   rm(x,y,z)
> > 
> >   message("save all of them")
> >   ffsave.image("d:/tmp/x")
> > 
> > I get:
> > 
> > Error in ffsave(list = ls(envir = .GlobalEnv, all.names =
> > TRUE), file = outfile,  : 
> >   the previous files do not match the rootpath (case
> > sensitive)
> > 
> > 
> > Whats wrong here? Should this not be working as I did not
> > change anything in the code?
> > 
> > 
> > 
> > Cheers
> > Jannis
> > 
> > 
> > > sessionInfo()
> > R version 2.12.0 (2010-10-15)
> > Platform: i386-pc-mingw32/i386 (32-bit)
> > 
> > locale:
> > [1] LC_COLLATE=English_United States.1252 
> > [2] LC_CTYPE=English_United States.1252   
> > [3] LC_MONETARY=English_United States.1252
> > [4] LC_NUMERIC=C           
> >               
> > [5] LC_TIME=English_United States.1252    
> > 
> > attached base packages:
> > [1] tools     stats 
> >    graphics  grDevices utils 
> >    datasets  methods  
> > [8] base     
> > 
> > other attached packages:
> > [1] ff_2.2-2   bit_1.1-7  rj_0.5.2-1
> > 
> > loaded via a namespace (and not attached):
> > [1] rJava_0.8-8
> > 
> > 
> > 
> > 
> > __
> > R-help@r-project.org
> > mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> > reproducible code.
> >
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with the maxBHHH routine

2011-05-04 Thread Andrew Robinson
Hi Rohit,

actually, the request for simple reproducible code means that you have
to find the simplest possible representation of the problem.  

What happens if you simplify the observation level gradient and the
likelihood function?  Eg to trivial examples?  If you still get the
error, then simplify it futher.  If you get the error with the
simplest possible problem, then share it.  If you don't , then try to
figure out what the changes were that resolved the problem, and scale
those back up to your original problem.

Does that make sense?

Cheers

Andrew


On Thu, May 05, 2011 at 03:22:55AM +0530, Rohit Pandey wrote:
>Hi Andrew, Ravi and Arne,
> 
>Thank you so much for your prompt replies. I see that all of you mention
>the need for simple, reproducible code. I had thought of doing this, but
>the functions I was using for the observation level gradient and
>likelihood function were very long. I will paste them below here.
> 
>Also, sorry for the ambiguity with the "1000's of observations and 821
>parameters" on the one hand and the 10 * 2 matrix on the other. The latter
>is a toy data set and the former is the real data set I ultimately hope to
>apply this routine to once it works. Also, sorry for not mentioning the
>fact that the maxBHHH function I am using is from the maxLik package
>(thanks, Ravi for pointing out).
>So, the code that is giving me the errors is:
>
> maxBHHH(logLikALS4,grad=nuGradientC4,finalHessian="BHHH",start=prm,iterlim=2)
>and
>
> maxBHHH(logLikALS4,grad=nuGradientC4,finalHessian="BHHH",start=prm,iterlim=2)
>Where nuGradientC4 returns a 2*10 matrix and nuGradientC5 a 10*2 matrix
>(there are 10 parameters and 2 observations).
>I have attached the required functions in the .R file.
>These make for some pretty long code, but all you have to do is either
>load the file or paste the contents into your R console (and maybe see
>that they're returning what they're supposed to). I'm sorry I couldn't
>think of a way to come up with a shorter version of this code (I tried my
>best).
> 
>Once you load the file, you should see the following:
> 
>#The observation level likelihood function
>> logLikALS4(prm)
>1 2
>-0.6931472 -0.6931472
> 
>#The observation level gradients
>> nuGradientC4(prm)
>1 2 3 4 5 6 7 8 9 10
>2 -0.3518519 0.3518519 0.000 0 -0.1481481 -0.167 0.1481481
>0.167 0.000 0.000
>4 0.000 -0.3518519 0.3518519 0 0.000 0.000 -0.167
>-0.1481481 0.167 0.1481481
>Warning messages:
>1: In [1]is.na(x) : [2]is.na() applied to non-(list or vector) of type
>'NULL'
>2: In [3]is.na(x) : [4]is.na() applied to non-(list or vector) of type
>'NULL'
> 
>> nuGradientC5(prm)
>2 4
>1 -0.3518519 0.000
>2 0.3518519 -0.3518519
>3 0.000 0.3518519
>4 0.000 0.000
>5 -0.1481481 0.000
>6 -0.167 0.000
>7 0.1481481 -0.167
>8 0.167 -0.1481481
>9 0.000 0.167
>10 0.000 0.1481481
>Warning messages:
>1: In [5]is.na(x) : [6]is.na() applied to non-(list or vector) of type
>'NULL'
>2: In [7]is.na(x) : [8]is.na() applied to non-(list or vector) of type
>'NULL'
> 
>Ignore the warning messages.
> 
>The errors are:
> 
>>
>
> maxBHHH(logLikALS4,grad=nuGradientC4,finalHessian="BHHH",start=prm,iterlim=2)
>Error in checkBhhhGrad(g = gr, theta = theta, analytic = (!is.null(attr(f,
>:
>the matrix returned by the gradient function (argument 'grad') must have
>at least as many rows as the number of parameters (10), where each row
>must correspond to the gradients of the log-likelihood function of an
>individual (independent) observation:
>currently, there are (is) 10 parameter(s) but the gradient matrix has only
>2 row(s)
>In addition: Warning messages:
>1: In [9]is.na(x) : [10]is.na() applied to non-(list or vector) of type
>'NULL'
>2: In [11]is.na(x) : [12]is.na() applied to non-(list or vector) of type
>'NULL'
> 
>and:
> 
>>
>
> maxBHHH(logLikALS4,grad=nuGradientC5,finalHessian="BHHH",start=prm,iterlim=2)
>Error in gr[, fixed] <- NA : (subscript) logical subscript too long
>In addition: Warning messages:
>1: In [13]is.na(x) : [14]is.na() applied to non-(list or vector) of type
>'NULL'
>2: In [15]is.na(x) : [16]is.na() applied to non-(list or vector) of ty

Re: [R] what happens when I store linear models in an array?

2011-05-04 Thread Andrew Robinson
Hi Andrew,

try

fitted(lms.ASP[1,1][[1]])

Cheers

Andrew

On Wed, May 04, 2011 at 01:49:45PM +0200, Andrew D. Steen wrote:
> I've got a bunch of similar datasets, all of which I've fit to linear
> models.  I'd like to easily create arrays of a specific parameter from each
> linear model (e.g., all of the intercepts in one array).  I figured I'd put
> the model objects into an array, and then (somehow) I could easily create
> corresponding arrays of intercepts or residuals or whatever, but I can't the
> parameters back out.
> 
>  
> 
> Right now I've stored the model objects in a 2-D array:
> > lms.ASP <- array(list(), c(3,4))
> 
>  
> 
> Then I fill the array element-by-element:
> > surf105.lm. ASP <- lm(ASP ~ time)
> > lms.ASP[1,1] <- list(surf105.lm.ASP)
> 
>  
> 
> Something is successfully being stored in the array:
> > test <- lms.tx.ASP[1,1]
> > test
> [[1]]
> Call:
> lm(formula = ASP ~ time)
> Coefficients:
>  (Intercept)  elapsed.time..hr  
>  0.430732  0.004073  
> 
>  
> 
> But I can't seem to call extraction functions on the linear models:
> > fitted(lms.ASP[1,1])
> NULL
> 
>  
> 
> It seems like something less than the actual linear model object is being
> stored in the array, but I don't understand what's happening, or how to
> easily batch-extract parameters of linear models.  Any advice?
> 
>  
> 
>  
> 
> 
> 
> Andrew D. Steen, Ph.D.
> 
> Center for Geomicrobiology, Aarhus University
> 
> Ny Munkegade 114
> DK-8000 Aarhus C
> Denmark
> Tel: +45 8942 3241
> Fax: +45 8942 2722
> 
> andrew.st...@biology.au.dk
> 
>  
> 
> 
>   [[alternative HTML version deleted]]
> 
> ______
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] select value from a column depending on a value in another column

2011-05-04 Thread Andrew Robinson
Try subset().

Andrew

On Wed, May 04, 2011 at 02:01:44AM -0700, tornanddesperate wrote:
> Hi everybody
> 
> I couldn't find the solution to what must be quite a simple problem. Maybe
> you can help?
> 
>treatment session period stage wage_accepted market
> 1  11 11   25  public
> 2  11 11   19  privat
> 3  11 11   15  public
> 4  11 12   32  public
> 5  11 12   13  privat
> 
> >From this table, I'd like to choose only those values in the column
> "wage_accepted" that have the value "public" in the column "market". How can
> I do this? 
> 
> Is there a good general help site for R that would explain basic table
> manipulations such as this?
> 
> Thank you very much for your help
> 
> 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/select-value-from-a-column-depending-on-a-value-in-another-column-tp3494926p3494926.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with package "adapt" for R in Mac

2011-05-04 Thread Andrew Robinson
Hi,

Is there such a package?  I can't find it on CRAN.  Can you let us
know exactly how you tried to install it, and what the error message
was (if any)?

Cheers

Andrew 


On Wed, May 04, 2011 at 01:29:37AM -0300, Mat?as Ram?rez Salgado wrote:
> Hi,
> 
> How i can install the package "adapt" in some version of R for mac?
> 
> i try in 2.13, 2.9,2.7 and other previous versions... and nothing happens.
> 
> and another question: There are some packages that do the same but that it
> is implemented for mac? (calculate integrals in 2 or more dimmensions).
> 
> help me please, it's for an important work.
> 
> greetings.
> 
> 
> -- 
> Mat?as Hern?n Ram?rez Salgado.
> Estudiante de Estad?stica.
> Pontificia Universidad Cat?lica de Chile.
> 
>   [[alternative HTML version deleted]]
> 

> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nls problem with R

2011-05-04 Thread Andrew Robinson
The fact that T2 and V2 are of different lengths seems like a likely
culprit.  Other than that, you need to find start points that do not
lead to a singular gradient.  There are several books that provide
advice on obtaining initial parameter estimates for non-linear
models.  Google Books might help you.

Cheers

Andrew




On Tue, May 03, 2011 at 09:08:03PM -0700, sterlesser wrote:
> the original data are
> V2 =c(371000,285000 ,156000, 20600, 4420, 3870, 5500 )
> T2=c( 0.3403 ,0.4181 ,0.4986 ,0.7451 ,1.0069 ,1.553)
> nls2=nls(V2~v0*(1-epi+epi*exp(-cl*(T2-t0))),start=list(v0=10^7,epi=0.9,cl=6.2,t0=8.7))
> after execution error occurs as below
> 
> Error in nlsModel(formula, mf, start, wts) : 
>   singular gradient matrix at initial parameter estimates
> Error in nlsModel(formula, mf, start, wts) : 
>   singular gradient matrix at initial parameter estimates
> In addition: Warning messages:
> 1: In lhs - rhs :
>   longer object length is not a multiple of shorter object length
> 2: In .swts * attr(rhs, "gradient") :
>   longer object length is not a multiple of shorter object length
> 
> could anyone help me ?thansks
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/nls-problem-with-R-tp3494454p3494454.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Watts Strogatz game

2011-05-03 Thread Andrew Robinson
Hi,

I have no familiarity with these functions --- I see that they are not
in base R --- so I suggest that at very least you identify the package
that you are using.  Better would be to contact the package maintainer
directly.  Sometimes maintainers do not read R-help.

Cheers

Andrew

On Tue, May 03, 2011 at 12:42:55AM -0700, kparamas wrote:
> Hi,
> 
> I have a erdos-renyi game with 6000 nodes and probability 0.003.
> 
> g1 = erdos.renyi.game(6000, 0.003)
> 
> How to create a Watts Strogatz game with the same probability.
> 
> g1 = watts.strogatz.game(1, 6000, ?, ?)
> What should be the third and fourth parameter to this argument.
> 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Watts-Strogatz-game-tp3491922p3491922.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] na.omit - Is it working properly?

2011-05-03 Thread Andrew Robinson
Hi Sarah,

I'm not sure that I understand your problem.  You have shown us three
ways to try to omit missing values, and one of them seems to work.
But you're concerned because some aspect of it doesn't match the ones
that don't work?  But they don't work!  

I wonder if you could send an example in commented, minimal,
self-contained, reproducible code ...

Cheers

Andrew

On Tue, May 03, 2011 at 12:18:03PM -0700, Kalicin, Sarah wrote:
> 
> I have a work around for this, but can someone explain why the first example 
> does not work properly? I believed it worked in the previous version of R, by 
> selecting just the rows=200525 and omitting the na's. I just upgraded to 
> 2.13. I am also concern with the row numbers being different in the 
> selections, should I be worried? FYI, I just selected the first few rows for 
> demonstration, please do not worry that the number of rows shown are not 
> equal. - Sarah
> 
> With na.omit around the column, but it is showing other values in the F.WW 
> column other than 200525, along with NA.  I was hoping that this would omit 
> all the NA's, and show all the rows that P$F.WW=200525. I believe it did with 
> the previous version of R.
> P[na.omit(P$F.WW)==200525, c(51, 52)]
>   F.WWR.WW
> 45  200525  NA
> 53  NA  NA
> 61  200534  200534
> 63  200608  200608
> 66  200522  200541
> 80  NA  NA
> 150 200521  200516
> 231 200530  200530
> 
> No na.omit, the F.WW=200525 seems to work, but lots of NA included. This is 
> what is expected!! The row numbers are not the same as the above example, 
> except the first row.
> > P[P$F.WW==200525, c(51, 52)]
> F.WW R.WW
> 45200525  NA
> NANA  NA
> NA.1  NA  NA
> NA.2  NA  NA
> NA.3  NA  NA
> 57200525  200526
> 65200525  NA
> 67200525  NA
> 70200525  200525
> NA.4  NA  NA
> NA.5  NA  NA
> 86200525  NA
> 
> Na.omit excludes the na's. This is what I want. The concern I have is why the 
> row numbers do not match any of those shown in the examples above.
> > na.omit(P[P$F.WW==200525, c(51, 52)])
> F.WWR.WW
> 57200525  200526
> 70200525  200525
> 161   200525  200525
> 245   200525  200525
> 246   200525  200525
> 247   200525  200526
> 256   200525  200525
> 266   200525  200525
> 269   200525  200525
> 271   200525  200526
> 276   200525  200526
> 278   200525  200526
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] delete excel id automatically generated

2011-05-03 Thread Andrew Robinson
Try the function

rownames()

Andrew

On Tue, May 03, 2011 at 03:29:37AM -0700, agent dunham wrote:
> Dear community, 
> 
> I uploaded an excel with read.xls. My xls file actually have a column which
> is an id, ("plot" is the id) : 
> 
> plot height area
> 347.6 5.4
> 853.2 4.1
> 895.4 8.4
> 121   6.76.2
> ...
> 1325  2.11.5
> 
> However R uses another id, this way: 
> 
> r id   plot height  area
>  1  347.6 5.4
>  2  853.2 4.1
>  3  895.4 8.4
>  4 1216.76.2
> ...
> 314  1325   2.11.5
> 
> I'd like that R used "plot" id because I delete some rows while studying
> regression, and R seems to be using the first id 1,2,3,4,...,314. Sometimes
> it's a mess to understand what R means in the plots when, for instance,
> states that data 200 is influential
> 
> Thanks in advance, u...@host.com
> 
> 
> 
> 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/delete-excel-id-automatically-generated-tp3492147p3492147.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with the maxBHHH routine

2011-05-03 Thread Andrew Robinson
I suggest that you provide some commented, minimal, self-contained,
reproducible code.

Cheers

Andrew

On Wed, May 04, 2011 at 02:23:29AM +0530, Rohit Pandey wrote:
> Hello R community,
> 
> I have been using R's inbuilt maximum likelihood functions, for the
> different methods (NR, BFGS, etc).
> 
> I have figured out how to use all of them except the maxBHHH function. This
> one is different from the others as it requires an observation level
> gradient.
> 
> I am using the following syntax:
> 
> maxBHHH(logLik,grad=nuGradient,finalHessian="BHHH",start=prm,iterlim=2)
> 
> where logLik is the likelihood function and returns a vector of observation
> level likelihoods and nuGradient is a function that returns a matrix with
> each row corresponding to a single observation and the columns corresponding
> to the gradient values for each parameter (as is mentioned in the online
> help).
> 
> however, this gives me the following error:
> 
> *Error in checkBhhhGrad(g = gr, theta = theta, analytic = (!is.null(attr(f,
> :
>   the matrix returned by the gradient function (argument 'grad') must have
> at least as many rows as the number of parameters (10), where each row must
> correspond to the gradients of the log-likelihood function of an individual
> (independent) observation:
>  currently, there are (is) 10 parameter(s) but the gradient matrix has only
> 2 row(s)
> *
> It seems it is expecting as many rows as there are parameters. So, I changed
> my likelihood function so that it would return the transpose of the earlier
> matrix (hence returning a matrix with rows equaling parameters and columns,
> observations).
> 
> However, when I run the function again, I still get an error:
> *Error in gr[, fixed] <- NA : (subscript) logical subscript too long*
> 
> I have verified that my gradient function, when summed across observations
> gives the same results as the in built numerical gradient (to the 11th
> decimal place - after that, they differ since R's function is numerical).
> 
> I am trying to run a very large estimation (1000's of observations and 821
> parameters) and all of the other methods are taking way too much time
> (days). This method is our last hope and so, any help will be greatly
> appreciated.
> 
> -- 
> Thanks in advance,
> Rohit
> Mob: 91 9819926213
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexp. behavior from boot with multiple statistics

2011-05-03 Thread Andrew Robinson
Your interpretation of what the output is supposed to look like is
actually correct.  Take a look at the estimates of the bias in the
BootStrap Statistics.  You will see that they are the same as the
difference between the location of colMeans of t and t0.

I hope that this helps,

Andrew

On Tue, May 03, 2011 at 12:15:05PM -0700, algorimancer wrote:
> I am attempting to use package boot to summarize and compare the performance
> of three models.  I'm using R 2.13.0 in a Win32 environment.
> 
> My statistic function returns a vector of 6 values, 3 of which are error
> rates for different models, and 3 are pairwise differences between those
> error rates.  It looks like:
> 
> multiEst<-function(dat,i)
> {
>
>   c(E1,E2,E3,E2-E1,E3-E1,E3-E2);
> }
> 
> then I call boot (using R=4 for simplicity of description) with:
> 
> multiBoot=boot(data,multiEst,R=4)
> 
> which gives reasonable results:
> 
> Bootstrap Statistics :
> original  biasstd. error
> t1* 0.07  0.3775  0.04193249
> t2* 0.08  0.3750  0.04654747
> t3* 0.04  0.4200  0.05354126
> t4* 0.01 -0.0025  0.0050
> t5*-0.03  0.0425  0.0150
> t6*-0.04  0.0450  0.01290994
> 
> and the resulting "t0" contains the expected estimates of the statistics,
> > multiBoot$t0
> [1]  0.07  0.08  0.04  0.01 -0.03 -0.04
> 
> however "t", which is supposed to contain bootstrap replicates of the
> statistic, doesn't.  It looks like this:
> > multiBoot$t
>  [,1] [,2] [,3] [,4] [,5]  [,6]
> [1,] 0.46 0.47 0.46 0.01 0.00 -0.01
> [2,] 0.39 0.39 0.39 0.00 0.00  0.00
> [3,] 0.45 0.46 0.47 0.01 0.02  0.01
> [4,] 0.49 0.50 0.52 0.01 0.03  0.02
> 
> It is not clear where these columns come from --- they clearly do not
> resemble the estimates in "t0".
> 
> If I define a separate statistic function for each desired estimate, the
> resulting "t" and "t0" are as expected, however it is important in this case
> that the separate estimates derive from the same bootstrap replicates.
> 
> Any helpful suggestions? Or have I come upon a bug in the implementation?
> 
> Note: the documentation provides the following definitions for these
> returned variables:
> 
> t0The observed value of statistic applied to data.
> t A matrix with R rows each of which is a bootstrap replicate of 
> statistic. 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Unexp-behavior-from-boot-with-multiple-statistics-tp3493300p3493300.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lasso with Categorical Variables

2011-05-02 Thread Andrew Robinson
On Mon, May 02, 2011 at 05:22:57PM -0400, Clemontina Alexander wrote:
> Thanks for your response, but I guess I didn't make my question clear.
> I am already familiar with the concept of dummy variables and
> regression in R. My question is, can the "lars" package (or some other
> lasso algorithm) handle factors? I did use dummy variables in my
> original data, but lars (lasso) only shrank the coefficients of some
> of the levels of one factor to 0. Is this the correct thing to do?

It's because, so far as the linear model is concerned, factors are a
convenience to help us handle the dummy variables. So, yes, it's the
correct thing to do.  It sounds to me as though you are after a
shrinkage device that will treat the factor as a whole. 

> Because intuitively it seems like I would want to shrink the whole
> factor coefficient to 0. If this is correct, what is the
> interpretation? For example, for X1, if lasso drops the coefficient
> for levels A and B, but not C and D, does this mean that X1 should be
> included in the model?

It means that X1 should be recoded to be C, D, and the rest. 

Cheers

Andrew

> Thanks.
> 
> 
> 
> On Mon, May 2, 2011 at 2:47 PM, David Winsemius  
> wrote:
> >
> > On May 2, 2011, at 10:51 AM, Steve Lianoglou wrote:
> >
> >> Hi,
> >>
> >> On Mon, May 2, 2011 at 12:45 PM, Clemontina Alexander 
> >> wrote:
> >>>
> >>> Hi! This is my first time posting. I've read the general rules and
> >>> guidelines, but please bear with me if I make some fatal error in
> >>> posting. Anyway, I have a continuous response and 29 predictors made
> >>> up of continuous variables and nominal and ordinal categorical
> >>> variables. I'd like to do lasso on these, but I get an error. The way
> >>> I am using "lars" doesn't allow for the factors. Is there a special
> >>> option or some other method in order to do lasso with cat. variables?
> >>>
> >>> Here is and example (considering ordinal variables as just nominal):
> >>>
> >>> set.seed(1)
> >>> Y <- rnorm(10,0,1)
> >>> X1 <- factor(sample(x=LETTERS[1:4], size=10, replace = TRUE))
> >>> X2 <- factor(sample(x=LETTERS[5:10], size=10, replace = TRUE))
> >>> X3 <- sample(x=30:55, size=10, replace=TRUE)  # think age
> >>> X4 <- rchisq(10, df=4, ncp=0)
> >>> X <- data.frame(X1,X2,X3,X4)
> >>>
> >>>> str(X)
> >>>
> >>> 'data.frame':   10 obs. of  4 variables:
> >>>  $ X1: Factor w/ 4 levels "A","B","C","D": 4 1 3 1 2 2 1 2 4 2
> >>>  $ X2: Factor w/ 5 levels "E","F","G","H",..: 3 4 3 2 5 5 5 1 5 3
> >>>  $ X3: int  51 46 50 44 43 50 30 42 49 48
> >>>  $ X4: num  2.86 1.55 1.94 2.45 2.75 ...
> >>>
> >>>
> >>> I'd like to do:
> >>> obj <- lars(x=X, y=Y, type = "lasso")
> >>>
> >>> Instead, what I have been doing is converting all data to continuous
> >>> but I think this is really bad!
> >>
> >> Yeah, it is.
> >>
> >> Check out the "Categorical Predictor Variables" section here for a way
> >> to handle such predictor vars:
> >> http://www.psychstat.missouristate.edu/multibook/mlt08m.html
> >
> > Steve's citation is somewhat helpful, but not sufficient to take the next
> > steps. You can find details regarding the mechanics of typical linear
> > regression in R on the ?lm page where you find that the factor variables are
> > typically handled by model.matrix. See below:
> >
> >> model.matrix(~X1 + X2 + X3 + X4, X)
> >   (Intercept) X1B X1C X1D X2F X2G X2H X2I X3        X4
> > 1            1   0   0   1   0   1   0   0 51 2.8640884
> > 2            1   0   0   0   0   0   1   0 46 1.5462243
> > 3            1   0   1   0   0   1   0   0 50 1.9430901
> > 4            1   0   0   0   1   0   0   0 44 2.4504180
> > 5            1   1   0   0   0   0   0   1 43 2.7535052
> > 6            1   1   0   0   0   0   0   1 50 1.6200326
> > 7            1   0   0   0   0   0   0   1 30 0.5750533
> > 8            1   1   0   0   0   0   0   0 42 5.9224777
> > 9            1   0   0   1   0   0   0   1 49 2.0401528
> > 10           1   1   0   0   0   1   0   0 48 6.2995288
> > attr(,"assign")
> >  [1] 0 1 1 1 2 2 2 2 3 4
> > attr(,"contrasts")
> > attr(,"cont

Re: [R] Simulation Questions

2011-05-02 Thread Andrew Robinson
Hi Shane,

it sounds to me as though you have a fairly well-defined problem.  You
want to generate random numbers with a specific mean, variance, and
correlation with another random varaible.  I would reverse-enginerr
the fuinctions for simple linear regression to get a result like

y = beta_0 + beta_1 * x + rnorm(n, 0, sigma^2)

and use that as the basis of generating random numbers.

Not sure how to interpret the second question ...

Cheers

Andrew

On Sun, May 01, 2011 at 12:33:41AM -0400, Shane Phillips wrote:
> I have the following script for generating a dataset.  It works like a champ 
> except for a couple of things.
> 
> 1.  I need the variables "itbs" and  "map" to be negatively correlated with 
> the binomial variable "lunch"  (around -0.21 and -0.24, respectively). The 
> binomial variable  "lunch" needs to remain unchanged.
> 2.  While my generated variables do come out with the desired means and 
> correlations, the distribution is very narrow and only represents a small 
> portion of the possible scores.  Can I force it to encompass a wider range of 
> scores, while maintaining my desired parameters and correlations?
> 
> Please help...
> 
> Shane
> 
> Script follows...
> 
> 
> 
> #Number the subjects
> subject=1:1000
> #Assign a treatment condition from a binomial distribution with a probability 
> of 0.13
> treat=rbinom(1*1000,1,.13)
> #Assign a lunch status condition froma binomial distribution with a 
> probability of 0.35
> lunch=rbinom(1*1000,1,.35)
> #Generate age in months from a random normal distribution with mean of 87 and 
> sd of 2
> age=rnorm(1000,87,2)
> #invoke the MASS package
> require(MASS)
> #Establish the covariance matrix for MAP, ITBS and CogAT scores
> sigma <- matrix(c(1, 0.84, 0.59, 0.84, 1, 0.56, 0.59, 0.56, 1), ncol = 3)
> #Establish MAP as a random normal variable with mean of 200 and sd of 9
> map   <- rnorm(1000, 200, 9)
> #Establish ITBS as a random normal variable with mean of 175 and sd of 15
> itbs <- rnorm(1000, 175, 15)
> #Establish CogAT as a random normal variable with mean of 100 and sd of 16
> cogat<-rnorm(1000,100,16)
> #Create a dataframe of MAP, ITBS, and CogAT
> data <- data.frame(map, itbs, cogat)
> #Draw from the multivariate distribution defined by MAP, ITBS, and CogAT 
> means and the covariance matrix
> sim <- mvrnorm(1000, mu=mean(data), sigma, empirical=FALSE)
> #Set growth at 0
> growth=0
> #Combine elements into a single dataset
> simtest=data.frame (subject=subject, treat=treat,lunch, 
> age=round(age,0),round(sim,0),growth)
> #Set mean growth by treatment condition with treatd subjects having a mean 
> growth of 1.5 and non-treated having a mean growth of 0.1
> simtest<-transform(simtest, growth=rnorm(1000,m=ifelse(treat==0,0.1,1.5),s=1))
> simtest
> cor (simtest)
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with coloring segments on a plot

2011-05-02 Thread Andrew Robinson
Hi Paul,

not to seem naive, but have you actually tried the code below?  It
doesn't seem that you have, from your text.  I think that if you try
it and hack then ask concrete questions (e.g. can anyone explain why
the following simple, reproducible, commented code does not work) then
you'll have more luck.

Best wishes

Andrew

On Mon, May 02, 2011 at 02:26:16PM -0400, Paul Davison wrote:
> Hi. I need a very short piece of help regarding colouring segments plotted
> on a graph.
> 
> When I am plotting segments for the graph, I am using "red" and "darkgreen
> for the values "1" and "2" respectively. Heres the relevant line of code in
> R:
> 
> + col = c("red", "darkgreen")[line.colour.value])
> 
> I just need to extend this to refer to a larger range of numbers from 1 to
> 10, to plot the segments in ten different colours. The values are just the
> first ten integers: 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10
> Each of the ten values will refer to a different colour just as "1" would
> plot a segment in red and "2" would plot a segment in darkgreen.
> 
> The only other condition I need is that the colours be in hex format. Would
> this be along the right lines? :
> 
> + col = c("#FF", "#FF", "#FF", "#FF", "#FF", "#FF",
> "#FF", "#FF", "#FF", "#FF",)[line.colour.value])
> 
> Or would I need to adjust the code in other places too?
> 
> I have copied the code I am using below. I have also copied below a small
> excerpt of the simple data I am plotting - with the headers at the top.
> 
> Thank you so much for your help.
> 
> Paul Davison
> University of Cambridge, UK
> 
> 
> 
> 
> > data = read.csv("r.test.data.csv", header = TRUE)
> > with(data, {
> + par(bg="#0B5FA5")
> + par(lwd=0.01)
> + plot(NA, NA,
> + xlim = range(start.x.co.ordinate, end.x.co.ordinate, 5),
> + ylim = range(start.y.co.ordinate, end.y.co.ordinate, 5),
> + type = "n", ann = FALSE, axes = FALSE)
> + segments(start.x.co.ordinate, start.y.co.ordinate,
> + end.x.co.ordinate, end.y.co.ordinate,
> + col = c("red", "darkgreen")[line.colour.value])
> + title(main = "10th April 1991",
> + xlab = "Pandora",
> + ylab = "Luna")
> + })
> >> quartz.save("sample4.png","png")
> 
> 
> The values in the following data table for the column "line.colour.value"
> are just 1s and 2s. Ideally I would have numbers of 1 through to 10 and each
> one would plot a different coloured (using a hex value) segment.
> 
> 
> start.x.co.ordinatestart.y.co.ordinate  end.x.co.ordinate
> end.y.co.ordinate   line.colour.value
> 300 300 2289 20289 2 300 300 2692 20467 1 300 300 3010 20608 2 300 300
> 2727 19828 1 300 300 2606 20056 2 300 300 16244 21416 1 300 300 16154 21899
> 2 300 300 16941 21434 1 300 300 17356 20205 2 300 300 16928 21245 1 300 300
> 16011 21024 2 300 300 17323 20053 1 300 300 17312 20435 2 300 300 17175
> 21259 1 300 300 16851 21268 2
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Optimization - n dimension matrix

2011-05-02 Thread Andrew Robinson
Hello,

optim() works for more than one dimension.  You might also find this
page helpful:

http://cran.r-project.org/web/views/Optimization.html

Cheers

Andrew

On Mon, May 02, 2011 at 12:41:19PM -0700, petrolmaniac wrote:
> Dear all,
> 
> I am facing the following problem in optimization:
> 
> w = (d, o1, ..., op, m1, ..., mq) is a 1 + p + q vector
> 
> I want to determine: 
> 
> w = argmin (a - d(w))' A (a - d(w))
> 
> where a is a 1xK marix, A is the covariance matrix of vector a, d(w) is a
> 1xK vector which parameters are functions of parameters d, o1 .. op, m1 ..
> mq.
> 
> Is there some function to solve this problem easily? I know optim() and
> ucminf() for one-dimensional optimization (I believe). Are there some tools
> for such n-dimensional problem?
> 
> Kind regards,
> 
> C.
> -- 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Optimization-n-dimension-matrix-tp3490772p3490772.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] UNIX-like "cut" command in R

2011-05-02 Thread Andrew Robinson
Hi Mike,

try substr()

Cheers

Andrew

On Mon, May 02, 2011 at 03:53:58PM -0500, Mike Miller wrote:
> The R "cut" command is entirely different from the UNIX "cut" command. 
> The latter retains selected fields in a line of text.  I can do that kind 
> of manipulation using sub() or gsub(), but it is tedious.  I assume there 
> is an R function that will do this, but I don't know its name.  Can you 
> tell me?
> 
> I'm also guessing that there is a web page somewhere that will tell me how 
> to do a lot of common GNU/UNIX/Linux "text util" commmand-line kinds of 
> things in R.  By that I mean by using R functions, not by making system 
> calls.  Does anyone know of such a web page?
> 
> Thanks in advance.
> 
> Mike
> 
> --
> Michael B. Miller, Ph.D.
> Minnesota Center for Twin and Family Research
> Department of Psychology
> University of Minnesota
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subseting data

2011-05-02 Thread Andrew Robinson
I wonder if grep()  will help you?

Cheers

Andrew

On Mon, May 02, 2011 at 11:03:52AM +0200, Matev? Pavli? wrote:
> Hi, 
> 
>  
> 
> Is it possible (i am sure it is)  to subset data from a data.frame on the 
> basis of SQL >LIKE< operator. I.e., i would like to subset a data where only 
> values which contains a string >GP< would be used? 
> 
>  
> 
> Example:
> 
>  
> 
> Gp<-subset(DF, DF$USCS like >GP<)
> 
>  
> 
> This like of course is not working, 
> 
>  
> 
> Thanks, m
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Different results of coefficients by packages penalized and glmnet

2011-05-01 Thread Andrew Robinson
Hi Yao,

I can't answer that question, but I offer the following thoughts for
your consideration. 

Generally it's best to approach the package maintainers directly with
questions like these.  You can find their contact details in the
package documentation.

Also, you will want to make sure that you provide commented, minimal,
self-contained, reproducible code.  I can't run the code below because
I don't have the data.  Try to create an example that shows your
problem using data that will be readily available to the maintainers.
Perhaps one of the packages provides a n example dataset --- that
would be best.  If not, you should write code to generate an example
dataset, or be prepared to share your own data. 

I hope that this helps,

Andrew

On Sun, May 01, 2011 at 05:01:54PM +0800, zhu yao wrote:
> Dear R users:
> 
> Recently, I learn to use penalized logistic regression. Two packages
> (penalized and glmnet) have the function of lasso.
> So I write these code. However, I got different results of coef. Can someone
> kindly explain.
> 
> # lasso using penalized
> library(penalized)
> pena.fit2<-penalized(HRLNM,penalized=~CN+NoSus,lambda1=1,model="logistic",standardize=TRUE)
> pena.fit2
> coef(pena.fit2)
> opt<-optL1(HRLNM,penalized=~CN+NoSus,fold=5)
> opt$lambda
> coef(opt$fullfit)
> prof<-profL1(HRLNM,penalized=~CN+NoSus,fold=opt$fold,steps=20)
> plot(prof$lambda, prof$cvl, type="l")
> plotpath(prof$fullfit)
> pena.fit2<-penalized(HRLNM,penalized=~CN+NoSus,lambda1=opt$lambda,model="logistic",standardize=TRUE,steps=20)
> plotpath(pena.fit2)
> pena.fit2<-penalized(HRLNM,penalized=~CN+NoSus,lambda1=opt$lambda,model="logistic",standardize=TRUE)
> coef(pena.fit2)
> 
> 
> #lasso using gamnet
> library(glmnet)
> factors<-matrix(c(CN,NoSus),ncol=2)
> colnames(factors)<-c("CN","NoSus")
> glmn.fit2<-glmnet(x=factors,y=HRLNM,family="binomial")
> cvglmnet<-cv.glmnet(x=factors,y=HRLNM,family="binomial",nfolds=5)
> plot(cvglmnet)
> cvglmnet$lambda.min
> which(cvglmnet$lambda==cvglmnet$lambda.min)
> glmn.fit2<-glmnet(x=factors,y=HRLNM,family="binomial",lambda=cvglmnet$lambda.min)
> coef(glmn.fit2)
> 
> 
> 
> Thanks a lot
> 
> btw: how to calculate the C.I. of coefs?
> 
> 
> *Yao Zhu*
> *Department of Urology
> Fudan University Shanghai Cancer Center
> Shanghai, China*
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using tapply with multiple variables

2011-05-01 Thread Andrew Robinson
This is a nice demonstration of the formula interface to aggregate.  A
less elegant alternative is to pass lists as arguments.

with(dd, 
 aggregate(Correct, 
   by = list(Subject = Subject,
 Group = Group), 
   FUN = function(x) sum(x == 'C')))

Using a list is advantageous if you want to make the summary of more
than one variable (which does not seem to be the case, here) --- I
believe that the formula interface doesn't allow for that.  That would
be set up like this

with(dd, 
 aggregate(x = list(Correct = Correct, 
other target variables listed here, 
...), 
   by = list(Subject = Subject,
 Group = Group), 
   FUN = function(x) sum(x == 'C')))

Cheers

Andrew

On Sat, Apr 30, 2011 at 10:03:24PM -0700, Dennis Murphy wrote:
> Hi:
> 
> If you have R 2.11.x or later, one can use the formula version of aggregate():
> 
> aggregate(Correct ~ Subject + Group, data = ALLDATA, FUN = function(x)
> sum(x == 'C'))
> 
> A variety of contributed packages (plyr, data.table, doBy, sqldf and
> remix, among others) have similar capabilities.
> 
> If you want some additional summaries (e.g., percent correct), here is
> an example function for a single subject/group that aggregate() can
> use to propagate to all subgroups and subjects (I encourage you to
> play with it):
> 
> f <- function(x) {
> Correct <- sum(x == 'C')
> Percent <- round(100 * Correct/length(x), 3)
> c(Number = Correct, Percent = Percent)
>   }
> aggregate(Correct ~ Subject + Group, data = ALLDATA, FUN = f)
> 
> The particular function isn't as important as knowing you can do this
> sort of thing. Several of the contributed packages indicated above
> have similar, if not superior, capabilities, depending on the
> situation.
> 
> Toy example to test the above:
> 
> dd <- data.frame(Subject = rep(1:5, each = 100),
>   Group = rep(rep(c('C', 'T'), each = 50), 5),
>   Correct = factor(rbinom(500, 1, 0.8), labels = c('I', 'C')))
> aggregate(Correct ~ Subject + Group, data = dd, FUN = function(x) sum(x == 
> 'C'))
>Subject Group Correct
> 11 C  40
> 22 C  36
> 33 C  39
> 44 C  37
> 55 C  41
> 61 T  43
> 72 T  45
> 83 T  37
> 94 T  45
> 10   5 T  36
> aggregate(Correct ~ Subject + Group, data = dd, FUN = f)
>Subject Group Correct.Number Correct.Percent
> 11 C 40  80
> 22 C 36  72
> 33 C 39  78
> 44 C 37  74
> 55 C 41  82
> 61 T 43  86
> 72 T 45  90
> 83 T 37  74
> 94 T 45  90
> 10   5 T 36  72
> 
> HTH,
> Dennis
> 
> On Sat, Apr 30, 2011 at 12:28 PM, Kevin Burnham  wrote:
> > HI All,
> >
> > I have a long data file generated from a minimal pair test that I gave to
> > learners of Arabic before and after a phonetic training regime.  For each of
> > thirty some subjects there are 800 rows of data, from each of 400 items at
> > pre and posttest.  For each item the subject got correct, there is a 'C' in
> > the column 'Correct'.  The line:
> >
> > tapply(ALLDATA$Correct, ALLDATA$Subject, function(x)sum(x=="C"))
> >
> > gives me the sum of correct answers for each subject.
> >
> > However, I would like to have that sum separated by Time (pre or post).  Is
> > there a simple way to do that?
> >
> >
> > What if I further wish to separate by Group (T or C)?
> >
> > Thanks,
> > Kevin
> >
> >        [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and 

Re: [R] Analysis and graphics by groups

2011-04-29 Thread Andrew Robinson
hi Christiano,

the error is that FUN is not a function.  That is true, the argument
that you are passing to FUN is a different class. Instead of fx, for
example, where fx is your model code below, try to write it as a
function of the arguments that you want to split by Cerca.

You might try to construct a minimal, reproducible, commented example
to help us explain what you need to do.

I don't have the gapply function and I don't know which package it is
in (perhaps you could provide that  information next time) so I can't
help more than that.

Andrew
 
On Fri, Apr 29, 2011 at 04:31:51PM -0300, Cristiano Yuji Sasada Sato wrote:
> Hello,
> 
> This is my first post in this e-mail list and I hope it's enough to justify
> calling for help. In case it's not, sorry.
> 
> I'm trying to do analysis and graphics using a factor as a criteria to split
> data and do the analysis/graphics for each subset of data.
> 
> Right now what I'm trying to do is to fit and plot the following logistic
> model, according to a third variable named "Cerca":
> dm_fit_T<-nls(nDMTRBgm2~(K/(1+((K-nDMTRBgm2.T.1)/nDMTRBgm2.T.1)*exp(-r))),perieph,start=list(K=3,r=0.2),trace=T)
> 
> I've found a function called gapply which seems to be what I need, but it
> doesn't seem to work. This is the argument I've used:
> gapply(perieph,FUN=nls(nDMTRBgm2~(K/(1+((K-nDMTRBgm2.T.1)/nDMTRBgm2.T.1)*exp(-r))),perieph,start=list(K=3,r=0.2),trace=T),groups="Cerca")
> 
> But I get this error message returned:
> > Error in get(as.character(FUN), mode = "function", envir = envir) :
> object 'FUN' of mode 'function' was not found
> 
> Can you help me doing this non-linear regression by groups work?
> 
> Also, after I manage making the regression, I'd also need fitting a line to
> the nDMTRBgm2~nDMTRBgm2.T.1 data using the same model above. I've used
> plotfit to do that with one nlm data set. Is it possible to fit each group
> trend line and data with different colours/symbols  in one same graphic?
> 
> Thank you,
> Cristiano
> 
> -- 
> Cristiano Yuji Sasada Sato
> Doutorando
> Programa de P?s-Gradua??o em Ecologia e Evolu??o - IBRAG / UERJ
> Laborat?rio de Ecologia de Rios e C?rregos
> Departamento de Ecologia - Universidade do Estado do Rio de Janeiro
> 
>   [[alternative HTML version deleted]]
> 

> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why doesn't ifelse work ?

2011-04-28 Thread Andrew Robinson
Hi Eric,

tough to say.  Please try to provide commented, minimal,
self-contained, reproducible code.

Cheers

Andrew

On Thu, Apr 28, 2011 at 06:46:16PM -0700, eric wrote:
> I have the following lines of code:
> 
> ind <- rollapply(GSPC, 200, mean)
> signal <- ifelse(diff(ind, 5) > 0 , 1 , -1)
> signal[is.na(signal)] <- 0
> 
> I never get a value of -1 for signal even though I know diff(ind , 5) is
> less than zero frequently. It looks like when diff(ind , 5) is less than
> zero, signal gets set to 0 instead of - 1. Any ideas why ?  Here's some
> information on ind and diff(ind, 5) :
> 
> > mode(diff(ind, 5) >0)
> [1] "logical"
> > class(diff(ind, 5) >0 )
> [1] "zoo"
> > str(diff(ind, 5) > 0 )
> ???zoo??? series from 1990-05-31 to 2010-12-02
>   Data: logi [1:5171, 1] FALSE FALSE FALSE FALSE FALSE FALSE ...
>  - attr(*, "dimnames")=List of 2
>   ..$ : NULL
>   ..$ : chr "GSPC.Adjusted"
>   Index:  Date[1:5171], format: "1990-05-31" "1990-06-01" "1990-06-04"
> "1990-06-05" "1990-06-06" "1990-06-07" "1990-06-08" "1990-06-11" ...
> > class(ind)
> [1] "zoo"
> > mode(ind)
> [1] "numeric"
> > str(ind)
> ???zoo??? series from 1990-05-23 to 2010-12-02
>   Data: num [1:5176, 1] 339 339 338 338 338 ...
>  - attr(*, "dimnames")=List of 2
>   ..$ : NULL
>   ..$ : chr "GSPC.Adjusted"
>   Index:  Date[1:5176], format: "1990-05-23" "1990-05-24" "1990-05-25"
> "1990-05-29" "1990-05-30" "1990-05-31" "1990-06-01" "1990-06-04" 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/why-doesn-t-ifelse-work-tp3482680p3482680.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Undefined columns selected

2011-04-28 Thread Andrew Robinson
Jennifer,

it looks like some of the columns that you are selecting don't exist.

What is the output of str(arc) before you run the code below?

Andrew

On Thu, Apr 28, 2011 at 04:49:41PM -0700, Jennifer Wessel wrote:
> This is part of my program.  I am getting an error, that I cannot figure  
> out, any help would very much appreciated, thanks.
> 
>   # subset variables 
>   arc <- arc[,c("SNAP", "code",  "ncode", "var", "n_total")]
> Error in `[.data.frame`(arc, , c("SNAP", "code", "ncode",  : 
>   undefined columns selected
>   arc$N_eff <- with(arc, ifelse(var > 1, n_total, var * n_total))
> Error in storage.mode(test) <- "logical" : object 'var' not found
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem loading quantreg

2011-04-28 Thread Andrew Robinson
Hi Derek,

I infer from the output that you're using a Mac.  Generally including
that kind of information is useful.

Your computer lacks the requisite software to install the package.
Make is saying that it can't find 'gfortran'.   

See this page for some more insight:

http://r.research.att.com/tools/

I hope that this helps,

Andrew

On Thu, Apr 28, 2011 at 03:15:41PM -0700, derekverley wrote:
> Hi all,
> 
> I'm trying to load the quantreg package but keep running into problems no
> matter which method I have tried.  Does anybody know what this error (below)
> means in plain language and what I might do to get this installed.  I have
> not had problems downloading/installing/running packages in the past.
> 
> thanks in advance,
> 
> derek
> 
> Begin R output:
> 
> trying URL 'http://cran.stat.ucla.edu/src/contrib/quantreg_4.65.tar.gz'
> Content type 'application/x-tar' length 2366176 bytes (2.3 Mb)
> opened URL
> ==
> downloaded 2.3 Mb
> 
> * installing *source* package ???quantreg??? ...
> ** libs
> *** arch - i386
> gfortran -arch i386   -fPIC  -g -O2 -c akj.f -o akj.o
> make: gfortran: No such file or directory
> make: *** [akj.o] Error 1
> ERROR: compilation failed for package ???quantreg???
> * removing
> ???/Library/Frameworks/R.framework/Versions/2.12/Resources/library/quantreg???
> 
> The downloaded packages are in
> 
> ???/private/var/folders/4m/4mHSx7JtHL88Cl-gi19+Zk+++TI/-Tmp-/RtmpApfBjX/downloaded_packages???
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Problem-loading-quantreg-tp3482397p3482397.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generating a best fit line for non linear data

2011-04-28 Thread Andrew Robinson
I think that you probably need to provide the x values to nls. Try,
for example,


fit <- nls(species ~ a *(1 - exp(-b*samples)),start = list(a = 27, b = .15))


I hope that this helps,

Andrew

On Thu, Apr 28, 2011 at 01:56:43PM -0700, BornSurvivor wrote:
> I have the following data set, and I have to find the line of best fit using
> this equation, 
> y = a*(1 - exp(-b*x)).
> 
> samples = seq(1,20,by=1)
> species = c(5,8,9,9,11,11,12,15,17,19,20,20,21,23,23,25,25,27,27,27)
> plot(samples,species, main = "Accumulation Curve for Tree Species Richness",
> xlab = "Samples", ylab = "Number of Species")
> curve((y = 27*(1 - exp(-.15*x))),from=0,to=20,add = T)
> 
> I tried to find the best fit curve using:
> 
> fit = nls(species ~ a *(1 - exp(-b*x)),start = list(a = 27, b = .15)
> 
> but I get a "parameters without starting value in 'data': x" and I don't
> have any idea what this means, or how to fix the above code.  
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Generating-a-best-fit-line-for-non-linear-data-tp3482193p3482193.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem installing package "sp" in R 2.13.0

2011-04-28 Thread Andrew Robinson
Hi Arnaud,

the error is telling you that you don't have the "make" command.  This
might be because you haven't installed the necessary software to
compile R packages.  I suggest that you check the FAQ for Macintosh to
see how to do that.

Best wishes

Andrew

On Thu, Apr 28, 2011 at 01:30:58PM +0200, Arnaud Catherine wrote:
> Hi,
> 
> I am having troubles trying to install package "sp" in R (2.13.0) on mac OSX.
> I have tried installing the package using GUi or function install.packages 
> but it didn't work.
> 
> Here is the error message I get:
> 
> 
> also installing the dependency ?rgdal?
> 
> trying URL 'http://cran.univ-lyon1.fr/src/contrib/rgdal_0.6-33.tar.gz'
> Content type 'application/x-gzip' length 1422992 bytes (1.4 Mb)
> opened URL
> ==
> downloaded 1.4 Mb
> 
> trying URL 'http://cran.univ-lyon1.fr/src/contrib/sp_0.9-80.tar.gz'
> Content type 'application/x-gzip' length 738569 bytes (721 Kb)
> opened URL
> ==
> downloaded 721 Kb
> 
> * installing *source* package ?sp? ...
> ** libs
> *** arch - i386
> sh: make: command not found
> ERROR: compilation failed for package ?sp?
> * removing 
> ?/Library/Frameworks/R.framework/Versions/2.13/Resources/library/sp?
> ERROR: dependency ?sp? is not available for package ?rgdal?
> 
> The downloaded packages are in
>   
> ?/private/var/folders/8P/8P9oV0FHFI83GKIm2cPUOk+++TM/-Tmp-/RtmppsxaRa/downloaded_packages?
> * removing 
> ?/Library/Frameworks/R.framework/Versions/2.13/Resources/library/rgdal?
> 
> 
> 
> Any help would be much appreciated!
> 
> 
> Best regards.
> 
> 
> 
> 
> 
> Dr. Arnaud CATHERINE
> Post-Doctorant
> 
> UMR 7245 CNRS/MNHN "Mol?cules de Communication et Adaptation des 
> Micro-organismes"
> Equipe "Cyanobact?ries, Cyanotoxines et Environnement"
> Mus?um National d'Histoire Naturelle
> 12, rue Buffon , Case 39
> 75231 Paris Cedex 05
> 
> Tel : + 33 (0)1 40 79 31 79
> Fax : +33 (0)1 40 79 35 94
> Email : arno...@mnhn.fr
> Site du Mus?um National d'Histoire Naturelle : http://www.mnhn.fr
> 
> 
> 
> 
> 
> 
> 
> 
>   [[alternative HTML version deleted]]
> 

> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variance

2011-04-28 Thread Andrew Robinson
A couple of points here 

First, note that q doesn't increment in the code below.  So, you're
getting the same variance each time.

Second, note that (t$Rec1==input3 & t$Rec2==input4) evaluates to F?T
or 0/1, and it's not clear from your code if that is what you intend.

Finally, it's much easier to work with commented, minimal,
self-contained, reproducible code.  Please consider submitting that
with future questions.

I hope that this helps,

Andrew. 


On Thu, Apr 28, 2011 at 05:58:24PM -0400, Dat Mai wrote:
> I'm trying to find the variance of various outputs in a matrix:
> 
> for(l in 2:vl){
>   for(o in 1:(l-1)){
> 
> # Make sure the inputs are for the matrix "m"
> input3=rownames(v)[o]
> input4=colnames(v)[l]
> 
> r=t[(t$Rec1==input3 & t$Rec2==input4),output]
> 
> if(length(r)==0){
>   r=t[(t$Rec1==input4 & t$Rec2==input3),output]
> }
> 
> v[l,o]=var(q,na.rm=TRUE)
> v[o,l]=var(q,na.rm=TRUE)
> v[l,l]=var(q,na.rm=TRUE)
> 
>   }
> }
> 
> Each output will yield multiple results, since each input length varies.
> I'm not sure if this is the right way to go about finding the variance of
> each pair, but this is what I've done.
> The main issue I have with this now is that the results in every box in the
> matrix yield the same exact number, even though that most likely shouldn't
> happen.
> 
> The question is: "How would I find the variance of each pair of inputs?"
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Off-topic: looking for a categorical, NMAR dataset

2009-09-30 Thread Andrew Robinson
Dear Colleagues,

apologies for this off-topic posting.

A Ph.D. student here at U of Melb. is trying to find a dataset to use
to demonstrate a technique that he is developing.  He needs a binary
response and ideally a categorical predictor, although the latter can
of course be induced from a continuous predictor.  The data should
also have missing values (ideally, NMAR, not missing at random) in the
response and in the predictor.

Of course we could generate such a dataset but it would be preferable
to use a dataset in which handling the missingness is an integral part
of the analysis.

The data set could be in any discipline, ideally already published.

If you have any suggestions, please respond directly to Ken at

Kheang Ken Lim 

Thanks!

Cheers

Andrew

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help: what are the basis functions in {mgcv}: gam?

2009-03-23 Thread Andrew Robinson
Try

 Wood S.N. (2006) Generalized Additive Models: An Introduction
 with R. Chapman and Hall/CRC Press.

listed in the references in the help file of the function.

It's a great read.

Andrew

On Mon, Mar 23, 2009 at 07:36:44PM -0700, oliviax wrote:
> 
> I am writing my thesis with the function gam(), with the package {mgcv}. 
> 
> My command is: gam(y~s(x1,bs="cr")+s(x2, bs="cr")). 
> 
> I need help to know what are the default basis funcitons for gam. I have not
> found any detailed reference for this. 
> 
> Can anyone help me with this?? 
> -- 
> View this message in context: 
> http://www.nabble.com/help%3A-what-are-the-basis-functions-in-%7Bmgcv%7D%3A-gam--tp22673295p22673295.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia   (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr  Fax: +61-3-8344-4599
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with subset() function?

2009-01-20 Thread Andrew Robinson
Steven,

check the class of the objects that you are creating.

Cheers,

Andrew

On Wed, January 21, 2009 10:02 am, Steven McKinney wrote:
> Hi all,
>
> Can anyone explain why the following use of
> the subset() function produces a different
> outcome than the use of the "[" extractor?
>
> The subset() function as used in
>
>  density(subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age)))
>
> appears to me from documentation to be equivalent to
>
>  density(mydf[mydf$ht >= 150.0 & mydf$wt <= 150.0, "age"])
>
> (modulo exclusion of NAs) but use of the former yields an
> error from density.default() (shown below).
>
>
> Is this a bug in the subset() machinery?  Or is it
> a documentation issue for the subset() function
> documentation or density() documentation?
>
> I'm seeing issues such as this with newcomers to R
> who initially seem to prefer using subset() instead
> of the bracket extractor.  At this point these functions
> are clearly not exchangeable.  Should code be patched
> so that they are, or documentation amended to show
> when use of subset() is not appropriate?
>
>> ### Bug in subset()?
>
>> set.seed(123)
>> mydf <- data.frame(ht = 150 + 10 * rnorm(100),
> +wt = 150 + 10 * rnorm(100),
> +age = sample(20:60, size = 100, replace = TRUE)
> +)
>
>
>> density(subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age)))
> Error in density.default(subset(mydf, ht >= 150 & wt <= 150, select =
> c(age))) :
>   argument 'x' must be numeric
>
>
>> density(mydf[mydf$ht >= 150.0 & mydf$wt <= 150.0, "age"])
>
> Call:
>   density.default(x = mydf[mydf$ht >= 150 & mydf$wt <= 150, "age"])
>
> Data: mydf[mydf$ht >= 150 & mydf$wt <= 150, "age"] (29 obs.); Bandwidth
> 'bw' = 5.816
>
>xy
>  Min.   : 4.553   Min.   :3.781e-05
>  1st Qu.:22.776   1st Qu.:3.108e-03
>  Median :41.000   Median :1.775e-02
>  Mean   :41.000   Mean   :1.370e-02
>  3rd Qu.:59.224   3rd Qu.:2.128e-02
>  Max.   :77.447   Max.   :2.665e-02
>
>
>> sessionInfo()
> R version 2.8.0 Patched (2008-11-06 r46845)
> powerpc-apple-darwin9.5.0
>
> locale:
> C
>
> attached base packages:
> [1] stats graphics  grDevices datasets  utils methods   base
>
> loaded via a namespace (and not attached):
> [1] Matrix_0.999375-16 grid_2.8.0 lattice_0.17-15
> lme4_0.99875-9
> [5] nlme_3.1-89
>>
>
>
>
>
>
>
> Steven McKinney
>
> Statistician
> Molecular Oncology and Breast Cancer Program
> British Columbia Cancer Research Centre
>
> email: smckinney +at+ bccrc +dot+ ca
>
> tel: 604-675-8000 x7561
>
> BCCRC
> Molecular Oncology
> 675 West 10th Ave, Floor 4
> Vancouver B.C.
> V5Z 1L3
> Canada
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


Andrew Robinson
Senior Lecturer in Statistics   Tel: +61-3-8344-6410
Department of Mathematics and StatisticsFax: +61-3-8344 4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.auWebsite: http://www.ms.unimelb.edu.au

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] p(H0|data) for lm/lmer-objects R

2008-12-25 Thread Andrew Robinson
Dear Leo,

> Dear R-List,
> 
> I am interested in the Bayesian view on parameter estimation for multilevel
> models and ordinary regression models. 

You might find Gelman & Hill's recent book to be good reading, and
there is a book in the Use-R series that focuses on using R to perform
Bayesian analyses.  

> AFAIU traditional frequentist p-values they give information about
> p(data_or_extreme|H0).  AFAIU it further, p-values in the Fisherian
> sense are also no alpha/type I errors and therefor give no
> information about future replications.

I don't think that the last comment is necessarily relevant nor is it
necessarily true.

> However, p(data_or_extreme|H0) is not really interesting for social science
> research questions (psychology). Much more interesting is
> p(H0|data). 

That's fine, but first you have to believe that the statement has
meaning.

> Is there a way or formula to calculate these probabilities of the H0
> (or another hypothesis) from lm-/lmer objects in R?

See the books above.  Note that in order to do so, you will need to
nominate a prior distribution of some kind.

> Yes I know that multi-level modeling as well as regression can be done in a
> purely Bayesian way. However, I am not capable of Bayesian statistics,
> therefor I ask that question. I am starting to learn it a little bit.

No offense, but it sounds to me like you want to have the Bayesian
omelette without breaking the Bayesian eggs ;).  Certain kinds of
multi-level models are mathematically identical to certain kinds of
Empirical Bayes models, but that does not make them Bayesian (despite
what some people say).  I caution against your implied goal of
obtaining Bayesian statistics without performing a Bayesian analysis.
 
Good luck,

Andrew

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Functions in R like lincom and nlcom of Stata

2008-12-13 Thread Andrew Robinson

estimable in the gmodels package provides point estimates, standard
errors and confidence intervals for arbitrary linear combinations of
model parameters.  I don't know for non-linear combinations, though.

Cheers

Andrew

On Sat, Dec 13, 2008 at 11:33:12PM -0600, Stas Kolenikov wrote:
> Those commands provide point estimates, standard errors and confidence
> intervals based on linear combination of parameters or
> linearization/delta-method, respectively. R's contrasts appear to be
> limited to a single factor and combinations that sum up to zero.
> 
> I am too so used to this Stata's concept, I now think it's odd R does
> not seem to have it readily identifiable in two-three search commands.
> And I would not believe R does not have this functionality, it must be
> hiding somewhere! :))
> 
> On 12/13/08, David Winsemius  wrote:
> >
> >  On Dec 12, 2008, at 11:14 AM, Marc Mar? Dell'Olmo wrote:
> >
> >
> > > Hello all,
> > >
> > > Does anyone know if there exists any function in R that resembles the
> > > "lincom" and "nlcom" of STATA?. These functions computes point
> > > estimates, standard errors, significance levels, confidence intervals,
> > > etc. for linear and non linear combinations of previous estimated
> > > parameters. Down here you've got links to descriptions of the
> > > functions of STATA
> > >
> > > nlcom:
> > > http://www.stata.com/help.cgi?nlcom
> > > lincom:
> > > http://www.stata.com/help.cgi?lincom
> > >
> >
> >  I did not find a description of the mathematical operations that let me
> > understand exactly what lincom is doing, but suspect that you should be
> > looking at how R handles contrasts. The help pages reference ch 2 of
> > "Statistical Models in S". The search at the console prompt would be:
> >
> >  ?C
> >  ?contrasts
> >  ?se.contrast
> >  ?model.tables
> >
> 
> -- 
> Stas Kolenikov, also found at http://stas.kolenikov.name
> Small print: I use this email account for mailing lists only.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mixed model nested ANOVA

2008-12-12 Thread Andrew Robinson
Hi Sebpe,

the analysis of the data that you describe could be a complex and
lengthy process, in which decisions that you are confronted by are
affected by previous decisions that you have made.  I recommend
obtaining the assistance of a statistician, preferably a local one
whose door you can knock on.  If you are unable to do so then I
suggest that you borrow/buy a copy of the Pinheiro and Bates book,
which documents lme() and its friends, and study it carefully,
especially the worked examples.

Good luck!

Andrew

On Fri, Dec 12, 2008 at 03:13:06PM +, Sebpe De Smedt wrote:
> Hi,
> 
> 
> I'm working on leaf characteristics of trees. Each tree is characterised by
> about 10 leaf traits.
> The trees were sampled at 9 different locations (about 20 to 30
> trees/location, NOT balanced), grouped in 3 different climatic zones
> (Sahelian, Soudanian and Guinean) (NOT balanced).
> Further, each tree is characterised by some degree of human pressure
> (mutilation degree), in total 4 different degrees were defined (NOT
> balanced).
> 
> In the dryer zones, the trees are under a much higher human pressure than in
> the more humid climatic zones, "zone" and "mutilation degree" are thus
> strongly correlated.
> 
> I want to know how "zones" (fixed effects, climate interests me) and
> "locations" (nested in "zones", random effects, location doesn't interests
> me) are influencing the leaf traits (say for example "SLA"). Further, also
> human pressure is affecting leaf traits so I want to characterise the
> influence of "mutilation degree" (fixed effects) on "SLA".
> 
> I found some interesting information, but still, I am not be able to analyse
> the data properly. I think I have to use the function lme() or lme().
> 
> Can anyone tell me which function and command I have to use? And how I can
> produce an ANOVA table?
> 
> 
> Thanks in advance,
> Sebpe De Smedt
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Growth rate determination using ANCOVA

2008-11-23 Thread Andrew Robinson
Hi David,

On Fri, Nov 21, 2008 at 12:01:52PM -0800, dschruth wrote:
> I'm a programmer in a biology lab who is starting to use R to automate
> some of our statistical analysis of growth rate determination. But I'm
> running into some problems as I re-code.
> 
> 1) Hypotheses concerning Slope similarity/difference:
>I'm using R's anova(lm()) methods to analyse a model which looks
> like this:
>   growth.metric ~ time * test.tube
>I understand that testing the the interaction between time and tube
> (time:test.tube) will tell us if the growth rates (for the last three
> test tubes) are significantly different from one another (Ho=slopes
> are the same).  The purpose of doing this test is so that we can be
> certain our cultures have fully acclimated to the treatment and aren't
> going to change much if we stop measuring. This is an important cost
> saving practice too as measurements can go on for years.   Yet I'm
> worried that our null and alternative hypotheses should be swapped so
> that our test is more conservative (Ho=slopes are different ... ie
> still acclimating.)

Good thinking.

> Is there a way to specify my model that flips these hypotheses?
> Should I be using a different method?  Is this even appropriate?

You could think about equivalence tests.  See e.g. references in the
equivalence package. 
 
> 2) Growth Rate is confounded with Variance of Growth Rate
>I'm also worried about the fact that rates for cultures with faster
> growth are calculated using fewer data points (assuming similar
> sampling times between treatments) .  The result is that growth ~ var
> (growth).   Not only does this put a wrinkle in my analysis between
> treatments, but it also biases the growth acclimation determining
> ANCOVA test above.  Faster growing cultures will usually pass the "no
> significant difference between slopes test" more easily because there
> are fewer points from which to be certain about rejecting Ho.
> 
> Is there a way to control for this?
> Perhaps I could include the number of points in my model?

Depending on the model that you apply, you might be able to explicitly
model the variance to allow for this possibility.  I would guess that
it's not necessarily only the fewer data points contributing to the
greater variation.  Faster-growing cultures might also be inherently
more variable.
 
> 3) Statistical validity of using subsets of growth.metric measurements
> within a test tube
>There are some lab members who insist that we can throw out the
> beginning and end of our log transformed growth.metric measurements
> because they are outliers in determining maximum growth.I've
> proposed looping through all possible combinations of 3 or more points
> within the growth curve and using the highest or best fitting (best R-
> squared) slope.  But this idea has been rejected by our PI as not be a
> valid thing to do.
> 
> Ideas here?

I'm feeling very cautious about giving advice on this question as I
don't know enough about the area.  Sorry.

I hope that this helps, otherwise.

Andrew
-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Spatial ANCOVA in R

2008-11-15 Thread Andrew Robinson
Hi Camilo,

try gls() in the nlme package.

Andrew

On Sat, Nov 15, 2008 at 11:34:57PM -0400, Camilo Mora wrote:
> Hi:
> 
> Does anyone know if it is possible to run an ANCOVA in R while accounting or
> controlling for spatial autocorrelation? I have found usefull information into
> how to account for spatial autocorrelaion in regression models but not much
> into how to deal with the problem in an ANCOVA.
> 
> Thanks,
> 
> Camilo
> 
> Camilo Mora, Ph.D.
> SCRIPPS Institute of Oceanography
> University of California San Diego
> San Diego, USA
> Phone: (858) 822 1642
> http://cmbc.ucsd.edu/People/Faculty_and_Researchers/mora/
> And
> Department of Biology
> Dalhouisie University
> Halifax, Canada
> Phone: (902) 494 3910
> http://as01.ucis.dal.ca/fmap/people.php?pid=53
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Stat related] Understanding Portmanteau test

2008-11-08 Thread Andrew Robinson
On Sat, Nov 08, 2008 at 09:58:14PM -0800, RON70 wrote:
> 
> Still waiting for some input. Did my question void forum rule in any manner?

No, no specific rule.  It's just not a particularly easy question to
answer, because it's not clear in what context you have seen the
phrase "portmanteau test".  Any given answer might be completely
irrelevant and useless to you.  
 
> > Sorry to be off-topic. Can somebody please explain me what is Portmanteau
> > test? Why it's name is like that? When I would say, a particular test is
> > portmanteau test? I did some googling but got no satisfactory answer at
> > all. Please anybody help for understanding that?

For what it's worth, the Cambridge Dictionary of Statistics defines a
"portmanteau test" as a test for assessing the fit of models for time
series in the presence of outliers.  It provides a citation:
Computational Statistics, 1994, 9, 301-310.

I hope that helps.

Andrew

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] perl expression question

2008-09-22 Thread Andrew Robinson
Hi Mark,

do you mean the regex to get the portion of the address after the
final slash?  Something like

gsub(".*/([^/]*$)", "\\1", stock, fixed=FALSE)

Cheers

Andrew

On Mon, Sep 22, 2008 at 07:29:25PM -0500, [EMAIL PROTECTED] wrote:
> If I have the string below. does someone know a regular expression to 
> just get the "BLC.NYSE". I bought the O'Reilley
> book and read it when I can  and I study the solutions on the list but 
> I'm still not self sufficient with these things. Thanks.
> 
> 
> stock<-"/opt/limsrv/mark/research/equity/projects/testDL/stock_data/fhdb/US/BLC.NYSE"
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Deleting multiple variables

2008-09-22 Thread Andrew Robinson
Mike,

how about

M_UC <- M_UC[,-(myvars[1]:myvars[2])]

?

Andrew

On Mon, Sep 22, 2008 at 11:04:34PM +0100, Michael Pearmain wrote:
> Hi All,
> i have searched the web for a simple solution but have been unable to find
> one.  Can anyone recommend a neat way of deleting multiple variable?
> I see, i need to use dataframe$VAR<-NULL to get rid of one variable,
> In my situation i need to delete all vars between two points.
> 
> I've used the 'which' function to find these out and have assigned to myvar
> >myvars
> [1]  2 17
> 
> but i can't figure out how i should apply this?
> 
> Should i loop through the values? (Psydo code below?)
> 
> for (x in c(myvars[1]:myvars[2]))
> (M_UC$x<-NULL))
> 
> Any help gratful
> 
> Mike
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] removing a word, the following space and the next word

2008-09-19 Thread Andrew Robinson
Hi Bob,

I recommend doing some background reading on regular expressions[1]
and using gsub().

Cheers

Andrew

[1] http://en.wikipedia.org/wiki/Regular_expressions

On Sat, Sep 20, 2008 at 04:24:36PM +1000, Bob Green wrote:
> 
> >Hello,
> 
> I am hoping for advice as to how I could remove all words immediately 
> following the words 'Mr' or 'Mr.' in  a csv file. For example, the 
> following phrases are included in lines of text (along with other Mr) 
> that could be anywhere in the file: "Mr Jones ate lunch" and "Mr 
> Smith was tied".
> 
> 
> I want to remove the words Jones and Smith (etc) leaving the other 
> text intact.
> 
> Any suggestions are appreciated,
> 
> regards
> 
> 
> Bob
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sweave and/or beamer issue

2008-09-06 Thread Andrew Robinson
Hi Michael,

I think that beamer needs that frames that contain any verbatim text
or R output to be declared as fragile.

Try

\begin{frame}[fragile]

...

\end{frame}

On Sat, Sep 06, 2008 at 06:22:41PM -0400, Michael Kubovy wrote:
> Dear Friends,
> 
> I not sure whether this is an Sweave or a beamer problem.
> 
> The Rnw file:
> 
> \documentclass[compress,smaller]{beamer}
> %\documentclass{article}
> %\usepackage{beamerarticle}
> 
> \usepackage{Sweave}
> 
> \title{Psychophysics II}
> \date{September 9, 2008}
> 
> \begin{document}
> 
> \frame{
> \begin{Schunk}
> \begin{Sinput}
>  > ro <- 0.2
>  > c <- seq(from = -3, to = 4, by = 0.1)
>  > fn <- 1 - pnorm(c)
>  > fo <- 1 - pnorm(c, mean = 1)
>  > h <- fo + ro - ro * fo
>  > f <- fn
>  > plot(h ~ f, type = "l", asp = 1, xlim = c(0.01, 1.01), ylim =  
> c(0.01, 1.01), las = 1, xlab = "p(yes | old)",
> + ylab = "p(yes | new)", main = "Dual-process model,  
> p(recollection) = 0.2")
> \end{Sinput}
> \end{Schunk}
> \includegraphics{20080909test-model}
> 
> }
> 
> \end{document}
> 
> The resulting LaTeX
> ***FAILS: *
> 
> Runaway argument?
>  > ro <- 0.2 > c <- seq(from = -3, to = 4, by = 0.1) > fn <- 1 - pnorm 
> \ETC.
> ! Paragraph ended before [EMAIL PROTECTED] was complete.
> 
> \par
> l.27 }
> 
> ?
> 
> **
> But when the Rnw file starts:
> **
> %\documentclass[compress,smaller]{beamer}
> \documentclass{article}
> \usepackage{beamerarticle}
> 
> ***IT DOES NOT FAIL: *
> 
> _
> Professor Michael Kubovy
> University of Virginia
> Department of Psychology
> USPS: P.O.Box 400400Charlottesville, VA 22904-4400
> Parcels:Room 102Gilmer Hall
>  McCormick RoadCharlottesville, VA 22903
> Office:B011+1-434-982-4729
> Lab:B019+1-434-982-4751
> Fax:+1-434-982-4766
> WWW:http://www.people.virginia.edu/~mk9y/
> 
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] random error with lme for repeated measures anova

2008-08-27 Thread Andrew Robinson
Hi JP,

I suggest that you read the literature cited in the help file, most
particularly 

 Pinheiro, J.C., and Bates, D.M. (2000) "Mixed-Effects Models in S
 and S-PLUS", Springer.  

MASS (Modern Applied Statistics in S) also has some useful things in
it.

Cheers

Andrew


On Wed, Aug 27, 2008 at 03:23:58AM -0700, Jean-Pierre Bresciani wrote:
> 
> Hi,
> 
> what is the appropriate syntax to get the random error correct when
> performing repeated measures anova with 'lme'.
> 
> let's say i have 3 independent variables, with 'aov', i would write
> something like: aov(dep_var~(indep_var1*indep_var2*indep_var3) +
> Error(subject/(indep_var1*indep_var2*indep_var3)).
> 
> With 'lme' however, i can't find the right formula. i tried things like:
> lme(dep_var~(indep_var1*indep_var2*indep_var3), random = ~1|subject) or
> nesting my independent variables in 'subject', but those are obviously wrong
> with my design.
> 
> i'm quite clueless (and i haven't found any convincing piece of information
> about how to correctly use 'lme' or 'lmer'). So, any advice along that line
> is more than welcome.
> 
> JP 
> -- 
> View this message in context: 
> http://www.nabble.com/random-error-with-lme-for-repeated-measures-anova-tp19178239p19178239.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comparison of demographics between 2 study samples

2008-08-14 Thread Andrew Robinson
On Thu, Aug 14, 2008 at 07:46:41PM +1000, Jim Lemon wrote:
> On Wed, 2008-08-13 at 19:14 -0700, Mark Home wrote:
> > Dear All:
> > 
> > I have a clinical study where I would like to compare the demographic 
> > information for 2 samples in a study.  The demographics include both 
> > categorical and continuous variables.  I would like to be able to say 
> > whether the demographics are significantly different or not.
> > 
> > The majority of papers that I have read use multiple techniques to achieve 
> > this (e.g., t-test for the continuous variables and either Fischer exact or 
> > Chi-square for categorical).  I wonder whether this might lead to spurious 
> > differences due to multiple significance tests.  Is there a better way to 
> > do this?
> > 
> Hi Mark,
> Most of these comparisons are uncorrected, as the aim is to demonstrate
> that the samples come from the same population. Therefore, you aren't
> worried about making a Type I error, but ignoring a sampling difference
> that might bias your results.
> 
> Jim

Hi Mark,

just following up on Jim's point, if your goal is to demonstrate that
the samples come from the same population then you probably should
take a look at equivalence testing.

Andrew
-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Decomposing tests of interaction terms in mixed-effects models

2008-08-04 Thread Andrew Robinson
On Mon, Aug 04, 2008 at 02:51:48PM +0200, Peter Dalgaard wrote:
> Andrew Robinson wrote:
> >On Mon, Aug 04, 2008 at 10:17:38AM +0200, Peter Dalgaard wrote:
> >  
> >>Andrew Robinson wrote:
> >>
> >
> >That is a neat idea, thanks, Peter, but it doesn't quite fit the bill.
> >The summary provides t-tests but I am hoping to find F-tests,
> >otherwise I'm not sure how to efficiently test A (3 levels) at the two
> >levels of C.  
> >
> >The anova.lme function doesn't help, sadly:
> >
> >  
> >>anova(lme(Y ~ M2 - 1, random = ~ 1 | Block, data = example))
> >>
> >   numDF denDF F-value p-value
> >M2 625 23.0198  <.0001
> >
> >so I'm still flummoxed!
> >
> >Andrew
> >  
> You do have to peek into M2 to know that the test is for whether the 
> last two coefs are zero, but how about
> 
> > M3 <- M2[,2:4]
> > M4 <- M2[,5:6]
> > anova(lme(Y ~ M3+M4, random = ~ 1 | Block, data = example))
>numDF denDF  F-value p-value
> (Intercept) 125 10.66186  0.0032
> M3  325 55.31464  <.0001
> M4  225  1.27591  0.2967

Marvelous, many thanks, Peter.

> Also, check out estimable() in the gmodels package.

Will do.  Actually had done, but will do again.

Cheers

Andrew

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Decomposing tests of interaction terms in mixed-effects models

2008-08-04 Thread Andrew Robinson
On Mon, Aug 04, 2008 at 10:17:38AM +0200, Peter Dalgaard wrote:
> Andrew Robinson wrote:
> >Dear R colleagues,
> >
> >a friend and I are trying to develop a modest workflow for the problem
> >of decomposing tests of higher-order terms into interpretable sets of
> >tests of lower order terms with conditioning.
> >
> >For example, if the interaction between A (3 levels) and C (2 levels)
> >is significant, it may be of interest to ask whether or not A is
> >significant at level 1 of C and level 2 of C.
> >
> >The textbook response seems to be to subset the data and perform the
> >tests on the two subsets, but using the RSS and DF from the original
> >fit.  
> >
> >We're trying to answer the question using new explanatory variables.
> >This approach (seems to) work just fine in aov, but not lme.  
> >
> >For example,
> >
> >##
> >
> ># Build the example dataset
> >
> >set.seed(101)
> >
> >Block <- gl(6, 6, 36, lab = paste("B", 1:6, sep = ""))
> >A <- gl(3, 2, 36, lab = paste("A", 1:3, sep = ""))
> >C <- gl(2, 1, 36, lab = paste("C", 1:2, sep = ""))
> >example <- data.frame(Block, A, C) 
> >example$Y <- rnorm(36)*2 + as.numeric(example$A)*as.numeric(example$C)^2 + 
> >3 * rep(rnorm(6), each=6)
> >
> >with(example, interaction.plot(A, C, Y)) 
> >
> ># lme 
> >
> >require(nlme) 
> >anova(lme(Y~A*C, random = ~1|Block, data = example)) 
> >
> >summary(aov(Y ~ Error(Block) + A*C, data = example))
> >
> ># Significant 2-way interaction.  Now we would like to test the effect
> ># of A at C1 and the effect of A at C2.  Construct two new variables
> ># that decompose the interaction, so an LRT is possible.
> >
> >example$AC <- example$AC1 <- example$AC2 <- interaction(example$A, 
> >example$C) 
> >example$AC1[example$C == "C1"] <- "A1.C1"  # A is constant at C1
> >example$AC2[example$C == "C2"] <- "A1.C2"  # A is constant at C2
> >
> ># lme fails (as does lmer)
> >
> >lme(Y ~ AC1 + AC, random = ~ 1 | Block, data = example)
> >
> ># but aov seems just fine.
> >
> >summary(aov(Y ~ Error(Block) + AC1 + AC, data = example)) 
> >
> >## AC was not significant, so A is not significant at C1
> >
> >summary(aov(Y ~ Error(Block) + AC2 + AC, data = example)) 
> >
> >## AC was significant, so A is significant at C2
> >
> >## Some experimentation suggests that lme doesn't like the 'partial
> >## confounding' approach that we are using, rather than the variables
> >## that we have constructed.
> >
> >lme(Y ~ AC, random = ~ 1 | Block, data = example)
> >lme(Y ~ A + AC, random = ~ 1 | Block, data = example)
> >
> >##
> >
> >Are we doing anything obviously wrong?   Is there another approach to
> >the goal that we are trying to achieve?
> >  
> This looks like a generic issue with lme/lmer, in that they are not 
> happy with rank deficiencies in the design matrix.
> 
> Here's a "dirty" trick which you are not really supposed to know about 
> because it is hidden inside the "stats" namespace:
> 
> M <- model.matrix(~AC1+AC, data=example)
> M2 <- stats:::Thin.col(M)
> summary(lme(Y ~ M2 - 1, random = ~ 1 | Block, data = example)
> 
> (Thin.col(), Thin.row(), and Rank() are support functions for 
> anova.mlm() et al., but maybe we should document them and put them out 
> in the open.)

That is a neat idea, thanks, Peter, but it doesn't quite fit the bill.
The summary provides t-tests but I am hoping to find F-tests,
otherwise I'm not sure how to efficiently test A (3 levels) at the two
levels of C.  

The anova.lme function doesn't help, sadly:

> anova(lme(Y ~ M2 - 1, random = ~ 1 | Block, data = example))
   numDF denDF F-value p-value
M2 625 23.0198  <.0001

so I'm still flummoxed!

Andrew
-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Decomposing tests of interaction terms in mixed-effects models

2008-08-03 Thread Andrew Robinson
Dear R colleagues,

a friend and I are trying to develop a modest workflow for the problem
of decomposing tests of higher-order terms into interpretable sets of
tests of lower order terms with conditioning.

For example, if the interaction between A (3 levels) and C (2 levels)
is significant, it may be of interest to ask whether or not A is
significant at level 1 of C and level 2 of C.

The textbook response seems to be to subset the data and perform the
tests on the two subsets, but using the RSS and DF from the original
fit.  

We're trying to answer the question using new explanatory variables.
This approach (seems to) work just fine in aov, but not lme.  

For example,

##

# Build the example dataset

set.seed(101)

Block <- gl(6, 6, 36, lab = paste("B", 1:6, sep = ""))
A <- gl(3, 2, 36, lab = paste("A", 1:3, sep = ""))
C <- gl(2, 1, 36, lab = paste("C", 1:2, sep = ""))
example <- data.frame(Block, A, C) 
example$Y <- rnorm(36)*2 + as.numeric(example$A)*as.numeric(example$C)^2 + 
3 * rep(rnorm(6), each=6)

with(example, interaction.plot(A, C, Y)) 

# lme 

require(nlme) 
anova(lme(Y~A*C, random = ~1|Block, data = example)) 

summary(aov(Y ~ Error(Block) + A*C, data = example))

# Significant 2-way interaction.  Now we would like to test the effect
# of A at C1 and the effect of A at C2.  Construct two new variables
# that decompose the interaction, so an LRT is possible.

example$AC <- example$AC1 <- example$AC2 <- interaction(example$A, example$C) 

example$AC1[example$C == "C1"] <- "A1.C1"  # A is constant at C1
example$AC2[example$C == "C2"] <- "A1.C2"  # A is constant at C2

# lme fails (as does lmer)

lme(Y ~ AC1 + AC, random = ~ 1 | Block, data = example)

# but aov seems just fine.

summary(aov(Y ~ Error(Block) + AC1 + AC, data = example)) 

## AC was not significant, so A is not significant at C1

summary(aov(Y ~ Error(Block) + AC2 + AC, data = example)) 

## AC was significant, so A is significant at C2

## Some experimentation suggests that lme doesn't like the 'partial
## confounding' approach that we are using, rather than the variables
## that we have constructed.

lme(Y ~ AC, random = ~ 1 | Block, data = example)
lme(Y ~ A + AC, random = ~ 1 | Block, data = example)

######

Are we doing anything obviously wrong?   Is there another approach to
the goal that we are trying to achieve?

Many thanks,

Andrew

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with NA and if

2008-07-03 Thread Andrew Robinson
Hi Keld

you should read ?sum.

sum(c(1,2,NA), na.rm=TRUE)

Cheers

Andrew

On Fri, Jul 04, 2008 at 08:29:34AM +0200, Keld J?rn Simonsen wrote:
> Hi 
> 
> I would like to sum a number of time series, some of them having NA's
> 
> Standard action is here that if I sum a value with a NA, then the result
> is NA. I would like it to just keep the value.
> 
> I then try to:
> 
>  a = NA; if (a == NA) { a = 0}
> 
> just to try it out, but it says
> 
> Error in if (a == NA) { : missing value where TRUE/FALSE needed
> 
> What is wrong, and can I do it smarter? I looked at na.action but I
> don't see how it affects addition of vectors, nor time series.
> 
> Best regards
> keld
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with lm()

2008-06-19 Thread Andrew Robinson
In your data, subject is nested within sequence.  Was that your
intention?

> a<-c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13,14,14)
> b<-c(1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2)
> c<-c(2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2)
> d<-c(2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1)
> e<-c(1739,1633,1481,1837,1780,2073,1374,1629,1555,1385,1756,1522,1566,1643,
+ 1939,1615,1475,1759,1388,1483,1127,1682,1542,1247,1235,1605,1598,1718
+ )
> Data<-data.frame(subject=as.factor(a),
+ drug=as.factor(b), period=as.factor(c),
+ sequence=as.factor(d), Max=e)
> Data
   subject drug period sequence  Max
111  22 1739
212  12 1633
321  11 1481
422  21 1837
531  22 1780
632  12 2073
741  11 1374
842  21 1629
951  22 1555
10   52  12 1385
11   61  11 1756
12   62  21 1522
13   71  22 1566
14   72  12 1643
15   81  11 1939
16   82  21 1615
17   91  22 1475
18   92  12 1759
19  101  11 1388
20  102  21 1483
21  111  22 1127
22  112  12 1682
23  121  11 1542
24  122  21 1247
25  131  22 1235
26  132  12 1605
27  141  11 1598
28  142  21 1718



Andrew


On Thu, Jun 19, 2008 at 04:29:16PM +0800, leeznar wrote:
> Dear R-users:
> 
> I am a new R-user and I have a question about lm
> function.  Here is my data.
> a<-c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13,14,14)
> b<-c(1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2)
> c<-c(2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2)
> d<-c(2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1)
> e<-c(1739,1633,1481,1837,1780,2073,1374,1629,1555,1385,1756,1522,1566,1643,1939,1615,1475,1759,1388,1483,1127,1682,1542,1247,1235,1605,1598,1718
> )
> Data<-data.frame(subject=as.factor(a),
> drug=as.factor(b), period=as.factor(c),
> sequence=as.factor(d), Max=e)
> 
> lm3<- lm(Max ~subject*sequence + sequence + period +
> drug, data=Data)
> print(lm3)
> anova(lm3)
> 
> When I use lm to fit the data, there are some problems
> in ??subject*sequence??.   I have use GLM in SPSS to
> fit the same data, and it seems there is no problem. 
> 
> I don??t know where my problem is.  How can I get the
> same result with SPSS? How can I do?
> 
> Best regards,
> Hsin-Ya Lee
> 
> 
> 
> 
>   
> __
> [[elided Yahoo spam]]
> Content-Type: application/msword; name="Result_SPSS.doc"
> Content-Transfer-Encoding: base64
> Content-Description: 3367377201-Result_SPSS.doc
> Content-Disposition: attachment; filename="Result_SPSS.doc"
> 
> 
> 

> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] "False convergence" in LME

2008-06-14 Thread Andrew Robinson
These are great tips from Spencer.

The other thing that I have found useful is to use a different
optimizing algorithm.  You can do this by:

control=lmeControl(opt = "optim")

Good luck!

Andrew


On Sat, Jun 14, 2008 at 09:45:22AM -0700, Spencer Graves wrote:
>  This is a common problem, for which solutions are poorly documented. 
> 
>   1.  Have you tried fitting simpler models, in the hopes of 
> getting something that converges without complaint, then use 'update' to 
> try more complicated models?  It sometimes works, though often not. 
> 
>   2.  Also, have you tried something like 'lme(..., 
> control=lmeControl(returnObject=TRUE))'?  If no, I suggest you try this 
> first.  With the error message you report, I would expect this to work, 
> though it may not. 
> 
>   3.  Finally, have you tried something like 'lme(..., 
> control=lmeControl(singular.ok=TRUE))' OR 'lme(..., 
> control=lmeControl(singular.ok=TRUE, returnObject=TRUE))'?  I'm not 
> sure, but I believe this may work when "returnObject=TRUE" does not. 
> 
>   4.  Have you tried the following: 
> 
>library(FinTS)
>package.dir('nlme')
> 
>I tried this just now and learned that the 'nlme' 
> packages contains two non-standard subdirectories named "mlbook" and 
> "scripts".  The second contains files names 'ch01.R', 'ch02.R', etc., 
> which contain the R commands required to reproduce virtually all the 
> figures, tables and examples in Pinheiro and Bates (2000) Mixed-Effects 
> Models in S and S-PLUS (Springer).  If you have not already done so, I 
> recommend you get this book and use these script files to facilitate 
> your study of it.  Doug Bates is one of the world's leading experts in 
> mixed-effects modeling, and I have learned a lot from my study of it. 
> 
>  Hope this helps. 
>  Spencer Graves
> 
> Rebecca Sela wrote:
> >I tried to use LME (on a fairly large dataset, so I am not including it), 
> >and I got this error message:
> >
> >Error in lme.formula(formula(paste(c(toString(TargetName), 
> >"as.factor(nodeInd)"),  : nlminb problem, convergence error code = 1
> >  message = false convergence (8)
> >
> >Is there any way to get more information or to get the potentially wrong 
> >estimates from LME?
> >
> >(Also, the page in the NLMINB documentation,  
> >http://netlib.bell-labs.com/cm/cs/cstr/153.pdf, has errors in it, which 
> >makes it harder to check on what is happening.)
> >
> >Thank you in advance!
> >
> >Rebecca
> >
> >__
> >R-help@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide 
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
> >
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bimodal Distribution

2008-05-29 Thread Andrew Robinson
Hi Mike,

if you can decompose the bimodal distribution into (eg two) known
forms, then you could try a stepwise approach, eg:

If uniform < 0.5 then double it and use it to draw from the inverse
cdf of  A,

else, double (uniform - 0.5) and use it to draw from the inverse cdf of B.

You can change the cutoff and the weights to suit your need.  It's
best to double-check by plotting an empirical density of the numbers
generated. 

I hope that this helps,

Andrew

On Thu, May 29, 2008 at 04:05:29PM -0600, Mike Williams wrote:
> Hello R Users,
> 
> I am doing a Latin Hypercube type simulation. I have found the 
> improvedLHS function and have used it to generate a bunch of properly 
> distributed uniform probabilities. Now I am using functions like qlnorm 
> to transform that into the appropriately lognormal or triangularly 
> distributed parameters for my modes. However I have a parameter which I 
> believe is bimodally distributed, could someone please point me at an 
> appropriate function equivalent to qlnorm which I can use, because for 
> some reason I have been unable to find one. It occurs to me that maybe 
> one doesn't exist, in which case could someone give me some other idea 
> of how to accomplish this goal?
> 
> Thanks,
> 
> Mike
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can I do this with R?

2008-05-28 Thread Andrew Robinson
On Wed, May 28, 2008 at 03:47:49PM -0700, Xiaohui Chen wrote:
> Frank E Harrell Jr ??:
> >Xiaohui Chen wrote:
> >>step or stepAIC functions do the job. You can opt to use BIC by 
> >>changing the mulplication of penalty.
> >>
> >>I think AIC and BIC are not only limited to compare two pre-defined 
> >>models, they can be used as model search criteria. You could 
> >>enumerate the information criteria for all possible models if the 
> >>size of full model is relatively small. But this is not generally 
> >>scaled to practical high-dimensional applications. Hence, it is often 
> >>only possible to find a 'best' model of a local optimum, e.g. 
> >>measured by AIC/BIC.
> >
> >Sure you can use them that way, and they may perform better than other 
> >measures, but the resulting model will be highly biased (regression 
> >coefficients biased away from zero).  AIC and BIC were not designed to 
> >be used in this fashion originally.  Optimizing AIC or BIC will not 
> >produce well-calibrated models as does penalizing a large model.
> >
> Sure, I agree with this point. AIC is used to correct the bias from the 
> estimations which minimize the KL distance of true model, provided the 
> assumed model family contains the true model. BIC is designed for 
> approximating the model marginal likelihood. Those are all 
> post-selection estimating methods. For simutaneous variable selection 
> and estimation, there are better penalizations like L1 penalty, which is 
> much better than AIC/BIC in terms of consistency.

Xiaohui, 

Tibshirani (1996) suggests that the quality of the L1 penalty depends
on the structure of the dataset.  As I recall, subset selection was
preferred for finding a small number of large effects, lasso (L1) for
finding a small to moderate number of moderate-sized effects, and
ridge (L2) for many small effects.

Can you provide any references to more up-to-date simulations that you
would recommend?

Cheers,

Andrew
-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to remove NAs and lme function

2008-05-28 Thread Andrew Robinson
Jen,

try

na.action = na.exclude

Andrew


On Wed, May 28, 2008 9:26 pm, Jen_mp3 wrote:
>
> I am working on a project to find a model for the concentration of
> dissolved
> oxygen in the river clyde. Ive fitted a linear mixed model as
> lme(DOW~Temperature+Salinity+Year+factor(Station)*factor(Depth),
> random~1|id), where id is an identifier of the day over 20 years defined
> as
> Day*1 + Month*100 + (1900 - Year).
> Anyway, there are some NAs for the concentration of dissolved oxygen in
> the
> water so I know you add in na.action = na.omit and that omits the NAs so
> there are 9008 observations in the model, but it doesnt do it for the
> whole
> data set where there are 10965 including observations with NAs. I would
> like
> to plot the residuals from the model against the Salinity, Temperature and
> Year, but when I try, it seems to want to take the observations of these
> variables from the full data set and the residuals from the model which of
> course doesnt work. I have tried using
> data1 <- data[data$DOW != "NA",] on the whole data set but it doesnt work.
> How can I remove the NAs from a data set?
>
> --
> View this message in context:
> http://www.nabble.com/How-to-remove-NAs-and-lme-function-tp17510564p17510564.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


Andrew Robinson
Senior Lecturer in Statistics   Tel: +61-3-8344-6410
Department of Mathematics and StatisticsFax: +61-3-8344 4599
University of Melbourne, VIC 3010 Australia
Email: [EMAIL PROTECTED]Website: http://www.ms.unimelb.edu.au

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need help for nlme

2008-05-25 Thread Andrew Robinson
My advice is to try to simplify this as much as possible.  When it is
as simple as possible, it will either work or not work.  If it works,
then you have learned something useful.  If it does not work, then
send your question again.  Right now there is a great deal of detail
that may or may not be extraneous.  

Andrew.


On Sun, May 25, 2008 at 01:36:49PM -0500, [EMAIL PROTECTED] wrote:
> Hi everyone,
> I try to write a module based on nlme however R always shows me the  
> error message
> Error in eval(expr, envir, enclos) : object "y" not found. Does anyone  
> know how to solve this? There is no problem in nls at all.
> 
> require(nlme)
> AMPmixed<-function(y, x, S1=c("asymptotic","logistic"),  
> S2=c("asymptotic","logistic"), data, start,random)
>  {
>   logist.logist<-function(x,alpha,delta,psi.l,tau.l,gamma,h){
>   
> delta+(alpha-delta+gamma*(x-(h-1))/exp(x))/(1+exp(-(x-tau.l)/psi.l))}
>   logist.asymp<-function(x,alpha,delta,psi.l,tau.l,lpsi.a,gamma,h){
>   
> delta+(alpha-delta)/(1+exp(-(x-tau.l)/psi.l))+(gamma*(x-(h-1))/exp(x))*exp(-exp(1/lpsi.a)*x)}
>   asymp.asymp<-function(x,alpha,delta,lpsi.a,gamma,h){
>   
> delta+(alpha-delta)*exp(-exp(1/lpsi.a)*x)+(gamma*(x-(h-1))/exp(x))*exp(-exp(1/lpsi.a)*x)}
>   asymp.logist<-function(x,alpha,delta,psi.l,tau.l,lpsi.a,gamma,h){
>   
> delta+(alpha-delta)*exp(-exp(1/lpsi.a)*x)+(gamma*(x-(h-1))/exp(x))/(1+exp(-(x-tau.l)/psi.l))}
> 
>(logistic.logistic<-function(y, x, data, start, random){
> 
> nlme.out<-nlme(y~logist.logist(x,alpha,delta,psi.l,tau.l,gamma,h),  
> data=data, start=start,
>   fixed=alpha+delta+psi.l+tau.l+gamma+h~1,  
> random=random)
>list(nlme.out=summary(nlme.out))
>})
>(logistic.asymptotic<-function(y, x, data, start, random){
>  
> nlme.out<-nlme(y~logist.asymp(x,alpha,delta,psi.l,tau.l,lpsi.a,gamma,h),  
> data=data, start=start,
>  
> fixed=alpha+delta+psi.l+tau.l+lpsi.a+gamma+h~1, random=random)
>  list(nlme.out=summary(nlme.out))
>})
>(asymptotic.logistic<-function(y, x, data, start,random){
> 
> nlme.out<-nlme(y~asymp.logist(x,alpha,delta,psi.l,tau.l,lpsi.a,gamma,h),  
> data=data, start=start,
> 
> fixed=alpha+delta+psi.l+tau.l+lpsi.a+gamma+h~1, random=random)
> list(nlme.out=summary(nlme.out))
>})
>(asymptotic.asymptotic<-function(y, x, data, start, random){
> 
> nlme.out<-nlme(y~asymp.asymp(x,alpha,delta,lpsi.a,gamma,h), data=data,  
> start=start,
>  fixed=alpha+delta+lpsi.a+gamma+h~1,random=random)
>list(nlme.out=summary(nlme.out))
>})
> 
>if(S1=="logistic" && S2=="logistic")  
> {(AMPmixed=logistic.logistic(y, x, data, start, random))}
>else if(S1=="logistic" &&  
> S2=="asymptotic"){(AMPmixed=logistic.asymptotic(y, x, data, start,  
> random))}
>else if(S1=="asymptotic" &&  
> S2=="logistic"){(AMPmixed=asymptotic.logistic(y, x, data, start,  
> random))}
>else if(S1=="asymptotic" &&  
> S2=="asymptotic"){(AMPmixed=asymptotic.asymptotic(y, x, data, start,  
> random))}
>}
> 
> #
>   con rep biomass
> 10.00   1   1.126
> 20.32   1   1.096
> 31.00   1   1.163
> 43.20   1   0.985
> 5   10.00   1   0.716
> 6   32.00   1   0.560
> 7  100.00   1   0.375
> 80.00   2   0.833
> 90.32   2   1.106
> 10   1.00   2   1.336
> 11   3.20   2   0.754
> 12  10.00   2   0.683
> 13  32.00   2   0.488
> 14 100.00   2   0.344
> 
> iso<-read.table(file="E:\\Hormesis\\data\\isobutanol.txt", header=T)
> aa<-groupedData(biomass~con|rep, data=iso)
> van2<-AMPmixed(y=biomass, x=con, S1="asymptotic", S2="asymptotic", data=aa,
>    random=pdDiag(alpha+delta+lpsi.a+gamma+h~1),
>start=c(alpha= 0.7954, delta= 0.3231,  lpsi.a=-0.2738,  
> gamma= 1.0366, h=0.8429))
> van2
> 
> Thank you very much in advance.
> Chunhao
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, 

Re: [R] Some problems with Sweave

2008-05-23 Thread Andrew Robinson
Martin,

try omitting the results=tex argument.

Andrew

On Fri, May 23, 2008 at 10:16:33AM +0200, [EMAIL PROTECTED] wrote:
> Dear R users,
> I'm working in a brief R-tutorial to a group of students. To make that I'm
> using Sweave but I've got two problems:
> 
> First, I want show how R operates with the matrix type but, I write in the
> .rnw document the code
> 
> <>=
>  matriz <- matrix(vector,nrow=3,ncol=6)
>  matriz
> @
> 
> and after compilating the LaTex document I obtain in the pdf the next text
> 
> > matriz <- matrix(vector, nrow = 3, ncol = 6)
> > matriz
> [,1] [,2] [,3] [,4] [,5] [,6] [1,] 1 4 1 4 1 4 [2,] 2 5 2 5 2 5 [3,] 3 6 3
> 6 3 6
> 
> My question is, How must I do To obtain in the pdf somethin near to
> 
> > matriz <- matrix(vector, nrow = 3, ncol = 6)
> > matriz
>  [,1] [,2] [,3] [,4] [,5] [,6]
> [1,]141414
> [2,]252525
> [3,]363636
> 
> I`ve tought in xtable but the aspect is not as the R console.
> 
> On the other hand I want show the list type R-treatment, the problem here
> is in the LaTex compilation
> I write in the .rnwd document
> 
> <>=
>  lista <- list(cadena='String',vector=c(1,1,1),logica=TRUE)
>  lista$cadena
> @
> 
> the Sweave call is ok, but when I compile the .tex document, It produces
> errors caused by the $ simbols. Anybody knows how save this problem?
> 
> Thanks in advance,
> Mart?n Gast?n
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >