Re: [R] Capturing warning within user-defined function

2018-03-06 Thread Jen
I did explore tryCatch but wasn't successful.

However, I did just try your solution, William, and it worked!

I just had to modify this line in my function:

p <- ((svyciprop(~grp, grp1, family=quasibinomial))[1])

to

p <- withWarnings((svyciprop(~grp, grp1, family=quasibinomial))[1])

Then I could use p[1] to get my estimated proportion.

Then I used:

ind <- length(attr(p, "warnings"))
print(ind)

if (ind > 0) {msg <- names(warnings()) }
else { msg <- "No warnings" }

overall[1,5] <- msg

to complete my table.

Thanks, again, William!

Jen





On Tue, Mar 6, 2018, 5:57 PM William Dunlap  wrote:

> tryCatch() is good for catching errors but not so good for warnings, as
> it does not let you resume evaluating the expression that emitted
> the warning.  withCallingHandlers(), with its companion invokeRestart(),
> lets you collect the warnings while letting the evaluation run to
> completion.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Tue, Mar 6, 2018 at 2:45 PM, Bert Gunter 
> wrote:
>
>> 1. I did not attempt to sort through your voluminous code. But I suspect
>> you are trying to reinvent wheels.
>>
>> 2. I don't understand this:
>>
>> "I've failed to find a solution after much searching of various R related
>> forums."
>>
>> A web search on "error handling in R" **immediately** brought up
>> ?tryCatch,
>> which I think is what you want.
>> If not, you should probably explain why it isn't, so that someone with
>> more
>> patience than I can muster will sort through your code to help.
>>
>> Cheers,
>> Bert
>>
>>
>>
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along and
>> sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>> On Tue, Mar 6, 2018 at 2:26 PM, Jen 
>> wrote:
>>
>> > Hi, I am trying to automate the creation of tables for some simply
>> > analyses. There are lots and lots of tables, thus the creation of a
>> > user-defined function to make and output them to excel.
>> >
>> > My problem is that some of the analyses have convergence issues, which I
>> > want captured and included in the output so the folks looking at them
>> know
>> > how to view those estimates.
>> >
>> > I am successfully able to do this in a straightforward set of steps.
>> > However, once I place those steps inside a function it fails.
>> >
>> > Here's the code (sorry this is a long post):
>> >
>> > # create data
>> > wt <- rgamma(6065, 0.7057511981,  0.0005502062)
>> > grp <- sample(c(replicate(315, "Group1"), replicate(3672, "Group2"),
>> > replicate(1080, "Group3"), replicate(998, "Group4")))
>> > dta <- data.frame(grp, wt)
>> > head(dta)
>> > str(dta)
>> >
>> > # declare design
>> > my.svy <- svydesign(ids=~1, weights=~wt, data=dta)
>> >
>> > # subset
>> > grp1 <- subset(my.svy, grp == "Group1")
>> >
>> > # set options and clear old warnings
>> > options(warn=0)
>> > assign("last.warning", NULL, envir = baseenv())
>> >
>> > ## proportions and CIs
>> > p <- ((svyciprop(~grp, grp1, family=quasibinomial))[1])
>> >
>> > # save warnings
>> > wrn1 <- warnings(p)
>> >
>> > ci_l <- (confint(svyciprop(~grp, grp1, family=quasibinomial), 'ci')[1])
>> > ci_u <- (confint(svyciprop(~grp, grp1, family=quasibinomial), 'ci')[2])
>> >
>> > ## sample counts
>> > n <- unwtd.count(~grp, grp1)[1]
>> >
>> > ## combine into table
>> > overall <- data.frame(n, p, ci_l, ci_u)
>> > colnames(overall) <- c("counts", "Group1", "LL", "UL")
>> >
>> > ## add any warnings
>> > ind <- length(wrn1)
>> > ind
>> >
>> > if (ind == 0) { msg <- "No warnings" }
>> > if (ind > 0) {msg <- names(warnings()) }
>> > overall[1,5] <- msg
>> >
>> > print(overall)
>> >
>> > Here's the output from the above:
>> >
>> > > # set options and clear old warnings
>> > > options(warn=0)
>> > > assign("last.warning", NULL, envir = baseenv())
>> > >
>> > > ## proporti

Re: [R] Capturing warning within user-defined function

2018-03-06 Thread Jen
Hi William,

Thanks, I'll give that a shot. I tried using withCallingHandlers without
success but II admit I'm not familiar with it and may have used it wrong.

I'll report back.

Jen


On Tue, Mar 6, 2018, 5:42 PM William Dunlap  wrote:

> You can capture warnings by using withCallingHandlers.   Here is an
> example,
> its help file has more information.
>
> dataList <- list(
>A = data.frame(y=c(TRUE,TRUE,TRUE,FALSE,FALSE), x=1:5),
>B = data.frame(y=c(TRUE,TRUE,FALSE,TRUE,FALSE), x=1:5),
>C = data.frame(y=c(FALSE,FALSE,TRUE,TRUE,TRUE), x=1:5))
>
> withWarnings <- function(expr) {
>.warnings <- NULL # warning handler will append to this using '<<-'
>value <- withCallingHandlers(expr,
> warning=function(e) {
> .warnings <<- c(.warnings,
> conditionMessage(e))
> invokeRestart("muffleWarning")
> })
>structure(value, warnings=.warnings)
> }
> z <- lapply(dataList, function(data) withWarnings(coef(glm(data=data, y ~
> x, family=binomial
> z
>
> The last line produces
>
> > z
> $A
> (Intercept)   x
>   160.80782   -45.97184
> attr(,"warnings")
> [1] "glm.fit: fitted probabilities numerically 0 or 1 occurred"
>
> $B
> (Intercept)   x
>3.893967   -1.090426
>
> $C
> (Intercept)   x
>  -115.0232145.97184
> attr(,"warnings")
> [1] "glm.fit: fitted probabilities numerically 0 or 1 occurred"
>
> and lapply(z, attr, "warnings") will give you the warnings themselves.
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Tue, Mar 6, 2018 at 2:26 PM, Jen 
> wrote:
>
>> Hi, I am trying to automate the creation of tables for some simply
>> analyses. There are lots and lots of tables, thus the creation of a
>> user-defined function to make and output them to excel.
>>
>> My problem is that some of the analyses have convergence issues, which I
>> want captured and included in the output so the folks looking at them know
>> how to view those estimates.
>>
>> I am successfully able to do this in a straightforward set of steps.
>> However, once I place those steps inside a function it fails.
>>
>> Here's the code (sorry this is a long post):
>>
>> # create data
>> wt <- rgamma(6065, 0.7057511981,  0.0005502062)
>> grp <- sample(c(replicate(315, "Group1"), replicate(3672, "Group2"),
>> replicate(1080, "Group3"), replicate(998, "Group4")))
>> dta <- data.frame(grp, wt)
>> head(dta)
>> str(dta)
>>
>> # declare design
>> my.svy <- svydesign(ids=~1, weights=~wt, data=dta)
>>
>> # subset
>> grp1 <- subset(my.svy, grp == "Group1")
>>
>> # set options and clear old warnings
>> options(warn=0)
>> assign("last.warning", NULL, envir = baseenv())
>>
>> ## proportions and CIs
>> p <- ((svyciprop(~grp, grp1, family=quasibinomial))[1])
>>
>> # save warnings
>> wrn1 <- warnings(p)
>>
>> ci_l <- (confint(svyciprop(~grp, grp1, family=quasibinomial), 'ci')[1])
>> ci_u <- (confint(svyciprop(~grp, grp1, family=quasibinomial), 'ci')[2])
>>
>> ## sample counts
>> n <- unwtd.count(~grp, grp1)[1]
>>
>> ## combine into table
>> overall <- data.frame(n, p, ci_l, ci_u)
>> colnames(overall) <- c("counts", "Group1", "LL", "UL")
>>
>> ## add any warnings
>> ind <- length(wrn1)
>> ind
>>
>> if (ind == 0) { msg <- "No warnings" }
>> if (ind > 0) {msg <- names(warnings()) }
>> overall[1,5] <- msg
>>
>> print(overall)
>>
>> Here's the output from the above:
>>
>> > # set options and clear old warnings
>> > options(warn=0)
>> > assign("last.warning", NULL, envir = baseenv())
>> >
>> > ## proportions and CIs
>> > p <- ((svyciprop(~grp, grp1, family=quasibinomial))[1])
>> Warning message:
>> glm.fit: algorithm did not converge
>> >
>> > # save warnings
>> > wrn1 <- warnings(p)
>> >
>> > ci_l <- (confint(svyciprop(~grp, grp1, family=quasibinomial), 'ci')[1])
>> Warning message:
>> glm.fit: algorithm did not converge
>> > ci_u <- (confint(svyciprop(~grp, grp1, family=quasibinomial), 'ci')

[R] Capturing warning within user-defined function

2018-03-06 Thread Jen
Hi, I am trying to automate the creation of tables for some simply
analyses. There are lots and lots of tables, thus the creation of a
user-defined function to make and output them to excel.

My problem is that some of the analyses have convergence issues, which I
want captured and included in the output so the folks looking at them know
how to view those estimates.

I am successfully able to do this in a straightforward set of steps.
However, once I place those steps inside a function it fails.

Here's the code (sorry this is a long post):

# create data
wt <- rgamma(6065, 0.7057511981,  0.0005502062)
grp <- sample(c(replicate(315, "Group1"), replicate(3672, "Group2"),
replicate(1080, "Group3"), replicate(998, "Group4")))
dta <- data.frame(grp, wt)
head(dta)
str(dta)

# declare design
my.svy <- svydesign(ids=~1, weights=~wt, data=dta)

# subset
grp1 <- subset(my.svy, grp == "Group1")

# set options and clear old warnings
options(warn=0)
assign("last.warning", NULL, envir = baseenv())

## proportions and CIs
p <- ((svyciprop(~grp, grp1, family=quasibinomial))[1])

# save warnings
wrn1 <- warnings(p)

ci_l <- (confint(svyciprop(~grp, grp1, family=quasibinomial), 'ci')[1])
ci_u <- (confint(svyciprop(~grp, grp1, family=quasibinomial), 'ci')[2])

## sample counts
n <- unwtd.count(~grp, grp1)[1]

## combine into table
overall <- data.frame(n, p, ci_l, ci_u)
colnames(overall) <- c("counts", "Group1", "LL", "UL")

## add any warnings
ind <- length(wrn1)
ind

if (ind == 0) { msg <- "No warnings" }
if (ind > 0) {msg <- names(warnings()) }
overall[1,5] <- msg

print(overall)

Here's the output from the above:

> # set options and clear old warnings
> options(warn=0)
> assign("last.warning", NULL, envir = baseenv())
>
> ## proportions and CIs
> p <- ((svyciprop(~grp, grp1, family=quasibinomial))[1])
Warning message:
glm.fit: algorithm did not converge
>
> # save warnings
> wrn1 <- warnings(p)
>
> ci_l <- (confint(svyciprop(~grp, grp1, family=quasibinomial), 'ci')[1])
Warning message:
glm.fit: algorithm did not converge
> ci_u <- (confint(svyciprop(~grp, grp1, family=quasibinomial), 'ci')[2])
Warning message:
glm.fit: algorithm did not converge
>
> ## sample counts
> n <- unwtd.count(~grp, grp1)[1]
>
> ## combine into table
> overall <- data.frame(n, p, ci_l, ci_u)
> colnames(overall) <- c("counts", "Group1", "LL", "UL")
>
> ## add any warnings
> ind <- length(wrn1)
> ind
[1] 1
>
> if (ind == 0) { msg <- "No warnings" }
> if (ind > 0) {msg <- names(warnings()) }
> overall[1,5] <- msg
>
> print(overall)
   counts   Group1   LL   UL
  V5
counts315 2.364636e-12 2.002372e-12 2.792441e-12 glm.fit: algorithm did
not converge

Here's the function:

est <- function(var) {

## set up formula
formula <- paste ("~", var)

## set options and clear old warning
options(warn=0)
assign("last.warning", NULL, envir = baseenv())

## proportions and CIs
p <- ((svyciprop(as.formula(formula), grp1, family=quasibinomial))[1])

## save warnings
wrn1 <- warnings(p)

ci_l <- (confint(svyciprop(as.formula(formula) , grp1,
family=quasibinomial), 'ci')[1])
ci_u <- (confint(svyciprop(as.formula(formula) , grp1,
family=quasibinomial), 'ci')[2])

## sample counts
n <- unwtd.count(as.formula(formula), grp1)[1]

## combine into table
overall <- data.frame(n, p, ci_l, ci_u)
colnames(overall) <- c("counts", "Group1", "LL", "UL")


## add any warnings
ind <- length(warnings(p))
print(ind)

if (ind == 0) { msg <- "No warnings" }
if (ind > 0) {msg <- names(warnings()) }
overall[1,5] <- msg

print(overall)

}

Here's the output from running the function:

> est("grp")
[1] 0
   counts   Group1   LL   UL  V5
counts315 2.364636e-12 2.002372e-12 2.792441e-12 No warnings
Warning messages:
1: glm.fit: algorithm did not converge
2: glm.fit: algorithm did not converge
3: glm.fit: algorithm did not converge

So, the warnings are showing up in the output at the end of the function
but they're not being captured like they are when run outside of the
function. Note the 0 output from print(ind) and V7 has "No warnings".
I know a lot of things "behave" differently inside functions. Case in
point, the use of "as.formula(var)" rather than just "~grp" being passed to
the function.

I've failed to find a solution after much searching of various R related
forums. I even posted this to stackoverflow but with 

Re: [R] force axis to extend

2017-03-16 Thread Jen
Yay!  That worked!  Thanks so much!

Jen

On Thu, Mar 16, 2017, 9:28 AM David L Carlson  wrote:

> Use
>
> plot(NA, xlim=c(0, 5), ylim=c(-35, 35), type="n", axes=FALSE, ann=FALSE)
>
> to set up the length of the y axis instead of your first plot command.
>
> -
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
>
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Jim Lemon
> Sent: Wednesday, March 15, 2017 9:09 PM
> To: Jen 
> Cc: r-help mailing list 
> Subject: Re: [R] force axis to extend
>
> Hi Jen,
> It seems way too simple, but does this work?
>
> axis(side=2,at=seq(-35,35,by=5),cex.axis=0.7)
>
> You may want to consider using a pyramid plot for this.
>
> Jim
>
>
> On Thu, Mar 16, 2017 at 11:45 AM, Jen 
> wrote:
> > Hi,  I'm creating a couple of mirrored bar plots.  Below is data and code
> > for one.
> >
> > My problem is that I need the axis to go from -35 to 35 by 5.  I can't
> get
> > that to happen with the code below.  I need it so all my plots are on the
> > same scale.
> >
> > How can I do that using barplot? For reasons, I can't use ggplot or
> > lattice.
> >
> > Thanks,
> >
> > Jen
> >
> >
> >
> > df <- data.frame(matrix(c(
> > '18-29','Females',  23.221039,
> > '30-44','Females',  16.665565,
> > '45-59','Females',  7.173238,
> > '60+',  'Females',  4.275979,
> > '18-29','Males',-22.008875,
> > '30-44','Males',-15.592936,
> > '45-59','Males',-7.312195,
> > '60+',  'Males',-3.750173),
> > nrow=8, ncol=3, byrow=T,
> > dimnames=list(NULL, c("Age", "Sex", "Percent"
> >
> > df$Percent <- as.numeric(as.character(df$Percent))
> >
> > midf <- barplot(height = df$Percent[df$Sex == "Females"])
> >
> > # distribution of men and women with solid fill
> >
> > plot(c(0,5),range(df$Percent),type = "n", axes=FALSE, ann=F)
> >
> > barplot(height = df$Percent[df$Sex == "Females"], add = TRUE,axes =
> FALSE,
> > col="#b498ec", ylab="")
> >
> > barplot(height = df$Percent[df$Sex == "Males"], add = TRUE, axes = F,
> > col="#f8bb85", ylab="",
> > names.arg=c("18-29", "30-44", "45-59", "60+"))
> >
> > axis(side=2, at = seq(-35,35,by=5),
> >  labels=format(abs(seq(-35,35,by=5)), scientific=F),
> >  cex.axis=0.7)
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] force axis to extend

2017-03-16 Thread Jen
Hi Jim,

Thanks for replying.

Unfortunately, that doesn't work.   The axis automatically scales to (-30,
25, by 5).

Jen



On Wed, Mar 15, 2017, 10:09 PM Jim Lemon  wrote:

> Hi Jen,
> It seems way too simple, but does this work?
>
> axis(side=2,at=seq(-35,35,by=5),cex.axis=0.7)
>
> You may want to consider using a pyramid plot for this.
>
> Jim
>
>
> On Thu, Mar 16, 2017 at 11:45 AM, Jen 
> wrote:
> > Hi,  I'm creating a couple of mirrored bar plots.  Below is data and code
> > for one.
> >
> > My problem is that I need the axis to go from -35 to 35 by 5.  I can't
> get
> > that to happen with the code below.  I need it so all my plots are on the
> > same scale.
> >
> > How can I do that using barplot? For reasons, I can't use ggplot or
> > lattice.
> >
> > Thanks,
> >
> > Jen
> >
> >
> >
> > df <- data.frame(matrix(c(
> > '18-29','Females',  23.221039,
> > '30-44','Females',  16.665565,
> > '45-59','Females',  7.173238,
> > '60+',  'Females',  4.275979,
> > '18-29','Males',-22.008875,
> > '30-44','Males',-15.592936,
> > '45-59','Males',-7.312195,
> > '60+',  'Males',-3.750173),
> > nrow=8, ncol=3, byrow=T,
> > dimnames=list(NULL, c("Age", "Sex", "Percent"
> >
> > df$Percent <- as.numeric(as.character(df$Percent))
> >
> > midf <- barplot(height = df$Percent[df$Sex == "Females"])
> >
> > # distribution of men and women with solid fill
> >
> > plot(c(0,5),range(df$Percent),type = "n", axes=FALSE, ann=F)
> >
> > barplot(height = df$Percent[df$Sex == "Females"], add = TRUE,axes =
> FALSE,
> > col="#b498ec", ylab="")
> >
> > barplot(height = df$Percent[df$Sex == "Males"], add = TRUE, axes = F,
> > col="#f8bb85", ylab="",
> > names.arg=c("18-29", "30-44", "45-59", "60+"))
> >
> > axis(side=2, at = seq(-35,35,by=5),
> >  labels=format(abs(seq(-35,35,by=5)), scientific=F),
> >  cex.axis=0.7)
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] force axis to extend

2017-03-15 Thread Jen
Hi,  I'm creating a couple of mirrored bar plots.  Below is data and code
for one.

My problem is that I need the axis to go from -35 to 35 by 5.  I can't get
that to happen with the code below.  I need it so all my plots are on the
same scale.

How can I do that using barplot? For reasons, I can't use ggplot or
lattice.

Thanks,

Jen



df <- data.frame(matrix(c(
'18-29','Females',  23.221039,
'30-44','Females',  16.665565,
'45-59','Females',  7.173238,
'60+',  'Females',  4.275979,
'18-29','Males',-22.008875,
'30-44','Males',-15.592936,
'45-59','Males',-7.312195,
'60+',  'Males',-3.750173),
nrow=8, ncol=3, byrow=T,
dimnames=list(NULL, c("Age", "Sex", "Percent"

df$Percent <- as.numeric(as.character(df$Percent))

midf <- barplot(height = df$Percent[df$Sex == "Females"])

# distribution of men and women with solid fill

plot(c(0,5),range(df$Percent),type = "n", axes=FALSE, ann=F)

barplot(height = df$Percent[df$Sex == "Females"], add = TRUE,axes = FALSE,
col="#b498ec", ylab="")

barplot(height = df$Percent[df$Sex == "Males"], add = TRUE, axes = F,
col="#f8bb85", ylab="",
names.arg=c("18-29", "30-44", "45-59", "60+"))

axis(side=2, at = seq(-35,35,by=5),
 labels=format(abs(seq(-35,35,by=5)), scientific=F),
 cex.axis=0.7)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to speed this up?

2017-02-28 Thread Jen
Hi, I'm trying to generate 2.5 million phone numbers.  The code below
generates a random sample of 250K MPNS for Morocco.  It takes about 10
minutes.

I need to generate 2.5 million.  I've run it through once and it took about
45 hours.

Is there a way to speed this up?

Thanks,

Jen

# generate random sample of mobile phone numbers (MPNs) - Morocco

# Mobile phone number format:  +212-6xx-xx

library(data.table)

# country code

cc <- "+212"

# prefixes

IAM  <- data.table(matrix(c(610,611,613,615,616,
618,641,642,648,650,651,652,653,
  654,  655,658,659,661,662,666,667,
668,670,671,672,673,
  676,  677,678), dimnames=list(NULL, "IAM")))



Medi <- data.table(matrix(c(612,614,617,619,644,
645,649,656,657,660,663,664,665,
  669,  674,675,679), dimnames=list(NULL, "Medi")))

MOROC <- data.table(matrix(c(0636, 0637), dimnames=list(NULL, "MOROC")))

# combine

mno <- c(IAM, Medi, MOROC)

# generate MPNs
MPN <- NULL

system.time(for (i in 1:25){
# randomly select number from list

prefix <- sapply(mno[floor(runif(1, 1, length(mno)+1))], function(x)
sample(x, 1))

MNO <- names(prefix)

# randomly generate 6 numbers between 0 and 9, inclusive

nums <- floor(runif(6, 0, 9))

# concatenate

tmp <- c(paste(c(cc,prefix,t(nums)), sep="", collapse=""), MNO)

MPN[[i]] <- tmp

i <- i+1


})

# unlist

df <- data.table(matrix(unlist(MPN), nrow=length(MPN), ncol=2, byrow=T,
dimnames = list(seq(1, length(MPN),1), c("MPN", "MNO"))   ))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Restructuring data - unstack, reshape?

2011-09-25 Thread Jen M
Hi all,

I'm having a problem restructuring my data the way I'd like it.  I have data
that look like this:

  Candidate.IDSpecialty   Office Score
   110002 C  London   47
   110002 C  East  48
   110003 RM   West  45
   110003 RM   Southwest  39
   110003 C  Southwest  38
   110004 H  South 42
   110006 G  East   47
   110006 G  London   45

Candidates can apply for the same job specialty in up to 2 offices (never
more).  They can apply for different specialties in further centres.  I
would like to look at score differences when candidates apply for the same
specialty in two different offices.  With the help of the archives I have
tried various stack/unstack and reshape/melt/cast combinations, and I've
managed to get a huge matrix where the columns are all possible combinations
of Specialties & Offices - and there are many.  This leaves a very sparse
matrix  with mainly null values, and this is not what I want.  I'd like the
scores from the two attempts in two columns so I can do scatterplots,
calculate differences by specialty etc. In SPSS I'd use 'restructure' to get
what I want.  I'm working to order with specific requests here so I have to
do it this way (as opposed to a modelling approach).

I would like it restructured to look something like this:

  Candidate.IDSpecialty   Office.1   Score.1  Office.2
Score.2
   110002 C  London   47
 East  48
   110003 RM   West  45
 Southwest  39
   110003 C  Southwest  38
   110004 H  South 42
   110006 G  East   47
London  45
   110006 G  London   45

So one row per candidate/specialty combination, with 2 sets of
offices/scores and null values in the second set if they've only applied
once for that specialty.  Can anyone help me out with this? Is it possible
using stack or reshape?

Many thanks for reading,
Jan

PS Closest I've come to what I need is the sparse matrix produced by this:

recast(spec.scores, Candidate.ID ~ Specialty + Office, measure.var="Score")

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] studentized and standarized residuals

2011-08-10 Thread Jen
Thanks Patrick - at least I know I wasn't being too silly :-)
Jen

--
View this message in context: 
http://r.789695.n4.nabble.com/studentized-and-standarized-residuals-tp3732997p3733173.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] studentized and standarized residuals

2011-08-10 Thread Jen
Hi,
I must be doing something silly here, because I can't get the studentised
and standardised residuals from r output of a linear model to agree with
what I think they should be from equation form. 
Thanks in advance,
Jennifer

x = seq(1,10)
y = x + rnorm(10)
mod = lm(y~x)
rstandard(mod)
residuals(mod)/(summary(mod)$sigma)
rstudent(mod)
residuals(mod)/(summary(mod)$sigma*sqrt(1-lm.influence(mod)$hat))

--
View this message in context: 
http://r.789695.n4.nabble.com/studentized-and-standarized-residuals-tp3732997p3732997.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] different regression coeffs with different starting point

2011-03-14 Thread Jen
Hi Bill,
Thanks for your response and I'm sorry -- that was a misleading example of
what I was trying to show. This one should illustrate the point:

require(AER)
data_in = c(0,6,12,18,24,30,36,42,48,54,60,66,72,78)
data_in2 = data_in^2
data_in3 = data_in^3
data_out =
c(139487.00,13.00,62500.00,58823.00,56338.00,27549.00,4244.00,600.00,112.00,95.00,48.00,31.00,15.00,14.99)
ldata_out = log(data_out)

m1 <- lm(ldata_out ~ data_in + data_in2+data_in3)
cubic <- tobit(ldata_out ~ data_in + data_in2 + data_in3, left=
log(15-0.01),init = coef(m1))
print(cubic)
fitted(cubic)

n= length(data_in)
m2 <- lm(ldata_out[1:(n-1)] ~ data_in[1:(n-1)] +
data_in2[1:(n-1)]+data_in3[1:(n-1)])
cubic2 <- tobit(ldata_out ~ data_in + data_in2 + data_in3, left=
log(15-0.01),init = coef(m2))
print(cubic2)
fitted(cubic2)

The models cubic and cubic2 seem to show that the intercept value in the
model doesn't change from its starting value:

> m1

Call:
lm(formula = ldata_out ~ data_in + data_in2 + data_in3)

Coefficients:
(Intercept)  data_in data_in2 data_in3  
  1.156e+011.004e-01   -7.755e-036.464e-05  

> cubic

Call:
tobit(formula = ldata_out ~ data_in + data_in2 + data_in3, left = log(15 - 
0.01), init = coef(m1))

Coefficients:
(Intercept)  data_in data_in2 data_in3  
 11.55595400.0289083   -0.00406150.229  

Scale: 3.759 

> m2

Call:
lm(formula = ldata_out[1:(n - 1)] ~ data_in[1:(n - 1)] + data_in2[1:(n - 
1)] + data_in3[1:(n - 1)])

Coefficients:
(Intercept)   data_in[1:(n - 1)]  data_in2[1:(n - 1)]  data_in3[1:(n
- 1)]  
  1.150e+011.151e-01   -8.381e-03   
7.122e-05  

> cubic2

Call:
tobit(formula = ldata_out ~ data_in + data_in2 + data_in3, left = log(15 - 
0.01), init = coef(m2))

Coefficients:
(Intercept)  data_in data_in2 data_in3  
  1.150e+013.391e-02   -4.187e-032.383e-05  

Scale: 3.759 

Am I missing something here??

--
View this message in context: 
http://r.789695.n4.nabble.com/different-regression-coeffs-with-different-starting-point-tp3353536p3354276.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] different regression coeffs with different starting point

2011-03-14 Thread Jen
Hi all,
I have a question about the optimisation methods used in nonlinear
regression. I have some data that I would like to fit a tobit regression
model to (see code below). It seems that the solution is very sensitive to
the initial condition that I give it - is there any option to use a
different optimisation process where the solution will be less sensitive to
the initial condition?
Many thanks in advance,
J

require(AER)
data_in = c(0,6,12,18,24,30,36,42,48,54,60,66,72,78)
data_in2 = data_in^2
data_in3 = data_in^3
data_out =
c(139487.00,13.00,62500.00,58823.00,56338.00,27549.00,4244.00,600.00,112.00,95.00,48.00,31.00,15.00,14.99)
ldata_out = log(data_out)


m2 <- lm(ldata_out ~ data_in + data_in2)
print(m2)
quad_mod <- tobit(ldata_out ~ data_in + data_in2, left= log(15-0.01),init =
coef(m2))
quad_mod2 <- tobit(ldata_out ~ data_in + data_in2, left= log(15-0.01),init =
coef(m2)-0.01)
print(quad_mod)
print(quad_mod2)


--
View this message in context: 
http://r.789695.n4.nabble.com/different-regression-coeffs-with-different-starting-point-tp3353536p3353536.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tobit regression model

2011-03-11 Thread Jen
H Zeileis, 
This helped out a lot - thanks!!
Jen

--
View this message in context: 
http://r.789695.n4.nabble.com/tobit-regression-model-tp3345789p3347936.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] tobit regression model

2011-03-10 Thread Jen
Hi,
I'm trying to fit a tobit regression model to some data. When fitting the
exact same data in Stata, I have no problems at all, however R won't
converge. Its not a maxiters thing, since I've tried increasing this
already. I need to be able to fit the model in R since there are users of
the code that don't have a Stata license.

The code is:

require(AER)
left = 3.218476
x = c(0,6,12,18)
y = c(10.065819,7.803843,5.164786,3.218476)
mod<-tobit(y~x, left = left, right = Inf)

This gives back the following warning:
Warning in survreg.fit(X, Y, weights, offset, init = init, controlvals =
control,  :
  Ran out of iterations and did not converge

Has anyone come across this problem before or know a way to fix the problem?
Thanks in advance,
Jennifer

--
View this message in context: 
http://r.789695.n4.nabble.com/tobit-regression-model-tp3345789p3345789.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to display full name for the coefficients/factors in summary()?

2009-11-03 Thread Jen-Chien Chang

Hi,

I am wondering if there is a way to display the full anme of the 
regression coeffients/factors in the summary?


Suppose I have a bogus data set using weekday as factor which has 7 levels 
such as:


mydata <- sample(364)
wk <- rep(1:7, 52)
weekday <- 
factor(wk,1:7,c("Mon","Tue","Wed","Thu","Fri","Sat","Sun"),ordered=T)

test <- data.frame(mydata,weekday)
lm.test <- lm(mydata ~ weekday, test)
summary(lm.test)
Call:
lm(formula = mydata ~ weekday)

Residuals:
 Min   1Q   Median   3Q  Max
-205.615  -93.486   -2.212   88.851  199.423

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 182.5000 5.5137  33.100   <2e-16 ***
weekday.L   -27.042614.5878  -1.854   0.0646 .
weekday.Q-0.921114.5878  -0.063   0.9497
weekday.C-1.656514.5878  -0.114   0.9097
weekday^4   -16.644914.5878  -1.141   0.2546
weekday^5 9.943614.5878   0.682   0.4959
weekday^6   -14.397114.5878  -0.987   0.3243
---
Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1

Residual standard error: 105.2 on 357 degrees of freedom
Multiple R-squared: 0.01705,Adjusted R-squared: 0.0005341
F-statistic: 1.032 on 6 and 357 DF,  p-value: 0.4039 
---


Why is the name for the regression coeffients/factors display as 
weekday.L, weekday.Q, weekday^6 instead of Tue, Wed, and such? Is 
there a simple way of changing options to fix it?


Please advise and many thanks for your help!

Jack Chang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to print the full name of the factors in summary?

2009-11-02 Thread Jen-Chien Chang

Hi,

I am wondering if there is a simple way to fix the problem I am having. 
For unknown reason, I could not get the full name of the factors to be 
printed in the summary. I have tried to used summary.lm as well but the 
problem still persists.


SJ$Weekday <- 
factor(SJ$Weekday,1:7,c("Mon","Tue","Wed","Thu","Fri","Sat","Sun"),ordered=T)


attach(SJ)
lm.SJ <- lm(Demand ~ Weekday+Month+Holiday+Season)
summary(lm.SJ)
Call:
lm(formula = Demand ~ Weekday + Month + Holiday + Season)

Residuals:
Min  1Q  Median  3Q Max
-69.767 -12.224  -1.378  10.857  91.376

Coefficients: (3 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept)  88.7091 3.3442  26.527  < 2e-16 ***
Weekday.L20.8132 2.8140   7.396 1.08e-12 ***
Weekday.Q   -12.7667 2.8156  -4.534 7.99e-06 ***
Weekday.C   -10.6375 2.8113  -3.784 0.000182 ***
Weekday^4-8.3325 2.8103  -2.965 0.003238 **
-

Is there a way for summary to print the full name of the factors and 
levels? Say Weekday.Tue instead Weekday.L?


Thanks!

Jack Chang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R package for visualizing/analyzing accelerometry data?

2009-09-29 Thread Jen Maresh
Hello All -

Any recommendations or suggestions for neat ways to visualize data
taken from a 3-axis accelerometer? My study species is aquatic, so I
would be interested in movement in the 3 dimensions in addition to
being able to incorporate the time series as well. Is there a package
in R that might be useful for this?

Thank you in advance,
Jen


~
Jennifer Maresh, PhD Student
Center for Ocean Health, Long Marine Lab
100 Shaffer Rd.
University of California
Santa Cruz, CA 95060
mar...@biology.ucsc.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] invalid type (list) for variable

2009-09-14 Thread Jen Maresh
Hello -

I am attempting to add interaction terms to a model using the gamm
function in the mgcv package. I stripped it down to only two terms to
simplify it. I have no trouble with individual terms, continuous or
factors, nor with interacting factor terms; I cannot, however, figure
out how to add interactions between continuous*factor terms. I applied
a smoother to my continuous term because it has a non-linear trend
over time. Below is an example of my attempt, followed by the error
message:

MMF.Test <- gamm(MaxDepth ~
s(TripTime) + Trtmt +
s(TripTime):Trtmt,
random = list(Seal =~1), method = "REML",
data = EsealsAll)


"Error in model.frame.default(formula = MaxDepth ~ Trtmt +
s(TripTime):Trtmt +  :
  invalid type (list) for variable 's(TripTime)'
In addition: Warning message:
In sp[i] <- ind :
  number of items to replace is not a multiple of replacement length"

I thought this was a problem with trying to use different vector
lengths, but the length of both is the same (1001). Help please? What
am I doing wrong? Not sure if my mistake is one of faulty statistical
reasoning, or incompetence with R.

Thank you in advance for any suggestions!


~
Jennifer Maresh, PhD Student
Center for Ocean Health, Long Marine Lab
100 Shaffer Rd.
University of California
Santa Cruz, CA 95060
mar...@biology.ucsc.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.