[R] Some coefficients are doubled when I use the step() function

2012-12-09 Thread Chris Beeley
Hello-

Such a strange problem, can't figure it out at all. Using binomial glm
models, and the step() function, so the call looks like this:

sectionmodel = glm(formula = Target3 ~ S1Q12_NUM.1 + S1Q9_NUM.1 + S1Q5_NUM.1 +
S1Q7_NUM.1 + S1Q8_NUM.1 + S1Q6_NUM.1 + S1Q10_NUM.1 + S1Q12_BURG.1 +
S1Q12_CD.1 + S1Q4.1 + S1Q12_OTHVIOL.1 + S1Q8.1 + S1Q12_GBH.1 +
S1Q11.1 + S1Q7.1 + S1Q12_THEFT.1 + S1Q12_DRIV.1 + S1Q5.1 +
S1Q9.1 + S1Q12_DRUG.1, family = binomial, data = moddata)

But when I run step() on the resulting model, some of the coefficents
are doubled when it comes back, with a 2 at the end, e.g. like this:

mymodel = step(sectionmodel, direction=backward, test=F)

summary(mymodel) returns this:

Coefficients:
 Estimate Std. Error z value Pr(|z|)
(Intercept)  -4.585190.55675  -8.236   2e-16 ***
S1Q12_NUM.1   0.184460.08576   2.151   0.0315 *
S1Q4.12   0.568930.40281   1.412   0.1578
S1Q12_OTHVIOL.11  0.564350.38262   1.475   0.1402
S1Q12_GBH.11  0.491990.33175   1.483   0.1381
S1Q7.11  -1.273301.12897  -1.128   0.2594
S1Q7.12  -1.839271.16909  -1.573   0.1157
S1Q5.11   0.917421.19489   0.768   0.4426
S1Q5.12   2.168611.19864   1.809   0.0704 .
S1Q12_DRUG.11-0.484000.29898  -1.619   0.1055

As you can see S1Q7.1 and S1Q5.1 are duplicated as S1Q7.11 and S1Q7.12 etc.

I've googled and read and re-read the step() and stepAIC()
documentation and I just can't figure out what it could mean. Removing
the test=F bit also generates the same behaviour.

Any help greatly appreciated.

Chris Beeley
Institute of Mental Health, UK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova of lme objects (model1, model2) gives different results depending on order of models

2012-06-01 Thread Chris Beeley

Well that's that cleared up then. Thanks to all.

Chris B.

On 31/05/2012 17:51, Albyn Jones wrote:

No, both yield the same result: reject the null hypothesis,
which always corresponds to the restricted (smaller) model.

albyn

On Thu, May 31, 2012 at 12:47:30PM +0100, Chris Beeley wrote:

Hello-

I understand that it's convention, when comparing two models using
the anova function anova(model1, model2), to put the more
complicated (for want of a better word) model as the second model.
However, I'm using lme in the nlme package and I've found that the
order of the models actually gives opposite results. I'm not sure if
this is supposed to be the case or if I have missed something
important, and I can't find anything in the Pinheiro and Bates book
or in ?anova, or in Google for that matter which unfortunately only
returns results about ANOVA which isn't much help. I'm using the
latest version of R and nlme, just checked both.

Here is the code and output:


PHQmodel1=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal,

random=~1|Case, na.action=na.omit)

PHQmodel2=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal,

random=~1|Case, na.action=na.omit,
+  correlation=corAR1(form=~Date|Case))


anova(PHQmodel1, PHQmodel2) # accept model 2

 Model df  AIC  BIClogLik   Test
L.Ratio p-value
PHQmodel1 1  8 48784.57 48840.43 -24384.28
PHQmodel2 2  9 48284.68 48347.51 -24133.34 1 vs 2 501.8926.0001


PHQmodel1=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal,

random=~1|Case, na.action=na.omit,
+  correlation=corAR1(form=~Date|Case))

PHQmodel2=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal,

random=~1|Case, na.action=na.omit)


anova(PHQmodel1, PHQmodel2) # accept model 2

  Model df  AIC  BIClogLik   Test
L.Ratio p-value
PHQmodel1 1  9 48284.68 48347.51 -24133.34
PHQmodel2 2  8 48784.57 48840.43 -24384.28 1 vs 2 501.8926.0001

In both cases I am led to accept model 2 even though they are
opposite models. Is it really just that you have to put them in the
right order? It just seems like if there were say four models you
wouldn't necessarily be able to determine the correct order.

Many thanks,
Chris Beeley, Institute of Mental Health, UK

...session info follows


sessionInfo()

R version 2.15.0 (2012-03-30)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] grid  stats graphics  grDevices utils datasets
methods   base

other attached packages:
  [1] gridExtra_0.9  RColorBrewer_1.0-5 car_2.0-12
nnet_7.3-1 MASS_7.3-17
  [6] xtable_1.7-0   psych_1.2.4languageR_1.4
nlme_3.1-104   ggplot2_0.9.1

loaded via a namespace (and not attached):
  [1] colorspace_1.1-1 dichromat_1.2-4  digest_0.5.2 labeling_0.1
lattice_0.20-6   memoise_0.1
  [7] munsell_0.3  plyr_1.7.1   proto_0.3-9.2
reshape2_1.2.1   scales_0.2.1 stringr_0.6
[13] tools_2.15.0


packageDescription(nlme)

Package: nlme
Version: 3.1-104
Date: 2012-05-21
Priority: recommended
Title: Linear and Nonlinear Mixed Effects Models
Authors@R: c(person(Jose, Pinheiro, comment = S version),
person(Douglas, Bates, comment =
up to 2007), person(Saikat, DebRoy, comment = up
to 2002), person(Deepayan,
Sarkar, comment = up to 2005), person(R-core, email
= r-c...@r-project.org, role =
c(aut, cre)))
Author: Jose Pinheiro (S version), Douglas Bates (up to 2007),
Saikat DebRoy (up to 2002), Deepayan
Sarkar (up to 2005), the R Core team.
Maintainer: R-corer-c...@r-project.org
Description: Fit and compare Gaussian linear and nonlinear
mixed-effects models.
Depends: graphics, stats, R (= 2.13)
Imports: lattice
Suggests: Hmisc, MASS
LazyLoad: yes
LazyData: yes
License: GPL (= 2)
BugReports: http://bugs.r-project.org
Packaged: 2012-05-23 07:28:59 UTC; ripley
Repository: CRAN
Date/Publication: 2012-05-23 07:37:45
Built: R 2.15.0; x86_64-pc-mingw32; 2012-05-29 12:36:01 UTC; windows

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] anova of lme objects (model1, model2) gives different results depending on order of models

2012-05-31 Thread Chris Beeley

Hello-

I understand that it's convention, when comparing two models using the 
anova function anova(model1, model2), to put the more complicated (for 
want of a better word) model as the second model. However, I'm using lme 
in the nlme package and I've found that the order of the models actually 
gives opposite results. I'm not sure if this is supposed to be the case 
or if I have missed something important, and I can't find anything in 
the Pinheiro and Bates book or in ?anova, or in Google for that matter 
which unfortunately only returns results about ANOVA which isn't much 
help. I'm using the latest version of R and nlme, just checked both.


Here is the code and output:

 PHQmodel1=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal, 
random=~1|Case, na.action=na.omit)


 PHQmodel2=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal, 
random=~1|Case, na.action=na.omit,

+  correlation=corAR1(form=~Date|Case))

 anova(PHQmodel1, PHQmodel2) # accept model 2
Model df  AIC  BIClogLik   Test  
L.Ratio p-value

PHQmodel1 1  8 48784.57 48840.43 -24384.28
PHQmodel2 2  9 48284.68 48347.51 -24133.34 1 vs 2 501.8926 .0001

 PHQmodel1=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal, 
random=~1|Case, na.action=na.omit,

+  correlation=corAR1(form=~Date|Case))

 PHQmodel2=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal, 
random=~1|Case, na.action=na.omit)


 anova(PHQmodel1, PHQmodel2) # accept model 2
 Model df  AIC  BIClogLik   Test  
L.Ratio p-value

PHQmodel1 1  9 48284.68 48347.51 -24133.34
PHQmodel2 2  8 48784.57 48840.43 -24384.28 1 vs 2 501.8926 .0001

In both cases I am led to accept model 2 even though they are opposite 
models. Is it really just that you have to put them in the right order? 
It just seems like if there were say four models you wouldn't 
necessarily be able to determine the correct order.


Many thanks,
Chris Beeley, Institute of Mental Health, UK

...session info follows

 sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United 
Kingdom.1252

[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] grid  stats graphics  grDevices utils datasets  
methods   base


other attached packages:
 [1] gridExtra_0.9  RColorBrewer_1.0-5 car_2.0-12 
nnet_7.3-1 MASS_7.3-17
 [6] xtable_1.7-0   psych_1.2.4languageR_1.4  
nlme_3.1-104   ggplot2_0.9.1


loaded via a namespace (and not attached):
 [1] colorspace_1.1-1 dichromat_1.2-4  digest_0.5.2 
labeling_0.1 lattice_0.20-6   memoise_0.1
 [7] munsell_0.3  plyr_1.7.1   proto_0.3-9.2
reshape2_1.2.1   scales_0.2.1 stringr_0.6

[13] tools_2.15.0

 packageDescription(nlme)
Package: nlme
Version: 3.1-104
Date: 2012-05-21
Priority: recommended
Title: Linear and Nonlinear Mixed Effects Models
Authors@R: c(person(Jose, Pinheiro, comment = S version), 
person(Douglas, Bates, comment =
   up to 2007), person(Saikat, DebRoy, comment = up to 
2002), person(Deepayan,
   Sarkar, comment = up to 2005), person(R-core, email = 
r-c...@r-project.org, role =

   c(aut, cre)))
Author: Jose Pinheiro (S version), Douglas Bates (up to 2007), Saikat 
DebRoy (up to 2002), Deepayan

   Sarkar (up to 2005), the R Core team.
Maintainer: R-core r-c...@r-project.org
Description: Fit and compare Gaussian linear and nonlinear mixed-effects 
models.

Depends: graphics, stats, R (= 2.13)
Imports: lattice
Suggests: Hmisc, MASS
LazyLoad: yes
LazyData: yes
License: GPL (= 2)
BugReports: http://bugs.r-project.org
Packaged: 2012-05-23 07:28:59 UTC; ripley
Repository: CRAN
Date/Publication: 2012-05-23 07:37:45
Built: R 2.15.0; x86_64-pc-mingw32; 2012-05-29 12:36:01 UTC; windows

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extract fitted values with and without offset from glm

2012-04-05 Thread Chris Beeley

Hello-

In the notes for the lm function it states  Offsets specified by offset 
will not be included in predictions by predict.lm, whereas those 
specified by an offset term in the formula will be. I would like to 
extract fitted values in just this way from a glm model, those with the 
offset and those without. I have tried doing things like this:


predict.glm(glm(Incident~Numbers, offset=logit(Numbers), 
family=binomial, data=violdata))


predict.glm(glm(Incident~Numbers+offset(logit(Numbers)), 
family=binomial, data=violdata))


As well as like this:

glm(Incident~Numbers, offset=logit(Numbers), family=binomial, 
data=violdata)$fitted.values


glm(Incident~Numbers+offset(logit(Numbers)), family=binomial, 
data=violdata)$fitted.values


But they return the same result. The first 50 lines of my data look like 
this:


structure(list(Incident = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1), Numbers = c(13L,
13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 14L, 14L, 14L, 14L, 14L,
14L, 14L, 14L, 14L, 13L, 13L, 13L, 13L, 14L, 14L, 14L, 14L, 14L,
14L, 14L, 13L, 13L, 13L, 13L, 13L, 13L, 14L, 14L, 14L, 14L, 14L,
13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L)), .Names = c(Incident,
Numbers), row.names = c(NA, 50L), class = data.frame)


Any assistance gratefully recieved.

Many thanks,
Chris Beeley, Institute of Mental Health, UK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nested brew call yields Error in .brew.cat(26, 28) : unused argument(s) (26, 28)

2012-04-02 Thread Chris Beeley
Many thanks for this. I have a follow-up question. The output that I 
have from the nested brew call includes output like this:


NANANANANANANANAN

... then a graph or a table

... then more 
NANANANANANANANANNANANANANANANANANNANANANANANANANANNANANANANANANANAN


... etc.

It only occurs in the nested brew calls, not in the top level document, 
which is absolutely fine. There are functions defined in the top level 
file which the lower level files make use of. I assumed the problem was 
caused by my not understanding the documentation to do with nested brew 
calls; evidently this is not the case.


I have several functions within the top file, one for drawing graphs, 
one for tables, another for wordclouds, etc. They all generate this 
NANANANANANANA behaviour, I have tested by putting them in and out of 
the code.


I tried to produce a minimal self contained example containing a 
function defined in the top level file used by a file called in a nested 
brew: however this file worked fine.


I realise this isn't a lot to go on, but the functions are fairly long 
and it clearly isn't a specific issue with a particular function because 
they all do it. Has anyone else ever had this happen to them? If so did 
you find a solution (other than manually removing the NAs using a final 
piece of code, which admittedly is not too arduous).


Many thanks,
Chris Beeley,
Institute of Mental Health, UK


On 30/03/2012 02:27, Matt Shotwell wrote:

On Wed, 2012-03-28 at 11:40 +0100, Chris Beeley wrote:

I am writing several webpages using the brew package and R2HTML. I would
like to work off one script so I am using nested brew calls. The
documentation for brew states that:

NOTE: brew calls can be nested and rely on placing a function named
’.brew.cat’ in the environment in which it is passed. Each time brew is
called, a check for the existence of this function is made. If it
exists, then it is replaced with a new copy that is lexically scoped to
the current brew frame. Once the brew call is done, the function is
replaced with the previous function. The function is finally removed from
the environment once all brew calls return.

I'm afraid I can't quite figure out what it is I'm supposed to do here.
I've tried loading the brew library within the script which I pass to
brew, and I've tried defining brew cat like this:

The paragraph above describes what brew is doing behind the scenes. It's
not necessary to modify or set the .brew.cat function.

A nested (or recursive) brew call occurs when brew() is called from a
document currently being processed by brew().

To illustrate further, suppose there are two brew documents,
example-1.brew and example-2.brew, where example-1.brew contains the
following text (delimited by '''):

'''
This text is in example-1.brew.
%= brew::brew(example-2.brew) %
'''

and the example-2.brew contains

'''
This text is in example-2.brew.
%= date() -%
'''

Then from the R prompt we have:

Rbrew::brew(example-1.brew)
This text is in example-1.brew.
This text is in example-2.brew.
Thu Mar 29 20:24:52 2012


.brew.cat=function(){}

This generates the following error message:

Error in .brew.cat(26, 28) : unused argument(s) (26, 28)

I think perhaps it is more likely that I need to insert into the script
the actual content of .brew.cat, but I can't seem to get R to tell me
what it is and Googling throws up a lot of stuff about beer and not much
else (drew a blank also from RSiteSearch(Nested brew))

Any help gratefully received.

Chris Beeley
Institute of Mental Health, UK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Nested brew call yields Error in .brew.cat(26, 28) : unused argument(s) (26, 28)

2012-03-28 Thread Chris Beeley
I am writing several webpages using the brew package and R2HTML. I would 
like to work off one script so I am using nested brew calls. The 
documentation for brew states that:


NOTE: brew calls can be nested and rely on placing a function named 
’.brew.cat’ in the environment in which it is passed. Each time brew is 
called, a check for the existence of this function is made. If it 
exists, then it is replaced with a new copy that is lexically scoped to 
the current brew frame. Once the brew call is done, the function is 
replaced with the previous function. The function is finally removed from 
the environment once all brew calls return.


I'm afraid I can't quite figure out what it is I'm supposed to do here. 
I've tried loading the brew library within the script which I pass to 
brew, and I've tried defining brew cat like this:


.brew.cat=function(){}

This generates the following error message:

Error in .brew.cat(26, 28) : unused argument(s) (26, 28)

I think perhaps it is more likely that I need to insert into the script 
the actual content of .brew.cat, but I can't seem to get R to tell me 
what it is and Googling throws up a lot of stuff about beer and not much 
else (drew a blank also from RSiteSearch(Nested brew))


Any help gratefully received.

Chris Beeley
Institute of Mental Health, UK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] as.numeric() generates NAs inside an apply call, but fine outside of it

2012-01-09 Thread Chris Beeley

Hello-

I have rather a messy SPSS file which I have imported to R, I've dput'd 
some of the columns at the end of this message. I wish to get rid of all 
the labels and have numeric values using as.numeric. The funny thing is 
it works like this:


as.numeric(mydata[,2]) # generates correct numbers

however, if I pass the whole dataframe at once like this:

apply(mydata, 1:2, function(x) as.numeric(x))

This same column, column 2, generates NAs with a in FUN(newX[, i], ...) 
: NAs introduced by coercion message.


Meanwhile column 3 works fine like this:

as.numeric(mydata[,3]) # generates correct numbers

And generates numeric results out of the apply function.

I think I basically know why, the str() command tells me that the 
variables which work okay are labelled whereas the ones that don't are 
Factor. However, I can't figure out what's special about the apply 
call that generates the NAs when as.numeric(mydata[,2]) doesn't and I'm 
not sure what to do about it in future.


I realise I can just loop over the columns, but I would rather get to 
the bottom of this if I can so I know for future.


Thanks in advance for any advice

Chris Beeley
Institute of Mental Health, UK

dput() gives-

structure(list(id = structure(1:79, label = structure(Participant, .Names = id), 
class = labelled),
item2.jan11 = structure(c(4L, 3L, 6L, 4L, 6L, 6L, 2L, 6L,
2L, 2L, 3L, 3L, 1L, 6L, 2L, 6L, 4L, 2L, 6L, 2L, 6L, 6L, 6L,
4L, 4L, 6L, 2L, 6L, 2L, 6L, 2L, 3L, 6L, 6L, 3L, 6L, 5L, 6L,
3L, 6L, 1L, 3L, 3L, 3L, 6L, 4L, 1L, 3L, 6L, 2L, 6L, 2L, 6L,
6L, 6L, 4L, 3L, 6L, 6L, 6L, 6L, 6L, 3L, 6L, 2L, 6L, 6L, 2L,
4L, 6L, 2L, 5L, 6L, 6L, 6L, 6L, 1L, 6L, 4L), .Label = c(Not at all,
a little, somewhat, quite a lot, very much, missing data
), class = c(labelled, factor), label = structure(The patients care for each 
other, .Names = item2_jan11)),
item12.jan11 = structure(c(5L, 5L, 999L, 5L, 999L, 999L,
2L, 999L, 5L, 2L, 5L, 3L, 3L, 999L, 2L, 999L, 5L, 5L, 999L,
5L, 999L, 999L, 999L, 5L, 5L, 999L, 3L, 999L, 5L, 999L, 3L,
4L, 999L, 999L, 4L, 999L, 5L, 999L, 5L, 999L, 3L, 5L, 4L,
4L, 999L, 3L, 2L, 4L, 999L, 5L, 999L, 5L, 999L, 999L, 999L,
4L, 5L, 999L, 999L, 999L, 999L, 999L, 4L, 999L, 3L, 999L,
999L, 1L, 5L, 999L, 3L, 5L, 999L, 999L, 999L, 999L, 4L, 999L,
0L), value.labels = structure(c(999, 5, 4, 3, 2, 1), .Names = c(missing 
data,
very much, quite a lot, somewhat, a little, Not at all
)), label = structure(At times, members of staff are afraid of some of the patients, .Names = 
item12_jan11), class = labelled)), .Names = c(id,
item2.jan11, item12.jan11), class = data.frame, row.names = c(NA,
-79L))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] as.numeric() generates NAs inside an apply call, but fine outside of it

2012-01-09 Thread Chris Beeley

Perfect, many thanks for explanation and correct line of code.

On 09/01/2012 14:29, peter dalgaard wrote:

as.data.frame(lapply(mydata, as.numeric))


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Basic question about re-writing for loop as a function

2011-08-29 Thread Chris Beeley
Hello-

Sorry to ask a basic question, but I've spent many hours on this now
and seem to be missing something.

I have a loop that looks like this:

mainmat=data.frame(matrix(data=0, ncol=92, nrow=length(predata$Words_MH)))

for(i in 1:length(predata$Words_MH)){
for(j in 1:92){

mainmat[i,j]=ifelse(j %in%
as.numeric(unlist(strsplit(predata$Words_MH[i], split=,))), 1, 0)

}
}

What it's doing is creating a matrix with 92 columns, that's the
number of different codes, and then for every row of my data it looks
to see if the code (code 1, code 2, etc.) is in the string and if it
is, returns a 1 in the relevant column (column 1 for code 1, column 2
for code 2, etc.)

There are 1000 rows in the database, and I have to run several
versions of this code, so it just takes way too long, I have been
trying to rewrite using lapply. I tried this:

myfunction=function(x, y) ifelse(x %in%
as.numeric(unlist(strsplit(predata$Words_MH[y], split=,))), 1, 0)

for(j in 1:92){
mainmat[,j]= lapply(predata$Words, myfunction)
}

but I don't think I can use something that takes two inputs, and I
can't seem to remove either.

Here's a dput of the first 10 rows of the variable in case that's helpful:

predata$Words=c(1, 1, 1, 1, 2,3,4, 5, 1, 1, 6, 7,8,9,10)

Given these data, I want the function to return, for the first column,
1, 1, 1, 1, 0, 0, 1, 1, 0, 0 (because those are the values of Words
which contain a 1) and for the second column return 0, 0, 0, 0, 1, 0,
0, 0, 0, 0 (because the fifth value is the only one that contains a
2).

Any suggestions gratefully received!

Chris Beeley
Institute of Mental Health, UK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] odfWeave repeats output

2011-08-12 Thread Chris Beeley
Hello all-

I'm having a problem with odfWeave. I'm still testing it out, and have
used both of these code chunks, which I copied off a blog:

Number 1:

A sample document last processed
\Sexpr{Sys.time()}.
This simply illustrates the output from an
R command inserted into our document.
This is using \Sexpr{version$version.string}.

Number 2:

Sample1=

summary(iris)

@

Both do the same thing, which is generate the document using this code:

odfWeave(/media/Windows7/temp/GCAMT_in.odt,
/media/Windows7/temp/GCAMT_out2.odt)

But the output repeats over and over in the document in a bizarre way,
stretching out over about 9 pages, like this (abbreviated):

A sample document last processed
2011-08-12 09:55:51.
This simply illustrates the output from an
R command inserted into our document.
This is using R version 2.12.1 (2010-12-16).

...

A sample document last processedA sample document last processed
2011-08-12 09:55:51.2011-08-12 09:55:51.
This simply illustrates the output from anThis simply illustrates the
output from an
R command inserted into our document.R command inserted into our document.
This is using R version 2.12.1 (2010-12-16).This is using R version
2.12.1 (2010-12-16).

...

etc.

The really weird thing is that I have replicated the problem across
two operating systems (dual boot on the same computer), windows 7
64bit and Linux Mint 11 (which is Ubuntu, not sure which version I'm
afraid). I've been unable to find anyone on any forums or anything
with the same problem.

Using R v2.13 on Windows, v 2.12 on Linux, was using RStudio but just
tested it without (just in case) and it does the same thing.

Any suggestions gratefully received.

Chris Beeley
Institute of Mental Health, UK.

Output of the operation is below.

With this output:

 odfWeave(/media/Windows7/temp/GCAMT_in.odt, 
 /media/Windows7/temp/GCAMT_out2.odt)
  Copying  /media/Windows7/temp/GCAMT_in.odt
  Setting wd to  /tmp/RtmpAwd1Bm/odfWeave12095551677
  Unzipping ODF file using unzip -o GCAMT_in.odt
Archive:  GCAMT_in.odt
 extracting: mimetype
   creating: Configurations2/statusbar/
  inflating: Configurations2/accelerator/current.xml
   creating: Configurations2/floater/
   creating: Configurations2/popupmenu/
   creating: Configurations2/progressbar/
   creating: Configurations2/toolpanel/
   creating: Configurations2/menubar/
   creating: Configurations2/toolbar/
   creating: Configurations2/images/Bitmaps/
  inflating: content.xml
  inflating: manifest.rdf
  inflating: styles.xml
 extracting: meta.xml
  inflating: Thumbnails/thumbnail.png
  inflating: settings.xml
  inflating: META-INF/manifest.xml

  Removing  GCAMT_in.odt
  Creating a Pictures directory

  Pre-processing the contents
  Sweaving  content.Rnw

  Writing to file content_1.xml
  Processing code chunks ...

  'content_1.xml' has been Sweaved

  Removing content.xml

  Post-processing the contents
  Removing content.Rnw
  Removing styles.xml
  Renaming styles_2.xml to styles.xml
  Removing manifest.xml
  Renaming manifest_2.xml to manifest.xml
  Removing extra files

  Packaging file using zip -r GCAMT_in.odt .
  adding: mimetype (stored 0%)
  adding: content.xml (deflated 98%)
  adding: settings.xml (deflated 84%)
  adding: meta.xml (deflated 57%)
  adding: META-INF/ (stored 0%)
  adding: META-INF/manifest.xml (deflated 83%)
  adding: styles.xml (deflated 93%)
  adding: manifest.rdf (deflated 54%)
  adding: Pictures/ (stored 0%)
  adding: Thumbnails/ (stored 0%)
  adding: Thumbnails/thumbnail.png (deflated 23%)
  adding: Configurations2/ (stored 0%)
  adding: Configurations2/progressbar/ (stored 0%)
  adding: Configurations2/images/ (stored 0%)
  adding: Configurations2/images/Bitmaps/ (stored 0%)
  adding: Configurations2/toolbar/ (stored 0%)
  adding: Configurations2/menubar/ (stored 0%)
  adding: Configurations2/statusbar/ (stored 0%)
  adding: Configurations2/popupmenu/ (stored 0%)
  adding: Configurations2/accelerator/ (stored 0%)
  adding: Configurations2/accelerator/current.xml (stored 0%)
  adding: Configurations2/floater/ (stored 0%)
  adding: Configurations2/toolpanel/ (stored 0%)
  Copying  GCAMT_in.odt
  Resetting wd
  Removing  /tmp/RtmpAwd1Bm/odfWeave12095551677

  Done

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Match strings across two differently sized dataframes and copy corresponding row to dataframe

2011-06-30 Thread Chris Beeley
Hello-

Sorry, this is a bit of a noob question, but I can't seem to progress
it any further.

I have two dataframes which contain a series of strings which exactly
match. The problem is one has more rows than the other (more cases
have been added) and they have been sorted so that they are not in the
same order. The smaller dataframe, though, contains in another column
which has codes classifying the strings.

So, for every row of the larger dataframe, I want to look up the
string in the smaller dataframe, and then use that row number to copy
across the code for the string into the larger dataframe. Here's my
idea so far:

# comments is the smaller dataframe with the codes, mydata is the
larger dataframe to which I would like to copy it.

commvec=charmatch(comments$ImproveOne, mydata$Improve)  # this is the
match between the strings one way
datavec=charmatch(mydata$Improve, comments$ImproveOne) # this is the
match the other way

mydata$ImproveCat1=NA # produce a variable to hold the copied codes

mydata$ImproveCat1[datavec[!is.na(datavec)]]=
comments$ImproveCat[commvec[!is.na(commvec)]] # for all the non
missing row numbers identified in the larger dataframe-
# copy the corresponding code from the smaller dataframe (which lives
in comments$ImproveCat

However, the last command doesn't work because the variables are not
the same length. They nearly are though, not sure if that's
coincidence or shows I'm close

length(mydata$ImproveCat1[datavec[!is.na(datavec)]]) # yields 1567

length(comments$ImproveCat[commvec[!is.na(commvec)]]) # yields 1512

I'm sorry, I did try to construct an example dataframe, but ironically
I can't make that work either! Sorry!

Any help gratefully received.

Many thanks!

Chris Beeley
Institute of Mental Health, UK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Replace selected columns of a dataframe with NA

2011-06-20 Thread Chris Beeley
I am using the following command to replace all the missing values and
assorted typos in a dataframe with NA:

mydata[mydata80]=NA

The problem is that the first column contains values which should be
more than 80, so really I want to do it just for
mydata[,2:length(mydata)]

I can't seem to re-write the code to fit:

mydata[,2:length(mydata)80]=NA # no error message, but doesn't work-
doesn't do anything, it would seem

I realise I can just keep the first column somewhere safe and copy it
back again when I'm done, but I wondered if there was a more elegant
solution, which would be much more important, if say I just wanted to
replace the odd columns, or something like that.

I found this code on the internet too:

idx - which(foo80, arr.ind=TRUE)
foo[idx[1], idx[2]] - NA

But I can't seem to rewrite that either, for the same reason

Many thanks!

Chris Beeley
Institute of Mental Health

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Subset command and the : operator

2011-05-27 Thread Chris Beeley
Hello-

I have some code that looks like this:

with(mydatalocal, sum(table(Service[Time==5:8])))

This is designed to add up the numbers of responses between the Time
codes 5 to 8 (which are integers and refer to quarters). Service is
just one of the variables, I'm just trying to count the number of
responses so I picked any of the variables. However, there is
something wrong, it returns far too low a number for the number of
responses. Indeed, if I run this:

with(mydatalocal, sum(table(Service[Time==5|Time==6|Time==7|Time==8])))

I get 4 times as many responses.

I've tried to recreate the problem with the following code:

mydata=data.frame(matrix(c(rep(1, 10), rep(2, 10), rep(3, 10), seq(1,
10, 1), seq(11, 20, 1), seq(21, 30, 1)), ncol=2))

with(mydata, sum(table(X1[X2==9:12])))

with(mydata, sum(table(X1[X2==9|X2==10|X2==11|X2==12])))

but to my immense frustration it actually seems to work fine there,
the same number, 4, both times. However, it does generate the
following error message:

In X2 == 9:12 :
  longer object length is not a multiple of shorter object length

I know I can use X1[ Time  9  Time  3] but I would like to know
what is wrong with the 5:8 usage in case I put it somewhere else and
don't notice the problem.

Many thanks!

Chris Beeley
Institute of Mental Health

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to remove rows based on frequency of factor and then difference date scores

2010-08-25 Thread Chris Beeley
Many thanks to you both. I have now filed away for future reference the 2 
factor tapply as well as the extremely useful looking plyr library. And the 
code worked beautifully :-)



On 24 Aug 2010, at 19:47, Abhijit Dasgupta, PhD aikidasgu...@gmail.com 
wrote:

 The paste-y argument is my usual trick in these situations. I forget that 
 tapply can take multiple ordering arguments :)
 
 Abhijit
 
 On 08/24/2010 02:17 PM, David Winsemius wrote:
 
 On Aug 24, 2010, at 1:59 PM, Abhijit Dasgupta, PhD wrote:
 
 The only problem with this is that Chris's unique individuals are a 
 combination of Type and ID, as I understand it. So Type=A, ID=1 is a 
 different individual from Type=B,ID=1. So we need to create a unique 
 identifier per person, simplistically by uniqueID=paste(Type, ID, sep=''). 
 Then, using this new identifier, everything follows.
 
 I see your point. I agree that a tapply method should present both factors 
 in the indices argument.
 
  new.df - txt.df[ -which( txt.df$nn =1), ]
  new.df - new.df[ with(new.df, order(Type, ID) ), ]  # and possibly needs 
  to be ordered?
  new.df$diffdays - unlist( tapply(new.df$dt2, list(new.df$ID, 
  new.df$Type), function(x) x[1] -x) )
  new.df
  Type ID   Date Valuedt2 nn diffdays
 1A  1 16/09/2020 8 2020-09-16  30
 2A  1 23/09/2010 9 2010-09-23  3 3646
 4B  1  13/5/2010 6 2010-05-13  30
 
 But do not agree that you need, in this case at least, to create a paste()-y 
 index. Agreed, however, such a construction can be useful in other 
 situations.
 
 
 
 -- 
 
 Abhijit Dasgupta, PhD
 Director and Principal Statistician
 ARAASTAT
 Ph: 301.385.3067
 E: adasgu...@araastat.com
 W: http://www.araastat.com
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to remove rows based on frequency of factor and then difference date scores

2010-08-24 Thread Chris Beeley
Hello-

A basic question which has nonetheless floored me entirely. I have a
dataset which looks like this:

Type  ID DateValue
A   116/09/2020   8
A   1 23/09/2010  9
B   3 18/8/20107
B   1 13/5/20106

There are two Types, which correspond to different individuals in
different conditions, and loads of ID labels (1:50) corresponding to
the different individuals in each condition, and measurements at
different times (from 1 to 10 measurements) for each individual.

I want to perform the following operations:

1) Delete all individuals for whom only one measurement is available.
In the dataset above, you can see that I want to delete the row Type B
ID 3, and Type B ID 1, but without deleting the Type A ID 1 data
because there is more than one measurement for Type A ID 1 (but not
for Type B ID1)

2) Produce difference scores for each of the Dates, so each individual
(Type A ID1 and all the others for whom more than one measurement
exists) starts at Date 1 and goes up in integers according to how
many days have elapsed.

I just know there's some incredibly cunning R-ish way of doing this
but after many hours of fiddling I have had to admit defeat.

I would be very grateful for any words of advice.

Many thanks,
Chris Beeley,
Institute of Mental Health, UK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Odp: Problem with aggregating data across time points

2010-07-03 Thread Chris Beeley
 on
 until 31-12-2009. The last four variables which you can see at the end
 of the email are my dependent variables, they are different types of
 violent and self harming behaviour shown by patients in a psychiatric
 hospital.
 
 What I want to do is:
 
 A) sum each of the dependent variables for each of the dates (so e.g.
 in the example above for 1-4-2007 it would be 3+2=5, 0+1=1, 1+2=3, and
 3+4=7 for each of the variables)
 
 B) do this sum, but only in each location this time (location is the
 first variable)- so the sum for 1-4-2007 in location A, sum for
 1-4-2007 in location B, and so on and so on. Because this is divided
 across locations, some dates will have no data going into them and
 will return 0 sums. Crucially I still want these dates to appear- so
 e.g. 21-5-2008 would appear as 0 0 0 0, then 22-5-2008 might have 1 2
 0 0, then 23-5-2008 0 0 0 0 again, and etc.
 
 I've had several abortive attempts and done some Googling but have got
 nowhere. I'd greatly appreciate any advice.
 
 Many thanks,
 Chris Beeley
 (Institute of Mental Health, UK)
 
 
 structure(list(Location = structure(c(1L, 2L, 2L, 1L, 3L, 5L,
 5L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 4L, 4L, 1L, 5L, 5L, 5L, 5L, 6L,
 1L, 2L, 3L, 5L, 6L, 6L, 6L, 7L, 7L, 5L, 5L, 4L, 4L, 4L, 3L, 3L,
 3L, 2L, 2L, 2L, 2L, 7L, 7L, 7L, 6L, 5L, 4L, 4L, 6L, 5L, 2L, 2L,
 3L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 5L, 5L, 3L, 3L, 4L,
 4L, 4L, 4L), .Label = c(, A, B, C, D, E, F), class =
 factor),
   Sex = c(NA, 1L, NA, NA, NA, 1L, 2L, NA, NA, 2L, 2L, NA, 2L,
   2L, 1L, 1L, NA, 2L, 2L, 2L, 1L, NA, NA, 1L, 1L, 1L, 1L, 2L,
   1L, 2L, NA, 1L, 1L, NA, 1L, NA, NA, 2L, 1L, 1L, 2L, 2L, 2L,
   2L, 1L, 2L, 2L, 2L, 2L, NA, 1L, 2L, NA, 1L, 1L, NA, 1L, NA,
   1L, 2L, NA, 1L, 1L, NA, 1L, 1L, 1L, NA, 2L, 2L, 1L, 2L, 1L
   ), Date = structure(c(1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L,
   2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 2L,
   2L, 2L, 2L, 2L, 2L, 2L, 1L, 3L, 3L, 1L, 3L, 1L, 1L, 3L, 3L,
   3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 3L, 4L, 1L, 4L,
   4L, 1L, 4L, 1L, 4L, 4L, 1L, 4L, 4L, 1L, 4L, 4L, 4L, 1L, 4L,
   4L, 4L, 4L, 4L), .Label = c(, 01/04/07, 02/04/07, 03/04/07
   ), class = factor), Time = structure(c(1L, 28L, 1L, 1L,
   1L, 1L, 20L, 1L, 1L, 37L, 37L, 2L, 13L, 31L, 1L, 17L, 1L,
   34L, 38L, 39L, 23L, 1L, 1L, 24L, 14L, 16L, 1L, 33L, 30L,
   10L, 1L, 6L, 8L, 1L, 26L, 1L, 1L, 13L, 3L, 4L, 1L, 1L, 35L,
   36L, 25L, 9L, 11L, 5L, 22L, 1L, 10L, 30L, 1L, 19L, 15L, 1L,
   29L, 1L, 27L, 10L, 2L, 21L, 18L, 1L, 23L, 32L, 36L, 1L, 30L,
   7L, 12L, 1L, 15L), .Label = c(,  , 02:24:00, 03:44:00,
   04:30:00, 07:00:00, 08:35:00, 09:20:00, 09:30:00,
   10:00:00, 10:15:00, 10:45:00, 11:00:00, 11:20:00,
   11:30:00, 11:35:00, 11:50:00, 12:00:00, 12:25:00,
   12:30:00, 12:45:00, 15:00:00, 15:15:00, 15:30:00,
   15:35:00, 17:15:00, 17:50:00, 18:00:00, 19:00:00,
   19:30:00, 19:50:00, 20:00:00, 20:30:00, 20:55:00,
   22:15:00, 22:30:00, 22:35:00, 22:40:00, 23:10:00
   ), class = factor), verbal = c(NA, 3L, NA, NA, NA, 3L,
   0L, NA, NA, 0L, 0L, NA, 0L, 0L, 0L, 4L, NA, 0L, 0L, 0L, 4L,
   NA, NA, 4L, 3L, 0L, 4L, 0L, 0L, 0L, NA, 0L, 0L, NA, 0L, NA,
   NA, 4L, 0L, 4L, 0L, 0L, 4L, 1L, 4L, 3L, 0L, 0L, 0L, NA, 4L,
   0L, NA, 0L, 3L, NA, 1L, NA, 0L, 3L, NA, 1L, 4L, NA, 4L, 0L,
   0L, NA, 0L, 0L, 0L, 0L, 1L), self.harm = c(NA, 0L, NA, NA,
   NA, 0L, 0L, NA, NA, 0L, 1L, NA, 2L, 0L, 0L, 2L, NA, 2L, 0L,
   2L, 0L, NA, NA, 0L, 0L, 2L, 0L, 1L, 2L, 1L, NA, 0L, 0L, NA,
   0L, NA, NA, 0L, 2L, 0L, 1L, 1L, 0L, 2L, 0L, 0L, 0L, 0L, 0L,
   NA, 0L, 2L, NA, 0L, 0L, NA, 0L, NA, 4L, 0L, NA, 1L, 0L, NA,
   1L, 3L, 1L, NA, 0L, 0L, 0L, 1L, 0L), violence_objects = c(NA,
   0L, NA, NA, NA, 0L, 0L, NA, NA, 0L, 0L, NA, 0L, 0L, 0L, 3L,
   NA, 0L, 0L, 0L, 0L, NA, NA, 0L, 0L, 0L, 0L, 0L, 0L, 0L, NA,
   0L, 0L, NA, 0L, NA, NA, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
   4L, 0L, 4L, NA, 0L, 0L, NA, 0L, 0L, NA, 0L, NA, 0L, 0L, NA,
   0L, 0L, NA, 0L, 0L, 0L, NA, 0L, 0L, 0L, 0L, 0L), violence = c(NA,
   0L, NA, NA, NA, 0L, 1L, NA, NA, 3L, 0L, NA, 0L, 1L, 1L, 1L,
   NA, 1L, 1L, 0L, 0L, NA, NA, 0L, 0L, 0L, 0L, 0L, 0L, 0L, NA,
   3L, 3L, NA, 2L, NA, NA, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L,
   0L, 3L, 0L, NA, 0L, 0L, NA, 2L, 0L, NA, 0L, NA, 0L, 0L, NA,
   0L, 0L, NA, 0L, 0L, 0L, NA, 3L, 3L, 2L, 0L, 0L)), .Names =
 c(Location,
 Sex, Date, Time, verbal, self.harm, violence_objects,
 violence), class = data.frame, row.names = c(NA, -73L))
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 David Winsemius, MD
 West Hartford, CT

[R] Problem with aggregating data across time points

2010-07-02 Thread Chris Beeley
Hello-

I have a dataset which basically looks like this:

Location   Sex   Date  Time   VerbalSelf harm
Violence_objects   Violence
  A 1  1-4-2007   1800  3 0
1   3
  A 1  1-4-2007   1230  21
   2   4
  D 2  2-4-2007   1100  04
   0   0
...

I've put a dput of the first section of the data at the end of this
email. Basically I have these data for several days across all of the
dates, so 2 or more on 1-4-2007, 2 or more on 2-4-2007, and so on
until 31-12-2009. The last four variables which you can see at the end
of the email are my dependent variables, they are different types of
violent and self harming behaviour shown by patients in a psychiatric
hospital.

What I want to do is:

A) sum each of the dependent variables for each of the dates (so e.g.
in the example above for 1-4-2007 it would be 3+2=5, 0+1=1, 1+2=3, and
3+4=7 for each of the variables)

B) do this sum, but only in each location this time (location is the
first variable)- so the sum for 1-4-2007 in location A, sum for
1-4-2007 in location B, and so on and so on. Because this is divided
across locations, some dates will have no data going into them and
will return 0 sums. Crucially I still want these dates to appear- so
e.g. 21-5-2008 would appear as 0 0 0 0, then 22-5-2008 might have 1 2
0 0, then 23-5-2008 0 0 0 0 again, and etc.

I've had several abortive attempts and done some Googling but have got
nowhere. I'd greatly appreciate any advice.

Many thanks,
Chris Beeley
(Institute of Mental Health, UK)


structure(list(Location = structure(c(1L, 2L, 2L, 1L, 3L, 5L,
5L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 4L, 4L, 1L, 5L, 5L, 5L, 5L, 6L,
1L, 2L, 3L, 5L, 6L, 6L, 6L, 7L, 7L, 5L, 5L, 4L, 4L, 4L, 3L, 3L,
3L, 2L, 2L, 2L, 2L, 7L, 7L, 7L, 6L, 5L, 4L, 4L, 6L, 5L, 2L, 2L,
3L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 5L, 5L, 3L, 3L, 4L,
4L, 4L, 4L), .Label = c(, A, B, C, D, E, F), class = factor),
Sex = c(NA, 1L, NA, NA, NA, 1L, 2L, NA, NA, 2L, 2L, NA, 2L,
2L, 1L, 1L, NA, 2L, 2L, 2L, 1L, NA, NA, 1L, 1L, 1L, 1L, 2L,
1L, 2L, NA, 1L, 1L, NA, 1L, NA, NA, 2L, 1L, 1L, 2L, 2L, 2L,
2L, 1L, 2L, 2L, 2L, 2L, NA, 1L, 2L, NA, 1L, 1L, NA, 1L, NA,
1L, 2L, NA, 1L, 1L, NA, 1L, 1L, 1L, NA, 2L, 2L, 1L, 2L, 1L
), Date = structure(c(1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L,
2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 3L, 3L, 1L, 3L, 1L, 1L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 3L, 4L, 1L, 4L,
4L, 1L, 4L, 1L, 4L, 4L, 1L, 4L, 4L, 1L, 4L, 4L, 4L, 1L, 4L,
4L, 4L, 4L, 4L), .Label = c(, 01/04/07, 02/04/07, 03/04/07
), class = factor), Time = structure(c(1L, 28L, 1L, 1L,
1L, 1L, 20L, 1L, 1L, 37L, 37L, 2L, 13L, 31L, 1L, 17L, 1L,
34L, 38L, 39L, 23L, 1L, 1L, 24L, 14L, 16L, 1L, 33L, 30L,
10L, 1L, 6L, 8L, 1L, 26L, 1L, 1L, 13L, 3L, 4L, 1L, 1L, 35L,
36L, 25L, 9L, 11L, 5L, 22L, 1L, 10L, 30L, 1L, 19L, 15L, 1L,
29L, 1L, 27L, 10L, 2L, 21L, 18L, 1L, 23L, 32L, 36L, 1L, 30L,
7L, 12L, 1L, 15L), .Label = c(,  , 02:24:00, 03:44:00,
04:30:00, 07:00:00, 08:35:00, 09:20:00, 09:30:00,
10:00:00, 10:15:00, 10:45:00, 11:00:00, 11:20:00,
11:30:00, 11:35:00, 11:50:00, 12:00:00, 12:25:00,
12:30:00, 12:45:00, 15:00:00, 15:15:00, 15:30:00,
15:35:00, 17:15:00, 17:50:00, 18:00:00, 19:00:00,
19:30:00, 19:50:00, 20:00:00, 20:30:00, 20:55:00,
22:15:00, 22:30:00, 22:35:00, 22:40:00, 23:10:00
), class = factor), verbal = c(NA, 3L, NA, NA, NA, 3L,
0L, NA, NA, 0L, 0L, NA, 0L, 0L, 0L, 4L, NA, 0L, 0L, 0L, 4L,
NA, NA, 4L, 3L, 0L, 4L, 0L, 0L, 0L, NA, 0L, 0L, NA, 0L, NA,
NA, 4L, 0L, 4L, 0L, 0L, 4L, 1L, 4L, 3L, 0L, 0L, 0L, NA, 4L,
0L, NA, 0L, 3L, NA, 1L, NA, 0L, 3L, NA, 1L, 4L, NA, 4L, 0L,
0L, NA, 0L, 0L, 0L, 0L, 1L), self.harm = c(NA, 0L, NA, NA,
NA, 0L, 0L, NA, NA, 0L, 1L, NA, 2L, 0L, 0L, 2L, NA, 2L, 0L,
2L, 0L, NA, NA, 0L, 0L, 2L, 0L, 1L, 2L, 1L, NA, 0L, 0L, NA,
0L, NA, NA, 0L, 2L, 0L, 1L, 1L, 0L, 2L, 0L, 0L, 0L, 0L, 0L,
NA, 0L, 2L, NA, 0L, 0L, NA, 0L, NA, 4L, 0L, NA, 1L, 0L, NA,
1L, 3L, 1L, NA, 0L, 0L, 0L, 1L, 0L), violence_objects = c(NA,
0L, NA, NA, NA, 0L, 0L, NA, NA, 0L, 0L, NA, 0L, 0L, 0L, 3L,
NA, 0L, 0L, 0L, 0L, NA, NA, 0L, 0L, 0L, 0L, 0L, 0L, 0L, NA,
0L, 0L, NA, 0L, NA, NA, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
4L, 0L, 4L, NA, 0L, 0L, NA, 0L, 0L, NA, 0L, NA, 0L, 0L, NA,
0L, 0L, NA, 0L, 0L, 0L, NA, 0L, 0L, 0L, 0L, 0L), violence = c(NA,
0L, NA, NA, NA, 0L, 1L, NA, NA, 3L, 0L, NA, 0L, 1L, 1L, 1L,
NA, 1L, 1L, 0L, 0L, NA, NA, 0L, 0L, 0L, 0L, 0L, 0L, 0L, NA,
3L, 3L, NA, 2L, NA, NA, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L,
0L, 3L, 0L, NA, 0L, 0L, NA, 2L, 0L, NA, 0L, NA, 0L, 0L, NA,
0L, 0L, NA, 0L, 0L, 0L, NA, 3L, 3L, 2L, 0L, 0L)), .Names = c(Location,
Sex, Date, Time, verbal, self.harm