Re: [R] test logistic regression model

2022-11-20 Thread Mitchell Maltenfort
Agreed on the ranking of (1) vs (2)



On Sun, Nov 20, 2022 at 1:30 PM Ebert,Timothy Aaron  wrote:

> I like option 1. Option 2 may cause problems if you are pooling groups
> that do not go together. This is especially a problem if you know that the
> data is missing some groups. I would consider dropping rare groups - or
> compare results between pooling and dropping options. If the answer is the
> same in both cases then use the approach that makes your life easier with
> reviewers/clients. If the answer is different then I would go with dropping
> rare categories, or present both and highlight the difference in outcome. A
> third option is to gather more data.
>
> Tim
>
> -Original Message-
> From: R-help  On Behalf Of Bert Gunter
> Sent: Sunday, November 20, 2022 1:06 PM
> To: Mitchell Maltenfort 
> Cc: R-help 
> Subject: Re: [R] test logistic regression model
>
> [External Email]
>
> I think (2) might be a bad idea if one of the "sparse"categories has high
> predictive power. You'll lose it when you pool, will you not?
> Also, there is the problem of subjectively defining "sparse."
>
> However, 1) seems quite sensible to me. But IANAE.
>
> -- Bert
>
> On Sun, Nov 20, 2022 at 9:49 AM Mitchell Maltenfort 
> wrote:
> >
> > Two possible fixes occur to me
> >
> > 1) Redo the test/training split but within levels of factor - so you
> > have the same split within each level and each level accounted for in
> > training and testing
> >
> > 2) if you have a lot of levels, and perhaps sparse representation in a
> > few, consider recoding levels to pool the rare ones into an "other"
> > category
> >
> > On Sun, Nov 20, 2022 at 11:41 AM Bert Gunter 
> wrote:
> >>
> >> small reprex:
> >>
> >> set.seed(5)
> >> dat <- data.frame(f = rep(c('r','g'),4), y = runif(8)) newdat <-
> >> data.frame(f =rep(c('r','g','b'),2)) ## convert values in newdat not
> >> seen in dat to NA
> >> is.na(newdat$f) <-!( newdat$f %in% dat$f) lmfit <- lm(y~f, data =
> >> dat)
> >>
> >> ##Result:
> >> > predict(lmfit,newdat)
> >> 1 2 3 4 5 6
> >> 0.4374251 0.6196527NA 0.4374251 0.6196527NA
> >>
> >> If this does not suffice, as Rui said, we need details of what you did.
> >> (predict.glm works like predict.lm)
> >>
> >>
> >> -- Bert
> >>
> >>
> >> On Sun, Nov 20, 2022 at 7:46 AM Rui Barradas 
> wrote:
> >> >
> >> > Às 15:29 de 20/11/2022, Gábor Malomsoki escreveu:
> >> > > Dear Bert,
> >> > >
> >> > > Yes, was trying to fill the not existing categories with NAs, but
> >> > > the suggested solutions in stackoverflow.com unfortunately did not
> work.
> >> > >
> >> > > Best regards
> >> > > Gabor
> >> > >
> >> > >
> >> > > Bert Gunter  schrieb am So., 20. Nov.
> 2022, 16:20:
> >> > >
> >> > >> You can't predict results for categories that you've not seen
> >> > >> before (think about it). You will need to remove those cases
> >> > >> from your test set (or convert them to NA and predict them as NA).
> >> > >>
> >> > >> -- Bert
> >> > >>
> >> > >> On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki
> >> > >> 
> >> > >> wrote:
> >> > >>
> >> > >>> Dear all,
> >> > >>>
> >> > >>> i have created a logistic regression model,
> >> > >>>   on the train df:
> >> > >>> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family =
> >> > >>> "binomial")
> >> > >>>
> >> > >>> then i try to predict with the test df
> >> > >>> Predict<- predict(mymodel1, newdata = test, type = "response")
> >> > >>> then iget this error message:
> >> > >>> Error in model.frame.default(Terms, newdata, na.action =
> >> > >>> na.action, xlev =
> >> > >>> object$xlevels)
> >> > >>> Factor  "TG_KraftF5" has new levels
> >> > >>>
> >> > >>> i have tried different proposals from stackoverflow, but
> >> > >>> unfortu

Re: [R] test logistic regression model

2022-11-20 Thread Ebert,Timothy Aaron
I like option 1. Option 2 may cause problems if you are pooling groups that do 
not go together. This is especially a problem if you know that the data is 
missing some groups. I would consider dropping rare groups - or compare results 
between pooling and dropping options. If the answer is the same in both cases 
then use the approach that makes your life easier with reviewers/clients. If 
the answer is different then I would go with dropping rare categories, or 
present both and highlight the difference in outcome. A third option is to 
gather more data.

Tim

-Original Message-
From: R-help  On Behalf Of Bert Gunter
Sent: Sunday, November 20, 2022 1:06 PM
To: Mitchell Maltenfort 
Cc: R-help 
Subject: Re: [R] test logistic regression model

[External Email]

I think (2) might be a bad idea if one of the "sparse"categories has high 
predictive power. You'll lose it when you pool, will you not?
Also, there is the problem of subjectively defining "sparse."

However, 1) seems quite sensible to me. But IANAE.

-- Bert

On Sun, Nov 20, 2022 at 9:49 AM Mitchell Maltenfort  wrote:
>
> Two possible fixes occur to me
>
> 1) Redo the test/training split but within levels of factor - so you 
> have the same split within each level and each level accounted for in 
> training and testing
>
> 2) if you have a lot of levels, and perhaps sparse representation in a 
> few, consider recoding levels to pool the rare ones into an "other" 
> category
>
> On Sun, Nov 20, 2022 at 11:41 AM Bert Gunter  wrote:
>>
>> small reprex:
>>
>> set.seed(5)
>> dat <- data.frame(f = rep(c('r','g'),4), y = runif(8)) newdat <- 
>> data.frame(f =rep(c('r','g','b'),2)) ## convert values in newdat not 
>> seen in dat to NA
>> is.na(newdat$f) <-!( newdat$f %in% dat$f) lmfit <- lm(y~f, data = 
>> dat)
>>
>> ##Result:
>> > predict(lmfit,newdat)
>> 1 2 3 4 5 6
>> 0.4374251 0.6196527NA 0.4374251 0.6196527NA
>>
>> If this does not suffice, as Rui said, we need details of what you did.
>> (predict.glm works like predict.lm)
>>
>>
>> -- Bert
>>
>>
>> On Sun, Nov 20, 2022 at 7:46 AM Rui Barradas  wrote:
>> >
>> > Às 15:29 de 20/11/2022, Gábor Malomsoki escreveu:
>> > > Dear Bert,
>> > >
>> > > Yes, was trying to fill the not existing categories with NAs, but 
>> > > the suggested solutions in stackoverflow.com unfortunately did not work.
>> > >
>> > > Best regards
>> > > Gabor
>> > >
>> > >
>> > > Bert Gunter  schrieb am So., 20. Nov. 2022, 
>> > > 16:20:
>> > >
>> > >> You can't predict results for categories that you've not seen 
>> > >> before (think about it). You will need to remove those cases 
>> > >> from your test set (or convert them to NA and predict them as NA).
>> > >>
>> > >> -- Bert
>> > >>
>> > >> On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki 
>> > >> 
>> > >> wrote:
>> > >>
>> > >>> Dear all,
>> > >>>
>> > >>> i have created a logistic regression model,
>> > >>>   on the train df:
>> > >>> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family =
>> > >>> "binomial")
>> > >>>
>> > >>> then i try to predict with the test df
>> > >>> Predict<- predict(mymodel1, newdata = test, type = "response") 
>> > >>> then iget this error message:
>> > >>> Error in model.frame.default(Terms, newdata, na.action = 
>> > >>> na.action, xlev =
>> > >>> object$xlevels)
>> > >>> Factor  "TG_KraftF5" has new levels
>> > >>>
>> > >>> i have tried different proposals from stackoverflow, but 
>> > >>> unfortunately they did not solved the problem.
>> > >>> Do you have any idea how to test a logistic regression model 
>> > >>> when you have different levels in train and in test df?
>> > >>>
>> > >>> thank you in advance
>> > >>> Regards,
>> > >>> Gabor
>> > >>>
>> > >>>  [[alternative HTML version deleted]]
>> > >>>
>> > >>> __
>> > >>> R-help@r-project.org mailing list -- To UNS

Re: [R] test logistic regression model

2022-11-20 Thread Bert Gunter
I think (2) might be a bad idea if one of the "sparse"categories has
high predictive power. You'll lose it when you pool, will you not?
Also, there is the problem of subjectively defining "sparse."

However, 1) seems quite sensible to me. But IANAE.

-- Bert

On Sun, Nov 20, 2022 at 9:49 AM Mitchell Maltenfort  wrote:
>
> Two possible fixes occur to me
>
> 1) Redo the test/training split but within levels of factor - so you have the 
> same split within each level and each level accounted for in training and 
> testing
>
> 2) if you have a lot of levels, and perhaps sparse representation in a few, 
> consider recoding levels to pool the rare ones into an “other” category
>
> On Sun, Nov 20, 2022 at 11:41 AM Bert Gunter  wrote:
>>
>> small reprex:
>>
>> set.seed(5)
>> dat <- data.frame(f = rep(c('r','g'),4), y = runif(8))
>> newdat <- data.frame(f =rep(c('r','g','b'),2))
>> ## convert values in newdat not seen in dat to NA
>> is.na(newdat$f) <-!( newdat$f %in% dat$f)
>> lmfit <- lm(y~f, data = dat)
>>
>> ##Result:
>> > predict(lmfit,newdat)
>> 1 2 3 4 5 6
>> 0.4374251 0.6196527NA 0.4374251 0.6196527NA
>>
>> If this does not suffice, as Rui said, we need details of what you did.
>> (predict.glm works like predict.lm)
>>
>>
>> -- Bert
>>
>>
>> On Sun, Nov 20, 2022 at 7:46 AM Rui Barradas  wrote:
>> >
>> > Às 15:29 de 20/11/2022, Gábor Malomsoki escreveu:
>> > > Dear Bert,
>> > >
>> > > Yes, was trying to fill the not existing categories with NAs, but the
>> > > suggested solutions in stackoverflow.com unfortunately did not work.
>> > >
>> > > Best regards
>> > > Gabor
>> > >
>> > >
>> > > Bert Gunter  schrieb am So., 20. Nov. 2022, 
>> > > 16:20:
>> > >
>> > >> You can't predict results for categories that you've not seen before
>> > >> (think about it). You will need to remove those cases from your test set
>> > >> (or convert them to NA and predict them as NA).
>> > >>
>> > >> -- Bert
>> > >>
>> > >> On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki 
>> > >> 
>> > >> wrote:
>> > >>
>> > >>> Dear all,
>> > >>>
>> > >>> i have created a logistic regression model,
>> > >>>   on the train df:
>> > >>> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family =
>> > >>> "binomial")
>> > >>>
>> > >>> then i try to predict with the test df
>> > >>> Predict<- predict(mymodel1, newdata = test, type = "response")
>> > >>> then iget this error message:
>> > >>> Error in model.frame.default(Terms, newdata, na.action = na.action, 
>> > >>> xlev =
>> > >>> object$xlevels)
>> > >>> Factor  "TG_KraftF5" has new levels
>> > >>>
>> > >>> i have tried different proposals from stackoverflow, but unfortunately
>> > >>> they
>> > >>> did not solved the problem.
>> > >>> Do you have any idea how to test a logistic regression model when you 
>> > >>> have
>> > >>> different levels in train and in test df?
>> > >>>
>> > >>> thank you in advance
>> > >>> Regards,
>> > >>> Gabor
>> > >>>
>> > >>>  [[alternative HTML version deleted]]
>> > >>>
>> > >>> __
>> > >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > >>> https://stat.ethz.ch/mailman/listinfo/r-help
>> > >>> PLEASE do read the posting guide
>> > >>> http://www.R-project.org/posting-guide.html
>> > >>> and provide commented, minimal, self-contained, reproducible code.
>> > >>>
>> > >>
>> > >
>> > >   [[alternative HTML version deleted]]
>> > >
>> > > __
>> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > > https://stat.ethz.ch/mailman/listinfo/r-help
>> > > PLEASE do read the posting guide 
>> > > http://www.R-project.org/posting-guide.html
>> > > and provide commented, minimal, self-contained, reproducible code.
>> >
>> > hello,
>> >
>> > What exactly didn't work? You say you have tried the solutions found in
>> > stackoverflow but without a link, we don't know which answers to which
>> > questions you are talking about.
>> > Like Bert said, if you assign NA to the new levels, present only in
>> > test, it should work.
>> >
>> > Can you post links to what you have tried?
>> >
>> > Hope this helps,
>> >
>> > Rui Barradas
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from Gmail Mobile

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test logistic regression model

2022-11-20 Thread Mitchell Maltenfort
Two possible fixes occur to me

1) Redo the test/training split but within levels of factor - so you have
the same split within each level and each level accounted for in training
and testing

2) if you have a lot of levels, and perhaps sparse representation in a few,
consider recoding levels to pool the rare ones into an “other” category

On Sun, Nov 20, 2022 at 11:41 AM Bert Gunter  wrote:

> small reprex:
>
> set.seed(5)
> dat <- data.frame(f = rep(c('r','g'),4), y = runif(8))
> newdat <- data.frame(f =rep(c('r','g','b'),2))
> ## convert values in newdat not seen in dat to NA
> is.na(newdat$f) <-!( newdat$f %in% dat$f)
> lmfit <- lm(y~f, data = dat)
>
> ##Result:
> > predict(lmfit,newdat)
> 1 2 3 4 5 6
> 0.4374251 0.6196527NA 0.4374251 0.6196527NA
>
> If this does not suffice, as Rui said, we need details of what you did.
> (predict.glm works like predict.lm)
>
>
> -- Bert
>
>
> On Sun, Nov 20, 2022 at 7:46 AM Rui Barradas  wrote:
> >
> > Às 15:29 de 20/11/2022, Gábor Malomsoki escreveu:
> > > Dear Bert,
> > >
> > > Yes, was trying to fill the not existing categories with NAs, but the
> > > suggested solutions in stackoverflow.com unfortunately did not work.
> > >
> > > Best regards
> > > Gabor
> > >
> > >
> > > Bert Gunter  schrieb am So., 20. Nov. 2022,
> 16:20:
> > >
> > >> You can't predict results for categories that you've not seen before
> > >> (think about it). You will need to remove those cases from your test
> set
> > >> (or convert them to NA and predict them as NA).
> > >>
> > >> -- Bert
> > >>
> > >> On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki <
> gmalomsoki1...@gmail.com>
> > >> wrote:
> > >>
> > >>> Dear all,
> > >>>
> > >>> i have created a logistic regression model,
> > >>>   on the train df:
> > >>> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family =
> > >>> "binomial")
> > >>>
> > >>> then i try to predict with the test df
> > >>> Predict<- predict(mymodel1, newdata = test, type = "response")
> > >>> then iget this error message:
> > >>> Error in model.frame.default(Terms, newdata, na.action = na.action,
> xlev =
> > >>> object$xlevels)
> > >>> Factor  "TG_KraftF5" has new levels
> > >>>
> > >>> i have tried different proposals from stackoverflow, but
> unfortunately
> > >>> they
> > >>> did not solved the problem.
> > >>> Do you have any idea how to test a logistic regression model when
> you have
> > >>> different levels in train and in test df?
> > >>>
> > >>> thank you in advance
> > >>> Regards,
> > >>> Gabor
> > >>>
> > >>>  [[alternative HTML version deleted]]
> > >>>
> > >>> __
> > >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > >>> PLEASE do read the posting guide
> > >>> http://www.R-project.org/posting-guide.html
> > >>> and provide commented, minimal, self-contained, reproducible code.
> > >>>
> > >>
> > >
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > hello,
> >
> > What exactly didn't work? You say you have tried the solutions found in
> > stackoverflow but without a link, we don't know which answers to which
> > questions you are talking about.
> > Like Bert said, if you assign NA to the new levels, present only in
> > test, it should work.
> >
> > Can you post links to what you have tried?
> >
> > Hope this helps,
> >
> > Rui Barradas
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
-- 
Sent from Gmail Mobile

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test logistic regression model

2022-11-20 Thread Bert Gunter
small reprex:

set.seed(5)
dat <- data.frame(f = rep(c('r','g'),4), y = runif(8))
newdat <- data.frame(f =rep(c('r','g','b'),2))
## convert values in newdat not seen in dat to NA
is.na(newdat$f) <-!( newdat$f %in% dat$f)
lmfit <- lm(y~f, data = dat)

##Result:
> predict(lmfit,newdat)
1 2 3 4 5 6
0.4374251 0.6196527NA 0.4374251 0.6196527NA

If this does not suffice, as Rui said, we need details of what you did.
(predict.glm works like predict.lm)


-- Bert


On Sun, Nov 20, 2022 at 7:46 AM Rui Barradas  wrote:
>
> Às 15:29 de 20/11/2022, Gábor Malomsoki escreveu:
> > Dear Bert,
> >
> > Yes, was trying to fill the not existing categories with NAs, but the
> > suggested solutions in stackoverflow.com unfortunately did not work.
> >
> > Best regards
> > Gabor
> >
> >
> > Bert Gunter  schrieb am So., 20. Nov. 2022, 16:20:
> >
> >> You can't predict results for categories that you've not seen before
> >> (think about it). You will need to remove those cases from your test set
> >> (or convert them to NA and predict them as NA).
> >>
> >> -- Bert
> >>
> >> On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki 
> >> wrote:
> >>
> >>> Dear all,
> >>>
> >>> i have created a logistic regression model,
> >>>   on the train df:
> >>> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family =
> >>> "binomial")
> >>>
> >>> then i try to predict with the test df
> >>> Predict<- predict(mymodel1, newdata = test, type = "response")
> >>> then iget this error message:
> >>> Error in model.frame.default(Terms, newdata, na.action = na.action, xlev =
> >>> object$xlevels)
> >>> Factor  "TG_KraftF5" has new levels
> >>>
> >>> i have tried different proposals from stackoverflow, but unfortunately
> >>> they
> >>> did not solved the problem.
> >>> Do you have any idea how to test a logistic regression model when you have
> >>> different levels in train and in test df?
> >>>
> >>> thank you in advance
> >>> Regards,
> >>> Gabor
> >>>
> >>>  [[alternative HTML version deleted]]
> >>>
> >>> __
> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> hello,
>
> What exactly didn't work? You say you have tried the solutions found in
> stackoverflow but without a link, we don't know which answers to which
> questions you are talking about.
> Like Bert said, if you assign NA to the new levels, present only in
> test, it should work.
>
> Can you post links to what you have tried?
>
> Hope this helps,
>
> Rui Barradas

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test logistic regression model

2022-11-20 Thread Rui Barradas

Às 15:29 de 20/11/2022, Gábor Malomsoki escreveu:

Dear Bert,

Yes, was trying to fill the not existing categories with NAs, but the
suggested solutions in stackoverflow.com unfortunately did not work.

Best regards
Gabor


Bert Gunter  schrieb am So., 20. Nov. 2022, 16:20:


You can't predict results for categories that you've not seen before
(think about it). You will need to remove those cases from your test set
(or convert them to NA and predict them as NA).

-- Bert

On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki 
wrote:


Dear all,

i have created a logistic regression model,
  on the train df:
mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family =
"binomial")

then i try to predict with the test df
Predict<- predict(mymodel1, newdata = test, type = "response")
then iget this error message:
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev =
object$xlevels)
Factor  "TG_KraftF5" has new levels

i have tried different proposals from stackoverflow, but unfortunately
they
did not solved the problem.
Do you have any idea how to test a logistic regression model when you have
different levels in train and in test df?

thank you in advance
Regards,
Gabor

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


hello,

What exactly didn't work? You say you have tried the solutions found in 
stackoverflow but without a link, we don't know which answers to which 
questions you are talking about.
Like Bert said, if you assign NA to the new levels, present only in 
test, it should work.


Can you post links to what you have tried?

Hope this helps,

Rui Barradas

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test logistic regression model

2022-11-20 Thread Gábor Malomsoki
Dear Bert,

Yes, was trying to fill the not existing categories with NAs, but the
suggested solutions in stackoverflow.com unfortunately did not work.

Best regards
Gabor


Bert Gunter  schrieb am So., 20. Nov. 2022, 16:20:

> You can't predict results for categories that you've not seen before
> (think about it). You will need to remove those cases from your test set
> (or convert them to NA and predict them as NA).
>
> -- Bert
>
> On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki 
> wrote:
>
>> Dear all,
>>
>> i have created a logistic regression model,
>>  on the train df:
>> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family =
>> "binomial")
>>
>> then i try to predict with the test df
>> Predict<- predict(mymodel1, newdata = test, type = "response")
>> then iget this error message:
>> Error in model.frame.default(Terms, newdata, na.action = na.action, xlev =
>> object$xlevels)
>> Factor  "TG_KraftF5" has new levels
>>
>> i have tried different proposals from stackoverflow, but unfortunately
>> they
>> did not solved the problem.
>> Do you have any idea how to test a logistic regression model when you have
>> different levels in train and in test df?
>>
>> thank you in advance
>> Regards,
>> Gabor
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test logistic regression model

2022-11-20 Thread Bert Gunter
You can't predict results for categories that you've not seen before (think
about it). You will need to remove those cases from your test set (or
convert them to NA and predict them as NA).

-- Bert

On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki 
wrote:

> Dear all,
>
> i have created a logistic regression model,
>  on the train df:
> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family = "binomial")
>
> then i try to predict with the test df
> Predict<- predict(mymodel1, newdata = test, type = "response")
> then iget this error message:
> Error in model.frame.default(Terms, newdata, na.action = na.action, xlev =
> object$xlevels)
> Factor  "TG_KraftF5" has new levels
>
> i have tried different proposals from stackoverflow, but unfortunately they
> did not solved the problem.
> Do you have any idea how to test a logistic regression model when you have
> different levels in train and in test df?
>
> thank you in advance
> Regards,
> Gabor
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if something was plotted on pdf device

2019-09-13 Thread PIKAL Petr
Dear Duncan

Thank you for the code, I will test it or at least check what it does. I 
finally found probably easier solution.

I stay with my original code

if (dev.cur()==1) plot(ecdf(velik[,"ecd"]), main = ufil[j], col=i) else
plot(ecdf(velik[,"ecd"]), add=T, col=i)

After plot is finished and cycle ends, I copy result to pdf device

dev.copy(pdf,paste(gsub(".xls", "", ufil)[j], ".pdf", sep=""))
dev.off()

Using this approach I could stay with my original code (almost), check if plot 
was initialised by dev.cur() and save it after it is finished to pdf.

The only obstacle is that my code flashes during plotting to basic device, 
however I can live with it.

Thank you again and best regards

Petr

> -Original Message-
> From: Duncan Murdoch 
> Sent: Thursday, September 12, 2019 2:29 PM
> To: PIKAL Petr ; r-help mailing list  project.org>
> Subject: Re: [R] test if something was plotted on pdf device
>
> On 12/09/2019 7:10 a.m., PIKAL Petr wrote:
> > Dear all
> >
> > Is there any simple way checking whether after calling pdf device
> something was plotted into it?
> >
> > In interactive session I used
> >
> > if (dev.cur()==1) plot(ecdf(rnorm(100))) else plot(ecdf(rnorm(100)),
> > add=T, col=i) which enabled me to test if plot is open
> >
> > But when I want to call eg. pdf("test.pdf") before cycle
> > dev.cur()==1 is FALSE even when no plot is drawn and plot.new error
> comes.
> >
> >> pdf("test.pdf")
> >
> > if (dev.cur()==1) plot(ecdf(rnorm(100))) else plot(ecdf(rnorm(100)),
> > add=T, col=i)
> >
> > Error in segments(ti.l, y, ti.r, y, col = col.hor, lty = lty, lwd = lwd,  :
> >plot.new has not been called yet
> >
>
> I don't know if this is reliable or not, but you could use code like this:
>
>f <- tempfile()
>pdf(f)
>blankPlot <- recordPlot()
>dev.off()
>unlink(f)
>
>pdf("test.pdf")
>
>...  unknown operations ...
>
>if (dev.cur() == 1 || identical(recordPlot(), blankPlot))
>  plot(ecdf(rnorm(100)))
>else
>  plot(ecdf(rnorm(100)), add=TRUE, col=i)
>
>
>
> Duncan Murdoch
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
partnerů PRECHEZA a.s. jsou zveřejněny na: 
https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
processing and protection of business partner’s personal data are available on 
website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a 
podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: 
https://www.precheza.cz/01-dovetek/ | This email and any documents attached to 
it may be confidential and are subject to the legally binding disclaimer: 
https://www.precheza.cz/en/01-disclaimer/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if something was plotted on pdf device

2019-09-12 Thread Duncan Murdoch

On 12/09/2019 7:10 a.m., PIKAL Petr wrote:

Dear all

Is there any simple way checking whether after calling pdf device something was 
plotted into it?

In interactive session I used

if (dev.cur()==1) plot(ecdf(rnorm(100))) else plot(ecdf(rnorm(100)), add=T, 
col=i)
which enabled me to test if plot is open

But when I want to call eg. pdf("test.pdf") before cycle
dev.cur()==1 is FALSE even when no plot is drawn and plot.new error comes.


pdf("test.pdf")


if (dev.cur()==1) plot(ecdf(rnorm(100))) else plot(ecdf(rnorm(100)), add=T, 
col=i)

Error in segments(ti.l, y, ti.r, y, col = col.hor, lty = lty, lwd = lwd,  :
   plot.new has not been called yet



I don't know if this is reliable or not, but you could use code like this:

  f <- tempfile()
  pdf(f)
  blankPlot <- recordPlot()
  dev.off()
  unlink(f)

  pdf("test.pdf")

  ...  unknown operations ...

  if (dev.cur() == 1 || identical(recordPlot(), blankPlot))
plot(ecdf(rnorm(100)))
  else
plot(ecdf(rnorm(100)), add=TRUE, col=i)



Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test of independence

2018-12-20 Thread Greg Snow
The basic test of independence for a table based on the Chi-squared
distribution can be done using the `chisq.test` function.  This is in
the stats package which is installed and loaded by default, so you
don't need to do anything additional.  There is also the `fisher.test`
function for Fisher's exact test (similar hypotheses, different
methodology and assumptions, may be really slow on your table).

If you need more than the basics provided in those functions, then a
search of CRAN may be helpful, or give us more detail to be able to
help.

On Thu, Dec 20, 2018 at 12:08 AM km  wrote:
>
> Dear All,
>
> How do I do a test of independence with 16x16 table of counts.
> Please suggest.
>
> Regards,
> KM
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test of independence

2018-12-20 Thread PIKAL Petr
Hi

Did you search CRAN? I got **many** results for

test of independence

which may or may not provide you with suitable procedures.

Cheers
Petr


> -Original Message-
> From: R-help  On Behalf Of km
> Sent: Thursday, December 20, 2018 8:07 AM
> To: r-help@r-project.org
> Subject: [R] test of independence
>
> Dear All,
>
> How do I do a test of independence with 16x16 table of counts.
> Please suggest.
>
> Regards,
> KM
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
partnerů PRECHEZA a.s. jsou zveřejněny na: 
https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
processing and protection of business partner’s personal data are available on 
website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a 
podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: 
https://www.precheza.cz/01-dovetek/ | This email and any documents attached to 
it may be confidential and are subject to the legally binding disclaimer: 
https://www.precheza.cz/en/01-disclaimer/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test if data uniformly distributed (newbie)

2018-04-10 Thread Huber, Florian
Dear Mr. Savicky,

I am currently working on a project where I want to test a random number 
generator, which is supposed to create 10.000 continuously uniformly 
distributed random numbers between 0 and 1. I am now wondering if I can use the 
Chi-Squared-Test to solve this problem or if the Kolmogorov-Smirnov-test would 
be a better fit.

I came across one of your threads on the internet where you answer a similar 
question and thought I'd reach out to you.


Thanks in advance
Florian Huber




Diese Nachricht einschliesslich etwa beigefuegter Anhaenge ist vertraulich und 
kann dem Bank- und Datengeheimnis unterliegen oder sonst rechtlich geschuetzte 
Daten und Informationen enthalten. Wenn Sie nicht der richtige Adressat sind 
oder diese Nachricht irrtuemlich erhalten haben, informieren Sie bitte sofort 
den Absender �ber die Antwortfunktion. Anschliessend moechten Sie bitte diese 
Nachricht einschliesslich etwa beigefuegter Anhaenge unverzueglich vollstaendig 
loeschen. Das unerlaubte Kopieren oder Speichern dieser Nachricht und/oder der 
ihr etwa beigefuegten Anhaenge sowie die unbefugte Weitergabe der darin 
enthaltenen Daten und Informationen sind nicht gestattet. Wir weisen darauf 
hin, dass rechtsverbindliche Erklaerungen namens unseres Hauses grundsaetzlich 
der Unterschriften zweier ausreichend bevollmaechtigter Vertreter unseres 
Hauses beduerfen. Wir verschicken daher keine rechtsverbindlichen Erklaerungen 
per E-Mail an Dritte. Demgemaess nehmen wir per E-Mail auch keine 
rechtsverbindlichen Erklaerungen oder Auftraege von Dritten entgegen. 
Sollten Sie Schwierigkeiten beim Oeffnen dieser E-Mail haben, wenden Sie sich 
bitte an den Absender oder an i...@berenberg.de. Please refer to 
http://www.berenberg.de/my_berenberg/disclaimer_e.html for our confidentiality 
notice.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test for proportion or concordance

2017-08-03 Thread Bert Gunter
This list is about R programming, not statistics, although admittedly
there is a nonempty intersection. However, I think you would do better
posting this on a statistics list like stats.stackexchange.com.

-- Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Aug 3, 2017 at 7:19 AM, Adrian Johnson
 wrote:
> Hello group,
>
> my question is deciding what test would be appropriate for following question.
>
> An experiment 'A' yielded 3200 observations of which 431 are
> significant. Similarly, using same method, another experiment 'B' on a
> different population yielded 2541 observations of which 260 are
> significant.
>
> There are 180 observations that are common between significant
> observations of A and B.
> (180 are common between 431 and 260).
>
> 80 observations are specific to A
> 251 observations are specific to B.
>
> The question are the 180 observations  that are common between A and B
> - are these 180 common observations occurring by  chance?
>
> What test would be appropriate for this scenario.  (if my total
> observations are fixed between two experiments A and B, I could use
> Cohens kappa for concordance or Chi-square etc.
> Since the total observations differ between experiments A and B, I
> dont know what test would be appropriate.   I appreciate your help.
>
> thanks
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test individual slope for each factor level in ANCOVA

2017-03-16 Thread li li
Hi John. Thanks much for your help. It is great to know this.
  Hanna

2017-03-16 8:02 GMT-04:00 Fox, John :

> Dear Hanna,
>
> You can test the slope in each non-reference group as a linear hypothesis.
> You didn’t make the data available for your example, so here’s an example
> using the linearHypothesis() function in the car package with the Moore
> data set in the same package:
>
> - - - snip - - -
>
> > library(car)
> > mod <- lm(conformity ~ fscore*partner.status, data=Moore)
> > summary(mod)
>
> Call:
> lm(formula = conformity ~ fscore * partner.status, data = Moore)
>
> Residuals:
> Min  1Q  Median  3Q Max
> -7.5296 -2.5984 -0.4473  2.0994 12.4704
>
> Coefficients:
>   Estimate Std. Error t value Pr(>|t|)
> (Intercept)   20.793483.26273   6.373 1.27e-07 ***
> fscore-0.151100.07171  -2.107  0.04127 *
> partner.statuslow-15.534084.40045  -3.530  0.00104 **
> fscore:partner.statuslow   0.261100.09700   2.692  0.01024 *
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 4.562 on 41 degrees of freedom
> Multiple R-squared:  0.2942,Adjusted R-squared:  0.2426
> F-statistic: 5.698 on 3 and 41 DF,  p-value: 0.002347
>
> > linearHypothesis(mod, "fscore + fscore:partner.statuslow")
> Linear hypothesis test
>
> Hypothesis:
> fscore  + fscore:partner.statuslow = 0
>
> Model 1: restricted model
> Model 2: conformity ~ fscore * partner.status
>
>   Res.DfRSS Df Sum of Sq  F  Pr(>F)
> 1 42 912.45
> 2 41 853.42  159.037 2.8363 0.09976 .
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> - - - snip - - -
>
> In this case, there are just two levels for partner.status, but for a
> multi-level factor you can simply perform more than one test.
>
>
> I hope this helps,
>
>  John
>
> -
> John Fox, Professor
> McMaster University
> Hamilton, Ontario, Canada
> Web: http://socserv.mcmaster.ca/jfox/
>
>
>
>
> On 2017-03-15, 9:43 PM, "R-help on behalf of li li"
>  wrote:
>
> >Hi all,
> >   Consider the data set where there are a continuous response variable, a
> >continuous predictor "weeks" and a categorical variable "region" with five
> >levels "a", "b", "c",
> >"d", "e".
> >  I fit the ANCOVA model as follows. Here the reference level is region
> >"a"
> >and there are 4 dummy variables. The interaction terms (in red below)
> >represent the slope
> >difference between each region and  the baseline region "a" and the
> >corresponding p-value is for testing whether this slope difference is
> >zero.
> >Is there a way to directly test whether the slope corresponding to each
> >individual factor level is 0 or not, instead of testing the slope
> >difference from the baseline level?
> >  Thanks very much.
> >  Hanna
> >
> >
> >
> >
> >
> >
> >> mod <- lm(response ~ weeks*region,data)> summary(mod)
> >Call:
> >lm(formula = response ~ weeks * region, data = data)
> >
> >Residuals:
> > Min   1Q   Median   3Q  Max
> >-0.19228 -0.07433 -0.01283  0.04439  0.24544
> >
> >Coefficients:
> >Estimate Std. Error t value Pr(>|t|)
> >(Intercept)1.2105556  0.0954567  12.682  1.2e-14 ***
> >weeks -0.021  0.0147293  -1.4480.156
> >regionb   -0.0257778  0.1349962  -0.1910.850
> >regionc   -0.034  0.1349962  -0.2550.800
> >regiond   -0.075  0.1349962  -0.5590.580
> >regione   -0.148  0.1349962  -1.0980.280weeks:regionb
> >-0.0007222  0.0208304  -0.0350.973
> >weeks:regionc -0.0017778  0.0208304  -0.0850.932
> >weeks:regiond  0.003  0.0208304   0.1440.886
> >weeks:regione  0.0301667  0.0208304   1.4480.156---
> >Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >
> >Residual standard error: 0.1082 on 35 degrees of freedom
> >Multiple R-squared:  0.2678,   Adjusted R-squared:  0.07946
> >F-statistic: 1.422 on 9 and 35 DF,  p-value: 0.2165
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Test individual slope for each factor level in ANCOVA

2017-03-16 Thread Fox, John
Dear Hanna,

You can test the slope in each non-reference group as a linear hypothesis.
You didn’t make the data available for your example, so here’s an example
using the linearHypothesis() function in the car package with the Moore
data set in the same package:

- - - snip - - -

> library(car)
> mod <- lm(conformity ~ fscore*partner.status, data=Moore)
> summary(mod)

Call:
lm(formula = conformity ~ fscore * partner.status, data = Moore)

Residuals:
Min  1Q  Median  3Q Max
-7.5296 -2.5984 -0.4473  2.0994 12.4704

Coefficients:
  Estimate Std. Error t value Pr(>|t|)
(Intercept)   20.793483.26273   6.373 1.27e-07 ***
fscore-0.151100.07171  -2.107  0.04127 *
partner.statuslow-15.534084.40045  -3.530  0.00104 **
fscore:partner.statuslow   0.261100.09700   2.692  0.01024 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 4.562 on 41 degrees of freedom
Multiple R-squared:  0.2942,Adjusted R-squared:  0.2426
F-statistic: 5.698 on 3 and 41 DF,  p-value: 0.002347

> linearHypothesis(mod, "fscore + fscore:partner.statuslow")
Linear hypothesis test

Hypothesis:
fscore  + fscore:partner.statuslow = 0

Model 1: restricted model
Model 2: conformity ~ fscore * partner.status

  Res.DfRSS Df Sum of Sq  F  Pr(>F)
1 42 912.45
2 41 853.42  159.037 2.8363 0.09976 .
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

- - - snip - - -

In this case, there are just two levels for partner.status, but for a
multi-level factor you can simply perform more than one test.


I hope this helps,

 John

-
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
Web: http://socserv.mcmaster.ca/jfox/




On 2017-03-15, 9:43 PM, "R-help on behalf of li li"
 wrote:

>Hi all,
>   Consider the data set where there are a continuous response variable, a
>continuous predictor "weeks" and a categorical variable "region" with five
>levels "a", "b", "c",
>"d", "e".
>  I fit the ANCOVA model as follows. Here the reference level is region
>"a"
>and there are 4 dummy variables. The interaction terms (in red below)
>represent the slope
>difference between each region and  the baseline region "a" and the
>corresponding p-value is for testing whether this slope difference is
>zero.
>Is there a way to directly test whether the slope corresponding to each
>individual factor level is 0 or not, instead of testing the slope
>difference from the baseline level?
>  Thanks very much.
>  Hanna
>
>
>
>
>
>
>> mod <- lm(response ~ weeks*region,data)> summary(mod)
>Call:
>lm(formula = response ~ weeks * region, data = data)
>
>Residuals:
> Min   1Q   Median   3Q  Max
>-0.19228 -0.07433 -0.01283  0.04439  0.24544
>
>Coefficients:
>Estimate Std. Error t value Pr(>|t|)
>(Intercept)1.2105556  0.0954567  12.682  1.2e-14 ***
>weeks -0.021  0.0147293  -1.4480.156
>regionb   -0.0257778  0.1349962  -0.1910.850
>regionc   -0.034  0.1349962  -0.2550.800
>regiond   -0.075  0.1349962  -0.5590.580
>regione   -0.148  0.1349962  -1.0980.280weeks:regionb
>-0.0007222  0.0208304  -0.0350.973
>weeks:regionc -0.0017778  0.0208304  -0.0850.932
>weeks:regiond  0.003  0.0208304   0.1440.886
>weeks:regione  0.0301667  0.0208304   1.4480.156---
>Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
>Residual standard error: 0.1082 on 35 degrees of freedom
>Multiple R-squared:  0.2678,   Adjusted R-squared:  0.07946
>F-statistic: 1.422 on 9 and 35 DF,  p-value: 0.2165
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Test for Homoscedesticity in R Without BP Test

2016-04-04 Thread Deepak Singh
I have tried and got the result.
Thank you every one.


On Tue, Apr 5, 2016 at 12:58 AM, Achim Zeileis 
wrote:

> On Mon, 4 Apr 2016, varin sacha via R-help wrote:
>
> Hi Deepak,
>>
>> In econometrics there is another test very often used : the white test.
>> The white test is based on the comparison of the estimated variances of
>> residuals when the model is estimated by OLS under the assumption of
>> homoscedasticity and when the model is estimated by OLS under the
>> assumption of heteroscedastic.
>>
>
> The White test is a special case of the Breusch-Pagan test using a
> particular specification of the auxiliary regressors: namely all
> regressors, their squares and their cross-products. As this specification
> makes only sense if all regressors are continuous, many implementations
> have problems if there are already dummy variables, interactions, etc. in
> the regressor matrix. This is also the reason why bptest() from "lmtest"
> uses a different specification by default. However, you can utilize the
> function to carry out the White test as illustrated in:
>
> example("CigarettesB", package = "AER")
>
> (Of course, the AER package needs to be installed first.)
>
> The White test with R
>>
>> install.packages("bstats")
>> library(bstats)
>> white.test(LinearModel)
>>
>
> That package is no longer on CRAN as it took the code from bptest()
> without crediting its original authors and released it in a package that
> conflicted with the original license. Also, the implementation did not
> check for potential problems with dummy variables or interactions mentioned
> above.
>
> So the bptest() implementation from "lmtest" is really recommend. Or
> alternatively ncvTest() from package "car".
>
>
> Hope this helps.
>>
>> Sacha
>>
>>
>>
>>
>>
>> 
>> De : Deepak Singh 
>> À : r-help@r-project.org Envoyé le : Lundi 4 avril 2016 10h40
>> Objet : [R] Test for Homoscedesticity in R Without BP Test
>>
>>
>> Respected Sir,
>> I am doing a project on multiple linear model fitting and in that project
>> I
>> have to test Homoscedesticity of errors I have google for the same and
>> found bptest for the same but in R version 3.2.4 bp test is not available.
>> So please suggest me a test on homoscedesticity ASAP as we have to submit
>> our report on 7-04-2016.
>>
>> P.S. : I have plotted residuals against fitted values and it is less or
>> more random.
>>
>> Thank You !
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Test for Homoscedesticity in R Without BP Test

2016-04-04 Thread Achim Zeileis

On Mon, 4 Apr 2016, varin sacha via R-help wrote:


Hi Deepak,

In econometrics there is another test very often used : the white test. 
The white test is based on the comparison of the estimated variances of 
residuals when the model is estimated by OLS under the assumption of 
homoscedasticity and when the model is estimated by OLS under the 
assumption of heteroscedastic.


The White test is a special case of the Breusch-Pagan test using a 
particular specification of the auxiliary regressors: namely all 
regressors, their squares and their cross-products. As this specification 
makes only sense if all regressors are continuous, many implementations 
have problems if there are already dummy variables, interactions, etc. in 
the regressor matrix. This is also the reason why bptest() from "lmtest" 
uses a different specification by default. However, you can utilize the 
function to carry out the White test as illustrated in:


example("CigarettesB", package = "AER")

(Of course, the AER package needs to be installed first.)


The White test with R

install.packages("bstats")
library(bstats)
white.test(LinearModel)


That package is no longer on CRAN as it took the code from bptest() 
without crediting its original authors and released it in a package that 
conflicted with the original license. Also, the implementation did not 
check for potential problems with dummy variables or interactions 
mentioned above.


So the bptest() implementation from "lmtest" is really recommend. Or 
alternatively ncvTest() from package "car".



Hope this helps.

Sacha






De : Deepak Singh 
À : r-help@r-project.org 
Envoyé le : Lundi 4 avril 2016 10h40

Objet : [R] Test for Homoscedesticity in R Without BP Test


Respected Sir,
I am doing a project on multiple linear model fitting and in that project I
have to test Homoscedesticity of errors I have google for the same and
found bptest for the same but in R version 3.2.4 bp test is not available.
So please suggest me a test on homoscedesticity ASAP as we have to submit
our report on 7-04-2016.

P.S. : I have plotted residuals against fitted values and it is less or
more random.

Thank You !

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Test for Homoscedesticity in R Without BP Test

2016-04-04 Thread Achim Zeileis

On Mon, 4 Apr 2016, Deepak Singh wrote:


Respected Sir,
I am doing a project on multiple linear model fitting and in that project I
have to test Homoscedesticity of errors I have google for the same and
found bptest for the same but in R version 3.2.4 bp test is not available.


The function is called bptest() and is implemented in package "lmtest" 
which is available for current versions of R, see

https://CRAN.R-project.org/package=lmtest

To install it, run:
install.packages("lmtest")

And then to load the package and try the function:
library("lmtest")
example("bptest")


So please suggest me a test on homoscedesticity ASAP as we have to submit
our report on 7-04-2016.

P.S. : I have plotted residuals against fitted values and it is less or
more random.

Thank You !

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for Homoscedesticity in R Without BP Test

2016-04-04 Thread varin sacha via R-help
Hi Deepak,

In econometrics there is another test very often used : the white test.
The white test is based on the comparison of the estimated variances of 
residuals when the model is estimated by OLS under the assumption of 
homoscedasticity and when the model is estimated by OLS under the assumption of 
heteroscedastic.


The White test with R

install.packages("bstats")
library(bstats)
white.test(LinearModel)



Hope this helps.

Sacha






De : Deepak Singh 
À : r-help@r-project.org 
Envoyé le : Lundi 4 avril 2016 10h40
Objet : [R] Test for Homoscedesticity in R Without BP Test


Respected Sir,
I am doing a project on multiple linear model fitting and in that project I
have to test Homoscedesticity of errors I have google for the same and
found bptest for the same but in R version 3.2.4 bp test is not available.
So please suggest me a test on homoscedesticity ASAP as we have to submit
our report on 7-04-2016.

P.S. : I have plotted residuals against fitted values and it is less or
more random.

Thank You !

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Test for Homoscedesticity in R Without BP Test

2016-04-04 Thread John C Frain
You might "google Breusch Pagan test r" and find that the test is
implemented in lmtest package.
On 4 Apr 2016 17:28, "Deepak Singh"  wrote:

> Respected Sir,
> I am doing a project on multiple linear model fitting and in that project I
> have to test Homoscedesticity of errors I have google for the same and
> found bptest for the same but in R version 3.2.4 bp test is not available.
> So please suggest me a test on homoscedesticity ASAP as we have to submit
> our report on 7-04-2016.
>
> P.S. : I have plotted residuals against fitted values and it is less or
> more random.
>
> Thank You !
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test hypothesis in R

2016-03-23 Thread David Winsemius

> On Mar 23, 2016, at 1:44 PM, ruipbarra...@sapo.pt wrote:
> 
> Hello,
> 
> Try
> 
> ?t.test
> t.test(mA, mB, alternative = "greater")
> 
> Hope this helps,
> 
> Rui Barradas
>  
> 
> Citando Eliza Botto :
> 
>> Dear All,
>> I want to test a hypothesis in R by using student' t-test (P-values).
>> The hypothesis is that model A produces lesser error than model B at  
>> ten stations. Obviously, Null Hypothesis (H0) is that the error  
>> produces by model A is not lower than model B.

NOT "obviously". You only get to do one-sided tests when the scientific 
question would not allow the possibility of a departure to "the other side".

Two-sided tests are the norm in scientific literature, often to the 
experimenter's distress when they haven't done a thoughtful (non-optimistic) 
power analysis and their results are inconclusive as a result. Your hypothesis 
_should_ have been constructed _before_ you saw the data. That is if you want 
to be an ethical scientist.


>> The error magnitudes are
>> 
>> #model A
>>> dput(mA)
>> 
>> c(36.1956086452583, 34.9996207622861, 36.435733025221,  
>> 37.2003157636202, 36.1318687775115, 37.164132533536,  
>> 35.2028759357069, 36.7719835944373, 38.3861425339751,  
>> 37.4174132119744)
>> #model B
>>> dput(mB)
>> 
>> c(39.7655211768704, 40.1730916643841, 39.3699055738618,  
>> 39.401619831763, 41.1218634441457, 39.1968630742826,  
>> 40.5265825061639, 40.4674956975404, 40.5954427072364,  
>> 41.4875529130543)

Those are not models. They are just vectors of numbers. And they seem unlikely 
to be residual errors of a linear model since they are not centered on zero. I 
doubt there is enough in your presentation for a sensible comment on the proper 
analysis.

-- 

David.

>> 
>> Now can I test my hypothesis in R?
>> Thankyou very much in Advance,
>> Eliza
>> [[alternative HTML version deleted]]
>> 
>> __



David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test hypothesis in R

2016-03-23 Thread ruipbarradas
Sorry, but in your original post you said that " Null Hypothesis (H0)  
is that the error produces by model A is not lower than model B".
If now is that model A produces less error change to  
alternative="less". The relevant part in the help page ?t.test is

alternative = "greater" is the alternative that x has a larger mean than y.

Rui Barradas
 

Citando Eliza Botto <eliza_bo...@outlook.com>:

> Thnx Rui,  
> Just one point though
>  
> Should it be alternative="greater" or "less"? Since alternative  
> hypothesis is that model A produced less error.
>  
> regards,
>  
> Eliza
>  
> -
> Date: Wed, 23 Mar 2016 20:44:20 +
> From: ruipbarra...@sapo.pt
> To: eliza_bo...@outlook.com
> CC: r-help@r-project.org
> Subject: Re: [R] test hypothesis in R

> Dear All,
> I want to test a hypothesis in R by using student' t-test (P-values).
> The hypothesis is that model A produces lesser error than model B at  
> ten stations. Obviously, Null Hypothesis (H0) is that the error  
> produces by model A is not lower than model B.
> The error magnitudes are
>
> #model A
>> dput(mA)
>
> c(36.1956086452583, 34.9996207622861, 36.435733025221,  
> 37.2003157636202, 36.1318687775115, 37.164132533536,  
> 35.2028759357069, 36.7719835944373, 38.3861425339751,  
> 37.4174132119744)
> #model B
>> dput(mB)
>
> c(39.7655211768704, 40.1730916643841, 39.3699055738618,  
> 39.401619831763, 41.1218634441457, 39.1968630742826,  
> 40.5265825061639, 40.4674956975404, 40.5954427072364,  
> 41.4875529130543)
>
> Now can I test my hypothesis in R?
> Thankyou very much in Advance,
> Eliza
>         [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide  
> http://www.R-project.org/posting-guide.htmland provide commented,  
> minimal, self-contained, reproducible code.

 

 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] test hypothesis in R

2016-03-23 Thread Eliza Botto
Thnx Rui,
Just one point though
Should it be alternative="greater" or "less"? Since alternative hypothesis is 
that model A produced less error.
regards,
Eliza

Date: Wed, 23 Mar 2016 20:44:20 +
From: ruipbarra...@sapo.pt
To: eliza_bo...@outlook.com
CC: r-help@r-project.org
Subject: Re: [R] test hypothesis in R








Hello,



Try



?t.test

t.test(mA, mB, alternative = "greater")



Hope this helps,



Rui Barradas

 

Citando Eliza Botto <eliza_bo...@outlook.com>:


Dear All,

I want to test a hypothesis in R by using student' t-test (P-values).

The hypothesis is that model A produces lesser error than model B at ten 
stations. Obviously, Null Hypothesis (H0) is that the error produces by model A 
is not lower than model B.

The error magnitudes are



#model A


dput(mA)


c(36.1956086452583, 34.9996207622861, 36.435733025221, 37.2003157636202, 
36.1318687775115, 37.164132533536, 35.2028759357069, 36.7719835944373, 
38.3861425339751, 37.4174132119744)

#model B

dput(mB)


c(39.7655211768704, 40.1730916643841, 39.3699055738618, 39.401619831763, 
41.1218634441457, 39.1968630742826, 40.5265825061639, 40.4674956975404, 
40.5954427072364, 41.4875529130543)



Now can I test my hypothesis in R?

Thankyou very much in Advance,

Eliza

[[alternative HTML version deleted]]



__

R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see

https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland 
provide commented, minimal, self-contained, reproducible code.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test hypothesis in R

2016-03-23 Thread ruipbarradas
Hello,

Try

?t.test
t.test(mA, mB, alternative = "greater")

Hope this helps,

Rui Barradas
 

Citando Eliza Botto :

> Dear All,
> I want to test a hypothesis in R by using student' t-test (P-values).
> The hypothesis is that model A produces lesser error than model B at  
> ten stations. Obviously, Null Hypothesis (H0) is that the error  
> produces by model A is not lower than model B.
> The error magnitudes are
>
> #model A
>> dput(mA)
>
> c(36.1956086452583, 34.9996207622861, 36.435733025221,  
> 37.2003157636202, 36.1318687775115, 37.164132533536,  
> 35.2028759357069, 36.7719835944373, 38.3861425339751,  
> 37.4174132119744)
> #model B
>> dput(mB)
>
> c(39.7655211768704, 40.1730916643841, 39.3699055738618,  
> 39.401619831763, 41.1218634441457, 39.1968630742826,  
> 40.5265825061639, 40.4674956975404, 40.5954427072364,  
> 41.4875529130543)
>
> Now can I test my hypothesis in R?
> Thankyou very much in Advance,
> Eliza
>         [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide  
> http://www.R-project.org/posting-guide.htmland provide commented,  
> minimal, self-contained, reproducible code.

 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] test if a url exists

2014-06-29 Thread Duncan Murdoch
On 29/06/2014, 7:12 AM, Hui Du wrote:
 Hi all,
 
 I need to test if a url exists. I used url.exists() in RCurl package
 
 library(RCurl)
 
 however the test result is kind of weird. For example,
 
 url.exists(http://www.amazon.com;)
 [1] FALSE
 
 although www.amazon.comhttp://www.amazon.com is a valid url. Does anybody 
 know how to use that function correctly or the other way to test url 
 existence?

You can use the .header = TRUE option to that call to see the error 405
that it gives.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test the return from grep or agrep

2014-03-02 Thread Prof Brian Ripley

On 01/03/2014 23:32, Hui Du wrote:

Hi All,

My sample code looks like

options(stringsAsFactors = FALSE);
clean = function(x)
{
 loc = agrep(ABC, x$name);
 x[loc,]$new_name - NEW;
 x;
}

name = c(12, dad, dfd);
y = data.frame(name = as.character(name), idx = 1:3);
y$new_name = y$name;

z - clean(y)

The snippet does not work because I forgot to test the return value of agrep. 
If no pattern is found, it returns 0 and the following x[loc, ]$new_name does 
not like. I know how to fix that part. However, my code has many places like 
that, say over 100 calls for agrep or grep for different patterns and 
substitution. Is there any smart way to fix them all rather than line by line?


That is not true: it returns integer(0).  (If it returned 0 it would work.)

For grep() I would recommend using grepl() instead. Otherwise

if(length(loc)) x[loc,]$new_name - NEW

or

x[loc,]$new_name - rep_len(NEW, length(loc))


Your code is full of pointless empty statements (between ; and NL): R is 
not C and ; is a separator, not a terminator.




Many thanks.

HXD




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test to determine if there is a difference between two means

2013-12-24 Thread Bert Gunter
Inline below.

 Cheers,

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
H. Gilbert Welch




On Tue, Dec 24, 2013 at 7:38 AM, wesley bell wesleybel...@yahoo.com wrote:
 Hi,
 I have a data set where there are 20 experiments which each ran for 10 
 minutes. In each experiment an insect had a choice to spend time in one of 
 two chambers. Each experiment therefore has number of seconds spent in each 
 chamber. I want to know whether there is a difference in the mean time spent 
 in each chamber.

Yes, there is. Always.


 I was going to do a t-test but was advised that there was a better way, 
 something about introducing random numbers? I was hoping someone could help?

This list is about R, not statistics, although they certainly overlap.
 I suggest you post on stats.stackexchange.com instead for statistics
help. Better yet, you might do well to talk with a local expert about
statistical issues, as you are obviously weak here.


 Thanks
 Wes
 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test ADF differences in R and Eviews

2013-12-05 Thread David Winsemius

On Dec 5, 2013, at 3:18 PM, nooldor wrote:

 Hi,
 
 
 In attachment you can find source data on which I run adf.test() and
 print-screen with results in R and Eviews.
 
 Results are very different. Did I missed something?

Yes. You missed the list of acceptable file types for r-help.

-- 
David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test wilcoxon sur R help!

2013-10-24 Thread arun
Hi,
Try:
fun1 - function(dat){
mat1 - combn(colnames(dat1),2)
 res - sapply(seq_len(ncol(mat1)),function(i) {x1- dat[,mat1[,i]]; 
wilcox.test(x1[,1],x1[,2])$p.value})
names(res) - apply(mat1,2,paste,collapse=_)
res
}

set.seed(432)
dat1 - as.data.frame(matrix(sample(18*10,18*10,replace=FALSE),ncol=18))

  fun1(dat1) #gives the p-value for each pair of columns




Hi, 

I want to make a wilcoxon test, i have 18 columns each column 
corresponds to a different sample and i want to compare one to each 
other with a wilcoxon test in one step this is possible ? or do i 
compare two by tow? 

Does it exist a code for automation this test? like this i dont have to type 
the code for each couple. 

thanks! 
denisse

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test wilcoxon sur R help!

2013-10-24 Thread Rui Barradas

Hello,

There's a bug in your function, it should be 'dat', not 'dat1'. In the 
line marked, below.


fun1 - function(dat){
mat1 - combn(colnames(dat),2)  # Here, 'dat' not 'dat1'
	res - sapply(seq_len(ncol(mat1)),function(i) {x1- dat[,mat1[,i]]; 
wilcox.test(x1[,1],x1[,2])$p.value})

names(res) - apply(mat1,2,paste,collapse=_)
res
}


Hope this helps,

Rui Barradas

Em 24-10-2013 20:16, arun escreveu:

Hi,
Try:
fun1 - function(dat){
mat1 - combn(colnames(dat1),2)
  res - sapply(seq_len(ncol(mat1)),function(i) {x1- dat[,mat1[,i]]; 
wilcox.test(x1[,1],x1[,2])$p.value})
names(res) - apply(mat1,2,paste,collapse=_)
res
}

set.seed(432)
dat1 - as.data.frame(matrix(sample(18*10,18*10,replace=FALSE),ncol=18))

   fun1(dat1) #gives the p-value for each pair of columns




Hi,

I want to make a wilcoxon test, i have 18 columns each column
corresponds to a different sample and i want to compare one to each
other with a wilcoxon test in one step this is possible ? or do i
compare two by tow?

Does it exist a code for automation this test? like this i dont have to type 
the code for each couple.

thanks!
denisse

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test wilcoxon sur R help!

2013-10-24 Thread vikram ranga
Hi,
Check out this function:-
pairwise.wilcox.test {package=stats}.

example(pairwise.wilcox.test)


On Fri, Oct 25, 2013 at 2:15 AM, Rui Barradas ruipbarra...@sapo.pt wrote:
 Hello,

 There's a bug in your function, it should be 'dat', not 'dat1'. In the line
 marked, below.

 fun1 - function(dat){
 mat1 - combn(colnames(dat),2)  # Here, 'dat' not 'dat1'

 res - sapply(seq_len(ncol(mat1)),function(i) {x1- dat[,mat1[,i]];
 wilcox.test(x1[,1],x1[,2])$p.value})
 names(res) - apply(mat1,2,paste,collapse=_)
 res
 }


 Hope this helps,

 Rui Barradas

 Em 24-10-2013 20:16, arun escreveu:

 Hi,
 Try:
 fun1 - function(dat){
 mat1 - combn(colnames(dat1),2)
   res - sapply(seq_len(ncol(mat1)),function(i) {x1- dat[,mat1[,i]];
 wilcox.test(x1[,1],x1[,2])$p.value})
 names(res) - apply(mat1,2,paste,collapse=_)
 res
 }

 set.seed(432)
 dat1 - as.data.frame(matrix(sample(18*10,18*10,replace=FALSE),ncol=18))

fun1(dat1) #gives the p-value for each pair of columns




 Hi,

 I want to make a wilcoxon test, i have 18 columns each column
 corresponds to a different sample and i want to compare one to each
 other with a wilcoxon test in one step this is possible ? or do i
 compare two by tow?

 Does it exist a code for automation this test? like this i dont have to
 type the code for each couple.

 thanks!
 denisse

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test wilcoxon sur R help!

2013-10-24 Thread arun
It looks much better than mine.


with p value adjustment:
p.adjust(fun1(dat1), method = holm, n = 153)
#

dat1$id - 1:10
library(reshape2)
dat2 - melt(dat1,id.var=id)
with(dat2,pairwise.wilcox.test(value,variable))
 with(dat2,pairwise.wilcox.test(value,variable,p.adj=none)) 


A.K.




On Friday, October 25, 2013 12:05 AM, vikram ranga babuaw...@gmail.com wrote:
Hi,
Check out this function:-
pairwise.wilcox.test {package=stats}.

example(pairwise.wilcox.test)


On Fri, Oct 25, 2013 at 2:15 AM, Rui Barradas ruipbarra...@sapo.pt wrote:
 Hello,

 There's a bug in your function, it should be 'dat', not 'dat1'. In the line
 marked, below.

 fun1 - function(dat){
         mat1 - combn(colnames(dat),2)  # Here, 'dat' not 'dat1'

         res - sapply(seq_len(ncol(mat1)),function(i) {x1- dat[,mat1[,i]];
 wilcox.test(x1[,1],x1[,2])$p.value})
         names(res) - apply(mat1,2,paste,collapse=_)
         res
 }


 Hope this helps,

 Rui Barradas

 Em 24-10-2013 20:16, arun escreveu:

 Hi,
 Try:
 fun1 - function(dat){
 mat1 - combn(colnames(dat1),2)
   res - sapply(seq_len(ncol(mat1)),function(i) {x1- dat[,mat1[,i]];
 wilcox.test(x1[,1],x1[,2])$p.value})
 names(res) - apply(mat1,2,paste,collapse=_)
 res
 }

 set.seed(432)
 dat1 - as.data.frame(matrix(sample(18*10,18*10,replace=FALSE),ncol=18))

    fun1(dat1) #gives the p-value for each pair of columns




 Hi,

 I want to make a wilcoxon test, i have 18 columns each column
 corresponds to a different sample and i want to compare one to each
 other with a wilcoxon test in one step this is possible ? or do i
 compare two by tow?

 Does it exist a code for automation this test? like this i dont have to
 type the code for each couple.

 thanks!
 denisse

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test if 2 samples differ if they have autocorrelation

2013-07-18 Thread Rolf Turner



I imagine that most readers of this list will put your question in the 
too hard basket.

That being so, here is my inexpert take on the question.

The issue is to estimate the uncertainty in the estimated difference of 
the means.
This uncertainty depends on the nature of the serial dependence of the 
series.

Therefore in order to get anywhere you need to *model* this dependence.

Different models could yield very different values for the variance of 
the estimated

difference of the means.

If the series are observed at the same times I would suggest taking the 
pointwise

difference of the two series: D_t = X_t - Y_t, say.

Fit the best arima model that you can to D_t. Then the standard error of 
what
is incorrectly labelled intercept (it is actually the estimate of the 
series *mean*)
is the appropriate estimate of the uncertainty. The ratio of the 
intercept value

to its standard error is the test statistic you are looking for.

If the series are *not* observed at the same times but can be assumed to be
independent then model *each* series as well as you can (different 
models for
each series) and obtain the standard error of the intercept for each 
series.
Your test statistic is then the difference of the intercept estimates 
divided by

sqrt(se_X^2 + se_Y^2) in what I hope is an obvious notation.

If the series are not observed at the same times and cannot be assumed to be
independent then you probably haven't got sufficient information to answer
the question that you wish to answer.

I hope that there is some value in the forgoing.

cheers,

Rolf Turner

On 18/07/13 21:50, Eric Jaeger wrote:

Dear all

I have one question that I struggle to find an answer:

Let`s assume I have 2 timeseries of daily PnL data over 2 years coming from 2 
different trading strategies. I want to find out if strategy A is better than 
strategy B. The problem is that the two series have serial correlations, hence 
I cannot just do a simple t-test.

I tried something like this:

1.create cumulative timeseries of PnL_A = C_A and of PnL_B = C_B

2.take the difference of both: C_A – C_B = DiffPnL (to see how the difference 
evolves over time)

3.do a regression: DiffPnL = beta * time + error (I thought if beta is 
significantly different from 0 than the two time series are different)

4.estimate beta not with OLS, but with the Newey-West method (HAC estimator) - 
this corrects statistical tests, standard errors for beta heteroskedasticity and 
autocorrelation

BUT: I read something that the tests are biased when the timeseries are unit 
root non-stationary (which is due to the fact that I take cumulative time 
series)

  


I am lost! This should be fairly simple: test if two samples differ if they 
have autocorrelation? Probably my approach above is completely wrong…


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for column equality across matrices

2013-07-14 Thread Thiem Alrik
Dear William,

thanks a lot. I've found another nice alternative:

A - matrix(t(expand.grid(c(1,2,3,4,5), 15, 16)), nrow = 3)
B - combn(16, 3)

B.n - B[, -which(duplicated(t(cbind(A, B - ncol(A)]

Best wishes,
Alrik


-Ursprüngliche Nachricht-
Von: arun [mailto:smartpink...@yahoo.com] 
Gesendet: Samstag, 13. Juli 2013 19:57
An: William Dunlap
Cc: mailman, r-help; Thiem Alrik
Betreff: Re: [R] Test for column equality across matrices

I tried it on a slightly bigger dataset:
A1 - matrix(t(expand.grid(1:90, 15, 16)), nrow = 3)
B1 - combn(90, 3)
which(is.element(columnsOf(B1), columnsOf(A1)))
# [1]  1067  4895  8636 12291 15861 19347 22750 26071 29311 32471 35552 38555
#[13] 41481


which(apply(t(B1),1,paste,collapse=)%in%apply(t(A1),1,paste,collapse=))
# [1]  1067  4895  8636 12291 15861 19347 22750 26071 29311 32471 35552 38555
#[13] 41481 44331


B1[,44331]
#[1] 14 15 16


which(apply(t(A1),1,paste,collapse=)==141516)
#[1] 14

B1New-B1[,!apply(t(B1),1,paste,collapse=)%in%apply(t(A1),1,paste,collapse=)]
newB - B1[ , !is.element(columnsOf(B1), columnsOf(A1))]
 identical(B1New,newB)
#[1] FALSE

 is.element(B1[,44331],A1[,14])
#[1] TRUE TRUE TRUE


 B1Sp-columnsOf(B1)
B1Sp[[44331]]
#[1] 14 15 16
 A1Sp- columnsOf(A1)
 A1Sp[[14]]
#[1] 14 15 16
 is.element(B1Sp[[44331]],A1Sp[[14]])
#[1] TRUE TRUE TRUE


A.K.



- Original Message -
From: William Dunlap wdun...@tibco.com
To: Thiem Alrik th...@sipo.gess.ethz.ch; mailman, r-help 
r-help@r-project.org
Cc: 
Sent: Saturday, July 13, 2013 1:30 PM
Subject: Re: [R] Test for column equality across matrices

Try
   columnsOf - function(mat) split(mat, col(mat))
   newB - B[ , !is.element(columnsOf(B), columnsOf(A))]

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Thiem Alrik
 Sent: Saturday, July 13, 2013 6:45 AM
 To: mailman, r-help
 Subject: [R] Test for column equality across matrices
 
 Dear list,
 
 I have two matrices
 
 A - matrix(t(expand.grid(c(1,2,3,4,5), 15, 16)), nrow = 3)
 B - combn(16, 3)
 
 Now I would like to exclude all columns from the 560 columns in B which are 
 identical to
 any 1 of the 6 columns in A. How could I do this?
 
 Many thanks and best wishes,
 
 Alrik
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for column equality across matrices

2013-07-14 Thread William Dunlap
It looks like match() (and relatives like %in% and is.element) act a bit 
unpredictably
on lists when the list elements are vectors of numbers of different types.  If 
you match
integers to integers or doubles to doubles it works as expected, but when the 
types
don't match the results vary.  I would expect the following to give either 
c(1,2) or
c(NA,NA) but not c(1,NA):

 match( list( c(13L,15L,16L), c(14L,15L,16L)), list( c(13.,15.,16.), 
 c(14.,15.,16.) ))
[1]  1 NA

It works when the list elements have the same type

 match( list( c(13L,15L,16L), c(14L,15L,16L)), list( c(13L,15L,16L), 
 c(14L,15L,16L) ))
[1] 1 2
 match( list( c(13.,15.,16.), c(14.,15.,16.)), list( c(13.,15.,16.), 
 c(14.,15.,16.) ))
[1] 1 2
 match( list( c(13.,15.,16.), c(14L,15L,16L)), list( c(13.,15.,16.), 
 c(14L,15L,16L) ))
[1] 1 2

So - A and B should be coerced to have a common type ('storage.mode') before
comparing them.

By the way, the discrepency might happen because match() applied to lists might
be implemented by calling deparse on each element of each list and then using
the character method of match.  For sequential integers deparse uses colon 
notation;
e.g., c(14L,15L,16L) becomes the string 14:16.  But usually deparse puts an 
'L' after
integers so they would never match with a double of the same value.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: arun [mailto:smartpink...@yahoo.com]
 Sent: Saturday, July 13, 2013 10:57 AM
 To: William Dunlap
 Cc: R help; Thiem Alrik
 Subject: Re: [R] Test for column equality across matrices
 
 I tried it on a slightly bigger dataset:
 A1 - matrix(t(expand.grid(1:90, 15, 16)), nrow = 3)
 B1 - combn(90, 3)
 which(is.element(columnsOf(B1), columnsOf(A1)))
 # [1]  1067  4895  8636 12291 15861 19347 22750 26071 29311 32471 35552 38555
 #[13] 41481
 
 
 which(apply(t(B1),1,paste,collapse=)%in%apply(t(A1),1,paste,collapse=))
 # [1]  1067  4895  8636 12291 15861 19347 22750 26071 29311 32471 35552 38555
 #[13] 41481 44331
 
 
 B1[,44331]
 #[1] 14 15 16
 
 
 which(apply(t(A1),1,paste,collapse=)==141516)
 #[1] 14
 
 B1New-B1[,!apply(t(B1),1,paste,collapse=)%in%apply(t(A1),1,paste,collapse=)]
 newB - B1[ , !is.element(columnsOf(B1), columnsOf(A1))]
  identical(B1New,newB)
 #[1] FALSE
 
  is.element(B1[,44331],A1[,14])
 #[1] TRUE TRUE TRUE
 
 
  B1Sp-columnsOf(B1)
 B1Sp[[44331]]
 #[1] 14 15 16
  A1Sp- columnsOf(A1)
  A1Sp[[14]]
 #[1] 14 15 16
  is.element(B1Sp[[44331]],A1Sp[[14]])
 #[1] TRUE TRUE TRUE
 
 
 A.K.
 
 
 
 - Original Message -
 From: William Dunlap wdun...@tibco.com
 To: Thiem Alrik th...@sipo.gess.ethz.ch; mailman, r-help 
 r-help@r-project.org
 Cc:
 Sent: Saturday, July 13, 2013 1:30 PM
 Subject: Re: [R] Test for column equality across matrices
 
 Try
    columnsOf - function(mat) split(mat, col(mat))
    newB - B[ , !is.element(columnsOf(B), columnsOf(A))]
 
 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com
 
 
  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
  Behalf
  Of Thiem Alrik
  Sent: Saturday, July 13, 2013 6:45 AM
  To: mailman, r-help
  Subject: [R] Test for column equality across matrices
 
  Dear list,
 
  I have two matrices
 
  A - matrix(t(expand.grid(c(1,2,3,4,5), 15, 16)), nrow = 3)
  B - combn(16, 3)
 
  Now I would like to exclude all columns from the 560 columns in B which are 
  identical
 to
  any 1 of the 6 columns in A. How could I do this?
 
  Many thanks and best wishes,
 
  Alrik
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for column equality across matrices

2013-07-13 Thread William Dunlap
Try
   columnsOf - function(mat) split(mat, col(mat))
   newB - B[ , !is.element(columnsOf(B), columnsOf(A))]

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Thiem Alrik
 Sent: Saturday, July 13, 2013 6:45 AM
 To: mailman, r-help
 Subject: [R] Test for column equality across matrices
 
 Dear list,
 
 I have two matrices
 
 A - matrix(t(expand.grid(c(1,2,3,4,5), 15, 16)), nrow = 3)
 B - combn(16, 3)
 
 Now I would like to exclude all columns from the 560 columns in B which are 
 identical to
 any 1 of the 6 columns in A. How could I do this?
 
 Many thanks and best wishes,
 
 Alrik
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for column equality across matrices

2013-07-13 Thread arun
I tried it on a slightly bigger dataset:
A1 - matrix(t(expand.grid(1:90, 15, 16)), nrow = 3)
B1 - combn(90, 3)
which(is.element(columnsOf(B1), columnsOf(A1)))
# [1]  1067  4895  8636 12291 15861 19347 22750 26071 29311 32471 35552 38555
#[13] 41481


which(apply(t(B1),1,paste,collapse=)%in%apply(t(A1),1,paste,collapse=))
# [1]  1067  4895  8636 12291 15861 19347 22750 26071 29311 32471 35552 38555
#[13] 41481 44331


B1[,44331]
#[1] 14 15 16


which(apply(t(A1),1,paste,collapse=)==141516)
#[1] 14

B1New-B1[,!apply(t(B1),1,paste,collapse=)%in%apply(t(A1),1,paste,collapse=)]
newB - B1[ , !is.element(columnsOf(B1), columnsOf(A1))]
 identical(B1New,newB)
#[1] FALSE

 is.element(B1[,44331],A1[,14])
#[1] TRUE TRUE TRUE


 B1Sp-columnsOf(B1)
B1Sp[[44331]]
#[1] 14 15 16
 A1Sp- columnsOf(A1)
 A1Sp[[14]]
#[1] 14 15 16
 is.element(B1Sp[[44331]],A1Sp[[14]])
#[1] TRUE TRUE TRUE


A.K.



- Original Message -
From: William Dunlap wdun...@tibco.com
To: Thiem Alrik th...@sipo.gess.ethz.ch; mailman, r-help 
r-help@r-project.org
Cc: 
Sent: Saturday, July 13, 2013 1:30 PM
Subject: Re: [R] Test for column equality across matrices

Try
   columnsOf - function(mat) split(mat, col(mat))
   newB - B[ , !is.element(columnsOf(B), columnsOf(A))]

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Thiem Alrik
 Sent: Saturday, July 13, 2013 6:45 AM
 To: mailman, r-help
 Subject: [R] Test for column equality across matrices
 
 Dear list,
 
 I have two matrices
 
 A - matrix(t(expand.grid(c(1,2,3,4,5), 15, 16)), nrow = 3)
 B - combn(16, 3)
 
 Now I would like to exclude all columns from the 560 columns in B which are 
 identical to
 any 1 of the 6 columns in A. How could I do this?
 
 Many thanks and best wishes,
 
 Alrik
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for column equality across matrices

2013-07-13 Thread arun
Hi,
One way would be:
 which(apply(t(B),1,paste,collapse=)%in%apply(t(A),1,paste,collapse=))
#[1] 105 196 274 340 395
B[,105]
#[1]  1 15 16
 B[,196]
#[1]  2 15 16
 B1-B[,!apply(t(B),1,paste,collapse=)%in%apply(t(A),1,paste,collapse=)]
 dim(B1)
#[1]   3 555
 dim(B)
#[1]   3 560

#or
B2-B[,is.na(match(interaction(as.data.frame(t(B))),interaction(as.data.frame(t(A)]
 identical(B1,B2)
#[1] TRUE


A.K.





- Original Message -
From: Thiem Alrik th...@sipo.gess.ethz.ch
To: mailman, r-help r-help@r-project.org
Cc: 
Sent: Saturday, July 13, 2013 9:45 AM
Subject: [R] Test for column equality across matrices

Dear list,

I have two matrices

A - matrix(t(expand.grid(c(1,2,3,4,5), 15, 16)), nrow = 3)
B - combn(16, 3)

Now I would like to exclude all columns from the 560 columns in B which are 
identical to any 1 of the 6 columns in A. How could I do this?

Many thanks and best wishes,

Alrik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test of Parallel Regression Assumption in R

2013-03-12 Thread Rune Haubo
Dear Heather,

You can make this test using the ordinal package. Here the function
clm fits cumulative link models where the ordinal logistic regression
model is a special case (using the logit link).

Let me illustrate how to test the parallel regression assumption for a
particular variable using clm in the ordinal package. I am using the
wine dataset from the same package, I fit a model with two explanatory
variables; temp and contact, and I test the parallel regression
assumption for the contact variable in a likelihood ratio test:

 library(ordinal)
Loading required package: MASS
Loading required package: ucminf
Loading required package: Matrix
Loading required package: lattice
 head(wine)
  response rating temp contact bottle judge
1   36  2 cold  no  1 1
2   48  3 cold  no  2 1
3   47  3 cold yes  3 1
4   67  4 cold yes  4 1
5   77  4 warm  no  5 1
6   60  4 warm  no  6 1
 fm1 - clm(rating ~ temp + contact, data=wine)
 fm2 - clm(rating ~ temp, nominal=~ contact, data=wine)
 anova(fm1, fm2)
Likelihood ratio tests of cumulative link models:

formula:nominal: link: threshold:
fm1 rating ~ temp + contact ~1   logit flexible
fm2 rating ~ temp   ~contact logit flexible

no.parAIC  logLik LR.stat df Pr(Chisq)
fm1  6 184.98 -86.492
fm2  9 190.42 -86.209  0.5667  3  0.904

The idea is to fit the model under the null hypothesis (parallel
effects - fm1) and under the alternative hypothesis (non-parallel
effects for contact - fm2) and compare these models with anova() which
performs the LR test. From the high p-value we see that the null
cannot be rejected and there is no evidence of non-parallel slopes in
this case. For additional information, I suggest that you take a look
at the following package vignette
(http://cran.r-project.org/web/packages/ordinal/vignettes/clm_tutorial.pdf)
where these kind of tests are more thoroughly described starting page
6.

I think you can also make similar tests with the VGAM package, but I
am not as well versed in that package.

Hope this helps,
Rune

Rune Haubo Bojesen Christensen
Postdoc
DTU Compute - Section for Statistics
---
Technical University of Denmark
Department of Applied Mathematics and Computer Science
Richard Petersens Plads
Building 324, Room 220
2800 Lyngby
Direct +45 45253363
Mobile +45 30264554
http://www.imm.dtu.dk


On 11 March 2013 22:52, Nicole Ford nicole.f...@me.com wrote:
 here's some code as an example  hope it helps!

 mod-polr(vote~age+demsat+eusup+lrself+male+retnat+union+urban, data=dat)
 summary(mod)


 mod-polr(vote~age+demsat+eusup+lrself+male+retnat+union+urban, data=dat)
 levs-levels(dat$vote)
 tmpdat-list()
 for(i in 1:(nlevels(dat$vote)-1)){
 tmpdat[[i]] - dat
 tmpdat[[i]]$z - as.numeric(as.numeric(tmpdat[[1]]$vote) = levs[i])
 }
 form-as.formula(z~age+demsat+eusup+lrself+male+retnat+union+urban)
 mods-lapply(tmpdat, function(x)glm(form, data=x, family=binomial))
 probs-sapply(mods, predict, type=response)
 p.logits-cbind(probs[,2], t(apply(probs, 1, diff)), 1-probs[,ncol(probs)])
 p.ologit-predict(mod, type='probs')
 n-nrow(p.logits)
 bin.ll - p.logits[cbind(1:n, dat$vote)]
 ologit.ll - p.ologit[cbind(1:n, dat$vote)]
 binom.test(sum(bin.ll  ologit.ll), n)


 dat$vote.fac-factor(dat$vote, levels=1:6)
 mod-polr(dat$vote.fac~age+demsat+eusup+lrself+male+retnat+union+urban, 
 data=dat)

 source(http://www.quantoid.net/cat_pre.R )
 catpre(mod)

 install.packages(rms)
 library(rms)
 olprobs-predict(mod, type='probs')
 pred.cat-apply(olprobs, 1, which.max)
 table(pred.cat, dat$vote)

 round(prop.table(table(pred.cat, dat$vote), 2), 3)
 On Mar 11, 2013, at 5:02 PM, Heather Kettrey wrote:

 Hi,

 I am running an analysis with an ordinal outcome and I need to run a test
 of the parallel regression assumption to determine if ordinal logistic
 regression is appropriate. I cannot find a function to conduct such a test.
 From searching various message boards I have seen a few useRs ask this same
 question without a definitive answer - and I came across a thread that
 indicated there is no such function available in any R packages. I hope
 this is incorrect.

 Does anyone know how to test the parallel regression assumption in R?

 Thanks for your help!


 --
 Heather Hensman Kettrey
 PhD Candidate
 Department of Sociology
 Vanderbilt University

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 

Re: [R] Test of Parallel Regression Assumption in R

2013-03-11 Thread Bert Gunter
Heather:

You are at Vanderbilt, whose statistics department under Frank Harrell
is a veritable bastion of R and statistical wisdom. I strongly
recommend that you take a stroll over there in the lovely spring
weather and seek their help. I can't imagine how you could do better
than that!

Cheers,
Bert

On Mon, Mar 11, 2013 at 2:02 PM, Heather Kettrey
heather.h.kett...@vanderbilt.edu wrote:
 Hi,

 I am running an analysis with an ordinal outcome and I need to run a test
 of the parallel regression assumption to determine if ordinal logistic
 regression is appropriate. I cannot find a function to conduct such a test.
 From searching various message boards I have seen a few useRs ask this same
 question without a definitive answer - and I came across a thread that
 indicated there is no such function available in any R packages. I hope
 this is incorrect.

 Does anyone know how to test the parallel regression assumption in R?

 Thanks for your help!


 --
 Heather Hensman Kettrey
 PhD Candidate
 Department of Sociology
 Vanderbilt University

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test of Parallel Regression Assumption in R

2013-03-11 Thread Jeff Newmiller
Perhaps you should be asking whether such an algorithm exists, regardless of 
whether it is already implemented in R. However, this is the wrong place to ask 
such theory questions... your local statistics expert might know, or you could 
ask on a statistics theory forum such as stats.stackexchange.com. With the 
answer to that question you could use the RSiteSeek function to search for 
references to that algorithm, or even implement it yourself.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Heather Kettrey heather.h.kett...@vanderbilt.edu wrote:

Hi,

I am running an analysis with an ordinal outcome and I need to run a
test
of the parallel regression assumption to determine if ordinal logistic
regression is appropriate. I cannot find a function to conduct such a
test.
From searching various message boards I have seen a few useRs ask this
same
question without a definitive answer - and I came across a thread that
indicated there is no such function available in any R packages. I hope
this is incorrect.

Does anyone know how to test the parallel regression assumption in R?

Thanks for your help!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test of Parallel Regression Assumption in R

2013-03-11 Thread Nicole Ford
here's some code as an example  hope it helps!

mod-polr(vote~age+demsat+eusup+lrself+male+retnat+union+urban, data=dat)
summary(mod)

 
mod-polr(vote~age+demsat+eusup+lrself+male+retnat+union+urban, data=dat)
levs-levels(dat$vote)
tmpdat-list()
for(i in 1:(nlevels(dat$vote)-1)){
tmpdat[[i]] - dat
tmpdat[[i]]$z - as.numeric(as.numeric(tmpdat[[1]]$vote) = levs[i])
}
form-as.formula(z~age+demsat+eusup+lrself+male+retnat+union+urban)
mods-lapply(tmpdat, function(x)glm(form, data=x, family=binomial))
probs-sapply(mods, predict, type=response)
p.logits-cbind(probs[,2], t(apply(probs, 1, diff)), 1-probs[,ncol(probs)])
p.ologit-predict(mod, type='probs')
n-nrow(p.logits)
bin.ll - p.logits[cbind(1:n, dat$vote)]
ologit.ll - p.ologit[cbind(1:n, dat$vote)]
binom.test(sum(bin.ll  ologit.ll), n)
 

dat$vote.fac-factor(dat$vote, levels=1:6)
mod-polr(dat$vote.fac~age+demsat+eusup+lrself+male+retnat+union+urban, 
data=dat)
 
source(http://www.quantoid.net/cat_pre.R )
catpre(mod)
 
install.packages(rms)
library(rms)
olprobs-predict(mod, type='probs')
pred.cat-apply(olprobs, 1, which.max)
table(pred.cat, dat$vote)
 
round(prop.table(table(pred.cat, dat$vote), 2), 3)
On Mar 11, 2013, at 5:02 PM, Heather Kettrey wrote:

 Hi,
 
 I am running an analysis with an ordinal outcome and I need to run a test
 of the parallel regression assumption to determine if ordinal logistic
 regression is appropriate. I cannot find a function to conduct such a test.
 From searching various message boards I have seen a few useRs ask this same
 question without a definitive answer - and I came across a thread that
 indicated there is no such function available in any R packages. I hope
 this is incorrect.
 
 Does anyone know how to test the parallel regression assumption in R?
 
 Thanks for your help!
 
 
 -- 
 Heather Hensman Kettrey
 PhD Candidate
 Department of Sociology
 Vanderbilt University
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test for a condition in a vector for loop not working

2012-11-10 Thread scoyoc
Once again, thanks!
MVS



-
MVS
=
Matthew Van Scoyoc
Graduate Research Assistant, Ecology
Wildland Resources Department  Ecology Center
Quinney College of Natural Resources
Utah State University
Logan, UT
=
Think SNOW!


--
View this message in context: 
http://r.789695.n4.nabble.com/test-for-a-condition-in-a-vector-for-loop-not-working-tp4649212p4649216.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for Random Points on a Sphere

2012-10-07 Thread 周果
Hi Lorenzo,

Just a quick thought, the uniform probability density on a unit sphere is 1
/ (4pi),
what about binning those random points according to their directions and do
a chi-square test?

Regards,
Guo

On Sun, Oct 7, 2012 at 2:16 AM, cbe...@tajo.ucsd.edu wrote:

 Lorenzo Isella lorenzo.ise...@gmail.com writes:

  Dear All,
  I implemented an algorithm for (uniform) random rotations.
  In order to test it, I can apply it to a unit vector (0,0,1) in
  Cartesian coordinates.
  The result is supposed to be a set of random, uniformly distributed,
  points on a sphere (not the point of the algorithm, but a way to test
  it).
  This is what the points look like when I plot them, but other then
  eyeballing them, can anyone suggest a test to ensure that I am really
  generating uniform random points on a sphere?

 There is a substantial literature on this topic and more than one
 (metaphorical?) direction you could follow.

 I suggest you Google 'directional statistics' and start reading.

 Visit http://www.rseek.org and enter 'directional statistics' in
 the search box and click on the search button to see if there is
 something in R to meet your needs.

 A post to r-sig-geo might get more helpful responses once you can focus
 the question a bit more.


 HTH,

 Chuck

  Many thanks
 
  Lorenzo
 

 --
 Charles C. BerryDept of Family/Preventive
 Medicine
 cberry at ucsd edu  UC San Diego
 http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for Random Points on a Sphere

2012-10-06 Thread cberry
Lorenzo Isella lorenzo.ise...@gmail.com writes:

 Dear All,
 I implemented an algorithm for (uniform) random rotations.
 In order to test it, I can apply it to a unit vector (0,0,1) in
 Cartesian coordinates.
 The result is supposed to be a set of random, uniformly distributed,
 points on a sphere (not the point of the algorithm, but a way to test
 it).
 This is what the points look like when I plot them, but other then
 eyeballing them, can anyone suggest a test to ensure that I am really
 generating uniform random points on a sphere?

There is a substantial literature on this topic and more than one
(metaphorical?) direction you could follow.

I suggest you Google 'directional statistics' and start reading.

Visit http://www.rseek.org and enter 'directional statistics' in
the search box and click on the search button to see if there is
something in R to meet your needs.

A post to r-sig-geo might get more helpful responses once you can focus
the question a bit more.


HTH,

Chuck

 Many thanks

 Lorenzo


-- 
Charles C. BerryDept of Family/Preventive Medicine
cberry at ucsd edu  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for Random Points on a Sphere

2012-10-05 Thread R. Michael Weylandt
On Fri, Oct 5, 2012 at 5:39 PM, Lorenzo Isella lorenzo.ise...@gmail.com wrote:
 Dear All,
 I implemented an algorithm for (uniform) random rotations.
 In order to test it, I can apply it to a unit vector (0,0,1) in Cartesian
 coordinates.
 The result is supposed to be a set of random, uniformly distributed, points
 on a sphere (not the point of the algorithm, but a way to test it).
 This is what the points look like when I plot them, but other then
 eyeballing them, can anyone suggest a test to ensure that I am really
 generating uniform random points on a sphere?
 Many thanks


Gut says to divide the surface into n bits of equal area and see if
the points appear uniformly in those using something chi-squared-ish,
but I'm not aware of a canonical way to do so.

Cheers,
Michael

 Lorenzo

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for Random Points on a Sphere

2012-10-05 Thread Nordlund, Dan (DSHS/RDA)
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of R. Michael Weylandt
 Sent: Friday, October 05, 2012 11:17 AM
 To: Lorenzo Isella
 Cc: r-help@r-project.org
 Subject: Re: [R] Test for Random Points on a Sphere
 
 On Fri, Oct 5, 2012 at 5:39 PM, Lorenzo Isella
 lorenzo.ise...@gmail.com wrote:
  Dear All,
  I implemented an algorithm for (uniform) random rotations.
  In order to test it, I can apply it to a unit vector (0,0,1) in
 Cartesian
  coordinates.
  The result is supposed to be a set of random, uniformly distributed,
 points
  on a sphere (not the point of the algorithm, but a way to test it).
  This is what the points look like when I plot them, but other then
  eyeballing them, can anyone suggest a test to ensure that I am really
  generating uniform random points on a sphere?
  Many thanks
 
 
 Gut says to divide the surface into n bits of equal area and see if
 the points appear uniformly in those using something chi-squared-ish,
 but I'm not aware of a canonical way to do so.
 
 Cheers,
 Michael
 
  Lorenzo
 

I would be more inclined to use a method which is known to produce a points 
uniformly distributed on the surface of a sphere and not worry about testing 
your results.  You might find the discussion at the following link useful.

http://mathworld.wolfram.com/SpherePointPicking.html


Hope this is helpful,

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test Breslow-Day for svytable??

2012-08-31 Thread John Sorkin
Suggstion:
You need to send us more information, i.e. the code that genrated daty, or a 
listing of the daty structure, and a copy of the listing
produced by epi.2by2
John

 
John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing) Diana 
Marcela Martinez Ruiz dianamm...@hotmail.com 8/31/2012 10:20 AM 

Hi all,

I want to know how to perform the test Breslow-Day test for homogeneity of 
odds ratios (OR) stratified for svytable. This test is obtained with the 
following code:

epi.2by2 (dat = daty, method = case.control conf.level = 0.95,
units = 100, homogeneity = breslow.day, verbose = TRUE)

where daty is the object type table  svytable consider it, but when I run the 
code
does not throw the homogeneity test.

Thanks.  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Confidentiality Statement:
This email message, including any attachments, is for the sole use of the 
intended recipient(s) and may contain confidential and privileged information.  
Any unauthorized use, disclosure or distribution is prohibited.  If you are not 
the intended recipient, please contact the sender by reply email and destroy 
all copies of the original message. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test Breslow-Day for svytable??

2012-08-31 Thread David Winsemius

On Aug 31, 2012, at 7:20 AM, Diana Marcela Martinez Ruiz wrote:

 Hi all,
 
 I want to know how to perform the test Breslow-Day test for homogeneity of 
 odds ratios (OR) stratified for svytable. This test is obtained with the 
 following code:
 
 epi.2by2 (dat = daty, method = case.control conf.level = 0.95,

missing comma here ...^

units = 100, homogeneity = breslow.day, verbose = TRUE)
 
 where daty is the object type table  svytable consider it, but when I run 
 the code
 does not throw the homogeneity test.

You are asked in the Posting guide to copy all errors and warnings when asking 
about unexpected behavior. When I run epi.2y2 on the output of a syvtable 
object I get no errors, but I do get warnings which I think are due to 
non-integer entries in the weighted table. I also get from a svytable() 
usingits first example on the help page an object that is NOT a set of 2 x 2 
tables in an array of the structure as expected by epi.2by2(). The fact that 
epi.2by2() will report numbers with labels for a 2 x 3 table means that its 
error checking is weak.

This is the output of str(dat) from one of the example on epi.2by2's help page:

 str(dat)
 table [1:2, 1:2, 1:3] 41 13 6 53 66 37 25 83 23 37 ...
 - attr(*, dimnames)=List of 3
  ..$ Exposure: chr [1:2] + -
  ..$ Disease : chr [1:2] + -
  ..$ Strata  : chr [1:3] 20-29 yrs 30-39 yrs 40+ yrs

Notice that is is a 2 x 2 x n array. (Caveat:: from here on out I am simply 
reading the help pages and using str() to look at the objects created to get an 
idea regarding success or failure. I am not an experienced user of either 
package.)  I doubt that  what you got from svytable is a 2 x 2 table. As 
another example you can build a 2 x 2 x n table from the built-in dataset: 
UCBAdmissions 

DF - as.data.frame(UCBAdmissions)
## Now 'DF' is a data frame with a grid of the factors and the counts
## in variable 'Freq'.
dat2 - xtabs(Freq ~ Gender + Admit+Dept, DF)
epiR::epi.2by2(dat = dat2, method = case.control, conf.level = 0.95, 
 units = 100, homogeneity = breslow.day, verbose = TRUE)$OR.homog
#-
  test.statistic dfp.value
1   18.82551  5 0.00207139

Using svydesign and svytable I _think_ this is how one would go about 
constructing a 2 x 2 table:

tbl2-svydesign(  ~ Gender + Admit+Dept, weights=~Freq, data=DF)
  summary(dclus1)
(tbl2by2 - svytable(~ Gender + Admit+Dept, tbl2))
 epiR::epi.2by2(dat = tbl, method = case.control, conf.level = 0.95, 
 units = 100, homogeneity = breslow.day, verbose = TRUE)$OR.homog
#---
  test.statistic dfp.value
1   18.82551  5 0.00207139

(At least I got internal consistency. I see you copied Thomas Lumley, which is 
a good idea. I'll be happy to get corrected on any point. I'm adding the 
maintainer of epiR to the recipients.)

-- 
David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test Breslow-Day for svytable??

2012-08-31 Thread Thomas Lumley
On Sat, Sep 1, 2012 at 4:27 AM, David Winsemius dwinsem...@comcast.net wrote:

 On Aug 31, 2012, at 7:20 AM, Diana Marcela Martinez Ruiz wrote:

 Hi all,

 I want to know how to perform the test Breslow-Day test for homogeneity of
 odds ratios (OR) stratified for svytable. This test is obtained with the 
 following code:

 epi.2by2 (dat = daty, method = case.control conf.level = 0.95,

 missing comma here ...^

units = 100, homogeneity = breslow.day, verbose = TRUE)

 where daty is the object type table  svytable consider it, but when I run 
 the code
 does not throw the homogeneity test.

 You are asked in the Posting guide to copy all errors and warnings when 
 asking about unexpected behavior. When I run epi.2y2 on the output of a 
 syvtable object I get no errors, but I do get warnings which I think are due 
 to non-integer entries in the weighted table. I also get from a svytable() 
 usingits first example on the help page an object that is NOT a set of 2 x 2 
 tables in an array of the structure as expected by epi.2by2(). The fact that 
 epi.2by2() will report numbers with labels for a 2 x 3 table means that its 
 error checking is weak.

 This is the output of str(dat) from one of the example on epi.2by2's help 
 page:

 str(dat)
  table [1:2, 1:2, 1:3] 41 13 6 53 66 37 25 83 23 37 ...
  - attr(*, dimnames)=List of 3
   ..$ Exposure: chr [1:2] + -
   ..$ Disease : chr [1:2] + -
   ..$ Strata  : chr [1:3] 20-29 yrs 30-39 yrs 40+ yrs

 Notice that is is a 2 x 2 x n array. (Caveat:: from here on out I am simply 
 reading the help pages and using str() to look at the objects created to get 
 an idea regarding success or failure. I am not an experienced user of either 
 package.)  I doubt that  what you got from svytable is a 2 x 2 table. As 
 another example you can build a 2 x 2 x n table from the built-in dataset: 
 UCBAdmissions

 DF - as.data.frame(UCBAdmissions)
 ## Now 'DF' is a data frame with a grid of the factors and the counts
 ## in variable 'Freq'.
 dat2 - xtabs(Freq ~ Gender + Admit+Dept, DF)
 epiR::epi.2by2(dat = dat2, method = case.control, conf.level = 0.95,
  units = 100, homogeneity = breslow.day, verbose = TRUE)$OR.homog
 #-
   test.statistic dfp.value
 1   18.82551  5 0.00207139

 Using svydesign and svytable I _think_ this is how one would go about 
 constructing a 2 x 2 table:

 tbl2-svydesign(  ~ Gender + Admit+Dept, weights=~Freq, data=DF)
   summary(dclus1)
 (tbl2by2 - svytable(~ Gender + Admit+Dept, tbl2))
  epiR::epi.2by2(dat = tbl, method = case.control, conf.level = 0.95,
  units = 100, homogeneity = breslow.day, verbose = TRUE)$OR.homog
 #---
   test.statistic dfp.value
 1   18.82551  5 0.00207139

 (At least I got internal consistency. I see you copied Thomas Lumley, which 
 is a good idea. I'll be happy to get corrected on any point. I'm adding the 
 maintainer of epiR to the recipients.)


Yes, that will give internal consistency from a data structure point
of view.  It won't give a valid test in real examples, though --
epi.2by2 doesn't know about complex sampling, and what you're passing
it is just an estimate of the population 2x2xK table.

What would work, though it's not quite the same as the Breslow-Day
test, is to use svyloglin() and do a Rao-Scott test comparing the
model with all two-way interactions ~(Gender+Dept+Admit)^2 to the
saturated model ~Gender*Dept*Admit.

-thomas


-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-08 Thread Liviu Andronic
On Tue, Aug 7, 2012 at 10:26 PM, Marc Schwartz marc_schwa...@me.com wrote:
 since there are alpha-numerics present, whereas the first option will:

 grepl([^[:alnum:]], ab%)
 [1] TRUE


 So, use the first option.

And I should start reading more carefully. The above works fine for me.

I ended up defining the following wrappers:
is_alpha - function(x) {grepl([[:alpha:]], x)}  ##Alphabetic characters
is_digit - function(x) {grepl([[:digit:]], x)}  ##Digits
is_alnum - function(x) {grepl([[:alnum:]], x)}  ##Alphanumeric characters
is_punct - function(x) {grepl([[:punct:]], x)}  ##Punctuation characters
is_notalnum - function(x) {grepl([^[:alnum:]], x)}
##Non-Alphanumeric characters


Thanks again
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-07 Thread Liviu Andronic
On Mon, Aug 6, 2012 at 7:35 PM, Marc Schwartz marc_schwa...@me.com wrote:
 is.letter - function(x) grepl([[:alpha:]], x)
 is.number - function(x) grepl([[:digit:]], x)

Quick follow-up question.

I'm always reluctant to create functions that would resemble the
method of a function (here, is() ), but would in fact not be a genuine
method. So would there be any incompatibility between is() and
is.letter(), given that the latter is not a method of the former?
Is it good (or acceptable) practice to define is.letter() as above?
Would is_letter() be better?

Regards
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-07 Thread Liviu Andronic
On Mon, Aug 6, 2012 at 7:35 PM, Marc Schwartz marc_schwa...@me.com wrote:
 is.letter - function(x) grepl([[:alpha:]], x)
 is.number - function(x) grepl([[:digit:]], x)


Another follow-up. To test for (non-)alphanumeric one would do the following:
 x - c(letters, 1:26, '+', '-', '%^')
 x[1:10] - paste(x[1:10], 1:10, sep='')
 x
 [1] a1  b2  c3  d4  e5  f6  g7  h8  i9  j10 k
l   m   n
[15] o   p   q   r   s   t   u   v   w   x   y
z   1   2
[29] 3   4   5   6   7   8   9   10  11  12  13
14  15  16
[43] 17  18  19  20  21  22  23  24  25  26  +
-   %^
 xb - grepl([[:alnum:]],x)  ##test for alphanumeric chars
 x[xb]
 [1] a1  b2  c3  d4  e5  f6  g7  h8  i9  j10 k
l   m   n
[15] o   p   q   r   s   t   u   v   w   x   y
z   1   2
[29] 3   4   5   6   7   8   9   10  11  12  13
14  15  16
[43] 17  18  19  20  21  22  23  24  25  26
 xb - grepl([[:punct:]],x)  ##test for non-alphanumeric chars
 x[xb]
[1] +   -   %^


More regex rules are available on the Wiki [1]. Regards
Liviu

[1] http://en.wikipedia.org/wiki/Regular_expression

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-07 Thread R. Michael Weylandt
On Tue, Aug 7, 2012 at 4:28 AM, Liviu Andronic landronim...@gmail.com wrote:
 On Mon, Aug 6, 2012 at 7:35 PM, Marc Schwartz marc_schwa...@me.com wrote:
 is.letter - function(x) grepl([[:alpha:]], x)
 is.number - function(x) grepl([[:digit:]], x)

 Quick follow-up question.

 I'm always reluctant to create functions that would resemble the
 method of a function (here, is() ), but would in fact not be a genuine
 method. So would there be any incompatibility between is() and
 is.letter(), given that the latter is not a method of the former?
 Is it good (or acceptable) practice to define is.letter() as above?
 Would is_letter() be better?

It certainly won't cause problems if you never define anything of
class letter or number.


 Regards
 Liviu


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-07 Thread Marc Schwartz

On Aug 7, 2012, at 3:02 PM, Liviu Andronic landronim...@gmail.com wrote:

 On Mon, Aug 6, 2012 at 7:35 PM, Marc Schwartz marc_schwa...@me.com wrote:
 is.letter - function(x) grepl([[:alpha:]], x)
 is.number - function(x) grepl([[:digit:]], x)
 
 
 Another follow-up. To test for (non-)alphanumeric one would do the following:
 x - c(letters, 1:26, '+', '-', '%^')
 x[1:10] - paste(x[1:10], 1:10, sep='')
 x
 [1] a1  b2  c3  d4  e5  f6  g7  h8  i9  j10 k
 l   m   n
 [15] o   p   q   r   s   t   u   v   w   x   y
 z   1   2
 [29] 3   4   5   6   7   8   9   10  11  12  13
 14  15  16
 [43] 17  18  19  20  21  22  23  24  25  26  +
 -   %^
 xb - grepl([[:alnum:]],x)  ##test for alphanumeric chars
 x[xb]
 [1] a1  b2  c3  d4  e5  f6  g7  h8  i9  j10 k
 l   m   n
 [15] o   p   q   r   s   t   u   v   w   x   y
 z   1   2
 [29] 3   4   5   6   7   8   9   10  11  12  13
 14  15  16
 [43] 17  18  19  20  21  22  23  24  25  26
 xb - grepl([[:punct:]],x)  ##test for non-alphanumeric chars
 x[xb]
 [1] +   -   %^


That will get you values where punctuation characters are used, but there may 
be other non-alphanumeric characters in the vector. There may be ASCII control 
codes, tabs, newlines, CR, LF, spaces, etc. which would not be found by using 
[:punct:].

For example:

 grepl([[:punct:]],  )
[1] FALSE


If you want to explicitly look for non-alphanumeric characters, you would be 
better off using a negation of [:alnum:] such as:

grepl([^[:alnum:]], x)

or

!grepl([[:alnum:]], x)


Regards,

Marc



 
 More regex rules are available on the Wiki [1]. Regards
 Liviu
 
 [1] http://en.wikipedia.org/wiki/Regular_expression
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-07 Thread Marc Schwartz

On Aug 7, 2012, at 3:18 PM, Marc Schwartz marc_schwa...@me.com wrote:

 
 On Aug 7, 2012, at 3:02 PM, Liviu Andronic landronim...@gmail.com wrote:
 
 On Mon, Aug 6, 2012 at 7:35 PM, Marc Schwartz marc_schwa...@me.com wrote:
 is.letter - function(x) grepl([[:alpha:]], x)
 is.number - function(x) grepl([[:digit:]], x)
 
 
 Another follow-up. To test for (non-)alphanumeric one would do the following:
 x - c(letters, 1:26, '+', '-', '%^')
 x[1:10] - paste(x[1:10], 1:10, sep='')
 x
 [1] a1  b2  c3  d4  e5  f6  g7  h8  i9  j10 k
 l   m   n
 [15] o   p   q   r   s   t   u   v   w   x   y
 z   1   2
 [29] 3   4   5   6   7   8   9   10  11  12  13
 14  15  16
 [43] 17  18  19  20  21  22  23  24  25  26  +
 -   %^
 xb - grepl([[:alnum:]],x)  ##test for alphanumeric chars
 x[xb]
 [1] a1  b2  c3  d4  e5  f6  g7  h8  i9  j10 k
 l   m   n
 [15] o   p   q   r   s   t   u   v   w   x   y
 z   1   2
 [29] 3   4   5   6   7   8   9   10  11  12  13
 14  15  16
 [43] 17  18  19  20  21  22  23  24  25  26
 xb - grepl([[:punct:]],x)  ##test for non-alphanumeric chars
 x[xb]
 [1] +   -   %^
 
 
 That will get you values where punctuation characters are used, but there may 
 be other non-alphanumeric characters in the vector. There may be ASCII 
 control codes, tabs, newlines, CR, LF, spaces, etc. which would not be found 
 by using [:punct:].
 
 For example:
 
 grepl([[:punct:]],  )
 [1] FALSE
 
 
 If you want to explicitly look for non-alphanumeric characters, you would be 
 better off using a negation of [:alnum:] such as:
 
 grepl([^[:alnum:]], x)
 
 or
 
 !grepl([[:alnum:]], x)
 



Actually (for the second time in two days) I need to correct myself. The second 
option would not work correctly in cases where there is a mix of alpha-numerics 
and non:

 !grepl([[:alnum:]], ab%)
[1] FALSE

since there are alpha-numerics present, whereas the first option will:

 grepl([^[:alnum:]], ab%)
[1] TRUE


So, use the first option.

Regards,

Marc who is heading to the coffee machine...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-07 Thread Liviu Andronic
On Tue, Aug 7, 2012 at 10:18 PM, Marc Schwartz marc_schwa...@me.com wrote:
 That will get you values where punctuation characters are used, but there may 
 be other non-alphanumeric characters in the vector. There may be ASCII 
 control codes, tabs, newlines, CR, LF, spaces, etc. which would not be found 
 by using [:punct:].

 For example:

 grepl([[:punct:]],  )
 [1] FALSE


 If you want to explicitly look for non-alphanumeric characters, you would be 
 better off using a negation of [:alnum:] such as:

[..]


 !grepl([[:alnum:]], x)

Good point! Thanks.
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-06 Thread Bert Gunter
nzchar(x)  !is.na(x)

No?

-- Bert

On Mon, Aug 6, 2012 at 9:25 AM, Liviu Andronic landronim...@gmail.com wrote:
 Dear all
 I'm pretty sure that I'm approaching the problem in a wrong way.
 Suppose the following character vector:
 (x[1:10] - paste(x[1:10], sample(1:10, 10), sep=''))
  [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4
 x
  [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4  k
 l   m   n
 [15] o   p   q   r   s   t   u   v   w   x   y
 z   1   2
 [29] 3   4   5   6   7   8   9   10  11  12  13
 14  15  16
 [43] 17  18  19  20  21  22  23  24  25  26


 How do you test whether the elements of the vector contain at least
 one letter (or at least one digit) and obtain a logical vector of the
 same dimension? I came up with the following awkward function:
 is_letter - function(x, pattern=c(letters, LETTERS)){
 sapply(x, function(y){
 any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
 })
 }

 is_letter(x)
   a10b7c2d3e6f1g5h8i9j4 k
 l m n o
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 TRUE  TRUE  TRUE  TRUE
 p q r s t u v w x y z
 1 2 3 4
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 FALSE FALSE FALSE FALSE
 5 6 7 8 9101112131415
 16171819
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 FALSE FALSE FALSE FALSE
20212223242526
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 is_letter(x, 0:9)  ##function slightly misnamed
   a10b7c2d3e6f1g5h8i9j4 k
 l m n o
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
 FALSE FALSE FALSE FALSE
 p q r s t u v w x y z
 1 2 3 4
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 TRUE  TRUE  TRUE  TRUE
 5 6 7 8 9101112131415
 16171819
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 TRUE  TRUE  TRUE  TRUE
20212223242526
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE


 Is there a nicer way to do this? Regards
 Liviu


 --
 Do you know how to read?
 http://www.alienetworks.com/srtest.cfm
 http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
 Do you know how to write?
 http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-06 Thread Rui Barradas

Hello,

Fun as an exercise in vectorization. 30 times faster. Don't look, guess.

Gave it up? Ok, here it is.


is_letter - function(x, pattern=c(letters, LETTERS)){
sapply(x, function(y){
any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
})
}
# test ascii codes, just one loop.
has_letter - function(x){
sapply(x, function(y){
y - as.integer(charToRaw(y))
any((65 = y  y = 90) | (97 = y  y = 122))
})
}

x - c(letters, 1:26)
x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')
x - rep(x, 1e3)

t1 - system.time(is_letter(x))
t2 - system.time(has_letter(x))
rbind(t1, t2, t1/t2)
   user.self sys.self elapsed user.child sys.child
t1 15.690   15.74 NANA
t2  0.5000.50 NANA
   31.38  NaN   31.48 NANA


Em 06-08-2012 17:25, Liviu Andronic escreveu:

Dear all
I'm pretty sure that I'm approaching the problem in a wrong way.
Suppose the following character vector:

(x[1:10] - paste(x[1:10], sample(1:10, 10), sep=''))

  [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4

x

  [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4  k
l   m   n
[15] o   p   q   r   s   t   u   v   w   x   y
z   1   2
[29] 3   4   5   6   7   8   9   10  11  12  13
14  15  16
[43] 17  18  19  20  21  22  23  24  25  26


How do you test whether the elements of the vector contain at least
one letter (or at least one digit) and obtain a logical vector of the
same dimension? I came up with the following awkward function:
is_letter - function(x, pattern=c(letters, LETTERS)){
 sapply(x, function(y){
 any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
 })
}


is_letter(x)

   a10b7c2d3e6f1g5h8i9j4 k
l m n o
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
TRUE  TRUE  TRUE  TRUE
 p q r s t u v w x y z
1 2 3 4
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
FALSE FALSE FALSE FALSE
 5 6 7 8 9101112131415
16171819
FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
FALSE FALSE FALSE FALSE
20212223242526
FALSE FALSE FALSE FALSE FALSE FALSE FALSE

is_letter(x, 0:9)  ##function slightly misnamed

   a10b7c2d3e6f1g5h8i9j4 k
l m n o
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
FALSE FALSE FALSE FALSE
 p q r s t u v w x y z
1 2 3 4
FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
TRUE  TRUE  TRUE  TRUE
 5 6 7 8 9101112131415
16171819
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
TRUE  TRUE  TRUE  TRUE
20212223242526
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE


Is there a nicer way to do this? Regards
Liviu




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-06 Thread Martin Morgan

On 08/06/2012 09:51 AM, Rui Barradas wrote:

Hello,

Fun as an exercise in vectorization. 30 times faster. Don't look, guess.


 system.time(res0 - grepl([[:alpha:]], x))
   user  system elapsed
  0.060   0.000   0.061
 system.time(res1 - has_letter(x))
   user  system elapsed
  3.728   0.008   3.747
 all.equal(res0, res1, check.attributes=FALSE)
[1] TRUE



Gave it up? Ok, here it is.


is_letter - function(x, pattern=c(letters, LETTERS)){
 sapply(x, function(y){
 any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
 })
}
# test ascii codes, just one loop.
has_letter - function(x){
 sapply(x, function(y){
 y - as.integer(charToRaw(y))
 any((65 = y  y = 90) | (97 = y  y = 122))
 })
}

x - c(letters, 1:26)
x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')
x - rep(x, 1e3)

t1 - system.time(is_letter(x))
t2 - system.time(has_letter(x))
rbind(t1, t2, t1/t2)
user.self sys.self elapsed user.child sys.child
t1 15.690   15.74 NANA
t2  0.5000.50 NANA
31.38  NaN   31.48 NANA


Em 06-08-2012 17:25, Liviu Andronic escreveu:

Dear all
I'm pretty sure that I'm approaching the problem in a wrong way.
Suppose the following character vector:

(x[1:10] - paste(x[1:10], sample(1:10, 10), sep=''))

  [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4

x

  [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4  k
l   m   n
[15] o   p   q   r   s   t   u   v   w   x   y
z   1   2
[29] 3   4   5   6   7   8   9   10  11  12  13
14  15  16
[43] 17  18  19  20  21  22  23  24  25  26


How do you test whether the elements of the vector contain at least
one letter (or at least one digit) and obtain a logical vector of the
same dimension? I came up with the following awkward function:
is_letter - function(x, pattern=c(letters, LETTERS)){
 sapply(x, function(y){
 any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
 })
}


is_letter(x)

   a10b7c2d3e6f1g5h8i9j4 k
l m n o
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
TRUE  TRUE  TRUE  TRUE
 p q r s t u v w x y z
1 2 3 4
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
FALSE FALSE FALSE FALSE
 5 6 7 8 9101112131415
16171819
FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
FALSE FALSE FALSE FALSE
20212223242526
FALSE FALSE FALSE FALSE FALSE FALSE FALSE

is_letter(x, 0:9)  ##function slightly misnamed

   a10b7c2d3e6f1g5h8i9j4 k
l m n o
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
FALSE FALSE FALSE FALSE
 p q r s t u v w x y z
1 2 3 4
FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
TRUE  TRUE  TRUE  TRUE
 5 6 7 8 9101112131415
16171819
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
TRUE  TRUE  TRUE  TRUE
20212223242526
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE


Is there a nicer way to do this? Regards
Liviu




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-06 Thread Marc Schwartz
Perhaps I am missing something, but why use sapply() when grepl() is already 
vectorized?

is.letter - function(x) grepl([:alpha:], x)
is.number - function(x) grepl([:digit:], x)

x - c(letters, 1:26)

x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')

x - rep(x, 1e3)

 str(x)
 chr [1:52000] a2 b10 c8 d3 e6 f1 g5 ...

 system.time(is.letter(x))
   user  system elapsed 
  0.011   0.000   0.010 

 system.time(is.number(x))
   user  system elapsed 
  0.010   0.000   0.011 


Regards,

Marc Schwartz

On Aug 6, 2012, at 11:51 AM, Rui Barradas ruipbarra...@sapo.pt wrote:

 Hello,
 
 Fun as an exercise in vectorization. 30 times faster. Don't look, guess.
 
 Gave it up? Ok, here it is.
 
 
 is_letter - function(x, pattern=c(letters, LETTERS)){
sapply(x, function(y){
any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
})
 }
 # test ascii codes, just one loop.
 has_letter - function(x){
sapply(x, function(y){
y - as.integer(charToRaw(y))
any((65 = y  y = 90) | (97 = y  y = 122))
})
 }
 
 x - c(letters, 1:26)
 x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')
 x - rep(x, 1e3)
 
 t1 - system.time(is_letter(x))
 t2 - system.time(has_letter(x))
 rbind(t1, t2, t1/t2)
   user.self sys.self elapsed user.child sys.child
 t1 15.690   15.74 NANA
 t2  0.5000.50 NANA
   31.38  NaN   31.48 NANA
 
 
 Em 06-08-2012 17:25, Liviu Andronic escreveu:
 Dear all
 I'm pretty sure that I'm approaching the problem in a wrong way.
 Suppose the following character vector:
 (x[1:10] - paste(x[1:10], sample(1:10, 10), sep=''))
  [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4
 x
  [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4  k
 l   m   n
 [15] o   p   q   r   s   t   u   v   w   x   y
 z   1   2
 [29] 3   4   5   6   7   8   9   10  11  12  13
 14  15  16
 [43] 17  18  19  20  21  22  23  24  25  26
 
 
 How do you test whether the elements of the vector contain at least
 one letter (or at least one digit) and obtain a logical vector of the
 same dimension? I came up with the following awkward function:
 is_letter - function(x, pattern=c(letters, LETTERS)){
 sapply(x, function(y){
 any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
 })
 }
 
 is_letter(x)
   a10b7c2d3e6f1g5h8i9j4 k
 l m n o
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 TRUE  TRUE  TRUE  TRUE
 p q r s t u v w x y z
 1 2 3 4
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 FALSE FALSE FALSE FALSE
 5 6 7 8 9101112131415
 16171819
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 FALSE FALSE FALSE FALSE
20212223242526
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 is_letter(x, 0:9)  ##function slightly misnamed
   a10b7c2d3e6f1g5h8i9j4 k
 l m n o
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
 FALSE FALSE FALSE FALSE
 p q r s t u v w x y z
 1 2 3 4
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 TRUE  TRUE  TRUE  TRUE
 5 6 7 8 9101112131415
 16171819
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 TRUE  TRUE  TRUE  TRUE
20212223242526
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 
 
 Is there a nicer way to do this? Regards
 Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-06 Thread arun
Hi,

Not sure whether this is you wanted.
x-letters
  (x[1:10] - paste(x[1:10], sample(1:10, 10), sep=''))
 x1-c(x,1:26)


x1
 [1] a4  b3  c5  d2  e9  f6  g1  h8  i10 j7  k   l  
[13] m   n   o   p   q   r   s   t   u   v   w   x  
[25] y   z   1   2   3   4   5   6   7   8   9   10 
[37] 11  12  13  14  15  16  17  18  19  20  21  22 
[49] 23  24  25  26 


 grepl(^[[:alpha:]][[:digit:]],x1)
 [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[49] FALSE FALSE FALSE FALSE

A.K.



- Original Message -
From: Liviu Andronic landronim...@gmail.com
To: r-help@r-project.org Help r-help@r-project.org
Cc: 
Sent: Monday, August 6, 2012 12:25 PM
Subject: [R] test if elements of a character vector contain letters

Dear all
I'm pretty sure that I'm approaching the problem in a wrong way.
Suppose the following character vector:
 (x[1:10] - paste(x[1:10], sample(1:10, 10), sep=''))
[1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4
 x
[1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4  k
l   m   n
[15] o   p   q   r   s   t   u   v   w   x   y
z   1   2
[29] 3   4   5   6   7   8   9   10  11  12  13
14  15  16
[43] 17  18  19  20  21  22  23  24  25  26


How do you test whether the elements of the vector contain at least
one letter (or at least one digit) and obtain a logical vector of the
same dimension? I came up with the following awkward function:
is_letter - function(x, pattern=c(letters, LETTERS)){
    sapply(x, function(y){
        any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
    })
}

 is_letter(x)
  a10    b7    c2    d3    e6    f1    g5    h8    i9    j4     k
l     m     n     o
TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
TRUE  TRUE  TRUE  TRUE
    p     q     r     s     t     u     v     w     x     y     z
1     2     3     4
TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
FALSE FALSE FALSE FALSE
    5     6     7     8     9    10    11    12    13    14    15
16    17    18    19
FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
FALSE FALSE FALSE FALSE
   20    21    22    23    24    25    26
FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 is_letter(x, 0:9)  ##function slightly misnamed
  a10    b7    c2    d3    e6    f1    g5    h8    i9    j4     k
l     m     n     o
TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
FALSE FALSE FALSE FALSE
    p     q     r     s     t     u     v     w     x     y     z
1     2     3     4
FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
TRUE  TRUE  TRUE  TRUE
    5     6     7     8     9    10    11    12    13    14    15
16    17    18    19
TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
TRUE  TRUE  TRUE  TRUE
   20    21    22    23    24    25    26
TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE


Is there a nicer way to do this? Regards
Liviu


-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-06 Thread Marc Schwartz

On Aug 6, 2012, at 12:06 PM, Marc Schwartz marc_schwa...@me.com wrote:

 Perhaps I am missing something, but why use sapply() when grepl() is already 
 vectorized?
 
 is.letter - function(x) grepl([:alpha:], x)
 is.number - function(x) grepl([:digit:], x)

Sorry, typos in the above from my CP. Should be:

is.letter - function(x) grepl([[:alpha:]], x)
is.number - function(x) grepl([[:digit:]], x)

Marc

 
 x - c(letters, 1:26)
 
 x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')
 
 x - rep(x, 1e3)
 
 str(x)
 chr [1:52000] a2 b10 c8 d3 e6 f1 g5 ...
 
 system.time(is.letter(x))
   user  system elapsed 
  0.011   0.000   0.010 
 
 system.time(is.number(x))
   user  system elapsed 
  0.010   0.000   0.011 
 
 
 Regards,
 
 Marc Schwartz
 
 On Aug 6, 2012, at 11:51 AM, Rui Barradas ruipbarra...@sapo.pt wrote:
 
 Hello,
 
 Fun as an exercise in vectorization. 30 times faster. Don't look, guess.
 
 Gave it up? Ok, here it is.
 
 
 is_letter - function(x, pattern=c(letters, LETTERS)){
   sapply(x, function(y){
   any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
   })
 }
 # test ascii codes, just one loop.
 has_letter - function(x){
   sapply(x, function(y){
   y - as.integer(charToRaw(y))
   any((65 = y  y = 90) | (97 = y  y = 122))
   })
 }
 
 x - c(letters, 1:26)
 x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')
 x - rep(x, 1e3)
 
 t1 - system.time(is_letter(x))
 t2 - system.time(has_letter(x))
 rbind(t1, t2, t1/t2)
  user.self sys.self elapsed user.child sys.child
 t1 15.690   15.74 NANA
 t2  0.5000.50 NANA
  31.38  NaN   31.48 NANA
 
 
 Em 06-08-2012 17:25, Liviu Andronic escreveu:
 Dear all
 I'm pretty sure that I'm approaching the problem in a wrong way.
 Suppose the following character vector:
 (x[1:10] - paste(x[1:10], sample(1:10, 10), sep=''))
 [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4
 x
 [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4  k
 l   m   n
 [15] o   p   q   r   s   t   u   v   w   x   y
 z   1   2
 [29] 3   4   5   6   7   8   9   10  11  12  13
 14  15  16
 [43] 17  18  19  20  21  22  23  24  25  26
 
 
 How do you test whether the elements of the vector contain at least
 one letter (or at least one digit) and obtain a logical vector of the
 same dimension? I came up with the following awkward function:
 is_letter - function(x, pattern=c(letters, LETTERS)){
sapply(x, function(y){
any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
})
 }
 
 is_letter(x)
  a10b7c2d3e6f1g5h8i9j4 k
 l m n o
 TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 TRUE  TRUE  TRUE  TRUE
p q r s t u v w x y z
 1 2 3 4
 TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 FALSE FALSE FALSE FALSE
5 6 7 8 9101112131415
 16171819
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 FALSE FALSE FALSE FALSE
   20212223242526
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 is_letter(x, 0:9)  ##function slightly misnamed
  a10b7c2d3e6f1g5h8i9j4 k
 l m n o
 TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
 FALSE FALSE FALSE FALSE
p q r s t u v w x y z
 1 2 3 4
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 TRUE  TRUE  TRUE  TRUE
5 6 7 8 9101112131415
 16171819
 TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 TRUE  TRUE  TRUE  TRUE
   20212223242526
 TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 
 
 Is there a nicer way to do this? Regards
 Liviu
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-06 Thread David L Carlson
Only an extra set of brackets:

is.letter - function(x) grepl([[:alpha:]], x)
is.number - function(x) grepl([[:digit:]], x)

Without them, the functions are fast, but wrong.

 x
 [1] a8  b5  c10 d1  e6  f2  g4  h3  i7  j9  k   l  
[13] m   n   o   p   q   r   s   t   u   v   w   x  
[25] y   z   1   2   3   4   5   6   7   8   9   10 
[37] 11  12  13  14  15  16  17  18  19  20  21  22 
[49] 23  24  25  26 
 is.letter - function(x) grepl([:alpha:], x)
 is.letter(x)
 [1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE  TRUE
[13] FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[49] FALSE FALSE FALSE FALSE
 is.letter - function(x) grepl([[:alpha:]], x)
 is.letter(x)
 [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
[13]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
[25]  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[49] FALSE FALSE FALSE FALSE 

--
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77843-4352

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Marc Schwartz
 Sent: Monday, August 06, 2012 12:07 PM
 To: Rui Barradas
 Cc: r-help
 Subject: Re: [R] test if elements of a character vector contain letters
 
 Perhaps I am missing something, but why use sapply() when grepl() is
 already vectorized?
 
 is.letter - function(x) grepl([:alpha:], x)
 is.number - function(x) grepl([:digit:], x)
 
 x - c(letters, 1:26)
 
 x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')
 
 x - rep(x, 1e3)
 
  str(x)
  chr [1:52000] a2 b10 c8 d3 e6 f1 g5 ...
 
  system.time(is.letter(x))
user  system elapsed
   0.011   0.000   0.010
 
  system.time(is.number(x))
user  system elapsed
   0.010   0.000   0.011
 
 
 Regards,
 
 Marc Schwartz
 
 On Aug 6, 2012, at 11:51 AM, Rui Barradas ruipbarra...@sapo.pt wrote:
 
  Hello,
 
  Fun as an exercise in vectorization. 30 times faster. Don't look,
 guess.
 
  Gave it up? Ok, here it is.
 
 
  is_letter - function(x, pattern=c(letters, LETTERS)){
 sapply(x, function(y){
 any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
 })
  }
  # test ascii codes, just one loop.
  has_letter - function(x){
 sapply(x, function(y){
 y - as.integer(charToRaw(y))
 any((65 = y  y = 90) | (97 = y  y = 122))
 })
  }
 
  x - c(letters, 1:26)
  x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')
  x - rep(x, 1e3)
 
  t1 - system.time(is_letter(x))
  t2 - system.time(has_letter(x))
  rbind(t1, t2, t1/t2)
user.self sys.self elapsed user.child sys.child
  t1 15.690   15.74 NANA
  t2  0.5000.50 NANA
31.38  NaN   31.48 NANA
 
 
  Em 06-08-2012 17:25, Liviu Andronic escreveu:
  Dear all
  I'm pretty sure that I'm approaching the problem in a wrong way.
  Suppose the following character vector:
  (x[1:10] - paste(x[1:10], sample(1:10, 10), sep=''))
   [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4
  x
   [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4  k
  l   m   n
  [15] o   p   q   r   s   t   u   v   w   x   y
  z   1   2
  [29] 3   4   5   6   7   8   9   10  11  12
 13
  14  15  16
  [43] 17  18  19  20  21  22  23  24  25  26
 
 
  How do you test whether the elements of the vector contain at least
  one letter (or at least one digit) and obtain a logical vector of
 the
  same dimension? I came up with the following awkward function:
  is_letter - function(x, pattern=c(letters, LETTERS)){
  sapply(x, function(y){
  any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
  })
  }
 
  is_letter(x)
a10b7c2d3e6f1g5h8i9j4 k
  l m n o
   TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
  TRUE  TRUE  TRUE  TRUE
  p q r s t u v w x y z
  1 2 3 4
   TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
  FALSE FALSE FALSE FALSE
  5 6 7 8 9101112131415
  16171819
  FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
  FALSE FALSE FALSE FALSE
 20212223242526
  FALSE FALSE FALSE FALSE FALSE FALSE FALSE
  is_letter(x, 0:9)  ##function slightly misnamed
a10b7c2d3e6f1g5h8i9j4 k
  l m n o
   TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
  FALSE FALSE FALSE FALSE
  p q r s t u v w x y z
  1 2 3 4
  FALSE FALSE FALSE FALSE

Re: [R] test if elements of a character vector contain letters

2012-08-06 Thread Liviu Andronic
On Mon, Aug 6, 2012 at 6:42 PM, Bert Gunter gunter.ber...@gene.com wrote:
 nzchar(x)  !is.na(x)

 No?


It doesn't work for what I need:
 x
 [1] a10 b8  c9  d2  e3  f4  g1  h7  i6  j5  k
l   m   n
[15] o   p   q   r   s   t   u   v   w   x   y
z   1   2
[29] 3   4   5   6   7   8   9   10  11  12  13
14  15  16
[43] 17  18  19  20  21  22  23  24  25  26
 nzchar(x)  !is.na(x)
 [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE TRUE TRUE TRUE
[18] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE TRUE TRUE TRUE
[35] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE TRUE TRUE TRUE
[52] TRUE


I need to have TRUE when an element contains a letter, and FALSE when
an element contains only numbers. The above returns TRUE for the
entire vector.

Regards
Liviu


 On Mon, Aug 6, 2012 at 9:25 AM, Liviu Andronic landronim...@gmail.com wrote:
 Dear all
 I'm pretty sure that I'm approaching the problem in a wrong way.
 Suppose the following character vector:
 (x[1:10] - paste(x[1:10], sample(1:10, 10), sep=''))
  [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4
 x
  [1] a10 b7  c2  d3  e6  f1  g5  h8  i9  j4  k
 l   m   n
 [15] o   p   q   r   s   t   u   v   w   x   y
 z   1   2
 [29] 3   4   5   6   7   8   9   10  11  12  13
 14  15  16
 [43] 17  18  19  20  21  22  23  24  25  26


 How do you test whether the elements of the vector contain at least
 one letter (or at least one digit) and obtain a logical vector of the
 same dimension? I came up with the following awkward function:
 is_letter - function(x, pattern=c(letters, LETTERS)){
 sapply(x, function(y){
 any(sapply(pattern, function(z) grepl(z, y, fixed=T)))
 })
 }

 is_letter(x)
   a10b7c2d3e6f1g5h8i9j4 k
 l m n o
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 TRUE  TRUE  TRUE  TRUE
 p q r s t u v w x y z
 1 2 3 4
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 FALSE FALSE FALSE FALSE
 5 6 7 8 9101112131415
 16171819
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 FALSE FALSE FALSE FALSE
20212223242526
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 is_letter(x, 0:9)  ##function slightly misnamed
   a10b7c2d3e6f1g5h8i9j4 k
 l m n o
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
 FALSE FALSE FALSE FALSE
 p q r s t u v w x y z
 1 2 3 4
 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 TRUE  TRUE  TRUE  TRUE
 5 6 7 8 9101112131415
 16171819
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 TRUE  TRUE  TRUE  TRUE
20212223242526
  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE


 Is there a nicer way to do this? Regards
 Liviu


 --
 Do you know how to read?
 http://www.alienetworks.com/srtest.cfm
 http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
 Do you know how to write?
 http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --

 Bert Gunter
 Genentech Nonclinical Biostatistics

 Internal Contact Info:
 Phone: 467-7374
 Website:
 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-06 Thread Yihui Xie
You probably mean grepl('[a-zA-Z]', x)

Regards,
Yihui
--
Yihui Xie xieyi...@gmail.com
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA


On Mon, Aug 6, 2012 at 3:29 PM, Liviu Andronic landronim...@gmail.com wrote:
 On Mon, Aug 6, 2012 at 6:42 PM, Bert Gunter gunter.ber...@gene.com wrote:
 nzchar(x)  !is.na(x)

 No?


 It doesn't work for what I need:
 x
  [1] a10 b8  c9  d2  e3  f4  g1  h7  i6  j5  k
 l   m   n
 [15] o   p   q   r   s   t   u   v   w   x   y
 z   1   2
 [29] 3   4   5   6   7   8   9   10  11  12  13
 14  15  16
 [43] 17  18  19  20  21  22  23  24  25  26
 nzchar(x)  !is.na(x)
  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 TRUE TRUE TRUE TRUE
 [18] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 TRUE TRUE TRUE TRUE
 [35] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
 TRUE TRUE TRUE TRUE
 [52] TRUE


 I need to have TRUE when an element contains a letter, and FALSE when
 an element contains only numbers. The above returns TRUE for the
 entire vector.

 Regards
 Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if elements of a character vector contain letters

2012-08-06 Thread Liviu Andronic
On Mon, Aug 6, 2012 at 7:35 PM, Marc Schwartz marc_schwa...@me.com wrote:
 is.letter - function(x) grepl([[:alpha:]], x)
 is.number - function(x) grepl([[:digit:]], x)



This does exactly what I wanted:
 x
 [1] a10 b8  c9  d2  e3  f4  g1  h7  i6  j5  k
l   m   n
[15] o   p   q   r   s   t   u   v   w   x   y
z   1   2
[29] 3   4   5   6   7   8   9   10  11  12  13
14  15  16
[43] 17  18  19  20  21  22  23  24  25  26
 xb - grepl([[:alpha:]],x)
 x[xb]  ##extract all vector elements that contain a letter
 [1] a10 b8  c9  d2  e3  f4  g1  h7  i6  j5  k
l   m   n
[15] o   p   q   r   s   t   u   v   w   x   y   z
 xb - grepl([[:digit:]],x)
 x[xb]  ##extract all vector elements that contain a digit
 [1] a10 b8  c9  d2  e3  f4  g1  h7  i6  j5  1
2   3   4
[15] 5   6   7   8   9   10  11  12  13  14  15
16  17  18
[29] 19  20  21  22  23  24  25  26

Thanks all for the suggestions! Regards
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test parallel slopes with svyolr

2012-07-08 Thread Thomas Lumley
On Sun, Jul 8, 2012 at 2:32 AM, Diana Marcela Martinez Ruiz
dianamm...@hotmail.com wrote:
 Hello,

 I would like to know how to test the assumption of proportional odds or
 parallel lines or slopes for an ordinal logistic regression with svyolr


I wouldn't, but if someone finds a clear reference I'd be prepared to
implement it anyway.

   -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test Binary File

2012-06-12 Thread Nortisiv
As an alternative to the hexview package, an external Hex-Editor may help you
investigate how the data is organised.

--
View this message in context: 
http://r.789695.n4.nabble.com/Test-Binary-File-tp833690p4633075.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test if a sample mean of integers with range -inf; inf is different from zero

2012-05-04 Thread R. Michael Weylandt
mean(c) != 0

But if you mean in a statistical sense... t.test() is one possibility.

Michael

On Fri, May 4, 2012 at 5:29 AM, Kay Cichini kay.cich...@gmail.com wrote:
 Hi all,

 how would you test  if a sample mean of integers with range -inf;inf  is
 different from zero:

 # my sample of integers:
 c - c(-3, -1, 0, 1, 0, 3, 4, 10, 12)

 # is mean of c  0?:
 mean(c)

 Thanks,
 Kay

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test if a sample mean of integers with range -inf; inf is different from zero

2012-05-04 Thread Petr Savicky
On Fri, May 04, 2012 at 11:29:51AM +0200, Kay Cichini wrote:
 Hi all,
 
 how would you test  if a sample mean of integers with range -inf;inf  is
 different from zero:
 
 # my sample of integers:
 c - c(-3, -1, 0, 1, 0, 3, 4, 10, 12)
 
 # is mean of c  0?:
 mean(c)

Hi.

It is better to use a name of a vector different from c, which
is a function, which you also use.

Testing, whether the sample mean is zero is simple, since one can use

  mean(c) == 0

or 

  sum(c) == 0

which are equivalent even in the inaccurate computer arithmetic.

So, i think, you are asking for a statistical test, whether the
true distribution mean is zero on the basis of a sample. Testing
this requires some additional information on the distribution.
If we do not know anything about the distribution except that the
values are integers, then the sample mean can be arbitrarily large
even if the distribuition mean is zero. Consider, for example,
a uniform distribution on {-M, M} for some very large integer M.
Observing a large sample mean does not allow to reject the null
hypothesis on any level, since a large mean may have large probability
even if the null hypothesis is true.

If there is no bound on the values, then testing anything concerning
the mean may not be possible, since the expected may not exist. Do you
have a reason to think that the true distribution has an expected value?

An example of an integer random variable without an expected value is

  s*X

where s is uniform on {-1, 1} and X has value 2^i with probability 2^-i
for i a positive integer.

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test-Predict R survival analysis

2012-04-18 Thread Terry Therneau

On 04/18/2012 05:00 AM, r-help-requ...@r-project.org wrote:

Hi,

I'm trying to use the R Survival analysis on a windows 7 system.
The input data format is described at the end of this mail.

1/ I tried to perform a survival analysis including stratified variables
using the following formula.
cox.xtab_miR=coxph(Surv(time, status) ~ miR + strata(sex,nbligne, age),
data=matrix)
and obtain the following error message
Warning message:
In fitter(X, Y, strats, offset, init, control, weights = weights,  :
Ran out of iterations and did not converge

Is this due to the model (error in formula) or is the number of
stratified variables fixed?
The Cox model compares the deaths to the non-deaths, separately within 
each stratum, then adds up the result.


Your data set and model combination puts each subject into their own 
strata, so there is no one to compare them to.  The fit has no data to 
use and so must fail.  (I admit the error message is misleading, but I 
hadn't ever seen someone make this particular mistake before.)


The following model works much better

 coxph(Surv(time, status) ~ miR + age + nbligne + strata(sex))
coef exp(coef) se(coef) z  p
miR 2.75e-05  1.00 9.35e-06 2.941 0.0033
age 3.39e-03  1.00 1.01e-02 0.334 0.7400
nbligne 7.14e-02  1.07 1.32e-01 0.542 0.5900

Likelihood ratio test=5.87  on 3 df, p=0.118  n= 70, number of events= 59
   (1 observation deleted due to missingness)

Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test Normality

2012-03-28 Thread Eik Vettorazzi
Hi Sindy,
you might try Snows penultimate normality test from the TeachingDemos
package. But read the help file carefully.

http://www.inside-r.org/packages/cran/TeachingDemos/docs/SnowsPenultimateNormalityTest

cheers.

Am 28.03.2012 02:32, schrieb Sindy Carolina Lizarazo:
 Good Night
 
 I made different test to check normality and multinormality in my dataset,
 but I don´t know which test is better.
 
 To verify univariate normality I checked: shapiro.test, cvm.test, ad.test,
 lillie.test, sf.test or jaque.bera.test and
 To verify multivariate normal distribution  I use mardia, mvShapiro.Test,
 mvsf, mshapiro.test, mvnorm.e.
 
 I have a dataset with almost 1000 data and 9 variables, in both cases the
 result is non-normality. For this reason, I transformed data with bcPower
 function and I want to check normality again.
 
 I really appreciate your help.
 Thanks.
 
   [[alternative HTML version deleted]]
 
 
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Eik Vettorazzi

Department of Medical Biometry and Epidemiology
University Medical Center Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/7410-58243
F ++49/40/7410-57790

--
Pflichtangaben gemäß Gesetz über elektronische Handelsregister und 
Genossenschaftsregister sowie das Unternehmensregister (EHUG):

Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; 
Gerichtsstand: Hamburg

Vorstandsmitglieder: Prof. Dr. Guido Sauter (Vertreter des Vorsitzenden), Dr. 
Alexander Kirstein, Joachim Prölß, Prof. Dr. Dr. Uwe Koch-Gromus 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test Normality

2012-03-28 Thread Michael Friendly

On 3/27/2012 8:32 PM, Sindy Carolina Lizarazo wrote:

Good Night

I made different test to check normality and multinormality in my dataset,
but I don´t know which test is better.

To verify univariate normality I checked: shapiro.test, cvm.test, ad.test,
lillie.test, sf.test or jaque.bera.test and
To verify multivariate normal distribution  I use mardia, mvShapiro.Test,
mvsf, mshapiro.test, mvnorm.e.

I have a dataset with almost 1000 data and 9 variables, in both cases the
result is non-normality. For this reason, I transformed data with bcPower
function and I want to check normality again.


Univariate tests of normality are subsumed within the multivariate 
tests, so there is no real need for the former.


That being said, many of the tests are quite sensitive to mild or small
departures from multivariate normality, such that would have little real
impact on the validity of an analysis.

You may find it more useful to carry out a graphical analysis, such as 
with normal QQ plots, or the multivariate generalization with is a plot

of Mahalanobis squared distances of all observations from their centroid
vs. corresponding quantiles of the Chisquare distribution with p=9 df.

[As a courtesy to readers, you might cite the packages from which you've
used these functions.]

-Michael


--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if text is part of vector

2012-01-20 Thread Petr PIKAL
Hi

 Hello,
 
 this is a very simple question:
 How can I find out if a word is part of a list of words
 
 like:
 a - word1
 b - word4
 
 vector - c(word1,word2,word3)
 
 I tried it with match(a,vector)
 but this gives the position of the word.
 

Perhaps

 a %in% vector

Regards
Petr


 I am not sure if and how that can be
 done with a logical operator like if:
 IF text is part of vector THEN print is part
 
 Probably a very easy thing to do, but I am missing
 the logical operator... and help(if) is not working
 
 best regards,
 johannes
 -- 
 Feel free - 10 GB Mailbox, 100 FreeSMS/Monat ...
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if text is part of vector

2012-01-20 Thread Rainer M Krug
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 20/01/12 12:50, Johannes Radinger wrote:
 Hello,
 
 this is a very simple question: How can I find out if a word is
 part of a list of words
 
 like: a - word1 b - word4
 
 vector - c(word1,word2,word3)
 
 I tried it with match(a,vector) but this gives the position of the
 word.
 
 I am not sure if and how that can be done with a logical operator
 like if: IF text is part of vector THEN print is part
 
 Probably a very easy thing to do, but I am missing the logical
 operator... and help(if) is not working

check out %in%

help:

?%in%

Cheers,

Rainer

 
 best regards, johannes


- -- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk8ZV7IACgkQoYgNqgF2egroawCfYAN/eOBMKN4VDTbBZtiBVGdS
LAUAnR+h9kg2INJTICiGIAUTfYm2fCbC
=Ws2h
-END PGP SIGNATURE-

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if text is part of vector

2012-01-20 Thread Johannes Radinger
Hi,

thank you very much... %in% is the operator I was looking for.

cheers,
johannes

 Original-Nachricht 
 Datum: Fri, 20 Jan 2012 13:01:54 +0100
 Von: Rainer M Krug r.m.k...@gmail.com
 An: Johannes Radinger jradin...@gmx.at
 CC: R-help@r-project.org
 Betreff: Re: [R] test if text is part of vector

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On 20/01/12 12:50, Johannes Radinger wrote:
  Hello,
  
  this is a very simple question: How can I find out if a word is
  part of a list of words
  
  like: a - word1 b - word4
  
  vector - c(word1,word2,word3)
  
  I tried it with match(a,vector) but this gives the position of the
  word.
  
  I am not sure if and how that can be done with a logical operator
  like if: IF text is part of vector THEN print is part
  
  Probably a very easy thing to do, but I am missing the logical
  operator... and help(if) is not working
 
 check out %in%
 
 help:
 
 ?%in%
 
 Cheers,
 
 Rainer
 
  
  best regards, johannes
 
 
 - -- 
 Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
 Biology, UCT), Dipl. Phys. (Germany)
 
 Centre of Excellence for Invasion Biology
 Stellenbosch University
 South Africa
 
 Tel :   +33 - (0)9 53 10 27 44
 Cell:   +33 - (0)6 85 62 59 98
 Fax :   +33 - (0)9 58 10 27 44
 
 Fax (D):+49 - (0)3 21 21 25 22 44
 
 email:  rai...@krugs.de
 
 Skype:  RMkrug
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.11 (GNU/Linux)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
 
 iEYEARECAAYFAk8ZV7IACgkQoYgNqgF2egroawCfYAN/eOBMKN4VDTbBZtiBVGdS
 LAUAnR+h9kg2INJTICiGIAUTfYm2fCbC
 =Ws2h
 -END PGP SIGNATURE-

-- 
Feel free - 10 GB Mailbox, 100 FreeSMS/Monat ...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if text is part of vector

2012-01-20 Thread R. Michael Weylandt michael.weyla...@gmail.com
You also might look at grepl() if you have time: it allows regular expressions 
and will be a little (a lot?) more flexible in how you define a match if you 
want to ignore things like capitalization. 

(mnemonic: the L in grepl indicates its like grep but returns logicals instead 
of positions)

Michael

On Jan 20, 2012, at 7:42 AM, Johannes Radinger jradin...@gmx.at wrote:

 Hi,
 
 thank you very much... %in% is the operator I was looking for.
 
 cheers,
 johannes
 
  Original-Nachricht 
 Datum: Fri, 20 Jan 2012 13:01:54 +0100
 Von: Rainer M Krug r.m.k...@gmail.com
 An: Johannes Radinger jradin...@gmx.at
 CC: R-help@r-project.org
 Betreff: Re: [R] test if text is part of vector
 
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On 20/01/12 12:50, Johannes Radinger wrote:
 Hello,
 
 this is a very simple question: How can I find out if a word is
 part of a list of words
 
 like: a - word1 b - word4
 
 vector - c(word1,word2,word3)
 
 I tried it with match(a,vector) but this gives the position of the
 word.
 
 I am not sure if and how that can be done with a logical operator
 like if: IF text is part of vector THEN print is part
 
 Probably a very easy thing to do, but I am missing the logical
 operator... and help(if) is not working
 
 check out %in%
 
 help:
 
 ?%in%
 
 Cheers,
 
 Rainer
 
 
 best regards, johannes
 
 
 - -- 
 Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
 Biology, UCT), Dipl. Phys. (Germany)
 
 Centre of Excellence for Invasion Biology
 Stellenbosch University
 South Africa
 
 Tel :   +33 - (0)9 53 10 27 44
 Cell:   +33 - (0)6 85 62 59 98
 Fax :   +33 - (0)9 58 10 27 44
 
 Fax (D):+49 - (0)3 21 21 25 22 44
 
 email:  rai...@krugs.de
 
 Skype:  RMkrug
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.11 (GNU/Linux)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
 
 iEYEARECAAYFAk8ZV7IACgkQoYgNqgF2egroawCfYAN/eOBMKN4VDTbBZtiBVGdS
 LAUAnR+h9kg2INJTICiGIAUTfYm2fCbC
 =Ws2h
 -END PGP SIGNATURE-
 
 -- 
 Feel free - 10 GB Mailbox, 100 FreeSMS/Monat ...
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test Case for Package

2011-10-19 Thread Uwe Ligges



On 19.10.2011 10:13, Vikram Bahure wrote:

Hi,

I had a query for writing a test case for a package.

If we are testing a function then do we need to call that function for
testing; library(mypackage)? Is that any circular logic any way.

For eg. if I create a package mypackage, can I have a file mypackagetest in
the tests directory whose line is library(mypackage).



Yes, actually it won't work without such a call to library() or 
require() ...


Uwe Ligges



It would be helpful if I could get some input on this.

Regards
Vikram

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for Random Walk and Makov Process

2011-09-03 Thread Ken
For random walk, there are entropy based tests (Robinson 1991), or you could 
empirically test the hypothesis by generating random normal data with the same 
mean and standard deviation and looking at the distribution of your quantiles. 
You could make generic statements also about whether or not the data 
demonstrates autocorrelation function values which are not significant and do 
not appear to have trend. Further, In a random walk, a binary variable for 
whether or not values are above and below the mean should follow a binomial 
distribution of size 1 with a probability of .5, there are tests which do this 
but also take magnitude into account. I mean to say there are a lot of ways to 
approach that problem, it depends on the application and how strong you want 
your conclusions to be. What kind of Markov process?

On Sep 3, 2554 BE, at 9:59 PM, Jumlong Vongprasert jumlong.u...@gmail.com 
wrote:

 Dear All
   I want to test my data for Random Walk or Markov Process.
   How I can do this.
 Many Thanks
 
 -- 
 Jumlong Vongprasert Assist, Prof.
 Institute of Research and Development
 Ubon Ratchathani Rajabhat University
 Ubon Ratchathani
 THAILAND
 34000
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if vector contains elements of another vector (disregarding the position)

2011-08-22 Thread R. Michael Weylandt
%in%

Here,

i %in% j

Hope this helps,

Michael

On Mon, Aug 22, 2011 at 11:51 AM, Martin Batholdy
batho...@googlemail.comwrote:

 Hi,


 I have the following problem:


 I have two vectors:

 i - c('a','c','g','h','b','d','f','k','l','e','i')

 j - c('a', 'b', 'c')



 now I would like to generate a vector with the length of i that
 has zeros where i[x] != any element of j
 and 1 where i[x] == any element of j.

 So for the example above the vector would look like this:

 c(1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0)



 can someone help me on this?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test if vector contains elements of another vector (disregarding the position)

2011-08-22 Thread Henrique Dallazuanna
Try this:

i %in% j * 1

On Mon, Aug 22, 2011 at 12:51 PM, Martin Batholdy
batho...@googlemail.com wrote:
 Hi,


 I have the following problem:


 I have two vectors:

 i - c('a','c','g','h','b','d','f','k','l','e','i')

 j - c('a', 'b', 'c')



 now I would like to generate a vector with the length of i that
 has zeros where i[x] != any element of j
 and 1 where i[x] == any element of j.

 So for the example above the vector would look like this:

 c(1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0)



 can someone help me on this?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test if data uniformly distributed (newbie)

2011-06-12 Thread Petr Savicky
On Fri, Jun 10, 2011 at 10:15:36PM +0200, Kairavi Bhakta wrote:
 Thanks for your answer. The reason I want the data to be uniform: It's the
 first step in a machine learning project I am working on. If I know the data
 isn't uniformly distributed, then this means there is probably something
 wrong and the following steps will be biased by the non-uniform input data.
 I'm not checking an assumption for another statistical test.
 
 Actually, the data has been normalized because it is supposed to represent a
 probability distribution. That's why it sums to 1. My assumption is that,
 for a vector of 5, the data at that point should look like 0.20 0.20 0.20
 0.20 0.20, but of course there is variation, and I would like to test
 whether the data comes close enough or not.

As others told you, this is not the right format for KS test. The words
testing uniformity can mean different things and the meaning depends
on which statistical model you assume. If we have a random variable
with values in [0, 1], then testing uniformity means to test, to which
extent its distribution is close to the uniform distribution on [0, 1].
The numbers, which concentrate around 0.2, will not satisfy this.

If we have a discrete variable with k values, for which we have m
independent observations, and the number of observations of value i
is m_i, then it is possible to test, whether the variable has the uniform
distribution on {1, ..., k} using Chi-squared test. Note that for
this test, the original counts are needed, not their normalized values,
which sum up to 1. For example, if we have 20 observations and
the counts (m_1, ..., m_5) are (4, 3, 5, 2, 6), then this is quite
consistent with the assumption of uniform distribution. On the
other hand, if we have 200 observations and the counts are
(40, 30, 50, 20, 60), then the null hypothesis of uniform distribution
may be rejected (the uniform distribution is the default, see argument
p in ?chisq.test)

  x - c(40, 30, 50, 20, 60)
  chisq.test(x)

  Chi-squared test for given probabilities

  data:  x 
  X-squared = 25, df = 4, p-value = 5.031e-05

It is not clear, whether this is suitable for your application.
If you generate the values in a different way, then another
test may be needed. Can you specify more detail on how the 
numbers are generated?

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test if data uniformly distributed (newbie)

2011-06-10 Thread Greg Snow
Yes, punif is the function to use, however the KS test (and the others) are 
based on an assumption of independence, and if you know that your data points 
sum to 1, then they are not independent (and not uniform if there are more than 
2).  Also note that these tests only rule out distributions (with a given type 
I error rate), but cannot confirm that the data comes from a given distribution 
(just that either they do, or there is not enough power to distinguish between 
the actual and the test distributions).

What is your ultimate question/goal?  Why do you care if the data is uniform or 
not?

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Kairavi Bhakta
 Sent: Friday, June 10, 2011 11:24 AM
 To: r-help@r-project.org
 Subject: [R] Test if data uniformly distributed (newbie)
 
 Hello,
 
 I have a bunch of files containing 300 data points each with values
 from 0
 to 1 which also sum to 1 (I don't think  the last element is relevant
 though). In addition, each data point is annotated as an a or a b.
 
 I would like to know in which files (if any) the data is uniformly
 distributed.
 
 I used Google and found out that a Kolmogorov-Smirnov or a Chi-square
 goodness-of-fit test could be used. Then I looked up ?kolmogorov and
 found
 ks.test, but the example there is for the normal distribution and I
 am not
 sure how to adapt it for the uniform distribution. I did ?runif and
 read
 about the uniform distribution but it doesn't say what the cumulative
 distribution is. Is it punif, like pnorm? I thought of that
 because I
 found a message on this list where someone was told to use pnorm
 instead
 of dnorm. But the help page on the uniform distribution says punif is
 the
 distribution function. Are the cumulative distribution and the
 distribution function the same thing? Having several names for the
 same
 thing has always confused me very much in statistics.
 
 Also, I am not sure whether I need to specify any parameters for the
 distribution and which. I thought maybe I should specify min=0 and
 max=1
 but those appear to be the defaults. Do I need to specify q, the vector
 of
 quantiles?
 
 So is
  ks.test(x, punif)
 correct or not for what I am attempting to do?
 
 After this I will also need to find out whether the a's and b's are
 distributed randomly in each file. I would be greatful for any pointers
 although I have not researched this issue yet.
 
 Kairavi.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test if data uniformly distributed (newbie)

2011-06-10 Thread Greg Snow
OK, that is not the correct format for the KS test (which is expecting data 
ranging from 0 to 1 with a fairly flat histogram).  You could possibly test 
this with a Chi-squared test.  Can you tell us more about how the numbers you 
are looking at are generated?  The Chi-squared test could be used on counts of 
1-5 and compared to the assumption that each is equally likely, but there still 
is the question of power and how close to uniform is uniform enough.  You would 
need huge samples to find a difference if the true distribution is only 
slightly non uniform.

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111

From: kairavibha...@googlemail.com [mailto:kairavibha...@googlemail.com] On 
Behalf Of Kairavi Bhakta
Sent: Friday, June 10, 2011 2:16 PM
To: Greg Snow; r-help@r-project.org
Subject: RE: [R] Test if data uniformly distributed (newbie)

Thanks for your answer. The reason I want the data to be uniform: It's the 
first step in a machine learning project I am working on. If I know the data 
isn't uniformly distributed, then this means there is probably something wrong 
and the following steps will be biased by the non-uniform input data. I'm not 
checking an assumption for another statistical test.

Actually, the data has been normalized because it is supposed to represent a 
probability distribution. That's why it sums to 1. My assumption is that, for a 
vector of 5, the data at that point should look like 0.20 0.20 0.20 0.20 0.20, 
but of course there is variation, and I would like to test whether the data 
comes close enough or not.

At the moment I am only testing whether there are more a's than b's in the top 
and bottom portion of the each file (with a wilcoxon test, I have 8 reps of the 
model I am trying to build). But that sort of felt like a very adhoc solution 
and I figured maybe testing for uniformity would be better, or at least a 
important addition. I've also been looking into testing for the randomness of 
the sequence of a's and b's instead of the wilcoxon test, although that may or 
may not involve R.

Kairavi.


 Yes, punif is the function to use, however the KS test (and the others) are 
 based on an assumption of independence, and if you know that your data points 
 sum to 1, then they are not independent (and not uniform if there are more 
 than 2).  Also note that these tests only rule out distributions (with a 
 given type I error rate), but cannot confirm that the data comes from a given 
 distribution (just that either they do, or there is not enough power to 
 distinguish between the actual and the test distributions).

 What is your ultimate question/goal?  Why do you care if the data is uniform 
 or not?

 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.orghttps://webmail.uni-saarland.de/imp/message.php?mailbox=INBOXindex=81599
 801.408.8111

[Hide Quoted Text]
-Original Message-
From: 
r-help-boun...@r-project.orghttps://webmail.uni-saarland.de/imp/message.php?mailbox=INBOXindex=81599
 
[mailto:r-help-bounces@r-https://webmail.uni-saarland.de/imp/message.php?mailbox=INBOXindex=81599
project.orghttp://project.org] On Behalf Of Kairavi Bhakta
Sent: Friday, June 10, 2011 11:24 AM
To: 
r-help@r-project.orghttps://webmail.uni-saarland.de/imp/message.php?mailbox=INBOXindex=81599
Subject: [R] Test if data uniformly distributed (newbie)

Hello,

I have a bunch of files containing 300 data points each with values from 0 to 1 
which also sum to 1 (I don't think  the last element is relevant though). In 
addition, each data point is annotated as an a or a b.

I would like to know in which files (if any) the data is uniformly distributed.

I used Google and found out that a Kolmogorov-Smirnov or a Chi-square 
goodness-of-fit test could be used. Then I looked up ?kolmogorov and found 
ks.test, but the example there is for the normal distribution and I am not 
sure how to adapt it for the uniform distribution. I did ?runif and read about 
the uniform distribution but it doesn't say what the cumulative distribution 
is. Is it punif, like pnorm? I thought of that because I found a message on 
this list where someone was told to use pnorm instead of dnorm. But the 
help page on the uniform distribution says punif is the distribution 
function. Are the cumulative distribution and the distribution function 
the same thing? Having several names for the same thing has always confused me 
very much in statistics.

Also, I am not sure whether I need to specify any parameters for the 
distribution and which. I thought maybe I should specify min=0 and max=1 
but those appear to be the defaults. Do I need to specify q, the vector
of quantiles?

So is
ks.test(x, punif)
correct or not for what I am attempting to do?
After this I will also need to find out whether the a's and b's are distributed 
randomly in each file. I would be greatful for any pointers although I have

Re: [R] Test if data uniformly distributed (newbie)

2011-06-10 Thread Kairavi Bhakta
Thanks for your answer. The reason I want the data to be uniform: It's the
first step in a machine learning project I am working on. If I know the data
isn't uniformly distributed, then this means there is probably something
wrong and the following steps will be biased by the non-uniform input data.
I'm not checking an assumption for another statistical test.

Actually, the data has been normalized because it is supposed to represent a
probability distribution. That's why it sums to 1. My assumption is that,
for a vector of 5, the data at that point should look like 0.20 0.20 0.20
0.20 0.20, but of course there is variation, and I would like to test
whether the data comes close enough or not.

At the moment I am only testing whether there are more a's than b's in the
top and bottom portion of the each file (with a wilcoxon test, I have 8 reps
of the model I am trying to build). But that sort of felt like a very adhoc
solution and I figured maybe testing for uniformity would be better, or at
least a important addition. I've also been looking into testing for the
randomness of the sequence of a's and b's instead of the wilcoxon test,
although that may or may not involve R.

Kairavi.


 Yes, punif is the function to use, however the KS test (and the others)
are based on an assumption of independence, and if you know that your data
points sum to 1, then they are not independent (and not uniform if there are
more than 2).  Also note that these tests only rule out distributions (with
a given type I error rate), but cannot confirm that the data comes from a
given distribution (just that either they do, or there is not enough power
to distinguish between the actual and the test distributions).

 What is your ultimate question/goal?  Why do you care if the data is
uniform or not?

 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.orghttps://webmail.uni-saarland.de/imp/message.php?mailbox=INBOXindex=81599#
 801.408.8111


[Hide Quoted Text]
-Original Message-
From: 
r-help-boun...@r-project.orghttps://webmail.uni-saarland.de/imp/message.php?mailbox=INBOXindex=81599#[mailto:
r-help-bounces@r-https://webmail.uni-saarland.de/imp/message.php?mailbox=INBOXindex=81599#
project.org] On Behalf Of Kairavi Bhakta
Sent: Friday, June 10, 2011 11:24 AM
To: 
r-help@r-project.orghttps://webmail.uni-saarland.de/imp/message.php?mailbox=INBOXindex=81599#
Subject: [R] Test if data uniformly distributed (newbie)

Hello,

I have a bunch of files containing 300 data points each with values from 0
to 1 which also sum to 1 (I don't think  the last element is relevant
though). In addition, each data point is annotated as an a or a b.

I would like to know in which files (if any) the data is uniformly
distributed.

I used Google and found out that a Kolmogorov-Smirnov or a Chi-square
goodness-of-fit test could be used. Then I looked up ?kolmogorov and found
ks.test, but the example there is for the normal distribution and I am not
sure how to adapt it for the uniform distribution. I did ?runif and read
about the uniform distribution but it doesn't say what the cumulative
distribution is. Is it punif, like pnorm? I thought of that because I
found a message on this list where someone was told to use pnorm instead
of dnorm. But the help page on the uniform distribution says punif is the
distribution function. Are the cumulative distribution and the
distribution function the same thing? Having several names for the same
thing has always confused me very much in statistics.

Also, I am not sure whether I need to specify any parameters for the
distribution and which. I thought maybe I should specify min=0 and max=1
but those appear to be the defaults. Do I need to specify q, the vector
of quantiles?

So is
ks.test(x, punif)
correct or not for what I am attempting to do?
After this I will also need to find out whether the a's and b's are
distributed randomly in each file. I would be greatful for any pointers
although I have not researched this issue yet.

Kairavi.

[[alternative HTML version deleted]]

__
R-help@r-project.orghttps://webmail.uni-saarland.de/imp/message.php?mailbox=INBOXindex=81599#mailing
list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/postinghttp://www.r-project.org/posting
-
guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for list membership

2011-05-30 Thread Henrique Dallazuanna
Try this:

list(c(1,2,3), c(4,5,6)) %in% list(c(1,2,3))

On Mon, May 30, 2011 at 10:36 AM, Marcin Wlodarczak
mwlodarc...@uni-bielefeld.de wrote:

 Hi,

 I need some help with this one: how do I check whether a vector is
 already present in a list of vectors.

 I have seen %in% recommended in a similar case but that obviously does
 not work here.

 c(1,2,3) %in% list(c(1,2,3), c(4,5,6))

 returns

 [1] FALSE FALSE FALSE

 which makes sense since 1, 2 or 3 are not elements of that list. I don't
 really know how to move from there though.

 Best wishes,
 Marcin

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for list membership

2011-05-30 Thread Uwe Ligges



On 30.05.2011 15:36, Marcin Wlodarczak wrote:


Hi,

I need some help with this one: how do I check whether a vector is
already present in a list of vectors.

I have seen %in% recommended in a similar case but that obviously does
not work here.

c(1,2,3) %in% list(c(1,2,3), c(4,5,6))



You said it yourself, almost:

list(c(1,2,3)) %in% list(c(1,2,3), c(4,5,6))

Uwe Ligges



returns

[1] FALSE FALSE FALSE

which makes sense since 1, 2 or 3 are not elements of that list. I don't
really know how to move from there though.

Best wishes,
Marcin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for list membership

2011-05-30 Thread Sarah Goslee
You almost solved your own problem with that last statement. Instead
of comparing apples and oranges, you need to compare oranges and
oranges:

 list(c(1,2,3)) %in% list(c(1,2,3), c(4,5,6))
[1] TRUE
 list(c(1,2,3)) %in% list(c(1,2,9), c(4,5,6))
[1] FALSE

Sarah


On Mon, May 30, 2011 at 9:36 AM, Marcin Wlodarczak
mwlodarc...@uni-bielefeld.de wrote:

 Hi,

 I need some help with this one: how do I check whether a vector is
 already present in a list of vectors.

 I have seen %in% recommended in a similar case but that obviously does
 not work here.

 c(1,2,3) %in% list(c(1,2,3), c(4,5,6))

 returns

 [1] FALSE FALSE FALSE

 which makes sense since 1, 2 or 3 are not elements of that list. I don't
 really know how to move from there though.

 Best wishes,
 Marcin

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for list membership

2011-05-30 Thread Marcin Włodarczak

On 05/30/2011 04:14 PM, Uwe Ligges wrote:

On 30.05.2011 15:36, Marcin Wlodarczak wrote:


Hi,

I need some help with this one: how do I check whether a vector is
already present in a list of vectors.

I have seen %in% recommended in a similar case but that obviously does
not work here.

c(1,2,3) %in% list(c(1,2,3), c(4,5,6))



You said it yourself, almost:

list(c(1,2,3)) %in% list(c(1,2,3), c(4,5,6))


Brilliant! Thanks to everyone.

Marcin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for equivalence

2011-02-14 Thread Greg Snow
Reading the original post it is fairly clear that the original poster's 
question does not match with the traditional test of equivalence, but rather is 
trying to determine distinguishable or indistinguishable.  If the test in my 
suggestion is statistically significant (and note I did not suggest only 
testing the interaction) then that meets one possible interpretation of 
distinguishable, a non-significant result could mean either equivalence or 
low power, the combination of which could be an interpretation of 
indistinguishable.

I phrased my response as a question in hopes that the original poster would 
think through what they really wanted to test and get back to us with further 
details.  It could very well be that my response is very different from what 
they were thinking, but explaining how it does not fit will better help us 
understand the real problem.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: Albyn Jones [mailto:jo...@reed.edu]
 Sent: Sunday, February 13, 2011 9:53 PM
 To: Greg Snow
 Cc: syrvn; r-help@r-project.org
 Subject: Re: [R] Test for equivalence
 
 testing the null hypothesis of no interaction is not the same as a
 test of equivalence for the two differences.  There is a literature on
 tests of equivalence.  First you must develop a definition of
 equivalence, for example the difference is in the interval (-a,a).
 Then, for example,  you test the null hypothesis that the difference
 is in [a,inf) or (-inf,-a] (a TOST, or two one sided tests).  One
 simple procedure: check to see if the 90% CI for the difference
 (difference of the differences or the interaction effect) is contained
 in the interval (-a,a).
 
 albyn
 
 Quoting Greg Snow greg.s...@imail.org:
 
  Does it make sense for you to combine the 2 data sets and do a 2-way
  anova with treatment vs. control as one factor and experiment number
  as the other factor?  Then you could test the interaction and
  treatment number factor to see if they make a difference.
 
  --
  Gregory (Greg) L. Snow Ph.D.
  Statistical Data Center
  Intermountain Healthcare
  greg.s...@imail.org
  801.408.8111
 
 
  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
  project.org] On Behalf Of syrvn
  Sent: Saturday, February 12, 2011 7:30 AM
  To: r-help@r-project.org
  Subject: [R] Test for equivalence
 
 
  Hi!
 
  is there a way in R to check whether the outcome of two different
  experiments is statistically distinguishable or indistinguishable?
 More
  preciously, I used the wilcoxon test to determine the differences
  between
  controls and treated subjects for two different experiments. Now I
  would
  like to check whether the two lists of analytes obtained are
  statistically
  distinguishable or indistinguishable
 
  I tried to use a equivalence test from the 'equivalence' package in
 R
  but it
  seems that this test is not applicable to my problem. The test in
 the
  'equivalence' package just determines similarity between two
 conditions
  but
  I need to compare the outcome of two different experiments.
 
  My experiments are constructed as follows:
 
  Exp1:
  8 control samples
  8 treated samples
  - determine significantly changes (List A)
 
  Exp2:
  8 control samples
  8 treated samples
  - determine significantly changes (List B)
 
 
  Now i would like to check whether List A and List B are
 distinguishable
  or
  indistinguishable.
 
  Any advice is very much appreciated!
 
  Best,
  beginner
  --
  View this message in context: http://r.789695.n4.nabble.com/Test-
 for-
  equivalence-tp3302739p3302739.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
  guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for equivalence

2011-02-14 Thread Albyn Jones
Reading the original post it was clear to me that the poster was looking for
a test of equivalence, but obviously there was room for interpretation!

albyn

On Mon, Feb 14, 2011 at 09:46:13AM -0700, Greg Snow wrote:
 Reading the original post it is fairly clear that the original poster's 
 question does not match with the traditional test of equivalence, but rather 
 is trying to determine distinguishable or indistinguishable.  If the test 
 in my suggestion is statistically significant (and note I did not suggest 
 only testing the interaction) then that meets one possible interpretation of 
 distinguishable, a non-significant result could mean either equivalence or 
 low power, the combination of which could be an interpretation of 
 indistinguishable.
 
 I phrased my response as a question in hopes that the original poster would 
 think through what they really wanted to test and get back to us with further 
 details.  It could very well be that my response is very different from what 
 they were thinking, but explaining how it does not fit will better help us 
 understand the real problem.
 
 -- 
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111
 
 
  -Original Message-
  From: Albyn Jones [mailto:jo...@reed.edu]
  Sent: Sunday, February 13, 2011 9:53 PM
  To: Greg Snow
  Cc: syrvn; r-help@r-project.org
  Subject: Re: [R] Test for equivalence
  
  testing the null hypothesis of no interaction is not the same as a
  test of equivalence for the two differences.  There is a literature on
  tests of equivalence.  First you must develop a definition of
  equivalence, for example the difference is in the interval (-a,a).
  Then, for example,  you test the null hypothesis that the difference
  is in [a,inf) or (-inf,-a] (a TOST, or two one sided tests).  One
  simple procedure: check to see if the 90% CI for the difference
  (difference of the differences or the interaction effect) is contained
  in the interval (-a,a).
  
  albyn
  
  Quoting Greg Snow greg.s...@imail.org:
  
   Does it make sense for you to combine the 2 data sets and do a 2-way
   anova with treatment vs. control as one factor and experiment number
   as the other factor?  Then you could test the interaction and
   treatment number factor to see if they make a difference.
  
   --
   Gregory (Greg) L. Snow Ph.D.
   Statistical Data Center
   Intermountain Healthcare
   greg.s...@imail.org
   801.408.8111
  
  
   -Original Message-
   From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
   project.org] On Behalf Of syrvn
   Sent: Saturday, February 12, 2011 7:30 AM
   To: r-help@r-project.org
   Subject: [R] Test for equivalence
  
  
   Hi!
  
   is there a way in R to check whether the outcome of two different
   experiments is statistically distinguishable or indistinguishable?
  More
   preciously, I used the wilcoxon test to determine the differences
   between
   controls and treated subjects for two different experiments. Now I
   would
   like to check whether the two lists of analytes obtained are
   statistically
   distinguishable or indistinguishable
  
   I tried to use a equivalence test from the 'equivalence' package in
  R
   but it
   seems that this test is not applicable to my problem. The test in
  the
   'equivalence' package just determines similarity between two
  conditions
   but
   I need to compare the outcome of two different experiments.
  
   My experiments are constructed as follows:
  
   Exp1:
   8 control samples
   8 treated samples
   - determine significantly changes (List A)
  
   Exp2:
   8 control samples
   8 treated samples
   - determine significantly changes (List B)
  
  
   Now i would like to check whether List A and List B are
  distinguishable
   or
   indistinguishable.
  
   Any advice is very much appreciated!
  
   Best,
   beginner
   --
   View this message in context: http://r.789695.n4.nabble.com/Test-
  for-
   equivalence-tp3302739p3302739.html
   Sent from the R help mailing list archive at Nabble.com.
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide http://www.R-project.org/posting-
   guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide http://www.R-project.org/posting-
  guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
  
 
 

-- 
Albyn Jones
Reed College
jo...@reed.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible

Re: [R] Test for equivalence

2011-02-14 Thread syrvn

Hi!

first of all. Thank you all very much for your input. I am sorry but I
haven't had yet the
time to reply to all of your messages. I will give you a more detailed
description of my
problem within the next 2 days!

Many thanks again.

Best,
syrvn
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Test-for-equivalence-tp3302739p3305890.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for equivalence

2011-02-13 Thread Mike Marchywka

 From: greg.s...@imail.org
 To: ment...@gmx.net; r-help@r-project.org
 Date: Sat, 12 Feb 2011 18:04:34 -0700
 Subject: Re: [R] Test for equivalence

 Does it make sense for you to combine the 2 data sets and do a 2-way anova 
 with treatment vs. control as one factor and experiment number as the other 
 factor? Then you could test the interaction and treatment number factor to 
 see if they make a difference.


I'm not a statistician and don't play one on TV but I'm not sure if the OP has 
a specific
approach of hypothesis in mind. I guess it could be a question about an 
equivalence or 
non-inferiority trial or about some notion of stationary statistics between the 
two
control and treatment groups ( do list A and B have same E(x^n) for example). 
More
likely, it sounds like a question about  do A and B appear to be drawn from 
the same
populations in terms of statistics I care about. So, I guess first I'd just re 
run
whatever analyses you did with lists A and B but run control vs control and also
treatment vs treatment, pool the results ( A+B combined ) etc. See what that 
returns
and do sensitivity tests, deleting points moving them a bit etc. Any anova, 
cox, aft etc
probably wouldn't hurt but hard to know without knowing real issue. 

FWIW, this issue was raised at a recent review of a drug where part of the FDA 
discussion 
concerned differences
in placebo survival between two studies. someone also earlier mentioned 
the FDA doesn't accept such and such. In this review of Provenge that was 
linked to later threats
against some oncologists ( presumably disgruntled DNDN stock speculators LOL).
many types of information are considered including post hoc analysis.
Now doesn't accept doesn't mean they will refuse to file a BLA that uses
some analysis but this panel anyway was quite open to considering all the 
information
they had probably more so than the public ( stockholders LOL) that was often 
just quoting
some isolated statistics, 

http://www.fda.gov/ohrms/dockets/ac/07/transcripts/2007-4291T1.pdf

( you can find the info presented by DNDN in their briefing and the
responses, this is just a transcript of meeting)

This was probably so bizarre to a lot of people because the panel voted solidly 
that they thought the drug was effective
but the FDA rejected the thing largely due to efficacy concerns. 
Their vote was on a question that forced 
a bit of an unfortunate choice and it was easy to see how the descrepancy 
occured.
And in the final analysis the FDA question is,  should the sponsor be allowed 
to
collect money for claiming they have a drug to treat this condition. I'm not
citing this as a case of how statistics should be done by any means, just 
that it
is an interesting recent case of how it is done in real life.





 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111


  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
  project.org] On Behalf Of syrvn
  Sent: Saturday, February 12, 2011 7:30 AM
  To: r-help@r-project.org
  Subject: [R] Test for equivalence
 
 
  Hi!
 
  is there a way in R to check whether the outcome of two different
  experiments is statistically distinguishable or indistinguishable? More
  preciously, I used the wilcoxon test to determine the differences
  between
  controls and treated subjects for two different experiments. Now I
  would
  like to check whether the two lists of analytes obtained are
  statistically
  distinguishable or indistinguishable
 
  I tried to use a equivalence test from the 'equivalence' package in R
  but it
  seems that this test is not applicable to my problem. The test in the
  'equivalence' package just determines similarity between two conditions
  but
  I need to compare the outcome of two different experiments.
 
  My experiments are constructed as follows:
 
  Exp1:
  8 control samples
  8 treated samples
  - determine significantly changes (List A)
 
  Exp2:
  8 control samples
  8 treated samples
  - determine significantly changes (List B)
 
 
  Now i would like to check whether List A and List B are distinguishable
  or
  indistinguishable.
 
  Any advice is very much appreciated!
 
  Best,
  beginner
  --
  View this message in context: http://r.789695.n4.nabble.com/Test-for-
  equivalence-tp3302739p3302739.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
  guide.html
  and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible

Re: [R] Test for equivalence

2011-02-13 Thread Albyn Jones
testing the null hypothesis of no interaction is not the same as a  
test of equivalence for the two differences.  There is a literature on  
tests of equivalence.  First you must develop a definition of  
equivalence, for example the difference is in the interval (-a,a).   
Then, for example,  you test the null hypothesis that the difference  
is in [a,inf) or (-inf,-a] (a TOST, or two one sided tests).  One  
simple procedure: check to see if the 90% CI for the difference  
(difference of the differences or the interaction effect) is contained  
in the interval (-a,a).


albyn

Quoting Greg Snow greg.s...@imail.org:

Does it make sense for you to combine the 2 data sets and do a 2-way  
anova with treatment vs. control as one factor and experiment number  
as the other factor?  Then you could test the interaction and  
treatment number factor to see if they make a difference.


--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
project.org] On Behalf Of syrvn
Sent: Saturday, February 12, 2011 7:30 AM
To: r-help@r-project.org
Subject: [R] Test for equivalence


Hi!

is there a way in R to check whether the outcome of two different
experiments is statistically distinguishable or indistinguishable? More
preciously, I used the wilcoxon test to determine the differences
between
controls and treated subjects for two different experiments. Now I
would
like to check whether the two lists of analytes obtained are
statistically
distinguishable or indistinguishable

I tried to use a equivalence test from the 'equivalence' package in R
but it
seems that this test is not applicable to my problem. The test in the
'equivalence' package just determines similarity between two conditions
but
I need to compare the outcome of two different experiments.

My experiments are constructed as follows:

Exp1:
8 control samples
8 treated samples
- determine significantly changes (List A)

Exp2:
8 control samples
8 treated samples
- determine significantly changes (List B)


Now i would like to check whether List A and List B are distinguishable
or
indistinguishable.

Any advice is very much appreciated!

Best,
beginner
--
View this message in context: http://r.789695.n4.nabble.com/Test-for-
equivalence-tp3302739p3302739.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test for equivalence

2011-02-12 Thread Greg Snow
Does it make sense for you to combine the 2 data sets and do a 2-way anova with 
treatment vs. control as one factor and experiment number as the other factor?  
Then you could test the interaction and treatment number factor to see if they 
make a difference.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of syrvn
 Sent: Saturday, February 12, 2011 7:30 AM
 To: r-help@r-project.org
 Subject: [R] Test for equivalence
 
 
 Hi!
 
 is there a way in R to check whether the outcome of two different
 experiments is statistically distinguishable or indistinguishable? More
 preciously, I used the wilcoxon test to determine the differences
 between
 controls and treated subjects for two different experiments. Now I
 would
 like to check whether the two lists of analytes obtained are
 statistically
 distinguishable or indistinguishable
 
 I tried to use a equivalence test from the 'equivalence' package in R
 but it
 seems that this test is not applicable to my problem. The test in the
 'equivalence' package just determines similarity between two conditions
 but
 I need to compare the outcome of two different experiments.
 
 My experiments are constructed as follows:
 
 Exp1:
 8 control samples
 8 treated samples
 - determine significantly changes (List A)
 
 Exp2:
 8 control samples
 8 treated samples
 - determine significantly changes (List B)
 
 
 Now i would like to check whether List A and List B are distinguishable
 or
 indistinguishable.
 
 Any advice is very much appreciated!
 
 Best,
 beginner
 --
 View this message in context: http://r.789695.n4.nabble.com/Test-for-
 equivalence-tp3302739p3302739.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test statistic in anova.glm when quasi family is used

2011-01-31 Thread Ben Bolker
Eiiti Kasuya ekasuscb at kyushu-u.org writes:

 
 When quasi family (not quasipoisson or quasibinomial) is used in glm,
 what is the appropriate test statistic in anova.glm?
 Help of anova.glm tells “For models with known dispersion (e.g.,
 binomial and Poisson fits) the chi-squared test is most appropriate, and
 for those with dispersion estimated by moments (e.g., gaussian,
 quasibinomial and quasipoisson fits) the F test is most appropriate”. I
 assume that  F is appropriate in the case of quasi (not quasipoisson or
 quasibinomial). Is this correct?
 
 Ei Kasuya
 
  Yes.

  References are Venables and Ripley (i.e. MASS) and Crawley's
Statistical Data Analysis book.

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test

2011-01-15 Thread romzero

Thank you all for the precious help.

Finally i could start the writing of a first part of my script, but now i
have a new question for you.

Need i to repeat the write.table portion for all the 15 lines or can i use a
short cut?

Example:

file.open - C:\\test.txt
file.save - C:\\results.txt

my.data - read.table(file, header=T)

library(plyr)
write.table(ddply(my.data, .(Thesis, Day), function(x){
Baseline - unlist(x[1, c(A, B, C)])
data.frame(t(apply(x[-1, c(A, B, C)], 1, function(z){z -
Baseline})))
}), file = file.save, row.names = F) 
write.table(ddply(my.data, .(Thesis, Day), function(x){
Baseline - unlist(x[2, c(A, B, C)])
data.frame(t(apply(x[-1:-2, c(A, B, C)], 1, function(z){z -
Baseline})))
}), file = file.save, append = T, row.names = F, col.names = F) 
etc etc

Thanks again for the help.

Best regards,
Roberto.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Comparison-of-numbers-in-a-table-tp3217329p3218524.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test

2011-01-14 Thread Dennis Murphy
Hi:

If I understood you correctly, the following may work:

# Utility function to compute all pairwise differences of a vector:
# The upper triangle subtracts x_i - x_j for j  i; if you want it the
# other way around, use lower.tri() instead of upper.tri().
# Coercion to vector means that the differences from x_1 appear
# first, followed by those from x_2, then x_3, etc.
subtfun - function(x) {
  u - outer(x, x, '-')
  as.vector(u[upper.tri(u)])
  }

# To apply it to each of A, B, C in each Thesis:Day subgroup,
# here's one way with function ddply() in the plyr package:

library(plyr)
v - ddply(df, .(Thesis, Day), numcolwise(subtfun))
head(v)

HTH,
Dennis

On Fri, Jan 14, 2011 at 1:17 AM, romzero romz...@yahoo.it wrote:


 Hi, i have that table



 Thesis  Day A   B   C
 1   0   83.43   90.15   22.97
 1   0   85.50   94.97   16.62
 1   0   83.36   95.38   20.70
 1   0   84.47   92.16   23.58
 1   0   83.98   95.33   19.39
 1   0   82.86   93.78   24.55
 1   0   83.39   92.67   19.56
 1   0   85.17   95.24   17.95
 1   0   81.62   93.32   28.49
 1   0   82.99   92.85   19.73
 1   0   81.11   95.67   27.20
 1   0   83.39   94.69   16.51
 1   0   79.56   89.87   30.39
 1   0   80.54   93.32   21.76
 1   0   82.11   92.58   22.17
 1   14  85.65   94.00   19.19
 1   14  85.06   92.44   20.44
 1   14  83.97   91.39   24.38
 1   14  84.61   91.97   19.44
 1   14  85.13   90.59   25.30
 1   14  84.81   91.01   19.80
 1   14  84.52   94.06   18.77
 1   14  84.30   94.49   24.90
 1   14  84.74   91.32   20.35
 1   14  84.08   94.12   22.96
 1   14  84.50   94.25   19.95
 1   14  84.02   94.74   20.35
 1   14  85.30   92.82   21.12
 1   14  85.08   91.14   24.16
 1   14  85.21   95.69   18.17
 etc etc etc etc etc
 2   0   83.43   90.15   22.97
 2   0   85.50   94.97   16.62
 2   0   83.36   95.38   20.70
 2   0   84.47   92.16   23.58
 2   0   83.98   95.33   19.39
 2   0   82.86   93.78   24.55
 2   0   83.39   92.67   19.56
 2   0   85.17   95.24   17.95
 2   0   81.62   93.32   28.49
 2   0   82.99   92.85   19.73
 2   0   81.11   95.67   27.20
 2   0   83.39   94.69   16.51
 2   0   79.56   89.87   30.39
 2   0   80.54   93.32   21.76
 2   0   82.11   92.58   22.17
 2   14  84.48   91.23   20.44
 2   14  85.22   93.08   22.54
 2   14  83.89   92.74   25.11
 etc etc etc etc etc

 I need to subtract from every number the other numbers of the same thesis
 and same day.

 Example:
 A(row1) - A (row2) (same for B and C)
 A(row1) - A (row3)
 etc until the last Thesis 1 and Day 0
 A(row2) - A (row3)
 etc etc until the last Thesis 1 and Day 0

 Same for the others theses and days.

 How can i do that?

 Sorry for my english.
 --
 View this message in context:
 http://r.789695.n4.nabble.com/test-tp3217329p3217329.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >