Re: [R] Creating Dummy Var in R for regression?

2016-08-08 Thread Fredrik Karlsson
Hi,

please also have a look at the 'cut' function.Very handa function for these
types of situations.

Best,

Fredrik

On Sun, Aug 7, 2016 at 8:10 PM, Shivi Bhatia  wrote:

> Thank you Jeremiah and all others for the assistance. This really helped.
>
> On Sat, Aug 6, 2016 at 5:01 AM, jeremiah rounds 
> wrote:
>
> > Something like:
> >
> > d  =  data.frame(score = sample(1:10, 100, replace=TRUE))
> > d$score_t = "low"
> > d$score_t[d$score > 3] = "medium"
> > d$score_t[d$score >7 ] = "high"
> > d$score_t = factor(d$score_t, levels = c("low", "medium", "high"),
> > ordered=TRUE)  #set ordered = FALSE for dummy variables
> > X = model.matrix(~score_t, data=d)
> > X
> >
> >
> >
> > On Fri, Aug 5, 2016 at 3:23 PM, Shivi Bhatia 
> wrote:
> >
> >> Thanks you all for the assistance. This really helps.
> >>
> >> Hi Bert: While searching nabble i got to know R with factors variables
> >> there is no need to create dummy variable. However please consider this
> >> situation:
> >> I am in the process of building a logistic regression model on NPS data.
> >> The outcome variable is CE i.e. customer experience which has 3 rating
> so
> >> ordinal logistic regression will be used. However most of my variables
> are
> >> categorical. For instance one of the variable is agent knowledge which
> is
> >> a
> >> 10 point scale.
> >>
> >> This agent knowledge is again a 3 rated scale: high medium low hence i
> >> need
> >> to group these 10 values into 3 groups & then as you suggested i can
> >> directly enter them in the model without creating n-1 categories.
> >>
> >> I have worked on SAS extensively hence found this a bit confusing.
> >>
> >> Thanks for the help.
> >>
> >> On Sat, Aug 6, 2016 at 2:30 AM, Bert Gunter 
> >> wrote:
> >>
> >> > Just commenting on the email subject, not the content (which you have
> >> > already been helped with): there is no need to *ever* create a dummy
> >> > variable for regression in R if what you mean by this is what is
> >> > conventionally meant. R will create the model matrix with appropriate
> >> > "dummy variables" for factors as needed. See ?contrasts and ?C for
> >> > relevant details and/or consult an appropriate R tutorial.
> >> >
> >> > Of course, if this is not what you meant, than ignore.
> >> >
> >> > Cheers,
> >> > Bert
> >> >
> >> >
> >> > Bert Gunter
> >> >
> >> > "The trouble with having an open mind is that people keep coming along
> >> > and sticking things into it."
> >> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >> >
> >> >
> >> > On Fri, Aug 5, 2016 at 1:49 PM,   wrote:
> >> > > Hello,
> >> > >
> >> > > Your ifelse will never work because
> >> > > reasons$salutation== "Mr" & reasons$salutation=="Father" is always
> >> FALSE
> >> > > and so is reasons$salutation=="Mrs" & reasons$salutation=="Miss".
> >> > > Try instead | (or), not & (and).
> >> > >
> >> > > Hope this helps,
> >> > >
> >> > > Rui Barradas
> >> > >
> >> > >
> >> > >
> >> > > Citando Shivi Bhatia :
> >> > >
> >> > >> Dear Team,
> >> > >>
> >> > >> I need help with the below code in R:
> >> > >>
> >> > >> gender_rec<- c('Dr','Father','Mr'=1, 'Miss','MS','Mrs'=2, 3)
> >> > >>
> >> > >> reasons$salutation<- gender_rec[reasons$salutation].
> >> > >>
> >> > >> This code gives me the correct output but it overwrites the
> >> > >> reason$salutation variable. I need to create a new variable gender
> to
> >> > >> capture gender details and leave salutation as it is.
> >> > >>
> >> > >> i tried the below syntax but it is converting all to 1.
> >> > >>
> >> > >> reasons$gender<- ifelse(reasons$salutation== "Mr" &
> >> reasons$salutation==
> >> > >> "Father","Male", ifelse(reasons$salutation=="Mrs" &
> >> > reasons$salutation==
> >> > >> "Miss","Female",1))
> >> > >>
> >> > >> Please suggest.
> >> > >>
> >> > >> [[alternative HTML version deleted]]
> >> > >>
> >> > >> __
> >> > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> > >> PLEASE do read the posting guide
> >> > >> http://www.R-project.org/posting-guide.htmland provide commented,
> >> > >> minimal, self-contained, reproducible code.
> >> > >
> >> > >
> >> > >
> >> > > [[alternative HTML version deleted]]
> >> > >
> >> > > __
> >> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> > > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > > PLEASE do read the posting guide http://www.R-project.org/
> >> > posting-guide.html
> >> > > and provide commented, minimal, self-contained, reproducible code.
> >> >
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> 

Re: [R] Creating Dummy Var in R for regression?

2016-08-07 Thread Shivi Bhatia
Thank you Jeremiah and all others for the assistance. This really helped.

On Sat, Aug 6, 2016 at 5:01 AM, jeremiah rounds 
wrote:

> Something like:
>
> d  =  data.frame(score = sample(1:10, 100, replace=TRUE))
> d$score_t = "low"
> d$score_t[d$score > 3] = "medium"
> d$score_t[d$score >7 ] = "high"
> d$score_t = factor(d$score_t, levels = c("low", "medium", "high"),
> ordered=TRUE)  #set ordered = FALSE for dummy variables
> X = model.matrix(~score_t, data=d)
> X
>
>
>
> On Fri, Aug 5, 2016 at 3:23 PM, Shivi Bhatia  wrote:
>
>> Thanks you all for the assistance. This really helps.
>>
>> Hi Bert: While searching nabble i got to know R with factors variables
>> there is no need to create dummy variable. However please consider this
>> situation:
>> I am in the process of building a logistic regression model on NPS data.
>> The outcome variable is CE i.e. customer experience which has 3 rating so
>> ordinal logistic regression will be used. However most of my variables are
>> categorical. For instance one of the variable is agent knowledge which is
>> a
>> 10 point scale.
>>
>> This agent knowledge is again a 3 rated scale: high medium low hence i
>> need
>> to group these 10 values into 3 groups & then as you suggested i can
>> directly enter them in the model without creating n-1 categories.
>>
>> I have worked on SAS extensively hence found this a bit confusing.
>>
>> Thanks for the help.
>>
>> On Sat, Aug 6, 2016 at 2:30 AM, Bert Gunter 
>> wrote:
>>
>> > Just commenting on the email subject, not the content (which you have
>> > already been helped with): there is no need to *ever* create a dummy
>> > variable for regression in R if what you mean by this is what is
>> > conventionally meant. R will create the model matrix with appropriate
>> > "dummy variables" for factors as needed. See ?contrasts and ?C for
>> > relevant details and/or consult an appropriate R tutorial.
>> >
>> > Of course, if this is not what you meant, than ignore.
>> >
>> > Cheers,
>> > Bert
>> >
>> >
>> > Bert Gunter
>> >
>> > "The trouble with having an open mind is that people keep coming along
>> > and sticking things into it."
>> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>> >
>> >
>> > On Fri, Aug 5, 2016 at 1:49 PM,   wrote:
>> > > Hello,
>> > >
>> > > Your ifelse will never work because
>> > > reasons$salutation== "Mr" & reasons$salutation=="Father" is always
>> FALSE
>> > > and so is reasons$salutation=="Mrs" & reasons$salutation=="Miss".
>> > > Try instead | (or), not & (and).
>> > >
>> > > Hope this helps,
>> > >
>> > > Rui Barradas
>> > >
>> > >
>> > >
>> > > Citando Shivi Bhatia :
>> > >
>> > >> Dear Team,
>> > >>
>> > >> I need help with the below code in R:
>> > >>
>> > >> gender_rec<- c('Dr','Father','Mr'=1, 'Miss','MS','Mrs'=2, 3)
>> > >>
>> > >> reasons$salutation<- gender_rec[reasons$salutation].
>> > >>
>> > >> This code gives me the correct output but it overwrites the
>> > >> reason$salutation variable. I need to create a new variable gender to
>> > >> capture gender details and leave salutation as it is.
>> > >>
>> > >> i tried the below syntax but it is converting all to 1.
>> > >>
>> > >> reasons$gender<- ifelse(reasons$salutation== "Mr" &
>> reasons$salutation==
>> > >> "Father","Male", ifelse(reasons$salutation=="Mrs" &
>> > reasons$salutation==
>> > >> "Miss","Female",1))
>> > >>
>> > >> Please suggest.
>> > >>
>> > >> [[alternative HTML version deleted]]
>> > >>
>> > >> __
>> > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > >> https://stat.ethz.ch/mailman/listinfo/r-help
>> > >> PLEASE do read the posting guide
>> > >> http://www.R-project.org/posting-guide.htmland provide commented,
>> > >> minimal, self-contained, reproducible code.
>> > >
>> > >
>> > >
>> > > [[alternative HTML version deleted]]
>> > >
>> > > __
>> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > > https://stat.ethz.ch/mailman/listinfo/r-help
>> > > PLEASE do read the posting guide http://www.R-project.org/
>> > posting-guide.html
>> > > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 

Re: [R] Creating Dummy Var in R for regression?

2016-08-05 Thread jeremiah rounds
Something like:

d  =  data.frame(score = sample(1:10, 100, replace=TRUE))
d$score_t = "low"
d$score_t[d$score > 3] = "medium"
d$score_t[d$score >7 ] = "high"
d$score_t = factor(d$score_t, levels = c("low", "medium", "high"),
ordered=TRUE)  #set ordered = FALSE for dummy variables
X = model.matrix(~score_t, data=d)
X



On Fri, Aug 5, 2016 at 3:23 PM, Shivi Bhatia  wrote:

> Thanks you all for the assistance. This really helps.
>
> Hi Bert: While searching nabble i got to know R with factors variables
> there is no need to create dummy variable. However please consider this
> situation:
> I am in the process of building a logistic regression model on NPS data.
> The outcome variable is CE i.e. customer experience which has 3 rating so
> ordinal logistic regression will be used. However most of my variables are
> categorical. For instance one of the variable is agent knowledge which is a
> 10 point scale.
>
> This agent knowledge is again a 3 rated scale: high medium low hence i need
> to group these 10 values into 3 groups & then as you suggested i can
> directly enter them in the model without creating n-1 categories.
>
> I have worked on SAS extensively hence found this a bit confusing.
>
> Thanks for the help.
>
> On Sat, Aug 6, 2016 at 2:30 AM, Bert Gunter 
> wrote:
>
> > Just commenting on the email subject, not the content (which you have
> > already been helped with): there is no need to *ever* create a dummy
> > variable for regression in R if what you mean by this is what is
> > conventionally meant. R will create the model matrix with appropriate
> > "dummy variables" for factors as needed. See ?contrasts and ?C for
> > relevant details and/or consult an appropriate R tutorial.
> >
> > Of course, if this is not what you meant, than ignore.
> >
> > Cheers,
> > Bert
> >
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
> > and sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Fri, Aug 5, 2016 at 1:49 PM,   wrote:
> > > Hello,
> > >
> > > Your ifelse will never work because
> > > reasons$salutation== "Mr" & reasons$salutation=="Father" is always
> FALSE
> > > and so is reasons$salutation=="Mrs" & reasons$salutation=="Miss".
> > > Try instead | (or), not & (and).
> > >
> > > Hope this helps,
> > >
> > > Rui Barradas
> > >
> > >
> > >
> > > Citando Shivi Bhatia :
> > >
> > >> Dear Team,
> > >>
> > >> I need help with the below code in R:
> > >>
> > >> gender_rec<- c('Dr','Father','Mr'=1, 'Miss','MS','Mrs'=2, 3)
> > >>
> > >> reasons$salutation<- gender_rec[reasons$salutation].
> > >>
> > >> This code gives me the correct output but it overwrites the
> > >> reason$salutation variable. I need to create a new variable gender to
> > >> capture gender details and leave salutation as it is.
> > >>
> > >> i tried the below syntax but it is converting all to 1.
> > >>
> > >> reasons$gender<- ifelse(reasons$salutation== "Mr" &
> reasons$salutation==
> > >> "Father","Male", ifelse(reasons$salutation=="Mrs" &
> > reasons$salutation==
> > >> "Miss","Female",1))
> > >>
> > >> Please suggest.
> > >>
> > >> [[alternative HTML version deleted]]
> > >>
> > >> __
> > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide
> > >> http://www.R-project.org/posting-guide.htmland provide commented,
> > >> minimal, self-contained, reproducible code.
> > >
> > >
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating Dummy Var in R for regression?

2016-08-05 Thread Shivi Bhatia
Thanks you all for the assistance. This really helps.

Hi Bert: While searching nabble i got to know R with factors variables
there is no need to create dummy variable. However please consider this
situation:
I am in the process of building a logistic regression model on NPS data.
The outcome variable is CE i.e. customer experience which has 3 rating so
ordinal logistic regression will be used. However most of my variables are
categorical. For instance one of the variable is agent knowledge which is a
10 point scale.

This agent knowledge is again a 3 rated scale: high medium low hence i need
to group these 10 values into 3 groups & then as you suggested i can
directly enter them in the model without creating n-1 categories.

I have worked on SAS extensively hence found this a bit confusing.

Thanks for the help.

On Sat, Aug 6, 2016 at 2:30 AM, Bert Gunter  wrote:

> Just commenting on the email subject, not the content (which you have
> already been helped with): there is no need to *ever* create a dummy
> variable for regression in R if what you mean by this is what is
> conventionally meant. R will create the model matrix with appropriate
> "dummy variables" for factors as needed. See ?contrasts and ?C for
> relevant details and/or consult an appropriate R tutorial.
>
> Of course, if this is not what you meant, than ignore.
>
> Cheers,
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Fri, Aug 5, 2016 at 1:49 PM,   wrote:
> > Hello,
> >
> > Your ifelse will never work because
> > reasons$salutation== "Mr" & reasons$salutation=="Father" is always FALSE
> > and so is reasons$salutation=="Mrs" & reasons$salutation=="Miss".
> > Try instead | (or), not & (and).
> >
> > Hope this helps,
> >
> > Rui Barradas
> >
> >
> >
> > Citando Shivi Bhatia :
> >
> >> Dear Team,
> >>
> >> I need help with the below code in R:
> >>
> >> gender_rec<- c('Dr','Father','Mr'=1, 'Miss','MS','Mrs'=2, 3)
> >>
> >> reasons$salutation<- gender_rec[reasons$salutation].
> >>
> >> This code gives me the correct output but it overwrites the
> >> reason$salutation variable. I need to create a new variable gender to
> >> capture gender details and leave salutation as it is.
> >>
> >> i tried the below syntax but it is converting all to 1.
> >>
> >> reasons$gender<- ifelse(reasons$salutation== "Mr" & reasons$salutation==
> >> "Father","Male", ifelse(reasons$salutation=="Mrs" &
> reasons$salutation==
> >> "Miss","Female",1))
> >>
> >> Please suggest.
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.htmland provide commented,
> >> minimal, self-contained, reproducible code.
> >
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating Dummy Var in R for regression?

2016-08-05 Thread Bert Gunter
Just commenting on the email subject, not the content (which you have
already been helped with): there is no need to *ever* create a dummy
variable for regression in R if what you mean by this is what is
conventionally meant. R will create the model matrix with appropriate
"dummy variables" for factors as needed. See ?contrasts and ?C for
relevant details and/or consult an appropriate R tutorial.

Of course, if this is not what you meant, than ignore.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Aug 5, 2016 at 1:49 PM,   wrote:
> Hello,
>
> Your ifelse will never work because
> reasons$salutation== "Mr" & reasons$salutation=="Father" is always FALSE
> and so is reasons$salutation=="Mrs" & reasons$salutation=="Miss".
> Try instead | (or), not & (and).
>
> Hope this helps,
>
> Rui Barradas
>
>
>
> Citando Shivi Bhatia :
>
>> Dear Team,
>>
>> I need help with the below code in R:
>>
>> gender_rec<- c('Dr','Father','Mr'=1, 'Miss','MS','Mrs'=2, 3)
>>
>> reasons$salutation<- gender_rec[reasons$salutation].
>>
>> This code gives me the correct output but it overwrites the
>> reason$salutation variable. I need to create a new variable gender to
>> capture gender details and leave salutation as it is.
>>
>> i tried the below syntax but it is converting all to 1.
>>
>> reasons$gender<- ifelse(reasons$salutation== "Mr" & reasons$salutation==
>> "Father","Male", ifelse(reasons$salutation=="Mrs" & reasons$salutation==
>> "Miss","Female",1))
>>
>> Please suggest.
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.htmland provide commented,
>> minimal, self-contained, reproducible code.
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating Dummy Var in R for regression?

2016-08-05 Thread ruipbarradas
Hello,

Your ifelse will never work because
reasons$salutation== "Mr" & reasons$salutation=="Father" is always FALSE
and so is reasons$salutation=="Mrs" & reasons$salutation=="Miss".
Try instead | (or), not & (and).

Hope this helps,

Rui Barradas

 

Citando Shivi Bhatia :

> Dear Team,
>
> I need help with the below code in R:
>
> gender_rec<- c('Dr','Father','Mr'=1, 'Miss','MS','Mrs'=2, 3)
>
> reasons$salutation<- gender_rec[reasons$salutation].
>
> This code gives me the correct output but it overwrites the
> reason$salutation variable. I need to create a new variable gender to
> capture gender details and leave salutation as it is.
>
> i tried the below syntax but it is converting all to 1.
>
> reasons$gender<- ifelse(reasons$salutation== "Mr" & reasons$salutation==
> "Father","Male", ifelse(reasons$salutation=="Mrs" & reasons$salutation==
> "Miss","Female",1))
>
> Please suggest.
>
>         [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide  
> http://www.R-project.org/posting-guide.htmland provide commented,  
> minimal, self-contained, reproducible code.

 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.