Re: [R] Fwd: Rpart help

2017-05-23 Thread Bert Gunter
1. Forget Excel. Erase it from your memory. banish its paradigms from
your practices. Faiing to do so will only bring misery as you explore
R. R is a rational programming language primarily for data analysis,
statistics, and graphics. Excel is, ummm, not.

2. Have you read the rpart documents and vignettes? That should be
your first port of call for questions about how it works.


Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, May 23, 2017 at 6:45 PM, kristen wissmar
 wrote:
> Hi R users!
>
> I'm new to R, so I'm starting with a basic exercise in rpart.
>
> I'm predicting if a user will churn based on past order history.  I've
> calculated the probabilities in excel, and if user is a single order
> customer (1), then their probability of churn is 90%, if there are multiple
> orders(0) then the probability of churning is 70%. In the R model, the
> probability looks like it's 100% and 53%. In excel I used the count of
> shopper_key to calculate probabilities. So I'm wondering if R has needs a
> shopper_key to count?
>
> It would be helpful if someone could suggest where I'm going wrong.
>
> Thank you!
>
>
> Code -
> m1 <- rpart( churn ~ single_order , data = data2, method="anova" )
>
> Output-
> n= 22041
>
> node), split, n, deviance, yval
>   * denotes terminal node
>
> 1) root 22041 3229.265 0.8216959
>   2) single_order< 0.5 8407 2092.852 0.5325324 *
>   3) single_order>=0.5 136340.000 1.000 *
>
>
> shopper_key churn single_order
> 1 1 0
> 2 1 1
> 3 0 0
> 4 1 0
> 5 1 1
> 6 1 1
> 7 1 0
> 8 1 1
> 9 0 1
> 10 1 1
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fwd: Rpart help

2017-05-23 Thread kristen wissmar
Hi R users!

I'm new to R, so I'm starting with a basic exercise in rpart.

I'm predicting if a user will churn based on past order history.  I've
calculated the probabilities in excel, and if user is a single order
customer (1), then their probability of churn is 90%, if there are multiple
orders(0) then the probability of churning is 70%. In the R model, the
probability looks like it's 100% and 53%. In excel I used the count of
shopper_key to calculate probabilities. So I'm wondering if R has needs a
shopper_key to count?

It would be helpful if someone could suggest where I'm going wrong.

Thank you!


Code -
m1 <- rpart( churn ~ single_order , data = data2, method="anova" )

Output-
n= 22041

node), split, n, deviance, yval
  * denotes terminal node

1) root 22041 3229.265 0.8216959
  2) single_order< 0.5 8407 2092.852 0.5325324 *
  3) single_order>=0.5 136340.000 1.000 *


shopper_key churn single_order
1 1 0
2 1 1
3 0 0
4 1 0
5 1 1
6 1 1
7 1 0
8 1 1
9 0 1
10 1 1

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.