subject:"\[R\] LM with summation function"

Re: [R] LM with summation function

2012-05-23 Thread Robbie Edwards

Thank you Peter, works perfectly.

Funny how simple things are once someone tells you the answer =)

robbie



On Tue, May 22, 2012 at 9:37 PM, Peter Ehlers  wrote:

> Robbie,
>
> Here's what I *think* you are trying to do:
>
> 1.
> y is a cubic function of x:
>
>  y = b1*x + b2*x^2 + b3*x^3
>
> 2.
> s is the cumsum of y:
>
>  s_i = y_1 + ... + y_i
>
> 3.
> Given a subset of x = 1:n and the corresponding
> values of s, estimate the coefficients of the cubic.
>
> If that is the correct understanding, then you should
> be able to estimate the coefficients as follows:
>
> a) since s_i = b1 * sum of x_k for k=1, ..., i
>   + b2 * sum of (x_k)^2 for k=1, ..., i
>   + b3 * sum of (x_k)^3 for k=1, ..., i
>
> we can regress s on the cumsums of x, x^2 and x^3:
>
> using your sample data:
>  d <- data.frame(x = c(1, 4, 9, 12),
>  s = c(109, 1200, 5325, 8216))
>
>  e <- data.frame(x = 1:12)
>  e <- merge(e, d, all.x = T)
>  e <- within(e,
> {z3 <- cumsum(x^3)
>  z2 <- cumsum(x^2)
>  z1 <- cumsum(x)})
>
>  coef(lm(s ~ 0 + z1 + z2 + z3, data = e))
>
> #  z1  z2  z3
> # 100  10  -1
>
>
> Peter Ehlers
>
>
> On 2012-05-22 09:43, Robbie Edwards wrote:
>
>> I don't think I can.
>>
>> For the sample data
>>
>>  d<- data.frame(x=c(1, 4, 9, 12), s=c(109, 1200, 5325, 8216))
>>
>> when x = 4, s = 1200.  However, that s4 is sum of y1 + y2 + y3 + y4.
>>  Wouldn't I have to know the y for x = 2 and x = 3 to get the value of y
>> for x = 4?
>>
>> In the previous message, I created two sample data frames.  d is what I'm
>> trying to use to create df.  I only know what's in d, df is just used to
>> illustrate what I'm trying to get from d.
>>
>> robbie
>>
>>
>>
>>
>>
>> On Tue, May 22, 2012 at 12:30 PM, R. Michael Weylandt<
>> michael.weyla...@gmail.com>  wrote:
>>
>>  But if I understand your problem correctly, you can get the y values
>>> from the s values. I'm relying on your statement that "s is sum of the
>>> current y and all previous y (s3 = y1 + y2 + y3)." E.g.,
>>>
>>> y<- c(1, 4, 6, 9, 3, 7)
>>>
>>> s1 = 1
>>> s2 = 4 + s1 = 5
>>> s3 = 6 + s2 = 11
>>>
>>> more generally
>>>
>>> s<- cumsum(y)
>>>
>>> Then if we only see s, we can get back the y vector by doing
>>>
>>> c(s[1], diff(s))
>>>
>>> which is identical to y.
>>>
>>> So for your data, the underlying y must have been c(109, 1091, 4125,
>>> 2891) right?
>>>
>>> Or have I completely misunderstood your problem?
>>>
>>> Michael
>>>
>>> On Tue, May 22, 2012 at 12:25 PM, Robbie Edwards
>>>   wrote:
>>>
 Actually, I can't.  I don't know the y values.  Only the s and only for
 a
 subset of the data.

 Like this.

 d<- data.frame(x=c(1, 4, 9, 12), s=c(109, 1200, 5325, 8216))



 On Tue, May 22, 2012 at 11:57 AM, R. Michael Weylandt
   wrote:

>
> You can reconstruct the y values by taking first-differences of the s
> vector, no? Then it sounds like you're good to go
>
> Best, Michael
>
> On Tue, May 22, 2012 at 11:40 AM, Robbie Edwards
>   wrote:
>
>> Hi all,
>>
>> Thanks for the replies, but I realize I've done a bad job explaining
>>
> my
>>>
 problem.  To help, I've created some sample data to explain the
>>
> problem.
>>>

>> df<- data.frame(x=c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), y=c(109,
>> 232,
>> 363, 496, 625, 744, 847, 928, 981, 1000, 979, 912), s=c(109, 341, 704,
>> 1200, 1825, 2569, 3416, 4344, 5325, 6325, 7304, 8216))
>>
>> In this data frame, y results from y = x * b1 + x^2 * b2 + x^3 * b3
>>
> and
>>>
 s
>> is sum of the current y and all previous y (s3 = y1 + y2 + y3).
>>
>> I know I can find b1, b2 and b3 using:
>> lm(y ~ 0 + x + I(x^2) + I(x^3), data=df)
>>
>> yielding...
>> Coefficients:
>> x  I(x^2)  I(x^3)
>>   100  10  -1
>>
>> However, I need to find b1, b2 and b3 using the s column.  The reason
>> being, I don't actually know the values of y in the actual data set.
>>  And
>> in the actual data, I only have a few of the values.  Imagine this
>>
> data
>>>
 is
>> being used a reward schedule for like a loyalty points program.  y
>> represents the number of points needed for each level while s is the
>> total
>> number of points to reach that level.  In the real problem, my data
>> looks
>> more like this:
>>
>> d<- data.frame(x=c(1, 4, 9, 12), s=c(109, 1200, 5325, 8216))
>>
>> Where I need to use a few sample points to help define the parameters
>>
> of
>>>
 the curve.
>>
>> thanks again and hopefully this makes the problem a bit clearer.
>>
>> robbie
>>
>>
>>
>> On Fri, May 18, 2012 at 7:40 PM, David Winsemius
>> wrote:
>>
>>
>>> On May 18, 2012, at 1:44 PM, Robbie Edwards wrote:
>>>
>>>  Hi all,
>>>

 I'm trying to

Re: [R] LM with summation function

2012-05-22 Thread Peter Ehlers


Robbie,

Here's what I *think* you are trying to do:

1.
y is a cubic function of x:

  y = b1*x + b2*x^2 + b3*x^3

2.
s is the cumsum of y:

  s_i = y_1 + ... + y_i

3.
Given a subset of x = 1:n and the corresponding
values of s, estimate the coefficients of the cubic.

If that is the correct understanding, then you should
be able to estimate the coefficients as follows:

a) since s_i = b1 * sum of x_k for k=1, ..., i
   + b2 * sum of (x_k)^2 for k=1, ..., i
   + b3 * sum of (x_k)^3 for k=1, ..., i

we can regress s on the cumsums of x, x^2 and x^3:

using your sample data:
  d <- data.frame(x = c(1, 4, 9, 12),
  s = c(109, 1200, 5325, 8216))

  e <- data.frame(x = 1:12)
  e <- merge(e, d, all.x = T)
  e <- within(e,
 {z3 <- cumsum(x^3)
  z2 <- cumsum(x^2)
  z1 <- cumsum(x)})

  coef(lm(s ~ 0 + z1 + z2 + z3, data = e))

#  z1  z2  z3
# 100  10  -1


Peter Ehlers

On 2012-05-22 09:43, Robbie Edwards wrote:

I don't think I can.

For the sample data

  d<- data.frame(x=c(1, 4, 9, 12), s=c(109, 1200, 5325, 8216))

when x = 4, s = 1200.  However, that s4 is sum of y1 + y2 + y3 + y4.
  Wouldn't I have to know the y for x = 2 and x = 3 to get the value of y
for x = 4?

In the previous message, I created two sample data frames.  d is what I'm
trying to use to create df.  I only know what's in d, df is just used to
illustrate what I'm trying to get from d.

robbie





On Tue, May 22, 2012 at 12:30 PM, R. Michael Weylandt<
michael.weyla...@gmail.com>  wrote:


But if I understand your problem correctly, you can get the y values
from the s values. I'm relying on your statement that "s is sum of the
current y and all previous y (s3 = y1 + y2 + y3)." E.g.,

y<- c(1, 4, 6, 9, 3, 7)

s1 = 1
s2 = 4 + s1 = 5
s3 = 6 + s2 = 11

more generally

s<- cumsum(y)

Then if we only see s, we can get back the y vector by doing

c(s[1], diff(s))

which is identical to y.

So for your data, the underlying y must have been c(109, 1091, 4125,
2891) right?

Or have I completely misunderstood your problem?

Michael

On Tue, May 22, 2012 at 12:25 PM, Robbie Edwards
  wrote:

Actually, I can't.  I don't know the y values.  Only the s and only for a
subset of the data.

Like this.

d<- data.frame(x=c(1, 4, 9, 12), s=c(109, 1200, 5325, 8216))



On Tue, May 22, 2012 at 11:57 AM, R. Michael Weylandt
  wrote:


You can reconstruct the y values by taking first-differences of the s
vector, no? Then it sounds like you're good to go

Best, Michael

On Tue, May 22, 2012 at 11:40 AM, Robbie Edwards
  wrote:

Hi all,

Thanks for the replies, but I realize I've done a bad job explaining

my

problem.  To help, I've created some sample data to explain the

problem.


df<- data.frame(x=c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), y=c(109,
232,
363, 496, 625, 744, 847, 928, 981, 1000, 979, 912), s=c(109, 341, 704,
1200, 1825, 2569, 3416, 4344, 5325, 6325, 7304, 8216))

In this data frame, y results from y = x * b1 + x^2 * b2 + x^3 * b3

and

s
is sum of the current y and all previous y (s3 = y1 + y2 + y3).

I know I can find b1, b2 and b3 using:
lm(y ~ 0 + x + I(x^2) + I(x^3), data=df)

yielding...
Coefficients:
 x  I(x^2)  I(x^3)
   100  10  -1

However, I need to find b1, b2 and b3 using the s column.  The reason
being, I don't actually know the values of y in the actual data set.
  And
in the actual data, I only have a few of the values.  Imagine this

data

is
being used a reward schedule for like a loyalty points program.  y
represents the number of points needed for each level while s is the
total
number of points to reach that level.  In the real problem, my data
looks
more like this:

d<- data.frame(x=c(1, 4, 9, 12), s=c(109, 1200, 5325, 8216))

Where I need to use a few sample points to help define the parameters

of

the curve.

thanks again and hopefully this makes the problem a bit clearer.

robbie



On Fri, May 18, 2012 at 7:40 PM, David Winsemius
wrote:



On May 18, 2012, at 1:44 PM, Robbie Edwards wrote:

  Hi all,


I'm trying to model some data where the y is defined by

y = summation[1 to 50] B1 * x + B2 * x^2 + B3 * x^3

Hopefully that reads clearly for email.



cumsum( rowSums( cbind(B1 * x,  B2 * x^2, B3 * x^3)))



  Anyway, if it wasn't for the summation, I know I would do it like

this


lm(y ~ x + x2 + x3)

Where x2 and x3 are x^2 and x^3.

However, since each value of x is related to the previous values of

x,

I
don't know how to do this.  Any help is greatly appreciated.





David Winsemius, MD
West Hartford, CT




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







[[alternative HTML version deleted]]

__
R-help@r-proj

Re: [R] LM with summation function

2012-05-22 Thread R. Michael Weylandt

Ahh sorry -- I didn't understand that x was supposed to be an
index so I was using the row number an index for the summation -- yes,
my proposal probably won't work without further assumptions[I.e.,
you could assume linear growth between observations, but that will
bias something some direction...(not sure which)]

I'll ponder it some more and get back to you if I come up with anything

Michael

On Tue, May 22, 2012 at 12:43 PM, Robbie Edwards
 wrote:
> I don't think I can.
>
> For the sample data
>
>  d <- data.frame(x=c(1, 4, 9, 12), s=c(109, 1200, 5325, 8216))
>
> when x = 4, s = 1200.  However, that s4 is sum of y1 + y2 + y3 + y4.
>  Wouldn't I have to know the y for x = 2 and x = 3 to get the value of y
> for x = 4?
>
> In the previous message, I created two sample data frames.  d is what I'm
> trying to use to create df.  I only know what's in d, df is just used to
> illustrate what I'm trying to get from d.
>
> robbie
>
>
>
>
>
> On Tue, May 22, 2012 at 12:30 PM, R. Michael Weylandt <
> michael.weyla...@gmail.com> wrote:
>
>> But if I understand your problem correctly, you can get the y values
>> from the s values. I'm relying on your statement that "s is sum of the
>> current y and all previous y (s3 = y1 + y2 + y3)." E.g.,
>>
>> y <- c(1, 4, 6, 9, 3, 7)
>>
>> s1 = 1
>> s2 = 4 + s1 = 5
>> s3 = 6 + s2 = 11
>>
>> more generally
>>
>> s <- cumsum(y)
>>
>> Then if we only see s, we can get back the y vector by doing
>>
>> c(s[1], diff(s))
>>
>> which is identical to y.
>>
>> So for your data, the underlying y must have been c(109, 1091, 4125,
>> 2891) right?
>>
>> Or have I completely misunderstood your problem?
>>
>> Michael
>>
>> On Tue, May 22, 2012 at 12:25 PM, Robbie Edwards
>>  wrote:
>> > Actually, I can't.  I don't know the y values.  Only the s and only for a
>> > subset of the data.
>> >
>> > Like this.
>> >
>> > d <- data.frame(x=c(1, 4, 9, 12), s=c(109, 1200, 5325, 8216))
>> >
>> >
>> >
>> > On Tue, May 22, 2012 at 11:57 AM, R. Michael Weylandt
>> >  wrote:
>> >>
>> >> You can reconstruct the y values by taking first-differences of the s
>> >> vector, no? Then it sounds like you're good to go
>> >>
>> >> Best, Michael
>> >>
>> >> On Tue, May 22, 2012 at 11:40 AM, Robbie Edwards
>> >>  wrote:
>> >> > Hi all,
>> >> >
>> >> > Thanks for the replies, but I realize I've done a bad job explaining
>> my
>> >> > problem.  To help, I've created some sample data to explain the
>> problem.
>> >> >
>> >> > df <- data.frame(x=c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), y=c(109,
>> >> > 232,
>> >> > 363, 496, 625, 744, 847, 928, 981, 1000, 979, 912), s=c(109, 341, 704,
>> >> > 1200, 1825, 2569, 3416, 4344, 5325, 6325, 7304, 8216))
>> >> >
>> >> > In this data frame, y results from y = x * b1 + x^2 * b2 + x^3 * b3
>> and
>> >> > s
>> >> > is sum of the current y and all previous y (s3 = y1 + y2 + y3).
>> >> >
>> >> > I know I can find b1, b2 and b3 using:
>> >> > lm(y ~ 0 + x + I(x^2) + I(x^3), data=df)
>> >> >
>> >> > yielding...
>> >> > Coefficients:
>> >> >     x  I(x^2)  I(x^3)
>> >> >   100      10      -1
>> >> >
>> >> > However, I need to find b1, b2 and b3 using the s column.  The reason
>> >> > being, I don't actually know the values of y in the actual data set.
>> >> >  And
>> >> > in the actual data, I only have a few of the values.  Imagine this
>> data
>> >> > is
>> >> > being used a reward schedule for like a loyalty points program.  y
>> >> > represents the number of points needed for each level while s is the
>> >> > total
>> >> > number of points to reach that level.  In the real problem, my data
>> >> > looks
>> >> > more like this:
>> >> >
>> >> > d <- data.frame(x=c(1, 4, 9, 12), s=c(109, 1200, 5325, 8216))
>> >> >
>> >> > Where I need to use a few sample points to help define the parameters
>> of
>> >> > the curve.
>> >> >
>> >> > thanks again and hopefully this makes the problem a bit clearer.
>> >> >
>> >> > robbie
>> >> >
>> >> >
>> >> >
>> >> > On Fri, May 18, 2012 at 7:40 PM, David Winsemius
>> >> > wrote:
>> >> >
>> >> >>
>> >> >> On May 18, 2012, at 1:44 PM, Robbie Edwards wrote:
>> >> >>
>> >> >>  Hi all,
>> >> >>>
>> >> >>> I'm trying to model some data where the y is defined by
>> >> >>>
>> >> >>> y = summation[1 to 50] B1 * x + B2 * x^2 + B3 * x^3
>> >> >>>
>> >> >>> Hopefully that reads clearly for email.
>> >> >>>
>> >> >>>
>> >> >> cumsum( rowSums( cbind(B1 * x,  B2 * x^2, B3 * x^3)))
>> >> >>
>> >> >>
>> >> >>
>> >> >>  Anyway, if it wasn't for the summation, I know I would do it like
>> this
>> >> >>>
>> >> >>> lm(y ~ x + x2 + x3)
>> >> >>>
>> >> >>> Where x2 and x3 are x^2 and x^3.
>> >> >>>
>> >> >>> However, since each value of x is related to the previous values of
>> x,
>> >> >>> I
>> >> >>> don't know how to do this.  Any help is greatly appreciated.
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>
>> >> >> David Winsemius, MD
>> >> >> West Hartford, CT
>> >> >>
>> >> >>
>> >> >
>> >> >        [[alternative HTML version deleted]]
>> >> >
>> >> > _

Re: [R] LM with summation function

2012-05-22 Thread Robbie Edwards

I don't think I can.

For the sample data

 d <- data.frame(x=c(1, 4, 9, 12), s=c(109, 1200, 5325, 8216))

when x = 4, s = 1200.  However, that s4 is sum of y1 + y2 + y3 + y4.
 Wouldn't I have to know the y for x = 2 and x = 3 to get the value of y
for x = 4?

In the previous message, I created two sample data frames.  d is what I'm
trying to use to create df.  I only know what's in d, df is just used to
illustrate what I'm trying to get from d.

robbie





On Tue, May 22, 2012 at 12:30 PM, R. Michael Weylandt <
michael.weyla...@gmail.com> wrote:

> But if I understand your problem correctly, you can get the y values
> from the s values. I'm relying on your statement that "s is sum of the
> current y and all previous y (s3 = y1 + y2 + y3)." E.g.,
>
> y <- c(1, 4, 6, 9, 3, 7)
>
> s1 = 1
> s2 = 4 + s1 = 5
> s3 = 6 + s2 = 11
>
> more generally
>
> s <- cumsum(y)
>
> Then if we only see s, we can get back the y vector by doing
>
> c(s[1], diff(s))
>
> which is identical to y.
>
> So for your data, the underlying y must have been c(109, 1091, 4125,
> 2891) right?
>
> Or have I completely misunderstood your problem?
>
> Michael
>
> On Tue, May 22, 2012 at 12:25 PM, Robbie Edwards
>  wrote:
> > Actually, I can't.  I don't know the y values.  Only the s and only for a
> > subset of the data.
> >
> > Like this.
> >
> > d <- data.frame(x=c(1, 4, 9, 12), s=c(109, 1200, 5325, 8216))
> >
> >
> >
> > On Tue, May 22, 2012 at 11:57 AM, R. Michael Weylandt
> >  wrote:
> >>
> >> You can reconstruct the y values by taking first-differences of the s
> >> vector, no? Then it sounds like you're good to go
> >>
> >> Best, Michael
> >>
> >> On Tue, May 22, 2012 at 11:40 AM, Robbie Edwards
> >>  wrote:
> >> > Hi all,
> >> >
> >> > Thanks for the replies, but I realize I've done a bad job explaining
> my
> >> > problem.  To help, I've created some sample data to explain the
> problem.
> >> >
> >> > df <- data.frame(x=c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), y=c(109,
> >> > 232,
> >> > 363, 496, 625, 744, 847, 928, 981, 1000, 979, 912), s=c(109, 341, 704,
> >> > 1200, 1825, 2569, 3416, 4344, 5325, 6325, 7304, 8216))
> >> >
> >> > In this data frame, y results from y = x * b1 + x^2 * b2 + x^3 * b3
> and
> >> > s
> >> > is sum of the current y and all previous y (s3 = y1 + y2 + y3).
> >> >
> >> > I know I can find b1, b2 and b3 using:
> >> > lm(y ~ 0 + x + I(x^2) + I(x^3), data=df)
> >> >
> >> > yielding...
> >> > Coefficients:
> >> > x  I(x^2)  I(x^3)
> >> >   100  10  -1
> >> >
> >> > However, I need to find b1, b2 and b3 using the s column.  The reason
> >> > being, I don't actually know the values of y in the actual data set.
> >> >  And
> >> > in the actual data, I only have a few of the values.  Imagine this
> data
> >> > is
> >> > being used a reward schedule for like a loyalty points program.  y
> >> > represents the number of points needed for each level while s is the
> >> > total
> >> > number of points to reach that level.  In the real problem, my data
> >> > looks
> >> > more like this:
> >> >
> >> > d <- data.frame(x=c(1, 4, 9, 12), s=c(109, 1200, 5325, 8216))
> >> >
> >> > Where I need to use a few sample points to help define the parameters
> of
> >> > the curve.
> >> >
> >> > thanks again and hopefully this makes the problem a bit clearer.
> >> >
> >> > robbie
> >> >
> >> >
> >> >
> >> > On Fri, May 18, 2012 at 7:40 PM, David Winsemius
> >> > wrote:
> >> >
> >> >>
> >> >> On May 18, 2012, at 1:44 PM, Robbie Edwards wrote:
> >> >>
> >> >>  Hi all,
> >> >>>
> >> >>> I'm trying to model some data where the y is defined by
> >> >>>
> >> >>> y = summation[1 to 50] B1 * x + B2 * x^2 + B3 * x^3
> >> >>>
> >> >>> Hopefully that reads clearly for email.
> >> >>>
> >> >>>
> >> >> cumsum( rowSums( cbind(B1 * x,  B2 * x^2, B3 * x^3)))
> >> >>
> >> >>
> >> >>
> >> >>  Anyway, if it wasn't for the summation, I know I would do it like
> this
> >> >>>
> >> >>> lm(y ~ x + x2 + x3)
> >> >>>
> >> >>> Where x2 and x3 are x^2 and x^3.
> >> >>>
> >> >>> However, since each value of x is related to the previous values of
> x,
> >> >>> I
> >> >>> don't know how to do this.  Any help is greatly appreciated.
> >> >>>
> >> >>>
> >> >>>
> >> >>
> >> >> David Winsemius, MD
> >> >> West Hartford, CT
> >> >>
> >> >>
> >> >
> >> >[[alternative HTML version deleted]]
> >> >
> >> > __
> >> > R-help@r-project.org mailing list
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide
> >> > http://www.R-project.org/posting-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] LM with summation function

2012-05-22 Thread R. Michael Weylandt

But if I understand your problem correctly, you can get the y values
from the s values. I'm relying on your statement that "s is sum of the
current y and all previous y (s3 = y1 + y2 + y3)." E.g.,

y <- c(1, 4, 6, 9, 3, 7)

s1 = 1
s2 = 4 + s1 = 5
s3 = 6 + s2 = 11

more generally

s <- cumsum(y)

Then if we only see s, we can get back the y vector by doing

c(s[1], diff(s))

which is identical to y.

So for your data, the underlying y must have been c(109, 1091, 4125,
2891) right?

Or have I completely misunderstood your problem?

Michael

On Tue, May 22, 2012 at 12:25 PM, Robbie Edwards
 wrote:
> Actually, I can't.  I don't know the y values.  Only the s and only for a
> subset of the data.
>
> Like this.
>
> d <- data.frame(x=c(1, 4, 9, 12), s=c(109, 1200, 5325, 8216))
>
>
>
> On Tue, May 22, 2012 at 11:57 AM, R. Michael Weylandt
>  wrote:
>>
>> You can reconstruct the y values by taking first-differences of the s
>> vector, no? Then it sounds like you're good to go
>>
>> Best, Michael
>>
>> On Tue, May 22, 2012 at 11:40 AM, Robbie Edwards
>>  wrote:
>> > Hi all,
>> >
>> > Thanks for the replies, but I realize I've done a bad job explaining my
>> > problem.  To help, I've created some sample data to explain the problem.
>> >
>> > df <- data.frame(x=c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), y=c(109,
>> > 232,
>> > 363, 496, 625, 744, 847, 928, 981, 1000, 979, 912), s=c(109, 341, 704,
>> > 1200, 1825, 2569, 3416, 4344, 5325, 6325, 7304, 8216))
>> >
>> > In this data frame, y results from y = x * b1 + x^2 * b2 + x^3 * b3 and
>> > s
>> > is sum of the current y and all previous y (s3 = y1 + y2 + y3).
>> >
>> > I know I can find b1, b2 and b3 using:
>> > lm(y ~ 0 + x + I(x^2) + I(x^3), data=df)
>> >
>> > yielding...
>> > Coefficients:
>> >     x  I(x^2)  I(x^3)
>> >   100      10      -1
>> >
>> > However, I need to find b1, b2 and b3 using the s column.  The reason
>> > being, I don't actually know the values of y in the actual data set.
>> >  And
>> > in the actual data, I only have a few of the values.  Imagine this data
>> > is
>> > being used a reward schedule for like a loyalty points program.  y
>> > represents the number of points needed for each level while s is the
>> > total
>> > number of points to reach that level.  In the real problem, my data
>> > looks
>> > more like this:
>> >
>> > d <- data.frame(x=c(1, 4, 9, 12), s=c(109, 1200, 5325, 8216))
>> >
>> > Where I need to use a few sample points to help define the parameters of
>> > the curve.
>> >
>> > thanks again and hopefully this makes the problem a bit clearer.
>> >
>> > robbie
>> >
>> >
>> >
>> > On Fri, May 18, 2012 at 7:40 PM, David Winsemius
>> > wrote:
>> >
>> >>
>> >> On May 18, 2012, at 1:44 PM, Robbie Edwards wrote:
>> >>
>> >>  Hi all,
>> >>>
>> >>> I'm trying to model some data where the y is defined by
>> >>>
>> >>> y = summation[1 to 50] B1 * x + B2 * x^2 + B3 * x^3
>> >>>
>> >>> Hopefully that reads clearly for email.
>> >>>
>> >>>
>> >> cumsum( rowSums( cbind(B1 * x,  B2 * x^2, B3 * x^3)))
>> >>
>> >>
>> >>
>> >>  Anyway, if it wasn't for the summation, I know I would do it like this
>> >>>
>> >>> lm(y ~ x + x2 + x3)
>> >>>
>> >>> Where x2 and x3 are x^2 and x^3.
>> >>>
>> >>> However, since each value of x is related to the previous values of x,
>> >>> I
>> >>> don't know how to do this.  Any help is greatly appreciated.
>> >>>
>> >>>
>> >>>
>> >>
>> >> David Winsemius, MD
>> >> West Hartford, CT
>> >>
>> >>
>> >
>> >        [[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] LM with summation function

2012-05-22 Thread Robbie Edwards

Hi all,

Thanks for the replies, but I realize I've done a bad job explaining my
problem.  To help, I've created some sample data to explain the problem.

df <- data.frame(x=c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), y=c(109, 232,
363, 496, 625, 744, 847, 928, 981, 1000, 979, 912), s=c(109, 341, 704,
1200, 1825, 2569, 3416, 4344, 5325, 6325, 7304, 8216))

In this data frame, y results from y = x * b1 + x^2 * b2 + x^3 * b3 and s
is sum of the current y and all previous y (s3 = y1 + y2 + y3).

I know I can find b1, b2 and b3 using:
lm(y ~ 0 + x + I(x^2) + I(x^3), data=df)

yielding...
Coefficients:
 x  I(x^2)  I(x^3)
   100  10  -1

However, I need to find b1, b2 and b3 using the s column.  The reason
being, I don't actually know the values of y in the actual data set.  And
in the actual data, I only have a few of the values.  Imagine this data is
being used a reward schedule for like a loyalty points program.  y
represents the number of points needed for each level while s is the total
number of points to reach that level.  In the real problem, my data looks
more like this:

d <- data.frame(x=c(1, 4, 9, 12), s=c(109, 1200, 5325, 8216))

Where I need to use a few sample points to help define the parameters of
the curve.

thanks again and hopefully this makes the problem a bit clearer.

robbie

On Fri, May 18, 2012 at 7:40 PM, David Winsemius wrote:

>
> On May 18, 2012, at 1:44 PM, Robbie Edwards wrote:
>
>  Hi all,
>>
>> I'm trying to model some data where the y is defined by
>>
>> y = summation[1 to 50] B1 * x + B2 * x^2 + B3 * x^3
>>
>> Hopefully that reads clearly for email.
>>
>>
> cumsum( rowSums( cbind(B1 * x,  B2 * x^2, B3 * x^3)))
>
>
>
>  Anyway, if it wasn't for the summation, I know I would do it like this
>>
>> lm(y ~ x + x2 + x3)
>>
>> Where x2 and x3 are x^2 and x^3.
>>
>> However, since each value of x is related to the previous values of x, I
>> don't know how to do this.  Any help is greatly appreciated.
>>
>>
>>
>
> David Winsemius, MD
> West Hartford, CT
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] LM with summation function

2012-05-18 Thread David Winsemius



On May 18, 2012, at 1:44 PM, Robbie Edwards wrote:


Hi all,

I'm trying to model some data where the y is defined by

y = summation[1 to 50] B1 * x + B2 * x^2 + B3 * x^3

Hopefully that reads clearly for email.



cumsum( rowSums( cbind(B1 * x,  B2 * x^2, B3 * x^3)))



Anyway, if it wasn't for the summation, I know I would do it like this

lm(y ~ x + x2 + x3)

Where x2 and x3 are x^2 and x^3.

However, since each value of x is related to the previous values of  
x, I

don't know how to do this.  Any help is greatly appreciated.





David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] LM with summation function

2012-05-18 Thread Bert Gunter

Following up on Rolf's post:

1) cumulative summation (cumsum) maybe?
2) In fact, you should probably **not** fit the non-summation version
as you have stated. See ?poly.

I would guess that context is important here. Based on (my
interpretation) of the rather strange nature of your request,  I
suspect that you shouldn't be trying to do what you're doing **at
all**;  but that's just a guess, of course.

-- Bert

On Fri, May 18, 2012 at 1:56 PM, Rolf Turner  wrote:
> On 19/05/12 05:44, Robbie Edwards wrote:
>>
>> Hi all,
>>
>> I'm trying to model some data where the y is defined by
>>
>> y = summation[1 to 50] B1 * x + B2 * x^2 + B3 * x^3
>>
>> Hopefully that reads clearly for email.
>>
>> Anyway, if it wasn't for the summation, I know I would do it like this
>>
>> lm(y ~ x + x2 + x3)
>>
>> Where x2 and x3 are x^2 and x^3.
>>
>> However, since each value of x is related to the previous values of x, I
>> don't know how to do this.  Any help is greatly appreciated.
>
>
> If your mail says what it seems to say, then your question makes
> no sense.  You are in effect trying to fit a linear model to a single
> point:
>
>    y = B1*s1 + B2*s2 + B3*3
>
> where s1 = sum(x), s2 = sum(x^2) and s3=sum(x^3)
>
> and you have only a single value of each of s1, s2, s3.
>
> If you have replicate values of s1, s2, and s3 (i.e. replicate
> vectors (x1, ... x50)) --- and of course a corresponding y value
> for each replicate --- then just form s1, s2, and s3 as vectors
> whose entries correspond to the replicates and then fit
>
>    lm(y ~ s1 + s2 + s3)
>
> If I have misunderstood what you are asking then please provide
> a self-contained reproducible example as the posting guide requests.
>
>    cheers,
>
>        Rolf Turner
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] LM with summation function

2012-05-18 Thread Rolf Turner


On 19/05/12 05:44, Robbie Edwards wrote:

Hi all,

I'm trying to model some data where the y is defined by

y = summation[1 to 50] B1 * x + B2 * x^2 + B3 * x^3

Hopefully that reads clearly for email.

Anyway, if it wasn't for the summation, I know I would do it like this

lm(y ~ x + x2 + x3)

Where x2 and x3 are x^2 and x^3.

However, since each value of x is related to the previous values of x, I
don't know how to do this.  Any help is greatly appreciated.


If your mail says what it seems to say, then your question makes
no sense.  You are in effect trying to fit a linear model to a single
point:

y = B1*s1 + B2*s2 + B3*3

where s1 = sum(x), s2 = sum(x^2) and s3=sum(x^3)

and you have only a single value of each of s1, s2, s3.

If you have replicate values of s1, s2, and s3 (i.e. replicate
vectors (x1, ... x50)) --- and of course a corresponding y value
for each replicate --- then just form s1, s2, and s3 as vectors
whose entries correspond to the replicates and then fit

lm(y ~ s1 + s2 + s3)

If I have misunderstood what you are asking then please provide
a self-contained reproducible example as the posting guide requests.

cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] LM with summation function

2012-05-18 Thread Robbie Edwards

Hi all,

I'm trying to model some data where the y is defined by

y = summation[1 to 50] B1 * x + B2 * x^2 + B3 * x^3

Hopefully that reads clearly for email.

Anyway, if it wasn't for the summation, I know I would do it like this

lm(y ~ x + x2 + x3)

Where x2 and x3 are x^2 and x^3.

However, since each value of x is related to the previous values of x, I
don't know how to do this.  Any help is greatly appreciated.


robbie

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] LM with summation function

Re: [R] LM with summation function

Re: [R] LM with summation function

Re: [R] LM with summation function

Re: [R] LM with summation function

Re: [R] LM with summation function

Re: [R] LM with summation function

Re: [R] LM with summation function

Re: [R] LM with summation function

[R] LM with summation function

10 matches

Site Navigation

Mail list logo

Footer information