Hi
Can someone help me with this self-education exercise?
(post is formatted for a fixed pitch font).
Assume some 'ordinary' data - a set of (x,y) data points, ordered on x,
with x & y real. Assume we draw a boundary somewhere in the middle of
the x domain
(x = k) to divide it into 2 adjacent subdomains. I want to fit 2
regression lines of the form y = ax + b, one to each of the subdomains,
with a continuity constraint.
In other words I want to satisfy these criteria:
(C1) the total sum of squared deviates over both intervals is minimised
(C2) the two fitted lines intersect the boundary x = k at the same
point.
The sums of squared deviates on the two intervals independently would
be:
s1 = S1(y - a1*x - b1)^2 [S1 = sum over n1 points in
subdomain #1]
s2 = S2(y - a2*x - b2)^2 [S2 = sum over n2 points in
subdomain #2]
requiring the 4 parameters a1, b1, a2, b2.
but these fits are not independent because of criterion #2, which says
that a1*k + b1 = a2*k + b2. This constraint means that we can replace
one of the parameters, eg, b2 = a1*k + b1 - a2*k.
The total sum to minimise is:
s = s1 + s2
= S1[ (y - a1*x - b1)^2 ] + S2[ (y - a2*x - b2)^2 ]
= S1[ (y - a1*x - b1)^2 ] + S2[ (y - a2*x - (a1*k + b1 - a2*k))^2 ]
Then the solution to these 3 simultaneous partial deriv equations:
ds/d(a1) = ds/d(b1) = ds/d(a2) = 0 ..... eq.1
is the solution to the problem, for a given k.
Expanding eq.1, I get:
S1(x^2) +k^2.n2 S1(x) +k.n2 k.S2(x) - k^2.n2
A = S1(x) + k.n2 n1 + n2 S2(x) - k.n2
k.S2(x) - k^2.n2 S2(x) - k.n2 S2(x^2) - 2k.S2(x) + k^2.n2
S1(xy) + k.S2(y)
C = S1(y) + S2(y)
S2(xy) - k.S2(y)
and solution B = A \ C where B = column (a1, b1, a2)
When I program it and plot the results, the fit is clearly not correct.
Is my expansion of eq.1 wrong? Am I introducing the constraint in
the wrong way?
Thanks for any suggestion
Ross
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
. http://jse.stat.ncsu.edu/ .
=================================================================