Hi all,
Time ago I did the exercise of repeating the EM algorithm in Schaffer's
book. I have compared my results with Norm program and I normally get
complete agreement. However, this discussion reminded me on particular
difficult example. On page 179, there is data bout change in heart rate
after marijuana use. The EM algorithm converges slowly using Schaffer's
program. Using my program I get exactly (apart from rounding) the same
results until iteration 161. Then, the loglikelihood decreases and starts
to behave strangely.
I have always thought that it was a problem of numerical precision and that
It would only happen with with somewhat perverse datasets. In this case,
subjects 4 and 5 (with missing values) are very influential.
(OUTPUT FROM MY PROGRAM)
Iteration Difference Loglikelihood
0 6.455187 -15.148608
1 1.820353 -13.261489
2 1.522566 -11.477738
3 1.513494 -9.418641
4 1.524991 -6.939941
5 1.525797 -4.058120
6 1.480551 -0.972671
7 1.301270 1.921887
8 1.024217 4.146841
9 0.752646 5.430844
10 0.546826 6.006434
11 0.408199 6.294803
12 0.310510 6.507559
13 0.242678 6.702320
14 0.195909 6.892047
15 0.163753 7.080013
16 0.141524 7.267238
17 0.125995 7.454095
18 0.114878 7.640735
19 0.106782 7.827225
20 0.100775 8.013599
21 0.095984 8.199874
22 0.091992 8.386062
23 0.088545 8.572171
24 0.085474 8.758205
25 0.082670 8.944170
26 0.080063 9.130068
27 0.077605 9.315904
28 0.075266 9.501680
29 0.073023 9.687398
30 0.070864 9.873061
31 0.068778 10.058670
32 0.066760 10.244229
33 0.064804 10.429739
34 0.062906 10.615202
35 0.061064 10.800620
36 0.059276 10.985993
37 0.057539 11.171325
38 0.055851 11.356616
39 0.054212 11.541869
40 0.052620 11.727083
41 0.051074 11.912262
42 0.049571 12.097406
43 0.048112 12.282516
44 0.046694 12.467594
45 0.045317 12.652640
46 0.043980 12.837656
47 0.042681 13.022644
48 0.041420 13.207603
49 0.040195 13.392536
50 0.039005 13.577442
51 0.037850 13.762323
52 0.036728 13.947180
53 0.035639 14.132014
54 0.034582 14.316825
55 0.033555 14.501614
56 0.032559 14.686382
57 0.031591 14.871130
58 0.030652 15.055858
59 0.029740 15.240568
60 0.028855 15.425259
61 0.027995 15.609932
62 0.027161 15.794588
63 0.026352 15.979228
64 0.025566 16.163852
65 0.024803 16.348461
66 0.024063 16.533054
67 0.023345 16.717634
68 0.022647 16.902199
69 0.021971 17.086751
70 0.021314 17.271290
71 0.020677 17.455817
72 0.020058 17.640331
73 0.019458 17.824834
74 0.018876 18.009325
75 0.018311 18.193805
76 0.017763 18.378275
77 0.017230 18.562735
78 0.016714 18.747184
79 0.016213 18.931624
80 0.015727 19.116055
81 0.015256 19.300476
82 0.014798 19.484889
83 0.014354 19.669294
84 0.013923 19.853691
85 0.013506 20.038079
86 0.013100 20.222460
87 0.012707 20.406834
88 0.012325 20.591200
89 0.011955 20.775560
90 0.011596 20.959913
91 0.011247 21.144259
92 0.010909 21.328599
93 0.010581 21.512933
94 0.010263 21.697261
95 0.009955 21.881584
96 0.009655 22.065901
97 0.009365 22.250213
98 0.009083 22.434519
99 0.008810 22.618821
100 0.008545 22.803117
101 0.008288 22.987409
102 0.008038 23.171697
103 0.007796 23.355980
104 0.007561 23.540259
105 0.007334 23.724533
106 0.007113 23.908804
107 0.006899 24.093071
108 0.006691 24.277334
109 0.006489 24.461593
110 0.006294 24.645849
111 0.006104 24.830101
112 0.005920 25.014351
113 0.005742 25.198597
114 0.005569 25.382839
115 0.005401 25.567079
116 0.005238 25.751316
117 0.005080 25.935550
118 0.004927 26.119782
119 0.004778 26.304011
120 0.004634 26.488237
121 0.004495 26.672460
122 0.004359 26.856682
123 0.004228 27.040901
124 0.004100 27.225117
125 0.003976 27.409332
126 0.003857 27.593544
127 0.003740 27.777754
128 0.003627 27.961963
129 0.003518 28.146169
130 0.003412 28.330374
131 0.003309 28.514576
132 0.003209 28.698777
133 0.003112 28.882976
134 0.003018 29.067174
135 0.002927 29.251370
136 0.002839 29.435564
137 0.002753 29.619757
138 0.002670 29.803949
139 0.002589 29.988139
140 0.002511 30.172327
141 0.002435 30.356514
142 0.002362 30.540700
143 0.002291 30.724885
144 0.002221 30.909069
145 0.002154 31.093251
146 0.002089 31.277432
147 0.002026 31.461612
148 0.001965 31.645791
149 0.001906 31.829969
150 0.001848 32.014146
151 0.001792 32.198322
152 0.001738 32.382498
153 0.001686 32.566672
154 0.001635 32.750845
155 0.001585 32.935017
156 0.001538 33.119189
157 0.001491 33.303360
158 0.001446 33.487530
159 0.001402 33.671699
160 0.001360 33.855867
161 0.001319 34.040035
162 0.299338 1.520169
163 0.304584 4.803212
164 0.098451 7.989062
165 0.038130 11.199458
166 0.026276 14.434238
167 0.020517 17.683455
168 0.015647 20.932177
169 0.011520 24.151492
170 0.008348 27.273035
171 0.006043 30.135875
172 0.004426 32.436470
173 0.003315 33.874626
174 0.002567 34.547117
175 0.002061 34.860672
176 0.299957 1.522204
177 0.301924 4.805204
178 0.095946 7.989818
179 0.038769 11.195829
180 0.028073 14.417732
181 0.022609 17.630809
182 0.017660 20.780448
183 0.013464 23.739589
184 0.010227 26.247264
185 0.007862 27.964062
186 0.006188 28.833393
187 0.005022 29.218901
188 0.004215 29.449655
189 0.003653 29.644805
190 0.003254 29.831961
191 0.002968 30.017026
192 0.002755 30.201454
193 0.002592 30.385670
194 0.002462 30.569814
195 0.002354 30.753933
196 0.002261 30.938045
197 0.002178 31.122154
198 0.002103 31.306264
199 0.002033 31.490374
200 0.001968 31.674485
201 0.001906 31.858597
202 0.001847 32.042710
203 0.001790 32.226825
204 0.001735 32.410940
205 0.001682 32.595056
206 0.001631 32.779172
207 0.001581 32.963290
208 0.001533 33.147408
209 0.001487 33.331528
210 0.001442 33.515648
211 0.001398 33.699769
212 0.001356 33.883890
213 0.001315 34.068012
214 0.299493 1.522362
215 0.302104 4.805371
216 0.096039 7.989870
217 0.038775 11.195449
218 0.028109 14.416085
219 0.022678 17.625629
220 0.017751 20.765768
221 0.013566 23.701221
222 0.010335 26.159821
223 0.007974 27.807779
224 0.006301 28.624952
225 0.005134 28.990242
226 0.004325 29.216029
227 0.003760 29.410097
228 0.003359 29.596985
229 0.003070 29.781971
230 0.002855 29.966371
231 0.002689 30.150577
232 0.002556 30.334716
233 0.002445 30.518833
234 0.002349 30.702943
235 0.002264 30.887051
236 0.002186 31.071159
237 0.002114 31.255268
238 0.002046 31.439377
239 0.001982 31.623488
240 0.001920 31.807600
241 0.001861 31.991713
242 0.001804 32.175827
243 0.001749 32.359942
244 0.001696 32.544058
245 0.001645 32.728174
246 0.001595 32.912292
247 0.001546 33.096410
248 0.001500 33.280529
249 0.001454 33.464649
(OUTPUT FROM SCHAFFER)
ITERATION # OBSERVED-DATA LOGLIKELIHOOD
1 -21.50000000
2 -15.14860756
3 -13.26148943
4 -11.47773796
5 -9.418641300
6 -6.939941174
7 -4.058120113
8 -0.9726707366
9 1.921886740
10 4.146841243
11 5.430843610
12 6.006433583
13 6.294803019
14 6.507558888
15 6.702319717
16 6.892046598
17 7.080012740
18 7.267237989
19 7.454094923
20 7.640734838
21 7.827225074
22 8.013598752
23 8.199874057
24 8.386062149
25 8.572170590
26 8.758204955
27 8.944169636
28 9.130068275
29 9.315904017
30 9.501679658
31 9.687397741
32 9.873060612
33 10.05867047
34 10.24422937
35 10.42973929
36 10.61520208
37 10.80061952
38 10.98599330
39 11.17132506
40 11.35661634
41 11.54186864
42 11.72708339
43 11.91226196
44 12.09740567
45 12.28251578
46 12.46759350
47 12.65264001
48 12.83765641
49 13.02264379
50 13.20760318
51 13.39253558
52 13.57744193
53 13.76232315
54 13.94718014
55 14.13201373
56 14.31682474
57 14.50161396
58 14.68638215
59 14.87113003
60 15.05585830
61 15.24056764
62 15.42525869
63 15.60993208
64 15.79458841
65 15.97922826
66 16.16385219
67 16.34846073
68 16.53305441
69 16.71763373
70 16.90219915
71 17.08675116
72 17.27129019
73 17.45581669
74 17.64033105
75 17.82483369
76 18.00932500
77 18.19380534
78 18.37827508
79 18.56273456
80 18.74718412
81 18.93162409
82 19.11605478
83 19.30047648
84 19.48488950
85 19.66929411
86 19.85369058
87 20.03807918
88 20.22246016
89 20.40683377
90 20.59120024
91 20.77555981
92 20.95991268
93 21.14425909
94 21.32859923
95 21.51293330
96 21.69726150
97 21.88158401
98 22.06590101
99 22.25021269
100 22.43451920
101 22.61882071
102 22.80311738
103 22.98740936
104 23.17169680
105 23.35597984
106 23.54025862
107 23.72453327
108 23.90880393
109 24.09307071
110 24.27733375
111 24.46159315
112 24.64584903
113 24.83010150
114 25.01435066
115 25.19859663
116 25.38283949
117 25.56707935
118 25.75131630
119 25.93555042
120 26.11978182
121 26.30401056
122 26.48823673
123 26.67246042
124 26.85668170
125 27.04090064
126 27.22511732
127 27.40933180
128 27.59354416
129 27.77775446
130 27.96196276
131 28.14616912
132 28.33037361
133 28.51457629
134 28.69877719
135 28.88297639
136 29.06717394
137 29.25136988
138 29.43556425
139 29.61975714
140 29.80394855
141 29.98813854
142 30.17232715
143 30.35651445
144 30.54070045
145 30.72488519
146 30.90906873
147 31.09325109
148 31.27743233
149 31.46161243
150 31.64579149
151 31.82996947
152 32.01414649
153 32.19832250
154 32.38249758
155 32.56667172
156 32.75084500
157 32.93501740
158 33.11918896
159 33.30335972
160 33.48752972
161 33.67169891
162 33.85586741
163 34.04003511
164 34.22420223
165 34.40836858
166 34.59253436
167 34.77669944
168 34.96086394
169 35.14502783
170 35.32919112
171 35.51335390
172 35.69751606
173 35.88167789
174 36.06583901
175 36.24999968
176 36.43415984
177 36.61831956
178 36.80247886
179 36.98663768
180 37.17079621
181 37.35495406
182 37.53911186
183 37.72326913
184 37.90742584
185 38.09158245
186 38.27573867
187 38.45989423
188 38.64404981
189 38.82820496
190 39.01235982
191 39.19651422
192 39.38066854
193 39.56482263
194 39.74897634
195 39.93312951
196 40.11728274
197 40.30143575
198 40.48558805
199 40.66974064
200 40.85389274
201 41.03804488
202 41.22219636
203 41.40634852
204 41.59049931
205 41.77465075
206 41.95880133
207 42.14295188
208 42.32710298
209 42.51125366
210 42.69540341
211 42.87955395
212 43.06370416
213 43.24785337
214 43.43200363
215 43.61615225
EM CONVERGED AT ITERATION 215
**************************************************
At 17:42 23/07/2003, Donald Rubin wrote:
>I read with puzzlement this discussion, because EM theoretically will
>never lead to increases and decreases because it can never decrease the
>likelihood. The only practical case where it might appear to do so is
>with round-off error due to finite accuracy of computers, but in over a
>quarter century of experience, I've never seen a real example of this.
>When it was claimed, it always turned out to be a programming bug of some
>kind.
>
>
>Re 1. SAS now says that its EM is not EM but is some other algorithm,
>which they have called EM? I didn't know there was another algorithm for
>maximizing likelihoods called EM -- do they have a reference?
>
>Re 3. I do not follow what "overly strict convergence criterea" could
>mean, except perhaps being more strict than the machine could tolerate,
>and then the algorithm would "randomly" converge someplace within this
>tolerance, I think. By memory, your examples do not appear challenging
>enough to expose such possible problems.
>
>DBR
>
>
>On Wed, 23 Jul 2003, Fay Hughes wrote:
>
> > Many thanks for the advice and suggestions, after 2 weeks of email
> > conversations with SAS, here are a few points you may find interesting:
> >
> > 1. It is possible for SAS -2logL to increase and decrease, in an attempt
> > to find the minimum (unlike what I expected) - I suspect that for some
> > computer reason (storage space, speed etc), SAS is therefore not
> > implementing the EM algorithm as originally formulated - although I have
> > been unable to get clarity on this point.
> >
> > 3. In my particular case the EM algorithm converges to the global
> > minimum and then, apparently due to over strict convergence criteria,
> > proceeds to move away from that point to another - in fact a local
> > minimum, this convergence is also probably not helped by, most likely,
> > an oddly shaped loglikelihood.
> >
> > In conclusion, I suspect that careful inspection of the iteration
> > history would be a worthwhile undertaking, as in my case the result SAS
> > yields is not the global minimum, but rather a local one. Another way of
> > checking convergence, through different starting values for the EM is
> > not implemented in Version 8.2, but SAS informs me that it is an option
> > in version 9.
> >
> > Many thanks for any help,
> > Fay Hosking (nee Hughes)
> > Statistics MSc Student
> > School of Mathematical and Statistical Sciences
> > University of Natal, South Africa
> >
>
>--
>Donald B. Rubin
>John L. Loeb Professor of Statistics
>Chairman Department of Statistics
>Harvard University
>Cambridge MA 02138
>Tel: 617-495-5498 Fax: 617-496-8057