Note that you sent this in response to my response on an issue which was only tangentially related to what I suspect you are asking about. So I am confused (which seems to be fairly typical for me, recently.) But you did respond specifically to me, so I feel I should respond to you.
Additionally, however, when asking for help on a problem (in email forum contexts, perhaps in other contexts), it's best to state what you expected, what you did, and what you got which was different from what you expected. I see some of what you did, here, but not enough to know what you expected nor what you got that was different from what you expected. I see from your followup message that you have found the answer that you are looking for, so I will not attempt a deeper investigation at this time. But I thought I would attempt to answer the general gist of the questions you posed, within my understanding at this time. It's undoubtedly not the sort of answer you were looking for, but I can hope this will still be helpful. Thanks, -- Raul On Fri, Jan 27, 2012 at 8:23 PM, Linda Alvord <lindaalv...@verizon.net> wrote: > Please point out what I have done that does not represent the problem > correctly. I am trying for a simple solution and would like to know what is > incorrect in this solution. > > bmd=:([:I.[: -. [: ~: [: i.~ [: ? #~)"1 > all=:bmd ,.4$366 > all > |."1 all > ALL=:bmd ,.500$366 > NB. The first element of each row is the first match > NB. Note fill as list are of different length > NB. ALL is 500 samples of 366 people > 1{."1 all > mean=:+/ % # > data=:(mean ,1{."1 bmd ,.500$366)"1 > ]DATA=:data ,.10#10 > ]MOM=:mean DATA > > Thanks. > > Linda > > -----Original Message----- > From: programming-boun...@jsoftware.com > [mailto:programming-boun...@jsoftware.com] On Behalf Of Raul Miller > Sent: Friday, January 27, 2012 3:53 PM > To: Programming forum > Subject: Re: [Jprogramming] Challenge 4 Bountiful Birthdays > > You must have missed Jose Mario Quintana's post: > > http://jsoftware.com/pipermail/programming/2012-January/026885.html > > -- > Raul > > On Fri, Jan 27, 2012 at 2:53 PM, Devon McCormick <devon...@gmail.com> wrote: >> One thing not considered so far is that birthdays are not evenly >> distributed - see http://www.panix.com/~murphy/bday.html - and this affects >> the probability. >> >> On Fri, Jan 27, 2012 at 9:50 AM, Linda Alvord <lindaalv...@verizon.net>wrote: >> >>> I'm glad that we have finally covered most of the conditions that could >>> occur. I'll enjoy pondering your work. I'm working on a "narrower" >>> challenge for next time that will not take us so far afield. >>> >>> Linda >>> >>> -----Original Message----- >>> From: programming-boun...@jsoftware.com >>> [mailto:programming-boun...@jsoftware.com] On Behalf Of Jose Mario >>> Quintana >>> Sent: Thursday, January 26, 2012 11:08 PM >>> To: Programming forum >>> Subject: Re: [Jprogramming] Challenge 4 Bountiful Birthdays >>> >>> Regarding the original task, one can proceed (knowing the actual densities) >>> via a shifted multinomial simulation, >>> >>> multinomial=. +/\ o [ <: o ((0: , [) I. ]) ? o $&0 o ] NB. dyadic verb >>> >>> 7 (samples=. ((densities o [) (2+multinomial) ]) ("0)) 10 10 10 10 10 NB. >>> day of the week samples >>> 2 3 4 6 5 6 2 6 2 6 >>> 3 4 2 8 2 5 4 5 3 5 >>> 3 2 2 6 2 3 4 5 2 3 >>> 4 5 5 6 2 6 5 7 2 3 >>> 5 2 3 5 5 5 4 5 4 7 >>> >>> mean=. +/ % # >>> >>> 365 (samplesmeans=. (mean"1 o samples)) 10000000 NB. day of the year 10 >>> million sample mean >>> 24.6141 >>> >>> 10 (] , mean) o (365 &samplesmeans) o # 500 NB. the original task >>> 24.264 23.782 24.334 24.516 25.016 24.704 25.05 24.514 23.93 25.25 24.536 >>> >>> (] , mean) o (365 &samplesmeans) o # f. NB. according to the "simple" >>> rules? >>> (] , +/ % #)@:(365&((+/ % #)"1@:((+/\^:_1@:(1 - */\@:(1 - ] %~ 1 + i.))@:[ >>> (2 + +/\@:[ <:@:((0: , [) I. ]) ?@:$&0@:]) ])"0)))@:# >>> >>> Regarding accuracy, among other things, it can be argued that the >>> distribution could even depend whether the experiment is conducted in the >>> northern or the southern hemisphere (see >>> http://www.panix.com/~murphy/bday.html and >>> http://answers.google.com/answers/threadview/id/280242.html). Models, >>> maps, >>> and other representations are ultimately doomed to be inaccurate; the >>> subject matter is not only too complex but also evolving; above all, of >>> course, my representation of "the world" that is my mind is affected as >>> well :) >>> >>> ________________________________________ >>> From: programming-boun...@jsoftware.com [programming-boun...@jsoftware.com >>> ] >>> On Behalf Of Jose Mario Quintana [josemarioquint...@2bestsystems.com] >>> Sent: Tuesday, January 24, 2012 2:23 PM >>> To: Programming forum >>> Subject: Re: [Jprogramming] Challenge 4 Bountiful Birthdays >>> >>> > +/(2+i.1000) * p * q NB. expected value >>> > 24.6166 >>> >>> I found the same solution in a slightly different way, >>> >>> ((2 + i.) +/ .* +/\^:_1@:((1 - */\)@:(1 - ] %~ 1 + i.))) 365 >>> 24.6166 >>> >>> The outline follows: >>> >>> It is easier to start dealing with the day of the week birthday process >>> first, >>> >>> (outcomes=. 2 + i.) 7 NB. all other outcomes have zero densities; thus, >>> they are irrelevant >>> 2 3 4 5 6 7 8 >>> >>> o=. @: >>> >>> (cp=. 1 - ] %~ 1 + i.) 7 NB. conditional probabilities the process will >>> not stop at each outcome given that it did not stop at its predecessor >>> 0.857143 0.714286 0.571429 0.428571 0.285714 0.142857 0 >>> >>> cdf=. 1 - */\ o cp NB. cumulative distribution function >>> >>> load'plot' >>> plot (0 0 , cdf) 7 NB. ploting the (smoothed) cdf >>> >>> densities=. +/\^:_1 o cdf NB. since cdf -: +/\ densities >>> >>> (mean=. outcomes +/ .* densities) 7 NB. formula for discrete densities >>> 4.01814 >>> >>> This generalizes to the day of the year birthday process, >>> >>> plot (0 0 , cdf) 365 >>> mean 365 >>> 24.6166 >>> >>> mean f. >>> (2 + i.) +/ .* +/\^:_1@:(1 - */\@:(1 - ] %~ 1 + i.)) >>> >>> >>> >>> ________________________________________ >>> From: programming-boun...@jsoftware.com [programming-boun...@jsoftware.com >>> ] >>> On Behalf Of Mike Day [mike_liz....@tiscali.co.uk] >>> Sent: Friday, January 20, 2012 7:54 PM >>> To: Programming forum >>> Subject: Re: [Jprogramming] Challenge 4 Bountiful Birthdays >>> >>> My "trial" function, listed earlier (and below) was >>> not quite correct, as it failed to count the >>> successful person. >>> >>> So it should be: >>> >>> trialb =: ([: # (] (,`]@.e.~ ([: ? 365"_)))^:_)"0 >>> >>> So we get, for example (but it's very slow! My variant >>> triala discussed with Linda is somewhat better): >>> >>> (mean, stdev) mean trialb 5000 100 $ _1 >>> 24.6133 0.180788 >>> >>> Linda thinks the mean should be somewhat lower, and >>> Brian thinks it's a lot lower. However, the standard >>> deviation suggests it's close to the true value. >>> >>> I think this is the way to find the true expected number >>> of people. We don't need Markov after all: >>> >>> Probability that (n-1) arrivals all have different >>> b/days: >>> >>> q =: Prod (1 - i%Y), 0<: i <: n-2, Y =~ 365 >>> >>> Probability that the nth arrival's b/day is one of >>> those present, ie is one of n-1 distinct bdays: >>> >>> p =: (n-1) % Y >>> >>> Expected value of number of arrivals for "success": >>> >>> Sum (2+i) pi * qi, 0 <: i <: n-2 >>> >>> In J: >>> 5{. q =: */\(1 - (365 %~(i.))) 1000 >>> 1 0.99726 0.991796 0.983644 0.972864 >>> >>> 5 {. p =: (365 %~>:@i. )1000 >>> 0.00273973 0.00547945 0.00821918 0.0109589 0.0136986 >>> >>> +/(2+i.1000) * p * q NB. expected value >>> 24.6166 >>> >>> This is not the same as the median, where the >>> probability q moves below 0.5, >>> >>> 21 22 { q >>> 0.524305 0.492703 >>> >>> As Roger observes, the index origin comes into play; >>> we should add one as the first person is 1, not zero (!) >>> and the median group size is therefore just below 23. >>> >>> This last is dealing with a slightly different problem: >>> what is the probability that a certain sized group of >>> people do (not) share a birthday? So we shouldn't be >>> surprised at the difference. >>> >>> Mike >>> >>> On 18/01/2012 3:17 PM, Mike Day wrote: >>> > People seem to be tackling two different problems. >>> > >>> > Variations on the Birthday Problem as I remember them: >>> > (a) what is the probability that two (or more) people >>> > share a birthday in a group of N people? >>> > (b) what should N be for the probability to be (say) 0.5 ? >>> > The somewhat counter-intuitive answers are dealt with in >>> > Roger's Wiki Essay, among many treatments, and also >>> > Pablo's message, below. The essential point is to >>> > consider the probability that there are no matches. >>> > >>> > However, Linda's single trial as stated is a random >>> > process with a stopping condition: >>> > take one person at a time until the new person shares a >>> > birthday with those already present. The result is the >>> > number of people including the new arrival. >>> > >>> > I expect you need a Markov Process approach to get the >>> > exact expected value for the stopping number. Not proved! >>> > >>> > Here's a stab at the required simulation, avoiding @ and @: >>> > though using [: >>> > >>> > NB. I use _1 as seed, so need to decrement the count >>> > >>> > trial =: (_1 + [: # (] (,`]@.e.~ ([: ? 365"_)))^:_)"0 >>> > >>> > trial 10#_1 NB. eg conduct 10 trials >>> > 27 19 29 2 24 42 30 9 34 33 >>> > >>> > mean =: +/%# NB. ok for vectors or columns of matrix >>> > >>> > ([:(;~mean) mean) TRIALS =: trial 500 10 $ _1 >>> > >>> >>> +-------+------------------------------------------------------------------- >>> + >>> > >>> > |23.5882|22.696 23.676 23.894 24.044 23.874 23.56 24.258 23.416 22.81 >>> > 23.654| >>> > >>> >>> +-------+------------------------------------------------------------------- >>> + >>> > >>> > >>> > These means are indeed close to N in problems >>> > (a) & (b) where the probability is ~0.50, namely >>> > 21 for 0.475695 and 22 for 0.507297, but not the >>> > same. >>> > >>> > I used 365 rather than Pablo's 365.25 . The simulation >>> > could be done for 365.25, using the integer 1461 (say). >>> > The stopping condition would be a bit more complicated. >>> > >>> > The deviation of trials is quite large: >>> > SS =: [: *: (-"1 mean) NB. squared deviations from mean >>> > stdev=: [: %: [: mean SS NB. Observed Standard deviation >>> > NB. not necessarily recommended for real, large sets of data >>> > >>> > (mean,:stdev) TRIALS >>> > 22.696 23.676 23.894 24.044 23.874 23.56 24.258 23.416 22.81 >>> > 23.654 >>> > 11.9378 12.6587 12.6917 12.5288 12.281 11.9741 12.1957 11.442 12.0969 >>> > 12.8718 >>> > >>> > NB. standard deviation of the means: >>> > >>> > (mean, stdev) mean TRIALS >>> > 23.5882 0.477041 >>> > >>> > Mike >>> > >>> > >>> > >>> > >>> >>> ---------------------------------------------------------------------- >>> For information about J forums see http://www.jsoftware.com/forums.htm >>> >>> ---------------------------------------------------------------------- >>> For information about J forums see http://www.jsoftware.com/forums.htm >>> >>> ---------------------------------------------------------------------- >>> For information about J forums see http://www.jsoftware.com/forums.htm >>> >>> ---------------------------------------------------------------------- >>> For information about J forums see http://www.jsoftware.com/forums.htm >>> >> >> >> >> -- >> Devon McCormick, CFA >> ^me^ at acm. >> org is my >> preferred e-mail >> ---------------------------------------------------------------------- >> For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm