Interesting task! Unless there's some whitepaper out there explaining the "real" right answer, Cameron's mention of how to test the effectiveness is great.
One way that occurred to me is have a meta table with a column for attribute-name and a column with the value of importance (would be sequential, with 1 being least important) - Ill call it tA for this email... Step 1 ColumnList = [all columns from Ta] Select #ColumnList#, count() from student group by #ColumnList# Take the returning row with the smallest count memberPool = any member that match all the columns in #ColumnList# into each group insert (memberPool/number of groups) members into each group-member Step 2 - loop over this ColumnList = [all columns from Ta] with importance > loopcount Select #ColumnList#, count() from student join group-member group by #ColumnList# - Take the returning row with the smallest count memberPool = any member that match all the columns in #ColumnList# and not in group-member into each group break loop when all members are assigned The importance might already be determined arbitrarily (which by that I mean for some non-data/math reason). If it weren't, maybe do this: For each column in ColumnList = [all columns from Ta] Select '#column#',#column#,count from member group by #column# after looping, the column with the highest count UNION ALL (repeat loop) Out Of this set, whichever column/value pair has the largest count, use that for your most important, for your second most important, use the pair with the next-largest count and not matching the column you just said was most important... and so forth. Not sure how any of that would pan out... but my two cents! -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Cameron Childress Sent: Friday, July 25, 2008 18:44 To: discussion@acfug.org Subject: Re: [ACFUG Discuss] sorting question Dean's method is one possibility. This is actually a very interesting question and I'm nojt sure how I'd solve it. I thought about it a bit during my drive home, and here's the approach I would take... This is alot easier if there are only two choices for each statistic (male|female - american|foreign - white|nonwhite), but it could also work with multiple choices for each. First, how to measure the success of the program? Measure the percentage of each stat in the group as a whole (I'll call this Big Ratio), and then measure the percentages in each of the 25 groups (I'll call this Group Ratio) and see how closely they each match. Okay, next, how to divide them up into groups? I'd start by seeding each group with a random individual. Then I would take each person from the pool of potential students and loop over each group, testing to see if adding that person to that group would make the Group Ratio for that group closer or farther away from the Big Ratio. Whichever Group Ratio moves the farthest toward the Big Ratio would be the group you add that individual to. Once a group reaches 17 people, close it and stop adding people to it. You'll have to find a way of combining the ratios and determining one big number that represents the combination. I am sure if I paid more attention in my statistics class I'd know it had something to do with standard deviations, but I didn't pay any attention - so that's up to you to figure out. Once you are done, look at all the Group Ratios and see how close their balance measures up to the Big Ratio. Two suggestions to make this easier on yourself: 1) Start by attempting to balance a smaller number of groups than 25. 2 or 3 maybe. 2) Start with binary choices, then move on to multiple choices after you have amethod that is capible of balancing two choices. Least that's where I would start. If you are willing, post your solution (in english or in code) once you're done. I would be interested in seeing how you did it. -Cameron On Fri, Jul 25, 2008 at 5:25 PM, Tepfer, Seth <[EMAIL PROTECTED]> wrote: > I have a challenge laid out before me. I need to divide the incoming Oxford student class into 25 groups of about 16 or 17 students each. However, they want the groups to be as balanced as possible, across number, sex, race, and geographic origin. Now, I can easily see how to balance based on sex or any single characteristic. But how to balance across all three at the same time? My head starts spinning when I think about the issues that we won't necessarily have equal distribution across any of the characteristics. > > I don't need the code, just the concept. I am having a hard time conceiving on how to do this if the people were standing in front of me, much less by code. Any ideas? ------------------------------------------------------------- To unsubscribe from this list, manage your profile @ http://www.acfug.org?fa=login.edituserform For more info, see http://www.acfug.org/mailinglists Archive @ http://www.mail-archive.com/discussion%40acfug.org/ List hosted by http://www.fusionlink.com ------------------------------------------------------------- ------------------------------------------------------------- To unsubscribe from this list, manage your profile @ http://www.acfug.org?fa=login.edituserform For more info, see http://www.acfug.org/mailinglists Archive @ http://www.mail-archive.com/discussion%40acfug.org/ List hosted by http://www.fusionlink.com -------------------------------------------------------------