A (key?) question: is it tolerable that the merged group of bets generate some bets that were not contained in the original group?
For example, for the bet group 1 2 3 7 8 9 5 2 3 7 8 9 1 6 3 7 8 9 5 6 3 7 8 9 1 2 4 7 8 9 5 2 4 7 8 9 1 6 4 7 8 9 the merge 1+5 / 2+6 / 3+4 / 7 / 8 / 9, renders good compression but it generates one unwanted bet. It would help that you send a sample group of bets. On 2/7/07, bullockbefriending bard <[EMAIL PROTECTED]> wrote: > > > Please bear with the somewhat unusual subject matter. I am not sure > whether or not this is an (a) interesting or (b) tractable graph > theoretical problem, and beg your collective indulgence! > > The Six-Up (I believe known as Six-Pick in the USA), and henceforth > abbreviated to 6UP is a type of horse racing wager where one tries to > pick the first or second horse in each of six races. > > For a given race meeting, we might have a large number (80,000 would > not be uncommon) of computer betting model-generated single 6UP bets > which each look like this: R1 / R2 / R3 / R4 / R5 / R6 (amt), where > Rn is the runner one has picked for race leg n and amt is amount > wagered for this bet. As each leg is a separate race, n is > independent: i.e. it is quite possible to have a bet like 1 / 1 / 1 / > 1 / 1 / 1 (10). Generally, all legs of the 6UP would have 14 horses > running, giving 7,529,536 possible combinations to wager on. So it's > not at all uncommon to have a 'mere' 80,000 combinations showing a > positive expectation and therefore being desirable wagers. > > It *is* possible to send bets electronically to the totalizator > system. However, it is not feasible to place 80,000+ bets in the > available time. As is the case with all totalizator systems, there is > a way to group single bets so that (traditionally) one paper betting > ticket could represent multiple bets for the same wager amount. This > can also be done for electronic bet submission. For example, the > single bets: > > 4 / 7 / 12 / 1 / 1 / 9 (10) > 4 / 8 / 12 / 1 / 1 / 9 (10) > 4 / 7 / 12 / 1 / 1 / 14 (10) > 4 / 8 / 12 / 1 / 1 / 14 (10) > > could be grouped together as one bet: 4 / 7 + 8 / 12 / 1 / 1 / 9 + 14 > (10) > > (I have listed these bets in order such that only one leg value > changes between each line. Of this, more later.) > > This grouped bet would then be submitted to the totalizator system > where it would be expanded back into its constituent bets and the 4 > single wagers recorded against our account. Ideally one would hope to > merge many more single bets into one grouped bet and see multiple > terms in each of the 6 legs of the grouped bet - thereby achieving a > much greater 'compression ratio'. > > GOAL: We wish to group or merge (depending on one's terminological > preference) large numbers of these single bets quickly and > efficiently. > > (For the purposes of most of this outline, I am pretending that all > 6UP single bets are for the same wager amount. In practice, this would > not be so. In reality wager amount does matter because I would not > wish to group these bets together: > > 4 / 7 / 12 / 1 / 1 / 9 (100) > 4 / 8 / 12 / 1 / 1 / 9 (10) > 4 / 7 / 12 / 1 / 1 / 14 (10) > 4 / 8 / 12 / 1 / 1 / 14 (10) > > 4 / 7 + 8 / 12 / 1 / 1 / 9 + 14 (10) would miss $90 desired investment > on 4 / 7 / 12 / 1 / 1 / 9 and more seriously 4 / 7 + 8 / 12 / 1 / 1 / > 9 + 14 (100) would over-invest 90 on the other three bets in the group > where we only want to bet $10.) > > > After floundering around with various naive approaches and flipping > through Skiena's Algorithm Design Manual and some discussion group > postings, it was suggested to me that I compute modified Hamming > distances between each bet in the set of merge candidate single bets, > where the modified Hamming distance is the number of legs in which the > runners differ between two bets, e.g.: > > 1 / 1 / 1 / 1 / 1 / 1 > 1 / 1/ 14 / 1 / 1 / 1 --> modified Hamming distance between these two > bets is 1 because they differ only in leg 3. > > 1 / 1 / 1 / 1 / 1 / 1 > 2 / 1 / 1 / 2 / 2 / 7 --> modified Hamming distance between these > two bets is 4 because they differ in legs 1, 4, 5, and 6. > > If I construct an undirected graph where vertices are bets and edges > join only those pairs of vertices having a modified Hamming distance > of 1, then I have some kind of representation of potential grouping > candidates. > > Furthermore, I have been able to discern some simple properties of > such a graph: > > Cliques in this graph describe the simplest form of grouping having > multiple terms in only one leg, such as: > > 1 / 14 / 9 / 4 / 11 / 1 + 8 + 9 + 10 > > i.e. if n vertices are connected to each other by edges, then they > must have modified Hamming distance of 1, then must all differ from > each other by one runner in one leg position. > > Further reflection suggests that simple cycles in this graph describe > groupings having multiple terms in two legs, such as 4 / 7 + 8 / 1 / > 1 / 9 + 14 > > A 4 / 7 / 12 / 1 / 1 / 9 (10) > B 4 / 8 / 12 / 1 / 1 / 9 (10) > C 4 / 8 / 12 / 1 / 1 / 14 (10) > D 4 / 7 / 12 / 1 / 1 / 14 (10) > > (ABCD is a simple cycle as only 1 change between each line) > > > These are a good start, but it's necessary to to merge further. If I > could find (e.g.) 2 / 14 / 9 / 4 / 11 / 1 + 8 + 9 + 10 elsewhere in my > graph, I would then be able to combine this with the aforementioned > 1 / 14 / 9 / 4 / 11 / 1 + 8 + 9 + 10 and get: > > 1 + 2 / 14 / 9 / 4 / 11 / 1 + 8 + 9 + 10 and so on. > > My unschooled (I have an EE, not CS background) flounderings in the > literature suggest that in effect I should be searching my graph for > matchings of cycles and cliques. I leave out any notion of optimality > for the time-being, given the NP-hard odor about the whole > undertaking. > > I would very greatly appreciate: > > (1) Any brief comments anybody might be able to make about the > soundness of my approach. I would be grateful to know that I am not > barking up the wrong tree and missing something obvious. > > (2) In addition to (1) any pointers to papers which might refer to a > similar problem. I have searched high and low but have not been able > to find a direct analogue to this bet grouping problem. Despite > initially looking to string matching and various clustering and > bioinformatics topics, I wasn't able to find one. Did I miss > something? > > (3) I am very new to graph theory and do not (this is probably obvious > to the assembled company by now) have a strong background in > mathematics. If I am missing something due to lacking the basic > conceptual grammar, incorrect use of terminologies, etc., I would > appreciate any tips. > > (4) If nobody can formulate the problem in a better way, how should I > go about attempting to match cliques and cycles? I get the impression > that this is not such a simple thing to do. > > Finally, to give some notion of the size of networks I am looking at, > one fairly maximal (I hope) edge case had 143,000 vertices and 2.5 > million edges when I ignored wager size. In practice, I would probably > only create edges between two (modified Hamming distance = 1) vertices > when their wager sizes were within an acceptable (say +/- 10%) range > of each other. Adding this constraint results in a graph containing > multiple networks, each unconnected with the others. This would tend > to reduce the aggregate number of edges quite significantly. > Presumably I would then attack each of these self-contained networks > and attempt to extract fairly optimal bet groupings from them - issues > of wager size having been dealt with at one fell swoop by virtue of > fact that each network only contains wagers of acceptably similar size. > > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Algorithm Geeks" group. To post to this group, send email to algogeeks@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/algogeeks -~----------~----~----~----~------~----~------~--~---