Please bear with the somewhat unusual subject matter. I am not sure
whether or not this is an (a) interesting or (b) tractable graph
theoretical problem, and beg your collective indulgence!

The Six-Up (I believe known as Six-Pick in the USA), and henceforth
abbreviated to 6UP is a type of horse racing wager where one tries to
pick the first or second horse in each of six races.

For a given race meeting, we might  have a large number (80,000 would
not be uncommon) of computer betting model-generated single 6UP bets
which each look like this: R1 / R2 / R3 / R4 / R5 / R6  (amt), where
Rn is the runner one has picked for race leg n and amt is amount
wagered for this bet. As each leg is a separate race, n is
independent: i.e. it is quite possible to have a bet like 1 / 1 / 1 /
1 / 1 / 1 (10). Generally, all legs of the 6UP would have 14 horses
running, giving 7,529,536 possible combinations to wager on. So it's
not at all uncommon to have a 'mere' 80,000 combinations showing a
positive expectation and therefore being desirable wagers.

It *is* possible to send bets electronically to the totalizator
system. However, it is not feasible to place 80,000+ bets in the
available time. As is the case with all totalizator systems, there is
a way to group single bets so that (traditionally) one paper betting
ticket could represent multiple bets for the same wager amount. This
can also be done for electronic bet submission. For example, the
single bets:

        4 / 7 / 12 / 1 / 1 / 9 (10)
        4 / 8 / 12 / 1 / 1 / 9 (10)
        4 / 7 / 12 / 1 / 1 / 14 (10)
        4 / 8 / 12 / 1 / 1 / 14 (10)

could be grouped together as one bet: 4 / 7 + 8 / 12 / 1 / 1 / 9 + 14
(10)

(I have listed these bets in order such that only one leg value
changes between each line. Of this, more later.)

This grouped bet would then be submitted to the totalizator system
where it would be expanded back into its constituent bets and the 4
single wagers recorded against our account. Ideally one would hope to
merge many more single bets into one grouped bet and see multiple
terms in each of the 6 legs of the grouped bet - thereby achieving a
much greater 'compression ratio'.

GOAL: We wish to group or merge (depending on one's terminological
preference) large numbers of these single bets quickly and
efficiently.

(For the purposes of most of this outline, I am pretending that all
6UP single bets are for the same wager amount. In practice, this would
not be so. In reality wager amount does matter because I would not
wish to group these bets together:

        4 / 7 / 12 / 1 / 1 / 9 (100)
        4 / 8 / 12 / 1 / 1 / 9 (10)
        4 / 7 / 12 / 1 / 1 / 14 (10)
        4 / 8 / 12 / 1 / 1 / 14 (10)

4 / 7 + 8 / 12 / 1 / 1 / 9 + 14 (10) would miss $90 desired investment
on 4 / 7 / 12 / 1 / 1 / 9 and more seriously 4 / 7 + 8 / 12 / 1 / 1 /
9 + 14 (100) would over-invest 90 on the other three bets in the group
where we only want to bet $10.)


After floundering around with various naive approaches and flipping
through Skiena's Algorithm Design Manual and some discussion group
postings, it was suggested to me that I compute modified Hamming
distances between each bet in the set of merge candidate single bets,
where the modified Hamming distance is the number of legs in which the
runners differ between two bets, e.g.:

1 / 1 / 1 / 1 / 1 / 1
1 / 1/ 14 / 1 / 1 / 1  --> modified Hamming distance between these two
bets is 1 because they differ only in leg 3.

1 / 1 / 1 / 1 / 1 / 1
2 / 1 / 1 / 2 / 2 / 7   --> modified Hamming distance between these
two bets is 4 because they differ in legs 1, 4, 5, and 6.

If I construct an undirected graph where vertices are bets and edges
join only those pairs of vertices having a modified Hamming distance
of 1, then I have some kind of representation of potential grouping
candidates.

Furthermore, I have been able to discern some simple properties of
such a graph:

Cliques in this graph describe the simplest form of grouping having
multiple terms in only one leg, such as:

1 / 14 / 9 / 4 / 11 / 1 + 8 + 9 + 10

i.e. if n vertices are connected to each other by edges, then they
must have modified Hamming distance of 1, then must all differ from
each other by one runner in one leg position.

Further reflection suggests that simple cycles in this graph describe
groupings having multiple terms in two legs, such as 4 / 7 + 8 / 1 /
1 / 9 + 14

A       4 / 7 / 12 / 1 / 1 / 9 (10)
B       4 / 8 / 12 / 1 / 1 / 9 (10)
C       4 / 8 / 12 / 1 / 1 / 14 (10)
D       4 / 7 / 12 / 1 / 1 / 14 (10)

(ABCD is a simple cycle as only 1 change between each line)


These are a good start, but it's necessary to to merge further. If I
could find (e.g.) 2 / 14 / 9 / 4 / 11 / 1 + 8 + 9 + 10 elsewhere in my
graph, I would then be able to combine this with the aforementioned
1 / 14 / 9 / 4 / 11 / 1 + 8 + 9 + 10 and get:

1 + 2 / 14 / 9 / 4 / 11 / 1 + 8 + 9 + 10 and so on.

My unschooled (I have an EE, not CS background) flounderings in the
literature suggest that in effect I should be searching my graph for
matchings of cycles and cliques. I leave out any notion of optimality
for the time-being, given the NP-hard odor about the whole
undertaking.

I would very greatly appreciate:

(1) Any brief comments anybody might be able to make about the
soundness of my approach.  I would be grateful to know that I am not
barking up the wrong tree and missing something obvious.

(2) In addition to (1) any pointers to papers which might refer to a
similar problem. I have searched high and low but have not been able
to find a direct analogue to this bet grouping problem. Despite
initially looking to string matching and various clustering and
bioinformatics topics, I wasn't able to find one. Did I miss
something?

(3) I am very new to graph theory and do not (this is probably obvious
to the assembled company by now) have a strong background in
mathematics. If I am missing something due to lacking the basic
conceptual grammar, incorrect use of terminologies, etc., I would
appreciate any tips.

(4) If nobody can formulate the problem in a better way, how should I
go about attempting to match cliques and cycles? I get the impression
that this is not such a simple thing to do.

Finally, to give some notion of the size of networks I am looking at,
one fairly maximal (I hope) edge case had 143,000 vertices and 2.5
million edges when I ignored wager size. In practice, I would probably
only create edges between two (modified Hamming distance = 1) vertices
when their wager sizes were within an acceptable (say +/- 10%) range
of each other. Adding this constraint results in a graph containing
multiple networks, each unconnected with the others. This would tend
to reduce the aggregate number of edges quite significantly.
Presumably I would then attack each of these self-contained networks
and attempt to extract fairly optimal bet groupings from them - issues
of wager size having been dealt with at one fell swoop by virtue of
fact that each network only contains wagers of acceptably similar size.


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups-beta.google.com/group/algogeeks
-~----------~----~----~----~------~----~------~--~---

Reply via email to