I answered on SO: The only thing I can think of that sounds like this problem is PageRank. It's computed by a sort of iterative simluation. Each page has some influence (color) which flows via its links (socks its washed with) and at some point the page influence reaches a steady state (final color). You can look up PageRank algorithms but it is essentially a matter of calculating eigenvectors of a big, erm, sock color matrix.
On Sat, Mar 3, 2012 at 5:25 PM, Zoltán Tóth-Czifra <t...@tcz.hu> wrote: > Hi, > > I posted this on stack overflow, but I've been informed that the mailing > list is the appropriate forum to ask questions like this. > > What I need is actually just a hint where I can start, as I'm not sure > which direction to go. Although this is not a classic machine learning > problem, AFAIK Mahout is not only for those. I was thinking it's an > appropriate tool, because I work with a large set of data and the output > has similar characteristics than a classic Mahout job. I think a problem > below can be implemented with MapReduce, but I don't know if there is any > existing, similar algorithm. I need this to be scalable, that's why I think > the Mahout on top of Hadoop's MapReduce power could be an appropriate > solution. > > Quickly what I want to do is: > > The final goal is to define one scalar (a "score") of each one of a set of > entities based on some "known" entities. The entities interact with each > other, known scores influence and define the unknown ones. You can imagine > with the following example. > > I have a lot if white clothes and a few pieces of colorful ones; red, blue, > green... I put them into the washing machine. I want to know what colors > the white ones will get after the wash. > > Things to take into account: > > - we make a series of washing with different "actors"... some clothes > are washed in the 1st and 3rd washing, some of them only in the 2nd, some > of them are washed in all > - in consecutive washes the clothes that were white before but now > colored also influence the rest, but not as strong (as they are not as > colored) > - some colors don't "color" as much as others. for example red has a > strong effect on most of the clothes, but green not so much > - coloring effect also depends on how many clothes are in one washing. > If you wash a red shirt with a white t-shirt, it gets much more colored, > than if there were 100 other white t-shirt > - clothes don't "lose" their color when influencing others > > You can see that while calculating, entities actually have 2 assigned > scalars: > > - the color hue (this also defines "coloring power" as mentioned above). > The hue can be represented as a number, from 0 to 1, let's say. The > coherence between the coloring power and the color number is not linear. It > is more like the ends of the scale have more coloring power (0 and 1) while > the middle (0.5) has less > - the color "lightness" (how much an entity is colored, for originally > colored clothes it's 1, for white ones it's 0), which in the same time also > defines coloring power regardless of the hue > > So, again, what I know: > > - which clothes where washed in which consecutive washing > - I know the original color of some of them, the rest is white in the > beginning > > What I want to know: > > > - the hue of all clothes in the end of the washing > > The problem is that I don't know what (type) of algorithm should I start > with. If you were so kind to read so far, please suggest me something (or > further reading). > > Obviously I don't ask for any detailed thing, again, only hints on the > algorithm. > > Thank you!