Potential use case for Ignite

Andy Zelinski Mon, 05 Oct 2015 16:39:13 -0700

I am looking to evaluate Ignite/GridGain to turn an iterative computation
batch job to a user-facing hot request-response app. as a general question,
has this type of thing been attempted before? more specifically, and this
may still be too vague, what modifiable parameters (cluster config,
partitioning, data loading, eviction policy setting, etc) do you envision
to be most paramount to get right to enable this?


here is a detailed toy example to clearly illustrate.

imagine we currently have a two phase recommender system. the first phase
(typical ML recommender algorithms) pair down an entire repository (10e6 to
10e7 objects) of entities (movies, songs, readables, etc) to a less huge
list of likely candidates (10e4-10e5 objects) for each user/group of users.

the second phase, currently, produces a list of 10 recommendations by
iteratively assigning a score to every object in the candidate list and
selecting top score. to assign a score, some info about the users behavior
over the last week is gathered as variables to apply to the iterative
algorithm.

both phases are Spark jobs. the algorithms for the score assigning and
iterating are elegantly expressed with Spark's abstractions.

Now, however, we want the second phase to be an on-demand service that
backs a user app. instead of gleaning info about the user behind the scenes
with no real time limit to complete tasks, the user can interact with the
algorithm directly. "heres my mood score, heres my last read book, i want a
list of 5 books, Go". we would need sub-second latency for the algorithm to
score and select through the list of 100,000 or so items.

thoughts so far: translate Spark map transformation, map reduce algorithm
to a fork-join(use cores!). partition cluster so users distributed evenly.

thanks!

Potential use case for Ignite

Reply via email to