Lukas,

The strongest alternative for this kind of application (and the normal
choice for large scale applications) is on-line gradient descent learning
with an L_1 or L_1 + L_2 regularization.  The typical goal is to predict
some outcome (click or purchase or signup) from a variety of large
vocabulary features.  As such association mining is usually just a
pre-processing step before actual learning is applied.  There is some
indication that an efficient sparse on-line gradient descent algorithm
applied to features and combinations could do just as well, especially if
the learning on combinations is applied in several passes.

These on-line algorithms have the virtue of being extremely fast and with
feature sharding, have substantial potential for parallel implementation.

What do you think about these two methods?  Can you compare them?

On Fri, Apr 9, 2010 at 4:26 AM, Lukáš Vlček <lukas.vl...@gmail.com> wrote:

>  One
> example would be analysis of click stream, where you can learn that those
> people visiting some negative comment on product blog never enter order
> form. Not saying this is best example but in general this is the essence of
> it. You simply need to take all possible values from the transaction into
> account even if it is missing in the market basket....
>
> The biggest challenge in implementing this would be the fact that the
> analysis have to deal with all the data (not just the most frequent
> patterns) and combinations. It is very resource expensive.
>

Reply via email to