That seems to be assuming that the "urge to click", is somehow related to the pattern associated with the occurrence of words on a page? This could be true and it would be interesting to find a correlation.
You could maybe come up with a general theory for "click attraction" and patterns associated with word occurrence and web browsing in general.... Sent from my iPhone > On Aug 8, 2014, at 7:44 AM, Ryan Belcher <[email protected]> wrote: > > I'm looking at the Criteo Kaggle competition. Each row is a data related to > the a single display of an advertisement. You're trying to predict whether > the ad will be clicked or not. > > Am I trying to categorize? Yes and no. I'm trying to predict whether the ad > will be clicked, but the way I'm trying to do that is by categorizing the > rows into buckets and calculating probability based on the category. > > I'm not sure how else you'd go about it. > > >> On Thu, Aug 7, 2014 at 5:44 PM, Jim Bridgewater <[email protected]> wrote: >> Hi Ryan, >> >> For classification problems it sounds like you are headed in the right >> direction, but I'm unclear about what your objective is. Are you just >> trying to categorize each row in the data set? >> >> >> >> On Thu, Aug 7, 2014 at 1:33 PM, Ryan Belcher <[email protected]> wrote: >> > I've been playing around with NuPIC for a while and am still trying to wrap >> > my head around how to use it. Right now I'm playing with some prediction >> > scenarios where you have a number of input fields and you're trying to >> > predict one output. >> > >> > My understaning is that if the inputs aren't related temporally, then it's >> > a >> > Spatial Pooling problem. If there are common patterns in the data, then it >> > may be helpful to create hierarchies of SPs. >> > >> > The data I'm looking at right now probably doesn't have common patterns. >> > It's basically a bunch of categorical data from which you're trying to >> > predict a boolean outcome. There are about 15M rows in the training set. >> > >> > So my thinking is to create 1 SP where the inputDimensions is wide enough >> > to >> > accomodate all of the fields and columnDimensions sized so that rows get >> > grouped together. (If there were 100k columns, then on average 150 rows >> > would be pooled together.) >> > >> > In theory I could run all of the training data through the SP, then run it >> > through again (without learning) and calculate an outcome probability for >> > each column. Then I could run the test data through and it's probability >> > would be the probability of the column it matches. >> > >> > Is that a reasonable approach or am I way out in left field? >> > >> > Thanks, >> > Ryan >> > >> > _______________________________________________ >> > nupic mailing list >> > [email protected] >> > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >> > >> >> >> >> -- >> James Bridgewater, PhD >> Arizona State University >> 480-227-9592 >> >> _______________________________________________ >> nupic mailing list >> [email protected] >> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
