Hi Scott, thank you for the response!
On Fri, Nov 22, 2013 at 11:00 PM, Scott Purdy <sc...@numenta.org> wrote: > > On Fri, Nov 22, 2013 at 6:55 AM, Marek Otahal <markota...@gmail.com>wrote: > >> Guys, >> >> ... >> The top cap is (100 choose 20) which is some crazy number of 5*10^20. All >> these SDRs will be sparse, but not distributed (right??) because a change >> in one bit will already be another pattern. >> > > The number of possible unique SP outputs is (1000 choose 20), or ~10^41. > Yes, I missed one zero there. > These are all 2% sparsity. Changing one input bit doesn't necessary result > in a different SP output though. There could be many more input bit > patterns than combinations of 20 SP columns. For instance, 1000 input bits > have 10^300 possible patterns. And regardless of that, the semantic > information learned by the SP is distributed across the 1000 columns so it > would still be distributed. > I wasn't clear there - I was thinking top-down, so a 1bit change to the SP's output would result in representation of a different learned input pattern. This raises a question: the "robustness"-feature from SDRs, is it related to the input bits (I mess with some of the input bits and still expect to get same SP's output), or to the output ON bits? where when I have representations be (1000 choose 20), so even if I flip 3-5 of the output bits, there's still a good chance the result is the closest to my original input, and not some other input? And is "robust"=="distributed"? Or distributed means 2^1000 states are represented by (only) (1000 choose 20) states? > >> So my question is, what is the "usable" capacity where all outputs are >> still sparse (they all are) and distributed (=robust to noice). Is there a >> percentage of bits (say 20% bits chaotic and still recognizes the pattern >> still considered distributed/robust?) >> > > This is still a valid question for real world datasets but is completely > dependent on the particular dataset. For instance, regardless of the SP > parameters, the dataset may have 10000 input bits but only ~50 of them > change regularly. The tolerance to noise at this point is limited by the > dataset. > Nice point, I haven't considered that. And assuming all of the bits carry information? Which is what I believe happens at the higher levels of the regions (?) - the useless data is cropped out. Do we have some data (from biology?) showing there have to be atleast say 5% (=R) of bits robustness at the output SDR? (eg because of errors at the synapses, etc..) So, for example, inputA causes SDR_A, iff I turn 5 of the 20 ON bits off, input_A would still be the most likely. This would allow me to lower the max number of patterns, because for (1000 choose 20) I'd actually require (1000 choose 25). > >> >> Or is it the other way around and the SP tries to maximize this robustnes >> for the given number of patterns it is presented? I if I feed it huge >> number of patterns I'll pay the obvious price of reducing the border >> between two patterns? >> > > I think the answer to the first question is yes but to the second no. The > SP attempts to maximize the distance between the column input bits relative > to the actual data (rather than the entire input space). But feeding many > patterns in doesn't necessarily have an impact on this. If the input data > are not random, then the more data fed into the SP, I would expect the more > the columns will converge to the optimal representations. > This is true, I was (falsely) assuming random input. But in real use-cases, the SP will find "patterns" in the input patterns, so even for higher number of input data, we may actually see drop in the entropy,as the SP will find some rule that separates the inputs. >> Either way, is there a reasonable way to measure what I defined a >> capacity? >> >> I was thinking like: >> >> for 10 repetitions: >> for p in patterns_to_present: >> sp.input(p) >> >> sp.disableLearning() >> for p in patterns_to_present: >> p_mod = randomize_some_percentage_of_pattern(p, percentage) # what >> should the percentage be? see above >> if( sp.input(p) == sp.input(p_mod): >> # ok, it's same, pattern learned >> > > This seems like a good methodology for determining how tolerant the model > is to noise for this particular dataset. The amount of data fed in before > disabling learning will have a large impact on the noise tolerance (but > with diminishing returns). > I think your answers led me to clearing it up, so a short summary... Does robustness to noise reciprocally correlate to the total number of input patterns I;m able to distinguish? (1/(rob tolerance) ~ #patterns) From what has been said, I think it is not necessary for real world datasets. PS: Is there a (lower bound) limit on the number of columns in SP? So would a 20 col SP work? That way, I could achieve the (20 choose 3) and reach the state of info-full SP. Regards, Mark >> >> Thanks for your replies, >> Mark >> >> >> -- >> Marek Otahal :o) >> >> _______________________________________________ >> nupic mailing list >> nupic@lists.numenta.org >> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >> >> > > _______________________________________________ > nupic mailing list > nupic@lists.numenta.org > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > > -- Marek Otahal :o)
_______________________________________________ nupic mailing list nupic@lists.numenta.org http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org