hi, Miloš Thanks for reply, redid hll definitely worth to try, anyone have done that before? my understand is that redis data structure is key-value, where the offset represents the userid for example, the concern is whether redis add another bottle-neck to whole system?
In addition, I feel TridentReach does unique count as well, my question is how to use external timestamp to define time windows, since I have not seen any sample code for timestamps. thanks Alec On Aug 21, 2014, at 4:36 PM, Miloš Solujić <milos.solu...@gmail.com> wrote: > Alec, > > For this one, I'd recommend redis hll like Gna explained earlier. > > On 21 Aug 2014 23:31, "Sa Li" <sa.in.v...@gmail.com> wrote: > Thanks all the reply > > I have considered to integrate the java-hll package > (https://github.com/aggregateknowledge/java-hll), which uses hash-function > murmur_23 from google, I am having lot of exceptions to include it, I am > thinking if this hash is compatible with the distributed machnism of storm (I > might be naive). > > Another thing I am thinking is to use TridentReach, this is to count the > unique people exposed to a url page, I am thinking to combine this > tridentReach with kafkaSpout, my question, should I create a fixed size > Hashmap to contain the URL and array of visitors? So this means the fixed > size of hash map represents the window size of slide window. I wonder if this > is correct? > > > thanks > > Alec > > On Aug 21, 2014, at 11:18 AM, Nima Movafaghrad <nima.movafagh...@oracle.com> > wrote: > >> Alec, >> >> You can use something like HyperLogLog or Bloomfilters to do Unique and/or >> Distinct counting. Just create a bolt that does that. >> >> Nima >> >> From: Sa Li [mailto:sa.in.v...@gmail.com] >> Sent: Wednesday, August 20, 2014 2:45 PM >> To: user@storm.incubator.apache.org >> Subject: distinct counting >> >> Hi, all >> >> I know storm does good job on counting and other aggregate jobs, I wonder if >> anyone ever did distinct counting in storm, and how would you set the time >> sliding window? >> >> thanks >> >> >> Alec >