Re: Web Analytics Use case?

John Martyniak Tue, 03 Nov 2009 06:10:08 -0800

Benjamin,

That is kind of the exact case for Hadoop.

Hadoop is a system that is built for handling very large datasets, anddelivering processed results. HBase is built for AdHoc data, soinstead of having complicated table joins etc, you have very largerows (multiple columns) with aggregate data, then use HBase to returnresults from that.

We currently use hadoop/hbase to collect and process lots of data,then take the results from the processing to populate a SOLR Index,and a MySQL database which is then used to feed the front ends. Itseems to work pretty good in that it greatly reduces the number ofrows and the size of the queries in the DB/index.

We are exploring using HBase to feed the front-ends in place of theMySQL DBs, so far the jury is out on the performance but it does lookpromising.


-John



On Nov 3, 2009, at 8:28 AM, Benjamin Dageroth wrote:

Hi,
I am currently evalutating whether Hadoop might be an alternative toour current system. We are providing a web analytics solution forvery large websites and run every analysis on all collected data -we do not aggregate the data. This results in very large amounts ofdata that are processed for each query and currently we are using anin memory database by Exasol with really a lot of RAM, so that itdoes not take longer than a few seconds and for more complicatedqueries not longer than a minute to deliever the results.
The solution however is quite expensive and given the growth of dataI'd like to explore alternatives. I have read about NoSQL Datastoresand about Hadoop, but I am not sure whether it is actually a choicefor our web analytics solution. We are collecting data via atrackingpixel which gives data to a trackingserver which writes itto disk once the session of a visitor is done. Our current solutionhas a large number of tables and the queries running the data can bequite complex:
How many user who came over that keyword and were from that city didactually buy the advertised product? Of these users, what otherpages did they look at. Etc.
Would this be a good case for Hbase, Hadoop, Map/Reduce and perhapsMahout?
Thanks for any thoughts,
Benjamin

_______________________________________
Benjamin Dageroth, Business Development Manager
Webtrekk GmbH
Boxhagener Str. 76-78, 10245 Berlin
fon 030 - 755 415 - 360
fax 030 - 755 415 - 100
[email protected]
http://www.webtrekk.com<http://www.webtrekk.de/>
Amtsgericht Berlin, HRB 93435 B
Geschäftsführer Christian Sauer


_______________________________________

Re: Web Analytics Use case?

Reply via email to