Wouldn't it be useful to store your data somewhere structured (Cassandra is obviously an option) and then use MapReduce to store statistics?
2011/6/22 Jake Luciani <jak...@gmail.com>: > Well solandra is running Cassandra so you can use Cassandra as you do today, > but index some of the data in solr. > > On Jun 22, 2011, at 3:41 AM, Sasha Dolgy <sdo...@gmail.com> wrote: > >> First, thanks everyone for the input. Appreciate it. The number >> crunching would already have been completed, and all statistics per >> game defined, and inserted into the appropriate CF/row/cols ... >> >> So, that being said, Solandra appears to be the right way to go ... >> except, this would require that my current application(s) be rewritten >> to consume Solandra and no longer Cassandra ... "Your application >> isn't aware of Cassandra only Solr." or can I have the best of both >> worlds? Search is only one aspect of the consumer experience. If a >> consumer wanted to view a 'card' for a baseball player, all the >> information would be retrieved directly from Cassandra to build that >> card and search wouldn't be required... >> >> -sd >> >> On Tue, Jun 21, 2011 at 9:50 PM, Jake Luciani <jak...@gmail.com> wrote: >>> Right, Solr will not do anything other than basic aggregations (facets) and >>> range queries. >>> On Tue, Jun 21, 2011 at 3:16 PM, Dan Kuebrich <dan.kuebr...@gmail.com> >>> wrote: >>>> >>>> Solandra is indeed distributed search, not distributed number-crunching. >>>> As a previous poster said, you could imagine structuring the data in a >>>> series of documents with fields containing playername, teamname, position, >>>> location, day, time, inning, at bat, outcome, etc. Then you could query to >>>> get a slice of the data that matches your predicate and run statistics on >>>> that subset. >>>> The statistics would have to come from other code (eg. R), but solr will >>>> filter it for you. So, this approach only works if the slices are >>>> reasonably >>>> small, but gives you great granularity on search as long as you put all the >>>> info in. The users of this datastore (or you) must be willing to write >>>> their own simple aggregation functions ("show me only the unique player >>>> names returned by this solr query", "show me the average of field X >>>> returned >>>> by this solr query", ...) >>>> If the numbers of results are too great, MR may be the way to go. > -- Santiago Basulto.-