I like Tez engine for hive (aka Stinger initiative) - faster than MR engine. especially for complex queries with lots of nested sub-queries - stable - min latency is 5-7 sec (0 sec for select count(*) ...) - capable to process huge datasets (not limited by RAM as Spark)
On Mon, Feb 2, 2015 at 6:00 PM, Samuel Marks <[email protected]> wrote: > Maybe you're right, and what I should be doing is throwing in connectors > so that data from regular databases is pushed into HDFS at regular > intervals, wherein my "fancier" analytics can be run across larger > data-sets. > > However, I don't want to decide straightaway, for example, Phoenix + Spark > may be just the combination I am looking for. > > Best, > > > Samuel Marks > http://linkedin.com/in/samuelmarks > > On Mon, Feb 2, 2015 at 5:14 PM, Jörn Franke <[email protected]> wrote: > >> Hallo, >> >> I think you have to think first about your functional and non-functional >> requirements. You can scale "normal" SQL databases as well (cf CERN or >> Facebook). There are different types of databases for different purposes - >> there is no one fits it all. At the moment, we are a few years away from a >> one-fits-it-all database that leverages AI etc to automatically scale, >> optimize etc processing, storage and network. Until then you will have to >> do the math depending on your requirements. >> Once you make them more precise, we will able to help you more. >> >> Cheers >> Le 2 févr. 2015 06:08, "Samuel Marks" <[email protected]> a écrit : >> >> Well what I am seeking is a Big Data database that can work with Small >> Data also. I.e.: scaleable from one node to vast clusters; whilst >> maintaining relatively low latency throughout. >> >> Which fit into this category? >> >> Samuel Marks >> http://linkedin.com/in/samuelmarks >> >> >
