I was looking for some options and came across JethroData. http://www.jethrodata.com/
This stores the data maintaining indexes over all the columns seems good and claims to have better performance than Impala. Earlier I had tried Apache Phoenix because of its secondary indexing feature. But the major challenge I faced there was, secondary indexing was not supported for bulk loading process. Only the sequential loading process supported the secondary indexes, which took longer time. Any comments on this ? On Thu, Mar 26, 2015 at 5:59 PM, kundan kumar <iitr.kun...@gmail.com> wrote: > I looking for some options and came across > > http://www.jethrodata.com/ > > On Thu, Mar 26, 2015 at 5:47 PM, Jörn Franke <jornfra...@gmail.com> wrote: > >> You can also preaggregate results for the queries by the user - depending >> on what queries they use this might be necessary for any underlying >> technology >> Le 26 mars 2015 11:27, "kundan kumar" <iitr.kun...@gmail.com> a écrit : >> >> Hi, >>> >>> I need to store terabytes of data which will be used for BI tools like >>> qlikview. >>> >>> The queries can be on the basis of filter on any column. >>> >>> Currently, we are using redshift for this purpose. >>> >>> I am trying to explore things other than the redshift . >>> >>> Is it possible to gain better performance in spark as compared to >>> redshift ? >>> >>> If yes, please suggest what is the best way to achieve this. >>> >>> >>> Thanks!! >>> Kundan >>> >> >