I was looking for some options and came across JethroData.

http://www.jethrodata.com/

This stores the data maintaining indexes over all the columns seems good
and claims to have better performance than Impala.

Earlier I had tried Apache Phoenix because of its secondary indexing
feature. But the major challenge I faced there was, secondary indexing was
not supported for bulk loading process.
Only the sequential loading process supported the secondary indexes, which
took longer time.


Any comments on this ?




On Thu, Mar 26, 2015 at 5:59 PM, kundan kumar <iitr.kun...@gmail.com> wrote:

> I looking for some options and came across
>
> http://www.jethrodata.com/
>
> On Thu, Mar 26, 2015 at 5:47 PM, Jörn Franke <jornfra...@gmail.com> wrote:
>
>> You can also preaggregate results for the queries by the user - depending
>> on what queries they use this might be necessary for any underlying
>> technology
>> Le 26 mars 2015 11:27, "kundan kumar" <iitr.kun...@gmail.com> a écrit :
>>
>> Hi,
>>>
>>> I need to store terabytes of data which will be used for BI tools like
>>> qlikview.
>>>
>>> The queries can be on the basis of filter on any column.
>>>
>>> Currently, we are using redshift for this purpose.
>>>
>>> I am trying to explore things other than the redshift .
>>>
>>> Is it possible to gain better performance in spark as compared to
>>> redshift ?
>>>
>>> If yes, please suggest what is the best way to achieve this.
>>>
>>>
>>> Thanks!!
>>> Kundan
>>>
>>
>

Reply via email to