Another thing is,

Let's say that we already have a structure data, the way we load that to
HDFS is to turn that one into a files ?

Cheers

On Sun, Oct 23, 2016 at 6:18 PM, Welly Tambunan <if05...@gmail.com> wrote:

> So basically you will store that files to HDFS and use Spark to process it
> ?
>
> On Sun, Oct 23, 2016 at 6:03 PM, Joaquin Alzola <joaquin.alz...@lebara.com
> > wrote:
>
>>
>>
>> I think what Ali mentions is correct:
>>
>> If you need a lot of queries that require joins, or complex analytics of
>> the kind that Cassandra isn't suited for, then HDFS / HBase may be better.
>>
>>
>>
>> We have files in which one line contains 500 fields (separated by pipe)
>> and each of this fields is particularly important.
>>
>> Cassandra will not manage that since you will need 500 indexes. HDFS is
>> the proper way.
>>
>>
>>
>>
>>
>> *From:* Welly Tambunan [mailto:if05...@gmail.com]
>> *Sent:* 23 October 2016 10:19
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: Hadoop vs Cassandra
>>
>>
>>
>> I like muti data centre resillience in cassandra.
>>
>> I think thats plus one for cassandra.
>>
>> Ali, complex analytics can be done in spark right?
>>
>> On 23 Oct 2016 4:08 p.m., "Ali Akhtar" <ali.rac...@gmail.com> wrote:
>>
>> >
>>
>> > I would say it depends on your use case.
>> >
>> > If you need a lot of queries that require joins, or complex analytics
>> of the kind that Cassandra isn't suited for, then HDFS / HBase may be
>> better.
>> >
>> > If you can work with the cassandra way of doing things (creating new
>> tables for each query you'll need to do, duplicating data - doing extra
>> writes for faster reads) , then Cassandra should work for you. It is easier
>> to setup and do dev ops with, in my experience.
>> >
>> > On Sun, Oct 23, 2016 at 2:05 PM, Welly Tambunan <if05...@gmail.com>
>> wrote:
>>
>> >>
>>
>> >> I mean. HDFS and HBase.
>> >>
>> >> On Sun, Oct 23, 2016 at 4:00 PM, Ali Akhtar <ali.rac...@gmail.com>
>> wrote:
>>
>> >>>
>>
>> >>> By Hadoop do you mean HDFS?
>> >>>
>> >>>
>> >>>
>> >>> On Sun, Oct 23, 2016 at 1:56 PM, Welly Tambunan <if05...@gmail.com>
>> wrote:
>>
>> >>>>
>>
>> >>>> Hi All,
>> >>>>
>> >>>> I read the following comparison between hadoop and cassandra. Seems
>> the conclusion that we use hadoop for data lake ( cold data ) and Cassandra
>> for hot data (real time data).
>> >>>>
>> >>>> http://www.datastax.com/nosql-databases/nosql-cassandra-and-hadoop
>> <http://www.datastax.com/nosql-databases/nosql-cassandra-and-hadoop>
>> >>>>
>> >>>> My question is, can we just use cassandra to rule them all ?
>> >>>>
>> >>>> What we are trying to achieve is to minimize the moving part on our
>> system.
>> >>>>
>> >>>> Any response would be really appreciated.
>> >>>>
>> >>>>
>> >>>> Cheers
>> >>>>
>> >>>> --
>> >>>> Welly Tambunan
>> >>>> Triplelands
>> >>>>
>> >>>> http://weltam.wordpress.com <http://weltam.wordpress.com>
>> >>>> http://www.triplelands.com <http://www.triplelands.com/blog/>
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Welly Tambunan
>> >> Triplelands
>> >>
>> >> http://weltam.wordpress.com <http://weltam.wordpress.com>
>> >> http://www.triplelands.com <http://www.triplelands.com/blog/>
>> >
>> >
>> This email is confidential and may be subject to privilege. If you are
>> not the intended recipient, please do not copy or disclose its content but
>> contact the sender immediately upon receipt.
>>
>
>
>
> --
> Welly Tambunan
> Triplelands
>
> http://weltam.wordpress.com
> http://www.triplelands.com <http://www.triplelands.com/blog/>
>



-- 
Welly Tambunan
Triplelands

http://weltam.wordpress.com
http://www.triplelands.com <http://www.triplelands.com/blog/>

Reply via email to