Another thing is, Let's say that we already have a structure data, the way we load that to HDFS is to turn that one into a files ?
Cheers On Sun, Oct 23, 2016 at 6:18 PM, Welly Tambunan <if05...@gmail.com> wrote: > So basically you will store that files to HDFS and use Spark to process it > ? > > On Sun, Oct 23, 2016 at 6:03 PM, Joaquin Alzola <joaquin.alz...@lebara.com > > wrote: > >> >> >> I think what Ali mentions is correct: >> >> If you need a lot of queries that require joins, or complex analytics of >> the kind that Cassandra isn't suited for, then HDFS / HBase may be better. >> >> >> >> We have files in which one line contains 500 fields (separated by pipe) >> and each of this fields is particularly important. >> >> Cassandra will not manage that since you will need 500 indexes. HDFS is >> the proper way. >> >> >> >> >> >> *From:* Welly Tambunan [mailto:if05...@gmail.com] >> *Sent:* 23 October 2016 10:19 >> *To:* user@cassandra.apache.org >> *Subject:* Re: Hadoop vs Cassandra >> >> >> >> I like muti data centre resillience in cassandra. >> >> I think thats plus one for cassandra. >> >> Ali, complex analytics can be done in spark right? >> >> On 23 Oct 2016 4:08 p.m., "Ali Akhtar" <ali.rac...@gmail.com> wrote: >> >> > >> >> > I would say it depends on your use case. >> > >> > If you need a lot of queries that require joins, or complex analytics >> of the kind that Cassandra isn't suited for, then HDFS / HBase may be >> better. >> > >> > If you can work with the cassandra way of doing things (creating new >> tables for each query you'll need to do, duplicating data - doing extra >> writes for faster reads) , then Cassandra should work for you. It is easier >> to setup and do dev ops with, in my experience. >> > >> > On Sun, Oct 23, 2016 at 2:05 PM, Welly Tambunan <if05...@gmail.com> >> wrote: >> >> >> >> >> >> I mean. HDFS and HBase. >> >> >> >> On Sun, Oct 23, 2016 at 4:00 PM, Ali Akhtar <ali.rac...@gmail.com> >> wrote: >> >> >>> >> >> >>> By Hadoop do you mean HDFS? >> >>> >> >>> >> >>> >> >>> On Sun, Oct 23, 2016 at 1:56 PM, Welly Tambunan <if05...@gmail.com> >> wrote: >> >> >>>> >> >> >>>> Hi All, >> >>>> >> >>>> I read the following comparison between hadoop and cassandra. Seems >> the conclusion that we use hadoop for data lake ( cold data ) and Cassandra >> for hot data (real time data). >> >>>> >> >>>> http://www.datastax.com/nosql-databases/nosql-cassandra-and-hadoop >> <http://www.datastax.com/nosql-databases/nosql-cassandra-and-hadoop> >> >>>> >> >>>> My question is, can we just use cassandra to rule them all ? >> >>>> >> >>>> What we are trying to achieve is to minimize the moving part on our >> system. >> >>>> >> >>>> Any response would be really appreciated. >> >>>> >> >>>> >> >>>> Cheers >> >>>> >> >>>> -- >> >>>> Welly Tambunan >> >>>> Triplelands >> >>>> >> >>>> http://weltam.wordpress.com <http://weltam.wordpress.com> >> >>>> http://www.triplelands.com <http://www.triplelands.com/blog/> >> >>> >> >>> >> >> >> >> >> >> >> >> -- >> >> Welly Tambunan >> >> Triplelands >> >> >> >> http://weltam.wordpress.com <http://weltam.wordpress.com> >> >> http://www.triplelands.com <http://www.triplelands.com/blog/> >> > >> > >> This email is confidential and may be subject to privilege. If you are >> not the intended recipient, please do not copy or disclose its content but >> contact the sender immediately upon receipt. >> > > > > -- > Welly Tambunan > Triplelands > > http://weltam.wordpress.com > http://www.triplelands.com <http://www.triplelands.com/blog/> > -- Welly Tambunan Triplelands http://weltam.wordpress.com http://www.triplelands.com <http://www.triplelands.com/blog/>