277 TB/day seems like the type of task I'd not trust to random mailing list
advice.

Cassandra can do that, but it's nontrivial. MongoDB may be able to do it,
too (not sure). A lot of it will depend on how you're trying to query the
data.



On Thu, May 31, 2018 at 9:00 AM, Sudhakar Ganesan <
sudhakar.gane...@flex.com.invalid> wrote:

> At high level, in the production line, machine will provide the data in
> the form of CSV in every 1 sec to 1 minutes to 1 day ( depending on machine
> type used in the line operations). I need to parse those files and load it
> to DB and build and API layer expose it to downstream systems.
>
>
>
> *Number of files to be processed   13,889,660,134  per day*
>
> *Each file could range from 20 KB to 600MB which will translate into few
> hundred rows to millions of rows.*
>
> *High availability with high write. Read is less compare to write.*
>
> *While extracting the rows, few validation to be performed.*
>
> *Build an API layer on top of the data to be persisted in the DB.*
>
>
>
> Now, tell me what would be the best choice…
>
>
>
> *From:* Russell Bateman [mailto:r...@windofkeltia.com]
> *Sent:* Thursday, May 31, 2018 7:36 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Mongo DB vs Cassandra
>
>
>
> Sudhakar,
>
> MongoDB will accommodate loading CSV without regard to schema while still
> creating identifiable "columns" in the database, but you'll have to predict
> or back-impose some schema later if you're going to create indices for fast
> searching of the data. You can perform searching of data without indexing
> in MongoDB, but it's slower.
>
> Cassandra will require you to understand the schema, i.e.: what the
> columns are up front unless you're just going to store the data without
> schema and, therefore, without ability to search effectively.
>
> As suggested already, you should share more detail if you want good
> advice. Both DBs are excellent. Both do different things in different ways.
>
> Hope this helps,
> Russ
>
> On 05/31/2018 05:49 AM, Sudhakar Ganesan wrote:
>
> Team,
>
>
>
> I need to make a decision on Mongo DB vs Cassandra for loading the csv
> file data and store csv file as well. If any of you did such study in last
> couple of months, please share your analysis or observations.
>
>
>
> Regards,
>
> Sudhakar
>
> Legal Disclaimer :
> The information contained in this message may be privileged and
> confidential.
> It is intended to be read only by the individual or entity to whom it is
> addressed
> or by their designee. If the reader of this message is not the intended
> recipient,
> you are on notice that any distribution of this message, in any form,
> is strictly prohibited. If you have received this message in error,
> please immediately notify the sender and delete or destroy any copy of
> this message!
>
>
>

Reply via email to