If you have the time, I would suggest creating a prototype with both databases and trying it out. You should also have some idea of how this system might evolve in the future. It is important because that could very well help you make a decision. Mongo or Cassandra may work but if your requirements evolve in a way that works better with Cassandra, you might be better off going with Cassandra. As others have pointed out each database has it's own strength. Given that you may store a 20KB to 600MB row, you may be able to model it with Mongo as well as Cassandra. If you plan on having a separate index like ElasticSearch, Solr that is outside the database, I would suggest going with Cassandra. Other factors to consider are licensing, operational cost, etc. Dinesh
On Thursday, May 31, 2018, 9:01:09 AM PDT, Sudhakar Ganesan <sudhakar.gane...@flex.com.INVALID> wrote: #yiv6912111178 #yiv6912111178 -- _filtered #yiv6912111178 {panose-1:2 4 5 3 5 4 6 3 2 4;} _filtered #yiv6912111178 {font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;} _filtered #yiv6912111178 {font-family:Candara;panose-1:2 14 5 2 3 3 3 2 2 4;}#yiv6912111178 #yiv6912111178 p.yiv6912111178MsoNormal, #yiv6912111178 li.yiv6912111178MsoNormal, #yiv6912111178 div.yiv6912111178MsoNormal {margin:0in;margin-bottom:.0001pt;font-size:11.0pt;font-family:sans-serif;color:black;}#yiv6912111178 a:link, #yiv6912111178 span.yiv6912111178MsoHyperlink {color:#0563C1;text-decoration:underline;}#yiv6912111178 a:visited, #yiv6912111178 span.yiv6912111178MsoHyperlinkFollowed {color:#954F72;text-decoration:underline;}#yiv6912111178 p.yiv6912111178msonormal0, #yiv6912111178 li.yiv6912111178msonormal0, #yiv6912111178 div.yiv6912111178msonormal0 {margin-right:0in;margin-left:0in;font-size:11.0pt;font-family:sans-serif;color:black;}#yiv6912111178 span.yiv6912111178EmailStyle18 {font-family:sans-serif;color:windowtext;}#yiv6912111178 span.yiv6912111178EmailStyle19 {font-family:sans-serif;color:windowtext;}#yiv6912111178 .yiv6912111178MsoChpDefault {font-size:10.0pt;} _filtered #yiv6912111178 {margin:1.0in 1.0in 1.0in 1.0in;}#yiv6912111178 div.yiv6912111178WordSection1 {}#yiv6912111178 At high level, in the production line, machine will provide the data in the form of CSV in every 1 sec to 1 minutes to 1 day ( depending on machine type used in the line operations). I need to parse those files and load it to DB and build and API layer expose it to downstream systems. Number of files to be processed 13,889,660,134 per day Each file could range from 20 KB to 600MB which will translate into few hundred rows to millions of rows. High availability with high write. Read is less compare to write. While extracting the rows, few validation to be performed. Build an API layer on top of the data to be persisted in the DB. Now, tell me what would be the best choice… From: Russell Bateman [mailto:r...@windofkeltia.com] Sent: Thursday, May 31, 2018 7:36 PM To: user@cassandra.apache.org Subject: Re: Mongo DB vs Cassandra Sudhakar, MongoDB will accommodate loading CSV without regard to schema while still creating identifiable "columns" in the database, but you'll have to predict or back-impose some schema later if you're going to create indices for fast searching of the data. You can perform searching of data without indexing in MongoDB, but it's slower. Cassandra will require you to understand the schema, i.e.: what the columns are up front unless you're just going to store the data without schema and, therefore, without ability to search effectively. As suggested already, you should share more detail if you want good advice. Both DBs are excellent. Both do different things in different ways. Hope this helps, Russ On 05/31/2018 05:49 AM, Sudhakar Ganesan wrote: Team, I need to make a decision on Mongo DB vs Cassandra for loading the csv file data and store csv file as well. If any of you did such study in last couple of months, please share your analysis or observations. Regards, Sudhakar Legal Disclaimer : The information contained in this message may be privileged and confidential. It is intended to be read only by the individual or entity to whom it is addressed or by their designee. If the reader of this message is not the intended recipient, you are on notice that any distribution of this message, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and delete or destroy any copy of this message!