Hi, Just found that reducing the batch size below 20 also increases the writing speed and reduction in memory usage(especially for Python driver).
Kind regards, Rajesh R ________________________________ From: Ben Bromhead [b...@instaclustr.com] Sent: 07 November 2016 05:44 To: user@cassandra.apache.org Subject: Re: Are Cassandra writes are faster than reads? They can be and it depends on your compaction strategy :) On Sun, 6 Nov 2016 at 21:24 Ali Akhtar <ali.rac...@gmail.com<redir.aspx?REF=KvuN_F91CkILmAKkPOD8RLOkpaObm4vWZ4CTx2PNAjG8Cvd6wAfUCAFtYWlsdG86YWxpLnJhYzIwMEBnbWFpbC5jb20.>> wrote: tl;dr? I just want to know if updates are bad for performance, and if so, for how long. On Mon, Nov 7, 2016 at 10:23 AM, Ben Bromhead <b...@instaclustr.com<redir.aspx?REF=bOLz-2Z_cjZ-R5mW4ySFRmRgIvYoWF43pRrpxxUsOOC8Cvd6wAfUCAFtYWlsdG86YmVuQGluc3RhY2x1c3RyLmNvbQ..>> wrote: Check out https://wiki.apache.org/cassandra/WritePathForUsers<redir.aspx?REF=z6gebtTM9Bi4b1ZEZqnpcgJOwnifCWloccEOX28F8UC8Cvd6wAfUCAFodHRwczovL3dpa2kuYXBhY2hlLm9yZy9jYXNzYW5kcmEvV3JpdGVQYXRoRm9yVXNlcnM.> for the full gory details. On Sun, 6 Nov 2016 at 21:09 Ali Akhtar <ali.rac...@gmail.com<redir.aspx?REF=KvuN_F91CkILmAKkPOD8RLOkpaObm4vWZ4CTx2PNAjG8Cvd6wAfUCAFtYWlsdG86YWxpLnJhYzIwMEBnbWFpbC5jb20.>> wrote: How long does it take for updates to get merged / compacted into the main data file? On Mon, Nov 7, 2016 at 5:31 AM, Ben Bromhead <b...@instaclustr.com<redir.aspx?REF=bOLz-2Z_cjZ-R5mW4ySFRmRgIvYoWF43pRrpxxUsOOC8Cvd6wAfUCAFtYWlsdG86YmVuQGluc3RhY2x1c3RyLmNvbQ..>> wrote: To add some flavor as to how the commitlog implementation is so quick. It only flushes to disk every 10s by default. So writes are effectively done to memory and then to disk asynchronously later on. This is generally accepted to be OK, as the write is also going to other nodes. You can of course change this behavior to flush on each write or to skip the commitlog altogether (danger!). This however will change how "safe" things are from a durability perspective. On Sun, Nov 6, 2016, 12:51 Jeff Jirsa <jeff.ji...@crowdstrike.com<redir.aspx?REF=CSJmlUdwjTSoe3NQdZNlO6pFPeaI_KxNpZweB-GbDYO8Cvd6wAfUCAFtYWlsdG86amVmZi5qaXJzYUBjcm93ZHN0cmlrZS5jb20.>> wrote: Cassandra writes are particularly fast, for a few reasons: 1) Most writes go to a commitlog (append-only file, written linearly, so particularly fast in terms of disk operations) and then pushed to the memTable. Memtable is flushed in batches to the permanent data files, so it buffers many mutations and then does a sequential write to persist that data to disk. 2) Reads may have to merge data from many data tables on disk. Because the writes (described very briefly in step 1) write to immutable files, updates/deletes have to be merged on read – this is extra effort for the read path. If you don’t do much in terms of overwrites/deletes, and your partitions are particularly small, and your data fits in RAM (probably mmap/page cache of data files, unless you’re using the row cache), reads may be very fast for you. Certainly individual reads on low-merge workloads can be < 0.1ms. - Jeff From: Vikas Jaiman <er.vikasjai...@gmail.com<redir.aspx?REF=VgqqnBUEzP6sLWofnDxFp3iyHQ4TGCTJL8MbqH0NOUK8Cvd6wAfUCAFtYWlsdG86ZXIudmlrYXNqYWltYW5AZ21haWwuY29t>> Reply-To: "user@cassandra.apache.org<redir.aspx?REF=yxCMb2E-WgRKlJCeCUpFf-0-Th-NE4pZJyZdWo0SRMS8Cvd6wAfUCAFtYWlsdG86dXNlckBjYXNzYW5kcmEuYXBhY2hlLm9yZw..>" <user@cassandra.apache.org<redir.aspx?REF=yxCMb2E-WgRKlJCeCUpFf-0-Th-NE4pZJyZdWo0SRMS8Cvd6wAfUCAFtYWlsdG86dXNlckBjYXNzYW5kcmEuYXBhY2hlLm9yZw..>> Date: Sunday, November 6, 2016 at 12:42 PM To: "user@cassandra.apache.org<redir.aspx?REF=yxCMb2E-WgRKlJCeCUpFf-0-Th-NE4pZJyZdWo0SRMS8Cvd6wAfUCAFtYWlsdG86dXNlckBjYXNzYW5kcmEuYXBhY2hlLm9yZw..>" <user@cassandra.apache.org<redir.aspx?REF=yxCMb2E-WgRKlJCeCUpFf-0-Th-NE4pZJyZdWo0SRMS8Cvd6wAfUCAFtYWlsdG86dXNlckBjYXNzYW5kcmEuYXBhY2hlLm9yZw..>> Subject: Are Cassandra writes are faster than reads? Hi all, Are Cassandra writes are faster than reads ?? If yes, why is this so? I am using consistency 1 and data is in memory. Vikas -- Ben Bromhead CTO | Instaclustr<redir.aspx?REF=N46JHXr59B026V3xSfBozh2xZoVS0DwdAV5Sm_LybJG8Cvd6wAfUCAFodHRwczovL3d3dy5pbnN0YWNsdXN0ci5jb20v> +1 650 284 9692<tel:%2B1%20650%20284%209692> Managed Cassandra / Spark on AWS, Azure and Softlayer -- Ben Bromhead CTO | Instaclustr<redir.aspx?REF=Y61HittTE07k3NR47zwHMClylS3zrPdxkOXCEQRVNWUdbPl6wAfUCAFodHRwczovL3d3dy5pbnN0YWNsdXN0ci5jb20v> +1 650 284 9692<tel:%2B1%20650%20284%209692> Managed Cassandra / Spark on AWS, Azure and Softlayer -- Ben Bromhead CTO | Instaclustr<redir.aspx?REF=Y61HittTE07k3NR47zwHMClylS3zrPdxkOXCEQRVNWUdbPl6wAfUCAFodHRwczovL3d3dy5pbnN0YWNsdXN0ci5jb20v> +1 650 284 9692 Managed Cassandra / Spark on AWS, Azure and Softlayer ************************************************************************** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE **************************************************************************