Hi Jeff,
Thanks for your response.
To answer your question, Yes, we have created dev environment by restoring
them from snapshot/CSV files.

Just one follow up question, I have a 5-node single DC on production on
version 3.0.9on physical server.
We are planning to migrate to GCP along with upgradation using below steps.
1. Setup GCP data center with same version 3.0.9 and rebuild complete data
2. Now install and configure 4.0 version in new GCP data center on all 5
nodes
3. Stop version 3.0.9 and start 4.0 on all 5 nodes of GCP one by one
4. Run upgradesstables one by one on all 5 nodes of GCP
5.Later move read/write traffic to GCP and remove old datacenter which is
still on version 3.0.9

Please guide on few things:
1. Is the above mention approach right?
2. OR should we update 4.0 on only one node on GCP at a time and run
upgrade sstables on just one node first
3. OR should we migrate to GCP first and then think of upgrade 4.0 later
4. OR Is there any reason I should upgrade to 3.11.x first

Regards,
Ashish

On Fri, Sep 3, 2021, 11:11 PM Jeff Jirsa <jji...@gmail.com> wrote:

>
>
> On Fri, Sep 3, 2021 at 10:33 AM MyWorld <timeplus.1...@gmail.com> wrote:
>
>> Hi all,
>> We are doing a POC on dev environment to upgrade apache cassandra 3.0.9
>> to 4.0.0. We have the below setup currently on cassandra 3.0.9
>> DC1 - GCP(india) - 1 node
>> DC2 - GCP(US) - 1 node
>>
>
> 3.0.9 is very old. It's got older version of data files and some known
> correctness bugs.
>
>
>>
>> For upgradation, we carried out below steps on DC2 - GCP(US) node:
>> Step1. Install apache cassandra 4.0.0
>> Step2. Did all Configuration settings
>> Step3. Stop apache cassandra 3.0.9
>> Step4. Start apache cassandra 4.0.0 and monitor logs
>> Step5. Run nodetool upgradesstables and monitor logs
>>
>> After monitoring logs, I had below observations:
>> *1. Initially during bootstrap at Step4, received below exceptions:*
>> a) Exception (java.lang.IllegalArgumentException) encountered during
>> startup: Invalid sstable file manifest.json: the name doesn't look like a
>> supported sstable file name
>> java.lang.IllegalArgumentException: Invalid sstable file manifest.json:
>> the name doesn't look like a supported sstable file name
>> b) ERROR [main] 2021-08-29 06:25:52,120 CassandraDaemon.java:909 -
>> Exception encountered during startup
>> java.lang.IllegalArgumentException: Invalid sstable file schema.cql: the
>> name doesn't look like a supported sstable file name
>>
>>
> *In order to resolve, we removed manifest.json and schema.cql files from
>> each table directory and the issue was resolved. *
>>
>
> Did you restore these from backup/snapshot?
>
>
>>
>> *2. After resolving the above issue, we received below WARN messages
>> during bootstrap(step 4).*
>> *WARN * [main] 2021-08-29 06:33:25,737 CommitLogReplayer.java:305 -
>> Origin of 1 sstables is unknown or doesn't match the local node;
>> commitLogIntervals for them were ignored
>> *DEBUG *[main] 2021-08-29 06:33:25,737 CommitLogReplayer.java:306 -
>> Ignored commitLogIntervals from the following sstables:
>> [/opt1/cassandra_poc/data/clickstream/glcat_mcat_by_flname-af4e3ac0ace511ebaf9ec13e37d013c2/mc-1-big-Data.db]
>> *WARN  *[main] 2021-08-29 06:33:25,737 CommitLogReplayer.java:305 -
>> Origin of 2 sstables is unknown or doesn't match the local node;
>> commitLogIntervals for them were ignored
>> *DEBUG *[main] 2021-08-29 06:33:25,738 CommitLogReplayer.java:306 -
>> Ignored commitLogIntervals from the following sstables:
>> [/opt1/cassandra_poc/data/clickstream/gl_city_map
>>
>>
> Your data files dont match the commitlog files it expects to see. Either
> you restored these from backup, or it's because 3.0.9 is much older than
> 3.0.x that is more commonly used.
>
>
>> *3. While upgrading sstables (step 5), we received below messages:*
>> *WARN*  [CompactionExecutor:3] 2021-08-29 07:47:32,828
>> DuplicateRowChecker.java:96 - Detected 2 duplicate rows for 29621439 during
>> Upgrade sstables.
>> *WARN*  [CompactionExecutor:3] 2021-08-29 07:47:32,831
>> DuplicateRowChecker.java:96 - Detected 4 duplicate rows for 45016570 during
>> Upgrade sstables.
>> *WARN*  [CompactionExecutor:3] 2021-08-29 07:47:32,833
>> DuplicateRowChecker.java:96 - Detected 3 duplicate rows for 61260692 during
>> Upgrade sstables.
>>
>>
> This says you have corrupt data from an old bug. Probably related to 2.1
> -> 3.0 upgrades, if this was originally on 2.1. If you read those keys, you
> would find that the data returns 2-4 rows where it should be exactly 1.
>
>
>> 4.* Also, received below messages during upgrade*
>> *DEBUG* [epollEventLoopGroup-5-8] 2021-09-03 12:27:31,347
>> InitialConnectionHandler.java:77 - OPTIONS received 5/v5
>> *DEBUG* [epollEventLoopGroup-5-8] 2021-09-03 12:27:31,349
>> InitialConnectionHandler.java:121 - Response to STARTUP sent, configuring
>> pipeline for 5/v5
>> *DEBUG* [epollEventLoopGroup-5-8] 2021-09-03 12:27:31,350
>> InitialConnectionHandler.java:153 - Configured pipeline:
>> DefaultChannelPipeline{(frameDecoder =
>> org.apache.cassandra.net.FrameDecoderCrc), (frameEncoder =
>> org.apache.cassandra.net.FrameEncoderCrc), (cqlProcessor =
>> org.apache.cassandra.transport.CQLMessageHandler), (exceptionHandler =
>> org.apache.cassandra.transport.ExceptionHandlers$PostV5ExceptionHandler)}
>>
>>
> Logs of debug stuff, normal. It's the netty connection pipelines being
> setup.
>
>
>> *5. After upgrade, we are regularly getting below messages:*
>> *DEBUG* [ScheduledTasks:1] 2021-09-02 00:03:20,910 SSLFactory.java:354 -
>> Checking whether certificates have been updated []
>> *DEBUG* [ScheduledTasks:1] 2021-09-02 00:13:20,910 SSLFactory.java:354 -
>> Checking whether certificates have been updated []
>> *DEBUG* [ScheduledTasks:1] 2021-09-02 00:23:20,911 SSLFactory.java:354 -
>> Checking whether certificates have been updated []
>>
>> Normal. It's checking to see if the ssl cert changed, and if it did, it
> would reload it.
>
>
>> *Can someone please explain what these above ERROR / WARN / DEBUG
>> messages refer to? Is there anything to be concerned about?*
>>
>> *Also, received 2 READ_REQ dropped messages (may be due to nw latency) *
>> *INFO*  [ScheduledTasks:1] 2021-09-03 11:40:10,009
>> MessagingMetrics.java:206 - READ_REQ messages were dropped in last 5000 ms:
>> 0 internal and 1 cross node. Mean internal dropped latency: 0 ms and Mean
>> cross-node dropped latency: 12359 ms
>> *INFO*  [ScheduledTasks:1] 2021-09-03 13:27:15,291
>> MessagingMetrics.java:206 - READ_REQ messages were dropped in last 5000 ms:
>> 0 internal and 1 cross node. Mean internal dropped latency: 0 ms and Mean
>> cross-node dropped latency: 5960 ms
>>
>>
> 12s and 6s cross-node latency isn't hugely surprising from US to India,
> given the geographical distance and likelihood of packet loss across that
> distance. Losing 1 read request every few hours seems like it's within
> normal expectations.
>
>
>
>> Rest of the stats are pretty much normal (tpstats, status, info,
>> tablestats, etc)
>>
>> Regards,
>> Ashish
>>
>>

Reply via email to