On Fri, Sep 3, 2021 at 10:33 AM MyWorld <timeplus.1...@gmail.com> wrote:

> Hi all,
> We are doing a POC on dev environment to upgrade apache cassandra 3.0.9 to
> 4.0.0. We have the below setup currently on cassandra 3.0.9
> DC1 - GCP(india) - 1 node
> DC2 - GCP(US) - 1 node
>

3.0.9 is very old. It's got older version of data files and some known
correctness bugs.


>
> For upgradation, we carried out below steps on DC2 - GCP(US) node:
> Step1. Install apache cassandra 4.0.0
> Step2. Did all Configuration settings
> Step3. Stop apache cassandra 3.0.9
> Step4. Start apache cassandra 4.0.0 and monitor logs
> Step5. Run nodetool upgradesstables and monitor logs
>
> After monitoring logs, I had below observations:
> *1. Initially during bootstrap at Step4, received below exceptions:*
> a) Exception (java.lang.IllegalArgumentException) encountered during
> startup: Invalid sstable file manifest.json: the name doesn't look like a
> supported sstable file name
> java.lang.IllegalArgumentException: Invalid sstable file manifest.json:
> the name doesn't look like a supported sstable file name
> b) ERROR [main] 2021-08-29 06:25:52,120 CassandraDaemon.java:909 -
> Exception encountered during startup
> java.lang.IllegalArgumentException: Invalid sstable file schema.cql: the
> name doesn't look like a supported sstable file name
>
>
*In order to resolve, we removed manifest.json and schema.cql files from
> each table directory and the issue was resolved. *
>

Did you restore these from backup/snapshot?


>
> *2. After resolving the above issue, we received below WARN messages
> during bootstrap(step 4).*
> *WARN * [main] 2021-08-29 06:33:25,737 CommitLogReplayer.java:305 -
> Origin of 1 sstables is unknown or doesn't match the local node;
> commitLogIntervals for them were ignored
> *DEBUG *[main] 2021-08-29 06:33:25,737 CommitLogReplayer.java:306 -
> Ignored commitLogIntervals from the following sstables:
> [/opt1/cassandra_poc/data/clickstream/glcat_mcat_by_flname-af4e3ac0ace511ebaf9ec13e37d013c2/mc-1-big-Data.db]
> *WARN  *[main] 2021-08-29 06:33:25,737 CommitLogReplayer.java:305 -
> Origin of 2 sstables is unknown or doesn't match the local node;
> commitLogIntervals for them were ignored
> *DEBUG *[main] 2021-08-29 06:33:25,738 CommitLogReplayer.java:306 -
> Ignored commitLogIntervals from the following sstables:
> [/opt1/cassandra_poc/data/clickstream/gl_city_map
>
>
Your data files dont match the commitlog files it expects to see. Either
you restored these from backup, or it's because 3.0.9 is much older than
3.0.x that is more commonly used.


> *3. While upgrading sstables (step 5), we received below messages:*
> *WARN*  [CompactionExecutor:3] 2021-08-29 07:47:32,828
> DuplicateRowChecker.java:96 - Detected 2 duplicate rows for 29621439 during
> Upgrade sstables.
> *WARN*  [CompactionExecutor:3] 2021-08-29 07:47:32,831
> DuplicateRowChecker.java:96 - Detected 4 duplicate rows for 45016570 during
> Upgrade sstables.
> *WARN*  [CompactionExecutor:3] 2021-08-29 07:47:32,833
> DuplicateRowChecker.java:96 - Detected 3 duplicate rows for 61260692 during
> Upgrade sstables.
>
>
This says you have corrupt data from an old bug. Probably related to 2.1 ->
3.0 upgrades, if this was originally on 2.1. If you read those keys, you
would find that the data returns 2-4 rows where it should be exactly 1.


> 4.* Also, received below messages during upgrade*
> *DEBUG* [epollEventLoopGroup-5-8] 2021-09-03 12:27:31,347
> InitialConnectionHandler.java:77 - OPTIONS received 5/v5
> *DEBUG* [epollEventLoopGroup-5-8] 2021-09-03 12:27:31,349
> InitialConnectionHandler.java:121 - Response to STARTUP sent, configuring
> pipeline for 5/v5
> *DEBUG* [epollEventLoopGroup-5-8] 2021-09-03 12:27:31,350
> InitialConnectionHandler.java:153 - Configured pipeline:
> DefaultChannelPipeline{(frameDecoder =
> org.apache.cassandra.net.FrameDecoderCrc), (frameEncoder =
> org.apache.cassandra.net.FrameEncoderCrc), (cqlProcessor =
> org.apache.cassandra.transport.CQLMessageHandler), (exceptionHandler =
> org.apache.cassandra.transport.ExceptionHandlers$PostV5ExceptionHandler)}
>
>
Logs of debug stuff, normal. It's the netty connection pipelines being
setup.


> *5. After upgrade, we are regularly getting below messages:*
> *DEBUG* [ScheduledTasks:1] 2021-09-02 00:03:20,910 SSLFactory.java:354 -
> Checking whether certificates have been updated []
> *DEBUG* [ScheduledTasks:1] 2021-09-02 00:13:20,910 SSLFactory.java:354 -
> Checking whether certificates have been updated []
> *DEBUG* [ScheduledTasks:1] 2021-09-02 00:23:20,911 SSLFactory.java:354 -
> Checking whether certificates have been updated []
>
> Normal. It's checking to see if the ssl cert changed, and if it did, it
would reload it.


> *Can someone please explain what these above ERROR / WARN / DEBUG messages
> refer to? Is there anything to be concerned about?*
>
> *Also, received 2 READ_REQ dropped messages (may be due to nw latency) *
> *INFO*  [ScheduledTasks:1] 2021-09-03 11:40:10,009
> MessagingMetrics.java:206 - READ_REQ messages were dropped in last 5000 ms:
> 0 internal and 1 cross node. Mean internal dropped latency: 0 ms and Mean
> cross-node dropped latency: 12359 ms
> *INFO*  [ScheduledTasks:1] 2021-09-03 13:27:15,291
> MessagingMetrics.java:206 - READ_REQ messages were dropped in last 5000 ms:
> 0 internal and 1 cross node. Mean internal dropped latency: 0 ms and Mean
> cross-node dropped latency: 5960 ms
>
>
12s and 6s cross-node latency isn't hugely surprising from US to India,
given the geographical distance and likelihood of packet loss across that
distance. Losing 1 read request every few hours seems like it's within
normal expectations.



> Rest of the stats are pretty much normal (tpstats, status, info,
> tablestats, etc)
>
> Regards,
> Ashish
>
>

Reply via email to