Hi, We have a micro service (multiple instances and each instance has multiple consumers) on AWS that uses Camel aws-sqs to listen to a AWS sqs queue. Our camel version is 2.24.0 and the aws-sqs config is visibilityTimeout=300&extendMessageVisibility=true&maxMessagesPerPoll=5&con currentConsumers=5. Sometimes (less than 1%) we found our micro service duplicated to process a message. And sometimes we can find a warning like this: 019-06-18 12:55:23.821 [Camel (camel-1) thread #7 - SqsTimeoutExtender] WARN o.a.c.c.a.sqs.SqsConsumer - Extending visibility window failed for exchange Exchange[ID-i-00edba07f823c5496-1560852804975-0-254]. Will not attempt to extend visibility further. This exception will be ignored. com.amazonaws.services.sqs.model.AmazonSQSException: Value AQEB/3m4117WvFRRTVpLJVyLa9D0pbln3f2Y8x/0lgTfWEDd for parameter ReceiptHandle is invalid. Reason: Message does not exist or is not available for visibility timeout change. (Service: AmazonSQS; Status Code: 400; Error Code: InvalidParameterValue; Request ID: 5531d38b-1253-59a1-a944-8609f3281568at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(Ama zonHttpClient.java:1638 at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(Amazo nHttpClient.java:1303 at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHtt pClient.java:1055 at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpCli ent.java:743 at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(Amazon HttpClient.java:717 at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClien t.java:699 at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpCl ient.java:667 at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(Ama zonHttpClient.java:649 at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513 at com.amazonaws.services.sqs.AmazonSQSClient.doInvoke(AmazonSQSClient.java:17 40 at com.amazonaws.services.sqs.AmazonSQSClient.invoke(AmazonSQSClient.java:1716 at com.amazonaws.services.sqs.AmazonSQSClient.executeChangeMessageVisibility(A mazonSQSClient.java:518 at com.amazonaws.services.sqs.AmazonSQSClient.changeMessageVisibility(AmazonSQ SClient.java:494 at org.apache.camel.component.aws.sqs.SqsConsumer$TimeoutExtender.run (SqsConsumer.java:327 (Executors.java:511 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access $301(ScheduledThreadPoolExecutor.java:180 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run (ScheduledThreadPoolExecutor.java:294 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1 149 at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:624 at java.lang.Thread.run (Thread.java:748
I thought it is because Camel failed to extend the visibilityTimeout so the message appeared in the queue and was picked up by another consumer. I raised a AWS support ticket and got the logs from AWS like this: 1. Operation=SendMessage | RemoteIpAddress=52.213.250.143 | EndTime=Tue, 18 Jun 2019 12:51:18 UTC 2. Operation=ReceiveMessage | RemoteIpAddress=52.208.226.232 | EndTime=Tue, 18 Jun 2019 12:51:18 UTC | VisibilityTimeout=60 3. Operation=ReceiveMessage | RemoteIpAddress=52.209.137.230 | EndTime=Tue, 18 Jun 2019 12:52:19 UTC | VisibilityTimeout=60 4. Operation=ChangeMessageVisibility | RemoteIpAddress=52.209.137.230 | EndTime=Tue, 18 Jun 2019 12:52:49 UTC | VisibilityTimeout=90 5. Operation=ChangeMessageVisibility | RemoteIpAddress=52.209.137.230 | EndTime=Tue, 18 Jun 2019 12:53:49 UTC | VisibilityTimeout=90 6. Operation=ChangeMessageVisibility | RemoteIpAddress=52.209.137.230 | EndTime=Tue, 18 Jun 2019 12:54:49 UTC | VisibilityTimeout=90 7. Operation=ChangeMessageVisibility | RemoteIpAddress=52.208.226.232 | EndTime=Tue, 18 Jun 2019 12:55:23 UTC | VisibilityTimeout=90 8. Operation=DeleteMessage | RemoteIpAddress=52.209.137.230 | EndTime=Tue, 18 Jun 2019 12:55:23 UTC 9. Operation=ChangeMessageVisibility | RemoteIpAddress=52.208.226.232 | EndTime=Tue, 18 Jun 2019 12:56:23 UTC | VisibilityTimeout=90 10. Operation=DeleteMessage | RemoteIpAddress=52.208.226.232 | EndTime=Tue, 18 Jun 2019 12:56:57 UTC We can see the message was picked up by the first consumer(RemoteIpAddress=52.208.226.232) but it didn't extend the visibilityTimeout, so after 60 seconds the message was picked up by another consumer(RemoteIpAddress=52.209.137.230) which extended the VisibilityTimeout as usual. This happened without any warning and exception. We saw the warning because somehow after about 4 minutes the first consumer tried to extend the VisibilityTimeout after the message was deleted. Sometimes this happened completely silently. This happened a couple of times every week. Is it because of something wrong with the threads (concurrentConsumers=5) which has problem to schedule the VisibilityTimeout extension? Has anyone seen this before? Many thanks, Qingyang Xu