Hi,
We have a micro service (multiple instances and each instance has multiple
consumers) on AWS that uses Camel aws-sqs to listen to a AWS sqs queue.
Our camel version is 2.24.0 and the aws-sqs config is
visibilityTimeout=300&extendMessageVisibility=true&maxMessagesPerPoll=5&con
currentConsumers=5.
Sometimes (less than 1%) we found our micro service duplicated to process
a message. And sometimes we can find a warning like this:
019-06-18 12:55:23.821 [Camel (camel-1) thread #7 - SqsTimeoutExtender]
WARN  o.a.c.c.a.sqs.SqsConsumer - Extending visibility window failed for
exchange Exchange[ID-i-00edba07f823c5496-1560852804975-0-254]. Will not
attempt to extend visibility further.
This exception will be ignored.
com.amazonaws.services.sqs.model.AmazonSQSException: Value
AQEB/3m4117WvFRRTVpLJVyLa9D0pbln3f2Y8x/0lgTfWEDd for parameter
ReceiptHandle is invalid. Reason: Message does not exist or is not
available for visibility timeout change. (Service: AmazonSQS; Status
Code: 400; Error Code: InvalidParameterValue; Request ID:
5531d38b-1253-59a1-a944-8609f3281568at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(Ama
zonHttpClient.java:1638
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(Amazo
nHttpClient.java:1303
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHtt
pClient.java:1055 
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpCli
ent.java:743 
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(Amazon
HttpClient.java:717
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClien
t.java:699 
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpCl
ient.java:667 
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(Ama
zonHttpClient.java:649
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513
at 
com.amazonaws.services.sqs.AmazonSQSClient.doInvoke(AmazonSQSClient.java:17
40 
at 
com.amazonaws.services.sqs.AmazonSQSClient.invoke(AmazonSQSClient.java:1716
at 
com.amazonaws.services.sqs.AmazonSQSClient.executeChangeMessageVisibility(A
mazonSQSClient.java:518
at 
com.amazonaws.services.sqs.AmazonSQSClient.changeMessageVisibility(AmazonSQ
SClient.java:494 
at org.apache.camel.component.aws.sqs.SqsConsumer$TimeoutExtender.run
(SqsConsumer.java:327
(Executors.java:511
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access
$301(ScheduledThreadPoolExecutor.java:180
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run
(ScheduledThreadPoolExecutor.java:294
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1
149 
at java.util.concurrent.ThreadPoolExecutor$Worker.run
(ThreadPoolExecutor.java:624
at java.lang.Thread.run
(Thread.java:748 


I thought it is because Camel failed to extend the visibilityTimeout so
the message appeared in the queue and was picked up by another consumer. I
raised a AWS support ticket and got the logs from AWS like this:
1. Operation=SendMessage                  | RemoteIpAddress=52.213.250.143
| EndTime=Tue, 18 Jun 2019 12:51:18 UTC
2. Operation=ReceiveMessage              | RemoteIpAddress=52.208.226.232
| EndTime=Tue, 18 Jun 2019 12:51:18 UTC | VisibilityTimeout=60
3. Operation=ReceiveMessage              | RemoteIpAddress=52.209.137.230
| EndTime=Tue, 18 Jun 2019 12:52:19 UTC | VisibilityTimeout=60
4. Operation=ChangeMessageVisibility | RemoteIpAddress=52.209.137.230 |
EndTime=Tue, 18 Jun 2019 12:52:49 UTC | VisibilityTimeout=90
5. Operation=ChangeMessageVisibility | RemoteIpAddress=52.209.137.230 |
EndTime=Tue, 18 Jun 2019 12:53:49 UTC | VisibilityTimeout=90
6. Operation=ChangeMessageVisibility | RemoteIpAddress=52.209.137.230 |
EndTime=Tue, 18 Jun 2019 12:54:49 UTC | VisibilityTimeout=90
7. Operation=ChangeMessageVisibility | RemoteIpAddress=52.208.226.232 |
EndTime=Tue, 18 Jun 2019 12:55:23 UTC | VisibilityTimeout=90
8. Operation=DeleteMessage                | RemoteIpAddress=52.209.137.230
| EndTime=Tue, 18 Jun 2019 12:55:23 UTC
9. Operation=ChangeMessageVisibility | RemoteIpAddress=52.208.226.232 |
EndTime=Tue, 18 Jun 2019 12:56:23 UTC | VisibilityTimeout=90
10. Operation=DeleteMessage                |
RemoteIpAddress=52.208.226.232 | EndTime=Tue, 18 Jun 2019 12:56:57 UTC

We can see the message was picked up by the first
consumer(RemoteIpAddress=52.208.226.232) but it didn't extend the
visibilityTimeout, so after 60 seconds the message was picked up by
another consumer(RemoteIpAddress=52.209.137.230) which extended the
VisibilityTimeout as usual. This happened without any warning and
exception. We saw the warning because somehow after about 4 minutes the
first consumer tried to extend the VisibilityTimeout after the message was
deleted. Sometimes this happened completely
silently.

This happened a couple of times every week. Is it because of something
wrong with the threads (concurrentConsumers=5) which has problem to
schedule the VisibilityTimeout extension? Has anyone seen this before?

Many thanks,
Qingyang Xu

Reply via email to