This shouldn't normally cause a hang.  The code that handles receipt of tcp/ip messages reads the message's "reply processor" identifier before trying to deserilize the rest of the message. If there is a problem in deserializing the message we send an error response with the identifier so that the sender knows something went wrong.

Having said that, I am not as familiar with the function execution streaming-reply processors and how they handle this kind of response.  It's possible that a hang could occur in your situation if these reply processors aren't prepared to deal with an error response.

It seems to me that you should be more concerned that a deserialization problem occurred at all.  For instance, was the treemap being actively modified during serialization?  If so, take steps to prevent that from happening.


On 1/10/18 5:02 AM, Vahram Aharonyan wrote:

Hi All,

We are experiencing an issue with the thread that is performing onRegion call and expecting some result in response being stacked forewer in TIMED_WAITING state with below  trace:

"ComputedAndSystemMetricsRetriever" Id=490 in TIMED_WAITING on lock=java.util.concurrent.CountDownLatch$Sync@5630fcc2

Total blocked: 33   Total waited: 261425

sun.misc.Unsafe.park(Native Method)

java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)

java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)

java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)

java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)

org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:64)

org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:716)

org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:793)

org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:769)

org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:856)

org.apache.geode.internal.cache.execute.FunctionStreamingResultCollector.waitForCacheOrFunctionException(FunctionStreamingResultCollector.java:438)

org.apache.geode.internal.cache.partitioned.PRFunctionStreamingResultCollector.getResult(PRFunctionStreamingResultCollector.java:91)

platform.gemfire.GemfireFunctionExecutor.onRegion(GemfireFunctionExecutor.java:494)

In the logs of that member we see following:

[warning 2017/12/20 10:49:14.570 UTC 29acc6f1-5384-489d-b2bd-5187b898e482 <ComputedAndSystemMetricsRetriever> tid=0x1ea] 60 seconds have elapsed while waiting for replies: <PRFunctionStreamingResultCollector 100547 waiting for 1 replies from [gbv00457(abb6648c-39d6-4c4c-9c6d-ab8589e034a5:9583)<ec><v4>:10002]> on gbv00455(29acc6f1-5384-489d-b2bd-5187b898e482:22303)<ec><v3>:10002 whose current membership list is: [[gbv00458(8d2960b9-a6be-4519-9547-311e2717231e:15532)<ec><v5>:10002, gbv00457(abb6648c-39d6-4c4c-9c6d-ab8589e034a5:9583)<ec><v4>:10002, gbv00460(21fd5612-5fe2-451d-aa9d-b8542fa43fa7:20144)<ec><v9>:10002, gbv00459(3a14f29a-8bdb-46d5-bb67-0f79cb5c7faa:17197)<ec><v7>:10002, gbv00454(18618:locator)<ec><v1>:20002, gbv00454(64aed382-0882-44f5-b71f-08a429af46dd:18983)<ec><v8>:10002, gbv00453(13656:locator)<ec><v0>:20002, gbv00453(881591a8-ae04-4af1-866a-5074c2ffb133:14490)<ec><v2>:10002, gbv00456(63cebdf8-dd1e-414e-af5f-f8c4ebecf726:18001)<ec><v6>:10002, gbv00455(29acc6f1-5384-489d-b2bd-5187b898e482:22303)<ec><v3>:10002]]

Near that time on the nodes where this call lands, this exceptions occur:

[severe 2017/12/20 10:48:14.728 UTC abb6648c-39d6-4c4c-9c6d-ab8589e034a5 <P2P message reader for gbv00455(29acc6f1-5384-489d-b2bd-5187b898e482:22303)<ec><v3>:10002 shared unordered uid=8 port=41631> tid=0x44] IOException deserializing message

java.io.IOException: failure during message deserialization

at org.apache.geode.internal.tcp.MsgDestreamer.getMessage(MsgDestreamer.java:190)

at org.apache.geode.internal.tcp.Connection.runOioReader(Connection.java:2218)

at org.apache.geode.internal.tcp.Connection.run(Connection.java:1728)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:748)

Caused by: org.apache.geode.SerializationException: Could not create an instance of org.apache.geode.internal.cache.partitioned.PartitionedRegionFunctionStreamingMessage .

at org.apache.geode.internal.InternalDataSerializer.invokeFromData(InternalDataSerializer.java:2492)

at org.apache.geode.internal.DSFIDFactory.create(DSFIDFactory.java:979)

at org.apache.geode.internal.InternalDataSerializer.readDSFID(InternalDataSerializer.java:2720)

at org.apache.geode.internal.tcp.MsgDestreamer$DestreamerThread.run(MsgDestreamer.java:261)

Caused by: org.apache.geode.SerializationException: Could not create an instance of org.apache.geode.internal.cache.execute.FunctionRemoteContext .

at org.apache.geode.internal.InternalDataSerializer.readDataSerializable(InternalDataSerializer.java:2521)

at org.apache.geode.internal.InternalDataSerializer.basicReadObject(InternalDataSerializer.java:2958)

at org.apache.geode.DataSerializer.readObject(DataSerializer.java:2897)

at org.apache.geode.internal.cache.partitioned.PartitionedRegionFunctionStreamingMessage.fromData(PartitionedRegionFunctionStreamingMessage.java:180)

at org.apache.geode.internal.InternalDataSerializer.invokeFromData(InternalDataSerializer.java:2477)

... 3 more

Caused by: org.apache.geode.SerializationException: Could not create an instance of org.apache.geode.internal.cache.execute.FunctionRemoteContext .

at org.apache.geode.internal.InternalDataSerializer.invokeFromData(InternalDataSerializer.java:2492)

at org.apache.geode.internal.InternalDataSerializer.readDataSerializable(InternalDataSerializer.java:2507)

... 7 more

Caused by: java.io.StreamCorruptedException: invalid type code: B1

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1563)

at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)

at java.util.TreeMap.buildFromSorted(TreeMap.java:2567)

at java.util.TreeMap.buildFromSorted(TreeMap.java:2551)

at java.util.TreeMap.buildFromSorted(TreeMap.java:2583)

at java.util.TreeMap.buildFromSorted(TreeMap.java:2583)

at java.util.TreeMap.buildFromSorted(TreeMap.java:2583)

at java.util.TreeMap.buildFromSorted(TreeMap.java:2583)

at java.util.TreeMap.buildFromSorted(TreeMap.java:2583)

at java.util.TreeMap.buildFromSorted(TreeMap.java:2583)

at java.util.TreeMap.buildFromSorted(TreeMap.java:2583)

at java.util.TreeMap.buildFromSorted(TreeMap.java:2551)

at java.util.TreeMap.buildFromSorted(TreeMap.java:2583)

at java.util.TreeMap.buildFromSorted(TreeMap.java:2551)

at java.util.TreeMap.buildFromSorted(TreeMap.java:2583)

at java.util.TreeMap.buildFromSorted(TreeMap.java:2551)

at java.util.TreeMap.buildFromSorted(TreeMap.java:2583)

at java.util.TreeMap.buildFromSorted(TreeMap.java:2508)

at java.util.TreeMap.readTreeSet(TreeMap.java:2460)

    at java.util.TreeSet.readObject(TreeSet.java:533)

at sun.reflect.GeneratedMethodAccessor743.invoke(Unknown Source)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)

at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2136)

at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)

at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2245)

at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169)

at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)

at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)

at java.util.ArrayList.readObject(ArrayList.java:791)

at sun.reflect.GeneratedMethodAccessor232.invoke(Unknown Source)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)

at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2136)

at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)

at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2245)

at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169)

at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)

at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)

       at java.util.ArrayList.readObject(ArrayList.java:791)

at sun.reflect.GeneratedMethodAccessor232.invoke(Unknown Source)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)

at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2136)

at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)

at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1933)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1529)

at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2245)

at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169)

at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)

at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)

        at org.apache.geode.internal.InternalDataSerializer.basicReadObject(InternalDataSerializer.java:2992)

at org.apache.geode.DataSerializer.readObject(DataSerializer.java:2897)

at org.apache.geode.internal.cache.execute.FunctionRemoteContext.fromData(FunctionRemoteContext.java:73)

at org.apache.geode.internal.InternalDataSerializer.invokeFromData(InternalDataSerializer.java:2479)

... 8 more

So could it be that these exceptions are not being sent back to caller node resulting caller thread to wait for reply forever?

Thanks,

Vahram.


Reply via email to