[ 
https://issues.apache.org/jira/browse/GEODE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-8920:
--------------------------------------
    Description: 
Debug logging in DirectChannel lets us know the IDs of receivers of a message 
and the toString of the message but it's very difficult to figure out what 
thread on the receiving end is supposed to process that message.

Here's an example of what we currently have:

[debug 2021/02/01 16:15:17.492 PST persistgemfire8_host1_8586 
<vm_9_thr_25_persist8_host1_8586> tid=0x4f0] Sending 
(DLockRequestProcessor.DLockResponseMessage responding GRANT; 
serviceName=__PDX(version 4); objectName=PDX_LOCK; responseCode=0; 
keyIfFailed=null; leaseExpireTime=9223372036854775807; processorId=509; 
lockId=509) to 1 peers 
([rs-GEM-3166-PL1535a2i32xlarge-hydra-client-36(persistgemfire9_host1_8517:8517)<ec><v51>:41005])
 via tcp/ip

This does not tell you anything about the receiver except its ID.  On the 
receiving side the thread that, in this run, would handle that message is this:

persistgemfire9_host1_8517 <P2P message reader for 
rs-GEM-3166-PL1535a2i32xlarge-hydra-client-36(persistgemfire8_host1_8586:8586)<ec><v51>:41006
 unshared ordered *uid=1036* dom #1 local port=47207 remote port=42068> tid=0x51

I've highlighted the *uid* here because that is the _uniqueId_ of the sending 
Connection.  If you looked through the logs or stack traces of the receiver and 
knew the uniqueId of the sending Connection you could easily locate the thread 
that should receive this DLockResponseMessage.  Currently this is much harder 
than it needs to be because the DirectChannel _Sending_ log message doesn't 
include the _uniqueId_ of the Connections it is using to send the message.

Let's change that log message to include the _uniqueId_ of each outgoing 
Connection.  Maybe something like this:

Sending (message.toString()) to 1 peers (peer ID)*, uid=1036* via tcp/ip

and on the receiving side we could be clearer about what the *uid* in the 
thread's name means:

persistgemfire9_host1_8517 <P2P message reader for 
rs-GEM-3166-PL1535a2i32xlarge-hydra-client-36(persistgemfire8_host1_8586:8586)<ec><v51>:41006
 unshared ordered *sender uid=1036* dom #1 local port=47207 remote port=42068> 
tid=0x51

or something like that.

Now we can look at the _Sending_ message and know that the receiving thread 
will have _uid=1036_ in its name.  Knowing this it ought to be possible to 
write a program/script to trace a message and its consequences from one node to 
another.


  was:
Debug logging in DirectChannel lets us know the IDs of receivers of a message 
and the toString of the message but it's very difficult to figure out what 
thread on the receiving end is supposed to process that message.

Here's an example of what we currently have:

[debug 2021/02/01 16:15:17.492 PST persistgemfire8_host1_8586 
<vm_9_thr_25_persist8_host1_8586> tid=0x4f0] Sending 
(DLockRequestProcessor.DLockResponseMessage responding GRANT; 
serviceName=__PDX(version 4); objectName=PDX_LOCK; responseCode=0; 
keyIfFailed=null; leaseExpireTime=9223372036854775807; processorId=509; 
lockId=509) to 1 peers 
([rs-GEM-3166-PL1535a2i32xlarge-hydra-client-36(persistgemfire9_host1_8517:8517)<ec><v51>:41005])
 via tcp/ip

This does not tell you anything about the receiver except its ID.  On the 
receiving side the thread that, in this run, would handle that message is this:

persistgemfire9_host1_8517 <P2P message reader for 
rs-GEM-3166-PL1535a2i32xlarge-hydra-client-36(persistgemfire8_host1_8586:8586)<ec><v51>:41006
 unshared ordered *uid=1036* dom #1 local port=47207 remote port=42068> tid=0x51

I've highlighted the *uid* here because that is the _uniqueId_ of the sending 
Connection.  If you looked through the logs or stack traces of the receiver and 
knew the uniqueId of the sending Connection you could easily locate the thread 
that should receive this DLockResponseMessage.  Currently this is much harder 
than it needs to be because the DirectChannel _Sending_ log message doesn't 
include the _uniqueId_ of the Connections it is using to send the message.

Let's change that log message to include the _uniqueId_ of each outgoing 
Connection.  Maybe something like this:

Sending (message.toString()) to 1 peers (peer ID)to 1 peers (peer ID)*, 
uid=1036* via tcp/ip

and on the receiving side we could be clearer about what the *uid* in the 
thread's name means:

persistgemfire9_host1_8517 <P2P message reader for 
rs-GEM-3166-PL1535a2i32xlarge-hydra-client-36(persistgemfire8_host1_8586:8586)<ec><v51>:41006
 unshared ordered *sender uid=1036* dom #1 local port=47207 remote port=42068> 
tid=0x51

or something like that.

Now we can look at the _Sending_ message and know that the receiving thread 
will have _uid=1036_ in its name.  Knowing this it ought to be possible to 
write a program/script to trace a message and its consequences from one node to 
another.



> Modify debug logging to make it easier to trace a message
> ---------------------------------------------------------
>
>                 Key: GEODE-8920
>                 URL: https://issues.apache.org/jira/browse/GEODE-8920
>             Project: Geode
>          Issue Type: Improvement
>          Components: membership
>            Reporter: Bruce J Schuchardt
>            Priority: Major
>
> Debug logging in DirectChannel lets us know the IDs of receivers of a message 
> and the toString of the message but it's very difficult to figure out what 
> thread on the receiving end is supposed to process that message.
> Here's an example of what we currently have:
> [debug 2021/02/01 16:15:17.492 PST persistgemfire8_host1_8586 
> <vm_9_thr_25_persist8_host1_8586> tid=0x4f0] Sending 
> (DLockRequestProcessor.DLockResponseMessage responding GRANT; 
> serviceName=__PDX(version 4); objectName=PDX_LOCK; responseCode=0; 
> keyIfFailed=null; leaseExpireTime=9223372036854775807; processorId=509; 
> lockId=509) to 1 peers 
> ([rs-GEM-3166-PL1535a2i32xlarge-hydra-client-36(persistgemfire9_host1_8517:8517)<ec><v51>:41005])
>  via tcp/ip
> This does not tell you anything about the receiver except its ID.  On the 
> receiving side the thread that, in this run, would handle that message is 
> this:
> persistgemfire9_host1_8517 <P2P message reader for 
> rs-GEM-3166-PL1535a2i32xlarge-hydra-client-36(persistgemfire8_host1_8586:8586)<ec><v51>:41006
>  unshared ordered *uid=1036* dom #1 local port=47207 remote port=42068> 
> tid=0x51
> I've highlighted the *uid* here because that is the _uniqueId_ of the sending 
> Connection.  If you looked through the logs or stack traces of the receiver 
> and knew the uniqueId of the sending Connection you could easily locate the 
> thread that should receive this DLockResponseMessage.  Currently this is much 
> harder than it needs to be because the DirectChannel _Sending_ log message 
> doesn't include the _uniqueId_ of the Connections it is using to send the 
> message.
> Let's change that log message to include the _uniqueId_ of each outgoing 
> Connection.  Maybe something like this:
> Sending (message.toString()) to 1 peers (peer ID)*, uid=1036* via tcp/ip
> and on the receiving side we could be clearer about what the *uid* in the 
> thread's name means:
> persistgemfire9_host1_8517 <P2P message reader for 
> rs-GEM-3166-PL1535a2i32xlarge-hydra-client-36(persistgemfire8_host1_8586:8586)<ec><v51>:41006
>  unshared ordered *sender uid=1036* dom #1 local port=47207 remote 
> port=42068> tid=0x51
> or something like that.
> Now we can look at the _Sending_ message and know that the receiving thread 
> will have _uid=1036_ in its name.  Knowing this it ought to be possible to 
> write a program/script to trace a message and its consequences from one node 
> to another.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to