[jira] [Commented] (CASSANDRA-5520) Query tracing session info inconsistent with events info
[ https://issues.apache.org/jira/browse/CASSANDRA-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13706133#comment-13706133 ] Jonathan Ellis commented on CASSANDRA-5520: --- Committed! > Query tracing session info inconsistent with events info > > > Key: CASSANDRA-5520 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5520 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Affects Versions: 1.2.0 > Environment: Linux >Reporter: Ilya Kirnos >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 1.2.7 > > Attachments: 5520-v1.txt, 5520-v2.txt, 5520-v3.txt, > 5520-v4-trunk.txt, 5520-v4.txt > > > Session info for a trace is showing that a query took > 10 seconds (it timed > out). > {noformat} > cqlsh:system_traces> select session_id, duration, request from sessions where > session_id = c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | duration | request > > c7e36a30-af3a-11e2-9ec9-772ec39805fe | 1230 | multiget_slice > However, the event-level breakdown shows no such large duration: > cqlsh:system_traces> select * from events where session_id = > c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | event_id | activity | source | source_elapsed | thread > -+--- > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a30-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 19 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a31-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.153.16 | xxx.xxx.90.147 | 246 | > WRITE-/xxx.xxx.4.16 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 259 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 253 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 25 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9480-e9d811e0fc18 | > Executing single-partition query on CardHash | xxx.xxx.4.16 | 421 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 310 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 106 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9480-e9d811e0fc18 | > Acquiring sstable references | xxx.xxx.4.16 | 444 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 352 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-b8dc-a7032a583115 | > Executing single-partition query on CardHash | xxx.xxx.213.136 | 144 | > ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9480-e9d811e0fc18 | > Merging memtable contents | xxx.xxx.4.16 | 472 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.95.237 | xxx.xxx.90.147 | 362 | > WRITE-/xxx.xxx.201.218 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-b8dc-a7032a583115 | > Acquiring sstable references | xxx.xxx.213.136 | 164 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9480-e9d811e0fc18 | > Merging data from memtables and 0 sstables | xxx.xxx.4.16 | 510 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 376 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-b8dc-a7032a583115 | > Merging memtable contents | xxx.xxx.213.136 | 195 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-9480-e9d811e0fc18 | > Read 0 live cells and 0 tombstoned | xxx.xxx.4.16 | 530 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.95.237 | xxx.xxx.90.147 | 401 | > WRITE-/xxx.xxx.201.218 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-b8dc-a7032a583115 | > Executing single-partition query on CardHash | xxx.xxx.213.136 | 202 | > ReadStage:41 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39146-
[jira] [Commented] (CASSANDRA-5520) Query tracing session info inconsistent with events info
[ https://issues.apache.org/jira/browse/CASSANDRA-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13705212#comment-13705212 ] Jonathan Ellis commented on CASSANDRA-5520: --- Can you also post a version against trunk? Unfortunately the merge is not trivial. > Query tracing session info inconsistent with events info > > > Key: CASSANDRA-5520 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5520 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Affects Versions: 1.2.0 > Environment: Linux >Reporter: Ilya Kirnos >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 1.2.7 > > Attachments: 5520-v1.txt, 5520-v2.txt, 5520-v3.txt, 5520-v4.txt > > > Session info for a trace is showing that a query took > 10 seconds (it timed > out). > {noformat} > cqlsh:system_traces> select session_id, duration, request from sessions where > session_id = c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | duration | request > > c7e36a30-af3a-11e2-9ec9-772ec39805fe | 1230 | multiget_slice > However, the event-level breakdown shows no such large duration: > cqlsh:system_traces> select * from events where session_id = > c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | event_id | activity | source | source_elapsed | thread > -+--- > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a30-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 19 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a31-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.153.16 | xxx.xxx.90.147 | 246 | > WRITE-/xxx.xxx.4.16 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 259 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 253 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 25 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9480-e9d811e0fc18 | > Executing single-partition query on CardHash | xxx.xxx.4.16 | 421 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 310 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 106 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9480-e9d811e0fc18 | > Acquiring sstable references | xxx.xxx.4.16 | 444 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 352 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-b8dc-a7032a583115 | > Executing single-partition query on CardHash | xxx.xxx.213.136 | 144 | > ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9480-e9d811e0fc18 | > Merging memtable contents | xxx.xxx.4.16 | 472 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.95.237 | xxx.xxx.90.147 | 362 | > WRITE-/xxx.xxx.201.218 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-b8dc-a7032a583115 | > Acquiring sstable references | xxx.xxx.213.136 | 164 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9480-e9d811e0fc18 | > Merging data from memtables and 0 sstables | xxx.xxx.4.16 | 510 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 376 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-b8dc-a7032a583115 | > Merging memtable contents | xxx.xxx.213.136 | 195 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-9480-e9d811e0fc18 | > Read 0 live cells and 0 tombstoned | xxx.xxx.4.16 | 530 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.95.237 | xxx.xxx.90.147 | 401 | > WRITE-/xxx.xxx.201.218 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-b8dc-a7032a583115 | > Executing single-partition query on CardHash | xxx.xxx.213.136 | 202 | > ReadStage:4
[jira] [Commented] (CASSANDRA-5520) Query tracing session info inconsistent with events info
[ https://issues.apache.org/jira/browse/CASSANDRA-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13704744#comment-13704744 ] Jonathan Ellis commented on CASSANDRA-5520: --- bq. I rarely use debug logging (especially while looking into timeouts), so I can't say how useful it is, but I feel like some mention of the timeout happening at debug level would be good, especially considering it's a somewhat rare event. In that case let's just clean it up so we're not logging redundant timeouts. > Query tracing session info inconsistent with events info > > > Key: CASSANDRA-5520 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5520 > Project: Cassandra > Issue Type: Improvement >Affects Versions: 1.2.4 > Environment: Linux >Reporter: Ilya Kirnos >Assignee: Tyler Hobbs >Priority: Minor > Attachments: 5520-v1.txt, 5520-v2.txt, 5520-v3.txt > > > Session info for a trace is showing that a query took > 10 seconds (it timed > out). > {noformat} > cqlsh:system_traces> select session_id, duration, request from sessions where > session_id = c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | duration | request > > c7e36a30-af3a-11e2-9ec9-772ec39805fe | 1230 | multiget_slice > However, the event-level breakdown shows no such large duration: > cqlsh:system_traces> select * from events where session_id = > c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | event_id | activity | source | source_elapsed | thread > -+--- > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a30-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 19 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a31-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.153.16 | xxx.xxx.90.147 | 246 | > WRITE-/xxx.xxx.4.16 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 259 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 253 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 25 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9480-e9d811e0fc18 | > Executing single-partition query on CardHash | xxx.xxx.4.16 | 421 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 310 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 106 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9480-e9d811e0fc18 | > Acquiring sstable references | xxx.xxx.4.16 | 444 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 352 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-b8dc-a7032a583115 | > Executing single-partition query on CardHash | xxx.xxx.213.136 | 144 | > ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9480-e9d811e0fc18 | > Merging memtable contents | xxx.xxx.4.16 | 472 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.95.237 | xxx.xxx.90.147 | 362 | > WRITE-/xxx.xxx.201.218 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-b8dc-a7032a583115 | > Acquiring sstable references | xxx.xxx.213.136 | 164 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9480-e9d811e0fc18 | > Merging data from memtables and 0 sstables | xxx.xxx.4.16 | 510 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 376 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-b8dc-a7032a583115 | > Merging memtable contents | xxx.xxx.213.136 | 195 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-9480-e9d811e0fc18 | > Read 0 live cells and 0 tombstoned | xxx.xxx.4.16 | 530 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.95.237 | xxx.xxx.90.147 | 401 | > WRITE-/xxx.xxx.201.218 > c7
[jira] [Commented] (CASSANDRA-5520) Query tracing session info inconsistent with events info
[ https://issues.apache.org/jira/browse/CASSANDRA-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13704673#comment-13704673 ] Tyler Hobbs commented on CASSANDRA-5520: bq. I was going to commit this but then I realized we still have a bunch of logger.debug for the timeouts in CassandraServer. Inclined to say we should just rip out logger.debug on this path after all, it's too noisy to use in a live system and tracing is just better all around. WDYT? I rarely use debug logging (especially while looking into timeouts), so I can't say how useful it is, but I feel like some mention of the timeout happening at debug level would be good, especially considering it's a somewhat rare event. bq. Meant to comment on this earlier – it does seem useful to denormalize this into sessions, maybe even index it so you can easily query it for failures when you have the "trace N% of my requests" turned on. (Should we enable that by default in 2.0?) ... However if we are going to standardize tracing events soon (CASSANDRA-5672) then we could index the event type directly which would be almost as user-friendly, so it might not warrant its own sessions column. Adding a session column seems a little more human-user friendly to me, so perhaps having both would be good. Regarding enabling tracing by default, I'm -1 on that. Given that it incurs extra overhead, I say we follow the principle of least surprise and only enable it when requested. bq. Either way I think it's worth pulling into a separate ticket. Thanks, I agree. > Query tracing session info inconsistent with events info > > > Key: CASSANDRA-5520 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5520 > Project: Cassandra > Issue Type: Improvement >Affects Versions: 1.2.4 > Environment: Linux >Reporter: Ilya Kirnos >Assignee: Tyler Hobbs >Priority: Minor > Attachments: 5520-v1.txt, 5520-v2.txt, 5520-v3.txt > > > Session info for a trace is showing that a query took > 10 seconds (it timed > out). > {noformat} > cqlsh:system_traces> select session_id, duration, request from sessions where > session_id = c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | duration | request > > c7e36a30-af3a-11e2-9ec9-772ec39805fe | 1230 | multiget_slice > However, the event-level breakdown shows no such large duration: > cqlsh:system_traces> select * from events where session_id = > c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | event_id | activity | source | source_elapsed | thread > -+--- > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a30-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 19 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a31-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.153.16 | xxx.xxx.90.147 | 246 | > WRITE-/xxx.xxx.4.16 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 259 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 253 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 25 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9480-e9d811e0fc18 | > Executing single-partition query on CardHash | xxx.xxx.4.16 | 421 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 310 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 106 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9480-e9d811e0fc18 | > Acquiring sstable references | xxx.xxx.4.16 | 444 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 352 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-b8dc-a7032a583115 | > Executing single-partition query on CardHash | xxx.xxx.213.136 | 144 | > ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9480-e9d811e0fc18 | > Merging memtable contents | xxx.xxx.4.16 | 472 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9ec9-772ec39805fe | > S
[jira] [Commented] (CASSANDRA-5520) Query tracing session info inconsistent with events info
[ https://issues.apache.org/jira/browse/CASSANDRA-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13704100#comment-13704100 ] Jonathan Ellis commented on CASSANDRA-5520: --- bq. re-reading Sylvain's comment, I realized that I didn't add a "success" column for sessions Meant to comment on this earlier -- it does seem useful to denormalize this into sessions, maybe even index it so you can easily query it for failures when you have the "trace N% of my requests" turned on. (Should we enable that by default in 2.0?) However if we are going to standardize tracing events soon (CASSANDRA-5672) then we could index the event type directly which would be almost as user-friendly, so it might not warrant its own sessions column. Either way I think it's worth pulling into a separate ticket. > Query tracing session info inconsistent with events info > > > Key: CASSANDRA-5520 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5520 > Project: Cassandra > Issue Type: Improvement >Affects Versions: 1.2.4 > Environment: Linux >Reporter: Ilya Kirnos >Assignee: Tyler Hobbs >Priority: Minor > Attachments: 5520-v1.txt, 5520-v2.txt, 5520-v3.txt > > > Session info for a trace is showing that a query took > 10 seconds (it timed > out). > {noformat} > cqlsh:system_traces> select session_id, duration, request from sessions where > session_id = c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | duration | request > > c7e36a30-af3a-11e2-9ec9-772ec39805fe | 1230 | multiget_slice > However, the event-level breakdown shows no such large duration: > cqlsh:system_traces> select * from events where session_id = > c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | event_id | activity | source | source_elapsed | thread > -+--- > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a30-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 19 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a31-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.153.16 | xxx.xxx.90.147 | 246 | > WRITE-/xxx.xxx.4.16 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 259 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 253 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 25 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9480-e9d811e0fc18 | > Executing single-partition query on CardHash | xxx.xxx.4.16 | 421 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 310 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 106 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9480-e9d811e0fc18 | > Acquiring sstable references | xxx.xxx.4.16 | 444 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 352 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-b8dc-a7032a583115 | > Executing single-partition query on CardHash | xxx.xxx.213.136 | 144 | > ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9480-e9d811e0fc18 | > Merging memtable contents | xxx.xxx.4.16 | 472 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.95.237 | xxx.xxx.90.147 | 362 | > WRITE-/xxx.xxx.201.218 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-b8dc-a7032a583115 | > Acquiring sstable references | xxx.xxx.213.136 | 164 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9480-e9d811e0fc18 | > Merging data from memtables and 0 sstables | xxx.xxx.4.16 | 510 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 376 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-b8dc-a7032a583115 | > Merging memtable contents | xxx.xxx.213.136 | 195 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9
[jira] [Commented] (CASSANDRA-5520) Query tracing session info inconsistent with events info
[ https://issues.apache.org/jira/browse/CASSANDRA-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13703790#comment-13703790 ] Jonathan Ellis commented on CASSANDRA-5520: --- Thanks! Comments: - We already trace the source when we get a reply to a request session, so including the addresses again in the timeout is probably overkill. I'd just go with the reply count. - I don't think ex.toString gives reply count, so probably want to reuse the reply-counting loging for the logger.debug branch - The "timed out after receiving all replies" message could probably use a bit more explanation for people unfamiliar w/ the digest logic I also note that the patch is reversed, which confused me a bit but looking at the branch cleared it up. :) > Query tracing session info inconsistent with events info > > > Key: CASSANDRA-5520 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5520 > Project: Cassandra > Issue Type: Improvement >Affects Versions: 1.2.4 > Environment: Linux >Reporter: Ilya Kirnos >Assignee: Tyler Hobbs >Priority: Minor > Attachments: 5520-v1.txt > > > Session info for a trace is showing that a query took > 10 seconds (it timed > out). > {noformat} > cqlsh:system_traces> select session_id, duration, request from sessions where > session_id = c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | duration | request > > c7e36a30-af3a-11e2-9ec9-772ec39805fe | 1230 | multiget_slice > However, the event-level breakdown shows no such large duration: > cqlsh:system_traces> select * from events where session_id = > c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | event_id | activity | source | source_elapsed | thread > -+--- > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a30-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 19 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a31-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.153.16 | xxx.xxx.90.147 | 246 | > WRITE-/xxx.xxx.4.16 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 259 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 253 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 25 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9480-e9d811e0fc18 | > Executing single-partition query on CardHash | xxx.xxx.4.16 | 421 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 310 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 106 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9480-e9d811e0fc18 | > Acquiring sstable references | xxx.xxx.4.16 | 444 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 352 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-b8dc-a7032a583115 | > Executing single-partition query on CardHash | xxx.xxx.213.136 | 144 | > ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9480-e9d811e0fc18 | > Merging memtable contents | xxx.xxx.4.16 | 472 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.95.237 | xxx.xxx.90.147 | 362 | > WRITE-/xxx.xxx.201.218 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-b8dc-a7032a583115 | > Acquiring sstable references | xxx.xxx.213.136 | 164 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9480-e9d811e0fc18 | > Merging data from memtables and 0 sstables | xxx.xxx.4.16 | 510 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 376 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-b8dc-a7032a583115 | > Merging memtable contents | xxx.xxx.213.136 | 195 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-9480-e9d811e0fc18 | > Read 0 live cells
[jira] [Commented] (CASSANDRA-5520) Query tracing session info inconsistent with events info
[ https://issues.apache.org/jira/browse/CASSANDRA-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13702499#comment-13702499 ] Jonathan Ellis commented on CASSANDRA-5520: --- I think I meant the timeout details, because logging the request result in the timeout block doesn't make sense. > Query tracing session info inconsistent with events info > > > Key: CASSANDRA-5520 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5520 > Project: Cassandra > Issue Type: Improvement >Affects Versions: 1.2.4 > Environment: Linux >Reporter: Ilya Kirnos >Assignee: Tyler Hobbs >Priority: Minor > > Session info for a trace is showing that a query took > 10 seconds (it timed > out). > {noformat} > cqlsh:system_traces> select session_id, duration, request from sessions where > session_id = c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | duration | request > > c7e36a30-af3a-11e2-9ec9-772ec39805fe | 1230 | multiget_slice > However, the event-level breakdown shows no such large duration: > cqlsh:system_traces> select * from events where session_id = > c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | event_id | activity | source | source_elapsed | thread > -+--- > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a30-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 19 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a31-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.153.16 | xxx.xxx.90.147 | 246 | > WRITE-/xxx.xxx.4.16 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 259 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 253 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 25 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9480-e9d811e0fc18 | > Executing single-partition query on CardHash | xxx.xxx.4.16 | 421 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 310 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 106 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9480-e9d811e0fc18 | > Acquiring sstable references | xxx.xxx.4.16 | 444 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 352 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-b8dc-a7032a583115 | > Executing single-partition query on CardHash | xxx.xxx.213.136 | 144 | > ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9480-e9d811e0fc18 | > Merging memtable contents | xxx.xxx.4.16 | 472 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.95.237 | xxx.xxx.90.147 | 362 | > WRITE-/xxx.xxx.201.218 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-b8dc-a7032a583115 | > Acquiring sstable references | xxx.xxx.213.136 | 164 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9480-e9d811e0fc18 | > Merging data from memtables and 0 sstables | xxx.xxx.4.16 | 510 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 376 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-b8dc-a7032a583115 | > Merging memtable contents | xxx.xxx.213.136 | 195 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-9480-e9d811e0fc18 | > Read 0 live cells and 0 tombstoned | xxx.xxx.4.16 | 530 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.95.237 | xxx.xxx.90.147 | 401 | > WRITE-/xxx.xxx.201.218 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-b8dc-a7032a583115 | > Executing single-partition query on CardHash | xxx.xxx.213.136 | 202 | > ReadStage:41 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39146-af3a-11e2-9480-e9d811e0fc18 | > Enqueuing response to
[jira] [Commented] (CASSANDRA-5520) Query tracing session info inconsistent with events info
[ https://issues.apache.org/jira/browse/CASSANDRA-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13702467#comment-13702467 ] Tyler Hobbs commented on CASSANDRA-5520: bq. If we're going to gold-plate it, including the result in the trace (but not the debug call, we only want to construct it if requested, because creating Strings is expensive) is a good idea too. Can you elaborate on what you mean by "including the result in the trace"? Are you just referring to the exception details that Sylvain mentioned? > Query tracing session info inconsistent with events info > > > Key: CASSANDRA-5520 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5520 > Project: Cassandra > Issue Type: Improvement >Affects Versions: 1.2.4 > Environment: Linux >Reporter: Ilya Kirnos >Assignee: Tyler Hobbs >Priority: Minor > > Session info for a trace is showing that a query took > 10 seconds (it timed > out). > {noformat} > cqlsh:system_traces> select session_id, duration, request from sessions where > session_id = c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | duration | request > > c7e36a30-af3a-11e2-9ec9-772ec39805fe | 1230 | multiget_slice > However, the event-level breakdown shows no such large duration: > cqlsh:system_traces> select * from events where session_id = > c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | event_id | activity | source | source_elapsed | thread > -+--- > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a30-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 19 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a31-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.153.16 | xxx.xxx.90.147 | 246 | > WRITE-/xxx.xxx.4.16 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 259 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 253 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 25 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9480-e9d811e0fc18 | > Executing single-partition query on CardHash | xxx.xxx.4.16 | 421 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 310 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 106 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9480-e9d811e0fc18 | > Acquiring sstable references | xxx.xxx.4.16 | 444 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 352 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-b8dc-a7032a583115 | > Executing single-partition query on CardHash | xxx.xxx.213.136 | 144 | > ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9480-e9d811e0fc18 | > Merging memtable contents | xxx.xxx.4.16 | 472 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.95.237 | xxx.xxx.90.147 | 362 | > WRITE-/xxx.xxx.201.218 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-b8dc-a7032a583115 | > Acquiring sstable references | xxx.xxx.213.136 | 164 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9480-e9d811e0fc18 | > Merging data from memtables and 0 sstables | xxx.xxx.4.16 | 510 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 376 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-b8dc-a7032a583115 | > Merging memtable contents | xxx.xxx.213.136 | 195 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-9480-e9d811e0fc18 | > Read 0 live cells and 0 tombstoned | xxx.xxx.4.16 | 530 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.95.237 | xxx.xxx.90.147 | 401 | > WRITE-/xxx.xxx.201.218 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e3
[jira] [Commented] (CASSANDRA-5520) Query tracing session info inconsistent with events info
[ https://issues.apache.org/jira/browse/CASSANDRA-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13646587#comment-13646587 ] Jonathan Ellis commented on CASSANDRA-5520: --- As I said on IRC, I think the fix is as simple as changing {{logger.debug("... timed out")}} to trace calls. (More precisely, we should trace it if tracing is enabled, otherwise logger.debug it, for symmetry with our open-trace-session code.) > Query tracing session info inconsistent with events info > > > Key: CASSANDRA-5520 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5520 > Project: Cassandra > Issue Type: Bug >Affects Versions: 1.2.4 > Environment: Linux >Reporter: Ilya Kirnos >Assignee: Aleksey Yeschenko > > Session info for a trace is showing that a query took > 10 seconds (it timed > out). > cqlsh:system_traces> select session_id, duration, request from sessions where > session_id = c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | duration | request > > c7e36a30-af3a-11e2-9ec9-772ec39805fe | 1230 | multiget_slice > However, the event-level breakdown shows no such large duration: > cqlsh:system_traces> select * from events where session_id = > c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | event_id | activity | source | source_elapsed | thread > -+--- > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a30-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 19 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a31-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.153.16 | xxx.xxx.90.147 | 246 | > WRITE-/xxx.xxx.4.16 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 259 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 253 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 25 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9480-e9d811e0fc18 | > Executing single-partition query on CardHash | xxx.xxx.4.16 | 421 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 310 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 106 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9480-e9d811e0fc18 | > Acquiring sstable references | xxx.xxx.4.16 | 444 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 352 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-b8dc-a7032a583115 | > Executing single-partition query on CardHash | xxx.xxx.213.136 | 144 | > ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9480-e9d811e0fc18 | > Merging memtable contents | xxx.xxx.4.16 | 472 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.95.237 | xxx.xxx.90.147 | 362 | > WRITE-/xxx.xxx.201.218 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-b8dc-a7032a583115 | > Acquiring sstable references | xxx.xxx.213.136 | 164 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9480-e9d811e0fc18 | > Merging data from memtables and 0 sstables | xxx.xxx.4.16 | 510 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 376 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-b8dc-a7032a583115 | > Merging memtable contents | xxx.xxx.213.136 | 195 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-9480-e9d811e0fc18 | > Read 0 live cells and 0 tombstoned | xxx.xxx.4.16 | 530 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.95.237 | xxx.xxx.90.147 | 401 | > WRITE-/xxx.xxx.201.218 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-b8dc-a7032a583115 | > Executing single-partition query on CardHash | xxx.xxx.213.136 | 202 | > ReadStage:41 > c7e36a30-af3a-
[jira] [Commented] (CASSANDRA-5520) Query tracing session info inconsistent with events info
[ https://issues.apache.org/jira/browse/CASSANDRA-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13646506#comment-13646506 ] Brandon Williams commented on CASSANDRA-5520: - +1 on all of that. > Query tracing session info inconsistent with events info > > > Key: CASSANDRA-5520 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5520 > Project: Cassandra > Issue Type: Bug >Affects Versions: 1.2.4 > Environment: Linux >Reporter: Ilya Kirnos > > Session info for a trace is showing that a query took > 10 seconds (it timed > out). > cqlsh:system_traces> select session_id, duration, request from sessions where > session_id = c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | duration | request > > c7e36a30-af3a-11e2-9ec9-772ec39805fe | 1230 | multiget_slice > However, the event-level breakdown shows no such large duration: > cqlsh:system_traces> select * from events where session_id = > c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | event_id | activity | source | source_elapsed | thread > -+--- > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a30-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 19 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a31-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.153.16 | xxx.xxx.90.147 | 246 | > WRITE-/xxx.xxx.4.16 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 259 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 253 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 25 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9480-e9d811e0fc18 | > Executing single-partition query on CardHash | xxx.xxx.4.16 | 421 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 310 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 106 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9480-e9d811e0fc18 | > Acquiring sstable references | xxx.xxx.4.16 | 444 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 352 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-b8dc-a7032a583115 | > Executing single-partition query on CardHash | xxx.xxx.213.136 | 144 | > ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9480-e9d811e0fc18 | > Merging memtable contents | xxx.xxx.4.16 | 472 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.95.237 | xxx.xxx.90.147 | 362 | > WRITE-/xxx.xxx.201.218 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-b8dc-a7032a583115 | > Acquiring sstable references | xxx.xxx.213.136 | 164 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9480-e9d811e0fc18 | > Merging data from memtables and 0 sstables | xxx.xxx.4.16 | 510 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 376 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-b8dc-a7032a583115 | > Merging memtable contents | xxx.xxx.213.136 | 195 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-9480-e9d811e0fc18 | > Read 0 live cells and 0 tombstoned | xxx.xxx.4.16 | 530 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.95.237 | xxx.xxx.90.147 | 401 | > WRITE-/xxx.xxx.201.218 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-b8dc-a7032a583115 | > Executing single-partition query on CardHash | xxx.xxx.213.136 | 202 | > ReadStage:41 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39146-af3a-11e2-9480-e9d811e0fc18 | > Enqueuing response to /xxx.xxx.90.147 | xxx.xxx.4.16 | 572 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39146-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.153.16 | xx
[jira] [Commented] (CASSANDRA-5520) Query tracing session info inconsistent with events info
[ https://issues.apache.org/jira/browse/CASSANDRA-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13646456#comment-13646456 ] Sylvain Lebresne commented on CASSANDRA-5520: - To be fair to Ilya, maybe we could add a line in the trace log saying that we've timeouted, how many response we were waiting for, and which node answered. Could also be useful to record with the session info the result of the operation (i.e. success or exception), so it's easy to filter out the one that have thrown (without having to compare the total duration to the rpc timeout, which can be painful if you've change the rpc timeout a few time, and besides it's not like our timeout are very precise so I think it's possible (though very unlikely) for a query to have a total duration slightly greater than the rpc timeout without having timeouted). I'ts improvement, not bugs, but still likely nice to have. > Query tracing session info inconsistent with events info > > > Key: CASSANDRA-5520 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5520 > Project: Cassandra > Issue Type: Bug >Affects Versions: 1.2.4 > Environment: Linux >Reporter: Ilya Kirnos > > Session info for a trace is showing that a query took > 10 seconds (it timed > out). > cqlsh:system_traces> select session_id, duration, request from sessions where > session_id = c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | duration | request > > c7e36a30-af3a-11e2-9ec9-772ec39805fe | 1230 | multiget_slice > However, the event-level breakdown shows no such large duration: > cqlsh:system_traces> select * from events where session_id = > c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | event_id | activity | source | source_elapsed | thread > -+--- > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a30-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 19 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a31-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.153.16 | xxx.xxx.90.147 | 246 | > WRITE-/xxx.xxx.4.16 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 259 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 253 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 25 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9480-e9d811e0fc18 | > Executing single-partition query on CardHash | xxx.xxx.4.16 | 421 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 310 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 106 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9480-e9d811e0fc18 | > Acquiring sstable references | xxx.xxx.4.16 | 444 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 352 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-b8dc-a7032a583115 | > Executing single-partition query on CardHash | xxx.xxx.213.136 | 144 | > ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9480-e9d811e0fc18 | > Merging memtable contents | xxx.xxx.4.16 | 472 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.95.237 | xxx.xxx.90.147 | 362 | > WRITE-/xxx.xxx.201.218 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-b8dc-a7032a583115 | > Acquiring sstable references | xxx.xxx.213.136 | 164 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9480-e9d811e0fc18 | > Merging data from memtables and 0 sstables | xxx.xxx.4.16 | 510 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 376 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-b8dc-a7032a583115 | > Merging memtable contents | xxx.xxx.213.136 | 195 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-9480-e9d811e
[jira] [Commented] (CASSANDRA-5520) Query tracing session info inconsistent with events info
[ https://issues.apache.org/jira/browse/CASSANDRA-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13646135#comment-13646135 ] Ilya Kirnos commented on CASSANDRA-5520: Sorry to be dense, but: what does it mean for a session to end late with no events seemingly in flight? isn't a session just a container for events? > Query tracing session info inconsistent with events info > > > Key: CASSANDRA-5520 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5520 > Project: Cassandra > Issue Type: Bug >Affects Versions: 1.2.4 > Environment: Linux >Reporter: Ilya Kirnos > > Session info for a trace is showing that a query took > 10 seconds (it timed > out). > cqlsh:system_traces> select session_id, duration, request from sessions where > session_id = c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | duration | request > > c7e36a30-af3a-11e2-9ec9-772ec39805fe | 1230 | multiget_slice > However, the event-level breakdown shows no such large duration: > cqlsh:system_traces> select * from events where session_id = > c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | event_id | activity | source | source_elapsed | thread > -+--- > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a30-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 19 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a31-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.153.16 | xxx.xxx.90.147 | 246 | > WRITE-/xxx.xxx.4.16 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 259 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 253 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 25 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9480-e9d811e0fc18 | > Executing single-partition query on CardHash | xxx.xxx.4.16 | 421 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 310 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 106 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9480-e9d811e0fc18 | > Acquiring sstable references | xxx.xxx.4.16 | 444 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 352 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-b8dc-a7032a583115 | > Executing single-partition query on CardHash | xxx.xxx.213.136 | 144 | > ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9480-e9d811e0fc18 | > Merging memtable contents | xxx.xxx.4.16 | 472 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.95.237 | xxx.xxx.90.147 | 362 | > WRITE-/xxx.xxx.201.218 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-b8dc-a7032a583115 | > Acquiring sstable references | xxx.xxx.213.136 | 164 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9480-e9d811e0fc18 | > Merging data from memtables and 0 sstables | xxx.xxx.4.16 | 510 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 376 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-b8dc-a7032a583115 | > Merging memtable contents | xxx.xxx.213.136 | 195 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-9480-e9d811e0fc18 | > Read 0 live cells and 0 tombstoned | xxx.xxx.4.16 | 530 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.95.237 | xxx.xxx.90.147 | 401 | > WRITE-/xxx.xxx.201.218 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-b8dc-a7032a583115 | > Executing single-partition query on CardHash | xxx.xxx.213.136 | 202 | > ReadStage:41 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39146-af3a-11e2-9480-e9d811e0fc18 | > Enqueuing response to /xxx.xxx.90.147 | xxx.xxx.4.16 | 572 | ReadStage:5329
[jira] [Commented] (CASSANDRA-5520) Query tracing session info inconsistent with events info
[ https://issues.apache.org/jira/browse/CASSANDRA-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13646116#comment-13646116 ] Brandon Williams commented on CASSANDRA-5520: - The events may not always add up to the total. There are at least two ways sessions and events can be inconsistent with each other: first, there are events happening after the session has ended, and second, there are too few events to explain why the session ended so late. You see to have the latter case. > Query tracing session info inconsistent with events info > > > Key: CASSANDRA-5520 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5520 > Project: Cassandra > Issue Type: Bug >Affects Versions: 1.2.4 > Environment: Linux >Reporter: Ilya Kirnos > > Session info for a trace is showing that a query took > 10 seconds (it timed > out). > cqlsh:system_traces> select session_id, duration, request from sessions where > session_id = c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | duration | request > > c7e36a30-af3a-11e2-9ec9-772ec39805fe | 1230 | multiget_slice > However, the event-level breakdown shows no such large duration: > cqlsh:system_traces> select * from events where session_id = > c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | event_id | activity | source | source_elapsed | thread > -+--- > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a30-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 19 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a31-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.153.16 | xxx.xxx.90.147 | 246 | > WRITE-/xxx.xxx.4.16 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9480-e9d811e0fc18 | > Message received from /xxx.xxx.90.147 | xxx.xxx.4.16 | 259 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 253 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 25 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9480-e9d811e0fc18 | > Executing single-partition query on CardHash | xxx.xxx.4.16 | 421 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 310 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-b8dc-a7032a583115 | > Message received from /xxx.xxx.90.147 | xxx.xxx.213.136 | 106 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9480-e9d811e0fc18 | > Acquiring sstable references | xxx.xxx.4.16 | 444 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.106.37 | xxx.xxx.90.147 | 352 | > WRITE-/xxx.xxx.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-b8dc-a7032a583115 | > Executing single-partition query on CardHash | xxx.xxx.213.136 | 144 | > ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9480-e9d811e0fc18 | > Merging memtable contents | xxx.xxx.4.16 | 472 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.95.237 | xxx.xxx.90.147 | 362 | > WRITE-/xxx.xxx.201.218 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-b8dc-a7032a583115 | > Acquiring sstable references | xxx.xxx.213.136 | 164 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9480-e9d811e0fc18 | > Merging data from memtables and 0 sstables | xxx.xxx.4.16 | 510 | > ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9ec9-772ec39805fe | > Sending message to /xxx.xxx.151.214 | xxx.xxx.90.147 | 376 | > WRITE-/xxx.xxx.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-b8dc-a7032a583115 | > Merging memtable contents | xxx.xxx.213.136 | 195 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-9480-e9d811e0fc18 | > Read 0 live cells and 0 tombstoned | xxx.xxx.4.16 | 530 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-9ec9-772ec39805fe | > Sending message to /10.248.95.237 | xxx.xxx.90.147 | 401 | > WRITE-/xxx.xxx.201.218 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39145-af3a-11e2-b8dc-a7032a583115 | > Executing single-partition query on CardHash | xxx.xxx.213.136 | 202 | >
[jira] [Commented] (CASSANDRA-5520) Query tracing session info inconsistent with events info
[ https://issues.apache.org/jira/browse/CASSANDRA-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13645945#comment-13645945 ] Ilya Kirnos commented on CASSANDRA-5520: just to clarify: how can i tell which part of the request caused the timeout? or are you saying that all the parts, when added up, exceed 10 seconds? looking at the trace above, that's not the case. > Query tracing session info inconsistent with events info > > > Key: CASSANDRA-5520 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5520 > Project: Cassandra > Issue Type: Bug >Affects Versions: 1.2.4 > Environment: Linux >Reporter: Ilya Kirnos > > Session info for a trace is showing that a query took > 10 seconds (it timed > out). > cqlsh:system_traces> select session_id, duration, request from sessions where > session_id = c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | duration | request > --+--+ > c7e36a30-af3a-11e2-9ec9-772ec39805fe | 1230 | multiget_slice > However, the event-level breakdown shows no such large duration: > cqlsh:system_traces> select * from events where session_id = > c7e36a30-af3a-11e2-9ec9-772ec39805fe; > session_id | event_id > | activity | source | > source_elapsed | thread > --+--+--+++ > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a30-af3a-11e2-9480-e9d811e0fc18 > | Message received from /50.112.90.147 |50.112.4.16 | > 19 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e36a31-af3a-11e2-9ec9-772ec39805fe > |Sending message to /10.252.153.16 | 50.112.90.147 | > 246 | WRITE-/50.112.4.16 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9480-e9d811e0fc18 > | Message received from /50.112.90.147 |50.112.4.16 | > 259 | Thread-57 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-9ec9-772ec39805fe > |Sending message to /10.248.106.37 | 50.112.90.147 | > 253 |WRITE-/50.112.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39140-af3a-11e2-b8dc-a7032a583115 > | Message received from /50.112.90.147 | 50.112.213.136 | > 25 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9480-e9d811e0fc18 > | Executing single-partition query on CardHash |50.112.4.16 | > 421 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-9ec9-772ec39805fe > | Sending message to /10.252.151.214 | 50.112.90.147 | > 310 | WRITE-/50.112.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39141-af3a-11e2-b8dc-a7032a583115 > | Message received from /50.112.90.147 | 50.112.213.136 | > 106 | Thread-94 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9480-e9d811e0fc18 > | Acquiring sstable references |50.112.4.16 | > 444 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-9ec9-772ec39805fe > |Sending message to /10.248.106.37 | 50.112.90.147 | > 352 |WRITE-/50.112.79.52 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39142-af3a-11e2-b8dc-a7032a583115 > | Executing single-partition query on CardHash | 50.112.213.136 | > 144 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9480-e9d811e0fc18 > |Merging memtable contents |50.112.4.16 | > 472 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-9ec9-772ec39805fe > |Sending message to /10.248.95.237 | 50.112.90.147 | > 362 | WRITE-/50.112.201.218 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39143-af3a-11e2-b8dc-a7032a583115 > | Acquiring sstable references | 50.112.213.136 | > 164 | ReadStage:11 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9480-e9d811e0fc18 > | Merging data from memtables and 0 sstables |50.112.4.16 | > 510 | ReadStage:5329 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-9ec9-772ec39805fe > | Sending message to /10.252.151.214 | 50.112.90.147 | > 376 | WRITE-/50.112.213.136 > c7e36a30-af3a-11e2-9ec9-772ec39805fe | c7e39144-af3a-11e2-b8dc-a7032a583115 > |Merging memtable