[ 
https://issues.apache.org/jira/browse/CASSANDRA-4573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13442538#comment-13442538
 ] 

Tyler Hobbs commented on CASSANDRA-4573:
----------------------------------------

Vijay, I'm actually not seeing very long garbage collections, if I'm reading 
the logs correctly.  These are the relevant logs, running with a heap of 2GB 
and young gen size of 400MB:

{noformat}
{Heap before GC invocations=0 (full 0):
 par new generation   total 368640K, used 327680K [0x2f200000, 0x48200000, 
0x48200000)
  eden space 327680K, 100% used [0x2f200000, 0x43200000, 0x43200000)
  from space 40960K,   0% used [0x43200000, 0x43200000, 0x45a00000)
  to   space 40960K,   0% used [0x45a00000, 0x45a00000, 0x48200000)
 concurrent mark-sweep generation total 1687552K, used 0K [0x48200000, 
0xaf200000, 0xaf200000)
 concurrent-mark-sweep perm gen total 16384K, used 14333K [0xaf200000, 
0xb0200000, 0xb3200000)
2012-08-27T12:03:56.096-0500: [GC Before GC:
Statistics for BinaryTreeDictionary:
------------------------------------
Total Free Space: 432013312
Max   Chunk Size: 432013312
Number of Blocks: 1
Av.  Block  Size: 432013312
Tree      Height: 1
Before GC:
Statistics for BinaryTreeDictionary:
------------------------------------
Total Free Space: 0
Max   Chunk Size: 0
Number of Blocks: 0
Tree      Height: 0
[ParNew
Desired survivor size 20971520 bytes, new threshold 1 (max 1)
- age   1:    2692712 bytes,    2692712 total
: 327680K->2642K(368640K), 0.0564410 secs] 327680K->2642K(2056192K)After GC:
Statistics for BinaryTreeDictionary:
------------------------------------
Total Free Space: 431996928
Max   Chunk Size: 431996928
Number of Blocks: 1
Av.  Block  Size: 431996928
Tree      Height: 1
After GC:
Statistics for BinaryTreeDictionary:
------------------------------------
Total Free Space: 0
Max   Chunk Size: 0
Number of Blocks: 0
Tree      Height: 0
, 0.0567720 secs] [Times: user=0.03 sys=0.00, real=0.06 secs] 
Heap after GC invocations=1 (full 0):
 par new generation   total 368640K, used 2642K [0x2f200000, 0x48200000, 
0x48200000)
  eden space 327680K,   0% used [0x2f200000, 0x2f200000, 0x43200000)
  from space 40960K,   6% used [0x45a00000, 0x45c94998, 0x48200000)
  to   space 40960K,   0% used [0x43200000, 0x43200000, 0x45a00000)
 concurrent mark-sweep generation total 1687552K, used 0K [0x48200000, 
0xaf200000, 0xaf200000)
 concurrent-mark-sweep perm gen total 16384K, used 14333K [0xaf200000, 
0xb0200000, 0xb3200000)
}
Total time for which application threads were stopped: 0.0576140 seconds
Total time for which application threads were stopped: 0.0080490 seconds
Total time for which application threads were stopped: 0.0000810 seconds
Total time for which application threads were stopped: 0.0000410 seconds
Total time for which application threads were stopped: 0.0000360 seconds
Total time for which application threads were stopped: 0.0000340 seconds
Total time for which application threads were stopped: 0.0000360 seconds
Total time for which application threads were stopped: 0.0000340 seconds
Total time for which application threads were stopped: 0.0000340 seconds
Total time for which application threads were stopped: 0.0000320 seconds
Total time for which application threads were stopped: 0.0000350 seconds
Total time for which application threads were stopped: 0.0000350 seconds
Total time for which application threads were stopped: 0.0000350 seconds
Total time for which application threads were stopped: 0.0000370 seconds
Total time for which application threads were stopped: 0.0000360 seconds
Total time for which application threads were stopped: 0.0000350 seconds
Total time for which application threads were stopped: 0.0000350 seconds
Total time for which application threads were stopped: 0.0000340 seconds
Total time for which application threads were stopped: 0.0000340 seconds
Total time for which application threads were stopped: 0.0000340 seconds
Total time for which application threads were stopped: 0.0000330 seconds
Total time for which application threads were stopped: 0.0000360 seconds
Total time for which application threads were stopped: 0.0000320 seconds
Total time for which application threads were stopped: 0.0000340 seconds
Total time for which application threads were stopped: 0.0000330 seconds
Total time for which application threads were stopped: 0.0000330 seconds
Total time for which application threads were stopped: 0.0000330 seconds
Total time for which application threads were stopped: 0.0000350 seconds
Total time for which application threads were stopped: 0.0000330 seconds
Total time for which application threads were stopped: 0.0000330 seconds
Total time for which application threads were stopped: 0.0000320 seconds
Total time for which application threads were stopped: 0.0000330 seconds
Total time for which application threads were stopped: 0.0000760 seconds
Total time for which application threads were stopped: 0.0000490 seconds
Total time for which application threads were stopped: 0.0000330 seconds
Total time for which application threads were stopped: 0.0000370 seconds
Total time for which application threads were stopped: 0.0000460 seconds
Total time for which application threads were stopped: 0.0000350 seconds
Total time for which application threads were stopped: 0.0004150 seconds
Total time for which application threads were stopped: 0.0001230 seconds
Total time for which application threads were stopped: 0.0035150 seconds
{noformat}

The client-side socket timeout is set to 3 seconds, so it's not hitting that 
timeout due to garbage collections.  I should also note that the client-side 
error is different when there is a client socket timeout (something like 
{{TTransportException: timed out reading 4 bytes}}).
                
> HSHA doesn't handle large messages gracefully
> ---------------------------------------------
>
>                 Key: CASSANDRA-4573
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4573
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Tyler Hobbs
>            Assignee: Vijay
>         Attachments: repro.py
>
>
> HSHA doesn't seem to enforce any kind of max message length, and when 
> messages are too large, it doesn't fail gracefully.
> With debug logs enabled, you'll see this:
> {{DEBUG 13:13:31,805 Unexpected state 16}}
> Which seems to mean that there's a SelectionKey that's valid, but isn't ready 
> for reading, writing, or accepting.
> Client-side, you'll get this thrift error (while trying to read a frame as 
> part of {{recv_batch_mutate}}):
> {{TTransportException: TSocket read 0 bytes}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to