Re: Cassandra Read Timeout

2015-03-19 Thread Jan
Yulian; 
Quote :Raw size is aroung 190MB.There are bigger raws with similar structure ( 
its index raws , which actually stores keys ) and everything is working fine on 
them, everything is working also fine on this cf but on other raw.
Tables data from CFStats ( First table has bigger raws but works fine , where 
second has timeout ) :
---
You asked: There are bigger raws with similar structure Question:  Do you mean 
bigger rows  ?      What is the structure of the statuspindexes Keyspace  
which table are you querying within it ?
you asked:  its index raws , which actually stores keysQuestion:  do you mean 
Index rows ?     how are you creating Indexes  , what type of Indexes ?  
you asked: Tables data from CFStats,  where second has timeout Question:  What 
is the time out value set at   whats different about both these tables ?     
What are you querying from the second table ? 

Unfortunately,  I have more questions that answers;  however  despite the 
sacrilege of using super-columns (lol), there has got to be a logical answer to 
the Performance problem you are having.       Hopefully we could dig in and 
find an answer . 

Jan/  



 



 On Tuesday, February 24, 2015 12:00 PM, Robert Coli rc...@eventbrite.com 
wrote:
   

 On Tue, Feb 24, 2015 at 8:50 AM, Yulian Oifa oifa.yul...@gmail.com wrote:

The structure is the same , the CFs are super column CFs , where key is long  ( 
timestamp to partition the index , so each 11 days new row is created ) , super 
Column is int32 and columns / values are timeuuids.I am running same queries , 
getting reversed slice by raw key and super column.

Obligatory notice that Super Columns are not really recommended for use. I have 
no idea if the performance problem you are seeing is related to the use of 
Super Columns.
=Rob 

  

Cassandra Read Timeout

2015-02-24 Thread Yulian Oifa
Hello to all
I have single node cassandra on amazon ec2.
Currently i am having a read timeout problem on single CF , single raw.

Raw size is aroung 190MB.There are bigger raws with similar structure ( its
index raws , which actually stores keys ) and everything is working fine on
them, everything is working also fine on this cf but on other raw.

Tables data from CFStats ( First table has bigger raws but works fine ,
where second has timeout ) :
Column Family: pendindexes
SSTable count: 5
Space used (live): 462298352
Space used (total): 462306752
SSTable Compression Ratio: 0.3511107495795905
Number of Keys (estimate): 640
Memtable Columns Count: 63339
Memtable Data Size: 12328802
Memtable Switch Count: 78
Read Count: 10
Read Latency: NaN ms.
Write Count: 1530113
Write Latency: 0.022 ms.
Pending Tasks: 0
Bloom Filter False Positives: 0
Bloom Filter False Ratio: 0.0
Bloom Filter Space Used: 3920
Compacted row minimum size: 73
Compacted row maximum size: 223875792
Compacted row mean size: 42694982
Average live cells per slice (last five minutes): 21.0
Average tombstones per slice (last five minutes): 0.0

Column Family: statuspindexes
SSTable count: 1
Space used (live): 99602136
Space used (total): 99609360
SSTable Compression Ratio: 0.34278775390997873
Number of Keys (estimate): 128
Memtable Columns Count: 6250
Memtable Data Size: 6061097
Memtable Switch Count: 65
Read Count: 1000
Read Latency: NaN ms.
Write Count: 1193142
Write Latency: 3.616 ms.
Pending Tasks: 0
Bloom Filter False Positives: 0
Bloom Filter False Ratio: 0.0
Bloom Filter Space Used: 656
Compacted row minimum size: 180
Compacted row maximum size: 186563160
Compacted row mean size: 63225562
Average live cells per slice (last five minutes): 0.0
Average tombstones per slice (last five minutes): 0.0

I have tried to debug it with cql , thats what i get:

 activity
| timestamp| source   | source_elapsed
-+--+--+

execute_cql3_query | 15:39:53,120 | 172.31.6.173 |  0
   Parsing Select * from
statuspindexes LIMIT 1; | 15:39:53,120 | 172.31.6.173 |875

Preparing statement | 15:39:53,121 | 172.31.6.173 |   1643

Determining replicas to query | 15:39:53,121 | 172.31.6.173 |   1740
 Executing seq scan across 1 sstables for [min(-9223372036854775808),
min(-9223372036854775808)] | 15:39:53,122 | 172.31.6.173 |   2581
 Seeking to partition
beginning in data file | 15:39:53,123 | 172.31.6.173 |   3118
   Timed out; received 0 of 1
responses for range 2 of 2 | 15:40:03,121 | 172.31.6.173 |   10001370

Request complete | 15:40:03,121 | 172.31.6.173 |   10001513

I have executed compaction on that cf.
What could lead to that problem?
Best regards
Yulian Oifa


Re: Cassandra Read Timeout

2015-02-24 Thread Yulian Oifa
Hello
The structure is the same , the CFs are super column CFs , where key is
long  ( timestamp to partition the index , so each 11 days new row is
created ) , super Column is int32 and columns / values are timeuuids.I am
running same queries , getting reversed slice by raw key and super column.
The number of reads is relatively high on second CF since i am testing it
several hours already , most of time there are no requests for read on both
of them , only writes.There is at most 1 read request in 20-30 seconds so
it should not create a load.There is also no reads ( 0 before and 1 after )
pending on tpstats.
Please also note that queries on different row same super column , and same
row , different super column are working , and if i am not mistaken
cassandra is loading complete raw including all super columns to memory (
either any request to this row should fail if this would be a memory
problem , or none...).

Best regards
Yulian Oifa


Re: Cassandra Read Timeout

2015-02-24 Thread Yulian Oifa
Hello
TP STATS Before Request:

Pool NameActive   Pending  Completed   Blocked  All
time blocked
ReadStage 0 0*7592835*
0 0
RequestResponseStage  0 0  0
0 0
MutationStage 0 0  *215980736*
0 0
ReadRepairStage   0 0  0
0 0
ReplicateOnWriteStage 0 0  0
0 0
GossipStage   0 0  0
0 0
AntiEntropyStage  0 0  0
0 0
MigrationStage0 0 28
0 0
MemoryMeter   0 0474
0 0
MemtablePostFlusher   0 0  32845
0 0
FlushWriter   0 0   4013
0  2239
MiscStage 0 0  0
0 0
PendingRangeCalculator0 0  1
0 0
commitlog_archiver0 0  0
0 0
InternalResponseStage 0 0  0
0 0
HintedHandoff 0 0  0
0 0

Message type   Dropped
RANGE_SLICE  0
READ_REPAIR  0
BINARY   0
READ 0
MUTATION 0
_TRACE   0
REQUEST_RESPONSE 0
COUNTER_MUTATION 0

TP STATS After Request:

Pool NameActive   Pending  Completed   Blocked  All
time blocked
ReadStage *1 17592942*
0 0
RequestResponseStage  0 0  0
0 0
MutationStage 0 0  *215983339*
0 0
ReadRepairStage   0 0  0
0 0
ReplicateOnWriteStage 0 0  0
0 0
GossipStage   0 0  0
0 0
AntiEntropyStage  0 0  0
0 0
MigrationStage0 0 28
0 0
MemoryMeter   0 0474
0 0
MemtablePostFlusher   0 0  32845
0 0
FlushWriter   0 0   4013
0  2239
MiscStage 0 0  0
0 0
PendingRangeCalculator0 0  1
0 0
commitlog_archiver0 0  0
0 0
InternalResponseStage 0 0  0
0 0
HintedHandoff 0 0  0
0 0

Message type   Dropped
RANGE_SLICE  0
READ_REPAIR  0
BINARY   0
READ 0
MUTATION 0
_TRACE   0
REQUEST_RESPONSE 0
COUNTER_MUTATION 0

The only items changed are : ReadStage increased by 107 + 1 Active/Pending
and MutationStage which changed by 2603.
Please note that system is writing all the time in batches ( each second 2
servers write one batch each ) so i dont see anything special in this
numbers.

Best regards
Yulian Oifa


Re: Cassandra Read Timeout

2015-02-24 Thread Marcelo Valle (BLOOMBERG/ LONDON)
Indeed, I thought something odd could be happening to your cluster, but it 
seems it's working fine but the request is taking too long to complete. 

I noticed from your cfstats the read count was about 10 in the first CF and in 
the second one it was about 1000... Would you be doing much more reads in the 
second one? If the load is higher, it could justify the timeout.

Do both CFs have the same data model? Are you running exactly the same queries? 

Best regards,
Marcelo.

From: user@cassandra.apache.org 
Subject: Re: Cassandra Read Timeout

Hello
TP STATS Before Request:

Pool NameActive   Pending  Completed   Blocked  All 
time blocked
ReadStage 0 07592835 0  
   0
RequestResponseStage  0 0  0 0  
   0
MutationStage 0 0  215980736 0  
   0
ReadRepairStage   0 0  0 0  
   0
ReplicateOnWriteStage 0 0  0 0  
   0
GossipStage   0 0  0 0  
   0
AntiEntropyStage  0 0  0 0  
   0
MigrationStage0 0 28 0  
   0
MemoryMeter   0 0474 0  
   0
MemtablePostFlusher   0 0  32845 0  
   0
FlushWriter   0 0   4013 0  
2239
MiscStage 0 0  0 0  
   0
PendingRangeCalculator0 0  1 0  
   0
commitlog_archiver0 0  0 0  
   0
InternalResponseStage 0 0  0 0  
   0
HintedHandoff 0 0  0 0  
   0

Message type   Dropped
RANGE_SLICE  0
READ_REPAIR  0
BINARY   0
READ 0
MUTATION 0
_TRACE   0
REQUEST_RESPONSE 0
COUNTER_MUTATION 0

TP STATS After Request:

Pool NameActive   Pending  Completed   Blocked  All 
time blocked
ReadStage 1 17592942 0  
   0
RequestResponseStage  0 0  0 0  
   0
MutationStage 0 0  215983339 0  
   0
ReadRepairStage   0 0  0 0  
   0
ReplicateOnWriteStage 0 0  0 0  
   0
GossipStage   0 0  0 0  
   0
AntiEntropyStage  0 0  0 0  
   0
MigrationStage0 0 28 0  
   0
MemoryMeter   0 0474 0  
   0
MemtablePostFlusher   0 0  32845 0  
   0
FlushWriter   0 0   4013 0  
2239
MiscStage 0 0  0 0  
   0
PendingRangeCalculator0 0  1 0  
   0
commitlog_archiver0 0  0 0  
   0
InternalResponseStage 0 0  0 0  
   0
HintedHandoff 0 0  0 0  
   0

Message type   Dropped
RANGE_SLICE  0
READ_REPAIR  0
BINARY   0
READ 0
MUTATION 0
_TRACE   0
REQUEST_RESPONSE 0
COUNTER_MUTATION 0

The only items changed are : ReadStage increased by 107 + 1 Active/Pending and 
MutationStage which changed by 2603.
Please note that system is writing all the time in batches ( each second 2 
servers write one batch each ) so i dont see anything special in this numbers.

Best regards
Yulian Oifa




Re: Cassandra Read Timeout

2015-02-24 Thread Robert Coli
On Tue, Feb 24, 2015 at 8:50 AM, Yulian Oifa oifa.yul...@gmail.com wrote:

 The structure is the same , the CFs are super column CFs , where key is
 long  ( timestamp to partition the index , so each 11 days new row is
 created ) , super Column is int32 and columns / values are timeuuids.I am
 running same queries , getting reversed slice by raw key and super column.


Obligatory notice that Super Columns are not really recommended for use. I
have no idea if the performance problem you are seeing is related to the
use of Super Columns.

=Rob


Re: Cassandra Read Timeout

2015-02-24 Thread Marcelo Valle (BLOOMBERG/ LONDON)
I am sorry, not sure if I will be able to help you. I am not familiar with 
super columns, I would tell you to try to get rid of them as soon as possible. 
Maybe someone else in the list can help you.
Anyway, it really seems you have a cell with a very large amount of data and 
the request might be taking longer to complete because of the amount, but I 
also don't understand why the requests on same row different SC and same SC 
different row would work. 

[]s

From: oifa.yul...@gmail.com 
Subject: Re: Cassandra Read Timeout

Hello
I am running 1.2.19
Best regards
Yulian Oifa

On Tue, Feb 24, 2015 at 6:57 PM, Marcelo Valle (BLOOMBERG/ LONDON) 
mvallemil...@bloomberg.net wrote:

Super column? Out of curiosity, which Cassandra version are you running?


From: user@cassandra.apache.org 
Subject: Re: Cassandra Read Timeout

Hello
The structure is the same , the CFs are super column CFs , where key is long  ( 
timestamp to partition the index , so each 11 days new row is created ) , super 
Column is int32 and columns / values are timeuuids.I am running same queries , 
getting reversed slice by raw key and super column.
The number of reads is relatively high on second CF since i am testing it 
several hours already , most of time there are no requests for read on both of 
them , only writes.There is at most 1 read request in 20-30 seconds so it 
should not create a load.There is also no reads ( 0 before and 1 after ) 
pending on tpstats.
Please also note that queries on different row same super column , and same row 
, different super column are working , and if i am not mistaken cassandra is 
loading complete raw including all super columns to memory ( either any request 
to this row should fail if this would be a memory problem , or none...).

Best regards 
Yulian Oifa




Re: Cassandra Read Timeout

2015-02-24 Thread Marcelo Valle (BLOOMBERG/ LONDON)
Super column? Out of curiosity, which Cassandra version are you running?


From: user@cassandra.apache.org 
Subject: Re: Cassandra Read Timeout

Hello
The structure is the same , the CFs are super column CFs , where key is long  ( 
timestamp to partition the index , so each 11 days new row is created ) , super 
Column is int32 and columns / values are timeuuids.I am running same queries , 
getting reversed slice by raw key and super column.
The number of reads is relatively high on second CF since i am testing it 
several hours already , most of time there are no requests for read on both of 
them , only writes.There is at most 1 read request in 20-30 seconds so it 
should not create a load.There is also no reads ( 0 before and 1 after ) 
pending on tpstats.
Please also note that queries on different row same super column , and same row 
, different super column are working , and if i am not mistaken cassandra is 
loading complete raw including all super columns to memory ( either any request 
to this row should fail if this would be a memory problem , or none...).

Best regards 
Yulian Oifa




Re: Cassandra Read Timeout

2015-02-24 Thread Yulian Oifa
Hello
I am running 1.2.19
Best regards
Yulian Oifa

On Tue, Feb 24, 2015 at 6:57 PM, Marcelo Valle (BLOOMBERG/ LONDON) 
mvallemil...@bloomberg.net wrote:

 Super column? Out of curiosity, which Cassandra version are you running?


 From: user@cassandra.apache.org
 Subject: Re: Cassandra Read Timeout

 Hello
 The structure is the same , the CFs are super column CFs , where key is
 long  ( timestamp to partition the index , so each 11 days new row is
 created ) , super Column is int32 and columns / values are timeuuids.I am
 running same queries , getting reversed slice by raw key and super column.
 The number of reads is relatively high on second CF since i am testing it
 several hours already , most of time there are no requests for read on both
 of them , only writes.There is at most 1 read request in 20-30 seconds so
 it should not create a load.There is also no reads ( 0 before and 1 after )
 pending on tpstats.
 Please also note that queries on different row same super column , and
 same row , different super column are working , and if i am not mistaken
 cassandra is loading complete raw including all super columns to memory (
 either any request to this row should fail if this would be a memory
 problem , or none...).

 Best regards
 Yulian Oifa