[ 
https://issues.apache.org/jira/browse/CASSANDRA-4314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290621#comment-13290621
 ] 

Wade Poziombka edited comment on CASSANDRA-4314 at 6/7/12 12:39 AM:
--------------------------------------------------------------------

I'm sorry but I don't understand the statement.  I have done no deletes and the 
rows are very small (max is like 285 bytes according to cfstats, which is 
inline with what I know about these data).  I did drop a column family before 
and I have updated many column values.  I don't know if that creates tombstones 
too.  

the model is this:

token - is the primary column family.  Has a column in it called "pan" which 
contains nearly unique binary values.  We need to be able to uniquely search 
pan so I have a pan_XXX family with pan as the key and token is a column name 
with a timestamp as a value.  pan_XXX is basically an index to the token column 
family.  

In the current scenario, there are very few token columns in the pan column 
family (indeed the largest row is 124 bytes by cfstat's measure).  At some 
point I need to essentially re-index (pan values change).  So I create a new 
dynamic column family (pan_YYY), modify the token's pan column and add new 
column to pan_YYY then when fully done I drop pan_XXX column family.

So at the end of it a new column family (an index) is populated and the old one 
is dropped.  All values in one column of the token column family are modified.

What is shown here in these logs is none of the above though.  I have restarted 
cassandra and done nothing but run the one query.

AND ONE MORE THING

I neglected to mention that during the update of the "token" column family it 
updates the indexed column too.  The indexed column essentially holds either 
XXX or YYY so we can resolve pan_XXX etc.  This may be important.  As it goes 
through each is eventually changed from XXX to YYY.  This index is the same 
that is used in the query above.
                
      was (Author: wpoziombka):
    I'm sorry but I don't understand the statement.  I have done no deletes and 
the rows are very small (max is like 285 bytes according to cfstats, which is 
inline with what I know about these data).  I did drop a column family before 
and I have updated many column values.  I don't know if that creates tombstones 
too.  

the model is this:

token - is the primary column family.  Has a column in it called "pan" which 
contains nearly unique binary values.  We need to be able to uniquely search 
pan so I have a pan_XXX family with pan as the key and token is a column name 
with a timestamp as a value.  pan_XXX is basically an index to the token column 
family.  

In the current scenario, there are very few token columns in the pan column 
family (indeed the largest row is 124 bytes by cfstat's measure).  At some 
point I need to essentially re-index (pan values change).  So I create a new 
dynamic column family (pan_YYY), modify the token's pan column and add new 
column to pan_YYY then when fully done I drop pan_XXX column family.

So at the end of it a new column family (an index) is populated and the old one 
is dropped.  All values in one column of the token column family are modified.

What is shown here in these logs is none of the above though.  I have restarted 
cassandra and done nothing but run the one query.
                  
> OOM errors on key slice
> -----------------------
>
>                 Key: CASSANDRA-4314
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4314
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.0
>         Environment: AS5 64, 64 GB ram, 12 core, Intel SSD 
>            Reporter: Wade Poziombka
>         Attachments: oom.zip
>
>
> My database (now at 1.0.10) is in a state in which it goes out of memory with 
> hardly any activity at all.  A key slice nothing more.
> The logs attached are this including verbose gc in stdout.  I started up 
> cassandra and waited a bit to ensure that it was unperturbed.
> Then (about 15:46) I ran this slice (using Pelops), which in this case should 
> return NO data.  My client times out and the database goes OOM.
>                   ConsistencyLevel cl = ConsistencyLevel.TWO;//TWO nodes in 
> my cluster
>                   Selector s = new Selector(this.pool);
>                   List<IndexExpression> indexExpressions = new 
> ArrayList<IndexExpression>();
>                   IndexExpression e = new IndexExpression(
>                               
> ByteBuffer.wrap("encryptionSettingsID".getBytes(ASCII)), IndexOperator.EQ,
>                               
> ByteBuffer.wrap(encryptionSettingsID.getBytes(Utils.ASCII)));
>                   indexExpressions.add(e);
>                   IndexClause indexClause = new IndexClause(indexExpressions,
>                               ByteBuffer.wrap(EMPTY_BYTE_ARRAY), 1);
>                   SlicePredicate predicate = new SlicePredicate();
>                   predicate.setColumn_names(Arrays.asList(new ByteBuffer[]
>                         { ByteBuffer.wrap(COL_PAN_ENC_BYTES) }));
>                   List<KeySlice> slices = s.getKeySlices(CF_TOKEN, 
> indexClause, predicate, cl);
> Note that “encryptionSettingsID” is an indexed column.  When this is executed 
> there should be no columns with the supplied value.
> I suppose I may have some kind of blatant error in this query but it is not 
> obvious to me.  I’m relatively new to cassandra.
> My key space is defined as follows:
> KsDef(name:TB_UNIT, 
> strategy_class:org.apache.cassandra.locator.SimpleStrategy, 
> strategy_options:{replication_factor=3}, 
> cf_defs:[
> CfDef(keyspace:TB_UNIT, name:token, column_type:Standard, 
> comparator_type:BytesType, column_metadata:[ColumnDef(name:70 61 6E 45 6E 63, 
> validation_class:BytesType), ColumnDef(name:63 72 65 61 74 65 54 73, 
> validation_class:DateType), ColumnDef(name:63 72 65 61 74 65 44 61 74 65, 
> validation_class:DateType, index_type:KEYS, index_name:TokenCreateDate), 
> ColumnDef(name:65 6E 63 72 79 70 74 69 6F 6E 53 65 74 74 69 6E 67 73 49 44, 
> validation_class:UTF8Type, index_type:KEYS, 
> index_name:EncryptionSettingsID)], caching:keys_only), 
> CfDef(keyspace:TB_UNIT, name:pan_d721fd40fd9443aa81cc6f59c8e047c6, 
> column_type:Standard, comparator_type:BytesType, caching:keys_only), 
> CfDef(keyspace:TB_UNIT, name:counters, column_type:Standard, 
> comparator_type:BytesType, column_metadata:[ColumnDef(name:75 73 65 43 6F 75 
> 6E 74, validation_class:CounterColumnType)], 
> default_validation_class:CounterColumnType, caching:keys_only)
> ])
> tpstats show pending tasks many minutes after time out:
> [root@r610-lb6 bin]# ../cassandra/bin/nodetool -h 127.0.0.1 tpstats
> Pool Name                    Active   Pending      Completed   Blocked  All 
> time blocked
> ReadStage                         3         3            107         0        
>          0
> RequestResponseStage              0         0             56         0        
>          0
> MutationStage                     0         0              6         0        
>          0
> ReadRepairStage                   0         0              0         0        
>          0
> ReplicateOnWriteStage             0         0              0         0        
>          0
> GossipStage                       0         0           2231         0        
>          0
> AntiEntropyStage                  0         0              0         0        
>          0
> MigrationStage                    0         0              0         0        
>          0
> MemtablePostFlusher               0         0              3         0        
>          0
> StreamStage                       0         0              0         0        
>          0
> FlushWriter                       0         0              3         0        
>          0
> MiscStage                         0         0              0         0        
>          0
> InternalResponseStage             0         0              0         0        
>          0
> HintedHandoff                     0         0              9         0        
>          0
> Message type           Dropped
> RANGE_SLICE                  0
> READ_REPAIR                  0
> BINARY                       0
> READ                         0
> MUTATION                     0
> REQUEST_RESPONSE             0
> cfstats:
> Keyspace: keyspace
>         Read Count: 118
>         Read Latency: 0.14722033898305084 ms.
>         Write Count: 0
>         Write Latency: NaN ms.
>         Pending Tasks: 0
>                 Column Family: token
>                 SSTable count: 7
>                 Space used (live): 4745885584
>                 Space used (total): 4745885584
>                 Number of Keys (estimate): 18626048
>                 Memtable Columns Count: 0
>                 Memtable Data Size: 0
>                 Memtable Switch Count: 0
>                 Read Count: 118
>                 Read Latency: 0.147 ms.
>                 Write Count: 0
>                 Write Latency: NaN ms.
>                 Pending Tasks: 0
>                 Bloom Filter False Postives: 0
>                 Bloom Filter False Ratio: 0.00000
>                 Bloom Filter Space Used: 55058352
>                 Key cache: disabled
>                 Row cache: disabled
>                 Compacted row minimum size: 150
>                 Compacted row maximum size: 258
>                 Compacted row mean size: 201
>                 Column Family: pan_2fef6478b62242dd94aecaa049b9d7bb
>                 SSTable count: 7
>                 Space used (live): 1987147156
>                 Space used (total): 1987147156
>                 Number of Keys (estimate): 14955264
>                 Memtable Columns Count: 0
>                 Memtable Data Size: 0
>                 Memtable Switch Count: 0
>                 Read Count: 0
>                 Read Latency: NaN ms.
>                 Write Count: 0
>                 Write Latency: NaN ms.
>                 Pending Tasks: 0
>                 Bloom Filter False Postives: 0
>                 Bloom Filter False Ratio: 0.00000
>                 Bloom Filter Space Used: 28056224
>                 Key cache: disabled
>                 Row cache: disabled
>                 Compacted row minimum size: 104
>                 Compacted row maximum size: 124
>                 Compacted row mean size: 124
>                 Column Family: counters
>                 SSTable count: 11
>                 Space used (live): 3433469364
>                 Space used (total): 3433469364
>                 Number of Keys (estimate): 21475328
>                 Memtable Columns Count: 0
>                 Memtable Data Size: 0
>                 Memtable Switch Count: 0
>                 Read Count: 0
>                 Read Latency: NaN ms.
>                 Write Count: 0
>                 Write Latency: NaN ms.
>                 Pending Tasks: 0
>                 Bloom Filter False Postives: 0
>                 Bloom Filter False Ratio: 0.00000
>                 Bloom Filter Space Used: 40271696
>                 Key cache capacity: 4652
>                 Key cache size: 4652
>                 Key cache hit rate: NaN
>                 Row cache: disabled
>                 Compacted row minimum size: 125
>                 Compacted row maximum size: 179
>                 Compacted row mean size: 150

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to