[jira] [Comment Edited] (CASSANDRA-12877) SASI index throwing AssertionError on index creation
[ https://issues.apache.org/jira/browse/CASSANDRA-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637691#comment-15637691 ] Voytek Jarnot edited comment on CASSANDRA-12877 at 11/4/16 9:12 PM: Attached full log output. Fresh build of cassandra-3.X; fresh install, fresh keyspace (SimpleStrategy, RF 1). 1) built/installed 3.10-SNAPSHOT from git branch cassandra-3.X 2) created keyspace (SimpleStrategy, RF 1) 3) created table: (simplified version below, many more valX columns present) {quote} CREATE TABLE mytable ( id1 text, id2 text, id3 date, id4 timestamp, id5 text, val1 text, val2 text, val3 text, task_id text, document_nbr text, val5 text, PRIMARY KEY ((id1, id2), id3, id4, id5) ) WITH CLUSTERING ORDER BY (id3 DESC, id4 DESC, id5 ASC) {quote} 4) created materialized view: {quote} CREATE MATERIALIZED VIEW mytable_by_task_id AS SELECT * FROM mytable WHERE id1 IS NOT NULL AND id2 IS NOT NULL AND id3 IS NOT NULL AND id4 IS NOT NULL AND id5 IS NOT NULL AND task_id IS NOT NULL PRIMARY KEY (task_id, id3, id4, id1, id2, id5) WITH CLUSTERING ORDER BY (id3 DESC, id4 DESC, id1 ASC, id2 ASC, id5 ASC) {quote} 5) inserted 27 million "rows" (i.e., unique values for id5) 6) create index attempt {quote} create custom index idx_ar_document_nbr on test_table(document_nbr) using 'org.apache.cassandra.index.sasi.SASIIndex'; {quote} 7) no error in cqlsh, logged errors attached. Beginning to suspect CASSANDRA-11990 ... but don't have enough internals-knowledge to do much more than guess. was (Author: voytek.jarnot): Attached full log output. Fresh build of cassandra-3.X; fresh install, fresh keyspace (SimpleStrategy, RF 1). 1) built/installed 3.10-SNAPSHOT from git branch cassandra-3.X 2) created keyspace (SimpleStrategy, RF 1) 3) created table: (simplified version below, many more valX columns present) CREATE TABLE mytable ( id1 text, id2 text, id3 date, id4 timestamp, id5 text, val1 text, val2 text, val3 text, task_id text, document_nbr text, val5 text, PRIMARY KEY ((id1, id2), id3, id4, id5) ) WITH CLUSTERING ORDER BY (id3 DESC, id4 DESC, id5 ASC) 4) created materialized view: CREATE MATERIALIZED VIEW mytable_by_task_id AS SELECT * FROM mytable WHERE id1 IS NOT NULL AND id2 IS NOT NULL AND id3 IS NOT NULL AND id4 IS NOT NULL AND id5 IS NOT NULL AND task_id IS NOT NULL PRIMARY KEY (task_id, id3, id4, id1, id2, id5) WITH CLUSTERING ORDER BY (id3 DESC, id4 DESC, id1 ASC, id2 ASC, id5 ASC) 5) inserted 27 million "rows" (i.e., unique values for id5) 6) create index attempt create custom index idx_ar_document_nbr on test_table(document_nbr) using 'org.apache.cassandra.index.sasi.SASIIndex'; 7) no error in cqlsh, logged errors attached. Beginning to suspect CASSANDRA-11990 ... but don't have enough internals-knowledge to do much more than guess. > SASI index throwing AssertionError on index creation > > > Key: CASSANDRA-12877 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12877 > Project: Cassandra > Issue Type: Bug > Components: sasi > Environment: 3.9 and 3.10 tested on both linux and osx >Reporter: Voytek Jarnot > Attachments: idx-stacktrace-03-nov-2016.txt, > idx-stacktrace-04-nov-2016.txt > > > Possibly a 3.10 regression? > I built and installed a 3.10 snapshot (built 03-Nov-2016) to get around > CASSANDRA-11670, CASSANDRA-12689, and CASSANDRA-12223 which are holding me > back when using 3.9. Edit to add: 3 node cluster, replication factor of 2. > Would like to state up front that I can't duplicate this with a lightweight > throwaway test, which is frustrating, but it keeps hitting me on our dev > cluster. It may require a certain amount of data present (or perhaps a high > number of nulls in the indexed column) - never had any luck duplicating with > the table shown below. > Table roughly resembles the following, with many more 'valx' columns: > CREATE TABLE idx_test_table ( > id1 text, > id2 text, > id3 text, > id4 text, > val1 text, > val2 text, > PRIMARY KEY ((id1, id2), id3, id4) > ) WITH CLUSTERING ORDER BY (id3 DESC, id4 ASC); > CREATE CUSTOM INDEX idx_test_index ON idx_test_table (val2) USING > 'org.apache.cassandra.index.sasi.SASIIndex'; > The error below occurs in 3.10, but not in 3.9; it occurs whether I insert a > bunch of dev data and then create the index, or whether I create the index > and then insert a bunch of test data. > {quote} > INFO [MemtableFlushWriter:5] 2016-11-03 21:00:19,416 > PerSSTableIndexWriter.java:284 - Scheduling index flush to > /u01/cassandra-data/data/essatc1/audit_record-520c1dc0a1e411e691db0bd4b103bd15/mc-266-big-SI_idx_ar_document_nbr.db
[jira] [Comment Edited] (CASSANDRA-12877) SASI index throwing AssertionError on index creation
[ https://issues.apache.org/jira/browse/CASSANDRA-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15635127#comment-15635127 ] Voytek Jarnot edited comment on CASSANDRA-12877 at 11/4/16 4:04 AM: Attached (slightly sanitized) result of a failed attempt to create a SASI index as described but on my localhost 1-machine cluster. Full series of stacktraces as well as the "Update table ..." output, giving the details of my setup. Perhaps worth mentioning: the tables has ~27 million values for the final primary key column. was (Author: voytek.jarnot): Attached (slightly sanitized) result of a failed attempt to create a SASI index as described but on my localhost 1-machine cluster. Full series of stacktraces as well as the "Update table ..." output, giving the details of my setup. > SASI index throwing AssertionError on index creation > > > Key: CASSANDRA-12877 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12877 > Project: Cassandra > Issue Type: Bug > Components: sasi > Environment: 3.9 and 3.10 tested on both linux and osx >Reporter: Voytek Jarnot > Attachments: idx-stacktrace-03-nov-2016.txt > > > Possibly a 3.10 regression? > I built and installed a 3.10 snapshot (built 03-Nov-2016) to get around > CASSANDRA-11670, CASSANDRA-12689, and CASSANDRA-12223 which are holding me > back when using 3.9. Edit to add: 3 node cluster, replication factor of 2. > Would like to state up front that I can't duplicate this with a lightweight > throwaway test, which is frustrating, but it keeps hitting me on our dev > cluster. It may require a certain amount of data present (or perhaps a high > number of nulls in the indexed column) - never had any luck duplicating with > the table shown below. > Table roughly resembles the following, with many more 'valx' columns: > CREATE TABLE idx_test_table ( > id1 text, > id2 text, > id3 text, > id4 text, > val1 text, > val2 text, > PRIMARY KEY ((id1, id2), id3, id4) > ) WITH CLUSTERING ORDER BY (id3 DESC, id4 ASC); > CREATE CUSTOM INDEX idx_test_index ON idx_test_table (val2) USING > 'org.apache.cassandra.index.sasi.SASIIndex'; > The error below occurs in 3.10, but not in 3.9; it occurs whether I insert a > bunch of dev data and then create the index, or whether I create the index > and then insert a bunch of test data. > {quote} > INFO [MemtableFlushWriter:5] 2016-11-03 21:00:19,416 > PerSSTableIndexWriter.java:284 - Scheduling index flush to > /u01/cassandra-data/data/essatc1/audit_record-520c1dc0a1e411e691db0bd4b103bd15/mc-266-big-SI_idx_ar_document_nbr.db > INFO [SASI-Memtable:1] 2016-11-03 21:00:19,450 > PerSSTableIndexWriter.java:335 - Index flush to > /u01/cassandra-data/data/essatc1/audit_record-520c1dc0a1e411e691db0bd4b103bd15/mc-266-big-SI_idx_ar_document_nbr.db > took 33 ms. > ERROR [SASI-Memtable:1] 2016-11-03 21:00:19,454 CassandraDaemon.java:229 - > Exception in thread Thread[SASI-Memtable:1,5,main] > java.lang.AssertionError: cannot have more than 8 overflow collisions per > leaf, but had: 25 > at > org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createOverflowEntry(AbstractTokenTreeBuilder.java:357) > ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT] > at > org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createEntry(AbstractTokenTreeBuilder.java:346) > ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT] > at > org.apache.cassandra.index.sasi.disk.DynamicTokenTreeBuilder$DynamicLeaf.serializeData(DynamicTokenTreeBuilder.java:180) > ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT] > at > org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.serialize(AbstractTokenTreeBuilder.java:306) > ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT] > at > org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder.write(AbstractTokenTreeBuilder.java:90) > ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT] > at > org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableDataBlock.flushAndClear(OnDiskIndexBuilder.java:629) > ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT] > at > org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.flush(OnDiskIndexBuilder.java:446) > ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT] > at > org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.add(OnDiskIndexBuilder.java:433) > ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT] > at > org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.addTerm(OnDiskIndexBuilder.java:207) > ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT] > at >