[jira] [Issue Comment Deleted] (CASSANDRA-12877) SASI index throwing AssertionError on index creation

2016-11-04 Thread Voytek Jarnot (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Voytek Jarnot updated CASSANDRA-12877:
--
Comment: was deleted

(was: Attached full log output.  Fresh build of cassandra-3.X; fresh install, 
fresh keyspace (SimpleStrategy, RF 1).

1) built/installed 3.10-SNAPSHOT from git branch cassandra-3.X
2) created keyspace (SimpleStrategy, RF 1)
3) created table: (simplified version below, many more valX columns present)
{quote}
CREATE TABLE mytable (
id1 text,
id2 text,
id3 date,
id4 timestamp,
id5 text,
val1 text,
val2 text,
val3 text,
task_id text,
document_nbr text,
val5 text,
PRIMARY KEY ((id1, id2), id3, id4, id5)
) WITH CLUSTERING ORDER BY (id3 DESC, id4 DESC, id5 ASC)
{quote}

4) created materialized view:
{quote}
CREATE MATERIALIZED VIEW mytable_by_task_id AS
SELECT *
FROM mytable
WHERE id1 IS NOT NULL AND id2 IS NOT NULL AND id3 IS NOT NULL AND id4 IS 
NOT NULL AND id5 IS NOT NULL AND task_id IS NOT NULL
PRIMARY KEY (task_id, id3, id4, id1, id2, id5)
WITH CLUSTERING ORDER BY (id3 DESC, id4 DESC, id1 ASC, id2 ASC, id5 ASC)
{quote}
5) inserted 27 million "rows" (i.e., unique values for id5)
6) create index attempt
{quote}
create custom index idx_ar_document_nbr on test_table(document_nbr) using 
'org.apache.cassandra.index.sasi.SASIIndex';
{quote}
7) no error in cqlsh, logged errors attached.

Beginning to suspect CASSANDRA-11990 ... but don't have enough 
internals-knowledge to do much more than guess.)

> SASI index throwing AssertionError on index creation
> 
>
> Key: CASSANDRA-12877
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12877
> Project: Cassandra
>  Issue Type: Bug
>  Components: sasi
> Environment: 3.9 and 3.10 tested on both linux and osx
>Reporter: Voytek Jarnot
>
> Possibly a 3.10 regression?
> I built and installed a 3.10 snapshot (built 03-Nov-2016) to get around 
> CASSANDRA-11670, CASSANDRA-12689, and CASSANDRA-12223 which are holding me 
> back when using 3.9.
> Would like to state up front that I can't duplicate this with a lightweight 
> throwaway test, which is frustrating, but it keeps hitting me on our dev 
> cluster.  It may require a certain amount of data present (or perhaps a high 
> number of nulls in the indexed column) - never had any luck duplicating with 
> the table shown below.
> Table roughly resembles the following, with many more 'valx' columns:
> CREATE TABLE idx_test_table (
> id1 text,
> id2 text,
> id3 text,
> id4 text,
> val1 text,
> val2 text,
> PRIMARY KEY ((id1, id2), id3, id4)
> ) WITH CLUSTERING ORDER BY (id3 DESC, id4 ASC);
> CREATE CUSTOM INDEX idx_test_index ON idx_test_table (val2) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex';
> The error below occurs in 3.10, but not in 3.9; it occurs whether I insert a 
> bunch of dev data and then create the index, or whether I create the index 
> and then insert a bunch of test data.
> {quote}
> INFO  [MemtableFlushWriter:5] 2016-11-03 21:00:19,416 
> PerSSTableIndexWriter.java:284 - Scheduling index flush to 
> /u01/cassandra-data/data/essatc1/audit_record-520c1dc0a1e411e691db0bd4b103bd15/mc-266-big-SI_idx_ar_document_nbr.db
> INFO  [SASI-Memtable:1] 2016-11-03 21:00:19,450 
> PerSSTableIndexWriter.java:335 - Index flush to 
> /u01/cassandra-data/data/essatc1/audit_record-520c1dc0a1e411e691db0bd4b103bd15/mc-266-big-SI_idx_ar_document_nbr.db
>  took 33 ms.
> ERROR [SASI-Memtable:1] 2016-11-03 21:00:19,454 CassandraDaemon.java:229 - 
> Exception in thread Thread[SASI-Memtable:1,5,main]
> java.lang.AssertionError: cannot have more than 8 overflow collisions per 
> leaf, but had: 25
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createOverflowEntry(AbstractTokenTreeBuilder.java:357)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createEntry(AbstractTokenTreeBuilder.java:346)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.DynamicTokenTreeBuilder$DynamicLeaf.serializeData(DynamicTokenTreeBuilder.java:180)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.serialize(AbstractTokenTreeBuilder.java:306)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder.write(AbstractTokenTreeBuilder.java:90)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableDataBlock.flushAndClear(OnDiskIndexBuilder.java:629)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSH

[jira] [Issue Comment Deleted] (CASSANDRA-12877) SASI index throwing AssertionError on index creation

2016-11-04 Thread Voytek Jarnot (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Voytek Jarnot updated CASSANDRA-12877:
--
Comment: was deleted

(was: Attached (slightly sanitized) result of a failed attempt to create a SASI 
index as described but on my localhost 1-machine cluster.  Full series of 
stacktraces as well as the "Update table ..." output, giving the details of my 
setup.

Perhaps worth mentioning: the tables has ~27 million values for the final 
primary key column.)

> SASI index throwing AssertionError on index creation
> 
>
> Key: CASSANDRA-12877
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12877
> Project: Cassandra
>  Issue Type: Bug
>  Components: sasi
> Environment: 3.9 and 3.10 tested on both linux and osx
>Reporter: Voytek Jarnot
>
> Possibly a 3.10 regression?
> I built and installed a 3.10 snapshot (built 03-Nov-2016) to get around 
> CASSANDRA-11670, CASSANDRA-12689, and CASSANDRA-12223 which are holding me 
> back when using 3.9.
> Would like to state up front that I can't duplicate this with a lightweight 
> throwaway test, which is frustrating, but it keeps hitting me on our dev 
> cluster.  It may require a certain amount of data present (or perhaps a high 
> number of nulls in the indexed column) - never had any luck duplicating with 
> the table shown below.
> Table roughly resembles the following, with many more 'valx' columns:
> CREATE TABLE idx_test_table (
> id1 text,
> id2 text,
> id3 text,
> id4 text,
> val1 text,
> val2 text,
> PRIMARY KEY ((id1, id2), id3, id4)
> ) WITH CLUSTERING ORDER BY (id3 DESC, id4 ASC);
> CREATE CUSTOM INDEX idx_test_index ON idx_test_table (val2) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex';
> The error below occurs in 3.10, but not in 3.9; it occurs whether I insert a 
> bunch of dev data and then create the index, or whether I create the index 
> and then insert a bunch of test data.
> {quote}
> INFO  [MemtableFlushWriter:5] 2016-11-03 21:00:19,416 
> PerSSTableIndexWriter.java:284 - Scheduling index flush to 
> /u01/cassandra-data/data/essatc1/audit_record-520c1dc0a1e411e691db0bd4b103bd15/mc-266-big-SI_idx_ar_document_nbr.db
> INFO  [SASI-Memtable:1] 2016-11-03 21:00:19,450 
> PerSSTableIndexWriter.java:335 - Index flush to 
> /u01/cassandra-data/data/essatc1/audit_record-520c1dc0a1e411e691db0bd4b103bd15/mc-266-big-SI_idx_ar_document_nbr.db
>  took 33 ms.
> ERROR [SASI-Memtable:1] 2016-11-03 21:00:19,454 CassandraDaemon.java:229 - 
> Exception in thread Thread[SASI-Memtable:1,5,main]
> java.lang.AssertionError: cannot have more than 8 overflow collisions per 
> leaf, but had: 25
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createOverflowEntry(AbstractTokenTreeBuilder.java:357)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.createEntry(AbstractTokenTreeBuilder.java:346)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.DynamicTokenTreeBuilder$DynamicLeaf.serializeData(DynamicTokenTreeBuilder.java:180)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder$Leaf.serialize(AbstractTokenTreeBuilder.java:306)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.AbstractTokenTreeBuilder.write(AbstractTokenTreeBuilder.java:90)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableDataBlock.flushAndClear(OnDiskIndexBuilder.java:629)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.flush(OnDiskIndexBuilder.java:446)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder$MutableLevel.add(OnDiskIndexBuilder.java:433)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.addTerm(OnDiskIndexBuilder.java:207)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:293)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:258)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.disk.OnDiskIndexBuilder.finish(OnDiskIndexBuilder.java:241)
>  ~[apache-cassandra-3.10-SNAPSHOT.jar:3.10-SNAPSHOT]
>   at 
> org.apache.cassandra.index.sasi.di