[ https://issues.apache.org/jira/browse/CASSANDRA-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Stupp resolved CASSANDRA-2915. ------------------------------------- Resolution: Won't Fix CASSANDRA-10661 adds similar functionality. So resolving this as "won't fix" and superseded by SASI. > Lucene based Secondary Indexes > ------------------------------ > > Key: CASSANDRA-2915 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2915 > Project: Cassandra > Issue Type: New Feature > Reporter: T Jake Luciani > Labels: secondary_index > > Secondary indexes (of type KEYS) suffer from a number of limitations in their > current form: > - Multiple IndexClauses only work when there is a subset of rows under the > highest clause > - One new column family is created per index this means 10 new CFs for 10 > secondary indexes > This ticket will use the Lucene library to implement secondary indexes as one > index per CF, and utilize the Lucene query engine to handle multiple index > clauses. Also, by using the Lucene we get a highly optimized file format. > There are a few parallels we can draw between Cassandra and Lucene. > Lucene indexes segments in memory then flushes them to disk so we can sync > our memtable flushes to lucene flushes. Lucene also has optimize() which > correlates to our compaction process, so these can be sync'd as well. > We will also need to correlate column validators to Lucene tokenizers, so the > data can be stored properly, the big win in once this is done we can perform > complex queries within a column like wildcard searches. > The downside of this approach is we will need to read before write since > documents in Lucene are written as complete documents. For random workloads > with lot's of indexed columns this means we need to read the document from > the index, update it and write it back. -- This message was sent by Atlassian JIRA (v6.3.4#6332)