[ https://issues.apache.org/jira/browse/CASSANDRA-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080008#comment-13080008 ]
Ryan King commented on CASSANDRA-2915: -------------------------------------- Regarding realtime search, hasn't our (twitter's) realtime search branch been merged into lucene trunk? Whenever that's available we should get real realtime results. > Lucene based Secondary Indexes > ------------------------------ > > Key: CASSANDRA-2915 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2915 > Project: Cassandra > Issue Type: New Feature > Components: Core > Reporter: T Jake Luciani > Labels: secondary_index > Fix For: 1.0 > > > Secondary indexes (of type KEYS) suffer from a number of limitations in their > current form: > - Multiple IndexClauses only work when there is a subset of rows under the > highest clause > - One new column family is created per index this means 10 new CFs for 10 > secondary indexes > This ticket will use the Lucene library to implement secondary indexes as one > index per CF, and utilize the Lucene query engine to handle multiple index > clauses. Also, by using the Lucene we get a highly optimized file format. > There are a few parallels we can draw between Cassandra and Lucene. > Lucene indexes segments in memory then flushes them to disk so we can sync > our memtable flushes to lucene flushes. Lucene also has optimize() which > correlates to our compaction process, so these can be sync'd as well. > We will also need to correlate column validators to Lucene tokenizers, so the > data can be stored properly, the big win in once this is done we can perform > complex queries within a column like wildcard searches. > The downside of this approach is we will need to read before write since > documents in Lucene are written as complete documents. For random workloads > with lot's of indexed columns this means we need to read the document from > the index, update it and write it back. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira