[ https://issues.apache.org/jira/browse/CASSANDRA-10681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15011664#comment-15011664 ]
Pavel Yaskevich edited comment on CASSANDRA-10681 at 11/18/15 8:17 PM: ----------------------------------------------------------------------- Let me clarify a bit - it serializes a compilation of all of the indexes not building them. That's what I have already mentioned, Indexes are built separate but it's just a nature of the Index API since PerRowIndex is no more we have to build all of the indexes independently, this requires to pass through a data multiple times but I don't think it's necessary a problem if we list the assumption that indexes are build in one and only way - by merging sstables together and feeding index collated row - and let API implementers decide how to build indexes based on the set of sstables. When the SSTable is added via streaming for example CASSANDRA-10678 would take care of creating indexes for it in case of SASI and Indexer API in case of standard indexes, so I don't really see a problem there, in case of side loading new compaction task per index is going to be triggered to build such indexes if necessary but we can't really go around that. was (Author: xedin): Let me clarify a bit - it serializes a compilation of all of the indexes not building them. That's what I have already mentioned, Indexes are built separate but it's just a nature of the Index API since PerRowIndex is no more we have to build all of the indexes independently, this requires to pass through a data multiple times but I don't think it's necessary a problem if we list the assumption that indexes are build in one and only way - by merging sstables together and feeding index collated row - and let API implementers decide how to build indexes based on the set of sstables. When the SSTable is added via streaming for example CASSANDRA-10678 would take care of creating indexes for it in case of SASI and Indexer API in case of standard indexes, so I don't really see a problem there, in case of side loading new compaction task per index is going to be triggered to build such indexes in necessary but we can't really go around that. > make index building pluggable via IndexBuildTask > ------------------------------------------------ > > Key: CASSANDRA-10681 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10681 > Project: Cassandra > Issue Type: Sub-task > Components: Local Write-Read Paths > Reporter: Pavel Yaskevich > Assignee: Pavel Yaskevich > Priority: Minor > Labels: sasi > Fix For: 3.x > > > Currently index building assumes one and only way to build all of the indexes > - through SecondaryIndexBuilder - which merges all of the sstables together, > collates columns etc. Such works fine for built-in indexes but not for SASI > since it's attaches to every SSTable individually. We need a "IndexBuildTask" > interface (based on CompactionInfo.Holder) to be returned from Index on > demand to give power to SI interface implementers to decide how build should > work. This might be less effective for CassandraIndex, since this effectively > means that collation will have to be done multiple times on the same data, > but nevertheless is a good compromise for clean interface to outside world. -- This message was sent by Atlassian JIRA (v6.3.4#6332)