[ 
https://issues.apache.org/jira/browse/CASSANDRA-10681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15011664#comment-15011664
 ] 

Pavel Yaskevich edited comment on CASSANDRA-10681 at 11/18/15 8:17 PM:
-----------------------------------------------------------------------

Let me clarify a bit - it serializes a compilation of all of the indexes not 
building them. That's what I have already mentioned, Indexes are built separate 
but it's just a nature of the Index API since PerRowIndex is no more we have to 
build all of the indexes independently, this requires to pass through a data 
multiple times but I don't think it's necessary a problem if we list the 
assumption that indexes are build in one and only way  - by merging sstables 
together and feeding index collated row - and let API implementers decide how 
to build indexes based on the set of sstables. When the SSTable is added via 
streaming for example CASSANDRA-10678 would take care of creating indexes for 
it in case of SASI and Indexer API in case of standard indexes, so I don't 
really see a problem there, in case of side loading new compaction task per 
index is going to be triggered to build such indexes if necessary but we can't 
really go around that.


was (Author: xedin):
Let me clarify a bit - it serializes a compilation of all of the indexes not 
building them. That's what I have already mentioned, Indexes are built separate 
but it's just a nature of the Index API since PerRowIndex is no more we have to 
build all of the indexes independently, this requires to pass through a data 
multiple times but I don't think it's necessary a problem if we list the 
assumption that indexes are build in one and only way  - by merging sstables 
together and feeding index collated row - and let API implementers decide how 
to build indexes based on the set of sstables. When the SSTable is added via 
streaming for example CASSANDRA-10678 would take care of creating indexes for 
it in case of SASI and Indexer API in case of standard indexes, so I don't 
really see a problem there, in case of side loading new compaction task per 
index is going to be triggered to build such indexes in necessary but we can't 
really go around that.

> make index building pluggable via IndexBuildTask
> ------------------------------------------------
>
>                 Key: CASSANDRA-10681
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10681
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Local Write-Read Paths
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>            Priority: Minor
>              Labels: sasi
>             Fix For: 3.x
>
>
> Currently index building assumes one and only way to build all of the indexes 
> - through SecondaryIndexBuilder - which merges all of the sstables together, 
> collates columns etc. Such works fine for built-in indexes but not for SASI 
> since it's attaches to every SSTable individually. We need a "IndexBuildTask" 
> interface (based on CompactionInfo.Holder) to be returned from Index on 
> demand to give power to SI interface implementers to decide how build should 
> work. This might be less effective for CassandraIndex, since this effectively 
> means that collation will have to be done multiple times on the same data, 
> but  nevertheless is a good compromise for clean interface to outside world.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to