[jira] [Updated] (CASSANDRA-5741) Provide a way to disable automatic index rebuilds during bulk loading

2018-11-18 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-5741:

Component/s: Secondary Indexes

> Provide a way to disable automatic index rebuilds during bulk loading
> -
>
> Key: CASSANDRA-5741
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5741
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Secondary Indexes
>Affects Versions: 1.2.6
>Reporter: Jim Zamata
>Priority: Major
>
> When using the BulkLoadOutputFormat the actual streaming of the SSTables into 
> Cassandra is fast, but the index rebuilds can take several minutes. Cassandra 
> does not send the response until after all of the rebuilds for a streaming 
> session complete. This causes the tasks to appear to hang at 100%, since the 
> record writer streams the files in its close method.  If the rebuilding 
> process takes too long, the tasks can actually time out.
> Many SQL databases provide bulk insert utilities that disable index updates 
> to allow large amounts of data to be added quickly.  This functionality would 
> serve a similar purpose.
> An alternative might be an option that would allow the session to return once 
> the SSTables had been successfully imported without waiting for the index 
> builds to complete.  However, I have noticed heavy CPU loads during the index 
> rebuilds, so bulkload performance might be better if this step could be 
> deferred until after all of the data is loaded. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-5741) Provide a way to disable automatic index rebuilds during bulk loading

2013-07-26 Thread Jim Zamata (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Zamata updated CASSANDRA-5741:
--

Component/s: (was: Hadoop)

 Provide a way to disable automatic index rebuilds during bulk loading
 -

 Key: CASSANDRA-5741
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5741
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.6
Reporter: Jim Zamata

 When using the BulkLoadOutputFormat the actual streaming of the SSTables into 
 Cassandra is fast, but the index rebuilds can take several minutes. Cassandra 
 does not send the response until after all of the rebuilds for a streaming 
 session complete. This causes the tasks to appear to hang at 100%, since the 
 record writer streams the files in its close method.  If the rebuilding 
 process takes too long, the tasks can actually time out.
 Many SQL databases provide bulk insert utilities that disable index updates 
 to allow large amounts of data to be added quickly.  This functionality would 
 serve a similar purpose.
 An alternative might be an option that would allow the session to return once 
 the SSTables had been successfully imported without waiting for the index 
 builds to complete.  However, I have noticed heavy CPU loads during the index 
 rebuilds, so bulkload performance might be better if this step could be 
 deferred until after all of the data is loaded. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira