alamb commented on issue #3142:
URL: https://github.com/apache/arrow-rs/issues/3142#issuecomment-1322692312

   > Fortunately however, append-only and immutable are not necessarily at 
odds, as long as one takes immutable RecordBatch snapshots at a point-in-time 
of the append-only version. 
   
   I think the general pattern of 
   * Append data ....
   * Snapshot (by copying into a read only `RecordBatch`)
   * keep appending data
   
   Makes sense with arrow. In fact this is what we do in IOx on the write path. 
   
   This pattern can be done with the various `*Builder`s -- such as  
https://docs.rs/arrow/27.0.0/arrow/array/type.Int64Builder.html
   
   So something like:
   
   ```rust
   builder.append_value(1);
   
   // get an array to use in datafusion, etc
   let array = builder.build();
   
   // make a new builder and start building the second batch
   builder = Int64Builder::new();
   builder.append_value(2);
   builder.append_value(3);
   ```
   
   
   
   
   I think what @tustvold  is suggesting with " non-consuming finish method to 
the builders" means you could do something like:
   
   ```rust
   builder.append_value(1);
   
   // get an array to use in datafusion, etc (by copying the underlying values)
   let array = builder.snapshot();
   
   // existing builder can be used to append 
   builder.append_value(2);
   builder.append_value(3);
   ```
   
   
   I don't fully follow the proposal in 
https://gist.github.com/avantgardnerio/48d977ea6bd28c790cfb6df09250336d 
   
   As soon as you have this function:
   
   ```rust
       pub fn as_slice(&self) -> RecordBatch {
           self.record_batch.slice(0, self.len())
       }
   ```
   
   This means that now multiple locations may be able to read that data so 
trying to update as well will likely be an exercise in frustration as you try 
and fight the rust compiler to teach it that somehow you have guaranteed that 
the memory remains valid and that concurrent read and modification are ok.
   
   Maybe we could start with the "make a snapshot based copy" and if the 
copying is too much figure out some different approach.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to