Hi again,

We noticed in the contribution guidelines that there needs to be an issue for 
every PR in JIRA. Should we open one for the eventual PR for the work we're 
doing on implementing the dataset on Ceph's RADOS?

Also, on a related note, we would like to mock the RADOS client so that we can 
integrate it in CI tests. Would it be OK to include gmock as a dependency?

thanks!

On 2020/09/02 22:05:51, Ivo Jimenez <ivo.jime...@gmail.com> wrote: 
> Hi Ben,
> 
> 
> > > Our main concern is that this new arrow::dataset::RadosFormat class will
> > be
> > > deriving from the arrow::dataset::FileFormat class, which seems to raise
> > a
> > > conceptual mismatch as there isn’t really a RADOS format
> >
> > IIUC RADOS doesn't interact with a filesystem directly, so RadosFileFormat
> > would
> > indeed be a conceptually problematic point of extension. If a RADOS file
> > system
> > is not viable then I think the ideal approach would be to directly
> > implement the
> > Fragment [1] and Dataset [2] interfaces, forgoing a FileFormat
> > implementation altogether.
> > Unfortunately the only example we have of this approach is
> > InMemoryFragment,
> > which simply wraps a vector of record batches.
> >
> 
> This is what we will go with, as this seems to be the quickest way for us
> to have a PoC and start experimenting with this.
> 
> Thanks a lot for the invaluable feedback! 🙏
> 

Reply via email to