They are saying that HBase uses Apache Parquet, which as I gather is compatible 
with Arrow.  I am just now spinning up on all this so bear with me.   As I 
understand it, Arrow is memory and Parquet is files.

I have a code base that is built around Accumulo. My code does a lot in memory 
already.  I like what Arrow has to offer from a polyglot standpoint, but my 
data sets are, well,   they're what people call "big data" hence Accumulo.  If 
HBase can handle the Arrow/Parquet structure, why not Accumulo?

Good to be talking
________________________________
From: Emilio Lahr-Vivaz <[email protected]>
Sent: Wednesday, February 17, 2021 4:09 PM
To: [email protected] <[email protected]>
Subject: Re: [External] Re: Accumulo and Arrow

I believe that was a theoretical - I don't think there has been any actual 
integration at this point. But I'd be happy to be proven wrong :)

Thanks,

Emilio

On 2/17/21 12:17 PM, Roberts, Geoffry [USA] wrote:
This is where I saw a reference to 
Hbase<https://urldefense.com/v3/__https://blog.cloudera.com/introducing-apache-arrow-a-fast-interoperable-in-memory-columnar-data-structure-standard/__;!!May37g!ZB8PMax5pRwIM7nFl1H-Mp08wuwY5wrZFRlBWLpFpE_9dISxxitDG-watKobtJyhfuEg$>.


________________________________
From: Emilio Lahr-Vivaz <[email protected]><mailto:[email protected]>
Sent: Wednesday, February 17, 2021 11:04 AM
To: [email protected]<mailto:[email protected]> 
<[email protected]><mailto:[email protected]>
Subject: [External] Re: Accumulo and Arrow

Hello,

Do you have a link to describe the integration between HBase and Arrow? I 
didn't find anything except some theoretical discussions. My understanding is 
that Arrow is meant for in-memory representations, and there is no plan to i.e. 
replace HFiles or RFiles with Arrow files in HBase/Accumulo.

I'm interested in the intersection of the two, though. I'm a committer on 
GeoMesa, and we provide a way to export Arrow files out of both Accumulo and 
HBase using custom iterators/coprocessors. GeoMesa is focused on spatial data 
though, so it may not fit with your use case.

Thanks,

Emilio

On 2/17/21 8:13 AM, Roberts, Geoffry [USA] wrote:
All,

I have been looking into Apache Arrow.  I see that it supports a connect to 
HBase.  I
Googled but found nothing wrt Accumulo.  Is there, or is there planned, support 
for Arrow/Accumulo?

Thanks


Reply via email to