Re: Drill + Accumulo?

Ron Cecchini Thu, 20 Feb 2020 21:38:14 -0800

Thank you, Paul, for your thoughtfulness and taking the time to research and 
explain things in such detail.


And thank you, Charles, for your suggestion as well.

You've given me a lot to think about.  In the very near term, I'll probably 
take a look at Presto.  But I will circle back to Drill for other things.

Thanks again.

Ron

> On February 19, 2020 at 6:41 PM Paul Rogers <[email protected]> wrote:
> 
> 
> Hi Ron,
> 
> Given that I know next to nothing about  Accumulo other than what I just 
> learned from a Google search, the answer appears to be no. The best approach 
> would be to write a connector that exploits Accumulo's new 2.0 AccumuloClient 
> API, perhaps along with the new Scan Executors. See [1].
> 
> I thought I had read in the past that Accumlo grew out of some other project, 
> but the project history of chapter 1 of the O'Reilly Accumulo book [2] 
> suggests that Accumulo was built independently. That said, to the degree that 
> Accumulo has similarities with HBase, another path is to fork/extend Drill's 
> HBase connector.
> 
> Yet another solution would be to use a REST API, leveraging the REST 
> connector that is being built. However, a quick review of the "Accumlo 
> Clients" page of the docs [3] does not suggest that Accumulo ships with a 
> REST API. Perhaps some other project has added one? Of course, this just 
> shifts the work to create an Accumulo connector to the work to force a 
> generic REST connector to issue the requests, and read the responses that the 
> REST proxy might use. Not sure that is a huge win.
> 
> Accumulo appears to have Hive integration [4], as does Drill. I wonder if 
> that is possible path? I'm not very familiar with how Drill reads data from 
> Hive, but if we use Hive's record format, and Accumulo can produce that 
> format, there might be a path. Not sure how things like filter push-down 
> would be handled.
> 
> All this said, Drill is designed to allow connectors. The API is not as 
> simple as we'd like (we're working on it), if you need SQL access to 
> Accumulo, writing a connector is a possible path.
> 
> Finally, if you need a SQL solution today, there is Presto, which already has 
> an Accumulo connector. [5].
> 
> 
> Thanks,
> - Paul
> 
> [1] https://accumulo.apache.org/release/accumulo-2.0.0/
> 
> [2] https://learning.oreilly.com/library/view/accumulo/9781491947098/
> 
> [3] https://accumulo.apache.org/docs/2.x/getting-started/clients
> 
> [4] https://cwiki.apache.org/confluence/display/Hive/AccumuloIntegration
> 
> [5]  https://prestosql.io/docs/current/connector/accumulo.html
> 
> 
> 
> 
> 
> 
> 
>  
> 
>     On Wednesday, February 19, 2020, 2:40:17 PM PST, Ron Cecchini 
> <[email protected]> wrote:  
>  
>  (Keeping in mind that I know next to nothing about Accumulo or Hive, etc...)
> 
> Is there currently (i.e. without writing a connector) any way whatsoever to 
> get Drill to query an Accumulo db? 
> 
> Thanks.
>

Re: Drill + Accumulo?

Reply via email to