[ 
https://issues.apache.org/jira/browse/ACCUMULO-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13808623#comment-13808623
 ] 

Philip A Grim II commented on ACCUMULO-1804:
--------------------------------------------

David,

I'll get a README posted to the GitHub repo this week.

Basically, there are three parts, assuming the things you've already done.  
First, you have to have an Accumulo proxy running on the Accumulo instance you 
want to talk to, which you can do by following the instructions in 
$ACCUMULO_HOME/proxy.

Second, you have to have Thrift installed on the box R is running on.  I don't 
recall if there's an apt package for it - I just build it from source.  
Building it from source has the advantage of ensuring that you have all of the 
prerequisites for performing the third step...

Third, you have to install raccumulo into R.  In a command shell, cd to the 
directory where you unpacked the tarball (not the raccumulo directory itself, 
but the parent) and type the following command:

R CMD INSTALL raccumulo

This will cause R to configure, build, and install the package into your R 
library.  The most common reason for it not to successfully build is that your 
PKG_CONFIG_LIB path doesn't include the Thrift installation.

Assuming it builds and installs, and you have the proxy running on your 
Accumulo instance, you should be able to talk to Accumulo from R.  There is a 
file in the noinst directory under raccumulo called test.R that shows examples 
of how to connect and use the functionality.  There are also R man pages you 
can get at from the RStudio help tab, or from the R prompt by typing ?raccumulo.

As I said, sometime this week, I intend to have this all written up formally 
and in great detail and posted in the raccumulo GitHub repo.

Phil

> Integrate RStudio to work with data residing in Accumulo
> --------------------------------------------------------
>
>                 Key: ACCUMULO-1804
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1804
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: Aaron Glahe
>            Priority: Minor
>         Attachments: raccumulo-release.tar.gz
>
>
> Need to be able to support users who utilize RStudio to conduct analysis of 
> data residing in the Accumulo data space instead of moving data from one 
> repository to a stand alone system to have the analytic run in memory.  
> RStudio should be able to make calls directly to the data space and provide 
> the output within the RStudio interface.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to