Re: Loading Rya via Map/Reduce is this the best I can do?

Jeff Dasch Fri, 16 Mar 2018 06:31:51 -0700

It is possible that visibilities are the culprit.  Does your Accumulo user
have the same authorizations you specified on the command line for the
RdfFileInputTool?   By CL scan, you're referring to an Accumulo Shell scan,
right?  You can use the Accumulo Shell command "getauths" to verify your
user's authorizations.


On Thu, Mar 15, 2018 at 4:53 PM, Geoffry Roberts <threadedb...@gmail.com>
wrote:

> RdfFileInputTool worked, thanks for the help.
>
> Do I have a visibility problem?
>
> I ran the tool and it showed 47 records were insterted--good.  I see the
> tables were created as expected with the right prefix.
>
> But when I attempt a CL scan, I get one line out that appears to be telling
> me which version of Rya I am using.   I did both the load and the scan as
> the same user.
>
> Not sure what to make of this.
>
> On Thu, Mar 15, 2018 at 9:16 AM, Jeff Dasch <hcs...@gmail.com> wrote:
>
> > Geoffry,
> >
> > Take a look at the RdfFileInputTool [1] in the rya.mapreduce module.  It
> > doesn't look like the shaded jar was uploaded to maven, so you will
> likely
> > need to build that artifact yourself by including the "-P mr" profile
> when
> > building Rya.
> >
> > There are instructions for loading data with the RdfFileInputTool here
> [2],
> > but they appear to be out of date.  I haven't tried it recently, but this
> > command, based on the unit test [3] should work:
> >
> > hadoop jar target/rya.mapreduce-3.2.12-shaded.jar
> > org.apache.rya.accumulo.mr.tools.RdfFileInputTool
> > -Dac.zk=zoo1,zoo2,zoo3 -Dac.instance=accumulo -Dac.username=root
> > -Dac.pwd=password -Dac.auth=auths -Dac.cv=auths -Drdf.tablePrefix=rya_
> > -Drdf.format=N-Triples /hdfs/path/to/triplefiles
> >
> >
> > [1]
> > https://github.com/apache/incubator-rya/blob/master/
> > mapreduce/src/main/java/org/apache/rya/accumulo/mr/tools/
> > RdfFileInputTool.java
> > [2]
> > https://github.com/apache/incubator-rya/blob/master/
> > extras/rya.manual/src/site/markdown/loaddata.md
> > [3]
> > https://github.com/apache/incubator-rya/blob/master/
> > mapreduce/src/test/java/org/apache/rya/accumulo/mr/tools/
> > RdfFileInputToolTest.java
> >
> >
> >
> > On Wed, Mar 14, 2018 at 5:28 PM, Geoffry Roberts <threadedb...@gmail.com
> >
> > wrote:
> >
> > > All,
> > >
> > > Am I doing things the best way?
> > >
> > > I have a pile of data that I need to load into Rya.  I must first
> convert
> > > it into RDF, then do the load.  I am using map/reduce because I have a
> > lot
> > > of data.
> > >
> > > I have an hdfs directory full of RDF in NTRIPLE format.
> > >
> > > I have a mapper like this:
> > >
> > > protected void map(LongWritable key, RyaStatementWritable value,
> Context
> > > ctx)
> > > {
> > >
> > > // RyaStatementWritable gives me a RyaStatement like this:
> > >
> > >
> > > RyaStatement ryaStatement = value.getRyaStatement();
> > >
> > >
> > > // At this point I find myself having to convert the
> > >
> > > // RyaStatement into an OpenRDF Statement like this:
> > >
> > >
> > > Sail ryaSail = RyaSailFactory.getInstance(conf);
> > >
> > > ValueFactory vf = ryaSail.getValueFactory();
> > >
> > > Statement stmt = vf.createStatement(vf.createURI(sS),
> vf.createURI(sP),
> > vf
> > > .createURI(sO));
> > >
> > > ctx.write(null, stmt);
> > >
> > > }
> > >
> > > In my reducer, I use AccumuloLoadStatements to lood Rya like this:
> > >
> > > protected void reduce(NullWritable key, Iterable<Statement> stmts,
> > > Reducer<NullWritable, Statement, NullWritable, NullWritable>.Context
> ctx)
> > > throws IOException, InterruptedException {
> > >
> > > super.reduce(key, stmts, ctx);
> > >
> > >
> > > AccumuloLoadStatements load = ...omitted for brevity...
> > >
> > >
> > > try {
> > >
> > > load.loadStatements(instance, stmts);
> > >
> > > } catch (RyaClientException e) {
> > >
> > > log.error("", e);
> > >
> > > }
> > >
> > > }
> > >
> > >
> > > Thanks
> > >
> > > --
> > > There are ways and there are ways,
> > >
> > > Geoffry Roberts
> > >
> >
>
>
>
> --
> There are ways and there are ways,
>
> Geoffry Roberts
>

Re: Loading Rya via Map/Reduce is this the best I can do?

Reply via email to