Extending ARQ to retrieve FILTER operations from Sparql Queries

2018-04-24 Thread anuj kumar
I have another query, this time relating to ARQ and how I can extend it.
*Background*

I am using HBase as the underlying Data Store for Triples (by implementing
various Jena Extension points).
So far so good.
But now I have certain SPARQL queries that have FILTER operations defined.
With a huge data set, that I have, its taking a lot of time to FILTER the
data out.

So, I want to extract the FILTER portion of the Query out and simply want
to pass it as a HBase Filter to the HBase client so that filtering can
happen at RegionServer level itself.

This question is very similar to the question here

on
stackoverflow.

I have been reading about ARQ and OpExecutor and believe that it may be
what I need but I cant seem to find a simple example on which I can build
up.

Can someone help me with getting to understand how I can parse the incoming
query to REMOVE the FILTER clause from it and instead take the filter
statement and evaluate it.

Thanks,
-- 
*Anuj Kumar*


Re: Extending ARQ to retrieve FILTER operations from Sparql Queries

2018-04-24 Thread Rob Vesse
The point is to not modify the parsing but the execution

By substituting a different OpExecutor implementation you can provide a custom 
executeOp() implementation that recognizes when an OpFilter operator is seen 
and acts accordingly

Rob

On 24/04/2018, 16:34, "anuj kumar"  wrote:

I have another query, this time relating to ARQ and how I can extend it.
*Background*

I am using HBase as the underlying Data Store for Triples (by implementing
various Jena Extension points).
So far so good.
But now I have certain SPARQL queries that have FILTER operations defined.
With a huge data set, that I have, its taking a lot of time to FILTER the
data out.

So, I want to extract the FILTER portion of the Query out and simply want
to pass it as a HBase Filter to the HBase client so that filtering can
happen at RegionServer level itself.

This question is very similar to the question here


on
stackoverflow.

I have been reading about ARQ and OpExecutor and believe that it may be
what I need but I cant seem to find a simple example on which I can build
up.

Can someone help me with getting to understand how I can parse the incoming
query to REMOVE the FILTER clause from it and instead take the filter
statement and evaluate it.

Thanks,
-- 
*Anuj Kumar*







Re: Extending ARQ to retrieve FILTER operations from Sparql Queries

2018-04-24 Thread Rob Vesse
Looking at the code for how TDB does this might be useful:

https://github.com/apache/jena/blob/master/jena-tdb/src/main/java/org/apache/jena/tdb/solver/OpExecutorTDB1.java

Rob

On 24/04/2018, 16:54, "Rob Vesse"  wrote:

The point is to not modify the parsing but the execution

By substituting a different OpExecutor implementation you can provide a 
custom executeOp() implementation that recognizes when an OpFilter operator is 
seen and acts accordingly

Rob

On 24/04/2018, 16:34, "anuj kumar"  wrote:

I have another query, this time relating to ARQ and how I can extend it.
*Background*

I am using HBase as the underlying Data Store for Triples (by 
implementing
various Jena Extension points).
So far so good.
But now I have certain SPARQL queries that have FILTER operations 
defined.
With a huge data set, that I have, its taking a lot of time to FILTER 
the
data out.

So, I want to extract the FILTER portion of the Query out and simply 
want
to pass it as a HBase Filter to the HBase client so that filtering can
happen at RegionServer level itself.

This question is very similar to the question here


on
stackoverflow.

I have been reading about ARQ and OpExecutor and believe that it may be
what I need but I cant seem to find a simple example on which I can 
build
up.

Can someone help me with getting to understand how I can parse the 
incoming
query to REMOVE the FILTER clause from it and instead take the filter
statement and evaluate it.

Thanks,
-- 
*Anuj Kumar*












Re: Extending ARQ to retrieve FILTER operations from Sparql Queries

2018-04-24 Thread anuj kumar
Thanks Rob for the pointers. I will start digging now.

On Tue, Apr 24, 2018 at 5:57 PM, Rob Vesse  wrote:

> Looking at the code for how TDB does this might be useful:
>
> https://github.com/apache/jena/blob/master/jena-tdb/src/
> main/java/org/apache/jena/tdb/solver/OpExecutorTDB1.java
>
> Rob
>
> On 24/04/2018, 16:54, "Rob Vesse"  wrote:
>
> The point is to not modify the parsing but the execution
>
> By substituting a different OpExecutor implementation you can provide
> a custom executeOp() implementation that recognizes when an OpFilter
> operator is seen and acts accordingly
>
> Rob
>
> On 24/04/2018, 16:34, "anuj kumar"  wrote:
>
> I have another query, this time relating to ARQ and how I can
> extend it.
> *Background*
>
> I am using HBase as the underlying Data Store for Triples (by
> implementing
> various Jena Extension points).
> So far so good.
> But now I have certain SPARQL queries that have FILTER operations
> defined.
> With a huge data set, that I have, its taking a lot of time to
> FILTER the
> data out.
>
> So, I want to extract the FILTER portion of the Query out and
> simply want
> to pass it as a HBase Filter to the HBase client so that filtering
> can
> happen at RegionServer level itself.
>
> This question is very similar to the question here
>  jena-arq-filter-optimization?rq=1>
> on
> stackoverflow.
>
> I have been reading about ARQ and OpExecutor and believe that it
> may be
> what I need but I cant seem to find a simple example on which I
> can build
> up.
>
> Can someone help me with getting to understand how I can parse the
> incoming
> query to REMOVE the FILTER clause from it and instead take the
> filter
> statement and evaluate it.
>
> Thanks,
> --
> *Anuj Kumar*
>
>
>
>
>
>
>
>
>
>
>


-- 
*Anuj Kumar*