Re: Wiki data

2017-03-02 Thread Lorenz B.
The question is strange.

Wikidata provides RDF data, also accessible via SPARQL.
DBpedia provides RDF data, also accessible via SPARQL.

In addition, both provide schema information via OWL/RDFS axioms, e.g.
domain , range, subclass hierarchy etc.

Protege can load any such data.

The rest should be clear: It is the same as you did for DBpedia.



> Can we add wikidata in Protege like we do in DBpedia. Not sure if Protege
> and Jena allow us to use both wikidata and DBpedia in one application.?
>
> On Thu, Mar 2, 2017 at 3:42 PM, Marco Neumann 
> wrote:
>
>> since wikidata.org provides canonical RDF dumps the data should behave
>> like any other data set. not particularly relevant to this list
>> though.
>>
>> https://www.wikidata.org/wiki/Wikidata:Database_download#RDF_dumps
>>
>>
>>
>> On Thu, Mar 2, 2017 at 7:35 AM, javed khan  wrote:
>>> Is Jena support wikidata the same way as it support DBpedia? For example,
>>> we store DBpedia resources in our owl file and then access it from our
>> Jena
>>> code. Any example, if some one provide how to access a wikidata using
>> Jena
>>> code?
>>>
>>> Thank you.
>>
>>
>> --
>>
>>
>> ---
>> Marco Neumann
>> KONA
>>
-- 
Lorenz Bühmann
AKSW group, University of Leipzig
Group: http://aksw.org - semantic web research center



Re: Extending Jena Text to Support ElasticSearch as Indexing/Querying Engine

2017-03-02 Thread anuj kumar
I second that. I am now finalising the integration of ES and should have a
good production quality implementation ready in a week's time.  At that
time I would want you guys to have a look at the implementation and provide
feedback. Once you guys have upgraded Lucene to 6.4.1 , I can merge the
code in jena-text module and do a round of testing.

Thanks,
Anuj Kumar

On 2 Mar 2017 22:28, "A. Soroka"  wrote:

> I do agree that trying to juggle different versions of Lucene libraries is
> probably not a realistic option right now. Luckily (if I understand the
> conversation thus far correctly) we have a solid alternative; getting our
> current Lucene dependency upgraded should allow us to (eventually) merge
> Anuj's work into the mainstream of development. Someone please tell me if I
> have that wrong! :grin:
>
> Let me reiterate that this seems like very good work and speaking for
> myself, I certainly want to get it included into Jena. It's just a question
> of fitting it in correctly, which might take a bit of time.
>
> ---
> A. Soroka
> The University of Virginia Library
>
> > On Mar 1, 2017, at 1:27 PM, Osma Suominen 
> wrote:
> >
> > Hi Anuj!
> >
> > I have nothing against modularity in general. However, I cannot see how
> your proposal could work in practice for the Fuseki build, due to the
> reasons I mentioned in my previous message (and Adam seemed to concur).
> >
> > In any case, I'll see what I can do to get the Lucene upgrade moving
> again. If all current Jena modules (ie jena-text and jena-spatial) were
> upgraded to Lucene 6.4.1, then you could just add your ES classes to
> jena-text, right? I think that would be better for everyone than having to
> maintain your own separate module.
> >
> > -Osma
> >
> > 01.03.2017, 16:59, anuj kumar kirjoitti:
> >> I personally have no preference as to how the code in Jena should be
> >> structured, as long as I am able to use it :).
> >> I have personal preference of doing it in a specific way because IMO,
> it is
> >> modular which makes it much easier to maintain in the long run. But
> again
> >> it may not be the quickest one.
> >>
> >> I already have been given a deadline, by the company to have ES
> extension
> >> implemented in the next 15 days :). What this means is that I will be
> >> maintaining the ES code extension to Jena Text at-least locally for a
> >> coming period of time. I would be more than happy to contribute to Jena
> >> community whatever is required to have a proper ElasticSearch
> >> Implementation in place, whether within jena-text module or as a
> separate
> >> module. Till the time Lucene and Solr is not upgraded to the latest
> >> version, I will have to maintain a separate module for jena-text-es.
> >>
> >> Cheers!
> >> Anuj Kumar
> >>
> >>
> >> On Wed, Mar 1, 2017 at 3:36 PM, A. Soroka  wrote:
> >>
> >>> Osma--
> >>>
> >>> The short answer is that yes, given the right tools you _can_ have
> >>> different versions of code accessible in different ways. The longer
> answer
> >>> is that it's probably not a viable alternative for Jena for this
> problem,
> >>> at least not without a lot of other change.
> >>>
> >>> You are right to point to the classloader mechanism as being at the
> heart
> >>> of this question, but I must alter your remark just slightly. From "the
> >>> Java classloader only sees a single, flat package/class namespace and
> a set
> >>> of compiled classes" to "ANY GIVEN Java classloader only sees a single,
> >>> flat package/class namespace and a set of compiled classes".
> >>>
> >>> This is the fact that OSGi uses to make it possible to maintain strict
> >>> module boundaries (and even dynamic module relationships at run-time).
> Each
> >>> OSGi bundle sees its own classloader, and the framework is responsible
> for
> >>> connecting bundles up to ensure that every bundle has what it needs in
> the
> >>> way of types to function, based on metadata that the bundles provide
> to the
> >>> framework. It's an incredibly powerful system (I use it every day and
> enjoy
> >>> it enormously) but it's also very "heavy" and requires a good deal of
> >>> investment to use. In particular, it's probably too large to put
> _inside_
> >>> Jena. (I frequently put Jena inside an OSGi instance, on the other
> hand.)
> >>>
> >>> Java 9 Jigsaw [1] offers some possibility for strong modularization of
> >>> this kind, but it's really meant for the JDK itself, not application
> >>> libraries. In theory, we could "roll our own" classloader management
> for
> >>> this problem. That sounds like more than a bit of a rabbit hole to me.
> >>> There might be another, more lightweight, toolkit out there to this
> >>> purpose, but I'm not aware of any myself.
> >>>
> >>> Otherwise, yes, you get into shading and the like. We have to do that
> for
> >>> Guava for now because of HADOOP-10101 (grumble grumble) but it's
> hardly a
> >>> thing we want to do any more of than needed, I 

Re: Extending Jena Text to Support ElasticSearch as Indexing/Querying Engine

2017-03-02 Thread A. Soroka
I do agree that trying to juggle different versions of Lucene libraries is 
probably not a realistic option right now. Luckily (if I understand the 
conversation thus far correctly) we have a solid alternative; getting our 
current Lucene dependency upgraded should allow us to (eventually) merge Anuj's 
work into the mainstream of development. Someone please tell me if I have that 
wrong! :grin:

Let me reiterate that this seems like very good work and speaking for myself, I 
certainly want to get it included into Jena. It's just a question of fitting it 
in correctly, which might take a bit of time. 

---
A. Soroka
The University of Virginia Library

> On Mar 1, 2017, at 1:27 PM, Osma Suominen  wrote:
> 
> Hi Anuj!
> 
> I have nothing against modularity in general. However, I cannot see how your 
> proposal could work in practice for the Fuseki build, due to the reasons I 
> mentioned in my previous message (and Adam seemed to concur).
> 
> In any case, I'll see what I can do to get the Lucene upgrade moving again. 
> If all current Jena modules (ie jena-text and jena-spatial) were upgraded to 
> Lucene 6.4.1, then you could just add your ES classes to jena-text, right? I 
> think that would be better for everyone than having to maintain your own 
> separate module.
> 
> -Osma
> 
> 01.03.2017, 16:59, anuj kumar kirjoitti:
>> I personally have no preference as to how the code in Jena should be
>> structured, as long as I am able to use it :).
>> I have personal preference of doing it in a specific way because IMO, it is
>> modular which makes it much easier to maintain in the long run. But again
>> it may not be the quickest one.
>> 
>> I already have been given a deadline, by the company to have ES extension
>> implemented in the next 15 days :). What this means is that I will be
>> maintaining the ES code extension to Jena Text at-least locally for a
>> coming period of time. I would be more than happy to contribute to Jena
>> community whatever is required to have a proper ElasticSearch
>> Implementation in place, whether within jena-text module or as a separate
>> module. Till the time Lucene and Solr is not upgraded to the latest
>> version, I will have to maintain a separate module for jena-text-es.
>> 
>> Cheers!
>> Anuj Kumar
>> 
>> 
>> On Wed, Mar 1, 2017 at 3:36 PM, A. Soroka  wrote:
>> 
>>> Osma--
>>> 
>>> The short answer is that yes, given the right tools you _can_ have
>>> different versions of code accessible in different ways. The longer answer
>>> is that it's probably not a viable alternative for Jena for this problem,
>>> at least not without a lot of other change.
>>> 
>>> You are right to point to the classloader mechanism as being at the heart
>>> of this question, but I must alter your remark just slightly. From "the
>>> Java classloader only sees a single, flat package/class namespace and a set
>>> of compiled classes" to "ANY GIVEN Java classloader only sees a single,
>>> flat package/class namespace and a set of compiled classes".
>>> 
>>> This is the fact that OSGi uses to make it possible to maintain strict
>>> module boundaries (and even dynamic module relationships at run-time). Each
>>> OSGi bundle sees its own classloader, and the framework is responsible for
>>> connecting bundles up to ensure that every bundle has what it needs in the
>>> way of types to function, based on metadata that the bundles provide to the
>>> framework. It's an incredibly powerful system (I use it every day and enjoy
>>> it enormously) but it's also very "heavy" and requires a good deal of
>>> investment to use. In particular, it's probably too large to put _inside_
>>> Jena. (I frequently put Jena inside an OSGi instance, on the other hand.)
>>> 
>>> Java 9 Jigsaw [1] offers some possibility for strong modularization of
>>> this kind, but it's really meant for the JDK itself, not application
>>> libraries. In theory, we could "roll our own" classloader management for
>>> this problem. That sounds like more than a bit of a rabbit hole to me.
>>> There might be another, more lightweight, toolkit out there to this
>>> purpose, but I'm not aware of any myself.
>>> 
>>> Otherwise, yes, you get into shading and the like. We have to do that for
>>> Guava for now because of HADOOP-10101 (grumble grumble) but it's hardly a
>>> thing we want to do any more of than needed, I don't think.
>>> 
>>> ---
>>> A. Soroka
>>> The University of Virginia Library
>>> 
>>> [1] http://openjdk.java.net/projects/jigsaw/
>>> 
 On Mar 1, 2017, at 9:03 AM, Osma Suominen 
>>> wrote:
 
 Hi Anuj!
 
 Thanks for the clarification.
 
 However, I'm still not sure I understand the situation completely. I
>>> know Maven can perform a lot of tricks, but Maven modules are just
>>> convenient ways to structure a Java project. Maven cannot change the fact
>>> that at runtime, module divisions don't really matter (except that they
>>> 

Re: Wiki data

2017-03-02 Thread A. Soroka
That's a good question to ask the Protege support lists.

---
A. Soroka
The University of Virginia Library

> On Mar 2, 2017, at 3:31 PM, javed khan  wrote:
> 
> Can we add wikidata in Protege like we do in DBpedia. Not sure if Protege
> and Jena allow us to use both wikidata and DBpedia in one application.?
> 
> On Thu, Mar 2, 2017 at 3:42 PM, Marco Neumann 
> wrote:
> 
>> since wikidata.org provides canonical RDF dumps the data should behave
>> like any other data set. not particularly relevant to this list
>> though.
>> 
>> https://www.wikidata.org/wiki/Wikidata:Database_download#RDF_dumps
>> 
>> 
>> 
>> On Thu, Mar 2, 2017 at 7:35 AM, javed khan  wrote:
>>> Is Jena support wikidata the same way as it support DBpedia? For example,
>>> we store DBpedia resources in our owl file and then access it from our
>> Jena
>>> code. Any example, if some one provide how to access a wikidata using
>> Jena
>>> code?
>>> 
>>> Thank you.
>> 
>> 
>> 
>> --
>> 
>> 
>> ---
>> Marco Neumann
>> KONA
>> 



Re: Wiki data

2017-03-02 Thread javed khan
Can we add wikidata in Protege like we do in DBpedia. Not sure if Protege
and Jena allow us to use both wikidata and DBpedia in one application.?

On Thu, Mar 2, 2017 at 3:42 PM, Marco Neumann 
wrote:

> since wikidata.org provides canonical RDF dumps the data should behave
> like any other data set. not particularly relevant to this list
> though.
>
> https://www.wikidata.org/wiki/Wikidata:Database_download#RDF_dumps
>
>
>
> On Thu, Mar 2, 2017 at 7:35 AM, javed khan  wrote:
> > Is Jena support wikidata the same way as it support DBpedia? For example,
> > we store DBpedia resources in our owl file and then access it from our
> Jena
> > code. Any example, if some one provide how to access a wikidata using
> Jena
> > code?
> >
> > Thank you.
>
>
>
> --
>
>
> ---
> Marco Neumann
> KONA
>


Re: Wiki data

2017-03-02 Thread Marco Neumann
since wikidata.org provides canonical RDF dumps the data should behave
like any other data set. not particularly relevant to this list
though.

https://www.wikidata.org/wiki/Wikidata:Database_download#RDF_dumps



On Thu, Mar 2, 2017 at 7:35 AM, javed khan  wrote:
> Is Jena support wikidata the same way as it support DBpedia? For example,
> we store DBpedia resources in our owl file and then access it from our Jena
> code. Any example, if some one provide how to access a wikidata using Jena
> code?
>
> Thank you.



-- 


---
Marco Neumann
KONA


Wiki data

2017-03-02 Thread javed khan
Is Jena support wikidata the same way as it support DBpedia? For example,
we store DBpedia resources in our owl file and then access it from our Jena
code. Any example, if some one provide how to access a wikidata using Jena
code?

Thank you.


Re: Extending Jena Text to Support ElasticSearch as Indexing/Querying Engine

2017-03-02 Thread anuj kumar
Just FYI, I was able to index multiple fields in ElasticSearch using Jena
Text capability.
The issue was in my ElasticSearch code where I was doing insert every time
instead of an update :/

Cheers!
Anuj Kumar

On Wed, Mar 1, 2017 at 7:40 PM, anuj kumar  wrote:

> Thanks Osma. I sent my previous email just a minute early. I will try your
> suggestion and if it doesn't work will send you the entire example.
>
> Thanks again.
> Anuj
>
> On 1 Mar 2017 19:36, "Osma Suominen"  wrote:
>
>> Hi Anuj!
>>
>> Generally I use assembler descriptions to configure the jena-text index.
>> An example with multiple properties (SKOS label properties) is here:
>> https://github.com/NatLibFi/Skosmos/wiki/InstallTutorial#cre
>> ating-a-text-index
>>
>> For examples on how to use assembler descriptions from Java code, take a
>> look at the jena-text unit tests. They generally contain a snippet of
>> assembler definition that configures the text index in a particular way,
>> then test that it does what it should when using that configuration.
>>
>> You didn't provide a full example. What is your data and what query did
>> you use? What results did you expect? What happened instead?
>>
>> One possible problem in your configuration is that you have set the
>> primary predicate to rdfs:label, but not set a field for it. Try adding
>> this:
>>
>> entDef.set("label", RDFS.label.asNode());
>>
>> For querying everything else but the default field, you need to specify
>> the predicate at query time. With your configuration, it should be possible
>> to query rdfs:comment values like this:
>>
>> ?s text:query (rdfs:comment "word") .
>>
>> Hope this helps!
>>
>> -Osma
>>
>> 01.03.2017, 17:33, anuj kumar kirjoitti:
>>
>>> BTW, I have one more question:
>>>
>>> How do I add more than one field to be indexed in my Index?
>>> Basically, if I want to index rdfs:label , rdfs:comment in the same index
>>> document, how do I do it?
>>>
>>> I tried :
>>>
>>> EntityDefinition entDef = new EntityDefinition(DOC_TYPE,
>>> FIELD_TO_SEARCH);
>>> entDef.setPrimaryPredicate(RDFS.label);
>>> entDef.setGraphField(GRAPH_FIELD_NAME);
>>> entDef.set("comment", RDFS.comment.asNode());
>>>
>>> But it doesnt work. Can you please point me on a way to do it please.
>>> This
>>> is an important piece of functionality I need.
>>>
>>> Thanks,
>>> Anuj Kumar
>>>
>>>
>>> On Wed, Mar 1, 2017 at 3:59 PM, anuj kumar 
>>> wrote:
>>>
>>> I personally have no preference as to how the code in Jena should be
 structured, as long as I am able to use it :).
 I have personal preference of doing it in a specific way because IMO, it
 is modular which makes it much easier to maintain in the long run. But
 again it may not be the quickest one.

 I already have been given a deadline, by the company to have ES
 extension
 implemented in the next 15 days :). What this means is that I will be
 maintaining the ES code extension to Jena Text at-least locally for a
 coming period of time. I would be more than happy to contribute to Jena
 community whatever is required to have a proper ElasticSearch
 Implementation in place, whether within jena-text module or as a
 separate
 module. Till the time Lucene and Solr is not upgraded to the latest
 version, I will have to maintain a separate module for jena-text-es.

 Cheers!
 Anuj Kumar


 On Wed, Mar 1, 2017 at 3:36 PM, A. Soroka  wrote:

 Osma--
>
> The short answer is that yes, given the right tools you _can_ have
> different versions of code accessible in different ways. The longer
> answer
> is that it's probably not a viable alternative for Jena for this
> problem,
> at least not without a lot of other change.
>
> You are right to point to the classloader mechanism as being at the
> heart
> of this question, but I must alter your remark just slightly. From "the
> Java classloader only sees a single, flat package/class namespace and
> a set
> of compiled classes" to "ANY GIVEN Java classloader only sees a single,
> flat package/class namespace and a set of compiled classes".
>
> This is the fact that OSGi uses to make it possible to maintain strict
> module boundaries (and even dynamic module relationships at run-time).
> Each
> OSGi bundle sees its own classloader, and the framework is responsible
> for
> connecting bundles up to ensure that every bundle has what it needs in
> the
> way of types to function, based on metadata that the bundles provide
> to the
> framework. It's an incredibly powerful system (I use it every day and
> enjoy
> it enormously) but it's also very "heavy" and requires a good deal of
> investment to use. In particular, it's probably too large to put
> _inside_
> Jena. (I frequently put Jena inside an OSGi 

Re: SPARQL query

2017-03-02 Thread Osma Suominen

Hi Claude,

Okay, let me try. I think I've done something similar actually - and ran 
into performance problems with those queries using the latest Jena 
releases. But that's a different story...


SELECT ?r
WHERE {
  # we have a specific resource for which we are looking for matches
  BIND(:inputresource AS ?a)

  # the input resource has one or more properties
  ?a :p ?val .
  # the matching resource should have at least one of those properties
  ?r :p ?val .

  # but the targeted resource shouldn't have any property that the 
input resource doesn't have

  FILTER NOT EXISTS {
?r :p ?val2 .
FILTER NOT EXISTS {
  ?a :p ?val2 .
}
  }
}

I think one or both of the FILTER NOT EXISTS may be changed to a MINUS 
as well.


Performance is likely to be poor with large data.

-Osma

02.03.2017, 09:52, Claude Warren kirjoitti:

Osma,  I think I am asking for the latter.  Table wise it looks like this

{noformat}

 A B C D   Match
w1 1 No
x1 1 1   Yes
y1   1   Yes
z1 1 1 1 No

Looking for matches for
P1 1 1


Items in the table may not specify values for properties that P does not
have

Claude


On Thu, Mar 2, 2017 at 7:30 AM, Osma Suominen 
wrote:


Hi Claude,

Do you mean something like this?

SELECT ?r
WHERE {
  { # must have at least one of A, B, C
{ ?r :p :A }
UNION
{ ?r :p :B }
UNION
{ ?r :p :B }
  }
  # must not have D, E or F
  FILTER NOT EXISTS { ?r :p :D }
  FILTER NOT EXISTS { ?r :p :E }
  FILTER NOT EXISTS { ?r :p :F }
}

I'm sure there are other possible variations, for example the FILTER NOT
EXISTS patterns could be combined to a single pattern using UNION. UNION is
generally more efficient than filter rules such as IN or NOT IN, that's why
I used it above. Using a VALUES block would have been an option as well.

Or did you mean that this has to be generic, so that any resource ?a can
be used as input and the query has to figure out the restrictions based on
properties that ?a has or doesn't have? That's going to be trickier...

-Osma


01.03.2017, 22:12, Claude Warren kirjoitti:


I have a graph where resources have a number of controlled properties.
Call them A thorough F.  Some resources may have one property other may
have all 6.

I have a resource with a with properties A, B, and C.

I want to find all resources in the graph where the resource has matching
values but does not have any non specified properties.

So
x with A, B, and C will match
y with A and C will match
z with A, B, C and D will not match
w with A and D will not match.

Any idea how to construct a query that will do this?

It seems backwards to the way I normally think about queries.

Any help would be appreciated,
Claude







--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi








--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi