RE: How to index different types of RDF file in one data set

2021-01-08 Thread 李惠玲
Hi Lorenz,

Thank you, not only for the reply, but also solving our incoming issue, cause 
soon we found using FILTER alone would affect database efficiency, now we could 
try your suggestion, using VALUES clause to prevent that. 

and, no, as you could see, we don't know much about SPARQL, we simply tried to 
achieve what we want, not intentionally omit that (hope I get your question 
right).

Regards,
Huiling Lee
-Original Message-
From: Lorenz Buehmann  
Sent: Thursday, January 7, 2021 6:20 PM
To: users@jena.apache.org
Subject: Re: How to index different types of RDF file in one data set

Yep, your query works. Even better - I know you're not yet familiar with all 
the features of SPARQL - there is inline data concept via e.g.
VALUES clause. This can avoid a scan on the data:

SELECT ?g ?label ?type (COUNT(*) as ?count) {
 VALUES ?type {madsrdf:Topic madsrdf:PersonalName}
 ?g ?p ?o .
 ?g rdf:type madsrdf:Authority ; madsrdf:authoritativeLabel ?label .
 FILTER (!isBlank(?g)) .
 ?g rdf:type ?type .
 
}

But know you're missing the fulltext search - did you just omit this in your 
query for brevity?

On 07.01.21 11:00, 李惠玲 wrote:
> Hi Lorenz,
>
> Thank you for the reply. 
> At first we thought we need to adjust the config file to achieve what we want 
> to do, so we did few times of adjustments, using "madsrdf:elementList" is one 
> of them (it seems this could index all elements underneath), of course, this 
> didn't work.
> When seeing your replies, Andy mentioned " In a single jena-text Lucene 
> index, all the values of some predicate are indexed in the same Lucene field. 
> Predicates in RDF globally defined relationships.", and you mentioned "it's 
> possible via SPARQL", we thought maybe we've been thinking in the wrong 
> direction, one of the reasons probably is we're not that familiar with SPARQL 
> query syntax.
>
> So, we look further into it, find out there's "FILTER" syntax, so we tried 
> the following query:
>
> SELECT ?g ?label ?type (COUNT(*) as ?count) {
>  ?g ?p ?o .
>  ?g rdf:type madsrdf:Authority ; madsrdf:authoritativeLabel ?label .
>  FILTER (!isBlank(?g)) .
>  ?g rdf:type ?type .
>  FILTER(?type = madsrdf:Topic || ?type = madsrdf:PersonalName) .
> }
>
> and the config file back to:
>
> text:map (
> [   
> text:field "authoritativeLabel" ; 
> text:predicate madsrdf:authoritativeLabel ;
> ]
> [   
> text:field "variantLabel" ; 
> text:predicate madsrdf:variantLabel ;
> ]
> [   
> text:field "citation-note" ; 
> text:predicate madsrdf:citation-note ;
> ]
> [   
> text:field "citation-source" ; 
> text:predicate madsrdf:citation-source ;
> ]
> ) .
>
> After these, right now we could get the query results like this: 
> (looks worked for now)
>
> In search bar: Man
>
> # Label   Concept
> -
> 1 Mann, Klaus, 1906-1949  PersonalName
> 2 Man Topic
>
>
> Perhaps I still don't get the point of what you and Andy tried to explain 
> (sorry about this), but what you've said did give us some inspiration, for 
> that, it's greatly appreciated.
>
> Regards,
> Huiling Lee
> -Original Message-
> From: Lorenz Buehmann 
> Sent: Wednesday, January 6, 2021 7:51 PM
> To: users@jena.apache.org
> Subject: Re: How to index different types of RDF file in one data set
>
>
> On 06.01.21 12:17, 李惠玲 wrote:
>> What we trying to do is after querying a string, the results could 
>> show both content type triples in the list, if it fits the literals;
>>
>> Thank you for your replies (and hint), we probably thinking in a wrong way 
>> about querying RDF type, yes, we should try via SPARQL, not config file.
> what does this mean? How do you access your data right now if not via SPARQL? 
> I mean you put it into a triple store or not?
>
> Something like
>
> select * where {
> ?s a madsrdf:PersonalName ;
>     text:query "some_search_string_here"
> }
>
>
> Also, as Andy pointed out, your index creation seems odd. You add an 
> index on madsrdf:elementList predicate, but according to your sample 
> data this doesn't link to string literals at all. It should be 
> madsrdf:authoritativeLabel in your config file
>
>> So, we'll keep on f

Re: How to index different types of RDF file in one data set

2021-01-07 Thread Lorenz Buehmann
Yep, your query works. Even better - I know you're not yet familiar with
all the features of SPARQL - there is inline data concept via e.g.
VALUES clause. This can avoid a scan on the data:

SELECT ?g ?label ?type (COUNT(*) as ?count) {
 VALUES ?type {madsrdf:Topic madsrdf:PersonalName}
 ?g ?p ?o .
 ?g rdf:type madsrdf:Authority ; madsrdf:authoritativeLabel ?label .
 FILTER (!isBlank(?g)) .
 ?g rdf:type ?type .
 
}

But know you're missing the fulltext search - did you just omit this in your 
query for brevity?

On 07.01.21 11:00, 李惠玲 wrote:
> Hi Lorenz,
>
> Thank you for the reply. 
> At first we thought we need to adjust the config file to achieve what we want 
> to do, so we did few times of adjustments, using "madsrdf:elementList" is one 
> of them (it seems this could index all elements underneath), of course, this 
> didn't work.
> When seeing your replies, Andy mentioned " In a single jena-text Lucene 
> index, all the values of some predicate are indexed in the same Lucene field. 
> Predicates in RDF globally defined relationships.", and you mentioned "it's 
> possible via SPARQL", we thought maybe we've been thinking in the wrong 
> direction, one of the reasons probably is we're not that familiar with SPARQL 
> query syntax.
>
> So, we look further into it, find out there's "FILTER" syntax, so we tried 
> the following query:
>
> SELECT ?g ?label ?type (COUNT(*) as ?count) {
>  ?g ?p ?o .
>  ?g rdf:type madsrdf:Authority ; madsrdf:authoritativeLabel ?label .
>  FILTER (!isBlank(?g)) .
>  ?g rdf:type ?type .
>  FILTER(?type = madsrdf:Topic || ?type = madsrdf:PersonalName) .
> }
>
> and the config file back to:
>
> text:map (
> [   
> text:field "authoritativeLabel" ; 
> text:predicate madsrdf:authoritativeLabel ;
> ]
> [   
> text:field "variantLabel" ; 
> text:predicate madsrdf:variantLabel ;
> ]
> [   
> text:field "citation-note" ; 
> text:predicate madsrdf:citation-note ;
> ]
> [   
> text:field "citation-source" ; 
> text:predicate madsrdf:citation-source ;
> ]
> ) .
>
> After these, right now we could get the query results like this: (looks 
> worked for now)
>
> In search bar: Man
>
> # Label   Concept
> -
> 1 Mann, Klaus, 1906-1949  PersonalName
> 2 Man Topic
>
>
> Perhaps I still don't get the point of what you and Andy tried to explain 
> (sorry about this), but what you've said did give us some inspiration, for 
> that, it's greatly appreciated.
>
> Regards,
> Huiling Lee
> -Original Message-
> From: Lorenz Buehmann  
> Sent: Wednesday, January 6, 2021 7:51 PM
> To: users@jena.apache.org
> Subject: Re: How to index different types of RDF file in one data set
>
>
> On 06.01.21 12:17, 李惠玲 wrote:
>> What we trying to do is after querying a string, the results could 
>> show both content type triples in the list, if it fits the literals;
>>
>> Thank you for your replies (and hint), we probably thinking in a wrong way 
>> about querying RDF type, yes, we should try via SPARQL, not config file.
> what does this mean? How do you access your data right now if not via SPARQL? 
> I mean you put it into a triple store or not?
>
> Something like
>
> select * where {
> ?s a madsrdf:PersonalName ;
>     text:query "some_search_string_here"
> }
>
>
> Also, as Andy pointed out, your index creation seems odd. You add an index on 
> madsrdf:elementList predicate, but according to your sample data this doesn't 
> link to string literals at all. It should be madsrdf:authoritativeLabel in 
> your config file
>
>> So, we'll keep on fighting!
>>
>> Thanks again,
>> Huiling Lee
>> -Original Message-
>> From: Lorenz Buehmann 
>> Sent: Wednesday, January 6, 2021 4:23 PM
>> To: users@jena.apache.org
>> Subject: Re: How to index different types of RDF file in one data set
>>
>> In addition to what Andy said:
>>
>> Even if you don't introduce separate subproperties for each type, why 
>> shouldn't you be able to distinguish both in a query? I mean, there are RDF 
>> types for both, so just append another triple pattern. I doubt it matters if 

RE: How to index different types of RDF file in one data set

2021-01-07 Thread 李惠玲
Hi Lorenz,

Thank you for the reply. 
At first we thought we need to adjust the config file to achieve what we want 
to do, so we did few times of adjustments, using "madsrdf:elementList" is one 
of them (it seems this could index all elements underneath), of course, this 
didn't work.
When seeing your replies, Andy mentioned " In a single jena-text Lucene index, 
all the values of some predicate are indexed in the same Lucene field. 
Predicates in RDF globally defined relationships.", and you mentioned "it's 
possible via SPARQL", we thought maybe we've been thinking in the wrong 
direction, one of the reasons probably is we're not that familiar with SPARQL 
query syntax.

So, we look further into it, find out there's "FILTER" syntax, so we tried the 
following query:

SELECT ?g ?label ?type (COUNT(*) as ?count) {
 ?g ?p ?o .
 ?g rdf:type madsrdf:Authority ; madsrdf:authoritativeLabel ?label .
 FILTER (!isBlank(?g)) .
 ?g rdf:type ?type .
 FILTER(?type = madsrdf:Topic || ?type = madsrdf:PersonalName) .
}

and the config file back to:

text:map (
[   
text:field "authoritativeLabel" ; 
text:predicate madsrdf:authoritativeLabel ;
]
[   
text:field "variantLabel" ; 
text:predicate madsrdf:variantLabel ;
]
[   
text:field "citation-note" ; 
text:predicate madsrdf:citation-note ;
]
[   
text:field "citation-source" ; 
text:predicate madsrdf:citation-source ;
]
) .

After these, right now we could get the query results like this: (looks worked 
for now)

In search bar: Man

#   Label   Concept
-
1   Mann, Klaus, 1906-1949  PersonalName
2   Man Topic


Perhaps I still don't get the point of what you and Andy tried to explain 
(sorry about this), but what you've said did give us some inspiration, for 
that, it's greatly appreciated.

Regards,
Huiling Lee
-Original Message-----
From: Lorenz Buehmann  
Sent: Wednesday, January 6, 2021 7:51 PM
To: users@jena.apache.org
Subject: Re: How to index different types of RDF file in one data set


On 06.01.21 12:17, 李惠玲 wrote:
> What we trying to do is after querying a string, the results could 
> show both content type triples in the list, if it fits the literals;
>
> Thank you for your replies (and hint), we probably thinking in a wrong way 
> about querying RDF type, yes, we should try via SPARQL, not config file.

what does this mean? How do you access your data right now if not via SPARQL? I 
mean you put it into a triple store or not?

Something like

select * where {
?s a madsrdf:PersonalName ;
    text:query "some_search_string_here"
}


Also, as Andy pointed out, your index creation seems odd. You add an index on 
madsrdf:elementList predicate, but according to your sample data this doesn't 
link to string literals at all. It should be madsrdf:authoritativeLabel in your 
config file

>
> So, we'll keep on fighting!
>
> Thanks again,
> Huiling Lee
> -Original Message-
> From: Lorenz Buehmann 
> Sent: Wednesday, January 6, 2021 4:23 PM
> To: users@jena.apache.org
> Subject: Re: How to index different types of RDF file in one data set
>
> In addition to what Andy said:
>
> Even if you don't introduce separate subproperties for each type, why 
> shouldn't you be able to distinguish both in a query? I mean, there are RDF 
> types for both, so just append another triple pattern. I doubt it matters if 
> the literals of both types are in the same index.
>
> I mean, the well-known property rdfs:label is also used for any type and 
> still people are able to distinguish by type.
>
> So, yes it's possible via SPARQL - if this wasn't clear.
>
> On 05.01.21 21:57, Andy Seaborne wrote:
>> Hi there,
>>
>> I'm not sure what you wish to do - could you sketch a query you want 
>> to ask of the data?
>>
>> In a single jena-text Lucene index, all the values of some predicate 
>> are indexed in the same Lucene field. Predicates in RDF globally 
>> defined relationships.
>>
>> If you want to treat madsrdf:authoritativeLabel in one RDF graph as 
>> "PersonalName" and the same predicate madsrdf:authoritativeLabel as 
>> "Topic", then it looks like you really have a subproperty hierarchy.
>> Maybe that woudl help.
>>
>>     Andy
>>
>>>   [
>>>   text:field "topic" 

Re: How to index different types of RDF file in one data set

2021-01-06 Thread Lorenz Buehmann

On 06.01.21 12:17, 李惠玲 wrote:
> What we trying to do is after querying a string, the results could show both 
> content type triples in the list, if it fits the literals;
>
> Thank you for your replies (and hint), we probably thinking in a wrong way 
> about querying RDF type, yes, we should try via SPARQL, not config file.

what does this mean? How do you access your data right now if not via
SPARQL? I mean you put it into a triple store or not?

Something like

select * where {
?s a madsrdf:PersonalName ;
    text:query "some_search_string_here"
}


Also, as Andy pointed out, your index creation seems odd. You add an
index on madsrdf:elementList predicate, but according to your sample
data this doesn't link to string literals at all. It should be
madsrdf:authoritativeLabel in your config file

>
> So, we'll keep on fighting!
>
> Thanks again,
> Huiling Lee
> -Original Message-
> From: Lorenz Buehmann  
> Sent: Wednesday, January 6, 2021 4:23 PM
> To: users@jena.apache.org
> Subject: Re: How to index different types of RDF file in one data set
>
> In addition to what Andy said:
>
> Even if you don't introduce separate subproperties for each type, why 
> shouldn't you be able to distinguish both in a query? I mean, there are RDF 
> types for both, so just append another triple pattern. I doubt it matters if 
> the literals of both types are in the same index.
>
> I mean, the well-known property rdfs:label is also used for any type and 
> still people are able to distinguish by type.
>
> So, yes it's possible via SPARQL - if this wasn't clear.
>
> On 05.01.21 21:57, Andy Seaborne wrote:
>> Hi there,
>>
>> I'm not sure what you wish to do - could you sketch a query you want 
>> to ask of the data?
>>
>> In a single jena-text Lucene index, all the values of some predicate 
>> are indexed in the same Lucene field. Predicates in RDF globally 
>> defined relationships.
>>
>> If you want to treat madsrdf:authoritativeLabel in one RDF graph as 
>> "PersonalName" and the same predicate madsrdf:authoritativeLabel as 
>> "Topic", then it looks like you really have a subproperty hierarchy.
>> Maybe that woudl help.
>>
>>     Andy
>>
>>>   [
>>>   text:field "topic" ;
>>>   text:predicate madsrdf:elementList ;
>>>   ]
>> madsrdf:elementList is a list so presumably isn't indexed
>>
>>
>> On 05/01/2021 10:48, 李惠玲 wrote:
>>> Dear Sirs,
>>>
>>> Our project implemented Jena Fuseki server (3.18.0, SNAPSHOT version) 
>>> and using Lucene (7.7.x) as fulltext search engine.
>>>
>>> Right now, there are two types of RDF files in our triple store, one 
>>> is “PersonalName”, the other is “Topic”, when we separate them to 
>>> different data set, two config files, they could be indexed 
>>> successfully, but “separately”;
>>>
>>> But when tried to index them together, since they have same tag 
>>> “madsrdf:authoritativeLabel”, we couldn’t find the instruction of how 
>>> to distinguish which is “Topic”, which is “PersonalName”,
>>>
>>> Hope you could share some experiences or suggestion, how to set the 
>>> config file to distinguish different types of RDF file correctly?
>>>
>>> Here are two RDF examples:
>>>
>>> Topic:
>>> -
>>> -
>>>
>>> http://www.w3.org/1999/02/22-rdf-syntax-ns#";>
>>>     http://www.loc.gov/mads/rdf/v1#";
>>>   
>>> rdf:about="http://ld.ncl.edu.tw/subject/981038693688004786";>
>>>    >> rdf:resource="http://www.loc.gov/mads/rdf/v1#Authority"/>
>>>    >> xml:lang="en">公設辯護
>>   
>>>    
>>>   
>>>  >> xml:lang="en">公設辯護
>>>   
>>>    
>>>    
>>>   
>>>  >> rdf:resource="http://www.loc.gov/mads/rdf/v1#Variant"/>
>>>  辯護人
>>>  
>>>     
>>>    >> xml:lang="en">辯護人
>>>     
>>>  
>>> 

RE: How to index different types of RDF file in one data set

2021-01-06 Thread 李惠玲
What we trying to do is after querying a string, the results could show both 
content type triples in the list, if it fits the literals;

Thank you for your replies (and hint), we probably thinking in a wrong way 
about querying RDF type, yes, we should try via SPARQL, not config file.

So, we'll keep on fighting!

Thanks again,
Huiling Lee
-Original Message-
From: Lorenz Buehmann  
Sent: Wednesday, January 6, 2021 4:23 PM
To: users@jena.apache.org
Subject: Re: How to index different types of RDF file in one data set

In addition to what Andy said:

Even if you don't introduce separate subproperties for each type, why shouldn't 
you be able to distinguish both in a query? I mean, there are RDF types for 
both, so just append another triple pattern. I doubt it matters if the literals 
of both types are in the same index.

I mean, the well-known property rdfs:label is also used for any type and still 
people are able to distinguish by type.

So, yes it's possible via SPARQL - if this wasn't clear.

On 05.01.21 21:57, Andy Seaborne wrote:
> Hi there,
>
> I'm not sure what you wish to do - could you sketch a query you want 
> to ask of the data?
>
> In a single jena-text Lucene index, all the values of some predicate 
> are indexed in the same Lucene field. Predicates in RDF globally 
> defined relationships.
>
> If you want to treat madsrdf:authoritativeLabel in one RDF graph as 
> "PersonalName" and the same predicate madsrdf:authoritativeLabel as 
> "Topic", then it looks like you really have a subproperty hierarchy.
> Maybe that woudl help.
>
>     Andy
>
> >  [
> >  text:field "topic" ;
> >  text:predicate madsrdf:elementList ;
> >  ]
>
> madsrdf:elementList is a list so presumably isn't indexed
>
>
> On 05/01/2021 10:48, 李惠玲 wrote:
>> Dear Sirs,
>>
>> Our project implemented Jena Fuseki server (3.18.0, SNAPSHOT version) 
>> and using Lucene (7.7.x) as fulltext search engine.
>>
>> Right now, there are two types of RDF files in our triple store, one 
>> is “PersonalName”, the other is “Topic”, when we separate them to 
>> different data set, two config files, they could be indexed 
>> successfully, but “separately”;
>>
>> But when tried to index them together, since they have same tag 
>> “madsrdf:authoritativeLabel”, we couldn’t find the instruction of how 
>> to distinguish which is “Topic”, which is “PersonalName”,
>>
>> Hope you could share some experiences or suggestion, how to set the 
>> config file to distinguish different types of RDF file correctly?
>>
>> Here are two RDF examples:
>>
>> Topic:
>> -
>> -
>>
>> http://www.w3.org/1999/02/22-rdf-syntax-ns#";>
>>     http://www.loc.gov/mads/rdf/v1#";
>>   
>> rdf:about="http://ld.ncl.edu.tw/subject/981038693688004786";>
>>    > rdf:resource="http://www.loc.gov/mads/rdf/v1#Authority"/>
>>    > xml:lang="en">公設辯護
>   
>>    
>>   
>>  > xml:lang="en">公設辯護
>>   
>>    
>>    
>>   
>>  > rdf:resource="http://www.loc.gov/mads/rdf/v1#Variant"/>
>>  辯護人
>>  
>>     
>>    > xml:lang="en">辯護人
>>     
>>  
>>   
>>    
>>    > xmlns:identifiers="http://id.loc.gov/vocabulary/identifiers/"/>
>>    > xmlns:identifiers="http://id.loc.gov/vocabulary/identifiers/";>(ChTaNC
>> )sh0001412
>>    
>>   > xmlns:ri="http://id.loc.gov/ontologies/RecordInfo#";>
>>  > rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime";>2020-12-30T0
>> 0:00:00
>>  > rdf:datatype="http://www.w3.org/2001/XMLSchema#string";>new> Status>
>>  > rdf:resource="http://id.loc.gov/vocabulary/organizations/ntu"/>
>>  > rdf:resource="http://id.loc.gov/vocabulary/iso639-2/chi"/>
>>   
>>    
>>     
>> 
>> -
>> --
>>
>>
>> PersonalName:
>> 

Re: How to index different types of RDF file in one data set

2021-01-06 Thread Lorenz Buehmann
In addition to what Andy said:

Even if you don't introduce separate subproperties for each type, why
shouldn't you be able to distinguish both in a query? I mean, there are
RDF types for both, so just append another triple pattern. I doubt it
matters if the literals of both types are in the same index.

I mean, the well-known property rdfs:label is also used for any type and
still people are able to distinguish by type.

So, yes it's possible via SPARQL - if this wasn't clear.

On 05.01.21 21:57, Andy Seaborne wrote:
> Hi there,
>
> I'm not sure what you wish to do - could you sketch a query you want
> to ask of the data?
>
> In a single jena-text Lucene index, all the values of some predicate
> are indexed in the same Lucene field. Predicates in RDF globally
> defined relationships.
>
> If you want to treat madsrdf:authoritativeLabel in one RDF graph as
> "PersonalName" and the same predicate madsrdf:authoritativeLabel as
> "Topic", then it looks like you really have a subproperty hierarchy.
> Maybe that woudl help.
>
>     Andy
>
> >  [
> >  text:field "topic" ;
> >  text:predicate madsrdf:elementList ;
> >  ]
>
> madsrdf:elementList is a list so presumably isn't indexed
>
>
> On 05/01/2021 10:48, 李惠玲 wrote:
>> Dear Sirs,
>>
>> Our project implemented Jena Fuseki server (3.18.0, SNAPSHOT version)
>> and using Lucene (7.7.x) as fulltext search engine.
>>
>> Right now, there are two types of RDF files in our triple store, one
>> is “PersonalName”, the other is “Topic”, when we separate them to
>> different data set, two config files, they could be indexed
>> successfully, but “separately”;
>>
>> But when tried to index them together, since they have same tag
>> “madsrdf:authoritativeLabel”, we couldn’t find the instruction of how
>> to distinguish which is “Topic”, which is “PersonalName”,
>>
>> Hope you could share some experiences or suggestion, how to set the
>> config file to distinguish different types of RDF file correctly?
>>
>> Here are two RDF examples:
>>
>> Topic:
>> --
>>
>> http://www.w3.org/1999/02/22-rdf-syntax-ns#";>
>>     http://www.loc.gov/mads/rdf/v1#";
>>   
>> rdf:about="http://ld.ncl.edu.tw/subject/981038693688004786";>
>>    > rdf:resource="http://www.loc.gov/mads/rdf/v1#Authority"/>
>>    > xml:lang="en">公設辯護
>   
>>    
>>   
>>  > xml:lang="en">公設辯護
>>   
>>    
>>    
>>   
>>  > rdf:resource="http://www.loc.gov/mads/rdf/v1#Variant"/>
>>  辯護人
>>  
>>     
>>    > xml:lang="en">辯護人
>>     
>>  
>>   
>>    
>>    > xmlns:identifiers="http://id.loc.gov/vocabulary/identifiers/"/>
>>    > xmlns:identifiers="http://id.loc.gov/vocabulary/identifiers/";>(ChTaNC)sh0001412
>>    
>>   > xmlns:ri="http://id.loc.gov/ontologies/RecordInfo#";>
>>  > rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime";>2020-12-30T00:00:00
>>  > rdf:datatype="http://www.w3.org/2001/XMLSchema#string";>new
>>  > rdf:resource="http://id.loc.gov/vocabulary/organizations/ntu"/>
>>  > rdf:resource="http://id.loc.gov/vocabulary/iso639-2/chi"/>
>>   
>>    
>>     
>> 
>> ---
>>
>>
>> PersonalName:
>> 
>>
>> 
>> > rdf:about="http://ld.ncl.edu.tw/authority/981038686683804786981038686683804786";>
>> http://www.loc.gov/mads/rdf/v1#Authority"/>
>> 蘇,
>> 慧婕
>> 
>> 
>> 蘇, 慧婕
>> 
>> 
>> 
>> 
>> http://www.loc.gov/mads/rdf/v1#Variant"/>
>> Su, Huijie
>> 
>> 
>> Su, Huijie
>> 
>> 
>> 
>> 
>> 
>> 
>> http://www.loc.gov/mads/rdf/v1#Variant"/>
>> Su, Hui-Chieh
>> 
>> 
>> Su, Hui-Chieh
>> 
>> 
>> 
>> 
>> 
>> 
>> 論國會議員產生方式之規範及其憲法界限,
>> 2003:
>> > xml:lang="en">書名頁(國立臺灣大學法律學硏究所碩士)
>>
>> found
>> 
>> 
>> 
>> 
>> 國立臺灣大學法律學系網頁, 檢索日期:
>> 2020/11/25
>> (女; Hui-chieh
>> Su)
>> found
>> 
>> 
>> 
>> 
>> NTU Scholar(臺大學術典藏)網頁, 檢索日期:
>> 2020/11/25
>> (HUI-CHIEH
>> SU)
>> found
>> 
>> 
>> 
>> 臺大教師權威紀錄, 英文權威名稱係以NTU Scholar(臺大學術典藏)網頁著錄(Su,
>> Hui-Chieh)
>> 
>> 女; 研究領域: 國家學, 憲法理論, 基本權理論, 言論自由,
>> 轉型正義
>> 
>> (TW-TaNTU)981038686683804786
>> 
>> 
>> > rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime";>2020-11-25T00:00:00
>> > rdf:datatype="http://www.w3.org/2001/XMLSchema#string";>new
>> > rdf:resource="http://id.loc.gov/vocabulary/organizations/ntu"/>
>> > rdf:resource="http://id.loc.gov/vocabulary/iso639-2/chi"/>
>> 
>> 
>> 
>> 
>> -

Re: How to index different types of RDF file in one data set

2021-01-05 Thread Andy Seaborne

Hi there,

I'm not sure what you wish to do - could you sketch a query you want to 
ask of the data?


In a single jena-text Lucene index, all the values of some predicate are 
indexed in the same Lucene field. Predicates in RDF globally defined 
relationships.


If you want to treat madsrdf:authoritativeLabel in one RDF graph as 
"PersonalName" and the same predicate madsrdf:authoritativeLabel as 
"Topic", then it looks like you really have a subproperty hierarchy. 
Maybe that woudl help.


Andy

>  [
>  text:field "topic" ;
>  text:predicate madsrdf:elementList ;
>  ]

madsrdf:elementList is a list so presumably isn't indexed


On 05/01/2021 10:48, 李惠玲 wrote:

Dear Sirs,

Our project implemented Jena Fuseki server (3.18.0, SNAPSHOT version) and using 
Lucene (7.7.x) as fulltext search engine.

Right now, there are two types of RDF files in our triple store, one is 
“PersonalName”, the other is “Topic”, when we separate them to different data 
set, two config files, they could be indexed successfully, but “separately”;

But when tried to index them together, since they have same tag 
“madsrdf:authoritativeLabel”, we couldn’t find the instruction of how to 
distinguish which is “Topic”, which is “PersonalName”,

Hope you could share some experiences or suggestion, how to set the config file 
to distinguish different types of RDF file correctly?

Here are two RDF examples:

Topic:
--
http://www.w3.org/1999/02/22-rdf-syntax-ns#";>
http://www.loc.gov/mads/rdf/v1#";
   rdf:about="http://ld.ncl.edu.tw/subject/981038693688004786";>
   http://www.loc.gov/mads/rdf/v1#Authority"/>
   公設辯護

  

   
  
 公設辯護
  
   
   
  
 http://www.loc.gov/mads/rdf/v1#Variant"/>
 辯護人
 

   辯護人

 
  
   
   http://id.loc.gov/vocabulary/identifiers/"/>
   http://id.loc.gov/vocabulary/identifiers/";>(ChTaNC)sh0001412
   
  http://id.loc.gov/ontologies/RecordInfo#";>
 http://www.w3.org/2001/XMLSchema#dateTime";>2020-12-30T00:00:00
 http://www.w3.org/2001/XMLSchema#string";>new
 http://id.loc.gov/vocabulary/organizations/ntu"/>
 http://id.loc.gov/vocabulary/iso639-2/chi"/>
  
   


---

PersonalName:


http://ld.ncl.edu.tw/authority/981038686683804786981038686683804786";>
http://www.loc.gov/mads/rdf/v1#Authority"/>
蘇, 慧婕


蘇, 慧婕




http://www.loc.gov/mads/rdf/v1#Variant"/>
Su, Huijie


Su, Huijie






http://www.loc.gov/mads/rdf/v1#Variant"/>
Su, Hui-Chieh


Su, Hui-Chieh






論國會議員產生方式之規範及其憲法界限, 2003:
書名頁(國立臺灣大學法律學硏究所碩士)
found




國立臺灣大學法律學系網頁, 檢索日期: 
2020/11/25
(女; Hui-chieh Su)
found




NTU Scholar(臺大學術典藏)網頁, 檢索日期: 
2020/11/25
(HUI-CHIEH SU)
found



臺大教師權威紀錄, 英文權威名稱係以NTU Scholar(臺大學術典藏)網頁著錄(Su, Hui-Chieh)

女; 研究領域: 國家學, 憲法理論, 基本權理論, 言論自由, 轉型正義

(TW-TaNTU)981038686683804786


http://www.w3.org/2001/XMLSchema#dateTime";>2020-11-25T00:00:00
http://www.w3.org/2001/XMLSchema#string";>new
http://id.loc.gov/vocabulary/organizations/ntu"/>
http://id.loc.gov/vocabulary/iso639-2/chi"/>




--

One of the config files looks like:
-
<#entMap> a text:EntityMap ;
 text:defaultField "authoritativeLabel" ;
 text:entityField  "uri" ;
 text:uidField "uid" ;
 text:langField"lang" ;
 text:graphField   "graph" ;
 text:map (
 [
 text:field "authoritativeLabel" ;
 text:predicate madsrdf:authoritativeLabel ;
 ]
 [
 text:field "variantLabel" ;
 text:predicate madsrdf:variantLabel ;
 ]
 [
 text:field "citation-note" ;
 text:predicate madsrdf:citation-note ;
 ]
 [
 text:field "citation-source" ;
 text:predicate madsrdf:citation-source ;
 ]
 [
 text:field "topic" ;
 text:predicate madsrdf:elementList ;
 ]
 ) .

---

Thank you for reading this post.

Best Regards,
Huiling Lee