On 06.01.21 12:17, 李惠玲 wrote: > What we trying to do is after querying a string, the results could show both > content type triples in the list, if it fits the literals; > > Thank you for your replies (and hint), we probably thinking in a wrong way > about querying RDF type, yes, we should try via SPARQL, not config file.
what does this mean? How do you access your data right now if not via SPARQL? I mean you put it into a triple store or not? Something like select * where { ?s a madsrdf:PersonalName ; text:query "some_search_string_here" } Also, as Andy pointed out, your index creation seems odd. You add an index on madsrdf:elementList predicate, but according to your sample data this doesn't link to string literals at all. It should be madsrdf:authoritativeLabel in your config file > > So, we'll keep on fighting! > > Thanks again, > Huiling Lee > -----Original Message----- > From: Lorenz Buehmann <buehm...@informatik.uni-leipzig.de> > Sent: Wednesday, January 6, 2021 4:23 PM > To: users@jena.apache.org > Subject: Re: How to index different types of RDF file in one data set > > In addition to what Andy said: > > Even if you don't introduce separate subproperties for each type, why > shouldn't you be able to distinguish both in a query? I mean, there are RDF > types for both, so just append another triple pattern. I doubt it matters if > the literals of both types are in the same index. > > I mean, the well-known property rdfs:label is also used for any type and > still people are able to distinguish by type. > > So, yes it's possible via SPARQL - if this wasn't clear. > > On 05.01.21 21:57, Andy Seaborne wrote: >> Hi there, >> >> I'm not sure what you wish to do - could you sketch a query you want >> to ask of the data? >> >> In a single jena-text Lucene index, all the values of some predicate >> are indexed in the same Lucene field. Predicates in RDF globally >> defined relationships. >> >> If you want to treat madsrdf:authoritativeLabel in one RDF graph as >> "PersonalName" and the same predicate madsrdf:authoritativeLabel as >> "Topic", then it looks like you really have a subproperty hierarchy. >> Maybe that woudl help. >> >> Andy >> >>> [ >>> text:field "topic" ; >>> text:predicate madsrdf:elementList ; >>> ] >> madsrdf:elementList is a list so presumably isn't indexed >> >> >> On 05/01/2021 10:48, 李惠玲 wrote: >>> Dear Sirs, >>> >>> Our project implemented Jena Fuseki server (3.18.0, SNAPSHOT version) >>> and using Lucene (7.7.x) as fulltext search engine. >>> >>> Right now, there are two types of RDF files in our triple store, one >>> is “PersonalName”, the other is “Topic”, when we separate them to >>> different data set, two config files, they could be indexed >>> successfully, but “separately”; >>> >>> But when tried to index them together, since they have same tag >>> “madsrdf:authoritativeLabel”, we couldn’t find the instruction of how >>> to distinguish which is “Topic”, which is “PersonalName”, >>> >>> Hope you could share some experiences or suggestion, how to set the >>> config file to distinguish different types of RDF file correctly? >>> >>> Here are two RDF examples: >>> >>> Topic: >>> --------------------------------------------------------------------- >>> --------------------------------------------------------- >>> >>> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> >>> <madsrdf:Topic xmlns:madsrdf="http://www.loc.gov/mads/rdf/v1#" >>> >>> rdf:about="http://ld.ncl.edu.tw/subject/981038693688004786"> >>> <rdf:type >>> rdf:resource="http://www.loc.gov/mads/rdf/v1#Authority"/> >>> <madsrdf:authoritativeLabel >>> xml:lang="en">公設辯護</madsrdf:authoritativeLabel> >> ^^^^ >>> <madsrdf:elementList rdf:parseType="Collection"> >>> <madsrdf:TopicElement> >>> <madsrdf:elementValue >>> xml:lang="en">公設辯護</madsrdf:elementValue> >>> </madsrdf:TopicElement> >>> </madsrdf:elementList> >>> <madsrdf:hasVariant> >>> <madsrdf:Topic> >>> <rdf:type >>> rdf:resource="http://www.loc.gov/mads/rdf/v1#Variant"/> >>> <madsrdf:variantLabel>辯護人</madsrdf:variantLabel> >>> <madsrdf:elementList rdf:parseType="Collection"> >>> <madsrdf:TopicElement> >>> <madsrdf:elementValue >>> xml:lang="en">辯護人</madsrdf:elementValue> >>> </madsrdf:TopicElement> >>> </madsrdf:elementList> >>> </madsrdf:Topic> >>> </madsrdf:hasVariant> >>> <identifiers:lccn >>> xmlns:identifiers="http://id.loc.gov/vocabulary/identifiers/"/> >>> <identifiers:id >>> xmlns:identifiers="http://id.loc.gov/vocabulary/identifiers/">(ChTaNC >>> )sh0001412</identifiers:id> >>> <madsrdf:adminMetadata> >>> <ri:RecordInfo >>> xmlns:ri="http://id.loc.gov/ontologies/RecordInfo#"> >>> <ri:recordChangeDate >>> rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-12-30T0 >>> 0:00:00</ri:recordChangeDate> >>> <ri:recordStatus >>> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">new</ri:record >>> Status> >>> <ri:recordContentSource >>> rdf:resource="http://id.loc.gov/vocabulary/organizations/ntu"/> >>> <ri:languageOfCataloging >>> rdf:resource="http://id.loc.gov/vocabulary/iso639-2/chi"/> >>> </ri:RecordInfo> >>> </madsrdf:adminMetadata> >>> </madsrdf:Topic> >>> </rdf:RDF> >>> --------------------------------------------------------------------- >>> ------------------------------------------------------------------ >>> >>> >>> PersonalName: >>> --------------------------------------------------------------------- >>> ------------------------------------------------------------------- >>> >>> <rdf:RDF> >>> <madsrdf:PersonalName >>> rdf:about="http://ld.ncl.edu.tw/authority/981038686683804786981038686 >>> 683804786"> <rdf:type >>> rdf:resource="http://www.loc.gov/mads/rdf/v1#Authority"/> >>> <madsrdf:authoritativeLabel xml:lang="en">蘇, >>> 慧婕</madsrdf:authoritativeLabel> >>> <madsrdf:elementList rdf:parseType="Collection"> >>> <madsrdf:FullNameElement> <madsrdf:elementValue xml:lang="en">蘇, >>> 慧婕</madsrdf:elementValue> >>> </madsrdf:FullNameElement> >>> </madsrdf:elementList> >>> <madsrdf:hasVariant> >>> <madsrdf:PersonalName> >>> <rdf:type rdf:resource="http://www.loc.gov/mads/rdf/v1#Variant"/> >>> <madsrdf:variantLabel>Su, Huijie</madsrdf:variantLabel> >>> <madsrdf:elementList rdf:parseType="Collection"> >>> <madsrdf:FullNameElement> <madsrdf:elementValue xml:lang="en">Su, >>> Huijie</madsrdf:elementValue> </madsrdf:FullNameElement> >>> </madsrdf:elementList> </madsrdf:PersonalName> </madsrdf:hasVariant> >>> <madsrdf:hasVariant> <madsrdf:PersonalName> <rdf:type >>> rdf:resource="http://www.loc.gov/mads/rdf/v1#Variant"/> >>> <madsrdf:variantLabel>Su, Hui-Chieh</madsrdf:variantLabel> >>> <madsrdf:elementList rdf:parseType="Collection"> >>> <madsrdf:FullNameElement> <madsrdf:elementValue xml:lang="en">Su, >>> Hui-Chieh</madsrdf:elementValue> </madsrdf:FullNameElement> >>> </madsrdf:elementList> </madsrdf:PersonalName> </madsrdf:hasVariant> >>> <madsrdf:hasSource> <madsrdf:Source> >>> <madsrdf:citation-source>論國會議員產生方式之規範及其憲法界限, >>> 2003:</madsrdf:citation-source> >>> <madsrdf:citation-note >>> xml:lang="en">書名頁(國立臺灣大學法律學硏究所碩士)</madsrdf:citation-note> >>> >>> <madsrdf:citation-status>found</madsrdf:citation-status> >>> </madsrdf:Source> >>> </madsrdf:hasSource> >>> <madsrdf:hasSource> >>> <madsrdf:Source> >>> <madsrdf:citation-source>國立臺灣大學法律學系網頁, 檢索日期: >>> 2020/11/25</madsrdf:citation-source> >>> <madsrdf:citation-note xml:lang="en">(女; Hui-chieh >>> Su)</madsrdf:citation-note> >>> <madsrdf:citation-status>found</madsrdf:citation-status> >>> </madsrdf:Source> >>> </madsrdf:hasSource> >>> <madsrdf:hasSource> >>> <madsrdf:Source> >>> <madsrdf:citation-source>NTU Scholar(臺大學術典藏)網頁, 檢索日期: >>> 2020/11/25</madsrdf:citation-source> >>> <madsrdf:citation-note xml:lang="en">(HUI-CHIEH >>> SU)</madsrdf:citation-note> >>> <madsrdf:citation-status>found</madsrdf:citation-status> >>> </madsrdf:Source> >>> </madsrdf:hasSource> >>> <madsrdf:editorialNote> >>> 臺大教師權威紀錄, 英文權威名稱係以NTU Scholar(臺大學術典藏)網頁著錄(Su, >>> Hui-Chieh) >>> </madsrdf:editorialNote> >>> <madsrdf:note>女; 研究領域: 國家學, 憲法理論, 基本權理論, 言論自由, >>> 轉型正義</madsrdf:note> >>> <identifiers:lccn/> >>> <identifiers:id>(TW-TaNTU)981038686683804786</identifiers:id> >>> <madsrdf:adminMetadata> >>> <ri:RecordInfo> >>> <ri:recordChangeDate >>> rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-11-25T0 >>> 0:00:00</ri:recordChangeDate> >>> <ri:recordStatus >>> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">new</ri:record >>> Status> >>> <ri:recordContentSource >>> rdf:resource="http://id.loc.gov/vocabulary/organizations/ntu"/> >>> <ri:languageOfCataloging >>> rdf:resource="http://id.loc.gov/vocabulary/iso639-2/chi"/> >>> </ri:RecordInfo> >>> </madsrdf:adminMetadata> >>> </madsrdf:PersonalName> >>> </rdf:RDF> >>> --------------------------------------------------------------------- >>> --------------------------------------------------------------------- >>> --------------------------------------------------------------------- >>> ------- >>> >>> >>> One of the config files looks like: >>> --------------------------------------------------------------------- >>> ------------ >>> >>> <#entMap> a text:EntityMap ; >>> text:defaultField "authoritativeLabel" ; >>> text:entityField "uri" ; >>> text:uidField "uid" ; >>> text:langField "lang" ; >>> text:graphField "graph" ; >>> text:map ( >>> [ >>> text:field "authoritativeLabel" ; >>> text:predicate madsrdf:authoritativeLabel ; >>> ] >>> [ >>> text:field "variantLabel" ; >>> text:predicate madsrdf:variantLabel ; >>> ] >>> [ >>> text:field "citation-note" ; >>> text:predicate madsrdf:citation-note ; >>> ] >>> [ >>> text:field "citation-source" ; >>> text:predicate madsrdf:citation-source ; >>> ] >>> [ >>> text:field "topic" ; >>> text:predicate madsrdf:elementList ; >>> ] >>> ) . >>> >>> --------------------------------------------------------------------- >>> ---------- >>> >>> >>> Thank you for reading this post. >>> >>> Best Regards, >>> Huiling Lee >>>