Hi Florent
First of all: as I expected it was a Bug in the FieldQuery
implementation of the SolrYard. The Bug is fixed in the meantime [1],
however the Bug itself was not the reason why you where getting
unexpected results. That there where any results for the second query
("South America") was a result of the Bug.
The reason why you do not get any results is because you search for
"xsd:string" values, but "skos:prefLabel" are mapped to
"entityhub:text". The entityhub distinguishes between natural language
text and "string" values. Strings would be typically used for ids
(e.g. ISBN numbers). also "skos:notation" would be a good example. The
Solr yard does not use any tokenizer for String values.
Your query used a ValueConstraint to search for the skos:prefLabel:
> {
> "type": "value",
> "field": "http:\/\/www.w3.org\/2004\/02\/skos\/core#prefLabel",
> "value": "Africa",
> }
if no data type is defined for such constraint, than the data type is
detected based on the java type of the value. What would be
* String for "value": "Africa"
* Integer for "value": 123
* Float for "value": 1.23
You can also explicitly parse an dataType by using "dataTypes"
{
"type": "value",
"field": "http:\/\/www.w3.org\/2004\/02\/skos\/core#prefLabel",
"dataTypes":
["http:\/\/www.iks-project.eu\/ontology\/rick\/model\/text"],
"value": "South America",
}
"http://www.iks-project.eu/ontology/rick/model/text" is the data type
used for natural text values.
However the preferred way to query for Natural Text values is to use a
TextConstraint instead of a ValueConstraint.
The TextConstraint equivalent to the above ValueConstraint is:
{
"type": "text",
"text": "South Africa",
"field": "http:\/\/www.w3.org\/2004\/02\/skos\/core#prefLabel",
}
However text constraints also allow to define the languages to search
as well as the use of Wildcards
e.g.
{
"selected": [
"http:\/\/www.w3.org\/2004\/02\/skos\/core#prefLabel",
"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type"],
"offset": "0",
"limit": "30",
"constraints": [
{
"type": "text",
"languages": ["en-GB"],
"patternType": "wildcard",
"text": "Photo*",
"field": "http:\/\/www.w3.org\/2004\/02\/skos\/core#prefLabel",
}
]
}
A documentation of the FieldQuery syntax is provided at the end of the
Entityhub README.TXT [2]
best
Rupert Westenthaler
[1] http://svn.apache.org/viewvc?rev=1131027&view=rev
[2] http://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/README.TXT
> 1) When indexing a skos file, only terms with multi-words are indexed, and
> not term with one word. I observe this first on my particular thesaurus then
> also in the iptc one. I try this request
> $ curl -X POST -F "[email protected]"
> http://localhost:8080/entityhub/site/iptc/query
> with queries :
> 1.A) @fieldQuery.json =
> {
> "offset": "0",
> "limit": "30",
> "constraints": [
> {
> "type": "value",
> "field": "http:\/\/www.w3.org\/2004\/02\/skos\/core#prefLabel",
> "value": "Africa",
> }
> ]
> }
>
> ==> output no results
>
> 1.B) @fieldQuery.json =
> {
> "offset": "0",
> "limit": "30",
> "constraints": [
> {
> "type": "value",
> "field": "http:\/\/www.w3.org\/2004\/02\/skos\/core#prefLabel",
> "value": "South America",
> }
> ]
> }