uhm thank you, i have a more clearer idea right now, i have to re-check what i'm doing and i'll try to follow your suggestion then, as i had misunderstood some points, sorry :-)
thanks very much for the explanation! Alfredo Serafini 2012/5/10 Rupert Westenthaler <[email protected]> > > On 10.05.2012, at 14:18, seralf wrote: > > > thanks for 1) > > > > for the 2) point i was not very clear sorry. > > I have on my test a particular weird use case where i am trying to > provide > > results for almost two different cases on the same rdfs:label field, > (where > > i have to use different tokenization approach, if that work) > > So my idea is to try to create a parallel field with a different > > tokenization approac and then copy it on the _text field. This is most > > common on solr, but i am at the beginning with stanbol, so i have some > > doubt: for example i'm not sure if the _text field is the field always > used > > for the matches or not. > > I hope i was more clear this time, but i'm probably trying to do > something > > which is strange, i know :-) > > > I try to replicate to ensure that we do not misunderstand each other > > You have two two types of Entities in you vocabulary that both use > rdfs:label. > But you would like to use two different fields so that you can use > different Solr Field configurations (e.g. Tokenizers) > > Copying values of rdf:label to an other field is easily possible with the > Entityhub indexing tool. > > If those two different Entities do have some distinct feature (e.g. a > different rdf:type) you could use the > > org.apache.stanbol.entityhub.indexing.core.processor.LdpathProcessor > > with a LDpath program like > > @prefix my : <http://www.example.com/my#>; > my:label1 = .[rdf:type is my:type1]/rdfs:label; > my:label2 = .[rdf:type is my:type1]/rdfs:label; > > this would ensure that > > * labels of Entities of type my:type1 are indexed in my:label1 and > * labels of Entities of type my:type2 are indexed in my:label2 > > The default "indexing.properties" file of the Entityhub Indexing tool also > contains an example for how to configure the LdpathProcessor. > > Note also that if you keep using the FiledMapperProcessor, than the > rdfs:label would still contain the labels of all Entities. > > For extraction you would need to configure two KeywordLinkingEngines (for > my:label1 and my:label2). > The dereferenced Entities included by those two engine configurations > would however miss the rdfs:label field. So if you would like to have the > rdfs:label values in the Enhancement metadata I would need to implement the > possibility to configure the list of included properties. > > > Regarding the *_text* field: > > This is configured (by default) in a way that any text value of an > property is copied to it. So it would not only contain the rdfs:labels, but > also all other textual values of any outgoing relation of an entity. > Also note that this field can NOT be used with the KeywordLinkingEngine, > because it is only indexed, but does not store the values. > > I hope this helps. > best > Rupert > > [1] https://issues.apache.org/jira/browse/STANBOL-596 > > > > > 2012/5/10 Rupert Westenthaler <[email protected]> > > > >> Hi > >> > >> On Thu, May 10, 2012 at 12:00 PM, seralf <[email protected]> wrote: > >>> Hi i'm trying to use the keyword linking engine with a customized solr > >>> configuration. Basically i need to understand two different things: > >>> > >>> 1. what are the default fields indexed and then used in the retrieval > >>> process? i look at the DEREFERENCE_FIELDS in the source, and i'm not > >> sure > >>> if this is or not the place to look at. > >> > >> Currently it is hard coded in the "DEREFERENCE_FIELDS" constant > >> defining fields required by the Web UI of the enhancer. Currently it > >> includes: > >> > >> * rdfs:comment > >> * geo:lat/geo:long > >> * foaf:depiction > >> * dbp-ont:thumbnail > >> > >> However note that in addition to this also the > >> > >> * nameField (the field configured to be used as label for extraction - > >> default: rdfs:label) > >> * redirectField (the field used to follow redirections - default: > >> rdf:seeAlso) > >> * typeField (the field used to determine the type of Entities - > >> default: rdf:type) > >> > >> are included. > >> > >> If you want this to be configurable I can easily add this feature. Not > >> sure why I have not enabled that in the beginning. > >> > >>> 2. starting from the fact that if i am sure about the field that is > >> used > >>> as a base to have a textual enhancement i could simple copy in that > the > >>> results from other fields in the config, i wonder if i could define > new > >>> fields and then consuming them into the process > >>> > >> > >> Sorry, I do not understand what you mean with that. > >> > >>> thanks in advance if someone could give me some suggestion > >>> > >>> Alfredo Serafini > >> > >> best > >> Rupert > >> > >> > >> > >> -- > >> | Rupert Westenthaler [email protected] > >> | Bodenlehenstraße 11 ++43-699-11108907 > >> | A-5500 Bischofshofen > >> > >
