first of all thanks for the suggestions on the felix interaction, i'll check/study this later as soon as possible
for my 'not active' index issue: ok the problems seems to be realated to the 'sling' directory i was using: i have started the 'stanbol-full' version using the option -c ../sling (so putting the index in <root>/sling/datafiles) and for some reason it doesn't work if i start again with the same configs but using the directory <root>/launcher/sling/datasets without the -c option it all works... At the moment for my test it's ok, but does anybody have an idea of what i'm missing for having the system started with a custom data directory? :-) just to be clear: > ~/sw/stanbol/launchers$ java -Xmx1024m -jar > full/target/org.apache.stanbol.launchers.full-0.9.0-incubating-SNAPSHOT.jar > start -c ../sling *doesn't work for me* > > ~/sw/stanbol/launchers$ java -Xmx1024m -jar > full/target/org.apache.stanbol.launchers.full-0.9.0-incubating-SNAPSHOT.jar > start *works* > thanks, Alfredo 2012/3/28 [email protected] <[email protected]> > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > I've seen this happen loading custom vocabularies built by the Generic RDF > Indexer and I'm honestly still not sure of why. In my case, restarting the > custom bundle and the Solr Yard bundle seemed to make it work. I imagine > that restarting Stanbol would do the same. Perhaps there is some subtle > error in the building of the custom bundle that makes it possible for a > Solr index service to be created but not started? > > As to managing configuration, you may want to follow: > > https://issues.apache.org/jira/browse/STANBOL-529 > > which offers a future way to provide configuration at startup. I'm not > familiar enough with the Sling Launcher system to know how difficult it > would be to directly expose deployment via REST, but it might be more > feasible using the Apache Felix Web Console which is normally included in > Stanbol builds: > > http://felix.apache.org/site/web-console-restful-api.html > > http://felix.apache.org/site/apache-felix-web-console.html#ApacheFelixWebConsole-RESTfulAPI > > - --- > A. Soroka > Software & Systems Engineering :: Online Library Environment > the University of Virginia Library > > On Mar 28, 2012, at 11:24 AM, seralf wrote: > > > yes i have already started the bundle, but if i search from the web > > interface or via a command line like: > > curl -X POST -d "name=roma*&limit=10&offset=0" > > http://localhost:8080/entityhub/site/<SITE-NAME>/find > > > > i have the error i pasted. > > > > Any suggestions? maybe i miss some configuration step? > > > > 2012/3/28 Michel Benevento <[email protected]> > > > >> Have you started your installed bundle in the admin console? Click the > >> little triangle next to it so it becomes a square and the status message > >> updates. > >> > >> Michel > >> > >> > >> On 28 mrt. 2012, at 17:02, seralf wrote: > >> > >>> Hi i'm trying to use the KeywordLinking as Rupert suggested me earlier. > >>> I've done the solr indexes as in the tutorial and they seems to be ok > (i > >>> looked inside them with Luke), i've copied them in ROOT/sling/dataset, > >> and > >>> then installed the generated bundle via the console. > >>> > >>> Now i have a strange error: seems like stanbol is not actually load my > >>> indexes, or for some reason it has not activated the yard > >>> > >>> java.lang.IllegalStateException: Unable to initialize the Cache with > Yard > >>>> <SITE-NAME> Index! This is usually caused by Errors while reading the > >> Cache > >>>> Configuration from the Yard. > >>>> at > >>>> > >> > org.apache.stanbol.entityhub.core.site.CacheImpl.getCacheYard(CacheImpl.java:214) > >>>> at > >>>> > >> > org.apache.stanbol.entityhub.core.site.CacheImpl.findRepresentation(CacheImpl.java:331) > >>>> ... > >>>> Caused by: > org.apache.stanbol.entityhub.servicesapi.yard.YardException: > >>>> The SolrIndex '<SITE-NAME>' for SolrYard '<SITE-NAME> Index' is > >> currently > >>>> not active! > >>>> ... > >>>> > >>> > >>> does anyone has suggestion on this? > >>> > >>> i have two other related questions: > >>> 1) how can i start stanbol with specific config activated? > >>> 2) is there any way to manage the deploy/activation via some kind of > rest > >>> interface? (for example curl? it could be helpful for doing some > >>> automatization... ) > >>> > >>> thanks in advance, > >>> Alfredo > >>> > >>> > >>> > >>> > >>> 2012/3/22 seralf <[email protected]> > >>> > >>>> Thanks very much Rupert, you help me a lot in clarify my ideas :-) > >>>> > >>>> i think i'll try to follow your suggestion, and try to use my > thesaurus > >>>> with the workflow option 2) > >>>> i already use solr either, so it's probably the best choice for my > >> needs, > >>>> indeed > >>>> > >>>> on the other hand i'm still interested on give a try on opennlp > italian > >>>> model construction, but i can to my experiments externally, as i > correct > >>>> understand. > >>>> > >>>> thanks very much, i'll try to make some progress > >>>> Alfredo > >>>> > >>>> > >>>> > >>>> 2012/3/22 Rupert Westenthaler <[email protected]> > >>>> > >>>>> Hi Alfredo > >>>>> > >>>>> On 22.03.2012, at 12:24, seralf wrote: > >>>>> > >>>>>> Hi i'm new to stambol, i'm reading the documentation and examples, > and > >>>>> i'd > >>>>>> like to start some testing with it on italian language, if it's > >>>>> possible. > >>>>>> > >>>>>> Could someone give me some hint regarding the steps to try to > costruct > >>>>> my > >>>>>> model (Italian) and configure it inside the platform? I suppose it's > >>>>>> possible and it should be not very far to the steps taken for > >> construct > >>>>>> -let's say- the Spanish integration. > >>>>>> What i need to do? I know it could sound a very generic question, > but > >>>>> it's > >>>>>> not so clear from the documentation, so i need help. > >>>>>> For my test i would like to be able to use a text corpora from the > >>>>> database > >>>>>> of a client, and a skos thesaurus from the same domain. > >>>>>> > >>>>>> thanks in advance for every help (suggestions, code examples, ideas, > >>>>> etc) > >>>>>> > >>>>> > >>>>> In principle there are two different workflows how to extract > Entities > >>>>> form Text > >>>>> > >>>>> (1) NamedEntityExtraction (NER) [3] => NamedEntityLinking [4] > >>>>> (2) KeywordLinking [5] > >>>>> > >>>>> > >>>>> (1) requires a OpenNLP [1] NER model for the language of your > >> documents. > >>>>> However currently there are no models for the italian language > >> distributed > >>>>> by OpenNLP. This would require you to build your own models. For more > >>>>> information on how to do that please see the documentation of OpenNLP > >> [1]. > >>>>> As soon as you have such models you need only copy them into the > >>>>> {stanbol-workingdir}/sling/datafiles folder. If they follow the > naming > >>>>> scheme used by OpenNLP ("{lang}-ner-{type}.bin" e.g. > >> "it-ner.location.bin" > >>>>> for the model that detects locations for italian) Stanbol will pick > >> them up > >>>>> automatically. > >>>>> > >>>>> (2) directly matches words of the text with labels of entities within > >> the > >>>>> controlled vocabulary. This process can be improved by Natural > Langauge > >>>>> Processing (e.g. Part-of-Speech tagging) but this is not a > requirement. > >>>>> Typically this works fine for datasets that contain named entities > >> such as > >>>>> concepts of an thesaurus; contacts of an company, projects, products > … > >> It > >>>>> does not work well with datasets that contains entities with labels > >> that > >>>>> are also used as common words in the given language as this will > >> result in > >>>>> a lot of false positives. > >>>>> > >>>>> Based on the information you provided on you use case I suggest that > >> (2) > >>>>> should work just fine for you. This user scenario [2] should provide > >> you > >>>>> will all the needed information on how to configure Stanbol for your > >> use > >>>>> case. > >>>>> > >>>>> I hope this helps. If you have any further questions feel free to ask > >>>>> > >>>>> best > >>>>> Rupert Westenthaler > >>>>> > >>>>> [1] http://opennlp.apache.org/ > >>>>> [2] > >> http://incubator.apache.org/stanbol/docs/trunk/customvocabulary.html > >>>>> > >>>>> [3] > >>>>> > >> > http://incubator.apache.org/stanbol/docs/trunk/enhancer/engines/namedentityextractionengine.html > >>>>> [4] > >>>>> > >> > http://incubator.apache.org/stanbol/docs/trunk/enhancer/engines/namedentitytaggingengine.html > >>>>> [5] > >>>>> > >> > http://incubator.apache.org/stanbol/docs/trunk/enhancer/engines/keywordlinkingengine.html > >>>>> > >>>>>> cheers, > >>>>>> Alfredo Serafini > >>>>> > >>>>> > >>>> > >> > >> > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG/MacGPG2 v2.0.17 (Darwin) > Comment: GPGTools - http://gpgtools.org > > iQEcBAEBAgAGBQJPcy+pAAoJEATpPYSyaoIkouwH/imt4ERphKHGc6tXrkLQIFWJ > TclWGjyCjoT1GgOr2OGjwfTS9xmcbsn3mYwfv+tuxNj2FfXfi4OfoVza6z7tZeUZ > WdH4+cmq+4Lg+7lt+Pbt2narYWhvUCg2Dths8tdj8nPtJSEEd2KfW5DQqnwq/CfA > uqOAN5zEb9rsy5gTGzSNxX66fpnM1t7XWHs2gmoD17rfmnJEQBc3l+a6rnLJdnFX > vABg2gEiYt5YGaZRG4V1oVC5SqEoZlysix/tkZyWcFMvXN+nvePbMDhaqBwjWc5k > 719uf4gW66Xf7V8zeWgwQcXlNICAebyXnsGiPqkeUaZa4nhm6v+G+FT4Ho/R4lk= > =10yV > -----END PGP SIGNATURE----- >
