Hi

On Thu, Jul 14, 2011 at 4:27 AM, Steve Reiner
<[email protected]> wrote:
> Know I should add to jira, just want to make sure I didn't need to some
> additional step to get the index to work
>
> Was actually getting a different error, not the cache thing, but index not
> yet installed when use /engines
>
> On Windows with latest code get the index not yet installed error (and
> weirdly also with what I built 7/10 that used to work with the
> sling/datafiles workaround on Windows)  (Linux with 7/10 code is still
> fine):

Do you delete the {stanbol}/sling folder after upgrading to the newest
version? If not you might still use the old version because within the
/sling folder there is a cache that is not overridden with the new
version just because the launcher jar file is replaced?

>
> (org.apache.stanbol.enhancer.servicesapi.EngineException:
> 'NamedEntityTaggingEngine' failed to process content item
> 'urn:content-item-sha1-88a2b5f6520df87e4567c06b48e742b7d1c71e9c' with type
> 'text/plain': org.apache.stanbol.entityhub.servicesapi.yard.YardException:
> SolrIndex entityhub is not available. The necessary Index is not yet
> installed.) org.apache.stanbol.enhancer.servicesapi.EngineException:
> 'NamedEntityTaggingEngine' failed to process content item
> 'urn:content-item-sha1-88a2b5f6520df87e4567c06b48e742b7d1c71e9c' with type
> 'text/plain': org.apache.stanbol.entityhub.servicesapi.yard.YardException:
> SolrIndex entityhub is not available. The necessary Index is not yet
> installed.
>        at
> org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEng
> ine.computeEnhancements(NamedEntityTaggingEngine.java:323)
>
Since revision 1144364 the BundleDataFilePovider (the one that seams
not to work on Windows) is also used to load the entityhub index.
Therefore also the initialization of the SolrYard used by the
Entityhub will not work. As I am writing this I now know that this
would also prevent the initialization of any other SolrYard (such as
the dbpediaCache) because also the default initialization does relay
an the same BundleDataFilePovider to load the required Solr
configuration. So this would also explain the problems you had with
the workaround I suggested.

The two required files are in this directory [1]. If you copy them to
the {stanbol-root}/datafiles directory it should solve the problem.
After copying the files there you will need to deactivate/activate the
SolrYards so they pick up this files.

[1] 
http://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/yard/solr/src/main/resources/solr/core/

> Have workarounds in sling/datafiles (have en-*.bin,
> dbpedia_43k.solrindex.zip )
>  (change for STANBOL-259, as Fabian commented, didn't fix the en-*.bin load
> issue, still needed the workaround)
>

I will look into that next Week when I am back in the office. I can
not do much without access to a Windows box.

> From the felix web console "Stanbol Data File Provider"
> Seems to be looking for entityhub.solrindex.zip and and not finding it
> (tried having dbpedia_43k.solrindex.zip copied to entityhub.solrindex.zip in
> datafiles but got same error after restart and engine use)
>

You need to restart the SolrYard because it lookups the required files
in the activation. Restarting the Engine will not cause the SolrYard
to be restarted

> Tried also after being clean:  blow away sling dir, mvn clean, run shell
> script script to get defaultdata files,  mvn install -DskipTests
> MAVEN_OPTS=-Xmx1024M -XX:MaxPermSize=128M in env
>

I am really sorry for all this writing without coming up with a real
solution, but it is really hard to solve Windows related problems
without access to a Windows box. So if you are not in a hurry it would
be maybe more effective to delay working on this until next week.

best
Rupert Westenthaler

> Steve
> -----Original Message-----
> From: Steve Reiner [mailto:[email protected]]
> Sent: Wednesday, July 13, 2011 12:09 AM
> To: '[email protected]'
> Subject: RE: EntityHub and DBpedia
>
> I am getting something like this too after updating with the code checked in
> yesterday. Problem wasn't there in the code the day before.
>
> (using /engines page)
>
> -----Original Message-----
> From: David Riccitelli [mailto:[email protected]]
> Sent: Tuesday, July 12, 2011 11:58 PM
> To: [email protected]
> Subject: Re: EntityHub and DBpedia
>
> Thanks Rupert,
>
> I'm trying to follow your instructions but I encounter a couple of issues
> (probably due to inexperience):
>  [1] when dropping the config files, they enter some loop of
> REGISTERING/UNREGISTERING (which I solve by stopping the FileInstall
> bundle), is that normal?
>  [2] after I restart Stanbol, and try to query an entity from the entityhub
> I receive the following error:
>
> 13.07.2011 09:54:17.939 *WARN* [509017110@qtp-1586831707-0]
> org.apache.felix.http.jetty /entityhub/sites/entity/
> (java.lang.IllegalStateException: Unable to initialize the Cache with Yard
> dbpediaCache! This is usually caused by Errors while reading the Cache
> Configuration from the Yard.) java.lang.IllegalStateException: Unable to
> initialize the Cache with Yard dbpediaCache! This is usually caused by
> Errors while reading the Cache Configuration from the Yard.
> at
> org.apache.stanbol.entityhub.core.site.CacheImpl.getCacheYard(CacheImpl.java
> :214)
>
>
> Do I need to initialize the Cache in some way?
>
> Thanks for your help,
>
> David
>
>
> On Mon, Jul 11, 2011 at 11:42 PM, Rupert Westenthaler <
> [email protected]> wrote:
>
>> Hi
>>
>> On Mon, Jul 11, 2011 at 8:17 PM, Andrea Giovanni Nuzzolese
>> <[email protected]> wrote:
>> > I solved in the same way, but loosing the caching capabilities.
>> > Is there any possibility to keep both all the data and the cache?
>> >
>> > Andrea
>> >
>> > On Jul 11, 2011, at 4:08 PM, David Riccitelli wrote:
>> >
>> >> Ok, stopping the solrYard dbpedia_43k component solved for me.
>> >>
>> >> Thanks,
>> >> David
>> >>
>> >> On Mon, Jul 11, 2011 at 4:13 PM, David Riccitelli <
>> >> [email protected]> wrote:
>> >>
>> >>> Hi Rupert,
>> >>>
>> >>> I recently updated the Stanbol install, and I found that the RDF
>> returned
>> >>> by the EntityHub is missing some props (specifically the dbprop as
>> >>> far
>> as I
>> >>> can see).
>> >>>
>> >>> This is the command that I use for testing:
>> >>> curl -H "accept: application/rdf+xml" "
>> >>>
>> http://localhost:8080/entityhub/site/dbpedia/entity?id=http://dbpedia.
>> org/resource/Valentino_Rossi
>> >>> "
>> >>>
>> >>> which outputs the attached RDF file.
>> >>>
>> >>> I cleared all of the sling folder (rm -fr sling) and checked the
>> >>> with
>> the
>> >>> SPAQL end-point at DBpedia, but I wasn't able to fix it.
>> >>>
>> >>> Does this depend on the mapping.txt file?
>> >>>
>>
>> If you plan to create your own dbpedia index, than the mapping.txt
>> file would be the way how to configure what properties are
>> includes/excluded.
>> Typically dbprop values are low quality. They are just naive 1:1
>> mappings of key value pairs as found in the info boxes. Because of
>> this they are excluded from the indexes.
>>
>> At runtime the returned data depend on the used Cache strategy:
>>
>> Currently there are three possibilities (configured with the
>> referenced
>> Site)
>> 1) no cache: bot queries and retrieval so use a remote service
>> 2) used: Queries are executed by the remote service. Retrieved
>> Entities are stored locally. The cached data depend on the mappings
>> defined for the cache.
>> 3) all: Both queries and retrieval are based on the cache. The remote
>> service are only used as fallback in the case that the cache is not
>> available (e.g. if you deactivate solrYard).
>>
>> So if you you are fine with (2) than you could use the configuration
>> as previously used by the stable launcher [1].
>> I think the easiest way to install this is to use this is to add the
>> Felix File Installer [2] to the Stanbol Environment. You will need to
>> delete the current referencedSite for dbpedia first and than add the
>> three configuration files as described by [1].
>>
>> If your requirements are not covered by the currently available option
>> it would be nice if you could write a short user story, because I am
>> thinking about how to improve this feature and input like that would
>> be really valuable.
>>
>> best
>> Rupert Westenthaler
>>
>> [1] The dbpedia config consists of three files. the referenced site,
>> cache and solryard components with the "-dbpedia" endings.
>>
>> http://svn.apache.org/viewvc/incubator/stanbol/trunk/launchers/stable/
>> src/main/resources/resources/config/?pathrev=1140181
>>
>> [2] http://felix.apache.org/site/apache-felix-file-install.html
>>
>> p.s. I keep this part because it describes very well how the cache
>> strategy "used" work:
>> >>>>> Hi David
>> >>>>>
>> >>>>> Assuming that you are using the default distribution of Apache
>> Stanbol.
>> >>>>>
>> >>>>> Requests for  http://dbpedia.org/resource/Valentino_Rossi will
>> >>>>> be
>> >>>>> - only the first time answered by retrieving the Entity form
>> DBpedia.org
>> >>>>> - the Information are cached in a local cache. By that values of
>> >>>>> the documents are filtered (see (a) for details)
>> >>>>> - the cached version is returned
>> >>>>>
>> >>>>> (a) The default configuration for dbpedia stores all fields
>> >>>>> however filters values for literals so that only values with the
>> >>>>> language
>> "en,
>> >>>>> de, fr, it, es" or no language are stored.
>> >>>>>
>> >>>>>
>> >>>>> Assuming that you have started for zero when updating to a new
>> version
>> >>>>> this also means that you have downloaded a new version of this
>> >>>>> Entity from dbPedia.
>> >>>>>
>>
>> --
>> | Rupert Westenthaler             [email protected]
>> | Bodenlehenstraße 11                             ++43-699-11108907
>> | A-5500 Bischofshofen
>>
>
>
>
> --
> David Riccitelli
>
> Interact SpA
> Via A. Bargoni 78 (scala F)
> 00153 Roma
>
> T +39 06 58318 301
> F +39 06 58318 303
>
>



-- 
| Rupert Westenthaler             [email protected]
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Reply via email to