Hi Steve

I think I have found the problem and provided a fix with revision 1147784 [1]

If you still have issues please feel free to reopen STANBOL-259 [2]

best
Rupert

[1] http://svn.apache.org/viewvc?view=revision&revision=1147784
[2] https://issues.apache.org/jira/browse/STANBOL-259

On Thu, Jul 14, 2011 at 8:37 AM, Steve Reiner
<[email protected]> wrote:
> Rupert ,
>
> Thanks for the help
>
> I can use Stanbol running in a Linux vmware vm until next week.
>
> Thanks,
> Steve
> -----Original Message-----
> From: Rupert Westenthaler [mailto:[email protected]]
> Sent: Wednesday, July 13, 2011 11:22 PM
> To: [email protected]
> Subject: Re: index file issue on windows
>
> Hi
>
>
> On Thu, Jul 14, 2011 at 4:27 AM, Steve Reiner 
> <[email protected]> wrote:
>> Know I should add to jira, just want to make sure I didn't need to
>> some additional step to get the index to work
>>
>> Was actually getting a different error, not the cache thing, but index
>> not yet installed when use /engines
>>
>> On Windows with latest code get the index not yet installed error (and
>> weirdly also with what I built 7/10 that used to work with the
>> sling/datafiles workaround on Windows)  (Linux with 7/10 code is still
>> fine):
>
> Do you delete the {stanbol}/sling folder after upgrading to the newest 
> version? If not you might still use the old version because within the /sling 
> folder there is a cache that is not overridden with the new version just 
> because the launcher jar file is replaced?
>
>>
>> (org.apache.stanbol.enhancer.servicesapi.EngineException:
>> 'NamedEntityTaggingEngine' failed to process content item
>> 'urn:content-item-sha1-88a2b5f6520df87e4567c06b48e742b7d1c71e9c' with
>> type
>> 'text/plain': org.apache.stanbol.entityhub.servicesapi.yard.YardException:
>> SolrIndex entityhub is not available. The necessary Index is not yet
>> installed.) org.apache.stanbol.enhancer.servicesapi.EngineException:
>> 'NamedEntityTaggingEngine' failed to process content item
>> 'urn:content-item-sha1-88a2b5f6520df87e4567c06b48e742b7d1c71e9c' with
>> type
>> 'text/plain': org.apache.stanbol.entityhub.servicesapi.yard.YardException:
>> SolrIndex entityhub is not available. The necessary Index is not yet
>> installed.
>>        at
>> org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTagg
>> ingEng
>> ine.computeEnhancements(NamedEntityTaggingEngine.java:323)
>>
> Since revision 1144364 the BundleDataFilePovider (the one that seams not to 
> work on Windows) is also used to load the entityhub index.
> Therefore also the initialization of the SolrYard used by the Entityhub will 
> not work. As I am writing this I now know that this would also prevent the 
> initialization of any other SolrYard (such as the dbpediaCache) because also 
> the default initialization does relay an the same BundleDataFilePovider to 
> load the required Solr configuration. So this would also explain the problems 
> you had with the workaround I suggested.
>
> The two required files are in this directory [1]. If you copy them to the 
> {stanbol-root}/datafiles directory it should solve the problem.
> After copying the files there you will need to deactivate/activate the 
> SolrYards so they pick up this files.
>
> [1] 
> http://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/yard/solr/src/main/resources/solr/core/
>
>> Have workarounds in sling/datafiles (have en-*.bin,
>> dbpedia_43k.solrindex.zip )
>>  (change for STANBOL-259, as Fabian commented, didn't fix the en-*.bin
>> load issue, still needed the workaround)
>>
>
> I will look into that next Week when I am back in the office. I can not do 
> much without access to a Windows box.
>
>> From the felix web console "Stanbol Data File Provider"
>> Seems to be looking for entityhub.solrindex.zip and and not finding it
>> (tried having dbpedia_43k.solrindex.zip copied to
>> entityhub.solrindex.zip in datafiles but got same error after restart
>> and engine use)
>>
>
> You need to restart the SolrYard because it lookups the required files in the 
> activation. Restarting the Engine will not cause the SolrYard to be restarted
>
>> Tried also after being clean:  blow away sling dir, mvn clean, run
>> shell script script to get defaultdata files,  mvn install -DskipTests
>> MAVEN_OPTS=-Xmx1024M -XX:MaxPermSize=128M in env
>>
>
> I am really sorry for all this writing without coming up with a real 
> solution, but it is really hard to solve Windows related problems without 
> access to a Windows box. So if you are not in a hurry it would be maybe more 
> effective to delay working on this until next week.
>
> best
> Rupert Westenthaler
>
>> Steve
>> -----Original Message-----
>> From: Steve Reiner [mailto:[email protected]]
>> Sent: Wednesday, July 13, 2011 12:09 AM
>> To: '[email protected]'
>> Subject: RE: EntityHub and DBpedia
>>
>> I am getting something like this too after updating with the code
>> checked in yesterday. Problem wasn't there in the code the day before.
>>
>> (using /engines page)
>>
>> -----Original Message-----
>> From: David Riccitelli [mailto:[email protected]]
>> Sent: Tuesday, July 12, 2011 11:58 PM
>> To: [email protected]
>> Subject: Re: EntityHub and DBpedia
>>
>> Thanks Rupert,
>>
>> I'm trying to follow your instructions but I encounter a couple of
>> issues (probably due to inexperience):
>>  [1] when dropping the config files, they enter some loop of
>> REGISTERING/UNREGISTERING (which I solve by stopping the FileInstall
>> bundle), is that normal?
>>  [2] after I restart Stanbol, and try to query an entity from the
>> entityhub I receive the following error:
>>
>> 13.07.2011 09:54:17.939 *WARN* [509017110@qtp-1586831707-0]
>> org.apache.felix.http.jetty /entityhub/sites/entity/
>> (java.lang.IllegalStateException: Unable to initialize the Cache with
>> Yard dbpediaCache! This is usually caused by Errors while reading the
>> Cache Configuration from the Yard.) java.lang.IllegalStateException:
>> Unable to initialize the Cache with Yard dbpediaCache! This is usually
>> caused by Errors while reading the Cache Configuration from the Yard.
>> at
>> org.apache.stanbol.entityhub.core.site.CacheImpl.getCacheYard(CacheImp
>> l.java
>> :214)
>>
>>
>> Do I need to initialize the Cache in some way?
>>
>> Thanks for your help,
>>
>> David
>>
>>
>> On Mon, Jul 11, 2011 at 11:42 PM, Rupert Westenthaler <
>> [email protected]> wrote:
>>
>>> Hi
>>>
>>> On Mon, Jul 11, 2011 at 8:17 PM, Andrea Giovanni Nuzzolese
>>> <[email protected]> wrote:
>>> > I solved in the same way, but loosing the caching capabilities.
>>> > Is there any possibility to keep both all the data and the cache?
>>> >
>>> > Andrea
>>> >
>>> > On Jul 11, 2011, at 4:08 PM, David Riccitelli wrote:
>>> >
>>> >> Ok, stopping the solrYard dbpedia_43k component solved for me.
>>> >>
>>> >> Thanks,
>>> >> David
>>> >>
>>> >> On Mon, Jul 11, 2011 at 4:13 PM, David Riccitelli <
>>> >> [email protected]> wrote:
>>> >>
>>> >>> Hi Rupert,
>>> >>>
>>> >>> I recently updated the Stanbol install, and I found that the RDF
>>> returned
>>> >>> by the EntityHub is missing some props (specifically the dbprop
>>> >>> as far
>>> as I
>>> >>> can see).
>>> >>>
>>> >>> This is the command that I use for testing:
>>> >>> curl -H "accept: application/rdf+xml" "
>>> >>>
>>> http://localhost:8080/entityhub/site/dbpedia/entity?id=http://dbpedia.
>>> org/resource/Valentino_Rossi
>>> >>> "
>>> >>>
>>> >>> which outputs the attached RDF file.
>>> >>>
>>> >>> I cleared all of the sling folder (rm -fr sling) and checked the
>>> >>> with
>>> the
>>> >>> SPAQL end-point at DBpedia, but I wasn't able to fix it.
>>> >>>
>>> >>> Does this depend on the mapping.txt file?
>>> >>>
>>>
>>> If you plan to create your own dbpedia index, than the mapping.txt
>>> file would be the way how to configure what properties are
>>> includes/excluded.
>>> Typically dbprop values are low quality. They are just naive 1:1
>>> mappings of key value pairs as found in the info boxes. Because of
>>> this they are excluded from the indexes.
>>>
>>> At runtime the returned data depend on the used Cache strategy:
>>>
>>> Currently there are three possibilities (configured with the
>>> referenced
>>> Site)
>>> 1) no cache: bot queries and retrieval so use a remote service
>>> 2) used: Queries are executed by the remote service. Retrieved
>>> Entities are stored locally. The cached data depend on the mappings
>>> defined for the cache.
>>> 3) all: Both queries and retrieval are based on the cache. The remote
>>> service are only used as fallback in the case that the cache is not
>>> available (e.g. if you deactivate solrYard).
>>>
>>> So if you you are fine with (2) than you could use the configuration
>>> as previously used by the stable launcher [1].
>>> I think the easiest way to install this is to use this is to add the
>>> Felix File Installer [2] to the Stanbol Environment. You will need to
>>> delete the current referencedSite for dbpedia first and than add the
>>> three configuration files as described by [1].
>>>
>>> If your requirements are not covered by the currently available
>>> option it would be nice if you could write a short user story,
>>> because I am thinking about how to improve this feature and input
>>> like that would be really valuable.
>>>
>>> best
>>> Rupert Westenthaler
>>>
>>> [1] The dbpedia config consists of three files. the referenced site,
>>> cache and solryard components with the "-dbpedia" endings.
>>>
>>> http://svn.apache.org/viewvc/incubator/stanbol/trunk/launchers/stable
>>> /
>>> src/main/resources/resources/config/?pathrev=1140181
>>>
>>> [2] http://felix.apache.org/site/apache-felix-file-install.html
>>>
>>> p.s. I keep this part because it describes very well how the cache
>>> strategy "used" work:
>>> >>>>> Hi David
>>> >>>>>
>>> >>>>> Assuming that you are using the default distribution of Apache
>>> Stanbol.
>>> >>>>>
>>> >>>>> Requests for  http://dbpedia.org/resource/Valentino_Rossi will
>>> >>>>> be
>>> >>>>> - only the first time answered by retrieving the Entity form
>>> DBpedia.org
>>> >>>>> - the Information are cached in a local cache. By that values
>>> >>>>> of the documents are filtered (see (a) for details)
>>> >>>>> - the cached version is returned
>>> >>>>>
>>> >>>>> (a) The default configuration for dbpedia stores all fields
>>> >>>>> however filters values for literals so that only values with
>>> >>>>> the language
>>> "en,
>>> >>>>> de, fr, it, es" or no language are stored.
>>> >>>>>
>>> >>>>>
>>> >>>>> Assuming that you have started for zero when updating to a new
>>> version
>>> >>>>> this also means that you have downloaded a new version of this
>>> >>>>> Entity from dbPedia.
>>> >>>>>
>>>
>>> --
>>> | Rupert Westenthaler             [email protected]
>>> | Bodenlehenstraße 11                             ++43-699-11108907
>>> | A-5500 Bischofshofen
>>>
>>
>>
>>
>> --
>> David Riccitelli
>>
>> Interact SpA
>> Via A. Bargoni 78 (scala F)
>> 00153 Roma
>>
>> T +39 06 58318 301
>> F +39 06 58318 303
>>
>>
>
>
>
> --
> | Rupert Westenthaler             [email protected]
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen
>
>



-- 
| Rupert Westenthaler             [email protected]
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Reply via email to