On Thu, Jul 14, 2011 at 8:30 AM, David Riccitelli
<[email protected]> wrote:
> Thanks Rupert,
>
> A description on how to do this is available in [1].
>
>
> I can't see the [1] :-)

does this count as missing attachment? ^^

[1] 
http://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/yard/solr/src/main/resources/solr/core/

>
> David
>
> On Thu, Jul 14, 2011 at 8:56 AM, Rupert Westenthaler <
> [email protected]> wrote:
>
>> Hi
>>
>> Yes this is possible, but would need (depending on the hardware) quite
>> some time.
>> A description on how to do this is available in [1].
>>
>> Instead of installing the dbpedia.solrindex.zip file as described in
>> the readme, you could directly
>>
>> * shutdown stanbol
>> * delete the "dbpedia_43k" index in
>> "{stanbol-root}/sling/entityhub/solrYard/indexes"
>> * copy the index located in the
>> "{indexing-root}/indexing/destination/indexes" to
>> "{stanbol-root}/sling/entityhub/solrYard/indexes" and rename it to
>> "dbpedia_43k"
>> * restart stanbol.
>>
>> After that Stanbol should use the new index.
>>
>> Copying the "dbpedia.solrindex.zip" to the datafiles directory and
>> than changing the value of "Solr Index/Core" in the configuration of
>> the SolrYard for dbPedia form "dbpedia_43k" to "dbpedia" should also
>> work.
>>
>> best
>> Rupert
>>
>> On Wed, Jul 13, 2011 at 11:58 AM, David Riccitelli
>> <[email protected]> wrote:
>> > Hi,
>> >
>> > As another workaround, I was thinking that I could actually generate
>> locally
>> > the DBpedia index with all the data using the dumps (
>> > http://wiki.dbpedia.org/Downloads36), in a way similar to the
>> dbpedia_43k.
>> >
>> > What do you think?
>> >
>> > Thanks,
>> > David
>> >
>> > On Wed, Jul 13, 2011 at 12:11 PM, Rupert Westenthaler <
>> > [email protected]> wrote:
>> >
>> >> Hi
>> >>
>> >> I will try to find some time in the evening to reproduce this.
>> >>
>> >> On Wed, Jul 13, 2011 at 8:57 AM, David Riccitelli
>> >> <[email protected]> wrote:
>> >> > Thanks Rupert,
>> >> >
>> >> > I'm trying to follow your instructions but I encounter a couple of
>> issues
>> >> > (probably due to inexperience):
>> >> >  [1] when dropping the config files, they enter some loop of
>> >> > REGISTERING/UNREGISTERING (which I solve by stopping the FileInstall
>> >> > bundle), is that normal?
>> >>
>> >> This is very strange and should not be caused by the FileInstaller.
>> >> Maybe there is some loop between the Sling Installer - trying to
>> >> install the default configuration and the FileInstaller that may cause
>> >> this under some circumstances.
>> >>
>> >> >  [2] after I restart Stanbol, and try to query an entity from the
>> >> entityhub
>> >> > I receive the following error:
>> >> >
>> >> > 13.07.2011 09:54:17.939 *WARN* [509017110@qtp-1586831707-0]
>> >> > org.apache.felix.http.jetty /entityhub/sites/entity/
>> >> > (java.lang.IllegalStateException: Unable to initialize the Cache with
>> >> Yard
>> >> > dbpediaCache! This is usually caused by Errors while reading the Cache
>> >> > Configuration from the Yard.) java.lang.IllegalStateException: Unable
>> to
>> >> > initialize the Cache with Yard dbpediaCache! This is usually caused by
>> >> > Errors while reading the Cache Configuration from the Yard.
>> >> > at
>> >> >
>> >>
>> org.apache.stanbol.entityhub.core.site.CacheImpl.getCacheYard(CacheImpl.java:214)
>> >> >
>> >> >
>> >> > Do I need to initialize the Cache in some way?
>> >> >
>> >> No it does not. Prepared in Indexes do include a document that
>> >> provides a list of the indexed fields. In future this may be used to
>> >> determine if a query can be successfully executed on the local index
>> >> or not. In addition this is used in case an Entity within the index is
>> >> updated with an newer version.
>> >> However this configuration is optional and is not required. This
>> >> Exception should only appear if the document is present but illegal
>> >> formatted. However the SolrYard initialized for the dbpediaCache
>> >> should be empty.
>> >>
>> >> Therefore I think it is somehow related to the above problem of
>> >> overriding configurations.
>> >>
>> >> In general the way how the default configuration is loaded is
>> >> sub-optional in the moment. Especially using a single defaultdata
>> >> bundle for both the OpenNLP models and the dbpedia configuration +
>> >> default index was not a good Idea, because one can not exclude/change
>> >> the dbpedia stuff without affecting other components that depend on
>> >> OpenNLP.
>> >> Therefore I think we need to discuss how to better structure the
>> >> configurations and data needed to run stanbol.
>> >>
>> >> There is also an other issue that the SolrYard only once copies
>> >> provided indexes and does not check for updates. This would it make
>> >> hard the upgrade from the small index provided with the default data
>> >> to a bigger version.
>> >>
>> >> Both this things are related to the problems and need to be addressed
>> >> before the first stanbol release. Independent of those I will try to
>> >> find a simple solution for what you intend to do.
>> >>
>> >> In the meantime I suggest you go for the initially proposed workaround.
>> >>
>> >> best
>> >> Rupert Westenthaler
>> >>
>> >> > Thanks for your help,
>> >> >
>> >> > David
>> >> >
>> >> >
>> >> > On Mon, Jul 11, 2011 at 11:42 PM, Rupert Westenthaler <
>> >> > [email protected]> wrote:
>> >> >
>> >> >> Hi
>> >> >>
>> >> >> On Mon, Jul 11, 2011 at 8:17 PM, Andrea Giovanni Nuzzolese
>> >> >> <[email protected]> wrote:
>> >> >> > I solved in the same way, but loosing the caching capabilities.
>> >> >> > Is there any possibility to keep both all the data and the cache?
>> >> >> >
>> >> >> > Andrea
>> >> >> >
>> >> >> > On Jul 11, 2011, at 4:08 PM, David Riccitelli wrote:
>> >> >> >
>> >> >> >> Ok, stopping the solrYard dbpedia_43k component solved for me.
>> >> >> >>
>> >> >> >> Thanks,
>> >> >> >> David
>> >> >> >>
>> >> >> >> On Mon, Jul 11, 2011 at 4:13 PM, David Riccitelli <
>> >> >> >> [email protected]> wrote:
>> >> >> >>
>> >> >> >>> Hi Rupert,
>> >> >> >>>
>> >> >> >>> I recently updated the Stanbol install, and I found that the RDF
>> >> >> returned
>> >> >> >>> by the EntityHub is missing some props (specifically the dbprop
>> as
>> >> far
>> >> >> as I
>> >> >> >>> can see).
>> >> >> >>>
>> >> >> >>> This is the command that I use for testing:
>> >> >> >>> curl -H "accept: application/rdf+xml" "
>> >> >> >>>
>> >> >>
>> >>
>> http://localhost:8080/entityhub/site/dbpedia/entity?id=http://dbpedia.org/resource/Valentino_Rossi
>> >> >> >>> "
>> >> >> >>>
>> >> >> >>> which outputs the attached RDF file.
>> >> >> >>>
>> >> >> >>> I cleared all of the sling folder (rm -fr sling) and checked the
>> >> with
>> >> >> the
>> >> >> >>> SPAQL end-point at DBpedia, but I wasn't able to fix it.
>> >> >> >>>
>> >> >> >>> Does this depend on the mapping.txt file?
>> >> >> >>>
>> >> >>
>> >> >> If you plan to create your own dbpedia index, than the mapping.txt
>> >> >> file would be the way how to configure what properties are
>> >> >> includes/excluded.
>> >> >> Typically dbprop values are low quality. They are just naive 1:1
>> >> >> mappings of key value pairs as found in the info boxes. Because of
>> >> >> this they are excluded from the indexes.
>> >> >>
>> >> >> At runtime the returned data depend on the used Cache strategy:
>> >> >>
>> >> >> Currently there are three possibilities (configured with the
>> referenced
>> >> >> Site)
>> >> >> 1) no cache: bot queries and retrieval so use a remote service
>> >> >> 2) used: Queries are executed by the remote service. Retrieved
>> >> >> Entities are stored locally. The cached data depend on the mappings
>> >> >> defined for the cache.
>> >> >> 3) all: Both queries and retrieval are based on the cache. The remote
>> >> >> service are only used as fallback in the case that the cache is not
>> >> >> available (e.g. if you deactivate solrYard).
>> >> >>
>> >> >> So if you you are fine with (2) than you could use the configuration
>> >> >> as previously used by the stable launcher [1].
>> >> >> I think the easiest way to install this is to use this is to add the
>> >> >> Felix File Installer [2] to the Stanbol Environment. You will need to
>> >> >> delete the current referencedSite for dbpedia first and than add the
>> >> >> three configuration files as described by [1].
>> >> >>
>> >> >> If your requirements are not covered by the currently available
>> option
>> >> >> it would be nice if you could write a short user story, because I am
>> >> >> thinking about how to improve this feature and input like that would
>> >> >> be really valuable.
>> >> >>
>> >> >> best
>> >> >> Rupert Westenthaler
>> >> >>
>> >> >> [1] The dbpedia config consists of three files. the referenced site,
>> >> >> cache and solryard components with the "-dbpedia" endings.
>> >> >>
>> >> >>
>> >>
>> http://svn.apache.org/viewvc/incubator/stanbol/trunk/launchers/stable/src/main/resources/resources/config/?pathrev=1140181
>> >> >>
>> >> >> [2] http://felix.apache.org/site/apache-felix-file-install.html
>> >> >>
>> >> >> p.s. I keep this part because it describes very well how the cache
>> >> >> strategy "used" work:
>> >> >> >>>>> Hi David
>> >> >> >>>>>
>> >> >> >>>>> Assuming that you are using the default distribution of Apache
>> >> >> Stanbol.
>> >> >> >>>>>
>> >> >> >>>>> Requests for  http://dbpedia.org/resource/Valentino_Rossi will
>> be
>> >> >> >>>>> - only the first time answered by retrieving the Entity form
>> >> >> DBpedia.org
>> >> >> >>>>> - the Information are cached in a local cache. By that values
>> of
>> >> the
>> >> >> >>>>> documents are filtered (see (a) for details)
>> >> >> >>>>> - the cached version is returned
>> >> >> >>>>>
>> >> >> >>>>> (a) The default configuration for dbpedia stores all fields
>> >> however
>> >> >> >>>>> filters values for literals so that only values with the
>> language
>> >> >> "en,
>> >> >> >>>>> de, fr, it, es" or no language are stored.
>> >> >> >>>>>
>> >> >> >>>>>
>> >> >> >>>>> Assuming that you have started for zero when updating to a new
>> >> >> version
>> >> >> >>>>> this also means that you have downloaded a new version of this
>> >> Entity
>> >> >> >>>>> from dbPedia.
>> >> >> >>>>>
>> >> >>
>> >> >> --
>> >> >> | Rupert Westenthaler             [email protected]
>> >> >> | Bodenlehenstraße 11                             ++43-699-11108907
>> >> >> | A-5500 Bischofshofen
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > David Riccitelli
>> >> >
>> >> > Interact SpA
>> >> > Via A. Bargoni 78 (scala F)
>> >> > 00153 Roma
>> >> >
>> >> > T +39 06 58318 301
>> >> > F +39 06 58318 303
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> | Rupert Westenthaler             [email protected]
>> >> | Bodenlehenstraße 11                             ++43-699-11108907
>> >> | A-5500 Bischofshofen
>> >>
>> >
>> >
>> >
>> > --
>> > David Riccitelli
>> >
>> > Interact SpA
>> > Via A. Bargoni 78 (scala F)
>> > 00153 Roma
>> >
>> > T +39 06 58318 301
>> > F +39 06 58318 303
>> >
>>
>>
>>
>> --
>> | Rupert Westenthaler             [email protected]
>> | Bodenlehenstraße 11                             ++43-699-11108907
>> | A-5500 Bischofshofen
>>
>
>
>
> --
> David Riccitelli
>
> Interact SpA
> Via A. Bargoni 78 (scala F)
> 00153 Roma
>
> T +39 06 58318 301
> F +39 06 58318 303
>



-- 
| Rupert Westenthaler             [email protected]
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Reply via email to