The query for Paris is not working as I indexed only Chinese files. I tried Chinese strings instead of Paris. It is not working. I know that those entries are there in dbpedia.
How do you start in -I DEBUG mode? I do the java -Xmx1024m -XX:MaxPermSize=256m -Xdebug -Xrunjdwp:transport=dt_socket,address=8787,server=y,suspend=n -jar launchers/full/target/org.apache.stanbol.launchers.full-0.10.0-incubating-SNAPSHOT.jar command to start Stanbol server. -harish On Wed, Aug 22, 2012 at 9:41 PM, Rupert Westenthaler < [email protected]> wrote: > Hi, > > Usually when one does not get the expected results it is related to > data contained by the dbpedia referenced site. So I will try to > provide some information on how to best debug what is happening. > > Can you maybe provide data for some Entities by providing the results > of a Entityhub > query such as > > curl -H "Accept: application/rdf+xml" \ > " > http://localhost:8080/entityhub/site/dbpedia/entity?id=http://dbpedia.org/resource/Paris > " > > You can also use an other Entity as Paris if it is more representative > for your data. > > An other interesting thing todo is > > 1) staring Stanbol in the DEBUG modus (by adding the "-l DEBUG" option > when starting) > 2) send a Document to the Enhancer > 3) now you should see the used Solr Queries in the log (you might need > to filter the extensive logging for the component > "org.apache.stanbol.entityhub.yard.solr.impl.SolrQueryFactory" > 4) check those queries manually by sending them to > > http://localhost:8080/solr/default/dbpedia/select?q= > > BTW: You can also look at data stored in the Solr Index by requesting > a Document via its URI e.g. > > > http://localhost:8080/solr/default/dbpedia/select?q=uri:http\://dbpedia.org/resource/Paris > > > This should help in looking into your issue. > > best > Rupert > > On Thu, Aug 23, 2012 at 12:57 AM, harish suvarna <[email protected]> > wrote: > > I am finally successfull after converting some chinese dbpedia dump files > > to utf8. But I can't hit any dbpedia links in stanbol using this solr > dump. > > I am just wondering whether I should pre-process the chinese dbpedia dump > > files. I uploaded the new jar file successfully as a new bundle. Then I > > defined a new engine using this reference site 'dbpedia'. I donot have > any > > other dbpedia solr dump. The chain says it is active and all 3 engines > are > > available. > > If I put the dbpedia solr index from Ogrisel (1.19GB), it works fine. I > get > > some dbpedia links. > > Am I missing anything > > else?<http://localhost:8080/system/console/bundles/179>I did add the > > instance_types and person_data from english dump. > > > > -harish > > > > > > > > On Tue, Aug 21, 2012 at 6:22 PM, harish suvarna <[email protected]> > wrote: > > > >> > >> > >> On Mon, Aug 20, 2012 at 9:30 PM, Rupert Westenthaler < > >> [email protected]> wrote: > >> > >>> On Tue, Aug 21, 2012 at 2:30 AM, harish suvarna <[email protected]> > >>> wrote: > >>> >> > >>> >> I had not yet time to look at dbpedia 3.8. They might have changed > >>> >> names of some dump files. Generally "instance_types" are very > >>> >> important (this provides the information about the type of an > Entity). > >>> >> "person_data" includes additional information for persons, AFAIK > those > >>> >> information are not included in the default configuration of the > >>> >> dbpedia indexing tool > >>> >> > >>> >> > >>> > Not all language dumps have these files. Japanese, Italian also donot > >>> have > >>> > these files. These files are listed in the readme file. Hence I was > >>> looking > >>> > for these. > >>> > > >>> Types are the same for all languages. Therefore they are only > >>> available in English. > >>> I am no sure about "person_data" but there it might be the same. > >>> > >>> In other words - if you build an index for a specific language you > >>> need to include the English dumps of those that are not language > >>> specific. > >>> > >>> >>> I will try this. Thanks a lot. > >> > >>> > > >>> >> > I get a java exception. > >>> >> > >>> >> The included exceptions look like the RDF file containing the > Chinese > >>> >> labels is not well formatted. The experience says that this is most > >>> >> likely related to char encoding issues. This was also the case with > >>> >> some dbpedia 3.7 files (see the special treatment of some files in > the > >>> >> shell script of the dbpedia). > >>> >> > >>> >> OK. I will try to debug this. > >>> > > >>> > >> >>>> > >> > >> I converted the labels_zh.nt to utf-8 using ms word. MS word adds the > bom > >> bytes though. I needed to remove the bom bytes. > >> Then lables_ZH.NT WENT THROUGH. But long abstracts has same problem. So > I > >> am still working on these other files. > >> Thanks a lot for all your patience and all stanbol teachings. > >> > >> > >> > >> -- > >> Thanks > >> Harish > >> > >> > > > > > > -- > > Thanks > > Harish > > > > -- > | Rupert Westenthaler [email protected] > | Bodenlehenstraße 11 ++43-699-11108907 > | A-5500 Bischofshofen > -- Thanks Harish
