Hello Max,
I'm working with Portuguese. I'm not sure if I understand what's going on
and how to solve it.
Cheers,
Gabriel Oliveira
2013/5/29 Max Jakob <[email protected]>
> Hi Gabriel, CCing dbpedia-developers list,
>
> this looks like a problem with the DBpedia parser, so it is not
> directly a Spotlight problem. It's related to the (fairly new?) Module
> namespace in Wikipedia [1] that is not handled by the parser for all
> languages yet [2]. I assume you are not working with English, French
> or Hungarian dumps. For all other language, the Module namespace is
> not configured yet. Which language are you working with?
> If you understand what's going on, you can add the appropriate
> configuration yourself and send a pull request on GitHub to the
> extraction-framework repo [3]. Otherwise, the developer community
> might be able to help you.
> After this is corrected, install the extraction-framework in your
> local Maven repo by running mvn clean install. Afterwards, do the same
> for Spotlight again. Finally, re-attempt to run the indexing.
>
> Cheers,
> Max
>
> [1] http://en.wikipedia.org/wiki/Wikipedia:Namespace
> [2]
> https://github.com/dbpedia/extraction-framework/blob/master/core/src/main/scala/org/dbpedia/extraction/wikiparser/impl/wikipedia/Namespaces.scala
> [3] https://github.com/dbpedia/extraction-framework
>
>
> On Wed, May 29, 2013 at 10:27 PM, Gabriel Oliveira <[email protected]>
> wrote:
> > Hello Max,
> >
> > I did as you told me and I have managed to fix some problems. I am still
> > learning how to use Maven and IntelliJ, therefore I have missed a few
> > details and it took me a while to realize that, but now I have made some
> > progress.
> >
> > Different from the last attempts though, now it has run for almost an
> hour
> > and has saved about 10800000 occurrences. However, after these
> occurrences
> > are saved an exception is thrown, and now I believe it is not my fault
> > anymore.
> > The output is as follows:
> >
> > INFO 2013-05-29 16:55:58,441 main [AllOccurrenceSource$] - Processed
> > 1300000 Wikipedia definition pages (average 9.73 links per page)
> > INFO 2013-05-29 16:56:23,092 main [FileOccurrenceSource$] - saved
> > 10600000 occurrences
> > INFO 2013-05-29 16:57:10,162 main [FileOccurrenceSource$] - saved
> > 10700000 occurrences
> > INFO 2013-05-29 16:58:00,311 main [FileOccurrenceSource$] - saved
> > 10800000 occurrences
> > java.lang.reflect.InvocationTargetException
> >
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:601)
> > at scala_maven_executions.MainHelper.runMain(MainHelper.java:164)
> > at
> >
> scala_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26)
> > Caused by: java.util.NoSuchElementException: key not found: 828
> > at scala.collection.MapLike$class.default(MapLike.scala:225)
> > at scala.collection.immutable.HashMap.default(HashMap.scala:38)
> > at scala.collection.MapLike$class.apply(MapLike.scala:135)
> > at scala.collection.immutable.HashMap.apply(HashMap.scala:38)
> > at
> >
> org.dbpedia.extraction.sources.WikipediaDumpParser.readPage(WikipediaDumpParser.java:218)
> > at
> >
> org.dbpedia.extraction.sources.WikipediaDumpParser.readPages(WikipediaDumpParser.java:179)
> > at
> >
> org.dbpedia.extraction.sources.WikipediaDumpParser.readDump(WikipediaDumpParser.java:137)
> > at
> >
> org.dbpedia.extraction.sources.WikipediaDumpParser.run(WikipediaDumpParser.java:108)
> > at
> >
> org.dbpedia.extraction.sources.XMLReaderSource.foreach(XMLSource.scala:57)
> > at
> >
> org.dbpedia.spotlight.io.AllOccurrenceSource$AllOccurrenceSource.foreach(AllOccurrenceSource.scala:80)
> > at
> > org.dbpedia.spotlight.filter.Filter$FilteredOccs.foreach(Filter.scala:58)
> > at
> > org.dbpedia.spotlight.filter.Filter$FilteredOccs.foreach(Filter.scala:58)
> > at
> > org.dbpedia.spotlight.filter.Filter$FilteredOccs.foreach(Filter.scala:58)
> > at
> >
> org.dbpedia.spotlight.io.FileOccurrenceSource$.writeToFile(FileOccurrenceSource.scala:57)
> > at
> >
> org.dbpedia.spotlight.lucene.index.ExtractOccsFromWikipedia$.main(ExtractOccsFromWikipedia.scala:82)
> > at
> >
> org.dbpedia.spotlight.lucene.index.ExtractOccsFromWikipedia.main(ExtractOccsFromWikipedia.scala)
> > ... 6 more
> > [INFO]
> > ------------------------------------------------------------------------
> > [INFO] BUILD FAILURE
> > [INFO]
> > ------------------------------------------------------------------------
> > [INFO] Total time: 56:48.220s
> > [INFO] Finished at: Wed May 29 16:58:17 BRT 2013
> > [INFO] Final Memory: 11M/216M
> > [INFO]
> > ------------------------------------------------------------------------
> >
> > [ERROR] Failed to execute goal
> > net.alchim31.maven:scala-maven-plugin:3.1.0:run (default-cli) on project
> > index: wrap: org.apache.commons.exec.ExecuteException: Process exited
> with
> > an error: 240(Exit value: 240) -> [Help 1]
> > [ERROR]
> > [ERROR] To see the full stack trace of the errors, re-run Maven with the
> -e
> > switch.
> > [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> >
> > [ERROR]
> > [ERROR] For more information about the errors and possible solutions,
> please
> > read the following articles:
> > [ERROR] [Help 1]
> > http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
> >
> > I will run it again with the -X switch and will send you the full output
> as
> > soon as it finishes. Probably about an hour from now.
> >
> > I really appreciate your support.
> >
> > Cheers,
> > Gabriel Oliveira
>
------------------------------------------------------------------------------
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j
_______________________________________________
Dbpedia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers