Hi Anupam,

Something's wrong with your file.
enwiki-20150113-pages-articles.xml.bz2 does not exist on
dumps.wikimedia.org, but enwiki-20150112-pages-articles.xml.bz2 and
wikidatawiki-20150113-pages-articles.xml.bz2 do.

Please download the enwiki dump and try again. The best way is to
adapt download.minimal.properties and extraction.default.properties to
your needs and then execute

../run download config=download.minimal.properties

and later

../run extraction extraction.default.properties


The warnings you sent imply that the parser is reading the wikidata
dump file, not the enwiki file.

The "unexpected end of stream" error probably means that the file is corrupted.

Regards,
JC

On Tue, Feb 10, 2015 at 3:05 PM, Anupam Mishra
<anupam.nihil...@gmail.com> wrote:
> Hi All,
>
> I have downloaded DBpedia extraction framework and trying to extract
> enwiki-20150113-pages-articles.xml.bz2 using commond mvn scala:run
> "-Dlauncher=extraction" "-DaddArgs=extraction.default.properties" but
> getting following exception.
>
> WARNING: Error parsing title: found namespace 0/Main, expected 4/Project in
> title Wikidata:Notability/eo
> Feb 10, 2015 7:21:29 PM org.dbpedia.extraction.sources.WikipediaDumpParser
> readPage
> WARNING: Error parsing title: found namespace 0/Main, expected
> 1198/Namespace 1198 in title Translations:Wikidata:Notability/3/eo
> Feb 10, 2015 7:22:01 PM org.dbpedia.extraction.sources.WikipediaDumpParser
> readPage
> WARNING: Error parsing title: found namespace 0/Main, expected
> 1198/Namespace 1198 in title Translations:Wikidata:Introduction/Page display
> title/uk
> Feb 10, 2015 7:22:01 PM org.dbpedia.extraction.sources.WikipediaDumpParser
> readPage
> WARNING: Error parsing title: found namespace 0/Main, expected 4/Project in
> title Wikidata:Introduction/uk
> Feb 10, 2015 7:22:01 PM org.dbpedia.extraction.sources.WikipediaDumpParser
> readPage
> WARNING: Error parsing title: found namespace 0/Main, expected
> 1198/Namespace 1198 in title Translations:Wikidata:Introduction/1/uk
> Feb 10, 2015 7:22:01 PM org.dbpedia.extraction.sources.WikipediaDumpParser
> readPage
> WARNING: Error parsing title: found namespace 0/Main, expected
> 1198/Namespace 1198 in title Translations:Wikidata:Introduction/2/uk
> Feb 10, 2015 7:22:01 PM org.dbpedia.extraction.sources.WikipediaDumpParser
> readPage
> WARNING: Error parsing title: found namespace 0/Main, expected
> 1198/Namespace 1198 in title Translations:Wikidata:Introduction/3/uk
> Feb 10, 2015 7:22:01 PM org.dbpedia.extraction.sources.WikipediaDumpParser
> readPage
> WARNING: Error parsing title: found namespace 0/Main, expected
> 1198/Namespace 1198 in title Translations:Wikidata:Introduction/4/uk
> Feb 10, 2015 7:22:01 PM org.dbpedia.extraction.sources.WikipediaDumpParser
> readPage
> WARNING: Error parsing title: found namespace 0/Main, expected
> 1198/Namespace 1198 in title Translations:Wikidata:Introduction/5/uk
> Feb 10, 2015 7:22:01 PM org.dbpedia.extraction.sources.WikipediaDumpParser
> readPage
> WARNING: Error parsing title: found namespace 0/Main, expected
> 1198/Namespace 1198 in title Translations:Wikidata:Glossary/23/he
> Feb 10, 2015 7:22:01 PM org.dbpedia.extraction.sources.WikipediaDumpParser
> readPage
> WARNING: Error parsing title: found namespace 0/Main, expected
> 1198/Namespace 1198 in title Translations:Wikidata:Introduction/6/uk
> Feb 10, 2015 7:22:01 PM org.dbpedia.extraction.sources.WikipediaDumpParser
> readPage
> WARNING: Error parsing title: found namespace 0/Main, expected
> 1198/Namespace 1198 in title Translations:Wikidata:Introduction/7/uk
> Feb 10, 2015 7:22:01 PM org.dbpedia.extraction.sources.WikipediaDumpParser
> readPage
> WARNING: Error parsing title: found namespace 0/Main, expected
> 1198/Namespace 1198 in title Translations:Wikidata:Introduction/8/uk
> Feb 10, 2015 7:22:01 PM org.dbpedia.extraction.sources.WikipediaDumpParser
> readPage
> WARNING: Error parsing title: found namespace 0/Main, expected
> 1198/Namespace 1198 in title Translations:Wikidata:Introduction/9/uk
> Feb 10, 2015 7:22:01 PM org.dbpedia.extraction.sources.WikipediaDumpParser
> readPage
> WARNING: Error parsing title: found namespace 0/Main, expected
> 1198/Namespace 1198 in title Translations:Wikidata:Introduction/22/uk
> Feb 10, 2015 7:22:04 PM org.dbpedia.extraction.sources.WikipediaDumpParser
> readPage
> WARNING: Error parsing title: found namespace 0/Main, expected
> 1198/Namespace 1198 in title Translations:Wikidata:Introduction/10/uk
> java.lang.reflect.InvocationTargetException
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:483)
>         at scala_maven_executions.MainHelper.runMain(MainHelper.java:164)
>         at
> scala_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26)
> Caused by: javax.xml.stream.XMLStreamException: ParseError at
> [row,col]:[4068918,3920]
> Message: unexpected end of stream
>         at
> com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:596)
>         at
> com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.getElementText(XMLStreamReaderImpl.java:862)
>         at
> org.dbpedia.extraction.sources.WikipediaDumpParser.readString(WikipediaDumpParser.java:395)
>         at
> org.dbpedia.extraction.sources.WikipediaDumpParser.readRevision(WikipediaDumpParser.java:290)
>         at
> org.dbpedia.extraction.sources.WikipediaDumpParser.readPage(WikipediaDumpParser.java:248)
>         at
> org.dbpedia.extraction.sources.WikipediaDumpParser.readPages(WikipediaDumpParser.java:187)
>         at
> org.dbpedia.extraction.sources.WikipediaDumpParser.readDump(WikipediaDumpParser.java:145)
>         at
> org.dbpedia.extraction.sources.WikipediaDumpParser.run(WikipediaDumpParser.java:116)
>         at
> org.dbpedia.extraction.sources.XMLReaderSource.foreach(XMLSource.scala:112)
>         at
> scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:252)
>         at
> org.dbpedia.extraction.sources.XMLReaderSource.flatMap(XMLSource.scala:108)
>         at
> org.dbpedia.extraction.mappings.Redirects$.loadFromSource(Redirects.scala:171)
>         at
> org.dbpedia.extraction.mappings.Redirects$.load(Redirects.scala:122)
>         at
> org.dbpedia.extraction.dump.extract.ConfigLoader$$anon$1.<init>(ConfigLoader.scala:101)
>         at
> org.dbpedia.extraction.dump.extract.ConfigLoader.org$dbpedia$extraction$dump$extract$ConfigLoader$$createExtractionJob(ConfigLoader.scala:53)
>         at
> org.dbpedia.extraction.dump.extract.ConfigLoader$$anonfun$getExtractionJobs$1.apply(ConfigLoader.scala:40)
>         at
> org.dbpedia.extraction.dump.extract.ConfigLoader$$anonfun$getExtractionJobs$1.apply(ConfigLoader.scala:40)
>         at
> scala.collection.TraversableViewLike$Mapped$$anonfun$foreach$2.apply(TraversableViewLike.scala:169)
>         at scala.collection.Iterator$class.foreach(Iterator.scala:743)
>         at
> scala.collection.immutable.RedBlackTree$TreeIterator.foreach(RedBlackTree.scala:468)
>         at
> scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>         at
> scala.collection.IterableLike$$anon$1.foreach(IterableLike.scala:310)
>         at
> scala.collection.TraversableViewLike$Mapped$class.foreach(TraversableViewLike.scala:168)
>         at
> scala.collection.IterableViewLike$$anon$3.foreach(IterableViewLike.scala:113)
>         at
> org.dbpedia.extraction.dump.extract.Extraction$.main(Extraction.scala:30)
>         at
> org.dbpedia.extraction.dump.extract.Extraction.main(Extraction.scala)
>
>
> Thanks & Regards,
> Anupam
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
> _______________________________________________
> Dbpedia-discussion mailing list
> Dbpedia-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>

------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to