Hi Jona,

I have replaced 0.8 with 0.6 only (marked bold). And this is not entire
english file but a fraction of it available at dbpedia dumps.It is of 144
MB only. This file is available at
http://dumps.wikimedia.org/enwiki/20130204/  named "
enwiki-20130204-pages-articles1.xml-p000000010p000010000.bz2<http://dumps.wikimedia.org/enwiki/20130204/enwiki-20130204-pages-articles1.xml-p000000010p000010000.bz2>".


*
Below are the top 10 lines of the file.Changes are marked red. There is no
change apart from that in file.*

"
<mediawiki xmlns="http://www.mediawiki.org/xml/export-*0.6*/"; xmlns:xsi="
http://www.w3.org/2001/XMLSchema-instance"; xsi:schemaLocation="
http://www.mediawiki.org/xml/export-*0.6*/
http://www.mediawiki.org/xml/export-*0.6*.xsd"; version="*0.6*"
xml:lang="en">
  <siteinfo>
    <sitename>Wikipedia</sitename>
    <base>http://en.wikipedia.org/wiki/Main_Page</base>
    <generator>MediaWiki 1.21wmf8</generator>
    <case>first-letter</case>
    <namespaces>
      <namespace key="-2" case="first-letter">Media</namespace>
      <namespace key="-1" case="first-letter">Special</namespace>
      <namespace key="0" case="first-letter" />
"

On Thu, Mar 28, 2013 at 8:20 PM, Jona Christopher Sahnwaldt <j...@sahnwaldt.de
> wrote:

>
> On Mar 28, 2013 2:02 PM, "gaurav pant" <golup...@gmail.com> wrote:
> >
> > Hi All,
> >
> > Importing page-article dump to mysql database is unsuccessful.
> > First i received Exception-
> > "
> > Caused by: javax.xml.stream.XMLStreamException: ParseError at
> [row,col]:[1,249]
> > Message: expected <mediawiki> with namespace [
> http://www.mediawiki.org/xml/export-0.6/], found [
> http://www.mediawiki.org/xml/export-0.8/]
> > "
> >
> > To resolve above i have changed mediawiki version to 0.6 from 0.8 in
> "20130325/enwiki-20130325-pages-articles.xm" file.
>
> how did you do that? maybe the file was damaged?
>
> how large is the file? please send us the results of the following
> commands:
>
> ls -l enwiki-20130325-pages-articles.xml
> head -100 enwiki-20130325-pages-articles.xml
> tail -100 enwiki-20130325-pages-articles.xml
>
> >
> > After doing this I have not received above exception but no data is
> being imported to database.
> >
> > "
> > INFO] No sources to compile
> > [INFO]
> > [INFO] --- maven-scala-plugin:2.15.2:testCompile (test-compile) @ dump
> ---
> >
> > [INFO] Checking for multiple versions of scala
> > [INFO] includes = [**/*.scala,**/*.java,]
> > [INFO] excludes = []
> > [WARNING] No source files found.
> > [INFO]
> > [INFO] <<< maven-scala-plugin:2.15.2:run (default-cli) @ dump <<<
> > [INFO]
> > [INFO] --- maven-scala-plugin:2.15.2:run (default-cli) @ dump ---
> > [INFO] Checking for multiple versions of scala
> > [INFO] launcher 'import' selected =>
> org.dbpedia.extraction.dump.sql.Import
> > importing pages in namespaces [Template] from
> /mnt/ebs/framework/test_dump/enwiki/20130325/enwiki-20130325-pages-articles.xml
> to database enwiki on server
> localhost:3306/?characterEncoding=UTF-8&user=testuser&password=testpass
> > imported 0 pages in 4941 millis (Infinity millis per page)
>
> 5 seconds is much too fast. The file should have several dozen GB. Not
> even RAM would be fast enough to read 50 GB in 5 seconds, let alone a hard
> drive.
>
> > imported  pages in namespaces [Template] from
> /mnt/ebs/framework/test_dump/enwiki/20130325/enwiki-20130325-pages-articles.xml
> to database enwiki on server
> localhost:3306/?characterEncoding=UTF-8&user=testuser&password=testpass
> > [INFO]
> ------------------------------------------------------------------------
> > [INFO] BUILD SUCCESS
> > [INFO]
> ------------------------------------------------------------------------
> > [INFO] Total time: 13.247s
> > [INFO] Finished at: Thu Mar 28 12:51:19 UTC 2013
> >
> > [INFO] Final Memory: 8M/56M
> > [INFO]
> ------------------------------------------------------------------------
> > "
> >
> >
> > Please let me know what is the problem?Whether my previous approach of
> resolving exception is wrong?
> >
> > Thanks
> >
> >
> >
> > On Tue, Mar 26, 2013 at 12:44 AM, Jona Christopher Sahnwaldt <
> j...@sahnwaldt.de> wrote:
> >>
> >> On 25 March 2013 19:35, gaurav pant <golup...@gmail.com> wrote:
> >> >
> >> > Hi All/Jona,
> >> >
> >> > With updated dump I am able to import data into mysql. Now the issue
> is
> >> > with abstract extraction.Thanks Jona for all the help.
> >> >
> >> > I am getting below errors for many files during ../clean-install-run
> >> > extraction extraction.abstracts.properties.
> >> >
> >> > "
> >> > Mon Mar 25 16:13:31 2013] [error] [client 10.169.15.110] PHP Warning:
> >> >
> require_once(/mnt/ebs/framework/media_wiki/wikimedia/extensions/Babel/Babel.php):
> >> > failed to open stream: No such file or directory in
> >> > /mnt/ebs/framework/media_wiki/wikimedia/LocalSettings.php on line 144
> >> >
> >> > [Mon Mar 25 17:26:15 2013] [error] [client 127.0.0.1] PHP Warning:
> >> >
> require_once(/mnt/ebs/framework/media_wiki/wikimedia/extensions/CategoryTree/CategoryTree.php):
> >> > failed to open stream: No such file or directory in
> >> > /mnt/ebs/framework/media_wiki/wikimedia/LocalSettings.php on line 145
> >> > "
> >>
> >> Please send us line 145 of LocalSettings.php.
> >>
> >> >
> >> > I am getting all required file specified in CategoryTree.php in
> >> > "mw-modified.tar.gz" which is not being used in new abstractor( as
> previous
> >> > dicussion). So it seems me that i am using old "mediawiki".
> >> >
> >> >
> >> > Can you please let me know the exact path of new mediawiki? or any
> >> > possible suggestion.
> >>
> >> First, use the appropriate MediaWiki version. I would recommend to use
> >> the exact version for your Wikipedia dump file. You can find the
> >> version in one of the first few lines of the dump file, or at
> >> http://en.wikipedia.org/wiki/Special:Version, for example 1.21wmf12.
> >> MediaWiki 1.21 is has no stable release for download yet, and
> >> sometimes the "wmf" branches (adapted for WikiMediaFoundation) contain
> >> significant changes, so you should check out the exact version from
> >> WikiMedia's git repo:
> >>
> >>
> https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=tree;hb=refs/heads/wmf/1.21wmf12
> >>
> >> That's how I installed MediaWiki when I last extracted the abstracts a
> >> few months ago.
> >>
> >> You also need to install the extensions listed on
> >> http://en.wikipedia.org/wiki/Special:Version . You don't need all of
> >> them, but most. When you check out the code from git, the appropriate
> >> extensions may already be included. Maybe you have to use a special
> >> git flag to get them, something like --sub-repositories or
> >> --follow-references, I don't remember. I hope the git experts on this
> >> list can help.
> >>
> >> Finally, to modify MediaWiki for DBpedia, just copy the three files from
> >>
> >>
> https://github.com/dbpedia/extraction-framework/tree/master/dump/src/main/mediawiki
> >>
> >> into your MediaWiki installation in the appropriate places. Note:
> >> these files may be OUTDATED. In other words, the MediaWiki versions of
> >> these files have probably changed since we last modified them. Maybe
> >> it's better if you apply our patches to the current files. For
> >> example, here's what we changed in ApiParse.php:
> >>
> >>
> https://github.com/dbpedia/extraction-framework/commit/e36913dabe0715672cbf0f2e6c5d86ec424b08b3
> >>
> >> Hope that helps.
> >>
> >> JC
> >>
> >> >
> >> > On Mon, Mar 25, 2013 at 3:01 PM, Jona Christopher Sahnwaldt
> >> > <j...@sahnwaldt.de> wrote:
> >> >>
> >> >> did you update the code to the latest version? git pull, update,
> etc.?
> >> >>
> >> >> On Mar 25, 2013 8:33 AM, "gaurav pant" <golup...@gmail.com> wrote:
> >> >>>
> >> >>> Hi Jona/All,
> >> >>>
> >> >>> I have changed pom.xml accordingly as below. But I am getting error.
> >> >>> "
> >> >>> <launcher>
> >> >>>                             <id>import</id>
> >> >>>
> >> >>> <mainClass>org.dbpedia.extraction.dump.sql.Import</mainClass>
> >> >>>                             <jvmArgs>
> >> >>>                                 <jvmArg>-server</jvmArg>
> >> >>>                             </jvmArgs>
> >> >>>                             <args>
> >> >>>
> <arg>/mnt/ebs/framework/test_dump</arg>
> >> >>>
> >> >>>
> <arg>/mnt/ebs/framework/media_wiki/wikimedia/maintenance/tables.sql</arg>
> >> >>>
> >> >>>
> <arg>jdbc:mysql://localhost:3306/?characterEncoding=UTF-8&amp;user=user-name&amp;password=my_password</arg><!--
> >> >>> MySQL host:port -->
> >> >>>                                 <arg>false</arg><!--
> >> >>> require-download-complete -->
> >> >>>                                 <arg>en</arg><!-- languages and
> article
> >> >>> count ranges, comma-separated -->
> >> >>>                             </args>
> >> >>>                         </launcher>
> >> >>> "
> >> >>>
> >> >>> I am getting below error--
> >> >>>
> >> >>> java.lang.reflect.InvocationTargetException
> >> >>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >> >>>     at
> >> >>>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >> >>>     at
> >> >>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >> >>>     at java.lang.reflect.Method.invoke(Method.java:601)
> >> >>>     at
> >> >>>
> org_scala_tools_maven_executions.MainHelper.runMain(MainHelper.java:161)
> >> >>>     at
> >> >>>
> org_scala_tools_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26)
> >> >>> Caused by:
> >> >>>
> com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Cannot
> >> >>> load connection class because of underlying exception:
> >> >>> 'java.lang.NumberFormatException: For input string: "mysql:"'.
> >> >>>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> >> >>> Method)
> >> >>>     at
> >> >>>
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> >> >>>     at
> >> >>>
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> >> >>>     at
> java.lang.reflect.Constructor.newInstance(Constructor.java:525)
> >> >>>     at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
> >> >>>     at com.mysql.jdbc.Util.getInstance(Util.java:386)
> >> >>>     at
> com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1013)
> >> >>>     at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:987)
> >> >>>     at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:982)
> >> >>>     at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:927)
> >> >>>     at
> >> >>>
> com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:314)
> >> >>>     at org.dbpedia.extraction.dump.sql.Import$.main(Import.scala:39)
> >> >>>     at org.dbpedia.extraction.dump.sql.Import.main(Import.scala)
> >> >>>     ... 6 more
> >> >>> Caused by: java.lang.NumberFormatException: For input string:
> "mysql:"
> >> >>>     at
> >> >>>
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> >> >>>     at java.lang.Integer.parseInt(Integer.java:492)
> >> >>>     at java.lang.Integer.parseInt(Integer.java:527)
> >> >>>     at
> >> >>>
> com.mysql.jdbc.NonRegisteringDriver.port(NonRegisteringDriver.java:831)
> >> >>>     at
> >> >>>
> com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:305)
> >> >>>     ... 8 more
> >> >>> [INFO]
> >> >>>
> ------------------------------------------------------------------------
> >> >>> [INFO] BUILD FAILURE
> >> >>> [INFO]
> >> >>>
> ------------------------------------------------------------------------
> >> >>> [INFO] Total time: 9.228s
> >> >>> [INFO] Finished at: Mon Mar 25 06:11:54 UTC 2013
> >> >>> [INFO] Final Memory: 8M/56M
> >> >>> [INFO]
> >> >>>
> ------------------------------------------------------------------------
> >> >>> [ERROR] Failed to execute goal
> >> >>> org.scala-tools:maven-scala-plugin:2.15.2:run (default-cli) on
> project dump:
> >> >>> wrap: org.apache.commons.exec.ExecuteException: Process exited with
> an
> >> >>> error: 240(Exit value: 240) -> [Help 1]
> >> >>>
> >> >>> On Mon, Mar 25, 2013 at 6:54 AM, Jona Christopher Sahnwaldt
> >> >>> <j...@sahnwaldt.de> wrote:
> >> >>>>
> >> >>>> On 20 March 2013 15:38, Mohamed Morsey
> >> >>>> <mor...@informatik.uni-leipzig.de> wrote:
> >> >>>> > Hi Jona and all,
> >> >>>> >
> >> >>>> >
> >> >>>> > On 03/20/2013 03:25 PM, Jona Christopher Sahnwaldt wrote:
> >> >>>> >
> >> >>>> > On Wed, Mar 20, 2013 at 3:01 PM, gaurav pant <golup...@gmail.com
> >
> >> >>>> > wrote:
> >> >>>> >
> >> >>>> >> Hi Morsy/All,
> >> >>>> >>
> >> >>>> >> While running Import.sh is am getting below error.
> >> >>>> >
> >> >>>> > Don't use import.sh.
> >> >>>> >
> >> >>>> >
> >> >>>> >>
> >> >>>> >> missing
> >> >>>> >>
> >> >>>> >>
> >> >>>> >>
> /home/gaurav/other_lang_extraction/extraction-framework-master/dump/wiki_dump/dewiki/tables-no-indexes.sql
> >> >>>> >> missing
> >> >>>> >>
> >> >>>> >>
> >> >>>> >>
> /home/gaurav/other_lang_extraction/extraction-framework-master/dump/wiki_dump/dewiki/tables-only-indexes.sql
> >> >>>> >>
> >> >>>> >>
> >> >>>> >>
> >> >>>> >>
> "/home/gaurav/other_lang_extraction/extraction-framework-master/dump/wiki_dump/dewiki/"
> >> >>>> >> is the directory where my dump exists.
> >> >>>> >>
> >> >>>> >> I think Some index need to be created before actual import
> start and
> >> >>>> >> required index files are missing. Where can I get these files.
> >> >>>> >>
> >> >>>> >> I am following
> >> >>>> >>
> >> >>>> >>
> >> >>>> >> "
> https://github.com/dbpedia/dbpedia/blob/master/abstractExtraction/README.txt
> "
> >> >>>> >
> >> >>>> > See the text at the top of this file:
> >> >>>> >
> >> >>>> > OUTDATED! EVERYTHING IN THIS DIRECTORY,
> >> >>>> > INCLUDING THE INSTRUCTIONS BELOW,
> >> >>>> > IS OUTDATED. THE CURRENT CODE IS IN
> >> >>>> > dbpedia/extraction-framework/dump
> >> >>>> >
> >> >>>> >
> >> >>>> >
> >> >>>> > Sorry for that, I overlooked that note.
> >> >>>>
> >> >>>> All right, I just made some changes that hopefully make it really
> >> >>>> painfully obvious that that stuff is outdated. :-)
> >> >>>>
> >> >>>> >
> >> >>>> >
> >> >>>> > --
> >> >>>> > Kind Regards
> >> >>>> > Mohamed Morsey
> >> >>>> > Department of Computer Science
> >> >>>> > University of Leipzig
> >> >>>
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Regards
> >> >>> Gaurav Pant
> >> >>> +91-7709196607,+91-9405757794
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > Regards
> >> > Gaurav Pant
> >> > +91-7709196607,+91-9405757794
> >
> >
> >
> >
> > --
> > Regards
> > Gaurav Pant
> > +91-7709196607,+91-9405757794
>



-- 
Regards
Gaurav Pant
+91-7709196607,+91-9405757794
------------------------------------------------------------------------------
Own the Future-Intel&reg; Level Up Game Demo Contest 2013
Rise to greatness in Intel's independent game demo contest.
Compete for recognition, cash, and the chance to get your game 
on Steam. $5K grand prize plus 10 genre and skill prizes. 
Submit your demo by 6/6/13. http://p.sf.net/sfu/intel_levelupd2d
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to