Hi Jona,

You have told that "The importer only imports templates" ..than how page
data will get imported to database?
If page data will not get imported to database than how abstract gets
extracted?


On Thu, Mar 28, 2013 at 8:55 PM, Jona Christopher Sahnwaldt <j...@sahnwaldt.de
> wrote:

>
> On Mar 28, 2013 4:14 PM, "gaurav pant" <golup...@gmail.com> wrote:
> >
> > Hi Jona,
> >
> > I have replaced 0.8 with 0.6 only (marked bold). And this is not entire
> english file but a fraction of it available at dbpedia dumps.It is of 144
> MB only. This file is available at
> http://dumps.wikimedia.org/enwiki/20130204/  named
> "enwiki-20130204-pages-articles1.xml-p000000010p000010000.bz2".
>
> Well, then there probably are no templates in this file. The importer only
> imports templates. Look for pages with title Template:... If there aren't
> any, try another file.
>
> >
> >
> > Below are the top 10 lines of the file.Changes are marked red. There is
> no change apart from that in file.
> >
> > "
> > <mediawiki xmlns="http://www.mediawiki.org/xml/export-0.6/"; xmlns:xsi="
> http://www.w3.org/2001/XMLSchema-instance"; xsi:schemaLocation="
> http://www.mediawiki.org/xml/export-0.6/
> http://www.mediawiki.org/xml/export-0.6.xsd"; version="0.6" xml:lang="en">
> >   <siteinfo>
> >     <sitename>Wikipedia</sitename>
> >     <base>http://en.wikipedia.org/wiki/Main_Page</base>
> >     <generator>MediaWiki 1.21wmf8</generator>
> >     <case>first-letter</case>
> >     <namespaces>
> >       <namespace key="-2" case="first-letter">Media</namespace>
> >       <namespace key="-1" case="first-letter">Special</namespace>
> >       <namespace key="0" case="first-letter" />
> >
> > "
> >
> > On Thu, Mar 28, 2013 at 8:20 PM, Jona Christopher Sahnwaldt <
> j...@sahnwaldt.de> wrote:
> >>
> >>
> >> On Mar 28, 2013 2:02 PM, "gaurav pant" <golup...@gmail.com> wrote:
> >> >
> >> > Hi All,
> >> >
> >> > Importing page-article dump to mysql database is unsuccessful.
> >> > First i received Exception-
> >> > "
> >> > Caused by: javax.xml.stream.XMLStreamException: ParseError at
> [row,col]:[1,249]
> >> > Message: expected <mediawiki> with namespace [
> http://www.mediawiki.org/xml/export-0.6/], found [
> http://www.mediawiki.org/xml/export-0.8/]
> >> > "
> >> >
> >> > To resolve above i have changed mediawiki version to 0.6 from 0.8 in
> "20130325/enwiki-20130325-pages-articles.xm" file.
> >>
> >> how did you do that? maybe the file was damaged?
> >>
> >> how large is the file? please send us the results of the following
> commands:
> >>
> >> ls -l enwiki-20130325-pages-articles.xml
> >> head -100 enwiki-20130325-pages-articles.xml
> >> tail -100 enwiki-20130325-pages-articles.xml
> >>
> >> >
> >> > After doing this I have not received above exception but no data is
> being imported to database.
> >> >
> >> > "
> >> > INFO] No sources to compile
> >> > [INFO]
> >> > [INFO] --- maven-scala-plugin:2.15.2:testCompile (test-compile) @
> dump ---
> >> >
> >> > [INFO] Checking for multiple versions of scala
> >> > [INFO] includes = [**/*.scala,**/*.java,]
> >> > [INFO] excludes = []
> >> > [WARNING] No source files found.
> >> > [INFO]
> >> > [INFO] <<< maven-scala-plugin:2.15.2:run (default-cli) @ dump <<<
> >> > [INFO]
> >> > [INFO] --- maven-scala-plugin:2.15.2:run (default-cli) @ dump ---
> >> > [INFO] Checking for multiple versions of scala
> >> > [INFO] launcher 'import' selected =>
> org.dbpedia.extraction.dump.sql.Import
> >> > importing pages in namespaces [Template] from
> /mnt/ebs/framework/test_dump/enwiki/20130325/enwiki-20130325-pages-articles.xml
> to database enwiki on server
> localhost:3306/?characterEncoding=UTF-8&user=testuser&password=testpass
> >> > imported 0 pages in 4941 millis (Infinity millis per page)
> >>
> >> 5 seconds is much too fast. The file should have several dozen GB. Not
> even RAM would be fast enough to read 50 GB in 5 seconds, let alone a hard
> drive.
> >>
> >> > imported  pages in namespaces [Template] from
> /mnt/ebs/framework/test_dump/enwiki/20130325/enwiki-20130325-pages-articles.xml
> to database enwiki on server
> localhost:3306/?characterEncoding=UTF-8&user=testuser&password=testpass
> >> > [INFO]
> ------------------------------------------------------------------------
> >> > [INFO] BUILD SUCCESS
> >> > [INFO]
> ------------------------------------------------------------------------
> >> > [INFO] Total time: 13.247s
> >> > [INFO] Finished at: Thu Mar 28 12:51:19 UTC 2013
> >> >
> >> > [INFO] Final Memory: 8M/56M
> >> > [INFO]
> ------------------------------------------------------------------------
> >> > "
> >> >
> >> >
> >> > Please let me know what is the problem?Whether my previous approach
> of resolving exception is wrong?
> >> >
> >> > Thanks
> >> >
> >> >
> >> >
> >> > On Tue, Mar 26, 2013 at 12:44 AM, Jona Christopher Sahnwaldt <
> j...@sahnwaldt.de> wrote:
> >> >>
> >> >> On 25 March 2013 19:35, gaurav pant <golup...@gmail.com> wrote:
> >> >> >
> >> >> > Hi All/Jona,
> >> >> >
> >> >> > With updated dump I am able to import data into mysql. Now the
> issue is
> >> >> > with abstract extraction.Thanks Jona for all the help.
> >> >> >
> >> >> > I am getting below errors for many files during
> ../clean-install-run
> >> >> > extraction extraction.abstracts.properties.
> >> >> >
> >> >> > "
> >> >> > Mon Mar 25 16:13:31 2013] [error] [client 10.169.15.110] PHP
> Warning:
> >> >> >
> require_once(/mnt/ebs/framework/media_wiki/wikimedia/extensions/Babel/Babel.php):
> >> >> > failed to open stream: No such file or directory in
> >> >> > /mnt/ebs/framework/media_wiki/wikimedia/LocalSettings.php on line
> 144
> >> >> >
> >> >> > [Mon Mar 25 17:26:15 2013] [error] [client 127.0.0.1] PHP Warning:
> >> >> >
> require_once(/mnt/ebs/framework/media_wiki/wikimedia/extensions/CategoryTree/CategoryTree.php):
> >> >> > failed to open stream: No such file or directory in
> >> >> > /mnt/ebs/framework/media_wiki/wikimedia/LocalSettings.php on line
> 145
> >> >> > "
> >> >>
> >> >> Please send us line 145 of LocalSettings.php.
> >> >>
> >> >> >
> >> >> > I am getting all required file specified in CategoryTree.php in
> >> >> > "mw-modified.tar.gz" which is not being used in new abstractor( as
> previous
> >> >> > dicussion). So it seems me that i am using old "mediawiki".
> >> >> >
> >> >> >
> >> >> > Can you please let me know the exact path of new mediawiki? or any
> >> >> > possible suggestion.
> >> >>
> >> >> First, use the appropriate MediaWiki version. I would recommend to
> use
> >> >> the exact version for your Wikipedia dump file. You can find the
> >> >> version in one of the first few lines of the dump file, or at
> >> >> http://en.wikipedia.org/wiki/Special:Version, for example 1.21wmf12.
> >> >> MediaWiki 1.21 is has no stable release for download yet, and
> >> >> sometimes the "wmf" branches (adapted for WikiMediaFoundation)
> contain
> >> >> significant changes, so you should check out the exact version from
> >> >> WikiMedia's git repo:
> >> >>
> >> >>
> https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=tree;hb=refs/heads/wmf/1.21wmf12
> >> >>
> >> >> That's how I installed MediaWiki when I last extracted the abstracts
> a
> >> >> few months ago.
> >> >>
> >> >> You also need to install the extensions listed on
> >> >> http://en.wikipedia.org/wiki/Special:Version . You don't need all of
> >> >> them, but most. When you check out the code from git, the appropriate
> >> >> extensions may already be included. Maybe you have to use a special
> >> >> git flag to get them, something like --sub-repositories or
> >> >> --follow-references, I don't remember. I hope the git experts on this
> >> >> list can help.
> >> >>
> >> >> Finally, to modify MediaWiki for DBpedia, just copy the three files
> from
> >> >>
> >> >>
> https://github.com/dbpedia/extraction-framework/tree/master/dump/src/main/mediawiki
> >> >>
> >> >> into your MediaWiki installation in the appropriate places. Note:
> >> >> these files may be OUTDATED. In other words, the MediaWiki versions
> of
> >> >> these files have probably changed since we last modified them. Maybe
> >> >> it's better if you apply our patches to the current files. For
> >> >> example, here's what we changed in ApiParse.php:
> >> >>
> >> >>
> https://github.com/dbpedia/extraction-framework/commit/e36913dabe0715672cbf0f2e6c5d86ec424b08b3
> >> >>
> >> >> Hope that helps.
> >> >>
> >> >> JC
> >> >>
> >> >> >
> >> >> > On Mon, Mar 25, 2013 at 3:01 PM, Jona Christopher Sahnwaldt
> >> >> > <j...@sahnwaldt.de> wrote:
> >> >> >>
> >> >> >> did you update the code to the latest version? git pull, update,
> etc.?
> >> >> >>
> >> >> >> On Mar 25, 2013 8:33 AM, "gaurav pant" <golup...@gmail.com>
> wrote:
> >> >> >>>
> >> >> >>> Hi Jona/All,
> >> >> >>>
> >> >> >>> I have changed pom.xml accordingly as below. But I am getting
> error.
> >> >> >>> "
> >> >> >>> <launcher>
> >> >> >>>                             <id>import</id>
> >> >> >>>
> >> >> >>> <mainClass>org.dbpedia.extraction.dump.sql.Import</mainClass>
> >> >> >>>                             <jvmArgs>
> >> >> >>>                                 <jvmArg>-server</jvmArg>
> >> >> >>>                             </jvmArgs>
> >> >> >>>                             <args>
> >> >> >>>
> <arg>/mnt/ebs/framework/test_dump</arg>
> >> >> >>>
> >> >> >>>
> <arg>/mnt/ebs/framework/media_wiki/wikimedia/maintenance/tables.sql</arg>
> >> >> >>>
> >> >> >>>
> <arg>jdbc:mysql://localhost:3306/?characterEncoding=UTF-8&amp;user=user-name&amp;password=my_password</arg><!--
> >> >> >>> MySQL host:port -->
> >> >> >>>                                 <arg>false</arg><!--
> >> >> >>> require-download-complete -->
> >> >> >>>                                 <arg>en</arg><!-- languages and
> article
> >> >> >>> count ranges, comma-separated -->
> >> >> >>>                             </args>
> >> >> >>>                         </launcher>
> >> >> >>> "
> >> >> >>>
> >> >> >>> I am getting below error--
> >> >> >>>
> >> >> >>> java.lang.reflect.InvocationTargetException
> >> >> >>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> >> >> >>>     at
> >> >> >>>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >> >> >>>     at
> >> >> >>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >> >> >>>     at java.lang.reflect.Method.invoke(Method.java:601)
> >> >> >>>     at
> >> >> >>>
> org_scala_tools_maven_executions.MainHelper.runMain(MainHelper.java:161)
> >> >> >>>     at
> >> >> >>>
> org_scala_tools_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26)
> >> >> >>> Caused by:
> >> >> >>>
> com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Cannot
> >> >> >>> load connection class because of underlying exception:
> >> >> >>> 'java.lang.NumberFormatException: For input string: "mysql:"'.
> >> >> >>>     at
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> >> >> >>> Method)
> >> >> >>>     at
> >> >> >>>
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> >> >> >>>     at
> >> >> >>>
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> >> >> >>>     at
> java.lang.reflect.Constructor.newInstance(Constructor.java:525)
> >> >> >>>     at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
> >> >> >>>     at com.mysql.jdbc.Util.getInstance(Util.java:386)
> >> >> >>>     at
> com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1013)
> >> >> >>>     at
> com.mysql.jdbc.SQLError.createSQLException(SQLError.java:987)
> >> >> >>>     at
> com.mysql.jdbc.SQLError.createSQLException(SQLError.java:982)
> >> >> >>>     at
> com.mysql.jdbc.SQLError.createSQLException(SQLError.java:927)
> >> >> >>>     at
> >> >> >>>
> com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:314)
> >> >> >>>     at
> org.dbpedia.extraction.dump.sql.Import$.main(Import.scala:39)
> >> >> >>>     at org.dbpedia.extraction.dump.sql.Import.main(Import.scala)
> >> >> >>>     ... 6 more
> >> >> >>> Caused by: java.lang.NumberFormatException: For input string:
> "mysql:"
> >> >> >>>     at
> >> >> >>>
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> >> >> >>>     at java.lang.Integer.parseInt(Integer.java:492)
> >> >> >>>     at java.lang.Integer.parseInt(Integer.java:527)
> >> >> >>>     at
> >> >> >>>
> com.mysql.jdbc.NonRegisteringDriver.port(NonRegisteringDriver.java:831)
> >> >> >>>     at
> >> >> >>>
> com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:305)
> >> >> >>>     ... 8 more
> >> >> >>> [INFO]
> >> >> >>>
> ------------------------------------------------------------------------
> >> >> >>> [INFO] BUILD FAILURE
> >> >> >>> [INFO]
> >> >> >>>
> ------------------------------------------------------------------------
> >> >> >>> [INFO] Total time: 9.228s
> >> >> >>> [INFO] Finished at: Mon Mar 25 06:11:54 UTC 2013
> >> >> >>> [INFO] Final Memory: 8M/56M
> >> >> >>> [INFO]
> >> >> >>>
> ------------------------------------------------------------------------
> >> >> >>> [ERROR] Failed to execute goal
> >> >> >>> org.scala-tools:maven-scala-plugin:2.15.2:run (default-cli) on
> project dump:
> >> >> >>> wrap: org.apache.commons.exec.ExecuteException: Process exited
> with an
> >> >> >>> error: 240(Exit value: 240) -> [Help 1]
> >> >> >>>
> >> >> >>> On Mon, Mar 25, 2013 at 6:54 AM, Jona Christopher Sahnwaldt
> >> >> >>> <j...@sahnwaldt.de> wrote:
> >> >> >>>>
> >> >> >>>> On 20 March 2013 15:38, Mohamed Morsey
> >> >> >>>> <mor...@informatik.uni-leipzig.de> wrote:
> >> >> >>>> > Hi Jona and all,
> >> >> >>>> >
> >> >> >>>> >
> >> >> >>>> > On 03/20/2013 03:25 PM, Jona Christopher Sahnwaldt wrote:
> >> >> >>>> >
> >> >> >>>> > On Wed, Mar 20, 2013 at 3:01 PM, gaurav pant <
> golup...@gmail.com>
> >> >> >>>> > wrote:
> >> >> >>>> >
> >> >> >>>> >> Hi Morsy/All,
> >> >> >>>> >>
> >> >> >>>> >> While running Import.sh is am getting below error.
> >> >> >>>> >
> >> >> >>>> > Don't use import.sh.
> >> >> >>>> >
> >> >> >>>> >
> >> >> >>>> >>
> >> >> >>>> >> missing
> >> >> >>>> >>
> >> >> >>>> >>
> >> >> >>>> >>
> /home/gaurav/other_lang_extraction/extraction-framework-master/dump/wiki_dump/dewiki/tables-no-indexes.sql
> >> >> >>>> >> missing
> >> >> >>>> >>
> >> >> >>>> >>
> >> >> >>>> >>
> /home/gaurav/other_lang_extraction/extraction-framework-master/dump/wiki_dump/dewiki/tables-only-indexes.sql
> >> >> >>>> >>
> >> >> >>>> >>
> >> >> >>>> >>
> >> >> >>>> >>
> "/home/gaurav/other_lang_extraction/extraction-framework-master/dump/wiki_dump/dewiki/"
> >> >> >>>> >> is the directory where my dump exists.
> >> >> >>>> >>
> >> >> >>>> >> I think Some index need to be created before actual import
> start and
> >> >> >>>> >> required index files are missing. Where can I get these
> files.
> >> >> >>>> >>
> >> >> >>>> >> I am following
> >> >> >>>> >>
> >> >> >>>> >>
> >> >> >>>> >> "
> https://github.com/dbpedia/dbpedia/blob/master/abstractExtraction/README.txt
> "
> >> >> >>>> >
> >> >> >>>> > See the text at the top of this file:
> >> >> >>>> >
> >> >> >>>> > OUTDATED! EVERYTHING IN THIS DIRECTORY,
> >> >> >>>> > INCLUDING THE INSTRUCTIONS BELOW,
> >> >> >>>> > IS OUTDATED. THE CURRENT CODE IS IN
> >> >> >>>> > dbpedia/extraction-framework/dump
> >> >> >>>> >
> >> >> >>>> >
> >> >> >>>> >
> >> >> >>>> > Sorry for that, I overlooked that note.
> >> >> >>>>
> >> >> >>>> All right, I just made some changes that hopefully make it
> really
> >> >> >>>> painfully obvious that that stuff is outdated. :-)
> >> >> >>>>
> >> >> >>>> >
> >> >> >>>> >
> >> >> >>>> > --
> >> >> >>>> > Kind Regards
> >> >> >>>> > Mohamed Morsey
> >> >> >>>> > Department of Computer Science
> >> >> >>>> > University of Leipzig
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >>> --
> >> >> >>> Regards
> >> >> >>> Gaurav Pant
> >> >> >>> +91-7709196607,+91-9405757794
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> > --
> >> >> > Regards
> >> >> > Gaurav Pant
> >> >> > +91-7709196607,+91-9405757794
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > Regards
> >> > Gaurav Pant
> >> > +91-7709196607,+91-9405757794
> >
> >
> >
> >
> > --
> > Regards
> > Gaurav Pant
> > +91-7709196607,+91-9405757794
>



-- 
Regards
Gaurav Pant
+91-7709196607,+91-9405757794
------------------------------------------------------------------------------
Own the Future-Intel&reg; Level Up Game Demo Contest 2013
Rise to greatness in Intel's independent game demo contest.
Compete for recognition, cash, and the chance to get your game 
on Steam. $5K grand prize plus 10 genre and skill prizes. 
Submit your demo by 6/6/13. http://p.sf.net/sfu/intel_levelupd2d
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to