Hi Marco,
did you have a look at Parsoid [1].

I am really not sure, what is the best way, to parse data out of 
Wikisyntax. The XML configs produced for Wiktionary by Jonas Master 
thesis seem to be quite alright for "normal" users. So I would hope that 
we can take it from there and just build an infrastructure around this. 
It might make sense to combine both sources and ways to parse Wiktionary 
to get better results in the end.

The most important thing seems to be a user-freindly process to involve 
Wiktionary users, however.

All the best,
Sebastian



[1] http://www.mediawiki.org/wiki/Parsoid



Am 18.04.2013 16:50, schrieb Marco Fossati:
> Definitely, that's why Sebastian's idea can become a very interesting 
> GSoC project.
>
> On 4/18/13 4:41 PM, Pablo N. Mendes wrote:
>>
>> The difference between JSON and HTML are 15min, Scala and IntelliJ. :)
>>
>> I'd think the important part is how the markup is parsed, templates,
>> resolved, etc.
>>
>>
>> On Thu, Apr 18, 2013 at 4:39 PM, Marco Fossati <hell.j....@gmail.com
>> <mailto:hell.j....@gmail.com>> wrote:
>>
>>     I can't say if it's a competitor. The main difference relies in the
>>     output, which is structured data (JSON) instead of semi-structured
>>     data (HTML).
>>     For more details, see the slides [1].
>>     Cheers!
>>
>>     [1] http://www.slideshare.net/__spaziodati/introducing-__jsonpedia
>> <http://www.slideshare.net/spaziodati/introducing-jsonpedia>
>>
>>
>>     On 4/18/13 4:23 PM, Pablo N. Mendes wrote:
>>
>>
>>         Is JSONPedia a competitor of gwtwiki and Sweble?
>>
>>         https://code.google.com/p/__gwtwiki/
>>         <https://code.google.com/p/gwtwiki/>
>> http://en.wikipedia.org/wiki/__Sweble#The_current_state_of___parsing
>> <http://en.wikipedia.org/wiki/Sweble#The_current_state_of_parsing>
>>
>>
>>         On Thu, Apr 18, 2013 at 4:18 PM, Marco Fossati
>>         <hell.j....@gmail.com <mailto:hell.j....@gmail.com>
>>         <mailto:hell.j....@gmail.com <mailto:hell.j....@gmail.com>>> 
>> wrote:
>>
>>              Hi Pablo,
>>
>>              It's a low-level generic parser for MediaWiki content.
>>              It converts all the content of any MediaWiki resource into
>>              structured data. The output could be JSON (as it is now),
>>         JSON-LD or
>>              RDF, i.e., it can be modeled for our needs.
>>              Compared to DBpedia extraction framework, it does not 
>> make any
>>              processing on the semantics of data e.g. on infoboxes, but
>>         handles
>>              every content item e.g. article body, tables, etc.
>>              I see some similarities with the Wiktionary extraction
>>         project [1]
>>              that Sebastian mentioned in the GSoC idea.
>>              Since Sebastian proposed to configure the Wiktionary
>>         extractor in
>>              order to parse other Wikis, I was just wondering if these 2
>>         projects
>>              were complementary, could be merged or whatever could help.
>>              Of course, JSONpedia will be released with an open source
>>         licence.
>>
>>              @Sebastian, can you give us some more thoughts about that?
>>              Cheers!
>>
>>              [1] http://dbpedia.org/Wiktionary
>>
>>
>>              On 4/18/13 11:32 AM, Pablo N. Mendes wrote:
>>
>>
>>                  What does it offer that the DEF does not have?
>>
>>                  Cheers,
>>                  Pablo
>>
>>
>>                  On Wed, Apr 17, 2013 at 10:33 PM, Marco Fossati
>>                  <hell.j....@gmail.com <mailto:hell.j....@gmail.com>
>>         <mailto:hell.j....@gmail.com <mailto:hell.j....@gmail.com>>
>>                  <mailto:hell.j....@gmail.com
>>         <mailto:hell.j....@gmail.com> <mailto:hell.j....@gmail.com
>>         <mailto:hell.j....@gmail.com>>>__> wrote:
>>
>>                       Hi Sebastian,
>>
>>                       I was wondering if the JSONpedia project [1] 
>> could be
>>                  helpful for the
>>                       idea you are mentoring for GSoC 2013.
>>                       Have a look at the slides [2].
>>                       What do you think about?
>>                       Let me know.
>>                       Cheers,
>>
>>                       [1]
>>         http://json.it.dbpedia.org/____frontend/form.html
>>         <http://json.it.dbpedia.org/__frontend/form.html>
>> <http://json.it.dbpedia.org/__frontend/form.html
>> <http://json.it.dbpedia.org/frontend/form.html>>
>>                       [2]
>> http://www.slideshare.net/____spaziodati/introducing-____jsonpedia
>> <http://www.slideshare.net/__spaziodati/introducing-__jsonpedia>
>>
>>
>> <http://www.slideshare.net/__spaziodati/introducing-__jsonpedia
>> <http://www.slideshare.net/spaziodati/introducing-jsonpedia>>
>>                       --
>>                       Marco Fossati
>>         http://about.me/marco.fossati
>>                       Twitter: @hjfocs
>>                       Skype: hell_j
>>
>>
>>
>> ------------------------------____----------------------------__--__------------------
>>
>>                       Precog is a next-generation analytics platform
>>         capable of
>>                  advanced
>>                       analytics on semi-structured data. The platform
>>         includes
>>                  APIs for
>>                       building
>>                       apps and a phenomenal toolset for data science.
>>         Developers
>>                  can use
>>                       our toolset for easy data analysis &
>>         visualization. Get a
>>                  free account!
>> http://www2.precog.com/____precogplatform/____slashdotnewsletter
>> <http://www2.precog.com/__precogplatform/__slashdotnewsletter>
>>
>> <http://www2.precog.com/__precogplatform/__slashdotnewsletter
>> <http://www2.precog.com/precogplatform/slashdotnewsletter>>
>> ___________________________________________________
>>                       Dbpedia-gsoc mailing list
>>                  Dbpedia-gsoc@lists.__sourcefor__ge.net
>>         <http://sourceforge.net>
>>                  <mailto:Dbpedia-gsoc@lists.__sourceforge.net
>>         <mailto:Dbpedia-gsoc@lists.sourceforge.net>>
>>                       <mailto:Dbpedia-gsoc@lists.
>>         <mailto:Dbpedia-gsoc@lists.>__s__ourceforge.net
>>         <http://sourceforge.net>
>>                  <mailto:Dbpedia-gsoc@lists.__sourceforge.net
>> <mailto:Dbpedia-gsoc@lists.sourceforge.net>>>
>>
>> https://lists.sourceforge.net/____lists/listinfo/dbpedia-gsoc
>> <https://lists.sourceforge.net/__lists/listinfo/dbpedia-gsoc>
>>
>>
>> <https://lists.sourceforge.__net/lists/listinfo/dbpedia-__gsoc
>> <https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc>>
>>
>>
>>
>>
>>                  --
>>
>>                  Pablo N. Mendes
>>         http://pablomendes.com
>>
>>
>>              --
>>              Marco Fossati
>>         http://about.me/marco.fossati
>>              Twitter: @hjfocs
>>              Skype: hell_j
>>
>>
>>
>>
>>         --
>>
>>         Pablo N. Mendes
>>         http://pablomendes.com
>>
>>
>>     --
>>     Marco Fossati
>>     http://about.me/marco.fossati
>>     Twitter: @hjfocs
>>     Skype: hell_j
>>
>>
>>
>>
>> -- 
>>
>> Pablo N. Mendes
>> http://pablomendes.com
>


-- 
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Projects: http://nlp2rdf.org , http://linguistics.okfn.org , 
http://dbpedia.org/Wiktionary , http://dbpedia.org
Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org

------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

Reply via email to