Hi Matthias, On Oct 1, 3:43 pm, Matthias Samwald <[email protected]> wrote: > At the moment, I am thinking about possible ways of turning existing > bulletin boards (often based on the popular vBulletin software) into > SIOC, by crawling them and extracting the content. > > Does any of you have experience with crawling bulletin boards? Is > there any existing software that could be built upon?
Are you able to install plugins on these bulletin boards? If yes, just install the vBulletin SIOC exporter [1] and use any RDF crawler. We used this approach to collect data for the boards.ie SIOC data competition [2][3]. Most of the work was done by Thomas Schandl and Tuukka Hastrup, and they can probably tell more about the process and tools used. [1] http://wiki.sioc-project.org/index.php/VBSIOC [2] http://wiki.sioc-project.org/index.php/Data/Boards.ie/Structure [3] http://www.johnbreslin.com/blog/2008/07/30/deri-nui-galway-launches-the-boardsie-sioc-data-competition/ If there is no possibility to install plugins, then some kind of a wrapper for converting HTML into RDF will need to be used. Does anyone know if there are such "SIOC wrappers" available for vBulletin and other systems? Uldis [ http://captsolo.net | http://twitter.com/CaptSolo ] --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "SIOC-Dev" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/sioc-dev?hl=en -~----------~----~----~----~------~----~------~--~---
