@Sam, Tpt, my personal experience is too that HTML is the way to pull out the Wikisource important metadata, but it's also that every Wikisource has sort of a different way to show them, meaning that you need to tweak your scraper for each Wikisource. Is that still true? Last time I did it was more than one year ago, but I need to try it again soon.
Aubrey On Wed, Nov 1, 2017 at 1:00 AM, Sam Wilson <s...@samwilson.id.au> wrote: > Yes I think you're definitely right! The easier way to send Wikisource > data to Wikidata is going to be a clever gadget that reads the > microformat or schema'd info in each page. My hack was just a quick and > easy test at getting some things added. :) > > Ultimately, I'm actually not that excited about working on the tools > that we need to transfer the data. No no I don't mean that! Well, just > that the end point we're aiming at is that a bunch of info *won't be* at > all in Wikisource, but will be pulled from Wikidata, and so I am much > more interested in making better tools for working with the data in > Wikidata. :-) If you see what I mean. > > My idea with ws-search is that it will progressively pull more and more > data from Wikidata, and only resort to HTML scraping where the data is > missing from Wikidata. I'm attempting to encapsulate this logic in the > `wikisource/api` PHP library. > > > On Tue, 31 Oct 2017, at 11:14 PM, Thomas Pellissier Tanon wrote: > > Hello Sam, > > > > Thank you for this nice feature! > > > > I have created a few months ago a prototype of Wikisource to Wikidata > > importation tool for the French Wikisource based on the schema.org > > annotation I have added to the main header template (I definitely think > > we should move from our custom microformat to this schema.org markup > that > > could be much more structured). It's not yet ready but I plan to move it > > forward in the coming weeks. A beginning of frontend to add to your > > Wikidata common.js is here: > > https://www.wikidata.org/wiki/User:Tpt/ws2wd.js > > We should probably find a way to merge the two projects. > > > > Cheers, > > > > Thomas > > > > > Le 31 oct. 2017 à 15:10, Nicolas VIGNERON <vigneron.nico...@gmail.com> > a écrit : > > > > > > 2017-10-31 13:16 GMT+01:00 Jane Darnell <jane...@gmail.com>: > > > Sorry, I am much more of a Wikidatan than a Wikisourcerer! I was > referring to items like this one > > > https://www.wikidata.org/wiki/Q21125368 > > > > > > No need to be sorry, that is actually a good question and this example > is even better (I totally forgot this kind of case). > > > > > > For now, this is probably better to deal with it by hands (and I'm not > sure what this tools can even do for this). > > > > > > Cdlt, ~nicolas > > > _______________________________________________ > > > Wikisource-l mailing list > > > Wikisource-l@lists.wikimedia.org > > > https://lists.wikimedia.org/mailman/listinfo/wikisource-l > > > > _______________________________________________ > > Wikisource-l mailing list > > Wikisource-l@lists.wikimedia.org > > https://lists.wikimedia.org/mailman/listinfo/wikisource-l > > Email had 1 attachment: > > + signature.asc > > 1k (application/pgp-signature) > > _______________________________________________ > Wikisource-l mailing list > Wikisource-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikisource-l >
_______________________________________________ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l