Re: [Wikisource-l] Takedown of BUB book on Internet Archive
I discussed this at the Dutch wiki conference yesterday. If we really want to move in the direction of hosting PD works that are and can be quoted in WP articles, then we will need items per page, which in turn can imply paintings, engravings, or photos per page. This could drastically improve the wikisource reading experience while also enabling full support for “citation needed” where PD citations are available for the topic. we need a “.djvu on the fly” function for special books uploaded and curated on Commons and Wikidata. E.g. Book of hours, book of Maps, emblem books, bibles Sent from my iPhone > On Mar 9, 2019, at 10:04 AM, Sam Wilson wrote: > >> On 3/7/19 7:09 PM, Jane Darnell wrote: >> Next, how do you use ABBY to convert a .PDF to .djvu? > > I must admit I'm guilty of suggesting people just use PDFs, as it's so much > easier to explain! Does anyone have any suggestions about how to convince > people to prefer DjVu over PDF? It seems from the outside of the proofreading > process that there's no problem with PDF. > > — Sam. > > ___ > Wikisource-l mailing list > Wikisource-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikisource-l ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Re: [Wikisource-l] Takedown of BUB book on Internet Archive
Thanks for all the links! Yes I have been using IA-upload quite happily, though it took me a while to figure out the time lag. I assumed there would be other copies, but 17 is a lot. I wish there was a way to find out which is the best copy to upload to Commons - we should have at least one for everything that exists on archive.org in, say, over 4 copies. Another generic question I had about Commons .djvu files was why these can't by default show up in Wikisource on a simple index page if the .djvu files are linked from Wikidata items as being instance of a book, with a statement link to Commons file and a statement link to the author's Wikidata item, which in turn is linked to Wikisource author id. On Thu, Mar 7, 2019 at 1:27 PM Federico Leva (Nemo) wrote: > Jane Darnell, 07/03/19 13:09: > > Thanks for this thread, guys, super interesting! Where can I subscribe > > to these notices ? > > The notices go to the email address of the account which uploaded to > archive.org, in this case BUB's email. > > > I have never received an email from archive.org > > <http://archive.org>. > > I've not had one in many years either. Publishers are getting more > aggressive these days: > < > https://www.theguardian.com/books/2019/jan/22/internet-archives-ebook-loans-face-uk-copyright-challenge > > > > > Also, how do you download from Google books (I > > have downloaded google .djvu files from archive.org <http://archive.org> > > > before, but the OCR tends to be pretty lousy). > > This specific book was uploaded with <https://tools.wmflabs.org/bub/> > when it still worked. > > > Next, how do you use ABBY > > to convert a .PDF to .djvu? > > I'm not sure I'd recommend ABBYY for the DjVu creation. I've recently > updated the instructions > <http://en.wikisource.org/wiki/Help:DjVu_files>, switching them to a > focus on image quality rather than compression. Some simple tweak in the > command line has a huge impact. > > > Lastly, if the counter notice doesn't work, > > The counter-notice probably works but I don't want to flood the IA with > extra work in case these takedowns become more frequent. I've sent this > example to the list to see if it makes sense or we should just drop it > in this case. The takedowns are handled by IA staffers in a rather > manual way, I think, so I'm not sure how many they can sustain. > > (There's also the possibility that the upload was wrong and the book at > that URL is not actually the PD book found on Google Books but something > else entirely. It has happened a time or two in the past that I know of.) > > > can't you just re-upload another version? > > Sure, in fact there are already 17 more. :-D > < > https://archive.org/search.php?query=%22a%20manual%20of%20ancient%20history%22%20rawlinson > > > > But better play nice, no need to dodge the counter-ticket process. > > > I agree though that we should > > be uploading these files to Commons, and I do this for the ones I care > > about, just to be on the safe side. > > Definitely, and this should be easy enough with IA-upload after the > recent fixes: <https://tools.wmflabs.org/ia-upload/>. > > There is no need to panic and transfer millions of books now! > > Federico > ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Re: [Wikisource-l] Takedown of BUB book on Internet Archive
Thanks for this thread, guys, super interesting! Where can I subscribe to these notices ? I have never received an email from archive.org. Also, how do you download from Google books (I have downloaded google .djvu files from archive.org before, but the OCR tends to be pretty lousy). Next, how do you use ABBY to convert a .PDF to .djvu? Lastly, if the counter notice doesn't work, can't you just re-upload another version? I agree though that we should be uploading these files to Commons, and I do this for the ones I care about, just to be on the safe side. On Thu, Mar 7, 2019 at 8:58 AM Nicolas VIGNERON wrote: > Hi, > > Just write a counter-notice and the book should be back soon (provided > that the notice is just cyber-law-bullying and that there is no other > underlying issue). > > Cheers, ~nicolas > > Le jeu. 7 mars 2019 à 00:20, Luiz Augusto a écrit : > >> The funniest thing is that the very same digitization is still available >> on Google Book Search (just downloaded the PDF version) and the same title >> is available on dozens of records at the HathiTrust consortium [1]. >> >> Are some volunteers of Internet Archive lazy to research basic things on >> copyright as some of the volunteers of Wikimedia Commons? >> >> If anyone is interested, I'll be glad to convert this file to djvu and >> upload to the Wikimedia Commons using my local copy of ABBYY. >> >> [1] >> https://catalog.hathitrust.org/Search/Home?type%5B%5D=title%5B%5D=a%20manual%20of%20ancient%20history%5B%5D=AND%5B%5D=author%5B%5D=Rawlinson=1=20=ft >> >> On Wed, Mar 6, 2019 at 7:39 PM Federico Leva (Nemo) >> wrote: >> >>> Seriously, a takedown for a 1870 book by an author died in 1902? >>> https://en.wikipedia.org/wiki/George_Rawlinson >>> https://books.google.com/books?id=5mwTYAAJ >>> >>> Federico >>> >>> Messaggio inoltrato >>> Oggetto: archive.org item disabled >>> Data: Wed, 6 Mar 2019 17:30:31 -0500 >>> Mittente: Internet Archive >>> >>> Hello, >>> >>> Access to the following item has been disabled following receipt by >>> Internet Archive of a copyright claim issued by The Publishers >>> Association: >>> >>> https://archive.org/details/bub_gb_5mwTYAAJ >>> >>> Some general information about take down notices and processes may be >>> found at https://lumendatabase.org, including information about >>> submitting a counter-notice, if applicable: >>> >>> https://lumendatabase.org/topics/29 >>> https://lumendatabase.org/topics/14 >>> >>> The Internet Archive provides these links as a potential resource and >>> cannot guarantee that any specific information posted at >>> lumendatabase.org is accurate or complete. >>> >>> The Internet Archive Terms of Use, including our Copyright Policy, are >>> posted at https://archive.org/about/terms.php. >>> >>> As a general note: repeated posting of infringing material may result in >>> the disabling of a user’s account. >>> >>> --- >>> The Internet Archive Team >>> >>> >>> ___ >>> Wikisource-l mailing list >>> Wikisource-l@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l >>> >> ___ >> Wikisource-l mailing list >> Wikisource-l@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikisource-l >> > ___ > Wikisource-l mailing list > Wikisource-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikisource-l > ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Re: [Wikisource-l] quickstatements for missing editions
Yes you definitely need this flow of useful interproject links both ways: as a trigger for Wikidatans to do more with Wikisource pages, and as a trigger to Wikisourcerers to do more with Wikidata items On Wed, Nov 1, 2017 at 10:01 AM, Sam Wilson <s...@samwilson.id.au> wrote: > Yup, still true. We do at least have a common goal of structured HTML, as > defined by http://schema.org/CreativeWork > > It sounds like Tpt's scraper will do wonders, if a Wikisource just > complies to that. I think that's one of the next steps we need to take. > > I sort of figure from the English Wikisource point of view that we should > do more on bringing data *in* from Wikidata, in our {{header}}, rather than > working on making it easier to extract data *out* with > microformats/structured-HTML. Well, we should do both, of course! :-) But > my feeling from the process of getting Author data in from Wikidata is that > the whole Wikidata integration becomes so much more worthwhile and clearer > (and we sort out the various edge cases) when we're actively using it for > real. > > But of course, each Wikisource is in a similar position. :-( And are we to > all be developing the Lua scripts and templates in isolation? Indeed no! > :-) We shall put them all toegther in our brave new Wikisource extension! :) > > —sam > > > > On Wed, 1 Nov 2017, at 04:03 PM, Andrea Zanni wrote: > > @Sam, Tpt, > my personal experience is too that HTML is the way to pull out the > Wikisource important metadata, > but it's also that every Wikisource has sort of a different way to show > them, > meaning that you need to tweak your scraper for each Wikisource. > Is that still true? Last time I did it was more than one year ago, but I > need to try it again soon. > Aubrey > > On Wed, Nov 1, 2017 at 1:00 AM, Sam Wilson <s...@samwilson.id.au> wrote: > > Yes I think you're definitely right! The easier way to send Wikisource > data to Wikidata is going to be a clever gadget that reads the > microformat or schema'd info in each page. My hack was just a quick and > easy test at getting some things added. :) > > Ultimately, I'm actually not that excited about working on the tools > that we need to transfer the data. No no I don't mean that! Well, just > that the end point we're aiming at is that a bunch of info *won't be* at > all in Wikisource, but will be pulled from Wikidata, and so I am much > more interested in making better tools for working with the data in > Wikidata. :-) If you see what I mean. > > My idea with ws-search is that it will progressively pull more and more > data from Wikidata, and only resort to HTML scraping where the data is > missing from Wikidata. I'm attempting to encapsulate this logic in the > `wikisource/api` PHP library. > > > > On Tue, 31 Oct 2017, at 11:14 PM, Thomas Pellissier Tanon wrote: > > Hello Sam, > > > > Thank you for this nice feature! > > > > I have created a few months ago a prototype of Wikisource to Wikidata > > importation tool for the French Wikisource based on the schema.org > > annotation I have added to the main header template (I definitely think > > we should move from our custom microformat to this schema.org markup > that > > could be much more structured). It's not yet ready but I plan to move it > > forward in the coming weeks. A beginning of frontend to add to your > > Wikidata common.js is here: > > https://www.wikidata.org/wiki/User:Tpt/ws2wd.js > > We should probably find a way to merge the two projects. > > > > Cheers, > > > > Thomas > > > > > Le 31 oct. 2017 à 15:10, Nicolas VIGNERON <vigneron.nico...@gmail.com> > a écrit : > > > > > > 2017-10-31 13:16 GMT+01:00 Jane Darnell <jane...@gmail.com>: > > > Sorry, I am much more of a Wikidatan than a Wikisourcerer! I was > referring to items like this one > > > https://www.wikidata.org/wiki/Q21125368 > > > > > > No need to be sorry, that is actually a good question and this example > is even better (I totally forgot this kind of case). > > > > > > For now, this is probably better to deal with it by hands (and I'm not > sure what this tools can even do for this). > > > > > > Cdlt, ~nicolas > > > ___ > > > Wikisource-l mailing list > > > Wikisource-l@lists.wikimedia.org > > > https://lists.wikimedia.org/mailman/listinfo/wikisource-l > > > > ___ > > Wikisource-l mailing list > > Wikisource-l@lists.wikimedia.org > > https://lists.wikimedia.org/mailman/listinfo/wikisource-l > > Email had 1 at
Re: [Wikisource-l] quickstatements for missing editions
Sorry, I am much more of a Wikidatan than a Wikisourcerer! I was referring to items like this one https://www.wikidata.org/wiki/Q21125368 On Tue, Oct 31, 2017 at 10:48 AM, Nicolas VIGNERON < vigneron.nico...@gmail.com> wrote: > 2017-10-31 10:00 GMT+01:00 Jane Darnell <jane...@gmail.com>: > >> We want the disambiguation pages on Wikisource - I checked a few of these >> and there are a lot of women and "younger sons" in them that we want. Also, >> many can be connected to existing "family of ..." pages or name >> disambiguation pages - they definitely help enrich our understanding of the >> problems of disambiguation over time. >> > > I'm guessing you're talking about pages in https://en.wikisource.org/ > wiki/Category:Author_disambiguation_pages (which only exist on en.ws) but > they are in the Author: namespace and (if Im' not mistaken) the WS search > tool here only look in the main namespace (as it's focused on editions). > > So this is a bit besides this mail thread but still, you are very right, > for people in particular and for everything in general, disambig pages are > indeed important and ideally the tool should not just discard them as 'not > edition' (if technically possible to spot them obviously). > > Cdlt, ~nicolas > > ___ > Wikisource-l mailing list > Wikisource-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikisource-l > > ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Re: [Wikisource-l] quickstatements for missing editions
We want the disambiguation pages on Wikisource - I checked a few of these and there are a lot of women and "younger sons" in them that we want. Also, many can be connected to existing "family of ..." pages or name disambiguation pages - they definitely help enrich our understanding of the problems of disambiguation over time. On Tue, Oct 31, 2017 at 9:46 AM, Nicolas VIGNERON < vigneron.nico...@gmail.com> wrote: > Yes it's certainly a first draft!! :-) Thanks for trying it out. >> >> With the disambig pages, can you suggest how to detect them? >> > > Not sure. > Could you detect the presence of the Q6148868 templates ? (and same thing > for Q15701815 ). > Or else maybe with the categories. > > >> Ah, there's a couple of other bugs here: >> >> The page https://fr.wikisource.org/wiki/Accroupissements actually >> already has a WIkidata ID, but the ws-search database didn't know about >> it :-( probably because it was failing for a while on some weird >> problems. I've re-run the scraper, and now that work is showing up with >> it's proper Q-number: >> https://tools.wmflabs.org/ws-search/?title=Accroupissements; >> author==fr >> >> The idea with the quickstatements is that it'll only show it for works >> that are *not yet* linked to wikidata. This is where the disambig >> problem comes in, because there doesn't seem to be a simple way to >> determine what's an edition and what's a work without resorting to >> Wikidata. We could look at categories? Is it a truth universally >> acknowledged that pages in the categories defined as >> https://www.wikidata.org/wiki/Q15939659 are all disambiguation pages? >> That could work... >> > > The truth (and I guess it is universal but could someone confirm?) is that > pages with 'multiples editions' are 'works' (Q571, this is what I do for > fr.wikisource at least). > > Thank you for all the work! > > Cdlt, ~nicolas > > ___ > Wikisource-l mailing list > Wikisource-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikisource-l > > ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Re: [Wikisource-l] Wikimedia Strategy
Yes I agree - totally wonderful. And there are more ways to make a more meaningful query out of this (In Dutch #1 is Barbapapa and in English the Simpsons take 1st place), by either specifying it can'be a film, or just filtering for inception date before 1970 On Tue, Apr 11, 2017 at 12:51 PM, Gerard Meijssen <gerard.meijs...@gmail.com > wrote: > Hoi, > Classification as we have it is a wonder. It is there and it cannot be > explained. It does serve a purpose though. > Thanks, > GerardM > > On 11 April 2017 at 12:44, Jane Darnell <jane...@gmail.com> wrote: > >> Interesting query, thanks! How odd that "sitcom" is a subclass of >> "literary work"! I never thought of it that way :) >> >> On Tue, Apr 11, 2017 at 12:23 PM, Magnus Manske < >> magnusman...@googlemail.com> wrote: >> >>> The 500 most important (as in, number of Wiki sitelinks) literary works >>> that are (at least partially) in "original language" German, according to >>> Wikidata: >>> http://tinyurl.com/mzhd8na >>> "The Big Bang Theory" item might need some review, but the rest look >>> good... >>> Just change the Q188 and the language code for your favourite language! >>> >>> On Tue, Apr 11, 2017 at 10:58 AM Andrea Zanni <zanni.andre...@gmail.com> >>> wrote: >>> >>>> In it.source we made a similar Canon: >>>> https://it.wikisource.org/wiki/Wikisource:Canone_delle_opere >>>> _della_letteratura_italiana >>>> >>>> Ideally, we should have an item (a "work" item, so basically the one >>>> with a Wikipedia article) on Wikidata for each one. >>>> Than we can count how many Wikipedias have an article on it. Basically >>>> it's Tpt's idea using wikidata and sitelinks. >>>> >>>> Aubrey >>>> >>>> >>>> On Tue, Apr 11, 2017 at 11:50 AM, Jane Darnell <jane...@gmail.com> >>>> wrote: >>>> >>>> You can always start with the lists per country (if they exist). So for >>>> example I made an article about the first 500 of such a "1000 most >>>> important works of literature" list compiled for the Netherlands here: >>>> https://en.wikipedia.org/wiki/Canon_of_Dutch_Literature >>>> >>>> On Tue, Apr 11, 2017 at 10:44 AM, Thomas PT <thoma...@hotmail.fr> >>>> wrote: >>>> >>>> A maybe simpler metric: the top 1000 Wikipedia articles about works per >>>> page view. >>>> >>>> Thomas >>>> >>>> > Le 11 avr. 2017 à 09:42, mathieu stumpf guntz < >>>> psychosl...@culture-libre.org> a écrit : >>>> > >>>> > Hi Nemo, >>>> > >>>> > We may establish a list a the "1000 works that every Wikisource >>>> should have" (with translation possibly needed). >>>> > >>>> > What metric could we use to define such a list? Maybe reference >>>> frequency, but it requires statistics whose availability is unknown to me. >>>> > >>>> > Statistically, >>>> > psychoslave >>>> > >>>> > Le 29/03/2017 à 08:30, Federico Leva (Nemo) a écrit : >>>> >> One issue sometimes raised about Wikisource is how we know that >>>> we're working on the "right" books. Internet Archive is planning to >>>> textbooks starting from those which are most frequently assigned in USA >>>> schools: >>>> >> http://blog.archive.org/2017/03/29/books-donated-for-macarth >>>> ur-foundation-100change-challenge-from-bookmooch-users/ >>>> >> >>>> >> I was surprised to learn a project like OpenSyllabus exists and >>>> works, I emailed them to ask what it would take to do the same for other >>>> languages/geographies. >>>> >> >>>> >> Nemo >>>> >> >>>> >> ___ >>>> >> Wikisource-l mailing list >>>> >> Wikisource-l@lists.wikimedia.org >>>> >> https://lists.wikimedia.org/mailman/listinfo/wikisource-l >>>> > >>>> > >>>> > ___ >>>> > Wikisource-l mailing list >>>> > Wikisource-l@lists.wikimedia.org >>>> > https://lists.wikimedia.org/mailman/listinfo/wikisource-l >>>> >
Re: [Wikisource-l] OCR as a service?
Nice! I will wait for the client though, thx. Where will the source images be stored? Labs or Commons? It would be nice if you could somehow make a client that builds a djvu file locally with the page image and the OCR text that you can cleanup before putting it into the djvu file. Now it just seems there are so many hurdles to ws that it's quicker to post pages to Commons and add the text in the template there. On Wed, Jul 29, 2015 at 8:23 AM, Asaf Bartov abar...@wikimedia.org wrote: Hello again. So, I've set up an OpenOCR instance on Labs that's available for use as a service. Just call it and point to an image. Example: *curl -X POST -H Content-Type: application/json -d '{img_url:http://bit.ly/ocrimage http://bit.ly/ocrimage,engine:tesseract}' http://openocr.wmflabs.org/ocr http://openocr.wmflabs.org/ocr* should yield: You can create local variables for the pipelines within the template by prefixing the variable name with a “$ sign. Variable names have to be composed of alphanumeric characters and the underscore. In the example below I have used a few variations that work for variable names. If we see evidence of abuse, we might have to protect it with API keys, but for now, let's AGF. :) I'm working on something that would be a client of this service, but don't have a demo yet. Stay tuned! :) A. On Sun, Jul 12, 2015 at 3:27 PM, Alex Brollo alex.bro...@gmail.com wrote: I explored abbyy gx files, the full xml output from ABBYY ocr engine running at Internet Archive, and I've been astonished by the amount of data they contain - they are stored at XCA_Extended detaiI (as documented at http://www.abbyy-developers.com/en:tech:features:xml ). Something that wikisource best developers should explore; comparing those data with the little bit of data into mapped text layer of djvu files is impressive and should be inspiring. But they are static data coming from a standard setting... nothing similar to a service with simple, shared, deep learning features for difficult and ancient texts. I tried ancient italian tesseract dictionary with very poor results. So Asaf, I can't wait for good news from you. :-) Alex 2015-07-12 12:50 GMT+02:00 Andrea Zanni zanni.andre...@gmail.com: On Sun, Jul 12, 2015 at 11:25 AM, Asaf Bartov abar...@wikimedia.org wrote: On Sat, Jul 11, 2015 at 9:59 AM, Andrea Zanni zanni.andre...@gmail.com wrote: uh, that sounds very interesting. Right now, we mainly use OCR from djvu from Internet Archive (that means ABBYY Finereader, which is very nice). Yes, the output is generally good. But as far as I can tell, the archive's Open Library API does not offer a way to retrieve the OCR output programmatically, and certainly not for an arbitrary page rather than the whole item. What I'm working on requires the ability to OCR a single page on demand. True. I've recently met Giovanni, a new (italian) guy who's now working with Internet Archive and Open Library. We discussed about a number of possible parnerships/projects, this is definitely one to bring it up. But if we manage to do it directly in the Wikimedia world it's even better. Aubrey ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l -- Asaf Bartov Wikimedia Foundation http://www.wikimediafoundation.org Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Re: [Wikisource-l] Playing with Lua, javascript and pagelist tag
Hi Alex, No the book I want to do is still on my computer. That link you looked at is a book that was printed about 20 years ago as a facsimile of one printed in the 17th century. It does happen to be a book that has been massively reused, and it's not even the original! The book is just plates, and the only text on them is in the description section of the files. I don't see the point of having it on Wikisource because you can use it more easily on Commons (and each page can be linked to any language-pedia. I will try to upload the part of the book I mean so you can take a look at my specific problem. It's 3 volumes but there is a section that is particularly problematic. Jane 2013/6/7, Alex Brollo alex.bro...@gmail.com: @ Jane again: I'd better look before, and talk after I see a collection jpg's from scans, not a djvu file or a Index: page into a wikisource project. :-) So I presume that I can't find any pagelist tag. :-) Did you personally scan those pages? Did you scan all the pages of the book (if it's a book...)? Do you know if any complete scan of the book has been published previously (into Internet Archive, Google books, or other digital libraries)? Next time if you scan or take pictures to all the pages of a book, and load all the images to Commons, some willing wikisourcian could mount them into a multipage djvu file and to open an Index: page to proofread it into a wikisource project. 2013/6/7 Alex Brollo alex.bro...@gmail.com Thanks for suggestions I can only promise, I'll think about them. The question by Micru is particularly hard. :-( @ Jane: I've to read your mail again and again; nevertheless a well compiled pagelist tag can really identify into a unique way any page of the book, even if they have no page number, and tl|Pg manages djvu page/book page relationship easily even if book page is identified by something like Figure 1, Figure 2. I'll take a look at your book. Alex 2013/6/7 Jane Darnell jane...@gmail.com I have been wondering the same thing for years. When I upload prints to Wikimedia Commons, I am generally in a hurry and just use the default uploader to get it out there. Weeks or months or sometimes years later I will add in the detailed metadata like the book it was first published in, alternate sources for the print from the one I used, the publisher if that is a different person than the artist, etc. What I don't bother with is page numbers, because this is often unknown and changes from edition to edition. You can get around this problem by naming specific editions held in specific libraries with specific page numbers, which I have done occasionally. Some prints are so well known they go by their own titles, and the Wikimedia Commons artwork template even has a field Original title to deal with this issue. When you go through an index of plates in any older book, generally there are some mistakes, such as blank pages that are indexed because the plate didn't make it to the printer, some plates the printer added that didn't make it into the index, and of course the really confusing one, the prints that a reprinter added that neither the original author nor the original publisher ever saw. One reason I have not spent much time on Wikisource is because I feel I have to decide up front what the structure of the book will be with page numbering (which sometimes does not count the plates), so I need to base this on the original index or original list of chapters. Sometimes a book becomes famous just for one passage, and that passage may not even be indexed in the original version. How do you add these links? On Wikimedia Commons you can keep on adding values to fields, and change the Information template to Artwork to get more fields. You can even add annotations to files and then put links to other files in the annotations, so that through the Global usage property you can see where such prints have been quoted or re-used. How do you do this with books? I would like to see a flexible way to set this up that makes it easy to come back and make corrections or additions to the published information in both indexes and ToC's based on later discovery. This book of prints for example shows a page order based on one edition that was reproduced in facsimile version, but other versions exist with different plates: http://commons.wikimedia.org/wiki/Category:32_afbeeldinge_der_Graven_van_HOLLANDT How do you set up page numbers for this, because there weren't any to start with? Jane 2013/6/7, Andrea Zanni zanni.andre...@gmail.com: On Fri, Jun 7, 2013 at 1:36 AM, David Cuenca dacu...@gmail.com wrote: Automatic creation of page transclusion is nice but also dangerous... too many structures to have an easy solution. What Alex is thinking, if I understand his work correctly, is that when you work on a new book in nsPage, you define what the structure is (his work right
Re: [Wikisource-l] Wikisource user group proposal page started
For WLM we have project pages on Commons, even though most participating countries have their WLM lists on Wikipedia. Maybe Wikisource should do this too; have all projects and associated files residing on Commons, with only actual text interfaces on Wikisource. Many more people can be found on Commons than on Meta. I signed up anyway 2013/6/2, David Cuenca dacu...@gmail.com: Hi there, In order to guarantee that there are more general Wikisource projects in the future, like those outlined in the Wikisource vision[1], which benefit the whole community and not specific language communities, and that there is a legitimate way of approaching institutions for collaborations or funding, it would be great if everyone who is interested in actively improving Wikisource would join the proposed user group! http://meta.wikimedia.org/wiki/Wikisource_User_Group Perhaps it is also a good way to launch more offline activities like the Wikisource workshop during the DC GlamWiki Boot Camp that Chris and Doug started [2]. What are your thoughts about this? Cheers, David --Micru [1] http://wikisource.org/wiki/Wikisource_vision_development/Applying_the_WS_values [2] http://en.wikipedia.org/wiki/Wikipedia:GLAM/Boot_Camp ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Re: [Wikisource-l] Reunification of Wikisources
The reason for moving Wikisource content to Commons that I think is most important is the fact that many original manuscripts have a one-to-many relationship with other texts in other languages. There is no definitive translation of the bible, Anna Karenina, Don Quixote, and so forth. However, when reading these texts, the reader should be able to see where related content is available in sister projects. On Wikimedia Commons, you see this with the Global usage feature. This would be perfect to use for text pages of books as well. Many engravings are used on Commons in multiple projects, without the original text being available on Wikisource. It would be a good project just to line up what we already have; so for example unite a title page with one of the other engravings in a book on a Wikisource book stub page. Look at the global usage of this file for example: http://commons.wikimedia.org/wiki/File:Don_Quixote_5.jpg Up until now only illustrations are common, but I think the whole book should be possible to read in DjVu on Commons, no matter what language the text is in, and no matter what the language interface is of the user on Commons. As it stands now, it is only possible to see this Global usage feature on Commons files, not on text files (because they can only link to one version of a text in another project per page). In the example above you can see that the same engraving is used on two different pages on the French Wikisource. You can't see that anywhere on Wikisource, only here in the Global usage feature on Commons. By the way I am not for getting rid of the separate Wikisource language projects altogether, because I think they still fill an important purpose for government documents and other things that will never or rarely be translated. I am just saying that it would be better to have full texts of original works easily available on Commons page by page (and perhaps we should involve Wikiquotes in this too, to split pages when necessary). 2013/6/2, Federico Leva (Nemo) nemow...@gmail.com: David Cuenca, 02/06/2013 02:22: [...] specially now that projects like Wikidata have shown that it is possible to have both localization and centralization living in harmony. We're VERY far from such a harmony, or maybe I'm misunderstandind what you mean here. We don't have a true solution for the problem of a multilingual wiki, Commons' pains show it well. https://wikimania2013.wikimedia.org/wiki/Submissions/Multilingual_Wikimedia_Commons_-_What_can_we_do_about_it From what I recall, localisation was definitely not the reason for splitting. It's also wrong to assume that bringing people on the same wiki will give you a single community: you may well just lose the (senses of) communities and end up with a dispersed array of editors. Nemo ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Re: [Wikisource-l] Reunification of Wikisources
Alex thanks for that perspective. I myself was wondering if anyone counted how many entries are in the books category here: http://commons.wikimedia.org/wiki/Category:Books and then the entries for books per sister project on Wikisource. My gut feeling is that the number per Wikisource entity will be smaller, but I may be wrong 2013/6/2, Alex Brollo alex.bro...@gmail.com: Just to have a try: imagine to install proofread extension and to add Index: and Page: namespaces into Commons, and to allow users to use Commons as an optional proofreading lab. No more painful alignement of Authors' metadata: there is Creator nmespace/template. No more painful setting aligning of book metadata: there is Book template. No more painful localization: there is a powerful set of localization templates/scripts. No more need for trick for multi-lengual books: they would be as simple as mono-lengual ones. Obviously such Commons books need to be fully shared - as html rendering - into any wikisource project, just adding a message Work can be edited in Commons, jus as it happens for shared media. Alex 2013/6/2 Federico Leva (Nemo) nemow...@gmail.com David Cuenca, 02/06/2013 02:22: [...] specially now that projects like Wikidata have shown that it is possible to have both localization and centralization living in harmony. We're VERY far from such a harmony, or maybe I'm misunderstandind what you mean here. We don't have a true solution for the problem of a multilingual wiki, Commons' pains show it well. https://wikimania2013.** wikimedia.org/wiki/**Submissions/Multilingual_** Wikimedia_Commons_-_What_can_**we_do_about_ithttps://wikimania2013.wikimedia.org/wiki/Submissions/Multilingual_Wikimedia_Commons_-_What_can_we_do_about_it From what I recall, localisation was definitely not the reason for splitting. It's also wrong to assume that bringing people on the same wiki will give you a single community: you may well just lose the (senses of) communities and end up with a dispersed array of editors. Nemo __**_ Wikisource-l mailing list Wikisource-l@lists.wikimedia.**org Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikisource-lhttps://lists.wikimedia.org/mailman/listinfo/wikisource-l ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l