from:"Jane Darnell"

Re: [Wikisource-l] Takedown of BUB book on Internet Archive

2019-03-09 Thread Jane Darnell

I discussed this at the Dutch wiki conference yesterday. If we really want to 
move in the direction of hosting PD works that are and can be quoted in WP 
articles, then we will need items per page, which in turn can imply paintings, 
engravings, or photos per page. This could drastically improve the wikisource 
reading experience while also enabling full support for “citation needed” where 
PD citations are available for the topic. we need a “.djvu on the fly” function 
for special books uploaded and curated on Commons and Wikidata. E.g. Book of 
hours, book of Maps, emblem books, bibles

Sent from my iPhone

> On Mar 9, 2019, at 10:04 AM, Sam Wilson  wrote:
> 
>> On 3/7/19 7:09 PM, Jane Darnell wrote:
>> Next, how do you use ABBY to convert a .PDF to .djvu?
> 
> I must admit I'm guilty of suggesting people just use PDFs, as it's so much 
> easier to explain! Does anyone have any suggestions about how to convince 
> people to prefer DjVu over PDF? It seems from the outside of the proofreading 
> process that there's no problem with PDF.
> 
> — Sam.
> 
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l

___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Takedown of BUB book on Internet Archive

2019-03-07 Thread Jane Darnell

Thanks for all the links! Yes I have been using IA-upload quite happily,
though it took me a while to figure out the time lag. I assumed there would
be other copies, but 17 is a lot. I wish there was a way to find out which
is the best copy to upload to Commons - we should have at least one for
everything that exists on archive.org in, say, over 4 copies. Another
generic question I had about Commons .djvu files was why these can't by
default show up in Wikisource on a simple index page if the .djvu files are
linked from Wikidata items as being instance of a book, with a statement
link to Commons file and a statement link to the author's Wikidata item,
which in turn is linked to Wikisource author id.

On Thu, Mar 7, 2019 at 1:27 PM Federico Leva (Nemo) 
wrote:

> Jane Darnell, 07/03/19 13:09:
> > Thanks for this thread, guys, super interesting! Where can I subscribe
> > to these notices ?
>
> The notices go to the email address of the account which uploaded to
> archive.org, in this case BUB's email.
>
> > I have never received an email from archive.org
> > <http://archive.org>.
>
> I've not had one in many years either. Publishers are getting more
> aggressive these days:
> <
> https://www.theguardian.com/books/2019/jan/22/internet-archives-ebook-loans-face-uk-copyright-challenge
> >
>
> > Also, how do you download from Google books (I
> > have downloaded google .djvu files from archive.org <http://archive.org>
>
> > before, but the OCR tends to be pretty lousy).
>
> This specific book was uploaded with <https://tools.wmflabs.org/bub/>
> when it still worked.
>
> > Next, how do you use ABBY
> > to convert a .PDF to .djvu?
>
> I'm not sure I'd recommend ABBYY for the DjVu creation. I've recently
> updated the instructions
> <http://en.wikisource.org/wiki/Help:DjVu_files>, switching them to a
> focus on image quality rather than compression. Some simple tweak in the
> command line has a huge impact.
>
> > Lastly, if the counter notice doesn't work,
>
> The counter-notice probably works but I don't want to flood the IA with
> extra work in case these takedowns become more frequent. I've sent this
> example to the list to see if it makes sense or we should just drop it
> in this case. The takedowns are handled by IA staffers in a rather
> manual way, I think, so I'm not sure how many they can sustain.
>
> (There's also the possibility that the upload was wrong and the book at
> that URL is not actually the PD book found on Google Books but something
> else entirely. It has happened a time or two in the past that I know of.)
>
> > can't you just re-upload another version?
>
> Sure, in fact there are already 17 more. :-D
> <
> https://archive.org/search.php?query=%22a%20manual%20of%20ancient%20history%22%20rawlinson
> >
>
> But better play nice, no need to dodge the counter-ticket process.
>
> > I agree though that we should
> > be uploading these files to Commons, and I do this for the ones I care
> > about, just to be on the safe side.
>
> Definitely, and this should be easy enough with IA-upload after the
> recent fixes: <https://tools.wmflabs.org/ia-upload/>.
>
> There is no need to panic and transfer millions of books now!
>
> Federico
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Takedown of BUB book on Internet Archive

2019-03-07 Thread Jane Darnell

Thanks for this thread, guys, super interesting! Where can I subscribe to
these notices ? I have never received an email from archive.org. Also, how
do you download from Google books (I have downloaded google .djvu files
from archive.org before, but the OCR tends to be pretty lousy). Next, how
do you use ABBY to convert a .PDF to .djvu? Lastly, if the counter notice
doesn't work, can't you just re-upload another version? I agree though that
we should be uploading these files to Commons, and I do this for the ones I
care about, just to be on the safe side.

On Thu, Mar 7, 2019 at 8:58 AM Nicolas VIGNERON 
wrote:

> Hi,
>
> Just write a counter-notice and the book should be back soon (provided
> that the notice is just cyber-law-bullying and that there is no other
> underlying issue).
>
> Cheers, ~nicolas
>
> Le jeu. 7 mars 2019 à 00:20, Luiz Augusto  a écrit :
>
>> The funniest thing is that the very same digitization is still available
>> on Google Book Search (just downloaded the PDF version) and the same title
>> is available on dozens of records at the HathiTrust consortium [1].
>>
>> Are some volunteers of Internet Archive lazy to research basic things on
>> copyright as some of the volunteers of Wikimedia Commons?
>>
>> If anyone is interested, I'll be glad to convert this file to djvu and
>> upload to the Wikimedia Commons using my local copy of ABBYY.
>>
>> [1]
>> https://catalog.hathitrust.org/Search/Home?type%5B%5D=title%5B%5D=a%20manual%20of%20ancient%20history%5B%5D=AND%5B%5D=author%5B%5D=Rawlinson=1=20=ft
>>
>> On Wed, Mar 6, 2019 at 7:39 PM Federico Leva (Nemo) 
>> wrote:
>>
>>> Seriously, a takedown for a 1870 book by an author died in 1902?
>>> https://en.wikipedia.org/wiki/George_Rawlinson
>>> https://books.google.com/books?id=5mwTYAAJ
>>>
>>> Federico
>>>
>>>   Messaggio inoltrato 
>>> Oggetto: archive.org item disabled
>>> Data: Wed, 6 Mar 2019 17:30:31 -0500
>>> Mittente: Internet Archive
>>>
>>> Hello,
>>>
>>> Access to the following item has been disabled following receipt by
>>> Internet Archive of a copyright claim issued by The Publishers
>>> Association:
>>>
>>> https://archive.org/details/bub_gb_5mwTYAAJ
>>>
>>> Some general information about take down notices and processes may be
>>> found at https://lumendatabase.org, including information about
>>> submitting a counter-notice, if applicable:
>>>
>>> https://lumendatabase.org/topics/29
>>> https://lumendatabase.org/topics/14
>>>
>>> The Internet Archive provides these links as a potential resource and
>>> cannot guarantee that any specific information posted at
>>> lumendatabase.org is accurate or complete.
>>>
>>> The Internet Archive Terms of Use, including our Copyright Policy, are
>>> posted at https://archive.org/about/terms.php.
>>>
>>> As a general note: repeated posting of infringing material may result in
>>> the disabling of a user’s account.
>>>
>>> ---
>>> The Internet Archive Team
>>>
>>>
>>> ___
>>> Wikisource-l mailing list
>>> Wikisource-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] quickstatements for missing editions

2017-11-01 Thread Jane Darnell

Yes you definitely need this flow of useful interproject links both
ways: as a trigger for Wikidatans to do more with Wikisource pages, and as
a trigger to Wikisourcerers to do more with Wikidata items

On Wed, Nov 1, 2017 at 10:01 AM, Sam Wilson <s...@samwilson.id.au> wrote:

> Yup, still true. We do at least have a common goal of structured HTML, as
> defined by http://schema.org/CreativeWork
>
> It sounds like Tpt's scraper will do wonders, if a Wikisource just
> complies to that. I think that's one of the next steps we need to take.
>
> I sort of figure from the English Wikisource point of view that we should
> do more on bringing data *in* from Wikidata, in our {{header}}, rather than
> working on making it easier to extract data *out* with
> microformats/structured-HTML. Well, we should do both, of course! :-) But
> my feeling from the process of getting Author data in from Wikidata is that
> the whole Wikidata integration becomes so much more worthwhile and clearer
> (and we sort out the various edge cases) when we're actively using it for
> real.
>
> But of course, each Wikisource is in a similar position. :-( And are we to
> all be developing the Lua scripts and templates in isolation? Indeed no!
> :-) We shall put them all toegther in our brave new Wikisource extension! :)
>
> —sam
>
>
>
> On Wed, 1 Nov 2017, at 04:03 PM, Andrea Zanni wrote:
>
> @Sam, Tpt,
> my personal experience is too that HTML is the way to pull out the
> Wikisource important metadata,
> but it's also that every Wikisource has sort of a different way to show
> them,
> meaning that you need to tweak your scraper for each Wikisource.
> Is that still true? Last time I did it was more than one year ago, but I
> need to try it again soon.
> Aubrey
>
> On Wed, Nov 1, 2017 at 1:00 AM, Sam Wilson <s...@samwilson.id.au> wrote:
>
> Yes I think you're definitely right! The easier way to send Wikisource
> data to Wikidata is going to be a clever gadget that reads the
> microformat or schema'd info in each page. My hack was just a quick and
> easy test at getting some things added. :)
>
> Ultimately, I'm actually not that excited about working on the tools
> that we need to transfer the data. No no I don't mean that! Well, just
> that the end point we're aiming at is that a bunch of info *won't be* at
> all in Wikisource, but will be pulled from Wikidata, and so I am much
> more interested in making better tools for working with the data in
> Wikidata. :-) If you see what I mean.
>
> My idea with ws-search is that it will progressively pull more and more
> data from Wikidata, and only resort to HTML scraping where the data is
> missing from Wikidata. I'm attempting to encapsulate this logic in the
> `wikisource/api` PHP library.
>
>
>
> On Tue, 31 Oct 2017, at 11:14 PM, Thomas Pellissier Tanon wrote:
> > Hello Sam,
> >
> > Thank you for this nice feature!
> >
> > I have created a few months ago a prototype of Wikisource to Wikidata
> > importation tool for the French Wikisource based on the schema.org
> > annotation I have added to the main header template (I definitely think
> > we should move from our custom microformat to this schema.org markup
> that
> > could be much more structured). It's not yet ready but I plan to move it
> > forward in the coming weeks. A beginning of frontend to add to your
> > Wikidata common.js is here:
> > https://www.wikidata.org/wiki/User:Tpt/ws2wd.js
> > We should probably find a way to merge the two projects.
> >
> > Cheers,
> >
> > Thomas
> >
> > > Le 31 oct. 2017 à 15:10, Nicolas VIGNERON <vigneron.nico...@gmail.com>
> a écrit :
> > >
> > > 2017-10-31 13:16 GMT+01:00 Jane Darnell <jane...@gmail.com>:
> > > Sorry, I am much more of a Wikidatan than a Wikisourcerer! I was
> referring to items like this one
> > > https://www.wikidata.org/wiki/Q21125368
> > >
> > > No need to be sorry, that is actually a good question and this example
> is even better (I totally forgot this kind of case).
> > >
> > > For now, this is probably better to deal with it by hands (and I'm not
> sure what this tools can even do for this).
> > >
> > > Cdlt, ~nicolas
> > > ___
> > > Wikisource-l mailing list
> > > Wikisource-l@lists.wikimedia.org
> > > https://lists.wikimedia.org/mailman/listinfo/wikisource-l
> >
> > ___
> > Wikisource-l mailing list
> > Wikisource-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikisource-l
> > Email had 1 at

Re: [Wikisource-l] quickstatements for missing editions

2017-10-31 Thread Jane Darnell

Sorry, I am much more of a Wikidatan than a Wikisourcerer! I was referring
to items like this one
https://www.wikidata.org/wiki/Q21125368

On Tue, Oct 31, 2017 at 10:48 AM, Nicolas VIGNERON <
vigneron.nico...@gmail.com> wrote:

> 2017-10-31 10:00 GMT+01:00 Jane Darnell <jane...@gmail.com>:
>
>> We want the disambiguation pages on Wikisource - I checked a few of these
>> and there are a lot of women and "younger sons" in them that we want. Also,
>> many can be connected to existing "family of ..." pages or name
>> disambiguation pages - they definitely help enrich our understanding of the
>> problems of disambiguation over time.
>>
>
> I'm guessing you're talking about pages in https://en.wikisource.org/
> wiki/Category:Author_disambiguation_pages (which only exist on en.ws) but
> they are in the Author: namespace and (if Im' not mistaken) the WS search
> tool here only look in the main namespace (as it's focused on editions).
>
> So this is a bit besides this mail thread but still, you are very right,
> for people in particular and for everything in general, disambig pages are
> indeed important and ideally the tool should not just discard them as 'not
> edition' (if technically possible to spot them obviously).
>
> Cdlt, ~nicolas
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] quickstatements for missing editions

2017-10-31 Thread Jane Darnell

We want the disambiguation pages on Wikisource - I checked a few of these
and there are a lot of women and "younger sons" in them that we want. Also,
many can be connected to existing "family of ..." pages or name
disambiguation pages - they definitely help enrich our understanding of the
problems of disambiguation over time.

On Tue, Oct 31, 2017 at 9:46 AM, Nicolas VIGNERON <
vigneron.nico...@gmail.com> wrote:

> Yes it's certainly a first draft!! :-) Thanks for trying it out.
>>
>> With the disambig pages, can you suggest how to detect them?
>>
>
> Not sure.
> Could you detect the presence of the Q6148868 templates ? (and same thing
> for Q15701815 ).
> Or else maybe with the categories.
>
>
>> Ah, there's a couple of other bugs here:
>>
>> The page https://fr.wikisource.org/wiki/Accroupissements actually
>> already has a WIkidata ID, but the ws-search database didn't know about
>> it :-( probably because it was failing for a while on some weird
>> problems. I've re-run the scraper, and now that work is showing up with
>> it's proper Q-number:
>> https://tools.wmflabs.org/ws-search/?title=Accroupissements;
>> author==fr
>>
>> The idea with the quickstatements is that it'll only show it for works
>> that are *not yet* linked to wikidata. This is where the disambig
>> problem comes in, because there doesn't seem to be a simple way to
>> determine what's an edition and what's a work without resorting to
>> Wikidata. We could look at categories? Is it a truth universally
>> acknowledged that pages in the categories defined as
>> https://www.wikidata.org/wiki/Q15939659 are all disambiguation pages?
>> That could work...
>>
>
> The truth (and I guess it is universal but could someone confirm?) is that
> pages with 'multiples editions' are 'works' (Q571, this is what I do for
> fr.wikisource at least).
>
> Thank you for all the work!
>
> Cdlt, ~nicolas
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Wikimedia Strategy

2017-04-11 Thread Jane Darnell

Yes I agree - totally wonderful. And there are more ways to make a more
meaningful query out of this (In Dutch #1 is Barbapapa and in English the
Simpsons take 1st place), by either specifying it can'be a film, or just
filtering for inception date before 1970

On Tue, Apr 11, 2017 at 12:51 PM, Gerard Meijssen <gerard.meijs...@gmail.com
> wrote:

> Hoi,
> Classification as we have it is a wonder. It is there and it cannot be
> explained. It does serve a purpose though.
> Thanks,
>  GerardM
>
> On 11 April 2017 at 12:44, Jane Darnell <jane...@gmail.com> wrote:
>
>> Interesting query, thanks! How odd that "sitcom" is a subclass of
>> "literary work"! I never thought of it that way :)
>>
>> On Tue, Apr 11, 2017 at 12:23 PM, Magnus Manske <
>> magnusman...@googlemail.com> wrote:
>>
>>> The 500 most important (as in, number of Wiki sitelinks) literary works
>>> that are (at least partially) in "original language" German, according to
>>> Wikidata:
>>> http://tinyurl.com/mzhd8na
>>> "The Big Bang Theory" item might need some review, but the rest look
>>> good...
>>> Just change the Q188 and the language code for your favourite language!
>>>
>>> On Tue, Apr 11, 2017 at 10:58 AM Andrea Zanni <zanni.andre...@gmail.com>
>>> wrote:
>>>
>>>> In it.source we made a similar Canon:
>>>> https://it.wikisource.org/wiki/Wikisource:Canone_delle_opere
>>>> _della_letteratura_italiana
>>>>
>>>> Ideally, we should have an item (a "work" item, so basically the one
>>>> with a Wikipedia article) on Wikidata for each one.
>>>> Than we can count how many Wikipedias have an article on it. Basically
>>>> it's Tpt's idea using wikidata and sitelinks.
>>>>
>>>> Aubrey
>>>>
>>>>
>>>> On Tue, Apr 11, 2017 at 11:50 AM, Jane Darnell <jane...@gmail.com>
>>>> wrote:
>>>>
>>>> You can always start with the lists per country (if they exist). So for
>>>> example I made an article about the first 500 of such a "1000 most
>>>> important works of literature" list compiled for the Netherlands here:
>>>> https://en.wikipedia.org/wiki/Canon_of_Dutch_Literature
>>>>
>>>> On Tue, Apr 11, 2017 at 10:44 AM, Thomas PT <thoma...@hotmail.fr>
>>>> wrote:
>>>>
>>>> A maybe simpler metric: the top 1000 Wikipedia articles about works per
>>>> page view.
>>>>
>>>> Thomas
>>>>
>>>> > Le 11 avr. 2017 à 09:42, mathieu stumpf guntz <
>>>> psychosl...@culture-libre.org> a écrit :
>>>> >
>>>> > Hi Nemo,
>>>> >
>>>> > We may establish a list a the "1000 works that every Wikisource
>>>> should have" (with translation possibly needed).
>>>> >
>>>> > What metric could we use to define such a list? Maybe reference
>>>> frequency, but it requires statistics whose availability is unknown to me.
>>>> >
>>>> > Statistically,
>>>> > psychoslave
>>>> >
>>>> > Le 29/03/2017 à 08:30, Federico Leva (Nemo) a écrit :
>>>> >> One issue sometimes raised about Wikisource is how we know that
>>>> we're working on the "right" books. Internet Archive is planning to
>>>> textbooks starting from those which are most frequently assigned in USA
>>>> schools:
>>>> >> http://blog.archive.org/2017/03/29/books-donated-for-macarth
>>>> ur-foundation-100change-challenge-from-bookmooch-users/
>>>> >>
>>>> >> I was surprised to learn a project like OpenSyllabus exists and
>>>> works, I emailed them to ask what it would take to do the same for other
>>>> languages/geographies.
>>>> >>
>>>> >> Nemo
>>>> >>
>>>> >> ___
>>>> >> Wikisource-l mailing list
>>>> >> Wikisource-l@lists.wikimedia.org
>>>> >> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>> >
>>>> >
>>>> > ___
>>>> > Wikisource-l mailing list
>>>> > Wikisource-l@lists.wikimedia.org
>>>> > https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>>
>

Re: [Wikisource-l] OCR as a service?

2015-07-29 Thread Jane Darnell

Nice! I will wait for the client though, thx. Where will the source images
be stored? Labs or Commons? It would be nice if you could somehow make a
client that builds a djvu file locally with the page image and the OCR text
that you can cleanup before putting it into the djvu file. Now it just
seems there are so many hurdles to ws that it's quicker to post pages to
Commons and add the text in the template there.

On Wed, Jul 29, 2015 at 8:23 AM, Asaf Bartov abar...@wikimedia.org wrote:

 Hello again.

 So, I've set up an OpenOCR instance on Labs that's available for use as a
 service.  Just call it and point to an image.  Example:

 *curl -X POST -H Content-Type: application/json -d
 '{img_url:http://bit.ly/ocrimage
 http://bit.ly/ocrimage,engine:tesseract}'
 http://openocr.wmflabs.org/ocr http://openocr.wmflabs.org/ocr*

 should yield:

 You can create local variables for the pipelines within the template by
 preﬁxing the variable name with a “$ sign. Variable names have to be
 composed of alphanumeric characters and the underscore. In the example
 below I have used a few variations that work for variable names.

 If we see evidence of abuse, we might have to protect it with API keys,
 but for now, let's AGF. :)

 I'm working on something that would be a client of this service, but don't
 have a demo yet.  Stay tuned! :)

A.

 On Sun, Jul 12, 2015 at 3:27 PM, Alex Brollo alex.bro...@gmail.com
 wrote:

 I explored abbyy gx files, the full xml output from ABBYY ocr engine
 running at Internet Archive, and I've been astonished by the amount of data
 they contain - they are stored at XCA_Extended  detaiI (as documented at
  http://www.abbyy-developers.com/en:tech:features:xml ).

 Something that wikisource best developers should explore; comparing those
 data with the little bit of data into mapped text layer of djvu files is
 impressive and should be inspiring.

 But they are static data coming from a standard setting... nothing
 similar to a service with simple, shared, deep learning features for
 difficult and ancient texts. I tried ancient italian tesseract dictionary
 with very poor results.

 So Asaf, I can't wait for good news from you. :-)

 Alex

 2015-07-12 12:50 GMT+02:00 Andrea Zanni zanni.andre...@gmail.com:



 On Sun, Jul 12, 2015 at 11:25 AM, Asaf Bartov abar...@wikimedia.org
 wrote:

 On Sat, Jul 11, 2015 at 9:59 AM, Andrea Zanni zanni.andre...@gmail.com
  wrote:

 uh, that sounds very interesting.
 Right now, we mainly use OCR from djvu from Internet Archive (that
 means ABBYY Finereader, which is very nice).


 Yes, the output is generally good.  But as far as I can tell, the
 archive's Open Library API does not offer a way to retrieve the OCR output
 programmatically, and certainly not for an arbitrary page rather than the
 whole item.  What I'm working on requires the ability to OCR a single page
 on demand.

 True.
 I've recently met Giovanni, a new (italian) guy who's now working with
 Internet Archive and Open Library.
 We discussed about a number of possible parnerships/projects, this is
 definitely one to bring it up.

 But if we manage to do it directly in the Wikimedia world it's even
 better.

 Aubrey



 ___
 Wikisource-l mailing list
 Wikisource-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikisource-l



 ___
 Wikisource-l mailing list
 Wikisource-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikisource-l



 ___
 Wikisource-l mailing list
 Wikisource-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikisource-l




 --
 Asaf Bartov
 Wikimedia Foundation http://www.wikimediafoundation.org

 Imagine a world in which every single human being can freely share in the
 sum of all knowledge. Help us make it a reality!
 https://donate.wikimedia.org

 ___
 Wikisource-l mailing list
 Wikisource-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikisource-l


___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Playing with Lua, javascript and pagelist tag

2013-06-07 Thread Jane Darnell

Hi Alex,
No the book I want to do is still on my computer. That link you looked
at is a book that was printed about 20 years ago as a facsimile of one
printed in the 17th century. It does happen to be a book that has been
massively reused, and it's not even the original!

The book is just plates, and the only text on them is in the
description section of the files. I don't see the point of having it
on Wikisource because you can use it more easily on Commons (and each
page can be linked to any language-pedia.

I will try to upload the part of the book I mean so you can take a
look at my specific problem. It's 3 volumes but there is a section
that is particularly problematic.

Jane

2013/6/7, Alex Brollo alex.bro...@gmail.com:
 @ Jane again: I'd better look before, and talk after I see a collection
 jpg's from scans, not a djvu file or a Index: page into a wikisource
 project. :-)

 So I presume that I can't find any pagelist tag. :-)

 Did you personally scan those pages?
 Did you scan all the pages of the book (if it's a book...)?
 Do you know if any complete scan of the book has been published previously
 (into Internet Archive, Google books, or other digital libraries)?
 Next time if you scan or take pictures to all the pages of a book, and load
 all the images to Commons, some willing wikisourcian could mount them into
 a multipage djvu file and to open an Index: page to proofread it into a
 wikisource project.


 2013/6/7 Alex Brollo alex.bro...@gmail.com

 Thanks for suggestions I can only promise, I'll think about them. The
 question by Micru is particularly hard. :-(

 @ Jane: I've to read your mail again and again; nevertheless a well
 compiled pagelist tag can really identify into a unique way any page of
 the
 book, even if they have no page number, and tl|Pg manages djvu page/book
 page relationship easily even if book page is identified by something
 like
 Figure 1, Figure 2. I'll take a look at your book.

 Alex


 2013/6/7 Jane Darnell jane...@gmail.com

 I have been wondering the same thing for years. When I upload prints
 to Wikimedia Commons, I am generally in a hurry and just use the
 default uploader to get it out there. Weeks or months or sometimes
 years later I will add in the detailed metadata like the book it was
 first published in, alternate sources for the print from the one I
 used, the publisher if that is a different person than the artist,
 etc. What I don't bother with is page numbers, because this is often
 unknown and changes from edition to edition. You can get around this
 problem by naming specific editions held in specific libraries with
 specific page numbers, which I have done occasionally. Some prints are
 so well known they go by their own titles, and the Wikimedia Commons
 artwork template even has a field Original title to deal with this
 issue.

 When you go through an index of plates in any older book, generally
 there are some mistakes, such as blank pages that are indexed because
 the plate didn't make it to the printer, some plates the printer added
 that didn't make it into the index, and of course the really confusing
 one, the prints that a reprinter added that neither the original
 author nor the original publisher ever saw.

 One reason I have not spent much time on Wikisource is because I feel
 I have to decide up front what the structure of the book will be with
 page numbering (which sometimes does not count the plates), so I need
 to base this on the original index or original list of chapters.
 Sometimes a book becomes famous just for one passage, and that passage
 may not even be indexed in the original version. How do you add these
 links? On Wikimedia Commons you can keep on adding values to fields,
 and change the Information template to Artwork to get more fields.
 You can even add annotations to files and then put links to other
 files in the annotations, so that through the Global usage property
 you can see where such prints have been quoted or re-used. How do
 you do this with books?

 I would like to see a flexible way to set this up that makes it easy
 to come back and make corrections or additions to the published
 information in both indexes and ToC's based on later discovery. This
 book of prints for example shows a page order based on one edition
 that was reproduced in facsimile version, but other versions exist
 with different plates:

 http://commons.wikimedia.org/wiki/Category:32_afbeeldinge_der_Graven_van_HOLLANDT
 How do you set up page numbers for this, because there weren't any to
 start with?
 Jane

 2013/6/7, Andrea Zanni zanni.andre...@gmail.com:
  On Fri, Jun 7, 2013 at 1:36 AM, David Cuenca dacu...@gmail.com
  wrote:
 
  Automatic creation of page transclusion is nice but also dangerous...
 too
  many structures to have an easy solution.
 
 
  What Alex is thinking, if I understand his work correctly, is that
  when
 you
  work on a new book in nsPage,
  you define what the structure is (his work right

Re: [Wikisource-l] Wikisource user group proposal page started

2013-06-02 Thread Jane Darnell

For WLM we have project pages on Commons, even though most
participating countries have their WLM lists on Wikipedia. Maybe
Wikisource should do this too; have all projects and associated files
residing on Commons, with only actual text interfaces on Wikisource.

Many more people can be found on Commons than on Meta.
I signed up anyway

2013/6/2, David Cuenca dacu...@gmail.com:
 Hi there,

 In order to guarantee that there are more general Wikisource projects in
 the future, like those outlined in the Wikisource vision[1], which benefit
 the whole community and not specific language communities, and that there
 is a legitimate way of approaching institutions for collaborations or
 funding, it would be great if everyone who is interested in actively
 improving Wikisource would join the proposed user group!

 http://meta.wikimedia.org/wiki/Wikisource_User_Group

 Perhaps it is also a good way to launch more offline activities like the
 Wikisource workshop during the DC GlamWiki Boot Camp that Chris and Doug
 started [2].

 What are your thoughts about this?

 Cheers,
 David --Micru


 [1]
 http://wikisource.org/wiki/Wikisource_vision_development/Applying_the_WS_values
 [2] http://en.wikipedia.org/wiki/Wikipedia:GLAM/Boot_Camp


___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Reunification of Wikisources

2013-06-02 Thread Jane Darnell

The reason for moving Wikisource content to Commons that I think is
most important is the fact that many original manuscripts have a
one-to-many relationship with other texts in other languages. There is
no definitive translation of the bible, Anna Karenina, Don Quixote,
and so forth. However, when reading these texts, the reader should be
able to see where related content is available in sister projects. On
Wikimedia Commons, you see this with the Global usage feature. This
would be perfect to use for text pages of books as well. Many
engravings are used on Commons in multiple projects, without the
original text being available on Wikisource. It would be a good
project just to line up what we already have; so for example unite a
title page with one of the other engravings in a book on a Wikisource
book stub page. Look at the global usage of this file for example:
http://commons.wikimedia.org/wiki/File:Don_Quixote_5.jpg

Up until now only illustrations are common, but I think the whole book
should be possible to read in DjVu on Commons, no matter what language
the text is in, and no matter what the language interface is of the
user on Commons.

As it stands now, it is only possible to see this Global usage
feature on Commons files, not on text files (because they can only
link to one version of a text in another project per page). In the
example above you can see that the same engraving is used on two
different pages on the French Wikisource. You can't see that anywhere
on Wikisource, only here in the Global usage feature on Commons.

By the way I am not for getting rid of the separate Wikisource
language projects altogether, because I think they still fill an
important purpose for government documents and other things that will
never or rarely be translated. I am just saying that it would be
better to have full texts of original works easily available on
Commons page by page (and perhaps we should involve Wikiquotes in this
too, to split pages when necessary).

2013/6/2, Federico Leva (Nemo) nemow...@gmail.com:
David Cuenca, 02/06/2013 02:22:
[...] specially now that projects like Wikidata have shown that it
is possible to have both localization and centralization living in
harmony.

We're VERY far from such a harmony, or maybe I'm misunderstandind what
you mean here. We don't have a true solution for the problem of a
multilingual wiki, Commons' pains show it well.
https://wikimania2013.wikimedia.org/wiki/Submissions/Multilingual_Wikimedia_Commons_-_What_can_we_do_about_it

From what I recall, localisation was definitely not the reason for
splitting. It's also wrong to assume that bringing people on the same
wiki will give you a single community: you may well just lose the
(senses of) communities and end up with a dispersed array of editors.

Nemo

___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Reunification of Wikisources

2013-06-02 Thread Jane Darnell

Alex thanks for that perspective. I myself was wondering if anyone
counted how many entries are in the books category here:
http://commons.wikimedia.org/wiki/Category:Books

and then the entries for books per sister project on Wikisource. My
gut feeling is that the number per Wikisource entity will be smaller,
but I may be wrong

2013/6/2, Alex Brollo alex.bro...@gmail.com:
Just to have a try: imagine to install proofread extension and to add
Index: and Page: namespaces into Commons, and to allow users to use Commons
as an optional proofreading lab.

No more painful alignement of Authors' metadata: there is Creator
nmespace/template.
No more painful setting aligning of book metadata: there is Book
template.
No more painful localization: there is a powerful set of localization
templates/scripts.
No more need for trick for multi-lengual books: they would be as simple as
mono-lengual ones.

Obviously such Commons books need to be fully shared - as html rendering -
into any wikisource project, just adding a message Work can be edited in
Commons, jus as it happens for shared media.

Alex

2013/6/2 Federico Leva (Nemo) nemow...@gmail.com

David Cuenca, 02/06/2013 02:22:

[...] specially now that projects like Wikidata have shown that it

is possible to have both localization and centralization living in
harmony.

We're VERY far from such a harmony, or maybe I'm misunderstandind what
you
mean here. We don't have a true solution for the problem of a
multilingual
wiki, Commons' pains show it well. https://wikimania2013.**
wikimedia.org/wiki/**Submissions/Multilingual_**
Wikimedia_Commons_-_What_can_**we_do_about_ithttps://wikimania2013.wikimedia.org/wiki/Submissions/Multilingual_Wikimedia_Commons_-_What_can_we_do_about_it

From what I recall, localisation was definitely not the reason for
splitting. It's also wrong to assume that bringing people on the same
wiki
will give you a single community: you may well just lose the (senses of)
communities and end up with a dispersed array of editors.

Nemo

__**_
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.**org Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/**mailman/listinfo/wikisource-lhttps://lists.wikimedia.org/mailman/listinfo/wikisource-l

___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Takedown of BUB book on Internet Archive

Re: [Wikisource-l] Takedown of BUB book on Internet Archive

Re: [Wikisource-l] Takedown of BUB book on Internet Archive

Re: [Wikisource-l] quickstatements for missing editions

Re: [Wikisource-l] quickstatements for missing editions

Re: [Wikisource-l] quickstatements for missing editions

Re: [Wikisource-l] Wikimedia Strategy

Re: [Wikisource-l] OCR as a service?

Re: [Wikisource-l] Playing with Lua, javascript and pagelist tag

Re: [Wikisource-l] Wikisource user group proposal page started

Re: [Wikisource-l] Reunification of Wikisources

Re: [Wikisource-l] Reunification of Wikisources

12 matches

Site Navigation

Mail list logo

Footer information