You can see a great advantage of djvu files over pdf files into the present file list of any IA item. You can see that IA removed djvu files, but it builds and publishes _djvu.xml file. Why? I presume that IA uses that file to "map words" into its book viewer, since it has a good text structure while being *pretty simple*. It can be translated into hOCR, and editing its text nodes the edited text can be uploaded again into the djvu file. Itsource is testing, on some texts, tricks to mass-fix djvu text layer (removing scannos etc.) *before* uploading it into Commons.
It's a pity IMHO that this magic book format has been disregarded. Its structure is *open* just as the pdf structure is *closed*. Alex 2017-01-03 0:19 GMT+01:00 Sam Wilson <s...@samwilson.id.au>: > I wonder if, rather than creating a new IA item, we should just link the > original IA item to the DjVu on Commons (via a review)? Or is there a > discoverability benefit to be had by having the DjVu also on IA? > > > On Tue, 3 Jan 2017, at 07:07 AM, Sam Wilson wrote: > > Good idea. I guess it's not ideal to end up with two items, but at least > the 2nd will be updateable from our end. > > It looks like we can add HTML links to IA reviews too, which is nice: > https://archive.org/details/spinoza_etica_paravia > > > On Mon, 2 Jan 2017, at 11:52 PM, Alex Brollo wrote: > > Done :-) > > Alex > > 2017-01-02 16:49 GMT+01:00 Alex Brollo <alex.bro...@gmail.com>: > > Please take a look to https://archive.org/details > /spinoza_etica_paravia_djvu, this is precisely a djvu-only item that I > uploaded some days ago. I asked for permission to create "djvu-only items" > into IA forum and I got it; this is the fiirst item I created; as you see > there's some "implicit convention" too (the name of item is the original > one + a _djvu suffix: it has been derived from > https://archive.org/details/spinoza_etica_paravia) and metadata are the > same, but a standard warning "Derived from files into L'Etica > <https://archive.org/details/spinoza_etica_paravia>" into the description > field. > > So far I did not do the last step, t.i. adding a "backlink" from original > item to the derived one. > > internetarchive.py allows to automatize the whole work (to download > metadata of source item, to build the new item name and to add the warning > do description field and to upload the new item). > > > > *_______________________________________________* > Wikisource-l mailing list > Wikisource-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikisource-l > > > > _______________________________________________ > Wikisource-l mailing list > Wikisource-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikisource-l > >
_______________________________________________ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l