Re: [Wikisource-l] IA Upload tool — higher-quality DjVus

2017-02-12 Thread Andrea Zanni
On Mon, Feb 13, 2017 at 1:59 AM, Sam Wilson  wrote:

> That's a great idea!
> I think we can use Wikidata to build the list:
>
> http://tinyurl.com/zwdbzyq
>
>
Probably, en.source is the only one who has filled in all Wikisource data
inside Wikidata... Or other Wikisources did that? Do you have some workflow
to share?


> I had been erroneously thinking along the lines that we'd have to be
> uploading something to the items before making it part of a Wikisource
> collection, but of course that's not necessary. I think your hierarchy of
> wikisource collections sounds perfect.
>

perfect.



> It'd be cool if items with a page on a Wikisource could have a little
> footnote like they do for Open Library ones ("[image: [Open Library icon]]
> This book has an editable web page
>  on Open Library
> .).
>

We can try to convince them about that. It'd be only for a fraction of
books, few thousands over the millions they have.
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] IA Upload tool — higher-quality DjVus

2017-02-12 Thread Sam Wilson
That's a great idea!

I think we can use Wikidata to build the list:
http://tinyurl.com/zwdbzyq


I had been erroneously thinking along the lines that we'd have to be
uploading something to the items before making it part of a Wikisource
collection, but of course that's not necessary. I think your hierarchy
of wikisource collections sounds perfect.


It'd be cool if items with a page on a Wikisource could have a little
footnote like they do for Open Library ones ("[Open Library icon]This
book has an editable web page[1] on Open Library[2].).


—sam



On Sun, 12 Feb 2017, at 08:17 PM, Andrea Zanni wrote:

> Hi everyone, 

> I made this, hopefully is helful:

> https://docs.google.com/spreadsheets/d/158GvBrPBW0KfREHRmLFK7EhuB-FQBkLbm9qxJBaJTUY/edit?usp=sharing
> 

> It's the list of the files on Commons uploaded from Internet Archive.
> The idea, right now, is that every language Wikisource would take care
> of their uploads,
> and when they are more than 50 they create a "Italian/German/Bengali
> Wikisource",
> collection on Internet Archive. 

> The whole set of collections will be inside one "Wikisource" global
> collection.
> Make sense? Do you agree?

> 

> On Thu, Feb 9, 2017 at 8:38 AM, Sam Wilson
>  wrote:
>> __

>> 

>> 

>> On Thu, 9 Feb 2017, at 03:13 PM, Alex Brollo wrote:

>>> Thanks Sam! 

>>> Now we should focus on  help about requisites of a good, wikisource-
>>> oriented IA upload: proper scan quality, good file names and useful
>>> metadata. IMHO it would be great to build a "wikisource collection"
>>> into IA, since collection admins can edit any item detail but its
>>> ID, and fix most mistakes.
>>> 

>> 

>> 

>> That sounds like a great idea! So it sounds like[3] we need to have
>> 50 items already uploaded before they'll create a collection for us.
>> Then, maybe we build it into ia-upload: a way of uploading and
>> setting metadata for a set of scan files? It would upload files to IA
>> and then do the DjVu-creating thing and upload just the DjVu to
>> Commons?
>> 

>> Or do people upload to Commons first? And then our tool takes a file
>> (or category of files), uploads it to IA, and then pulls the DjVu
>> back from there and adds it to the same category?
>> 

>> (I'm sort of thinking aloud...)

>> 

>> 

>> 

>> ___

>>  Wikisource-l mailing list

>> Wikisource-l@lists.wikimedia.org

>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l

>> 

> _

> Wikisource-l mailing list

> Wikisource-l@lists.wikimedia.org

> https://lists.wikimedia.org/mailman/listinfo/wikisource-l




Links:

  1. http://openlibrary.org/ia/thatremystre00gaut
  2. https://openlibrary.org/
  3. https://archive.org/about/faqs.php#Collections
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] IA Upload tool — higher-quality DjVus

2017-02-12 Thread Andrea Zanni
Hi everyone,
I made this, hopefully is helful:
https://docs.google.com/spreadsheets/d/158GvBrPBW0KfREHRmLFK7EhuB-FQBkLbm9qxJBaJTUY/edit?usp=sharing

It's the list of the files on Commons uploaded from Internet Archive.
The idea, right now, is that every language Wikisource would take care of
their uploads,
and when they are more than 50 they create a "Italian/German/Bengali
Wikisource",
collection on Internet Archive.
The whole set of collections will be inside one "Wikisource" global
collection.

Make sense? Do you agree?

On Thu, Feb 9, 2017 at 8:38 AM, Sam Wilson  wrote:

>
> On Thu, 9 Feb 2017, at 03:13 PM, Alex Brollo wrote:
>
> Thanks Sam!
> Now we should focus on  help about requisites of a good,
> wikisource-oriented IA upload: proper scan quality, good file names and
> useful metadata. IMHO it would be great to build a "wikisource collection"
> into IA, since collection admins can edit any item detail but its ID, and
> fix most mistakes.
>
>
> That sounds like a great idea! So it sounds like
>  we need to have 50 items
> already uploaded before they'll create a collection for us. Then, maybe we
> build it into ia-upload: a way of uploading and setting metadata for a set
> of scan files? It would upload files to IA and then do the DjVu-creating
> thing and upload just the DjVu to Commons?
>
> Or do people upload to Commons first? And then our tool takes a file (or
> category of files), uploads it to IA, and then pulls the DjVu back from
> there and adds it to the same category?
>
> (I'm sort of thinking aloud...)
>
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] IA Upload tool — higher-quality DjVus

2017-02-08 Thread Sam Wilson


On Thu, 9 Feb 2017, at 03:13 PM, Alex Brollo wrote:

> Thanks Sam! 

> Now we should focus on  help about requisites of a good, wikisource-
> oriented IA upload: proper scan quality, good file names and useful
> metadata. IMHO it would be great to build a "wikisource collection"
> into IA, since collection admins can edit any item detail but its ID,
> and fix most mistakes.
> 



That sounds like a great idea! So it sounds like[1] we need to have 50
items already uploaded before they'll create a collection for us. Then,
maybe we build it into ia-upload: a way of uploading and setting
metadata for a set of scan files? It would upload files to IA and then
do the DjVu-creating thing and upload just the DjVu to Commons?


Or do people upload to Commons first? And then our tool takes a file (or
category of files), uploads it to IA, and then pulls the DjVu back from
there and adds it to the same category?


(I'm sort of thinking aloud...)






Links:

  1. https://archive.org/about/faqs.php#Collections
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] IA Upload tool — higher-quality DjVus

2017-02-08 Thread Alex Brollo
Thanks Sam!
Now we should focus on  help about requisites of a good,
wikisource-oriented IA upload: proper scan quality, good file names and
useful metadata. IMHO it would be great to build a "wikisource collection"
into IA, since collection admins can edit any item detail but its ID, and
fix most mistakes.

Alex

2017-02-09 4:10 GMT+01:00 Sam Wilson :

> This new feature is now live on the ia-upload tool:
> http://tools.wmflabs.org/ia-upload/
> Please raise any issues on Github:
> https://github.com/wikisource/ia-upload/issues
>
> The conversion process takes about 15 minutes for most books, it seems
> like. (For books that already have DjVus at IA, it uploads them
> immediately though.)
>
> Thanks,
> Sam.
>
>
> On Thu, 2 Feb 2017, at 09:33 AM, Sam Wilson wrote:
> > I've been tinkering with the ia-upload tool and incorporating Alex
> > Brollo's better system of DjVu generation (better than converting from
> > PDF, that is; instead it works from the original Jpeg2000 files and
> > merges the OCR data in).
> >
> > I've set up a test installation of the tool at
> > http://tools.wmflabs.org/ia-upload/test/ and would love anyone to have a
> > go at it, and to report any bugs at
> > https://github.com/wikisource/ia-upload/issues
> >
> > Because DjVu generation can take a while (quite a while if you've got a
> > crappy slow laptop like me), the tool runs each job on the grid engine,
> > starting every 5 minutes. The queue is shown on the homepage of the
> > tool, with a status of each job. (Unless you're just re-using an
> > existing DjVu file from the IA, in which case it's just uploaded
> > directly to Commons while you wait, like the tool's always done.)
> >
> > Thanks!
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] IA Upload tool — higher-quality DjVus

2017-02-08 Thread Sam Wilson
This new feature is now live on the ia-upload tool:
http://tools.wmflabs.org/ia-upload/
Please raise any issues on Github:
https://github.com/wikisource/ia-upload/issues

The conversion process takes about 15 minutes for most books, it seems
like. (For books that already have DjVus at IA, it uploads them
immediately though.)

Thanks,
Sam.


On Thu, 2 Feb 2017, at 09:33 AM, Sam Wilson wrote:
> I've been tinkering with the ia-upload tool and incorporating Alex
> Brollo's better system of DjVu generation (better than converting from
> PDF, that is; instead it works from the original Jpeg2000 files and
> merges the OCR data in).
> 
> I've set up a test installation of the tool at
> http://tools.wmflabs.org/ia-upload/test/ and would love anyone to have a
> go at it, and to report any bugs at
> https://github.com/wikisource/ia-upload/issues
> 
> Because DjVu generation can take a while (quite a while if you've got a
> crappy slow laptop like me), the tool runs each job on the grid engine,
> starting every 5 minutes. The queue is shown on the homepage of the
> tool, with a status of each job. (Unless you're just re-using an
> existing DjVu file from the IA, in which case it's just uploaded
> directly to Commons while you wait, like the tool's always done.)
> 
> Thanks!

___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


[Wikisource-l] IA Upload tool — higher-quality DjVus

2017-02-01 Thread Sam Wilson
I've been tinkering with the ia-upload tool and incorporating Alex
Brollo's better system of DjVu generation (better than converting from
PDF, that is; instead it works from the original Jpeg2000 files and
merges the OCR data in).

I've set up a test installation of the tool at
http://tools.wmflabs.org/ia-upload/test/ and would love anyone to have a
go at it, and to report any bugs at
https://github.com/wikisource/ia-upload/issues

Because DjVu generation can take a while (quite a while if you've got a
crappy slow laptop like me), the tool runs each job on the grid engine,
starting every 5 minutes. The queue is shown on the homepage of the
tool, with a status of each job. (Unless you're just re-using an
existing DjVu file from the IA, in which case it's just uploaded
directly to Commons while you wait, like the tool's always done.)

Thanks!

___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l