from:"Alex Brollo"

[Wikisource-l] Re: asking grants to improve WS: WMI is funding some of our projects

2022-07-17 Thread Alex Brollo

😊

Alex brollo

Il giorno gio 30 giu 2022 alle ore 10:29 Asaf Bartov 
ha scritto:

> Wonderful news!
>
>A.
>
> Asaf Bartov (he/him/his)
>
> Senior Program Officer, Emerging Wikimedia Communities
>
> Wikimedia Foundation <https://wikimediafoundation.org/>
>
> Imagine a world in which every single human being can freely share in the
> sum of all knowledge. Help us make it a reality!
> https://donate.wikimedia.org
>
>
> On Wed, Jun 29, 2022 at 10:55 PM Ruthven  wrote:
>
>> Hi guys,
>>during the Wikisource triage meetings we discussed the opportunity of
>> using Foundation and Chapter grants in order to improve the project
>> (focusing on the ProofreadPage extension),  instead of waiting for the
>> Annual Wishlist or hoping for some computer scientist user with a lot of
>> good will. Other User Groups regularly ask for grants with success, even if
>> the activities do not directly improve the projects.
>>
>> We were lucky because WMI (Wikimedia Italy) funded two grants we
>> requested. We asked for grants for developers on very punctual tasks.
>>
>> Sohom is adapting theEdit-in-sequence gadget, used in some projects, to
>> have it natively in the ProofreadPage extension. Jay is upgrading the
>> BookReader (currently, this tool is limited to the Indic Wikisource
>> community)  so that other global Wikisource communities can also use it.
>> Sam is supervising in his work hours the two projects.
>>
>> Here's the link to the WMI announcement:
>> https://www.wikimedia.it/news/sostenere-le-idee-dei-volontari-per-liberare-conoscenza/
>>
>> Let's hope that we can pursue this course of action in the future, in
>> order to have efficient and up-to-date projects. Check with your local
>> chapter if there are grants available for developers, and join us in the
>> Triage meetings!
>>
>>  Cheers,
>>  Alex
>> *Ruthven* on Wikipedia
>> ___
>> Wikisource-l mailing list -- wikisource-l@lists.wikimedia.org
>> To unsubscribe send an email to wikisource-l-le...@lists.wikimedia.org
>>
> ___
> Wikisource-l mailing list -- wikisource-l@lists.wikimedia.org
> To unsubscribe send an email to wikisource-l-le...@lists.wikimedia.org
>
___
Wikisource-l mailing list -- wikisource-l@lists.wikimedia.org
To unsubscribe send an email to wikisource-l-le...@lists.wikimedia.org

Re: [Wikisource-l] Does really wikisource need djvu/pdf files?

2019-07-12 Thread Alex Brollo

Thank you for your mention of Google OCR gadget, I didn't know it; I'll
test it for sure, even inf I'm far from happy to became dependent from
Google services.

Alex

Il giorno ven 12 lug 2019 alle ore 09:26 David Starner 
ha scritto:

> On Thu, Jul 11, 2019 at 11:22 PM Alex Brollo 
> wrote:
> >
> > I don't understand fully your statement "Right now, I'm going to convert
> them to DjVu and upload them, without any text information.". Don't you
> feel any need  of an excellent OCR layer when proofreading it into
> wikisource?
>
> I reuploaded the first issue of Weird Tales in DjVu because the PDF
> was significantly fuzzier than the DjVu, and looking at the PDF OCR,
> it's slightly better than what I can get from the interface. Given the
> choice between better images and better OCR, I go with the first one.
>
> > Do you feel fully satisfied by mediawiki OCR of images?
>
> I can't even get the MediaWiki OCR to work. I use the Google OCR gadget.
>
> > I don't know how to get xml data about mapping of words into page image.
>
> It's a pretty distant concern for me, somewhat tangential to producing
> transcriptions of the works.
>
> --
> Kie ekzistas vivo, ekzistas espero.
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Does really wikisource need djvu/pdf files?

2019-07-11 Thread Alex Brollo

I don't understand fully your statement "Right now, I'm going to convert
them to DjVu and upload them, *without any text information*.". Don't you
feel any need  of an excellent OCR layer when proofreading it into
wikisource? Do you feel fully satisfied by mediawiki OCR of images?
Unluckily, I feel mediawiki OCR very uncomfortable, dealing with
not-English books, and I don't know how to get xml data about mapping of
words into page image. For sure, if mediawiki 1. could serve the best OCR
possible of images with no text layer, after self-recognition of languages
of text, 2. would encourage to upload images at best possible quality, 3.
could optionally serve hOCR or xml of mapped text layer, there would be no
need of thirdy-parts good OCR layer.

Il giorno ven 12 lug 2019 alle ore 06:01 David Starner 
ha scritto:

> On Sat, Jul 6, 2019 at 3:24 PM Alex Brollo  wrote:
> >
> > Nevertheless consider the file structure inside archive.org, who
> collects images into zip files and text into _djvu.xml files, so allowing
> to manage its brilliant viewer.
> > Djvu format really can be used as a compact images+xml container, but it
> seems an obsolete file format, as recent discontinuation of output by
> archive.org suggests.  Pdf is IMHO too complex and can't be considered an
> open format.
>
> Let's look at one of the files I'm going to upload.
> https://archive.org/details/Weird_Tales_v02n02_1923-09 was originally
> uploaded as a zip file of JPEG files. If I could upload it as that, or
> as the zip of JP2 files, I would. Right now, I'm going to convert them
> to DjVu and upload them, without any text information. However,
> there's a lot of cases where we just have PDF files, and I don't want
> to force some of our more technically unskilled users to have to
> figure out file conversion, especially where, in the case of PDF
> files, there's no point; Wikimedia can convert it loselessly to any
> number of pile of page image formats without much problem.
>
> >  Pdf is IMHO too complex and can't be considered an open format.
>
> It's got an ISO standard and royalty-free patent licensing. An open
> format doesn't have to be a simple or good one; it just has to have an
> agreed-upon standard without licensing problems.
>
> --
> Kie ekzistas vivo, ekzistas espero.
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Does really wikisource need djvu/pdf files?

2019-07-06 Thread Alex Brollo

Nevertheless consider the file structure inside archive.org, who collects
images into zip files and text into _djvu.xml files, so allowing to manage
its brilliant viewer.
Djvu format really can be used as a compact images+xml container, but it
seems an obsolete file format, as recent discontinuation of output by
archive.org suggests.  Pdf is IMHO too complex and can't be considered an
open format.

Alex brollo

Il giorno sab 6 lug 2019 alle ore 10:51 David Starner 
ha scritto:

> From my perspective, a DjVu or PDF file is just an archive format for
> images. Any text that comes along with them is ancillary; if it's
> missing, we can always generate it from OCR. I could just as well use
> CBR/CBZ files, though they're not as reliable for having a sensible
> format. I want to avoid, as much as possible, dealing with a bunch of
> disconnected page images, because that maximizes the possibility for
> human error.
>
> --
> Kie ekzistas vivo, ekzistas espero.
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

[Wikisource-l] Does really wikisource need djvu/pdf files?

2019-07-06 Thread Alex Brollo

I like and I studies - as deeply I can - djvu file structure and DjvuLibre
routines; dealing with wikisource needs, I appreciate, but I like less, pdf
files for their complexity. Proofread procedure is presently based on djvu
or pdf files; but I see that another approach could be used, using only
simpler routines.

Proofreading procedure needs two inputs:
1. a set of good images of page scans;
2. a good mapped file of text content matched with images.

About "mapped text", there are two alternatives, hOCR and xml; both can be
used to extract "unmapped raw text" when needed at server level, but at
local level too by jQuery. If hOCR/xml of page text could be fastly and
simply accessed from nsPage, I see interesting opportunities - i.e.
generalized highlighting of selected text on nsPage image both in view and
in edit mode; formatting suggestions from heuristic analysis of word
coordinates; different organization of high level text structures, as wrong
column layout).

Alex brollo (it.wikisource)
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Recto/verso numbering through

2018-06-01 Thread Alex Brollo

An excellent step into continuous enhancement  proofread page extension -
that's IMHO the core of wikisource.

Alex

2018-05-31 22:12 GMT+02:00 Thomas Pellissier Tanon <
tho...@pellissier-tanon.fr>:

> Dear all,
>
> Candalua has implemented a new feature for the Wikitext  tag. It
> is now possible to display automated recto/verso numbering.
>
> For example, if an index page uses "" the displayed
> page numbers are going to be "1", "2r", "2v", "3r", "3v".
>
> 3 new displays options are added: folio, folioroman, foliohighroman. These
> will assign a number to each leaf instead of each page, with the two sides
> labeled r for recto ("front") and v for verso ("back"). The first page in a
> range is assumed to be a recto.
>
> Examples:
> -> 1r 1v 2r 2v
>   -> ir iv iir iiv
>-> Ir Iv IIr IIv.
>
>
> The MediaWiki:Proofreadpage_pagenum_template special template takes now a
> new parameter "formatted" with a better formatted page number (e.g. with
> the "r" of recto and the "v" of verso in superscript).
>
>
> Many thanks to Candalua for having implemented this feature and made it
> merged as part of ProofreadPage.
>
>
> Cheers,
>
> Thomas
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

[Wikisource-l] Bold try running into it.source

2018-04-11 Thread Alex Brollo

We are testing a trick, useful for IA items where there's no djvu file but
there's a _djvu.xml file.

_djvu.xml file is splitted into pages and uploaded "as it is" as page text.
An jQuery script can parse xml and convert it into an excellent plain text.
The same trick runs both in djvu and in pdf based Index pages. Another
advantage is that mapped text is saved as first version of page content and
that it can be recovered and used with no external tool.

While parsing xml, the same script can fix too some FineReader severe
mistakes from wrong analysis of text layout (wrong splitting of text  into
columns/regions) using words coordinates.

Alex brollo
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Djvu format fate

2018-04-06 Thread Alex Brollo

There's on the web an interesting suggestion about difference between djvu
and pdf. The question was: how I can get hOCR from hidden layer of a pdf
file? The reply: convert pdf in djvu, then all wik be simple (more or
less). This comes from the fact that anything into a djvu file is open and
"simply" accessible, just as anything into a pdf is difficult and obscure.
Djvu is wiki, pdf isn't. I don't know any other open format that implements
searchable hidden text underlying page image.

But as a first step, incredible djvu opportunities should be *actively
explored and used*! If you use a car simply as a hen-house, never driving
it, any  standard and effective hen-house is similar, or more effective, in
your opinion.

Alex





2018-04-06 15:45 GMT+02:00 Federico Leva (Nemo) :

> Peter Meyer, 06/04/2018 14:59:
>
>> Could we distill these issues online on a wiki page somewhere?   Or is it
>> already done?
>> (1) what are the significant differences between pdf and djvu (or some
>> new version of djvu that we could imagine coming up with)
>>
>
> I agree this is important to outline. For instance, is there some
> Wikisource where PDF files are actively discouraged in favour of DjVu, and
> for what reasons?
>
> Which DjVu features we dream of using within 5 years, which PDF doesn't
> provide? Do we want a system where libraries can feed us with DjVu files,
> the proofread text gets ingested back to the DjVu file and libraries can
> reuse it? Do we want to use some of the low level features of the text
> layer to widely deploy some dark magic, such as the captcha-based
> proofreading we talked about many times or some other interaction between
> MediaWiki and the scans? What "market" is there for such features?
>
> DjVu became our favourite format back at the time when the upload size
> limit was around 10 MiB, if I remember correctly, and compression was the
> most important factor. I often find myself explaining why it's such a
> useful format, but in the end if someone asks me "so, is it fine to just
> upload a PDF at Wikisource?" I have a hard time giving an answer other than
> "sure, don't worry, it will be the same".
>
> Federico
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

[Wikisource-l] Djvu format fate

2018-04-06 Thread Alex Brollo

As you know, djvu format is an excellent and open format, but its fate is
uncertain since it is overwhelmed by pdf, surely excellent, but closed and
very difficult to manage.

Even if a small part of djvu features are used by mediawiki, djvu in a
necessary tool for wikisource work.

Unluckily, there's no sufficient work about djvu, and I see that recently
ABBYY discontinued the support of djvu format as output in its OCR engines.
This is probably the cause of discontinuation of djvu files output bi
Internet Archive.

Is it possible to encourage MediaWiki to devoid sufficient energies to save
djvu format to its fate?

Alex brollo
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

[Wikisource-l] February 15 bug

2018-02-16 Thread Alex Brollo

A deep bug almost completely blocked nsPage editing for many ours.

Please ask MediaWiki stuff (I don't know which channel is most effective)
for:

1. test carefully any new deploiment of Mediawiki software in wikisource
projects, with special attention to proofread extension;

2. communicate fastly - as soon as they pops out - news about such severe
and frustrating bugs.

Alex
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

[Wikisource-l] Djvu OCR layer

2018-01-23 Thread Alex Brollo

Only a little bit of djvu OCR/text  contents is currently used, I think
that we can do more:
1. xml and dsed (LISP-like) representations have pros and cons, that should
be carefully considered;
2. djvu text layer can host an unlimited number of metadata and free text
content, indipendent from mapped OCR;
3. hOCR (by tesseract) can be translated in dsed, a converting script would
be very useful to inject tesseract output into djvu OCR layer;
4. IA shares a terrible g-zipped xml, _abbyy.gz, where any possible detail
about OCR recognition can be found, and a converting tool to dsed (perhaps,
recovering too many formatting details!) would be very useful.


I'm playing into all from these issues, I'd like to know if any other
wikisource contributor is interested about.
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] wavy line

2018-01-05 Thread Alex Brollo

Let's try too with backgrounds, I'll test using my tamil wikisource
user.common.css, then you will export into MediaWiki:Common.css if I'll get
a decent result.

Alex

2018-01-05 17:42 GMT+01:00 balaji :

> Hello every one,
>Thanks for everyone taking effort to reply. I
> tried various solutions.
>
> Repeating a single character wouldnt be desirable. Because the effect
> wouldnt be same for different screen sizes.
>
> I tried the custom rule template. But with the parameters i could do, I
> could only produce a wavy line of short length. Is there a way to create
> the wave effect for full length like a line?
>
> I tried a solution by Alex. It seems to work okay in firefox. But doesnt
> look good in chrome.
>
> As people pointed out, the shape of the line doesnt convey any meaning.
> Its true. Just wanted to try if there is any easy way to format to look as
> its original. If not possible going to use a simple straight line.
>
>
> Thanks
>
> J. Balaji
>
> (User:Balajijagadesh <https://meta.wikimedia.org/wiki/User:Balajijagadesh>
> )
>
>
>
> On Fri, Jan 5, 2018 at 5:21 PM, Andy Mabbett 
> wrote:
>
>> On 4 January 2018 at 23:09, Alex Brollo  wrote:
>> > 2018-01-04 23:16 GMT+01:00 Andy Mabbett :
>>
>> >> Why not just use a straight horizontal rue? It's styling, not content,
>> >> and conveys no significant meaning.
>>
>> > I respectfully disagree - works are content, editions are contens plus
>> > styling IMHO.
>>
>> In that case, what does the wavy line /mean/? How would you convey
>> that meaning to a non-visual user?
>>
>> --
>> Andy Mabbett
>> @pigsonthewing
>> http://pigsonthewing.org.uk
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] wavy line

2018-01-04 Thread Alex Brollo

2018-01-04 23:16 GMT+01:00 Andy Mabbett :

>
>
> Why not just use a straight horizontal rue? It's styling, not content,
> and conveys no significant meaning.


I respectfully disagree - *works *are content, *editions *are contens plus
styling IMHO.
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] wavy line

2018-01-04 Thread Alex Brollo

I think that repeating a character isn't a clean solution - it would be
difficult to get an "elastic" covering of 100% of page width. I'd  use a
background instead.

Alex

2018-01-04 22:47 GMT+01:00 mathieu stumpf guntz <
psychosl...@culture-libre.org>:

> You might use a simple wavy dash : 〰, with as many as you want .
> Creating a template like {{wavy-dash|repeat=80}} should be too difficult.
> Will you need help for that, if this solution is fine for you?
>
> Le 04/01/2018 à 17:56, balaji a écrit :
>
> Hello all,
>For a book I am proof reading there is a requirement to
> create wavy line. The link for the page is https://ta.wikisource.org/s/6du
>
>
> The wavy line is appearing in many pages.
>
> Is there any template to create this effect.?
>
> Regards,
> J. Balaji.
> (User:Balajijagadesh)
>
>
> ___
> Wikisource-l mailing 
> listWikisource-l@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

[Wikisource-l] Unexpected result from a try to fix IA Upload failures

2017-12-22 Thread Alex Brollo

While trying to fix some failures of IA Upload an unexpected result
emerged: an easy opportunity of fixing some usual OCR errors into djvu text
layer.

In brief, the script xml2dsed.py
<https://it.wikisource.org/wiki/Progetto:Bot/Programmi_in_Python_per_i_bot/xml2dsed.py>
converts IA _djvu.xml files into a "dsed" (lisp-like) code, so that text
layer  can be uploaded into djvu file into a much faster and controllable
way using djvused.exe. While parsing the xml tree, at WORD level any word
of the text layer is exposed to the script environment as pure text; this
offers a unique opportunity to fix many scannos, avoiding any risk to mess
the xml or the dsed code.

Here the first djvu file
<https://commons.wikimedia.org/wiki/File:Trattati_del_Cinquecento_sulla_donna,_1913_%E2%80%93_BEIC_1949816.djvu>
where this has been successfully tested.

Alex brollo

<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Mail
priva di virus. www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

[Wikisource-l] Notice for IA Upload users

2017-12-14 Thread Alex Brollo

I see into IA Upload queue some "probably failed" not-Italian items, but I
can't get the email/the user name of uploader. Probably a it.source script

could successfully merge OCR into djvu in items where IA Upload built a
image-only djvu, but fails to merge text into the file. I'd like to try,
but I need a personal contact  from interested users, just to coordinate
upload of fixed djvu files into Commons.

Feel free to send me an email or a message into my it.source talk page
.

Alex
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] IA Upload large files okay now

2017-11-20 Thread Alex Brollo

Great! :-)

I'm thinking about djvu text layer manipulation, I feel that it could be
useful to get a good djvu xml -> djvu dsed transformation; I'll let you
know for any good news.

Alex

2017-11-20 7:33 GMT+01:00 Sam Wilson :

> The IA Upload tool now will work with files over 100 MB.
>
> Example of large file uploaded:
> https://commons.wikimedia.org/wiki/File:Once_a_Week,_Series_
> 1,_Volume_X.djvu
> (278.92 MB, 736 pages).
>
> Sorry this took very long. There were a few steps to getting it working.
>
> Let me know if it fails for you!
>
> :)
>
> Thanks,
> Sam.
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] wikisource "work" pages or "multiple editions" pages

2017-11-02 Thread Alex Brollo

Consider too the possibility of using a "pseudo-namespace", while testing
the stuff it source used a prefix "Opera": for ns0 pages devoided to
"works" with few drawback (and the big advantage to make things clear)

Alex



2017-11-02 10:30 GMT+01:00 Sam Wilson :

> I must admit, I'm not a huge fan of multiple namespaces in wikis. They're
> mostly not necessary! :-) (Don't worry, I'm not suggesting getting rid of
> any either.)
>
> And certainly, from the point of view of integrating Wikidata and moving
> towards better metadata and searchability, I don't think we need all
> Wikisources to unify on any particular set of namespaces. I think any
> future metadata system must just work with all the different current
> set-ups (and I think it can, quite well).
>
> —Sam.
>
> On Thu, 2 Nov 2017, at 05:21 PM, Anika Born wrote:
>
> Billinghurst,
>
> That might work for me, with a Login.
>
> But does this also work for random readers, who don't have a login? Who
> don't know, that there are preferences (and especially what can be done
> with them?)
>
> But more important: please don't (just) focus in namespaces for every
> Wikisource-Project. You might loose at least de.WS. I can't see changing
> something, that works fine for this project...  Especially not to change a
> system, that is quite different, from what they have now. That is all I am
> asking for. de.ws is working with templates to differ, not with
> namespaces.
>
> for instance Johann Wolfgang von Goethe:
> 
>
> Goethe was an author, but there are also works about Goethe. In de.ws
> portal-page and author-page about Geothe are merged in one. There is no
> difference. Don't expect something else.
>
> Best, Anika
>
> 2017-11-02 9:07 GMT+01:00 billinghurst :
>
>
> Anika,
>
> That is matter long resolved in my opinion with the change in the default
> search namespaces that the communities made, and similarly with our
> redefining content namespaces. While main namespace will always take
> preference to the other nss in results, they show up pretty quickly where
> you have an intitle: match.
>
> At enWS I would say that we lost more searches to subpages, so with the
> ability to change your search preferences with subphrase matches, much of
> that is addressed (though it is not the default search configuration at
> this point).
>
> The completion suggester
> 
> is an algorithm for search suggestions with better typo correction and
> search relevance.
>
> Default (recommended)
> Corrects up to two typos. Resolves close redirects.
>
> Subphrase matching (recommended for longer page titles)
> Corrects up to two typos. Resolves close redirects. Matches subphrase in
> titles.
>
> Strict mode (advanced)
> No typo correction. No accent folding. Strict matching.
>
> Redirect mode (advanced)
> No typo correction. Resolves close redirects.
>
> Redirect mode with subphrase matching (advanced)
> No typo correction. Resolves close redirects. Matches subphrase in titles.
>
> Regards, Billinghurst
>
>
>
> -- Original Message --
> From: "Anika Born" 
> To: "discussion list for Wikisource, the free library" <
> wikisource-l@lists.wikimedia.org>
>
> Sent: 2/11/2017 6:37:29 PM
> Subject: Re: [Wikisource-l] wikisource "work" pages or "multiple editions"
> pages
>
> 2017-11-01 16:40 GMT+01:00 Nicolas VIGNERON :
>
>
>
>
> From afar, the Opera: pages on it.ws are very close to the pages with the
> template {{Éditions}} on fr.ws or the template {{Versions}} on en.ws (and
> similar system elsewhere).
>
> The main difference is having a separate namespace A second major
> difference is that the templates on fr.ws and en.ws are very light while
> the {{Opera}} template took data from Wikidata (but that's an independent
> problem, it's possible to change the {{Éditions}} or {{Versions}} templates
> to do exactly the same thing without having a specific namespace).
>
> I'm almost convinced too, but in order to create a new namespace on a
> project you have to convinced the local community. That's why I'm still
> playing the Devil's advocate role and want to learn about the inconvenients
> of this system
>
>
> A reason why there are no different namespaces for work-, edition-,
> author-, list- and other portal pages in de.ws is the ws-search. When you
> are looking for "Goethe" in the (simple) search (as readers may do) on WS,
> you might get to
> * https://de.wikisource.org/wiki/Tafellied,_zu_Goethe%E2%80%
> 99s_Geburtstage but not to
> * https://de.wikisource.org/wiki/Johann_Wolfgang_von_Goethe
> 
> with all the interesting stuff, if that page was in another namespace...
>
> So there was the desition to use templates (and categories) for these
> different kind of pages: https://de.wikisource.org/wiki
> /Wikisource:Seiten_zu_Autoren,_Texten,

Re: [Wikisource-l] wikisource "work" pages or "multiple editions" pages

2017-11-02 Thread Alex Brollo

"Short works" (i.e. a sonnet) are really a hard issue.
Just to remember two other issues:

1. IMHO translations are not works, the work being a unique "abstract" item
(Iliad)
2. sometimes the author reviewes it work and produces a substantially
different "derived" work (ie. Fermo e Lucia vs I Promessi sposi; different
revisions of Orlando Furioso). IMHO any of these should be considered
different works.

Alex



2017-11-02 9:07 GMT+01:00 billinghurst :

> Anika,
>
> That is matter long resolved in my opinion with the change in the default
> search namespaces that the communities made, and similarly with our
> redefining content namespaces. While main namespace will always take
> preference to the other nss in results, they show up pretty quickly where
> you have an intitle: match.
>
> At enWS I would say that we lost more searches to subpages, so with the
> ability to change your search preferences with subphrase matches, much of
> that is addressed (though it is not the default search configuration at
> this point).
>
> The completion suggester
> 
> is an algorithm for search suggestions with better typo correction and
> search relevance.
> Default (recommended)
> Corrects up to two typos. Resolves close redirects.
> Subphrase matching (recommended for longer page titles)
> Corrects up to two typos. Resolves close redirects. Matches subphrase in
> titles.
> Strict mode (advanced)
> No typo correction. No accent folding. Strict matching.
> Redirect mode (advanced)
> No typo correction. Resolves close redirects.
> Redirect mode with subphrase matching (advanced)
> No typo correction. Resolves close redirects. Matches subphrase in titles.
>
> Regards, Billinghurst
>
>
> -- Original Message --
> From: "Anika Born" 
> To: "discussion list for Wikisource, the free library" <
> wikisource-l@lists.wikimedia.org>
> Sent: 2/11/2017 6:37:29 PM
> Subject: Re: [Wikisource-l] wikisource "work" pages or "multiple editions"
> pages
>
> 2017-11-01 16:40 GMT+01:00 Nicolas VIGNERON :
>
>>
>> From afar, the Opera: pages on it.ws are very close to the pages with
>> the template {{Éditions}} on fr.ws or the template {{Versions}} on en.ws
>> (and similar system elsewhere).
>>
>> The main difference is having a separate namespace A second major
>> difference is that the templates on fr.ws and en.ws are very light while
>> the {{Opera}} template took data from Wikidata (but that's an independent
>> problem, it's possible to change the {{Éditions}} or {{Versions}} templates
>> to do exactly the same thing without having a specific namespace).
>>
>> I'm almost convinced too, but in order to create a new namespace on a
>> project you have to convinced the local community. That's why I'm still
>> playing the Devil's advocate role and want to learn about the inconvenients
>> of this system
>>
>
> A reason why there are no different namespaces for work-, edition-,
> author-, list- and other portal pages in de.ws is the ws-search. When you
> are looking for "Goethe" in the (simple) search (as readers may do) on WS,
> you might get to
> * https://de.wikisource.org/wiki/Tafellied,_zu_Goethe%E2%
> 80%99s_Geburtstage but not to
> * https://de.wikisource.org/wiki/Johann_Wolfgang_von_Goethe
> 
> with all the interesting stuff, if that page was in another namespace...
>
> So there was the desition to use templates (and categories) for these
> different kind of pages: https://de.wikisource.org/
> wiki/Wikisource:Seiten_zu_Autoren,_Texten,_Themen,_Listen
>
>
> 
> I think German Wikisource Community won't give this up and switch to using
> multiple namespaces (besides Wikisource: and Page:namespace).
>
> Best
> Anika
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] WikidataCon 2017

2017-10-23 Thread Alex Brollo

Things about the issue work/edition are simpler for projects like
fr.source, bn.source and other projects with a very high proofread/total
ratio, since there's usually an unique Index page linked with an unique
Commons djvu/pdf page and very often an unique source of that image. Things
are much more confused with "naked" works. I suggest - for simplicity -
that by now any effort should be focused on proofread works/editions
ignoring the case of naked works.

I don't know in detail the work flow into de.source, German language being
a hard obstacle - what's a pity!

Alex

2017-10-23 14:04 GMT+02:00 Thomas Pellissier Tanon <
tho...@pellissier-tanon.fr>:

> Hi,
>
> I am also going to be at Wikidata Con.
>
> In the French Wikisource we started a Wikidata project.
> https://fr.wikisource.org/wiki/Aide:Wikidata  We plan to do a small
> hackathon soon to start uploading our book data to Wikidata (I already have
> a prototype of tools that extract data from Wikisource to upload them to
> Wikidata but it still needs some work to do the job well).
>
>
> > Still my biggest issues/hurdles for good data are
> >   • capture of information from WS to WD — it just is hard work, WEF
> tool is still not sufficiently aligned
>
> Yes, we need some specific tools.
>
> >   • the ever problematic inability to link WP book to WS edition
> through Wikidata
> What we could do is use the new Wikisource MediaWiki extension to add a
> piece of code to add links to Wikipedia from Wikisource and to the other
> Wikisources. We could start prototyping it using lua modules.
>
> >   • that cannot capture information for Wikidata at archive.org,
> and relate that through to the file at Commons, and then the edition at
> Wikisource (or pick another starting point and interrelate0
>
> In Wikidata you can point to the commons file and to IA [1]
>
> >   • the inability to create an edition from a book/work, the
> inability to create a work from an edition
>
> Yes, we should create a UI on top of Wikidata to do such task.
>
> Thomas
>
> [1] https://www.wikidata.org/wiki/Property:P724
>
> > Le 23 oct. 2017 à 13:48, Gerard Meijssen  a
> écrit :
> >
> > Hoi,
> > A Wikipedia matra is be bold and another is that things are a work in
> progress. In my opinion, what we need is the name of a book, its author and
> the fact that people can read it. All the other stuff like what "version"
> is a particular book pales in comparison. We should not let the quest for
> perfection be the enemy of the good.
> >
> > Also Archive.org and Open Library are two different entities. Both the
> Open Library and the Internet Archive have their own identifiers for
> authors and they are not necessarily linked. We are talking about books
> from the Open Library and they are available as an E-book or a PDF.
> >
> > My problem is not with Open Library, my problem is that we do not know
> what is available from Wikisource as a finished good ready for reading. In
> the end what we advertise is the author the book, versions are secondary.
> > Thanks,
> > GerardM
> >
> > On 23 October 2017 at 12:36, billinghurst 
> wrote:
> > Hi Nicolas,
> >
> > Still my biggest issues/hurdles for good data are
> >   • capture of information from WS to WD — it just is hard work, WEF
> tool is still not sufficiently aligned
> >   • the ever problematic inability to link WP book to WS edition
> through Wikidata
> >   • that cannot capture information for Wikidata at archive.org,
> and relate that through to the file at Commons, and then the edition at
> Wikisource (or pick another starting point and interrelate0
> >   • the inability to create an edition from a book/work, the
> inability to create a work from an edition
> > Maybe you can even ask what we need to improve to get bots to run
> through and autocapture, is our meta-data in headers not suitable? What is
> it that is problematic?
> >
> > Thanks for asking.
> >
> > -- billinghurst (being so remote for the action )
> >
> >
> > -- Original Message --
> > From: "Nicolas VIGNERON" 
> > To: "discussion list for Wikisource, the free library" <
> wikisource-l@lists.wikimedia.org>
> > Sent: 23/10/2017 7:30:44 PM
> > Subject: [Wikisource-l] WikidataCon 2017
> >
> >> Hi all,
> >>
> >> For information, the WikidataCon is this week-end in Berlin. While
> there is no talk nominatively around Wikisource, there is some intervention
> on relation subjects (inventaire.io, WikiCite, German National Library,
> FRBR, and so on).
> >>
> >> The event is sold out, but you can follow remotely some of the
> presentation (link will be added here : https://www.wikidata.org/wiki/
> Wikidata:WikidataCon_2017/Program/Remote ).
> >>
> >> I'll be there and I'll be happy to talk about Wikisource, who else will
> be there?
> >>
> >> Cdlt, ~nicolas
> >
> > ___
> > Wikisource-l mailing list
> > Wikisource-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] The very first result of IA _abbyy.gz parsing & bot uploading into nsPage

2017-10-16 Thread Alex Brollo

@Anika: happy to know that you like "visualizzatore" and that you
discovered the search function, that is perhaps the most useful trick,
together with pre-viewing of OCR for "red" pages,  the latter allowing to
refine a book-specific shared regex set.

Alex

<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Mail
priva di virus. www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

2017-10-16 20:09 GMT+02:00 Anika Born :

> as aubrey: Thank you very much!
>
> I shared these news at the Scriptorium of de.ws.
>
> I also used the opportunity to inform them about your "Visualizzatore".
> This is so cool (especially the search-function)
>
> And because I had some time (and the best things come in threes) I invited
> them to your it.WikiCon in Trento (https://meta.wikimedia.org/
> wiki/ItWikiCon/2017/Proposte#Wikisource). Have fun there! My best wishes
> to the organizers. I co-organized it three times in a row for the
> all-German-Community
>
> https://de.wikisource.org/wiki/Wikisource:Skriptorium#
> Italien:_17._bis_19._November_WikiCon_in_Trient
>
>
> Anika
>
> 2017-10-16 19:35 GMT+02:00 Andrea Zanni :
>
>> Thanks Alex!
>> I really hope this is a direction where other developers will follow:
>> being able to harness the full potential of structured data from OCR
>> software is absolutely crucial for Wikisource:
>> we could actually automatize *a lot* of the formatting work now done by
>> volunteers, and their time could be spent still formatting, proofreading
>> and validating, but with much power than before.
>> IMO, it changes a lot if a book is formatted ~50% by a machine, we could
>> do much more books in less time.
>> Go Alex!
>>
>> Aubrey
>>
>> On Mon, Oct 16, 2017 at 5:42 PM, Asaf Bartov 
>> wrote:
>>
>>> That's really promising!
>>>
>>> Thank you for sharing this.
>>>
>>>A.
>>>
>>> On Oct 17, 2017 00:11, "Alex Brollo"  wrote:
>>>
>>>> Here:
>>>> Pagina:D'Ayala_-_Dizionario_militare_francese_italiano.djvu/46
>>>> <https://it.wikisource.org/wiki/Pagina:D%27Ayala_-_Dizionario_militare_francese_italiano.djvu/46>
>>>> and immediately previous and following pages both the text and some
>>>> formatting  from Internet Archive file bub_gb_lvzoCyRdzsoC_abbyy.gz
>>>> <https://archive.org/download/bub_gb_lvzoCyRdzsoC/bub_gb_lvzoCyRdzsoC_abbyy.gz>
>>>>  (in previous pages only some templates have been added and a little
>>>> bit of regex manipulation has be done)
>>>>
>>>> Internet Archive _abbyy.gz files are gzipped, enormous xml files where
>>>> any detail of FineReader OCR output is exported - but, even if enormous and
>>>> terribly complex, they can be parsed and any detail (a little bit
>>>> painfully...)  can be used; presently, only bold, italic,  smallcaps and
>>>> paragraphs have been explored,  translated into wiki code by a prettily
>>>> simple python code.
>>>>
>>>> Alex
>>>>
>>>>
>>>>
>>>> ___
>>>> Wikisource-l mailing list
>>>> Wikisource-l@lists.wikimedia.org
>>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>>
>>>>
>>> ___
>>> Wikisource-l mailing list
>>> Wikisource-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>
>>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] The very first result of IA _abbyy.gz parsing & bot uploading into nsPage

2017-10-16 Thread Alex Brollo

thanks for appreciation - please consider my tries only as a proof that "it
can be done". I'll share the test python code I'm using here:

https://it.wikisource.org/wiki/Progetto:Bot/Programmi_in_Python_per_i_bot/abbyyXml.py

Alex





<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Mail
priva di virus. www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

2017-10-16 20:09 GMT+02:00 Anika Born :

> as aubrey: Thank you very much!
>
> I shared these news at the Scriptorium of de.ws.
>
> I also used the opportunity to inform them about your "Visualizzatore".
> This is so cool (especially the search-function)
>
> And because I had some time (and the best things come in threes) I invited
> them to your it.WikiCon in Trento (https://meta.wikimedia.org/
> wiki/ItWikiCon/2017/Proposte#Wikisource). Have fun there! My best wishes
> to the organizers. I co-organized it three times in a row for the
> all-German-Community
>
> https://de.wikisource.org/wiki/Wikisource:Skriptorium#
> Italien:_17._bis_19._November_WikiCon_in_Trient
>
>
> Anika
>
> 2017-10-16 19:35 GMT+02:00 Andrea Zanni :
>
>> Thanks Alex!
>> I really hope this is a direction where other developers will follow:
>> being able to harness the full potential of structured data from OCR
>> software is absolutely crucial for Wikisource:
>> we could actually automatize *a lot* of the formatting work now done by
>> volunteers, and their time could be spent still formatting, proofreading
>> and validating, but with much power than before.
>> IMO, it changes a lot if a book is formatted ~50% by a machine, we could
>> do much more books in less time.
>> Go Alex!
>>
>> Aubrey
>>
>> On Mon, Oct 16, 2017 at 5:42 PM, Asaf Bartov 
>> wrote:
>>
>>> That's really promising!
>>>
>>> Thank you for sharing this.
>>>
>>>A.
>>>
>>> On Oct 17, 2017 00:11, "Alex Brollo"  wrote:
>>>
>>>> Here:
>>>> Pagina:D'Ayala_-_Dizionario_militare_francese_italiano.djvu/46
>>>> <https://it.wikisource.org/wiki/Pagina:D%27Ayala_-_Dizionario_militare_francese_italiano.djvu/46>
>>>> and immediately previous and following pages both the text and some
>>>> formatting  from Internet Archive file bub_gb_lvzoCyRdzsoC_abbyy.gz
>>>> <https://archive.org/download/bub_gb_lvzoCyRdzsoC/bub_gb_lvzoCyRdzsoC_abbyy.gz>
>>>>  (in previous pages only some templates have been added and a little
>>>> bit of regex manipulation has be done)
>>>>
>>>> Internet Archive _abbyy.gz files are gzipped, enormous xml files where
>>>> any detail of FineReader OCR output is exported - but, even if enormous and
>>>> terribly complex, they can be parsed and any detail (a little bit
>>>> painfully...)  can be used; presently, only bold, italic,  smallcaps and
>>>> paragraphs have been explored,  translated into wiki code by a prettily
>>>> simple python code.
>>>>
>>>> Alex
>>>>
>>>>
>>>>
>>>> ___
>>>> Wikisource-l mailing list
>>>> Wikisource-l@lists.wikimedia.org
>>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>>
>>>>
>>> ___
>>> Wikisource-l mailing list
>>> Wikisource-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>
>>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

[Wikisource-l] The very first result of IA _abbyy.gz parsing & bot uploading into nsPage

2017-10-16 Thread Alex Brollo

Here:
Pagina:D'Ayala_-_Dizionario_militare_francese_italiano.djvu/46

and immediately previous and following pages both the text and some
formatting  from Internet Archive file bub_gb_lvzoCyRdzsoC_abbyy.gz

(in
previous pages only some templates have been added and a little bit of
regex manipulation has be done)

Internet Archive _abbyy.gz files are gzipped, enormous xml files where any
detail of FineReader OCR output is exported - but, even if enormous and
terribly complex, they can be parsed and any detail (a little bit
painfully...)  can be used; presently, only bold, italic,  smallcaps and
paragraphs have been explored,  translated into wiki code by a prettily
simple python code.

Alex
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

[Wikisource-l] An it.source gadget to manage diacritics

2017-08-01 Thread Alex Brollo

Just to let it known, some it.source contributors are using a comfortable
gadget to manage diacritics - it can delete, replace or add a pretty large
list of diacritical marks to any character with a single click.

It uses .normalize() string method, so decomposing-recomposing (when
possible) unicode characters and allowing to manage diacritics alone
indipendently from base ascii character.

Perhaps is this gadget   "rediscovering the wheel"? Anyway, the code is
here: https://it.wikisource.org/wiki/MediaWiki:Gadget-pulsanti-diacritici.js

Alex brollo
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] content modef of [[MediaWiki:Proofreadpage index data config]]

2017-07-14 Thread Alex Brollo

Done in it.ws too, thanks

Alex

2017-07-15 0:40 GMT+02:00 Bodhisattwa Mandal :

> Done in bn.ws
> On Jul 15, 2017 2:48 AM, "ankry.wiki"  wrote:
>
>> This page contains definition of index page fields.
>>
>> Ankry
>>
>>
>> W dniu 2017-07-14 22:58:15 użytkownik Nicolas VIGNERON <
>> vigneron.nico...@gmail.com> napisał:
>>
>> Hi,
>>
>> This mail is quite technical, could we have some more explanations?
>>
>> What is [[MediaWiki:Proofreadpage index data config]] and what does it do?
>>
>> More particularly, I see that br.ws is not listed but maybe it could
>> benefit from it too (and as I'm the only active admin there and 1/5th of
>> the community no-one else than me would do it ; so can I do?).
>>
>> Cdlt, ~nicolas
>>
>> 2017-07-14 22:51 GMT+02:00 ankry.wiki :
>>
>> Dear wikisource administrators!
>>
>> I noticed that default content model (wikitext) of index configuration
>> page:
>>   [[MediaWiki:Proofreadpage index data config]]
>> is wrong in almost all wikisources. As it contains index definition in
>> JSON format, the content model should be JSON to avoid potential
>> presentation problems (eg. in case when somebody wishes to add wikicode
>> examples or HTML tags here).
>>
>> Yesterday, I disputed this with Tpt on IRC, and he agreed with me that
>> the content model "wikitext" is wrong here and told that it is difficult to
>> change it from the ProofreadPage extension. It should be done manually,
>> when ProofredPage is configured.
>>
>> The change only affects HTML presentation of this particular page and is
>> written in database. Page code accessible by API or visible for the
>> ProofreadPage extension remains unchanged. So the change is safe and also
>> fully reversible (if somebody wishes so).
>>
>> The content model has already been changed in en (by Yann), fr (by Tpt),
>> pl & mul (by me), hu (by Tacsipacsi). Nothing needs to be changed on
>> wikisources that do not use this page (eg. ar, de, ko, sv).
>>
>> INSTRUCTION (English interface assumed)
>> - open [[MediaWiki:Proofreadpage index data config]] page
>> - choose "Page information" from the left menu
>> - find the "Page content model" row in the "Basic information" table
>> - if "wikitext" is displayed as current model, click "change"
>> - choose "JSON" as "New content model"
>> - click the "Change" button to save the change.
>>
>> You need to be admin to change content model.
>>
>> The change may be also performed by a global admin or steward. But they
>> need to be ensured that no bot/tool uses HTML code of this page directly.
>> (A bot should use API if really needs this page, but if you know any bot
>> accessing HTML code ot this page, let us know)
>>
>> The alphabetic list of wikisources that still need this change:
>> be, bn, ca, da, es, gu, hr, hy, it, is, mk, nl, no, or, pa, pt, ru, ta,
>> uk, vec, zh
>>
>> Ankry
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] A draft nsPage viewer

2017-07-05 Thread Alex Brollo

Just a fast update, now the viewer shows too OCR of "red pages" and
implements usual wikimedia search, highlighting target words both in
existing pages and in those showing OCR text. It has been exported too into
vec.source, just to test how much running it into a different project is
painful (not so much painful... but vec.source isn't so much different from
it.source).

Alex

2017-06-14 11:58 GMT+02:00 Alex Brollo :

> Great! Obviously vis.js is inspired to IA viewer, even if pages content is
> wikisource nsPage  html. I didn.t know anything about https://phabricator.
> wikimedia.org/T154100 .
>
> Alex
>
> 2017-06-14 11:09 GMT+02:00 Pierre-Yves Beaudouin <
> pierre.beaudo...@gmail.com>:
>
>> FYI, the designer of the book reader on Internet Archive offered her help
>>
>> https://phabricator.wikimedia.org/T154100
>>
>> Pyb
>>
>> 2017-06-14 10:55 GMT+02:00 Alex Brollo :
>>
>>> Thanks for your interest! :-)
>>> Try too this link to see what happens when you try to browse a "red
>>> index", with no created page:
>>> https://it.wikisource.org/wiki/Indice:Vico_-_La_scienza_nuov
>>> a,_2,_1913.djvu?vis=true
>>>
>>> Alex
>>>
>>>
>>>
>>> 2017-06-14 10:38 GMT+02:00 Thomas Pellissier Tanon <
>>> tho...@pellissier-tanon.fr>:
>>>
>>>> What would be great is to implement a "production-ready" version of it
>>>> into ProofreadPage with translation, MediaWiki standard UI components,
>>>> responsive layout... If someone is interested in working on it I could help
>>>> (but I won't commit in doing it myself).
>>>>
>>>> Cheers,
>>>>
>>>> Thomas
>>>>
>>>> > Le 14 juin 2017 à 10:22, Gerard Meijssen 
>>>> a écrit :
>>>> >
>>>> > Hoi,
>>>> > Has it been considered to Internationalise and Localise the code at
>>>> translatewiki.net ?
>>>> > Thanks,
>>>> >  GerardM
>>>> >
>>>> > On 14 June 2017 at 10:03, Alex Brollo  wrote:
>>>> > The code is into https://it.wikisource.org/wiki
>>>> /MediaWiki:Gadget-vis.js (running) and into
>>>> https://it.wikisource.org/wiki/MediaWiki:Gadget-visTest.js
>>>> (development version), with some dependencies to other it.wikisource
>>>> gadgets.
>>>> >
>>>> > I'm far from  a good programmer, so I guess that you've to "catch the
>>>> idea" then  deeply reviewing it. Bengali importation will need  special
>>>> care for non-arabic page numbers. Consider the gadget simply a running
>>>> proof that "it can be done". :-(
>>>> >
>>>> > Alex
>>>> >
>>>> >
>>>> >
>>>> > 2017-06-14 9:26 GMT+02:00 Nicolas VIGNERON <
>>>> vigneron.nico...@gmail.com>:
>>>> > Great !
>>>> >
>>>> > Same question as Bodhi, where can I translate it in French and
>>>> Breton? ;)
>>>> >
>>>> > Cdlt ~nicolas
>>>> >
>>>> >
>>>> > 2017-06-14 8:25 GMT+02:00 Bodhisattwa Mandal <
>>>> bodhisattwa.rg...@gmail.com>:
>>>> > Wow! This is awesome!!
>>>> >
>>>> > Can you please help it localise for Bengali Wikisource?
>>>> >
>>>> > Thanks
>>>> >
>>>> > On 14 June 2017 at 11:32, Alex Brollo  wrote:
>>>> > Try this link:
>>>> >
>>>> > https://it.wikisource.org/wiki/Indice:Collodi_-_Le_avventure
>>>> _di_Pinocchio,_Bemporad,_1892.djvu?vis=true
>>>> >
>>>> > I hope, it will be inspiring for something better.
>>>> >
>>>> > Alex brollo,  itwikisource
>>>> >
>>>> > ___
>>>> > Wikisource-l mailing list
>>>> > Wikisource-l@lists.wikimedia.org
>>>> > https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Bodhisattwa
>>>> >
>>>> >
>>>> > ___
>>>> > Wikisource-l mailing list
>>>> > Wikisource-l@lists.wikimedia.org
>>>> > https://lists.wikimedia.org/mai

Re: [Wikisource-l] Budget for Wikisource

2017-06-30 Thread Alex Brollo

Opppss... I *presume* that _djvu.xml is bugged, really I only examined
whole text file (deved, I think, from  _djvu.xml file). I'll take a deeper
look, examining too searchable PDF.

Alex

2017-06-30 12:20 GMT+02:00 Alex Brollo :

> Take a look to this case: https://archive.org/details/
> GiacomoRacioppiLAgiografiaDiSanLaverioDel1162Images
>
> Here OCR (as you can see from _djvu.xml file) seems severely bugged, and
> obviously djvu file built by IA Upload tool can't be better than source.
>
> Please Aubrey go on notifying me any case of faulty djvu coming from IA or
> coming from IA files used by IA Upload tool.
>
> Alex
>
> 2017-06-30 10:10 GMT+02:00 Andrea Zanni :
>
>> Unfortunately, sometimes, and apparently it's not related to the Google
>> cover page (at least, I removed a page in a book and it doesn't have the
>> problem. Another book indeed is disaligned, without removing the cover).
>>
>> Look this:
>> https://it.wikisource.org/wiki/Indice:Decio_Albini_-_La_sped
>> izione_di_Sapri,_Tip._delle_Terme_diocleziane_di_G._Balbi,_Roma_1891.djvu
>>
>> On Fri, Jun 30, 2017 at 10:00 AM, Sam Wilson  wrote:
>>
>>> This is indeed a bug! I can't replicate it though. Does it happen for
>>> every book for you? Or only sometimes? Do you know what is different about
>>> the ones that fail? Is it related to removing (or not) the Google cover
>>> page?
>>>
>>> I can find time this weekend I think, to work on this.
>>>
>>>
>>> On Fri, 30 Jun 2017, at 03:23 PM, Andrea Zanni wrote:
>>>
>>> Hello everyone, before talking again about this let me say that I think
>>> we have a "major" bug in the IA-upload:
>>> sometimes, the OCR is not aligned between the pages, meaning you have
>>> the right OCR but it's shown for the following page...
>>> Aubrey
>>>
>>> On Thu, May 11, 2017 at 1:30 AM, Sam Wilson  wrote:
>>>
>>>
>>> This is very cool news. :)
>>>
>>> One possibly not-too-onerous feature would be to permit upload of other
>>> file types other than DjVu (e.g. PDF). Or there's the whole topic of
>>> creating/finding Wikidata items for the books uploaded, and updating them
>>> with the IA identifier. That'd probably require the uploading user to
>>> specify a Wikidata ID though — which is what the {{book}} template on
>>> Commons should work from anyway, in my opinion (because it can't be done
>>> via a sitelink).
>>>
>>> I'm very happy to help with whatever I can!
>>>
>>> —sam
>>>
>>> On Wed, 10 May 2017, at 09:38 PM, Andrea Zanni wrote:
>>>
>>> Dear all,
>>> Wikimedia Italia put in its budget 3000€ for Wikisource-related work.
>>> When we discussed this, months ago, we thought about paying a developer
>>> for
>>> the DJVU issue of the IA-Upload tool,
>>> which then has been resolved by our beloved Sam Wilson.
>>>
>>> The tool is still not perfect (I often get errors), so maybe some
>>> development is still needed, but I'd ask you (especially technically
>>> skilled people like Tpt, Sam, Philippe, etc.) if you think there is some
>>> low-hanging fruit that could be reached with that kind of budget.
>>> Of course, we will be looking for developers, so if you want to propose
>>> yourself for something, please do! ;-)
>>>
>>> Aubrey
>>>
>>> *___*
>>> Wikisource-l mailing list
>>> Wikisource-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>
>>>
>>>
>>> ___
>>> Wikisource-l mailing list
>>> Wikisource-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>
>>> *___*
>>> Wikisource-l mailing list
>>> Wikisource-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>
>>>
>>>
>>> ___
>>> Wikisource-l mailing list
>>> Wikisource-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>
>>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Budget for Wikisource

2017-06-30 Thread Alex Brollo

Take a look to this case:
https://archive.org/details/GiacomoRacioppiLAgiografiaDiSanLaverioDel1162Images

Here OCR (as you can see from _djvu.xml file) seems severely bugged, and
obviously djvu file built by IA Upload tool can't be better than source.

Please Aubrey go on notifying me any case of faulty djvu coming from IA or
coming from IA files used by IA Upload tool.

Alex

2017-06-30 10:10 GMT+02:00 Andrea Zanni :

> Unfortunately, sometimes, and apparently it's not related to the Google
> cover page (at least, I removed a page in a book and it doesn't have the
> problem. Another book indeed is disaligned, without removing the cover).
>
> Look this:
> https://it.wikisource.org/wiki/Indice:Decio_Albini_-_La_
> spedizione_di_Sapri,_Tip._delle_Terme_diocleziane_di_G._
> Balbi,_Roma_1891.djvu
>
> On Fri, Jun 30, 2017 at 10:00 AM, Sam Wilson  wrote:
>
>> This is indeed a bug! I can't replicate it though. Does it happen for
>> every book for you? Or only sometimes? Do you know what is different about
>> the ones that fail? Is it related to removing (or not) the Google cover
>> page?
>>
>> I can find time this weekend I think, to work on this.
>>
>>
>> On Fri, 30 Jun 2017, at 03:23 PM, Andrea Zanni wrote:
>>
>> Hello everyone, before talking again about this let me say that I think
>> we have a "major" bug in the IA-upload:
>> sometimes, the OCR is not aligned between the pages, meaning you have the
>> right OCR but it's shown for the following page...
>> Aubrey
>>
>> On Thu, May 11, 2017 at 1:30 AM, Sam Wilson  wrote:
>>
>>
>> This is very cool news. :)
>>
>> One possibly not-too-onerous feature would be to permit upload of other
>> file types other than DjVu (e.g. PDF). Or there's the whole topic of
>> creating/finding Wikidata items for the books uploaded, and updating them
>> with the IA identifier. That'd probably require the uploading user to
>> specify a Wikidata ID though — which is what the {{book}} template on
>> Commons should work from anyway, in my opinion (because it can't be done
>> via a sitelink).
>>
>> I'm very happy to help with whatever I can!
>>
>> —sam
>>
>> On Wed, 10 May 2017, at 09:38 PM, Andrea Zanni wrote:
>>
>> Dear all,
>> Wikimedia Italia put in its budget 3000€ for Wikisource-related work.
>> When we discussed this, months ago, we thought about paying a developer
>> for
>> the DJVU issue of the IA-Upload tool,
>> which then has been resolved by our beloved Sam Wilson.
>>
>> The tool is still not perfect (I often get errors), so maybe some
>> development is still needed, but I'd ask you (especially technically
>> skilled people like Tpt, Sam, Philippe, etc.) if you think there is some
>> low-hanging fruit that could be reached with that kind of budget.
>> Of course, we will be looking for developers, so if you want to propose
>> yourself for something, please do! ;-)
>>
>> Aubrey
>>
>> *___*
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>> *___*
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] A draft nsPage viewer

2017-06-14 Thread Alex Brollo

Great! Obviously vis.js is inspired to IA viewer, even if pages content is
wikisource nsPage  html. I didn.t know anything about
https://phabricator.wikimedia.org/T154100 .

Alex

2017-06-14 11:09 GMT+02:00 Pierre-Yves Beaudouin :

> FYI, the designer of the book reader on Internet Archive offered her help
>
> https://phabricator.wikimedia.org/T154100
>
> Pyb
>
> 2017-06-14 10:55 GMT+02:00 Alex Brollo :
>
>> Thanks for your interest! :-)
>> Try too this link to see what happens when you try to browse a "red
>> index", with no created page:
>> https://it.wikisource.org/wiki/Indice:Vico_-_La_scienza_nuov
>> a,_2,_1913.djvu?vis=true
>>
>> Alex
>>
>>
>>
>> 2017-06-14 10:38 GMT+02:00 Thomas Pellissier Tanon <
>> tho...@pellissier-tanon.fr>:
>>
>>> What would be great is to implement a "production-ready" version of it
>>> into ProofreadPage with translation, MediaWiki standard UI components,
>>> responsive layout... If someone is interested in working on it I could help
>>> (but I won't commit in doing it myself).
>>>
>>> Cheers,
>>>
>>> Thomas
>>>
>>> > Le 14 juin 2017 à 10:22, Gerard Meijssen 
>>> a écrit :
>>> >
>>> > Hoi,
>>> > Has it been considered to Internationalise and Localise the code at
>>> translatewiki.net ?
>>> > Thanks,
>>> >  GerardM
>>> >
>>> > On 14 June 2017 at 10:03, Alex Brollo  wrote:
>>> > The code is into https://it.wikisource.org/wiki
>>> /MediaWiki:Gadget-vis.js (running) and into
>>> https://it.wikisource.org/wiki/MediaWiki:Gadget-visTest.js (development
>>> version), with some dependencies to other it.wikisource gadgets.
>>> >
>>> > I'm far from  a good programmer, so I guess that you've to "catch the
>>> idea" then  deeply reviewing it. Bengali importation will need  special
>>> care for non-arabic page numbers. Consider the gadget simply a running
>>> proof that "it can be done". :-(
>>> >
>>> > Alex
>>> >
>>> >
>>> >
>>> > 2017-06-14 9:26 GMT+02:00 Nicolas VIGNERON >> >:
>>> > Great !
>>> >
>>> > Same question as Bodhi, where can I translate it in French and Breton?
>>> ;)
>>> >
>>> > Cdlt ~nicolas
>>> >
>>> >
>>> > 2017-06-14 8:25 GMT+02:00 Bodhisattwa Mandal <
>>> bodhisattwa.rg...@gmail.com>:
>>> > Wow! This is awesome!!
>>> >
>>> > Can you please help it localise for Bengali Wikisource?
>>> >
>>> > Thanks
>>> >
>>> > On 14 June 2017 at 11:32, Alex Brollo  wrote:
>>> > Try this link:
>>> >
>>> > https://it.wikisource.org/wiki/Indice:Collodi_-_Le_avventure
>>> _di_Pinocchio,_Bemporad,_1892.djvu?vis=true
>>> >
>>> > I hope, it will be inspiring for something better.
>>> >
>>> > Alex brollo,  itwikisource
>>> >
>>> > ___
>>> > Wikisource-l mailing list
>>> > Wikisource-l@lists.wikimedia.org
>>> > https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > Bodhisattwa
>>> >
>>> >
>>> > ___
>>> > Wikisource-l mailing list
>>> > Wikisource-l@lists.wikimedia.org
>>> > https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>> >
>>> >
>>> >
>>> > ___
>>> > Wikisource-l mailing list
>>> > Wikisource-l@lists.wikimedia.org
>>> > https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>> >
>>> >
>>> >
>>> > ___
>>> > Wikisource-l mailing list
>>> > Wikisource-l@lists.wikimedia.org
>>> > https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>> >
>>> >
>>> > ___
>>> > Wikisource-l mailing list
>>> > Wikisource-l@lists.wikimedia.org
>>> > https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>
>>>
>>> ___
>>> Wikisource-l mailing list
>>> Wikisource-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>
>>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] A draft nsPage viewer

2017-06-14 Thread Alex Brollo

Thanks for your interest! :-)
Try too this link to see what happens when you try to browse a "red index",
with no created page:
https://it.wikisource.org/wiki/Indice:Vico_-_La_scienza_nuova,_2,_1913.djvu?vis=true

Alex



2017-06-14 10:38 GMT+02:00 Thomas Pellissier Tanon <
tho...@pellissier-tanon.fr>:

> What would be great is to implement a "production-ready" version of it
> into ProofreadPage with translation, MediaWiki standard UI components,
> responsive layout... If someone is interested in working on it I could help
> (but I won't commit in doing it myself).
>
> Cheers,
>
> Thomas
>
> > Le 14 juin 2017 à 10:22, Gerard Meijssen  a
> écrit :
> >
> > Hoi,
> > Has it been considered to Internationalise and Localise the code at
> translatewiki.net ?
> > Thanks,
> >  GerardM
> >
> > On 14 June 2017 at 10:03, Alex Brollo  wrote:
> > The code is into https://it.wikisource.org/wiki/MediaWiki:Gadget-vis.js
> (running) and into https://it.wikisource.org/
> wiki/MediaWiki:Gadget-visTest.js (development version), with some
> dependencies to other it.wikisource gadgets.
> >
> > I'm far from  a good programmer, so I guess that you've to "catch the
> idea" then  deeply reviewing it. Bengali importation will need  special
> care for non-arabic page numbers. Consider the gadget simply a running
> proof that "it can be done". :-(
> >
> > Alex
> >
> >
> >
> > 2017-06-14 9:26 GMT+02:00 Nicolas VIGNERON :
> > Great !
> >
> > Same question as Bodhi, where can I translate it in French and Breton? ;)
> >
> > Cdlt ~nicolas
> >
> >
> > 2017-06-14 8:25 GMT+02:00 Bodhisattwa Mandal <
> bodhisattwa.rg...@gmail.com>:
> > Wow! This is awesome!!
> >
> > Can you please help it localise for Bengali Wikisource?
> >
> > Thanks
> >
> > On 14 June 2017 at 11:32, Alex Brollo  wrote:
> > Try this link:
> >
> > https://it.wikisource.org/wiki/Indice:Collodi_-_Le_
> avventure_di_Pinocchio,_Bemporad,_1892.djvu?vis=true
> >
> > I hope, it will be inspiring for something better.
> >
> > Alex brollo,  itwikisource
> >
> > ___
> > Wikisource-l mailing list
> > Wikisource-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikisource-l
> >
> >
> >
> >
> > --
> > Bodhisattwa
> >
> >
> > ___
> > Wikisource-l mailing list
> > Wikisource-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikisource-l
> >
> >
> >
> > ___
> > Wikisource-l mailing list
> > Wikisource-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikisource-l
> >
> >
> >
> > ___
> > Wikisource-l mailing list
> > Wikisource-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikisource-l
> >
> >
> > ___
> > Wikisource-l mailing list
> > Wikisource-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] A draft nsPage viewer

2017-06-14 Thread Alex Brollo

The code is into https://it.wikisource.org/wiki/MediaWiki:Gadget-vis.js
(running) and into
https://it.wikisource.org/wiki/MediaWiki:Gadget-visTest.js (development
version), with some dependencies to other it.wikisource gadgets.

I'm far from  a good programmer, so I guess that you've to "catch the idea"
then  deeply reviewing it. Bengali importation will need  special care for
non-arabic page numbers. Consider the gadget simply a running proof that
"it can be done". :-(

Alex



2017-06-14 9:26 GMT+02:00 Nicolas VIGNERON :

> Great !
>
> Same question as Bodhi, where can I translate it in French and Breton? ;)
>
> Cdlt ~nicolas
>
>
> 2017-06-14 8:25 GMT+02:00 Bodhisattwa Mandal 
> :
>
>> Wow! This is awesome!!
>>
>> Can you please help it localise for Bengali Wikisource?
>>
>> Thanks
>>
>> On 14 June 2017 at 11:32, Alex Brollo  wrote:
>>
>>> Try this link:
>>>
>>> https://it.wikisource.org/wiki/Indice:Collodi_-_Le_avventure
>>> _di_Pinocchio,_Bemporad,_1892.djvu?vis=true
>>>
>>> I hope, it will be inspiring for something better.
>>>
>>> Alex brollo,  itwikisource
>>>
>>> ___
>>> Wikisource-l mailing list
>>> Wikisource-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>
>>>
>>
>>
>> --
>> Bodhisattwa
>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

[Wikisource-l] A draft nsPage viewer

2017-06-13 Thread Alex Brollo

Try this link:

https://it.wikisource.org/wiki/Indice:Collodi_-_Le_avventure_di_Pinocchio,_Bemporad,_1892.djvu?vis=true

I hope, it will be inspiring for something better.

Alex brollo,  itwikisource
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] new right for Wikisource admins in Mediawiki/ProofreadPage

2017-06-10 Thread Alex Brollo

Thanks Ankry, no, I didn't notice this (odd!) change of proofreading
policy. I notified this alert to it.source community; luckily, even if I'm
an admin into it.source so being "at risk", I use our editing interface eis
(Edit In Sequence) that effectively prevents mistakes from this change.

Alex brollo

2017-06-09 19:00 GMT+02:00 ankry.wiki :

> Unsure if you noticed that admins are now able to override Page status
> change limitations in ProofreadPage.
>
> There is a discussion in phabricator:
>   https://phabricator.wikimedia.org/T167491
> about this feature if you care.
>
> Ankry
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Wikimedia Strategy, cycle 2

2017-06-03 Thread Alex Brollo

Personally my feel about statements is that they are so much bold, to be
discouraging + a little bit frustrating -  somehow related
to Neuro-linguistic programming or business coaching. Am I alone to feel
this?

Alex

2017-06-02 16:13 GMT+02:00 Andrea Zanni :

> Dear all,
> cycle 2 for strategy is started.
>
> The Wikimedia movement strategy core team and working groups have
> completed reviewing the more than 1800 thematic statements we received from
> the first discussion. They have identified 5 themes that were consistent
> across all the conversations - each with their own set of sub-themes. These
> are not the final themes, just an initial working draft of the core
> concepts.
>
> This round of discussions will take place between now and June 12th. You
> can discuss as many as you like; we ask you to participate in the ones that
> are most (or least) important to you.
>
> I think that, as an international community, it's important for us to give
> our opinion on the themes proposed. They are:
>
> * Healthy, Inclusive Communities
> * The Augmented Age
> * A Truly Global Movement
> * The Most Respected Source of Knowledge
> * Engaging in the Knowledge Ecosystem
>
> I'm not sure if you want to discuss it here, on the mailing list, or you
> prefer to set up pages on your local wikisource, or both. You can also go
> and discuss directly on Meta¹.
>
> I just set up the pages on it.source, and will (desperately) try to revamp
> the discussion there.
> If you need a hand in understanding what to do, please ask: the process is
> complicated for everyone ;-)
>
> Aubrey
>
> ¹
> http://meta.wikimedia.org/wiki/Special:MyLanguage/Strategy/Wikimedia
> movement/2017/Cycle 2/Healthy, Inclusive Communities
>
> http://meta.wikimedia.org/wiki/Special:MyLanguage/Strategy/Wikimedia
> movement/2017/Cycle 2/The Augmented Age
>
> http://meta.wikimedia.org/wiki/Special:MyLanguage/Strategy/Wikimedia
> movement/2017/Cycle 2/A Truly Global Movement
>
> http://meta.wikimedia.org/wiki/Special:MyLanguage/Strategy/Wikimedia
> movement/2017/Cycle 2/The Most Respected Source of Knowledge
>
> http://meta.wikimedia.org/wiki/Special:MyLanguage/Strategy/Wikimedia
> movement/2017/Cycle 2/Engaging in the Knowledge Ecosystem
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Creating family tree chart while proof reading

2017-04-11 Thread Alex Brollo

Look at it as a table, with borders for some cell. It.wikisource uses a
Template:Cs  (calling a
Module:Cs ), that makes easy to
add specific borders to individual cells. A little bit of colspan and
rowspan, and you'll get an "elastic" and exportable family tree chard.

Alex

2017-04-11 15:03 GMT+02:00 balaji :

> Hi all,
>There is one particular book I am proof reading in Tamil
> language. There is a page which has a family tree chart. How to proof read
> that. The page I am talking about can be found here
> https://ta.wikisource.org/s/938 . In the current format if downloaded as
> epub of rtf etc., the structure is not maintained if page size is changed.
> How this can be proof read?
>
> Regards,
> J.Balaji.
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Book-based pageviews statistics

2017-04-06 Thread Alex Brollo

I guess, that if we could build an excellent AJAX-based page viewer
(something like IA viewer, but displaying the wiki view code two nsPage
pages instead of images), how much fast and comfortable is possible,  books
views would increase a lot.

Alex

2017-04-05 7:23 GMT+02:00 Jayanta Nath :

> Hi Aubrey,
>
> Thanks, Updated to our Bengali Wikisource.
>
> Regards,
> Jayanta
>
> On Wed, Apr 5, 2017 at 10:44 AM, balaji  wrote:
>
>> Hi Aubrey,
>>Thanks for sharing the info. I have made the changes to
>> Tamil Wikisource. Keep sharing more info.
>> Thanks&Regards,
>> J.Balaji
>>
>> On Wed, Apr 5, 2017 at 2:45 AM, Andrea Zanni 
>> wrote:
>>
>>>
>>> {{#ifeq:{{NAMESPACENUMBER}}|0|'''·''' [https://tools.wmflabs.org/mas
>>> sviews?project={{SERVERNAME}}&source=subpages&target=http:{{
>>> urlencode:{{fullurl:{{ROOTPAGENAME}} book pageviews counter]}}
>>
>>
>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

[Wikisource-l] Any news about Bub upload tool?

2017-03-03 Thread Alex Brollo

What about Bub upload tool (http://tools.wmflabs.org/bub/index)? It seems
unmanteined, since it accepts new upoload requests but it does'n run the
uploads; there's no message to warn users about its stop.

Alex brollo
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] IA Upload tool — higher-quality DjVus

2017-02-08 Thread Alex Brollo

Thanks Sam!
Now we should focus on  help about requisites of a good,
wikisource-oriented IA upload: proper scan quality, good file names and
useful metadata. IMHO it would be great to build a "wikisource collection"
into IA, since collection admins can edit any item detail but its ID, and
fix most mistakes.

Alex

2017-02-09 4:10 GMT+01:00 Sam Wilson :

> This new feature is now live on the ia-upload tool:
> http://tools.wmflabs.org/ia-upload/
> Please raise any issues on Github:
> https://github.com/wikisource/ia-upload/issues
>
> The conversion process takes about 15 minutes for most books, it seems
> like. (For books that already have DjVus at IA, it uploads them
> immediately though.)
>
> Thanks,
> Sam.
>
>
> On Thu, 2 Feb 2017, at 09:33 AM, Sam Wilson wrote:
> > I've been tinkering with the ia-upload tool and incorporating Alex
> > Brollo's better system of DjVu generation (better than converting from
> > PDF, that is; instead it works from the original Jpeg2000 files and
> > merges the OCR data in).
> >
> > I've set up a test installation of the tool at
> > http://tools.wmflabs.org/ia-upload/test/ and would love anyone to have a
> > go at it, and to report any bugs at
> > https://github.com/wikisource/ia-upload/issues
> >
> > Because DjVu generation can take a while (quite a while if you've got a
> > crappy slow laptop like me), the tool runs each job on the grid engine,
> > starting every 5 minutes. The queue is shown on the homepage of the
> > tool, with a status of each job. (Unless you're just re-using an
> > existing DjVu file from the IA, in which case it's just uploaded
> > directly to Commons while you wait, like the tool's always done.)
> >
> > Thanks!
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] The conversion from PDF to DJVU loses too much quality

2017-01-26 Thread Alex Brollo

Yes, presently IA jp2.zip are the source files for all derived ones and for
OCR. All the derived ones are omologous - t.i. *relative* coordinates of
any element inside images are identical, even if image size varies. This
means that mapping of elements (images or text) can be exported into any
derived file.

Just an example: when an user crops an image from a djvu file by the
excellent CropTool by Danmichaelo, coordinated of the cropping  could be
used to crop high-resolution jp2 or jpg image, or to get coordinates of any
piece of  text mapped by OCR.

Alex







2017-01-27 0:53 GMT+01:00 Sam Wilson :

> Good to know, thanks!
>
> So, we just stick with jp2.zip
>
> And I love the IA magic :)
>
>
> On Fri, 27 Jan 2017, at 07:40 AM, Andrea Zanni wrote:
>
> AFAIK, IA always produce the jp2 files by himself.
> I suggest GLAMs to upload zipped folders of jpegs,
> so IA can do his magic and produce a book viewer and a PDF as well as the
> jp2.
>
> On Fri, Jan 27, 2017 at 12:10 AM, Sam Wilson  wrote:
>
>
>
>
>
> On Thu, 26 Jan 2017, at 06:35 PM, Andrea Zanni wrote:
>
> The problem for me is that librarians and other people who are genuinely
> interested in Wikisource and IA
> don't understand why
> * they upload a good scan on IA
> * see a good book on IA, via the viewer
> * get an horrible djvu on Wikisource.
>
> This is the issue we should try to solve, otherwise we will lose a
> potential important ally, content and new userbase.
> Aubrey
>
>
>
> Definitely!
>
> On a related note: most (all?) IA-scanned books have e.g. *_jp2.zip files
> containing all the original scan images, but is there any standard for
> user-uploaded books? Like your librarians above, I assume they're uploading
> individual jpg/png files? Do these get combined into a single zip? I'm
> thinking that they don't, and that ia-upload needs to provide the option of
> using any of the following sources:
>
>- .djvu
>- _jp2.zip (there's also _jpg.zip and _raw_jp2.zip, but I guess we
>don't need to use them?)
>- *.jpg + *.jp2 + *.png (i.e. use all images in the item, apart from
>_cover_image.jpg)
>- .pdf
>
>
> Sound complete? Or are there other ways?
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
> *___*
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] The conversion from PDF to DJVU loses too much quality

2017-01-26 Thread Alex Brollo

By now IA pdf too are very compressed, sometimes too much - the result
being impredictable; the problem is, that viewer doesn't uses djvu nor pdf
IMHO, so the quality of pdf (and of resulting djvu by pdf2djvu) doesn't
mirror at all the quality of viewer images.

The IA pdf needs a good review before upload it into Commons.

There are subltle advantages using djvu instead of pdf, i.e. fixing errors
into source file (adding/deleting/moving pages, manipulating text layer);
djvu is a great "wiki" format since it is *open*.

Alex



2017-01-25 11:35 GMT+01:00 Yann Forget :

>
>
> 2017-01-25 8:40 GMT+01:00 Sam Wilson :
>
>>
>> On Wed, 25 Jan 2017, at 03:27 PM, Andrea Zanni wrote:
>>
>> On Wed, Jan 25, 2017 at 1:45 AM, Sam Wilson  wrote:
>>
>>
>> Yann, do you mean you're getting good quality DjVu generated from the
>> PDF? Or from the original scan Jpegs?
>>
>> AFAIU, Yann is using ABBYY finereader to generate a djvu and then uploads
>> it directly to Commons. So outside of our ia-upload tool.
>>
>> Ah, okay. So if it could be done in the tool, that'd be nicer.
>>
>> Yes, it is a question of settings.
>
>> Aubrey: when you say directly use the PDF, you mean for the tool to copy
>> that across to Commons and not create a DjVu?
>>
>>
>> Yes.
>> If the Djvu quality is much lower than the PDF there's no reason to use
>> the djvu over the pdf :-(
>>
>> DjVu has to advantages over PDF: better compression, so small files for
> the same content, and better management of the text layer.
> Over if the compression is too high, the quality is not good. It is a
> question of a compromise between quality and size.
>
> Yann
>
>
>> Are we saying that we *never* want to use the IA PDF? That if there's a
>> DjVu we use it, and if there isn't we generate our own DjVu from the JP2
>> and djvu.xml files? Or should the tool user make this call and we give them
>> a drop-down list of "PDF only", "Generate DjVu from PDF", and "Generate
>> DjVu from original scans" with a note about the last of these being higher
>> quality but slower?
>>
>> I think I'm in favour of just generating a high-quality DjVu and making
>> it simpler for the end user. But we want to be flexible too. jayantanth
>> mentioned  that he'd like to
>> be able to just upload the PDF for example.
>>
>>
>>
>>
>> I can have a look at adding that feature perhaps? (Anyone else working on
>> this?)
>>
>>
>> Please ;-)
>>
>>
>> I can try!  :-)
>>
>> Aubrey
>> *___*
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] The conversion from PDF to DJVU loses too much quality

2017-01-24 Thread Alex Brollo

My tool is very rough, and some recent tests show that is not sufficiently
generalized - it fails into some IA items; so I thing that it can
considered simply a proof that a tool, that uses IA jp2 images (that are
shown into the IA viewer IMHO) and djvy_xml can be merged into a
high-quality djvu file.

Alex brollo

2017-01-24 12:03 GMT+01:00 Andrea Zanni :

> I added this issue to IA-upload tool on github:
> https://github.com/Tpt/ia-upload/issues/14
>
> Unfortunately, the new PDF > DJVU conversion is useless, as it loses too
> much quality.
> Can we find a solution?
> The IA-Upload tool is a great asset for the whole international community,
> and it's very simple to teach librarians to upload stuff on IA and then
> use it to port it on Commons and Wikisource.
> But when they upload new stuff on IA, we don't have the IA djvu anymore.
> So the tool converts the original PDF to a new DJVU, and this is the part
> of the process that is failing.
>
> I can think of 2 solutions:
> * integrate this script from Alex brollo into the tool:
> https://it.wikisource.org/wiki/Progetto:Bot/Programmi_in_
> Python_per_i_bot/jp2todjvu.py
> the script creates a good quality djvu
> * have a toggle/top-down menu which allow the user to use directly the
> PDF.
>
> Andrea
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Upload/import wizard

2017-01-02 Thread Alex Brollo

You can see a great advantage of djvu files over pdf files into the present
file list of any IA item. You can see that IA removed djvu files, but it
builds and publishes _djvu.xml file. Why?  I presume that IA uses that file
to "map words" into its book viewer, since it has a good text structure
while being *pretty simple*. It can be translated into hOCR, and editing
its text nodes the edited text can be uploaded again into the djvu file.
Itsource is testing, on some texts, tricks to mass-fix djvu text layer
(removing scannos etc.) *before* uploading it into Commons.

It's a pity IMHO that this magic book format has been disregarded. Its
structure is *open* just as the pdf structure is *closed*.

Alex



2017-01-03 0:19 GMT+01:00 Sam Wilson :

> I wonder if, rather than creating a new IA item, we should just link the
> original IA item to the DjVu on Commons (via a review)? Or is there a
> discoverability benefit to be had by having the DjVu also on IA?
>
>
> On Tue, 3 Jan 2017, at 07:07 AM, Sam Wilson wrote:
>
> Good idea. I guess it's not ideal to end up with two items, but at least
> the 2nd will be updateable from our end.
>
> It looks like we can add HTML links to IA reviews too, which is nice:
> https://archive.org/details/spinoza_etica_paravia
>
>
> On Mon, 2 Jan 2017, at 11:52 PM, Alex Brollo wrote:
>
> Done :-)
>
> Alex
>
> 2017-01-02 16:49 GMT+01:00 Alex Brollo :
>
> Please take a look to https://archive.org/details
> /spinoza_etica_paravia_djvu, this is precisely a djvu-only item that I
> uploaded some days ago. I asked for permission to create "djvu-only items"
> into IA forum and I got it; this is the fiirst item I created; as you see
> there's some "implicit convention" too (the name of item is the original
> one + a _djvu suffix: it has been derived from
> https://archive.org/details/spinoza_etica_paravia) and metadata are the
> same, but a standard warning "Derived from files into L'Etica
> <https://archive.org/details/spinoza_etica_paravia>" into the description
> field.
>
> So far I did not do the last step, t.i. adding a "backlink" from original
> item to the derived one.
>
> internetarchive.py allows to automatize the whole work (to download
> metadata of source item, to build the new item name and to add the warning
> do description field and to upload the new item).
>
>
>
> *___*
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Upload/import wizard

2017-01-02 Thread Alex Brollo

Done :-)

Alex

2017-01-02 16:49 GMT+01:00 Alex Brollo :

> Please take a look to https://archive.org/details/spinoza_etica_paravia_
> djvu, this is precisely a djvu-only item that I uploaded some days ago. I
> asked for permission to create "djvu-only items" into IA forum and I got
> it; this is the fiirst item I created; as you see there's some "implicit
> convention" too (the name of item is the original one + a _djvu suffix: it
> has been derived from https://archive.org/details/spinoza_etica_paravia)
> and metadata are the same, but a standard warning "Derived from files
> into L'Etica <https://archive.org/details/spinoza_etica_paravia>" into
> the description field.
>
> So far I did not do the last step, t.i. adding a "backlink" from original
> item to the derived one.
>
> internetarchive.py allows to automatize the whole work (to download
> metadata of source item, to build the new item name and to add the warning
> do description field and to upload the new item).
>
> Alex
>
>
> 2017-01-02 14:37 GMT+01:00 Sam Wilson :
>
>>
>>
>> On Mon, 2 Jan 2017, at 05:29 PM, Andrea Zanni wrote:
>>
>>
>> Ideally, we should talk to IA about this.
>> Adding a comment on the IA item is a very low-cost solution and I think
>> is important, adding the djvu would be much better. We should check if a
>> script can edit every kind of item and add files (I think not).
>> Aubrey
>>
>>
>> Yes, good idea about talking to them.
>>
>> I wonder about the workflow too, because what about the situation of
>> someone uploading a new work with our tool: the script creates a new IA
>> item then (I assume as the 'wikisource-import-tool' or whatever user) and
>> then it will have full permissions over that item. So the update-DjVu
>> scenario will only apply for IA items that already exist but which don't
>> have DjVu files (i.e. only the last few months' worth). Which is good...
>>
>> —sam
>>
>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Upload/import wizard

2017-01-02 Thread Alex Brollo

Please take a look to https://archive.org/details/spinoza_etica_paravia_djvu,
this is precisely a djvu-only item that I uploaded some days ago. I asked
for permission to create "djvu-only items" into IA forum and I got it; this
is the fiirst item I created; as you see there's some "implicit convention"
too (the name of item is the original one + a _djvu suffix: it has been
derived from https://archive.org/details/spinoza_etica_paravia) and
metadata are the same, but a standard warning "Derived from files into
L'Etica " into the
description field.

So far I did not do the last step, t.i. adding a "backlink" from original
item to the derived one.

internetarchive.py allows to automatize the whole work (to download
metadata of source item, to build the new item name and to add the warning
do description field and to upload the new item).

Alex

2017-01-02 14:37 GMT+01:00 Sam Wilson :

>
>
> On Mon, 2 Jan 2017, at 05:29 PM, Andrea Zanni wrote:
>
>
> Ideally, we should talk to IA about this.
> Adding a comment on the IA item is a very low-cost solution and I think is
> important, adding the djvu would be much better. We should check if a
> script can edit every kind of item and add files (I think not).
> Aubrey
>
>
> Yes, good idea about talking to them.
>
> I wonder about the workflow too, because what about the situation of
> someone uploading a new work with our tool: the script creates a new IA
> item then (I assume as the 'wikisource-import-tool' or whatever user) and
> then it will have full permissions over that item. So the update-DjVu
> scenario will only apply for IA items that already exist but which don't
> have DjVu files (i.e. only the last few months' worth). Which is good...
>
> —sam
>
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Upload/import wizard

2017-01-02 Thread Alex Brollo

The problem is that many new IA pdf files have a poor resolution / too high
compression from beginning, so their quality can't be improved.

IA viewer doesnìt use pdf or djvu file, it uses jpg images coming from jp2
images; this explains why images seen by the viewer are so beautiful, while
pdf or djvu files are poor.

@Sam: About uploading djvu into IA item lacking of it: no, nobody but the
original contributor or a sysop can upload files into an item. But it can
be uploaded as a new item linked with the original one; its link could be
shown into source item adding a comment (a "review"),



2017-01-02 12:52 GMT+01:00 Ankry :

> > Very interesting.
> >
> > About djvu files on IA, they can be built simply by pdf2djvu from pdf
> > files
> > of IA, but quality is very poor;
> [...]
>
> Did you try to set the -d parameter to something higher than the default
> 300?
> While converting PDF files from Polish digital libraries, I often use -d
> 450 or -d 600 with good results.
>
> Ankry
>
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Upload/import wizard

2017-01-02 Thread Alex Brollo

Very interesting.

About djvu files on IA, they can be built simply by pdf2djvu from pdf files
of IA, but quality is very poor; or they can be built, with some more pain,
from _jp2.zip images merged with _djvu.xml files, the quality is high but
resulting djvu is heavy.

As Aubrey told some time ago, it.source uses a python script to do the
latter job, but it is a DIY (do it yourself) script, just to proof that *it
can be done*.

Alex

2017-01-02 5:29 GMT+01:00 Sam Wilson :

>
> Hi all,
>
> I've attempted to start a phab ticket about what the import wizard
> should look like:
> https://phabricator.wikimedia.org/T154413
>
> There are plenty of unanswered questions I'm sure, and lots missing
> still. Please edit the task or add comments about anything.
>
> This is 2016 Wishlist #73, so I'm not sure it'll get much 'official'
> comm-tech time (yet; there *is* a plan to address further-down wishes,
> but they may take some time), but I'm keen to work on it in my own time
> anyway.
>
> One thing I'd love to have in a Wikisource upload wizard is a thing that
> I can show to Glam people that makes it easier for them to see the value
> (and ease) in getting their stuff online and ready for crowd-sourced
> transcription. :-)
>
> Thanks,
> Sam.
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Make Wikisource "book-based"

2016-11-16 Thread Alex Brollo

Please don't underestimate the critical point of editing interface and
editing tools (even if *small*...), proofreading is the core of wikisource
work and it is presently the bottle neck - any minor success about
proofreading comfort, speed & precision is highly valuable and really
effective.


Alex


2016-11-16 13:29 GMT+01:00 David Cuenca Tudela :

> I like the idea of finishing the development of BookManager2, it is a pity
> that the code was not reviewed
> https://www.mediawiki.org/wiki/Extension:BookManagerv2
>
> Cheers,
> Micru
>
> On Wed, Nov 16, 2016 at 12:23 PM, Andrea Zanni 
> wrote:
>
>> I made a draft on a proposal here:
>> https://meta.wikimedia.org/wiki/2016_Community_Wishlist_Surv
>> ey/Categories/Wikisource#Make_Wikisource_.22book-based.22
>>
>> Please edit, it's a draft and I think it's a problem many of us know from
>> a long time.
>>
>> The idea is that, if we had a system for MediaWiki to know what a book
>> is,
>> we could
>> * better metadata management system, thus a better integration with
>> Wikidata
>> * we could develop better tools, for example create automatic Indexes and
>> navigation templates, which would maybe enable us to "import" EPUBs
>> directly on Wikisource
>> * have a better workflow, making it easier to new users to use us
>> * have even better statistics and analytics: which are the *books* (not
>> the pages!) which are read o WIkisource?
>>
>> etc.
>>
>> IMHO, the Community Wishlist is a great opportunity to push for a
>> systemic development of Wikisource: I'm convinced we don't need little
>> tools here and there, but a systematic overview of what Wikisource
>> framework, workflows and software.
>> I'm open to discuss the merging of different proposals in the Wishlist.
>>
>> Aubrey
>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
>
> --
> Etiamsi omnes, ego non
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Fwd: [Wikitech-ambassadors] Your help needed: Community Wishlist Survey 2016

2016-11-13 Thread Alex Brollo

Well, I'll try It's something that can be done locally just to test it.
IMHO it's only to hide level radiobuttons, replace them with a brief list
of checkboxes, then to use their values to state their result into canonic
level  0-4, and to save them somewhere into the page code with some clever
trick.

I presume that a template could do the work (the resulting code could be
simply {{level|0|0|1|0|}} or more verbose, with named parameters) and,
at the same time, the template could generate categories.

Alex

2016-11-11 22:02 GMT+01:00 mathieu stumpf guntz <
psychosl...@culture-libre.org>:

>
>
> Le 11/11/2016 à 09:17, Alex Brollo a écrit :
>
> I'd like to state a "binary page quality" splitting the workflow into its
> basic steps (proofreading of text; formatting; adding links;
> validating), t.i. into a set of true/false states, clearly showing the
> list of lacking steps. I.e. sometimes I fastly add complex formatting to
> rough text, and this results into a exotic  "level" proofreading=false,
> formatting=true. It's a level 1, but it is deeply different from a level 1
> coming from proofreading=true, formatting=false.
>
> That's closer to the idea I had in mind. :)
>
>
> Obviously the whole "binary level" could be simply stored as a number,
> with useful information into it.
>
> Alex
>
> 2016-11-11 8:32 GMT+01:00 Sam Wilson :
>
>> That sounds really interesting! Do you mean as a way for people
>> unfamiliar with Wikisource to easily contribute notes and corrections? On
>> the face of things, it could perhaps work by storing the notes in a the
>> Page_talk namspace and doing some clever thing to display them on the Page
>> (and perhaps in main) namespaces.
>>
>> It seems like it'd be cool to be able to get "typo reports" or something,
>> from people who mightn't have any idea of Wikisource other than that's
>> where they got an epub.
>>
>> To rate a page, we currently have the various levels of proofreading
>> quality. Is this not sufficient? And does the current Index page overview
>> of all of a book's statuses work for you? I sometimes wonder if we need
>> another rating, above 'validated', that indicates that a whole book has
>> been read through and (hopefully) any remaining typos have been found.
>>
>> —sam
>>
>> On Fri, 11 Nov 2016, at 12:27 AM, mathieu stumpf guntz wrote:
>>
>> Hmm, at the conference I think someone was interested in a feature to
>> make comments on texts, like you can make on some word processors for
>> example. That may be interesting, but how you render the result might be a
>> huge user interface problem. One should be able to choose whom comments
>> should be visible…
>>
>> Otherwise, I would still be happy to have more flexibable way to "rate" a
>> page. That is, a page might be text proof readed, but laking some css, or a
>> picture should be extracted etc. Having a way to see that for all pages in
>> the book: namespace would be fine.
>>
>> ĝis baldaŭ
>>
>> Le 10/11/2016 à 06:09, Sam Wilson a écrit :
>>
>> Thanks Alex :) It's a minor project so far, but I reckon the work you've
>> been doing on making a better, bigger, more proofreading-focused
>> interface is really good. Do stick a proposal up!
>>
>> So far, we've got:
>>
>> * Add a 'clean' method for side-titles, and side notes to parser
>> * A spelling- and typo-checking system for proofreading
>> * Visual Editor menu refresh
>> * upload text wizard
>> * Language links in Wikisource for edition items in Wikidata
>> * Display subpage name in category
>> * Make Special:IndexPage transcludeable
>> * Fix Extension:Cite to get rid of foibles
>>
>> If anyone's got half-formed ideas, I'd encourage you to post something,
>> or just post to this mailing list, and we can all have a chat about it.
>> :)
>>
>> —sam
>>
>>
>> On Wed, 9 Nov 2016, at 04:50 PM, Alex Brollo wrote:
>>
>>
>> I too could add *some* proposals but the first one could be a deep 
>> revision of nsPage edit interface to got the goal "fixed tools, almost full 
>> screen scrolling text & image". In the meantime, I'm go on testing 
>> FullScreenEditing.js by Sam, that presently is an excellent, running  step 
>> approximating such a goal.
>>
>> Alex
>>
>> 2016-11-09 1:03 GMT+01:00 Sam Wilson  
>> :
>>
>>
>> __
>> Huzza for Wikisource; we've currently got more proposals

Re: [Wikisource-l] Fwd: [Wikitech-ambassadors] Your help needed: Community Wishlist Survey 2016

2016-11-11 Thread Alex Brollo

Perhaps the logics could be reversed - t.i. with a list of todo specific
steps *needed* for a specific page; "This page needs proofreading? yes/no;
needs formatting? yes/no; needs image managing? yes/no; and so on. With
this approach, a new page could have all steps *flagged*, bus some could be
immediately unflagged, since the page doesn't need the step (if a page has
no picture indside, theres'n any need for image managing). So, a level 4
page will be by definition *a page with no pending flag*, and it will be
very simple to categorize them for pending flags.

Alex


2016-11-11 10:00 GMT+01:00 Andrea Zanni :

> I remember when we tried to make a partnership with a scholar who works
> with ancient texts.
> He needed some Italian translation of Greek texts in Wikisource, but he
> was much more interested in validated/proofread text *without* formatting,
> than the contrary.
> 75% for us is formatted, always.
> But, arguably, for people it's easier to correct typos and proofread than
> format with strange templates and codes. We always assume that people know
> how Wikisource works, how wikicode works, etc.
>
> A brand new quality workflow could be beneficial.
>
> Aubrey
>
>
> On Fri, Nov 11, 2016 at 9:54 AM, Alex Brollo 
> wrote:
>
>>  coupled with a KISSing approach  it could run perhaps :-)
>>
>> Alex
>>
>> 2016-11-11 9:37 GMT+01:00 Sam Wilson :
>>
>>> Yes, makes sense! Or a series of attributes like:
>>>
>>> proofread once?
>>> proofread twice?
>>> formatted?
>>> all images added?
>>> hyperlinked?
>>> transcluded?
>>> read in context with other pages?
>>> etc.
>>>
>>> Only some of which need be linear.
>>>
>>> And only when all are done is the thing considered bonzer. :-)
>>>
>>> —sam
>>>
>>> On Fri, 11 Nov 2016, at 04:17 PM, Alex Brollo wrote:
>>>
>>> I'd like to state a "binary page quality" splitting the workflow into
>>> its basic steps (proofreading of text; formatting; adding links;
>>> validating), t.i. into a set of true/false states, clearly showing the
>>> list of lacking steps. I.e. sometimes I fastly add complex formatting to
>>> rough text, and this results into a exotic  "level" proofreading=false,
>>> formatting=true. It's a level 1, but it is deeply different from a level 1
>>> coming from proofreading=true, formatting=false.
>>> Obviously the whole "binary level" could be simply stored as a number,
>>> with useful information into it.
>>> Alex
>>>
>>> 2016-11-11 8:32 GMT+01:00 Sam Wilson :
>>>
>>>
>>> That sounds really interesting! Do you mean as a way for people
>>> unfamiliar with Wikisource to easily contribute notes and corrections? On
>>> the face of things, it could perhaps work by storing the notes in a the
>>> Page_talk namspace and doing some clever thing to display them on the Page
>>> (and perhaps in main) namespaces.
>>>
>>> It seems like it'd be cool to be able to get "typo reports" or
>>> something, from people who mightn't have any idea of Wikisource other than
>>> that's where they got an epub.
>>>
>>> To rate a page, we currently have the various levels of proofreading
>>> quality. Is this not sufficient? And does the current Index page overview
>>> of all of a book's statuses work for you? I sometimes wonder if we need
>>> another rating, above 'validated', that indicates that a whole book has
>>> been read through and (hopefully) any remaining typos have been found.
>>>
>>>
>>> —sam
>>>
>>>
>>> On Fri, 11 Nov 2016, at 12:27 AM, mathieu stumpf guntz wrote:
>>>
>>> Hmm, at the conference I think someone was interested in a feature to
>>> make comments on texts, like you can make on some word processors for
>>> example. That may be interesting, but how you render the result might be a
>>> huge user interface problem. One should be able to choose whom comments
>>> should be visible…
>>>
>>> Otherwise, I would still be happy to have more flexibable way to "rate"
>>> a page. That is, a page might be text proof readed, but laking some css, or
>>> a picture should be extracted etc. Having a way to see that for all pages
>>> in the book: namespace would be fine.
>>>
>>> ĝis baldaŭ
>>>
>>> Le 10/11/2016 à 06:09, Sam Wilson a écrit :
>>>
>>

Re: [Wikisource-l] Fwd: [Wikitech-ambassadors] Your help needed: Community Wishlist Survey 2016

2016-11-09 Thread Alex Brollo

I too could add *some* proposals but the first one could be a deep
revision of nsPage edit interface to got the goal "fixed tools, almost full
screen scrolling text & image". In the meantime, I'm go on testing
FullScreenEditing.js by Sam, that presently is an excellent, running  step
approximating such a goal.

Alex

2016-11-09 1:03 GMT+01:00 Sam Wilson :

> Huzza for Wikisource; we've currently got more proposals than any of the
> other categories (not that it's a competition, but still...).
>
> @Micru: this whole topic of how to represent bibliographic data in WD and
> properly link it in Wikisource is great! I'm looking forward to helping. :-)
>
> —sam
>
>
> On Tue, 8 Nov 2016, at 10:08 PM, David Cuenca Tudela wrote:
>
> Hi Thomas,
> thanks for bringing that up! I wrote a proposal to finish the work
> retrieving the language links from several editions and represent them in
> wikisource as language links.
>
> To write or vote exiting Wikisource proposals, the link is:
> https://meta.wikimedia.org/wiki/2016_Community_Wishlist_
> Survey/Categories/Wikisource
> Cheers,
> Micru
>
> On Tue, Nov 8, 2016 at 10:06 AM, Thomas PT  wrote:
>
> Hello everyone,
>
> The Wikimedia Foundation Community Tech team has launched a new "Community
> Wishlist Survey".
> Last year survey allowed us to get WMF staff time to work on using Google
> OCR in Wikisource that allowed some Indian languages Wikisources to raise
> and on VisualEditor support.
>
> Please, take time to submit new wishes and comment them. It could be
> simple things (e.g. a new gadget for a specific workflow) or very
> complicated ones (e.g. native TEI support).
>
> Cheers,
>
> Thomas
>
>
> Début du message réexpédié :
>
> *De: *Johan Jönsson 
> *Objet: **[Wikitech-ambassadors] Your help needed: Community Wishlist
> Survey 2016*
> *Date: *7 novembre 2016 à 20:26:21 UTC+1
> *À: *Wikitech Ambassadors 
> *Répondre à: *Coordination of technology deployments across
> languages/projects 
>
> Hi everyone,
>
> Last year, the Community Tech team did a survey for a community wishlist
> to decide what we shoudl be working on throughout the year. Since it's
> useful to have a list of tasks from the Wikimedia communities, it's also
> been used by other developers, been the focus of Wikimedia hackathons and
> so on. In short, I think it matters.
>
> Now we're doing the process again.
>
> https://meta.wikimedia.org/wiki/2016_Community_Wishlist_Survey
>
> If you'd feel like spreading this in your communities, it would be much
> appreciated.
>
> *) This is when you can suggest things. This phase will last from 7
> November to 20 November.
> *) Editors who are not comfortable writing in English can write proposals
> in their language.
> *) Voting will take place 28 November to 12 December.
>
> Thanks,
>
> //Johan Jönsson
> --
>
>
>
>
> ___
> Wikitech-ambassadors mailing list
> wikitech-ambassad...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-ambassadors
>
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
>
>
> --
> Etiamsi omnes, ego non
> *___*
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Indic Wikisource Update November 2016

2016-11-03 Thread Alex Brollo

I go sometimes into fr.source as a contributor, even if my French is very
poor; I appreciate a lot fr.source editing tools for proofreading, they
document a deep interest about any trick to make editing faster, safer, and
more comfortable. This "evidence of care" is very rewarding for any
contributor.

Alex

2016-11-03 12:46 GMT+01:00 Ankry :

> > 2016-11-03 10:12 GMT+01:00 Andrea Zanni :
> >
> >> Thanks Mathieu.
> >> What really strikes me is that challenge is doable in fr.wikisource: in
> >> many others would be complete madness ;-)
> >> Also, Polish Wikisource is doing great.
> >>
> >> What interest me is understanding how they are building their community
> >> of
> >> active and super-active proofreaders: are they doing something that
> >> other
> >> wikisource aren't?
> >>
> >
> > Not sure if there is a link but when you mention fr.ws and pl.ws I can
> > immediately think of a correlation since these two are among the rare
> > wikisources which are prooferead system only (or nearly only : 92 % and
> 96
> > % of mainspace pages back with scan, see
> > http://tools.wmflabs.org/phetools/statistics.php).
> >
> > Cdlt, ~nicolas
>
> In pl.ws we have a policy that if a text *can* be processed using
> ProofreadPage (legal aspects, scan availability) then it *has to* be
> processed using this extention.
>
> Ankry
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

[Wikisource-l] LivePreview for nsPage: try to make it more useful

2016-10-21 Thread Alex Brollo

Who tries livePreview for nsPage, finds that it's no so much comfortable,
since size/position of its output doesn't allow a useful comparison with
front image or with wikicode. VisualEditor could be a good solution, but
presently it have limitations dealing with wikisource nsPage, mostly when
adding fastly complex format, without the help of our beloved tools.

Here: https://it.wikisource.org/wiki/Utente:Alex_brollo/livePreview.js a
temptative js code to customize the output of livePreview for wikisource
needs: output goes into a box superimposed to edit area, draggable if user
likes a comparison with wiki code, instead of a comparison with front
image.

Far from being a professional script, it is just to proof  that* it can be
done. *
If there's any better solution for that issue, please tell me!

Alex brollo (from it.source)
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Importing books from Project Gutenberg

2016-10-16 Thread Alex Brollo

Ok, I'll use https://www.wikidata.org/wiki/Q27245478 as an example and I'll
submit it to it.source WD specialists to see if we can retrieve, or add
data for a test work.

Alex



2016-10-16 1:28 GMT+02:00 Sam Wilson :

> Hm, it should work fine for it.ws too. Can you give me a WD item for a
> book with a PG ID and a it.ws Index page? I'll investigate further... :-)
>
> One cool thing that I've only recently found is this list of PG's sources:
> http://www.pgdp.net/c/tools/project_manager/show_image_sources.php (you
> need to log in)
>
> It's not very structured, but it's the only place I've found that links a
> PG ID to a scan on the Internet Archive or elsewhere. I'm thinking of
> writing a scraper to get the data so that it can at least link more PG IDs
> and IA identifiers on Wikidata.
>
> —Sam
>
> On 13/10/16 23:27, Andrea Zanni wrote:
>
> I think the idea is good,
> but I would like to try that in my wikisource:
> could you manage to take also the few italian books that PG has?
> Thanks!
>
> On Fri, Oct 14, 2016 at 8:23 AM, Anika Born 
> wrote:
>
>> corr1: [...] does not ha*ve*/show the scans, [...]
>>
>> Anika
>>
>> 2016-10-14 8:18 GMT+02:00 Anika Born :
>>
>>> Hy Sam,
>>>
>>> would be good, cause PG does not hat/show the scans,
>>>
>>> But
>>>
>>> as I remember there was/is a policy at de.ws to not use texts from
>>> other projects (say: if there is text A in PG, there won't be a similar
>>> text A in de.WS),
>>>
>>> cause at the time de.WS did use PG-texts... Google said WS is a mirror
>>> of PG and all other (not PG)-texts were left out in Google-Search-Results
>>> as well  The (small) visibility of WS got lost completely... That is
>>> the reason, why there are no new projects on de-WS about texts that are
>>> available in a (nearly) similar project
>>>
>>> (besides the effort: why spending so much time on a text that already is
>>> avilable? - you'd have to proofread ist at least two times)
>>>
>>>
>>> But that is this special German-thing.
>>>
>>>
>>> What do the others think about it?
>>> Anika
>>>
>>> 2016-10-14 3:20 GMT+02:00 Sam Wilson :
>>>
 Hi all,

 I've been tinkering with an idea I've had for importing Project
 Gutenberg books into Wikisource: http://tools.wmflabs.org/pg2ws/

 The idea is that, if Wikidata makes a link between a PG ID number and a
 Wikisource Index page, then we can go through that Index page one page at a
 time, and copy the page's text from the PG book to the WS page.

 The interface so far isn't very brilliant, but I'm just trying to
 figure out if this is worthwhile or not. Basically, it's a matter of
 selecting the right chunk of text in the right-most text box (the full PG
 text) and hitting the button to move it left into the centre box. Then
 cleaning it up (manually and with the magic cleaning button) to make it
 match the image, and then uploading it to Wikisource.

 It's a bad tool though, because it doesn't handle the running header,
 and the copy-across button doesn't do nice things with {{hws}} etc. — not
 to mention all the other things it doesn't do.

 Anyway, just thought I'd mention it. :-) Anyone think this is an avenue
 worth exploring? Certainly I'd love to be able to say we've got everything
 PG has *and more*!

 —Sam

 PS changes made by this tool are all tagged as "OAuth CID: 638" —

 https://en.wikisource.org/w/index.php?title=Special:RecentCh
 anges&tagfilter=OAuth+CID%3A+638

 ___
 Wikisource-l mailing list
 Wikisource-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikisource-l


>>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
>
> ___
> Wikisource-l mailing 
> listWikisource-l@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Importing books from Project Gutenberg

2016-10-14 Thread Alex Brollo

Back to the tool, is there some more doc to understand - step by step - how
to run it? I imagine, that there's the need of a Gutemberg text and of a
wikisource Index page coming from the same edition used by Gutemberg text;
then the tool allows something like a "manual match and split". But perhaps
I didn't understand anything I need to see the tool at work to
understand it! :-(

At its beginning, it.source uploaded many books from an Italian project,
LiberLiber, somehow similar to Project Gutemberg, and we often convert
those ns0-only texts into proofread ones by various tricks; so I'd like to
learn anything from Sam's tool.

Alex

2016-10-14 12:55 GMT+02:00 Anika Born :

> Hy Alex,
>
> My comment was not about spending some time on a PG-Projekt or not
> spending any time at all.
>
> The point/question (when it comes to de-WS) is a different one:
>
> (A) to spend some of our valuable contributions into a project that
> already is freely available (in another format) or spend this time in a
> (related) project that is NOT already freely available? (and we do have a
> lot of them)
>
> // note, it is not about not spending any time in proofreading or the
> Wikisourceproject... it is about finding valuable projects/texts to invest
> our time...
>
>
> + (B) to spend this time in a project, that may cost us the findability of
> the whole wikisource-project (and all other texts on wikisource) because
> Google/Bing/others do tag us as fork/reuser/copy of ... (as happened in the
> past, at least with de, when we had some texts of the commercial
> http://gutenberg.spiegel.de/ that is also supported by ABBY with a free
> softwarelizense)
>
>
> Anika
>
> 2016-10-14 10:13 GMT+02:00 Alex Brollo :
>
>> I'm too very interested both into the idea and into its technical
>> implementation, but I need some more doc for dummies to understand it fully
>> :-(
>>
>> About importing into wikisource texts alreary proofread: a text into
>> wikisource is different from a similar text into another web site, since it
>> is "a node into wiki network", and this goal deserves IMHO some pain to
>> proofread (and re-format)  it again, adding lots of wiki cross links.
>>
>> Alex
>>
>>
>> 2016-10-14 8:27 GMT+02:00 Andrea Zanni :
>>
>>> I think the idea is good,
>>> but I would like to try that in my wikisource:
>>> could you manage to take also the few italian books that PG has?
>>> Thanks!
>>>
>>> On Fri, Oct 14, 2016 at 8:23 AM, Anika Born 
>>> wrote:
>>>
>>>> corr1: [...] does not ha*ve*/show the scans, [...]
>>>>
>>>> Anika
>>>>
>>>> 2016-10-14 8:18 GMT+02:00 Anika Born :
>>>>
>>>>> Hy Sam,
>>>>>
>>>>> would be good, cause PG does not hat/show the scans,
>>>>>
>>>>> But
>>>>>
>>>>> as I remember there was/is a policy at de.ws to not use texts from
>>>>> other projects (say: if there is text A in PG, there won't be a similar
>>>>> text A in de.WS),
>>>>>
>>>>> cause at the time de.WS did use PG-texts... Google said WS is a mirror
>>>>> of PG and all other (not PG)-texts were left out in Google-Search-Results
>>>>> as well  The (small) visibility of WS got lost completely... That is
>>>>> the reason, why there are no new projects on de-WS about texts that are
>>>>> available in a (nearly) similar project
>>>>>
>>>>> (besides the effort: why spending so much time on a text that already
>>>>> is avilable? - you'd have to proofread ist at least two times)
>>>>>
>>>>>
>>>>> But that is this special German-thing.
>>>>>
>>>>>
>>>>> What do the others think about it?
>>>>> Anika
>>>>>
>>>>> 2016-10-14 3:20 GMT+02:00 Sam Wilson :
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I've been tinkering with an idea I've had for importing Project
>>>>>> Gutenberg books into Wikisource: http://tools.wmflabs.org/pg2ws/
>>>>>>
>>>>>> The idea is that, if Wikidata makes a link between a PG ID number and
>>>>>> a Wikisource Index page, then we can go through that Index page one page 
>>>>>> at
>>>>>> a time, and copy the page's text from the PG book to the WS page.
>>>>>>
>>>>

Re: [Wikisource-l] Importing books from Project Gutenberg

2016-10-14 Thread Alex Brollo

I'm too very interested both into the idea and into its technical
implementation, but I need some more doc for dummies to understand it fully
:-(

About importing into wikisource texts alreary proofread: a text into
wikisource is different from a similar text into another web site, since it
is "a node into wiki network", and this goal deserves IMHO some pain to
proofread (and re-format)  it again, adding lots of wiki cross links.

Alex

2016-10-14 8:27 GMT+02:00 Andrea Zanni :

> I think the idea is good,
> but I would like to try that in my wikisource:
> could you manage to take also the few italian books that PG has?
> Thanks!
>
> On Fri, Oct 14, 2016 at 8:23 AM, Anika Born 
> wrote:
>
>> corr1: [...] does not ha*ve*/show the scans, [...]
>>
>> Anika
>>
>> 2016-10-14 8:18 GMT+02:00 Anika Born :
>>
>>> Hy Sam,
>>>
>>> would be good, cause PG does not hat/show the scans,
>>>
>>> But
>>>
>>> as I remember there was/is a policy at de.ws to not use texts from
>>> other projects (say: if there is text A in PG, there won't be a similar
>>> text A in de.WS),
>>>
>>> cause at the time de.WS did use PG-texts... Google said WS is a mirror
>>> of PG and all other (not PG)-texts were left out in Google-Search-Results
>>> as well  The (small) visibility of WS got lost completely... That is
>>> the reason, why there are no new projects on de-WS about texts that are
>>> available in a (nearly) similar project
>>>
>>> (besides the effort: why spending so much time on a text that already is
>>> avilable? - you'd have to proofread ist at least two times)
>>>
>>>
>>> But that is this special German-thing.
>>>
>>>
>>> What do the others think about it?
>>> Anika
>>>
>>> 2016-10-14 3:20 GMT+02:00 Sam Wilson :
>>>
 Hi all,

 I've been tinkering with an idea I've had for importing Project
 Gutenberg books into Wikisource: http://tools.wmflabs.org/pg2ws/

 The idea is that, if Wikidata makes a link between a PG ID number and a
 Wikisource Index page, then we can go through that Index page one page at a
 time, and copy the page's text from the PG book to the WS page.

 The interface so far isn't very brilliant, but I'm just trying to
 figure out if this is worthwhile or not. Basically, it's a matter of
 selecting the right chunk of text in the right-most text box (the full PG
 text) and hitting the button to move it left into the centre box. Then
 cleaning it up (manually and with the magic cleaning button) to make it
 match the image, and then uploading it to Wikisource.

 It's a bad tool though, because it doesn't handle the running header,
 and the copy-across button doesn't do nice things with {{hws}} etc. — not
 to mention all the other things it doesn't do.

 Anyway, just thought I'd mention it. :-) Anyone think this is an avenue
 worth exploring? Certainly I'd love to be able to say we've got everything
 PG has *and more*!

 —Sam

 PS changes made by this tool are all tagged as "OAuth CID: 638" —

 https://en.wikisource.org/w/index.php?title=Special:RecentCh
 anges&tagfilter=OAuth+CID%3A+638

 ___
 Wikisource-l mailing list
 Wikisource-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikisource-l

>>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Please import your best templates and scripts into mul.source and la.source

2016-09-26 Thread Alex Brollo

About mul.source, there was a hard limitation: when I worked into it, it
was not possible to link its pages to wikidata. Is perhaps that issue
fixed?

2016-09-27 7:19 GMT+02:00 Sam Wilson :

> Good point. That's a useful page. (Is it okay that we treat mulws as our
> "Meta"? It seems okay to me, but then there's some discussion about that
> moving to mul.wikisource.org I think...)
>
> Maybe some of these scripts are candidates to be made into MediaWiki
> extensions, or are otherwise getting too big for their boots — if so, we
> can raise them in the Community Wishlist survey, and perhaps give them the
> attention they deserve! :-) (Mainly I say that because things are much
> easier to test etc. when they're not just text-in-wikipages.)
>
> —sam
>
> On Tue, 27 Sep 2016, at 01:02 PM, Bodhisattwa Mandal wrote:
>
> Hi,
>
> Excellent idea!
>
> Also please add a short description of the function of those scripts and
> templates and tabulate them in a single page in mul.wikisource like
> https://wikisource.org/wiki/Wikisource:Shared_Scripts , so that they
> become handy to other Wikisource projects.
>
> Regards,
> Bodhisattwa
>
> On Sep 27, 2016 10:16 AM, "Sam Wilson"  wrote:
>
>
> Good idea! And keep in mind that it can be a good idea to keep scripts on
> one wiki and load them from the others using
> mw.loader.load() — that way there's only one place to update things, and
> everyone can use the same version.
>
> —Sam
>
> On Tue, 27 Sep 2016, at 12:02 PM, Alex Brollo wrote:
>
> I guess that both into larger and into smaller wikisource projects there
> are plentiful of clever templates, modules and javascript tools, developed
> by users for practical goals.
>
> Please take a little bit of time and  import them into mul.source and
> la.source - the latter being something like a second "multi-language
> wikisource". IMHO we need to share our best ideas and tricks, and both
> mul,source and la.source could be good showcases for them.
>
> Alex brollo
> *___*
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
> *___*
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Please import your best templates and scripts into mul.source and la.source

2016-09-26 Thread Alex Brollo

Thanks Sam, really my heavier script is into mul,source in my common.js
user page (I've no special privileges there)  and I can use it from
anywhere by mw.loader.load(); with "a little bit of pain" I run it from
bn.wikisource too

Alex


2016-09-27 6:46 GMT+02:00 Sam Wilson :

> Good idea! And keep in mind that it can be a good idea to keep scripts on
> one wiki and load them from the others using
> mw.loader.load() — that way there's only one place to update things, and
> everyone can use the same version.
>
> —Sam
>
> On Tue, 27 Sep 2016, at 12:02 PM, Alex Brollo wrote:
>
> I guess that both into larger and into smaller wikisource projects there
> are plentiful of clever templates, modules and javascript tools, developed
> by users for practical goals.
>
> Please take a little bit of time and  import them into mul.source and
> la.source - the latter being something like a second "multi-language
> wikisource". IMHO we need to share our best ideas and tricks, and both
> mul,source and la.source could be good showcases for them.
>
> Alex brollo
> *___*
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

[Wikisource-l] Please import your best templates and scripts into mul.source and la.source

2016-09-26 Thread Alex Brollo

I guess that both into larger and into smaller wikisource projects there
are plentiful of clever templates, modules and javascript tools, developed
by users for practical goals.

Please take a little bit of time and  import them into mul.source and
la.source - the latter being something like a second "multi-language
wikisource". IMHO we need to share our best ideas and tricks, and both
mul,source and la.source could be good showcases for them.

Alex brollo
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] IA Upload tool

2016-09-26 Thread Alex Brollo

Time to develop a wiki, excellent OCR (better. hOCR) multilingual service,
isn't it?

Alex



2016-09-26 15:04 GMT+02:00 Andrea Zanni :

> (also, BUB is not currently working)
>
> On Mon, Sep 26, 2016 at 2:55 PM, Andrea Zanni 
> wrote:
>
>> Hello everyone,
>> can I ask you if you are currently using IA Upload tool
>> with IA books that *do not* have a Djvu file?
>>
>> It's few weeks I'm trying to upload this book
>> https://archive.org/details/ComeRuinareLAutoritaImage
>>
>> with the tool, and in theory the IA-upload can now make the djvu by
>> himself,
>> but in this case it's not working.
>>
>> But maybe it's just this book.
>> Did you have any issues?
>>
>> Andrea
>>
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

[Wikisource-l] Good news about CropTool

2016-08-16 Thread Alex Brollo

CropTool now can crop & uoload into Commons illustrations from djvu and pdf
files. We are testing the tool at it.wikisource and we all agree that it is
now a surprisingly useful tool for daily wikisource work.
Please test it and give some feedback and some more suggestions to its
author, Danmichaelo (*thanks Dan!*)
The local gadget to call the tool from Commons::
https://commons.wikimedia.org/wiki/MediaWiki:Gadget-CropTool.js
should be implemented into wikisource projects with some changes; see
it.source version:
https://it.wikisource.org/wiki/MediaWiki:Gadget-CropTool.js as an example.

Alex brollo
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Splitting Books for Wikisource

2016-05-22 Thread Alex Brollo

Try too Briss: http://briss.sourceforge.net/ , it's free and very fast, and
it saves mapped text, if pdf has a OCR layer.

Alex

2016-05-22 10:22 GMT+02:00 Satdeep Gill :

> Thanks a lot everyone and special thanks to Antonis.
>
> Regards
> Satdeep Gill
>
> > On 22-May-2016, at 1:39 PM, wikisource-l-requ...@lists.wikimedia.org
> wrote:
> >
> > Re: [Wikisource-l] Splitting Books for Wikisource
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] [pywikibot] pdf library

2016-05-13 Thread Alex Brollo

You can be right - my tests presently have been done on one book only. As
soon as a python tool to get djvu from _jp2 will run with no human effort,
I'll try it on lots of books to get some "general rule".

But - can you confirm that IA viewer shows jpg images coming from jp2-jpg
folder?

Another problem, when using original IA pdf (again, I tested it on one book
only: see https://it.wikisource.org/wiki/Indice:Tarchetti_-_Paolina.pdf )
is, that OCR text retrieved by mediawiki software is horrible in structure,
please try to create any page of that Index. With pdftotext (xpdf) too,
results are far from good.

Alex



Alex

2016-05-13 11:20 GMT+02:00 Federico Leva (Nemo) :

> Alex Brollo, 13/05/2016 11:06:
>
>> Simply, from a practital point iof view, my suggestion is: don't try to
>> get a good djvu from IA pdf, use instead _jp2.zip images (after
>> conversion to jpg the images are very good), and the result will be much
>> better - almost as good as images into IA viewer, that uses the same
>> images.
>>
>
> In my experience, when there are problems, usually the JP2 images are
> either too little compressed or too compressed. This has precise reasons
> and no trivial solution:
> http://www.digitizationguidelines.gov/still-image/documents/JP2LossyCompression.pdf
>
>
> Nemo
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] [pywikibot] pdf library

2016-05-13 Thread Alex Brollo

Simply, from a practital point iof view, my suggestion is: don't try to get
a good djvu from IA pdf, use instead _jp2.zip images (after conversion to
jpg the images are very good), and the result will be much better - almost
as good as images into IA viewer, that uses the same images.

Alex



2016-05-13 10:06 GMT+02:00 Federico Leva (Nemo) :

> Alex Brollo, 13/05/2016 09:02:
>
>> I presume that this complex structure is somewhat similar of djvu
>> background/foreground segmentation into djvu files, and artifacts are
>> similar.
>>
>
> Sure.
>
>
>> So, pdf images are not only "compressed", but deeply processed and
>> segmented images.
>>
>
> ...which is what I call "compression". I still recommend to try and
> increase the fixed-ppi parameter in such a case of excessive compression.
>
> I also still need an answer to https://it.wikisource.org/?diff=1733473
>
> Is something of this complex IA image processing path documented
>> anywhere?
>>
>
> What do you mean? Are you asking about details of their derivation plan
> for books? What we know has been summarised over time at
> https://en.wikisource.org/wiki/Help:DjVu_files#The_Internet_Archive , as
> always. As the help page IIRC states, the best way to understand what's
> going on is to check the item history and read the derive.php log, like
> https://catalogd.archive.org/log/487271468 which I linked.
>
> The main difference compared to the past is, I think, that they're no
> longer creating the luratech b/w PDF, probably because the "normal" PDF now
> manages to compress enough. They may have not realised that the single PDF
> they now produce is too compressed for illustrations and for cases where
> the original JP2 is too small.
>
>
> Nemo
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] [pywikibot] pdf library

2016-05-13 Thread Alex Brollo

Nemo, try to do an "autopsy" of cited IA pdf by pdfimages (xpdf) that
recovers raw pdf images into its pages. You'll find that pages are
exotically segmented into a full color background, a strange image, and an
inverted image of thresholded image (I presume, used as a mask). Just
negating the last one, you can get a decent, light BW image of the page. I
could build from the last one a decent BW djvu image:
https://it.wikisource.org/wiki/File:Paolina.djvu , but it.source users
didn't like the idea
https://it.wikisource.org/wiki/Wikisource:Bar#Pensiero_in_libert.C3.A0_sulle_immagini_delle_pagine

I presume that this complex structure is somewhat similar of djvu
background/foreground segmentation into djvu files, and artifacts are
similar.

So, pdf images are not only "compressed", but deeply processed and
segmented images.

Anyway: IA image viewer doesn't use at all pdf (nor djvu) but uses jpg from
jp2 files; so, if you need a djvu similar, for details, to what you see
into the IA viewer, you have to download and process jp2 images to build a
decent djvu file.

Is something of this complex IA image processing path documented anywhere?
I got my conclusions simply by "try and learn" from IA  file "necropsy".

Alex

2016-05-12 20:10 GMT+02:00 Federico Leva (Nemo) :

> Andrea Zanni, 12/05/2016 19:38:
>
>> [1] https://it.wikisource.org/wiki/File:Tarchetti_pdf.png
>> [2]
>>
>> https://commons.wikimedia.org/w/index.php?title=File%3ATarchetti_-_Paolina.pdf&page=4
>> [3] https://it.wikisource.org/wiki/File:Tarchetti_pdf.png
>>
>
> That was meant to be
> https://it.wikisource.org/wiki/File:Tarchetti_alex_djvu.png
>
> I don't think this has anything to do with DjVu or PDF, the problem is
> very clear just by looking at
> https://archive.org/download/digitami_LO10534041 : the JP2 conversion
> compressed the images 30 times, the PDF compression 5 more times.
>
> The first step in such cases, as documented in
> https://en.wikisource.org/wiki/Help:DjVu_files#The_Internet_Archive , is
> to add/increase the fixed-ppi field. I don't understand what was used in
> https://catalogd.archive.org/log/487271468
>
>
> Nemo
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Wikimedia Conference report

2016-05-03 Thread Alex Brollo

It's impressive to realize that it.source is a tiny community inside a
larger, but nevertheless "tiny community" (the whole wikisource
community)...

Alex brollo

2016-05-02 17:53 GMT+02:00 Andrea Zanni :

> Dear all,
> last week I represented the Wikisource Community user Group at the
> Wikimedia Conference, in Berlin.
>
> As usual, it has been a great opportunity to talk and meet fellow
> wikimedians, especially after the turmoil of the recent months in our
> movement (Lila and the other multiple resignations from WMF; Denny and
> James leaving the Board, etc.).
>
> The atmosphere was quite calm and friendly, and this was a good sign.
> I've spoken with many WMF employees and they all said "we're in a better
> place now". There is hope for the future and they want to do a lot of
> things. All this is positive, IMHO, especially after the things I'll
> explain now.
>
> I've spoke with many people; especially
> * Danny Horn and Ryan Kaldari of Community Tech team
> * Alex Stinson, "The Wikipedia Library"
> * Asaf Bartov, which all of you know well :-D
> * Katy Love, Director of Resources
> * Katherine Mayer, interim ED (!!)
>
> I'll start from this last one.
> Unfortunately I had my flight scheduled right after the meeting, so I was
> in a rush. But I met with Katherine and other 2 members of other user
> groups: it was the first time (that I know of) that the ED of the WMF took
> the time to speak with such tiny groups, and that says something.
> She was *genuinely* interested in knowing about Wikisource, and I hope our
> conversation will continue in the future.
>
> But a general "interest" in WS was all that I hoped for.
>
> Discussions (separate and collective) with Alex, Danny and Asaf were much
> more fruitful.
>
> * Alex Stinson is really interested in promoting Wikisource in the context
> of GLAM-WIKI, which makes perfectly sense and has been don already.
> Wikimedia Italia use Wikisource a lot in his talks with libraries and
> archives, and we are seeing good results. GLAM-WIKI has a "political"
> priority in the Wikimedia world, so we should harness that.
>
> * Asaf is the "Head of Emerging Communities", and I don't see why not we
> should *not* be see an emergent community! I don't really know what is the
> daily job of Asaf but if he's the person to speak to for the Wikisource
> community, we are in great hands.
>
> * Probably the best discussion I had was with Danny Horn, from Community
> Tech.
> Danny is one of the guys behind the Community Wishlist Survey.
> He was really astonished (he said it many times) by the response from the
> Wikisource community in that survey, and was the first to understand that,
> if they are bound to count absolute votes to proposals, our tiny community
> needs will never be addressed. They have that in mind and will change
> things for the following surveys.
>
> Danny spent *a whole hour* with me looking at Wikisource.
> I should him the Proofread Page extension, the phe stats, the EPUB
> generator. He was really shocked but what we do, how "details-oriented" we
> are (I used less politically-correct adjectives :-), and how much work we
> put into things.
>
> He also saw a bunch of things who could really get some quick help (for
> example, incresing the size of the radio buttons, or changing some icons,
> or stuff like that) which I think would be great.
>
> In the end, he was really excited and wanted to do stuff.  Of course, he
> didn't promise anything because his schedule is full, but there is maybe
> the chance to finally push Wikisource into the agenda of the WMF.
>
> All in all,
> I was pretty happy with how people were interested in the project and I
> think there is concrete room for improvement.
>
> Let me know what you think :-)
>
> Andrea
>
> [1] https://meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey
>
>
>
>
>
>
>
>
>
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Non transcluded Page: from ns:0

2016-04-28 Thread Alex Brollo

Very interesting.

Have you any suggestion about finding the list of not transcluded pages? I
can imagine, to get by a bot html of ns0 main page and all its subpages
related to a Index page, then parsing it to get the list of existing page
links; is there any simpler strategy?

Alex


2016-04-28 14:50 GMT+02:00 Andrea Zanni :

> Wow, this is fantastic Phe.
> It's really useful for running the "Match & split" when it's needed.
>
> Andrea
>
> On Thu, Apr 28, 2016 at 2:42 PM, Philippe Elie  wrote:
>
>> Hi,
>>
>> I added a new tool: https://tools.wmflabs.org/phetools/not_transcluded/
>> to
>> provide a list of Index containing corrected or validated page which are
>> not
>> transcluded from main:, see the README.txt.
>>
>> --
>> phe
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] [pywikibot] pdf library

2016-04-18 Thread Alex Brollo

Can someone "ping" Phe & Tpt into this talk?

Alex

2016-04-18 10:51 GMT+02:00 Andrea Zanni :

> I think that the crucial issue here is: will the ia-upload tool run?
> https://tools.wmflabs.org/ia-upload/commons/init
>
> Aubrey
>
>
> On Fri, Apr 15, 2016 at 8:29 PM, Alex Brollo 
> wrote:
>
>> Again, just to explain: pdftodjvu output of a IA pdf is a perfect djvu,
>> with its regular OCR mapped layer, so nothing changes but the need of
>> running a very simple command:
>>
>> pdf2djvu namefile.pdf -o namefile.djvu
>>
>> Alex
>>
>>
>>
>>
>>
>> 2016-04-15 10:01 GMT+02:00 Andrea Zanni :
>>
>>> Yes, this is why I cited it: if we can manage to use it for Wikisource
>>> importing, we could be safe :-)
>>>
>>> Aubrey
>>>
>>> On Fri, Apr 15, 2016 at 9:41 AM, Federico Leva (Nemo) <
>>> nemow...@gmail.com> wrote:
>>>
>>>> Andrea Zanni, 15/04/2016 09:03:
>>>>
>>>>> I remember Alex Brollo was working with the djvu_xml layer
>>>>>
>>>>
>>>> The XML output from ABBYY is still being published, AFAIK.
>>>>
>>>>
>>>> Nemo
>>>>
>>>> ___
>>>> Wikisource-l mailing list
>>>> Wikisource-l@lists.wikimedia.org
>>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>>
>>>
>>>
>>> ___
>>> Wikisource-l mailing list
>>> Wikisource-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>
>>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] [pywikibot] pdf library

2016-04-15 Thread Alex Brollo

Again, just to explain: pdftodjvu output of a IA pdf is a perfect djvu,
with its regular OCR mapped layer, so nothing changes but the need of
running a very simple command:

pdf2djvu namefile.pdf -o namefile.djvu

Alex





2016-04-15 10:01 GMT+02:00 Andrea Zanni :

> Yes, this is why I cited it: if we can manage to use it for Wikisource
> importing, we could be safe :-)
>
> Aubrey
>
> On Fri, Apr 15, 2016 at 9:41 AM, Federico Leva (Nemo) 
> wrote:
>
>> Andrea Zanni, 15/04/2016 09:03:
>>
>>> I remember Alex Brollo was working with the djvu_xml layer
>>>
>>
>> The XML output from ABBYY is still being published, AFAIK.
>>
>>
>> Nemo
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Visual Editor in Beta on Wikisource ns0

2016-04-13 Thread Alex Brollo

I activated it on it.source but nothing happens. Andrea, does it run for
you?

Another question. Any news about user-written edit gadgets? Some time ago I
took a look to a terribbly complex "help" page. Wikisource needs lots of
specific scripts to make thing faster - much more into nsPage, but some in
ns0 too.

Alex

2016-04-13 11:04 GMT+02:00 David Cuenca Tudela :

> Fantastic! It took some years, but it seems that it finally arrived :-)
>
> Cheers,
> Micru
>
> On Wed, Apr 13, 2016 at 10:44 AM, Thomas Tanon 
> wrote:
>
>> Strong warning: the widget that allows to edit the  tag is vey
>> betaish and should be definitely improved. So, don't stay on a bad first
>> impression, the user experience could (and should) be better in the future.
>> Thomas
>>
>> > Le 13 avr. 2016 à 10:39, Andrea Zanni  a
>> écrit :
>> >
>> > You probably just saw it, but we have the posibility of trying the VE
>> in the main namespace on Wikisources now :-)
>> >
>> > Here's there is a guide:
>> >
>> https://www.mediawiki.org/wiki/Help:VisualEditor/The_visual_editor_at_Wikisources_and_Wiktionaries
>> .
>> >
>> > It's be awesome if everyone of you could
>> > * spread the word
>> > * spread the guide
>> > * translate it :-)
>> >
>> > Let me know if there are some issues.
>> >
>> > Aubrey
>> > ___
>> > Wikisource-l mailing list
>> > Wikisource-l@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
>
> --
> Etiamsi omnes, ego non
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] How to activate header=1 param into pages tag

2016-03-23 Thread Alex Brollo

Problem solved, thanks Zdzislaw for your help!

Alex

2016-03-22 17:23 GMT+01:00 Alex Brollo :

> Yes you are right, I imported code long before having a vague idea of what
> it means  presently I'm going to understand that I didn't understand
> anything ... a good forward step :-)
>
> I'll follow your links, and I'll ask for help as soon as needed. By now,
> thanks!
>
> Alex
>
>
> 2016-03-22 16:15 GMT+01:00 Zdzislaw :
>
>> hello Alex,
>>
>> On 22 March 2016 at 13:15, Alex Brollo  wrote:
>> > I'm trying to understand how magics of header=1 param into Proofreadpage
>> > pages tag run, I'd like do test it into it.source as an alternative to
>> our
>> > ns0 header templates, but I failed.
>> >
>> > Where can I find a step-by-step, stupid-proof doc?
>>
>> all available documentation is on:
>> https://www.mediawiki.org/wiki/Proofreadhelp#Headers_and_Navigation
>>
>> https://www.mediawiki.org/wiki/Extension:Proofread_Page#Configuration_of_index_namespace
>> unfortunately, it does not include all possibilities and do not take
>> into account the new and the old configuration of index namespace.
>>
>> I glanced at your configurations:
>> https://it.wikisource.org/wiki/MediaWiki:Proofreadpage_header_template
>> https://it.wikisource.org/wiki/Modulo:Header_template
>> and I noticed that you have copied and "try" to use "the French names"
>> of parameters,  whereas to the
>> https://it.wikisource.org/wiki/MediaWiki:Proofreadpage_header_template
>> are passed parameters defined in
>> https://it.wikisource.org/wiki/MediaWiki:Proofreadpage_index_data_config
>> (for it ws there is only "Fonte" - all others are set "header":false
>> parameter...)...
>>
>> it is difficult to describe everything without "talking" - if you need
>> more information, I suggest meeting at #wikisource (my irc nickname is
>> Zdzislaw).
>>
>> Z.
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] How to activate header=1 param into pages tag

2016-03-22 Thread Alex Brollo

Yes you are right, I imported code long before having a vague idea of what
it means  presently I'm going to understand that I didn't understand
anything ... a good forward step :-)

I'll follow your links, and I'll ask for help as soon as needed. By now,
thanks!

Alex


2016-03-22 16:15 GMT+01:00 Zdzislaw :

> hello Alex,
>
> On 22 March 2016 at 13:15, Alex Brollo  wrote:
> > I'm trying to understand how magics of header=1 param into Proofreadpage
> > pages tag run, I'd like do test it into it.source as an alternative to
> our
> > ns0 header templates, but I failed.
> >
> > Where can I find a step-by-step, stupid-proof doc?
>
> all available documentation is on:
> https://www.mediawiki.org/wiki/Proofreadhelp#Headers_and_Navigation
>
> https://www.mediawiki.org/wiki/Extension:Proofread_Page#Configuration_of_index_namespace
> unfortunately, it does not include all possibilities and do not take
> into account the new and the old configuration of index namespace.
>
> I glanced at your configurations:
> https://it.wikisource.org/wiki/MediaWiki:Proofreadpage_header_template
> https://it.wikisource.org/wiki/Modulo:Header_template
> and I noticed that you have copied and "try" to use "the French names"
> of parameters,  whereas to the
> https://it.wikisource.org/wiki/MediaWiki:Proofreadpage_header_template
> are passed parameters defined in
> https://it.wikisource.org/wiki/MediaWiki:Proofreadpage_index_data_config
> (for it ws there is only "Fonte" - all others are set "header":false
> parameter...)...
>
> it is difficult to describe everything without "talking" - if you need
> more information, I suggest meeting at #wikisource (my irc nickname is
> Zdzislaw).
>
> Z.
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

[Wikisource-l] How to activate header=1 param into pages tag

2016-03-22 Thread Alex Brollo

I'm trying to understand how magics of header=1 param into Proofreadpage
pages tag run, I'd like do test it into it.source as an alternative to our
ns0 header templates, but I failed.

Where can I find a step-by-step, stupid-proof doc?

Alex brollo
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

[Wikisource-l] Edit in view

2016-01-05 Thread Alex Brollo

I'm using a personal jQuery tool into mul.source, named Edit in view, that
allows a very fast edit of a nsPage page while staying in view mode, using
mw.API routines to read and save wikicode. It has too a preview option
using a parse API call.

My question is: can such a procedure be considered safe?

Alex brollo
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

[Wikisource-l] Layman works about djvu: a self-made editor and more

2015-12-01 Thread Alex Brollo

I'm playing with djvu files, luckily I found a "simple" way to build a GUI
and so I'm using a self-build djvu text editor with some features that
allow many developments (select and save cropped images from page images;
saving to ws nsPage text by pywikibot; aligning djvu text layer with ws
nsPage text).

Please don't ask me to put the scripts into Git or Github since I simply
can't do this it's a blame but I can't.

Here a draft doc about what I'm doing:
https://it.wikisource.org/wiki/Utente:Alex_brollo/Djvu_Editor_-_Documentazione
; it's in Italian, but I found that Chrome translator from Italian into
English does a surplisingly good translation.

Obviously I'll be happy to share the code with any of you; consider that
I'm a DIY "programmer", and that therefore to read my code for a good
programmer would be a terrible pain.

Alex
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Vote for Google OCR-Wikisource integration in 2015 community wishlist

2015-12-01 Thread Alex Brollo

... nevertheless I found very interesting this about "SaaSS":
https://www.gnu.org/philosophy/who-does-that-server-really-serve.html

So, to build a true, excellent and indipendent "wikisource multilingual OCR
service" would be a better solution.

Alex

2015-12-01 16:06 GMT+01:00 Bodhisattwa Mandal :

> Hi Nemo,
>
> Thanks for your interest. You can find the list of Google OCR supported
> languages in the following link -
>
> https://support.google.com/drive/answer/176692?hl=en
>
> Regards,
> Bodhisattwa
> Thanks for posting about the topic. Which indic languages are we talking
> about exactly? Are they included in the recent FineReader versions now used
> by Internet Archive?
>
> Nemo
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Errare humanum est, perseverare diabolicum

2015-12-01 Thread Alex Brollo

@Nahum: What is "Table interface"? Can you please link a page using it,
just to take a look?

Alex

2015-12-01 1:44 GMT+01:00 Nahum Wengrov :

> Multilingual texts are not a priority (35 people in the conference didn't
>> even mention them, I think).
>
>
> Multilingual texts can either be treated on the ws of its main language.
> In case of a translation, that would most likely be the destination
> language. We have examples of English and Russian texts translated into
> Hebrew and placed side-by-side using Table  interface in he.ws. In other
> cases the language site could be decided arbitrarily by the original
> contributor (perhaps according to where he feels more comfortable, either
> for his personal native language or for the specific community happening to
> be there), etc. The current form enables flexibility which would be
> unavialble on a single multilingual site and is most likely to drive
> possible contributors away. Just my 10 cents, based by my own personal
> experfience as a veteran he.ws (and former en.wp and he.wp) active editor.
>
> On Mon, Nov 30, 2015 at 11:56 AM, Andrea Zanni 
> wrote:
>
>> Maybe it's me,
>> but I think that we are missing the real, huge point: *we are not ready
>> for this*.
>>
>> I mean, we, as a community: in the few days in Vienna, we discovered how
>> many problems each Wikisource and community has, and that was the first
>> time we had the chance to meet and talk (at that scale).
>> Yes, being all in the same place would maybe shorten the distance within
>> the international community, but it would be an enormous challenge for the
>> amount of software tweakings (gadgets, css, proofread page, layouts,
>> everything), and it would be a real, literal "babel" of languages.
>> And, remember, without the support of any engineer at the WMF! :-)
>>
>> So, please, keep our feet on the ground. Xanadu was the perfect model for
>> a digital library, and after 50 years is still not real. Our problem, in
>> Wikisource, is that each community has created little, complicated gadgets
>> and templates to do amazing things, but the result is that we are overly
>> complicated.
>> We need to simplify things, be better for our readers and beginners, new
>> editors.
>> Our strength is the community, above everything else.
>> That we have to nurture and care about.
>>
>> As much as I love the idea of a unique, Babelian (Borges style) digital
>> library, it won't happen if before we don't fix much more urgent things.
>> Multilingual texts are not a priority (35 people in the conference didn't
>> even mention them, I think).
>>
>> Aubrey
>>
>>
>>
>> On Mon, Nov 30, 2015 at 10:44 AM, Federico Leva (Nemo) <
>> nemow...@gmail.com> wrote:
>>
>>> Ankry, 29/11/2015 23:22:
>>>
 >
> >What about two multilanguage Wikisources? One for RTL languages,
> another
> >for LTR languages.
>
 ... and the third for some Asian scripts:
 https://phabricator.wikimedia.org/T60729  ?

 And maybe a separate one for French:
 https://phabricator.wikimedia.org/T14752  ?

 If you dig deeper then more such issues.

>>>
>>> Again, this problem is already solved: content language can be decided
>>> per page. As usual, this is blocked on silly bottlenecks on WMF servers:
>>> https://phabricator.wikimedia.org/T69223
>>>
>>> Nemo
>>>
>>>
>>> ___
>>> Wikisource-l mailing list
>>> Wikisource-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>
>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Errare humanum est, perseverare diabolicum

2015-11-30 Thread Alex Brollo

I added here (please consider that I use the first book that captured my
curiosity  into it.source main page ) simply a link to ePub generator
into Book template (Description field) of a djvu  file:
https://commons.wikimedia.org/wiki/File:Gioventu_italiana_del_littorio.djvu.
The link runs.

A  very simple way to transform Commons into a ePub central storage place
(t.i. "a library"), isn't it?

Alex



2015-11-30 12:22 GMT+01:00 Federico Leva (Nemo) :

> billinghurst, 28/11/2015 11:40:
>
>> Please go and write an essay about the matter at
>> https://wikisource.org/  referencing the original argument for the
>> split, and how the reintroduction of a single site would be of value,
>> and how it might be done. In fact how it will be better than now.
>> Otherwise all I see is a doom and gloom worry-wort.
>>
>
> +1
> Following some recent discussions on NPOV, content forking AKA separatism
> and so on, I expanded an existing essay on Meta with a summary of the main
> discussions in ~2003 and 2015.
>
> There is a section where I briefly mentioned Wikisource, it would be great
> to link a Wikisource.org essay where to examine all pros and cons of the
> Wikisource split:
> https://meta.wikimedia.org/wiki/Why_creating_new_wikis_is_a_bad_idea#Language_subdomains
>
> Nemo
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Errare humanum est, perseverare diabolicum

2015-11-30 Thread Alex Brollo

I'm testing a feature of djvu files, t.i. the possibility of upload into a
shared internal file, or into pages, *any unlimited text of any type*. Html
could be upload (with banal encoding) and downloaded. It's only a play so
far; but I think that it could be interesting to explore, since there's the
opportunity to invisibly wrap into djvu page *the html of wikisource nsPage*
- so allowing to extract, visualize, and use it with a "reader" by basic
djvulibre routines (djvused.exe) and some code.

Obviously, there are serious safety issues and a need of sanitization -
"any text" is an alarming statement.

Alex

2015-11-30 2:23 GMT+01:00 Luiz Augusto :

>
>
> On Sun, Nov 29, 2015 at 8:22 PM, Ankry  wrote:
>
>> > What about two multilanguage Wikisources? One for RTL languages, another
>> > for LTR languages.
>>
>> ... and the third for some Asian scripts:
>> https://phabricator.wikimedia.org/T60729 ?
>>
>> And maybe a separate one for French:
>> https://phabricator.wikimedia.org/T14752 ?
>>
>> If you dig deeper then more such issues.
>>
>>
> Those are related on how MediaWiki renders text, so can be easily
> circumvented in a multilingual wiki adding a new feature to instruct
> MediaWiki to renders based in a given language. If in Page namespace,
> adding to the current
> 
> tag a lang parameter
> 
>
> or, for pages with texts in more than one language (such quotations), on
> LabeledSectionTransclusion tags, making
> 
> 
> as
> 
> 
>
> (T14752 is not related to ProofreadPage extension, but the language trick
> can be added on this way as a shortcut for some possible new MediaWiki
> parser tags)
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Errare humanum est, perseverare diabolicum

2015-11-29 Thread Alex Brollo

There's a unique feature of wikisource: anyone can contribute, even if
he *doesn't
know at all the language of the text that it is editing* (it is sufficient
to recognize the characters of that language). It would be a little bit
painful, but I could proofread an hungarian text, finding and fixing some
scannos. A small contribute, but a valuable one. On the contrary, I can't
contribute at all to any other hungarian project.
I could too apply some basic formatting to the same, incomprehensible
hungarian text, but only using standard wiki markup, or css/html, that
are *universal
languages*. I could do most of needed work using shared templates and
scripts, without any knowledge of the hungarian language.

This uniqueness of wikisource (only shared by images and other media into
Commons) has been underestimated IMHO.

Alex

2015-11-29 14:48 GMT+01:00 Federico Leva (Nemo) :

> Asaf Bartov, 29/11/2015 14:40:
>
>> One significant advantage of per-language Wikisources is that the
>> interface language is appropriate
>>
>
> That's a bug, as well: https://phabricator.wikimedia.org/T58464
>
> I agree it's shameful that WMF doesn't fix the most fundamental bugs which
> make collaboration harder, even when they've been known for a decade AND
> software is available to fix them.
>
> Nemo
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Errare humanum est, perseverare diabolicum

2015-11-28 Thread Alex Brollo

Perhaps some problems come from the double nature of wikisource - that is
both a *typography *and a *library*. I see soma advantage from having
language-specific typographies, but I can't see any advantage from having
language-specific libraries; my dream would be, a Commons like
architecture, to share *source texts* just as any project can share *media*
.

A bold solution could be, to share texts using Commons; I'm just playing
with the idea of uploading wiki text, or html, of nsPage into djvu page
metadata.

Alex

2015-11-29 2:19 GMT+01:00 billinghurst :

> There is no need for global gadgets, javascripts are able to be pulled
> x-wiki now and are essentially global, and if any community wishes to
> use another's gadgets they can now. If they are not usable then
> request to make them usable. If they want assistance, then ask for it.
>
> I would think that we are looking to argue that we would be looking
> for the x-Wikisource application of Module: ns to allow a one to many
> pull of Module: from that space. Traditionally that has been
> oldwikisource, though one would say that other wikisources have been
> where more development has taken place more recently, so there is
> possibly argument about where, otherwise HOW if they are to be at
> (mul|old)wikisource
>
> I still believe that if this is a rational complaint then someone will
> sit down and write down out the issues on a wiki and we can step
> through them. Plaintive cries to a mailing list just creates noise,
> and little action.  Wistful commentary about how olden times were
> better has never had a success in my simple look at history.
>
> Regards, Billinghurst
>
> On Sun, Nov 29, 2015 at 9:24 AM, Bodhisattwa Mandal
>  wrote:
> > Hi,
> >
> > During the recent Wikisource Conference in Vienna, need for global
> gadgets,
> > templates and module was discussed and already it has been reported in
> > Phabricator ( https://phabricator.wikimedia.org/T1238 ). So someday, the
> > problem will be solved.
> >
> > To me, it is not at all a good idea to return back to multilingual WS for
> > this reason. The diversity of the language projects make Wikimedia
> movement
> > unique which includes Wikisource as well. Every language and scripts has
> its
> > own unique problem, which can not be generalised at all. Besides, if
> some WS
> > community choose to return back to multilingual, I think, that's
> possible,
> > but not every WS community would want or like to do that.
> >
> > Regards,
> > Bodhisattwa
> >
> > Maybe it is "fine" but I am afraid it is only "fine" for majority (that
> > speaks English or at least one major European language). As an example,
> > note, that there is very few discussion in Chinese in Village pump
> despite
> > there is a lot Chinese users there and many of them do not speak English.
> >
> > It is very difficult to operate on Commons for users that speak only
> Thai,
> > Urdu, Bashkir, Hindi or another not highly populated language.
> >
> > Also there are attempts to discriminate users who do not speak / do not
> > understand English.
> >
> > IMO, there is high risk that merging all wikisources would marginalize
> > minorities or people who are not multilingual.
> >
> > The other issue is (I noticed it in plwikisoure) that few users come to
> > wikisource because they feel bad in large wiki communities (plwiki in our
> > case). (I don't know if there are similar cases in otner wikisources, but
> > likely.) In case, we decide to merge projects they will leave.
> > So disadvantage here is the risk of losing users that we do not have too
> > many.
> >
> > However, there are also advantages of unification and closer cooperation.
> > Question is: will they predominate?
> >
> > Ankry
> >
> >> As to the communication problems well WD and Commons are doing just
> >> fine, it's no problem really. I am actually not an active contributor to
> >> WS but I always had a feeling that I'd perhaps be one if it was not
> >> split. It's easier to work in big project with all infrastructure ready
> >> and big community to help you, in small on the other hand you have to
> >> face the same 1 or 2 people or the time and personal issues may come in
> >> the way of participation.
> >>
> >> I am not a person to have enough energy to run a major RfC in order to
> >> have the WSs joined (as you can see I even failed to show my points in a
> >> structured way) but if such a person shows up I'd gladly support such an
> >> initiativ

Re: [Wikisource-l] Errare humanum est, perseverare diabolicum

2015-11-28 Thread Alex Brollo

Thanks for interest.
I work mainly into it.source, but I often try to work a little bit into
other projects too for a number of reasons. I find unknown templates,
tools, policies, all from them very interesting; and I do too some effort
to import them into it.source, but it's difficult, since diversity grows
daily and some good ideas are very difficult to implement into different
contexts.

I'm only a wikisource active user, not more than this, I've not sufficient
technical or organizing skills to build a project to revert what I see as a
big mistake, and I'm far from sure that my opinion is right; but I feel the
need to share this personal opinion.

About mul.source: my suggestion would be, to activate into it best tools
and gadgets, best templates, best policies and best docs;  to remove as
soon as possible any trouble for its users; and to encourage users to
upload there any multi-language book.

Alex

2015-11-28 12:54 GMT+01:00 Bodhisattwa Mandal :

> Hi,
>
> I failed to understand how splitting Wikisource projects in different
> languages had been a mistake and how that affected communities badly.
>
> As part of Bengali Wikisource community, I can only say, we are doing well
> and we don't want to return back to old multilingual Wikisource.
>
> Regards,
> Bodhisattwa
> On 28 Nov 2015 16:10, "billinghurst"  wrote:
>
>> I see an argument unsupported by evidence, and without evidence it
>> approaches baseless and without value.
>>
>> Please go and write an essay about the matter at
>> https://wikisource.org/ referencing the original argument for the
>> split, and how the reintroduction of a single site would be of value,
>> and how it might be done. In fact how it will be better than now.
>> Otherwise all I see is a doom and gloom worry-wort.
>>
>> Regards, Billinghurst
>>
>> On Sat, Nov 28, 2015 at 2:03 AM, Alex Brollo 
>> wrote:
>> > I'm deeply convinced that splitting wikisource projects into variuos
>> > languages has been a mistake.
>> >
>> > Is anyone so bold to imagine that it is possible to revert that mistake?
>> >
>> > Or, are we forced to travel along the diabolicum trail?
>> >
>> > Alex
>> >
>> >
>> >
>> >
>> > ___
>> > Wikisource-l mailing list
>> > Wikisource-l@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>> >
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

[Wikisource-l] Errare humanum est, perseverare diabolicum

2015-11-27 Thread Alex Brollo

I'm deeply convinced that splitting wikisource projects into variuos
languages has been a mistake.

Is anyone so bold to imagine that it is possible to revert that mistake?

Or, are we forced to travel along the* diabolicum* trail?

Alex
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Technical wishlist

2015-11-10 Thread Alex Brollo

Second wikisource-related idea posted a* little bit* bold: to emulate
IA digitalizing platform, but ameliorating it with a wiki style. :-)

Alex brollo

2015-11-10 19:18 GMT+01:00 Bodhisattwa Mandal :

> I requested to get a tool so that Google OCR can be integrated with Indic
> language Wikisource projects.
>
> Bodhisattwa
> On 10 Nov 2015 19:38, "Alex Brollo"  wrote:
>
>> I posted something about djvu files.
>>
>> Alex
>>
>> 2015-11-09 17:55 GMT+01:00 Andrea Zanni :
>>
>>> We should participate :-)
>>>
>>> https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2015-11-04/Op-ed
>>>
>>> Aubrey
>>>
>>> ___
>>> Wikisource-l mailing list
>>> Wikisource-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>
>>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Technical wishlist

2015-11-10 Thread Alex Brollo

I posted something about djvu files.

Alex

2015-11-09 17:55 GMT+01:00 Andrea Zanni :

> We should participate :-)
> https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2015-11-04/Op-ed
>
> Aubrey
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Watermark on Google scan.

2015-10-20 Thread Alex Brollo

Cropping scans (briss is great for such a work) is very useful  for a
comfortable proofreading. What I've to do, if unluckily Google watermark is
cut away by a  useful cropping?

More: Google can use the hard wikisource work to enhance the quality of the
text layer of its books, so wikisource is working (for free) for Google.

Last: I never use Google pdf files as they are; I usually use Internet
Archive djvu files, as they are.

Alex



2015-10-20 17:18 GMT+02:00 Trần Nguyễn Minh Huy :

> Hello everybody. I have a question about watermark removal on Wikisource.
> If I removal claims copyright page and watermark of Google in the
> Google-scanned books under this guidance <
> https://en.wikisource.org/wiki/Help:DjVu_files#Removing_a_copyright_page>,
> then legal or not? Because per statement of Google, the Google-scanned
> books must be preserved such pages and watermark copyright.
>
> This guidance about removal watermark mainly based on case law Bridgeman
> Art Library v. Corel Corp. (see also <
> https://commons.wikimedia.org/wiki/Commons:When_to_use_the_PD-scan_tag#USA>),
> which (follow Wikipedia <
> https://en.wikipedia.org/wiki/Bridgeman_Art_Library_v._Corel_Corp.#Subsequent_jurisprudence>)
> is in controversial:  *"We are not convinced that the single case to
> which we are pointed where copyright was awarded for a “slavish copy”
> remains good law." The appeals court ruling cited and followed the **United
> States Supreme Court
>  decision
> in Feist Publications v. Rural Telephone Service
> **
>  (1991),
> explicitly rejecting difficulty of labor or expense as a consideration in
> copyrightability..."*
>
> Suppose that in future Google litigation Wikipedia for removing the page
> of Google claiming copyright in the Google-scanned books and that case was
> heard by the appeals court or the US Supreme Court (higher competent than
> Southern New York Court) and the court reverse ruling, then what should we
> do?
>
> Trần Nguyễn Minh Huy
> Supporter, Wikimedia Projects
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Large uploads to Wikisource

2015-10-15 Thread Alex Brollo

Perhaps all from you already know this, but I only recently discovered that
pdf2djvu converts a *searchable pdf* into a *searchable djvu* (t.i. uploads
anything from pdf to djvu, active links and metadata too) and I like to
share my "discover". Conversion is extremely simple. Unluckily, we use only
a little bit of djvu text data - usually only the whole, unmapped text, the
only exception being hOCR tool by Phe, that outputs mapped text.

Alex



2015-10-15 13:45 GMT+02:00 billinghurst :

> Also to note that User:Dominic was a wikimedian in residence with NARA in
> the States and had a large number of files uploaded, and components of
> transcription project for those uploads.. They have their own template at
> Commons, so you should be able to dig them up.
>
> Regards, Billinghurst
>
> On Thu, Oct 15, 2015 at 9:18 PM Arne Wossink  wrote:
>
>> Hi all,
>>
>> Wikimedia Nederland has recently approached by several institutions that
>> would like to do uploads of source material. Wikisource would be the
>> preferred platform for this as the material would be searchable (which it
>> wouldn't be if it was only uploaded as pdf to Commons).
>>
>> I would like to know if there have been previous projects involving large
>> uploads by institutions, and if there's any documentation on how to proceed
>> with these.
>>
>> Thanks!
>>
>> Arne Wossink
>>
>> Projectleider / Project Lead Wikimedia Nederland
>>
>> Tel. +31 (0)6 11000505
>>
>> *Postadres*:
>> * Bezoekadres:*
>> Postbus 167Mariaplaats 3
>> 3500 AD  Utrecht Utrecht
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Analysed Layout and Text Object (ALTO)

2015-10-05 Thread Alex Brollo

I apologyze for using Italian, my aim was to send a personal reply to Nemo.

Being a personal comment, it doesn't deserve an English translation, so
ingore it please.

Alex

2015-10-05 15:05 GMT+02:00 Alex Brollo :

> Interessante; una conferma della mia vecchia idea che il "cuore di
> wikisource" è il nsIndice, e l'unità di trascrizione +è la pagina in
> nsPagina ma è un'opinione isolata, sono stato contraddetto da chi (anche
> fra i wikisourciani di altissimo livello internazionale) è convinto che
> nsIndice e nsPagina siano unicamente "proofreading tools".
>
> Ovvio che la strutturazione xml dei contenuti, per quel poco che ho visto,
> richiama (è l'evoluzione?) della struttura TEI, ma vivendo dentro
> wikisource vedo che il "peccato originale" di non valorizzare nsPagina
> rischia di rendere le cose complesse, o impossibili, oltre ad aver disperso
> incredibili energie nella "transclusione".
>
> Le mie energie e il mio entusiasmo stanno scemando
>
> Alex
>
>
> 2015-10-05 13:04 GMT+02:00 Federico Leva (Nemo) :
>
>> I'm finding this document quite useful:
>> http://www.succeed-project.eu/sites/default/files/deliverables/Succeed_600555_WP4_D4.1_RecommendationsOnFormatsAndStandards_v1.1.pdf
>>
>> See description of ALTO pasted below, which is a followup to
>> https://lists.wikimedia.org/pipermail/wikisource-l/2014-September/002081.html
>> . We should find a way to convert the transcribed books' HTML to ALTO
>> format. :)
>>
>> Some libraries are apparently using
>> http://www.primaresearch.org/tools/Aletheia which seems an augmented
>> (but unfree?!) version of ScanTailor with some different purpose.
>>
>> Nemo
>>
>> Principles
>> ALTO stores layout information and OCR recognized text of pages of any
>> kind of printed
>> documents like books, journals and newspapers. ALTO can detail technical
>> metadata for
>> describing the layout and content of physical resources (text,
>> illustrations, graphics).
>> ALTO describes a content page with different views:
>> The Description section helps to describe some general settings and
>> information
>> of the ALTO file (measurement units, file name, etc.), and the production
>> process
>> itself (processing steps, software used, dates and actors, etc.)
>> The Layout section contains what‟s on the page. A page is divided into
>> several
>> regions (print space; left, right, top and bottom margins). For each
>> region, all
>> objects are listed which have been detected inside: text blocks,
>> illustrations,
>> graphical elements, composed blocks. Each object previously identified is
>> defined
>> by generic attributes: width, height, text content (for the String
>> element).
>> Besides, the reading order of all the elements can be managed.
>> Each ALTO file may also contain a style section where different styles
>> (for
>> paragraphs and fonts) are listed.
>> Use cases
>> ALTO is one of the most common formats used by libraries for converting
>> text from
>> images. It‟s used both to deliver digitized contents and to preserve
>> these contents.
>> In a delivery perspective, the ability of ALTO to store the text content
>> coordinates in a
>> page allows the overlay of image and text (multilayer PDF) and highlight
>> search words
>> in a query.
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Analysed Layout and Text Object (ALTO)

2015-10-05 Thread Alex Brollo

Interessante; una conferma della mia vecchia idea che il "cuore di
wikisource" è il nsIndice, e l'unità di trascrizione +è la pagina in
nsPagina ma è un'opinione isolata, sono stato contraddetto da chi (anche
fra i wikisourciani di altissimo livello internazionale) è convinto che
nsIndice e nsPagina siano unicamente "proofreading tools".

Ovvio che la strutturazione xml dei contenuti, per quel poco che ho visto,
richiama (è l'evoluzione?) della struttura TEI, ma vivendo dentro
wikisource vedo che il "peccato originale" di non valorizzare nsPagina
rischia di rendere le cose complesse, o impossibili, oltre ad aver disperso
incredibili energie nella "transclusione".

Le mie energie e il mio entusiasmo stanno scemando

Alex


2015-10-05 13:04 GMT+02:00 Federico Leva (Nemo) :

> I'm finding this document quite useful:
> http://www.succeed-project.eu/sites/default/files/deliverables/Succeed_600555_WP4_D4.1_RecommendationsOnFormatsAndStandards_v1.1.pdf
>
> See description of ALTO pasted below, which is a followup to
> https://lists.wikimedia.org/pipermail/wikisource-l/2014-September/002081.html
> . We should find a way to convert the transcribed books' HTML to ALTO
> format. :)
>
> Some libraries are apparently using
> http://www.primaresearch.org/tools/Aletheia which seems an augmented (but
> unfree?!) version of ScanTailor with some different purpose.
>
> Nemo
>
> Principles
> ALTO stores layout information and OCR recognized text of pages of any
> kind of printed
> documents like books, journals and newspapers. ALTO can detail technical
> metadata for
> describing the layout and content of physical resources (text,
> illustrations, graphics).
> ALTO describes a content page with different views:
> The Description section helps to describe some general settings and
> information
> of the ALTO file (measurement units, file name, etc.), and the production
> process
> itself (processing steps, software used, dates and actors, etc.)
> The Layout section contains what‟s on the page. A page is divided into
> several
> regions (print space; left, right, top and bottom margins). For each
> region, all
> objects are listed which have been detected inside: text blocks,
> illustrations,
> graphical elements, composed blocks. Each object previously identified is
> defined
> by generic attributes: width, height, text content (for the String
> element).
> Besides, the reading order of all the elements can be managed.
> Each ALTO file may also contain a style section where different styles (for
> paragraphs and fonts) are listed.
> Use cases
> ALTO is one of the most common formats used by libraries for converting
> text from
> images. It‟s used both to deliver digitized contents and to preserve these
> contents.
> In a delivery perspective, the ability of ALTO to store the text content
> coordinates in a
> page allows the overlay of image and text (multilayer PDF) and highlight
> search words
> in a query.
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Tech issues

2015-09-11 Thread Alex Brollo

I didin't find anything similar. Do you find the same issue with the same
PC with different connections, or with different PC under the same
connection? What happens if you log out?

Alex

2015-09-11 15:19 GMT+02:00 Federico Leva (Nemo) :

> What error do you see? Something like "connection refused"? How does your
> computer connect to the internet? Do things improve if you delete your
> cookies for the domains which don't work?
>
> Do you manage to perform some basic networking diagnostic e.g. with
> http://winmtr.net/ ?
>
> Nemo
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] GLAM/Wikisource Project bears fruit in Québec!

2015-09-01 Thread Alex Brollo

Great news! French wikisource is really a marvellous project, and such news
are a deserved reward for developers and contributors.

Alex

2015-08-31 17:56 GMT+02:00 Ernest Boucher :

> Projet Québec/Canada, a french GLAM project presented on the French
> Wikisource in collaboration with WMCA and BAnQ (Bibliothèque et Archives
> nationales du Québec) bears fruit.
>
> With the enthusiastic collaboration of the Collection nationale (National
> Collection), in the last year, over 35 historic documents and books were
> rendered in digital format on Wikisource for the public and many more are
> on their way to be completed.
>
> Seeing the potential of this great collaboration between contributors and
> BAnQ's mission to preserve and desseminate our published heritage, BAnQ
> decided to link Wikisource directly from their official website's main page!
>
> This new window on the national archives of Québec website will bring, I'm
> sure, great visibility for Wikisource and many interested people who will
> want to contribute!
>
> Ernest Boucher, Wikimédia Canada (ca.wikimedia.org)
> Wikisource: Projet Québec Canada (jecontribue.ca)
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Badges for Wikisource [[phabricator:T97014]]

2015-08-21 Thread Alex Brollo

The link into Index page should show that Index page has been used as
Edition item property into wikidata; just do avoid to follow the link to
ns0 page, then to follow his wikidata link to see if Index page property
has been added.

Alex

2015-08-21 0:16 GMT+02:00 billinghurst :

> ???
>
> Are you asking whether we can put a WD link on that index page to the
> item, yes, though it wouldn't be automated. Not certain that I see much
> value to such a link.
>
> On Thu, 20 Aug 2015 21:29 Alex Brollo  wrote:
>
>> Very interesting; so Index page/Index pages is/are properties of an
>> edition item. One question: is it possible to show somehow the link to
>> edition related item into wikisource Index page, i.e. into
>> https://en.wikisource.org/wiki/Index:The_Garden_of_Romance_-_1897.djvu ?
>>
>> Alex
>>
>> 2015-08-20 12:26 GMT+02:00 billinghurst :
>>
>>> I went and added a badge to   https://www.wikidata.org/wiki/Q20850489
>>>
>>> Regards, Billinghurst
>>>
>>> On Thu, Aug 20, 2015 at 8:18 PM billinghurst 
>>> wrote:
>>>
>>>> Following work by WMDE at Wikidata, there will now be the ability to
>>>> tag/badge works at Wikidata as to their proofread status.  The badges have
>>>> been kept simple, and should reflect the colour that is used for a work's
>>>> page status.
>>>>
>>>> I recommended an implementation of three
>>>> * incomplete (be it not be proofread, or problematic)
>>>> * proofread once
>>>> * proofread twice = validated
>>>>
>>>> This will mean that we should be looking to an agreed approach to its
>>>> use. I suggest that this have some onwiki work, with a discussion to follow
>>>> at Wikisource Wien Workabout (WWW)
>>>>
>>>> Regards, Billinghurst
>>>>
>>>> The detail ...
>>>>
>>>> Add default badges for Wikisource
>>>>
>>>> With site css, each Wikisource community can override
>>>> these defaults.
>>>>
>>>> The defaults are coloured circles. To start with, these match
>>>> what we have for Wikibase Repo (e.g. Wikidata), except not (yet?)
>>>> have one for 'not proofread'.
>>>>
>>>> * problematic / incomplete
>>>> * proofread
>>>> * validated
>>>>
>>>> Bug: T97014
>>>> Change-Id: I5ad5270866a0a5b8fbc7a19b194f800c2b0e9a8a
>>>> ---
>>>> A resources/images/badge-notproofread.png
>>>> A resources/images/badge-notproofread.svg
>>>> A resources/images/badge-problematic.png
>>>> A resources/images/badge-problematic.svg
>>>> A resources/images/badge-proofread.png
>>>> A resources/images/badge-proofread.svg
>>>> A resources/images/badge-validated.png
>>>> A resources/images/badge-validated.svg
>>>> M resources/skins/cologneblue/wikimedia-badges.css
>>>> M resources/skins/modern/wikimedia-badges.css
>>>> M resources/skins/monobook/wikimedia-badges.css
>>>> M resources/skins/vector/wikimedia-badges.css
>>>> 12 files changed, 86 insertions(+), 1 deletion(-)
>>>>
>>>>
>>>> To view the specific changes, visit https://gerrit.wikimedia.org/r/229199
>>>>
>>>>
>>>>
>>> ___
>>> Wikisource-l mailing list
>>> Wikisource-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>
>>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Badges for Wikisource [[phabricator:T97014]]

2015-08-20 Thread Alex Brollo

Very interesting; so Index page/Index pages is/are properties of an edition
item. One question: is it possible to show somehow the link to edition
related item into wikisource Index page, i.e. into
https://en.wikisource.org/wiki/Index:The_Garden_of_Romance_-_1897.djvu ?

Alex

2015-08-20 12:26 GMT+02:00 billinghurst :

> I went and added a badge to   https://www.wikidata.org/wiki/Q20850489
>
> Regards, Billinghurst
>
> On Thu, Aug 20, 2015 at 8:18 PM billinghurst 
> wrote:
>
>> Following work by WMDE at Wikidata, there will now be the ability to
>> tag/badge works at Wikidata as to their proofread status.  The badges have
>> been kept simple, and should reflect the colour that is used for a work's
>> page status.
>>
>> I recommended an implementation of three
>> * incomplete (be it not be proofread, or problematic)
>> * proofread once
>> * proofread twice = validated
>>
>> This will mean that we should be looking to an agreed approach to its
>> use. I suggest that this have some onwiki work, with a discussion to follow
>> at Wikisource Wien Workabout (WWW)
>>
>> Regards, Billinghurst
>>
>> The detail ...
>>
>> Add default badges for Wikisource
>>
>> With site css, each Wikisource community can override
>> these defaults.
>>
>> The defaults are coloured circles. To start with, these match
>> what we have for Wikibase Repo (e.g. Wikidata), except not (yet?)
>> have one for 'not proofread'.
>>
>> * problematic / incomplete
>> * proofread
>> * validated
>>
>> Bug: T97014
>> Change-Id: I5ad5270866a0a5b8fbc7a19b194f800c2b0e9a8a
>> ---
>> A resources/images/badge-notproofread.png
>> A resources/images/badge-notproofread.svg
>> A resources/images/badge-problematic.png
>> A resources/images/badge-problematic.svg
>> A resources/images/badge-proofread.png
>> A resources/images/badge-proofread.svg
>> A resources/images/badge-validated.png
>> A resources/images/badge-validated.svg
>> M resources/skins/cologneblue/wikimedia-badges.css
>> M resources/skins/modern/wikimedia-badges.css
>> M resources/skins/monobook/wikimedia-badges.css
>> M resources/skins/vector/wikimedia-badges.css
>> 12 files changed, 86 insertions(+), 1 deletion(-)
>>
>>
>> To view the specific changes, visit https://gerrit.wikimedia.org/r/229199
>>
>>
>>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

[Wikisource-l] Css and resourceLoader

2015-08-20 Thread Alex Brollo

When editing my common.css to test style of p tag - I need to activate:
.tiInherit p {
   text-indent:inherit;
}

I see that my style is uploaded, but is overwritten by another general css.
I suspect that this could come from ResourceLoader and related timing of
css files loading. Am I true? If so, how the issue can be avoided?

Alex
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Better way to validate pages

2015-08-18 Thread Alex Brollo

2015-08-17 19:12 GMT+02:00 Andrea Zanni :


> *Wikisource is still too complicated*, and this is one of the reasons we
> don't have big communities.
>
>
IMHO what is really complicated is, the last step of digitalization (OCR
review + formatting), it's almost impossible to simplify what is
intrinsically complex. Do you know the Distributed Proofreaders approach to
split such an intrinsic complexity into many steps?

Nevertheless, there's a wide range of complexity - some text  being very
simple (i.e. novels), other being extremely difficult (ancient books,
theatre, scientific textbooks); perhaps the degree of complexity could be
evaluated and explicitely stated, both by automatic scripts (page length +
no. of templates + no. of unicode,  non-ASCII characters) and by expert
users.

BGB should be used in very simple texts only.

Alex
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Better way to validate pages

2015-08-17 Thread Alex Brollo

IMHO, even if I'm testing the BGB as a personal script, I'm not satisfied
by it, since - ironically - I don't agree fully with Andrea: I think that a
good look to wiki code is mandatory, I want to see if transclusion codes
are OK, I want see templates and their use and so on. Unexperienced but
interested users need to look at code to learn by example. Often
experienced users need too (but they are aware of such a need).

It would be great IMHO that the raw code of the page would be uploaded by
default into some system variable in view mode too, so that it can be
reviewed immediately by a click. It is a really simple job to do by
javascript, but I think that wiki code should be uploaded by default/by an
extension. I think that server and browser load would be very low.

Alex

2015-08-17 15:07 GMT+02:00 Erasmo Barresi :

> Hum... why should these "button validations" count less, so that four or
> five of them are needed to change the page status? Certainly not because
> "the code is not being checked", since the code stays unchecked no matter
> how many "button validations" are done.
> Possibly it would be better if the button(s) opened a flyout telling users
> what to do: create an account if they do not have one yet, then click edit,
> [correct what's wrong,] change the page status and save. I think it is
> better that new users begin to take part in the main editing workflow
> rather than operating on a separate one that is designed for them.
>
> Whether to make the _next_ page appear after saving is entirely another
> question, and one to which I would answer "yes". This cannot be done for
> the very last page of an index, of course.
>
> Erasmo
>
> > Date: Fri, 14 Aug 2015 15:46:31 +0200
> > From: Andrea Zanni 
> > To: "discussion list for Wikisource, the free library"
> > 
> > Subject: Re: [Wikisource-l] Better way to validate pages
> > Message-ID:
> > 
> > Content-Type: text/plain; charset="utf-8"
> >
> > On Fri, Aug 14, 2015 at 2:06 PM, zdzislaw 
> wrote:
> >
> > > In the view mode of the yellow Pages (sic! :-)), we can add the "Thin
> (but
> > > long) Green Button" (TGB) described: "I read and carefully compared the
> > > contents with the scan - there's no mistakes." :) Users who "DO read
> our
> > > books" (and they do not want / do not have time / skills... to edit)
> click
> > > on this button and simply go to the view mode of the next page. Such a
> > > click would be counted (extra field in the mw database), but did not
> cause
> > > an immediate change of the Page status. If for a given page will be
> counted
> > > three??, four?? such clicks (this amount would have to have the
> ability to
> > > configure for each WS - community could determine their "quality
> threshold"
> > > - for "one click" it will became into BGB), then the Page status would
> > > change automatically from "yellow" to "green". Of course, it would be
> also
> > > configurable, to whom show TGB (ip, registered, autopotrolled ...).
> > > Such a solution would have be implemented directly in the proofread
> > > extension.
> > > "TGB" would allow adjustment of the level of "quality" and would be
> > > acceptable by most the community. If it is true that " a lot of users
> DO
> > > read our books," even for 5-4 "clicks" the status would change quickly.
> > >
> > >
> > I do like this approach, and I'd love to see some tests.
> > I really believe that is good to do tests and experiments, as we are
> > sometimes convinced by things that are not really proven.
> >
> > A 3 step validation passage as you suggest could maybe be easy enough for
> > new users and casual readers, and we could gain some validations we could
> > not have had otherwise.
> >
> >
> > I also would like to repeat my question about the Visual Editor: are we
> > close tho that or nobody is working on it?
> >
> > Aubrey
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Better way to validate pages

2015-08-11 Thread Alex Brollo

Please don't presume that such a controversial tool hase been implemented
anywhere . "running" only means that che code can run; presently only
*one* user (Aubrey) can click it, just to test it.

Alex

2015-08-12 2:24 GMT+02:00 Wiera Lee :

> Luiz Augusto:  "Rough but runing code of BGB is ready".
>
> This is not a discussion. They had decided.
>
>
> We can change nothing. Well... Why go to Vienna?
> Wieralee
>
> 2015-08-12 1:26 GMT+02:00 Luiz Augusto :
>
>> ("Didn't read the entire thread; too long" warning)
>>
>> I must agree with PL folks: the BGB isn't an improvement. Probably the
>> OCR quality is great on English, Italian and French for doing such thing,
>> but it certainly isn't also for Portuguese (PT).
>>
>> A good improvement will be if a Yellow Big Button wold be implemented.
>> Maybe you don't find it useful, as many pages are reviewed on creation, but
>> it is because we, experienced users, do it in this way.
>>
>> Simply putting an Index page or an external link to get the digitization
>> is the worst thing we currently do.
>>
>> Why not our bots starts extracting all and every pages, to make Page
>> namespace working in similar way that Google Book Search works (you can
>> choose if you need to browse on image view or OCR view on that platform).
>> If a random Internet user goes to Wikisource after doing a Web search due
>> to the correct recognized portion of text (as he go to GBS), he can start
>> immediately to fix the OCR and, voila! A new user just discovered an
>> ancient text and a promising website that collects ancient texts!
>>
>> This approach makes sense on attracting new user and presenting how to
>> work on Wikisource, and not downgrading our compromise to flag pages fully
>> reviewed.
>>
>> Side note: Portuguese language still is "unstable" on orthography and how
>> to spell words. From time to time we change our conventions (Brazil and
>> Portugal are yet implementing the Acordo Ortográfico de 1990 and some are
>> arguing on a new one change). PD-old digitizations came in A VERY OLD
>> ORTHOGRAPHY CONVENTION. Creating the Big Green Button will make us unable
>> to do a last check if the wikitext follows the way that words are on
>> digitization or in the current way of writing. So, it isn't an improvement,
>> only a trouble finding.
>>
>> [[User:555]]
>> Em 11/08/2015 7:09 PM, "Alex Brollo"  escreveu:
>>
>>> Rough but running code of BGB is ready, and Andrea can test it to find
>>> bugs and/or drawbacks by now, if he likes.
>>>
>>> To lower the risk of a nonsense-click, BGB should pop out after some
>>> reasonable delay - something less than the time needed to carefully compare
>>> the page  text and its image. To make simpler to monitor its use, a
>>> standard message could be added to edit, so that BGB edits could be fastly
>>> selected in RecentChanges.
>>>
>>> Alex
>>>
>>>
>>>
>>>
>>> 2015-08-11 21:21 GMT+02:00 Nicolas VIGNERON 
>>> :
>>>
>>>> 2015-08-11 20:39 GMT+02:00 Wiera Lee :
>>>> >
>>>> > On pl.wikisource each correction level means that another person did
>>>> the correction again. The green status means the page was corrected three
>>>> times by three another persons.
>>>>
>>>> The colours are just for marking the status page, it's not per se a
>>>> correction and only two people are actually needed ; but yes, it's the more
>>>> or less the same on each wikisource with the proofred system.
>>>>
>>>> > Corrected, not read.
>>>>
>>>> Uh? Correcting without reading?
>>>>
>>>> > In my opinion Big Green Button Correction is useless. New users can
>>>> click only for stats, not for proofreading. And nobody would check it
>>>> again, because the book would be finished.
>>>>
>>>> Please dont bite the new users or imagine that they're all evil. Maybe
>>>> you had a bad experience on plwiki but that's not always true.
>>>>
>>>> Think about it: When you were new users, did you edit only for stats?
>>>>
>>>> I check *a lot* the green pages since *sometimes* there is still little
>>>> correction to do (a new and better templates, some strange typo like «
>>>> word » - with invisible hyphen - or « wоrd » - with a cyrillic о - instead
>>>> of « wo

Re: [Wikisource-l] Better way to validate pages

2015-08-11 Thread Alex Brollo

Rough but running code of BGB is ready, and Andrea can test it to find bugs
and/or drawbacks by now, if he likes.

To lower the risk of a nonsense-click, BGB should pop out after some
reasonable delay - something less than the time needed to carefully compare
the page  text and its image. To make simpler to monitor its use, a
standard message could be added to edit, so that BGB edits could be fastly
selected in RecentChanges.

Alex




2015-08-11 21:21 GMT+02:00 Nicolas VIGNERON :

> 2015-08-11 20:39 GMT+02:00 Wiera Lee :
> >
> > On pl.wikisource each correction level means that another person did the
> correction again. The green status means the page was corrected three times
> by three another persons.
>
> The colours are just for marking the status page, it's not per se a
> correction and only two people are actually needed ; but yes, it's the more
> or less the same on each wikisource with the proofred system.
>
> > Corrected, not read.
>
> Uh? Correcting without reading?
>
> > In my opinion Big Green Button Correction is useless. New users can
> click only for stats, not for proofreading. And nobody would check it
> again, because the book would be finished.
>
> Please dont bite the new users or imagine that they're all evil. Maybe you
> had a bad experience on plwiki but that's not always true.
>
> Think about it: When you were new users, did you edit only for stats?
>
> I check *a lot* the green pages since *sometimes* there is still little
> correction to do (a new and better templates, some strange typo like «
> word » - with invisible hyphen - or « wоrd » - with a cyrillic о - instead
> of « word », ).
>
> > We are asking new users to validate the pages for the second time (from
> red to yellow level): new users can learn how the templates and raw codes
> are working, but when they do something wrong, an experienced user would
> check it one more time -- to make it green. If they would not edit the
> page, they would never know how the templates works. So they would not
> become a better editors...
>
> Can't they do both?
>
> And should we really make the life of users (new and old) hard when it's
> not needed ?
>
> > We all can do only red pages, why not. We'll get a "perfectly readable
> and functional book" with some errors. But should we give its the same
> status as a proof-read three times book? Green status means "almost
> perfect". We shouldn't make green pages automatically, only to make our
> stats better.
>
> No, only red pages is not "perfectly readable and functional book.
>
> How many is « almost » perfect? 80%? 90% 95%? 99%? that's a tricky
> question.
> And if a book made of 500 yellow pages already at 99% perfect, isn't the
> BGB usefull?
>
> > Correction without correction is not a good idea. It's a lie.
>
> Very true but the BGB is not about correction, it's about marking as
> correct something that already is.
>
> Cdlt, ~nicolas
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Better way to validate pages

2015-08-11 Thread Alex Brollo

While suggesting how the Andrea's ideas coud be implemented (in the
meantime, I wrote some js rows to upload quietly localStorage.rawCode,
localStorage.pageUser, localStorage.pageLevel, an localStorage.validable
too when reading any page in view mode), I was perfecly aware of what a
similar tool could cause.

But... is there so deep a difference between the validation of a page by a
newbie in Edit mode, and the validation by the same user clicking the Big
Green Button? For sure, it's much simpler and comfortable to review a text
in view mode: isn't it the idea of VisualEditor?

Alex



2015-08-11 12:28 GMT+02:00 Nicolas VIGNERON :

> I'm not sure we're all talking about the same thing.
>
> First, this tool is just a tool. If someone is misusing a tool, don't
> blame the tool, blame (and block) the user of the tool !
>
> Then it seems that the quality level has not the same meaning on every
> wikisources. Typo such as « rn » intead of « m » are usually removed on the
> red or yellow step on fr.ws (and such obvious error can be seen before
> editing, reviewing the final render code seems enough to me).
> When I'm thinking of raw code review on yellow to green step, I'm thinking
> of formatting and things like html code replace by ws templates, Unicode
> encoding mistakes, and little things like that ; for me all typo should be
> gone at the previous stage (and personally, I don't go from red to yellow
> if there is still such typo mistakes).
>
> The GGB is a tool (and just an idea of a tool right now) and one of many
> solution to one of many problems Andrea pointed ; but there is many other
> problems. Especially, the navigation arrows could use some improvement. «
> validate this and go to next page » is definitively something we need.
> Since the VisualEditor is coming, we would be dumb no to cease this
> opportunity to do some clean-up and renovation.
>
> We should think too to an other category of tools : global detection of
> possible mistakes. On frws, there is some little things like
> https://fr.wikisource.org/wiki/MediaWiki:Gadget-Erreurs-communes.js
> (intern gadget) and https://tools.wmflabs.org/dicompte/index.php (extern)
> but here too there is huge room for improvement. Proofreading page by page
> is great and necessary but we should multiply the approachs to reach the
> best quality.
>
> We're speaking of new users but such tools (the GGB and much more others)
> can be useful for old users too. Maybe we can test them for some old user
> first, see how it goes and then offers them (or not) to new users.
> Finally, new users are not all the same. The director of Rennes Library is
> a new user on frws but she's defintively better at proofreading than most
> wikisorcerers ;)
>
> Cdlt, ~nicolas
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Better way to validate pages

2015-08-10 Thread Alex Brollo

Ok; imagine that while opening a level 3 page, an ajax query uploads
quietly the raw code of the page; as soon as you click the "Big Green
Button" the script could edit the code and send it to the server - in
milliseconds - and immediately could click the next page button.

If a review of page in view mode is all what is needed to validate it,
there's no reason to enter in edit mode when there's nothing to fix.

Alex

2015-08-10 18:14 GMT+02:00 Andrea Zanni :

> The Big Validate Button is a good idea,
> but I also would like a better navigation experience, as it is pretty slow
> and cumbersome to got on the top of the page to click a tiny arrow, wait
> for the new page, click edit, etc.
>
> Aubrey
>
>
> On Mon, Aug 10, 2015 at 4:29 PM, Alex Brollo 
> wrote:
>
>> If this is true, then to add a big button "Validate" to edit by ajax the
>> code of the page (the header section only needs to be changed if there's no
>> error to fix into the txt) should be a banal task for a good programmer.
>>
>> Perhaps Andrea is asking for much more, but this could be a first step.
>>
>> Alex
>>
>>
>>
>> 2015-08-10 14:47 GMT+01:00 Nicolas VIGNERON :
>>
>>> 2015-08-10 15:37 GMT+02:00 Alex Brollo :
>>> >
>>> > First point is:
>>> > is it a safe practice to validate a page without reviewing its raw
>>> code?
>>>
>>> Probably yes.
>>> Obviously, it's safer to check the raw code but it's unrealistic to
>>> expect the raw code to be review for all page. Anyway, the pages doesn't
>>> contain a lot of code (and most pages does'nt contain code at all), so it
>>> doesn't seems to be crucial to me.
>>> Plus : when VisualEditor will be on WS, less and less people will
>>> actually see the raw wikicode.
>>>
>>> > A second point: is it a safe practice to validate a page without
>>> carefully reviewing its transclusion into ns0?
>>>
>>> Definitively yes.
>>> When can a transclusion can go wrong? In all cases I can think of, the
>>> problem come from templates, css classes or general stuff like that. It
>>> should be fixed generally and it shouldn't block the page validation since
>>> it have nothing to do the the page itself (but maybe I'm missing an obvious
>>> example here).
>>>
>>> > Alex
>>>
>>> Cdlt, ~nicolas
>>>
>>> ___
>>> Wikisource-l mailing list
>>> Wikisource-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>
>>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] Better way to validate pages

2015-08-10 Thread Alex Brollo

If this is true, then to add a big button "Validate" to edit by ajax the
code of the page (the header section only needs to be changed if there's no
error to fix into the txt) should be a banal task for a good programmer.

Perhaps Andrea is asking for much more, but this could be a first step.

Alex



2015-08-10 14:47 GMT+01:00 Nicolas VIGNERON :

> 2015-08-10 15:37 GMT+02:00 Alex Brollo :
> >
> > First point is:
> > is it a safe practice to validate a page without reviewing its raw code?
>
> Probably yes.
> Obviously, it's safer to check the raw code but it's unrealistic to expect
> the raw code to be review for all page. Anyway, the pages doesn't contain a
> lot of code (and most pages does'nt contain code at all), so it doesn't
> seems to be crucial to me.
> Plus : when VisualEditor will be on WS, less and less people will actually
> see the raw wikicode.
>
> > A second point: is it a safe practice to validate a page without
> carefully reviewing its transclusion into ns0?
>
> Definitively yes.
> When can a transclusion can go wrong? In all cases I can think of, the
> problem come from templates, css classes or general stuff like that. It
> should be fixed generally and it shouldn't block the page validation since
> it have nothing to do the the page itself (but maybe I'm missing an obvious
> example here).
>
> > Alex
>
> Cdlt, ~nicolas
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

1 2 3 >

1 - 100 of 296 matches

Mail list logo