You need to be cautious talking about "PDF" documents, as it is not the
document presentation format, it is the source of the text. So I like to
talk as the source being digitally prepared (and not requiring validation,
though may require formatting), or OCR'd (requiring validation, and
probably formatting.)

If you are talking about how we represent digitally prepared text with the
validation process. I would have no issue with the text being ripped and
having a bot run through and taking it straight to level 4 (green), and
then redefining green to say validated, or digitally prepared text not
requiring validation.

At the same time, if someone proposed and generates a fifth colour to
represent digitally prepared text not requiring proofreading, then I will
be happy with that. It may make someone happier in being a truer
representation, but in the end to me it is a moot point. In the end, each
of those is a local community decision, though one that should be made in
consideration of how the other wikis interpret their processes.

Regards, Billinghurst


On Tue, 11 Jun 2013 15:12:41 -0400, David Cuenca <dacu...@gmail.com>
wrote:
> @Billinghurst, I think Aubrey was referring mainly to pdf files, which
> sometimes have text and format but they are not that easy to represent
in
> Wikisource. The main problem is that our current workflow always assume
> that we are going to proofread a text and have it stored as a web page.
> 
> @others: for me it doesn't matter much if the representation of the
> metadata is done by a template, an index page, or something different
> (maybe related to the new Extension:BookManager?)
> However I think that from the user point of view it is better to have a
> consistent system that can handle:
> 1) representation of book/source metadata
> 2) give access to export/visualization options
> 
> I'm preparing a document with some ideas that we can discuss here.
> 
> Micru
> 
> On Tue, Jun 11, 2013 at 7:48 AM, billinghurst
> <billinghu...@gmail.com>wrote:
> 
>> On Tue, 11 Jun 2013 12:16:54 +0530, "Aarti K. Dwivedi"
>> <ellydwivedi2...@gmail.com> wrote:
>> > A slighly off-topic question: Even if we modify the extension to
>> proofread
>> > books which do not have scans( I am assuming books that were born
>> digital
>> > ), against what
>> > will these books be proofread?
>> >
>>
>> I am not sure why we are looking to proofread a digital only file,
unless
>> of course it never had a text layer and it had to be OCR'd. 
Proofreading
>> surely only relates to scanned images where there has been the need to
>> proofread.
>>
>> Regards, Billinghurst
>>
>> _______________________________________________
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>

_______________________________________________
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Reply via email to