On 29 October 2012 18:20, Karen Coyle <[email protected]> wrote:
>
>
> On 10/27/12 12:06 PM, Ben Companjen wrote:
>> Hi all,
>>
>> Since I received my e-book reader a couple of weeks ago, I have been
>> looking at out-of-copyright books to load. The few books that I
>> downloaded as EPUB from the OL / Internet Archive contain many OCR
>> errors. Rather than correcting these by hand just for myself (as OL/IA
>> doesn't provide an obvious way to let me upload a more correct
>> version), I remembered that there is a web place where people gather
>> to improve texts for e-book readers and re-discovered Project
>> Gutenberg [1].
>>
>> Community members involved with Project Gutenberg produce e-book
>> versions of out-of-copyright books, which can then be downloaded from
>> the website. But whereas OL EPUBs can be linked to a specific edition,
>> the PG EPUBs are mostly "reconstructed" from the text and harder to
>> link to a paper edition.
>>
>> Hence my following questions:
>> Do people agree that Project Gutenberg editions be seen as separate editions?
>
> Yes, definitely. I also think that a corrected OL edition should be
> stored separately from its original un-corrected OCR. The reason is that
> at some point it may be desirable to go back and see what was there
> before the correction. Ideally, there could be versioning and forking,
> much like software.

Project Gutenberg has methods for distributed transcription of books,
although in general (and for OL books with suboptimal OCR results) I
support your vision of versioning and forking.
>
>> Do people agree the release date given by the project is the publish date?
>
> The release date of the digital edition is a publish date, but I think
> that it isn't sufficient. If the text is derived from a physical book,
> then the date of the book is also needed. I also would like to see
> "original" dates where known -- that is the original publication date of
> the text. Otherwise, Moby Dick and Origin of Species end up being
> presented as 21st century texts, which really messes up the cultural and
> scientific context.

Sure, if available, the original publish date should be added. But I
trust that somewhere in OL there already is an Edition describing the
original publication. According to OL, Moby Dick was first published
in 1851 even though one of the E-book versions was (a scanned version
of an edition) published in 1922.

>
>> Do people agree that there is some sense in PG editions' formats being
>> something like "E-book" or "Electronic resource"
>
> They are electronic resources, but if they are plain text I have a hard
> time seeing them as "ebooks" -- to me, ebook implies something more
> structured than plain text. (Title pages, navigable chapters, etc.) I
> know not everyone sees it that way.

The editions that I have seen come in several flavours: EPUB (usually
including TOC), plain text, HTML. So I think "E-book" qualifies.
>
>
>> Why are there only (19 | less than 19 | 281) of the 40000+ editions
>> [2] in OL? These 19 seem to be linked to IA items, coming from
>> "European libraries", although not all seem to be really published by
>> PG (e.g. [3]). In the latest data dump, there are 281 editions with at
>> least one PG identifier, but they are not listed under publisher PG.
>> Are there people around who know about connecting or importing the PG
>> catalogue?
>
> I believe that the PG books are not in the OL/IA workflow for a reason,
> although I don't recall the reason. It may have to do with the
> availability of bibliographic data?

There is some metadata available for every book. I don't know about
the licence (or terms of use) though.
>
> Note, though, that from what I understand there is no new development
> happening on OL at the moment and I don't know if it will be taken up
> again. There seems to be no staff dedicated to the project. So it's
> unlikely that any new data types will be added.

I was thinking more of some best (or just good) practice for PG books,
not new development. From looking at GitHub, it seems there is still
some development going on (last commit 6 days ago, not bad). Surely
they wouldn't abandon "us" before every book has a webpage? :)

Ben

>
> kc
>
>> Are there other known publishers named Project Gutenberg?
>>
>> (Feel free to answer a subset of these questions :) )
>>
>> Ben
>>
>> [1] http://www.gutenberg.org
>> [2] http://openlibrary.org/publishers/Project_Gutenberg
>> [3] http://openlibrary.org/books/OL20478553M/The_Lady_of_the_Lake
>> _______________________________________________
>> Ol-discuss mailing list
>> [email protected]
>> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
>> To unsubscribe from this mailing list, send email to 
>> [email protected]
>>
>
> --
> Karen Coyle
> [email protected] http://kcoyle.net
> ph: 1-510-540-7596
> m: 1-510-435-8234
> skype: kcoylenet
> _______________________________________________
> Ol-discuss mailing list
> [email protected]
> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
> To unsubscribe from this mailing list, send email to 
> [email protected]
_______________________________________________
Ol-discuss mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to