Martin,

Sorry that I'm responding to this so late.

The 475 million word Corpus of Historical American English
<https://www.english-corpora.org/coha/> (COHA) has about 220 million words
of fiction from the 1820s-2010s (see information on the number of texts and
words by decade below). Nearly all of the texts from the 1820s-1980s are
novels, whereas there are more short stories from the 1990s-2010s. Full
information (for all texts) can be found here
<https://www.english-corpora.org/coha/files/sources-coha-2020.zip>.

Full-text (downloadable) data can be found at CorpusData.org
<https://www.corpusdata.org/> as well as the Univ Stuttgart
<https://www.ims.uni-stuttgart.de/en/research/resources/corpora/ccoha/>.

Best,

Mark Davies
English-Corpora.org

------------------------------------------

decade #texts #words
1820s    90     3,778,554
1830s    179     7,492,464
1840s    243     8,615,569
1850s    151     9,175,764
1860s    249     9,279,356
1870s    217     10,454,445
1880s    264     11,204,077
1890s    257     11,261,720
1900s    266     12,096,794
1910s    296     12,266,683
1920s    281     12,668,146
1930s    533     11,959,731
1940s    420     12,030,426
1950s    470     12,014,411
1960s    403     11,652,761
1970s    335     11,652,921
1980s    334     11,664,130
1990s    1711     13,337,688
2000s    4224     14,624,639
2010s    3672     15,150,555
TOTAL    14595     222,380,834

> From: Martin Wynne via Corpora <corpora@list.elra.info>
>> > Sent: Sunday, October 27, 2024 8:11 AM
>> > To: corpora@list.elra.info
>> > Subject: [Corpora-List] Corpora of English novels
>> >
>> > I have a student who is interested in tracing the development of the
>> English novel from its origins to the present day (or at least to the start
>> of the twentieth century), and I'm trying to gather information about
>> relevant corpora covering this text type and period.
>> >
>> > We know about the European Literary Text Collection (ELTeC,
>> >
>> https://www.google.com/url?q=https://www.distant-reading.net/eltec/&source=gmail-imap&ust=1730635931000000&usg=AOvVaw2Y1rJdwNxnHfCqswyPsa22)
>> which will be very useful for the later end of the timescale. We also know
>> it is possible to assemble a corpus from Project Gutenberg, archive.org,
>> Oxford Text Archive, etc.
>> > , but would be interested in re-using any corpora that people might
>> already have made, which aim to be representative of particular periods
>> within this genre.
>> >
>> > The student has some flexibility with her research question, so while
>> the original idea of 'English novels' was probably 'novels in English from
>> Great Britain and Ireland', other related areas such as US novels might be
>> interesting as well.
>> >
>> > Any tips and suggestions gratefully received. If we get a number of
>> interesting direct emails, I'll be happy to summarize the results to the
>> list.
>> >
>> > Best wishes,
>> > Martin
>> >
>> > --
>> > Senior Researcher in Corpus Linguistics
>> > Faculty of Linguistics, Philology and Phonetics, University of Oxford
>> National Co-ordinator, CLARIN-UK martin.wy...@ling-phil.ox.ac.uk
>> https://www.google.com/url?q=https://orcid.org/0000-0002-4155-0530&source=gmail-imap&ust=1730635931000000&usg=AOvVaw1i_exZAWOHquyE8Wlol7Le
>> >
>> > _______________________________________________
>> > Corpora mailing list -- corpora@list.elra.info
>> https://www.google.com/url?q=https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/&source=gmail-imap&ust=1730635931000000&usg=AOvVaw3ExL6BwTVsV7vY84JjtMck
>> > To unsubscribe send an email to corpora-le...@list.elra.info
>> >
>>
>> --
>> Senior Researcher in Corpus Linguistics
>> Faculty of Linguistics, Philology and Phonetics, University of Oxford
>> National Co-ordinator, CLARIN-UK
>> martin.wy...@ling-phil.ox.ac.uk
>> https://orcid.org/0000-0002-4155-0530
>>
>> _______________________________________________
>> Corpora mailing list -- corpora@list.elra.info
>> https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
>> To unsubscribe send an email to corpora-le...@list.elra.info
>>
> _______________________________________________
> Corpora mailing list -- corpora@list.elra.info
> https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
> To unsubscribe send an email to corpora-le...@list.elra.info
>


-- 
============================================
Mark Davies
english-corpora.org
mark-davies.org
============================================
_______________________________________________
Corpora mailing list -- corpora@list.elra.info
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to corpora-le...@list.elra.info

Reply via email to