The following is an exchange between Andrew Odlyzko (U. Minnesota) and Hal Varian (UC Berkeley) on the question of the number of journals and annual articles. There is also a note from Donald W. King (U. Pittsburgh) at the end. 7 numbered contributions in all.
--------------------------------------------------------------------------- 1. List-Post: goal@eprints.org List-Post: goal@eprints.org Date: Tue, 27 Jan 2004 20:42:39 -0800 From: Hal Varian <h...@sims.berkeley.edu> We cite 37,609 journals (using Ulrich's 2001 data), with an average of 208 pages per issue. See: http://www.sims.berkeley.edu/research/projects/how-much-info-2003/print.htm#genres --------------------------------------------------------------------------- 2. List-Post: goal@eprints.org List-Post: goal@eprints.org Date: Wed, 28 Jan 2004 04:54:27 -0600 (CST) From: Andrew Odlyzko <odly...@dtc.umn.edu> Something seems wrong here. In your entry for scholarly periodicals (which cites Ulrich's 2001 figure of 37,609 journals), what is the assumed size of a scanned image (600 dpi)? (Looking just in this table, for books you seem to be using 130 KB/page, for newspapers you say it is 500 KB/page, for newsletters probably 1600/12 or 133 KB/page.) Also, how many issues of a scholarly periodical do you assume there are in a year, on average? (I.e., does "208 page average" refer to an issue or to annual input?) If we take your "total TB per year" figure of 6.0 TB and divide by the 27 MN/issue figure in the table, we get about 222,000 issues. There are some pretty extensive studies by Don King which show that the average scholarly paper is something like 10 pages in length (with variations from field to field, math being about twice as long, for example). If we use that, and combine it with your estimates of 222,000 annual issues and 208 pages per issue, we get something like 4.5 million articles per year. --------------------------------------------------------------------------- 3. List-Post: goal@eprints.org List-Post: goal@eprints.org Date: Wed, 28 Jan 2004 07:00:44 -0600 (CST) From: Andrew Odlyzko <odly...@dtc.umn.edu> Stevan, Sure, there is nothing confidential about it. However, it might be better to wait and give Hal a chance to respond, and then send out both messages at once, to minimize the mental load on the readers. Best regards, Andrew P.S. My impression of the size of the literature (based on the rough estimates I had made years ago) is about the same as yours, and appears to differ from Hal's. --------------------------------------------------------------------------- 4. List-Post: goal@eprints.org List-Post: goal@eprints.org Date: Wed, 28 Jan 2004 06:48:02 -0800 From: Hal Varian <h...@sims.berkeley.edu> Andrew Odlyzko wrote: > Something seems wrong here. In your entry for scholarly periodicals > (which cites Ulrich's 2001 figure of 37,609 journals), what is the > assumed size of a scanned image (600 dpi)? (Looking just in this > table, for books you seem to be using 130 KB/page, for newspapers > you say it is 500 KB/page, for newsletters probably 1600/12 or > 133 KB/page.) Take a look at: http://www.sims.berkeley.edu/research/projects/how-much-info/print.html#orig which lays out different scanning/compression assumptions. As I recall, for books and journals we used the JSTOR numbers. They scan at 600 dpi, but then compress using a technology that stores a dictionary of font shapes, along with pointers to those shapes. This is quite efficient for material that is primarily text. For newspapers, we used the Newspaper Preservation Project standards http://www.neh.gov/projects/usnp.html, which use different technology. Also, newspapers pages are much bigger than book/journal pages. For journals, we used the Tenopir and King numbers. > Also, how many issues of a scholarly periodical do you assume there > are in a year, on average? (I.e., does "208 page average" refer to > an issue or to annual input?) "1,700 pages per periodical per year" from King's data. > If we take your "total TB per year" figure of 6.0 TB and divide by > the 27 MN/issue figure in the table, we get about 222,000 issues. > > There are some pretty extensive studies by Don King which show > that the average scholarly paper is something like 10 pages in > length (with variations from field to field, math being about > twice as long, for example). If we use that, and combine it > with your estimates of 222,000 annual issues and 208 pages per > issue, we get something like 4.5 million articles per year. Sounds reasonable. --------------------------------------------------------------------------- 5. List-Post: goal@eprints.org List-Post: goal@eprints.org Date: Wed, 28 Jan 2004 09:05:21 -0600 (CST) From: Andrew Odlyzko <odly...@dtc.umn.edu> Hal, Thank you very much for your response. There are still some seeming inconsistencies, but they may not be too big. If there are "1,700 pages per periodical per year", and "208 page average" per issue, we get 8.2 issues per journal per year, or a total of 308,000 issues per year. At the 27 MB/issue figure from your table, we obtain "total TB per year" of 8.3 TB, instead of the 6.0 TB in the table. This does not matter much for the main purposes of your table, but does affect the calculation of the number of scholarly articles. If we scale down the number of scholarly journals to the 24,000 figure used by Stevan, and use the King estimate of 1,700 pages per periodical per year, we get 41 million pages per year, and if we assume an average of 12 pages per article (I don't have the King estimate at hand, so don't recall if it was 10 or 12), we get about 3.4 million articles. This is all very rough estimates. Best regards, Andrew --------------------------------------------------------------------------- 6. List-Post: goal@eprints.org List-Post: goal@eprints.org Date: Wed, 28 Jan 2004 08:09:10 -0800 From: Hal Varian <h...@sims.berkeley.edu> Andrew Odlyzko wrote: > Thank you very much for your response. There are still some > seeming inconsistencies, but they may not be too big. > > If there are "1,700 pages per periodical per year", and > "208 page average" per issue, we get 8.2 issues > per journal per year, or a total of 308,000 issues per year. > At the 27 MB/issue figure from your table, we obtain "total TB per > year" of 8.3 TB, instead of the 6.0 TB in the table. This does > not matter much for the main purposes of your table, but does > affect the calculation of the number of scholarly articles. These inconsistencies are due to the fact that we are intermingling two reports--the 2000 report (which is where the table is from) and the 2003 report. The raw numbers, and the assumptions, differ a bit between the two reports. I originally sent the numbers from the 2003 reports, then later sent the table from the 2000 report to illustrate the compression issue. (We didn't have a similar table in the 2003 report.) > If we scale down the number of scholarly journals to the 24,000 > figure used by Stevan, and use the King estimate of 1,700 pages > per periodical per year, we get 41 million pages per year, and > if we assume an average of 12 pages per article (I don't have > the King estimate at hand, so don't recall if it was 10 or 12), > we get about 3.4 million articles. > > This is all very rough estimates. For sure. Hal Varian voice: 510-642-9980 SIMS, 102 South Hall fax: 510-642-5814 University of California h...@sims.berkeley.edu Berkeley, CA 94720-4600 http://www.sims.berkeley.edu/~hal --------------------------------------------------------------------------- 7. List-Post: goal@eprints.org List-Post: goal@eprints.org Date: Wed, 28 Jan 2004 14:55:38 +0000 From: Donald W. King In reviewing the communication yesterday with Hal Varion, I also updated our journal tracking from Ulrich's U.S. science titles in 2002 (using Ulrich's broad definition of U.S. titles). This involves tracking 800 titles for information in Ulrich's (started in 1960) and going to the library for in-depth observation of about 200 titles. Results are: * about 7,800 U.S. science scholarly titles * 10.8 issues/title * 154 articles/title * 1,910 article pages/title * 2,215 total pages/title * 397 special graphics/title * Applying these parameters to our publishing cost model, I estimate the article processing costs to be $1,660 per article(not too different from PLoS $1,500). Don King