I would like to ask a counting question since all of this is based on
good counting
and a great deal of faith is placed on the the counters. Even the US census
knows the issues with doing this and resorts to capture/recapture methods
to get things right.

Counting papers should be rather straightforward since there are databases
which are primarily populated manually mostly by the authors themselves.

However, counting unique authors is not so straightforward; clustering
same author mentions is nontrivial and just counting each mention leads
to terribly inaccurate results.

How does the UN or anyone disambiguate authors, such as the John Smith
at Harvard from the John Smith at Berkeley or, more interestingly, from
the H. Chen at many places? Is this a manual or an automatic method?
Manual methods do not scale (look at DBLP) and have many
errors - see PubMed. Do you know how many different "you's" there are in
health
record or credit databases; you would be surprised.

As for automatic methods, the literature is rife with results.
Disambiguation algorithms are notoriously temperamental and,
depending on the parameters, can lead to over- or under- counting
by factors up to an order of magnitude.

Best regards,

Lee Giles

On 1/3/12 9:59 AM, Arif Jinha wrote:
> Arthur,
> Great work.  Just trying to save you some time.  Here's what I found after 
> working on it for about 2 years. 
> - # of researchers in the world is reported by UN data in the Science Report. 
> - That figure directly relates to the number of journal titles which relates 
> directly to the number of articles, and the growth rates of articles and 
> researchers are 1:1.  So, even if you're not interested in the number of 
> annual articles published, it's important to note as a check on data and 
> possibly a challenge to the evidence thus far. 
> - There are more researchers than annual articles - about 6 to 7.  Again, a 
> check on data or a challenge.
>
> In the absence of any undertaking of reasonable time and expense to count 
> researchers better than the UN, I've relied on that data not for great 
> precision but because of the logical and empirical support for the internal 
> consistency of the relationships (the self-organizing system of scholarly 
> communication).  
>
> I'm very confident in the precision of some estimates and growth rates for 
> articles and not others, those done by Mabe (1 million annual articles in 
> 2000, 3.4% growth of journals over 3 centuries and variability in Little 
> Science, Big Science and Disillusionment periods) Tenopir and King (similar 
> data in the late 1990s) and Bjork.  The 2.5 million articles frequently cited 
> by Harnad is way off because they failed to take into account the difference 
> in article averages - they used the article average from ISI and the number 
> of titles from Ulrich's. There is no excuse for that.  The other estimates 
> that are way off occur before the tools were available to get the precision 
> needed and those are older estimtes. 
>
> In addition, Bjork's work continues to cite the 3.4% average annual growth of 
> active journals, whereas I have noted a spike in article and journal output 
> since 2000 which is important to note.  The variations in article and journal 
> growth are what defines Little Science (before WWII), Big Science (after 
> WWII), Disillusionment (1970s to 2000) periods. Since the current growth rate 
> is minimally 4.5%, we currently see a) a reversal of disllusionment, b) the 
> highest variation in history, and c) the highest annual increase in 
> production.  Moreover, we see a massive 10% drop in the share to the West 
> (NA, Europe, Australia and New Zealand), as a result of globalization.  We 
> can also see from the data minimally 20% of articles being OA now, and the 
> current growth rate (last 5 yrs) pointing towards 50% in the next 20 years. 
> So, I have named 2 new periods after Disillusionment - Global Science (2000 
> to current) and Open Science (current to future).  
>
> Here are my frustrations with this research, it is rooted in the ancient 
> research paradigms of the 20th century, which I myself had to wade through.  
> It lacks REFLEXIVITY, and is hopelessly academic.  Academia is hopelessly 
> unimaginative.
>
> You cannot determine the future of OA by the trend alone, logically if the 
> share of OA is already significant and growing rapidly, this alters the 
> market, and puts pressure on publishers to react.  What will happen is that 
> as the OA share increases, more journals will convert to OA, and more new 
> journals will start OA. A quick look into Urlich's tells me that the increase 
> in new OA journals is much higher than the current growth rate of Gold OA 
> articles.  Secondly, the growth of mandates is spectacular, but the effect 
> takes 2 years to manifest so we are only going to start to see that in the 
> next decade.  That means an acceleration of the trends that I've pointed out 
> is likely, begging the question as to who in their right mind would publish a 
> Toll Access journal in the year 2030, to a global audience who will see Toll 
> Access as a dinosaur? 
>
> Major publishers, as we've seen, have started OA brands and this will 
> continue until a major publisher converts their entire product line to OA.  
> That publisher will be loved because they will have a useable website, and 
> the others will start to look even more awful.
>
> You cannot determine the future either by the behaviour of current 
> researchers, since we are in the midst of a vast demographic shift from a 
> research world dominated by Western baby boomers who are retiring or will 
> retire in the next 20 years.  You should determine the future of OA by the 
> behavior of future researchers who reflected the boom in the global youth 
> population, and grew up in digital culture.  Pay attention to students.  
>
> The goals of OA create this change, so you have a reflexive effect 
> particularly when researchers are transparently advocates of OA (which is 
> better than attempting a facade of neutrality impossible for the researcher 
> whose choices we are concerned with).  The more you succeed in advocating OA, 
> the more you re-arrange and accelerate the data you're studying.  But no one 
> is talking to the students.
>
> Following my research, I decided to register a new publising firm and I'm 
> actually much more focused at the moment on creative arts and culture - so 
> I'm using what I call an Open Creative Commercial business model.  I plan to 
> publish scholarly communication going forward, though I'm not satisfied by 
> the 'article' as a format since this was designed for print.  When we publish 
> in print, there will be articles but they will have to justify their 
> production value by being both sound scholarship and nice to read. Most 
> journal articles are awful reads.
>
> My thesis is not that nice to read because my university requires me to 
> confine myself to 20th century conventions.  I apologize for this since you 
> have decided to read it.  What I will do is put together a web presentation 
> of the thesis, when I have recovered from academia.  I'm satisfied that the 
> data shows that the research world is changing, and I can't understand why OA 
> advocates pay not attention to students, particularly grad students since it 
> is their culture and attitudes which will determine their legacy and the 
> future of OA.  OA should pay attention to and encourage students, and 
> students know more about how to use the web than their professors.
>
> There are several reasons why students are Occupying campuses, and tuition is 
> only one problem.  Respect is the mai problem.  It is like the 'Bread and 
> Roses' strike, we want the money problem to be solved, but we want respect 
> more than anything. 
>
> There is the lack of respect for students' at universities, particularly at 
> the biggest ones in the North/West, which is characterized by a culture of 
> research entitlement.  That is to say, profs generally chase research money 
> and neglect their students, adminstrators direct funds to a massive, 
> bureaucratic institutional structure and the value and quality of education 
> does not keep pace with social change. Students encounter outdated lessons, 
> teaching which does not observe pedagogical knowledge, grading which 
> discourages innovation, high debt load, and the parochial tradition of 
> bullying students with criticism.  The criticism is largely habitual by now, 
> since profs do not have the time to constructively assist 100s of students in 
> their classes.  
>
> Profs know and understand that the institutions are broken, but they have 
> zero time to address problems.  The system of tenure puts forward the false 
> notions of 'academic freedom' as if it were a carrot, whereas all it is is 
> job security.  Thus, their days are spent chasing this carrot, and carrying 
> the heavy workload of dealing with 20th century administration.  They are 
> time-poor  They do this for the money. 
>
> Academic freedom cannot be granted, it is inherent.  I learned that in high 
> school when I cut class to learn about the world.  All the best critical 
> young thinkers are fed up with a generation that has led us to crisis, 
> failure, climate change, war, a university climate which does not tolerate 
> the spirituality or mysticism that informs arts and culture, and that used to 
> infuse intellect with brilliance.  In the Muslim world, I think we understand 
> that the great towering intellects of history were all mystics.  Academia and 
> in particular social sciences,  holds giant cultural prejudices about the 
> nature of reality, all of which were rejected by modern physics 50 years ago. 
>
> University today is oppression by debt and drudgery and old white folks who 
> feelg guilty about global decline.  This will be the case until it is 
> occupied by the love of wisdom again.  Access to scholarship and Open Science 
> marks the end of exclusivity to scholarship reserved for elites who are 
> members of rich institutions, and ends the cultural hegemony of accreditated 
> knowledge. 
>
> Tomorrow's researchers are going to take knowledge into vast new dimensions 
> of integrated understanding together with the need to raise children in a 
> world that one must admit is schizoprhenic, bipolar and 
> personality-disordered! It is chaotic and it can only be tolerated by a 
> student who becomes a Master, who grounds themselves in enlightenment - 
> intellectual, spiritual, mystical and devotional.  My child's studies in 
> Sufism will be as important as their studies in maths. There is no university 
> today that understands any of this. 
>
> Because the OA trend is irreversible and people do not require nor reasonably 
> should place trust in peer-review anymore, all of the topics we are now 
> interested in quickly fade, and we become interested in the action of sharing 
> knowledge and being their own filter, doing it rather than letting 
> institutions do it.  It would be unwise for today's university teachers to 
> place great emphasis on publishing in journals anyway, but it would be wise 
> for them to teach their students how to be leaders in contemporary thought, 
> how to navigate truth and reality, and to be informationally wise, and to be 
> fearless about the Openness paradigm - share your work! For me it's like the 
> Blues Brothers - 'I'm on a mission from God'. lol. 
>
> Now that I've finished my MA, I don't have to conform anymore and I can be 
> myself again - mystic-philosopher-entrepreneur-occupier.  There is a lot of 
> bitterness I need to transform into beauty, which is why I Occupy myself with 
> the Creative Arts at the moment. Poetry, literature, music, visual art, 
> photography, creative capitalism and mutual aid.  This was dashed off 
> quickly, and I really ought to be on my zafu doing anapasati (that's Buddhist 
> for 'sitting around').  Do you think, though, that there will ever space in 
> the future for the Bohemian at university - the Alan Watts type? I hope so.  
> Otherwise, you'll just say to people 'I got this strange letter from this 
> student who is probably mentally ill or on drugs'.  Ugggh. That is the Brave 
> New World we are in.
>
> If this stuff is less fun and interesting to read than my thesis, I've lost 
> you! I wish you greatness in life, the depth of being human, and a good death.
>
> 'I believe that unconditional love and unarmed truth will have the final say 
> in reality' - MLK.
>
> all the best,
>
> Arif
>
> yo
>
>   ----- Original Message ----- 
>   From: Arthur Sale 
>   To: 'Global Open Access List (Successor of AmSci)' 
>   Sent: Monday, January 02, 2012 11:43 PM
>   Subject: [GOAL] Re: How many researchers are there?
>
>
>   Thank you Arif.  I have read the article this afternoon (3 January) and 
> will download and look through your thesis asap.
>
>    
>
>   However I feel compelled to re-emphasize to the list that I am not looking 
> for an estimate of how many articles are published annually, or ever. The 
> first of those pieces of data is useful for estimating what I really want to 
> know: how many active researchers are employed in year y? Particularly 2011. 
> Of course, it will be useful to have article counts by discipline, however 
> rough, because publication practices differ widely between disciplines. A 
> publication in some disciplines is worth far less than in others, the number 
> of authors/article differs widely, and journal prestige varies at least as 
> much.
>
>    
>
>   There are many other confusing factors in estimates based on article 
> production rates which I touched on in my reply to Stevan Harnad, not least 
> of which is the frequency of publication of equally highly respected 
> researchers. Some publish rarely (say once every three years), others produce 
> multiple articles per year. There are distributions in all these things which 
> we should understand. If I mention just one, the huge disparity between 
> articles/title in ISI and non-ISI journals listed in your article (111 vs 26, 
> from Bjork et al) must give anyone cause to reflect! That's over 4:1, too big 
> to gloss over.
>
>    
>
>   I know of course that I cannot determine exactly the number of researchers 
> in the world, any more than anyone else can determine exactly how many 
> articles were written or published.  As an engineer in a previous career, 
> absolute precision in these matters is not required, rather sufficient 
> confidence that we are in the right ballpark. Anyway, thank you very much for 
> your help and links, which I greatly appreciate.
>
>    
>
>   Arthur Sale
>
>   University of Tasmania
>
>    
>
>    
>
>   From: goal-bounces at eprints.org [mailto:goal-bounces at eprints.org] On 
> Behalf Of Arif Jinha
>   Sent: Tuesday, 3 January 2012 5:26 AM
>   To: Global Open Access List (Successor of AmSci)
>   Subject: [GOAL] Re: How many researchers are there?
>
>    
>
>   Arthur,
>
>    
>
>   You're not going to be able to determine the exact number of researchers in 
> the world and you will have to make good estimates. But there are direct 
> relationships between the number of researchers, the number of articles 
> published annually and the number of active peer-reviewed journals. Good 
> sources for methodology are my thesis - 
> http://arif.jinhabrothers.com/sites/arif.jinhabrothers.com/files/aj.pdf 
> (defended and submitted this fall)
>
>   - Article 50 million - 
> http://www.mendeley.com/research/article-50-million-estimate-number-scholarly-articles-existence-6/
>
>   Methods and data are based chiefly on:
>
>   Bjork et al's studies on OA share growth 2006 to current
>
>   Mabe and Amin, Tenopir and King - works 1990s to early 2000s
>
>   Derek De Sallo Price - 1960s - the 'father of scientometrics.
>
>   - you can get the number of article from Bjork's methods and data and mine.
>
>   - you can get the number of researchers from UN data but there is ratio of 
> researchers to publishing researchers, and publishing researchers publish an 
> average of 1 article per year, so if you can determine good estimate for that 
> ratio you are on your way. You have good data on growth rates of researchers, 
> articles and journals, but growth rates have increased dramatically since 
> 2000 as demonstrated in my thesis.  It got a bit complex and I tried to sort 
> it best I could in my thesis.
>
>    
>
>   all the best,
>
>    
>
>   Arif
>
>    
>
>    
>
>    
>
>   ----- Original Message ----- 
>
>     From: Arthur Sale 
>
>     To: 'Global Open Access List (Successor of AmSci)' 
>
>     Sent: Saturday, December 31, 2011 6:25 PM
>
>     Subject: [GOAL] How many researchers are there?
>
>      
>
>     I am trying to get a rough estimate of the number of active researchers 
> in the world. Unfortunately all the estimates seem to be as rough as the 
> famous Drake equation for calculating the number of technological 
> civilizations in the universe: in other words all the factors are extremely 
> fuzzy.  I seek your help. My interest is that this is the number of people 
> who need to adopt OA for us to have 100% OA. (Actually, we will approach that 
> sooner, as the average publication has more than one author and we need only 
> one to make it OA.
>
>      
>
>     To share some thinking, let me take Australia. In 2011 it had 35 
> universities and 29,226 academic staff with a PhD. Let me assume that this is 
> the number of research active staff. The average per institution is 835, and 
> this spans big universities down to small ones. Australia produces according 
> to the OECD 2.5% of the world's research, so let's estimate the number of 
> active researchers in the world (taking Australia as 'typical' of 
> researchers) as 29226 / 0.025 = 1,169,040 researchers in universities. Note 
> that I have not counted non-university research organizations (they'll make a 
> small difference) nor PhD students (there is usually a supervisor listed in 
> the author list of any publication they produce).
>
>      
>
>     Let's take another tack. I have read the number of 10,000 research 
> universities in the world bandied about. Let's regard 'research university' 
> as equal to 'PhD-granting university'. If each of them have 1,000 research 
> active staff on average, then that implies 10000 x 1000 = 10,000,000 
> researchers.
>
>      
>
>     That narrows the estimate, rough as it is, to
>
>              1.1M  < no of researchers < 10M
>
>     I can live with this, as it is only one power of ten (order of magnitude) 
> between the two bounds. The upper limit is around 0.2% of the world's 
> population.
>
>      
>
>     Another tactic is to try to estimate the number of people whose name 
> appeared in an author list in the last decade. Disambiguation of names rears 
> its ugly head. This will also include many non-researchers in big labs, some 
> of them will be dead, and there will be new researchers who have just not yet 
> published, but I am looking for ball-park figures, not pinpoint accuracy. I 
> haven't done this work yet.
>
>      
>
>     Can we do better than these estimates, in the face of different national 
> styles?  It is even difficult to get one number for PhD granting universities 
> in the US, and as for India and China @$#!
>
>      
>
>     Arthur Sale
>
>     University of Tasmania, Australia
>
>
>
> ------------------------------------------------------------------------------
>
>
>   _______________________________________________
>   GOAL mailing list
>   GOAL at eprints.org
>   http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal
>
>
>
> _______________________________________________
> GOAL mailing list
> GOAL at eprints.org
> http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://mailman.ecs.soton.ac.uk/pipermail/goal/attachments/20120103/3487aeaa/attachment-0001.html
 

Reply via email to