Dec 9 Baba Ramdev is one of India's many "godmen", but one who has risen to power and prominence in the last 10-12 years. He is politically close to the party and people in power in India now, and who knows, maybe that's helped him rise.
One of his endeavours is Patanjali, that produces all manner of ayurvedic drugs, but also toothpastes and shampoos and so forth. Not long ago, our newspapers carried a large Patanjali ad which made a startling claim ... well, take a look at the attached image. So I went digging to try to confirm this claim. What I found left me somewhat confused, and perhaps that will apply to you too. But what there's no confusion about is that the "top 2%" is wrong. More like "bottom 5%". Take a look: What the number of citations can be made to suggest by tweaking them, https://www.livemint.com/opinion/columns/what-the-number-of-citations-can-be-made-to-suggest-by-tweaking-them-11669922167402.html Let me know if you know something about citation indices and this particular use of them. Let me know any thoughts about Patanjali. cheers, dilip ---- What citation numbers can be tweaked to suggest Heard of the Erdös number? It's named for the great Hungarian mathematician Paul Erdös. As I wrote in this space some years ago: "He worked with other mathematicians ... and that's where the idea for the Erdös number came from. If you collaborated with him on a paper, your Erdös number is 1. (About 500 such mathematicians). If you collaborated with someone who had collaborated with him, it's 2. (Over 9000). And so on. Your number measures what you might call your 'collaborative distance' from the man. (Erdös himself? 0, of course.)" The number is really like a tongue-in-cheek tribute to the man, not a serious measure of achievement. And yet it touches on a serious question: is there a way to measure how good, or effective, a scientist is? A low Erdös number means you have worked with some serious mathematicians, so that does indicate that you have some worth as a mathematician yourself. But still, it really is just in the nature of a tribute to a man who touched so many. There is, though, a more serious measure that's often used: how many times a paper you have authored has been cited in other papers by other scientists. You can probably tell that this "citation index" carries some weight. For if another scientist cites something you have researched and written up, it means that scientist found your work relevant and useful in his work. And if several scientists cite your paper, it means you produced something of some relatively wide relevance. If your paper continues to be cited long after it is published, maybe even after you're dead and gone, that speaks of the lasting impact of your findings. Of course, this idea of an index can be tweaked. For example, what's the calibre of the citations of your paper? Should a reference in Mint, for example, count the same as a reference in Nature, or Scientific American? Does a Mint reference mean that a wider audience than just academia is reading your paper? If so, surely that suggests a broader understanding and appeal? Besides, how good are the references you yourself cite - meaning, how many of the important results in your field are you aware of while you do your research? Considerations like these go into the calculation of various metrics - called h-index, g-index and more. The h-index, for example, is calculated thus: of all a scientist's published papers, if some number "h" of them each have h or more citations, and the rest have h or less citations, then her h-index is h. So let's say researcher Sharvari has published 30 papers. 7 of them have each been cited at least 7 times each; the other 23 are each cited 7 or fewer times. Sharvari then has a h-index of 7. To give you a quick idea, Erdös has a h-index of 76, Albert Einstein 92. Einstein's E=mc2 paper has been cited nearly 500 times; his special relativity paper over 2000 times. Like all statistical measures, these are used and interpreted in different ways. Certainly Einstein should figure at or near the top of any ranking of scientists. But considering the sheer number of scientific journals that there are, some no doubt of dubious merit, it should hardly be a surprise to find unknown or unexpected names on such rankings. These might be solid scientists who are just not widely known; they might also be not-so-solid ones who find ways to inflate their citation indices. And in fact, one recent paper addresses exactly the use and misuse of such citation metrics, apparently seeking a way to rationalize them ("September 2022 data-update for 'Updated science-wide author databases of standardized citation indicators', John PA Ionaddis, Elsevier, 10 October 2022, https://elsevier.digitalcommonsdata.com/datasets/btchxktzyw/4). It looks at citation data for about 200,000 scientists around the world and attempts to rank them according to a composite - called the c-score - of various citation indices like the h-index. As the paper explains, "The c-score focuses on impact (citations) rather than productivity (number of publications)." It's not clear to me how many scientists, whether on this list or not, pay attention to a list like this. I suspect the more serious ones simply go about their work, uninterested in such rankings. But be that as it may, how did Ionaddis choose those approximately 200,000 scientists? Let me quote his paper: "The selection is based on the top 100,000 scientists by c-score ... or a percentile rank of 2% or above ... 200,409 scientists are included in the single recent year dataset." Never mind "single recent year". The language seems calculated to obfuscate this process: in particular, how does 100,000 become 200,409? Perhaps it's the percentile rank of 2%? That may be, because that number takes in those who are above the bottom 2% in the rankings (what "2 percentile" means); in other words, the 200,000 make up the top 98% of scientists in the overall list. To me, this seems the most reasonable interpretation of this paper. This is all leading somewhere, I promise. I was prompted to do this digging by a recent large newspaper ad for Patanjali products featuring this claim: "Pujya Acharya Balkrishna ji and the Patanjali (PRI) among top 2% scientists in the world." Patanjali's website sent me to the Ionaddis paper above, and from there to two enormous Excel files which contain data about all these scientists. The paper's only mention of 2% is "percentile rank of 2% or above". So "top 2% scientists" in the ad should actually be "top 98% scientists". Or, equivalently, "scientists ranked higher than the bottom 2%." Neither of which, you'll agree, is quite as eye-catching as "top 2%". But wait, where exactly in the rankings is Acharya Balkrishna? With a c-score "rank" of 367,268, he appears in the 192,004th row, of a total of 200,409. That is, 95.8% of these scientists are ranked higher than Acharya Balkrishna. That is, his percentile rank is 4.2. That is, 4.2% of the scientists rank below him. Give that a thought. -- My book with Joy Ma: "The Deoliwallahs" Twitter: @DeathEndsFun Death Ends Fun: http://dcubed.blogspot.com -- You received this message because you are subscribed to the Google Groups "Dilip's essays" group. To unsubscribe from this group and stop receiving emails from it, send an email to dilips-essays+unsubscr...@googlegroups.com. To view this discussion on the web, visit https://groups.google.com/d/msgid/dilips-essays/CAEiMe8o2if8UkT_sxjBAh3JHALZEiPHeR4dwEiFrkKvYOFU8RA%40mail.gmail.com.