Hi Scott,


I?m sure that someone with more direct knowledge of the GBIF taxonomy backbone 
will answer more specifically.  But in general, essentially all large taxonomic 
databases have these sorts of duplicate records due to spelling variations, 
etc.  Most such databases began by harvesting lists of (messy) text-string 
names from various sources, with the early emphasis being on quantity rather 
than quality.  In recent years, the emphasis has shifted towards improving 
quality, and to greater or lesser degrees, most large databases and aggregators 
have made tremendous progress in reconciling and correcting these sorts of 
issues.  However, these kind of lexical variants (i.e., two slightly different 
spellings being mistakenly represented as separate names) continue to exist, 
and probably will continue for quite some time (especially in large taxonomic 
aggregators, such as GIBIF).  The Global Names Architecture has current NSF 
funding (PI: Dima Mozzherin) to develop tools to help reconcile these sorts of 
lexical variants, and we have another NSF grant pending that will flesh those 
cleaned/reconciled text-string names out into metadata-rich names and 
name-usages? so there is some additional hope of accelerated clean-up in the 
next few years.  But until then, I?m afraid these kinds of duplicates will 
continued to be discovered and addressed on a case-by-case basis.



Not sure if that helps?. But if you do restrict to a single source (like CoL), 
you?re less likely to encounter these kinds of duplicates, and the presumption 
is that linking to either one will eventually get straightened out.



Aloha,

Rich



Richard L. Pyle, PhD
Database Coordinator for Natural Sciences | Associate Zoologist in Ichthyology 
| Dive Safety Officer
Department of Natural Sciences, Bishop Museum, 1525 Bernice St., Honolulu, HI 
96817
Ph: (808)848-4115, Fax: (808)847-8252 email: deepreef at bishopmuseum.org
http://hbs.bishopmuseum.org/staff/pylerichard.html







From: API-users [mailto:api-users-boun...@lists.gbif.org] On Behalf Of Scott 
Chamberlain
Sent: Wednesday, May 11, 2016 11:23 AM
To: api-users at lists.gbif.org
Cc: juli g. pausas
Subject: [API-users] Scientific names questions



HI all, 



Not sure where is best to ask this... so here goes. Let me know if there's a 
better place.  



The following are examples some users have highlighted for me as leading to 
confusion when searching for taxa.



1. Macrozamia platyrachis ( <http://www.gbif.org/species/4928834> 
http://www.gbif.org/species/4928834) vs. Macrozamia platyrhachis ( 
<http://www.gbif.org/species/2683551> http://www.gbif.org/species/2683551)



Here, the two spellings (with/without h) are accepted, and exact matches. The 
sci. authority seems to differ with F. M. Bailey vs. F.M.Bailey. The first is 
from GRIN taxonomy and the second from COL. 



Anyway, for users e.g., of the R client, this is a bit confusing. I had thought 
the backbone taxonomy would only have one master taxon key and name for each 
real taxon, but here it seems like there's two?



2. Cycas circinalis ( <http://www.gbif.org/species/2683264> 
http://www.gbif.org/species/2683264 ) vs. Cycas circinnalis ( 
<http://www.gbif.org/species/3594916> http://www.gbif.org/species/3594916 )


Here, the two spellings (with 1 or 2 "n"'s) are accepted, and exact matches. 
The sci. authorities here are exactly the same. The first is from COL and the 
second from IPNI taxonomy. 



3. Isolona perrieri ( <http://www.gbif.org/species/3648546> 
http://www.gbif.org/species/3648546 ) vs Isolona perrierii ( 
<http://www.gbif.org/species/6308376> http://www.gbif.org/species/6308376 )


Here, the two spellings (with 1 or 2 "i"'s) are accepted, and exact matches. 
The sci. authorities here are exactly the same. The first is from TPL and the 
second from COL 



--------



Should I advise users to when searching on the backbone taxonomy to limit to 
COL to avoid any confusion about names?  



Best, 

Scott Chamberlain

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://lists.gbif.org/pipermail/api-users/attachments/20160511/ce0a079f/attachment.html>

Reply via email to