Thanks, Tom.
I'll have to look at this specific case when I'm back at work tomorrow, as
it does seem you found something in error.
As for my process: with WD, I queried out the label, description & country
of citizenship, dob & dod of of everyone with occupation: photographer.
After some cleaning, I can get the WD data formatted like my own (Name,
Nationality, Dates). I can then do a simple match, where everything matches
exactly. For the remainder, I then match names and dates- without
Nationality, which is often very "soft" information. For those that pass a
smell test (one is "English" the other is "British") I pass those along,
too. For those with greater discrepancies, I look still closer. For those
with still greater discrepancies, I manually, individually query my
database for anyone with the same last name & same first initial to catch
misspellings or different transliterations. I also occasionally put my
entire database into open refine to catch instances where, for instance, a
Chinese name has been given as FamilyName, GivenName in one source, and
GivenName, FamilyName in another.
In short, this is scrupulously- and manually- checked data. I'm not savvy
enough to let an algorithm make my mistakes for me! But let me know if this
seems to be more than bad luck of the draw- finding the conflicting data
you found.
I have also to say, I may suppress the Niepce Museum collection, as it's
from a really crappy list of photographers in their collection which I
found many years ago, and can no longer find. I don't want to blame them
for the discrepancy, but that might be the source. I don't know.
As I start to query out places of birth & death from WD in the next days, I
expect to find more discrepancies. (Just today, I found dozens of folks
whom ULAN gendered one way, and WD another- but were undeniably the same
photographer. )
Thanks,
David


On Tuesday, December 8, 2015, Tom Morris <tfmor...@gmail.com> wrote:

> Can you explain what "indexing" means in this context?  Is there some type
> of matching process?  How are duplicates resolved, if at all? Was the
> Wikidata info extracted from a dump or one of the APIs?
>
> When I looked at the first person I picked at random, Pierre Berdoy
> (ID:269710), I see that both Wikidata and Wikipedia claim that he was born
> in Biarritz while the NYPL database claims he was born in Nashua, NH.  So,
> it would appear that there are either two different people with the same
> name, born in different places, or the birth place is wrong.
>
>
> http://mgiraldo.github.io/pic/?&biography.TermID=2028247&Location=269710|42.7575,-71.4644
> https://www.wikidata.org/wiki/Q3383941
>
> Tom
>
>
>
>
> On Tue, Dec 8, 2015 at 7:10 PM, David Lowe <davidl...@nypl.org
> <javascript:_e(%7B%7D,'cvml','davidl...@nypl.org');>> wrote:
>
>> Hello all,
>> The Photographers' Identities Catalog (PIC) is an ongoing project of
>> visualizing photo history through the lives of photographers and photo
>> studios. I have information on 115,000 photographers and studios as of
>> tonight. It is still under construction, but as I've almost completed an
>> initial indexing of the ~12,000 photographers in WikiData, I thought I'd
>> share it with you. We (the New York Public Library) hope to launch it
>> officially in mid to late January. This represents about 12 years worth of
>> my work of researching in NYPL's photography collection, censuses and
>> business directories, and scraping or indexing trusted websites, databases,
>> and published biographical dictionaries pertaining to photo history.
>> Again, please bear in mind that our programmer is still hard at work (and
>> I continue to refine and add to the data*), but we welcome your feedback,
>> questions, critiques, etc. To see the WikiData photographers, select
>> WikiData from the Source dropdown. Have fun!
>>
>> *PIC*
>> <http://mgiraldo.github.io/pic/?address.AddressTypeID=*&address.CountryID=*&Nationality=*&gender.TermID=*&process.TermID=*&role.TermID=*&format.TermID=*&biography.TermID=*&collection.TermID=*&Location=*&DisplayName=*&Date=*>
>>
>> Thanks,
>> David
>>
>> *Tomorrow,  for instance, I'll start mining Wikidata for birth & death
>> locations.
>>
>> _______________________________________________
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> <javascript:_e(%7B%7D,'cvml','Wikidata@lists.wikimedia.org');>
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
>
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to