Sophie,

I'll admit, I'm not really an OAI-PMH expert.  But, at a glance, I don't 
think PubMed provides any way to "filter" by an institution.

This is based on looking at the OAI examples: 
http://www.ncbi.nlm.nih.gov/pmc/tools/oai_examples/

On that page, it looks like PubMed provides "Sets", but those Sets are 
publication based (so you can filter by a specific publication).

The other issue is that if you look at some sample "oai_dc" records, I 
don't even see an institutional affiliation listed for authors.

http://www.pubmedcentral.gov/oai/oai.cgi?verb=GetRecord&identifier=oai:pubmedcentral.nih.gov:152494&metadataPrefix=oai_dc

So, I'm really not sure that you'd be able to filter by a specific 
institution.

You could also get in touch with PubMed to ask them directly -- there's 
an email contact at the bottom of this page: 
http://www.ncbi.nlm.nih.gov/pmc/tools/oai/

- Tim


On 5/23/2012 11:02 AM, Deng, Sai wrote:
> Thank you, Tim!!
> There is no way to harvest for a specific institution (publications 
> affiliated to one institution), right? I looked at the PubMed instructions 
> and it looks like their sets are defined by Publishers.
> Here is my testing result:
> OAI Provider: http://www.pubmedcentral.nih.gov/oai/oai.cgi
> OAI Set id: aac              (*This is for aac set. Seems no way to harvest 
> for an institution.)
> Metadata format: Simple Dublin Core
> Content being harvested: Harvest metadata only.
> Last Harvest Result: Harvest from 
> http://www.pubmedcentral.nih.gov/oai/oai.cgi sucessful on 2012-05-22 
> 16:18:43.386
>
> I am thinking whether the harvesting interface can be include more options. 
> Is it possible to harvest only data from one university?
> Vika from Boston posted these two questions before:
> - Where in the harvesting settings can I specify things like the dates
> from/until which I want to harvest?
>
> - If my collection has an accept/reject step, is there a way to make harvested
> items completely invisible until they are accepted?  Does the answer to this
> depend on whether I'm harvesting only metadata, or also full-text files?
>
> Thank you for any insight!
> Sophie
>
>
> -----Original Message-----
> From: Tim Donohue [mailto:tdono...@duraspace.org]
> Sent: Wednesday, May 23, 2012 10:58 AM
> To: Deng, Sai
> Cc: dspace-tech@lists.sourceforge.net
> Subject: Re: [Dspace-tech] Harvesting PubMed
>
> Sophie,
>
> I forgot to include the link to the DSpace documentation on how to harvest 
> external content via OAI-PMH:
>
> https://wiki.duraspace.org/display/DSDOC18/XMLUI+Configuration+and+Customization#XMLUIConfigurationandCustomization-HarvestingItemsfromXMLUIviaOAIOREorOAIPMH
>
> That may also be helpful!
>
> - Tim
>
> On 5/23/2012 10:55 AM, Tim Donohue wrote:
>> Sophie,
>>
>> I've never tried this before, but it looks like PubMed supports
>> OAI-PMH harvesting, so you should be able to configure DSpace to
>> harvest content from PubMed via OAI-PMH. Here's the details from the PubMed 
>> website:
>>
>> http://www.ncbi.nlm.nih.gov/pmc/tools/oai/
>>
>> It says the base URL you'd want to use is
>> http://www.pubmedcentral.nih.gov/oai/oai.cgi
>>
>> It also has some examples of OAI requests to PubMed:
>> http://www.ncbi.nlm.nih.gov/pmc/tools/oai_examples/
>>
>> Hopefully that will help you out.
>>
>> - Tim
>>
>> On 5/22/2012 3:47 PM, Deng, Sai wrote:
>>> Hi,
>>>
>>> Can anyone give me an example of harvesting PubMed publications from
>>> a specific institution? In other words, could you show me how to
>>> configure the auto harvesting under "Collection-Harvesting-Content
>>> Source":
>>> Content source: This collection harvests its content from an external
>>> source OAI Provider:______________________ OAI Set id: Specific
>>> sets_____________ Metadata Format: Simple Dublin Core [or] DSpace
>>> Intermediate Metadata
>>>
>>> Content being harvested: Harvest metadata and bitstreams (requires
>>> ORE
>>> support)
>>>
>>> We've been downloading xml data directly from the PubMed website and
>>> transform it to DCXML using some local VBscript. Then we export the
>>> DCXML file to Excel, transform Excel to SIP packages using
>>> BloomaMohan's program. We add several additional fields to the data
>>> set and do quite some editing in the Excel file. I have been
>>> wondering whether the auto harvesting will be a much better option.
>>> Any opinion or suggestion? What's your experience?
>>>
>>> Thank you for your reply!
>>> Sophie
>>>
>>> ---------------------------------------------------------------------
>>> ---------
>>>
>>> Live Security Virtual Conference
>>> Exclusive live event will cover all the ways today's security and
>>> threat landscape has changed and how IT managers can respond.
>>> Discussions will include endpoint security, mobile security and the
>>> latest in malware threats.
>>> http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>>> _______________________________________________
>>> DSpace-tech mailing list
>>> DSpace-tech@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/dspace-tech

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to