[ 
https://jira.duraspace.org/browse/DS-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=27690#comment-27690
 ] 

Tim Donohue commented on DS-1482:
---------------------------------

A brief followup to Anurag's points (in previous comment).

We make recommendations similar to what he states on our wiki at:
https://wiki.duraspace.org/display/DSPACE/Ensuring+your+instance+is+indexed
(And we do embed an invisible link to HTML sitemaps in JSPUI and our various 
XMLUI themes)

However, he does make a good point that currently we don't have any way to 
default sitemaps to be enabled (as they are generated/refreshed by a 
recommended cron job).  So, even though Google Scholar can index the sitemaps, 
they often are not enabled, so the Scholar crawler cannot really depend on them.

So, there may be a couple options here:
(1) Look into whether we can auto-update sitemaps (perhaps via a new event 
consumer or similar) so that Google / Google Scholar can use those.
AND/OR
(2) Potentially add a way to browse content by the date it was added (this may 
even be useful / interesting to repo managers as a sort of "report" of recently 
added content)
                
> Add a way for harvesters to find recently added items (request from Google)
> ---------------------------------------------------------------------------
>
>                 Key: DS-1482
>                 URL: https://jira.duraspace.org/browse/DS-1482
>             Project: DSpace
>          Issue Type: New Feature
>            Reporter: Tim Donohue
>
> This request came out of a discussion I had with Anurag Acharya and Darcy 
> Darpa at Google / Google Scholar.
> Anurag mentioned that often the Google harvesters seem to need to do a lot of 
> "paging / clicking" in order to find new items in a DSpace instance.  This 
> can cause both a performance hit in DSpace (as the crawler keeps requesting 
> pages), and also can result in delays where items may not appear in Google 
> for some time (if the crawler gives up or moves on before it ever finds the 
> item).
> Anurag mentioned that it'd be much easier (both on DSpace performance and on 
> the Google crawlers), if DSpace provided some way to easily locate recently 
> added items.  
> This could be something like a "Browse Recently Added Items" (i.e. browse by 
> dc.date.accessioned), or similar.  It was noted that EPrints has such a 
> feature (called "Latest Additions").  For example, see their demo site:
> http://demoprints.eprints.org/cgi/latest
> It's also worth noting this could just be as simple as adding a "More...." 
> Option to our existing "Recently Added" list (of 5 items), so that you can 
> see other recently added items.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013 
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel

Reply via email to