Wow Ben! Thanks for the great answer! -- Don On Thu, Apr 9, 2015 at 8:52 AM, Ben Shum <bs...@biblio.org> wrote:
> Hi Don, > > Starting as recently as Evergreen 2.6 (it's noted on the Evergreen 2.6 > release notes under "structured data" - > http://evergreen-ils.org/documentation/release/RELEASE_NOTES_2_6.html), > efforts were made by developers like Dan Scott to add structured data > elements to Evergreen's catalog to make them more discoverable. This > work has continued throughout newer Evergreen releases and I'd like to > say that through Dan's work and others, it has been essential towards > keeping Evergreen's catalog more friendly to search engines, like > Google, etc. > > Evergreen 2.8's release notes include lots more discoverability > enhancements added with that release too: > http://evergreen-ils.org/documentation/release/RELEASE_NOTES_2_8.html#_opac > > Since your site does not include a manually configured robots.txt > file, I'll point you at an example set at Dan's library Laurentian > University's catalog: https://laurentian.concat.ca/robots.txt (we > based many of our changes following the example they set). > > That robots.txt file tends to guide search engine bots that arrive at > the catalog towards indexing the appropriate contents, and avoid/skip > over certain undesirables. > > By default, if you do not have anything set, then search engine bots > will likely attempt to index everything in your catalog that it can > publicly access. > > Doing an example search like > https://www.google.com/#q=asbury+catalog+Star+Trek (aka, keywords in > Google for "asbury catalog Star Trek" I can already see a couple > results that come from your Evergreen catalog records. So at least > Google's search engine bots are already working to grab your catalog's > contents. > > That all said, I suppose one potential "danger" of having bots freely > scan over your site is that if they get too busy with indexing your > site's contents, they can overwhelm and cause interruptions in your > ability to use Evergreen. This happened to us at least once before, > where some indexer in China scanned our whole catalog and tried to > index every page causing us to run out of system resources trying to > serve up all the content it was requesting. > > For myself and Bibliomation's catalog, I've been experimenting with > modifying our robots.txt file and continually upgrading our Evergreen > catalog to reflect the latest enhancements for structured data to try > making the most use out of what's possible in Evergreen. Proceeding > forward, I've also done some small experiments in creating Google > Custom Search Engines to search against our indexed online catalog > (and requesting scheduled indexing from Google's bots) as an > alternative means of discovering the content contained in our systems. > > Moving forward, I expect this to continue to be an exciting area to > explore the ways of improving discoverability of Evergreen's content. > > -- Ben > > On Thu, Apr 9, 2015 at 8:15 AM, Donald Butterworth > <don.butterwo...@asburyseminary.edu> wrote: > > Hi everyone, > > > > I was asked to toss these questions out and get some perspectives. > > > > "What would it take to make the Evergreen catalog holdings available to > > generic search engines like Google, Bing, Yahoo and DuckDuckGo?" "Even > if it > > is doable, is it a good idea?" > > > > The motivation behind these questions is a perception that the first > attempt > > many students make to do research is through a general web search. > > > > Anybody have a comment? > > > > Don > > > > -- > > Don Butterworth > > Faculty Associate / Librarian III > > B.L. Fisher Library > > Asbury Theological Seminary > > don.butterwo...@asburyseminary.edu > > (859) 858-2227 > > > > -- > Benjamin Shum > Evergreen Systems Manager > Bibliomation, Inc. > 24 Wooster Ave. > Waterbury, CT 06708 > 203-577-4070, ext. 113 > -- Don Butterworth Faculty Associate / Librarian III B.L. Fisher Library Asbury Theological Seminary don.butterwo...@asburyseminary.edu (859) 858-2227