Re: [OPEN-ILS-GENERAL] Evergreen access via Google?

2015-04-09 Thread Marc Truitt

On 2015-04-09 1005, Ben Shum wrote:

That all said, I suppose one potential danger of having bots freely
scan over your site is that if they get too busy with indexing your
site's contents, they can overwhelm and cause interruptions in your
ability to use Evergreen.  This happened to us at least once before,
where some indexer in China scanned our whole catalog and tried to
index every page causing us to run out of system resources trying to
serve up all the content it was requesting.


Disclaimer:  I'm a lurker who is very interested in and hopeful for 
Evergreen's success.  Our site currently uses a proprietary ILS.


I don't know whether an Evergreen site has yet done this, but FWIW, 
another approach is to sign up for OCLC's WorldCat Local (to be 
rebranded WorldCat Discovery) service.  Among the (apparently 
less-known) features included is web-scale discovery.  OCLC makes its 
records discoverable via Google and other search services.  These in 
turn -- via WCL -- are linked in real-time to availability information 
in subscribers' ILSes.


cheers,

- mt

--
*
Marc Truitt
University Librarianvoice  : 506-364-2567
Mount Allison Universitye-mail : mtru...@mta.ca
Libraries and Archives  fax: 506-364-2617
49 York Street  cell   : 506-232-0503
Sackville, NB  E4L 1C6

We wanted flying cars, instead we got 140 characters.
-- Peter Thiel

  Wearing the sensible shoes proudly since 1978!



Re: [OPEN-ILS-GENERAL] Evergreen access via Google?

2015-04-09 Thread Kathy Lussier

Welcome to the list Marc!



I don't know whether an Evergreen site has yet done this, but FWIW, 
another approach is to sign up for OCLC's WorldCat Local (to be 
rebranded WorldCat Discovery) service.  Among the (apparently 
less-known) features included is web-scale discovery.  OCLC makes its 
records discoverable via Google and other search services. These in 
turn -- via WCL -- are linked in real-time to availability information 
in subscribers' ILSes. 


That might be true, but I'm perplexed as to why a library would pay for 
this discoverability through WorldCat Local when they already get it out 
of the box with Evergreen?


Of course, there might be other reasons for signing up for WorldCat 
Local, but I think the work Dan has done has put Evergreen, Koha, and 
VuFind ahead of the pack in this area.


Kathy

On 04/09/2015 09:15 AM, Marc Truitt wrote:

On 2015-04-09 1005, Ben Shum wrote:

That all said, I suppose one potential danger of having bots freely
scan over your site is that if they get too busy with indexing your
site's contents, they can overwhelm and cause interruptions in your
ability to use Evergreen.  This happened to us at least once before,
where some indexer in China scanned our whole catalog and tried to
index every page causing us to run out of system resources trying to
serve up all the content it was requesting.


Disclaimer:  I'm a lurker who is very interested in and hopeful for 
Evergreen's success.  Our site currently uses a proprietary ILS.


I don't know whether an Evergreen site has yet done this, but FWIW, 
another approach is to sign up for OCLC's WorldCat Local (to be 
rebranded WorldCat Discovery) service.  Among the (apparently 
less-known) features included is web-scale discovery.  OCLC makes its 
records discoverable via Google and other search services. These in 
turn -- via WCL -- are linked in real-time to availability information 
in subscribers' ILSes.


cheers,

- mt



--
Kathy Lussier
Project Coordinator
Massachusetts Library Network Cooperative
(508) 343-0128
kluss...@masslnc.org
Twitter: http://www.twitter.com/kmlussier



Re: [OPEN-ILS-GENERAL] Evergreen access via Google?

2015-04-09 Thread Rogan Hamby
I can't remember the year off hand but Dan did a presentation on schema.org
(and related stuff -- highly technical term) at one of the Evergreen
conferences.  Vancouver I think?  There might be video somewhere.



On Thu, Apr 9, 2015 at 8:52 AM, Ben Shum bs...@biblio.org wrote:

 Hi Don,

 Starting as recently as Evergreen 2.6 (it's noted on the Evergreen 2.6
 release notes under structured data -
 http://evergreen-ils.org/documentation/release/RELEASE_NOTES_2_6.html),
 efforts were made by developers like Dan Scott to add structured data
 elements to Evergreen's catalog to make them more discoverable.  This
 work has continued throughout newer Evergreen releases and I'd like to
 say that through Dan's work and others, it has been essential towards
 keeping Evergreen's catalog more friendly to search engines, like
 Google, etc.

 Evergreen 2.8's release notes include lots more discoverability
 enhancements added with that release too:
 http://evergreen-ils.org/documentation/release/RELEASE_NOTES_2_8.html#_opac

 Since your site does not include a manually configured robots.txt
 file, I'll point you at an example set at Dan's library Laurentian
 University's catalog:  https://laurentian.concat.ca/robots.txt  (we
 based many of our changes following the example they set).

 That robots.txt file tends to guide search engine bots that arrive at
 the catalog towards indexing the appropriate contents, and avoid/skip
 over certain undesirables.

 By default, if you do not have anything set, then search engine bots
 will likely attempt to index everything in your catalog that it can
 publicly access.

 Doing an example search like
 https://www.google.com/#q=asbury+catalog+Star+Trek (aka, keywords in
 Google for asbury catalog Star Trek I can already see a couple
 results that come from your Evergreen catalog records.  So at least
 Google's search engine bots are already working to grab your catalog's
 contents.

 That all said, I suppose one potential danger of having bots freely
 scan over your site is that if they get too busy with indexing your
 site's contents, they can overwhelm and cause interruptions in your
 ability to use Evergreen.  This happened to us at least once before,
 where some indexer in China scanned our whole catalog and tried to
 index every page causing us to run out of system resources trying to
 serve up all the content it was requesting.

 For myself and Bibliomation's catalog, I've been experimenting with
 modifying our robots.txt file and continually upgrading our Evergreen
 catalog to reflect the latest enhancements for structured data to try
 making the most use out of what's possible in Evergreen.  Proceeding
 forward, I've also done some small experiments in creating Google
 Custom Search Engines to search against our indexed online catalog
 (and requesting scheduled indexing from Google's bots) as an
 alternative means of discovering the content contained in our systems.

 Moving forward, I expect this to continue to be an exciting area to
 explore the ways of improving discoverability of Evergreen's content.

 -- Ben

 On Thu, Apr 9, 2015 at 8:15 AM, Donald Butterworth
 don.butterwo...@asburyseminary.edu wrote:
  Hi everyone,
 
  I was asked to toss these questions out and get some perspectives.
 
  What would it take to make the Evergreen catalog holdings available to
  generic search engines like Google, Bing, Yahoo and DuckDuckGo? Even
 if it
  is doable, is it a good idea?
 
  The motivation behind these questions is a perception that the first
 attempt
  many students make to do research is through a general web search.
 
  Anybody have a comment?
 
  Don
 
  --
  Don Butterworth
  Faculty Associate / Librarian III
  B.L. Fisher Library
  Asbury Theological Seminary
  don.butterwo...@asburyseminary.edu
  (859) 858-2227



 --
 Benjamin Shum
 Evergreen Systems Manager
 Bibliomation, Inc.
 24 Wooster Ave.
 Waterbury, CT 06708
 203-577-4070, ext. 113




-- 

Rogan Hamby, MLS, CCNP, MIA
Managers Headquarters Library and Reference Services,
York County Library System

“You can never get a cup of tea large enough or a book long enough to suit
me.”
― C.S. Lewis http://www.goodreads.com/author/show/1069006.C_S_Lewis


Re: [OPEN-ILS-GENERAL] Evergreen access via Google?

2015-04-09 Thread Ben Shum
Hi Don,

Starting as recently as Evergreen 2.6 (it's noted on the Evergreen 2.6
release notes under structured data -
http://evergreen-ils.org/documentation/release/RELEASE_NOTES_2_6.html),
efforts were made by developers like Dan Scott to add structured data
elements to Evergreen's catalog to make them more discoverable.  This
work has continued throughout newer Evergreen releases and I'd like to
say that through Dan's work and others, it has been essential towards
keeping Evergreen's catalog more friendly to search engines, like
Google, etc.

Evergreen 2.8's release notes include lots more discoverability
enhancements added with that release too:
http://evergreen-ils.org/documentation/release/RELEASE_NOTES_2_8.html#_opac

Since your site does not include a manually configured robots.txt
file, I'll point you at an example set at Dan's library Laurentian
University's catalog:  https://laurentian.concat.ca/robots.txt  (we
based many of our changes following the example they set).

That robots.txt file tends to guide search engine bots that arrive at
the catalog towards indexing the appropriate contents, and avoid/skip
over certain undesirables.

By default, if you do not have anything set, then search engine bots
will likely attempt to index everything in your catalog that it can
publicly access.

Doing an example search like
https://www.google.com/#q=asbury+catalog+Star+Trek (aka, keywords in
Google for asbury catalog Star Trek I can already see a couple
results that come from your Evergreen catalog records.  So at least
Google's search engine bots are already working to grab your catalog's
contents.

That all said, I suppose one potential danger of having bots freely
scan over your site is that if they get too busy with indexing your
site's contents, they can overwhelm and cause interruptions in your
ability to use Evergreen.  This happened to us at least once before,
where some indexer in China scanned our whole catalog and tried to
index every page causing us to run out of system resources trying to
serve up all the content it was requesting.

For myself and Bibliomation's catalog, I've been experimenting with
modifying our robots.txt file and continually upgrading our Evergreen
catalog to reflect the latest enhancements for structured data to try
making the most use out of what's possible in Evergreen.  Proceeding
forward, I've also done some small experiments in creating Google
Custom Search Engines to search against our indexed online catalog
(and requesting scheduled indexing from Google's bots) as an
alternative means of discovering the content contained in our systems.

Moving forward, I expect this to continue to be an exciting area to
explore the ways of improving discoverability of Evergreen's content.

-- Ben

On Thu, Apr 9, 2015 at 8:15 AM, Donald Butterworth
don.butterwo...@asburyseminary.edu wrote:
 Hi everyone,

 I was asked to toss these questions out and get some perspectives.

 What would it take to make the Evergreen catalog holdings available to
 generic search engines like Google, Bing, Yahoo and DuckDuckGo? Even if it
 is doable, is it a good idea?

 The motivation behind these questions is a perception that the first attempt
 many students make to do research is through a general web search.

 Anybody have a comment?

 Don

 --
 Don Butterworth
 Faculty Associate / Librarian III
 B.L. Fisher Library
 Asbury Theological Seminary
 don.butterwo...@asburyseminary.edu
 (859) 858-2227



-- 
Benjamin Shum
Evergreen Systems Manager
Bibliomation, Inc.
24 Wooster Ave.
Waterbury, CT 06708
203-577-4070, ext. 113


Re: [OPEN-ILS-GENERAL] Evergreen access via Google?

2015-04-09 Thread Donald Butterworth
Wow Ben! Thanks for the great answer! -- Don

On Thu, Apr 9, 2015 at 8:52 AM, Ben Shum bs...@biblio.org wrote:

 Hi Don,

 Starting as recently as Evergreen 2.6 (it's noted on the Evergreen 2.6
 release notes under structured data -
 http://evergreen-ils.org/documentation/release/RELEASE_NOTES_2_6.html),
 efforts were made by developers like Dan Scott to add structured data
 elements to Evergreen's catalog to make them more discoverable.  This
 work has continued throughout newer Evergreen releases and I'd like to
 say that through Dan's work and others, it has been essential towards
 keeping Evergreen's catalog more friendly to search engines, like
 Google, etc.

 Evergreen 2.8's release notes include lots more discoverability
 enhancements added with that release too:
 http://evergreen-ils.org/documentation/release/RELEASE_NOTES_2_8.html#_opac

 Since your site does not include a manually configured robots.txt
 file, I'll point you at an example set at Dan's library Laurentian
 University's catalog:  https://laurentian.concat.ca/robots.txt  (we
 based many of our changes following the example they set).

 That robots.txt file tends to guide search engine bots that arrive at
 the catalog towards indexing the appropriate contents, and avoid/skip
 over certain undesirables.

 By default, if you do not have anything set, then search engine bots
 will likely attempt to index everything in your catalog that it can
 publicly access.

 Doing an example search like
 https://www.google.com/#q=asbury+catalog+Star+Trek (aka, keywords in
 Google for asbury catalog Star Trek I can already see a couple
 results that come from your Evergreen catalog records.  So at least
 Google's search engine bots are already working to grab your catalog's
 contents.

 That all said, I suppose one potential danger of having bots freely
 scan over your site is that if they get too busy with indexing your
 site's contents, they can overwhelm and cause interruptions in your
 ability to use Evergreen.  This happened to us at least once before,
 where some indexer in China scanned our whole catalog and tried to
 index every page causing us to run out of system resources trying to
 serve up all the content it was requesting.

 For myself and Bibliomation's catalog, I've been experimenting with
 modifying our robots.txt file and continually upgrading our Evergreen
 catalog to reflect the latest enhancements for structured data to try
 making the most use out of what's possible in Evergreen.  Proceeding
 forward, I've also done some small experiments in creating Google
 Custom Search Engines to search against our indexed online catalog
 (and requesting scheduled indexing from Google's bots) as an
 alternative means of discovering the content contained in our systems.

 Moving forward, I expect this to continue to be an exciting area to
 explore the ways of improving discoverability of Evergreen's content.

 -- Ben

 On Thu, Apr 9, 2015 at 8:15 AM, Donald Butterworth
 don.butterwo...@asburyseminary.edu wrote:
  Hi everyone,
 
  I was asked to toss these questions out and get some perspectives.
 
  What would it take to make the Evergreen catalog holdings available to
  generic search engines like Google, Bing, Yahoo and DuckDuckGo? Even
 if it
  is doable, is it a good idea?
 
  The motivation behind these questions is a perception that the first
 attempt
  many students make to do research is through a general web search.
 
  Anybody have a comment?
 
  Don
 
  --
  Don Butterworth
  Faculty Associate / Librarian III
  B.L. Fisher Library
  Asbury Theological Seminary
  don.butterwo...@asburyseminary.edu
  (859) 858-2227



 --
 Benjamin Shum
 Evergreen Systems Manager
 Bibliomation, Inc.
 24 Wooster Ave.
 Waterbury, CT 06708
 203-577-4070, ext. 113




-- 
Don Butterworth
Faculty Associate / Librarian III
B.L. Fisher Library
Asbury Theological Seminary
don.butterwo...@asburyseminary.edu
(859) 858-2227


Re: [OPEN-ILS-GENERAL] Evergreen access via Google?

2015-04-09 Thread Ben Shum
Oh and Dan writes good stuff about his findings with structured data on his
blog. See: https://coffeecode.net/categories/22-Structured-data

I've found that to be a helpful resource in learning about how these things
work and where they might be headed someday.

-- Ben

Sent from my Nexus 6
On Apr 9, 2015 8:52 AM, Ben Shum bs...@biblio.org wrote:

 Hi Don,

 Starting as recently as Evergreen 2.6 (it's noted on the Evergreen 2.6
 release notes under structured data -
 http://evergreen-ils.org/documentation/release/RELEASE_NOTES_2_6.html),
 efforts were made by developers like Dan Scott to add structured data
 elements to Evergreen's catalog to make them more discoverable.  This
 work has continued throughout newer Evergreen releases and I'd like to
 say that through Dan's work and others, it has been essential towards
 keeping Evergreen's catalog more friendly to search engines, like
 Google, etc.

 Evergreen 2.8's release notes include lots more discoverability
 enhancements added with that release too:
 http://evergreen-ils.org/documentation/release/RELEASE_NOTES_2_8.html#_opac

 Since your site does not include a manually configured robots.txt
 file, I'll point you at an example set at Dan's library Laurentian
 University's catalog:  https://laurentian.concat.ca/robots.txt  (we
 based many of our changes following the example they set).

 That robots.txt file tends to guide search engine bots that arrive at
 the catalog towards indexing the appropriate contents, and avoid/skip
 over certain undesirables.

 By default, if you do not have anything set, then search engine bots
 will likely attempt to index everything in your catalog that it can
 publicly access.

 Doing an example search like
 https://www.google.com/#q=asbury+catalog+Star+Trek (aka, keywords in
 Google for asbury catalog Star Trek I can already see a couple
 results that come from your Evergreen catalog records.  So at least
 Google's search engine bots are already working to grab your catalog's
 contents.

 That all said, I suppose one potential danger of having bots freely
 scan over your site is that if they get too busy with indexing your
 site's contents, they can overwhelm and cause interruptions in your
 ability to use Evergreen.  This happened to us at least once before,
 where some indexer in China scanned our whole catalog and tried to
 index every page causing us to run out of system resources trying to
 serve up all the content it was requesting.

 For myself and Bibliomation's catalog, I've been experimenting with
 modifying our robots.txt file and continually upgrading our Evergreen
 catalog to reflect the latest enhancements for structured data to try
 making the most use out of what's possible in Evergreen.  Proceeding
 forward, I've also done some small experiments in creating Google
 Custom Search Engines to search against our indexed online catalog
 (and requesting scheduled indexing from Google's bots) as an
 alternative means of discovering the content contained in our systems.

 Moving forward, I expect this to continue to be an exciting area to
 explore the ways of improving discoverability of Evergreen's content.

 -- Ben

 On Thu, Apr 9, 2015 at 8:15 AM, Donald Butterworth
 don.butterwo...@asburyseminary.edu wrote:
  Hi everyone,
 
  I was asked to toss these questions out and get some perspectives.
 
  What would it take to make the Evergreen catalog holdings available to
  generic search engines like Google, Bing, Yahoo and DuckDuckGo? Even
 if it
  is doable, is it a good idea?
 
  The motivation behind these questions is a perception that the first
 attempt
  many students make to do research is through a general web search.
 
  Anybody have a comment?
 
  Don
 
  --
  Don Butterworth
  Faculty Associate / Librarian III
  B.L. Fisher Library
  Asbury Theological Seminary
  don.butterwo...@asburyseminary.edu
  (859) 858-2227



 --
 Benjamin Shum
 Evergreen Systems Manager
 Bibliomation, Inc.
 24 Wooster Ave.
 Waterbury, CT 06708
 203-577-4070, ext. 113