Google Scholar

2005-02-16 Thread Thomas Walker

As T.S.Mahadevan recently pointed out on the BOAI Forum, what those who are
searching for open archive and other scholarly literature really want is a
single website where they can search the entire set of such literature.

Google is already accounting for a significant portion of the hits on the
OA journal articles I monitor.  Might Google Scholar be that website?

===

Google Scholar (beta version online at http://scholar.google.com) restricts
Google searches to scholarly literature, including peer-reviewed papers,
theses, books, preprints, abstracts and technical reports from all fields
of research, and finds articles from a wide variety of academic publishers,
professional societies, preprint repositories and universities, as well as
scholarly articles available across the web.  Google Scholar ranks search
results by their relevance to the query, so the most useful references
should appear at the top of the page. The relevance ranking takes into
account the full text of each article as well as the article's author, the
publication in which the article appeared and how often it has been cited
in scholarly literature. Google Scholar also automatically analyzes and
extracts citations and presents them as separate results, even if the
documents they refer to are not online. This means that search results may
include citations of older works and seminal articles that appear only in
books or other offline publications. [Parts of this description taken
directly from http://scholar.google.com/scholar/about.html#about.]

===

Tom Walker




Thomas J. Walker
Department of Entomology & Nematology
PO Box 110620 (or Natural Area Drive)
University of Florida, Gainesville, FL 32611-0620
E-mail: t...@ufl.edu  (or tjwal...@ifas.ufl.edu)
FAX: (352)392-0190
Web: http://tjwalker.ifas.ufl.edu



Re: Google Scholar

2005-02-16 Thread Lee Giles

The Google scholar is outstanding, but I still feel there is a place
for specialized search in topical domains such as CiteSeer, which
I maintain. Our community still very much likes CiteSeer but
also uses the Google Scholar.

Best

Lee Giles

Thomas Walker wrote:


As T.S.Mahadevan recently pointed out on the BOAI Forum, what those
who are
searching for open archive and other scholarly literature really want
is a
single website where they can search the entire set of such literature.

Google is already accounting for a significant portion of the hits on the
OA journal articles I monitor.  Might Google Scholar be that website?

===

Google Scholar (beta version online at http://scholar.google.com)
restricts
Google searches to scholarly literature, including peer-reviewed papers,
theses, books, preprints, abstracts and technical reports from all fields
of research, and finds articles from a wide variety of academic
publishers,
professional societies, preprint repositories and universities, as
well as
scholarly articles available across the web.  Google Scholar ranks search
results by their relevance to the query, so the most useful references
should appear at the top of the page. The relevance ranking takes into
account the full text of each article as well as the article's author,
the
publication in which the article appeared and how often it has been cited
in scholarly literature. Google Scholar also automatically analyzes and
extracts citations and presents them as separate results, even if the
documents they refer to are not online. This means that search results
may
include citations of older works and seminal articles that appear only in
books or other offline publications. [Parts of this description taken
directly from http://scholar.google.com/scholar/about.html#about.]

===

Tom Walker




Thomas J. Walker
Department of Entomology & Nematology
PO Box 110620 (or Natural Area Drive)
University of Florida, Gainesville, FL 32611-0620
E-mail: t...@ufl.edu  (or tjwal...@ifas.ufl.edu)
FAX: (352)392-0190
Web: http://tjwalker.ifas.ufl.edu



Re: Google Scholar

2005-02-16 Thread Heather Morrison

With all the emphasis on immediate open access, I'm wondering - how up
to date is google scholar?

A quick search by publication year yields the following:

2001:  62,000 items
2002:  68,600 items
2003:  63,700 items
2004:  8,060 items

While it is possible that 2004 statistics will not be complete due to
publishing delays, this does suggest to me that there is a delay in
google scholar harvesting - whether of open access or
subscription-based resources, or both, is hard to say of course.  I do
think this data suggests that if there is one place to look for OA
materials at the moment, it is not google scholar.

My own searching confirms this suspicion - I am finding that if a
needed item is not found in google scholar, then an open access copy
may well still be found through a regular web search.

hope this helps,

Heather Morrison


On 16-Feb-05, at 6:16 AM, Thomas Walker wrote:


As T.S.Mahadevan recently pointed out on the BOAI Forum, what those
who are
searching for open archive and other scholarly literature really want
is a
single website where they can search the entire set of such literature.

Google is already accounting for a significant portion of the hits on
the
OA journal articles I monitor.  Might Google Scholar be that website?

===

Google Scholar (beta version online at http://scholar.google.com)
restricts
Google searches to scholarly literature, including peer-reviewed
papers,
theses, books, preprints, abstracts and technical reports from all
fields
of research, and finds articles from a wide variety of academic
publishers,
professional societies, preprint repositories and universities, as
well as
scholarly articles available across the web.  Google Scholar ranks
search
results by their relevance to the query, so the most useful references
should appear at the top of the page. The relevance ranking takes into
account the full text of each article as well as the article's author,
the
publication in which the article appeared and how often it has been
cited
in scholarly literature. Google Scholar also automatically analyzes and
extracts citations and presents them as separate results, even if the
documents they refer to are not online. This means that search results
may
include citations of older works and seminal articles that appear only
in
books or other offline publications. [Parts of this description taken
directly from http://scholar.google.com/scholar/about.html#about.]

===

Tom Walker




Thomas J. Walker
Department of Entomology & Nematology
PO Box 110620 (or Natural Area Drive)
University of Florida, Gainesville, FL 32611-0620
E-mail: t...@ufl.edu  (or tjwal...@ifas.ufl.edu)
FAX: (352)392-0190
Web: http://tjwalker.ifas.ufl.edu



Heather G. Morrison
Project Coordinator
BC Electronic Library Network

Phone: 604-268-7001
Fax: 604-291-3023
Email:  heath...@eln.bc.ca
Web: http://www.eln.bc.ca


Re: Google Scholar

2005-02-16 Thread Hamaker, Chuck
Heather and others: The date search can be misleading and generally
inaccurate from my experience with Google Scholar. Blackwell's journals
for example, were indexed by GS before they were in CINAHL when I
checked Blackwell's nursing journals and their most recent issues
against Google Scholar content. Ebsco has introduced a "pre cinhal" to
speed up the process of identification of nursing content. I didn't
compare GS to "Pre-CINAHL" but that would be a good way to check
timeliness of coverage in GS. 

IN GS several publishers and aggregators were indexed very quickly in
what I looked at in December. 100% as far as I could tell, of Extenza
content was indexed, for example, and I noted particularly rapid
indexing of a significant percentage but not all Ingenta content.

The other area where currency of coverage seems pretty good is the 35
some 
CrossRef publishers working with Google. For Open Access, the links to
"other" versions was particularly useful, as I found for some journal
titles and in some subject fields, significant portions of the content
was also available on individual or institutional servers. Bringing the
original article together with the archived versions is a unique service
that for secondary searching (i.e. if your local resources fail to
provide access to the article you need) is a powerful tool. 
 
Chuck Hamaker
Associate University Librarian Collections and Technical Services
Atkins Library
University of North Carolina Charlotte
Charlotte, NC 28223
phone 704 687-2825


-Original Message-
From: American Scientist Open Access Forum
[mailto:american-scientist-open-access-fo...@listserver.sigmaxi.org] On
Behalf Of Heather Morrison
Sent: Wednesday, February 16, 2005 12:48 PM
To: american-scientist-open-access-fo...@listserver.sigmaxi.org
Subject: Re: Google Scholar

With all the emphasis on immediate open access, I'm wondering - how up
to date is google scholar?

A quick search by publication year yields the following:

2001:  62,000 items
2002:  68,600 items
2003:  63,700 items
2004:  8,060 items

While it is possible that 2004 statistics will not be complete due to
publishing delays, this does suggest to me that there is a delay in
google scholar harvesting - whether of open access or
subscription-based resources, or both, is hard to say of course.  I do
think this data suggests that if there is one place to look for OA
materials at the moment, it is not google scholar.

My own searching confirms this suspicion - I am finding that if a
needed item is not found in google scholar, then an open access copy
may well still be found through a regular web search.

hope this helps,

Heather Morrison


On 16-Feb-05, at 6:16 AM, Thomas Walker wrote:

> As T.S.Mahadevan recently pointed out on the BOAI Forum, what those
> who are
> searching for open archive and other scholarly literature really want
> is a
> single website where they can search the entire set of such
literature.
>
> Google is already accounting for a significant portion of the hits on
> the
> OA journal articles I monitor.  Might Google Scholar be that website?
>
> ===
>
> Google Scholar (beta version online at http://scholar.google.com)
> restricts
> Google searches to scholarly literature, including peer-reviewed
> papers,
> theses, books, preprints, abstracts and technical reports from all
> fields
> of research, and finds articles from a wide variety of academic
> publishers,
> professional societies, preprint repositories and universities, as
> well as
> scholarly articles available across the web.  Google Scholar ranks
> search
> results by their relevance to the query, so the most useful references
> should appear at the top of the page. The relevance ranking takes into
> account the full text of each article as well as the article's author,
> the
> publication in which the article appeared and how often it has been
> cited
> in scholarly literature. Google Scholar also automatically analyzes
and
> extracts citations and presents them as separate results, even if the
> documents they refer to are not online. This means that search results
> may
> include citations of older works and seminal articles that appear only
> in
> books or other offline publications. [Parts of this description taken
> directly from http://scholar.google.com/scholar/about.html#about.]
>
> ===
>
> Tom Walker
>
>
>
> 
> Thomas J. Walker
> Department of Entomology & Nematology
> PO Box 110620 (or Natural Area Drive)
> University of Florida, Gainesville, FL 32611-0620
> E-mail: t...@ufl.edu  (or tjwal...@ifas.ufl.edu)
> FAX: (352)392-0190
> Web: http://tjwalker.ifas.ufl.edu
> 
>
Heather G. Morrison
Project Coordinator
BC Electronic Library Network

Phone: 604-268-7001
Fax: 604-291-3023
Email:  heath...@eln.bc.ca
Web: http://www.eln.bc.ca


ParaCite and Google Scholar

2004-12-29 Thread Mike Jewell

Google Scholar has now been added as a resource to ParaCite. This uses
the advanced search options; so the year, publication, and author - as
well as title - will be taken into account. This also works with the
existing OpenURL framework. i.e.

To search for 'Matan, A. & Carey, S. (2001). Developmental changes
within the core of artifact concepts. Cognition, 78, 1-26'

http://paracite.eprints.org/cgi-bin/paracite.cgi?
ref=Matan,%20A.%20%26%20Carey,%20%20S.%20(2001).%20Developmental%20chang
es%20within%20the%20core%20of%20artifact%20concepts.%20Cognition,%2078,%
201-26.

[Reference parsing version]

and

http://paracite.eprints.org/cgi-bin/openurl.cgi?
sid=paracite&spage=1&date=2001&aufirst=A&aulast=Matan&volume=78&title=Co
gnition&pages=1
-26&atitle=Developmental%20changes%20within%20the%20core%20of%20artifact
%20concepts&epage=26&year=2001

[OpenURL version]

are equivalent.

Mike Jewell
Doctoral Candidate
University of Southampton


Fwd: Google/Google Scholar merge?

2008-10-16 Thread Stevan Harnad
-- Forwarded message --
From: Leslie Carr 
List-Post: goal@eprints.org
List-Post: goal@eprints.org
Date: Thu, 16 Oct 2008 11:05:14 +0100
Subject: Google/Google Scholar merge?
To: JISC-REPOSITORIES -- jiscmail.ac.uk

I was just using Google to search for items in repositories when I
noticed that some Google results have Google Scholar data associated
with them - author name, year of publication, number of citations and
links to the Google scholar records.

See the following examples:
(EPrints Soton)
http://www.google.com/search?num=100&hl=en&safe=off&client=safari&rls=en-us&q=site%3Aeprints.soton.ac.uk+%22institutional+repositories%22&btnG=Search

(DSpace MIT)
http://www.google.com/search?num=100&hl=en&safe=off&client=safari&rls=en-us&q=site%3Adspace.mit.edu+%22digital+preservation%22&btnG=Search

  I'm not aware of any announcements about this. Does anyone have any
more information?

On closer inspection, it seems that any of the versions of a paper
that Google Scholar has identified will appear with the enhanced
information - whether in a repository or on a publisher's website or
an author's home page. The author names are sometimes somewhat awry -
you will often see authors listed as "Submission R" because the paper
is listed under Recent Submissions or similar.

The vast majority of repository usage comes from Google, not Google
scholar, and so this development is very welcome because it allows
users to see some kind of scholarly perspective on top of Google's
(and the Web's) model of individual document resources.
--
Les Carr


Re: Google/Google Scholar merge?

2008-10-16 Thread Frank McCown
I haven't seen any formal announcements, but I think this is part of
Google's larger strategy of merging results from multiple sources
(news, images, etc.) into a single results page, what they call
universal search.

http://www.google.com/intl/en/press/pressrel/universalsearch_20070516.html

Regards,
Frank


On Thu, Oct 16, 2008 at 6:36 AM, Stevan Harnad  wrote:
> -- Forwarded message --
> From: Leslie Carr 
> Date: Thu, 16 Oct 2008 11:05:14 +0100
> Subject: Google/Google Scholar merge?
> To: JISC-REPOSITORIES -- jiscmail.ac.uk
>
> I was just using Google to search for items in repositories when I
> noticed that some Google results have Google Scholar data associated
> with them - author name, year of publication, number of citations and
> links to the Google scholar records.
>
> See the following examples:
> (EPrints Soton)
> http://www.google.com/search?num=100&hl=en&safe=off&client=safari&rls=en-us&q=site%3Aeprints.soton.ac.uk+%22institutional+repositories%22&btnG=Search
>
> (DSpace MIT)
> http://www.google.com/search?num=100&hl=en&safe=off&client=safari&rls=en-us&q=site%3Adspace.mit.edu+%22digital+preservation%22&btnG=Search
>
>  I'm not aware of any announcements about this. Does anyone have any
> more information?
>
> On closer inspection, it seems that any of the versions of a paper
> that Google Scholar has identified will appear with the enhanced
> information - whether in a repository or on a publisher's website or
> an author's home page. The author names are sometimes somewhat awry -
> you will often see authors listed as "Submission R" because the paper
> is listed under Recent Submissions or similar.
>
> The vast majority of repository usage comes from Google, not Google
> scholar, and so this development is very welcome because it allows
> users to see some kind of scholarly perspective on top of Google's
> (and the Web's) model of individual document resources.
> --
> Les Carr
>



--
Frank McCown, Ph.D.
Assistant Professor of Computer Science
Harding University
http://www.harding.edu/fmccown/


Re: Google/Google Scholar merge?

2008-10-16 Thread Leslie Carr
This may be a small change in the user interface, but it is a large
step in the convergence between "green" open access resources
(repositories) and publisher resources. Now researchers will be able
to find (together, in one place) the various for-free and for-pay
manifestations of a publication, and then they can make informed
decisions about whether the preprint, author's postprint or published
version will satisfy their requirements.

Of course, they could have done that through Google Scholar, but most
researchers aren't using Google Scholar, and they would have to use
two different services for different types of information.
--
Les Carr



On 16 Oct 2008, at 14:31, Frank McCown wrote:

> I haven't seen any formal announcements, but I think this is part of
> Google's larger strategy of merging results from multiple sources
> (news, images, etc.) into a single results page, what they call
> universal search.
> 
> http://www.google.com/intl/en/press/pressrel/universalsearch_20070516.html
> 
> Regards,
> Frank
> 
> 
> On Thu, Oct 16, 2008 at 6:36 AM, Stevan Harnad
>  wrote:
> > -- Forwarded message --
> > From: Leslie Carr 
> > Date: Thu, 16 Oct 2008 11:05:14 +0100
> > Subject: Google/Google Scholar merge?
> > To: JISC-REPOSITORIES -- jiscmail.ac.uk
> > 
> > I was just using Google to search for items in repositories when I
> > noticed that some Google results have Google Scholar data associated
> > with them - author name, year of publication, number of citations and
> > links to the Google scholar records.
> > 
> > See the following examples:
> > (EPrints Soton)
> > http://www.google.com/search?num=100&hl=en&safe=off&client=safari&rls=en
> > -us&q=site%3Aeprints.soton.ac.uk+%22institutional+repositories%22&btnG=S
> > earch
> > 
> > (DSpace MIT)
> > http://www.google.com/search?num=100&hl=en&safe=off&client=safari&rls=en
> > -us&q=site%3Adspace.mit.edu+%22digital+preservation%22&btnG=Search
> > 
> > I'm not aware of any announcements about this. Does anyone have any
> > more information?
> > 
> > On closer inspection, it seems that any of the versions of a paper
> > that Google Scholar has identified will appear with the enhanced
> > information - whether in a repository or on a publisher's website or
> > an author's home page. The author names are sometimes somewhat awry -
> > you will often see authors listed as "Submission R" because the paper
> > is listed under Recent Submissions or similar.
> > 
> > The vast majority of repository usage comes from Google, not Google
> > scholar, and so this development is very welcome because it allows
> > users to see some kind of scholarly perspective on top of Google's
> > (and the Web's) model of individual document resources.
> > --
> > Les Carr
> > 
> 
> 
> 
> --
> Frank McCown, Ph.D.
> Assistant Professor of Computer Science
> Harding University
> http://www.harding.edu/fmccown/


Re: Google/Google Scholar merge?

2008-10-17 Thread Sally Morris (Morris Associates)
Puzzled by Les's posting - Google Scholar already identifies 'green' sources
of documents, doesn't it?

Sally


Sally Morris
Consultant, Morris Associates (Publishing Consultancy)
South House, The Street
Clapham, Worthing, West Sussex BN13 3UU, UK
Tel:  +44(0)1903 871286
Fax:  +44(0)8701 202806
Email:  sa...@morris-assocs.demon.co.uk

-Original Message-
From: American Scientist Open Access Forum
[mailto:american-scientist-open-access-fo...@listserver.sigmaxi.org] On
Behalf Of Leslie Carr
Sent: 16 October 2008 17:01
To: american-scientist-open-access-fo...@listserver.sigmaxi.org
Subject: Re: Google/Google Scholar merge?

This may be a small change in the user interface, but it is a large
step in the convergence between "green" open access resources
(repositories) and publisher resources. Now researchers will be able
to find (together, in one place) the various for-free and for-pay
manifestations of a publication, and then they can make informed
decisions about whether the preprint, author's postprint or published
version will satisfy their requirements.

Of course, they could have done that through Google Scholar, but most
researchers aren't using Google Scholar, and they would have to use
two different services for different types of information.
--
Les Carr



On 16 Oct 2008, at 14:31, Frank McCown wrote:

> I haven't seen any formal announcements, but I think this is part of
> Google's larger strategy of merging results from multiple sources
> (news, images, etc.) into a single results page, what they call
> universal search.
>
> http://www.google.com/intl/en/press/pressrel/universalsearch_20070516.html
>
> Regards,
> Frank
>
>
> On Thu, Oct 16, 2008 at 6:36 AM, Stevan Harnad
>  wrote:
>> -- Forwarded message --
>> From: Leslie Carr 
>> Date: Thu, 16 Oct 2008 11:05:14 +0100
>> Subject: Google/Google Scholar merge?
>> To: JISC-REPOSITORIES -- jiscmail.ac.uk
>>
>> I was just using Google to search for items in repositories when I
>> noticed that some Google results have Google Scholar data associated
>> with them - author name, year of publication, number of citations and
>> links to the Google scholar records.
>>
>> See the following examples:
>> (EPrints Soton)
>>
http://www.google.com/search?num=100&hl=en&safe=off&client=safari&rls=en-us&;
q=site%3Aeprints.soton.ac.uk+%22institutional+repositories%22&btnG=Search
>>
>> (DSpace MIT)
>>
http://www.google.com/search?num=100&hl=en&safe=off&client=safari&rls=en-us&;
q=site%3Adspace.mit.edu+%22digital+preservation%22&btnG=Search
>>
>> I'm not aware of any announcements about this. Does anyone have any
>> more information?
>>
>> On closer inspection, it seems that any of the versions of a paper
>> that Google Scholar has identified will appear with the enhanced
>> information - whether in a repository or on a publisher's website or
>> an author's home page. The author names are sometimes somewhat awry -
>> you will often see authors listed as "Submission R" because the paper
>> is listed under Recent Submissions or similar.
>>
>> The vast majority of repository usage comes from Google, not Google
>> scholar, and so this development is very welcome because it allows
>> users to see some kind of scholarly perspective on top of Google's
>> (and the Web's) model of individual document resources.
>> --
>> Les Carr
>>
>
>
>
> --
> Frank McCown, Ph.D.
> Assistant Professor of Computer Science
> Harding University
> http://www.harding.edu/fmccown/


Re: Google/Google Scholar merge?

2008-10-17 Thread Garret McMahon
Both Stephen Downes [ http://www.downes.ca/cgi-bin/page.cgi?post=45607
] and Stuart Lewis [
http://blog.stuartlewis.com/2008/08/13/google-bring-scholar-richness-into-normal-search-results/
] posted on this back in August.

Regards,

Garret

2008/10/16 Leslie Carr :
> This may be a small change in the user interface, but it is a large
> step in the convergence between "green" open access resources
> (repositories) and publisher resources. Now researchers will be able
> to find (together, in one place) the various for-free and for-pay
> manifestations of a publication, and then they can make informed
> decisions about whether the preprint, author's postprint or published
> version will satisfy their requirements.
>
> Of course, they could have done that through Google Scholar, but most
> researchers aren't using Google Scholar, and they would have to use
> two different services for different types of information.
> --
> Les Carr
>
>
>
> On 16 Oct 2008, at 14:31, Frank McCown wrote:
>
>> I haven't seen any formal announcements, but I think this is part of
>> Google's larger strategy of merging results from multiple sources
>> (news, images, etc.) into a single results page, what they call
>> universal search.
>>
>> http://www.google.com/intl/en/press/pressrel/universalsearch_20070516.html
>>
>> Regards,
>> Frank
>>
>>
>> On Thu, Oct 16, 2008 at 6:36 AM, Stevan Harnad
>>  wrote:
>>>
>>> -- Forwarded message --
>>> From: Leslie Carr 
>>> Date: Thu, 16 Oct 2008 11:05:14 +0100
>>> Subject: Google/Google Scholar merge?
>>> To: JISC-REPOSITORIES -- jiscmail.ac.uk
>>>
>>> I was just using Google to search for items in repositories when I
>>> noticed that some Google results have Google Scholar data associated
>>> with them - author name, year of publication, number of citations and
>>> links to the Google scholar records.
>>>
>>> See the following examples:
>>> (EPrints Soton)
>>>
>>> http://www.google.com/search?num=100&hl=en&safe=off&client=safari&rls=en-us&q=site%3Aeprints.soton.ac.uk+%22institutional+repositories%22&btnG=Search
>>>
>>> (DSpace MIT)
>>>
>>> http://www.google.com/search?num=100&hl=en&safe=off&client=safari&rls=en-us&q=site%3Adspace.mit.edu+%22digital+preservation%22&btnG=Search
>>>
>>> I'm not aware of any announcements about this. Does anyone have any
>>> more information?
>>>
>>> On closer inspection, it seems that any of the versions of a paper
>>> that Google Scholar has identified will appear with the enhanced
>>> information - whether in a repository or on a publisher's website or
>>> an author's home page. The author names are sometimes somewhat awry -
>>> you will often see authors listed as "Submission R" because the paper
>>> is listed under Recent Submissions or similar.
>>>
>>> The vast majority of repository usage comes from Google, not Google
>>> scholar, and so this development is very welcome because it allows
>>> users to see some kind of scholarly perspective on top of Google's
>>> (and the Web's) model of individual document resources.
>>> --
>>> Les Carr
>>>
>>
>>
>>
>> --
>> Frank McCown, Ph.D.
>> Assistant Professor of Computer Science
>> Harding University
>> http://www.harding.edu/fmccown/
>


Re: Google/Google Scholar merge?

2008-10-17 Thread Tim Gray, Homerton College Library
I am sure this was covered some months ago. Google Scholar results appearing
in 'vanilla' Google (presumably this is about the same thing?). For example
Peter Suber on 14 August 2008 who is quoting Stuart Lewis's blog
<http://www.earlham.edu/~peters/fos/2008/08/google-scholar-results-starting-
to.html>:

[quote:]
Stuart Lewis, Google bring Scholar richness into normal search results,
Stuart Lewis' Blog, August 13, 2008.
"Some good news for open access repository advocates: It seems that the
normal Google search engine has now started bringing the richness of Google
Scholar results into the main Google search results. This extra information
includes:
The (first) author's name
Links to papers that have cited it
Links to related articles
Links to other versions
For me this is great news. When we go out selling repositories to academics,
one of our arguments is "your paper will appear in Google Scholar, and other
specialist search engines such as Intute Repository Search and OAIster.
However, if we are honest, how many people use these, and I'm including
Google Scholar in this, as their first point of call? Not many I
suspect"
[end quote]

Of course, some of the links are to subscription only sources and are mostly
inaccessible to those outside institutions that can afford the
subscriptions.

I am sure I read somewhere, long ago when Google Scholar was launched, that
it was based on a *subset* of 'vanilla' Google's data. Was this true then,
or ever true, or still true?

Tim Gray
Library Assistant
Homerton College Library


-Original Message-
From: American Scientist Open Access Forum
[mailto:american-scientist-open-access-fo...@listserver.sigmaxi.org] On
Behalf Of Frank McCown
Sent: 16 October 2008 14:31
To: american-scientist-open-access-fo...@listserver.sigmaxi.org
Subject: Re: Google/Google Scholar merge?

I haven't seen any formal announcements, but I think this is part of
Google's larger strategy of merging results from multiple sources
(news, images, etc.) into a single results page, what they call
universal search.

http://www.google.com/intl/en/press/pressrel/universalsearch_20070516.html

Regards,
Frank


On Thu, Oct 16, 2008 at 6:36 AM, Stevan Harnad  wrote:
> -- Forwarded message --
> From: Leslie Carr 
> Date: Thu, 16 Oct 2008 11:05:14 +0100
> Subject: Google/Google Scholar merge?
> To: JISC-REPOSITORIES -- jiscmail.ac.uk
>
> I was just using Google to search for items in repositories when I
> noticed that some Google results have Google Scholar data associated
> with them - author name, year of publication, number of citations and
> links to the Google scholar records.
>
> See the following examples:
> (EPrints Soton)
>
http://www.google.com/search?num=100&hl=en&safe=off&client=safari&rls=en-us&;
q=site%3Aeprints.soton.ac.uk+%22institutional+repositories%22&btnG=Search
>
> (DSpace MIT)
>
http://www.google.com/search?num=100&hl=en&safe=off&client=safari&rls=en-us&;
q=site%3Adspace.mit.edu+%22digital+preservation%22&btnG=Search
>
>  I'm not aware of any announcements about this. Does anyone have any
> more information?
>
> On closer inspection, it seems that any of the versions of a paper
> that Google Scholar has identified will appear with the enhanced
> information - whether in a repository or on a publisher's website or
> an author's home page. The author names are sometimes somewhat awry -
> you will often see authors listed as "Submission R" because the paper
> is listed under Recent Submissions or similar.
>
> The vast majority of repository usage comes from Google, not Google
> scholar, and so this development is very welcome because it allows
> users to see some kind of scholarly perspective on top of Google's
> (and the Web's) model of individual document resources.
> --
> Les Carr
>



--
Frank McCown, Ph.D.
Assistant Professor of Computer Science
Harding University
http://www.harding.edu/fmccown/


Re: Google/Google Scholar merge?

2008-10-17 Thread Leslie Carr
On 17 Oct 2008, at 09:27, Sally Morris (Morris Associates) wrote:

> Puzzled by Les's posting - Google Scholar already identifies 'green'
> sources
> of documents, doesn't it?

What I mean is that
(a) Google Scholar is a service that few people are using (just look
at the stats for repository usage)
(b) Google Scholar does a specific kind of search that returns a
specific kind of resource (a subset of the scholarly literature)
(c) it is possible that (a) and (b) are causally related

By putting the Google Scholar (and Google Books) benefits into Vanilla
Google then all the knowledge about a FRBR resource is concentrated
into one place for the benefit of a very much larger audience.
--
Les


can you present google scholar in French

2009-03-06 Thread Bernard Lang
Bonjour,   (intentionally in French)

An economist collegue, Joëlle Farchy, is co-editor of a special issue
of a French sociology journal (publishing in French only). The topic
of this issue is free scientific publication (I am not sure whether
she means something other than open access).

She is interested in finding a contributor for a fairly short piece
analyzing goggle scolar, its uses or misuses, or contrasting it with
other way of finding scholarly articles.  Also, what (algorithmic)
techniques do they use to query the web and how does it differ from or
improve (?)  upon a simple use of the general purpose search engine
(like the standard Goggle).

It should be about 10 to 20 000 characters (spaces included) and
written in French.

I thought someone in these forum might be interested in contributing,
or might know someone who is and can do it.

She can be contacted at
   Joelle FARCHY 

It seems to me that the issue she is preparing may contribute to make
open access better known.  She has already significantly contributed
to a better understanding of open access in France, though not through
our usual channels.

Do not write back to me : I will be off-line for some time.

Thank you for helping.

Cordialement

Bernard Lang

--
  Après la bulle Internet, la bulle financière ...
   Et bientôt la bulle des brevets
   http://www.strategie.gouv.fr/revue/IMG/pdf/article_HS7RL2.pdf
  http://www.huffingtonpost.com/brian-kahin/the-patent-bubble_b_129232.html
la gestion des catastrophes comme principe de gouvernement

bernard.l...@inria.fr  ,_  /\o\o/Tel  +33 1 3963 5644
http://bat8.inria.fr/~lang/   ^  Fax  +33 1 3963 5469
   INRIA / B.P. 105 / 78153 Le Chesnay CEDEX / France
Je n'exprime que mon opinion - I express only my opinion


[GOAL] Fwd: Re: Google Scholar discoverability of repository content

2012-02-17 Thread Stevan Harnad
Important feedback from Tim Brody, one of the developers of EPrints:

Begin forwarded message:

  From: Tim Brody 
List-Post: goal@eprints.org
List-Post: goal@eprints.org
Date: February 17, 2012 6:33:22 AM EST
To: eprints-t...@ecs.soton.ac.uk
Cc: jisc-repositor...@jiscmail.ac.uk
Subject: [EP-tech] Re: Google Scholar discoverability of repository
content


Hi All,

Here is some specific advice for existing repository administrators from
Google Scholar:
http://roar.eprints.org/help/google_scholar.html

As far as I'm aware there isn't anyone running EPrints 2 now, so
EPrints-based repositories are already (and for a long) the "best in
class" for Google Scholar.


Right, this paper ...

Table 1 is irrelevant and misleading. Scholar links first to the
publisher and, only if there is no publisher link, directly to the IR
version. That's a policy decision on the part of Scholar and nothing to
do with IRs.

Table 2 gives us some useful data. The headline rate for EPrints is 88%
(based on CalTech). Unfortunately the authors haven't provided an
analysis of what happened to the missing records. I've done a quick
random sample of CalTech and I suspect the missing records will consist
of:
1) Non-OA/non-full-text records (I'm sure a query to the CalTech
repository admin could supply the data).
2) A percentage of PDFs that Scholar won't be able to parse. CalTech
contains some old (1950s), scanned PDFs from Journals. Where the article
isn't at the top of the page Scholar will struggle to parse the
title/authors/abstract and therefore won't be able to match it to their
records e.g. http://authors.library.caltech.edu/5815/


The remainder of the paper describes the authors' process of fixing
their own IR (based on CONTENTdm).


The authors then wrongly conclude:

"Despite GS’s endorsement of three software packages, the surveys
conducted for this paper demonstrates that software is not a deciding
factor for indexing ratio in GS. Each of the three recommended software
packages showed good indexing ratios for some repositories and poor
ratios for others."

The authors looked at one instance of EPrints and, despite being a
relatively old version, found 88% of its records indexed in GS.

It is unfortunate that this paper has suggested that IR software in
general is poorly indexed in GS. On the contrary, some badly implemented
IR software is poorly indexed in GS.


After all that is said, the most critical factor to IR visibility is
having (BOAI definition) open access content. Hiding content behind
search forms, click-throughs and other things that emphasise the IR at
the expense of the content will hurt your visibility.

Lastly, Google will index your metadata-only records while Google
Scholar is looking for full-texts. Your GS/Google ratio will approximate
how many of your records have an attached open access PDF (.doc etc).


Sincerely,
Tim Brody
(EPrints Developer)

On Wed, 2012-02-15 at 11:31 +, Stevan Harnad wrote:
  Can we enhance the google-scholar discoverability of EPrints
  (and

  DSpace) repositories?


http://linksource.ebsco.com/linking.aspx?sid=google&auinit=K&aulast=Arlitsch&at
itle=Invisible+Institutional+Repositories:+Addressing+the+Low+Indexing+Ratios+o
f+IRs+in+Google+Scholar&title=Library+Hi+Tech&volume=30&issue=1&date=2012&spage
  =4&issn=0737-8831


  Kenning Arlitsch, Patrick Shawn OBrien, (2012) "Invisible
  Institutional

  Repositories: Addressing the Low Indexing Ratios of IRs in
  Google

  Scholar", Library Hi Tech, Vol. 30 Iss: 1


  Purpose - Google Scholar has difficulty indexing the contents
  of

  institutional repositories, and the authors hypothesize the
  reason is

  that most repositories use Dublin Core, which cannot express

  bibliographic citation information adequately for academic
  papers.

  Google Scholar makes specific recommendations for
  repositories,

  including the use of publishing industry metadata schemas over
  Dublin

  Core. This paper tests a theory that transforming metadata
  schemas in

  institutional repositories will lead to increased indexing by
  Google

  Scholar.


  Design/methodology/approach - The authors conducted two
  surveys of

  institutional and disciplinary repositories across the United
  States,

  using different methodologies. They also conducted three pilot
  projects

  that transformed the metadata of a subset of papers from
  USpace, the

  University of Utah's institutional repository, and examined
  the results

  of Google Scholar's explicit harvests.


  Findings - Repositories that use GS recommended metadata
  schemas and

  express them in HTML meta tags experienced significantly
  higher indexing

  ratios. The eas

[GOAL] {Disarmed} Re: Google Scholar discoverability of repository content

2012-02-17 Thread Stevan Harnad
Begin forwarded message:

  From: Betsy Coles 
List-Post: goal@eprints.org
List-Post: goal@eprints.org
Date: February 17, 2012 5:48:42 PM EST
To: jisc-repositor...@jiscmail.ac.uk
Subject: Re: [EP-tech] Re: Google Scholar discoverability of repository
content

I'm the technical manager for the main IR at Caltech, CaltechAUTHORS

  (MailScanner has detected a possible fraud attempt from
  "authors.library.caltech.edu" claiming to be
  http://authors.library.caltech..edu), currently running EPrints
  3.1.3.  

  Tim's conjecture 1) below seems to account almost exactly for the
  result

  the article authors found: 87.7% of the 25,072 eprints in
  CaltechAUTHORS

  have OA documents attached; the remainder have only documents that
  are

  either restricted to campus or to repository staff.  I don't think
  there are very

  many cases of Tim's conjecture 2), since we have concentrated on
  adding

  current content.

  I haven't read the article in question (we don't subscribe), but the
  percentage

  of open access eprints is almost exactly the same as the authors'
  report of GS

  indexed items in Table 2.  I haven't tested specifically, but it's
  tempting to

  conclude that GS is indexing 100% of our open access content.

  Betsy Coles
  Caltech Library IT Group
  bco...@caltech.edu

  -Original Message-
  From: eprints-tech-boun...@ecs.soton.ac.uk
  [mailto:eprints-tech-boun...@ecs.soton.ac.uk] On Behalf Of Tim Brody
  Sent: Friday, February 17, 2012 3:33 AM
  To: eprints-t...@ecs.soton.ac.uk
  Cc: jisc-repositor...@jiscmail.ac.uk
  Subject: [EP-tech] Re: Google Scholar discoverability of repository
  content

  Hi All,

  Here is some specific advice for existing repository administrators
  from Google Scholar:
  http://roar.eprints.org/help/google_scholar.html

  As far as I'm aware there isn't anyone running EPrints 2 now, so
  EPrints-based repositories are already (and for a long) the "best in
  class" for Google Scholar.


  Right, this paper ...

  Table 1 is irrelevant and misleading. Scholar links first to the
  publisher and, only if there is no publisher link, directly to the
  IR version. That's a policy decision on the part of Scholar and
  nothing to do with IRs.

  Table 2 gives us some useful data. The headline rate for EPrints is
  88% (based on CalTech). Unfortunately the authors haven't provided
  an analysis of what happened to the missing records. I've done a
  quick random sample of CalTech and I suspect the missing records
  will consist
  of:
  1) Non-OA/non-full-text records (I'm sure a query to the CalTech
  repository admin could supply the data).
  2) A percentage of PDFs that Scholar won't be able to parse. CalTech
  contains some old (1950s), scanned PDFs from Journals. Where the
  article isn't at the top of the page Scholar will struggle to parse
  the title/authors/abstract and therefore won't be able to match it
  to their records e.g. http://authors.library.caltech.edu/5815/


  The remainder of the paper describes the authors' process of fixing
  their own IR (based on CONTENTdm).


  The authors then wrongly conclude:

  "Despite GS’s endorsement of three software packages, the surveys
  conducted for this paper demonstrates that software is not a
  deciding factor for indexing ratio in GS. Each of the three
  recommended software packages showed good indexing ratios for some
  repositories and poor ratios for others."

  The authors looked at one instance of EPrints and, despite being a
  relatively old version, found 88% of its records indexed in GS.

  It is unfortunate that this paper has suggested that IR software in
  general is poorly indexed in GS. On the contrary, some badly
  implemented IR software is poorly indexed in GS.


  After all that is said, the most critical factor to IR visibility is
  having (BOAI definition) open access content. Hiding content behind
  search forms, click-throughs and other things that emphasise the IR
  at the expense of the content will hurt your visibility.

  Lastly, Google will index your metadata-only records while Google
  Scholar is looking for full-texts. Your GS/Google ratio will
  approximate how many of your records have an attached open access
  PDF (.doc etc).


  Sincerely,
  Tim Brody
  (EPrints Developer)

  On Wed, 2012-02-15 at 11:31 +, Stevan Harnad wrote:
Can we enhance the google-scholar discoverability of
EPrints (and

DSpace) repositories?



http://linksource.ebsco.com/li

[GOAL] Fwd: Re: Google Scholar discoverability of repository content

2012-02-17 Thread Stevan Harnad
Important feedback from Tim Brody, one of the developers of EPrints:

Begin forwarded message:

> From: Tim Brody 
> Date: February 17, 2012 6:33:22 AM EST
> To: eprints-tech at ecs.soton.ac.uk
> Cc: JISC-REPOSITORIES at JISCMAIL.AC.UK
> Subject: [EP-tech] Re: Google Scholar discoverability of repository content
> 
> 
> Hi All,
> 
> Here is some specific advice for existing repository administrators from
> Google Scholar:
> http://roar.eprints.org/help/google_scholar.html
> 
> As far as I'm aware there isn't anyone running EPrints 2 now, so
> EPrints-based repositories are already (and for a long) the "best in
> class" for Google Scholar.
> 
> 
> Right, this paper ...
> 
> Table 1 is irrelevant and misleading. Scholar links first to the
> publisher and, only if there is no publisher link, directly to the IR
> version. That's a policy decision on the part of Scholar and nothing to
> do with IRs.
> 
> Table 2 gives us some useful data. The headline rate for EPrints is 88%
> (based on CalTech). Unfortunately the authors haven't provided an
> analysis of what happened to the missing records. I've done a quick
> random sample of CalTech and I suspect the missing records will consist
> of:
> 1) Non-OA/non-full-text records (I'm sure a query to the CalTech
> repository admin could supply the data).
> 2) A percentage of PDFs that Scholar won't be able to parse. CalTech
> contains some old (1950s), scanned PDFs from Journals. Where the article
> isn't at the top of the page Scholar will struggle to parse the
> title/authors/abstract and therefore won't be able to match it to their
> records e.g. http://authors.library.caltech.edu/5815/
> 
> 
> The remainder of the paper describes the authors' process of fixing
> their own IR (based on CONTENTdm).
> 
> 
> The authors then wrongly conclude:
> 
> "Despite GS?s endorsement of three software packages, the surveys
> conducted for this paper demonstrates that software is not a deciding
> factor for indexing ratio in GS. Each of the three recommended software
> packages showed good indexing ratios for some repositories and poor
> ratios for others."
> 
> The authors looked at one instance of EPrints and, despite being a
> relatively old version, found 88% of its records indexed in GS.
> 
> It is unfortunate that this paper has suggested that IR software in
> general is poorly indexed in GS. On the contrary, some badly implemented
> IR software is poorly indexed in GS.
> 
> 
> After all that is said, the most critical factor to IR visibility is
> having (BOAI definition) open access content. Hiding content behind
> search forms, click-throughs and other things that emphasise the IR at
> the expense of the content will hurt your visibility.
> 
> Lastly, Google will index your metadata-only records while Google
> Scholar is looking for full-texts. Your GS/Google ratio will approximate
> how many of your records have an attached open access PDF (.doc etc).
> 
> 
> Sincerely,
> Tim Brody
> (EPrints Developer)
> 
> On Wed, 2012-02-15 at 11:31 +0000, Stevan Harnad wrote:
>> Can we enhance the google-scholar discoverability of EPrints (and
>> DSpace) repositories?
>> 
>> http://linksource.ebsco.com/linking.aspx?sid=google&auinit=K&aulast=Arlitsch&atitle=Invisible+Institutional+Repositories:+Addressing+the+Low+Indexing+Ratios+of+IRs+in+Google+Scholar&title=Library+Hi+Tech&volume=30&issue=1&date=2012&spage=4&issn=0737-8831
>> 
>> Kenning Arlitsch, Patrick Shawn OBrien, (2012) "Invisible Institutional
>> Repositories: Addressing the Low Indexing Ratios of IRs in Google
>> Scholar", Library Hi Tech, Vol. 30 Iss: 1
>> 
>> Purpose - Google Scholar has difficulty indexing the contents of
>> institutional repositories, and the authors hypothesize the reason is
>> that most repositories use Dublin Core, which cannot express
>> bibliographic citation information adequately for academic papers.
>> Google Scholar makes specific recommendations for repositories,
>> including the use of publishing industry metadata schemas over Dublin
>> Core. This paper tests a theory that transforming metadata schemas in
>> institutional repositories will lead to increased indexing by Google
>> Scholar.
>> 
>> Design/methodology/approach - The authors conducted two surveys of
>> institutional and disciplinary repositories across the United States,
>> using different methodologies. They also conducted three pilot projects
>> that transformed the metadata of a subset of papers from USpace, the
>> University of Utah's institu

[GOAL] {Disarmed} Re: Google Scholar discoverability of repository content

2012-02-17 Thread Stevan Harnad
Begin forwarded message:

> From: Betsy Coles 
> Date: February 17, 2012 5:48:42 PM EST
> To: JISC-REPOSITORIES at JISCMAIL.AC.UK
> Subject: Re: [EP-tech] Re: Google Scholar discoverability of repository 
> content
> 
> I'm the technical manager for the main IR at Caltech, CaltechAUTHORS
> (http://authors.library.caltech.edu), currently running EPrints 3.1.3.  
> 
> Tim's conjecture 1) below seems to account almost exactly for the result
> the article authors found: 87.7% of the 25,072 eprints in CaltechAUTHORS
> have OA documents attached; the remainder have only documents that are
> either restricted to campus or to repository staff.  I don't think there are 
> very
> many cases of Tim's conjecture 2), since we have concentrated on adding
> current content.
> 
> I haven't read the article in question (we don't subscribe), but the 
> percentage
> of open access eprints is almost exactly the same as the authors' report of GS
> indexed items in Table 2.  I haven't tested specifically, but it's tempting to
> conclude that GS is indexing 100% of our open access content.
> 
> Betsy Coles
> Caltech Library IT Group
> bcoles at caltech.edu
> 
> -Original Message-
> From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:eprints-tech-bounces at 
> ecs.soton.ac.uk] On Behalf Of Tim Brody
> Sent: Friday, February 17, 2012 3:33 AM
> To: eprints-tech at ecs.soton.ac.uk
> Cc: JISC-REPOSITORIES at JISCMAIL.AC.UK
> Subject: [EP-tech] Re: Google Scholar discoverability of repository content
> 
> Hi All,
> 
> Here is some specific advice for existing repository administrators from 
> Google Scholar:
> http://roar.eprints.org/help/google_scholar.html
> 
> As far as I'm aware there isn't anyone running EPrints 2 now, so 
> EPrints-based repositories are already (and for a long) the "best in class" 
> for Google Scholar.
> 
> 
> Right, this paper ...
> 
> Table 1 is irrelevant and misleading. Scholar links first to the publisher 
> and, only if there is no publisher link, directly to the IR version. That's a 
> policy decision on the part of Scholar and nothing to do with IRs.
> 
> Table 2 gives us some useful data. The headline rate for EPrints is 88% 
> (based on CalTech). Unfortunately the authors haven't provided an analysis of 
> what happened to the missing records. I've done a quick random sample of 
> CalTech and I suspect the missing records will consist
> of:
> 1) Non-OA/non-full-text records (I'm sure a query to the CalTech repository 
> admin could supply the data).
> 2) A percentage of PDFs that Scholar won't be able to parse. CalTech contains 
> some old (1950s), scanned PDFs from Journals. Where the article isn't at the 
> top of the page Scholar will struggle to parse the title/authors/abstract and 
> therefore won't be able to match it to their records e.g. 
> http://authors.library.caltech.edu/5815/
> 
> 
> The remainder of the paper describes the authors' process of fixing their own 
> IR (based on CONTENTdm).
> 
> 
> The authors then wrongly conclude:
> 
> "Despite GS?s endorsement of three software packages, the surveys conducted 
> for this paper demonstrates that software is not a deciding factor for 
> indexing ratio in GS. Each of the three recommended software packages showed 
> good indexing ratios for some repositories and poor ratios for others."
> 
> The authors looked at one instance of EPrints and, despite being a relatively 
> old version, found 88% of its records indexed in GS.
> 
> It is unfortunate that this paper has suggested that IR software in general 
> is poorly indexed in GS. On the contrary, some badly implemented IR software 
> is poorly indexed in GS.
> 
> 
> After all that is said, the most critical factor to IR visibility is having 
> (BOAI definition) open access content. Hiding content behind search forms, 
> click-throughs and other things that emphasise the IR at the expense of the 
> content will hurt your visibility.
> 
> Lastly, Google will index your metadata-only records while Google Scholar is 
> looking for full-texts. Your GS/Google ratio will approximate how many of 
> your records have an attached open access PDF (.doc etc).
> 
> 
> Sincerely,
> Tim Brody
> (EPrints Developer)
> 
> On Wed, 2012-02-15 at 11:31 +, Stevan Harnad wrote:
>> Can we enhance the google-scholar discoverability of EPrints (and
>> DSpace) repositories?
>> 
>> http://linksource.ebsco.com/linking.aspx?sid=google&auinit=K&aulast=Ar
>> litsch&atitle=Invisible+Institutional+Repositories:+Addressing+the+Low
>> +Indexing+Rat

[GOAL] Re: {Disarmed} Re: Google Scholar discoverability of repository content

2012-02-20 Thread Dirk Pieper
Hi,

there were several articles by Peter Jasco in the past regarding search quality
in Google Scholar (GS), see for example

http://www.libraryjournal.com/article/CA6698580.html

The GS "inclusion guidelines" are about three years old now, so I'm wondering
about the discussion now. In the past, the number of documents from repositories
was even higher in Google than in GS! The experience with our own repositories
shows, that providing GS metadata clearly increased the number of documents of
our repositories in GS, but is still below 50%. BASE covers repository content
much better than GS, of course GS has other qualities (citation counts, library
links, ...).

>From my point of view the exciting question is, if GS uses the GS metadata only
to get the fulltext easier from a repository or if GS uses the metadata in
addition to the fulltext in order to improve the search quality within GS. The
other question is, how much the Google/GS ratio of documents from repositories
has changed in the last years.

Best
Dirk

--
Dirk Pieper
Bielefeld UL - BASE
Universitätsstr. 25, D-33615 Bielefeld
E-mail: dirk.pie...@uni-bielefeld.de | Tel.: +49 521 106-4010
Fax: +49 521 106-4052

www.ub.uni-bielefeld.de
www.base-search.net
--


+++ Welcome to the 10th International Bielefeld Conference,
24. - 26. April 2012,
http://conference.ub.uni-bielefeld.de +++



- Ursprüngliche Nachricht -
Von: Stevan Harnad 
Datum: Samstag, 18. Februar 2012, 8:31
Betreff: [GOAL] {Disarmed} Re: Google Scholar discoverability of repository
content
An: "Global Open Access List (Successor of AmSci)" 
Cc: SPARC IR 

> Begin forwarded message:
>
  > From: Betsy Coles 
> Date: February 17, 2012 5:48:42 PM EST
> To: jisc-repositor...@jiscmail.ac.uk
> Subject: Re: [EP-tech] Re: Google Scholar discoverability of repository
content
>
> I'm the technical manager for the main IR at Caltech, CaltechAUTHORS

  > (MailScanner has detected a possible fraud attempt from
  "authors.library.caltech.edu" claiming to be
  http://authors.library.caltech..edu), currently running EPrints
  3.1.3.  
  >
  > Tim's conjecture 1) below seems to account almost exactly for the
  result

  > the article authors found: 87.7% of the 25,072 eprints in
  CaltechAUTHORS

  > have OA documents attached; the remainder have only documents that
  are

  > either restricted to campus or to repository staff.  I don't think
  there are very

  > many cases of Tim's conjecture 2), since we have concentrated on
  adding

  > current content.
  >
  > I haven't read the article in question (we don't subscribe), but
  the percentage

  > of open access eprints is almost exactly the same as the authors'
  report of GS

  > indexed items in Table 2.  I haven't tested specifically, but it's
  tempting to

  > conclude that GS is indexing 100% of our open access content.
  >
  > Betsy Coles
  > Caltech Library IT Group
  > bco...@caltech.edu
  >
  > -Original Message-
  > From: eprints-tech-boun...@ecs.soton.ac.uk
  [mailto:eprints-tech-boun...@ecs.soton.ac.uk] On Behalf Of Tim Brody
  > Sent: Friday, February 17, 2012 3:33 AM
  > To: eprints-t...@ecs.soton.ac.uk
  > Cc: jisc-repositor...@jiscmail.ac.uk
  > Subject: [EP-tech] Re: Google Scholar discoverability of
  repository content
  >
  > Hi All,
  >
  > Here is some specific advice for existing repository
  administrators from Google Scholar:
  > http://roar.eprints.org/help/google_scholar.html
  >
  > As far as I'm aware there isn't anyone running EPrints 2 now, so
  EPrints-based repositories are already (and for a long) the "best in
  class" for Google Scholar.
  >
  >
  > Right, this paper ...
  >
  > Table 1 is irrelevant and misleading. Scholar links first to the
  publisher and, only if there is no publisher link, directly to the
  IR version. That's a policy decision on the part of Scholar and
  nothing to do with IRs.
  >
  > Table 2 gives us some useful data. The headline rate for EPrints
  is 88% (based on CalTech). Unfortunately the authors haven't
  provided an analysis of what happened to the missing records. I've
  done a quick random sample of CalTech and I suspect the missing
  records will consist
  > of:
  > 1) Non-OA/non-full-text records (I'm sure a query to the CalTech
  repository admin could supply the data).
  > 2) A percentage of PDFs that Scholar won't be able

[GOAL] Re: {Disarmed} Re: Google Scholar discoverability of repository content

2012-02-20 Thread Dirk Pieper
Hi,
 
 there were several articles by Peter Jasco in the past regarding search 
quality in Google Scholar (GS), see for example
  
  http://www.libraryjournal.com/article/CA6698580.html
 
 The GS "inclusion guidelines" are about three years old now, so I'm  wondering 
about the discussion now. In the past, the number of documents  from 
repositories was even higher in Google than in GS! The experience  with our own 
repositories shows, that providing GS metadata clearly  increased the number of 
documents of our repositories in GS, but is  still below 50%. BASE covers 
repository content much better than GS, of  course GS has other qualities 
(citation counts, library links, ...).
 
 From my point of view the exciting question is, if GS uses the GS  metadata 
only to get the fulltext easier from a repository or if GS uses  the metadata 
in addition to the fulltext in order to improve the search  quality within GS. 
The other question is, how much the Google/GS ratio  of documents from 
repositories has changed in the last years. 
 
 Best
 Dirk
 
 --
 Dirk Pieper
 Bielefeld UL - BASE
 Universit?tsstr. 25, D-33615 Bielefeld
 E-mail: dirk.pieper at uni-bielefeld.de | Tel.: +49 521 106-4010
 Fax: +49 521 106-4052
 
 www.ub.uni-bielefeld.de
 www.base-search.net
 --
 
 
 +++ Welcome to the 10th International Bielefeld Conference,
24. - 26. April 2012,
http://conference.ub.uni-bielefeld.de +++



- Urspr?ngliche Nachricht -
Von: Stevan Harnad 
Datum: Samstag, 18. Februar 2012, 8:31
Betreff: [GOAL] {Disarmed} Re: Google Scholar discoverability of repository 
content
An: "Global Open Access List (Successor of AmSci)" 
Cc: SPARC IR 


---
| 


> Begin forwarded message:> > From: Betsy Coles 
> Date: February 17, 2012 5:48:42 PM EST
> To: JISC-REPOSITORIES at JISCMAIL.AC.UK
> Subject: Re: [EP-tech] Re: Google Scholar discoverability of repository 
> content> 
> I'm the technical manager for the main IR at Caltech, CaltechAUTHORS> 
> (MailScanner has detected a possible fraud attempt from 
> "authors.library.caltech.edu" claiming to be 
> http://authors.library.caltech..edu), currently running EPrints 3.1.3.  
> 
> Tim's conjecture 1) below seems to account almost exactly for the result> the 
> article authors found: 87.7% of the 25,072 eprints in CaltechAUTHORS> have OA 
> documents attached; the remainder have only documents that are> either 
> restricted to campus or to repository staff.  I don't think there are very> 
> many cases of Tim's conjecture 2), since we have concentrated on adding> 
> current content.
> 
> I haven't read the article in question (we don't subscribe), but the 
> percentage> of open access eprints is almost exactly the same as the authors' 
> report of GS> indexed items in Table 2.  I haven't tested specifically, but 
> it's tempting to> conclude that GS is indexing 100% of our open access 
> content.
> 
> Betsy Coles
> Caltech Library IT Group
> bcoles at caltech.edu
> 
> -Original Message-
> From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:eprints-tech-bounces at 
> ecs.soton.ac.uk] On Behalf Of Tim Brody
> Sent: Friday, February 17, 2012 3:33 AM
> To: eprints-tech at ecs.soton.ac.uk
> Cc: JISC-REPOSITORIES at JISCMAIL.AC.UK
> Subject: [EP-tech] Re: Google Scholar discoverability of repository content
> 
> Hi All,
> 
> Here is some specific advice for existing repository administrators from 
> Google Scholar:
> http://roar.eprints.org/help/google_scholar.html
> 
> As far as I'm aware there isn't anyone running EPrints 2 now, so 
> EPrints-based repositories are already (and for a long) the "best in class" 
> for Google Scholar.
> 
> 
> Right, this paper ...
> 
> Table 1 is irrelevant and misleading. Scholar links first to the publisher 
> and, only if there is no publisher link, directly to the IR version. That's a 
> policy decision on the part of Scholar and nothing to do with IRs.
> 
> Table 2 gives us some useful data. The headline rate for EPrints is 88% 
> (based on CalTech). Unfortunately the authors haven't provided an analysis of 
> what happened to the missing records. I've done a quick random sample of 
> CalTech and I suspect the missing records will consist
> of:
> 1) Non-OA/non-full-text records (I'm sure a query to the CalTech repository 
> admin could supply the data).
> 2) A percentage of PDFs that Scholar won't be able to parse. CalTech contains 
> some old (1950s), scanned PDFs from Journals. Where the article isn't at the 
> top of the page Scholar will struggl

[GOAL] WoS, SCOPUS, Google Scholar and finding OA papers and their proportion

2015-11-29 Thread Stevan Harnad
In “Web of Science, Scopus, and Open Access: What they are doing right and
what they are doing wrong
<https://awayofhappening.wordpress.com/2015/11/27/web-of-science-scopus-and-open-access-what-they-are-doing-right-and-what-they-are-doing-wrong/>”
 Ryan Regier discusses the current capacities and limitations
of WoS,SCOPUS, Google Scholar in finding OA papers and their proportions
(OA/total). Most of the discussion is about Gold OA, but Regier notes that
GS can be used for Green OA, though inefficient.

I would add that the way to find just about all OA articles and to
calculate the proportion of a university’s total articles that are OA is
not to (1) seek them or (2) their proportion in WoS or SCOPUS. That way,
the only OA articles you’ll find are the Gold OA ones, and their
proportion.

Yes, google scholar (GS) is the way an individual researcher can find OA
articles on a particular topic, and yes the search, as well as the
calculation of the proportion has to be done by hand (to see which hits
have an OA version). This is much more useful than WoS or SCOPUS, because
it covers Green OA too, but it requires a lot of manual work that could be
reduced as soon as GS does a little tweaking of data and metadata it
already has (author name, institution, pub date), even to an approximation.

Already (to a very crude approximation) I can get all the GS articles on
“slender loris” (3200) narrow it down to 2014-2015 (198) or to (“slender
loris” “university of illinois”) (42) or to (“slender loris” “university of
illinois”) 2014-2015 (2).

Combining WoS or SCOPUS data and GS I could also get an approximate
estimate of OA/total output, for an individual university, per year,
without reaching the GS robot limit for an institution.

Tedious. inefficient, and very approximate, admittedly, but a taste of
what’s to come (and what GS can and will make much easier and more
efficient) — once universities and funders do their part, which is to adopt
strong, effective Green OA mandates.

Vincent-Lamarre, Philippe, Boivin, Jade, Gargouri, Yassine, Larivière,
Vincent and Harnad, Stevan (2016) Estimating Open Access Mandate
Effectiveness: The MELIBEA Score <http://eprints.soton.ac.uk/370203/>. *Journal
of the Association for Information Science and Technology (JASIST) (in
press)*
___
GOAL mailing list
GOAL@eprints.org
http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal


[GOAL] Re: WoS, SCOPUS, Google Scholar and finding OA papers and their proportion

2015-11-30 Thread Dirk Pieper
In order to get a complete picture as possible we are using WoS - we can´t 
afford Scopus in addition -, our repository and publisher APIs. We know it´s 
not perfect, but it´s the best we think we can do now. Especially when you can 
manage to get those publications into your repository, which have not been 
indexed in WoS or have been published with publishers which don´t offer APIs, 
the repository becomes very useful.

Best,

Dirk 

Am 29.11.15 14:35 schrieb Stevan Harnad  :
> 
> 
> In “Web of Science, Scopus, and Open Access: What they are doing right and 
> what they are doing 
> wrong(https://awayofhappening.wordpress.com/2015/11/27/web-of-science-scopus-and-open-access-what-they-are-doing-right-and-what-they-are-doing-wrong/)”
>  Ryan Regier discusses the current capacities and limitations of WoS,SCOPUS, 
> Google Scholar in finding OA papers and their proportions (OA/total). Most of 
> the discussion is about Gold OA, but Regier notes that GS can be used for 
> Green OA, though inefficient.
> 
> 
> 
> 
> 
> I would add that the way to find just about all OA articles and to calculate 
> the proportion of a university’s total articles that are OA is not to (1) 
> seek them or (2) their proportion in WoS or SCOPUS. That way, the only OA 
> articles you’ll find are the Gold OA ones, and their proportion. 
> 
> Yes, google scholar (GS) is the way an individual researcher can find OA 
> articles on a particular topic, and yes the search, as well as the 
> calculation of the proportion has to be done by hand (to see which hits have 
> an OA version). This is much more useful than WoS or SCOPUS, because it 
> covers Green OA too, but it requires a lot of manual work that could be 
> reduced as soon as GS does a little tweaking of data and metadata it already 
> has (author name, institution, pub date), even to an approximation. 
> 
> Already (to a very crude approximation) I can get all the GS articles on 
> “slender loris” (3200) narrow it down to 2014-2015 (198) or to (“slender 
> loris” “university of illinois”) (42) or to (“slender loris” “university of 
> illinois”) 2014-2015 (2).
> 
> Combining WoS or SCOPUS data and GS I could also get an approximate estimate 
> of OA/total output, for an individual university, per year, without reaching 
> the GS robot limit for an institution.
> 
> Tedious. inefficient, and very approximate, admittedly, but a taste of what’s 
> to come (and what GS can and will make much easier and more efficient) — once 
> universities and funders do their part, which is to adopt strong, effective 
> Green OA mandates.
> 
> Vincent-Lamarre, Philippe, Boivin, Jade, Gargouri, Yassine, Larivière, 
> Vincent and Harnad, Stevan (2016) Estimating Open Access Mandate 
> Effectiveness: The MELIBEA Score(http://eprints.soton.ac.uk/370203/). Journal 
> of the Association for Information Science and Technology (JASIST) (in press)
> 
> 
> 
> 
>  
>
___
GOAL mailing list
GOAL@eprints.org
http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal


Google Scholar Can Now Focus Boolean Search on the Articles Citing a Given Article

2010-07-03 Thread Stevan Harnad
In the world of journal articles, each article is both a "citing" item
and a "cited" item. The list of references a given article cites
provides that article's outgoing citations. And all the other articles
in whose reference lists that article is cited provide that article's
incoming citations.

Formerly, with  Google Scholar (1) you could do a google-like boolean
(and, or, not, etc.) word search, which ranked the articles that it
retrieved by how highly cited they were. Then, for any individual
citing article in that ranked list of citing articles, (2) you could
go on to retrieve all the articles citing that individual cited
article, again ranked by how highly cited they were. But you could not
go on to do a boolean word search within just that set of citing
articles; as of July 1 you can: http://j.mp/c74RQs (Thanks to Joseph
Esposito for pointing this out on liblicense.)

Of course Google Scholar is a potential scientometric killer-app that
is just waiting to design and display powers far, far greater and
richer than even these. Only two things are holding it back: (a) the
sparse Open Access content of the web to date (only about 20% of
articles published annually) and (b) the sleepiness of google, in not
realizing what a potentially rich a scientometric resource and tool
they have in their hands.

Citebase http://www.citebase.org/search gives a foretaste of some more
of the latent power of an Open Access impact and influence engine (so
does citeseerx http://citeseerx.ist.psu.edu/ ), but even that is pale
by comparison with what is still to come -- if only Green OA
self-archiving mandates by the world's universities, the providers of
all the missing content, hurry up and get adopted so they can be
implemented and hence *all* the target content for these impending
marvels (not just 20% of it) can begin being reliably provided at long
last.

(SCOPUS and Thomson-Reuters Web of Science are of course likewise
standing by, ready to upgrade their services so as to point also to
the OA versions of the content they index -- if only we hurry up and
make it OA!)

Harnad, S. (2001) Research access, impact and assessment. Times Higher
Education Supplement 1487: p. 16.  http://cogprints.org/1683/

Brody, T., Kampa, S., Harnad, S., Carr, L. and Hitchcock, S. (2003)
Digitometric Services for Open Archives Environments. In Proceedings
of European Conference on Digital Libraries 2003, pp. 207-220,
Trondheim, Norway. http://eprints.ecs.soton.ac.uk/7503/

Hitchcock, Steve; Woukeu, Arouna; Brody, Tim; Carr, Les; Hall, Wendy &
Harnad, Stevan. (2003) Evaluating Citebase, an open access Web-based
citation-ranked search and impact discovery service
http://eprints.ecs.soton.ac.uk/8204/

Harnad, Stevan (2003)  Maximizing Research Impact by Maximizing Online
Access. In: Law, Derek & Judith Andrews, Eds. Digital Libraries:
Policy Planning and Practice. Ashgate Publishing 2003.
http://cogprints.org/1639/

Harnad, S. (2006) Online, Continuous, Metrics-Based Research
Assessment. Technical Report, ECS, University of Southampton.
http://eprints.ecs.soton.ac.uk/12130/

Brody, T., Carr, L., Harnad, S. and Swan, A. (2007) Time to Convert to
Metrics. Research Fortnight pp. 17-18.
http://eprints.ecs.soton.ac.uk/14329/

Brody, T., Carr, L., Gingras, Y., Hajjem, C., Harnad, S. and Swan, A.
(2007) Incentivizing the Open Access Research Web:
Publication-Archiving, Data-Archiving and Scientometrics. CTWatch
Quarterly 3(3). http://eprints.ecs.soton.ac.uk/14418/

Harnad, S. (2008) Validating Research Performance Metrics Against Peer
Rankings. Ethics in Science and Environmental Politics 8 (11)
doi:10.3354/esep00088  The Use And Misuse Of Bibliometric Indices In
Evaluating Scholarly Performance
http://eprints.ecs.soton.ac.uk/15619/

Harnad, S., Carr, L. and Gingras, Y. (2008) Maximizing Research
Progress Through Open Access Mandates and Metrics. Liinc em Revista
4(2). http://eprints.ecs.soton.ac.uk/16617/

Harnad, S. (2009) The PostGutenberg Open Access Journal. In: Cope, B.
& Phillips, A (Eds.) The Future of the Academic Journal. Chandos.
http://eprints.ecs.soton.ac.uk/15617/

Harnad, S. (2009) Open Access Scientometrics and the UK Research
Assessment Exercise. Scientometrics 79 (1)


-- Forwarded message --
From: Joseph Esposito espositoj -- gmail.com
List-Post: goal@eprints.org
List-Post: goal@eprints.org
Date: Fri, Jul 2, 2010 at 11:14 PM
Subject: New feature in Google Scholar
To:  

Google Scholar now lets you see how an article was cited:

http://j.mp/c74RQs

Joe Esposito


[GOAL] Google Scholar Profiles of the Open Access Movement in India & Digital Library Initiatives in India

2015-09-03 Thread anup kumar das
Glad to inform you that *Google Scholar Profile* (with Citation indices,
H-index and i10-index) of the *Open Access India* (about open access
movement in India and South Asia) is now available Online
<http://scholar.google.co.in/citations?user=vsgjnxMJ>.


   - *Google Scholar Profile
   <http://scholar.google.co.in/citations?user=vsgjnxMJ> *of the *Open
   Access India* (about open access movement in India)*. *
   http://scholar.google.co.in/citations?user=vsgjnxMAAAAJ


   - *Google Scholar Profile
   <http://scholar.google.co.in/citations?user=3QssQpYJ> *of the *Digital
   Libraries in India* (about digital libraries and digitization
   initiatives in India)*. *
   http://scholar.google.co.in/citations?user=3QssQpYJ
___
GOAL mailing list
GOAL@eprints.org
http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal