Re: [Dspace-tech] DSpace and Google Scholar

2014-12-07 Thread karolosoc...@interia.pl
Peter and Hilton, Thanks You so much! Everything works fine. Especialy thanks
to Peter for  a good manual step by step :)

Regards,

Karol



--
View this message in context: 
http://dspace.2283337.n4.nabble.com/DSpace-and-Google-Scholar-tp4675798p4675808.html
Sent from the DSpace - Tech mailing list archive at Nabble.com.

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


Re: [Dspace-tech] DSpace and Google Scholar

2014-12-07 Thread Peter Dietz
Hi Karol,

After you've got dspace.cfg dspace.url set to https://opub.dsw.edu.pl/

You will probably need to restart your handle server. There's no simple way
to stop the handle service, so I usually do something like:

ps aux | grep handle

Which finds:

dspace   25609  0.0  0.0 3650848 56600 ?   Sl2012 492:05 java
-Xmx256m -classpath :/home/nmc/lib/activation-1.1.jar:...

And then kill it:

kill 25609

Then start it back up again with:

bin/start-handle-server


Because the handle service reads its values from dspace.cfg, but isn't
reloading them after a config change.


Also, one other SEO thing to do, after you've fixed everything, is to 301
redirect hits to /jspui/* to /*, so that people don't get broken links, and
so that Google doesn't penalize your site for losing all of its links.


On Dec 7, 2014 6:26 AM, "karolosoc...@interia.pl" 
wrote:

> Hi Hilton,
>
> Thanks for Your reply. I removed jspui from opub.dsw.edu.pl in dspace.url
> and reloaded Tomcat. But when i press URI: http://hdl.handle.net/11479/112
> I
> still redirecting to https://opub.dsw.edu.pl/jspui/handle/11479/112 .
> Maybe
> this is problem in DSpace database ?
>
> Thanks,
>
> Karol
>
>
>
> --
> View this message in context:
> http://dspace.2283337.n4.nabble.com/DSpace-and-Google-Scholar-tp4675798p4675806.html
> Sent from the DSpace - Tech mailing list archive at Nabble.com.
>
>
> --
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations, FREE
>
> http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
> ___
> DSpace-tech mailing list
> DSpace-tech@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
> List Etiquette:
> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>
--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Re: [Dspace-tech] DSpace and Google Scholar

2014-12-07 Thread karolosoc...@interia.pl
Hi Hilton,

Thanks for Your reply. I removed jspui from opub.dsw.edu.pl in dspace.url 
and reloaded Tomcat. But when i press URI: http://hdl.handle.net/11479/112 I
still redirecting to https://opub.dsw.edu.pl/jspui/handle/11479/112 . Maybe
this is problem in DSpace database ? 

Thanks,

Karol



--
View this message in context: 
http://dspace.2283337.n4.nabble.com/DSpace-and-Google-Scholar-tp4675798p4675806.html
Sent from the DSpace - Tech mailing list archive at Nabble.com.

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


Re: [Dspace-tech] DSpace and Google Scholar

2014-12-07 Thread Hilton Gibson
Hi

If you are using 4.2, then perhaps this applies:
http://wiki.lib.sun.ac.za/index.php/SUNScholar/Upgrading/DSpace/Release_Notes/4.X#DSpace_base_URl_error

Cheers

hg

*Hilton Gibson*
Ubuntu Linux Systems Administrator
JS Gericke Library
Room 1025C
Stellenbosch University
Private Bag X5036
Stellenbosch
7599
South Africa

Tel: +27 21 808 4100 | Cell: +27 84 646 4758

On 7 December 2014 at 11:54, karolosoc...@interia.pl <
karolosoc...@interia.pl> wrote:

> Hi Peter,
>
> Thank You! I change it ! But i have a problem... becasue when i press my
> handle LINK for example:
>
> https://opub.dsw.edu.pl/handle/11479/60
>
> and when i press handle http://hdl.handle.net/11479/60 then i redirect to
> :
> https://opub.dsw.edu.pl/jspui/handle/11479/60
>
> Maybe You known what i can do with this? Thanks so much!
>
> Karol
>
>
>
>
> --
> View this message in context:
> http://dspace.2283337.n4.nabble.com/DSpace-and-Google-Scholar-tp4675798p4675804.html
> Sent from the DSpace - Tech mailing list archive at Nabble.com.
>
>
> --
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations, FREE
>
> http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
> ___
> DSpace-tech mailing list
> DSpace-tech@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
> List Etiquette:
> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>
--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Re: [Dspace-tech] DSpace and Google Scholar

2014-12-07 Thread karolosoc...@interia.pl
Hi Peter,

Thank You! I change it ! But i have a problem... becasue when i press my
handle LINK for example:

https://opub.dsw.edu.pl/handle/11479/60

and when i press handle http://hdl.handle.net/11479/60 then i redirect to :
https://opub.dsw.edu.pl/jspui/handle/11479/60

Maybe You known what i can do with this? Thanks so much!

Karol




--
View this message in context: 
http://dspace.2283337.n4.nabble.com/DSpace-and-Google-Scholar-tp4675798p4675804.html
Sent from the DSpace - Tech mailing list archive at Nabble.com.

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


Re: [Dspace-tech] DSpace and Google Scholar

2014-12-05 Thread Peter Dietz
Hi Karol,

robots.txt can't be in a subfolder, it has to be at the root of the
subdomain. So instead of
https://opub.dsw.edu.pl/jspui/robots.txt

You need to customize your site for it to serve from:
https://opub.dsw.edu.pl/robots.txt
I don't believe it can be a redirect, but you have to serve from that
location.
Perhaps you could just setup a tomcat context for your jspui to serve from
the ROOT context.

For reference, you can read the entire Google Scholar Indexing Guidelines.
http://scholar.google.com/intl/en-US/scholar/inclusion.html#overview


Peter Dietz
Longsight
www.longsight.com
pe...@longsight.com
p: 740-599-5005 x809

On Fri, Dec 5, 2014 at 7:35 AM, karolosoc...@interia.pl <
karolosoc...@interia.pl> wrote:

> Hi,
>
> i have a problem with indexing in Google Scholar. My repository is
> https://opub.dsw.edu.pl/
>
> This is my robots.txt: https://opub.dsw.edu.pl/jspui/robots.txt
> This is my sitemaps:  https://opub.dsw.edu.pl/jspui/htmlmap
>
> Sitemaps is generated automaticaly at 6.0 a.m what is wrong with my
> repository ? Thanks,
>
> Karol
>
>
>
> --
> View this message in context:
> http://dspace.2283337.n4.nabble.com/DSpace-and-Google-Scholar-tp4675798.html
> Sent from the DSpace - Tech mailing list archive at Nabble.com.
>
>
> --
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations, FREE
>
> http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
> ___
> DSpace-tech mailing list
> DSpace-tech@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
> List Etiquette:
> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>
--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

[Dspace-tech] DSpace and Google Scholar

2014-12-05 Thread karolosoc...@interia.pl
Hi,

i have a problem with indexing in Google Scholar. My repository is
https://opub.dsw.edu.pl/

This is my robots.txt: https://opub.dsw.edu.pl/jspui/robots.txt
This is my sitemaps:  https://opub.dsw.edu.pl/jspui/htmlmap

Sitemaps is generated automaticaly at 6.0 a.m what is wrong with my
repository ? Thanks,

Karol 



--
View this message in context: 
http://dspace.2283337.n4.nabble.com/DSpace-and-Google-Scholar-tp4675798.html
Sent from the DSpace - Tech mailing list archive at Nabble.com.

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


[Dspace-tech] DSpace and Google Scholar updates

2014-02-03 Thread Tim Donohue
Developers, DCAT members and others,

Last week, Jonathan Markow (DuraSpace), Bram Luyten (@mire, DSpace 
Commiter) and I had a brief call with the Google Scholar team (Anurag & 
Darcy) with regards to DSpace. Anurag Acharya (Tech Lead) and Darcy 
Dapra (Product Manager) had emailed DuraSpace to ask for this meeting. 
We invited along Bram because of his Google Scholar coverage analysis 
work and general interest in this area.

Last year (around this same time) Anurag & Darcy had reported several 
indexing issues related to DSpace 3.x and below. All of those issues 
have now been resolved in DSpace 4.0 thanks to the hard work of our 
Committers & Developers!
* https://jira.duraspace.org/browse/DS-1481
* https://jira.duraspace.org/browse/DS-1482
* https://jira.duraspace.org/browse/DS-1483

This year's meeting seemed more positive in many ways. Anurag said he 
felt like DSpace's indexability is improving in recent releases (esp. 
based on the flexibility of the system, and the ability to customize it 
heavily).

There were no new major DSpace issues they had to report. Rather, Anurag 
mostly wanted to see if there were ways to either "make it harder" to 
mis-configure DSpace or "provide better warnings/advice/detection" with 
regards to configuring DSpace properly. Anurag specifically mentioned 
that often DSpace coverage issues in Google Scholar are caused by a 
misconfiguration of DSpace.

So, essentially there were two main ideas/brainstorms that came up in 
the discussion:

(1) Think about building an "Indexing problems detection tool/service" 
for DSpace users. This may help DSpace users detect issues more 
immediately, and hopefully help them get better coverage in GS & similar.
 * Would attempt to programmatically detect configuration issues in 
a DSpace site that would cause indexing problems (especially with GS)
 * Simple examples may be:
  * An improperly configured robots.txt
  * Sitemaps are missing/disabled
  * Missing or incorrect "citation" meta tags in HTML (which is 
what Google Scholar uses)
  * (Anurag will forward on some more specific examples he's seen)
 * Anurag feels it'd be best to make such a tool DSpace-specific, as 
it's easier to "guess" where things should be (e.g. we know the path of 
the DSpace sitemaps, we know what a good "robots.txt" looks like for 
DSpace, etc)
 * Such a tool likely would NOT need to crawl/scan an entire DSpace 
site. Rather it would just check a small sample for possible known issues.
 * Such a tool could be something users run themselves, or even 
perhaps a hosted service (off of http://dspace.org or similar) where 
users could enter their DSpace URL and get a report back.
 * It's unknown as of yet who would build this tool or what it would 
look like exactly. I'll be talking with DuraSpace and DSpace Committers 
about it. But all on the call agreed this sounds like an interesting 
idea to investigate further.

(2) Possibly improve DSpace default settings (try to enable things most 
people really should have enabled if they want good search engine 
coverage) and/or make it harder to disable some indexing related features.
  * E.g. Could we make Sitemaps enabled by default, and also 
autogenerated? (i.e. no longer require they be enabled & updated via a 
cron job) (https://jira.duraspace.org/browse/DS-1901)
  * E.g. If we make the default DC schema "read-only" (or mostly 
read-only), we could better standardize the crosswalking to the 
"citation" metatags that Google Scholar needs. Currently if someone 
removes/changes dc metadata fields, it may accidentally affect what is 
displayed in "citation" metatags. (This idea is already being suggested 
by https://jira.duraspace.org/browse/DS-1631)
  * Might also be worth reviewing our default "robots.txt" files for 
XMLUI and JSPUI.

Overall though, I felt it was good to hear mostly positive feedback from 
the Google Scholar team!

REMINDER: If you are wondering whether your DSpace instance is following 
best practices, we recommend reviewing these guidelines in our 
documentation: 
https://wiki.duraspace.org/display/DSDOC4x/Search+Engine+Optimization

Comments/Questions/Thoughts welcome.

- Tim

-- 
Tim Donohue
Technical Lead for DSpace & DSpaceDirect
DuraSpace.org | DSpace.org | DSpaceDirect.org

--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


Re: [Dspace-tech] DSpace and Google Scholar

2013-11-11 Thread Calloni, Rodrigo
Thanks for the article Bram.

I will share it around. We hope our repository can reach such a success as 
Baylor U, having 89% or more of our contents in GS and other available tools.

It will be a challenge as we just moved to 1.8 and we are missing all nice 
developments of more recent DSpace versions.

Best regards
Rodrigo

From: bluy...@gmail.com [mailto:bluy...@gmail.com] On Behalf Of Bram Luyten
Sent: Thursday, November 07, 2013 8:36 AM
To: Calloni, Rodrigo
Cc: dspace-tech@lists.sourceforge.net
Subject: Re: [Dspace-tech] DSpace and Google Scholar

Hi Rodrigo,

although it's not super recent anymore, here's an interesting piece of research 
on the subject:
http://www.emeraldinsight.com/journals.htm?articleid=17020806&show=abstract

Just adding one (poorly taken) screenshot from the paper in here, a table 
ordering repositories with their indexing ratio: amount of items in the 
repository vs amount of items actually included in Google Scholar.

Although this can not lead to direct conclusions, note that the top repository, 
with an 89% coverage, was actually a DSpace. To me, this just highlights that 
there's much more to it than merely the choice of platforms.

rgds

Bram

[Inline image 1]


--
[logo]

Bram Luyten @mire
2888 Loker Avenue East, Suite 315, Carlsbad, CA. 92010
Esperantolaan 4, Heverlee 3001, Belgium
www.atmire.com<http://atmire.com/website/?q=services&utm_source=emailfooter&utm_medium=email&utm_campaign=braml>


On Mon, Nov 4, 2013 at 4:23 PM, Calloni, Rodrigo 
mailto:rcall...@iadb.org>> wrote:
Hello

We are using DSpace 1.8 XMLUI.

I am in contact with someone at Google Scholar who mentioned that EPrints and 
BEPRess's Digital Commons are better integrated with Scholar than DSpace.

I wonder if you are aware of this and what these 2 other IR solutions are doing 
to bet better acceptable platforms for Scholar. Is it the UI?

Thanks in advance
Rodrigo

--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net<mailto:DSpace-tech@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

<>--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Re: [Dspace-tech] DSpace and Google Scholar

2013-11-11 Thread Calloni, Rodrigo
Thanks again Tim and Peter.

For us is a huge challenge keep up-to-date with DSpace due to all 
customizations that implementation team applied to our DSpace. It became costly 
and a very length process to update our DSpace. We updated from 1.6 to 1.8 this 
year and we already knew this wasn't going to bring us to up-to-date, but we 
were recommended to go 1.8 instead of 3.0

One of my goals is to bring back our DSpace to a more un-customized version in 
next years. I hope we can achieve this soon to be able to take advantages of 
new developments.

Best regards
Rodrigo

-Original Message-
From: Tim Donohue [mailto:tdono...@duraspace.org] 
Sent: Wednesday, November 06, 2013 11:26 AM
To: Peter Dietz; Calloni, Rodrigo
Cc: dspace-tech@lists.sourceforge.net
Subject: Re: [Dspace-tech] DSpace and Google Scholar

Just to add a few notes to Peter's detailed examples...

On 11/6/2013 10:07 AM, Peter Dietz wrote:
> I would say DSpace is doing a "good" job of producing Scholar tags
> (highwire) for the most part. There are some edge cases, as mentioned 
> above by others, that other systems could be doing a better job. I 
> don't know enough about (EPrints / BePress) scholar support to weigh 
> in. There is a config setting 
> https://github.com/DSpace/DSpace/blob/master/dspace/config/crosswalks/
> google-metadata.properties that you will NEED to modify to map your 
> custom metadata profile, to Google Scholar (highwire) metadata fields.
>
> Citing specific examples, DSpace out-of-the-box, only supports mapping 
> to the citation_pdf_url, when you only have one bitstream, and it is a 
> PDF, in the ORIGINAL bundle. In any other circumstance, it will punt, 
> and not add a citation_pdf_url.
>
> The reason for that is if you have multiple PDF's, DSpace doesn't have 
> enough information to know which one is the "best" PDF that contains 
> your article. Or, in other cases, people use multiple bundles to store 
> their content. Or, you have multiple formats available, such as word, 
> text/latex, and again, DSpace can't say which one is the best. So, if 
> you are deviating from the simple use-case, then you'll need to 
> customize the logic for determining the citation_pdf_url, likely 
> altering some Java code to do so.

There's a proposed fix/change for this "citation_pdf_url" logic (which was 
requested by Google Scholar folks) for the upcoming DSpace 4.0...see comments 
in:
https://jira.duraspace.org/browse/DS-1483

(So this may be improved in DSpace 4.0...it's a proposed bug fix we are looking 
into right now.)

> Another example of things that Scholar doesn't like is the 
> dc.date.issued being set to the date submitted (i.e. today's date, if 
> you just submitted). So, if that article you just submitted was 
> actually published elsewhere a few months ago, but the version you 
> submit to your IR has today's date, then scholar has conflicting 
> information about the Date of that article, and doesn't think of them 
> as multiple versions/sources of the same content. DSpace 4.0 has some 
> changes regarding that, as it tries not to add date.issued of today, 
> for anything that you mark as previously published.

As Peter mentions, this is already improved in the upcoming DSpace 4.0 release. 
 For details see the most recent comments on:
https://jira.duraspace.org/browse/DS-1481

So, these are two great examples of where Google Scholar has talked with us 
about possible improvements to DSpace (based on issues Google has run
into) and we've taken that feedback and made fixes/improvements in the next 
release of DSpace.

It also is a great example of why it's important to stay up-to-date with DSpace 
releases, if you want the latest and greatest Google Scholar improvements. We 
are constantly tweaking DSpace for better Search Engine Optimization (and based 
on feedback directly from Google and others). 
So, if you are on an older version of DSpace, it often is not as "optimized" as 
more recent versions.

- Tim
--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


Re: [Dspace-tech] DSpace and Google Scholar

2013-11-06 Thread Tim Donohue
Just to add a few notes to Peter's detailed examples...

On 11/6/2013 10:07 AM, Peter Dietz wrote:
> I would say DSpace is doing a "good" job of producing Scholar tags
> (highwire) for the most part. There are some edge cases, as mentioned
> above by others, that other systems could be doing a better job. I don't
> know enough about (EPrints / BePress) scholar support to weigh in. There
> is a config setting
> https://github.com/DSpace/DSpace/blob/master/dspace/config/crosswalks/google-metadata.properties
> that you will NEED to modify to map your custom metadata profile, to
> Google Scholar (highwire) metadata fields.
>
> Citing specific examples, DSpace out-of-the-box, only supports mapping
> to the citation_pdf_url, when you only have one bitstream, and it is a
> PDF, in the ORIGINAL bundle. In any other circumstance, it will punt,
> and not add a citation_pdf_url.
>
> The reason for that is if you have multiple PDF's, DSpace doesn't have
> enough information to know which one is the "best" PDF that contains
> your article. Or, in other cases, people use multiple bundles to store
> their content. Or, you have multiple formats available, such as word,
> text/latex, and again, DSpace can't say which one is the best. So, if
> you are deviating from the simple use-case, then you'll need to
> customize the logic for determining the citation_pdf_url, likely
> altering some Java code to do so.

There's a proposed fix/change for this "citation_pdf_url" logic (which 
was requested by Google Scholar folks) for the upcoming DSpace 4.0...see 
comments in:
https://jira.duraspace.org/browse/DS-1483

(So this may be improved in DSpace 4.0...it's a proposed bug fix we are 
looking into right now.)

> Another example of things that Scholar doesn't like is the
> dc.date.issued being set to the date submitted (i.e. today's date, if
> you just submitted). So, if that article you just submitted was actually
> published elsewhere a few months ago, but the version you submit to your
> IR has today's date, then scholar has conflicting information about the
> Date of that article, and doesn't think of them as multiple
> versions/sources of the same content. DSpace 4.0 has some changes
> regarding that, as it tries not to add date.issued of today, for
> anything that you mark as previously published.

As Peter mentions, this is already improved in the upcoming DSpace 4.0 
release.  For details see the most recent comments on:
https://jira.duraspace.org/browse/DS-1481

So, these are two great examples of where Google Scholar has talked with 
us about possible improvements to DSpace (based on issues Google has run 
into) and we've taken that feedback and made fixes/improvements in the 
next release of DSpace.

It also is a great example of why it's important to stay up-to-date with 
DSpace releases, if you want the latest and greatest Google Scholar 
improvements. We are constantly tweaking DSpace for better Search Engine 
Optimization (and based on feedback directly from Google and others). 
So, if you are on an older version of DSpace, it often is not as 
"optimized" as more recent versions.

- Tim

--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


Re: [Dspace-tech] DSpace and Google Scholar

2013-11-06 Thread Peter Dietz
I would say DSpace is doing a "good" job of producing Scholar tags
(highwire) for the most part. There are some edge cases, as mentioned above
by others, that other systems could be doing a better job. I don't know
enough about (EPrints / BePress) scholar support to weigh in. There is a
config setting
https://github.com/DSpace/DSpace/blob/master/dspace/config/crosswalks/google-metadata.propertiesthat
you will NEED to modify to map your custom metadata profile, to Google
Scholar (highwire) metadata fields.

Citing specific examples, DSpace out-of-the-box, only supports mapping to
the citation_pdf_url, when you only have one bitstream, and it is a PDF, in
the ORIGINAL bundle. In any other circumstance, it will punt, and not add a
citation_pdf_url.

The reason for that is if you have multiple PDF's, DSpace doesn't have
enough information to know which one is the "best" PDF that contains your
article. Or, in other cases, people use multiple bundles to store their
content. Or, you have multiple formats available, such as word, text/latex,
and again, DSpace can't say which one is the best. So, if you are deviating
from the simple use-case, then you'll need to customize the logic for
determining the citation_pdf_url, likely altering some Java code to do so.

Another example of things that Scholar doesn't like is the dc.date.issued
being set to the date submitted (i.e. today's date, if you just submitted).
So, if that article you just submitted was actually published elsewhere a
few months ago, but the version you submit to your IR has today's date,
then scholar has conflicting information about the Date of that article,
and doesn't think of them as multiple versions/sources of the same content.
DSpace 4.0 has some changes regarding that, as it tries not to add
date.issued of today, for anything that you mark as previously published.

Peter Dietz


On Wed, Nov 6, 2013 at 9:50 AM, Calloni, Rodrigo  wrote:

>  Thanks a lot Tim. Very important to know the differences as we move
> forward into the best integration we can have with all search tools, in
> special Scholar.
>
>
>
> Rodrigo
>
>
>
> *From:* Tim Donohue [mailto:tdono...@duraspace.org]
> *Sent:* Tuesday, November 05, 2013 10:50 AM
> *To:* Calloni, Rodrigo; dspace-tech@lists.sourceforge.net
>
> *Subject:* Re: [Dspace-tech] DSpace and Google Scholar
>
>
>
> Hi Rodrigo,
>
>
> DuraSpace has been in contact with the Google Scholar team frequently over
> the past few years with regards to DSpace and Google Scholar. We have been
> providing feedback/requests back to DSpace developers directly from the
> Google Scholar team.
>
> So, we've been in ongoing discussions with Google Scholar around making
> DSpace more easily indexed/searched by Google Scholar.  Nearly every new
> version of DSpace includes some search engine improvements (more are coming
> in the upcoming 4.0).  Google Scholar has changed its own "best practices"
> over time (as they improve their system), and as such DSpace has been
> changing its functionality to better support these new  best practices.
>
> Because of that, it is very important to stay up-to-date with DSpace in
> order to get all of these Google Scholar enhancements.  This is another
> difference between DSpace and EPrints & bepress.  Although it's not always
> the case, EPrints and bepress often are "hosted" solutions -- meaning that
> the hosting provider keeps the software up-to-date on your behalf.
> Therefore, as EPrints and bepress make GS improvements, you'd get them
> "automatically" in your hosted system.  There are also some DSpace hosting
> options (e.g. DSpaceDirect via DuraSpace, Open Repository via BioMed
> Central, others), but most institutions run DSpace on their own servers.
> This means that, in order to see all the GS improvements in DSpace, you
> need to be sure you are upgrading the software at a relatively regular pace
> (or hiring someone to do it on your behalf)
>
> Currently, DSpace supports embedded Google Scholar metadata (in their
> recommended Highwire Press format), it's also editable so that you can
> enhance the metadata even more based on any local metadata fields you may
> add. As Richard mentioned, another difference here is that DSpace is built
> to store *any* content you want to put into it (it need not even be
> "scholarly" in nature), which is why we have configurable Google Scholar
> metadata to support multiple use cases.  Finally, DSpace also provides
> "sitemaps" which let search engines (in general) more easily locate content
> in DSpace.
>
> Google Scholar Metadata tags:
> https://wiki.duraspace.org/display/DSDOC4x/Google+Scholar+Metadata+Mappings
> SiteMaps / SEO:
> https:

Re: [Dspace-tech] DSpace and Google Scholar

2013-11-06 Thread Calloni, Rodrigo
Thanks a lot Tim. Very important to know the differences as we move forward 
into the best integration we can have with all search tools, in special Scholar.

Rodrigo

From: Tim Donohue [mailto:tdono...@duraspace.org]
Sent: Tuesday, November 05, 2013 10:50 AM
To: Calloni, Rodrigo; dspace-tech@lists.sourceforge.net
Subject: Re: [Dspace-tech] DSpace and Google Scholar

Hi Rodrigo,

DuraSpace has been in contact with the Google Scholar team frequently over the 
past few years with regards to DSpace and Google Scholar. We have been 
providing feedback/requests back to DSpace developers directly from the Google 
Scholar team.

So, we've been in ongoing discussions with Google Scholar around making DSpace 
more easily indexed/searched by Google Scholar.  Nearly every new version of 
DSpace includes some search engine improvements (more are coming in the 
upcoming 4.0).  Google Scholar has changed its own "best practices" over time 
(as they improve their system), and as such DSpace has been changing its 
functionality to better support these new  best practices.

Because of that, it is very important to stay up-to-date with DSpace in order 
to get all of these Google Scholar enhancements.  This is another difference 
between DSpace and EPrints & bepress.  Although it's not always the case, 
EPrints and bepress often are "hosted" solutions -- meaning that the hosting 
provider keeps the software up-to-date on your behalf.  Therefore, as EPrints 
and bepress make GS improvements, you'd get them "automatically" in your hosted 
system.  There are also some DSpace hosting options (e.g. DSpaceDirect via 
DuraSpace, Open Repository via BioMed Central, others), but most institutions 
run DSpace on their own servers. This means that, in order to see all the GS 
improvements in DSpace, you need to be sure you are upgrading the software at a 
relatively regular pace (or hiring someone to do it on your behalf)

Currently, DSpace supports embedded Google Scholar metadata (in their 
recommended Highwire Press format), it's also editable so that you can enhance 
the metadata even more based on any local metadata fields you may add. As 
Richard mentioned, another difference here is that DSpace is built to store 
*any* content you want to put into it (it need not even be "scholarly" in 
nature), which is why we have configurable Google Scholar metadata to support 
multiple use cases.  Finally, DSpace also provides "sitemaps" which let search 
engines (in general) more easily locate content in DSpace.

Google Scholar Metadata tags: 
https://wiki.duraspace.org/display/DSDOC4x/Google+Scholar+Metadata+Mappings
SiteMaps / SEO: https://wiki.duraspace.org/pages/viewpage.action?pageId=34642415

I hope this gives you a good overview of how DSpace attempts to stay up to date 
with Google Scholar and other search engine best practices.

Feel free to let us know if you have other questions,

- Tim


--

Tim Donohue

Technical Lead for DSpace & DSpaceDirect

DuraSpace.org | DSpace.org | DSpaceDirect.org

On 11/4/2013 4:23 PM, Calloni, Rodrigo wrote:
Hello

We are using DSpace 1.8 XMLUI.

I am in contact with someone at Google Scholar who mentioned that EPrints and 
BEPRess's Digital Commons are better integrated with Scholar than DSpace.

I wonder if you are aware of this and what these 2 other IR solutions are doing 
to bet better acceptable platforms for Scholar. Is it the UI?

Thanks in advance
Rodrigo




--

November Webinars for C, C++, Fortran Developers

Accelerate application performance with scalable programming models. Explore

techniques for threading, error checking, porting, and tuning. Get the most

from the latest Intel processors and coprocessors. See abstracts and register

http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk




___

DSpace-tech mailing list

DSpace-tech@lists.sourceforge.net<mailto:DSpace-tech@lists.sourceforge.net>

https://lists.sourceforge.net/lists/listinfo/dspace-tech

List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Re: [Dspace-tech] DSpace and Google Scholar

2013-11-06 Thread Calloni, Rodrigo
Thanks Richard for your time to respond on this. Very nice indeed.

From: Richard Rodgers [mailto:rrodg...@mit.edu]
Sent: Monday, November 04, 2013 7:43 PM
To: Calloni, Rodrigo
Cc: dspace-tech@lists.sourceforge.net
Subject: Re: [Dspace-tech] DSpace and Google Scholar

Hi Rodrigo:

Both EPrints and Digital Commons were designed (initially, at least) for 
scholarly article content specifically:
 therefore they can make assumptions that DSpace (where the content might be 
anything, like a dataset) cannot.
A good example is the Google Scholar tag 'citation_pdf_url' (from memory, might 
be different): it makes assumptions about content
that don't necessarily apply, since in DSpace the bitstream can be anything, 
not just a PDF.

So while those services might map more naturally to what GS expects to index, I 
don't think it's a matter of doing a better or worse
implementation.

Having said this, we of course always are looking at ways to improve 
interoperability with GS (and thus content discoverability generally),
and more work can certainly be done in this area.

My 2 cents,

Richard


On Nov 4, 2013, at 5:23 PM, Calloni, Rodrigo wrote:


Hello

We are using DSpace 1.8 XMLUI.

I am in contact with someone at Google Scholar who mentioned that EPrints and 
BEPRess's Digital Commons are better integrated with Scholar than DSpace.

I wonder if you are aware of this and what these 2 other IR solutions are doing 
to bet better acceptable platforms for Scholar. Is it the UI?

Thanks in advance
Rodrigo
--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net<mailto:DSpace-tech@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Re: [Dspace-tech] DSpace and Google Scholar

2013-11-05 Thread Tim Donohue

  
  
Hi Rodrigo,

DuraSpace has been in contact with the Google Scholar team
frequently over the past few years with regards to DSpace and Google
Scholar. We have been providing feedback/requests back to DSpace
developers directly from the Google Scholar team.  

So, we've been in ongoing discussions with Google Scholar around
making DSpace more easily indexed/searched by Google Scholar. 
Nearly every new version of DSpace includes some search engine
improvements (more are coming in the upcoming 4.0).  Google Scholar
has changed its own "best practices" over time (as they improve
their system), and as such DSpace has been changing its
functionality to better support these new  best practices. 

Because of that, it is very important to stay up-to-date with DSpace
in order to get all of these Google Scholar enhancements.  This is
another difference between DSpace and EPrints & bepress. 
Although it's not always the case, EPrints and bepress often are
"hosted" solutions -- meaning that the hosting provider keeps the
software up-to-date on your behalf.  Therefore, as EPrints and
bepress make GS improvements, you'd get them "automatically" in your
hosted system.  There are also some DSpace hosting options (e.g.
DSpaceDirect via DuraSpace, Open Repository via BioMed Central,
others), but most institutions run DSpace on their own servers. This
means that, in order to see all the GS improvements in DSpace, you
need to be sure you are upgrading the software at a relatively
regular pace (or hiring someone to do it on your behalf)

Currently, DSpace supports embedded Google Scholar metadata (in
their recommended Highwire Press format), it's also editable so that
you can enhance the metadata even more based on any local metadata
fields you may add. As Richard mentioned, another difference here is
that DSpace is built to store *any* content you want to put into it
(it need not even be "scholarly" in nature), which is why we have
configurable Google Scholar metadata to support multiple use cases. 
Finally, DSpace also provides "sitemaps" which let search engines
(in general) more easily locate content in DSpace.   

Google Scholar Metadata tags:
https://wiki.duraspace.org/display/DSDOC4x/Google+Scholar+Metadata+Mappings
SiteMaps / SEO:
https://wiki.duraspace.org/pages/viewpage.action?pageId=34642415

I hope this gives you a good overview of how DSpace attempts to stay
up to date with Google Scholar and other search engine best
practices.

Feel free to let us know if you have other questions,

- Tim
-- 
Tim Donohue
Technical Lead for DSpace & DSpaceDirect
DuraSpace.org | DSpace.org | DSpaceDirect.org



On 11/4/2013 4:23 PM, Calloni, Rodrigo
  wrote:


  
  
  
  
Hello 

  We are using DSpace 1.8 XMLUI.
 
I am in contact with someone at Google
  Scholar who mentioned that EPrints and BEPRess’s Digital
  Commons are better integrated with Scholar than DSpace.
 
I wonder if you are aware of this and what
  these 2 other IR solutions are doing to bet better acceptable
  platforms for Scholar. Is it the UI?
 
Thanks in advance
Rodrigo
  
  
  
  
  --
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
  
  
  
  ___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


  


--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Re: [Dspace-tech] DSpace and Google Scholar

2013-11-04 Thread Richard Rodgers
Hi Rodrigo:

Both EPrints and Digital Commons were designed (initially, at least) for 
scholarly article content specifically:
 therefore they can make assumptions that DSpace (where the content might be 
anything, like a dataset) cannot.
A good example is the Google Scholar tag 'citation_pdf_url' (from memory, might 
be different): it makes assumptions about content
that don't necessarily apply, since in DSpace the bitstream can be anything, 
not just a PDF.

So while those services might map more naturally to what GS expects to index, I 
don't think it's a matter of doing a better or worse
implementation.

Having said this, we of course always are looking at ways to improve 
interoperability with GS (and thus content discoverability generally),
and more work can certainly be done in this area.

My 2 cents,

Richard


On Nov 4, 2013, at 5:23 PM, Calloni, Rodrigo wrote:

Hello

We are using DSpace 1.8 XMLUI.

I am in contact with someone at Google Scholar who mentioned that EPrints and 
BEPRess’s Digital Commons are better integrated with Scholar than DSpace.

I wonder if you are aware of this and what these 2 other IR solutions are doing 
to bet better acceptable platforms for Scholar. Is it the UI?

Thanks in advance
Rodrigo
--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

[Dspace-tech] DSpace and Google Scholar

2013-11-04 Thread Calloni, Rodrigo
Hello

We are using DSpace 1.8 XMLUI.

I am in contact with someone at Google Scholar who mentioned that EPrints and 
BEPRess's Digital Commons are better integrated with Scholar than DSpace.

I wonder if you are aware of this and what these 2 other IR solutions are doing 
to bet better acceptable platforms for Scholar. Is it the UI?

Thanks in advance
Rodrigo
--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette