Re: [Dspace-tech] Federating DSpace instances

2013-07-30 Thread helix84
On Tue, Jul 30, 2013 at 1:25 AM, Gary Browne gary.bro...@sydney.edu.au wrote:
 Would you mind if I sent my requirements and thoughts on how to proceed back 
 to the list to get your further input?

Please, do.

 One thing I do know is that yes, Ben, I am planning to build an interface to 
 search over the aggregate collection of instances - I'd want some sort of 
 indicator for each item as to which source instance it belongs to.

One of the things you should clear upfront is whether you want to
build your own interface or try to reuse and adapt an existing one
(one more you can look at is DSpace/SkylightUI). Consider the guys at
Villanova University - they decided to build a thin UI on top of Solr
and ended up with a project being used by dozens of other
institutions. Your project would have a similar initial goal. Do you
have the resources to build and maintain it?

I may have some good news regarding MARC export in a few days. I'll
let you know.


Regards,
~~helix84

Compulsory reading: DSpace Mailing List Etiquette
https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

--
Get your SQL database under version control now!
Version control is standard for application code, but databases havent 
caught up. So what steps can you take to put your SQL databases under 
version control? Why should you start doing it? Read more to find out.
http://pubads.g.doubleclick.net/gampad/clk?id=49501711iu=/4140/ostg.clktrk
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


Re: [Dspace-tech] Federating DSpace instances

2013-07-29 Thread Hilton Gibson
Check vufind, see: http://wiki.lib.sun.ac.za/index.php/OpenCampus/Library



On 29 July 2013 07:29, Gary Browne gary.bro...@sydney.edu.au wrote:

 Hello all,

 I can't seem to find much info on this topic. I'm interested in federating
 a couple of our DSpace instances and am not sure what/who/where to ask
 about it.

 I'm looking at having one interface for a couple of DSpace instances where
 people can search and even go to items and bitstreams using this single
 interface (perhaps using REST API). I probably would not want to replicate
 objects but just harvest all metadata into one instance. How would
 authentication and authorisation work with a federated interface? And what
 would be the best method to achieve the federation itself? I probably have
 lots more questions, but just want to get an idea of whether I'm on the
 right track and where some resources for this might be available at this
 stage.

 Thanks,
 Gary

 GARY BROWNE | Development Programmer
 Library IT Services | Fisher Library F03
 THE UNIVERSITY OF SYDNEY

 T +61 2 9351 5946  | M +61 405 647 868
 E gary.bro...@sydney.edu.au  | W http://sydney.edu.au
 Sent from my plain old desktop computer.

 CRICOS 00026A
 This email plus any attachments to it are confidential. Any unauthorised
 use is strictly prohibited. If you receive this email in error, please
 delete it and any attachments.
 Please think of our environment and only print this e-mail if necessary.




 --
 See everything from the browser to the database with AppDynamics
 Get end-to-end visibility with application monitoring from AppDynamics
 Isolate bottlenecks and diagnose root cause in seconds.
 Start your free trial of AppDynamics Pro today!
 http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk
 ___
 DSpace-tech mailing list
 DSpace-tech@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dspace-tech
 List Etiquette:
 https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette




-- 
*Hilton Gibson*
Linux Systems Administrator
JS Gericke Library
Room 1025C
Stellenbosch University
Private Bag X5036
Stellenbosch
7599
South Africa

Tel: +27 21 808 4100 | Cell: +27 84 646 4758
http://library.sun.ac.za
http://scholar.sun.ac.za
http://www.journals.ac.za
--
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Re: [Dspace-tech] Federating DSpace instances

2013-07-29 Thread helix84
Hi Garry,

this is a really interesting use case.

There are several ways to approach this, but first you should clarify
your requirements. Do you require real-time search or is harvesting
enough? By real-time search I mean that after an item is updated in
the source repository, any search on it immediately after that will
reflect the updated metadata. By harvesting I mean that the changes
will be reflected only after the next harvest.

The most staightforward solution would seem to be to set up a separate
DSpace repository that harvests the other repositories. Unfortunately,
it seems to me that the built-in OAI-PMH harvester in DSpace supports
only harvesting individual collections, i.e. not the whole repository.
I guess that's because OAI-PMH doesn't recognize the notion of
community/collection hierarchy. If you choose to go this way, you
would have to write your own harvester and make it import items to
DSpace (there are several ways how to do that).

Hilton suggested VuFind. I have some exprience with that and at this
time, I wouldn't recommend that solution because importing content via
XSL is slow (it took me about 30 minutes per 10 000 items), not
because of the harvesting itself (which is blazingly fast in DSpace
3.x), but because the importer in VuFind decides to chop the results
into individual items and apply the XSL transformation on each item
individually. OTOH, import of MARC21 to VuFind is very fast and we
might get a MARC21 exporter in DSpace 4.0 - it's still an open
question.

Those solutions leveraged harvesting. Now for your real-time search options:

You could use Solr in DSpace directly and either use a federated
solution like MetaLib (commercial) or write your own (shouldn't be
really difficult, but you will have to take care of ranking the merged
result set). Searching speed will depend on the speed of your slowest
DSpace source.

REST API - I would discourage you from going this route at this time
for two reasons:
* It's not officially part of DSpace yet and may change in the future.
* It would be slower than searching Solr directly (at best, it would
be a wrapper on top of Solr)

Please, clarify your requirements and feel free to ask for any details.

Regards,
~~helix84

Compulsory reading: DSpace Mailing List Etiquette
https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

--
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


Re: [Dspace-tech] Federating DSpace instances

2013-07-29 Thread helix84
One more real-time option. I'm not sure if this would be easy, because
my Solr-fu is not that strong.

You could somehow channel the content of the Solr indices of all the
DSpace instances into a single Solr index and do searches on top of
that. The open question here is how this would handle updates. I
didn't do any research on that, but you could start here [1].

You might notice that this is Similar to the VuFind solution, because
VuFind is just a thin UI on top of Solr. The difference is in the
index schema. DSpace Solr (search core) and VuFind Solr (biblio core)
schemas differ, that's why with VuFind you need import. You could skip
that step here and maybe even use a DSpace instance as the UI (with
modifications).

[1] http://wiki.apache.org/solr/MergingSolrIndexes


Regards,
~~helix84

Compulsory reading: DSpace Mailing List Etiquette
https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

--
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


[Dspace-tech] Federating DSpace instances

2013-07-28 Thread Gary Browne
Hello all,

I can't seem to find much info on this topic. I'm interested in federating a 
couple of our DSpace instances and am not sure what/who/where to ask about it.

I'm looking at having one interface for a couple of DSpace instances where 
people can search and even go to items and bitstreams using this single 
interface (perhaps using REST API). I probably would not want to replicate 
objects but just harvest all metadata into one instance. How would 
authentication and authorisation work with a federated interface? And what 
would be the best method to achieve the federation itself? I probably have lots 
more questions, but just want to get an idea of whether I'm on the right track 
and where some resources for this might be available at this stage.

Thanks,
Gary

GARY BROWNE | Development Programmer 
Library IT Services | Fisher Library F03    

THE UNIVERSITY OF SYDNEY

T +61 2 9351 5946  | M +61 405 647 868  
E gary.bro...@sydney.edu.au  | W http://sydney.edu.au 
Sent from my plain old desktop computer.

CRICOS 00026A
This email plus any attachments to it are confidential. Any unauthorised use is 
strictly prohibited. If you receive this email in error, please delete it and 
any attachments.
Please think of our environment and only print this e-mail if necessary.



--
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette