>From my investigations into DSpace, the key element that I would like to
de-couple from DSpace is SOLR.

Say you were going to build a new frontend to DSpace that heavily used the
DSpace REST API. You could have multiple servers, each running tomcat and
the DSpace REST API deployed. With nginx outside of that proxying / load
balancing. No problems. Especially as you have postgres as an external
service (rds), the assetstore is located outside of DSpace (s3). However, I
don't see how you can run multiple instances of DSpace SOLR. SOLR stores
data, and it wouldn't be as simple as just adding another server running
the webapp. But you would need to coordinate the SOLR cluster, using
SolrCloud / ZooKeeper. Maybe its not as complicated as I think. But, I
thought that I read at one point that DSpace had some custom solr code
present, or the solr configs would have to be managed, and I'm not sure how
much work it would be to build up a solr cluster with that config.

It could be possible to ensure that DSpace can use stock SOLR, or to write
another implementation for storage/search/index/engine that might be more
cloud friendly than Solr, such as DynamoDB/CloudSearch. A normal use-case
of SOLR is to use it only as an index, that your important data lives in a
persistant data store, such as the database, and you could wipe out your
search index, and reindex your source data to repopulate it. However,
DSpace's use of solr relies on it as being a source of data for some
elements (authority, statistics).

________________
Peter Dietz
Longsight
www.longsight.com
pe...@longsight.com
p: 740-599-5005 x809

On Thu, Jul 21, 2016 at 2:34 PM, Luiz dos Santos <luiz...@gmail.com> wrote:

> Hi Tim,
>
> It seems interesting , but I have a point, relay the high availability in
> hardware and in a big monolithic  software seems more like mitigate the
> problem but not solve it, you could have Solr and PostgresSQL in clusters,
> they have their own cluster possibilities, but you will end up with a one
> DSpace in one Tomcat that can fall and put your repository down, right?
>
> Maybe to a high availability DSpace need something more, something with
> microservices, something in agreement with the Reactive Manifesto (
> http://www.reactivemanifesto.org/). In Dspace 7, the new GUI model will
> bring the possibility of run the GUI in another server, that is great, but
> DSpace will be relay in a one DSpace backend, right? Do you see a way to
> have two or more DSpace back end running simultaneous.
>
> One last point, as a volunteer, I would like to take part in the
> clustering group.
>
> Best regards
> Luiz
>
> On Thu, Jul 21, 2016 at 1:11 PM, Hernán Lagos <hernanla...@gmail.com>
> wrote:
>
>> Hi Tim
>>
>> Thanks, your feedback has been very useful.
>> If we set up a cluster of Dspace, it will be announced.
>>
>> Regards
>>
>> El jueves, 21 de julio de 2016, 12:27:17 (UTC-4), Tim Donohue escribió:
>>>
>>> Hi Hernán,
>>>
>>> The simple answer here is that, currently, there is no "standard" high
>>> availability setup for DSpace, and DSpace has no inherent ability to do
>>> load balancing or clustering on its own.
>>>
>>> That said, DSpace is essentially just a web application that runs on
>>> Tomcat (or similar), uses a PostgreSQL database (or similar) to store
>>> metadata/relationships, and uses Apache Solr for searching/browsing.  Each
>>> of these three tools (Tomcat, PostgreSQL and Solr) *do* provide clustering
>>> options.  So, it may be plausible to rely on the clustering options at
>>> those levels to create a DSpace cluster.
>>>
>>> However, I'll admit that I'm not aware of anyone who has done that
>>> before. If someone has, I'm hoping they will speak up here to provide us
>>> all a bit more clues/hints.  There is an older (outdated now) wiki page
>>> where such discussions started a long time ago, but they never came to any
>>> final decision/proposal:
>>>
>>> https://wiki.duraspace.org/display/DSPACE/Clustering
>>>
>>> All that said, I suspect there are others who would be of interested in
>>> more easily enabling clustering within DSpace itself.   That seems like
>>> it'd make a wonderful addition to the software platform, but it'd take one
>>> (or more) institutions who could help us to better define the gaps, what is
>>> missing/needed, and then start to figure out a way forward.  DSpace has no
>>> centralized development team (developers are volunteers or allowed to work
>>> on the project by their institutions). So we are entirely reliant on the
>>> institutions using DSpace to help us make such improvements (see how to
>>> contribute [1]).  If we can find a few interested users, we also could
>>> establish a formal DSpace Clustering Interest Group [2] that could begin to
>>> define the use cases, needs, etc for the benefit of us all.
>>>
>>> The topic of clustering is one that comes up every once in a while on
>>> this mailing list. If others are interested in helping to move this idea
>>> forward, I'd encourage you to voice you opinions/experience here. All we'd
>>> need to establish a Clustering Interest Group would be some interested
>>> individuals and one or more willing to chair / co-chair those group
>>> meetings.
>>>
>>> Sincerely,
>>>
>>> Tim
>>>
>>> [1]
>>> https://wiki.duraspace.org/display/DSPACE/How+to+Contribute+to+DSpace
>>> [2] https://wiki.duraspace.org/display/DSPACE/DSpace+Interest+Groups
>>>
>>> --
>>> Tim Donohue
>>> Technical Lead for DSpace & DSpaceDirect
>>> DuraSpace.org | DSpace.org | DSpaceDirect.org
>>>
>>>
>>> On 7/21/2016 9:51 AM, Hernán Lagos wrote:
>>>
>>> Dear
>>>
>>> I want to ask if anyone of you have experience creating a high
>>> availability cluster for Dspace .
>>>
>>> Best regards
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "DSpace Technical Support" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to dspace-tech...@googlegroups.com.
>>> To post to this group, send email to dspac...@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/dspace-tech.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>>
>>> --
>> You received this message because you are subscribed to the Google Groups
>> "DSpace Technical Support" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to dspace-tech+unsubscr...@googlegroups.com.
>> To post to this group, send email to dspace-tech@googlegroups.com.
>> Visit this group at https://groups.google.com/group/dspace-tech.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "DSpace Technical Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dspace-tech+unsubscr...@googlegroups.com.
> To post to this group, send email to dspace-tech@googlegroups.com.
> Visit this group at https://groups.google.com/group/dspace-tech.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.

Reply via email to