Re: [dspace-tech] HA Clustering Dspace

2016-07-22 Thread Mark Wood
On Thursday, July 21, 2016 at 3:13:35 PM UTC-4, Peter Dietz wrote:
>
> From my investigations into DSpace, the key element that I would like to 
> de-couple from DSpace is SOLR. 
>
> Say you were going to build a new frontend to DSpace that heavily used the 
> DSpace REST API. You could have multiple servers, each running tomcat and 
> the DSpace REST API deployed. With nginx outside of that proxying / load 
> balancing. No problems. Especially as you have postgres as an external 
> service (rds), the assetstore is located outside of DSpace (s3). However, I 
> don't see how you can run multiple instances of DSpace SOLR. SOLR stores 
> data, and it wouldn't be as simple as just adding another server running 
> the webapp. But you would need to coordinate the SOLR cluster, using 
> SolrCloud / ZooKeeper. Maybe its not as complicated as I think. But, I 
> thought that I read at one point that DSpace had some custom solr code 
> present, or the solr configs would have to be managed, and I'm not sure how 
> much work it would be to build up a solr cluster with that config.
>
> DSpace does include a tiny dab of custom code for Solr, which I think is 
not essential.  LocalHostRestrictionFilter can be replaced with fairly 
simple filtering by the Servlet container.  ConfigureLog4jListener only 
exists because for some reason we insist that Solr's logging configuration 
live with DSpace's instead of with the rest of Solr's configuration.  There 
is nothing else.  It should be simple to use DSpace with a stock Solr 4 
instance.  If and when DSpace moves to Solr 5+ this all will have to be 
revamped anyway due to significant changes in the way Solr must be deployed.

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] HA Clustering Dspace

2016-07-21 Thread Andrea Bollini
Another thing about SOLR Cloud: at the time that I looked to it SOLR joins
between different cores were not supported. SOLR join are used by
DSpace-CRIS to provide aggregated statistics of items at author, department
level.
Andrea

2016-07-21 23:40 GMT+02:00 Andrea Bollini :

> I have had an extensive experience with DSpace in cluster environment. I
> was responsible for a product based on DSpace 4, more precisely on
> DSpace-CRIS 4.4, used by more than 60+ institutions. Depending on the size
> of the Institution and the expected usage 2 or 4 servers were dedicated to
> run tomcat with the JSPUI.  The DBMS was Oracle in a centralized cluster
> environment and the fronted Apache HTTP 2 load balanced. The key point here
> was to share some part of the filesystem (config and assetstore) between
> all nodes using NFS. DSpace works fine in cluster for the dissemination
> until you don't change things that are cached like the metadatafield and
> bitstream format registry. If you make changes to such aspects or other
> customization that use local cache it is necessary to introduce some system
> able to propagate / notify the changes to all the other nodes.
> We have tried to use SOLR Cloud but we have found some limitation, after a
> while we start to receive randomly corrupted response from slave nodes, we
> were not able to fix he issue and as a single server with 8vcpu, 8gb ram
> was able to manage heavy loading for 4-8 customers (using multiple cores)
> we have at the end decided to stay with a SOLR standalone solution.
> DSpace can very easily use a standard SOLR server, the addition that we
> have in DSpace and the configurations fit in the normal SOLR configurations
> and extensions points. The issue here is that we need to update the client
> side to be able to use the latest version of SOLR. Right now, I'm starting
> to investigate about the feasibility of upgrade to SOLR 6.
> With the new configuration system to be introduced in DSpace 6 capable to
> automatically reload changes monitoring the filesystem it could be also
> easier to achieve a better clustering support for DSpace.
>
> Hope this help,
> Andrea
>
> 2016-07-21 21:13 GMT+02:00 Peter Dietz :
>
>> From my investigations into DSpace, the key element that I would like to
>> de-couple from DSpace is SOLR.
>>
>> Say you were going to build a new frontend to DSpace that heavily used
>> the DSpace REST API. You could have multiple servers, each running tomcat
>> and the DSpace REST API deployed. With nginx outside of that proxying /
>> load balancing. No problems. Especially as you have postgres as an external
>> service (rds), the assetstore is located outside of DSpace (s3). However, I
>> don't see how you can run multiple instances of DSpace SOLR. SOLR stores
>> data, and it wouldn't be as simple as just adding another server running
>> the webapp. But you would need to coordinate the SOLR cluster, using
>> SolrCloud / ZooKeeper. Maybe its not as complicated as I think. But, I
>> thought that I read at one point that DSpace had some custom solr code
>> present, or the solr configs would have to be managed, and I'm not sure how
>> much work it would be to build up a solr cluster with that config.
>>
>> It could be possible to ensure that DSpace can use stock SOLR, or to
>> write another implementation for storage/search/index/engine that might be
>> more cloud friendly than Solr, such as DynamoDB/CloudSearch. A normal
>> use-case of SOLR is to use it only as an index, that your important data
>> lives in a persistant data store, such as the database, and you could wipe
>> out your search index, and reindex your source data to repopulate it.
>> However, DSpace's use of solr relies on it as being a source of data for
>> some elements (authority, statistics).
>>
>> 
>> Peter Dietz
>> Longsight
>> www.longsight.com
>> pe...@longsight.com
>> p: 740-599-5005 x809
>>
>> On Thu, Jul 21, 2016 at 2:34 PM, Luiz dos Santos 
>> wrote:
>>
>>> Hi Tim,
>>>
>>> It seems interesting , but I have a point, relay the high availability
>>> in hardware and in a big monolithic  software seems more like mitigate the
>>> problem but not solve it, you could have Solr and PostgresSQL in clusters,
>>> they have their own cluster possibilities, but you will end up with a one
>>> DSpace in one Tomcat that can fall and put your repository down, right?
>>>
>>> Maybe to a high availability DSpace need something more, something with
>>> microservices, something in agreement with the Reactive Manifesto (
>>> http://www.reactivemanifesto.org/). In Dspace 7, the new GUI model will
>>> bring the possibility of run the GUI in another server, that is great, but
>>> DSpace will be relay in a one DSpace backend, right? Do you see a way to
>>> have two or more DSpace back end running simultaneous.
>>>
>>> One last point, as a volunteer, I would like to take part in the
>>> clustering group.
>>>
>>> Best regards
>>> Luiz
>>>
>>> On Thu, Jul 21, 2016 at 1:11 PM, Hernán Lagos 

Re: [dspace-tech] HA Clustering Dspace

2016-07-21 Thread Andrea Bollini
I have had an extensive experience with DSpace in cluster environment. I
was responsible for a product based on DSpace 4, more precisely on
DSpace-CRIS 4.4, used by more than 60+ institutions. Depending on the size
of the Institution and the expected usage 2 or 4 servers were dedicated to
run tomcat with the JSPUI.  The DBMS was Oracle in a centralized cluster
environment and the fronted Apache HTTP 2 load balanced. The key point here
was to share some part of the filesystem (config and assetstore) between
all nodes using NFS. DSpace works fine in cluster for the dissemination
until you don't change things that are cached like the metadatafield and
bitstream format registry. If you make changes to such aspects or other
customization that use local cache it is necessary to introduce some system
able to propagate / notify the changes to all the other nodes.
We have tried to use SOLR Cloud but we have found some limitation, after a
while we start to receive randomly corrupted response from slave nodes, we
were not able to fix he issue and as a single server with 8vcpu, 8gb ram
was able to manage heavy loading for 4-8 customers (using multiple cores)
we have at the end decided to stay with a SOLR standalone solution.
DSpace can very easily use a standard SOLR server, the addition that we
have in DSpace and the configurations fit in the normal SOLR configurations
and extensions points. The issue here is that we need to update the client
side to be able to use the latest version of SOLR. Right now, I'm starting
to investigate about the feasibility of upgrade to SOLR 6.
With the new configuration system to be introduced in DSpace 6 capable to
automatically reload changes monitoring the filesystem it could be also
easier to achieve a better clustering support for DSpace.

Hope this help,
Andrea

2016-07-21 21:13 GMT+02:00 Peter Dietz :

> From my investigations into DSpace, the key element that I would like to
> de-couple from DSpace is SOLR.
>
> Say you were going to build a new frontend to DSpace that heavily used the
> DSpace REST API. You could have multiple servers, each running tomcat and
> the DSpace REST API deployed. With nginx outside of that proxying / load
> balancing. No problems. Especially as you have postgres as an external
> service (rds), the assetstore is located outside of DSpace (s3). However, I
> don't see how you can run multiple instances of DSpace SOLR. SOLR stores
> data, and it wouldn't be as simple as just adding another server running
> the webapp. But you would need to coordinate the SOLR cluster, using
> SolrCloud / ZooKeeper. Maybe its not as complicated as I think. But, I
> thought that I read at one point that DSpace had some custom solr code
> present, or the solr configs would have to be managed, and I'm not sure how
> much work it would be to build up a solr cluster with that config.
>
> It could be possible to ensure that DSpace can use stock SOLR, or to write
> another implementation for storage/search/index/engine that might be more
> cloud friendly than Solr, such as DynamoDB/CloudSearch. A normal use-case
> of SOLR is to use it only as an index, that your important data lives in a
> persistant data store, such as the database, and you could wipe out your
> search index, and reindex your source data to repopulate it. However,
> DSpace's use of solr relies on it as being a source of data for some
> elements (authority, statistics).
>
> 
> Peter Dietz
> Longsight
> www.longsight.com
> pe...@longsight.com
> p: 740-599-5005 x809
>
> On Thu, Jul 21, 2016 at 2:34 PM, Luiz dos Santos 
> wrote:
>
>> Hi Tim,
>>
>> It seems interesting , but I have a point, relay the high availability in
>> hardware and in a big monolithic  software seems more like mitigate the
>> problem but not solve it, you could have Solr and PostgresSQL in clusters,
>> they have their own cluster possibilities, but you will end up with a one
>> DSpace in one Tomcat that can fall and put your repository down, right?
>>
>> Maybe to a high availability DSpace need something more, something with
>> microservices, something in agreement with the Reactive Manifesto (
>> http://www.reactivemanifesto.org/). In Dspace 7, the new GUI model will
>> bring the possibility of run the GUI in another server, that is great, but
>> DSpace will be relay in a one DSpace backend, right? Do you see a way to
>> have two or more DSpace back end running simultaneous.
>>
>> One last point, as a volunteer, I would like to take part in the
>> clustering group.
>>
>> Best regards
>> Luiz
>>
>> On Thu, Jul 21, 2016 at 1:11 PM, Hernán Lagos 
>> wrote:
>>
>>> Hi Tim
>>>
>>> Thanks, your feedback has been very useful.
>>> If we set up a cluster of Dspace, it will be announced.
>>>
>>> Regards
>>>
>>> El jueves, 21 de julio de 2016, 12:27:17 (UTC-4), Tim Donohue escribió:

 Hi Hernán,

 The simple answer here is that, currently, there is no "standard" high
 availability setup for DSpace, and DSpace has no inhe

Re: [dspace-tech] HA Clustering Dspace

2016-07-21 Thread Luiz dos Santos
Hi Peter,

Thanks, nice explanation! Why don't you coordinate the DSpace
Clustering group?

Best regards
Luiz

On Thu, Jul 21, 2016 at 3:13 PM, Peter Dietz  wrote:

> From my investigations into DSpace, the key element that I would like to
> de-couple from DSpace is SOLR.
>
> Say you were going to build a new frontend to DSpace that heavily used the
> DSpace REST API. You could have multiple servers, each running tomcat and
> the DSpace REST API deployed. With nginx outside of that proxying / load
> balancing. No problems. Especially as you have postgres as an external
> service (rds), the assetstore is located outside of DSpace (s3). However, I
> don't see how you can run multiple instances of DSpace SOLR. SOLR stores
> data, and it wouldn't be as simple as just adding another server running
> the webapp. But you would need to coordinate the SOLR cluster, using
> SolrCloud / ZooKeeper. Maybe its not as complicated as I think. But, I
> thought that I read at one point that DSpace had some custom solr code
> present, or the solr configs would have to be managed, and I'm not sure how
> much work it would be to build up a solr cluster with that config.
>
> It could be possible to ensure that DSpace can use stock SOLR, or to write
> another implementation for storage/search/index/engine that might be more
> cloud friendly than Solr, such as DynamoDB/CloudSearch. A normal use-case
> of SOLR is to use it only as an index, that your important data lives in a
> persistant data store, such as the database, and you could wipe out your
> search index, and reindex your source data to repopulate it. However,
> DSpace's use of solr relies on it as being a source of data for some
> elements (authority, statistics).
>
> 
> Peter Dietz
> Longsight
> www.longsight.com
> pe...@longsight.com
> p: 740-599-5005 x809
>
> On Thu, Jul 21, 2016 at 2:34 PM, Luiz dos Santos 
> wrote:
>
>> Hi Tim,
>>
>> It seems interesting , but I have a point, relay the high availability in
>> hardware and in a big monolithic  software seems more like mitigate the
>> problem but not solve it, you could have Solr and PostgresSQL in clusters,
>> they have their own cluster possibilities, but you will end up with a one
>> DSpace in one Tomcat that can fall and put your repository down, right?
>>
>> Maybe to a high availability DSpace need something more, something with
>> microservices, something in agreement with the Reactive Manifesto (
>> http://www.reactivemanifesto.org/). In Dspace 7, the new GUI model will
>> bring the possibility of run the GUI in another server, that is great, but
>> DSpace will be relay in a one DSpace backend, right? Do you see a way to
>> have two or more DSpace back end running simultaneous.
>>
>> One last point, as a volunteer, I would like to take part in the
>> clustering group.
>>
>> Best regards
>> Luiz
>>
>> On Thu, Jul 21, 2016 at 1:11 PM, Hernán Lagos 
>> wrote:
>>
>>> Hi Tim
>>>
>>> Thanks, your feedback has been very useful.
>>> If we set up a cluster of Dspace, it will be announced.
>>>
>>> Regards
>>>
>>> El jueves, 21 de julio de 2016, 12:27:17 (UTC-4), Tim Donohue escribió:

 Hi Hernán,

 The simple answer here is that, currently, there is no "standard" high
 availability setup for DSpace, and DSpace has no inherent ability to do
 load balancing or clustering on its own.

 That said, DSpace is essentially just a web application that runs on
 Tomcat (or similar), uses a PostgreSQL database (or similar) to store
 metadata/relationships, and uses Apache Solr for searching/browsing.  Each
 of these three tools (Tomcat, PostgreSQL and Solr) *do* provide clustering
 options.  So, it may be plausible to rely on the clustering options at
 those levels to create a DSpace cluster.

 However, I'll admit that I'm not aware of anyone who has done that
 before. If someone has, I'm hoping they will speak up here to provide us
 all a bit more clues/hints.  There is an older (outdated now) wiki page
 where such discussions started a long time ago, but they never came to any
 final decision/proposal:

 https://wiki.duraspace.org/display/DSPACE/Clustering

 All that said, I suspect there are others who would be of interested in
 more easily enabling clustering within DSpace itself.   That seems like
 it'd make a wonderful addition to the software platform, but it'd take one
 (or more) institutions who could help us to better define the gaps, what is
 missing/needed, and then start to figure out a way forward.  DSpace has no
 centralized development team (developers are volunteers or allowed to work
 on the project by their institutions). So we are entirely reliant on the
 institutions using DSpace to help us make such improvements (see how to
 contribute [1]).  If we can find a few interested users, we also could
 establish a formal DSpace Clustering Interest Group [2] that could beg

Re: [dspace-tech] HA Clustering Dspace

2016-07-21 Thread Peter Dietz
>From my investigations into DSpace, the key element that I would like to
de-couple from DSpace is SOLR.

Say you were going to build a new frontend to DSpace that heavily used the
DSpace REST API. You could have multiple servers, each running tomcat and
the DSpace REST API deployed. With nginx outside of that proxying / load
balancing. No problems. Especially as you have postgres as an external
service (rds), the assetstore is located outside of DSpace (s3). However, I
don't see how you can run multiple instances of DSpace SOLR. SOLR stores
data, and it wouldn't be as simple as just adding another server running
the webapp. But you would need to coordinate the SOLR cluster, using
SolrCloud / ZooKeeper. Maybe its not as complicated as I think. But, I
thought that I read at one point that DSpace had some custom solr code
present, or the solr configs would have to be managed, and I'm not sure how
much work it would be to build up a solr cluster with that config.

It could be possible to ensure that DSpace can use stock SOLR, or to write
another implementation for storage/search/index/engine that might be more
cloud friendly than Solr, such as DynamoDB/CloudSearch. A normal use-case
of SOLR is to use it only as an index, that your important data lives in a
persistant data store, such as the database, and you could wipe out your
search index, and reindex your source data to repopulate it. However,
DSpace's use of solr relies on it as being a source of data for some
elements (authority, statistics).


Peter Dietz
Longsight
www.longsight.com
pe...@longsight.com
p: 740-599-5005 x809

On Thu, Jul 21, 2016 at 2:34 PM, Luiz dos Santos  wrote:

> Hi Tim,
>
> It seems interesting , but I have a point, relay the high availability in
> hardware and in a big monolithic  software seems more like mitigate the
> problem but not solve it, you could have Solr and PostgresSQL in clusters,
> they have their own cluster possibilities, but you will end up with a one
> DSpace in one Tomcat that can fall and put your repository down, right?
>
> Maybe to a high availability DSpace need something more, something with
> microservices, something in agreement with the Reactive Manifesto (
> http://www.reactivemanifesto.org/). In Dspace 7, the new GUI model will
> bring the possibility of run the GUI in another server, that is great, but
> DSpace will be relay in a one DSpace backend, right? Do you see a way to
> have two or more DSpace back end running simultaneous.
>
> One last point, as a volunteer, I would like to take part in the
> clustering group.
>
> Best regards
> Luiz
>
> On Thu, Jul 21, 2016 at 1:11 PM, Hernán Lagos 
> wrote:
>
>> Hi Tim
>>
>> Thanks, your feedback has been very useful.
>> If we set up a cluster of Dspace, it will be announced.
>>
>> Regards
>>
>> El jueves, 21 de julio de 2016, 12:27:17 (UTC-4), Tim Donohue escribió:
>>>
>>> Hi Hernán,
>>>
>>> The simple answer here is that, currently, there is no "standard" high
>>> availability setup for DSpace, and DSpace has no inherent ability to do
>>> load balancing or clustering on its own.
>>>
>>> That said, DSpace is essentially just a web application that runs on
>>> Tomcat (or similar), uses a PostgreSQL database (or similar) to store
>>> metadata/relationships, and uses Apache Solr for searching/browsing.  Each
>>> of these three tools (Tomcat, PostgreSQL and Solr) *do* provide clustering
>>> options.  So, it may be plausible to rely on the clustering options at
>>> those levels to create a DSpace cluster.
>>>
>>> However, I'll admit that I'm not aware of anyone who has done that
>>> before. If someone has, I'm hoping they will speak up here to provide us
>>> all a bit more clues/hints.  There is an older (outdated now) wiki page
>>> where such discussions started a long time ago, but they never came to any
>>> final decision/proposal:
>>>
>>> https://wiki.duraspace.org/display/DSPACE/Clustering
>>>
>>> All that said, I suspect there are others who would be of interested in
>>> more easily enabling clustering within DSpace itself.   That seems like
>>> it'd make a wonderful addition to the software platform, but it'd take one
>>> (or more) institutions who could help us to better define the gaps, what is
>>> missing/needed, and then start to figure out a way forward.  DSpace has no
>>> centralized development team (developers are volunteers or allowed to work
>>> on the project by their institutions). So we are entirely reliant on the
>>> institutions using DSpace to help us make such improvements (see how to
>>> contribute [1]).  If we can find a few interested users, we also could
>>> establish a formal DSpace Clustering Interest Group [2] that could begin to
>>> define the use cases, needs, etc for the benefit of us all.
>>>
>>> The topic of clustering is one that comes up every once in a while on
>>> this mailing list. If others are interested in helping to move this idea
>>> forward, I'd encourage you to voice you opinions/experience here. Al

Re: [dspace-tech] HA Clustering Dspace

2016-07-21 Thread Luiz dos Santos
Hi Tim,

It seems interesting , but I have a point, relay the high availability in
hardware and in a big monolithic  software seems more like mitigate the
problem but not solve it, you could have Solr and PostgresSQL in clusters,
they have their own cluster possibilities, but you will end up with a one
DSpace in one Tomcat that can fall and put your repository down, right?

Maybe to a high availability DSpace need something more, something with
microservices, something in agreement with the Reactive Manifesto (
http://www.reactivemanifesto.org/). In Dspace 7, the new GUI model will
bring the possibility of run the GUI in another server, that is great, but
DSpace will be relay in a one DSpace backend, right? Do you see a way to
have two or more DSpace back end running simultaneous.

One last point, as a volunteer, I would like to take part in the clustering
group.

Best regards
Luiz

On Thu, Jul 21, 2016 at 1:11 PM, Hernán Lagos  wrote:

> Hi Tim
>
> Thanks, your feedback has been very useful.
> If we set up a cluster of Dspace, it will be announced.
>
> Regards
>
> El jueves, 21 de julio de 2016, 12:27:17 (UTC-4), Tim Donohue escribió:
>>
>> Hi Hernán,
>>
>> The simple answer here is that, currently, there is no "standard" high
>> availability setup for DSpace, and DSpace has no inherent ability to do
>> load balancing or clustering on its own.
>>
>> That said, DSpace is essentially just a web application that runs on
>> Tomcat (or similar), uses a PostgreSQL database (or similar) to store
>> metadata/relationships, and uses Apache Solr for searching/browsing.  Each
>> of these three tools (Tomcat, PostgreSQL and Solr) *do* provide clustering
>> options.  So, it may be plausible to rely on the clustering options at
>> those levels to create a DSpace cluster.
>>
>> However, I'll admit that I'm not aware of anyone who has done that
>> before. If someone has, I'm hoping they will speak up here to provide us
>> all a bit more clues/hints.  There is an older (outdated now) wiki page
>> where such discussions started a long time ago, but they never came to any
>> final decision/proposal:
>>
>> https://wiki.duraspace.org/display/DSPACE/Clustering
>>
>> All that said, I suspect there are others who would be of interested in
>> more easily enabling clustering within DSpace itself.   That seems like
>> it'd make a wonderful addition to the software platform, but it'd take one
>> (or more) institutions who could help us to better define the gaps, what is
>> missing/needed, and then start to figure out a way forward.  DSpace has no
>> centralized development team (developers are volunteers or allowed to work
>> on the project by their institutions). So we are entirely reliant on the
>> institutions using DSpace to help us make such improvements (see how to
>> contribute [1]).  If we can find a few interested users, we also could
>> establish a formal DSpace Clustering Interest Group [2] that could begin to
>> define the use cases, needs, etc for the benefit of us all.
>>
>> The topic of clustering is one that comes up every once in a while on
>> this mailing list. If others are interested in helping to move this idea
>> forward, I'd encourage you to voice you opinions/experience here. All we'd
>> need to establish a Clustering Interest Group would be some interested
>> individuals and one or more willing to chair / co-chair those group
>> meetings.
>>
>> Sincerely,
>>
>> Tim
>>
>> [1] https://wiki.duraspace.org/display/DSPACE/How+to+Contribute+to+DSpace
>> [2] https://wiki.duraspace.org/display/DSPACE/DSpace+Interest+Groups
>>
>> --
>> Tim Donohue
>> Technical Lead for DSpace & DSpaceDirect
>> DuraSpace.org | DSpace.org | DSpaceDirect.org
>>
>>
>> On 7/21/2016 9:51 AM, Hernán Lagos wrote:
>>
>> Dear
>>
>> I want to ask if anyone of you have experience creating a high
>> availability cluster for Dspace .
>>
>> Best regards
>> --
>> You received this message because you are subscribed to the Google Groups
>> "DSpace Technical Support" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to dspace-tech...@googlegroups.com.
>> To post to this group, send email to dspac...@googlegroups.com.
>> Visit this group at https://groups.google.com/group/dspace-tech.
>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "DSpace Technical Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dspace-tech+unsubscr...@googlegroups.com.
> To post to this group, send email to dspace-tech@googlegroups.com.
> Visit this group at https://groups.google.com/group/dspace-tech.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@go

Re: [dspace-tech] HA Clustering Dspace

2016-07-21 Thread Hernán Lagos
Hi Tim

Thanks, your feedback has been very useful.
If we set up a cluster of Dspace, it will be announced.

Regards

El jueves, 21 de julio de 2016, 12:27:17 (UTC-4), Tim Donohue escribió:
>
> Hi Hernán,
>
> The simple answer here is that, currently, there is no "standard" high 
> availability setup for DSpace, and DSpace has no inherent ability to do 
> load balancing or clustering on its own.
>
> That said, DSpace is essentially just a web application that runs on 
> Tomcat (or similar), uses a PostgreSQL database (or similar) to store 
> metadata/relationships, and uses Apache Solr for searching/browsing.  Each 
> of these three tools (Tomcat, PostgreSQL and Solr) *do* provide clustering 
> options.  So, it may be plausible to rely on the clustering options at 
> those levels to create a DSpace cluster.  
>
> However, I'll admit that I'm not aware of anyone who has done that before. 
> If someone has, I'm hoping they will speak up here to provide us all a bit 
> more clues/hints.  There is an older (outdated now) wiki page where such 
> discussions started a long time ago, but they never came to any final 
> decision/proposal:
>
> https://wiki.duraspace.org/display/DSPACE/Clustering
>
> All that said, I suspect there are others who would be of interested in 
> more easily enabling clustering within DSpace itself.   That seems like 
> it'd make a wonderful addition to the software platform, but it'd take one 
> (or more) institutions who could help us to better define the gaps, what is 
> missing/needed, and then start to figure out a way forward.  DSpace has no 
> centralized development team (developers are volunteers or allowed to work 
> on the project by their institutions). So we are entirely reliant on the 
> institutions using DSpace to help us make such improvements (see how to 
> contribute [1]).  If we can find a few interested users, we also could 
> establish a formal DSpace Clustering Interest Group [2] that could begin to 
> define the use cases, needs, etc for the benefit of us all.
>
> The topic of clustering is one that comes up every once in a while on this 
> mailing list. If others are interested in helping to move this idea 
> forward, I'd encourage you to voice you opinions/experience here. All we'd 
> need to establish a Clustering Interest Group would be some interested 
> individuals and one or more willing to chair / co-chair those group 
> meetings.
>
> Sincerely,
>
> Tim
>
> [1] https://wiki.duraspace.org/display/DSPACE/How+to+Contribute+to+DSpace
> [2] https://wiki.duraspace.org/display/DSPACE/DSpace+Interest+Groups
>
> -- 
> Tim Donohue
> Technical Lead for DSpace & DSpaceDirect
> DuraSpace.org | DSpace.org | DSpaceDirect.org
>
>
> On 7/21/2016 9:51 AM, Hernán Lagos wrote:
>
> Dear
>
> I want to ask if anyone of you have experience creating a high
> availability cluster for Dspace .
>
> Best regards
> -- 
> You received this message because you are subscribed to the Google Groups 
> "DSpace Technical Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to dspace-tech...@googlegroups.com .
> To post to this group, send email to dspac...@googlegroups.com 
> .
> Visit this group at https://groups.google.com/group/dspace-tech.
> For more options, visit https://groups.google.com/d/optout.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] HA Clustering Dspace

2016-07-21 Thread Tim Donohue

Hi Hernán,

The simple answer here is that, currently, there is no "standard" high 
availability setup for DSpace, and DSpace has no inherent ability to do 
load balancing or clustering on its own.


That said, DSpace is essentially just a web application that runs on 
Tomcat (or similar), uses a PostgreSQL database (or similar) to store 
metadata/relationships, and uses Apache Solr for searching/browsing.  
Each of these three tools (Tomcat, PostgreSQL and Solr) *do* provide 
clustering options.  So, it may be plausible to rely on the clustering 
options at those levels to create a DSpace cluster.


However, I'll admit that I'm not aware of anyone who has done that 
before. If someone has, I'm hoping they will speak up here to provide us 
all a bit more clues/hints.  There is an older (outdated now) wiki page 
where such discussions started a long time ago, but they never came to 
any final decision/proposal:


https://wiki.duraspace.org/display/DSPACE/Clustering

All that said, I suspect there are others who would be of interested in 
more easily enabling clustering within DSpace itself.   That seems like 
it'd make a wonderful addition to the software platform, but it'd take 
one (or more) institutions who could help us to better define the gaps, 
what is missing/needed, and then start to figure out a way forward.  
DSpace has no centralized development team (developers are volunteers or 
allowed to work on the project by their institutions). So we are 
entirely reliant on the institutions using DSpace to help us make such 
improvements (see how to contribute [1]).  If we can find a few 
interested users, we also could establish a formal DSpace Clustering 
Interest Group [2] that could begin to define the use cases, needs, etc 
for the benefit of us all.


The topic of clustering is one that comes up every once in a while on 
this mailing list. If others are interested in helping to move this idea 
forward, I'd encourage you to voice you opinions/experience here. All 
we'd need to establish a Clustering Interest Group would be some 
interested individuals and one or more willing to chair / co-chair those 
group meetings.


Sincerely,

Tim

[1] https://wiki.duraspace.org/display/DSPACE/How+to+Contribute+to+DSpace
[2] https://wiki.duraspace.org/display/DSPACE/DSpace+Interest+Groups

--
Tim Donohue
Technical Lead for DSpace & DSpaceDirect
DuraSpace.org | DSpace.org | DSpaceDirect.org


On 7/21/2016 9:51 AM, Hernán Lagos wrote:

Dear

I want to ask if anyone of you have experience creating a high
availability cluster for Dspace .

Best regards
--
You received this message because you are subscribed to the Google 
Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send 
an email to dspace-tech+unsubscr...@googlegroups.com 
.
To post to this group, send email to dspace-tech@googlegroups.com 
.

Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups "DSpace 
Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] HA Clustering Dspace

2016-07-21 Thread Hilton Gibson
Hi All,

For 100% uptime regarding infrastructure, better to use tried and tested
cloud services.
Thats my 2c ;-)

Cheers

hg

*Hilton Gibson*
Stellenbosch University Library
*http://orcid.org/-0002-2992-208X
*


On 21 July 2016 at 18:22, Hernán Lagos  wrote:

> Thanks
>
> In this case it's for a DSpace installation remains online.
>
> Regards
>
> El jueves, 21 de julio de 2016, 11:52:25 (UTC-4), Luiz dos Santos escribió:
>>
>> It is a nice question, but if you want to use a cluster is not better use
>> Fedora instead? Please note that is a question...I'm curious to known what
>> the DSpace specialist think about it.
>>
>> Best
>> Luiz
>>
>> On Thu, Jul 21, 2016 at 10:51 AM, Hernán Lagos 
>> wrote:
>>
>>> Dear
>>>
>>> I want to ask if anyone of you have experience creating a high
>>> availability cluster for Dspace .
>>>
>>> Best regards
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "DSpace Technical Support" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to dspace-tech...@googlegroups.com.
>>> To post to this group, send email to dspac...@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/dspace-tech.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "DSpace Technical Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dspace-tech+unsubscr...@googlegroups.com.
> To post to this group, send email to dspace-tech@googlegroups.com.
> Visit this group at https://groups.google.com/group/dspace-tech.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] HA Clustering Dspace

2016-07-21 Thread Hernán Lagos
Thanks

In this case it's for a DSpace installation remains online.

Regards

El jueves, 21 de julio de 2016, 11:52:25 (UTC-4), Luiz dos Santos escribió:
>
> It is a nice question, but if you want to use a cluster is not better use 
> Fedora instead? Please note that is a question...I'm curious to known what 
> the DSpace specialist think about it.
>
> Best
> Luiz
>
> On Thu, Jul 21, 2016 at 10:51 AM, Hernán Lagos  > wrote:
>
>> Dear
>>
>> I want to ask if anyone of you have experience creating a high
>> availability cluster for Dspace .
>>
>> Best regards
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "DSpace Technical Support" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to dspace-tech...@googlegroups.com .
>> To post to this group, send email to dspac...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/dspace-tech.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] HA Clustering Dspace

2016-07-21 Thread Luiz dos Santos
It is a nice question, but if you want to use a cluster is not better use
Fedora instead? Please note that is a question...I'm curious to known what
the DSpace specialist think about it.

Best
Luiz

On Thu, Jul 21, 2016 at 10:51 AM, Hernán Lagos 
wrote:

> Dear
>
> I want to ask if anyone of you have experience creating a high
> availability cluster for Dspace .
>
> Best regards
>
> --
> You received this message because you are subscribed to the Google Groups
> "DSpace Technical Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dspace-tech+unsubscr...@googlegroups.com.
> To post to this group, send email to dspace-tech@googlegroups.com.
> Visit this group at https://groups.google.com/group/dspace-tech.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] HA Clustering Dspace

2016-07-21 Thread Hernán Lagos
Dear

I want to ask if anyone of you have experience creating a high
availability cluster for Dspace .

Best regards

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.