Another thing about SOLR Cloud: at the time that I looked to it SOLR joins
between different cores were not supported. SOLR join are used by
DSpace-CRIS to provide aggregated statistics of items at author, department
level.
Andrea

2016-07-21 23:40 GMT+02:00 Andrea Bollini <bollin...@gmail.com>:

> I have had an extensive experience with DSpace in cluster environment. I
> was responsible for a product based on DSpace 4, more precisely on
> DSpace-CRIS 4.4, used by more than 60+ institutions. Depending on the size
> of the Institution and the expected usage 2 or 4 servers were dedicated to
> run tomcat with the JSPUI.  The DBMS was Oracle in a centralized cluster
> environment and the fronted Apache HTTP 2 load balanced. The key point here
> was to share some part of the filesystem (config and assetstore) between
> all nodes using NFS. DSpace works fine in cluster for the dissemination
> until you don't change things that are cached like the metadatafield and
> bitstream format registry. If you make changes to such aspects or other
> customization that use local cache it is necessary to introduce some system
> able to propagate / notify the changes to all the other nodes.
> We have tried to use SOLR Cloud but we have found some limitation, after a
> while we start to receive randomly corrupted response from slave nodes, we
> were not able to fix he issue and as a single server with 8vcpu, 8gb ram
> was able to manage heavy loading for 4-8 customers (using multiple cores)
> we have at the end decided to stay with a SOLR standalone solution.
> DSpace can very easily use a standard SOLR server, the addition that we
> have in DSpace and the configurations fit in the normal SOLR configurations
> and extensions points. The issue here is that we need to update the client
> side to be able to use the latest version of SOLR. Right now, I'm starting
> to investigate about the feasibility of upgrade to SOLR 6.
> With the new configuration system to be introduced in DSpace 6 capable to
> automatically reload changes monitoring the filesystem it could be also
> easier to achieve a better clustering support for DSpace.
>
> Hope this help,
> Andrea
>
> 2016-07-21 21:13 GMT+02:00 Peter Dietz <pe...@longsight.com>:
>
>> From my investigations into DSpace, the key element that I would like to
>> de-couple from DSpace is SOLR.
>>
>> Say you were going to build a new frontend to DSpace that heavily used
>> the DSpace REST API. You could have multiple servers, each running tomcat
>> and the DSpace REST API deployed. With nginx outside of that proxying /
>> load balancing. No problems. Especially as you have postgres as an external
>> service (rds), the assetstore is located outside of DSpace (s3). However, I
>> don't see how you can run multiple instances of DSpace SOLR. SOLR stores
>> data, and it wouldn't be as simple as just adding another server running
>> the webapp. But you would need to coordinate the SOLR cluster, using
>> SolrCloud / ZooKeeper. Maybe its not as complicated as I think. But, I
>> thought that I read at one point that DSpace had some custom solr code
>> present, or the solr configs would have to be managed, and I'm not sure how
>> much work it would be to build up a solr cluster with that config.
>>
>> It could be possible to ensure that DSpace can use stock SOLR, or to
>> write another implementation for storage/search/index/engine that might be
>> more cloud friendly than Solr, such as DynamoDB/CloudSearch. A normal
>> use-case of SOLR is to use it only as an index, that your important data
>> lives in a persistant data store, such as the database, and you could wipe
>> out your search index, and reindex your source data to repopulate it.
>> However, DSpace's use of solr relies on it as being a source of data for
>> some elements (authority, statistics).
>>
>> ________________
>> Peter Dietz
>> Longsight
>> www.longsight.com
>> pe...@longsight.com
>> p: 740-599-5005 x809
>>
>> On Thu, Jul 21, 2016 at 2:34 PM, Luiz dos Santos <luiz...@gmail.com>
>> wrote:
>>
>>> Hi Tim,
>>>
>>> It seems interesting , but I have a point, relay the high availability
>>> in hardware and in a big monolithic  software seems more like mitigate the
>>> problem but not solve it, you could have Solr and PostgresSQL in clusters,
>>> they have their own cluster possibilities, but you will end up with a one
>>> DSpace in one Tomcat that can fall and put your repository down, right?
>>>
>>> Maybe to a high availability DSpace need something more, something with
>>> microservices, something in agreement with the Reactive Manifesto (
>>> http://www.reactivemanifesto.org/). In Dspace 7, the new GUI model will
>>> bring the possibility of run the GUI in another server, that is great, but
>>> DSpace will be relay in a one DSpace backend, right? Do you see a way to
>>> have two or more DSpace back end running simultaneous.
>>>
>>> One last point, as a volunteer, I would like to take part in the
>>> clustering group.
>>>
>>> Best regards
>>> Luiz
>>>
>>> On Thu, Jul 21, 2016 at 1:11 PM, Hernán Lagos <hernanla...@gmail.com>
>>> wrote:
>>>
>>>> Hi Tim
>>>>
>>>> Thanks, your feedback has been very useful.
>>>> If we set up a cluster of Dspace, it will be announced.
>>>>
>>>> Regards
>>>>
>>>> El jueves, 21 de julio de 2016, 12:27:17 (UTC-4), Tim Donohue escribió:
>>>>>
>>>>> Hi Hernán,
>>>>>
>>>>> The simple answer here is that, currently, there is no "standard" high
>>>>> availability setup for DSpace, and DSpace has no inherent ability to do
>>>>> load balancing or clustering on its own.
>>>>>
>>>>> That said, DSpace is essentially just a web application that runs on
>>>>> Tomcat (or similar), uses a PostgreSQL database (or similar) to store
>>>>> metadata/relationships, and uses Apache Solr for searching/browsing.  Each
>>>>> of these three tools (Tomcat, PostgreSQL and Solr) *do* provide clustering
>>>>> options.  So, it may be plausible to rely on the clustering options at
>>>>> those levels to create a DSpace cluster.
>>>>>
>>>>> However, I'll admit that I'm not aware of anyone who has done that
>>>>> before. If someone has, I'm hoping they will speak up here to provide us
>>>>> all a bit more clues/hints.  There is an older (outdated now) wiki page
>>>>> where such discussions started a long time ago, but they never came to any
>>>>> final decision/proposal:
>>>>>
>>>>> https://wiki.duraspace.org/display/DSPACE/Clustering
>>>>>
>>>>> All that said, I suspect there are others who would be of interested
>>>>> in more easily enabling clustering within DSpace itself.   That seems like
>>>>> it'd make a wonderful addition to the software platform, but it'd take one
>>>>> (or more) institutions who could help us to better define the gaps, what 
>>>>> is
>>>>> missing/needed, and then start to figure out a way forward.  DSpace has no
>>>>> centralized development team (developers are volunteers or allowed to work
>>>>> on the project by their institutions). So we are entirely reliant on the
>>>>> institutions using DSpace to help us make such improvements (see how to
>>>>> contribute [1]).  If we can find a few interested users, we also could
>>>>> establish a formal DSpace Clustering Interest Group [2] that could begin 
>>>>> to
>>>>> define the use cases, needs, etc for the benefit of us all.
>>>>>
>>>>> The topic of clustering is one that comes up every once in a while on
>>>>> this mailing list. If others are interested in helping to move this idea
>>>>> forward, I'd encourage you to voice you opinions/experience here. All we'd
>>>>> need to establish a Clustering Interest Group would be some interested
>>>>> individuals and one or more willing to chair / co-chair those group
>>>>> meetings.
>>>>>
>>>>> Sincerely,
>>>>>
>>>>> Tim
>>>>>
>>>>> [1]
>>>>> https://wiki.duraspace.org/display/DSPACE/How+to+Contribute+to+DSpace
>>>>> [2] https://wiki.duraspace.org/display/DSPACE/DSpace+Interest+Groups
>>>>>
>>>>> --
>>>>> Tim Donohue
>>>>> Technical Lead for DSpace & DSpaceDirect
>>>>> DuraSpace.org | DSpace.org | DSpaceDirect.org
>>>>>
>>>>>
>>>>> On 7/21/2016 9:51 AM, Hernán Lagos wrote:
>>>>>
>>>>> Dear
>>>>>
>>>>> I want to ask if anyone of you have experience creating a high
>>>>> availability cluster for Dspace .
>>>>>
>>>>> Best regards
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "DSpace Technical Support" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to dspace-tech...@googlegroups.com.
>>>>> To post to this group, send email to dspac...@googlegroups.com.
>>>>> Visit this group at https://groups.google.com/group/dspace-tech.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>>
>>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "DSpace Technical Support" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to dspace-tech+unsubscr...@googlegroups.com.
>>>> To post to this group, send email to dspace-tech@googlegroups.com.
>>>> Visit this group at https://groups.google.com/group/dspace-tech.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "DSpace Technical Support" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to dspace-tech+unsubscr...@googlegroups.com.
>>> To post to this group, send email to dspace-tech@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/dspace-tech.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "DSpace Technical Support" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to dspace-tech+unsubscr...@googlegroups.com.
>> To post to this group, send email to dspace-tech@googlegroups.com.
>> Visit this group at https://groups.google.com/group/dspace-tech.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.

Reply via email to