Re: [xwiki-devs] [Idea] Rewrite Activity Stream + Stats using Elastic Search

Paul Libbrecht Sat, 21 Nov 2015 13:40:07 -0800

Hello Vincent,

While I strongly believe that a NoSQL-type of storage is a fundamentally
good idea to store activity streams, I believe you may be attracted by
applying ElasticSearch mostly on a superficial basis compared to Solr.


Most analytics systems base indeed on noSQL storages, ElasticSearch and
Solr are examples of such. Many bigger systems are used in other
analytics solutions such as CouchDB and MongoDB. Almost all will
optimize for the chosen views.

My impression is that many persons are excited by ElasticSearch because
it has fancy UIs, whereas Solr may be more optimized for its very
effective caching. In both cases, the creation of an analytics system
will involve designing a storage that architects for making effective
the queries that are expected by the views of the analytics system, e.g.
the row of page-view-counts along recent times. I would expect a Solr or
ElasticSearch based Stats module to have few differences.

One thing that is crucial when using a stats system (and, I believe,
even if trying to adjust the SQL-stored-activity-stream by doing less
writes) is that viewers should not expect a perfect real time updated
view. ElasticSearch and Solr have the same behaviour: real time is only
"near real time". Alternatively, the real-time aspect (as done by Google
analytics for example) should be a completely separated view which
probably bases on in-memory values.

paul

PS: did you consider using hsqlDB for a part of this?
  This is in memory and locks are certainly way less hurting.
Persistence should be somewhat decoupled...

PPS: schema evolution is never painless, even in a noSQL system. If a
field needs to be merged or split, there is a price to it, whatever the
storage system.


> [email protected] <mailto:[email protected]>
> 21 novembre 2015 12:01
> Hi devs,
>
> I think that for data that are both not critical and high volume we
> should use ElasticSearch instead of saving them in our RDBMS.
>
> So the idea would be to have an embedded ES in XWiki by default (using
> the permanent directory to store its data) and admins could configure
> XWiki to use a separate ES instance (very similar to what we do with
> SOLR).
>
> Whenever a user modifies/creates/deletes/does operations on
> XObjects/etc, this is sent to ES.
>
> The AS UI queries ES to display the data.
>
> The Stats UI does the same.
>
> Pros:
> - scalability
> - performance
> - extensibility. It’s easy to evolve the schema in ES, and we can
> easily have several formats (as was proven by the Active Installs code)
>
> I’d like to start a POC in my “free” time.
>
> WDYT?
>
> Thanks
> -Vincent
>
> _______________________________________________
> devs mailing list
> [email protected]
> http://lists.xwiki.org/mailman/listinfo/devs

_______________________________________________
devs mailing list
[email protected]
http://lists.xwiki.org/mailman/listinfo/devs

Re: [xwiki-devs] [Idea] Rewrite Activity Stream + Stats using Elastic Search

Reply via email to