Re: [DISCUSS] Persisting user data

Simon Elliston Ball Thu, 03 Aug 2017 03:26:55 -0700

Anything spring based is likely multi-db by definition as long as a we pick a 
good friendly ORM (not hibernate because licensing problems with apache, 
eclipselink?) But I suspect we should pick a good default and that that default 
should be postgres.


> On 3 Aug 2017, at 10:24, Casey Stella <[email protected]> wrote:
> 
> I'd vote for a DB-based solution, but I'd argue that any solution shouldn't
> be database specific (i.e. postgres), but JDBC-generic.  People and
> organizations have very strong views regarding databases and I'd prefer to
> side-step those holy wars by being agnostic.
> 
> On Wed, Aug 2, 2017 at 9:36 PM, Ryan Merriman <[email protected]> wrote:
> 
>> Spring supports a variety of databases including Postgres.  I have no
>> problem with using Postgres instead of MySQL.
>> 
>> On Wed, Aug 2, 2017 at 3:32 PM, Simon Elliston Ball <
>> [email protected]> wrote:
>> 
>>> Agreed on Postgres. It's a lot easier to work with license-wise in apache
>>> projects, and has a lot of the capability we need here, especially if we
>>> can find a sensible ORM. Anyone got any thoughts on what would work
>> there?
>>> 
>>> Simon
>>> 
>>>> On 2 Aug 2017, at 21:21, Matt Foley <[email protected]> wrote:
>>>> 
>>>> Hi Ryan,
>>>> Zookeeper has a default (and seldom changed) max znode size of 1MB, but
>>> it is “designed to store data on the order of kilobytes in size.”[1]  And
>>> it’s not really intended for frequently-changing data, which is okay
>> here.
>>> But I just included it for completeness, I’m not advocating for its use
>>> here.
>>>> 
>>>> I agree with you that the problem, especially because it includes
>> shared
>>> config, would fit well in a db.  I’d suggest you consider PostgreSQL
>> rather
>>> than MySQL, as postgres is built into Redhat 6 and 7, and Ambari now uses
>>> it by default, so an available server might be conveniently at hand in
>> most
>>> deployments.  Definitely assume the user will want to use an external db
>>> instance, rather than one dedicated to this use.  Conveniently Postgres
>>> also has a native REST interface, with the usual authorization options.
>>>> 
>>>> Never mind about Ambari Views for now.  It’s just a way to get GUI
>>> dashboards without writing all the infrastructure for it, which as you
>> say
>>> is somewhat water under the bridge.
>>>> Cheers,
>>>> --Matt
>>>> 
>>>> [1] https://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html
>>>> 
>>>> 
>>>> 
>>>> On 8/2/17, 12:34 PM, "Ryan Merriman" <[email protected]> wrote:
>>>> 
>>>>   Matt,
>>>> 
>>>>   Thank you for the suggestions.  I forgot to include Zookeeper.  Are
>>> there
>>>>   any tradeoffs we should be aware of if we decide to use Zookeeper?
>>> Are
>>>>   there guidelines for how much data can be stored in Zookeeper?
>>>> 
>>>>   To answer your questions:
>>>> 
>>>>   1.  I think both use cases make sense so a combination of shared and
>>>>   personal.
>>>>   2.  I was planning on managing authorization in the REST layer.  For
>>> now
>>>>   viewer login auth (which is really REST auth) will suffice but we
>>> might
>>>>   consider other methods since authentication is pluggable here.
>>>>   3.  I had not considered Ambari Views since this will support an
>>> existing
>>>>   UI.  How would Ambari Views help us here?
>>>> 
>>>>   I will proceed initially with a saved search POC using a relational
>>>>   database unless you think that is a bad idea or there are other
>> better
>>>>   options.  Hopefully an example will further the discussion.
>>>> 
>>>>   Ryan
>>>> 
>>>>>   On Wed, Jul 26, 2017 at 6:31 PM, Matt Foley <[email protected]>
>>> wrote:
>>>>> 
>>>>> There’s a couple other places you could put config info (but maybe not
>>>>> saved searches):
>>>>> -  Zookeeper
>>>>> -  metron-alerts-ui/config.xml or config.json  file
>>>>> -  the Ambari database, whichever it happens to be
>>>>> 
>>>>> Questions that influence the decision include:
>>>>> 1. Should there be one configuration shared among users, or strictly
>>>>> per-user config?  Or a combination of shared and personal?
>>>>> 2. What security do you wish to maintain on changing those settings,
>>> both
>>>>> shared and personal?  What authentication/authorization scheme will
>> you
>>>>> use?  Is viewer login auth sufficient for this?
>>>>> 3. Will you assume Ambari exists?  Did you consider using Ambari Views
>>> as
>>>>> the basis? (https://cwiki.apache.org/confluence/display/AMBARI/Views
>> )
>>>>> 
>>>>> On 7/26/17, 2:54 PM, "Ryan Merriman" <[email protected]> wrote:
>>>>> 
>>>>>   In anticipation of METRON-988 being merged into master, there will
>>> be a
>>>>>   need to persist user preferences such as UI layout, saved searches,
>>>>> search
>>>>>   history, etc.  I think where and how we persist this data should be
>>>>>   discussed in order to facilitate a design.  This data won't be
>> large
>>> in
>>>>>   scale and may or may not be relational.  The initial features I am
>>>>> aware of
>>>>>   don't require a relational model but I'm sure there will be some
>> that
>>>>> do in
>>>>>   the future.  I'm also assuming this code will live in the REST
>>>>> application
>>>>>   but someone correct me if there is a reason to keep it somewhere
>>> else.
>>>>> 
>>>>>   I think it would be preferable to leverage something that is
>> already
>>>>> in our
>>>>>   stack and available as a dependency.  However I would not be
>> against
>>>>> adding
>>>>>   something if it really were the right tool for the job.  Assuming
>>>>> others
>>>>>   agree we should stick with out current stack, I see these options:
>>>>> 
>>>>>      - MySQL (or other relational database)
>>>>>         - good fit for the size of data
>>>>>         - relational capabilities
>>>>>         - an ORM framework will be necessary which will increase our
>>>>>         dependencies and complexity
>>>>>      - HBase
>>>>>         - client setup and code will likely be simpler and less
>> complex
>>>>>         - limited data model
>>>>>      - Elasticsearch
>>>>>         - json is a convenient data model
>>>>>         - we already store user preferences here (Kibana dashboards)
>>>>>         - we have abstracted our search engine interactions in
>> several
>>>>> places
>>>>>         and would have to here too
>>>>> 
>>>>>   Elasticsearch is out for me because we view search engines as
>>>>> pluggable.  I
>>>>>   think HBase would be the easiest to implement and get working but
>> I'm
>>>>>   worried we'll have similar use cases that won't be a good fit for
>>>>> HBase.
>>>>>   In that case we would need to come up with an alternative
>> persistence
>>>>>   solution anyways.  I think MySQL is a good fit long term but I'm
>>>>> concerned
>>>>>   about adding a heavy ORM framework.  Also, we can't use Hibernate
>>>>> because
>>>>>   it is not license friendly.
>>>>> 
>>>>>   Does anyone have any thoughts on these options or other ideas?
>>>>> 
>>>>>   This requirement also brings up another topic that is outside of
>> this
>>>>>   discussion.  Should we reevaluate our authentication strategy?
>>>>> Currently
>>>>>   the REST application uses JDBC for this but if we decide a
>> different
>>>>>   mechanism is better then we no longer need a relational database.
>>> This
>>>>>   might affect our decision to use MySQL for this kind of data
>>>>> persistence.
>>>>> 
>>>>>   Ryan
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>

Re: [DISCUSS] Persisting user data

Reply via email to