Re: Load all data from DB on Cache Start

John Blum Mon, 26 Dec 2016 23:48:07 -0800

Hi Amit-

Yeah, that is true.  It really depends on how you code your CacheWriters I
suppose.


For instance, you could "flip a switch" in your CacheWriter(s), so to say,
after the "application" (not just the cache), has fully initialized
itself.  For instance, you could write a *Spring* ApplicationListener that
listens for the ContextRefreshed event, which is fired after the *Spring*
container initializes all beans (Geode and application beans alike)
declared in the context.

There are other lifecycle/post-processing mechanisms (both in *Spring* and
in Geode) you can play with as well.

Food for thought.

Cheers,
John


On Mon, Dec 26, 2016 at 10:53 PM, Amit Pandey <[email protected]>
wrote:

> John,
>
> Thanks for the awesome answer this is really helpful.
>
> I have only one concern. My regions are read-write through enabled.
> So if I use a BeanPostProcessor to load after the cache is initialized
> will it try to write through to the database again? Its something that will
> be no op but can result in millions of writes being sent to the database.
>
> Thanks for the cloud foundry tip I will look into it.
>
> Regards
>
> On Tue, Dec 27, 2016 at 1:58 AM, John Blum <[email protected]> wrote:
>
>> Amit-
>>
>> Regarding...
>>
>> *> I want to load all data on cache startup at a go.*
>>
>> Since you are using "*Spring*", you could easily implement a *Spring*
>> BeanPostProcessor [1] (BPP) for each (or all the) *Region(s)* in which
>> you need to load data.  I do this frequently in *Spring Data
>> GemFire/Geode's* test suite when testing *Region* data access operations
>> using the GemfireTemplate, *Repositories* or things of that nature.  Clearly
>> your BPP could use a DataSource to load the data from an external data
>> store (e.g. RDBMS).
>>
>> Another way to do load data on startup is to use a Geode *Initializer*.
>> However, this would require you to specify a snippet of cache.xml and
>> does not work if you specify your *Regions* in *Spring* (XML/Java)
>> config as you should when using *Spring*.  I also don't recommend using
>> cache.xml, but is the pure, non-*Spring* way to invoke logic after the
>> cache has been "fully" initialized (i.e. where the *Regions* have been
>> defined in cache.xml).
>>
>> See here [2] for more details.  Note, the documentation talks of
>> "launching an application" on startup, after cache initialization, but
>> technically, you can do whatever you want, like load data.
>>
>> I recommend the BPP.
>>
>>
>> *> How should I set it up in config to allow it to join other nodes in
>> cluster?*
>>
>> Regardless of whether your server data node is "embedded" or not, you can
>> still use a Locator, or mcast to have the node join the cluster.  The
>> "embedded" scenario, where the "application" is a GemFire Server data node
>> will be part of the cluster as Udo said.
>>
>> This is easily achievable with...
>>
>> <util:properties id="gemfireProperties">
>>   <prop key="name">Example</prop>
>>   <!-- Set to non-zero value to use Multicast; comment out "locators" -->
>>   <prop key="*mcast-port*">0</prop>
>>   <prop key="log-level">${gemfire.log-level:config}</prop>
>>   <prop key=“*locators*”>someHost[10334]</prop>
>>   <prop key="start-locator">localhost[1034]</prop>
>> </util:properties>
>>
>> <gfe:cache properties-ref="gemfireProperties"/>
>>
>> ...
>>
>>
>> As you can see from the snippet of *Spring* XML config above, this
>> application is a Geode "peer" cache (i.e. embeds a Geode data node/server).
>>
>> The "*locators*" Geode/GemFire property enables this node to connect to
>> a cluster.  Likewise, you can use the "*mcast-port*" property instead,
>> however, I would recommend *Locators* over mcast.
>>
>> Additionally, you can see that I specified the "start-locator"
>> Geode/GemFire property, which enables me to start an embedded Locator.
>> Useful for testing purposes and connecting Geode data nodes together in a
>> cluster without a dedicated Locator, though, this approach is less
>> resilient if the applications/servers go down (as may be the case in a
>> micro-services scenario)!
>>
>>
>> *> if I start with embedded server is it required to use client pool or
>> is it not required?*
>>
>> A "client pool" is only applicable to cache clients (i.e. ClientCaches)
>> on the "client-side" of the equation.  "peers" find (Locator, mcast) and
>> communicate (TCP/UDP, JGroups) with each other through other means once a
>> cluster is formed.
>>
>> In fact, typically, it is more common to position your
>> microservices-based applications as Geode cache clients (i.e. 
>> <gfe:client-cache
>> ...>) and have them connect to a dedicated Geode service (i.e. cluster
>> of Geode servers/data nodes where also, 1 or more of those nodes are
>> running a "CacheServer", listening for cache clients to connect).  These
>> dedicated Geode server nodes in a cluster constituting the service can
>> still be configured with *Spring*, but they typically will not contain
>> an application-specific components other than CacheListeners, Loaders,
>> Writers, AEQ *Listeners*, etc.
>>
>> ClientCache applications use 1 or more Pools configured to talk to the
>> servers in the cluster (either by way of Locator or direct server
>> communication). Pools can be configured with groups to target specific
>> members (in that group) in the cluster.  Typically, members in 1 group host
>> a different set of Regions from another group and is a way to separate data
>> traffic from 1 client to another dedicated to a specific resource/purpose
>> (usually based on business function, etc).
>>
>> On a side note, some of what you are wanting to do "scale-wise" seems
>> like a perfect fit for Pivotal CloudFoundry, which can auto-scale up or
>> down nodes in your cluster based on load and other factors.
>>
>> Anyway, hope this helps!
>>
>> -John
>>
>>
>>
>>
>>
>> [1] http://docs.spring.io/spring/docs/current/spring-framewo
>> rk-reference/htmlsingle/#beans-factory-extension-bpp
>> [2] http://geode.apache.org/docs/guide/basic_config/the_cach
>> e/setting_cache_initializer.html
>>
>>
>> On Sun, Dec 25, 2016 at 11:12 PM, Amit Pandey <[email protected]>
>> wrote:
>>
>>> Hey,
>>>
>>> Thanks.
>>>
>>> I have lots of reference data which will be loaded at start of day. This
>>> data is not bound to change much and as such I want to keep it loaded at
>>> the start of day. Read through will make it slow while it is being actually
>>> accessed so I want to keep it loaded in memory.
>>>
>>> Also I want to have functions which will be called by clients to do some
>>> compute and return results. Using functions should allow me to add nodes
>>> and speed up the compute.
>>>
>>> I have some micro services each of which will start a gemfire node, and
>>> I want to connect, so yes I can set it up with locator.
>>>
>>> However I have one doubt, if I start with embedded server is it required
>>> to use client pool or is it not required?
>>>
>>> Regards
>>>
>>> On Mon, Dec 26, 2016 at 1:18 AM, Udo Kohlmeyer <[email protected]>
>>> wrote:
>>>
>>>> Hi there Amit,
>>>>
>>>> At this stage the only way you could load all data at one go is to
>>>> write a client to connect to the db and load all in. Another approach could
>>>> be to write the same code into a function and invoke the function at start
>>>> up. But in both cases both are manual.
>>>>
>>>> To have geode servers join a cluster, you have 2 ways.
>>>>
>>>>    1. Connecting them up via a locator
>>>>    2. Connecting them up via mcast.
>>>>
>>>> Please be aware the once you connect a server to a cluster, that server
>>>> becomes an integral part of the cluster so adding/removing servers from a
>>>> cluster is not something you'd want to do in a load-based scaling model.
>>>> i.e if the load is high, add a server and if load is low, shut down a
>>>> server.
>>>>
>>>> Just interest sake, what is your use case.
>>>>
>>>> --Udo
>>>>
>>>> On 12/24/16 05:57, Amit Pandey wrote:
>>>>
>>>> Hi Guys,
>>>>
>>>> I am using Spring Data Geode. I have been able to use read and write
>>>> through/ write behind. I want to load all data on cache startup at a go.
>>>>
>>>> Secondly my geode server is embedded but I want to allow it join to
>>>> other nodes.  How should I set it up in config to allow it to join other
>>>> nodes in cluster?
>>>>
>>>> Regards
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> -John
>> john.blum10101 (skype)
>>
>
>


-- 
-John
john.blum10101 (skype)

Re: Load all data from DB on Cache Start

Reply via email to