Yes, the persistence feature could also be configurable on a per-topic
basis.  If desired, persistence could then be limited only to certain
critical topics, allowing less critical data to avoid the overhead.
There could also be configuration options specifying when and how to
persist the data.  Some options might be to fsync() the data to disk
immediately, at specified intervals, or just let the kernel decide when
to write the dirty pages to disk from its page cache.


Dave


-----Original Message-----
From: "Pete Wright" <pwri...@rubiconproject.com>
Sent: Monday, June 13, 2016 11:22am
To: users@kafka.apache.org
Subject: Re: Introducing Dory

On Mon, Jun 13, 2016 at 11:45:13AM -0600, Jason J. W. Williams wrote:
> Hi Dave,
> 
> Dory sounds very exciting. Without persistence its less useful for clients
> connected over a WAN, since if the WAN goes wonky you could build up quite
> a queue until it comes back.
> 
I was thinking the same thing.  My first thought was to see what the LOE
would be to implement some sort of spill-over process, where if the
preallocated memory segment is exhausted it could spool data to disk.
When connectivity to the brokers is back it could then de-spool data and
produce it back to the cluster.  I could see this being something worth
persuing if you are handling data that is critical for financial
reasons (as opposed to data used for non-financial reporting or
metrics).

-pete


-- 
Pete Wright
Lead Systems Architect
Rubicon Project
pwri...@rubiconproject.com
310.309.9298

Reply via email to