Hi Ted - if the data is keyed you can use a key compacted topic and
essentially keep the data 'forever',i.e., you'll always have the latest
version of the data for a given key. However, you'd still want to backup
the data someplace else just-in-case.

On 16 February 2016 at 21:25, Ted Swerve <ted.swe...@gmail.com> wrote:

> I guess I was just drawn in by the elegance of having everything available
> in one well-defined Kafka topic should I start up some new code.
>
> If instead the Kafka topics were on a retention period of say 7 days, that
> would involve firing up a topic to load the warehoused data from HDFS (or a
> more traditional load), and then switching over to the live topic?
>
> On Tue, Feb 16, 2016 at 8:32 AM, Ben Stopford <b...@confluent.io> wrote:
>
> > Ted - it depends on your domain. More conservative approaches to long
> > lived data protect against data corruption, which generally means
> snapshots
> > and cold storage.
> >
> >
> > > On 15 Feb 2016, at 21:31, Ted Swerve <ted.swe...@gmail.com> wrote:
> > >
> > > HI Ben, Sharninder,
> > >
> > > Thanks for your responses, I appreciate it.
> > >
> > > Ben - thanks for the tips on settings. A backup could certainly be a
> > > possibility, although if only with similar durability guarantees, I'm
> not
> > > sure what the purpose would be?
> > >
> > > Sharninder - yes, we would only be using the logs as forward-only
> > streams -
> > > i.e. picking an offset to read from and moving forwards - and would be
> > > setting retention time to essentially infinite.
> > >
> > > Regards,
> > > Ted.
> > >
> > > On Tue, Feb 16, 2016 at 5:05 AM, Sharninder Khera <
> sharnin...@gmail.com>
> > > wrote:
> > >
> > >> This topic comes up often on this list. Kafka can be used as a
> datastore
> > >> if that’s what your application wants with the caveat that Kafka isn’t
> > >> designed to keep data around forever. There is a default retention
> time
> > >> after which older data gets deleted. The high level consumer
> essentially
> > >> reads data as a stream and while you can do sort of random access with
> > the
> > >> low level consumer, its not ideal.
> > >>
> > >>
> > >>
> > >>> On 15-Feb-2016, at 10:26 PM, Ted Swerve <ted.swe...@gmail.com>
> wrote:
> > >>>
> > >>> Hello,
> > >>>
> > >>> Is it viable to use infinite-retention Kafka topics as a master data
> > >>> store?  I'm not talking massive volumes of data here, but still
> > >> potentially
> > >>> extending into tens of terabytes.
> > >>>
> > >>> Are there any drawbacks or pitfalls to such an approach?  It seems
> > like a
> > >>> compelling design, but there seem to be mixed messages about its
> > >>> suitability for this kind of role.
> > >>>
> > >>> Regards,
> > >>> Ted
> > >>
> > >>
> >
> >
>

Reply via email to