(moving future discussion on listener hooks to [email protected])

We should take a look at HBase. They have an Observer API that runs server side which might serve as a good starting point. IIRC, it's designed for implementing this kind of functionality.

-------- Original Message --------
Subject:        Re: Trigger for Accumulo table
Date:   Tue, 8 Dec 2015 18:28:17 -0500
From:   Adam Fuchs <[email protected]>
Reply-To:       [email protected]
To:     [email protected]



I totally agree, Christopher. I have also run into a few situations
where it would have been nice to have something like a mutation listener
hook. Particularly in generating indexing and stats records.

Adam


On Tue, Dec 8, 2015 at 5:59 PM, Christopher <[email protected]
<mailto:[email protected]>> wrote:

    In the future, it might be useful to provide a supported API hook
    here. It certainly would've made implementing replication easier,
    but could also be useful as a notification system.

    On Tue, Dec 8, 2015 at 4:51 PM Keith Turner <[email protected]
    <mailto:[email protected]>> wrote:

        Constraints are checked before data is written.  In the case of
failures a constraint may see data thats never successfully written.

        On Tue, Dec 8, 2015 at 4:18 PM, Christopher <[email protected]
        <mailto:[email protected]>> wrote:

            Look at org.apache.accumulo.core.constraints.Constraint for
            a description and

org.apache.accumulo.core.constraints.DefaultKeySizeConstraint as
            an example.

            In short, Mutations which are live-ingested into a tablet
            server are validated against constraints you specify on the
            table. That means that all Mutations written to a table go
            through this bit of user-provided code at least once. You
            could use that fact to your advantage. However, this would
            be highly experimental and might have some caveats to consider.

            You can configure a constraint on a table with
            connector.tableOperations().addConstraint(...)


            On Sun, Dec 6, 2015 at 10:49 PM Thai Ngo
            <[email protected] <mailto:[email protected]>> wrote:

                Christopher,

                This is interesting! Could you please give me more
                details about this?

                Thanks,
                Thai

                On Thu, Dec 3, 2015 at 12:17 PM, Christopher
                <[email protected] <mailto:[email protected]>> wrote:

                    You could also implement a constraint to notify an
                    external system when a row is updated.


                    On Wed, Dec 2, 2015, 22:54 Josh Elser
                    <[email protected] <mailto:[email protected]>>
                    wrote:

                        oops :)

                        [1] http://fluo.io/

                        Josh Elser wrote:
                         > Hi Thai,
                         >
                         > There is no out-of-the-box feature provided
                        with Accumulo that does what
                         > you're asking for. Accumulo doesn't provide
                        any functionality to push
                         > notifications to other systems. You could
                        potentially maintain other
                         > tables/columns in which you maintain the last
                        time a row was updated,
                         > but the onus is on your "other services" to
                        read the table to find out
                         > when a change occurred (which is probably not
                        scalable at "real time").
                         >
                         > There are other systems you could likely
                        leverage to solve this,
                         > depending on the durability and scalability
                        that your application needs.
                         >
                         > For a system "close" to Accumulo, you could
                        take a look at Fluo [1]
                         > which is an implementation of Google's
                        "Percolator" system. This is a
                         > system based on throughput rather than
                        low-latency, so it may not be a
                         > good fit for your needs. There are probably
                        other systems in the Apache
                         > ecosystem (Kafka, Storm, Flink or Spark
                        Streaming maybe?) that are be
                         > helpful to your problem. I'm not an expert on
                        these to recommend on (nor
                         > do I think I understand your entire
                        architecture well enough).
                         >
                         > Thai Ngo wrote:
                         >> Hi list,
                         >>
                         >> I have a use-case when existing rows in a
                        table will be updated by an
                         >> internal service. Data in a row of this
                        table is composed of 2 parts:
                         >> 1st part - immutable and the 2nd one - will
                        be updated (filled in) a
                         >> little later.
                         >>
                         >> Currently, I have a need of knowing when and
                        which rows will be updated
                         >> in the table so that other services will be
                        wisely start consuming the
                         >> data. It will make more sense when I need to
                        consume the data in near
                         >> realtime. So developing a notification
                        function or simpler - a trigger
                         >> is what I really want to do now.
                         >>
                         >> I am curious to know if someone has done
                        similar job or there are
                         >> features or APIs or best practices available
                        for Accumulo so far. I'm
                         >> thinking of letting the internal service
                        which updates the data notify
                         >> us whenever it updates the data.
                         >>
                         >> What do you think?
                         >>
                         >> Thanks,
                         >> Thai




Reply via email to