(moving future discussion on listener hooks to [email protected])
We should take a look at HBase. They have an Observer API that runs
server side which might serve as a good starting point. IIRC, it's
designed for implementing this kind of functionality.
-------- Original Message --------
Subject: Re: Trigger for Accumulo table
Date: Tue, 8 Dec 2015 18:28:17 -0500
From: Adam Fuchs <[email protected]>
Reply-To: [email protected]
To: [email protected]
I totally agree, Christopher. I have also run into a few situations
where it would have been nice to have something like a mutation listener
hook. Particularly in generating indexing and stats records.
Adam
On Tue, Dec 8, 2015 at 5:59 PM, Christopher <[email protected]
<mailto:[email protected]>> wrote:
In the future, it might be useful to provide a supported API hook
here. It certainly would've made implementing replication easier,
but could also be useful as a notification system.
On Tue, Dec 8, 2015 at 4:51 PM Keith Turner <[email protected]
<mailto:[email protected]>> wrote:
Constraints are checked before data is written. In the case of
failures a constraint may see data thats never successfully
written.
On Tue, Dec 8, 2015 at 4:18 PM, Christopher <[email protected]
<mailto:[email protected]>> wrote:
Look at org.apache.accumulo.core.constraints.Constraint for
a description and
org.apache.accumulo.core.constraints.DefaultKeySizeConstraint as
an example.
In short, Mutations which are live-ingested into a tablet
server are validated against constraints you specify on the
table. That means that all Mutations written to a table go
through this bit of user-provided code at least once. You
could use that fact to your advantage. However, this would
be highly experimental and might have some caveats to consider.
You can configure a constraint on a table with
connector.tableOperations().addConstraint(...)
On Sun, Dec 6, 2015 at 10:49 PM Thai Ngo
<[email protected] <mailto:[email protected]>> wrote:
Christopher,
This is interesting! Could you please give me more
details about this?
Thanks,
Thai
On Thu, Dec 3, 2015 at 12:17 PM, Christopher
<[email protected] <mailto:[email protected]>> wrote:
You could also implement a constraint to notify an
external system when a row is updated.
On Wed, Dec 2, 2015, 22:54 Josh Elser
<[email protected] <mailto:[email protected]>>
wrote:
oops :)
[1] http://fluo.io/
Josh Elser wrote:
> Hi Thai,
>
> There is no out-of-the-box feature provided
with Accumulo that does what
> you're asking for. Accumulo doesn't provide
any functionality to push
> notifications to other systems. You could
potentially maintain other
> tables/columns in which you maintain the last
time a row was updated,
> but the onus is on your "other services" to
read the table to find out
> when a change occurred (which is probably not
scalable at "real time").
>
> There are other systems you could likely
leverage to solve this,
> depending on the durability and scalability
that your application needs.
>
> For a system "close" to Accumulo, you could
take a look at Fluo [1]
> which is an implementation of Google's
"Percolator" system. This is a
> system based on throughput rather than
low-latency, so it may not be a
> good fit for your needs. There are probably
other systems in the Apache
> ecosystem (Kafka, Storm, Flink or Spark
Streaming maybe?) that are be
> helpful to your problem. I'm not an expert on
these to recommend on (nor
> do I think I understand your entire
architecture well enough).
>
> Thai Ngo wrote:
>> Hi list,
>>
>> I have a use-case when existing rows in a
table will be updated by an
>> internal service. Data in a row of this
table is composed of 2 parts:
>> 1st part - immutable and the 2nd one - will
be updated (filled in) a
>> little later.
>>
>> Currently, I have a need of knowing when and
which rows will be updated
>> in the table so that other services will be
wisely start consuming the
>> data. It will make more sense when I need to
consume the data in near
>> realtime. So developing a notification
function or simpler - a trigger
>> is what I really want to do now.
>>
>> I am curious to know if someone has done
similar job or there are
>> features or APIs or best practices available
for Accumulo so far. I'm
>> thinking of letting the internal service
which updates the data notify
>> us whenever it updates the data.
>>
>> What do you think?
>>
>> Thanks,
>> Thai