Re: Is anyone using serialized iterators to provide provenance data?

Josh Elser Wed, 15 May 2013 17:58:25 -0700

Oh, I see what you mean. Table B was created from table A with afunction F (where F is some collection of iterators like you said).

It could be a neat application of the clone command. Storing thatinformation on table B is some exercise in where to put that immutableinformation (that's me ignoring that problem :P).

You say git: do you actually intend to have a cheap replay ability? Ormerely be able to view the history and be able to work through thetransformations again?


Seems reasonable for a 1.6 wish to me.

On 05/15/2013 08:44 PM, David Medinets wrote:

I don't see those as covering the same ground. Let's say I have anAccumulo table for a given human's genome. As a scientist, I want toapply a set of filters to create a subset of the genome. This providesa transform from data-set A to data-set B. Since iterators were usedfor the transform, we could serialize the set of iterators used by thetransformation. Both data-sets are immutable. Think git for data-sets.
On Wed, May 15, 2013 at 4:25 PM, Christopher <[email protected]<mailto:[email protected]>> wrote:
    I think this might relate to ACCUMULO-1397, in the form of providing a
    mechanism to specify iterator profiles, or ACCUMULO-415.

    --
    Christopher L Tubbs II
    http://gravatar.com/ctubbsii


    On Wed, May 15, 2013 at 2:51 PM, David Medinets
    <[email protected] <mailto:[email protected]>> wrote:
    > If you apply a set of iterators to one table to produce another,
    it seems
    > possible to serialize the iterator stack alongside the new table
    in some
    > catalog to provide provenance. The assumption is that the tables are
    > immutable, I think. Is anyone doing this or has anyone thought
    about doing
    > so? Just curious and wanted to ask before I forgot about the idea.

Re: Is anyone using serialized iterators to provide provenance data?

Reply via email to