NICE! On Thu, Jul 12, 2012 at 11:47 AM, Billie J Rinaldi < billie.j.rina...@ugov.gov> wrote:
> On Thursday, July 12, 2012 8:47:41 AM, "David Medinets" < > david.medin...@gmail.com> wrote: > > I'd like to track field level changes for a given record (say, > > author). So I create a table without a VersioningIterator. And I > > insert a few records: > > > > insert "JOHN" "ATTRIBUTE" "AGE" "34" > > insert "JOHN" "ATTRIBUTE" "HEIGHT" "67" > > insert "JOHN" "BOOKS" "TITLE" "THE RISE OF ACCUMULO" > > > > The next action is that some ingest process happens and does this: > > > > insert "JOHN" "ATTRIBUTE" "AGE" "34" > > > > Since there is no VersioningIterator, there are two AGES both with > > "34" as the value. > > > > I would like an DropUnchangedValueIterator which removes the last > > inserted record. Removing the last record lets me use the n-1 > > timestamp as a LastUpdated value for the key-value pair. But as soon > > as a record is deleted, the previous records are not available > > anymore? What if the timestamp is set to MAX-timestamp so the records > > are sorted backwards? Does that avoid the blocking tombstones? I'd > > look at the source code before asking but I don't have that luxury for > > the next week or two and the question is rattling around my head. > > This is mixing the idea of a deletion entry, which removes all earlier > entries, and the the idea that iterators can arbitrarily filter out > entries. I don't think reversing the timestamp will help you much in this > case; what you want is an iterator that does pairwise comparisons of > entries, and if the values are the same keep one entry with the earlier > timestamp (then keep comparing entries for that record), and if the values > are different keep one entry with the later timestamp (then skip to the > next record). I think you'll have to write a custom iterator for that. > > Billie > > > > Naturally, I could query the database before the ingest insert. But, > > referring to slide 19 in Adam's presentation at > > http://people.apache.org/~afuchs/slides/accumulo_table_design.pdf, the > > read-modify-write design is not optimal. > -- Corey Nolet Senior Software Engineer TexelTek, inc. [Office] 301.880.7123 [Cell] 410-903-2110