Also, since it hasn't been otherwise stated, using the TransformingIterator is on the fringes of "normal". Your life may be much more simple to write a mapreduce job to rewrite your data. Implementing the Iterator correctly is a little obtuse (as you're noticing) and is not at all straightforward to debug. If it's reasonable to rewrite your data, it may be the easier solution IMO.
madhvi wrote:
Hi All, If anyone has worked on tranforming iterator can tell me if the iterator make tranformed changes in the accumulo table also or it returns the result at the scan time only. Can u provide me details how to implement its abstract methods and their use and workflow of the iterator? Thanks Madhvi On Wednesday 27 May 2015 05:38 PM, Andrew Wells wrote:to implement that iterator. looks like you will only need to override replaceColumnFamily and this looks to return the new ColumnFamily via the argument. So manipulate the Text object provided. On Wed, May 27, 2015 at 8:06 AM, Andrew Wells <awe...@clearedgeit.com <mailto:awe...@clearedgeit.com>> wrote: Looks like you want to override these methods: |protected Key <http://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/data/Key.html>| |*replaceColumnFamily <http://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/iterators/user/TransformingIterator.html#replaceColumnFamily%28org.apache.accumulo.core.data.Key,%20org.apache.hadoop.io.Text%29>*(Key <http://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/data/Key.html> originalKey, org.apache.hadoop.io.Text newColFam)| Make a new key with all parts (including delete flag) coming from |originalKey| but use |newColFam| as the column family. |protected Key <http://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/data/Key.html>| |*replaceColumnQualifier <http://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/iterators/user/TransformingIterator.html#replaceColumnQualifier%28org.apache.accumulo.core.data.Key,%20org.apache.hadoop.io.Text%29>*(Key <http://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/data/Key.html> originalKey, org.apache.hadoop.io.Text newColQual)| Make a new key with all parts (including delete flag) coming from |originalKey| but use |newColQual| as the column qualifier. |protected Key <http://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/data/Key.html>| |*replaceColumnVisibility <http://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/iterators/user/TransformingIterator.html#replaceColumnVisibility%28org.apache.accumulo.core.data.Key,%20org.apache.hadoop.io.Text%29>*(Key <http://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/data/Key.html> originalKey, org.apache.hadoop.io.Text newColVis)| Make a new key with all parts (including delete flag) coming from |originalKey| but use |newColVis| as the column visibility. |protected Key <http://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/data/Key.html>| |*replaceKeyParts <http://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/iterators/user/TransformingIterator.html#replaceKeyParts%28org.apache.accumulo.core.data.Key,%20org.apache.hadoop.io.Text,%20org.apache.hadoop.io.Text%29>*(Key <http://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/data/Key.html> originalKey, org.apache.hadoop.io.Text newColQual, org.apache.hadoop.io.Text newColVis)| Make a new key with a column qualifier, and column visibility. |protected Key <http://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/data/Key.html>| |*replaceKeyParts <http://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/iterators/user/TransformingIterator.html#replaceKeyParts%28org.apache.accumulo.core.data.Key,%20org.apache.hadoop.io.Text,%20org.apache.hadoop.io.Text,%20org.apache.hadoop.io.Text%29>*(Key <http://accumulo.apache.org/1.6/apidocs/org/apache/accumulo/core/data/Key.html> originalKey, org.apache.hadoop.io.Text newColFam, org.apache.hadoop.io.Text newColQual, org.apache.hadoop.io.Text newColVis)| Make a new key with a column family, column qualifier, and column visibility. On Wed, May 27, 2015 at 7:40 AM, shweta.agrawal <shweta.agra...@orkash.com <mailto:shweta.agra...@orkash.com>> wrote: Thanks for all the suggestion. I read about TransformingIterator and started implementing it, I extended this class and tried to override its abstract method. But I am not able to get where and what to write to change column family? So please provide your suggestions. Thanks Shweta On Tuesday 26 May 2015 08:33 PM, Adam Fuchs wrote:This can also be done with a row-doesn't-fit-into-memory constraint. You won't need to hold the second column in-memory if your iterator tree deep copies, filters, transforms and merges. Exhibit A: [HeapIterator-derivative] |_________________________ | \ [transform-graph1-to-graph2] \ | \ [column-family-graph1][all-but-column-family-graph1] With this design, you can subclass the HeapIterator, deep copy the source in the init method, wrap one in a custom transform iterator, and create a appropriate seek method. This is probably more on the advanced side of Accumulo programming, but can be done. Adam On Tue, May 26, 2015 at 8:59 AM, Eric Newton <eric.new...@gmail.com <mailto:eric.new...@gmail.com>> wrote: Short answer: no. Long answer: maybe. You can write an iterator which will transform: row, cf1, cq, vis -> value into: row, cf2, cq, vis -> value And if you can do this while maintaining sort order, you can get your new ColumnFamily transformed during scans and compactions. But this bit about maintaining the sort order is more complex than it sounds. If you have the following: row, a, cq, vis -> value row, aa, cq, vis -> value And you want to transform cf "a" into cf "b": row, aa, cq, vis -> value row, b, cq, vis -> value Your iterator needs to hold the second column in memory, after transforming the first column. Tablet server memory for holding Key/Values is not infinite. -Eric On Tue, May 26, 2015 at 8:44 AM, shweta.agrawal <shweta.agra...@orkash.com <mailto:shweta.agra...@orkash.com>> wrote: Hi, I want to ask, is it possible in accumulo to change the column family without changing the whole data. Suppose my column family is graph1, now i want to rename this column family as graph2. Is it possible? Thanks Shweta-- *Andrew George Wells* *Software Engineer* *awe...@clearedgeit.com <mailto:awe...@clearedgeit.com>* -- *Andrew George Wells* *Software Engineer* *awe...@clearedgeit.com <mailto:awe...@clearedgeit.com>*