Thanks for all the suggestion.
I read about TransformingIterator and started implementing it, I
extended this class and tried to override its abstract method. But I am
not able to get where and what to write to change column family?
So please provide your suggestions.
Thanks
Shweta
On
On the surface it adds an additional level of specification/grouping.
The potential benefit we have in accumulo is that along with the fact that
identical rowID's are guaranteed to be in the same file. You can use
Locality Groups, to place specific Column Families into the same file as
well.
to implement that iterator.
looks like you will only need to override replaceColumnFamily
and this looks to return the new ColumnFamily via the argument. So
manipulate the Text object provided.
On Wed, May 27, 2015 at 8:06 AM, Andrew Wells awe...@clearedgeit.com
wrote:
Looks like you want to
I've been trying to understand the difference between the two column name
parts -- column family and column qualifier. I don't understand the value
of using the columnFamily for the column name and an empty text (new
Text(new byte[0])) field for the column qualifier vs. a non-unique column
name
Eric,
Thanks. I assume managing something like 280GB per tablet server is feasible
given the various knobs available to tune performance.
Regards,
Mike Fagan
From: Eric Newton eric.new...@gmail.commailto:eric.new...@gmail.com
Reply-To: user@accumulo.apache.orgmailto:user@accumulo.apache.org
Thanks everyone for their input.
I estimate I can use 20 tablet servers to support 1m lookups a day
Are there any good rules of thumb regarding the amount of data/tablets
managed by a tablet server?
Regards,
Mike Fagan
On 5/22/15, 1:33 PM, Kepner, Jeremy - 0553 - MITLL kep...@ll.mit.edu
Thank you to all responders. This clears it up greatly.
Dave P
On Wed, May 27, 2015 at 10:52 AM, Christopher ctubb...@apache.org wrote:
David-
Both the column family (CF) and column qualifier (CQ) could be thought of
as arbitrary dimensions in the key. If you only need one dimension to
David-
Both the column family (CF) and column qualifier (CQ) could be thought of
as arbitrary dimensions in the key. If you only need one dimension to
specify your data, the other can be empty. You could also store these in
separate tables, as you suggest, but part of the power of Accumulo is
You can get decent ingest concurrency when the number of tablets per server
is between 20 and 80.
There are so many knobs to adjust this performance, it's hard to give a
simple answer. 0-1 tablets/server is bad. 1000+/server is bad. Usually.
It will take time to tune your system.
On Wed, May
Couple of clarifications:
* Identical rowIDs will colocate data in the same tablet, but not
necessarily the same file. Tablets can have multiple files.
* Locality groups will colocate data within a file, not necessarily in
its own file. RFile's format support multiple regions within the file
I believe the typical case would be to set it at the scan and major
compaction scopes for the table. This would ensure that queries for data
would see the transformed result and, eventually, all of the data would
be rewritten to the new schema (or you could force a major compaction
and know
Hi All,
If anyone has worked on tranforming iterator can tell me if the iterator
make tranformed changes in the accumulo table also or it returns the
result at the scan time only. Can u provide me details how to implement
its abstract methods and their use and workflow of the iterator?
12 matches
Mail list logo