Chiming in on one of Josh's comments
Since you're passing in what are likely multiple, disjoint ranges, I'm not
sure you're going to get much of a performance optimization out of a custom
iterator in this case. After each seek, your iterator would need to return
the entries that it summed in
Thanks Josh. It really worked for me.
On Wednesday 17 June 2015 08:43 PM, Josh Elser wrote:
Madhvi,
Understood. A few more questions..
How are you passing these IDs to the batch scanner? Are you providing
individual Ranges for each ID (e.g. `new Range(new Key(row1, ,
id1), true, new
Also, apparently I wrote something similar to your problem a long time ago:
https://github.com/joshelser/accumulo-column-summing
The above implementation does assume large contiguous ranges. Thought it
might be helpful anyways.
Josh Elser wrote:
Good, I'm glad you found it useful.
The
Hi,
Thanks for the blog you shared.I found it quite useful for my requirement.
How are you passing these IDs to the batch scanner?
I am passing row ids received as a previous query result from another
table as 'new Range(entry.getKey().getRow())' in a Range type list and
passing that list to
Hi Josh,
Sorry, my company policy doesn't allow me to share full source.What we
are tryng to do is summing over a unique field stored in column
qualifier for IDs passed to batch scanner.Can u suggest how it can be
done in accumulo.
Thanks
Madhvi
On Wednesday 17 June 2015 10:32 AM, Josh
Madhvi,
Understood. A few more questions..
How are you passing these IDs to the batch scanner? Are you providing
individual Ranges for each ID (e.g. `new Range(new Key(row1, ,
id1), true, new Key(row1, , id1\x00), false))`)? Or are you
providing an entire row (or set of rows) and using the
You put random values in the family and qualifier? Do I misunderstand you?
Also, if you can put up the full source for the iterator, that will be
much easier if you need help debugging it. It's hard for us to guess at
why your code might not be working as you expect.
madhvi wrote:
Hi Josh,
It's hard to remotely debug an iterator, especially when we don't know
what it's doing. If you can post the code, that would help tremendously.
Instead of dumping values to a text file, you may fare better by
attaching a remote debugger to the TabletServer and setting a breakpoint
on your
Thanks Josh.
Outline of my code is:
public class TestIterator extends WrappingIterator {
HashMapString, Integer holder = new HashMap();
private IteratorMap.EntryString, Integer entries=null;
private EntryString, Integer entry=null;
private Key emitKey;
private Value emitValue;
@Override
To enable remote debugging, in ACCUMULO_TSERVER_OPTS in accumulo-env.sh,
add the following -Xdebug
-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=
In this case, you would then use the port in Eclipse to do a Remote
Java Application debugging session. Your TServer would need
What do you mean by multiple entries? Are you doing something similar to
the WholeRowIterator, which encodes all the entries for a given row into a
single key value?
Are you using any other iterators?
In general, calls to `hasTop()`, `getTopKey()` and `getTopValue()` should
not change the state
Possible explanation inline
shweta.agrawal wrote:
Hi,
I am making a custom iterator which returns multiple entries. For some
entries getTopValue function is called, sometimes skipped. Due to this
behaviour i am not getting all the entries at scan time which are to be
returned.
I had written
12 matches
Mail list logo