Hello,

I am trying to bulk upload content to hbase using the instructions provided at 
http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/mapreduce/package-summary.html#package_description:
I have a mapper that reads input and emmits KeyValue objects to be fed in the 
KeyValueSortReducer. The mapper emmits a number of KeyValue objects for each 
row. For the same rowid, the KeyValue objects have different columnids.
The problem is the following: when these KeyValue objects (that have the same rowid but different colids in the same column family) reach the reducer, the TreeSet used to sort KeyValues, keeps only the KeyValue that gets last (it replaces all entries with the last one that reaches the reducer), as the KeyValue.COMPARATOR compares only the rowid !!!!!

Can I use a different Comparator??? KeyValue objects of the same rowid must be 
sorted before writing them in the Hfile, or this does not matter???

Thank you in advance for your time.


--
Ioannis Konstantinou
Research Associate, Computing Systems Laboratory
National Technical University of Athens
Web: http://www.cslab.ntua.gr/~ikons

Reply via email to