+1 to all that. Adding to that The FastMap, BitVector and other classes in taste.common are being used(or should be used) by other packages. We can start our own collections package say ...mahout.collections ?
about Cache: the reason why both implementation differ is. At one place the datastore/retriever code is extended by the cache class. The other place, the datastore is independent of the cache and the application or the algorithm uses cache explicitly to get data or fetch+put data into the cache. Plus it didnt make sense to make Hbase retriever of the form <K,V> because, Hbase is used as getCell(row, columnfamily, column) or getFamily(row, columnfamily) or getRow(row) about Pair: I really didnt see that. Again, lets move all these helper classes out of taste and ensure its getting reused by other algorithms as well. And it will also ease adding more trove/colt like collections classes