Hi, I have a spark job which needs to access HBase inside a mapToPair function. The question is that I do not want to connect to HBase and close connection each time.
As I understand, PairFunction is not designed to manage resources with setup() and close(), like Hadoop reader and writer. Does spark support this kind of resource manage? Your help is appreciated! By the way, the reason I do not want to use writer is that I want to return some metric values after processing. The returned metric values will be further processed. Basically, it is not desirable to use HDFS as transfer location. Thanks, Huiliang