Depends on the datasets size and HBase workload. The best way is to do join in pig, store it and then use HBase bulk load tool. It's general recommendation. I have no idea about your task details
2014-09-27 7:32 GMT+04:00 Krishna Kalyan <[email protected]>: > Hi, > We have a use case that involves ETL on data coming from several different > sources using pig. > We plan to store the final output table in HBase. > What will be the performance impact if we do a join with an external CSV > table using pig?. > > Regards, > Krishna >
