Hi All I am running the below code and its running for very long time where
input to flatMapTopair is record of 50K. and I am calling Hbase for 50K
times just a range scan query to should not take time. can anybody guide me
what is wrong here?

JavaPairRDD<VendorRecord, Iterable<VendorRecord>> pairvendorData
=matchRdd.flatMapToPair( new PairFlatMapFunction<VendorRecord,
VendorRecord, VendorRecord>(){

@Override
public Iterable<Tuple2<VendorRecord,VendorRecord>> call(
VendorRecord t) throws Exception {
List<Tuple2<VendorRecord, VendorRecord>> pairs = new
LinkedList<Tuple2<VendorRecord, VendorRecord>>();
MatcherKeys matchkeys=CompanyMatcherHelper.getBlockinkeys(t);
List<VendorRecord> Matchedrecords
=ckdao.getMatchingRecordsWithscan(matchkeys);
for(int i=0;i<Matchedrecords.size();i++){
pairs.add( new Tuple2<VendorRecord,VendorRecord>(t,Matchedrecords.get(i)));
}
 return pairs;
}
 }
).groupByKey(200).persist(StorageLevel.DISK_ONLY_2());

Reply via email to