The DAG for a template just happens to schedule 2 tasks that do something like 
this:

val fieldsRDD: RDD[(ItemID, PropertyMap)] = PEventStore.aggregateProperties(
  appName = dsp.appName,
  entityType = "item")(sc)

to execute in parallel

The PEventStore calls from 2 separate closures start hitting HBase and it 
fails, no matter how high I set the RPC and Scanner Timeout. 

This has only come up recently with some restructuring, which I assume caused 
the 2 tasks to end up at the same point in the DAG. Is there a way to force one 
HBase related task to complete before the other is started? They both return 
RDDs, which are lazy evaluated like promises until the data is needed. Can I 
force the promise to be kept?

Reply via email to