Hello.

I am trying to write a storage function in Pig and I'd like to know what the 
guarantees are on the  StoreFunc's  prepareToWrite , cleanupOnFailure and 
cleanupOnSucccess methods are.

In particular, when are these functions called?  Is it once per task or once 
per tuple?

The store that I am writing to expects a flow like

Open connection.
Many, many writes.
Close connection.

If it turns out the prepareToWrite and cleanupOnSuccess get called for every 
tuple,  it would be very problematic on large datasets. But once per task (or 
so) would be reasonable.

Pointers to the pig code controlling the invocation of these functions would be 
especially appreciated.

Cheers,
Nate Segerlind

Reply via email to