When a job fails to start for whatever reason, condor puts
the job in a "Held" state and emits a ULOG_JOB_HELD event.
Handle this event and update the state in the database.

Signed-off-by: Chris Lalancette <[email protected]>
---
 src/dbomatic/dbomatic |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/src/dbomatic/dbomatic b/src/dbomatic/dbomatic
index f6647ba..92a27ef 100755
--- a/src/dbomatic/dbomatic
+++ b/src/dbomatic/dbomatic
@@ -143,6 +143,15 @@ class CondorEventLog < Nokogiri::XML::SAX::Document
       # ULOG_SUBMIT happens when the job is first submitted to condor.
       # However, it's not a state that we care to export to users, but it's
       # also not an error, so we just silently ignore it.
+    elsif @trigger_type == "ULOG_JOB_HELD"
+      # we failed to start the instance
+      # FIXME: if this happens, we probably want to add the HoldReason field
+      # to the database so we can display it to the user
+      # FIXME: we also may want to delete this job from condor, depending
+      # on the error.  For instance, if you are trying to start an instance
+      # with a mismatched image and hardwareprofile architecture, the only
+      # reasonable way out is to create a new instance.  Needs thought
+      inst.state = Instance::STATE_FAILED
     else
       @logger.info "Unexpected trigger type #...@trigger_type}, not updating 
instance state"
       return
-- 
1.7.2.3

_______________________________________________
deltacloud-devel mailing list
[email protected]
https://fedorahosted.org/mailman/listinfo/deltacloud-devel

Reply via email to