[jira] [Created] (GIRAPH-1132) Giraph jobs don't end if zookeeper dies before job starts

2017-03-01 Thread Sergey Edunov (JIRA)
Sergey Edunov created GIRAPH-1132:
-

 Summary: Giraph jobs don't end if zookeeper dies before job starts
 Key: GIRAPH-1132
 URL: https://issues.apache.org/jira/browse/GIRAPH-1132
 Project: Giraph
  Issue Type: Bug
Reporter: Sergey Edunov


There are multiple places in the Giraph code where we waitForever() on some 
event (e.g. all workers to finish or zookeeper to come up). This is in general 
bad, as any issue on other side may become undetected and make job run forever. 
We need to introduce timeout to these waits



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (GIRAPH-1132) Giraph jobs don't end if zookeeper dies before job starts

2017-03-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891109#comment-15891109
 ] 

ASF GitHub Bot commented on GIRAPH-1132:


GitHub user edunov opened a pull request:

https://github.com/apache/giraph/pull/21

GIRAPH-1132 Giraph jobs don't end if zookeeper dies before job starts

I'm not sure I set all the timeouts right. There is no way to test all of 
these either. 
The idea is that we shouldn't have infinite wait loops anywhere. And that's 
exactly what this diff does

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/edunov/giraph timeout

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/giraph/pull/21.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21


commit cdbe7d4a46d80611fb5846eeeab37b94e66781a1
Author: Sergey Edunov 
Date:   2017-03-01T21:41:37Z

GIRAPH-1132 Giraph jobs don't end if zookeeper dies before job starts




> Giraph jobs don't end if zookeeper dies before job starts
> -
>
> Key: GIRAPH-1132
> URL: https://issues.apache.org/jira/browse/GIRAPH-1132
> Project: Giraph
>  Issue Type: Bug
>Reporter: Sergey Edunov
>
> There are multiple places in the Giraph code where we waitForever() on some 
> event (e.g. all workers to finish or zookeeper to come up). This is in 
> general bad, as any issue on other side may become undetected and make job 
> run forever. We need to introduce timeout to these waits



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (GIRAPH-1132) Giraph jobs don't end if zookeeper dies before job starts

2017-03-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891144#comment-15891144
 ] 

ASF GitHub Bot commented on GIRAPH-1132:


Github user majakabiljo commented on a diff in the pull request:

https://github.com/apache/giraph/pull/21#discussion_r103799656
  
--- Diff: 
giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java ---
@@ -1238,5 +1239,27 @@
   BooleanConfOption PREFER_IP_ADDRESSES =
   new BooleanConfOption("giraph.preferIP", false,
   "Prefer IP addresses instead of host names");
+
+  /**
+   * Timeout for "waitForever", when we need to wait for zookeeper.
+   * Since we should never really have to wait forever.
+   * We should only wait some reasonable but large amount of time.
+   */
+  LongConfOption WAIT_FOREVER_ZOOKEEPER_TIMEOUT_MSEC =
--- End diff --

Nit: since we are not waiting forever anymore, I'd drop word forever from 
everywhere (forever and timeout have opposite meaning :-))


> Giraph jobs don't end if zookeeper dies before job starts
> -
>
> Key: GIRAPH-1132
> URL: https://issues.apache.org/jira/browse/GIRAPH-1132
> Project: Giraph
>  Issue Type: Bug
>Reporter: Sergey Edunov
>
> There are multiple places in the Giraph code where we waitForever() on some 
> event (e.g. all workers to finish or zookeeper to come up). This is in 
> general bad, as any issue on other side may become undetected and make job 
> run forever. We need to introduce timeout to these waits



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (GIRAPH-1132) Giraph jobs don't end if zookeeper dies before job starts

2017-03-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891157#comment-15891157
 ] 

ASF GitHub Bot commented on GIRAPH-1132:


Github user asfgit closed the pull request at:

https://github.com/apache/giraph/pull/21


> Giraph jobs don't end if zookeeper dies before job starts
> -
>
> Key: GIRAPH-1132
> URL: https://issues.apache.org/jira/browse/GIRAPH-1132
> Project: Giraph
>  Issue Type: Bug
>Reporter: Sergey Edunov
>
> There are multiple places in the Giraph code where we waitForever() on some 
> event (e.g. all workers to finish or zookeeper to come up). This is in 
> general bad, as any issue on other side may become undetected and make job 
> run forever. We need to introduce timeout to these waits



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (GIRAPH-1132) Giraph jobs don't end if zookeeper dies before job starts

2017-03-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891158#comment-15891158
 ] 

Hudson commented on GIRAPH-1132:


FAILURE: Integrated in Jenkins build Giraph-trunk-Commit #1686 (See 
[https://builds.apache.org/job/Giraph-trunk-Commit/1686/])
GIRAPH-1132 (edunov: 
[http://git-wip-us.apache.org/repos/asf?p=giraph.git&a=commit&h=18c67ca3c221cf2961d59ac19a245762f4dd8f7d])
* (edit) 
giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java
* (edit) giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java
* (edit) 
giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java
* (edit) giraph-core/src/main/java/org/apache/giraph/zk/BspEvent.java
* (edit) giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java
* (edit) giraph-core/src/main/java/org/apache/giraph/zk/PredicateLock.java
* (edit) giraph-core/src/test/java/org/apache/giraph/zk/TestPredicateLock.java


> Giraph jobs don't end if zookeeper dies before job starts
> -
>
> Key: GIRAPH-1132
> URL: https://issues.apache.org/jira/browse/GIRAPH-1132
> Project: Giraph
>  Issue Type: Bug
>Reporter: Sergey Edunov
>
> There are multiple places in the Giraph code where we waitForever() on some 
> event (e.g. all workers to finish or zookeeper to come up). This is in 
> general bad, as any issue on other side may become undetected and make job 
> run forever. We need to introduce timeout to these waits



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)