[kudu-CR] Kudu-2208 Avoid system call interruption in Subprocess

2018-01-12 Thread Jeffrey F. Lukman (Code Review)
Jeffrey F. Lukman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9011 )

Change subject: Kudu-2208 Avoid system call interruption in Subprocess
..


Patch Set 1:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc
File src/kudu/util/subprocess-test.cc:

http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@301
PS1, Line 301: thd
> nit: either abbreviate to just 't' or 'thr' would be more in line with what
Done


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@302
PS1, Line 302: 0.5
> nit: why not just '1'?  Using non-integer numbers for /bin/sleep is conside
Done


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@303
PS1, Line 303: subprocessThread
> nit: we use snake_case style in C++
Done


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@305
PS1, Line 305: ASSERT_OK(p.Start());
> I think adding a short sleep here (eg 50ms) is a good idea to ensure that t
Done


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@310
PS1, Line 310:   // create signal
> you aren't creating a signal here, but rather setting up a signal handler
Done


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@317
PS1, Line 317:   // send at most 10 kill signals to subprocess
> why at most 10? Why not loop until we see that the subprocess has exited? I
Done


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@319
PS1, Line 319:   int signalCounter = 0;
 :   bool hasError = false;
 :
> nit: snake_case
Done


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@322
PS1, Line 322: if (signalCounter > 10 || hasError) break;
> why not just put this into the while condition above?
Done


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@325
PS1, Line 325:   hasError = pthread_kill(thd, SIGUSR2) == 0;
> do we expect any errors from this? if not, could we just ASSERT this?
Done


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@330
PS1, Line 330: SleepFor(MonoDelta::FromMilliseconds(10));
> why sleep in between? if our goal is to provoke races we want to send as ma
I added sleep time here to avoid undefined signal 2 exception.


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess.cc
File src/kudu/util/subprocess.cc:

http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess.cc@67
PS1, Line 67: // Retry on EINTR for functions like read() that return -1 on 
error.
> can we move this function to a utility header since it's used in a few plac
Done


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess.cc@69
PS1, Line 69:  == true
> nit: can we drop this extra?
Done



--
To view, visit http://gerrit.cloudera.org:8080/9011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I66ab2ec391eced5ea1c37d4294d151fe03c7d586
Gerrit-Change-Number: 9011
Gerrit-PatchSet: 1
Gerrit-Owner: Jeffrey F. Lukman 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 12 Jan 2018 19:30:59 +
Gerrit-HasComments: Yes


[kudu-CR] Kudu-2208 Avoid system call interruption in Subprocess

2018-01-12 Thread Jeffrey F. Lukman (Code Review)
Jeffrey F. Lukman has abandoned this change. ( 
http://gerrit.cloudera.org:8080/9011 )

Change subject: Kudu-2208 Avoid system call interruption in Subprocess
..


Abandoned

replaced with a new patch.
--
To view, visit http://gerrit.cloudera.org:8080/9011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I66ab2ec391eced5ea1c37d4294d151fe03c7d586
Gerrit-Change-Number: 9011
Gerrit-PatchSet: 1
Gerrit-Owner: Jeffrey F. Lukman 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] Kudu-2208 Avoid system call interruption in Subprocess

2018-01-11 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9011 )

Change subject: Kudu-2208 Avoid system call interruption in Subprocess
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc
File src/kudu/util/subprocess-test.cc:

http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@26
PS1, Line 26: #include 
nit: move this header into the C-headers-group along with the  header.


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@302
PS1, Line 302: 0.5
nit: why not just '1'?  Using non-integer numbers for /bin/sleep is considered 
as a non-portable extension.


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@314
PS1, Line 314: sigaction
It would be nice to assert that sigaction() returns success.


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess.cc
File src/kudu/util/subprocess.cc:

http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess.cc@69
PS1, Line 69:  == true
nit: can we drop this extra?



--
To view, visit http://gerrit.cloudera.org:8080/9011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I66ab2ec391eced5ea1c37d4294d151fe03c7d586
Gerrit-Change-Number: 9011
Gerrit-PatchSet: 1
Gerrit-Owner: Jeffrey F. Lukman 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 12 Jan 2018 07:49:12 +
Gerrit-HasComments: Yes


[kudu-CR] Kudu-2208 Avoid system call interruption in Subprocess

2018-01-11 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9011 )

Change subject: Kudu-2208 Avoid system call interruption in Subprocess
..


Patch Set 1:

(10 comments)

I think it's also worth testing Subprocess::Call which seems to have some bugs 
as well

http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc
File src/kudu/util/subprocess-test.cc:

http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@301
PS1, Line 301: thd
nit: either abbreviate to just 't' or 'thr' would be more in line with what we 
do elsewhere.


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@303
PS1, Line 303: subprocessThread
nit: we use snake_case style in C++


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@305
PS1, Line 305: ASSERT_OK(p.Start());
I think adding a short sleep here (eg 50ms) is a good idea to ensure that the 
signals being sent from the other thread have a chance to race against Start() 
and not just Wait()


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@310
PS1, Line 310:   // create signal
you aren't creating a signal here, but rather setting up a signal handler


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@317
PS1, Line 317:   // send at most 10 kill signals to subprocess
why at most 10? Why not loop until we see that the subprocess has exited? I 
think there is a getter like 'joinable()' that you could use in a while loop. 
Sending only 10 leaves open a possibility that we dont' cover all the potential 
races


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@319
PS1, Line 319:   int signalCounter = 0;
 :   bool hasError = false;
 :
nit: snake_case


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@322
PS1, Line 322: if (signalCounter > 10 || hasError) break;
why not just put this into the while condition above?


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@325
PS1, Line 325:   hasError = pthread_kill(thd, SIGUSR2) == 0;
do we expect any errors from this? if not, could we just ASSERT this?


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@330
PS1, Line 330: SleepFor(MonoDelta::FromMilliseconds(10));
why sleep in between? if our goal is to provoke races we want to send as many 
signals as we can as fast as possible.


http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess.cc
File src/kudu/util/subprocess.cc:

http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess.cc@67
PS1, Line 67: // Retry on EINTR for functions like read() that return -1 on 
error.
can we move this function to a utility header since it's used in a few places 
now? maybe os-util.h



--
To view, visit http://gerrit.cloudera.org:8080/9011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I66ab2ec391eced5ea1c37d4294d151fe03c7d586
Gerrit-Change-Number: 9011
Gerrit-PatchSet: 1
Gerrit-Owner: Jeffrey F. Lukman 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 12 Jan 2018 00:03:45 +
Gerrit-HasComments: Yes


[kudu-CR] Kudu-2208 Avoid system call interruption in Subprocess

2018-01-11 Thread Jeffrey F. Lukman (Code Review)
Jeffrey F. Lukman has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9011


Change subject: Kudu-2208 Avoid system call interruption in Subprocess
..

Kudu-2208 Avoid system call interruption in Subprocess

This patch includes additional unit test to detect this bug.
To fix the issue, this patch added RETRY_ON_EINTR in Subprocess::DoWait()
if the waitpid() return -1.

Change-Id: I66ab2ec391eced5ea1c37d4294d151fe03c7d586
---
M src/kudu/util/subprocess-test.cc
M src/kudu/util/subprocess.cc
2 files changed, 52 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/9011/1
--
To view, visit http://gerrit.cloudera.org:8080/9011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I66ab2ec391eced5ea1c37d4294d151fe03c7d586
Gerrit-Change-Number: 9011
Gerrit-PatchSet: 1
Gerrit-Owner: Jeffrey F. Lukman