nathant27 commented on issue #4726:
URL: https://github.com/apache/texera/issues/4726#issuecomment-4426251167

   Hello. I plan on fixing this bug by replacing the old try except block with 
two separate try except blocks, and returning True as long as the connection 
successfully establishes. This would change the behavior a bit
   ```# in _check_heartbeat(self) new proposed version
   try:
               temp_socket = socket.create_connection(
                   (self._parsed_server_host, self._parsed_server_port), 
timeout=1
               )
           except Exception as e:
               logger.warning(f"Server is down with exception: {e}")
               return False
   
           try:
               temp_socket.close()
           except Exception as e:
               logger.warning(f"Failed to close socket: {e}")
           
           return True
   ```
   In my proposal, I changed the behavior of the function so that on successful 
connection it returns True(even if close raises exception), and returns False 
if socket.create_connection(...) raises. If close raises, we still return True 
but log something different in the logger.
   
   I changed it to this behavior because as stated 
   "If close() raises after a successful connect, the method logs "Server is 
down" and returns False even though the JVM is reachable. The double-check in 
run() masks transient cases but the misclassification can still trigger 
unwarranted shutdowns.", 
   which makes it seem like we want to avoid unwarranted shutdowns when 
.close() raises an exception, and the easiest way I thought of would be 
changing that behavior so that .close() raised exceptions still return 
True(assuming connection was established), to keep the run loop going.
   
   Was wondering if this was an appropriate way to handle the bug, or if we 
still want to keep the current behavior where an exception on close still 
returns false, which is the current behavior in the pinned test in 
amber/src/test/python/core/runnables/test_heartbeat.py, line 82
   test_returns_false_when_socket_close_raises


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to