** Description changed:

  We're trying to use cinder-netapp, and the driver fails to start with the 
following error.
  https://pastebin.ubuntu.com/p/dwMdrVwdtf/
  
  After enabling debugging, I noticed that one node doesn't report its model.
  Checking on netapp side, the node is down due to a hardware issue, and 
another one is taking over the current node.
  
  If i add or ''  to get_cluster_nodes_info. it works fine.
  
  In this current setup, if any node goes down and the driver restarts,
  Cinder won't be able to recover.
- 
  
  ***************************
  
  [SRU]
  
  [Impact]
  cinder-volume service fails to start when cinder-netapp driver is used with 
some netapp nodes in maintenance state.
  
  The service during driver initialization queries the netapp server for all 
the nodes information. Typically the node
  name, model and certain other attributes are expected. However in case if the 
node is in maintenance mode, the
  model information is missing.
  cinder-volume service does not handle properly in case the model value is 
None and so the service goes to failed state.
  
  The fix checks if the model value is None and assigns empty string as default 
value. In addition, a warning message is
  logged for missing model values.
  
  [Test Case]
- To test the fix, netapp storage nodes are required.
- However the effect of the fix can be easily tested using unit testing by 
simple changes.
  
- In the file tests/unit/volume/drivers/netapp/dataontap/fakes.py
- Change the model value in dict NO_MODEL_NODE to None instead of ''
- Run `tox -e py3 -- 
cinder.tests.unit.volume.drivers.netapp.dataontap.client.test_client_cmode_rest.NetAppRestCmodeClientTestCase`
+ To test the bug, we need netapp storage nodes. Instead developed a small
+ python script that responds to couple of netapp requests that are
+ required to reproduce the bug.
  
- All the test cases failed in driver initialization with following error:
- TypeError: argument of type 'NoneType' is not iterable
+ Here are the reproducer steps:
+ 
+ 1. Deploy regress-stack (https://github.com/canonical/regress-stack)
+ 2. Install the packages required for cinder service to get setup
+    
+    sudo snap install astral-uv --classic
+    sudo apt-get update
+    uvx pre-commit install
+    sudo apt install dpkg-dev python3-dev python-apt-dev -y
+    uv sync
+    sudo apt install ceph mysql-server rabbitmq-server keystone cinder-api 
cinder-scheduler cinder-volume -y
+ 
+ 3. Run regress-stack setup step
+ 
+    uv run regress-stack setup
+ 
+ 4. Simulate netapp server
+ 
+    The simulated code responds to couple of initial netapp requests and you 
can
+    see in L#33 one of the node has no model information.
+    Python code: https://pastebin.ubuntu.com/p/7pBSXzFSGY/
+    python <netapp.py>
+ 
+ 5. Update cinder.conf to add netapp configuration
+ 
+    https://pastebin.ubuntu.com/p/xK52rT5f7V/ (modified cinder configs)
+    systemctl restart cinder-volume.service
+ 
+ 6. Check for cinder-volume logs
+ 
+    Non-Working case:
+    Should fail with error `TypeError: argument of type 'NoneType' is not 
iterable`
+ 
+    Working case:
+    Should see a printout in logs: `Reported ONTAPI Version: 1.261`
+    (At this point of time the code execution crossed the bug)
+ 
+    Note: The service wont start and fail with following error as the
+ simulation does not support all the netapp API calls.
+ 
+    cinder.volume.drivers.netapp.dataontap.client.api.NaApiError: NetApp
+ API failed. Reason - 400:BAD REQUEST
+ 
  
  [Regression Potential]
- In order to mitigate any regression potential, the fix has been tested with 
real hardware for jammy caracal
- Also the default logic is not changed when model is provided by netapp server.
+ In order to mitigate any regression potential, the fix has been tested with 
real hardware for jammy caracal. The unit test cases are also updated to verify 
the bug. Also the default logic is not changed when model is provided by netapp 
server.
  
  [Discussion]
  n/a

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2121812

Title:
  [SRU] cinder-netapp driver fails to start when a node is down

To manage notifications about this bug go to:
https://bugs.launchpad.net/cinder/+bug/2121812/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to