Alon Bar-Lev has posted comments on this change.

Change subject: fklsnr: Introduce standalone fence_kdump listener
......................................................................


Patch Set 7:

(15 comments)

OK, code is nice!

now, let's discuss improvements :)

1. move all configuration to config file, update vdc_options on startup.

this will allow starting service when database is down, and have single place 
for configuration.

2. periodic update sessions within database instead of when packet is accepted, 
this will reduce load and make service more predictable.

3. survive database down/up by reading all vdsm ids addresses and session 
status when database connection established. then when database connection is 
established we sync missing entries within map and in next periodic update (2) 
we update the database with these that new or expired.

http://gerrit.ovirt.org/#/c/27201/7/packaging/services/ovirt-fence-kdump-listener/db.py
File packaging/services/ovirt-fence-kdump-listener/db.py:

Line 222:                     address,
Line 223:                     status
Line 224:                 ) VALUES (
Line 225:                     %(vds_id)s,
Line 226:                     timezone('UTC', %(received)s),
again, need to understand... is there a strong reason not to use the current 
timestamp of the insert?

use database now() or similar instead of providing timestamp?
Line 227:                     %(address)s,
Line 228:                     %(status)s
Line 229:                 )
Line 230:             """,


Line 252:         result = self._db_mgr.execute(
Line 253:             statement="""
Line 254:                 SELECT
Line 255:                     fkm1.vds_id,
Line 256:                     timezone('UTC', fkm1.received) as received,
here also, you can assume it is now() and not fetch it... wait for next timeout 
to or next packet.

the timestamp is important to engine not to us.
Line 257:                     fkm1.address,
Line 258:                     fkm1.status
Line 259:                 FROM fence_kdump_messages fkm1
Line 260:                 JOIN (


Line 264:                     FROM fence_kdump_messages
Line 265:                     GROUP BY vds_id
Line 266:                 ) fkm2
Line 267:                 ON fkm1.vds_id = fkm2.vds_id
Line 268:                    AND fkm1.received = fkm2.latest
:)... some indentation.... to make it clearer...

 JOIN (
     SELECT
          x
          y
      FRO...
 ) ON
     fkm1.vds_id = fkm2.vds_id and
     fkm1.received = fkm2.latest
 WHERE
     fkm1.status <> 'finished' and
     fkm1.vds_id <> %(vds_id)s
Line 269:                 WHERE fkm1.status <> 'finished'
Line 270:                       AND fkm1.vds_id <> %(vds_id)s
Line 271:             """,
Line 272:             args={


Line 275:         )
Line 276: 
Line 277:         for record in result:
Line 278:             # received needs to be converted from aware to naive 
datetime
Line 279:             record['received'] = 
record['received'].replace(tzinfo=None)
what is None in this context? why not use explicit utc?
Line 280:             # address needs to be converted from json to tuple
Line 281:             record['address'] = tuple(json.loads(record['address']))
Line 282:         return result
Line 283: 


Line 278:             # received needs to be converted from aware to naive 
datetime
Line 279:             record['received'] = 
record['received'].replace(tzinfo=None)
Line 280:             # address needs to be converted from json to tuple
Line 281:             record['address'] = tuple(json.loads(record['address']))
Line 282:         return result
oh! you modify result returned by database layer directly to the calling logic? 
this is not wise... please have clear separation between the layer.
Line 283: 
Line 284:     def get_host_with_address(self, address):
Line 285:         result = self._db_mgr.execute(
Line 286:             statement="""


http://gerrit.ovirt.org/#/c/27201/7/packaging/services/ovirt-fence-kdump-listener/listener.py
File packaging/services/ovirt-fence-kdump-listener/listener.py:

Line 70:     def _total_seconds(self, dt1, dt2):
Line 71:         return round(
Line 72:             abs(
Line 73:                 time.mktime(dt1.timetuple()) - 
time.mktime(dt2.timetuple())
Line 74:             )
why abs? we do expect negative if d1 < d2
Line 75:         )
Line 76: 
Line 77:     def _interval_finished(self, interval, last):
Line 78:         return (


Line 114:             else:
Line 115:                 entry = self._sessions.get(address)
Line 116:                 if entry is None:
Line 117:                     entry = {
Line 118:                         'vds_id': None,
no need to put here vds_id, set it when session is at handle message, think of 
this part as reusable code for other implementations.
Line 119:                         'status': self.SESSION_STATE_INITIAL,
Line 120:                         'address': address,
Line 121:                         'received': None,
Line 122:                     }


Line 144:                 ).format(
Line 145:                     msg=binascii.hexlify(message),
Line 146:                     address=entry['address'][0],
Line 147:                 )
Line 148:             )
this is warning not error... I also do not think it is warning... as you be 
flooded with these... better to have this as debug.

also, please move the session to CLOSED state so house keeping will clear it 
out.
Line 149: 
Line 150:         if entry['vds_id'] is None:
Line 151:             entry['vds_id'] = self._dao.get_host_with_address(
Line 152:                 entry['address'][0]


Line 146:                     address=entry['address'][0],
Line 147:                 )
Line 148:             )
Line 149: 
Line 150:         if entry['vds_id'] is None:
can be written as:

 if entry.get('vds_id') is None:

this way even if it is missing you get None
Line 151:             entry['vds_id'] = self._dao.get_host_with_address(
Line 152:                 entry['address'][0]
Line 153:             )
Line 154: 


Line 149: 
Line 150:         if entry['vds_id'] is None:
Line 151:             entry['vds_id'] = self._dao.get_host_with_address(
Line 152:                 entry['address'][0]
Line 153:             )
oh... please raise exception, do not mix procedural and exceptional pattern, 
then you will also avoid return at middle of logic
Line 154: 
Line 155:         if not entry['vds_id']:
Line 156:             self.logger.error(
Line 157:                 _(


Line 213:                 interval=self._sessionExpirationTime,
Line 214:                 last=session['received']
Line 215:             ):
Line 216:                 new_state = self.SESSION_STATE_CLOSED
Line 217:                 try:
add:

 if session.get['vds_id'] is not None:
Line 218:                     self._dao.create_fence_kdump_message(
Line 219:                         vds_id=session['vds_id'],
Line 220:                         received=received,
Line 221:                         address=session['address'],


Line 249:                     self.logger.debug('Exception',  exc_info=True)
Line 250: 
Line 251:         self._clear_finished_sessions()
Line 252: 
Line 253:     def _clear_finished_sessions(self):
this is why I do not like one time code be split... it should be called only 
from the above function... I do not see any reason to split. not that important.
Line 254:         for address in [
Line 255:             entry['address']
Line 256:             for entry in self._sessions.values()
Line 257:             if entry['status'] == self.SESSION_STATE_CLOSED


Line 250: 
Line 251:         self._clear_finished_sessions()
Line 252: 
Line 253:     def _clear_finished_sessions(self):
Line 254:         for address in [
saggi suggested to use in (...) instead of [...], give this a try.
Line 255:             entry['address']
Line 256:             for entry in self._sessions.values()
Line 257:             if entry['status'] == self.SESSION_STATE_CLOSED
Line 258:         ]:


Line 267:                 new_heartbeat = datetime.datetime.utcnow()
Line 268: 
Line 269:                 self._dao.update_heartbeat()
Line 270: 
Line 271:                 self._lastHeartbeat = new_heartbeat
I suggest you update this to now() to avoid the time took database
Line 272:                 self.logger.debug('heartbeat')
Line 273:             except StandardError as e:
Line 274:                 self.logger.error(
Line 275:                     _('Error updating heartbeat: {error}.').format(


Line 282:         for record in self._dao.get_unfinished_flows():
Line 283:             # if record is saved to db, in memory cache status
Line 284:             # is always 'dumping'
Line 285:             record['status'] = self.SESSION_STATE_DUMPING
Line 286:             self._sessions[record['address']] = record
hmmm... better to have a create_session_entry function that both this and run 
will call, to create dictionary with basic data.

also I do not like that this entry format is exposed to dao... better return a 
tuple with sessions and vdsm id and add this to map here.
Line 287:         self.logger.debug("Populated sessions: '%s'", self._sessions)
Line 288: 
Line 289: 


-- 
To view, visit http://gerrit.ovirt.org/27201
To unsubscribe, visit http://gerrit.ovirt.org/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ieec3bad47bbba860a52a9ff4e2eb7f61277f4e36
Gerrit-PatchSet: 7
Gerrit-Project: ovirt-engine
Gerrit-Branch: master
Gerrit-Owner: Martin Peřina <[email protected]>
Gerrit-Reviewer: Alon Bar-Lev <[email protected]>
Gerrit-Reviewer: Barak Azulay <[email protected]>
Gerrit-Reviewer: Eli Mesika <[email protected]>
Gerrit-Reviewer: Martin Peřina <[email protected]>
Gerrit-Reviewer: Oved Ourfali <[email protected]>
Gerrit-Reviewer: Saggi Mizrahi <[email protected]>
Gerrit-Reviewer: [email protected]
Gerrit-Reviewer: oVirt Jenkins CI Server
Gerrit-HasComments: Yes
_______________________________________________
Engine-patches mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/engine-patches

Reply via email to