Hello, VCL dev --
I've seen three instances of "relaim" errors on my VCL instance today --
machines that vcld couldn't seem to reach, although they appeared running
on the ESXi console.
It started when I began creating block allocations (which did not appear to
work). In both cases there seemed to be text in the log regarding
incorrect SQL syntax.
After the errors, the affected machines showed as "failed" in the VCL front
end, but eventually reloaded and appear to be working fine.
Regular, individual reservations appear to be working fine. I've attached
both snippets of log and would appreciate any feedback.
Many thanks,
Mike
------------------------------------------------------------------------
RECENT LOG ENTRIES FOR THIS PROCESS:
|28851|193:193|deleted| request state: deleted
|28851|193:193|deleted| request laststate: reserved
|28851|193:193|deleted| computer state: available
|28851|193:193|deleted| computer type: virtualmachine
2012-01-24 09:37:20|28851|193:193|deleted|reclaim.pm:process(124)|request
laststate is reserved, checking if computer table current image matches image
currently loaded on richmond41
|28851|193:193|deleted| ---- WARNING ----
|28851|193:193|deleted| 2012-01-24
11:38:35|28851|193:193|deleted|OS.pm:get_currentimage_txt_contents(161)|failed
to retrieve currentimage.txt from richmond41, exit status: 255, output:
|28851|193:193|deleted| ssh_exchange_identification: read: Connection reset by
peer
|28851|193:193|deleted| ( 0) OS.pm, get_currentimage_txt_contents (line: 161)
|28851|193:193|deleted| (-1) OS.pm, get_current_image_name (line: 197)
|28851|193:193|deleted| (-2) reclaim.pm, process (line: 133)
|28851|193:193|deleted| (-3) vcld, make_new_child (line: 568)
|28851|193:193|deleted| (-4) vcld, main (line: 346)
|28851|193:193|deleted| ---- WARNING ----
|28851|193:193|deleted| 2012-01-24
11:38:35|28851|193:193|deleted|OS.pm:get_current_image_name(201)|failed to
retrieve currentimage.txt contents from richmond41
|28851|193:193|deleted| ( 0) OS.pm, get_current_image_name (line: 201)
|28851|193:193|deleted| (-1) reclaim.pm, process (line: 133)
|28851|193:193|deleted| (-2) vcld, make_new_child (line: 568)
|28851|193:193|deleted| (-3) vcld, main (line: 346)
|28851|193:193|deleted| ---- WARNING ----
|28851|193:193|deleted| 2012-01-24
11:38:35|28851|193:193|deleted|reclaim.pm:process(135)|failed to retrieve name
of image currently loaded on richmond41, computer will be reloaded
|28851|193:193|deleted| ( 0) reclaim.pm, process (line: 135)
|28851|193:193|deleted| (-1) vcld, make_new_child (line: 568)
|28851|193:193|deleted| (-2) vcld, main (line: 346)
2012-01-24
11:38:35|28851|193:193|deleted|utils.pm:get_management_predictive_info(5849)|management
node info retrieved from database
2012-01-24
11:38:35|28851|193:193|deleted|DataStructure.pm:get_next_image_dataStructure(1187)|attempting
to load predictive loading module: VCL::Module::Predictive::Level_1
2012-01-24
11:38:35|28851|193:193|deleted|Module.pm:create_mn_os_object(335)|management
node OS object has already been created, address: 1ac5a880, returning 1
2012-01-24
11:38:35|28851|193:193|deleted|Module.pm:new(203)|VCL::Module::Predictive::Level_1
object created, address: 1b199aa0
2012-01-24
11:38:35|28851|193:193|deleted|DataStructure.pm:get_next_image_dataStructure(1197)|VCL::Module::Predictive::Level_1
predictive loading object successfully created
2012-01-24
11:38:35|28851|193:193|deleted|Level_1.pm:get_next_image(91)|predictive_reload_Level_1
: starting predictive_reload_level_1 for 45
2012-01-24 11:38:35|28851|193:193|deleted|utils.pm:is_inblockrequest(6163)|zero
rows were returned from database select
2012-01-24
11:38:35|28851|193:193|deleted|Level_1.pm:get_next_image(247)|predictive_reload_Level_1
: computer_short_name= richmond41 type= virtualmachine
2012-01-24
11:38:35|28851|193:193|deleted|Level_1.pm:get_next_image(248)|predictive_reload_Level_1
: avail= 25 notavail= 25 online= 50 timeframe= 1 DAY
2012-01-24
11:38:35|28851|193:193|deleted|Level_1.pm:get_next_image(357)|predictive_reload_Level_1
: 1 available images can go on richmond41
2012-01-24
11:38:35|28851|193:193|deleted|Level_1.pm:get_next_image(379)|predictive_reload_Level_1
: 1 of 1 available images loaded at least once
2012-01-24
11:38:35|28851|193:193|deleted|Level_1.pm:get_next_image(391)|predictive_reload_Level_1
: 0 of 1 total images available for selection
|28851|193:193|deleted| ---- WARNING ----
|28851|193:193|deleted| 2012-01-24
11:38:35|28851|193:193|deleted|utils.pm:database_select(4008)|could not execute
statement,
|28851|193:193|deleted| SELECT
|28851|193:193|deleted| COUNT(imageid) AS cnt,
|28851|193:193|deleted| imageid
|28851|193:193|deleted| FROM
|28851|193:193|deleted| log
|28851|193:193|deleted| WHERE
|28851|193:193|deleted| imageid IN ()
|28851|193:193|deleted| AND start > (NOW() - INTERVAL 1 DAY)
|28851|193:193|deleted| GROUP BY imageid
|28851|193:193|deleted| ORDER BY cnt DESC
|28851|193:193|deleted| LIMIT 1
|28851|193:193|deleted| , You have an error in your SQL syntax; check the
manual that corresponds to your MySQL server version for the right syntax to
use near ')
|28851|193:193|deleted| AND start > (NOW() - INTERVAL 1 DAY)
|28851|193:193|deleted| GROUP BY imageid
|28851|193:193|deleted| ' at line 7
|28851|193:193|deleted| ( 0) utils.pm, database_select (line: 4008)
|28851|193:193|deleted| (-1) Level_1.pm, get_next_image (line: 408)
|28851|193:193|deleted| (-2) DataStructure.pm, get_next_image_dataStructure
(line: 1198)
|28851|193:193|deleted| (-3) reclaim.pm, insert_reload_and_exit (line: 191)
|28851|193:193|deleted| (-4) reclaim.pm, process (line: 136)
|28851|193:193|deleted| (-5) vcld, make_new_child (line: 568)
|28851|193:193|deleted| ---- WARNING ----
|28851|193:193|deleted| 2012-01-24
11:38:35|28851|193:193|deleted|Level_1.pm:get_next_image(410)|predictive_reload_Level_1
: failed to fetch preferred image for computer_id 45
|28851|193:193|deleted| ( 0) Level_1.pm, get_next_image (line: 410)
|28851|193:193|deleted| (-1) DataStructure.pm, get_next_image_dataStructure
(line: 1198)
|28851|193:193|deleted| (-2) reclaim.pm, insert_reload_and_exit (line: 191)
|28851|193:193|deleted| (-3) reclaim.pm, process (line: 136)
|28851|193:193|deleted| (-4) vcld, make_new_child (line: 568)
|28851|193:193|deleted| (-5) vcld, main (line: 346)
|28851|193:193|deleted| ---- WARNING ----
|28851|193:193|deleted| 2012-01-24
11:38:35|28851|193:193|deleted|DataStructure.pm:get_next_image_dataStructure(1204)|predictive
loading module failed to retrieve image information, returning current
reservation image information
|28851|193:193|deleted| ( 0) DataStructure.pm, get_next_image_dataStructure
(line: 1204)
|28851|193:193|deleted| (-1) reclaim.pm, insert_reload_and_exit (line: 191)
|28851|193:193|deleted| (-2) reclaim.pm, process (line: 136)
|28851|193:193|deleted| (-3) vcld, make_new_child (line: 568)
|28851|193:193|deleted| (-4) vcld, main (line: 346)
2012-01-24
11:38:35|28851|193:193|deleted|DataStructure.pm:_automethod(834)|data structure
updated: $self->request_data->{reservation}{193}{imagerevision}{imagename}
|28851|193:193|deleted| image_name = vmwarewinxp-longsighttest037-v0
2012-01-24
11:38:35|28851|193:193|deleted|DataStructure.pm:_automethod(834)|data structure
updated: $self->request_data->{reservation}{193}{image}{id}
|28851|193:193|deleted| image_id = 7
2012-01-24
11:38:35|28851|193:193|deleted|DataStructure.pm:_automethod(834)|data structure
updated: $self->request_data->{reservation}{193}{imagerevision}{id}
|28851|193:193|deleted| imagerevision_id = 7
2012-01-24
11:38:35|28851|193:193|deleted|reclaim.pm:insert_reload_and_exit(203)|next
image: vmwarewinxp-longsighttest037-v0, image id=7, imagerevision id=7
2012-01-24
11:38:35|28851|193:193|deleted|utils.pm:insertloadlog(3848)|computerloadlog
entry not inserted, reservation has been deleted
|28851|193:193|deleted| ---- WARNING ----
|28851|193:193|deleted| 2012-01-24
11:38:35|28851|193:193|deleted|utils.pm:insert_reload_request(8450)|req=193:
current computer state is reloading, reload request will not be inserted
|28851|193:193|deleted| ( 0) utils.pm, insert_reload_request (line: 8450)
|28851|193:193|deleted| (-1) reclaim.pm, insert_reload_and_exit (line: 206)
|28851|193:193|deleted| (-2) reclaim.pm, process (line: 136)
|28851|193:193|deleted| (-3) vcld, make_new_child (line: 568)
|28851|193:193|deleted| (-4) vcld, main (line: 346)
2012-01-24
11:38:35|28851|193:193|deleted|DataStructure.pm:get_computer_private_ip_address(1581)|attempting
to retrieve private IP address for computer: richmond41
2012-01-24
11:38:35|28851|193:193|deleted|DataStructure.pm:get_computer_private_ip_address(1585)|retrieved
contents of /etc/hosts on this management node, contains 60 lines
2012-01-24
11:38:35|28851|193:193|deleted|DataStructure.pm:get_computer_private_ip_address(1645)|returning
IP address from /etc/hosts file: 10.100.0.67
2012-01-24 11:38:35|28851|193:193|deleted|utils.pm:is_inblockrequest(6163)|zero
rows were returned from database select
2012-01-24
11:38:35|28851|193:193|deleted|DataStructure.pm:get_image_affiliation_name(2035)|image
owner id: 1
2012-01-24 11:38:35|28851|193:193|deleted|utils.pm:getnewdbh(2709)|database
requested (information_schema) does not match handle stored in $ENV{dbh}
(vcl:127.0.0.1)
2012-01-24 11:38:35|28851|193:193|deleted|utils.pm:getnewdbh(2760)|database
handle stored in $ENV{dbh}
2012-01-24
11:38:36|28851|193:193|deleted|DataStructure.pm:retrieve_user_data(1352)|attempting
to retrieve and store data for user: user.id = '1'
2012-01-24 11:38:36|28851|193:193|deleted|utils.pm:getnewdbh(2709)|database
requested (vcl) does not match handle stored in $ENV{dbh}
(information_schema:127.0.0.1)
2012-01-24 11:38:36|28851|193:193|deleted|utils.pm:getnewdbh(2760)|database
handle stored in $ENV{dbh}
2012-01-24
11:38:36|28851|193:193|deleted|DataStructure.pm:retrieve_user_data(1415)|data
has been retrieved for user: admin (id: 1)
Use of uninitialized value in concatenation (.) or string at
/usr/local/vcl/bin/../lib/VCL/blockrequest.pm line 167 (#1)
(W uninitialized) An undefined value was used as if it were already
defined. It was interpreted as a "" or a 0, but maybe it was a mistake.
To suppress this warning assign a defined value to your variables.
To help you figure out what was undefined, perl tells you what operation
you used the undefined value in. Note, however, that perl optimizes your
program and the operation displayed in the warning may not necessarily
appear literally in your program. For example, "that $foo" is
usually optimized into "that " . $foo, and the warning will refer to
the concatenation (.) operator, even though there is no . in your
program.
|28413|blockrequest| ---- WARNING ----
|28413|blockrequest| 2012-01-24
12:01:04|28413|blockrequest|vcld:warning_handler(610)|Use of uninitialized
value in concatenation (.) or string at
/usr/local/vcl/bin/../lib/VCL/blockrequest.pm line 167.
|28413|blockrequest| ( 0) vcld, warning_handler (line: 610)
|28413|blockrequest| (-1) blockrequest.pm, process (line: 167)
|28413|blockrequest| (-2) vcld, make_new_child (line: 568)
|28413|blockrequest| (-3) vcld, main (line: 448)
2012-01-24 12:01:04|28413|blockrequest|blockrequest.pm:process(167)|help
address:
2012-01-24 12:01:04|28413|blockrequest|blockrequest.pm:process(312)|Removed
computers from blockComputers table for blocktime_id=5
2012-01-24 12:01:04|28413|blockrequest|blockrequest.pm:process(315)|Removed
blocktime_id=5 from blockTimes table
2012-01-24
12:01:04|28413|blockrequest|utils.pm:check_blockrequest_time(1033)|block
request end time has been reached (-1 minutes from now), returning 'end'
2012-01-24 12:01:04|4488|vcld:REAPER(718)|VCL process exited for reservation
<unknown>, PID: 28409, signal: CHLD
2012-01-24 12:01:04|4488|vcld:main(167)|lastcheckin time updated for management
node 1: 2012-01-24 12:01:04
2012-01-24 12:01:04|28413|blockrequest|blockrequest.pm:process(343)|Removed
processing flag on blockrequest_id 5
2012-01-24 12:01:04|28413|blockrequest|State.pm:DESTROY(829)|VCL::blockrequest
destructor called, address: 1aa2f010
2012-01-24 12:01:04|28413|blockrequest|State.pm:DESTROY(848)|number of database
handles state process created: 1
2012-01-24 12:01:04|4488|vcld:REAPER(718)|VCL process exited for reservation
<unknown>, PID: 28413, signal: CHLD