I've just tried this...

> * Make sure time on management node is correct.

Time is correct.

> * double check the blockRequest table for the given allocation, make
> sure processing is set to 0

It was set to 1, but setting it back to 0 triggered lots of traffic in
the log, see below.

> * make sure the related BlockTimes processing entry in the BlockTimes
> table is set to 0

That was 0.

On the way I uncovered a few more error conditions, involving Perl
libraries that I'd overlooked before, which are now adjusted to call our
purpose-build perl.  But I'm back to the error I was getting before:

 [...]
2012-10-04 
13:21:53|21007|blockrequest|utils.pm:xmlrpc_call(9105)|argument_string= 
XMLRPCprocessBlockTime 19 1
Can't locate object method "type" via package "RPC::XML::Client::send_request:
        HTTP server error: Can't connect to vlab.uchicago.edu:443 (certificate 
verify failed)" (perhaps you forgot to load "RPC::XML::Client::send_request: 
HTTP server error: Can't connect to vlab.uchicago.edu:443 (certificate verify 
failed)"?) at /usr/local/vcl-2.2.1/bin/../lib/VCL/utils.pm line 9121 (#1)
Uncaught exception from user code:
        Can't locate object method "type" via package 
"RPC::XML::Client::send_request: HTTP server error: Can't connect to 
vlab.uchicago.edu:443 (certificate verify failed)" (perhaps you forgot to load 
"RPC::XML::Client::send_request: HTTP server error: Can't connect to 
vlab.uchicago.edu:443 (certificate verify failed)"?) at 
/usr/local/vcl-2.2.1/bin/../lib/VCL/utils.pm line 9121.
 at /usr/local/vcl-2.2.1/bin/../lib/VCL/utils.pm line 9121
        VCL::utils::xmlrpc_call('XMLRPCprocessBlockTime', 19, 1) called at 
/usr/local/vcl-2.2.1/bin/../lib/VCL/blockrequest.pm line 373
        VCL::blockrequest::process_block_time(19) called at 
/usr/local/vcl-2.2.1/bin/../lib/VCL/blockrequest.pm line 193
        VCL::blockrequest::process('VCL::blockrequest=HASH(0x102e528)') called 
at /usr/local/vcl/bin/vcld line 568
        VCL::vcld::make_new_child('HASH(0x208fe70)') called at 
/usr/local/vcl/bin/vcld line 448
        VCL::vcld::main() called at /usr/local/vcl/bin/vcld line 98
2012-10-04 13:21:53|21007|blockrequest|State.pm:DESTROY(829)|VCL::blockrequest 
destructor called, address: 102e528
2012-10-04 13:21:53|21007|blockrequest|State.pm:DESTROY(848)|number of database 
handles state process created: 1
2012-10-04 13:21:53|14005|vcld:REAPER(718)|VCL process exited for reservation 
<unknown>, PID: 21007, signal: CHLD


So maybe we do have a cert issue.  Could it be that the CA bundle that
VCL uses (wherever that is) doesn't like Comodo/AddTrust?




On Thu, Oct 04, 2012 at 11:21:57AM -0400, Aaron Peeler wrote:
> You shouldn't need to wait 45 minutes, the start time only needs to be
> between 6hrs and 30 minutes from now. So as long as the start time is
> between this window, you should see something immediately.
> 
> Some things to check:
> * Make sure time on management node is correct.
> * double check the blockRequest table for the given allocation, make
> sure processing is set to 0
> * make sure the related BlockTimes processing entry in the BlockTimes
> table is set to 0
> 
> Note, resetting these to 0 will re-processing the blockallocation to
> help debugging a particular block.
> 
> -A
> 
> On Thu, Oct 4, 2012 at 11:04 AM, Michael Jinks <[email protected]> wrote:
> > Hrm.  Well, I made my block reservation, waited my 45 minutes, and still
> > no mention of it in the log.  I'll try again with a longer lead time.
> >
> >
> > On Thu, Oct 04, 2012 at 09:01:34AM -0500, Michael Jinks wrote:
> >> Thanks Aaron.
> >>
> >> It looks like we're in good shape as far as this stuff is concerned...
> >>
> >>  $ /usr/local/vcl/bin/perl -MRPC::XML -e 'print "success\n";'
> >>  success
> >>
> >>  $ /usr/local/vcl/bin/perl -MLWP -e 'print LWP->VERSION ."\n";'
> >>  6.04
> >>
> >> ...but I wonder if it's possible that under some circumstances vcld is
> >> calling a perl other than the one we purpose-installed for it.  vcld
> >> itself was adjusted to call that instance but a couple of other perl
> >> scripts in vcl/bin weren't, so I've fixed that and scheduled another
> >> reservation 45 minutes from now, will see if that makes any difference.
> >>
> >>
> >>
> >> On Thu, Oct 04, 2012 at 01:42:47PM +0000, Aaron Coburn wrote:
> >> >    I have seen something like this before.
> >> >
> >> >    I would recommend ensuring that the RPC::XML perl module is properly
> >> >    installed on the management node. Try this command:
> >> >
> >> >    perl -MRPC::XML -e 'print "success\n";'
> >> >
> >> >    If you get errors from this, try manually installing the RPC::XML
> >> >    module from here:
> >> >
> >> >    [1]http://search.cpan.org/~rjray/RPC-XML-0.77/
> >> >
> >> >    The only thing to note is that the current version of RPC::XML 
> >> > requires
> >> >    at least version 5.834 of LWP. To check which version of LWP you
> >> >    currently have installed, run this command:
> >> >
> >> >    perl -MLWP -e 'print LWP->VERSION ."\n";'
> >> >
> >> >    If this reports any version before 5.834 (as my system does), then you
> >> >    may need to use version 0.72 of the RPC::XML module, which requires
> >> >    only version 5.801 of LWP.
> >> >
> >> >    Also, when I say 'manually install', I do not mean using the cpan
> >> >    interactive interface; rather, download the package with wget, and 
> >> > then
> >> >    use the standard perl installation sequence:
> >> >
> >> >    perl Makefile.PL
> >> >
> >> >    make
> >> >
> >> >    make test
> >> >
> >> >    make install
> >> >
> >> >    Aaron
> >> >
> >> >    --
> >> >    Aaron Coburn
> >> >    Systems Administrator and Programmer
> >> >    Academic Technology Services, Amherst College
> >> >    [2][email protected]
> >> >    On Oct 4, 2012, at 9:05 AM, Michael Jinks <[3][email protected]>
> >> >    wrote:
> >> >
> >> >      Okay, this makes sense.  Thanks for clearing that up; I'd wondered
> >> >      if
> >> >      vcld had logic for ramping up to a reservation.
> >> >      I made a reservation last night, scheduled for 08:00 this morning.
> >> >      At
> >> >      02:00, I have errors in my log, which I'll paste below.  I haven't
> >> >      started trying to investigate what's wrong; the error suggests a bad
> >> >      cert.  Our cert is signed by Comodo, maybe that's the problem?  I
> >> >      don't
> >> >      know what vcld uses as a trust chain.
> >> >      There could be several other things going on too.  The reference to
> >> >      'Can't locate object method "type"' looks ugly, but I wonder if it's
> >> >      just a result of the SSL error further up?
> >> >      Log excerpt follows.
> >> >      [...]
> >> >      2012-10-04
> >> >      02:00:02|3322|blockrequest|blockrequest.pm:process(192)|processing
> >> >      blocktime_id= 15 pass 1
> >> >      2012-10-04
> >> >      02:00:02|3322|blockrequest|utils.pm:xmlrpc_call(9105)|argument_strin
> >> >      g= XMLRPCprocessBlockTime 15 1
> >> >      Can't locate object method "type" via package
> >> >      "RPC::XML::Client::send_request:
> >> >             HTTP server error: Can't connect to [4]vlab.uchicago.edu:443
> >> >      (certificate verify failed)" (perhaps you forgot to load
> >> >      "RPC::XML::Client::send_request: HTTP server error: Can't connect to
> >> >      [5]vlab.uchicago.edu:443 (certificate verify failed)"?) at
> >> >      /usr/local/vcl-2.2.1/bin/../lib/VCL/utils.pm line 9121 (#1)
> >> >      Uncaught exception from user code:
> >> >             Can't locate object method "type" via package
> >> >      "RPC::XML::Client::send_request: HTTP server error: Can't connect to
> >> >      [6]vlab.uchicago.edu:443 (certificate verify failed)" (perhaps you
> >> >      forgot to load "RPC::XML::Client::send_request: HTTP server error:
> >> >      Can't connect to [7]vlab.uchicago.edu:443 (certificate verify
> >> >      failed)"?) at /usr/local/vcl-2.2.1/bin/../lib/VCL/utils.pm line
> >> >      9121.
> >> >      at /usr/local/vcl-2.2.1/bin/../lib/VCL/utils.pm line 9121
> >> >             VCL::utils::xmlrpc_call('XMLRPCprocessBlockTime', 15, 1)
> >> >      called at /usr/local/vcl-2.2.1/bin/../lib/VCL/blockrequest.pm line
> >> >      373
> >> >             VCL::blockrequest::process_block_time(15) called at
> >> >      /usr/local/vcl-2.2.1/bin/../lib/VCL/blockrequest.pm line 193
> >> >             VCL::blockrequest::process('VCL::blockrequest=HASH(0x358bb68)
> >> >      ') called at /usr/local/vcl/bin/vcld line 568
> >> >             VCL::vcld::make_new_child('HASH(0x34740f0)') called at
> >> >      /usr/local/vcl/bin/vcld line 448
> >> >             VCL::vcld::main() called at /usr/local/vcl/bin/vcld line 98
> >> >      2012-10-04
> >> >      02:00:02|3322|blockrequest|State.pm:DESTROY(829)|VCL::blockrequest
> >> >      destructor called, address: 358bb68
> >> >      2012-10-04 02:00:02|3322|blockrequest|State.pm:DESTROY(848)|number
> >> >      of database handles state process created: 1
> >> >      2012-10-04 02:00:02|20597|vcld:REAPER(718)|VCL process exited for
> >> >      reservation <unknown>, PID: 3322, signal: CHLD
> >> >      On Thu, Oct 04, 2012 at 08:48:34AM -0400, Aaron Peeler wrote:
> >> >
> >> >      Hello Michael,
> >> >      The start time for block allocations need to be requested > 30
> >> >      minutes
> >> >      in advance.
> >> >      If folks think this should be adjusted let us know. The original
> >> >      thinking was to allow the enough time for the machines to get
> >> >      prepared
> >> >      for 30+ machines, which is what we've experienced for the average
> >> >      number of machines for Blocks allocations.
> >> >      Aaron
> >> >      On Wed, Oct 3, 2012 at 6:03 PM, Michael Jinks
> >> >      <[8][email protected]> wrote:
> >> >
> >> >      Thanks, Aaron and Josh.
> >> >      I did have some problems in my vcld.conf; the vclsystem account
> >> >      hadn't
> >> >      been properly configured.  So that was clearly part of my trouble.
> >> >      Having fixed that, though, I'm still not getting a successful block
> >> >      reservation.  Based on Josh's remarks about vcld reserving the slots
> >> >      "several hours" in advance, I wonder if that's an issue.  In order
> >> >      to
> >> >      test this in a reasonable amount of time, I've been creating a time
> >> >      slot
> >> >      a few minutes in the future, which then comes and goes with no
> >> >      activity
> >> >      in the vcld log except 'lastcheckin' lines.  I also find, in the
> >> >      database:
> >> >      mysql> select processed from blockTimes;
> >> >      +-----------+
> >> >      | processed |
> >> >      +-----------+
> >> >      |         0 |
> >> >      |         0 |
> >> >      +-----------+
> >> >      2 rows in set (0.00 sec)
> >> >      How far advance do I need to do a block allocation in order for it
> >> >      to
> >> >      take effect?
> >> >      On Wed, Oct 03, 2012 at 03:03:46PM -0400, Josh Thompson wrote:
> >> >
> >> >      -----BEGIN PGP SIGNED MESSAGE-----
> >> >      Hash: SHA1
> >> >      Michael,
> >> >      Aaron gave some good info.
> >> >      One thing I'll point out is that when you view the status of a block
> >> >      allocation through the web interface as you've described, the number
> >> >      listed for "Failed" comes from
> >> >      (number of computers for block allocation) - (number of computers in
> >> >      blockComputers table)
> >> >      It doesn't necessarily reflect that VCL attempted to load the
> >> >      machines
> >> >      and that those loads failed.  If vcld never processes the block
> >> >      allocation, it will always show the full count in the Failed row.
> >> >      When the block allocation is created, an entry is created in the
> >> >      blockTimes table for each time slot in which the block allocation
> >> >      will
> >> >      occur.  However, no computers are actually allocated for each time
> >> >      slot until several hours before a time slot starts.  Several hours
> >> >      before any block time, vcld processes that block time to allocate
> >> >      the
> >> >      computers and populate the blockComputers table for that blockTimes
> >> >      entry.  It does this by making an XMLRPC call to the web code.  You
> >> >      can check that vcld has actually processed a block time entry by
> >> >      looking at the blockTimes.processed field in the database for a
> >> >      given
> >> >      time slot.
> >> >      Josh
> >> >      On 10/03/12 13:59, Aaron Coburn wrote:
> >> >
> >> >      Michael,
> >> >      You should make sure that /etc/vcl/vcld.conf has the proper values
> >> >      set for:
> >> >      xmlrpc_username xmlrpc_pass xmlrpc_url
> >> >      xmlrpc_username should be set as vclsystem@Local xmlrpc_url should
> >> >      be something like:
> >> >      [9]https://vcl.myinstitution.edu/index.php?mode=xmlrpccall
> >> >      This is all described in the vcld.conf file as well as on this
> >> >      page:
> >> >      https://cwiki.apache.org/confluence/display/VCL/VCL+2.3+Management+N
> >> >      ode+Installation#VCL2.3ManagementNodeInstallation-Configurevcld.conf
> >> >      Then be sure to set the password for the user identified above
> >> >      (e.g. vclsystem@Local). Instructions are here:
> >> >      https://cwiki.apache.org/confluence/display/VCL/VCL+2.3+Management+N
> >> >      ode+Installation#VCL2.3ManagementNodeInstallation-Setthevclsystemacc
> >> >      ountpasswordforxmlrpcapi
> >> >      (The instructions are for version 2.3, but they will work for
> >> >      2.2.1 as well)
> >> >      Finally, restart vcld so that it can pick up the new information.
> >> >      If that doesn't help, then try inspecting the vcld logfile -- it
> >> >      will tell you more about why the reservations fail.
> >> >      Aaron
> >> >      -- Aaron Coburn Systems Administrator and Programmer Academic
> >> >      Technology Services, Amherst College [email protected]
> >> >      On Oct 3, 2012, at 1:30 PM, Michael Jinks <[email protected]>
> >> >      wrote:
> >> >
> >> >      Hi List.
> >> >      We have a VCL 2.2 instance running, most things work, but one
> >> >      glaring problem is still thwarting me.  When I try to create a
> >> >      block allocation, the computers always come up "Failed"; this in
> >> >      spite of the fact that I can reserve the same image individually
> >> >      with no issues.
> >> >      I've tried this so far with the admin@Local account, as well as
> >> >      my user account which has full admin privileges (or at least as
> >> >      close as we can get by assigning rights in the Privileges page).
> >> >      I go to the "Block Allocations" page, click "Create New Block
> >> >      Allocation", fill in the form, choosing a known-working image in
> >> >      the "Environment" dropdown, and a "List of Dates/Times" with one
> >> >      time span in the near future.  "Submit New Block Allocation" pops
> >> >      up a confirmation window, that goes fine, and I have an entry
> >> >      "Block Allocations" page under "Manage Block Allocations".  Looks
> >> >      fine.
> >> >      But, if I click an entry under "Your Active Block Allocations",
> >> >      under "Current status of computers," the number by "Failed" is
> >> >      equal to the number of seats I reserved, and as far as I can tell
> >> >      the system never tries to deploy a VM.
> >> >      I'm guessing that there is some permission setting or some other
> >> >      item within the management node that I've overlooked.  Any hints
> >> >      on where to look?
> >> >      Thanks, -j
> >> >      -- Michael Jinks :: [email protected] University of Chicago IT
> >> >      Services
> >> >
> >> >      - --
> >> >      - -------------------------------
> >> >      Josh Thompson
> >> >      VCL Developer
> >> >      North Carolina State University
> >> >      my GPG/PGP key can be found at [10]pgp.mit.edu
> >> >      All electronic mail messages in connection with State business which
> >> >      are sent to or received by this account are subject to the NC Public
> >> >      Records Law and may be disclosed to third parties.
> >> >      -----BEGIN PGP SIGNATURE-----
> >> >      Version: GnuPG v2.0.17 (GNU/Linux)
> >> >      iEYEARECAAYFAlBsjBIACgkQV/LQcNdtPQOcyQCfTQE1TzGXVFsIqitpPFrzJrM/
> >> >      SoIAn3p/PMlH+mjI0jWpHdwOC12/hwyu
> >> >      =gnM2
> >> >      -----END PGP SIGNATURE-----
> >> >
> >> >      --
> >> >      Michael Jinks :: [11][email protected] :: 773-469-9688
> >> >      University of Chicago IT Services
> >> >
> >> >      --
> >> >      Aaron Peeler
> >> >      Program Manager
> >> >      Virtual Computing Lab
> >> >      NC State University
> >> >      All electronic mail messages in connection with State business which
> >> >      are sent to or received by this account are subject to the NC Public
> >> >      Records Law and may be disclosed to third parties.
> >> >
> >> >      --
> >> >      Michael Jinks :: [12][email protected] :: 773-469-9688
> >> >      University of Chicago IT Services
> >> >
> >> > References
> >> >
> >> >    1. http://search.cpan.org/~rjray/RPC-XML-0.77/
> >> >    2. mailto:[email protected]
> >> >    3. mailto:[email protected]
> >> >    4. http://vlab.uchicago.edu/
> >> >    5. http://vlab.uchicago.edu/
> >> >    6. http://vlab.uchicago.edu/
> >> >    7. http://vlab.uchicago.edu/
> >> >    8. mailto:[email protected]
> >> >    9. https://vcl.myinstitution.edu/index.php?mode=xmlrpccall
> >> >   10. http://pgp.mit.edu/
> >> >   11. mailto:[email protected]
> >> >   12. mailto:[email protected]
> >>
> >> --
> >> Michael Jinks :: [email protected] :: 773-469-9688
> >> University of Chicago IT Services
> >
> > --
> > Michael Jinks :: [email protected] :: 773-469-9688
> > University of Chicago IT Services
> 
> 
> 
> -- 
> Aaron Peeler
> Program Manager
> Virtual Computing Lab
> NC State University
> 
> All electronic mail messages in connection with State business which
> are sent to or received by this account are subject to the NC Public
> Records Law and may be disclosed to third parties.

-- 
Michael Jinks :: [email protected] :: 773-469-9688
University of Chicago IT Services

Reply via email to