On Thu, Apr 15, 2010 at 09:37:45PM +0200, Michael Hanselmann wrote:
> Hi Iustin
> 
> Unless noted otherwise, comments have been addressed.
> 
> 2010/4/15 Iustin Pop <[email protected]>:
> > On Wed, Apr 07, 2010 at 06:44:29PM +0200, Michael Hanselmann wrote:
> >> --- a/Makefile.am
> >> +++ b/Makefile.am
> >> +pkglib_python_scripts = \
> >> +     daemons/import-export
> >
> > Is this a daemon that will run all the time? Then it needs to be called
> > "ganeti-*".
> 
> No, it's run once for each of the reading and writing side per disk.
> For an instance with 3 disks, it's run 6 times. That's also why it's
> not installed in PATH.
> 
> The daemon starts, initializes the transport (dd, gzip, socat, etc.),
> watches their output and progress, and upon the child's termination,
> exits itself.
> 
> >> --- /dev/null
> >> +++ b/daemons/import-export
> >> +def SetupLogging():
> >> +  formatter = logging.Formatter("%(asctime)s: %(message)s")
> >> +
> >> +  stderr_handler = logging.StreamHandler()
> >> +  stderr_handler.setFormatter(formatter)
> >> +  stderr_handler.setLevel(logging.NOTSET)
> >> +
> >> +  root_logger = logging.getLogger("")
> >> +  root_logger.addHandler(stderr_handler)
> >> +
> >> +  if options.debug:
> >> +    root_logger.setLevel(logging.NOTSET)
> >> +  elif options.verbose:
> >> +    root_logger.setLevel(logging.INFO)
> >> +  else:
> >> +    root_logger.setLevel(logging.CRITICAL)
> >> +
> >> +  # Create special logger for child process output
> >> +  child_logger = logging.Logger("child output")
> >> +  child_logger.addHandler(stderr_handler)
> >> +  child_logger.setLevel(logging.NOTSET)
> >> +
> >> +  return child_logger
> >
> > Why is the standard ganeti setuplogging for daemons not enough?
> 
> The import-export daemon can be considered to be a standalone program,
> similar to cfgupgrade, when it comes to logging. Its stdout/stderr is
> redirected to a per-transfer logfile (via utils.StartDaemon).
> 
> Independent of that, the logging setup here is a bit special because
> it's used to “forward” messages from socat and the other commands run
> by the daemon via “child_logger”.
> 
> I don't think it makes sense to clog utils.SetupLogging with another
> parameter just for this special case.

Indeed, the only reason I asked was because I believed it's a
long-running daemon.

> >> +def _VerifyListening(family, address, port):
> >> +  """Verify address given as listening address by socat.
> >> +
> >
> > what does this verify actually? that someone listens on a port (not!) or
> > that the address is valid? (in which case, VerifyAddress would be better
> > name).
> 
> It verifies the values extracted using LISTENING_RE: address family
> (think future support for IPv6) and the IP address. The port is not
> checked.

(I still think the name is wrong, but anyway)

> >> +class StatusFile:
> >> +  def AddRecentOutput(self, line):
> >> +    self._data.recent_output.append(line)
> >> +
> >> +    # Remove old lines
> >> +    del self._data.recent_output[:-MAX_RECENT_OUTPUT_LINES]
> >
> > is this a plain list? if so, do we care about speed of deletion from
> > head?
> 
> Yes, it is a plain list. After socat is set up (which generates a few
> lines), not many are added in normal cases. Errors can generate a few
> more. I don't expect performance to be an issue here. Since this list
> is directly serialized to JSON for the status file read by
> ganeti-noded, I wanted to use a simple structure.

Sure, if it's small, no issue.

> >> +def GetSocatCommand(mode):
> > […]
> >> +  return [
> >> +    constants.SOCAT_PATH,
> >> +
> >> +    # Log to stderr
> >> +    "-ls",
> >> +
> >> +    # Log level
> >> +    "-d", "-d",
> >
> > how much noise will socat display in this mode? (just curious)
> 
> This is both the minimum level required to get the information we need
> and at the same time the maximum which makes sense. Going up one level
> (by adding another "-d") generates at least one line per
> read()/write() call. I tried to use that to measure the amount of
> transferred data, but overall the CPU usage caused by it was simply
> too high. That's why I added a comment to implement such measurement
> using “dd”.
> 
> The output of a very short connection with this log level looks like
> this (plus timestamps):
> 
> socat: N using stdin for reading
> socat: N accepting connection from AF=2 127.0.0.1:52442 on AF=2 
> 127.0.0.1:37190
> socat: N opening connection to AF=2 127.0.0.1:37190
> socat: N successfully connected from local address AF=2 127.0.0.1:52442
> socat: N SSL connection using AES256-SHA
> socat: N using stdout for writing
> socat: N starting data transfer loop with FDs [6,6] and [1,1]
> socat: N SSL connection using AES256-SHA
> socat: N starting data transfer loop with FDs [0,0] and [5,5]
> socat: N socket 1 (fd 0) is at EOF
> socat: N socket 1 (fd 6) is at EOF
> socat: N exiting with status 0
> socat: N exiting with status 0

Ack, thanks.

> >> +def GetTransportCommand(mode, socat_stderr_fd):
> >> +  socat_cmd = ("%s 2>&%d" %
> >> +               (utils.ShellQuoteArgs(GetSocatCommand(mode)),
> >> +                socat_stderr_fd))
> >> +
> >> +  if mode == constants.IEM_IMPORT:
> >> +    transport_cmd = "%s | gunzip -c" % socat_cmd
> >> +  elif mode == constants.IEM_EXPORT:
> >> +    transport_cmd = "gzip -c | %s" % socat_cmd
> >
> > Hmm, yeah, we use gzip today, but it does eat lots of CPU. I wonder if
> > that could be an issue for intra-cluster export.
> 
> It'll be a trade-off between traffic and CPU usage. Making this
> customizable will be easy, though I'd prefer to do that in a separate
> step. Similarily, it doesn't make sense to use gzip when
> exporting/import on the same machine (which uses SSH even before this
> series).

OK, please add a todo, this definitely needs revisiting.

> > […]
> > Argh, huge function.
> > […]
> > this function fails again size checks. Please split it.
> 
> As discussed offline: A lot of state is needed in these functions. I
> moved some parts around, but they're still very large. Unfortunately I
> don't see ways to optimize it at this time without making the code
> significantly more complex (e.g. passing around many parameters).
> 
> >> --- a/lib/objects.py
> >> +++ b/lib/objects.py
> >> +class ImportExportStatus(ConfigObject):
> >> +  """Config object representing the status of an import or export."""
> >
> > config object? /me confused.
> 
> The status file is an instance of this config object serialized to
> JSON. The config objects provide us with easy
> serialization/deserialization. Compare with objects.BlockDevStatus,
> which is only used for RPC calls, but for the same purpose.
> 
> I'll send the updated patch as a reply. It'll include a script testing
> the most important parts of the import/export daemon (like the test
> for daemon-util, it doesn't test absolutely everything, but it's
> better than nothing).

Thanks,
iustin


-- 
Subscription settings: 
http://groups.google.com/group/ganeti-devel/subscribe?hl=en

Reply via email to