Re: [OpenStack-Infra] Log storage/serving

Joshua Hesketh Mon, 16 Sep 2013 17:24:43 -0700

So if zuul dictates where a log goes and we place the objects in swiftwith that path (change / patchset / pipeline / job / run) then zuulcould also handle placing indexes as it should know which objects to expect.

That said, if the path is deterministic (such as that) and the workersprovide the index for a run then I'm not sure how useful an index forpatchsets would be. I'd be interested to know if anybody uses the linkhttp://logs.openstack.org/34/45334/ without having come from gerrit oranother source where it is published. Because of its deterministicnature perhaps the use case where it is needed could be served otherhow?


Cheers,
Josh

--
Rackspace Australia

On 9/13/13 2:49 AM, James E. Blair wrote:

Joshua Hesketh <joshua.hesk...@rackspace.com> writes:

We could then use either psuedo folders[0] or have the worker generate
an index. For example, why not create an index object with links to
the other objects (using the known serving application URL prepended)?
In fact, the reporter can choose whether to generate an index file or
just send the psuedo folder to be served up.

This is one of the main reasons we don't use swift today.  Consider this
directory:

http://logs.openstack.org/34/45334/

It contains all of the runs of all of the jobs for all of the patchsets
for change 45334.  That's very useful for discoverability; the
alternative is to read the comments in gerrit and examine the links
one-by-one.  A full-depth example:

http://logs.openstack.org/34/45334/7/check/gate-zuul-python27/7c48ee3/

(That's change / patchset / pipeline / job / run.)

Each individual job is concerned with only the last component of that
hierarchy, and has no knowledge of what other related jobs may have run
before or will run after, so creating an index under those circumstances
is difficult.  Moreover, if you consider that in the new system, we
won't be able to trust a job with access to any pseudo-directory level
higher than its individual run, there is actually no way for it to
create any of the higher-level indexes.

If we want to maintain that level of discoverability, then I think we
need something outside of the job to create indexes (in my earlier
email, the artifact-serving component does this).  If we are okay losing
that, then yes, we can just sort of shove everything related to a run
into a certain arbitrary location whose path won't be important anymore.
Within the area written to by a single run, however, we may still have
subdirectories.  Whether and how to create swift directory markers for
those is still an issue (see my other email).  But perhaps they are not
necessary, and yes, certainly _within the directory for a run_, we could
create index files for as needed.

Note the following implementation quirks we have observed:

  * Rackspace does not perform autoindex-like functionality for directory
    markers unless you are using the CDN (which has its own complications
    related to cache timeouts, dns hostnames, etc).

  * HPCloud does not recognize directory markers when generating index
    pages for the public view of containers.

We may want and indeed be able to use the staticweb feature, along with
the CDN -- but there's enough complication here that we'll need to get
fairly detailed in the design and validate our assumptions.

-Jim

_______________________________________________
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra



_______________________________________________
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra

Re: [OpenStack-Infra] Log storage/serving

Reply via email to