Hi All - I just created this LogStore SPI PR[1], and mentioned in the meeting last week I would solicit feedback on the dev list.
A couple of things to note: - our approach is using Splunk for log storage; we use fluentd in front of spunk forwarder, so with minor additions, you can replace splunk with anything that fluentd can talk to - there is also a generic provision for specifying your docker log driver choices, in case you don’t want to use fluentd at all (the example config is fluentd, but there is no code that is related specifically to fluentd) - as mentioned in the PR, there is some assumption about the format of stdout/stderr from the action containers - we are working on a separate PR for this, but the approach is to allow a configuration to be passed to the container that indicates a preferred log output. Of course people can also deploy their own action containers, but I think this flexibility should be exposed in the OOTB containers. One issue that comes from decoupling log collection from the activation execution, is the delay between when logs are generated and when the logs are available to developers. We haven’t become attached to a specific approach for this, but some options (besides tuning the log forwarders to lower latency or just polling till logs are available) are: - use the existing approach of adding a sentinel log to indicate the end of the activation - this allows to distinguish between the state of “logs not collected yet”, and “logs collected but none were generated”; then a developer can be given a message like “logs not available yet” in case the collection has not made any progress yet. - don’t use controller APIs at all, just use the log store (splunk, ELK, etc); this has some affect on the usefulness of the CLI for debugging. Someone mentioned using syslog on the call, but I didn’t quite follow the entire workflow, so please chime in here if this SPI interface would meet your needs? Finally, the changes in this PR are dependent on the ContainerFactory PR [2] since in our testing using an alternative Container provider (e.g. Mesos) is a real test case for delegating container creation and (which implies log collection as well) to an external system. Thanks Tyson [1] https://github.com/apache/incubator-openwhisk/pull/2695 [2] https://github.com/apache/incubator-openwhisk/pull/2659