[ 
https://issues.apache.org/jira/browse/TS-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13752429#comment-13752429
 ] 

Leif Hedstrom commented on TS-2145:
-----------------------------------

Nice! Couple of minor thought:

1) Why not add an additional metric for "disk full" events? E.g. in
{code}
+    case Log::FAIL:
+    case Log::FULL:
{code]
would it not make sense to treat "failures" as different events than disk full? 
I can see taking different actions on e.g. LOG::FULL. Unless of course the only 
"failure" is disk full, but then we really ought to clean up that :).


2) I'm not a huge fan of the existing #define's like 
{code}
#define LOG_INCREMENT_DYN_STAT(x) \
        RecIncrRawStat(log_rsb, mutex->thread_holding, (int) x, 1);
{code}

Since you are working in this code already, could we just eliminate e.g. 
LOG_INCREMENT_DYN_STAT (and the other similar ones) and just use 
RecIncrRawStat() directly? The issue I have is that these defines depend on 
certain things in the current scope, such as the mutex, making the code less 
readable IMO.

3) Use < 120 characters per line, e.g. this can easily fit on one line, and not 
two:
{code}
+    Note("logging space exhausted, can't write to:%s, drop this entry",
+         m_logFile->m_name);
{code}


4) In ::log() why use an if statement and not a switch ? E.g. why
{code}
+  if (ret & (Log::FAIL | Log::FULL)) {
+    // At least one of object logging failed.
+    LOG_INCREMENT_DYN_STAT(log_stat_event_log_access_fail_stat);
+  } else if (ret & Log::LOG_OK) {
{code}
Considering that LOG_OK is probably the predominant case, why sacrifice and do 
two logical operations in that case? This seems unnecessary, and inconsistent 
with how you use the return values elsewhere. Also, assuming the return value 
is only one of ReturnCodeFlags, why did you make that a bit-field instead of 
just a sequence? If you switch the above to a switch statement, maybe change 
ReturnCodeFlags back as well.

5) This is very nitpicky, but I find the naming of corresponding stats slightly 
confusing. We have log_stat_event_log_error_skip_stat and 
log_stat_event_log_access_skip_stat. The latter presumably is for log collation 
receiver side of log collation? Would it make sense to call them e.g. 
log_stat_event_log_produce_skip_stat and log_stat_event_log_consume_skip_stat ? 
Or perhaps _write_ and _read_ (respectively) ?
                
> Add metrics for log collation, e.g. messages sent and messages received
> -----------------------------------------------------------------------
>
>                 Key: TS-2145
>                 URL: https://issues.apache.org/jira/browse/TS-2145
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Logging
>            Reporter: Yunkai Zhang
>            Assignee: Yunkai Zhang
>         Attachments: 0001-TS-2145-Add-metrics-for-log-collation.patch
>
>
> Without accurate stats, we don't known whether the log is lost.
> This coming patch will be used to verify:
> 1) At log client side, how many logs genterated from HttpSM, how many logs 
> sent to LogServer.
> 2) At log server side, how many logs received from LogClient, how many logs 
> written to LogServers's disk.
> ==
> I have attached a patch. Let's show an example for the metrics collected by 
> this patch:
> In this example, there are two machines: one log-client and one log-server.
> {code}
> 1) Clear old stats data in both client and server machines:
> $ /etc/init.d/trafficserver stop
> $ rm -rf /var/run/trafficserver/*.snap
> 2) Make logs_xml.config in log server empty.
> 3) Make logs_xml.config in log client looks like:
> <LogObject>
>   <Format = "Cdn_access_log"/>
>   <CollationHosts = "<log-server-ip>:8085"/>
>   <Filename = "cdn_access"/>
>   <Protocols = "http"/>
>   <Filters = "only_get"/>
> </LogObject>
> 2) Start log server (with collation mode == 1, logging_enable = 3, 
> squid_log_enabled = 0)
> [log-server]$ /etc/init.d/trafficserver start
> 3) Start log client (with collation mode = 0, logging_enable = 3, 
> squid_log_enabled = 0)
> [log-server]$ /etc/init.d/trafficserver start
> 4) Use jtest do some request for log client.
> 5) Stop log client and for a while (so that all data in network have been 
> sent to log-server).
> 6) Log client's metrics for logging:
> [log-client]# lynx -dump http://localhost:8080/stat/ | grep process.log
> proxy.process.log.event_log_error=0
> proxy.process.log.event_log_error_skip=0
> proxy.process.log.event_log_error_aggr=0
> proxy.process.log.event_log_error_fail=0
> proxy.process.log.event_log_access=4990275
> proxy.process.log.event_log_access_skip=0
> proxy.process.log.event_log_access_aggr=0
> proxy.process.log.event_log_access_fail=0
> proxy.process.log.num_sent_to_network=4990275
> proxy.process.log.num_received_from_network=0
> proxy.process.log.num_flush_to_disk=0
> proxy.process.log.bytes_sent_to_network=719149632
> proxy.process.log.bytes_received_from_network=0
> proxy.process.log.bytes_flush_to_disk=0
> proxy.process.log.bytes_written_to_disk=0
> proxy.process.log.log_files_open=0
> proxy.process.log.log_files_space_used=5330
> 7) Log server's metrics for logging:
> [log-server]# lynx -dump http://localhost:8080/stat/ | grep process.log
> proxy.process.log.event_log_error=0
> proxy.process.log.event_log_error_skip=0
> proxy.process.log.event_log_error_aggr=0
> proxy.process.log.event_log_error_fail=0
> proxy.process.log.event_log_access=0
> proxy.process.log.event_log_access_skip=20
> proxy.process.log.event_log_access_aggr=0
> proxy.process.log.event_log_access_fail=0
> proxy.process.log.num_sent_to_network=0
> proxy.process.log.num_received_from_network=4990275
> proxy.process.log.num_flush_to_disk=4990275
> proxy.process.log.bytes_sent_to_network=0
> proxy.process.log.bytes_received_from_network=719149632
> proxy.process.log.bytes_flush_to_disk=485256906
> proxy.process.log.bytes_written_to_disk=485256906
> proxy.process.log.log_files_open=1
> proxy.process.log.log_files_space_used=85215225070
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to