#22983: add a descriptor interface and implementation for web-logs ---------------------------------+----------------------------------- Reporter: iwakeh | Owner: metrics-team Type: enhancement | Status: needs_review Priority: Medium | Milestone: metrics-lib 2.1.0 Component: Metrics/metrics-lib | Version: Severity: Normal | Resolution: Keywords: | Actual Points: Parent ID: | Points: Reviewer: | Sponsor: ---------------------------------+-----------------------------------
Comment (by iwakeh): Replying to [comment:10 karsten]: > I need more time for this review. But here's a first question: > > Should we really put the sanitizing code into metrics-lib rather than CollecTor? That's an important design decision and a change to what we have been doing in the past. Where would this code be used other than in CollecTor? So far, metrics-lib has primarily been the client-side library for applications using CollecTor data. But this change would turn it into a library that both the CollecTor server and its clients depend on. I'm yet undecided whether this is a good idea or not. In any case, we should discuss this first. The sanitizing code **is not** part of metrics-lib. Thus, we agree here. In the proposed patch metrics-lib enables adding sanitizing code from the 'outside' using method {{{ public void setSanitizer(LogDescriptor.Sanitizer sani); }}} and `Sanitizer` is just a functional interface (i.e., having one method) that can be fulfilled by a lambda expression once we go to Java8, but that's an aside. The given `Sanitizer` is applied when `sanitize()` is called. The resulting lines are sorted by metrics-lib. A choice I made is to have a default identity sanitizer in case none was set instead of raising an exception. With this approach metrics-lib is sanitizer-code-agnostic, but provides all else (compression, de-compression, etc.), which avoids duplicating code and enables us to implement performance and space saving code 'under the hood' once it is needed. Hope this explains my reasoning. CollecTor depends on metrics-lib already as it uses `Descriptor`s of all kinds as well as parser and reader from metrics-lib. (I noticed that I missed adding a comment to `LogDescriptor.setSantizer()`. I'll add a fixup commit, but that shouldn't hinder review.) -- Ticket URL: <https://trac.torproject.org/projects/tor/ticket/22983#comment:11> Tor Bug Tracker & Wiki <https://trac.torproject.org/> The Tor Project: anonymity online _______________________________________________ tor-bugs mailing list tor-bugs@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs