#22196: Configure descriptor sources using method chaining ---------------------------------+----------------------------------- Reporter: karsten | Owner: metrics-team Type: enhancement | Status: new Priority: Medium | Milestone: metrics-lib 1.9.0 Component: Metrics/metrics-lib | Version: Severity: Normal | Resolution: Keywords: | Actual Points: Parent ID: | Points: Reviewer: | Sponsor: ---------------------------------+-----------------------------------
Comment (by karsten): I had the following idea when working on #22141: Maybe we can avoid configuring descriptor sources entirely and go with overloaded, state-less methods like the one we have in `DescriptorCollector`. The main obstacle here is `DescriptorReader`'s parse history where the application needs a way to save the parse history to file to disk once it's done processing descriptors. Maybe we can work around that obstacle by using `minLastModified` in `DescriptorReader` as we're using in `DescriptorCollector`. Here are possible method signatures in the three non-deprecated descriptor sources: {{{ DescriptorParser: Iterable<Descriptor> parse(byte[] rawDescriptorBytes); DescriptorCollector: void collect( File localDirectory, String collecTorBaseUrl, String... remoteDirectories); void collect( long minLastModified, File localDirectory, String collecTorBaseUrl, String... remoteDirectories); void collect( boolean deleteExtraneousLocalFiles, long minLastModified, File localDirectory, String collecTorBaseUrl, String... remoteDirectories); DescriptorReader: Iterable<Descriptor> read( File... descriptorFiles); Iterable<Descriptor> read( long minLastModified, File... descriptorFiles); Iterable<Descriptor> read( int maxDescriptorsInQueue, long minLastModified, File... descriptorFiles); }}} An application that wants to process only newly added descriptors would start by retrieving a current timestamp to mark the beginning of its current execution. It would then load the last execution timestamp from disk (rather than let metrics-lib load the last parse history) and pass that timestamp as `minLastModified` to both `collect()` and `read()`. In fact, it might want to subtract 15 or 30 minutes from that timestamp to account for clock skew with the CollecTor server. And when it's done processing descriptors it saves the current timestamp for the next execution (rather than let metrics-lib save the parse history to disk). -- Ticket URL: <https://trac.torproject.org/projects/tor/ticket/22196#comment:3> Tor Bug Tracker & Wiki <https://trac.torproject.org/> The Tor Project: anonymity online _______________________________________________ tor-bugs mailing list tor-bugs@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs