#18910: distributing descriptors accross CollecTor instances -------------------------------+--------------------------------- Reporter: iwakeh | Owner: iwakeh Type: enhancement | Status: needs_review Priority: High | Milestone: CollecTor 1.1.0 Component: Metrics/CollecTor | Version: Severity: Normal | Resolution: Keywords: ctip | Actual Points: Parent ID: | Points: Reviewer: | Sponsor: -------------------------------+---------------------------------
Comment (by karsten): Alright! Please find [https://gitweb.torproject.org/karsten/metrics- db.git/log/?h=task-18910-3 branch task-18910-3 in my repository]. I created fixup/squash commits for most things I found. Please take a close look and challenge anything you don't like to be squashed into your commit. In particular, please take a look why the unit tests don't pass anymore. ;) More comments below: - I believe that index.json would include the `.tmp` files written during sync, because we're only skipping files starting with `.`. And that's not even a new issue, because the `.tmp` files written during normal updating would also be included. The issue will only become more prevalent. I'd say we adapt the indexer, which can happen in a separate commit. Or we could change temporary files to start with `.`, but that seems a bit more error-prone. Suggestion: '''fix''' - As mentioned before somewhere, we're appending server/extra-info descriptors to different files in the synchronization run than in the earlier update run. The reason is that `SyncManager` creates its own `collectionDate` in the constructor. However, we cannot just pass a `Date` instance to both `SyncManager` and `ArchiveWriter`, because then `SyncManager` would append to a file that we're already serving, in which case the file size would change once we're done synchronizing, and we should try to avoid that. Let's fix this as soon as the current updaters are modules just like the synchronization module, and let's keep in mind not to close and rename temporary files to their final names between modules. Suggestion: '''postpone''' - As said earlier, let's release metrics-lib and use that new release as basis for this patch. Then we won't have to import and instantiate `org.torproject.descriptor.index.DescriptorIndexCollector` anymore, and we can change defaults for the `*SyncOrigins` properties in `collector.properties` back to `https://collector.torproject.org`. Suggestion: '''change''' - Regarding comment 25 above, I'm still a fan of using `printf` as opposed to creating `SimpleDateFormat` instances and concatenating strings. I also believe it would be more efficient. But I'm happy to postpone that change. Suggestion: '''postpone''' -- Ticket URL: <https://trac.torproject.org/projects/tor/ticket/18910#comment:56> Tor Bug Tracker & Wiki <https://trac.torproject.org/> The Tor Project: anonymity online _______________________________________________ tor-bugs mailing list tor-bugs@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs