Hi Salvatore, On Sat, Jun 08, 2019 at 06:29:24PM +0200, Salvatore Bonaccorso wrote: > Hi, > > On Thu, Jun 06, 2019 at 06:11:53PM +0200, Salvatore Bonaccorso wrote: > > Hi Daniel, > > > > On Thu, Jun 06, 2019 at 08:35:47AM +0200, Daniel Lange wrote: > > > Am 06.06.19 um 07:31 schrieb Salvatore Bonaccorso: > > > > Could you again point me to your splitted up variant mirror? > > > > > > https://git.faster-it.de/debian_security_security-tracker_split_files/ > > > > Thanks! > > > > While starting to look at it, could you change the splitting to > > $year.list instead of list.$year? I know this comes from the initial > > script which was commited. It is though more intuitive working with > > $work.something than something.$year in this context. > > Thanks to Daniel for providing the converted repository (with list > named as well the other way around as $year.list, which is more > intuitive, and looks saner (to me)) which get updated regularly, this > helps as a extremly good basis. > > Below are some thoughs which I started thinking of during the last few > days, please not it might not yet be complete. Please as well try to > not push/force us too much -- whilst we understand the issue, and see > that something whatever the solution is (split, move somewhere else) > -- we have regularly more serious issues popping up we want and need > to look at those. But we acknowledge and see als well salsa admin > point of view. > > That said, here is what I have at the moment, some are easy, some > will/might be more involving. > > Notes on possible CVE/list splits > --------------------------------- > > - workflows on files itself by most active users. Often kept open > cross-checking issues all issues in one file. But this will "just" > need other ways to deal with the situation by the persons working > most on it. > - Code of security-tracker service and python modules itself which > currently rely on the data/*/list formats (DSA, DLA, CVE, ...) This > could probably be split up and use data/*/*.list > - Externally called but included in code: update script which fetches > MITRE list and integrates all needed changes (see further below). > - bin/bts-update (called from scripts/update-CVE-assignments in cron of > the securiy-tracker-services) operates based on data/CVE/list and > keeps track of the already tagged bugs by comparing with an 'oldlist'. > The oldlist is copied on a run on soriano.debian.org as 'state' file > similar to logroate's statefile (cron). > - bin/check-new-issues: parsing of TODO and checks for the new issues is > as well based on 'data/CVE/list' existence and parsing. After a split > up the interactive commands should still be able to navigate trough > the items. > - bin/check-syntax: Check syntax of the various lists based on the security- > tracker parser for the lists. make check-syntax from the Makefile, pre- > commit hook or C/I tests are all using this script for syntax check. > Depends on CVEfile as well from python/bugs.py. Relevant here is the > check-syntax target from the Makefile. At SVN times this was actually > only testing the syntax of the changed files, but now it just runs > make check-syntax. > - bin/compare-nvd-cve reads from data/CVE/list and this is probably > easier to adapt and it's used basically in a "experimental" target in > Makefile for update-compare-nvd target. AFAICS this is just reading > the information should be easy to adapt to any split up setup. > - bin/gen-{DSA,DLA}: Used the data/CVE/list for sanity check for > presence of the CVE. > - bin/get-todo-items (this script is currently not working correctly and > it's implemented already via the webview, so need to consider if we > actually still need it). > - bin/inject-embedded-code-copies (experimental script, not > actively used) > - bin/rejected-with-info relies on data/CVE/list directly, but will be > potentially easily adaptable in a splited setup. > - bin/setup-repo: checks for data/CVE/list just to make sure it's the > right repo. > - bin/report-vuln uses CVEFile (from python/bugs.py). > - bin/update and bin/updatelist: Parses DSA/DTSA/DLA list and > data/CVE/list adding new entries from MITRE feed and crossreferences > for the DSA/DLA's to a new data/CVE/list which then in the cronjob on > soriano will be committed. That is one processing those files in a > splitted setup this will need continue to work. > - bin/update-db (Used triggered by Makefile target to update security.db > sqlite database). > - bin/update-nvd (possibly dependency on the CVE lists via the used > modules but not directly). > - data/config.json contains the sources for CVE, DSA, DLA and extended > lists. Currently path thus will be a path component starting from > data, e.g. for CVE files path is '/CVE/list'. See as well "Setting up > an extended instance" in the documentation. > - lib/python/bugs.py contains the classes CVEFile, DSAFile, > CVEExtendFile. > - lib/python/debian_support.py: defines the getconfig function reading > data/config.json. > - lib/python/security_db.py, via getSources get the configuration from > where to read CVE, DSA, DLA, Extends information defined in > config.json.
Maybe this helps to cut down on the list of things to tackle: For things needing the whole history and only requiring r/o access we could just add a makefile target that creates data/CVE/list from the split files and have that in .gitignore. For tools writing they usually only need the latest file so we could have a data/CVE/latest -> data/CVE/2019.list comitted to git that gets moved once a year. Let me know if that makes sense and I can help with that. Cheers, -- Guido > > Regards, > Salvatore >