> On 24 Nov 2018, at 03:35, Benjamin Kaduk <ka...@mit.edu> wrote:
> 
> On Wed, Nov 21, 2018 at 01:53:09PM +0000, Sara Dickinson wrote:
>> 
>> 
>>> Begin forwarded message:
>>> 
>>> From: Benjamin Kaduk <ka...@mit.edu <mailto:ka...@mit.edu>>
>>> Subject: Benjamin Kaduk's Discuss on 
>>> draft-ietf-dnsop-dns-capture-format-08: (with DISCUSS and COMMENT)
>>> Date: 19 November 2018 at 00:28:19 GMT
>>> To: "The IESG" <i...@ietf.org <mailto:i...@ietf.org>>
>>> Cc: draft-ietf-dnsop-dns-capture-for...@ietf.org 
>>> <mailto:draft-ietf-dnsop-dns-capture-for...@ietf.org>, Tim Wicinski 
>>> <tjw.i...@gmail.com <mailto:tjw.i...@gmail.com>>, dnsop-cha...@ietf.org 
>>> <mailto:dnsop-cha...@ietf.org>, tjw.i...@gmail.com 
>>> <mailto:tjw.i...@gmail.com>, dnsop@ietf.org <mailto:dnsop@ietf.org>
>>> Resent-From: <alias-boun...@ietf.org <mailto:alias-boun...@ietf.org>>
>>> Resent-To: j...@sinodun.com <mailto:j...@sinodun.com>, j...@sinodun.com 
>>> <mailto:j...@sinodun.com>, s...@sinodun.com <mailto:s...@sinodun.com>, 
>>> terry.mander...@icann.org <mailto:terry.mander...@icann.org>, 
>>> john.b...@icann.org <mailto:john.b...@icann.org>
>> 
>> Many thanks for the detailed review. 
>> 
>>> 
>>> ----------------------------------------------------------------------
>>> DISCUSS:
>>> ----------------------------------------------------------------------
>>> 
>>> It is pretty shocking to not see any discussion of the privacy
>>> considerations of storing data including client addresses (and ports)
>>> alongside DNS transactions, given how central DNS resolution is to user
>>> behavior on the web.  (Note that there are mentions of potentially
>>> anonymized data in Sections 6.2 and 6.2.3 which would presumably
>>> forward-reference the privacy considerations.)  Data normalization would
>>> probably also be mentioned in this section, since (e.g.) the case used for
>>> a query/response could be used in fingerprinting an implementation.
>> 
>> There have been extensive discussion of data storage risks and practices in 
>> two DPRIVE documents so I’d suggest the following changes in the first 
>> instance to address this:
> 
> This is exactly the sort of thing I was hoping to see, thank you!  I have
> just a couple tweaks to suggest, inline.
> 
>> New Privacy Considerations section:
>> “ Storage of DNS traffic by operators in PCAP and other formats is a long 
>> standing and widespread practice. Section 2.5 of 
>> draft-bortzmeyer-dprive-rfc7626-bis is an analysis of the risks to Internet 
>> users of the storage of DNS traffic data in servers (recursive resolvers, 
>> authoritative and rogue server). 
>> 
>> Section 5.2 of draft-dickinson-dprive-bcp-op describes mitigations for those 
>> risks for data stored on recursive resolvers (but which could by extension 
>> apply to authoritative servers). These include data handling practices and 
>> methods for data minimisation, IP address pseudonymization and 
>> anonymization. Appendix B of that document presents an analysis of 7 
>> published anonymization processes. In addition RSSAC have recently published 
>> RSSAC04: " Recommendations on Anonymization Processes for Source IP 
>> Addresses Submitted for Future Analysis”[1].
>> 
>> The above analyses consider full data capture (e.g using PCAP) as a
>> baseline for privacy considerations and therefore this format
>> specification introduces no new user privacy issues beyond those of full
>> data capture. It does provides mechanisms to selectively record only
> 
> I would say "beyond those of full data capture (which are quite severe)".
> That is, while the current state of affairs is a valid baseline for
> comparison, that does not absolve us of responsibility for analyzing the
> current state of affairs.  (To be clear,
> draft-bortzmeyer-dprive-rfc7626-bis is a fine place for the bulk of that
> anlaysis to live, but in this document we should not pretend that the
> current state of affairs is a good situation to be in.)
> 
>> certain fields at the time of data capture to improve user privacy and to
>> explicitly indicate that data is sampled and or anonymised. It also
>> provide flags to indicate if data normalisation has been performed; data
>> normalisation increases user privacy by reducing the potential for
>> fingerprinting individuals however a trade-off is potentially reducing
> 
> I think "however" would be offset by commas on both sides.

Both these WFM - thanks.

And thanks for the responses below - will update the draft accordingly.

Sara. 

> 
>> the capacity to identify attack traffic via query name signatures.
>> Operators should carefully consider their operational requirements and
>> privacy policies and SHOULD capture at source the minimum user data
>> required to meet their needs“
>> 
>> [1] https://www.icann.org/en/system/files/files/rssac-040-07aug18-en.pdf 
>> <https://www.icann.org/en/system/files/files/rssac-040-07aug18-en.pdf>
>> 
>> 
>> As noted, there are a few other places we can also highlight the privacy 
>> aspects:
>> 
>> Introduction:
>> OLD: “The PCAP [pcap] or PCAP-NG [pcapng] formats are typically used in 
>> practice for packet captures, but these file formats can contain a great 
>> deal of additional  information that is not directly pertinent to DNS 
>> traffic analysis  and thus unnecessarily increases the capture file size.”
>> 
>> NEW: “The PCAP [pcap] or PCAP-NG [pcapng] formats are typically used in 
>> practice for packet captures, but these file formats can contain a great 
>> deal of additional  information that is not directly pertinent to DNS 
>> traffic analysis  and thus unnecessarily increases the capture file size. 
>> Additionally these tools and format typically have no filter mechanism to 
>> selectively record only certain fields at capture time, requiring 
>> post-processing for anonymisation or pseudonymistaion of data to protect 
>> user privacy.
>> 
>> Section 4, bullet point 2:
>> 
>> OLD: “Different users will have different requirements
>>          for data to be available for analysis.  Users with minimal
>>          requirements should not have to pay the cost of recording full
>>          data, though this will limit the ability to perform certain
>>          kinds of data analysis and also to reconstruct packet
>>          captures.  For example, omitting the resource records from a
>>          Response will reduce the C-DNS file size; in principle
>>          responses can be synthesized if there is enough context.”
>> 
>> NEW: “Different operators will have different requirements
>>          for data to be available for analysis.  Operators with minimal
>>          requirements should not have to pay the cost of recording full
>>          data, though this will limit the ability to perform certain
>>          kinds of data analysis and also to reconstruct packet
>>          captures.  For example, omitting the resource records from a
>>          Response will reduce the C-DNS file size; in principle
>>          responses can be synthesized if there is enough context.
>>          Operators may have different policies for collecting user data
>>          and can choose to omit or anonymise certain fields at
>>         capture time e.g. client address."
>> 
>> And yes, in both sections 6.2 and 6.2.3 add forward references to the 
>> Privacy Considerations section
>> 
>> 
>>> 
>>> I'm also concerned about the policy/procedure for allocating/extending the
>>> various bitfields and similar potential extension points in the data
>>> structures.  Section 8 covers the major/minor versioning semantics with
>>> respect to new map keys and new maps, but not addition of new bits within
>>> existing (uint) bitmaps.  Given the usage of the CDDL .bits constraint,
>>> it's not really clear that an IANA registry is the right tool to use, but I
>>> think some indication of the expected way to allocate new bits is in order,
>>> whether it's "a future standards-track document that updates this document"
>>> or otherwise.  (I've noted many, but not all, instances of such bitmaps in
>>> my COMMENT section.)
>> 
>> We are inclined to follow the lead of existing RFCs making use of CBOR, 
>> namely
>> * RFC8152 'CBOR Object Signing and Encryption' (July 2017)
>> * RFC8392 ‘CBOR Web Token (CWT)' (May 2018) and 
>> * RFC8428 'Sensor Measurement Lists (SenML)' (Aug 2018) 
>> and request IANA create a C-DNS registry with
>> subregistries with keys for each of the different maps used in C-DNS.
>> New entries in these subregistries would follow Expert Review as defined
>> in RFC8126. This appears to be the emerging usual way of dealing with
>> CBOR map key values, particularly integer.
> 
> That sounds like a fine path forward, thanks.
> 
>>> 
>>> There are also a couple of fields whose semantics don't seem to be
>>> sufficiently well specified for a proposed-standard document, such as
>>> vlan-ids, generator-id, name-rdata, and ae-code.  (I understand that some
>>> of them are probably only going to have locally relevant semantics, but we
>>> should be explicit about when that's the case.)
>> 
>> Acknowledged, we’ll add references or clarifications for these (will put 
>> details in a follow up mail that will also address your comments below).
> 
> Sounds good.
> 
>>> 
>>> If I'm reading things correctly that the IP address type is inferred from
>>> the bytestring length, then I think we need to enforce a restriction on the
>>> address prefix length(s) to allow for that inference to be unambiguous
>>> (noting that we only have the *byte* length of the address fields at our
>>> disposal for disabmgituation, and not the more precise bit-length).
>> 
>> Ah, the first bit of the qr-transport-flags contains a IPv4/IPv6 flag so the 
>> address type can be explicitly determined from that if it is set but of 
>> course there is a corner case where that field isn’t present we hadn’t 
>> considered so we’ll have to address that. Making that field mandatory if 
>> prefixes are used would be simplest. 
> 
> I guess I had forgotten about that bit in the qr-transport-flags on my
> first read.  Making it mandatory if prefix lengths are present ought to
> work.
> 
> -Benjamin

_______________________________________________
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop

Reply via email to