--- Begin Message ---
Hello tcpdump workers,

We've been working on adding a new feature to tcpdump that will allow IP
address anonymization via the Crypto-PAn (Cryptography-based
Prefix-preserving Anonymization) approach. The feature we’re adding to
tcpdump is motivated by the importance of preserving user privacy and
complying with data processing security regulations. The Crypto-PAn
anonymization approach keeps the original IP addresses' prefixes while
anonymizing the suffixes, preserving the network structure.

The goal of this email is to poll the interest of the tcpdump community in
merging this feature once it’s complete, and to get in touch with potential
reviewers of our patch.

We are aware that there are external tools that seek a similar goal, as
discussed in PR #615 (https://github.com/the-tcpdump-group/tcpdump/pull/615).
However, the anonymization methods used by these tools often fall short to
achieve a balance between privacy and preserving data utility. For example,
Black Marker sets the IP to all zeros, resulting in a complete loss of
utility. Permutation can distort the original data distribution, resulting
in skewed results and lower analytical value. Similarly, traditional IP
randomization methods frequently treat each octet independently, omitting
the importance of preserving the hierarchical structure of IP addresses and
compromising the integrity of network analysis and management.

For this reason, we believe that the best approach is to use
prefix-preserving anonymization techniques, which are similar to
permutation techniques but preserve the prefixes. The mapping is kept
consistent using cryptographic keys, which addresses the issue of balancing
privacy and utility in data anonymization.

We believe that this functionality is well suited for tcpdump because much
of the logic used to print an IP address for a specific packet can be
reused to access that IP and anonymize it. The logic for dissecting packet
headers can be slightly adapted to implement this feature, including
anonymization of application headers. For example, much of the code written
to print an IP address offered by DHCP can be used to access that address
and anonymize it.

We have an early prototype of this patch. The feature we’re adding uses the
cryptopANT library. This library provides a comprehensive set of
anonymization functions designed for IPv4 and IPv6 addresses. With the
addition of a new flag, "--anon," users enable IP address anonymization in
tcpdump by providing a key file that will be used by the Crypto-PAn
anonymization algorithm.

Here's a brief overview of how the implementation works:

1.      Activation Flag: Users can activate the anonymization feature by
using the "--anon" flag along with tcpdump commands.

2.      Key File: A key file containing the encryption key required for the
Crypto-PAn algorithm must be provided as an input parameter alongside the
"--anon" flag.

3.      Callback Invocation: When the "loop_pcap" function acquires a
packet, the designated callback method responsible for anonymizing IP
addresses is invoked. This method anonymizes the IP addresses in the packet
headers.

4.      Execution of Real Callback: Following anonymization, the "real
callback" is triggered. This callback can do current implemented actions
such as dumping packet contents, writing contents to a pcap file, etc.

An example of how to use this flag is: ./tcpdump --anon keyfile.txt -n where,
keyfile.txt is a file containing the key produced by cryptopANT using
scramble_ips
--newkey keyfile.txt.

Currently, we have implemented support for anonymizing IPv4 addresses. Our
roadmap includes extending support to accommodate additional anonymization
methods, and enabling users to specify anonymization parameters dynamically.

I am sharing my GitHub project (https://github.com/aperezb21/tcpdump),
which is forked from commit bb704ed32d770e84fdc340de8276c261bb6e9ee1,
containing the current prototype. We welcome any discussion or feedback,
both on or off-list.

Thank you,

Alberto.

--- End Message ---
_______________________________________________
tcpdump-workers mailing list -- tcpdump-workers@lists.tcpdump.org
To unsubscribe send an email to tcpdump-workers-le...@lists.tcpdump.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

Reply via email to