[secu-share] Social networking over Tor and GNUnet

carlo von lynX Thu, 15 Jan 2015 07:13:18 -0800

First of all, let's eliminate the obvious first
assumption. No, it is not safe to simply use Tor,
then log into a social network. Even if you are
tidy about maintaining pseudonyms, you will be
adding similar people as on other social tools.

The 2009 paper "De-anonymizing Social Networks"
by Arvind Narayanan and Vitaly Shmatikov has
shown how the similarity of social graphs is
enough to de-anonymize its members.

This is also why the current privacy practice
of accessing an XMPP account via Tor (with or
without OTR) is not sufficient for maintaining 
anonymity. The friendship subscriptions of XMPP
are a social graph, de-anonymizable if put in
comparison with Twitter and Facebook. You may
be safe as long as you use a server whose data
base is being kept in a very safe place, but
you're fried as soon as you seriously make use
of federation - that is, your friends are on
remote XMPP servers, because that introduces
many additional ways of de-anonymizing you.

So, closing this little parenthesis on the
hopelessness of the traditional server and
federation model, let's talk about using
distributed technologies such as the Tor
hidden services and secushare's multicast
for GNUnet: these approaches are about making
social networking happen inside Tor/GNUnet
rather than on some hidden server. In particular
we'll discuss deploying GNUnet as an extension of
Tor running on the same relay nodes, mostly for
the purpose of resolving one-to-many scalability,
but also hinting at offering an alternative to
Tor's current hidden services mechanism.

Introducing, an anonymous proponent of a project
called "sharebook" started this interesting
thread on the tor-talk mailing list, inviting
people to visit its design for a modification
of the Tor hidden service mechanism in such a
way that it would allow for the delivery of
lightweight notifications to hundreds of
recipients:

https://lists.torproject.org/pipermail/tor-talk/2014-December/036251.html

It has been an interesting debate, mostly among
Sharebook and me, a bit disturbed by our
difficulties of understanding each other's
architectures at first, so you may want to
skip that.

Sharebook proposes a social network whereby
each event would be fan out by the Tor network
to all subscribers, then the subscribers would
fetch the actual encrypted data from a cloud
system running at a non-hidden service. A non-
hidden service := a service that uses no
intermediate hops to anonymize itself, but
responds immediately after the .onion lookup
(to use simplified terms). It's also called the
TOR2WEB mode since the tor2web.org folks added
this feature to Tor, now being prominently used
by the Facebook onion service.

The debate becomes intense and interesting as
I propose to use anonymous multicast for the
delivery not just of notifications, but of the
entire content. Essentially, by plugging the
secushare/GNUnet multicast infrastructure
between the third hop of each user's outgoing
onion circuit and the rendez-vous points of
the subscribers of her channels, which happen
to be the first hop of their respective incoming
onion circuits.

https://lists.torproject.org/pipermail/tor-talk/2015-January/036467.html
is my latest contribution in this thread, providing
also a list of scientific papers in this area of
research. I'll cite some hopefully accessible parts of 
that mail right here:

---------------------------------------------------

[...]

On Tue, Jan 13, 2015 at 09:48:21PM +0000, contact at sharebook.com wrote:
> I think it's better spell the question of choice of trade-off like this:
> do we want forward secrecy for sending each Notification to each friend
> when we only use Mceliece cryptosystem for asymmetrical encryption? or
> do we want forget about group PQ forward secrecy by encrypting the
> Notification using a common secret (or using Attribute-Based Encryption)
> that is same for all friends to be able multicast the cipher-text value

Don't forget that there is link level encryption between each multicast
node, so an attacker would have to take over the network of relay nodes
to gather significant knowledge.

Also we should devise a multicast ratcheting method by which each
branch of the tree re-encrypts the content with a different ratchet,
thus making it difficult for somebody who p0wns a certain number of
relay nodes to recognize which subtrees belong to the same root.

[...]

> But the real problem is that multicasting is not metadata friendly. 

That is a bold claim.

> it's not feasible to protect metadata secrecy on multicasting because
> you fundamentally can't send a random packet to each recipient and when
> you multicast same value then you enter one-to-many pseudonyms paradigm
> which means some social graphs between pseudonymous vertices become
> visible to observers in that zone (search social network
> de-anonymization papers for more info). 

2009, "De-anonymizing Social Networks" by Arvind Narayanan and Vitaly
Shmatikov is about correlating Twitter and Flickr users.
Is this really what you mean? Sounds pretty off-topic to me.

Other papers on the topic are these:

- 2000, "Xor-trees for efficient anonymous multicast and reception"
- 2002, "Hordes — A Multicast Based Protocol for Anonymity"
- 2004, "AP3: Cooperative, decentralized anonymous communication"
- 2006, "M2: Multicasting Mixes for Efficient and Anonymous Communication"
- 2006, "Packet coding for strong anonymity in ad hoc networks"
- 2007, "Secure asynchronous change notifications for a distributed file system"
- 2011, "Scalability & Paranoia in a Decentralized Social Network."
- 2013, "Design of a Social Messaging System Using Stateful Multicast."

The last two are our own. I'm afraid I can't find a paper that supports
your bold assertion there. You will have to help me.

Other papers on the topic of distributed social multicast, but without 
anonymity:

- 2003, "Bullet: High Bandwidth Data Dissemination Using an Overlay Mesh"
- 2003, "SplitStream: high-bandwidth multicast in cooperative environments"
- 2005, "The Feasibility of DHT-based Streaming Multicast"
- 2006, "Minimizing churn in distributed systems"
- 2007, "SpoVNet: An Architecture for Supporting Future Internet Applications"
- 2008, "TRIBLER: a Social-based Peer-to-Peer System." 

Papers about anonymous social networking, without the scalability bit:

- 2010, "The Gossple Anonymous Social Network."
- 2010, "Pisces: Anonymous Communication Using Social Networks"
    by Prateek Mittal, Matthew Wright, and Nikita Borisov isn't about
    multicast, but it elaborates a way how social networks can in theory
    improve onion routing.
- 2011, "X-Vine: Secure and Pseudonymous Routing Using Social Networks."

Papers suggesting the use of social graph data to protect against
sybil attacks:

- 2006, "SybilGuard: defending against sybil attacks via social networks"
- 2013, "Persea: A Sybil-resistant Social DHT"

I think there are more on this specific topic.. ah yes, X-Vine also
proposes social protection against sybil attacks.

Since I'm not a paid researcher I have not read all of these papers,
but it does so far look like there is a majority in favor of our
architecture rather than yours.

> in pubsub, constant connections even between pseudonyms might reveal
> some parts of social graph, what I proposed as Hybrid hidden service is

I believe seeing little pieces of branches will not get you far.
The disadvantages of requiring a storage cloud are more heavy-weight.

> discontinuous and packets traveling from SC's third hope to RPs don't
> look relevant to each other, there is no way to draw a social graph
> between one sender and several RPs because when an OR sends 167 packet
> to 167 RP, an observer in between can't separate these packets are from
> same person who sent them to all those RPs, or 167 different person at
> that OR sent those packets to RPs in a linear paradigm as each packet
> looks random without any connection information. everything changes when

I challenge that, at least in the current Tor network. If the attacker
applies traffic shaping to the outgoing notification. Only if the
notification has a fixed size the third hop can avoid replicating the
shaped traffic and thus allow an observer to see which rendez-vous
points are being addressed - possibly de-anonymizing many involved
hidden services behind them. Probably there is even a chance of
de-anonymization if notifications had a fixed size, since the third hop
will suddenly be busy sending out all similarly shaped packets to 167 RPs.

> there is a constant identical connection between SC's third hope and 167
> RP that makes entire relations between pseudonyms visible to an
> observers between them without hacking ORs. 

I challenge that as well. Given a high latency packet-oriented multicast
system being fed from the third hop, distributing the content to a network
of reception points, the maximum de-anonymization that can be achieved
is by p0wning some nodes, seeing some fragments of somebody's trees,
still not being able to tell where the stuff came from and where it
will end up.

[...]

> We can use twitter's distribution strategy on PseudonymousServer, you
> can consider blocks as tweets, how twitter sends a plaintext tweet to
> 167 different person from different IP addresses who ask it? I guess we
> can use same method to deliver blocks to 167 different person who
> request it. 

Twitter uses a multicast-like replication system, like all cloud
systems. The question is if it makes sense to access that via
a TOR2WEB gateway or better have it built into the anonymization
network. Cloud systems are easier to set up because they are a
well understood thing, but the disadvantages are relevant.

> And in our app we limit numbers of friends to ~250 friends, if someone
> shares something to millions then probably it's not private. 

Yes, but the fact that I am interested in ioerror's tweets says
something about me. That's why I believe anonymization should
happen at any scale. That's why I would rather opt for a system
that can scale with the number of people adopting it, rather
than having to say: Sorry, twitter.com or livestream.com use
cases are unwelcome - you have to give up anonymity for those.

> >That can be achieved by creating suitable motivation. If the social
> >distance can be computed even for anonymous data, people can sponsor relays
> >that offer services to first or second degree friends without knowing 
> >what exactly and who exactly they are working for. The space for ideas
> >in this field is still vast methinks.
> 
> There might be a lots of volunteers who are willing to donate their
> storage for incentives but they are finite not infinite, someday we

Yes, they grow at the same speed as the number of people wanting
to use them - so the principles of scalability are respected.

[...] Should we employ
GNUnet as a distribution infrastructure plugged betwen the third
outgoing hop and the rendez-vous points, it probably makes sense to
also use GNUnet's sybil attack resistant DHT instead of Tor's, possibly
introducing better look-up privacy. But that is something Christian
and Georges should be working out.

-- [email protected]
   https://lists.secushare.org/mailman/listinfo.cgi/secu-share

[secu-share] Social networking over Tor and GNUnet

Reply via email to