[freenet-dev] Question about an important design decision of the WoT plugin

Evan Daniel Tue, 26 May 2009 19:04:01 -0400

On Tue, May 26, 2009 at 5:38 PM, xor <xor at gmx.li> wrote:
> On Tuesday 26 May 2009 23:19:53 Evan Daniel wrote:
>> 2009/5/26 xor <xor at gmx.li>:
>> > On Tuesday 26 May 2009 22:02:37 xor wrote:
>> >> On Thursday 07 May 2009 11:23:51 Evan Daniel wrote:
>> >> > > Why exactly? Your post is nice but I do not see how it answers my
>> >> > > question. The general problem my post is about: New identities are
>> >> > > obtained by taking them from trust lists of known identities. An
>> >> > > attacker therefore could put 1000000 identities in his trust list to
>> >> > > fill up your database and slow down WoT. Therefore, an decision has
>> >> > > to be made when to NOT import new identities from someone's trust
>> >> > > list. In the current implementation, it is when he has a negative
>> >> > > score.
>> >>
>> >> [...]
>> >>
>> >> > I have not examined the WoT code. ?However, the Advogato metric has
>> >> > two attributes that I don't think the current WoT method has: no
>> >> > negative trust behavior (if there is a trust rating Bob can assign to
>> >> > Carol such that Alice will trust Carol less than if Bob had not
>> >> > assigned a rating, that's a negative trust behavior), and a
>> >> > mathematical proof as to the upper limit on the quantity of spammer
>> >> > nodes that get trusted.
>> >> >
>> >> > The Advogato metric is *specifically* designed to handle the case of
>> >> > the attacker creating millions of accounts. ?In that case, his success
>> >> > is bounded (linear with modest constant) by the number of confused
>> >> > nodes -- that is, legitimate nodes that have (incorrectly) marked his
>> >> > accounts as legitimate. ?If you look at the flow computation, it
>> >> > follows that for nodes for which the computed trust value is zero, you
>> >> > don't have to bother downloading their trust lists, so the number of
>> >> > such lists you download is similarly well controlled.
>> >>
>> >> I have read your messages again and all your new messages and you are so
>> >> convinced about advogato that I'd like to ask you more questions about
>> >> how it would work, I don't want you to feel like everyone is ignoring
>> >> you :) (- I am more of a programmer right now than a designer of
>> >> algorithms, I concentrate on spending most available time on
>> >> *implementing* WoT/FT because nobody else is doing it and it needs to
>> >> get done... so I have not talked much in this discussion)
>> >>
>> >> Consider the following case, using advogato and not the current FMS/WoT
>> >> alchemy:
>> >>
>> >> 1. Identity X is an occasional and trustworthy poster. X has received
>> >> many positive trust values from hundreds of identities because it has
>> >> posted hundreds of messages over the months, so it has a high score and
>> >> capacity to give trust values, and all newbies will know about the
>> >> identity and it's high score because it is well-integrated into the
>> >> trust graph.
>> >>
>> >> 2. Now a spammer gets a single identity Y onto the trust list of X by
>> >> solving a captcha, his score is very low because he has only solved a
>> >> captcha but the score is there. Therefore, any newbie will see Y because
>> >> X is well-integrated into the WoT
>> >>
>> >> 3. X is gone for quite some time due to inactivity, during that time Y
>> >> creates 500 spam identities on his trust list and starts to spam all
>> >> boards. X will not remove Y from his trust list because he is *away* for
>> >> weeks.
>> >
>> > Also consider the case that instead of 500 new identities he just posts
>> > 5000000 messages with his single identity Y. How do we get rid of Y?
>>
>> First, you rate limit messages. ?I'm having trouble coming up with a
>> case where I ever want my node downloading that many messages from one
>> identity.
>
> And how to find a practical rate limit?
> Consider SVN/GIT/etc. log-bots: They post a single message for each commit to
> the repository.


FMS, WoT, and Advogato all mark identities, not messages.  Why is this
scenario relevant to the question at hand -- that is, which algorithm
to run on the trust graph?  If you want to discuss intelligent rate
limiting, and how to make that usable and useful to users, that is
basically a UI problem.  I have ideas and suggestions, and would be
happy to discuss them.  However, that would be completely unrelated to
the current subject, so I suggest starting a new thread.

>
>>
>> Second, after I read a few, I'll mark some as spam and the rest will
>> go away. ?From a practical standpoint, I don't really care about the
>> difference between 5 messages, 500, or 500000 -- I'll read one, or a
>> few, and then mark Y as a spammer. ?I'll never see the rest.
>
> Can a messaging system survive which will appear as "full of spam" to every
> newbie?
>
> Isn't it the core goal of the WoT to prevent *newbies* from seeing spam, to
> let the community design a set of ratings which prevents EVERYONE from having
> to manually mark spam/non spam, letting only a subset of the community doing
> the work and others can benefit from it?
>
> I think thats what any algorithm needs to be able to do: Provide a nice first
> usage experience.
>
> First usage = empty trust list. So this also applies to people who are to lazy
> to mark everything as spam which is spam. Which probably applies to > 50% of
> the users. So advogato would annoy >50% if I have not misunderstood it?

Your question about the survival of the messaging system leads me to
think that we are not discussing the same (hypothetical) messaging
system.  Why would it be full of spam in the first place?  The number
of spammers is linearly bounded by the number of confused nodes in the
web.  The proof of this is not complicated.  If you have found a flaw
in the proof, please explain in more detail where it is.  This precise
feature -- a provable bound on the number of spammers -- is the major
thing that distinguishes Advogato from FMS or WoT.  Your complaint
seems to be that Advogato might allow too many spammers, yet your
proposed solution appears to be to instead use a system that does not
have any such bound.  I do not understand why this is.

I do not think I could disagree more strongly about the core goal of
such a plugin.  It should not be to let the community decide what I
can or cannot see.  The goal should be to have a unified way for the
community to provide me with information that helps *me* decide (or
rather my proxy, in the form of software running on *my* computer with
*my* configuration) what I can or cannot see.  As a secondary goal,
the software should provide a reasonable set of defaults to get
newbies started, and make it clear that those defaults are just that
and nothing more.

An empty trust list means you see no messages.  Therefore I recommend
not doing that by default.  Again: how is this different than with WoT
or FMS?  How else would you find out what identities exist and are
posting messages?  (If by default the trust list is empty, and I am
still seeing messages, I think that means that the default trust list
is simply trusting everyone.  I don't think "empty" is an accurate
description of such a trust list, regardless of whether or not those
entries are displayed.)

Evan Daniel

[freenet-dev] Question about an important design decision of the WoT plugin

Reply via email to