Hi Sid,

Please find the answers inline.

thanks

Moheeb


On Tue, Jun 19, 2012 at 3:03 PM, Sid Stamm <s...@mozilla.com> wrote:
> Thanks for the info, Moheeb!
>
> On 06/15/2012 11:35 AM, moh...@google.com wrote:
>> Regarding the TLS bouncing idea.
>> As the reputation system derives features in part from the submitted
>> pings, it's important for us to be able to detect abusive reputation
>> requests.  The source IP is a very meaningful feature for detecting
>> spammy requests.  Furthermore, if we get requests from a sufficient
>> number of users for the same url, we may also attempt to fetch it to
>> feed the binary into our analysis system. I would like to emphasize that
>> this data is only kept for two weeks and is subject to strict access 
>> controls.
>>
>> A trusted proxy run by Mozilla might be an option if it did its own
>> meaningful spam filtering and additionally provided us at least with
>> the /24 of the source IP address from the original requester.
>
> So it sounds like a proxy is possible if we send along the first three
> octets.  If we deploy this feature opt-in, I'm not convinced we need to
> proxy.  If we deploy the feature on by default, we might want to
> consider this.
>
> There are a few more open questions, and I know this sounds a bit like
> an inquisition, but given the lack of a public API or any other feature
> documentation, I just want to get all the facts on the table:
>
> 1.  To be explicit, some folks here are curious why we can't just submit
> a URL hash prefix like for the rest of the safe browsing stuff?  I'm
> assuming it is because of what you say above (if a lot of pings have the
> same URL, you grab it and do analysis), but can you explain why the
> whole URL/hash/size are needed?

Yes analyzing the binary is  one of the reasons why we send the full URL in the
ping. The other reason is that this information is important for our
reputation model. Our reputation scheme keeps track of sites and IP
addresses hosting malware binaries and uses this information to
predict whether a
binary hosted on a particular site is malicious even if we did not see
the content before.

>
> 2.  What types of downloads are subject to this filter?  Just .exe?
> (And how do you determine the filetype, extension)?
>

Currently, we support Windows executable content (e.g. .exe, .msi,
.scr, etc). The determination of the content-type is up to the client
implementation. But since the safety check with the reputation server
happens after the content is downloaded, I think content-based
identification might be feasible, working with file extension is also
a feasible option.

> 3.  How aggressively are you purging logs of the pings?  Some users are
> concerned about your service being compelled by some legal proceedings
> to turn over data about their IP address's download habits.
>

We keep the pings only for two weeks.

> 4.  Is there any chance we could offer users who want it a "no-ping"
> option?  (This would be a hypothetical client-side whitelist updates but
> no pings with URL/hash/size.)
>

We have a  list of known malware domains that we export via the
SafeBrowsing API, but this is completely different from reputation
solution which is done on the server-side.  Based on our experience,
relying on this list will only offer  modest protection to users.
_______________________________________________
dev-security mailing list
dev-security@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security

Reply via email to