On Thu, Apr 8, 2010 at 11:33 AM, Evan Daniel <evanbd at gmail.com> wrote:

> On Thu, Apr 8, 2010 at 12:29 AM, Spencer Jackson
> <spencerandrewjackson at gmail.com> wrote:
> > Hi
> >
> > I've been hanging out on IRC as of late under the nick 'sajack'. I'm
> going
> > to be submitting an application to work with Freenet during Google Summer
> of
> > Code. Anyway, here's the proposal I'm going to be uploading, if anyone
> has
> > any thoughts. Thanks for looking at it.
> >
> >
> >
> > Proposal: Improve Implementation and Functionality of Content Filtration
> and
> > Add Support for Additional Formats
> > Proposer: Spencer Jackson (sajack)
> >
> > Introduction
> > Content is an important part of the Freenet experience. Good, plentiful
> > content attracts users, which attracts donations and creates more nodes,
> > both of which, directly or indirectly, improve performance and security
> of
> > the network. As such, to make Freenet better, we must make the process of
> > getting information from the network to the user quick, easy, and safe. I
> am
> > proposing a series of changes to the ContentFilter and adjacent systems
> so
> > as to realize this. Below are the general steps I will take.
> >
> >
> > Modify content filters to act as streams
> >
> > Presently, Freenet's data filters are passed Bucket objects containing
> all
> > of the data they need to process. This is suboptimal. Ideally, the data
> > filters should have a stream interface. This will reduce duplication of
> data
> > and increase performance by removing the need for vast amounts of disk
> I/O,
> > as less will be needed to be cached on the disk. This will be very easy
> to
> > implement, as most of the filters deal with streams internally.
> >
> > Right now, filters are a part of FProxy, and are invoked by it when a
> file
> > is downloaded. Really though, most clients probably desire filtered data,
> so
> > filtration should be done earlier, with FProxy simply using general
> > functionality. I will therefore move filters into the client layer, and
> > invoke them there.
> >
> > Now while here, it would be useful to add some new functionality. First
> off,
> > I'll add the ability to filter files being saved to the hard drive. Right
> > now, this doesn't happen, and it's something of a weak spot in our armor.
> > Later on especially, when there are Ogg filters, users may be downloading
> > large files directly to their hard drive. We will want them to be
> filtered.
> >
> > Another thing while I'm working with filters in the client layer: I will
> > implement filtration of inserts. This will help prevent metadata in a
> file
> > uploaded by the user from breaking his or her anonymity. For example,
> EXIF
> > data in jpegs may reveal the serial number of the camera which took the
> > picture, or even the GPS coordinates from where the picture was taken.
> >
> > Of course, there are some usage cases, such as during debugging, where it
> > may be undesirable for a request to be filtered. It must therefore be
> > possible to disable filtering. To accomplish this, I will prevent the
> data
> > from being filtered when a configuration setting in the request's context
> > has been set. Support for disabling filters will need to be added to FCP.
> > All of this will then need to be supported in the web interface. I will
> add
> > support for complementary GET and POST variables in FProxy which would be
> > used to trigger this setting. Next, I'll add UI elements to the download
> and
> > insert queue pages and any other pertinent locations, such as the
> > 'Downloading a page' page, which would enable these variables. These
> > elements should only be visible when the user is in 'Advanced mode,' and,
> > even then, should be tagged with a Big Fat Warning about the risks of
> > turning off filtering.
> >
> > Another feature I will implement is the ability to run data through a
> filter
> > without placing it on the network. This would be useful for debugging
> > content filters, and for freesite writers, who want to see what their
> site
> > will look like after its been parsed. This should be pretty easy to
> > implement. I'll create an FCP message which will take data, filter it,
> and
> > return it. I will also create a way to do this through FProxy, by
> uploading
> > a file, and receiving the filtered version.
> >
> > The next thing I will implement are stream friendly Compressors.
> > Essentially, we should be able to have a filter and a decompressor
> running
> > on separate threads, and have data be passable between them transparently
> > using piped streams.
> >
> >
> > Implementation of Ogg container formatRight now, Freenet has filters for
> > HTML and some forms of image files. More filters means more types of
> content
> > which may be safely viewed by the user. This will allow the network to be
> > used in ways which are currently not safe. After I have implemented the
> new
> > stream based content filters, I shall implement more of them.
> >
> > The first type of filter which I will implement is for the Ogg container
> > format. This is technically interesting, as it encapsulates other types
> of
> > data. A generic Ogg parser will be written, which will need to validate
> the
> > Ogg container, identify the bitstreams it contains, identify the codec
> used
> > inside these bitstreams, and process the streams using a second(or nth,
> > really, depending on how many bitstreams are in the container) codec
> > specific filter. It should be possible to use this filter to either
> filter
> > the just beginning of the file, or the whole thing. This will make it
> > possible to preview a partially downloaded file, at some point in the
> > future. Some things which will need to be taken into consideration are
> the
> > possibility of Ogg pages being concealed inside of other Ogg pages. This
> > will be checked for, and a fatal error will be raised if it occurs.
> >
> > The Ogg codecs which I will initially add support for are, in order,
> Vorbis,
> > Theora, and FLAC.
> >
> >
> > More content filters
> >
> > The more filters the better. In the time remaining, I will implement as
> many
> > different possible content filters. While this step is very important,
> these
> > codecs individually are of a lower priority than previous steps. I will
> > implement ATOM/RSS, mp3, and the rudiments of pdf.
> >
> >
> >
> > Milestones
> > Here are clear milestones which may be used to evaluate my performance.
> The
> > following are a list of these goals which should be met to signify
> > completion, along with very rough estimates as to how long each step
> should
> > take:
> >
> > *Stream based filters (3 days)
> > *Filters are moved to the client layer, with support for (disableable)
> > support filtering files going to the hard drive, and inserts (9 days)
> > *Filters can be tested on data, without inserting it into the network (3
> > days)
> > *Compressors can be interacted with through streams (4 days)
> > *An Ogg content filter is implemented, supporting the following codecs:
> (3
> > days)
> >  -The Vorbis codec (2 days)
> >  -The Theora codec (2 days)
> >  -The FLAC codec (2 days)
> > *Content filters for ATOM/RSS are implemented (5 days)
> > *A content filter for MP3 is implemented (6 days)
> > *A basic content filter for pdf is implemented (Remaining time)
> >
> >
> >
> > Biography
> > I initially became interested in Freenet because I am something of a
> > cypherpunk, in that I believe the ability to hold pseudonymous discourse
> to
> > be a major cornerstone of free speech and the free flow of information.
> I've
> > skulked around Freenet occasionally, even helping pre-alpha test version
> > 0.7. But I'd like to do more. I want to put my time and energy where my
> > mouth is and spend my summer making the world, in some small way, safer
> for
> > freedom.
> > Starry-eyed idealism aside, I am an 18 year old American high school
> senior,
> > who will be studying Computer Science after I graduate. While C/C++ is my
> > 'first language', so to speak, I am also fluent in Java and Python. Last
> > year, I personally rewrote my high school's web page in Python and
> Django.
> > This year, I've been working on an editor for Model United Nations
> > resolutions, as time permits. This project is licensed under the GPLv3,
> and
> > is available on GitHub, at http://github.com/spencerjackson/resolute.
> It's
> > written in C++, and uses GTKmm for the GUI.
> >
> >
> > _______________________________________________
> > Devl mailing list
> > Devl at freenetproject.org
> > http://osprey.vm.bytemark.co.uk/cgi-bin/mailman/listinfo/devl
> >
>
> IMHO this looks good.
>
> My one concern is that your suggested timeline looks aggressive.  It
> looks to me more like a timeline for writing the code, as opposed to a
> timeline for writing the code, documenting it, writing unit tests, and
> debugging it.  I know that writing copious documentation and unit
> tests as we go isn't how Freenet normally does things, but it would be
> nice to improve on that standard :)  I think adding 1 day worth of
> documentation and unit tests after each of your listed steps would
> make a meaningful improvement to the resultant body of work.  Of
> course, others might disagree, and it's not a big concern.  Like I
> said, this looks good.
>
> Evan Daniel
> _______________________________________________
> Devl mailing list
> Devl at freenetproject.org
> http://osprey.vm.bytemark.co.uk/cgi-bin/mailman/listinfo/devl
>


Okay. I removed the syndication feeds, and used that time to add another day
to the other steps. So yeah, that's all submitted to GSoC. I have the second
proposal done, which is very similar, but with more multimedia stuff, and am
just about to click the submit button. So I'll copy that into another post.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20100408/3795e28c/attachment.html>

Reply via email to