On Tuesday 24 April 2007 20:47, Jonathan Powell wrote:
> This whole XML pipelining thing has captured my interest more than
> anything else technological in the past year. Your talk looks
> fascinating - I wish I could be there
>
> Do you use much of this technology in the BBC?
It's a natural fit for the BBC really - any production chain tends to be
a pipeline of hardware boxes. Having pipelines of software boxes is
natural & logical. It's something that makes sense to grow. Getting
information out from R&D that there's something they can just use can
be tricky...
(he says sneakily using a public mailing list for that purpose ;)
> I haven't yet seen a good
> example of it being used for anything particularly substantial - does it
> have excessive overhead?
In the spirit of Backstage being "use our stuff to build your stuff" it
makes sense for me to mention Kamaelia here again.
What do you mean by substantive? I'm not talking about XML pipelining or
mashups below but the more general aspect of pipelining (which Kamaelia
uses as its default system creation approach). Sure we can pipe XML around
and you can probably make a mashup using it (I've never bothered, despite
webserving and RSS capabilities :-), but they're relatively trivial
applications in my mind.
eg XML parsing is used by our simple presentation tool (Kamaelia: Show):
*
http://svn.sourceforge.net/viewvc/kamaelia/trunk/Code/Python/Kamaelia/Tools/Show.py?revision=2301&view=markup
There was a box made ~18 months ago by Radio & Music Interactive (using
Kamaelia for prototyping since Kamaelia wasn't optimised then) which used
a pipelining approach for grabbing all BBC radio output, transcoding it
for a record of transmission and then the data was used for generating
podcasts. As I understand it this led directly to the podcast trials.
The code to do that as a pipeline is relatively trivial really. Doing *NOT*
as a pipeline can make your life a lot harder.
We (in R&D) then heard that Kamaelia had been used for this and decided
to optimise it (At that time it did have excessive overheads). We've
since built a similar system for TV:
* http://kamaelia.sourceforge.net/KamaeliaMacro
Which has been up and running for around 12 months now. You can see
the front end here, but unfortunately can't give anyone a login for
the data (obvious reasons!). It is useful as a record of transmission
however:
* http://bbc.kamaelia.org/cgi-bin/blog/blog.cgi
Due to space concerns etc it purges data. (We missed that off initially
and running out of diskspace twice has caused the only 2 crashes for
that system :-) This was intended to cover all channels, but availability
of hardware has been problematic.
You CAN build one for yourself of course, since it's just timeshifting
(so long as you don't redistribute). Pretty simple too, just need a
linux box and and a DVB-T stick and however much storage you want.
If anyone's curious about this, following up here on the Kamaelia
mailing lists is welcome. The code for it is here:
*
http://svn.sourceforge.net/viewvc/kamaelia/trunk/Code/Python/Kamaelia/Examples/DVB_Systems/Macro.py?revision=2257&view=markup
In terms of code, rather than a simple pipeline something like this needs
to be a graphline. At somepoint later today probably I'll add in the
diagrams from my talk on Kamaelia Macro's internals at EuroOSCON. The
structure is outlined OK on the page noted above.
A pipelining system is only as good as they components available inside
it, and for a flavour of the sorts of components our reference is here:
* http://kamaelia.sourceforge.net/Components
In terms of overhead, the entire system has been optimised now meaning
we can do realtime transcoding tasks, video playback, collaborative
whiteboarding, sketching over running video, use Open GL, distribution
via bit torrent (since that's been integrated), simple game based
interfaces. There's extensive DVB support due to the work on Kamaelia
Macro.
Pipelining can be a much simpler way of writing code. For example writing
a splitting server is as simple as writing something like this:
Backplane("DataToServe").activate()
Pipeline(
TCPClient( someserver, someport ), # grab source stream to split
publishTo("DataToServe"),
).activate()
def ServeTalk(): # Protocol handler created to handle each connected client
return subscribeTo("DataToServe")
SimpleServer( ServeTalk, 1602).run()
That's a simple scalable splitting server. It's also the core of a basic
P2P streaming system with no QoS, since it is also lightweight enough to
it on a client's system. The only thing missing from it is mesh setup &
resiliency.
IMO, XML pipelining and mashups are simple subsets of the more general
principle that Kamaelia works on. We do have a webserver component as
well (written last year by a Google Summer of Code student as an unexpected
extra, being fleshed out this year by another), which means we can bridge
into the web world that way. As well as that there's also a webclient,
which when combined with feedparser, bridges the other way:
* http://kamaelia.sourceforge.net/Cookbook/HTTPClient shows how to bridge
with feedparser.
* http://kamaelia.sourceforge.net/Cookbook/HTTPServer shows how to use the
webserver at present.
One of the problems we've seen with this approach is due to the dataflow &
naturally concurrent nature of pipeline, some people can find it harder to
integrate with traditional style code. As a result this year our interaction
with Google's Summer Of Code[1] is with a number of projects which are aimed
at making it simpler for people to integrate Kamaelia facilities with
non-Kamaelia systems.
An overview of GSOC 2007 projects:
* Web Server Consolidation - This task will extend the web-server component
in Kamaelia to make it useful as a general purpose web-server component.
(Why? Being able to have a scalable targeted lightweight webserver anywhere
that is Kamaelia based opens up huge options. Eg Desktop Django/Pylons,
adding a web interface to your PVR trivially, local rather than remote
mashups, offline web applications, etc. Simpler mashing of web and non-web
apps.
* Compose: Shard Extensions - This task extends our graphical builder tool
to allow creation of linear components graphicallally. Compose is currently
a line in the sand to say "it should be simple for expert users, not just
programmers to build systems". Being able to originate new components
rather
than just use existing ones (where we are now), is an aim for this project.
* Filehandle Like API - This task allows us to treat components like you
would a filehandle. File reading and writing & flushing etc normally
happens concurrently to your normal code, but you don't think of it that
way, so that's the idea here. In practice this would allow the ability to
embed kamaelia systems in "normal" code.
* AIM/IRC Bridge - This may seem bizarre, but having textual input and
output to programs is useful for testing. AIM & IRC components extend
this IO to more interesting user level areas. For example combining AIM
clients with Kamaelia Macro would mean that you could (for example) - if
you wrote a parser - be able to say to your PVR:
recordForMe BBCONE "Neighbours /data/neighbours.ts
recordForMe CBBC "Class TV" /data/schoolsprogrammes.ts
The latter would record every educational programme broadcast by CBBC on
weekday mornings. The former would record every episode of Neighbours.
(Parsing this is trivial, the interesting bit is the AIM & IRC components
:-)
Or you could have it watching the DVB-EIT stream to look for programmes
you may want to watch and have it (literally) tell you. (eg hook up the
output
from the AIM component to a text to speech component. Creating that on Mac
OS
is as simple as UnixProcess("/usr/bin/say").
Building your own twitter style server would be relatively trivial once
these components are written. You could also however share the server
as something lightweight and have a distributed distribution network
relatively easily.
The recordForMe component can be found here BTW:
*
http://svn.sourceforge.net/viewvc/kamaelia/trunk/Code/Python/Kamaelia/Examples/DVB_Systems/PersonalVideoRecorder.py?revision=2715&view=markup
How that works:
* http://kamaelia.sourceforge.net/Cookbook/DVB/PersonalVideoRecorder
* Test framework - whilst we use unit test, _testing_ data flow or
pipelining _systems_ is a harder problem for which there's no easy
solution at present. Obviously we have a number of test harnesses,
and the aim of this project is to integrate them and make them
reusable as a starting point.
More information via Google's SOC pages here:
* http://code.google.com/soc/bbc/about.html
Things already done though:
* Kamaelia Macro -
http://kamaelia.sourceforge.net/Developers/Projects/KamaeliaMacro
* Mobile Reframer -
http://kamaelia.sourceforge.net/Developers/Projects/MobileReframer
* Multicast RTP MPEG Remultiplexer -
http://kamaelia.sourceforge.net/Developers/Projects/MulticastRtpMpegRemultiplexer
* Multicast Proxy Tools -
http://kamaelia.sourceforge.net/Developers/Projects/MulticastProxyTools
* Whiteboard - http://kamaelia.sourceforge.net/Developers/Projects/Whiteboard
- This is a simple looking app, but has audio support, multiple pages,
ability for remote control and can build ad-hoc P2P style meshes for
sharing whiteboard sessions and audio. Also has recording & playback
capabilities.
* Compose - http://kamaelia.sourceforge.net/Developers/Projects/Compose
* Video Cut Detector -
http://kamaelia.sourceforge.net/Developers/Projects/VideoCutDetector
* Last years GSOC overview:
http://kamaelia.sourceforge.net/Developers/Projects/GoogleSummerOfCode2006
If you want to know more the places to start:
Introductions:
* http://kamaelia.sourceforge.net/Introduction - short and simple overview.
* http://kamaelia.sourceforge.net/t/TN-LinuxFormat-Kamaelia.pdf
- An article I wrote for Linux Format, which describes how to install
and use the Kamaelia whiteboarding application (works best using a
tablet PC or some form of tablet interface)
* http://kamaelia.sourceforge.net/t/TN-LightTechnicalIntroToKamaelia.pdf
- An Article I wrote for Linux Magazin (german version of Linux Magazine,
text above is english :). In terms of installing from a developers
perspective, personally I think my instructions here are simpler/better
than above. (Though that's targeted at getting a specific app running :)
Next steps:
* http://kamaelia.sourceforge.net/MiniAxon (internals tutorial)
* http://tinyurl.com/2p9zku - overview of how to go from a stub program
to reusable components.
* http://kamaelia.sourceforge.net/Cookbook - Cookbook - contains *lots*
of examples. Some trivial, some not.
We tend to use IRC for collaboration (hence the desire above for IRC
integration) so if people do get started, popping by the IRC channel is
probably not a bad idea :)
Oh and the point of Kamaelia? To make concurrency simple, safe and natural
to work with. <rhetorical>You *do* want to be able to use your multicore
systems capabilities as they grow?</rhetorical> :-)
Anyone not using pipelining in some form in 5 years time (unix pipelines,
IPC, mashups, XML pipeline, Kamaelia style, Erlang style) will probably
only be using trivial systems IMO, and certainly not scalable ones (esp
if they're CPU intensive)
Regards,
Michael.
--
Michael Sparks, Senior Research Engineer, BBC Research, Technology Group
[EMAIL PROTECTED], Kamaelia Project Lead, http://kamaelia.sf.net/
-
Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.
Unofficial list archive: http://www.mail-archive.com/[email protected]/