Jakob Spörk wrote:
Hello,

I just want to give my thoughts to unified pipeline and data conversion
topic. In my opinion, the pipeline can't do the data conversion, because it
has no information about how to do this. Let's take a simple example: We
have a pipeline processing XML documents that describe images. The first
components process this xml data while the rest of the components do
operations on the actual image. Now is the question, who will transform the
xml data to image data in the middle of the pipeline?
I believe the pipeline cannot do this, because it simply do not know how to
transform, because that’s a custom operation. You would need a component
that is on the one hand a XML consumer and on the other hand an image
producer. Providing some automatic data conversions directly in the pipeline
may help developers that need exactly these default cases but I believe it
would be harder for people requiring custom data conversions (and that are
most of the cases).

Absolutely. The discussion was about having the pipeline automate the connection of components that deal with the same data, but with different representations of it. Think XML data represented as SAX, StAX, DOM or even text, and binary data represented as byte[], InputStream, OutputStream or NIO buffers.

Let's consider your example. We can have:
- an XML producer that outputs SAX events
- an XML tranformer that pulls StAX events a writes SVG as StAX events in an XMLStreamWriter - an SVG serializer that takes a DOM and renders it as a JPEG image on an output stream - and finally an image transformer that adds a watermark to the image, reading an input stream and writing on an output stream.

The pipeline must not have the reponsibility of transforming data from one paradigm to another (i.e. an XML document to a jpeg image) because the way to do that highly depends on the application, But the pipeline should allow the component developers to use whatever representation of that data best fits their needs, and allow the user of not caring about the actual data representation as long as the components that are added to the pipeline are "compatible" (i.e. StAX, SAX and DOM are compatible). This can be achieved by adding the necessary transcoding bridges between components. And if such a bridge does not exist, then we can throw an exception because the pipeline is obviously incorrect.

Note that XML is a quite unique area where components can allow data to flow in one single direction through them (i.e. a SAX consumer producing SAX events). Most components that deal with binary data pull their input and push their output, which is actually exactly what Unix pipes do (read from stdin, write to stderr). So wanting a universal pipeline API that also works with binary data requires to address the push/pull conversion problem.

Sylvain

--
Sylvain Wallez - http://bluxte.net

Reply via email to