Le jeudi 10 f�vrier 2005 � 19:49 +0100, Erik Bruchez a �crit : > Eric van der Vlist wrote: > > > * To detect the media-type (without relying on the manifest since > > I'd like this converter to be generic) of each part I rely currently > > on sun.net.www.MimeTable that is probably not very portable. IMO, > > that would be useful to have a Presentation Server class to handle > > that globally. > > I think that until the JDK provides some real integration with the OS > (which apparently doesn't exist), I suggest using mime-types.xml or > something like it, which maps from extension to media-type, like the > Resource server does. This will make it easily configurable and, more > important, will make sure it works the same on every platform. > > So your processor can have an entry called mime-types like the > Resource server. Maybe a better name would be media-types.
Yes, that's an option. > > * For text formats I don't know how to detect the encoding and > > serialise them as base64. > > You mean that zip does not provide the encoding information? Exactly. In the best case, I know that it's a text document but the ZIP doesn't mention its encoding. > > * I am currently treating external entities (or DTD references) has > > pointing on empty documents. > > Zip has DTDs? (I guess not.) Or do you mean in the manifest? No, but OOo documents have the bad idea to reference a DTD that isn't included in the ZIP archive. > > * As already mentioned on that list, I don't find it kosher to use > > xsi:type to specify the encoding. I have preferred to use a > > "content-type" attribute but am not very happy with that name. > > Argh. Anytyhing but content-type ;-) There is too much history behind > content-type, which usually contains what you call media type below. Hmmm... encoding would be equally bad... what about "serialization" ? > > The output of this converter gives something such as: > > > > <archive> > > <entry name="mimetype" content-type="base64Binary" > time="2005-02-08T22:03:36" size="30"> > > YXBwbGljYXRpb24vdm5kLnN1bi54bWwud3JpdGVy > > </entry> > > <entry name="Pictures/10000000000001DA0000005BF70F3350.png" > media-type="image/png" content-type="base64Binary" > time="2005-02-08T22:03:36" size="29006"> > > ... > > </entry> > > ... > > <entry name="META-INF/manifest.xml" media-type="application/xml" > content-type="xml" time="2005-02-08T22:03:36"> > > <manifest:manifest > xmlns:manifest="http://openoffice.org/2001/manifest"> > > <manifest:file-entry > manifest:media-type="application/vnd.sun.xml.writer" > manifest:full-path="/"/> > > <manifest:file-entry manifest:media-type="image/png" > manifest:full-path="Pictures/10000000000001DA0000005BF70F3350.png"/> > > <manifest:file-entry manifest:media-type="image/png" > manifest:full-path="Pictures/1000020000000055000000255789B5EB.png"/> > > <manifest:file-entry manifest:media-type="" > manifest:full-path="Pictures/"/> > > <manifest:file-entry > manifest:media-type="appication/binary" manifest:full-path="layout-cache"/> > > <manifest:file-entry manifest:media-type="text/xml" > manifest:full-path="content.xml"/> > > <manifest:file-entry manifest:media-type="text/xml" > manifest:full-path="styles.xml"/> > > <manifest:file-entry manifest:media-type="text/xml" > manifest:full-path="meta.xml"/> > > <manifest:file-entry manifest:media-type="text/xml" > manifest:full-path="settings.xml"/> > > </manifest:manifest> > > </entry> > > </archive> > > Looks prety good except I would say for some attribute naming. > > Now there is the whole question of: what do you do with the entries, > i.e. when do you consider them: > > o binary (and so encode them to Base64 > o text (and so produce embedded character content) > o XML (and so parse the content) > > The URL generator has a logic that makes this decision, see the doc. I have followed the same logic... > Also, an option for the Zip extractor could be to not embed content, > but produce URLs, like the Request generator does for uploaded files > and request body. What you would do in this case is extract the > content to temporary files. They would live for the duration of the > request. The benefit: you don't end up with a really huge XML document > going through the pipeline. I don't find that very elegant but that may be useful for huge archives... What I am implementing right now is the possibility to extract only one document if you know its name. Anyway, that's only a very early draft! Eric > -Erik > > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click > _______________________________________________ > orbeon-user mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/orbeon-user > > -- Freelance consulting and training. http://dyomedea.com/english/ ------------------------------------------------------------------------ Eric van der Vlist http://xmlfr.org http://dyomedea.com (ISO) RELAX NG ISBN:0-596-00421-4 http://oreilly.com/catalog/relax (W3C) XML Schema ISBN:0-596-00252-1 http://oreilly.com/catalog/xmlschema ------------------------------------------------------------------------ ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_ide95&alloc_id396&op=click _______________________________________________ orbeon-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/orbeon-user
