On 11/3/06, Simon Laws <[EMAIL PROTECTED]> wrote:



On 11/3/06, Frank Budinsky <[EMAIL PROTECTED]> wrote:
>
> Simon,
>
> The Sequence entry returns a special property:
>
> if (sequence.getProperty(i) == specialCDataProperty) {
>   String cDataValue = sequence.getValue(i);
> }
>
> The problem is that this special property is currently inherited from
> the
> underlying EMF, so it's really not something we want clients to use.
> RIght
> now, this works fine in terms of reading in CDATA and not losing it when
> you reserialize, but there's really no proper way for an SDO client to
> actually access it. Without using EMF apis, the only way a client can
> decide that an entry is CDATA is by looking at the property name (and
> hope
> there's no real property with that name). Longer term, I'm not sure that
>
> even handling it this way is right. Maybe CDATA and other special XML
> things should look like mixed text in the sequence (property == null),
> and
> some XMLHelper method (or some Tuscany specific API for now) could be
> used
> to check if it's actually something special like CDATA. Maybe you should
> try to do something like that in the C++ impl, and if it looks
> promissing,
> we'll switch the Java impl to do the same.
>
> Frank
>
> "Simon Laws" <[EMAIL PROTECTED]> wrote on 11/03/2006 10:18:04
> AM:
>
> > On 11/3/06, Frank Budinsky <[EMAIL PROTECTED] > wrote:
> > >
> > > In the Tuscany Java implementation we expose CDATA as sequence
> entries
> > > (like mixed text) with a special "CDATA" property (we handle
> comments
> in a
> > > similar way). SDO doesn't define a special property for CDATA, so
> this
> is
> > > an implementation-specific feature. I'm not sure, long term, what
> should
> > > be the best (proper) way to do this.
> > >
> > > Frank.
> > >
> > > "Simon Laws" < [EMAIL PROTECTED]> wrote on 11/03/2006
> 09:41:25
> AM:
> > >
> > > > On 10/26/06, Simon Laws <[EMAIL PROTECTED] > wrote:
> > > > >
> > > > > This is primarily a C++ question but I guess could apply to Java
> also.
> > > I'm
> > > > > trying to read a document into C++ SDO that contains a CDATA
> section.
> > > The
> > > > > corresponding CDATA doesn't make its way into the resulting SDO.
> I
> put
> > > the
> > > > > C++ SDO implementation in the debugger and found the reason why:
>
> > > > >
> > > > > sax2parser.cpp
> > > > >
> > > > > void sdo_cdataBlock(void *ctx, const xmlChar *value, int len)
> > > > > {
> > > > > }
> > > > >
> > > > > So the callback exists, gets called with the correct data during
> the
> > > > > parse, i.e. LibXML2 is doing the right thing, but the callback
> is
> > > ignored.
> > > > > Is there a good reason for this? I did a quick search of the C++
> and
> > > Java
> > > > > specs and they don't appear to discuss CDATA specifically. Can
> someone
> > > > > comment on whether the Java implementation handles CDATA
> successfully?
> > > > >
> > > > > Logically, from an SDO point of view, there is probably no need
> to
> > > treat
> > > > > CDATA specially as the SDO model dictates precisely the
> difference
> > > between
> > > > > data and structure. We may find that to make the XML DAS
> function
> work
> > > we
> > > > > have to know that a property potentially contains markup but I'd
>
> have
> > > to
> > > > > look closely at how the C++ SDO implementation of the XML DAS
> function
> > > > > streams out SDOs to XML when requested to do so.
> > > > >
> > > > > If CDATA hasn't been omitted for a good reason I'll come up with
> a
> > > > > proposal for C++ SDO.
> > > > >
> > > > > Regards
> > > > >
> > > > > Simon
> > > >
> > > >
> > > >
> > > > I didn't get any response to this. Here are my further thoughts..
> > > >
> > > > There are a number of options for representing CDATA in SDO, for
> example
> > > >
> > > > 1) Duplicate the CDATA string as is, including the "<![CDATA[" and
>
> "]]>"
> > > > markers, to the appropriate property in the data object hiearchy
> > > > 2) Duplicate the CDATA string excluding the "<![CDATA[" and "]]>"
> > > markers
> > > > and instigate a special flag to indicate that CDATA is present.
> > > >
> > > > CDATA is the specific concern of XML, i.e. the chracter entities
> that
> > > CDATA
> > > > protects an XML parser from are of no
> > > > concern to SDO because SDO is not intended to be tied directly to
> XML.
> > > So
> > > > given the example options above we
> > > > either expose the specifics of XML to the SDO core 2) or to the
> SDO
> user
> > > 1).
> > > >
> > > > Neither are particularly attractive.
> > > >
> > > > 1) appears to be the simplest approach to implement because it
> provides
> > > a
> > > > mechanism for the user to read, and
> > > > create CDATA without having to provide much special support in
> SDO.
> 2)
> > > is
> > > > more involved particularly because
> > > > CDATA can appear mixed in with other text strings and so a
> sequence
> may
> > > need
> > > > to be used to represent properties
> > > > that have a mixture of text and CDATA marking those sequences
> entries
> > > that
> > > > are CDATA.
> > > >
> > > > 1) does require changes (at least in C++ SDO) because XML parsers
> tend
> > > to be
> > > > too helpful in this case for
> > > > processing CDATA. XML parsers, libxml2 in particular, recognize
> the
> > > > "<![CDATA[" and "]]>" sequence as a special
> > > > indicator and throw it away returning just the text it includes.
> We
> > > would
> > > > have to reintroduce it and store it in
> > > > the parameter value in question. The C++ SDO implementation uses a
> lot
> > > of
> > > > XML string handling before the parameter
> > > > value is actually stored which URL encodes parts of the CDATA
> markers so
> > > > this would have to be fixed. When writing out the CDATA strings
> any
> > > string
> > > > typed properties would have to be scanned for the markers so that
> the
> > > > appropriate libxml2 functions can be called to get the CDATA
> sections in
> > > the
> > > > right place.
> > > >
> > > > I have a test implementation of 1). If this is the way we want to
> go
> I
> > > would
> > > > have to do more work to thread CDATA handling through the xml
> strings
> > > that
> > > > are used to set parameters. Happy to do this but would like to
> discuss
> > > > first.
> > > >
> > > > Thoughts (particularly on what Java SDO does with CDATA)?
> > > >
> > > > Simon
> > >
> > >
> > >
> ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > For additional commands, e-mail: [EMAIL PROTECTED]
> > >
> > > Ok, thanks for that Frank. It would be nice for us to do it in the
> same
> > way so even though it is not in the SDO spec the implementations are
> > similar. A question, how do you set a sequence entry to indicate that
> it
> > holds CDATA. Is this a flag that is exposed on the sequence API. I'm
> not
> > really familiar with the Java SDO code base. Can you give me a class
> name to
> > go look at?
> >
> > Simon
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
> Ok, thanks Frank. That explains why I couldn't really track it down in
the Java code. I'll have a bash at extending the C++ implementation I have
now and I'll report back.

Regards

Simon


OK, so I've take my option 1 approach (see previous mail) and implemented a
solution in C++ SDO which allows CDATA sections and their strange markers to
appear in SDO properties. In this way the API for reading and creating CDATA
sections is the normal SDO string API. In a way this is just a preliminary
stab to allow us to play with CDATA and see whether this simple approach is
satisfactory.

From the previous mails there was discussion of alternative approaches where
special markers are introduced to indicate where CDATA appears hence
removing the need to maintain the CDATA markers in text. However there are
some tricky cases. Particularly where one or more CDATA sections appear
within primitive text string. As schema gives us no help in locating CDATA
sections this leaves the model at a bit of a loss in terms of representing
them. We would potentially end up adding scaffolding around primitive string
types, or preferably create a new type, that is able to represent accurately
the combination of text and CDATA sections.

Anyhow something more complex may be appropriate in the future but this
simple solution allows us to offer something to our PHP SDO users quickly
that I don't think causes us big problems for the future. If in the future
we have special flags we can always reproduce the CDATA markers if required.


I created a JIRA to record progress on this issue (
http://issues.apache.org/jira/browse/TUSCANY-908)

Simon

Reply via email to