Re: DTD Reading Urgent

chandru 8 Aug 2001 05:21:06 -0000

Hi ,
You mean to say  i can get a solution for this with xerces2.if so  what is the 
release date for xerces2. i need this
desparately .is there any other work around .please suggest me.


from
chandra sekhar

Glenn Marcy wrote:

> John,
>
> Actually, it is not as bad as you make out...
>
> Here is a note that I sent to the dev list several months ago:
>
> Subject: Re: [xerces2] Note to current XNI state
>
> I have been looking into another approach.  The documentation of the
> SAX2 property "http://xml.org/sax/properties/xml-string"; says:
>
>      data type: java.lang.String
>      description: The literal string of characters that was the source
>                   for the current event.
>      access: read-only
>
> I have been looking at the current Xerces2 APIs and how one might add
> support for this property during DTD parsing.
>
> So, using your two examples, I get (from a DTD writer prototype):
>
> Input:
> <!ENTITY % ent "ANY">
> <!ELEMENT e %ent;>
>
> Output:
> <!ENTITY % ent "ANY">
> <!ELEMENT e %ent;>
>
> Input:
> <!ENTITY % ent " ">
> %ent;
> <!ELEMENT e ANY>
>
> Output:
> <!ENTITY % ent " ">
> %ent;
> <!ELEMENT e ANY>
>
> So this matches one of my goals of being able to take any DTD and parse
> it and be able to spit back out exactly what was parsed.  In this case
> all I do is get the value of the "xml-string" property during every event
> and write it out.  Now to get a feel for the event/xml-string relationship,
> I can run the program with debugging turned on, which just adds an
> "[event-name]" to the output stream.  Note that I have also added a new
> "startOfMarkupDecl" event because it made my application (the simple DTD
> writer) easier to code, but it is not "strictly" necessary.  The debug
> code looks like:
>
> String propString = "http://xml.org/sax/properties/xml-string";;
>
> void printEvent(String eventName) {
>     String xmlstring = (String) parser.getProperty(propString);
>     if (DEBUG)
>         System.out.println("[" + eventName "]" + xmlstring);
>     else
>         System.out.print(xmlstring);
> }
>
> So again for the same to cases:
>
> Output (with DEBUG printing added):
> [startDTD]
> [startEntity "[dtd]"]
> [startOfMarkupDecl]<!ENTITY
> [internalEntityDecl] % ent "ANY">
> [startOfMarkupDecl]
> <!ELEMENT
> [startEntity "%ent"] e %ent;
> [endEntity "%ent"]
> [elementDecl]>
> [endEntity "[dtd]"]
> [endDTD]
>
> Output (with DEBUG printing added):
> [startDTD]
> [startEntity "[dtd]"]
> [startOfMarkupDecl]<!ENTITY
> [internalEntityDecl] % ent " ">
> [startEntity "%ent"]
> %ent;
> [endEntity "%ent"]
> [startOfMarkupDecl]
> <!ELEMENT
> [elementDecl] e ANY>
> [endEntity "[dtd]"]
> [endDTD]
>
> Obviously this requires some additional "reparsing" of some of the
> simple constructs in the declarations, but it isn't too hard to keep
> straight.  The nice advantage is that you do not need to add lots of
> methods to the handler APIs and you can avoid doing the work of
> creating the String until you get the getProperty call from the
> application within the handler callback.  Since the parser only needs
> to be able to return the unparsed stream back as far as the last event,
> and not since the last call to getProperty, the amount of information
> that needs to be available is small.  There is a little overhead on the
> edge cases at low-level reader I/O buffer boundaries, but since most
> events will occur within the same buffer as the previous event a simple
> lastXMLStringOffset variable handles the common case.
>
> Regards,
> Glenn
>
> <<<end enclosure>>>
>
> John wrote:
>
> For example, what would you like reported for this:
> <!ENTITY % someOtherFile SYSTEM "aardvark.mod">
> <!ENTITY % prefix "foo:">
> <!ENTITY % elementName "%prefix;bar">
> <!ENTITY % cmBit "a,b,c">
> <!ENTITY % fullDecl "<!ELEMENT %elementName; (%cmBit;)>" >
> %fullDecl;
> %someOtherFile;
>
> This is what I get with my DTDWriter:
>
> [startDocument]
> [startDTD]
> [startEntity "[dtd]"]
> [startOfEntityDecl]<!ENTITY
> [externalEntityDecl] % someOtherFile SYSTEM "aardvark.mod">
> [startOfEntityDecl]
> <!ENTITY
> [internalEntityDecl] % prefix "foo:">
> [startOfEntityDecl]
> <!ENTITY
> [startEntity "%prefix"] % elementName "%prefix;
> [endEntity "%prefix"]
> [internalEntityDecl]bar">
> [startOfEntityDecl]
> <!ENTITY
> [internalEntityDecl] % cmBit "a,b,c">
> [startOfEntityDecl]
> <!ENTITY
> [startEntity "%elementName"] % fullDecl "<!ELEMENT %elementName;
> [endEntity "%elementName"]
> [startEntity "%cmBit"] (%cmBit;
> [endEntity "%cmBit"]
> [internalEntityDecl])>" >
> [startEntity "%fullDecl"]
> %fullDecl;
> [startOfElementDecl]
> [elementDecl]
> [endEntity "%fullDecl"]
> [startEntity "%someOtherFile"]
> %someOtherFile;
> [endEntity "%someOtherFile"]
> [endEntity "[dtd]"]
> [endDTD]
>
> Regards,
> Glenn
>
>
>                     "Anderson,
>                     John"                To:     "'[EMAIL PROTECTED]'" 
> <[EMAIL PROTECTED]>,
>                     <[EMAIL PROTECTED]        "'[email protected]'" 
> <[email protected]>
>                     oft.com>             cc:
>                                          Subject:     RE: DTD Reading Urgent
>                     08/06/2001
>                     01:09 PM
>                     Please respond
>                     to
>                     xerces-j-dev
>
>
>
> Ho ho!
>
> Found someone who wants this as well!
>
> The XML spec says that resolving is what a validating parser should do, and
> neither SAX nor DOM APIs expose this information. I would also like to be
> able to access this information, but most people have put it in the too
> hard basket. The problem is complicated (to put it mildly) by the various
> ways in which PEs can be used, including as element names, content models
> (or parts thereof), entire declarations, includes, bits of attribute
> declarations, etc etc. Plus some local PEs to override those in the
> external DTD. I am no kind of parser expert, but I guess it would be pretty
> difficult for a parser to not resolve this and still guarantee validity
> without doing a double pass.
>
> For example, what would you like reported for this:
>
> <!ENTITY % someOtherFile SYSTEM "aardvark.mod">
> <!ENTITY % prefix "foo:">
> <!ENTITY % elementName "%prefix;bar">
> <!ENTITY % cmBit "a,b,c">
> <!ENTITY % fullDecl "<!ELEMENT %elementName; (%cmBit;)>" >
> %fullDecl;
> %someOtherFile;
>
> Now add in a few general entities just to make it interesting, and perhaps
> some attributes and then remember it all when you create and modify a DOM3
> AS Model of it.
>
> As far as I know, the only way to do it would be to do some hacking in the
> source code after determining which entities you want resolved and which
> ones you'd like passed through. Unfortunately I haven't yet had the time to
> work out exactly how I might do this. If I do, I'll let you know.
>
> If anyone else has done it, I'd also be very grateful to know where and how
> I should begin.
>
> John
>
> -----Original Message-----
> From: chandru [mailto:[EMAIL PROTECTED]
> Sent: 25 July 2001 05:40
> To: [EMAIL PROTECTED]
> Subject: DTD Reading Urgent
>
> Hi friends ,
>   while reading the dtd using the DTDReader(using the feature
> decl-handler,of parser) .the elements content model is giving the a
> normalised definition .The model will be normalized so that all parameter
> entities are fully resolved and all whitespace is removed,and will include
> the enclosing parentheses. how can i stop this,i want the un normalised
> definition of the element .
> e.g:
>  <!ELEMENT NOE (%_NOE_;)>
> <!ENTITY % _NOE_ (Msgfun,getInfo,(Reader))>
>
> what iam expecting is:
>   element name : NOE
>   content Mode: (%_NOE)
>
> what the parser informs in DeclHandler  is
>  element name : NOE
>   content Mode: (Msgfun,getInfo,(Reader))
>
> how can i regain the original declarations.i.e with out the normalisation.
>
> Expecting your mail soon....
>
> from
> chandra sekhar
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: DTD Reading Urgent

Reply via email to