Hi , You mean to say i can get a solution for this with xerces2.if so what is the release date for xerces2. i need this desparately .is there any other work around .please suggest me.
from chandra sekhar Glenn Marcy wrote: > John, > > Actually, it is not as bad as you make out... > > Here is a note that I sent to the dev list several months ago: > > Subject: Re: [xerces2] Note to current XNI state > > I have been looking into another approach. The documentation of the > SAX2 property "http://xml.org/sax/properties/xml-string" says: > > data type: java.lang.String > description: The literal string of characters that was the source > for the current event. > access: read-only > > I have been looking at the current Xerces2 APIs and how one might add > support for this property during DTD parsing. > > So, using your two examples, I get (from a DTD writer prototype): > > Input: > <!ENTITY % ent "ANY"> > <!ELEMENT e %ent;> > > Output: > <!ENTITY % ent "ANY"> > <!ELEMENT e %ent;> > > Input: > <!ENTITY % ent " "> > %ent; > <!ELEMENT e ANY> > > Output: > <!ENTITY % ent " "> > %ent; > <!ELEMENT e ANY> > > So this matches one of my goals of being able to take any DTD and parse > it and be able to spit back out exactly what was parsed. In this case > all I do is get the value of the "xml-string" property during every event > and write it out. Now to get a feel for the event/xml-string relationship, > I can run the program with debugging turned on, which just adds an > "[event-name]" to the output stream. Note that I have also added a new > "startOfMarkupDecl" event because it made my application (the simple DTD > writer) easier to code, but it is not "strictly" necessary. The debug > code looks like: > > String propString = "http://xml.org/sax/properties/xml-string"; > > void printEvent(String eventName) { > String xmlstring = (String) parser.getProperty(propString); > if (DEBUG) > System.out.println("[" + eventName "]" + xmlstring); > else > System.out.print(xmlstring); > } > > So again for the same to cases: > > Output (with DEBUG printing added): > [startDTD] > [startEntity "[dtd]"] > [startOfMarkupDecl]<!ENTITY > [internalEntityDecl] % ent "ANY"> > [startOfMarkupDecl] > <!ELEMENT > [startEntity "%ent"] e %ent; > [endEntity "%ent"] > [elementDecl]> > [endEntity "[dtd]"] > [endDTD] > > Output (with DEBUG printing added): > [startDTD] > [startEntity "[dtd]"] > [startOfMarkupDecl]<!ENTITY > [internalEntityDecl] % ent " "> > [startEntity "%ent"] > %ent; > [endEntity "%ent"] > [startOfMarkupDecl] > <!ELEMENT > [elementDecl] e ANY> > [endEntity "[dtd]"] > [endDTD] > > Obviously this requires some additional "reparsing" of some of the > simple constructs in the declarations, but it isn't too hard to keep > straight. The nice advantage is that you do not need to add lots of > methods to the handler APIs and you can avoid doing the work of > creating the String until you get the getProperty call from the > application within the handler callback. Since the parser only needs > to be able to return the unparsed stream back as far as the last event, > and not since the last call to getProperty, the amount of information > that needs to be available is small. There is a little overhead on the > edge cases at low-level reader I/O buffer boundaries, but since most > events will occur within the same buffer as the previous event a simple > lastXMLStringOffset variable handles the common case. > > Regards, > Glenn > > <<<end enclosure>>> > > John wrote: > > For example, what would you like reported for this: > <!ENTITY % someOtherFile SYSTEM "aardvark.mod"> > <!ENTITY % prefix "foo:"> > <!ENTITY % elementName "%prefix;bar"> > <!ENTITY % cmBit "a,b,c"> > <!ENTITY % fullDecl "<!ELEMENT %elementName; (%cmBit;)>" > > %fullDecl; > %someOtherFile; > > This is what I get with my DTDWriter: > > [startDocument] > [startDTD] > [startEntity "[dtd]"] > [startOfEntityDecl]<!ENTITY > [externalEntityDecl] % someOtherFile SYSTEM "aardvark.mod"> > [startOfEntityDecl] > <!ENTITY > [internalEntityDecl] % prefix "foo:"> > [startOfEntityDecl] > <!ENTITY > [startEntity "%prefix"] % elementName "%prefix; > [endEntity "%prefix"] > [internalEntityDecl]bar"> > [startOfEntityDecl] > <!ENTITY > [internalEntityDecl] % cmBit "a,b,c"> > [startOfEntityDecl] > <!ENTITY > [startEntity "%elementName"] % fullDecl "<!ELEMENT %elementName; > [endEntity "%elementName"] > [startEntity "%cmBit"] (%cmBit; > [endEntity "%cmBit"] > [internalEntityDecl])>" > > [startEntity "%fullDecl"] > %fullDecl; > [startOfElementDecl] > [elementDecl] > [endEntity "%fullDecl"] > [startEntity "%someOtherFile"] > %someOtherFile; > [endEntity "%someOtherFile"] > [endEntity "[dtd]"] > [endDTD] > > Regards, > Glenn > > > "Anderson, > John" To: "'[EMAIL PROTECTED]'" > <[EMAIL PROTECTED]>, > <[EMAIL PROTECTED] "'[email protected]'" > <[email protected]> > oft.com> cc: > Subject: RE: DTD Reading Urgent > 08/06/2001 > 01:09 PM > Please respond > to > xerces-j-dev > > > > Ho ho! > > Found someone who wants this as well! > > The XML spec says that resolving is what a validating parser should do, and > neither SAX nor DOM APIs expose this information. I would also like to be > able to access this information, but most people have put it in the too > hard basket. The problem is complicated (to put it mildly) by the various > ways in which PEs can be used, including as element names, content models > (or parts thereof), entire declarations, includes, bits of attribute > declarations, etc etc. Plus some local PEs to override those in the > external DTD. I am no kind of parser expert, but I guess it would be pretty > difficult for a parser to not resolve this and still guarantee validity > without doing a double pass. > > For example, what would you like reported for this: > > <!ENTITY % someOtherFile SYSTEM "aardvark.mod"> > <!ENTITY % prefix "foo:"> > <!ENTITY % elementName "%prefix;bar"> > <!ENTITY % cmBit "a,b,c"> > <!ENTITY % fullDecl "<!ELEMENT %elementName; (%cmBit;)>" > > %fullDecl; > %someOtherFile; > > Now add in a few general entities just to make it interesting, and perhaps > some attributes and then remember it all when you create and modify a DOM3 > AS Model of it. > > As far as I know, the only way to do it would be to do some hacking in the > source code after determining which entities you want resolved and which > ones you'd like passed through. Unfortunately I haven't yet had the time to > work out exactly how I might do this. If I do, I'll let you know. > > If anyone else has done it, I'd also be very grateful to know where and how > I should begin. > > John > > -----Original Message----- > From: chandru [mailto:[EMAIL PROTECTED] > Sent: 25 July 2001 05:40 > To: [EMAIL PROTECTED] > Subject: DTD Reading Urgent > > Hi friends , > while reading the dtd using the DTDReader(using the feature > decl-handler,of parser) .the elements content model is giving the a > normalised definition .The model will be normalized so that all parameter > entities are fully resolved and all whitespace is removed,and will include > the enclosing parentheses. how can i stop this,i want the un normalised > definition of the element . > e.g: > <!ELEMENT NOE (%_NOE_;)> > <!ENTITY % _NOE_ (Msgfun,getInfo,(Reader))> > > what iam expecting is: > element name : NOE > content Mode: (%_NOE) > > what the parser informs in DeclHandler is > element name : NOE > content Mode: (Msgfun,getInfo,(Reader)) > > how can i regain the original declarations.i.e with out the normalisation. > > Expecting your mail soon.... > > from > chandra sekhar > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
