[jira] Closed: (STDCXX-1053) Xerces is poping up exception while parsing a Unicode file, but same is working fine for an ANSI file

Jojo Jose (JIRA) Fri, 21 Jan 2011 01:25:13 -0800

     [ 
https://issues.apache.org/jira/browse/STDCXX-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jojo Jose closed STDCXX-1053.
-----------------------------

    Resolution: Duplicate

New location at https://issues.apache.org/jira/browse/XERCESC-1955

> Xerces is poping up exception while parsing a Unicode file, but same is 
> working fine for an ANSI file
> -----------------------------------------------------------------------------------------------------
>
>                 Key: STDCXX-1053
>                 URL: https://issues.apache.org/jira/browse/STDCXX-1053
>             Project: C++ Standard Library
>          Issue Type: Bug
>          Components: 20. General Utilities
>         Environment: Windows XP
>            Reporter: Jojo Jose
>
> Hi All,
> Please let me know, if anybody can provide some clue on this.
> I have been using Xerces as XML parser in my C++ application and I have 
> recently migrated my Xerces version from 1.3 (very old) to 3.1.
> After that, when I call AbstractDOMParser::parse(const 
> xercesc_3_1::InputSource & source={...}) and passing a Unicode file as input, 
> it pops up exception. However the same works ok for ANSI.
> The call stack is as shown below.
> xerces-c_3_1.dll!xercesc_3_1::XMLScanner::scanProlog()  Line 1227 + 0x25 bytes
> xerces-c_3_1.dll!xercesc_3_1::IGXMLScanner::scanDocument(const 
> xercesc_3_1::InputSource & src={...})  Line 210
> xerces-c_3_1.dll!xercesc_3_1::AbstractDOMParser::parse(const 
> xercesc_3_1::InputSource & source={...})  Line 549
> EPConfigTool.dll!XCfgXMLParser::parse()  Line 66 - // My application code
> In the code, it is reaching at  
> else
> {
>  emitError(XMLErrs::InvalidDocumentStructure);
> ...
> }
> The function at parse fail is as shown below:
> void XMLScanner::scanProlog()
> {
>     bool sawDocTypeDecl = false;
>     // Get a buffer for whitespace processing
>     XMLBufBid bbCData(&fBufMgr);
>     //  Loop through the prolog. If there is no content, this could go all
>     //  the way to the end of the file.
>     try
>     {
>         while (true)
>         {
>             const XMLCh nextCh = fReaderMgr.peekNextChar();
>             if (nextCh == chOpenAngle)
>             {
>                 //  Ok, it could be the xml decl, a comment, the doc type 
> line,
>                 //  or the start of the root element.
>                 if (checkXMLDecl(true))
>                 {
>                     // There shall be at lease --ONE-- space in between
>                     // the tag '<?xml' and the VersionInfo.
>                     //
>                     //  If we are not at line 1, col 6, then the decl was not
>                     //  the first text, so its invalid.
>                     const XMLReader* curReader = 
> fReaderMgr.getCurrentReader();
>                     if ((curReader->getLineNumber() != 1)
>                     ||  (curReader->getColumnNumber() != 7))
>                     {
>                         emitError(XMLErrs::XMLDeclMustBeFirst);
>                     }
>                     scanXMLDecl(Decl_XML);
>                 }
>                 else if (fReaderMgr.skippedString(XMLUni::fgPIString))
>                 {
>                     scanPI();
>                 }
>                  else if (fReaderMgr.skippedString(XMLUni::fgCommentString))
>                 {
>                     scanComment();
>                 }
>                  else if (fReaderMgr.skippedString(XMLUni::fgDocTypeString))
>                 {
>                     if (sawDocTypeDecl) {
>                         emitError(XMLErrs::DuplicateDocTypeDecl);
>                     }
>                     scanDocTypeDecl();
>                     sawDocTypeDecl = true;
>                     // if reusing grammar, this has been validated already in 
> first scan
>                     // skip for performance
>                     if (fValidate && fGrammar && !fGrammar->getValidated()) {
>                         //  validate the DTD scan so far
>                         fValidator->preContentValidation(fUseCachedGrammar, 
> true);
>                     }
>                 }
>                 else
>                 {
>                     // Assume its the start of the root element
>                     return;
>                 }
>             }
>             else if (fReaderMgr.getCurrentReader()->isWhitespace(nextCh))
>             {
>                 //  If we have a document handler then gather up the
>                 //  whitespace and call back. Otherwise just skip over spaces.
>                 if (fDocHandler)
>                 {
>                     fReaderMgr.getSpaces(bbCData.getBuffer());
>                     fDocHandler->ignorableWhitespace
>                     (
>                         bbCData.getRawBuffer()
>                         , bbCData.getLen()
>                         , false
>                     );
>                 }
>                  else
>                 {
>                     fReaderMgr.skipPastSpaces();
>                 }
>             }
>              else
>             {
>                 emitError(XMLErrs::InvalidDocumentStructure);
>                 // Watch for end of file and break out
>                 if (!nextCh)
>                     break;
>                 else
>                     fReaderMgr.skipPastChar(chCloseAngle);
>             }
>         }
>     }
>     catch(const EndOfEntityException&)
>     {
>         //  We should never get an end of entity here. They should only
>         //  occur within the doc type scanning method, and not leak out to
>         //  here.
>         emitError
>         (
>             XMLErrs::UnexpectedEOE
>             , "in prolog"
>         );
>     }
> }
> It is working fine when I move back to version 1.3, but due to various other 
> requirements, I have to use the new version 3.1 in my application.
> Thanks in advance,
> Jojo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Closed: (STDCXX-1053) Xerces is poping up exception while parsing a Unicode file, but same is working fine for an ANSI file

Reply via email to