null termination of DOMStrings

2002-04-23 Thread Marcus Ackermann
Hi, the documentation of DOMString::rawBuffer() says that the returned buffer is not always null terminated. This implies that the buffer has to be copied and a null character has to be appended to that copy of the buffer for further use. I would like to skip this copying for performance reasons

Re: I cannot find the answer after reading FAQ.

2002-04-23 Thread David N Bertoni/Cambridge/IBM
This won't work. An illegal character is an illegal character, even if it's represented as a numeric character reference. Attempting to parse a file with results in the following error: Fatal Error at (file test1.xml, line 4, column 16): Invalid character reference The only way to make

Re: I cannot find the answer after reading FAQ.

2002-04-23 Thread Dean Roddey
They are inherently illegal in XML, so it doesn't matter what encoding you use. Its not that they can't be represented in the source encoding, but that the parser won't accept them. You will have to escape them using character refs, e.g: F; Unless my memory is failing me, this will work beca

I cannot find the answer after reading FAQ.

2002-04-23 Thread Nathan Pitzer
We are attempting to store data in an XML file. This data is encoded ascii text, and because of this, some of the characters end up falling outside the legal limits for XML characters. Specifically, I am getting this error: Fatal Error at file "C:\natemail.xml", line 2, column 3275(4/23/2002 22

Re: how to access the raw text that generated a sax event

2002-04-23 Thread Dean Roddey
The getSrcOffset() method of XMLScanner should return you the information you want. However, it can only do that if the source offset stuff is supported by the transcoding system being used. For ICU and the internal transcoders that is true. I just looked and in the latest repository files, the Wi

Re: how to access the raw text that generated a sax event

2002-04-23 Thread Jason E. Stewart
"Jason E. Stewart" <[EMAIL PROTECTED]> writes: > Any ideas what to do? I finally broke down and read the source code for XMLScanner and XMLReader and I'm convinced that without a major re-writing, this is not possible. Basically, the XMLReader calls readBytes() on the stream to fill up a buffer

Re: how to access the raw text that generated a sax event

2002-04-23 Thread Jason E. Stewart
"Murphy, James" <[EMAIL PROTECTED]> writes: > Yuck? WTF? Its beautiful! Maybe I should explain some more. > > I followed the same links you described and realized that without some > hacking in and around XMLScanner getting the BinInputStream from the > ReaderMgr is a no go. BTW, getting at

RE: RE: how to access the raw text that generated a sax event

2002-04-23 Thread Murphy, James
Hmm...you're right. We get some value in SAX parsing the initial part of the document before the glut of repeated record structures. That is where we do some "document level" sanity checking and hang onto some other higher level data. Thanks Jim > -Original Message- > From: Dean Rod

Re: RE: how to access the raw text that generated a sax event

2002-04-23 Thread Dean Roddey
If you can impose certain restrictions, don't even use the XML parser. Just do a fast and dirty scan, based on known limitations of the format and break it up yourself at maximum speed. -- Dean Roddey The Charmed Quark Controller Charmed Quark Software [EMAIL PROTECTED] ht

RE: RE: how to access the raw text that generated a sax event

2002-04-23 Thread Murphy, James
You're right of course, that's a very sensible approach. But my client has an XML based product to handle communication between trading partners. The benefits of XML are significant since it is an integration product and honestly the instance sizes a usually very manageable. But 5% of the tim

RE: RE: how to access the raw text that generated a sax event

2002-04-23 Thread Murphy, James
My XML doesn't get within 100 miles of a DTD. If I care to validate I use schema. The chunks that I find are very well formed XML due to a priori knowledge of the xml structure I'm parsing. They look like: ... ... ... ... ... .

Re: RE: how to access the raw text that generated a sax event

2002-04-23 Thread Dean Roddey
Of course, the counter argument to that is: Use a format that's designed to handle that reasonably. XML isn't, so why use it if its not an optimal (or even reasonable) format to use for this kind of thing? -- Dean Roddey The Charmed Quark Controller Charmed Quark Software

Re: RE: how to access the raw text that generated a sax event

2002-04-23 Thread Dean Roddey
> I am working on a system that will be responsible for > splitting large XML files into record sized chunks. > These chunks will be handed off to end-users who > want the option of parsing them with whatever parser > they choose. No XML compliant parser should parse such chunks, because they ar

RE: RE: how to access the raw text that generated a sax event

2002-04-23 Thread Murphy, James
Fair enough Dean - I'm sympathetic to your point that Xerces was designed from an InfoSet perspective. That's cool - but when you are writing for performance we are willing to make some Faustian bargains. Especially since, like Jason our environment stipulates single entities anyway. Jim > ---

RE: how to access the raw text that generated a sax event

2002-04-23 Thread Murphy, James
Yuck? WTF? Its beautiful! Maybe I should explain some more. I followed the same links you described and realized that without some hacking in and around XMLScanner getting the BinInputStream from the ReaderMgr is a no go. BTW, getting at XMLScanner from a parser would be real handy for lots o

Re: RE: how to access the raw text that generated a sax event

2002-04-23 Thread tedsandler
> > From: Dean Roddey <[EMAIL PROTECTED]> > Date: 2002/04/23 Tue PM 03:33:45 EDT > To: [EMAIL PROTECTED] > Subject: Re: RE: how to access the raw text that generated a sax event > > The source offset stuff is always relative to the entity, so if you have > internal or external entity references

Re: how to access the raw text that generated a sax event

2002-04-23 Thread Jason E. Stewart
"Jason E. Stewart" <[EMAIL PROTECTED]> writes: > "Murphy, James" <[EMAIL PROTECTED]> writes: > > > BinInputStream::curPos() const; looks promising since the built in > > input sources actually implement it! So you should be able to call > > this in your SAX event handler methods if you provide

Re: how to access the raw text that generated a sax event

2002-04-23 Thread Jason E. Stewart
"Murphy, James" <[EMAIL PROTECTED]> writes: > BinInputStream::curPos() const; looks promising since the built in > input sources actually implement it! So you should be able to call > this in your SAX event handler methods if you provide your event > handler class with the InputSource you use to

Re: how to access the raw text that generated a sax event

2002-04-23 Thread Jason E. Stewart
"Dean Roddey" <[EMAIL PROTECTED]> writes: > Anyway, the whole concept of getting back to the original raw XML > text is counter to what an XML parser is supposed to do, so its > never going to be easy because it wasn't designed to make that easy > or useful to do. I always argued that we never ev

Re: RE: how to access the raw text that generated a sax event

2002-04-23 Thread Dean Roddey
The source offset stuff is always relative to the entity, so if you have internal or external entity references and such, you are going to have to keep up with that fact. So if a entity reference to an internal general entity contains elements (and it pretty much has to contain whole elements), th

Re: RE: how to access the raw text that generated a sax event

2002-04-23 Thread tedsandler
The other potential solution I've found is the XMLScanner's "getSrcOffset" method. My only fear in using it is that it will give weird results if an XML document is comprised of more than 1 entity. Does "getSrcOffset" treat the document as a continuous sequence of bytes, or is it more low-le

RE: how to access the raw text that generated a sax event

2002-04-23 Thread Murphy, James
Looking through the source... BinInputStream::curPos() const; looks promising since the built in input sources actually implement it! So you should be able to call this in your SAX event handler methods if you provide your event handler class with the InputSource you use to parse. I haven't tri

DOMParser: how to verify if the schema validation was successfull

2002-04-23 Thread Carlo Agopian
I have intentionaly added a violation(against the external schema) inside the XML document but no error is being reported during: parser->getErrorCount(); and cerr << "Errors from errReporter ->" << errReporter->getSawErrors() << "<-\n";

Re: how to access the raw text that generated a sax event

2002-04-23 Thread Jason E. Stewart
"Murphy, James" <[EMAIL PROTECTED]> writes: > I thought this would be really handy when parsing from a continuous buffer > like a MemBufInputSource or a LocalFileInputSource. I have a situation > where I SAX parse _very_ large XML instances looking for small repeating > fragments. These fragmen

RE: Re: how to access the raw text that generated a sax event

2002-04-23 Thread Murphy, James
I thought this would be really handy when parsing from a continuous buffer like a MemBufInputSource or a LocalFileInputSource. I have a situation where I SAX parse _very_ large XML instances looking for small repeating fragments. These fragments are operated on individually by making a DOM to op

Re: memory allocate in DOM_Node

2002-04-23 Thread Jorge Pozo Ramirez
    Both of them returns a newly allocated DOMString, hence, newly allocated memory for it.   Jorge       - Original Message - From: Felipe Micaroni Lalli To: [EMAIL PROTECTED] Sent: Tuesday, April 23, 2002 1:24 AM Subject: memory allocate in DOM_Node Hel

memory allocate in DOM_Node

2002-04-23 Thread Felipe Micaroni Lalli
Hello people, I made a program using DOM Xerces for C++ and the functions:   DOM_Node::getNodeName() or DOM_Node::getNodeValue() allocate memory. Any ideas?   Thanks, hugs, Felipe.

Incorrect Xerces C tar-ball

2002-04-23 Thread Murty Dasari
--- Murty Dasari <[EMAIL PROTECTED]> wrote: > Hi, > > I was trying to get source-code for latest stable Xerces C parser for > Unix. > > I've downloaded a tar-ball, xerces-c-src1_7.0.tar.gz (Latest Stable > source package for Unix's) from the following download site. > http://xml.apache.org/dist

Re: Re: how to access the raw text that generated a sax event

2002-04-23 Thread tedsandler
The problem with using the "locator" is that it only reports line+column info. Byte offsets into the file would be more helpful for my purposes. -ted > > From: "Joseph Kesselman/CAM/Lotus" <[EMAIL PROTECTED]> > Date: 2002/04/23 Tue AM 08:39:09 EDT > To: [EMAIL PROTECTED] > Subject: Re: how t

Re: how to access the raw text that generated a sax event

2002-04-23 Thread Joseph Kesselman/CAM/Lotus
Best suggestion I've got is to use the SAX "locator" to find the relevant area of the document, then perform your own primitive parsing to extract a moderately meaningful chunk thereof ... but I suspect that's more work than simply using a single parser and routing its SAX events to the app

Setting the default validator.

2002-04-23 Thread Peter A. Volchek
I need a way to tell the XMLScanner to use the default validator. The one is actually created during the XMLScanner creation (fDTDValidator) when valToAdopt=NULL is passed to its constructor. But for my needs I need to change the validator dynamically. The XMLScanner::setValidator(XMLValida