My suggestion is to make your class "fake" a root tag, which it prepends to the
actual data. Here bytesRead is a counter you keep in your class:
XMLSize_t MyFileInputStream::readBytes(XMLByte* const toFill, const XMLSize_t
maxToRead) {
if (bytesRead == 0) {
fseek(...fp...); // skip front of file
}
int bytesReturned = 0;
if (bytesRead < 6) {
int bytesToCopy = min(maxToRead, 6-bytesRead);
memcpy(toFill, "<root>", bytesToCopy);
bytesRead += bytesToCopy;
maxToRead -= bytesToCopy;
toFile += bytesToCopy;
bytesReturned += bytesToCopy;
}
// normal file I/O
bytesReturned += fread(toFill, 1, maxToRead, fp);
return bytesReturned;
}
The other issue you will have is that the SAX parser is not a state machine.
It won't return from parsing and let you pick up later where it left off.
There are two ways of dealing with this that I can think of. The simpler is to
just make a new parser and fake a new root tag every time you see more input.
The harder is to make your file reader run in a separate thread, and
communicate to your inputsource object in the foreground thread that more data
is available via a semaphore or other thread-synchronization primitive. It
depends on the requirements of your processing model.
john
-----Original Message-----
From: Galande, Manish [mailto:[email protected]]
Sent: Thursday, October 22, 2009 1:28 AM
To: [email protected]
Subject: RE: Reading portions of XML file.
I'm able to reposition the file pointer and start reading from there on. But
now the problem is I can read only one tag at a time. The first tag
encountered, during read, is treated as root tag and startDocument/endDocument
events are generated accordingly. Where as I want to continue reading. Here is
a sample xml file:
<root>
<rec>rec1</rec>
<rec>rec2</rec>
<rec>rec3</rec>
</root>
Initially the file contains only two lines and later lines are added at a
regular interval. My intent is to parse whatever the content of the file and
then do not parse the same content again next time I read the file.
The approach that I'm thinking of is to read till the root-tag in xml file and
then reposition the file-pointer and start reading from there on. But, I'm not
sure if this can be done. Is there any way to do this?
Thanks.
Manish
-----Original Message-----
From: Galande, Manish
Sent: Friday, October 09, 2009 6:58 PM
To: '[email protected]'
Subject: RE: Reading portions of XML file.
True. I realized it after I sent the earlier reply. Till then I was thinking
only of Xerces API's but it too implements the platform dependent code using
these native calls.
Manish
-----Original Message-----
From: John Lilley [mailto:[email protected]]
Sent: Friday, October 09, 2009 6:33 PM
To: [email protected]
Subject: RE: Reading portions of XML file.
You have to implement that in your class. Presumably your class will contain a
file handle or object. Call fseek() or lseek() as appropriate.
john
-----Original Message-----
From: Galande, Manish [mailto:[email protected]]
Sent: Friday, October 09, 2009 5:52 AM
To: [email protected]
Subject: RE: Reading portions of XML file.
Thanks John.
But how would I move to the interested offset when I already have the offset.
I'm not aware of any lseek like call in Xerces.
Manish
-----Original Message-----
From: John Lilley [mailto:[email protected]]
Sent: Thursday, October 08, 2009 7:04 PM
To: [email protected]
Subject: RE: Reading portions of XML file.
Er, a bit cleaner snippet:
You can subclass InputSource and InputStream, then implement it however you
like. In this snippet, XN:: is defined as the Xerces namespace.
class MyFileInputStream : public XN::BinInputStream {
public:
MyFileInputStream(...) {}
virtual XMLFilePos curPos() const { return ... }
virtual XMLSize_t readBytes(XMLByte* const toFill, const XMLSize_t
maxToRead) {
return ...;
}
// No "out-of-band" content type
virtual const XMLCh* getContentType() const { return 0; }
private:
...
};
class MyFileInputSource : public XN::InputSource
{
public :
MyFileInputSource(...) {}
virtual XN::BinInputStream* makeStream() const {
// Caller is owner of stream
return new MyFileInputStream(...);
}
private:
...
};
john
-----Original Message-----
From: Galande, Manish [mailto:[email protected]]
Sent: Thursday, October 08, 2009 7:12 AM
To: [email protected]
Subject: Reading portions of XML file.
Hi,
I want to read the portions of XML file depending on byte-position, using
getSrcOffset(). Is there a way I can directly start reading/parsing from a
pre-defined offset in the source, similar to lseek and read?
Thanks.
Manish