Try this path: SAXParser().getScanner().getSrcOffset()

The problem is that the getScanner method is protected. You might
inherit the SAXParser into your own class to get access.

But it should give you the number of characters eaten by the XMLReader.
That is the current fileposition.

/ Erik

> -----Original Message-----
> From: Pete Hodgson [mailto:[EMAIL PROTECTED]
> Sent: den 16 november 2004 16:53
> To: [EMAIL PROTECTED]
> Subject: Accessing file position information during a SAX parse
> 
> Hi everyone,
> 
> I was hoping for some advice regarding a problem my team is facing
> related to SAX parsing in Xerces-C++. I'm new to Xerces, and SAX in
> general, so please forgive any stupidity!
> 
> The application we're developing is processing /very/ large XML files
> that contain time-series data looking something like this:
> 
> <Root>
>    <Header>
>      <SomeMetaData>
>      <SomeMoreMetaData>
>      ...
>      ...
>    </Header>
> 
>    <Frame id="1">
>      <LotsOfData>
>      <LotsMoreData>
>      <YetMoreData>
>      ...
>    </Frame>
>    <Frame id="2">
>      ...
>    </Frame>
>    <Frame id="3">
>      ...
>    </Frame>
>    ...
>    ...
>    ...
> </Root>
> 
> We've been using progressive parsing SAX to read the <Frame> data from
> these XML files, which works great because we can deal with it as a
> stream without having to read the entire file up front.
> 
> We've also been using the MSXML DOM implementation to read <Header>
data
> with the same Schema as the <Header> element in the time-series files,
> but from other, small files.
> 
> The problem now is that we wish to access the <Header> data in these
> extremely large files. We don't want to use DOM to parse the entire
file
> (for efficiency issues), but we'd like to re-use the existing
DOM-based
> implementation that we have for reading the <Header> schema (rather
than
> implementing a new SAX parser for the <Header>).
> 
> So, I guess my question is, is there a way to discover the exact file
> location of an Element as it's encountered during a SAX parse? If we
> could get the location we could manually read the entire <Header>
> section into a string and DOM-parse the string. We'd also like to be
> able to access file location information for other reasons, such as to
> pre-parse the files and build a 'look up table' for the XML file, so
> that a particular section of the time series can be read in on demand
> with the help of a custom LocalFileSource.
> 
> The closest thing I've found is Locator, but that doesn't help because
> it gives you a line and column, rather than an absolute location
within
> the file. I looked into peeking at the BinInputStream that the
> SAX2XMLReader is using, but that doesn't work because the stream is
read
> in chunks, so calling BinInputStream::curPos() when the Header element
> is encountered doesn't supply the exact location either. I know that
> that would have been a kludgy solution anyways, but it would have
served
> our purposes.
> 
> Any suggestions on how best to solve this one?
> 
> Many Thanks,
> 
> Pete Hodgson
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to