Re: Derby received an error "ERROR XSDG0: Page Page(1325564,Container(0, 30832)) could not be read from disk."

Bryan Pendleton Thu, 03 Sep 2015 17:34:18 -0700

ERROR XSDG0: Page Page(1325564,Container(0, 30832)) could not be read from disk.
Caused by: java.io.EOFException: Reached end of file while attempting to read a 
whole page.


Does the derby.log have any more detail about this specific exception?

Note that you can use the system tables (SYSCONGLOMERATES, I believe)
to figure out which table corresponds to conglomerate 30832, and you
can also multiply 1325564 by the pagesize of your table to figure out
what the file size was at the instant that this happened.

Assuming your page size was 4096, 1325564 * 4096 is 5,429,510,144, so
that conglomerate should be about 5.4 GB in size.


derby the reported errors like:
org.apache.derby.iapi.error.ShutdownException:


This is normal I believe.

java.lang.NullPointerException
         at org.apache.derby.impl.drda.DRDAConnThread.writePBSD(Unknown Source)
         at org.apache.derby.impl.drda.DRDAConnThread.processCommands(Unknown 
Source)
         at org.apache.derby.impl.drda.DRDAConnThread.run(Unknown Source)


This is scary, but it appears to have happened AFTER the shutdown, and hence
may be some secondary, unrelated bug in the network server code related to
not handling a shutdown correctly. It seems worth investigating separately.

The system is an Oracle M5000 Enterprise server with what I believe is a 15TB 
Sun ZFS Storage 7320 external ZFS storage array connected by Fibre Channel.   
This is the first time in over 8 years we have seen any I/O error like such.

What I am trying to confirm is that this is really low level derby code that if 
it reports an “java.io.EOFException” like it did, it really did have an I/O 
error somewhere in reading the page from the container file.   Things like 
performance, java heap
space, etc, can pretty much be ruled out as causing such an error.   My gut 
feeling is that maybe something in the connection to this storage array had a 
hiccup.   This setup is at the customer site and I cannot directly access 
system logs nor do I have
knowledge on how this storage array works and how to look at such but just 
having confirmation that an I/O error really did occur would help.


This is good information to have.

My feeling is that you should do a more thorough investigation of the
specific conglomerate in question, to check for errors that might
not be showing up using your regular application access patterns.

Also, if you can find any more information in the derby log, it would
be nice to know.

Thanks for sharing the information that you do have, it is quite
interesting to know what your experience is!

bryan

P.S. I believe this is the code that threw the java.io.EOFException:

    /**
     * Attempts to fill buf completely from start until it's full.
     * <p/>
     * FileChannel has no readFull() method, so we roll our own.
     * <p/>
     * @param dstBuffer buffer to read into
     * @param srcChannel channel to read from
     * @param position file position from where to read
     *
     * @throws IOException if an I/O error occurs while reading
     * @throws StandardException If thread is interrupted.
     */
    private void readFull(ByteBuffer dstBuffer,
                          FileChannel srcChannel,
                          long position)
            throws IOException, StandardException
    {
        while(dstBuffer.remaining() > 0) {
            if (srcChannel.read(dstBuffer,
                                    position + dstBuffer.position()) == -1) {
                throw new EOFException(
                    "Reached end of file while attempting to read a "
                    + "whole page.");
            }

            // (**) Sun Java NIO is weird: it can close the channel due to an
            // interrupt without throwing if bytes got transferred. Compensate,
            // so we can clean up.  Bug 6979009,
            // http://bugs.sun.com/view_bug.do?bug_id=6979009
            if (Thread.currentThread().isInterrupted() &&
                    !srcChannel.isOpen()) {
                throw new ClosedByInterruptException();
            }
        }
    }

Re: Derby received an error "ERROR XSDG0: Page Page(1325564,Container(0, 30832)) could not be read from disk."

Reply via email to