[
https://issues.apache.org/jira/browse/DERBY-7007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17013873#comment-17013873
]
Richard N. Hillegas commented on DERBY-7007:
--------------------------------------------
Did you get to the bottom of this problem? Is there any further advice which
you can give the community in case someone else trips across this behavior?
Thanks.
> Random IOException: Bad file descriptor on new server platform
> --------------------------------------------------------------
>
> Key: DERBY-7007
> URL: https://issues.apache.org/jira/browse/DERBY-7007
> Project: Derby
> Issue Type: Bug
> Components: Miscellaneous
> Affects Versions: 10.12.1.1
> Environment: Linux: SUSE Linux Enterprise Server for SAP Applications
> 12 SP3 (x86_64)
> Kernel: 4.4.126-94.22-default #1 SMP Wed Apr 11 07:45:03 UTC 2018 (9649989)
> x86_64 x86_64 x86_64 GNU/Linux
> Filesystem: /dev/mapper/appsvg-lvapps on /opt/apps type ext3
> (rw,relatime,data=ordered)
> Java: IBM 7.1-4.15
> Tomcat: 7.0.85
> Reporter: Ralf Schubert
> Priority: Blocker
>
> Our customer is migrating to a new server platform. We have running several
> applications on their old server platform right now, which are running well
> so far. But on the new platform some random Derby errors occur reproducably
> which we and customer are analysing since several months now. However, the
> deeper we get the more clueless we are and it looks more and more like a
> DERBY bug.
> We would be pleased if somebody could look into this and give us some idea if
> this is either a bug in derby or if you have some other ideas what could
> cause derby to behave like this.
> h2. Situation
> We have one Application which includes several embedded DERBY databases.
> After the server is starting, the application behaves normal for a few
> minutes. But after some minutes, one of the Derby DBs (accessed by JAVA
> Hibernate using DERBY embedded mode) shows first an error like this on a
> random derby file (the files vary each time):
> {code:java}
> Local derby log (/home/tomcat_i36/derby.log):
>
> ------------ Begin Shutdown Error Stack -------------
> ERROR XSDG3: Meta-data for Container(0, 33904) could not be accessed to clean
> /opt/apps/tomcat/i36/webapps/XXXXX/database/XX/seg0/c8470.dat
> at org.apache.derby.iapi.error.StandardException.newException(Unknown
> Source)
> at org.apache.derby.impl.store.raw.data.RAFContainer.clean(Unknown
> Source)
> at
> org.apache.derby.impl.services.cache.ConcurrentCache.cleanAndUnkeepEntry(Unknown
> Source)
> at
> org.apache.derby.impl.services.cache.ConcurrentCache.cleanEntry(Unknown
> Source)
> at
> org.apache.derby.impl.services.cache.BackgroundCleaner.performWork(Unknown
> Source)
> at
> org.apache.derby.impl.services.daemon.BasicDaemon.serviceClient(Unknown
> Source)
> at org.apache.derby.impl.services.daemon.BasicDaemon.work(Unknown
> Source)
> at org.apache.derby.impl.services.daemon.BasicDaemon.run(Unknown
> Source)
> at java.lang.Thread.run(Thread.java:809)
> Caused by: java.io.IOException: Bad file descriptor
> at sun.nio.ch.FileDispatcherImpl.pread0(Native Method)
> at sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:65)
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
> at sun.nio.ch.IOUtil.read(IOUtil.java:210)
> at sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:754)
> at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:739)
> at
> org.apache.derby.impl.store.raw.data.RAFContainer4.readFull(Unknown Source)
> at
> org.apache.derby.impl.store.raw.data.RAFContainer4.readPage0(Unknown Source)
> at
> org.apache.derby.impl.store.raw.data.RAFContainer4.readPage(Unknown Source)
> at
> org.apache.derby.impl.store.raw.data.RAFContainer4.getEmbryonicPage(Unknown
> Source)
> at
> org.apache.derby.impl.store.raw.data.RAFContainer.writeRAFHeader(Unknown
> Source)
> ... 8 more
> ============= begin nested exception, level (1) ===========
> java.io.IOException: Bad file descriptor
> at sun.nio.ch.FileDispatcherImpl.pread0(Native Method)
> at sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:65)
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
> at sun.nio.ch.IOUtil.read(IOUtil.java:210)
> at sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:754)
> at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:739)
> at
> org.apache.derby.impl.store.raw.data.RAFContainer4.readFull(Unknown Source)
> at
> org.apache.derby.impl.store.raw.data.RAFContainer4.readPage0(Unknown Source)
> at
> org.apache.derby.impl.store.raw.data.RAFContainer4.readPage(Unknown Source)
> at
> org.apache.derby.impl.store.raw.data.RAFContainer4.getEmbryonicPage(Unknown
> Source)
> at
> org.apache.derby.impl.store.raw.data.RAFContainer.writeRAFHeader(Unknown
> Source)
> at org.apache.derby.impl.store.raw.data.RAFContainer.clean(Unknown
> Source)
> at
> org.apache.derby.impl.services.cache.ConcurrentCache.cleanAndUnkeepEntry(Unknown
> Source)
> at
> org.apache.derby.impl.services.cache.ConcurrentCache.cleanEntry(Unknown
> Source)
> at
> org.apache.derby.impl.services.cache.BackgroundCleaner.performWork(Unknown
> Source)
> at
> org.apache.derby.impl.services.daemon.BasicDaemon.serviceClient(Unknown
> Source)
> at org.apache.derby.impl.services.daemon.BasicDaemon.work(Unknown
> Source)
> at org.apache.derby.impl.services.daemon.BasicDaemon.run(Unknown
> Source)
> at java.lang.Thread.run(Thread.java:809)
> ============= end nested exception, level (1) ===========
> ------------ End Shutdown Error Stack ------------{code}
> After this happens, the DB behaves weird, throwing random errors (e.g.
> telling a column is missing in a table although it is there, or telling the
> DB is corrupt).
> Hint: We do only have READ access on those databases within the application.
> We do not write any data to it.
> It only happens to one single DB, but this is the most complex one in the
> application. Restarting the server will make it WORK for some minutes again!
> We deploy the exact same WAR file to Old and new platform for testing.
> h2. Already analysed
> We already tried several things and did several analysis steps:
> # Turning off antivirus solution (Trend Micro Deep Security Agent) did not
> help
> # Exchanging the servers of the new server platform with another set of
> servers with same setup does not help
> # Comparing a SHA1 hash of the "corrupt" files with the original files
> turned out the files are IDENTICAL.
> # Copying the "corrupt" DB to another system, testing it there works as
> expected without issues.
> # Running an integrity check on the DB shows no problems
> # Checking the file permissions on the problematic servers shows no problems
> {code:java}
> # ls -l /opt/apps/tomcat/i36/webapps/XXXX/database/XX/seg0/c8470.dat
> -rw-r--r-- 1 tomcat_i36 tomcat 16384 Aug 21 09:32
> /opt/apps/tomcat/i36/webapps/XXXXX/database/XX/seg0/c8470.dat
>
> # file /opt/apps/tomcat/i36/webapps/XXXXX/database/XX/seg0/c8470.dat
> /opt/apps/tomcat/i36/webapps/XXXXX/database/XX/seg0/c8470.dat: data{code}
> # Checking if any linux limits (e.g. open files limit) was reached: nothing
> found
> # Checking for corrupt file system: Ext3 is used on old and new platform, no
> hint about corrupt files found
> # Upgrading DERBY from 10.11.1.1 to 10.12.1.1 did not fix the issue.
> h2. The server environments
> h3. OLD environment (working well)
> {code:java}
> Linux: SUSE Linux Enterprise Server 11 SP4 (s390x)
> Kernel: 3.0.101-91-default #1 SMP Mon Dec 12 13:06:13 UTC 2016 (544b9d1)
> s390x s390x s390x GNU/Linux
> Filesystem: /dev/mapper/appsvg-lvapps on /opt/apps type ext3
> (rw,acl,user_xattr)
> Java: IBM 7.1-4.1
> Tomcat: 7.0.70{code}
> h3. NEW environment (not working)
> {code:java}
> Linux: SUSE Linux Enterprise Server for SAP Applications 12 SP3 (x86_64)
> Kernel: 4.4.126-94.22-default #1 SMP Wed Apr 11 07:45:03 UTC 2018 (9649989)
> x86_64 x86_64 x86_64 GNU/Linux
> Filesystem: /dev/mapper/appsvg-lvapps on /opt/apps type ext3
> (rw,relatime,data=ordered)
> Java: IBM 7.1-4.15
> Tomcat: 7.0.85{code}
> Our customer has to migrate the server platforms very soon so we would be
> very glad if someone could assist us in checking and resolving this.
> Thanks!
--
This message was sent by Atlassian Jira
(v8.3.4#803005)