[jira] [Updated] (DERBY-7007) Random IOException: Bad file descriptor on new server platform

Rick Hillegas (JIRA) Tue, 21 Aug 2018 18:42:36 -0700


     [ 
https://issues.apache.org/jira/browse/DERBY-7007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Rick Hillegas updated DERBY-7007:
---------------------------------
    Urgency: Normal  (was: Blocker)

> Random IOException: Bad file descriptor on new server platform
> --------------------------------------------------------------
>
>                 Key: DERBY-7007
>                 URL: https://issues.apache.org/jira/browse/DERBY-7007
>             Project: Derby
>          Issue Type: Bug
>          Components: Miscellaneous
>    Affects Versions: 10.12.1.1
>         Environment: Linux: SUSE Linux Enterprise Server for SAP Applications 
> 12 SP3  (x86_64)
> Kernel: 4.4.126-94.22-default #1 SMP Wed Apr 11 07:45:03 UTC 2018 (9649989) 
> x86_64 x86_64 x86_64 GNU/Linux
> Filesystem: /dev/mapper/appsvg-lvapps on /opt/apps type ext3 
> (rw,relatime,data=ordered)
> Java: IBM 7.1-4.15
> Tomcat: 7.0.85
>            Reporter: Ralf Schubert
>            Priority: Blocker
>
> Our customer is migrating to a new server platform. We have running several 
> applications on their old server platform right now, which are running well 
> so far. But on the new platform some random Derby errors occur reproducably 
> which we and customer are analysing since several months now. However, the 
> deeper we get the more clueless we are and it looks more and more like a 
> DERBY bug.
> We would be pleased if somebody could look into this and give us some idea if 
> this is either a bug in derby or if you have some other ideas what could 
> cause derby to behave like this.
> h2. Situation
> We have one Application which includes several embedded DERBY databases. 
> After the server is starting, the application behaves normal for a few 
> minutes. But after some minutes, one of the Derby DBs (accessed by JAVA 
> Hibernate using DERBY embedded mode) shows first an error like this on a 
> random derby file (the files vary each time):
> {code:java}
> Local derby log (/home/tomcat_i36/derby.log):
>  
> ------------  Begin Shutdown Error Stack -------------
> ERROR XSDG3: Meta-data for Container(0, 33904) could not be accessed to clean 
> /opt/apps/tomcat/i36/webapps/XXXXX/database/XX/seg0/c8470.dat
>         at org.apache.derby.iapi.error.StandardException.newException(Unknown 
> Source)
>         at org.apache.derby.impl.store.raw.data.RAFContainer.clean(Unknown 
> Source)
>         at 
> org.apache.derby.impl.services.cache.ConcurrentCache.cleanAndUnkeepEntry(Unknown
>  Source)
>         at 
> org.apache.derby.impl.services.cache.ConcurrentCache.cleanEntry(Unknown 
> Source)
>         at 
> org.apache.derby.impl.services.cache.BackgroundCleaner.performWork(Unknown 
> Source)
>         at 
> org.apache.derby.impl.services.daemon.BasicDaemon.serviceClient(Unknown 
> Source)
>         at org.apache.derby.impl.services.daemon.BasicDaemon.work(Unknown 
> Source)
>         at org.apache.derby.impl.services.daemon.BasicDaemon.run(Unknown 
> Source)
>         at java.lang.Thread.run(Thread.java:809)
> Caused by: java.io.IOException: Bad file descriptor
>         at sun.nio.ch.FileDispatcherImpl.pread0(Native Method)
>         at sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:65)
>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
>         at sun.nio.ch.IOUtil.read(IOUtil.java:210)
>         at sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:754)
>         at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:739)
>         at 
> org.apache.derby.impl.store.raw.data.RAFContainer4.readFull(Unknown Source)
>         at 
> org.apache.derby.impl.store.raw.data.RAFContainer4.readPage0(Unknown Source)
>         at 
> org.apache.derby.impl.store.raw.data.RAFContainer4.readPage(Unknown Source)
>         at 
> org.apache.derby.impl.store.raw.data.RAFContainer4.getEmbryonicPage(Unknown 
> Source)
>         at 
> org.apache.derby.impl.store.raw.data.RAFContainer.writeRAFHeader(Unknown 
> Source)
>         ... 8 more
> ============= begin nested exception, level (1) ===========
> java.io.IOException: Bad file descriptor
>         at sun.nio.ch.FileDispatcherImpl.pread0(Native Method)
>         at sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:65)
>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
>         at sun.nio.ch.IOUtil.read(IOUtil.java:210)
>         at sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:754)
>         at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:739)
>         at 
> org.apache.derby.impl.store.raw.data.RAFContainer4.readFull(Unknown Source)
>         at 
> org.apache.derby.impl.store.raw.data.RAFContainer4.readPage0(Unknown Source)
>         at 
> org.apache.derby.impl.store.raw.data.RAFContainer4.readPage(Unknown Source)
>         at 
> org.apache.derby.impl.store.raw.data.RAFContainer4.getEmbryonicPage(Unknown 
> Source)
>         at 
> org.apache.derby.impl.store.raw.data.RAFContainer.writeRAFHeader(Unknown 
> Source)
>         at org.apache.derby.impl.store.raw.data.RAFContainer.clean(Unknown 
> Source)
>         at 
> org.apache.derby.impl.services.cache.ConcurrentCache.cleanAndUnkeepEntry(Unknown
>  Source)
>         at 
> org.apache.derby.impl.services.cache.ConcurrentCache.cleanEntry(Unknown 
> Source)
>         at 
> org.apache.derby.impl.services.cache.BackgroundCleaner.performWork(Unknown 
> Source)
>         at 
> org.apache.derby.impl.services.daemon.BasicDaemon.serviceClient(Unknown 
> Source)
>         at org.apache.derby.impl.services.daemon.BasicDaemon.work(Unknown 
> Source)
>         at org.apache.derby.impl.services.daemon.BasicDaemon.run(Unknown 
> Source)
>         at java.lang.Thread.run(Thread.java:809)
> ============= end nested exception, level (1) ===========
> ------------  End Shutdown Error Stack ------------{code}
> After this happens, the DB behaves weird, throwing random errors (e.g. 
> telling a column is missing in a table although it is there, or telling the 
> DB is corrupt).
> Hint: We do only have READ access on those databases within the application. 
> We do not write any data to it.
> It only happens to one single DB, but this is the most complex one in the 
> application. Restarting the server will make it WORK for some minutes again!
> We deploy the exact same WAR file to Old and new platform for testing.
> h2. Already analysed
> We already tried several things and did several analysis steps:
>  # Turning off antivirus solution (Trend Micro Deep Security Agent) did not 
> help
>  # Exchanging the servers of the new server platform with another set of 
> servers with same setup  does not help
>  # Comparing a SHA1 hash of the "corrupt" files with the original files 
> turned out the files are IDENTICAL.
>  # Copying the "corrupt" DB to another system, testing it there works as 
> expected without issues.
>  # Running an integrity check on the DB shows no problems
>  # Checking the file permissions on the problematic servers shows no problems
> {code:java}
> # ls -l /opt/apps/tomcat/i36/webapps/XXXX/database/XX/seg0/c8470.dat
> -rw-r--r-- 1 tomcat_i36 tomcat 16384 Aug 21 09:32 
> /opt/apps/tomcat/i36/webapps/XXXXX/database/XX/seg0/c8470.dat
>  
> # file /opt/apps/tomcat/i36/webapps/XXXXX/database/XX/seg0/c8470.dat
> /opt/apps/tomcat/i36/webapps/XXXXX/database/XX/seg0/c8470.dat: data{code}
>  # Checking if any linux limits (e.g. open files limit) was reached: nothing 
> found
>  # Checking for corrupt file system: Ext3 is used on old and new platform, no 
> hint about corrupt files found
> # Upgrading DERBY from 10.11.1.1 to 10.12.1.1 did not fix the issue.
> h2. The server environments
> h3. OLD environment (working well)
> {code:java}
> Linux: SUSE Linux Enterprise Server 11 SP4  (s390x)
> Kernel: 3.0.101-91-default #1 SMP Mon Dec 12 13:06:13 UTC 2016 (544b9d1) 
> s390x s390x s390x GNU/Linux
> Filesystem: /dev/mapper/appsvg-lvapps on /opt/apps type ext3 
> (rw,acl,user_xattr)
> Java: IBM 7.1-4.1
> Tomcat: 7.0.70{code}
> h3. NEW environment (not working)
> {code:java}
> Linux: SUSE Linux Enterprise Server for SAP Applications 12 SP3  (x86_64)
> Kernel: 4.4.126-94.22-default #1 SMP Wed Apr 11 07:45:03 UTC 2018 (9649989) 
> x86_64 x86_64 x86_64 GNU/Linux
> Filesystem: /dev/mapper/appsvg-lvapps on /opt/apps type ext3 
> (rw,relatime,data=ordered)
> Java: IBM 7.1-4.15
> Tomcat: 7.0.85{code}
> Our customer has to migrate the server platforms very soon so we would be 
> very glad if someone could assist us in checking and resolving this.
> Thanks!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DERBY-7007) Random IOException: Bad file descriptor on new server platform

Reply via email to