[ https://issues.apache.org/jira/browse/PARQUET-408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ryan Blue resolved PARQUET-408. ------------------------------- Resolution: Not A Problem > Shutdown hook in parquet-avro library corrupts data and disables logging > ------------------------------------------------------------------------ > > Key: PARQUET-408 > URL: https://issues.apache.org/jira/browse/PARQUET-408 > Project: Parquet > Issue Type: Bug > Components: parquet-avro > Affects Versions: 1.8.1 > Reporter: Michal Turek > Fix For: 1.9.0 > > Attachments: parquet-broken-shutdown_2015-12-16.tar.gz > > > Parquet-avro and probably also other Parquet libraries are not well behaved. > It registers a shutdown hook that bypasses application shutdown sequence, > corrupts data written to currently opened Parquet file(s) and disables or > reconfigures slf4j/logback logger so no further log message is visible. > h3. Scope > Our application is a microservice that handles stop request in form of signal > SIGTERM, resp. JVM shutdown hook. If it arrives the application will close > all opened files (writers), release all other resources and gracefully > shutdown. We are swiching from sequence files to Parquet at the moment and > using Maven dependency {{org.apache.parquet:parquet-avro:1.8.1}} which is > current latest version. We are using > {{Runtime.getRuntime().addShutdownHook()}} to handle SIGTERM. > h3. Example code > See archive in attachment. > - Optionally update version of {{hadoop-client}} in {{pom.xml}} to match your > Hadoop. > - Use {{mvn package}} to compile. > - Copy Hadoop configuration XMLs to {{config}} directory. > - Update configuration at the top of {{ParquetBrokenShutdown}} class. > - Execute {{ParquetBrokenShutdown}} class. > - Send SIGTERM to shutdown the application ({{kill PID}}). > h3. Initial analysis > Parquet library tries to care about application shutdown but this introduces > more issues than solves. If application is writing to a file and the library > asynchronously decides to close underlying writer, data loss will occur. The > handle is just closed and all remaining records can't be written. > {noformat} > Writing to HDFS/Parquet failed > java.io.IOException: can not write PageHeader(type:DICTIONARY_PAGE, > uncompressed_page_size:14, compressed_page_size:34, > dictionary_page_header:DictionaryPageHeader(num_values:1, encoding:PLAIN)) > at org.apache.parquet.format.Util.write(Util.java:224) > at org.apache.parquet.format.Util.writePageHeader(Util.java:61) > at > org.apache.parquet.format.converter.ParquetMetadataConverter.writeDictionaryPageHeader(ParquetMetadataConverter.java:760) > at > org.apache.parquet.hadoop.ParquetFileWriter.writeDictionaryPage(ParquetFileWriter.java:307) > at > org.apache.parquet.hadoop.ColumnChunkPageWriteStore$ColumnChunkPageWriter.writeToFileWriter(ColumnChunkPageWriteStore.java:179) > at > org.apache.parquet.hadoop.ColumnChunkPageWriteStore.flushToFileWriter(ColumnChunkPageWriteStore.java:238) > at > org.apache.parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:165) > at > org.apache.parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:113) > at org.apache.parquet.hadoop.ParquetWriter.close(ParquetWriter.java:297) > at > com.avast.bugreport.parquet.ParquetBrokenShutdown.writeParquetFile(ParquetBrokenShutdown.java:86) > at > com.avast.bugreport.parquet.ParquetBrokenShutdown.run(ParquetBrokenShutdown.java:53) > at > com.avast.bugreport.parquet.ParquetBrokenShutdown.main(ParquetBrokenShutdown.java:153) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144) > Caused by: parquet.org.apache.thrift.transport.TTransportException: > java.nio.channels.ClosedChannelException > at > parquet.org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:147) > at > parquet.org.apache.thrift.transport.TTransport.write(TTransport.java:105) > at > parquet.org.apache.thrift.protocol.TCompactProtocol.writeByteDirect(TCompactProtocol.java:424) > at > parquet.org.apache.thrift.protocol.TCompactProtocol.writeByteDirect(TCompactProtocol.java:431) > at > parquet.org.apache.thrift.protocol.TCompactProtocol.writeFieldBeginInternal(TCompactProtocol.java:194) > at > parquet.org.apache.thrift.protocol.TCompactProtocol.writeFieldBegin(TCompactProtocol.java:176) > at > org.apache.parquet.format.InterningProtocol.writeFieldBegin(InterningProtocol.java:74) > at org.apache.parquet.format.PageHeader.write(PageHeader.java:918) > at org.apache.parquet.format.Util.write(Util.java:222) > ... 16 more > Caused by: java.nio.channels.ClosedChannelException > at > org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1635) > at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:104) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58) > at java.io.DataOutputStream.write(DataOutputStream.java:107) > at > parquet.org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145) > ... 24 more > {noformat} > Stored file is also currupted in such case. It contains only four bytes > "PAR1" header so actually all data are lost. > Together with that whole application logging using slf4j/logback is > reconfigured or completely disabled so no more application's log messages > appear in the output. This may be related to PARQUET-369 and will be > hopefully fixed in PARQUET-401. No such issue was observed when sequence > files were used in the past. > h3. Known workaround > Java executes threads of shutdown hooks independently and in parallel, but it > is possible for the application to force its own shutdown to be executed > before the Parquet's one. Parquet library uses Hadoop's > {{ShutdownHookManager}} to register the hook which has a possibility to > define a priority. > {noformat} > if (!USE_WORKAROUND) { > Runtime.getRuntime().addShutdownHook(shutdownHook); > } else { > int maxPriority = Integer.MAX_VALUE; // Execute application hook before > hook in Hadoop library > ShutdownHookManager.get().addShutdownHook(shutdownHook, maxPriority); > } > {noformat} > This workaround would be impossible if there are two or more such nasty > libraries. > No workaround has been found for the second issue with disabled logging until > now. > h3. Suggestions for fix > - Never touch logging configuration in libraries. > -- Use slf4j-api interfaces and just output log messages. > -- Application is responsible for configuration of logging. > - Do not register any shutdown hook in libraries that releases resources that > are still possibly in use. > -- Instead provide an API to the application to close/free the allocated > resources (already present). > -- The application is responsible to call close on all Closeables during > shutdown in correct order and correct time. > -- No library has enough information for that. > - Be well behaved library, do not try to be smarter than the application, it > isn't possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)