[ 
https://issues.apache.org/jira/browse/PARQUET-408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Blue resolved PARQUET-408.
-------------------------------
    Resolution: Not A Problem

> Shutdown hook in parquet-avro library corrupts data and disables logging
> ------------------------------------------------------------------------
>
>                 Key: PARQUET-408
>                 URL: https://issues.apache.org/jira/browse/PARQUET-408
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-avro
>    Affects Versions: 1.8.1
>            Reporter: Michal Turek
>             Fix For: 1.9.0
>
>         Attachments: parquet-broken-shutdown_2015-12-16.tar.gz
>
>
> Parquet-avro and probably also other Parquet libraries are not well behaved. 
> It registers a shutdown hook that bypasses application shutdown sequence, 
> corrupts data written to currently opened Parquet file(s) and disables or 
> reconfigures slf4j/logback logger so no further log message is visible.
> h3. Scope
> Our application is a microservice that handles stop request in form of signal 
> SIGTERM, resp. JVM shutdown hook. If it arrives the application will close 
> all opened files (writers), release all other resources and gracefully 
> shutdown. We are swiching from sequence files to Parquet at the moment and 
> using Maven dependency {{org.apache.parquet:parquet-avro:1.8.1}} which is 
> current latest version. We are using 
> {{Runtime.getRuntime().addShutdownHook()}} to handle SIGTERM.
> h3. Example code
> See archive in attachment.
> - Optionally update version of {{hadoop-client}} in {{pom.xml}} to match your 
> Hadoop.
> - Use {{mvn package}} to compile.
> - Copy Hadoop configuration XMLs to {{config}} directory.
> - Update configuration at the top of {{ParquetBrokenShutdown}} class.
> - Execute {{ParquetBrokenShutdown}} class.
> - Send SIGTERM to shutdown the application ({{kill PID}}).
> h3. Initial analysis
> Parquet library tries to care about application shutdown but this introduces 
> more issues than solves. If application is writing to a file and the library 
> asynchronously decides to close underlying writer, data loss will occur. The 
> handle is just closed and all remaining records can't be written.
> {noformat}
> Writing to HDFS/Parquet failed
> java.io.IOException: can not write PageHeader(type:DICTIONARY_PAGE, 
> uncompressed_page_size:14, compressed_page_size:34, 
> dictionary_page_header:DictionaryPageHeader(num_values:1, encoding:PLAIN))
>       at org.apache.parquet.format.Util.write(Util.java:224)
>       at org.apache.parquet.format.Util.writePageHeader(Util.java:61)
>       at 
> org.apache.parquet.format.converter.ParquetMetadataConverter.writeDictionaryPageHeader(ParquetMetadataConverter.java:760)
>       at 
> org.apache.parquet.hadoop.ParquetFileWriter.writeDictionaryPage(ParquetFileWriter.java:307)
>       at 
> org.apache.parquet.hadoop.ColumnChunkPageWriteStore$ColumnChunkPageWriter.writeToFileWriter(ColumnChunkPageWriteStore.java:179)
>       at 
> org.apache.parquet.hadoop.ColumnChunkPageWriteStore.flushToFileWriter(ColumnChunkPageWriteStore.java:238)
>       at 
> org.apache.parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:165)
>       at 
> org.apache.parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:113)
>       at org.apache.parquet.hadoop.ParquetWriter.close(ParquetWriter.java:297)
>       at 
> com.avast.bugreport.parquet.ParquetBrokenShutdown.writeParquetFile(ParquetBrokenShutdown.java:86)
>       at 
> com.avast.bugreport.parquet.ParquetBrokenShutdown.run(ParquetBrokenShutdown.java:53)
>       at 
> com.avast.bugreport.parquet.ParquetBrokenShutdown.main(ParquetBrokenShutdown.java:153)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:497)
>       at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
> Caused by: parquet.org.apache.thrift.transport.TTransportException: 
> java.nio.channels.ClosedChannelException
>       at 
> parquet.org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:147)
>       at 
> parquet.org.apache.thrift.transport.TTransport.write(TTransport.java:105)
>       at 
> parquet.org.apache.thrift.protocol.TCompactProtocol.writeByteDirect(TCompactProtocol.java:424)
>       at 
> parquet.org.apache.thrift.protocol.TCompactProtocol.writeByteDirect(TCompactProtocol.java:431)
>       at 
> parquet.org.apache.thrift.protocol.TCompactProtocol.writeFieldBeginInternal(TCompactProtocol.java:194)
>       at 
> parquet.org.apache.thrift.protocol.TCompactProtocol.writeFieldBegin(TCompactProtocol.java:176)
>       at 
> org.apache.parquet.format.InterningProtocol.writeFieldBegin(InterningProtocol.java:74)
>       at org.apache.parquet.format.PageHeader.write(PageHeader.java:918)
>       at org.apache.parquet.format.Util.write(Util.java:222)
>       ... 16 more
> Caused by: java.nio.channels.ClosedChannelException
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1635)
>       at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:104)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58)
>       at java.io.DataOutputStream.write(DataOutputStream.java:107)
>       at 
> parquet.org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145)
>       ... 24 more
> {noformat}
> Stored file is also currupted in such case. It contains only four bytes 
> "PAR1" header so actually all data are lost.
> Together with that whole application logging using slf4j/logback is 
> reconfigured or completely disabled so no more application's log messages 
> appear in the output. This may be related to PARQUET-369 and will be 
> hopefully fixed in PARQUET-401. No such issue was observed when sequence 
> files were used in the past.
> h3. Known workaround
> Java executes threads of shutdown hooks independently and in parallel, but it 
> is possible for the application to force its own shutdown to be executed 
> before the Parquet's one. Parquet library uses Hadoop's 
> {{ShutdownHookManager}} to register the hook which has a possibility to 
> define a priority.
> {noformat}
> if (!USE_WORKAROUND) {
>       Runtime.getRuntime().addShutdownHook(shutdownHook);
> } else {
>       int maxPriority = Integer.MAX_VALUE; // Execute application hook before 
> hook in Hadoop library
>       ShutdownHookManager.get().addShutdownHook(shutdownHook, maxPriority);
> }
> {noformat}
> This workaround would be impossible if there are two or more such nasty 
> libraries.
> No workaround has been found for the second issue with disabled logging until 
> now.
> h3. Suggestions for fix
> - Never touch logging configuration in libraries.
> -- Use slf4j-api interfaces and just output log messages.
> -- Application is responsible for configuration of logging.
> - Do not register any shutdown hook in libraries that releases resources that 
> are still possibly in use.
> -- Instead provide an API to the application to close/free the allocated 
> resources (already present).
> -- The application is responsible to call close on all Closeables during 
> shutdown in correct order and correct time.
> -- No library has enough information for that.
> - Be well behaved library, do not try to be smarter than the application, it 
> isn't possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to