[ https://issues.apache.org/jira/browse/NIFI-9380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matt Burgess updated NIFI-9380: ------------------------------- Fix Version/s: 1.16.0 Resolution: Fixed Status: Resolved (was: Patch Available) > PutParquet - Compression Type: SNAPPY (Not Working) > --------------------------------------------------- > > Key: NIFI-9380 > URL: https://issues.apache.org/jira/browse/NIFI-9380 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions > Affects Versions: 1.14.0, 1.15.0 > Environment: CentOS 7.4, RedHat 7.9 > Reporter: Bilal > Assignee: Bryan Bende > Priority: Major > Fix For: 1.16.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > I have tested different compression types which is a feature of _PutParquet_ > and _ConvertAvroToParquet_ Processors on different NiFi versions. > > Summary information: > * Compression types (UNCOMPRESSED, GZIP, {*}SNAPPY{*}) of _PutParquet_ > Processor works correctly on NiFi 1.12.1 and 1.13.2 > * Compression types (UNCOMPRESSED, GZIP) of _PutParquet_ Processor works > correctly on NiFi 1.14.0 and 1.5.0; *SNAPPY* gives an error. > > * Compression types (UNCOMPRESSED, GZIP, {*}SNAPPY{*}) of > _ConvertAvroToParquet_ Processor works correctly on NiFi 1.12.1, 1.13.2, > 1.14.0 and 1.15.0. > _PutParquet_ – Properties: > * Hadoop Configuration Resources: File locations > * Kerberos Credentials Service: Keytab service > * Record Reader: AvroReader Service (Embedded Avro Schema) > * Overwrite Files: True > * Compression Type: SNAPPY > * Other Properties: Default > > In order to do lean testing, the default configuration was used generally: > * nifi-env.sh file has the default configuration. > * bootstrap.conf file has the default configuration. > * nifi.properties file has the default configuration except security > configuration. > * _PutParquet_ Processor has the default configuration. (But SNAPPY > compression is not working) > * _ConvertAvroToParquet_ Processor has the default configuration. (SNAPPY > compression is working correctly) > * There is no custom processor in our NiFi environment. > * There is no custom lib location in Nifi properties. > > Error Log (nifi-app.log): > {noformat} > Error Log (nifi-app.log): > ERROR [Timer-Driven Process Thread-12] o.a.nifi.processors.parquet.PutParquet > PutParquet[id=6caab337-68e8-3834-b64a-1d2cbd93aba8] Failed to write due to > java.lang.IncompatibleClassChangeError: Class org.xerial.snappy.SnappyNative > does not implement the requested interface org.xerial.snappy.SnappyApi: > java.lang.IncompatibleClassChangeError: Class org.xerial.snappy.SnappyNative > does not implement the requested interface org.xerial.snappy.SnappyApi > java.lang.IncompatibleClassChangeError: Class org.xerial.snappy.SnappyNative > does not implement the requested interface org.xerial.snappy.SnappyApi > at org.xerial.snappy.Snappy.maxCompressedLength(Snappy.java:380) > at > org.apache.parquet.hadoop.codec.SnappyCompressor.compress(SnappyCompressor.java:67) > at > org.apache.hadoop.io.compress.CompressorStream.compress(CompressorStream.java:81) > at > org.apache.hadoop.io.compress.CompressorStream.finish(CompressorStream.java:92) > at > org.apache.parquet.hadoop.CodecFactory$HeapBytesCompressor.compress(CodecFactory.java:167) > at > org.apache.parquet.hadoop.ColumnChunkPageWriteStore$ColumnChunkPageWriter.writePage(ColumnChunkPageWriteStore.java:168) > at > org.apache.parquet.column.impl.ColumnWriterV1.writePage(ColumnWriterV1.java:59) > at > org.apache.parquet.column.impl.ColumnWriterBase.writePage(ColumnWriterBase.java:387) > at > org.apache.parquet.column.impl.ColumnWriteStoreBase.flush(ColumnWriteStoreBase.java:186) > at > org.apache.parquet.column.impl.ColumnWriteStoreV1.flush(ColumnWriteStoreV1.java:29) > at > org.apache.parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:185) > at > org.apache.parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:124) > at > org.apache.parquet.hadoop.ParquetWriter.close(ParquetWriter.java:319) > at > org.apache.nifi.parquet.hadoop.AvroParquetHDFSRecordWriter.close(AvroParquetHDFSRecordWriter.java:49) > at org.apache.commons.io.IOUtils.closeQuietly(IOUtils.java:534) > at org.apache.commons.io.IOUtils.closeQuietly(IOUtils.java:466) > at > org.apache.nifi.processors.hadoop.AbstractPutHDFSRecord.lambda$null$0(AbstractPutHDFSRecord.java:326) > at > org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2466) > at > org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2434) > at > org.apache.nifi.processors.hadoop.AbstractPutHDFSRecord.lambda$onTrigger$1(AbstractPutHDFSRecord.java:303) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1822) > at > org.apache.nifi.processors.hadoop.AbstractPutHDFSRecord.onTrigger(AbstractPutHDFSRecord.java:271) > at > org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) > at > org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1202) > at > org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214) > at > org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:103) > at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)