[jira] [Updated] (HADOOP-15171) Hadoop native ZLIB decompressor produces 0 bytes for some input
[ https://issues.apache.org/jira/browse/HADOOP-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HADOOP-15171: -- Priority: Blocker (was: Critical) > Hadoop native ZLIB decompressor produces 0 bytes for some input > --- > > Key: HADOOP-15171 > URL: https://issues.apache.org/jira/browse/HADOOP-15171 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: Sergey Shelukhin >Priority: Blocker > Fix For: 3.1.0, 3.0.1 > > > While reading some ORC file via direct buffers, Hive gets a 0-sized buffer > for a particular compressed segment of the file. We narrowed it down to > Hadoop native ZLIB codec; when the data is copied to heap-based buffer and > the JDK Inflater is used, it produces correct output. Input is only 127 bytes > so I can paste it here. > All the other (many) blocks of the file are decompressed without problems by > the same code. > {noformat} > 2018-01-13T02:47:40,815 TRACE [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Decompressing > 127 bytes to dest buffer pos 524288, limit 786432 > 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: The codec has > produced 0 bytes for 127 bytes at pos 0, data hash 1719565039: [e3 92 e1 62 > 66 60 60 10 12 e5 98 e0 27 c4 c7 f1 e8 12 8f 40 c3 7b 5e 89 09 7f 6e 74 73 04 > 30 70 c9 72 b1 30 14 4d 60 82 49 37 bd e7 15 58 d0 cd 2f 31 a1 a1 e3 35 4c fa > 15 a3 02 4c 7a 51 37 bf c0 81 e5 02 12 13 5a b6 9f e2 04 ea 96 e3 62 65 b8 c3 > b4 01 ae fd d0 72 01 81 07 87 05 25 26 74 3c 5b c9 05 35 fd 0a b3 03 50 7b 83 > 11 c8 f2 c3 82 02 0f 96 0b 49 34 7c fa ff 9f 2d 80 01 00 > 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Fell back to > JDK decompressor with memcopy; got 155 bytes > {noformat} > Hadoop version is based on 3.1 snapshot. > The size of libhadoop.so is 824403 bytes, and libgplcompression is 78273 > FWIW. Not sure how to extract versions from those. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15171) Hadoop native ZLIB decompressor produces 0 bytes for some input
[ https://issues.apache.org/jira/browse/HADOOP-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HADOOP-15171: -- Fix Version/s: 3.0.1 3.1.0 > Hadoop native ZLIB decompressor produces 0 bytes for some input > --- > > Key: HADOOP-15171 > URL: https://issues.apache.org/jira/browse/HADOOP-15171 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: Sergey Shelukhin >Priority: Critical > Fix For: 3.1.0, 3.0.1 > > > While reading some ORC file via direct buffers, Hive gets a 0-sized buffer > for a particular compressed segment of the file. We narrowed it down to > Hadoop native ZLIB codec; when the data is copied to heap-based buffer and > the JDK Inflater is used, it produces correct output. Input is only 127 bytes > so I can paste it here. > All the other (many) blocks of the file are decompressed without problems by > the same code. > {noformat} > 2018-01-13T02:47:40,815 TRACE [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Decompressing > 127 bytes to dest buffer pos 524288, limit 786432 > 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: The codec has > produced 0 bytes for 127 bytes at pos 0, data hash 1719565039: [e3 92 e1 62 > 66 60 60 10 12 e5 98 e0 27 c4 c7 f1 e8 12 8f 40 c3 7b 5e 89 09 7f 6e 74 73 04 > 30 70 c9 72 b1 30 14 4d 60 82 49 37 bd e7 15 58 d0 cd 2f 31 a1 a1 e3 35 4c fa > 15 a3 02 4c 7a 51 37 bf c0 81 e5 02 12 13 5a b6 9f e2 04 ea 96 e3 62 65 b8 c3 > b4 01 ae fd d0 72 01 81 07 87 05 25 26 74 3c 5b c9 05 35 fd 0a b3 03 50 7b 83 > 11 c8 f2 c3 82 02 0f 96 0b 49 34 7c fa ff 9f 2d 80 01 00 > 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Fell back to > JDK decompressor with memcopy; got 155 bytes > {noformat} > Hadoop version is based on 3.1 snapshot. > The size of libhadoop.so is 824403 bytes, and libgplcompression is 78273 > FWIW. Not sure how to extract versions from those. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15171) Hadoop native ZLIB decompressor produces 0 bytes for some input
[ https://issues.apache.org/jira/browse/HADOOP-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HADOOP-15171: -- Description: While reading some ORC file via direct buffers, Hive gets a 0-sized buffer for a particular compressed segment of the file. We narrowed it down to Hadoop native ZLIB codec; when the data is copied to heap-based buffer and the JDK Inflater is used, it produces correct output. Input is only 127 bytes so I can paste it here. All the other (many) blocks of the file are decompressed without problems by the same code. {noformat} 2018-01-13T02:47:40,815 TRACE [IO-Elevator-Thread-0 (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Decompressing 127 bytes to dest buffer pos 524288, limit 786432 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: The codec has produced 0 bytes for 127 bytes at pos 0, data hash 1719565039: [e3 92 e1 62 66 60 60 10 12 e5 98 e0 27 c4 c7 f1 e8 12 8f 40 c3 7b 5e 89 09 7f 6e 74 73 04 30 70 c9 72 b1 30 14 4d 60 82 49 37 bd e7 15 58 d0 cd 2f 31 a1 a1 e3 35 4c fa 15 a3 02 4c 7a 51 37 bf c0 81 e5 02 12 13 5a b6 9f e2 04 ea 96 e3 62 65 b8 c3 b4 01 ae fd d0 72 01 81 07 87 05 25 26 74 3c 5b c9 05 35 fd 0a b3 03 50 7b 83 11 c8 f2 c3 82 02 0f 96 0b 49 34 7c fa ff 9f 2d 80 01 00 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Fell back to JDK decompressor with memcopy; got 155 bytes {noformat} Hadoop version is based on 3.1 snapshot. The size of libhadoop.so is 824403 bytes, and libgplcompression is 78273 FWIW. Not sure how to extract versions from those. was: While reading some ORC file via direct buffers, Hive gets a 0-sized buffer for a particular compressed segment of the file. We narrowed it down to Hadoop native ZLIB codec; when the data is copied to heap-based buffer and the JDK Inflater is used, it produces correct output. Input is only 127 bytes so I can paste it here. {noformat} 2018-01-13T02:47:40,815 TRACE [IO-Elevator-Thread-0 (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Decompressing 127 bytes to dest buffer pos 524288, limit 786432 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: The codec has produced 0 bytes for 127 bytes at pos 0, data hash 1719565039: [e3 92 e1 62 66 60 60 10 12 e5 98 e0 27 c4 c7 f1 e8 12 8f 40 c3 7b 5e 89 09 7f 6e 74 73 04 30 70 c9 72 b1 30 14 4d 60 82 49 37 bd e7 15 58 d0 cd 2f 31 a1 a1 e3 35 4c fa 15 a3 02 4c 7a 51 37 bf c0 81 e5 02 12 13 5a b6 9f e2 04 ea 96 e3 62 65 b8 c3 b4 01 ae fd d0 72 01 81 07 87 05 25 26 74 3c 5b c9 05 35 fd 0a b3 03 50 7b 83 11 c8 f2 c3 82 02 0f 96 0b 49 34 7c fa ff 9f 2d 80 01 00 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Fell back to JDK decompressor with memcopy; got 155 bytes {noformat} Hadoop version is based on 3.1 snapshot. The size of libhadoop.so is 824403 bytes, and libgplcompression is 78273 FWIW. Not sure how to extract versions from those. > Hadoop native ZLIB decompressor produces 0 bytes for some input > --- > > Key: HADOOP-15171 > URL: https://issues.apache.org/jira/browse/HADOOP-15171 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: Sergey Shelukhin >Priority: Critical > > While reading some ORC file via direct buffers, Hive gets a 0-sized buffer > for a particular compressed segment of the file. We narrowed it down to > Hadoop native ZLIB codec; when the data is copied to heap-based buffer and > the JDK Inflater is used, it produces correct output. Input is only 127 bytes > so I can paste it here. > All the other (many) blocks of the file are decompressed without problems by > the same code. > {noformat} > 2018-01-13T02:47:40,815 TRACE [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Decompressing > 127 bytes to dest buffer pos 524288, limit 786432 > 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: The codec has > produced 0 bytes for 127 bytes at pos 0, data hash 1719565039: [e3 92 e1 62 > 66 60 60 10 12 e5 98 e0 27 c4 c7 f1 e8 12 8f 40 c3 7b 5e 89 09 7f 6e 74 73 04 > 30 70 c9 72 b1 30 14 4d 60 82 49 37 bd e7 15 58 d0 cd 2f 31 a1 a1 e3 35 4c fa > 15 a3 02 4c 7a 51 37 bf c0 81 e5 02 12 13 5a b6 9f e2 04 ea 96 e3 62 65 b8 c3 > b4 01 ae fd d0 72 01 81 07 87 05 25 26 74 3c 5b c9 05 35 fd 0a b3 03 50 7b 83 > 11 c8 f2 c3 82 02 0f 96 0b 49 34 7c fa ff 9f 2d 80 01 00 > 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderI