noirello created ORC-1288:
-----------------------------

             Summary: [C++] Invalid memory freeing with ZLIB compression
                 Key: ORC-1288
                 URL: https://issues.apache.org/jira/browse/ORC-1288
             Project: ORC
          Issue Type: Bug
    Affects Versions: 1.8.0
            Reporter: noirello


The simple example code ends with a segfault/munmap_chunk(): invalid pointer:
{code:cpp}
#include "orc/Common.hh"
#include "orc/OrcFile.hh"

using namespace orc;

int main(void) {
    ORC_UNIQUE_PTR<OutputStream> outStream = writeLocalFile("test_file.orc");
    ORC_UNIQUE_PTR<Type> schema(Type::buildTypeFromString("struct<c0:int>"));
    WriterOptions options;
    options.setCompression(orc::CompressionKind_ZLIB);
    options.setStripeSize(4096);
    options.setCompressionBlockSize(4096);
    ORC_UNIQUE_PTR<Writer> writer = createWriter(*schema, outStream.get(), 
options);
    uint64_t batchSize = 65535, rowCount = 10000000;
    ORC_UNIQUE_PTR<ColumnVectorBatch> batch = writer->createRowBatch(batchSize);
    StructVectorBatch *root = dynamic_cast<StructVectorBatch *>(batch.get());
    LongVectorBatch *c0 = dynamic_cast<LongVectorBatch *>(root->fields[0]);
    uint64_t rows = 0;
    
    for (uint64_t i = 0; i < rowCount; ++i) {
        c0->data[rows] = i;
        rows++;
        if (rows == batchSize) {
            root->numElements = rows;
            c0->numElements = rows;
            writer->add(*batch);
            rows = 0;
        }
    }
    if (rows != 0) {
        root->numElements = rows;
        c0->numElements = rows;
        writer->add(*batch);
        rows = 0;
    }
    writer->close();
    return 0;
}
{code}
The bug depends on the stripe size, compression size, and the record number 
written to the file as well. I wasn't able to reproduce the error with other 
compression strategies than ZLIB.

It looks like to me that it's related to 
[ORC-1130|https://issues.apache.org/jira/projects/ORC/issues/ORC-1130] somehow, 
but I couldn't comprehend how.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to