[ https://issues.apache.org/jira/browse/HIVE-12537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Prasanth Jayachandran updated HIVE-12537: ----------------------------------------- Attachment: HIVE-12537.2.patch Rebased .2 patch > RLEv2 doesn't seem to work > -------------------------- > > Key: HIVE-12537 > URL: https://issues.apache.org/jira/browse/HIVE-12537 > Project: Hive > Issue Type: Bug > Components: File Formats, ORC > Affects Versions: 0.14.0, 1.0.1, 1.1.1, 1.3.0, 1.2.1, 2.0.0 > Reporter: Bogdan Raducanu > Assignee: Prasanth Jayachandran > Priority: Critical > Labels: orc, orcfile > Attachments: HIVE-12537.1.patch, HIVE-12537.2.patch, Main.java, > orcdump.txt > > > Perhaps I'm doing something wrong or is actually working as expected. > Putting 1 million constant int32 values produces an ORC file of 1MB. > Surprisingly, 1 million consecutive ints produces a much smaller file. > Code and FileDump attached. > {code} > ObjectInspector inspector = > ObjectInspectorFactory.getReflectionObjectInspector( > Integer.class, > ObjectInspectorFactory.ObjectInspectorOptions.JAVA); > Writer w = OrcFile.createWriter(new Path("/tmp/my.orc"), > OrcFile.writerOptions(new Configuration()) > .compress(CompressionKind.NONE) > .inspector(inspector) > > .encodingStrategy(OrcFile.EncodingStrategy.COMPRESSION) > .version(OrcFile.Version.V_0_12) > ); > for (int i = 0; i < 1000000; ++i) { > w.addRow(123); > } > w.close(); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)