[ https://issues.apache.org/jira/browse/HIVE-12537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Damien Carol updated HIVE-12537: -------------------------------- Description: Perhaps I'm doing something wrong or is actually working as expected. Putting 1 million constant int32 values produces an ORC file of 1MB. Surprisingly, 1 million consecutive ints produces a much smaller file. Code and FileDump attached. {code} ObjectInspector inspector = ObjectInspectorFactory.getReflectionObjectInspector( Integer.class, ObjectInspectorFactory.ObjectInspectorOptions.JAVA); Writer w = OrcFile.createWriter(new Path("/tmp/my.orc"), OrcFile.writerOptions(new Configuration()) .compress(CompressionKind.NONE) .inspector(inspector) .encodingStrategy(OrcFile.EncodingStrategy.COMPRESSION) .version(OrcFile.Version.V_0_12) ); for (int i = 0; i < 1000000; ++i) { w.addRow(123); } w.close(); {code} was: Perhaps I'm doing something wrong or is actually working as expected. Putting 1 million constant int32 values produces an ORC file of 1MB. Surprisingly, 1 million consecutive ints produces a much smaller file. Code and FileDump attached. > RLEv2 doesn't seem to work > -------------------------- > > Key: HIVE-12537 > URL: https://issues.apache.org/jira/browse/HIVE-12537 > Project: Hive > Issue Type: Bug > Components: File Formats, ORC > Affects Versions: 1.2.1 > Reporter: Bogdan Raducanu > Labels: orc, orcfile > Attachments: Main.java, orcdump.txt > > > Perhaps I'm doing something wrong or is actually working as expected. > Putting 1 million constant int32 values produces an ORC file of 1MB. > Surprisingly, 1 million consecutive ints produces a much smaller file. > Code and FileDump attached. > {code} > ObjectInspector inspector = > ObjectInspectorFactory.getReflectionObjectInspector( > Integer.class, > ObjectInspectorFactory.ObjectInspectorOptions.JAVA); > Writer w = OrcFile.createWriter(new Path("/tmp/my.orc"), > OrcFile.writerOptions(new Configuration()) > .compress(CompressionKind.NONE) > .inspector(inspector) > > .encodingStrategy(OrcFile.EncodingStrategy.COMPRESSION) > .version(OrcFile.Version.V_0_12) > ); > > for (int i = 0; i < 1000000; ++i) { > w.addRow(123); > } > w.close(); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)