[
https://issues.apache.org/jira/browse/HIVE-5994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13905058#comment-13905058
]
Puneet Gupta commented on HIVE-5994:
------------------------------------
Hi Prasanth
This is the code I Used to reproduce the issue .
1. I am using Hive binary from "hive-0.12.0.tar.gz"
2. I am using a old hadoop version "hadoop-core-1.0.0.jar" ---
http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-core
3. In the below code if ROWS_TO_TEST is set to 1 or >10 , the problem does not
occur.
---------------------------
package hive;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hive.ql.io.orc.CompressionKind;
import org.apache.hadoop.hive.ql.io.orc.OrcFile;
import org.apache.hadoop.hive.ql.io.orc.Reader;
import org.apache.hadoop.hive.ql.io.orc.RecordReader;
import org.apache.hadoop.hive.ql.io.orc.Writer;
import org.apache.hadoop.hive.ql.io.orc.OrcFile.WriterOptions;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory;
public class TestLong {
/**
* @param args
* @throws IOException
*/
public static void main(String[] args) throws IOException
{
int ROWS_TO_TEST =10;
Path path = new Path("E:/Test/file.orc");
Configuration conf = new Configuration();
FileSystem fs = FileSystem.getLocal(conf);
if(fs.exists(path))
fs.delete(path,true);
ObjectInspector inspector = ObjectInspectorFactory
.getReflectionObjectInspector(MyData.class,
ObjectInspectorFactory.ObjectInspectorOptions.JAVA);
WriterOptions options = OrcFile.writerOptions(conf)
.inspector(inspector).compress(CompressionKind.SNAPPY);
Writer writer = OrcFile.createWriter(path, options);
for (int i = 0; i < ROWS_TO_TEST; i++) {
writer.addRow(new MyData());
}
writer.close();
Reader reader = OrcFile.createReader(fs, path);
RecordReader rows = reader.rows(null);
Object row = null;
while (rows.hasNext()) {
row = rows.next(row);
System.out.println(row);
}
}
private static class MyData
{
long data = 4703275633953830000L ;
}
}
-----------
OUTPUT
{112}
{112}
{112}
{112}
{112}
{112}
{112}
{112}
{112}
{112}
> ORC RLEv2 encodes wrongly for large negative BIGINTs (64 bits )
> ----------------------------------------------------------------
>
> Key: HIVE-5994
> URL: https://issues.apache.org/jira/browse/HIVE-5994
> Project: Hive
> Issue Type: Bug
> Affects Versions: 0.13.0
> Reporter: Prasanth J
> Assignee: Prasanth J
> Labels: orcfile
> Fix For: 0.13.0
>
> Attachments: HIVE-5994.1.patch
>
>
> For large negative BIGINTs, zigzag encoding will yield large value (64bit
> value) with MSB set to 1. This value is interpreted as negative value in
> SerializationUtils.findClosestNumBits(long value) function. This resulted in
> wrong computation of total number of bits required which results in wrong
> encoding/decoding of values.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)