Mike Lewis created HIVE-5273:
--------------------------------
Summary: Subsequent use of Mapper yields 0 results
Key: HIVE-5273
URL: https://issues.apache.org/jira/browse/HIVE-5273
Project: Hive
Issue Type: Bug
Affects Versions: 0.12.0, 0.13.0
Reporter: Mike Lewis
First noticed this when using local task tracker (and is easiest to reproduce
with it).
Created a table with one column (uuid). Ran
{code}
SELECT uuid FROM test_foo LIMIT 5;
{code}
Results are as expected:
{code}
ace7265d-49bf-4c11-af67-0cd0a33c690e
ace7265d-49bf-4c11-af67-0cd0a33c690e
ace7265d-49bf-4c11-af67-0cd0a33c690e
ace7265d-49bf-4c11-af67-0cd0a33c690e
ace7265d-49bf-4c11-af67-0cd0a33c690e
Time taken: 40.172 seconds, Fetched: 5 row(s)
{code}
Then I run it again.
The results are not as expected:
{code}
Time taken: 55.498 seconds
{code}
The table I am querying is
{code}
hive> describe extended test_foo;
OK
uuid string None
Detailed Table Information Table(tableName:test_foo, dbName:default,
owner:lewis, createTime:1378934838, lastAccessTime:0, retention:0,
sd:StorageDescriptor(cols:[FieldSchema(name:uuid, type:string, comment:null)],
location:hdfs://gun1.sjc1c.square:8020/user/hive/warehouse/test_foo,
inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat,
outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat,
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null,
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
parameters:{serialization.format=1}), bucketCols:[], sortCols:[],
parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[],
skewedColValueLocationMaps:{}), storedAsSubDirectories:false),
partitionKeys:[], parameters:{numPartitions=0, numFiles=37,
transient_lastDdlTime=1378934838, numRows=0, totalSize=44600654909,
rawDataSize=0}, viewOriginalText:null, viewExpandedText:null,
tableType:MANAGED_TABLE)
{code}
With non-local tasktracker subsequent queries work, but when doing a {{count(*
)}} over a large data set, 0.12.0 returns only a subset of results that 0.10.0
returns.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira