[
https://issues.apache.org/jira/browse/HIVE-5273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thejas M Nair updated HIVE-5273:
--------------------------------
Assignee: Thejas M Nair
> Subsequent use of Mapper yields 0 results
> -----------------------------------------
>
> Key: HIVE-5273
> URL: https://issues.apache.org/jira/browse/HIVE-5273
> Project: Hive
> Issue Type: Bug
> Affects Versions: 0.12.0, 0.13.0
> Reporter: Mike Lewis
> Assignee: Thejas M Nair
> Priority: Blocker
>
> First noticed this when using local task tracker (and is easiest to reproduce
> with it).
> Created a table with one column (uuid). Ran
> {code}
> SELECT uuid FROM test_foo LIMIT 5;
> {code}
> Results are as expected:
> {code}
> ace7265d-49bf-4c11-af67-0cd0a33c690e
> ace7265d-49bf-4c11-af67-0cd0a33c690e
> ace7265d-49bf-4c11-af67-0cd0a33c690e
> ace7265d-49bf-4c11-af67-0cd0a33c690e
> ace7265d-49bf-4c11-af67-0cd0a33c690e
> Time taken: 40.172 seconds, Fetched: 5 row(s)
> {code}
> Then I run it again.
> The results are not as expected:
> {code}
> Time taken: 55.498 seconds
> {code}
> The table I am querying is
> {code}
> hive> describe extended test_foo;
> OK
> uuid string None
>
> Detailed Table Information Table(tableName:test_foo, dbName:default,
> owner:lewis, createTime:1378934838, lastAccessTime:0, retention:0,
> sd:StorageDescriptor(cols:[FieldSchema(name:uuid, type:string,
> comment:null)],
> location:hdfs://gun1.sjc1c.square:8020/user/hive/warehouse/test_foo,
> inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat,
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat,
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null,
> serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
> parameters:{serialization.format=1}), bucketCols:[], sortCols:[],
> parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[],
> skewedColValueLocationMaps:{}), storedAsSubDirectories:false),
> partitionKeys:[], parameters:{numPartitions=0, numFiles=37,
> transient_lastDdlTime=1378934838, numRows=0, totalSize=44600654909,
> rawDataSize=0}, viewOriginalText:null, viewExpandedText:null,
> tableType:MANAGED_TABLE)
> {code}
> With non-local tasktracker subsequent queries work, but when doing a
> {{count(* )}} over a large data set, 0.12.0 returns only a subset of results
> that 0.10.0 returns.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira