[ https://issues.apache.org/jira/browse/HIVE-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alina Abramova updated HIVE-12963: ---------------------------------- Description: I execute query: hive> select age from test1 sort by age.age limit 10; Total jobs = 2 Launching Job 1 out of 2 Number of reduce tasks not specified. Estimated from input data size: 1 Launching Job 2 out of 2 Number of reduce tasks determined at compile time: 1 When I have a large number of rows then the last stage of the job takes a long time. I think we could allow to user choose number of reducers of last job or refuse extra MR job. The same behavior I observed with querie: hive> create table new_test as select age from test1 group by age.age limit 10; was: I execute query: hive> select age from test1 sort by age.age limit 10; Total jobs = 2 Launching Job 1 out of 2 Number of reduce tasks not specified. Estimated from input data size: 1 Launching Job 2 out of 2 Number of reduce tasks determined at compile time: 1 When I have a large number of rows then the last stage of the job takes a long time. I think we could allow to user choose number of reducers of last job or refuse extra MR job. The same behavior I observed with queries: hive> create table new_test as select age from test1 group by age.age limit 10; > LIMIT statement with SORT BY creates additional MR job with hardcoded only > one reducer > -------------------------------------------------------------------------------------- > > Key: HIVE-12963 > URL: https://issues.apache.org/jira/browse/HIVE-12963 > Project: Hive > Issue Type: Bug > Components: Hive > Affects Versions: 1.0.0, 1.2.1, 0.13 > Reporter: Alina Abramova > Assignee: Alina Abramova > Attachments: HIVE-12963.1.patch, HIVE-12963.2.patch > > > I execute query: > hive> select age from test1 sort by age.age limit 10; > Total jobs = 2 > Launching Job 1 out of 2 > Number of reduce tasks not specified. Estimated from input data size: 1 > Launching Job 2 out of 2 > Number of reduce tasks determined at compile time: 1 > When I have a large number of rows then the last stage of the job takes a > long time. I think we could allow to user choose number of reducers of last > job or refuse extra MR job. > The same behavior I observed with querie: > hive> create table new_test as select age from test1 group by age.age limit > 10; -- This message was sent by Atlassian JIRA (v6.3.4#6332)