the operator plan of two sql is different.first one: TableScanOperator--SelectOperator--ReduceOutputOperator--FileSinkOperator--MoveOperator second one:TableScanOperator--SelectOperator--FetchOperator in second one,FetchOperator work on client and directly output to local directory. but first one, result sink to tmp hdfs and then move tmp hdfs to local directory. you can add explain to to sql and then look at operator plan of sql. example: explain insert overwrite local directory 'output' select * from test limit 10;
2014-07-16 11:36 GMT+08:00 Azuryy Yu <azury...@gmail.com>: > Hi, > > I think the following two sql have the same effect. > > 1) hive -e "insert overwrite local directory 'output' select * from test > limit 10;" > 2) hive -e "select * from test limit 10;" > output > > > but the second one read HDFS directly only takes two seconds, but the first > one submit a MR job, which has one reduce. > > why there is such difference? Thanks. > -- thanks 王联辉(Lianhui Wang) blog; http://blog.csdn.net/lance_123 兴趣方向:数据库,分布式,数据挖掘,编程语言,互联网技术等