Re: Does Hive trunk version work with Hadoop release 0.21.0?

2010-09-02 Thread Neil Xu
Not yet, the trunk version can only work with Hadoop 0.17.*, 0.18.*, 0.19.* 0.20.* Neil 2010/9/3 Ping Zhu > Dear all, > > I did not find any related statements regarding this issue either online > or within trunk documentation. Anyone can confirm about this? Thanks. > > Ping >

Re: How is Union All optimized in Hive

2010-08-26 Thread Neil Xu
in > Yes, it is optimized by hive. There will be only 1 mr job, even if the > columns selected were different. > > > -namit > > ________ > From: Neil Xu [neil.x...@gmail.com] > Sent: Wednesday, August 25, 2010 2:40 AM > To: hive-user@h

Re: Directing output from Hive MR second custom MR job

2010-08-25 Thread Neil Xu
of jobs, for example, you can set the job chain in advance, and the system will call a hive job first, then a shell job, or another MR job,etc. * is there anyone have some ideas?* -Chocobo 2010/8/25 Maxim Veksler > Hi Neil, > > On Wed, Aug 25, 2010 at 2:41 PM, Neil Xu wrote: >

Re: Directing output from Hive MR second custom MR job

2010-08-25 Thread Neil Xu
You can set the input path and output path for each job, and run jobs in order. ex. TwoJobs.java public class TwoJobs extends Configured implements Tool { public static class Job1Mapper extends MapReduceBase implements Mapper { } public static class Job1Reducer extends MapRedu

How is Union All optimized in Hive

2010-08-25 Thread Neil Xu
I tried a query like below, same table, same column and different conditions, only one MR job generated, is it optimized by Hive itself? and is the' table_1' only scanned once? who can give some details, thanks! select a, b, c from table_1 where ... union all select a, b, c from table_1 where ...

Is INSTR faster than LIKE

2010-08-10 Thread Neil Xu
Hi, all, I am wondering whether INSTR will run faster than LIKE, in all cases, or in some cases? for example, INSTR(url, 'http://www.google.com/') = 1 VS url like ' http://www.google.com/%' INSTR(url, 'google') = 0 VS not url like '%google%'