Re: running pig on amazon ec2

2011-06-15 Thread Dexin Wang
t;Local mode (-x local) when I ran it on my laptop, and mapreduce > >> >>>mode when I ran it on ec2 cluster. > >> >>> > >> >>>2. If mapreduce mode, did you look into the hadoop log to see > >> >>> how much slow do

Re: running pig on amazon ec2

2011-06-15 Thread Dmitriy Ryaboy
>>>        3. What kind of query is it? >> >>> >> >>>    The input is gzipped json files which has one event per line. >> >>>    Then I do some hourly aggregation on the raw events, then do >> >>>    bunch of groupping, joining and s

Re: running pig on amazon ec2

2011-06-15 Thread Dexin Wang
t; >>> > >>> Daniel > >>> > >>> Someone mentioned it's EC2's I/O performance. But I'm sure there > >>>are plenty of people using EC2/EMR running big MR jobs so more > >>>likely I have some configuration issues? My

Re: running pig on amazon ec2

2011-06-14 Thread Tomas Svarovsky
t;>>    median, variance) on some fields. >>> >>>        Daniel >>> >>>     Someone mentioned it's EC2's I/O performance. But I'm sure there >>>    are plenty of people using EC2/EMR running big MR jobs so more >>>    likely

Re: running pig on amazon ec2

2011-06-14 Thread Daniel Dai
that running on my laptop is faster tells me this is a separate issue. Thanks! On 06/13/2011 11:54 AM, Dexin Wang wrote: Hi, This is probably not directly a Pig question. Anyone running Pig on amazon EC2 instances? Something's

Re: running pig on amazon ec2

2011-06-14 Thread Dexin Wang
ance. But I'm sure there are > plenty of people using EC2/EMR running big MR jobs so more likely I have > some configuration issues? My jobs can be optimized a bit but the fact that > running on my laptop is faster tells me this is a separate issue. > > Thanks! > > > >>

Re: running pig on amazon ec2

2011-06-14 Thread Daniel Dai
ues? My jobs can be optimized a bit but the fact that running on my laptop is faster tells me this is a separate issue. Thanks! On 06/13/2011 11:54 AM, Dexin Wang wrote: Hi, This is probably not directly a Pig question. Anyone running Pig on amazon E

Re: running pig on amazon ec2

2011-06-14 Thread Dexin Wang
ion issues? My jobs can be optimized a bit but the fact that running on my laptop is faster tells me this is a separate issue. Thanks! > On 06/13/2011 11:54 AM, Dexin Wang wrote: > >> Hi, >> >> This is probably not directly a Pig question. >> >> Anyone runn

Re: running pig on amazon ec2

2011-06-14 Thread Daniel Dai
directly a Pig question. Anyone running Pig on amazon EC2 instances? Something's not making sense to me. I ran a Pig script that has about 10 mapred jobs in it on a 16 node cluster using m1.small. It took *13 minutes*. The job reads input from S3 and writes output to S3. But from the log

running pig on amazon ec2

2011-06-13 Thread Dexin Wang
Hi, This is probably not directly a Pig question. Anyone running Pig on amazon EC2 instances? Something's not making sense to me. I ran a Pig script that has about 10 mapred jobs in it on a 16 node cluster using m1.small. It took *13 minutes*. The job reads input from S3 and writes output