Re: No matter what I do, pig is trying to run locally

2013-08-23 Thread Tim Chan
ption for the pig command. On Thu, Aug 22, 2013 at 11:35 PM, Serega Sheypak wrote: > Are you sure that your core-site, hdfs-site, maped-site are in pig's > classpath? > > > 2013/8/23 Tim Chan > > > Apache Pig version 0.11.0-cdh4.3.0 > > Hadoop 2.0.0-cdh4.3.0

No matter what I do, pig is trying to run locally

2013-08-22 Thread Tim Chan
Apache Pig version 0.11.0-cdh4.3.0 Hadoop 2.0.0-cdh4.3.0 Here is the error I'm getting: 2013-08-22 19:27:50,304 [main] INFO org.apache.hadoop.mapreduce.Cluster - Failed to use org.apache.hadoop.mapred.LocalClientProtocolProvider due to error: Invalid "mapreduce.jobtracker.address" configuration

Filtering based the value of an aggregate function?

2013-07-29 Thread Tim Chan
I would like to know if there is a better way to do the following. GIVEN: (name:chararray, score:float) I would like to filter out all records that are below the average score. This is what I came up with: data = load 'input.dat' using PigStorage('\t') as (name:chararray, score:float); data_

Re: filter on a date column?

2013-03-21 Thread Tim Chan
/01/01 > > I can do: > > a = LOAD 'input.txt' AS (str:chararray); > b = FOREACH a GENERATE ToDate(str, '/MM/DD'); > DUMP b; > > This gives me: > > (2013-01-01T00:00:00.000-08:00) > > Thanks, > Cheolsoo > > > > On Thu, Mar 21, 2

Re: filter on a date column?

2013-03-21 Thread Tim Chan
is: > > a = LOAD 'input.txt' AS (date:datetime); > b = FILTER a BY date < ToDate('2013-01-01'); > > Also see built-in functions for datetime type: > http://pig.apache.org/docs/r0.11.0/func.html#datetime-functions > > Thanks, > Cheolsoo > > >

filter on a date column?

2013-03-21 Thread Tim Chan
Since there is not date datatype, how do I filter on a date column? I've been setting the date column as a chararray. I would like to do something like: a = filter b by date_col < '2013-01-01';

What is wrong with my input path?

2013-03-20 Thread Tim Chan
I'm using parameter passing to pass an input path to my pig script. This does not seem to work: -param input=/path1/{08,09,10,11,12}/*/data/,/path2/{01,02,03}/*/data/

Re: removing last item in a bag

2013-03-13 Thread Tim Chan
Hi Ruslan, I'm using the trunk version of Pig. For the following script: test = LOAD '$test' USING PigStorage('\t') AS ( visitor:chararray, submodelid:long, record_datetime:chararray ); test_grp = group test by visitor; -- add counts of each bag test_grp_cnt = foreach test_grp