ption for the
pig command.
On Thu, Aug 22, 2013 at 11:35 PM, Serega Sheypak
wrote:
> Are you sure that your core-site, hdfs-site, maped-site are in pig's
> classpath?
>
>
> 2013/8/23 Tim Chan
>
> > Apache Pig version 0.11.0-cdh4.3.0
> > Hadoop 2.0.0-cdh4.3.0
Apache Pig version 0.11.0-cdh4.3.0
Hadoop 2.0.0-cdh4.3.0
Here is the error I'm getting:
2013-08-22 19:27:50,304 [main] INFO org.apache.hadoop.mapreduce.Cluster -
Failed to use org.apache.hadoop.mapred.LocalClientProtocolProvider due to
error: Invalid "mapreduce.jobtracker.address" configuration
I would like to know if there is a better way to do the following.
GIVEN:
(name:chararray, score:float)
I would like to filter out all records that are below the average score.
This is what I came up with:
data = load 'input.dat' using PigStorage('\t') as (name:chararray,
score:float);
data_
/01/01
>
> I can do:
>
> a = LOAD 'input.txt' AS (str:chararray);
> b = FOREACH a GENERATE ToDate(str, '/MM/DD');
> DUMP b;
>
> This gives me:
>
> (2013-01-01T00:00:00.000-08:00)
>
> Thanks,
> Cheolsoo
>
>
>
> On Thu, Mar 21, 2
is:
>
> a = LOAD 'input.txt' AS (date:datetime);
> b = FILTER a BY date < ToDate('2013-01-01');
>
> Also see built-in functions for datetime type:
> http://pig.apache.org/docs/r0.11.0/func.html#datetime-functions
>
> Thanks,
> Cheolsoo
>
>
>
Since there is not date datatype, how do I filter on a date column?
I've been setting the date column as a chararray.
I would like to do something like:
a = filter b by date_col < '2013-01-01';
I'm using parameter passing to pass an input path to my pig script.
This does not seem to work:
-param input=/path1/{08,09,10,11,12}/*/data/,/path2/{01,02,03}/*/data/
Hi Ruslan,
I'm using the trunk version of Pig.
For the following script:
test = LOAD '$test' USING PigStorage('\t') AS
( visitor:chararray,
submodelid:long,
record_datetime:chararray );
test_grp = group test by visitor;
-- add counts of each bag
test_grp_cnt = foreach test_grp