Re:Re: Pig UT last nearly 8 hours and TestEvalPipeline2 lasts for 37 minutes

2012-11-13 Thread lulynn_2008
Thanks. Then my environment is normal. Is there any way to shorten the time? I think maybe we can find a way to shorten the time. At 2012-11-14 10:37:55,"Johnny Zhang" wrote: >Hi, lulynn: >Yes, whole Pig unit tests run about 8 hours. >TestEvalPipline runs about 26 mins and TestEvalPipline

Re: Pig UT last nearly 8 hours and TestEvalPipeline2 lasts for 37 minutes

2012-11-13 Thread Johnny Zhang
Hi, lulynn: Yes, whole Pig unit tests run about 8 hours. TestEvalPipline runs about 26 mins and TestEvalPiplineLocal runs about 3 mins. Hope it is helpful, Johnny On Tue, Nov 13, 2012 at 6:28 PM, lulynn_2008 wrote: > Hi all, > > The whole pig UT last for nearly 8 hours, and TestEvalPipeline2

Pig UT last nearly 8 hours and TestEvalPipeline2 lasts for 37 minutes

2012-11-13 Thread lulynn_2008
Hi all, The whole pig UT last for nearly 8 hours, and TestEvalPipeline2 last for 37 minutes. My questions are: how long pig UT will last in normal? Do we have jenkins for pig UT? If yes, please attach the link. Thanks Thanks

Re: Dynamically generating load/store path

2012-11-13 Thread Prashant Kommireddi
You could write a custom load/storefunc and override setLocation(String location, Job job). The logic you want the UDF for could go in there. On Tue, Nov 13, 2012 at 4:45 PM, Jonathan Coveney wrote: > If it's a parameter, it could just be passed in as a $var > > > 2012/11/13 Miki Tebeka > > >

Re: Dynamically generating load/store path

2012-11-13 Thread Jonathan Coveney
If it's a parameter, it could just be passed in as a $var 2012/11/13 Miki Tebeka > Greetings, > > Is there a way to dynamically generate (maybe via UDF) the path to > load/store data? (something like "A = LOAD InputPath() USING > PigStorage();") > > Currently we calculate the load/store path ou

Dynamically generating load/store path

2012-11-13 Thread Miki Tebeka
Greetings, Is there a way to dynamically generate (maybe via UDF) the path to load/store data? (something like "A = LOAD InputPath() USING PigStorage();") Currently we calculate the load/store path outside the pig script and pass it as parameter. However we'd like to have a UDF that does that. T

Re: Help with Script

2012-11-13 Thread ingvay7
Thanks, Prashant and Pablomar. That fixed it! - Original Message - From: Prashant Kommireddi To: "user@pig.apache.org" Cc: Sent: Tuesday, November 13, 2012 1:40 PM Subject: Re: Help with Script SUM function requires that you specify the specific element from the grouping. In this ca

Re: Help with Script

2012-11-13 Thread Prashant Kommireddi
SUM function requires that you specify the specific element from the grouping. In this case, U_tm and U_cnt are both within group/bags and need to be accessed as "reqd.U_tm" and "reqd.U_cnt". --Sum the User Counts and Times G3 = foreach G2 generate group,SUM(reqd.U_tm)as time,SUM(reqd.U_cnt)as co

Re: Fw: Help with Script

2012-11-13 Thread pablomar
what about something like this ? G3 = foreach G2 generate group,SUM(reqd.U_tm)as time,SUM(reqd.U_cnt)as count; On Tue, Nov 13, 2012 at 12:57 PM, ingvay7 wrote: > (Apologies for resending but corrected script below) > > > This is the error I got: > > ERROR org.apache.pig.tools.grunt.Grunt - ER

Fw: Help with Script

2012-11-13 Thread ingvay7
(Apologies for resending but corrected script below) This is the error I got: ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: Could not infer the matching function for org.apache.pig.builtin.SUM as multiple or none of them fit. Please use an explicit cast. Updated code: a = LOAD 'Report

Re: Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udf

2012-11-13 Thread Michał Czerwiński
Yeah, so just to be clear under pig > 0.10 the issue seems to be exactly as you describe + issue occurs whenever you specify in the -Dpig.additional.jars a directory path instead of the file path. This is quite often happening because its advised on forums to include HIVE_HOME and HADOOP_HOME in t

Re: Help with Script

2012-11-13 Thread Vishwanath
This is the error I got: ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: Could not infer the matching function for org.apache.pig.builtin.SUM as multiple or none of them fit. Please use an explicit cast. Updated code: a = LOAD 'Report' AS (         dt:chararray,         Server:chararray,

Re: Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udf

2012-11-13 Thread Cheolsoo Park
Hi Michal, Thanks for sharing your workaround. I think that Pig should be able to handle empty file names in -Dpig.additional.jars, so users don't have to spend hours to debug problems like this. So I filed a JIRA: https://issues.apache.org/jira/browse/PIG-3046 We will get this fixed in a future

Re: Help with Script

2012-11-13 Thread Prashant Kommireddi
Hi, Can you paste the error message here? Sent from my iPhone On Nov 13, 2012, at 8:34 AM, "ingv...@yahoo.com" wrote: > hey all, > > Very new Pig user here. I think I'm trying to get something very simple done > but getting a few errors. See me script below.Any guidance will be > appreciated

Re: Help with Script

2012-11-13 Thread pablomar
just taking a quick look, I see a couple of errors: 1_ your LOAD hast one more comma. You need to delete the last one, after U_avg_tm:float 2_ and then, the group by, I think you need parenthesis G2 = group reqd by (Server,Type,Ops); by the way, where is you alias serverin ? On Tue, Nov 13, 20

Help with Script

2012-11-13 Thread ingv...@yahoo.com
hey all, Very new Pig user here. I think I'm trying to get something very simple done but getting a few errors. See me script below.Any guidance will be appreciated.Thanks. I get errors such as Error during parsing. Invalid alias: serverin {time: double,count: double} I am basically trying to

Re: Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udf

2012-11-13 Thread Michał Czerwiński
Oh well I changed PIG_CLASSPATH="$HCAT_HOME/share/hcatalog/hcatalog-0.4.0.jar:$HIVE_HOME/conf:$HADOOP_HOME/conf" into PIG_CLASSPATH="$HCAT_HOME/share/hcatalog/hcatalog-0.4.0.jar" having still hive libraries loaded via for file in $HIVE_HOME/lib/*.jar; do #echo "==> Adding $file" PIG_CLASS

Re: Trying to get pig 0.11/0.12 working to solve 0.10's issues with python udf

2012-11-13 Thread Michał Czerwiński
Right, it looks like that: 2012-11-13 15:13:57,100 [main] DEBUG org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Adding jar to DistributedCache: file:/opt/hcat/share/hcatalog/hcatalog-0.4.0.jar 2012-11-13 15:13:57,428 [main] DEBUG org.apache.pig.backend.hadoop.exec

RE: Intermittent NullPointerException

2012-11-13 Thread Malcolm Tye
Hi Cheolsoo, I tried setting default_parallel to 1 to rule out parallel processing, but the problem still happened. I've recompiled Pig and have put that into the test environment with the debug option set. I don't have recreate steps that fail every time. When the problem occurs,