Re: Initial Benchmark Results

2010-05-17 Thread Jeff Zhang
Anyone still has the comparison report ? The link seems broken now. On Tue, Jan 19, 2010 at 8:34 AM, Rob Stewart wrote: > Hi folks, > > I have some initial results to run through with you. I have a number of > implementations ready to push onto the Hadoop cluster, but I have finalized > the tes

Re: revision 909116

2010-05-17 Thread Corbin Hoenes
when building rev 909116 we get a pig-0.7.0-dev.jar how did you apply this patch to pig 0.6? On May 14, 2010, at 5:08 PM, Dmitriy Ryaboy wrote: > I've applied it to 0.6 before > > D > > On Fri, May 14, 2010 at 3:54 PM, Corbin Hoenes wrote: >> Does anyone know if rev 909116 can be applied to

Re: revision 909116

2010-05-17 Thread Dmitriy Ryaboy
I just download the patch from the corresponding jira, https://issues.apache.org/jira/browse/PIG-1217 (note that the test is broken, you need to remove line 72 in TestTop.java -- though that doesn't affect functionality, just the unit test). On Mon, May 17, 2010 at 9:28 AM, Corbin Hoenes wrote:

NVL for pig

2010-05-17 Thread Wasti, Syed
I am trying the SQL ³NVL(city, ŒU¹) city² in pig I am using the bincond operator, ³(city is null?'U': city) AS city², which is of chararray type, the result file shows Œ\N¹ instead of U. Any ideas ?

Re: revision 909116

2010-05-17 Thread Corbin Hoenes
doh! okay that works great :) thnx On May 17, 2010, at 10:47 AM, Dmitriy Ryaboy wrote: > I just download the patch from the corresponding jira, > https://issues.apache.org/jira/browse/PIG-1217 (note that the test is > broken, you need to remove line 72 in TestTop.java -- though that doesn't > af

[Travel Assistance] - Applications Open for ApacheCon NA 2010

2010-05-17 Thread Alan Gates
The Travel Assistance Committee is now taking in applications for those wanting to attend ApacheCon North America (NA) 2010, which is taking place between the 1st and 5th November in Atlanta. The Travel Assistance Committee is looking for people who would like to be able to attend ApacheCon,

Re: Nvl function for pig

2010-05-17 Thread Syed Wasti
Well Dmitriy, my bad, I was looking at the data through a hive query and it shows as NULL, but when I looked into the flat file all the NULL values are are seen as \N. Hive is able to understand \N as NULL but pig is not... How can I resolve this ? On 5/16/10 4:33 PM, "Dmitriy Ryaboy" wrote: >

Re: Nvl function for pig

2010-05-17 Thread Dmitriy Ryaboy
Arguably, that's a Hive bug. What does hive do if you *want* to have a \n as a value? For your case, I think it's as simple as foreach rel generate ( foo is null OR foo == '\n' ? 'U' : foo); -D On Mon, May 17, 2010 at 11:42 AM, Syed Wasti wrote: > Well Dmitriy, my bad, I was looking at the dat

Re: Nvl function for pig

2010-05-17 Thread Syed Wasti
Have tried both ways foo is null OR foo == '\n', doesn't work in pig. Why would null values be saved as \N in a file ? Is there a reason, is this hive or hadoop way which pig cant understand ? On 5/17/10 11:53 AM, "Dmitriy Ryaboy" wrote: > Arguably, that's a Hive bug. What does hive do if you *

Re: Nvl function for pig

2010-05-17 Thread Dmitriy Ryaboy
There must be some noise in your input that is getting interpreted differently by Hive and Pig. Loading a bunch of newlines does generate nulls, so I am not sure what's happening there. Are you loading using PigStorage? Default delimiters? Can you upload a sample file and script that reproduces the

Re: Nvl function for pig

2010-05-17 Thread Syed Wasti
Attached is the data file, just in case, below is the data and the script, this should give you all you want and to your last question, I am using Mac. 7001Test001\N 7002Test101\N 7003Test312\N 7004Test422\N grunt> data = LOAD 'data'

Re: Initial Benchmark Results

2010-05-17 Thread Mads Moeller
Hi Jeff, It seems to have been moved. http://www.macs.hw.ac.uk/~rs46/publications.html On Mon, May 17, 2010 at 12:34 AM, Jeff Zhang wrote: > Anyone still has the comparison report ? The link seems broken now. > > > > On Tue, Jan 19, 2010 at 8:34 AM, Rob Stewart > wrote: >> Hi folks, >> >> I h

Re: Nvl function for pig

2010-05-17 Thread Dmitriy Ryaboy
double-escape the slash. grunt> data1 = FOREACH data GENERATE id, name, (gender=='\\N'?'U':gender) AS gender; grunt> dump data1 (7001L,Test0,U) (7002L,Test1,U) (7003L,Test3,U) (7004L,Test4,U) On Mon, May 17, 2010 at 1:44 PM, Syed Wasti wrote: > Attached is the data file, just in case, below is

Exception: Unable to find clone for op Project 4-16 Projections

2010-05-17 Thread Yonggang Qiao
Hi, anyone has seen this error before? normally our script runs fine, but sometime recently it began to throw this exception. also usually it will go away if I rerun it. Caused by: java.lang.RuntimeException: Unable to find clone for op Project 4-16 Projections: [9] Overloaded: false at

Re: Nvl function for pig

2010-05-17 Thread Syed Wasti
Nice, it works, thanks. On 5/17/10 1:49 PM, "Dmitriy Ryaboy" wrote: > double-escape the slash. > > grunt> data1 = FOREACH data GENERATE id, name, (gender=='\\N'?'U':gender) AS > gender; > grunt> dump data1 > (7001L,Test0,U) > (7002L,Test1,U) > (7003L,Test3,U) > (7004L,Test4,U) > > > On Mon,

Re: Exception: Unable to find clone for op Project 4-16 Projections

2010-05-17 Thread Ashutosh Chauhan
Are you using PigServer java api to run your Pig queries ? If so, are you trying to run multiple queries in different threads against same Pig server instance? Ashutosh On Mon, May 17, 2010 at 13:57, Yonggang Qiao wrote: > Hi, > > anyone has seen this error before? normally our script runs fine

Re: Exception: Unable to find clone for op Project 4-16 Projections

2010-05-17 Thread Yonggang Qiao
yes. yes. On Mon, May 17, 2010 at 2:03 PM, Ashutosh Chauhan wrote: > Are you using PigServer java api to run your Pig queries ? If so, are > you trying to run multiple queries in different threads against same > Pig server  instance? > > Ashutosh > > On Mon, May 17, 2010 at 13:57, Yonggang Qiao

Re: Exception: Unable to find clone for op Project 4-16 Projections

2010-05-17 Thread Yonggang Qiao
sorry, actually yes. no, we use new instance for each script. On Mon, May 17, 2010 at 2:13 PM, Yonggang Qiao wrote: > yes. yes. > > On Mon, May 17, 2010 at 2:03 PM, Ashutosh Chauhan > wrote: >> Are you using PigServer java api to run your Pig queries ? If so, are >> you trying to run multiple qu

Re: Exception: Unable to find clone for op Project 4-16 Projections

2010-05-17 Thread Ashutosh Chauhan
if you are creating new instance for each query.. you should be fine.. which Pig version are you using ? can you paste the snippet of java code where you are creating new pig server instance and then using it for a new query ? Ashutosh On Mon, May 17, 2010 at 14:17, Yonggang Qiao wrote: > sorry,

Re: Exception: Unable to find clone for op Project 4-16 Projections

2010-05-17 Thread Yonggang Qiao
sure. removed the irrelevant stuff. PigServer pigServer = new PigServer("mapreduce"); pigServer.registerQuery(...); ... ExecJob job = pigServer.store("RET", outputPath); then periodically check the job.getStatus(). On Mon, May 17, 2010 at 2:43 PM, Ashutosh Chauhan wrote: > if you are creati

Re: Initial Benchmark Results

2010-05-17 Thread Jeff Zhang
Thanks, Mads. On Tue, May 18, 2010 at 4:48 AM, Mads Moeller wrote: > Hi Jeff, > > It seems to have been moved. > http://www.macs.hw.ac.uk/~rs46/publications.html > > > > On Mon, May 17, 2010 at 12:34 AM, Jeff Zhang wrote: >> Anyone still has the comparison report ? The link seems broken now. >

Re: Exception: Unable to find clone for op Project 4-16 Projections

2010-05-17 Thread Jeff Zhang
Ashutosh, Are you sure pig now can work on multi-thread environment ? As I know pig 0.5 can not work on multi-thread environment even you are create new PigServer for each pig script. On Tue, May 18, 2010 at 5:43 AM, Ashutosh Chauhan wrote: > if you are creating new instance for each query.. yo

Re: Exception: Unable to find clone for op Project 4-16 Projections

2010-05-17 Thread Ashutosh Chauhan
>From Yonggang description and code snippet .. it seems to me he is not having multithreaded environment.. there is only one thread and he is creating new PigServer instance in it repeatedly for each query... and since static variables are reset everytime.. this should work... PigServer still doesn