Re: Unable to access Map within a tuple.

2011-03-22 Thread deepak kumar v
Hi Daniel, I have a bag of tuples inputBag = { (day, age, name, address, ['k1#v1','k2#v2']), (12/2,22,deepak,newyork, ['k1#v1','k2#v2']), (12/3,22,deepak,newjersy, ['k1#v1','k2#v2']) } I need to invoke a UDF for each tuple, so i have to flatten the bag which i do as flatTuples = foreach input

Re: Incorrect header or version mismatch

2011-03-22 Thread Dmitriy Ryaboy
Cloudera packages a version of Pig that works with their distribution; you are using a different version, which bundles its own hadoop, and there is a conflict.. to be expected, I think, not a bug. This is why the jar-withouthadoop target is provided. D On Tue, Mar 22, 2011 at 12:07 PM, Dan Hend

RE: Failed to generate logical plan

2011-03-22 Thread Xuefu Zhang
That means your class need to be a subclass of org.apache.pig.EvalFunc. --Xuefu -Original Message- From: Baraa Mohamad [mailto:baraa.issa.moha...@gmail.com] Sent: Tuesday, March 22, 2011 12:38 PM To: user@pig.apache.org Subject: Re: Failed to generate logical plan Thank you very much f

Re: Failed to generate logical plan

2011-03-22 Thread Baraa Mohamad
Thank you very much for your help , in fact I had some directory structure problems , when I did jar -tf mypigudf I get mypigudf/DicomParser.java mypigudf/DicomParser.class But now I have different problem *Failed to generate logical plan. Nested exception: java.lang.ClassCastException: mypigudf

RE: Preserve newlines in field

2011-03-22 Thread Lai Will
I'm using 0.8.0. You're right, now I see that I'm actually doing something weird: I'm using TextInputFormat to read my XML file line by line and construct the object to hold the xml element. Shouldn't actually the XML element , that spans over several lines be my record and not one single line?

RE: Failed to generate logical plan

2011-03-22 Thread Xuefu Zhang
I wasn't able to compile your code, but from the steps you listed, I think you should check the directory structure. Once you have a jar, you can do "jar -tf mypigudf" to list the classes in it. You also need to make sure that the directory structure matches your class patch. In your case, you

RE: Incorrect header or version mismatch

2011-03-22 Thread Dan Hendry
Thanks for the info I have not yet verified with the hadoop list but it looks like the CDH3b4 0.20.2 hadoop-core.jar is incompatible or different from the hadoop-core.jar that the pig build script pulls in via ivy. I was able to solve my problem by building pig without hadoop (ant jar-withoutha

Re: Failed to generate logical plan

2011-03-22 Thread Baraa Mohamad
Yes I'm using trunk this is my class package mypigudf; import java.io.IOException; import org.apache.pig.data.TupleFactory; import org.apache.pig.data.Tuple; public class DicomParser{ public Tuple exec(String input) throws IOException {} } and after that I follwed this steps cd H:/apps/mypi

RE: Failed to generate logical plan

2011-03-22 Thread Xuefu Zhang
Hi Baraa, I'm assuming you're using trunk for your experiment. Nevertheless, this error basically tells you that Pig cannot instanticate your UDF. Common cause is misspelling. Is mypigudf.DicomParser fully qualified class name of your UDF? I noticed that mypigudf is the jar file name as well.

Failed to generate logical plan

2011-03-22 Thread Baraa Mohamad
Hi all, I wrote a simple udf DicomParser which read a line and convert it to tuple but when I tried to use like that register H:/apps/mypig/mypigudf.jar; A = load 'dicoms/' using org.apache.pig.piggybank.storage.XMLLoader('attr') as (x:chararray); B = Foreach A generate mypigudf.DicomParser(x); s

Re: packages problem with eclipse

2011-03-22 Thread Baraa Mohamad
thanks I'll try that :) but maybe I'll need to develope Pig core code :) Best regards Baraa On Tue, Mar 22, 2011 at 6:38 PM, Daniel Dai wrote: > If all you need is to write a UDF, you only need to add pig.jar into > library of your eclipse project. The wiki page is to set up the environment >

Re: logging in pig

2011-03-22 Thread Daniel Dai
Pig output goes to STDOUT, info goes to STDERR. If you want to log both, use pig > filename 2>&1 You can open a file to log inside UDF, but your log will be in different work nodes. For debugging purpose, usually I print some debugging output to STDOUT, and check the JobTracker UI. Dani

Re: packages problem with eclipse

2011-03-22 Thread Daniel Dai
If all you need is to write a UDF, you only need to add pig.jar into library of your eclipse project. The wiki page is to set up the environment to develop Pig core code. Daniel On 03/22/2011 06:51 AM, Baraa Mohamad wrote: Hello there, I want to write a UDF in java so I tried to add pig to e

Re: Preserve newlines in field

2011-03-22 Thread Daniel Dai
Which Pig version are you using? If you are using Pig 0.7/0.8, line parsing is handled by hadoop TextInputFormat. You need to override the behavior of TextInputFormat in order to do that. You need to derive a new TextInputFormat which reserve newline characters, feed it to your LoadFunc(getInpu

logging in pig

2011-03-22 Thread Lai Will
Hello, This may seem a dumb question.. Is there a way to log the pig output to a file? I tried pig test.pig > filename However that does not work. A related question: How can we use logging in a UDF that logs out to a file? Best, Will

Preserve newlines in field

2011-03-22 Thread Lai Will
Hello, I'm currently encountering following problem. I have a xml file that gets loaded using a custom LoadFunc. Boiled down my xml file could look like: 1 This is a sample text that contains newlines, which sho

Re: Incorrect header or version mismatch

2011-03-22 Thread Josh Devins
Hey Dan This usually means that you have mismatched Hadoop jar versions somewhere. I encountered a similar problem with Oozie trying to talk to HDFS. Maybe try posting to the Hadoop user list as well. In general, you should just need the same hadoop-core.jar as on your cluster when you run Pig. Fr