Re: Re: How to wrie the user custemed Load Funtion

2014-01-26 Thread leiwang...@gmail.com
Hi Serega, I just want to implement the simple LOAD UDF that can take two seperators: '\t' and "|~| " Attachment is the java source code and pig statcktrace. I can see it is because the delimeter must be a single character, not "|~|". How to fix that? Please help to review the code to see

RE: custom UDF generates SpillableMemoryManager and task is killed

2014-01-26 Thread Yigitbasi, Nezih
Hi Davide, Your UDF is doing a lot of intensive processing without reporting its progress. EvalFunc class has a reporter field, please use that to report progress in your UDF (use reporter.progress() method) so that Hadoop doesn't kill your task. Nezih -Original Message- From: Davide Br

custom UDF generates SpillableMemoryManager and task is killed

2014-01-26 Thread Davide Brambilla
Hi, I'm new to Pig and i wrote a pig UDF to generate a bag of tuples, it seems to be correct, I've tested and it works perfectly. What I've missed ? Thanks Davide B. When I apply it to my data (220 million of rows) using this scripts *1 REGISTER '/mnt5/pig/udf_date.jar';* *2 REGISTER '/opt/c

Re: Re: How to wrie the user custemed Load Funtion

2014-01-26 Thread Serega Sheypak
Can you: 1. provide source code 2. provide stacktrace I've seen smilar stacktrace with could not instantiate 'my.pig.stuff.SomeClass' with arguments 'null', sbut the root cause was: 2.1. missing jar with class used in UDF/Load func 2.2. not correcntly handled exception. 2014-01-26 leiwang...@gmai

Re: Re: How to wrie the user custemed Load Funtion

2014-01-26 Thread leiwang...@gmail.com
I write a simple LOAD UDF according to the link and packaged it in the jar. register tracking-0.0.1-SNAPSHOT.jar; DEFINE PvDataLoader com.agrantsem.tracking.hadoop.udf.PvDataLoader(); data = LOAD '/user/tracking/pv/log/hourly/trackingpv_2013-10-30...@l-tr9.prod.cn2.log.gz' USING PvDataLoader()

Re: How to wrie the user custemed Load Funtion

2014-01-26 Thread Serega Sheypak
Try to use this one as start point: https://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/PigStorage.java 2014-01-26 leiwang...@gmail.com > > Hi, >I want to parse some text file data compressed with .gz format. The > data is not neat. The seperator is not uniq and some recor

How to wrie the user custemed Load Funtion

2014-01-26 Thread leiwang...@gmail.com
Hi, I want to parse some text file data compressed with .gz format. The data is not neat. The seperator is not uniq and some records are not imcomplete. Anyone can give an examle of how to write the pig Load UDF? Thanks, Lei leiwang...@gmail.com