Re: JSONToTuple for pig UDF

2011-04-19 Thread John Hui
pr 19, 2011 at 3:09 PM, Xavier Stevens wrote: > Hey John, > > If you take a look at mine it looks explicitly for Lists and converts > them to DataBags. I ran into that issue with our data. That said I won't > make any claims that it'll work for all data. > > Cheers, > >

Re: JSONToTuple for pig UDF

2011-04-19 Thread John Hui
I'll post my solution in a few hours =) On Tue, Apr 19, 2011 at 3:02 PM, John Hui wrote: > I don't think one parser will work for all solution. It really depends on > your data, since there might be a list within a list. > > But pick anyone as a starting point and cus

Re: JSONToTuple for pig UDF

2011-04-19 Thread John Hui
I don't think one parser will work for all solution. It really depends on your data, since there might be a list within a list. But pick anyone as a starting point and customize it for your own json data format. On Tue, Apr 19, 2011 at 3:00 PM, Alan Gates wrote: > > On Apr 19, 2011, at 11:44 A

Re: JSONToTuple for pig UDF

2011-04-19 Thread John Hui
I have a JSON library and pig script working. Should I just contribute it instead of reinventing the wheel? John On Tue, Apr 19, 2011 at 2:44 PM, Daniel Eklund wrote: > Bill, thanks... > > so that is a confirmation... people have rolled their own, and it's not in > piggybank. > I would absol

Re: Comparison between long

2010-12-15 Thread John Hui
btw.. you probably want to call that InSeconds, plural) > > D > > On Wed, Dec 15, 2010 at 2:00 PM, John Hui wrote: > > To give more context, the ISOToUnixInSecond return UnixTime in second. > The > > return value of this function is Long > > > > 75 @

Re: Comparison between long

2010-12-15 Thread John Hui
the Long into the proper long type in the pig script. ISOToUnixInSecond('$STARTDATETIME') AS startTime:long Hence during the comparion, it treat the Long as a string value ... On Wed, Dec 15, 2010 at 4:28 PM, John Hui wrote: > This is actually, please ignore the code section belo

Re: Comparison between long

2010-12-15 Thread John Hui
('$STARTDATETIME') AS startTime:long; 8 9 eventData = FILTER eventData BY (event == 'adImpression') AND (eventTimestamp <= startTime); 10 11 DESCRIBE eventData; 12 13 B = GROUP eventData BY (event, publication, deviceType, adID, mcc); On Wed, Dec 15, 2010 at 4:21 PM, John Hui wrote

Comparison between long

2010-12-15 Thread John Hui
I am having a hard time getting comparison to work. I am comparing from two long values but I keep on getting a cast long to String error Backend error message - java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.String at java.lang.String.compareT

Re: LOAD data USING to parse data in order to obtain the AS as desired.

2010-11-30 Thread John Hui
hird column into > multiple columns. > > -D > > On Tue, Nov 30, 2010 at 9:26 AM, John Hui wrote: > > > You can try using a customer storage parser. > > > > You can see a bunch of examples here.. > > > > > > > pig-0.7.0/contrib/piggybank/java/sr

Re: LOAD data USING to parse data in order to obtain the AS as desired.

2010-11-30 Thread John Hui
You can try using a customer storage parser. You can see a bunch of examples here.. pig-0.7.0/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage I wrote one for JSON. On Tue, Nov 30, 2010 at 12:16 PM, Yves Roy wrote: > Hello: > > I hope this is not double posting. > > I wa

Re: UDF Loader - one line in input result in multiple tuples

2010-10-28 Thread John Hui
Awesome Alan, let me try that out and see if it works. John On Thu, Oct 28, 2010 at 11:49 AM, Alan Gates wrote: > > On Oct 28, 2010, at 8:36 AM, John Hui wrote: > > I look into the return data bag as an option. The problem is the Loader >> interface require me to ret

Re: UDF Loader - one line in input result in multiple tuples

2010-10-28 Thread John Hui
itutes a record your loader gets -- > use a different inputformat/recordReader to produce the records as > needed, instead of feeding you lines. > > -D > > On Thu, Oct 28, 2010 at 8:36 AM, John Hui wrote: > > I look into the return data bag as an option. The prob

Re: UDF Loader - one line in input result in multiple tuples

2010-10-28 Thread John Hui
loader to return a bag of tuples. Right? John On Wed, Oct 27, 2010 at 6:00 PM, John Hui wrote: > Hi Pig Users, > > I am currently writing a UDF loader. In one of my use case, one line in > the input stream results in multiple tuples. Has anyone encounter or solve > this iss

UDF Loader - one line in input result in multiple tuples

2010-10-27 Thread John Hui
Hi Pig Users, I am currently writing a UDF loader. In one of my use case, one line in the input stream results in multiple tuples. Has anyone encounter or solve this issue on their end. The current structure of the code getNext method only return tuple but I want it to return a List. Let me kn

Add constant to pig output

2010-10-20 Thread John Hui
Hi, I have this pig script. 1 data = LOAD '$INPUT' USING PigStorage(',') AS (app:chararray, user:chararray, timestamp:int, duration:int); 2 3 appUserIn = FOREACH data GENERATE app, user; 5 distinctAppUserIn = DISTINCT appUserIn; 6 7 groupOnApp = GROUP distinctAppUserIn BY app; 8