Re: How do I load JSON in Pig?

2012-11-17 Thread Dan Young
No sure if this helps, but in 0.11 I've been using this on EMR for some of our JSON data raw = load 'hdfs:///cleaned_logs/clicks2/$year_id/$month_id/part-*' USING JsonLoader('a:chararray,at:chararray,c1:(url:chararray,useragent:chararray,referrer:chararray,window:(innerheight:chararray,innerwi

Re: STREAM in foreach block

2012-09-17 Thread Dan Young
I believe these are the ops supported in a nested foreach: CROSS, DISTINCT, FILTER, FOREACH, LIMIT, and ORDER BY. See: http://pig.apache.org/docs/r0.10.0/basic.html#foreach On Sep 17, 2012 1:55 PM, "Kannan Shah" wrote: > I'm trying to group tuples by a key, sort by another key within each gro

Hadoop version question...

2012-08-07 Thread Dan Young
I noticed that Amazon EMR now supports Hadoop 1.0.3, does pig 0.10.x work/certified with Hadoop 1.0.3? Regards, Dano

MaxMapTaskFailuresPercent setting?

2012-08-01 Thread Dan Young
Can I set the MaxMapTaskFailurePercent in my pig jobs? If so how? what would be the set command? Regards, Dano

Re: How to CONCAT multiple expressions

2012-07-11 Thread Dan Young
Try org.apache.pig.builtin.StringConcat Dano On Tuesday, July 10, 2012, Cdy Chen wrote: > Hi all, > > I am a new comer here. I encounter a problem toady: > > Pig version: 0.10.0 > > temp2 = LOAD '/pig/procedure/tzone' USING PigStorage(';'); > zone = FOREACH temp2 > { >a = STRSPLIT($0,'#',3)

Re: Copying files to Amazon S3 using Pig is slow

2012-06-09 Thread Dan Young
Definitely go down the s3distcp route. I use it to copy large number of smaller files from s3 into fewer larger ones in HDFS and it's been working great. This also helps out with the Pig jobs running faster vs. having Pig try to load files from s3. Regards, Dan On Fri, Jun 8, 2012 at 5:40

Re: STRSPLIT problems (or UDF shortcoming?)

2012-05-17 Thread Dan Young
A quick test would be to scp the 0.10 pig.jar over to your master node, and then run: hadoop -jar pig.jar . Run your script in grunt... Dano On May 17, 2012 5:26 PM, "Nerius Landys" wrote: > > Have you tried 0.10? > > No but I can and will try it. I've been using whatever is on Amazon > becaus

Re: STRSPLIT problems (or UDF shortcoming?)

2012-05-17 Thread Dan Young
We ended up using 0.10 on EMR and its been working fine so far... Dano On May 17, 2012 5:26 PM, "Nerius Landys" wrote: > > Have you tried 0.10? > > No but I can and will try it. I've been using whatever is on Amazon > because that is the system that we'll be using. > I'll report back on my find

Re: STRSPLIT problems (or UDF shortcoming?)

2012-05-17 Thread Dan Young
Have you tried 0.10? On May 17, 2012 5:13 PM, "Nerius Landys" wrote: > > What version of pig are you using on EMR? > > hadoop@ip-10-190-83-146:~$ pig --version > Apache Pig version 0.9.2-amzn (rexported) > compiled Apr 06 2012, 23:48:53 >

Re: STRSPLIT problems (or UDF shortcoming?)

2012-05-17 Thread Dan Young
What version of pig are you using on EMR? On May 17, 2012 5:02 PM, "Nerius Landys" wrote: > > Did you try to escape the backslash? > > I just tried this: > > POSA = FOREACH TEST GENERATE STRSPLIT(startpos,'\\u002F'); > > ... and still the same result. By the way I'm using a forward slash > for

Re: STRSPLIT problems (or UDF shortcoming?)

2012-05-17 Thread Dan Young
Did you try to escape the backslash? Dano On Thu, May 17, 2012 at 11:57 AM, Nerius Landys wrote: > I'm having problems using Pig's STRSPLIT (on Amazon's cloud computing > environment). > I also noticed that STRSPLIT isn't documented in the Pig Latin > Reference Manual, so I found out about it

Re: Apache Pig and EMR....

2012-05-16 Thread Dan Young
ssell Jurney http://datasyndrome.com > > On May 16, 2012, at 11:39 AM, Dan Young wrote: > > > I'm trying to run 0.10 on EMR and noticed this open issue: > > > > https://issues.apache.org/jira/browse/PIG-2562 > > > > HortonWork's release

Apache Pig and EMR....

2012-05-16 Thread Dan Young
I'm trying to run 0.10 on EMR and noticed this open issue: https://issues.apache.org/jira/browse/PIG-2562 HortonWork's release notes state that s3 is support. Is this the case or not? Regards, Dan

Re: Slides from Apache Pig Hackday, Austin edition

2012-05-11 Thread Dan Young
a family event today so he was > unable > > to make it. > > > > On May 11, 2012, at 3:06 PM, Dan Young wrote: > > > > > Sweet! Thank you. > > > > > > Anything going on there with JRuby UDFs? > > > > > > Regards, > > > &

Re: Slides from Apache Pig Hackday, Austin edition

2012-05-11 Thread Dan Young
Sweet! Thank you. Anything going on there with JRuby UDFs? Regards, Dan On May 11, 2012 2:00 PM, "Jeremy Hanna" wrote: > Here in Austin, we've been having a hack day for beginning to intermediate > developers. Just wanted to post some slides that were from presentations > here: > Pig 101 - >

Re: Hackday Skype: apachepig

2012-05-11 Thread Dan Young
Will you be able to post any slides, notes, etc online afterwards? Regards, Dan On Fri, May 11, 2012 at 11:04 AM, Russell Jurney wrote: > Up to 10 people can skype in to the Pig hackday. Call apachepig :) > > -- > Russell Jurney twitter.com/rjurney russell.jur...@gmail.com > datasyndrome.c

Javascript UDF woes...

2012-05-01 Thread Dan Young
rse vs the eval but I still can't get then tuple definition /outputSchema working. Any insight /pointers would be greatly appreciated. Regards, Dano -- Forwarded message ------ From: "Dan Young" Date: Apr 27, 2012 4:00 PM Subject: Javascript UDF woes... To: I'm

Javascript UDF woes...

2012-04-27 Thread Dan Young
I'm trying to output two values from a javascript udf, which takes a JSON document as input.but for the life of me I can't seem to get it to return where I can reference the device_id or user_device.. I've tried this, which when I dump the data I just get noting => () get_device_id_and_ty

RE: Apache Pig hackday @ Twitter (SF)

2012-04-20 Thread Dan Young
As well from Boulder, CO! Regards, Dan On Apr 20, 2012 6:56 AM, "Doug Daniels" wrote: > If we figure out how to do remote participants, I'd love to join as well > from NY. > > Thanks, > Doug > > From: Jeremy Hanna [jeremy.hanna1...@gmail.com] > Sent: Th

Re: Javascript UDF's

2012-04-12 Thread Dan Young
o make them > better :) You should give feedback on features you'd like, syntax that > would be useful, etc. > > 2012/4/12 Dan Young > > > Can anyone comment on whether or not Javascript UDFs are here to stay? on > > the wiki it states "*Note:* *JavaScript UDFs are an experimental > feature."* > > * > > * > > *Regards,* > > * > > Dan* > > * > > * > > >

Javascript UDF's

2012-04-12 Thread Dan Young
Can anyone comment on whether or not Javascript UDFs are here to stay? on the wiki it states "*Note:* *JavaScript UDFs are an experimental feature."* * * *Regards,* * Dan* * *

Re: Trying to store a bag of tuples using AvroStorage.

2012-04-03 Thread Dan Young
#x27; before the last ']'. > > > > On Tue, Apr 3, 2012 at 10:32 AM, Dan Young wrote: > > > I just updated my pig from svn repo and now am using the latest from > trunk: > > > > pig -i > > Apache Pig version 0.11.0-SNAPSHOT (r1309051) > > comp

Re: Trying to store a bag of tuples using AvroStorage.

2012-04-03 Thread Dan Young
b.com/2293909 Regards, Dan On Tue, Apr 3, 2012 at 11:07 AM, Russell Jurney wrote: > This looks like a bug fixed in 0.10. Mind trying it? > > Russell Jurney http://datasyndrome.com > > On Apr 3, 2012, at 9:13 AM, Dan Young wrote: > > > Hello Stan, > > > > I'm

Re: Trying to store a bag of tuples using AvroStorage.

2012-04-03 Thread Dan Young
. Mind trying it? > > Russell Jurney http://datasyndrome.com > > On Apr 3, 2012, at 9:13 AM, Dan Young wrote: > > > Hello Stan, > > > > I'm back from Mexico now, and here's my GIST with all the information. > > > > https://gist.github.com/22

Re: Trying to store a bag of tuples using AvroStorage.

2012-04-03 Thread Dan Young
ter tonight unless > someone > > else has a more immediate answer. > > > > Best, > > > > stan > > > > On Mar 25, 2012 12:36 AM, "Dan Young" wrote: > >> > >> Hello all, > >> > >> I'm trying

Trying to store a bag of tuples using AvroStorage.

2012-03-24 Thread Dan Young
Hello all, I'm trying to store a bag of tuples using AvroStorage but am not able to figure out what I'm doing wrong (or if it' supported). What I have is the following: grunt>illustrate c; - | c

RE: Selective removal of data from a relation

2012-03-18 Thread Dan Young
Post it on https://gist.github.com/ and email out the gist. Regards, Dan On Mar 18, 2012 12:33 PM, "rakesh sharma" wrote: > > All indentations get removed when message comes back from > user@pig.apache.org. Any idea how I can make it work. > > > From: rakesh_sharm...@hotmail.com > > To: user@pi

Javascript UDF-Illustrate

2012-01-12 Thread Dan Young
Hello All, New to Pig and I'm playing around with using a Javascript UDF to parse some Google Adwords reports and am having some issues with Illustrate. I always get a java.lang.NullPointerException when I issue an Illustrate, although when I do a dump everything seems to work properly. Below is