RE: Can I check if the field is integer or not

2012-06-08 Thread Steve Bernstein
...for a one digit integer, or '[0-9]+' an integer of one or more digits. From: Jagat Singh [jagatsi...@gmail.com] Sent: Thursday, June 07, 2012 7:08 PM To: user@pig.apache.org Subject: Re: Can I check if the field is integer or not You can use regular

Re: Can I check if the field is integer or not

2012-06-08 Thread Kris Coward
And both of these only cover positive/unsigned integers, and allowing for negatives you'd want '-?[0-9]+' On Fri, Jun 08, 2012 at 03:20:01PM +, Steve Bernstein wrote: ...for a one digit integer, or '[0-9]+' an integer of one or more digits. From:

Re: pig:-storage problem

2012-06-08 Thread Alan Gates
Bouncing this to pig user, where your more likely to get an answer. Could you share your Pig Latin script. It will make it easier to diagnose. Alan. On Jun 7, 2012, at 9:54 PM, avnish pundir wrote: hi everyone, I've to generate sequence number for my data in pig.I'm using RANDOM()

NoClassDefFoundError after upgrading to pig 0.10.0 from 0.9.0

2012-06-08 Thread Matthew Hayes
I'm using ivy to download dependencies for pig, but after updating the version to 0.10.0 I am getting errors in my unit tests: [testng] java.lang.NoClassDefFoundError: org/codehaus/jackson/map/util/LRUMap [testng] at org.apache.pig.builtin.JsonMetadata.init(JsonMetadata.java:75)

running pig on remote cluster

2012-06-08 Thread Stan Rosenberg
Hi, I am trying to submit a pig job to a remote cluster by setting mapred.job.tracker and fs.default.name accordingly. The job does get executed on the remote cluster, however all intermediate output is stored on the local cluster from which pig is run. From job configuration I can see that

Re: Replace null with string

2012-06-08 Thread Dragan Nedeljkovic
You can use an UDF like the one bellow to deal with  the NULLs. register 'mypiggybank.jar'; define Nvl piggybank.Nvl(); input_lines = LOAD 'test_Nvl.in' AS (line:chararray); describe input_lines; dump input_lines; new_list = FOREACH input_lines GENERATE Nvl(line, 'n/a'); describe new_list; dump

Re: Copying files to Amazon S3 using Pig is slow

2012-06-08 Thread Aniket Mokashi
http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/UsingEMR_s3distcp.html On Fri, Jun 8, 2012 at 4:40 AM, James Newhaven james.newha...@gmail.comwrote: I want to copy 26,000 HDFS files generated by a pig script to Amazon S3. I am using the copyToLocal command, but I

Re: Copying files to Amazon S3 using Pig is slow

2012-06-08 Thread Mohit Anchlia
Also use multiple streams of s3 to get better throughput On Fri, Jun 8, 2012 at 3:24 PM, Aniket Mokashi aniket...@gmail.com wrote: http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/UsingEMR_s3distcp.html On Fri, Jun 8, 2012 at 4:40 AM, James Newhaven

Re: Replace null with string

2012-06-08 Thread Russell Jurney
fixed = FOREACH my_relation GENERATE (my_field IS NOT NULL ? my_field : 'other') as my_field, *; Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com On Jun 8, 2012, at 3:01 PM, Dragan Nedeljkovic draga...@yahoo.com wrote: You can use an UDF like the one bellow to deal