Streaming to PHP

2010-09-29 Thread Rob Wilkerson
I have a Pig script--currently running in local mode--that processes a huge file containing a list of categories: /root/level1/level2/level3 /root/level1/level2/level3/level4 ... I need to insert each of these into an existing database by calling a stored procedure. Because I'm new to

Re: Streaming to PHP

2010-09-29 Thread Dmitriy Ryaboy
Rob, I don't know PHP so can't advise you on the command-line flags, but I just tried it with Perl, using both Pig 0.6 and Pig 0.8, and this works: grunt> cats = load 'tmp/text.txt'; grunt> dump cats; (Art) (Arts/Animation) (Arts/Animation/Anime) (Arts/Animation/Anime/Characters) (Arts/Animation/A

Magic numbers in my pig scripts

2010-09-29 Thread Eric Wadsworth
Hi folks! I'm brand new to this list, so apologies if this is an inappropriate newbie question, or is otherwise incorrect, but here goes. I'm working with a bunch of pig scripts, and we're adding new ones almost daily. They are getting more and more complex. The problem is exacerbated by the

Re: Magic numbers in my pig scripts

2010-09-29 Thread Saurav Datta
Hi Eric, As I understand, you would like to define the value of the filter at run time, and this value would be taken from a file. Am I correct ? Regards, Saurav On Sep 29, 2010, at 10:00 AM, Eric Wadsworth wrote: Hi folks! I'm brand new to this list, so apologies if this is an inappropri

Re: Magic numbers in my pig scripts

2010-09-29 Thread Eric Wadsworth
Saurav, Not that limited, but yes. Another example is in order. Say I have something like this: projected_data = FOREACH data GENERATE com.example.udfs.foo(7, 37, 'https', fields#'bar') as bat; This sort of thing would be vastly better: projected_data = FOREACH data GENERATE com.example.udfs

RE: Magic numbers in my pig scripts

2010-09-29 Thread Aniket Mokashi
http://wiki.apache.org/pig/ParameterSubstitution http://hadoop.apache.org/pig/docs/r0.3.0/piglatin.html Also, Pig 0.8 can have RECORD_TYPE_ALPHA take runtime values (alias like filtered_stuff_threshold). https://issues.apache.org/jira/browse/PIG-1434 Thanks, Aniket -Original Message-

Re: Magic numbers in my pig scripts

2010-09-29 Thread Saurav Datta
Same here, I was coming to parameter substitution by reading from a parameter file. Here is how you declare the variable year, month and date . A = load '/INPUTDIR/$year/$month/$date/input_test.dat' using PigStorage(' ') as (field1, field2, field3) ; Here is how you invoke the pig script, i

RE: [DISCUSS] Apache Pig bylaws

2010-09-29 Thread Santhosh Srinivasan
I like the bylaws as they stand now. Consensus for removal and 2/3 votes for new code bases and change in bylaws makes sense as these activities are not frequent occurrences. Santhosh -Original Message- From: Alan Gates [mailto:ga...@yahoo-inc.com] Sent: Tuesday, September 28, 2010 10:

RE: Magic numbers in my pig scripts

2010-09-29 Thread Matthew Smith
Maybe this is off topic, but I used it in Java code with a parameter array. In MAIN (or UI, Input, etc.): String[] params = new String[]; params[0]= "date'; params[1]="filter_regex"; runScript(params); in runScript(String[] params, pigServer server, String inputPath, String outputPath) PigServ

Re: Problem with LzoTokenizedLoader with elephant-bird branch for Pig 0.7

2010-09-29 Thread ed
Hello, I tested the newest push to the hirohanin elephant-bird branch (for pig 0.7) and had an error when trying to use LzoTokenizedLoader with the following pig script: REGISTER elephant-bird-1.0.jar REGISTER /usr/lib/elephant-bird/lib/google-collect-1.0.jar A = load '/usr/foo/inp

Re: project on pigerry

2010-09-29 Thread Alan Gates
We keep tabs on projects we have worked on, are working on, and are thinking of working on at http://wiki.apache.org/pig/PigJournal This should give you some ideas for projects. Alan. On Sep 28, 2010, at 11:38 AM, yoomeosym...@yahoo.com wrote: Kindly give a set of project on the above, fo

Re: Accessing Nested Json

2010-09-29 Thread Alan Gates
Are you loading them as tuples or maps? If you're loading them as tuples than you should be able to say x.keyA.pA (which should return "vA"). If you're loading them as maps than it would be x#'keyA'#'pA' Alan. On Sep 28, 2010, at 12:45 PM, rakesh kothari wrote: Hi, Is there a good way

Re: [DISCUSS] Apache Pig bylaws

2010-09-29 Thread Benjamin Reed
i also like them! excellent work. ben On 09/29/2010 10:57 AM, Santhosh Srinivasan wrote: I like the bylaws as they stand now. Consensus for removal and 2/3 votes for new code bases and change in bylaws makes sense as these activities are not frequent occurrences. Santhosh -Original Mes

Re: [DISCUSS] Apache Pig bylaws

2010-09-29 Thread Benjamin Reed
one small thing that i'm not sure exactly how to address. you don't want voting intervals to be too long because they can slow down the release if issues come up and you need to keep restarting the vote. on the other hand, a short interval means that people may miss it. it might be nice to ha

Re: Magic numbers in my pig scripts

2010-09-29 Thread Eric Wadsworth
Piggers, Parameter substitution isn't really what I'm needing. After some discussion with my co-workers, it looks like the best feature would really be sort of a pre-processor. Basically, insert a line in your pig script that would "include" another pig script, right there. Then that other pi

Re: Accessing Nested Json

2010-09-29 Thread hc busy
I thought map can only take bytearray as value type? On Wed, Sep 29, 2010 at 1:53 PM, Alan Gates wrote: > Are you loading them as tuples or maps? If you're loading them as tuples > than you should be able to say x.keyA.pA (which should return "vA"). If > you're loading them as maps than it wou

Re: Magic numbers in my pig scripts

2010-09-29 Thread Thejas M Nair
Support for functions as part of the turing complete pig effort should help (it is in early design stages)- http://wiki.apache.org/pig/TuringCompletePig -Thejas On 9/29/10 3:32 PM, "Eric Wadsworth" wrote: Piggers, Parameter substitution isn't really what I'm needing. After some discussion wi

Re: Accessing Nested Json

2010-09-29 Thread Alan Gates
On Sep 29, 2010, at 3:46 PM, hc busy wrote: I thought map can only take bytearray as value type? No, it can take any type as a value. There are just a number of places where Pig assumes it is a byte array and then does the wrong thing (like if you try to order by it). If the user just d

Re: Accessing Nested Json

2010-09-29 Thread hc busy
hooray! On Wed, Sep 29, 2010 at 4:24 PM, Alan Gates wrote: > > On Sep 29, 2010, at 3:46 PM, hc busy wrote: > > I thought map can only take bytearray as value type? >> > > No, it can take any type as a value. There are just a number of places > where Pig assumes it is a byte array and then does

funny error

2010-09-29 Thread hc busy
Guys, I'm seeing this one 2998 Unexpected internal error. Can we be more specific or dump a stack trace when this happens?

Re: funny error

2010-09-29 Thread Jeff Zhang
No other stack trace ? And in what situation does this happen ? On Thu, Sep 30, 2010 at 11:09 AM, hc busy wrote: > Guys, I'm seeing this one > > 2998 > > Unexpected internal error. > > > Can we be more specific or dump a stack trace when this happens? > -- Best Regards Jeff Zhang

Re: funny error

2010-09-29 Thread hc busy
"null" was the error. this 60k PigLatin script that I'm running hasn't changed that much, but suddenly started erroring out. I've rebuilt pig release 7 from scratch, checked java version, err... checked PiggyBank and our own libraries, not there changed. You know, some comercial software that has

Re: funny error

2010-09-29 Thread hc busy
But having complained about error, I want to say this seems like a really courteous error message ERROR 1007: Found duplicates in schema. : 2 columns, column_name: 2 columns. *Please* alias the columns with unique names. Where it explains nicely how to fix the error. On Wed, Sep 29, 2010 at

Re: Problem with LzoTokenizedLoader with elephant-bird branch for Pig 0.7

2010-09-29 Thread Rohan Rai
Hi Which Hadoop/ PIg version are you using ?? Regards Rohan ed wrote: Hello, I tested the newest push to the hirohanin elephant-bird branch (for pig 0.7) and had an error when trying to use LzoTokenizedLoader with the following pig script: REGISTER elephant-bird-1.0.jar REGISTER /u

RE: funny error

2010-09-29 Thread Santhosh Srinivasan
Are you sure that the 2998 and 1007 both popped up? -Original Message- From: hc busy [mailto:hc.b...@gmail.com] Sent: Wednesday, September 29, 2010 8:21 PM To: pig-user@hadoop.apache.org Subject: Re: funny error But having complained about error, I want to say this seems like a really