Re: embedded pig error

2011-05-13 Thread Andrew Wells
I should be more clear, append the path the root class path, not the classpath On Sat, May 14, 2011 at 1:05 AM, Andrew Wells wrote: > I had a simular problem, you need to add the path to the rootClasspath, not > just the class path. > > > > On Fri, May 13, 2011 at 8:49 PM, Daniel Dai wrote: >

Re: embedded pig error

2011-05-13 Thread Andrew Wells
I had a simular problem, you need to add the path to the rootClasspath, not just the class path. On Fri, May 13, 2011 at 8:49 PM, Daniel Dai wrote: > Sounds like a hadoop job setup exception. Go to job tracker UI, you may > have chance to locate the job and check what happen in job setup. > >

Re: release notes?

2011-05-13 Thread Dmitriy Ryaboy
Released 0.8.1 INCOMPATIBLE CHANGES pig-1936: documentation update (chandec via olgan) PIG-1680: HBaseStorage should work with HBase 0.90 (gstathis, billgraham, dvryaboy, tlipcon via dvryaboy) IMPROVEMENTS PIG-1830: Type mismatch error in key from map, when doing GROUP on PigStorageSchema() va

Re: embedded pig error

2011-05-13 Thread Daniel Dai
Sounds like a hadoop job setup exception. Go to job tracker UI, you may have chance to locate the job and check what happen in job setup. Daniel On 05/11/2011 05:45 PM, Jianting Cao wrote: I'm trying to embed pig into java program. I tried two approaches, none of them works. Approach 1: I fo

Re: java.lang.OutOfMemoryError while running Pig Job

2011-05-13 Thread Thejas M Nair
The stack trace shows that the OOM error is happening when the distinct is being applied. It looks like in some record(s) of the relation group_it, one more of the following bags is very large - logic.c_users, logic.nc_users or logic.registered_users; Try setting the property pig.cachedbag.memusa

Re: order by throwing exception in cluster

2011-05-13 Thread Thejas M Nair
The exception stack has LocalJobRunner, that is strange. Have you specified the cmd line option "-x mapreduce" ? Is the hadoop conf dir in class path? -Thejas On 5/13/11 12:37 PM, "Irooniam" wrote: Hello, I'm running into a weird problem that I'm hoping you can help me with. I'm basically j

Re: release notes?

2011-05-13 Thread Daniel Dai
You can: 1. CHANGE.txt has all the issue fixed in 0.8.1 2. Go to Jira, search for tickets with fix version 0.8.1 Daniel On 05/13/2011 12:36 PM, Corbin Hoenes wrote: Is there a change log for the 0.8.1 release? release notes.txt just mentions "bug fixes"

Re: release notes?

2011-05-13 Thread Mark Laczin
I will note that the notes themselves don't contain much. On Fri, May 13, 2011 at 4:11 PM, Mark Laczin wrote: > http://www.fightrice.com/mirrors/apache/pig/pig-0.8.1/ > > It's odd, and should probably be changed, but you click "Pig 0.8 and > later" on the Releases page, then click the mirror link

Re: release notes?

2011-05-13 Thread Mark Laczin
http://www.fightrice.com/mirrors/apache/pig/pig-0.8.1/ It's odd, and should probably be changed, but you click "Pig 0.8 and later" on the Releases page, then click the mirror link that Apache gives you, then click pig-0.8.x/ Then they're in there. (took me a while to find too) On Fri, May 13, 2

order by throwing exception in cluster

2011-05-13 Thread Irooniam
Hello, I'm running into a weird problem that I'm hoping you can help me with. I'm basically just loading a access log, grouping, ordering and then dumping the data. I can load the log, group and order when I'm in local mode, but when I try to do the same in the hadoop cluster I always get a err

release notes?

2011-05-13 Thread Corbin Hoenes
Is there a change log for the 0.8.1 release? release notes.txt just mentions "bug fixes"

Re: input into pig

2011-05-13 Thread Mark Laczin
I'm not sure if Pig can do this. It's designed to follow the MapReduce/Hadoop paradigm which typically involves data on disk -> MapReduce Jobs -> data on disk. You could try to create a custom InputSplit/RecordReader to read from a program's standard output or something but this is kind of hacky.

Re: input into pig

2011-05-13 Thread Jianting Cao
Thank you Mark. Sorry that I'm not clear enough. What I want is this, there are some program running and generating a lot of data, instead of putting these data to a relational database, I want to directly output them to Pig and do some analysis along the way or afterwards. So I'm asking if there i

Re: Explain Plan in Pig

2011-05-13 Thread Dmitriy Ryaboy
I didn't know about the -e explain option! That's massively helpful. Thanks Alan. On Thu, May 12, 2011 at 5:43 PM, Alan Gates wrote: > http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html#dev_tools > > Alan. > > On May 12, 2011, at 4:32 PM, sonia gehlot wrote: > >> Hi Guys, >>

Re: input into pig

2011-05-13 Thread Mark Laczin
Technically speaking, yes you could store data in memory and keep it there, then have your program present some interface to store data (shared memory or reading from the stdin or something) but I'm not sure why you'd want to do this. Maybe I'm misunderstanding your question, but it sounds like yo

input into pig

2011-05-13 Thread Jianting Cao
Hi, Is there only one way to load data into pig, i.e. using load command to load data from files? Can I load data from memory, for example in embedded code create a table and store data into it? Thanks, Jianting Cao

Re: Tuple to lines conversion in Pig

2011-05-13 Thread Vincent
Thanks Yong and Mridul, I was able to the the trick like this: A = LOAD 'peoples.txt' USING PigStorage(';') AS (name : chararray, pets_ids : chararray); B = FOREACH A GENERATE name, TOKENIZE(REPLACE(pets_ids, ',', ' ')) AS products_bag; DUMP B; DESCRIBE B; C = FOREACH B GENERATE name, FLATTEN(p

Re: Pig 0.7 download mirror sites not working

2011-05-13 Thread Subhramanian, Deepak
Hi Alan, Thank you for help. On 12 May 2011 19:22, Alan Gates wrote: > Hadoop has removed the release artifacts of its former subprojects > (including Pig) from the mirrors. You can still find the release in > Apache's archive: http://archive.apache.org/dist/hadoop/pig/pig-0.7.0/ > > Alan. > >