date:20130718

java.lang.RuntimeException: native-lzo library not available

2013-07-18 Thread Bhavesh Shah

Hello, I have written one PIG Script and tried to execute it, but after executing some part it gives me error java.io.IOException: Spill failed. I have included below statements in my script. And also I have set the classpath for hadoop-LZO jar. 1) set mapred.compress.map.output true; 2) set ma

Re: Compiling PigUnit

2013-07-18 Thread j.barrett Strausser

I'm unable to recreate this by doing the following : tar xvzf pig-0.11.1.tar.gz cd pig-0.11.1 ant ant pigunit-jar I'm running on Mint 14 % java -version java version "1.7.0_25" Java(TM) SE Runtime Environment (build 1.7.0_25-b15) Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)

Compiling PigUnit

2013-07-18 Thread Siegfried Bilstein

Hi everyone, I'm attempting to compile a pigunit jar so that I can begin unit testing my pig scripts, but I'm running into this issue during compilation: [ivy:resolve] [ivy:resolve] :: problems summary :: [ivy:resolve] WARNINGS [ivy:resolve] ::

Re: python version with Jython/Pig

2013-07-18 Thread Dexin Wang

Thanks. Instead, I found a Python implementation of the erf function, so that'll be good for now. http://stackoverflow.com/questions/457408/is-there-an-easily-available-implementation-of-erf-for-python On Wed, Jul 17, 2013 at 5:08 PM, Cheolsoo Park wrote: > Hi Dexin, > > Unfortunately, Pig is

Re: Getting dimension values for Facts

2013-07-18 Thread Pradeep Gollakota

Unfortunately I can't think of any good way of doing this (other than what Bertrand suggested with using a different language to generate the script). I'd also recommend Hive... it may be easier to do this in Hive since you have SQL like syntax. (Haven't used Hive, but it looks like this type of t

Re: Want to add data in same file in Apache PIG?

2013-07-18 Thread Xuefu Zhang

One thing you can do though, is to let pig create new files every time and have a post-pig task/job to combine the old file and the new file. It's a little abnormal to require a single file on HDFS. Normally, MR or other jobs deal with a folder of files, not just a single file. Regards, Xuefu O

Re: DISTINCT and paritioner

2013-07-18 Thread Alan Gates

You're correct. It looks like an optimization was put in to make distinct use a special partitioner which prevents the user from setting the partitioner. Could you file a JIRA against the docs so we can get that fixed? Alan. On Jul 17, 2013, at 11:27 AM, William Oberman wrote: > The docs say

Re: Getting dimension values for Facts

2013-07-18 Thread Something Something

I don't think this is macro-able, Pradeep. Every step of the way a different column gets updated. For example, for FACT_TABLE3 we update 'col1' from DIMENSION1, for FACT_TABLE5 we update 'col2' from DIMENSION2 & so on. Feel free to correct me if I am wrong. Thanks. On Thu, Jul 18, 2013 at

Re: Getting dimension values for Facts

2013-07-18 Thread Pradeep Gollakota

Looks like this might be macroable. Not entirely sure how that can be done yet... but I'd look into that if I were you. On Thu, Jul 18, 2013 at 11:16 AM, Something Something < mailinglist...@gmail.com> wrote: > Wow, Bertrand, on the Pig mailing list you're recommending not to use > Pig... LOL!

Re: Getting dimension values for Facts

2013-07-18 Thread Something Something

Wow, Bertrand, on the Pig mailing list you're recommending not to use Pig... LOL! Jokes apart, I would think this would be a common use case for Pig, no? Generating a Pig script on the fly is a decent idea, but we're hoping to avoid that - unless there's no other way. Thanks for the pointers.

DESCRIBE alias in local mode

2013-07-18 Thread Serega Sheypak

Hi, we've created simple utility project for testing pig scripts. We the core we do: def pigServer = new PigServer(ExecType.LOCAL) pigServer.setBatchOn() try { pigServer.registerScript(new FileInputStream(scriptFile.absolutePath), params, null) pigServer.dumpSchema(

Re: Want to add data in same file in Apache PIG?

2013-07-18 Thread Serega Sheypak

Use ORDER. if set is not too big. Or write mr job with single reducer. You even can try use default mapper and reducer in there is no problem with input format. 2013/7/18 Bhavesh Shah > Thanks Serega and Pradeep for your quick replies. > > > > Serega, As i am new to PIG, I didn't understand "Pi

RE: Want to add data in same file in Apache PIG?

2013-07-18 Thread Bhavesh Shah

Thanks Serega and Pradeep for your quick replies. Serega, As i am new to PIG, I didn't understand "Pig Script with one reduce action". Do you mean to write reduce action in Pig Latin or in some other langauge? - Bhavesh. > Date: Thu, 18 Jul 2013 16:03:54 +0400 > Subject: Re: Want to

Re: Want to add data in same file in Apache PIG?

2013-07-18 Thread Serega Sheypak

*merge* and sort them to only one file on *local fs*. is kept. Are you sure that you want to merge several HDFS files into one LOCAL file? Local file would be in your local file system. The simples way is to use union in pig and union existig files in HDFS with new one generated by pig script.

RE: Want to add data in same file in Apache PIG?

2013-07-18 Thread Pradeep Gollakota

If you want persistent storage like that, you're best bet is to use a database like HBase On Jul 18, 2013 7:56 AM, "Bhavesh Shah" wrote: > Thanks for reply. :) > > I just came across one command -getmerge > > > > -getmerge : Get all the files in the directories that > match the source file pa

RE: Want to add data in same file in Apache PIG?

2013-07-18 Thread Bhavesh Shah

Thanks for reply. :) I just came across one command -getmerge -getmerge : Get all the files in the directories that match the source file pattern and merge and sort them to only one file on local fs. is kept. I am thinking if I STORE the data in some other file say TMP_Name and l

Re: Want to add data in same file in Apache PIG?

2013-07-18 Thread Serega Sheypak

it's not possible. It's HDFS. 2013/7/18 Bhavesh Shah > Hello, > > Actually I have a use case in which I will receive the data from some > source and I have to dump it in the same file after every regular interval > and use that file for further operation. I tried to search on it, but I > didn't

Want to add data in same file in Apache PIG?

2013-07-18 Thread Bhavesh Shah

Hello, Actually I have a use case in which I will receive the data from some source and I have to dump it in the same file after every regular interval and use that file for further operation. I tried to search on it, but I didn't see the anything related to this. I am using STORE function,

Re: Getting dimension values for Facts

2013-07-18 Thread Bertrand Dechoux

I would say either generate the script using another language (eg Python) or use a true programming language with an API having the same level of abstraction (eg Java and Cascading). Bertrand On Thu, Jul 18, 2013 at 8:44 AM, Something Something < mailinglist...@gmail.com> wrote: > There must be

java.lang.RuntimeException: native-lzo library not available

Re: Compiling PigUnit

Compiling PigUnit

Re: python version with Jython/Pig

Re: Getting dimension values for Facts

Re: Want to add data in same file in Apache PIG?

Re: DISTINCT and paritioner

Re: Getting dimension values for Facts

Re: Getting dimension values for Facts

Re: Getting dimension values for Facts

DESCRIBE alias in local mode

Re: Want to add data in same file in Apache PIG?

RE: Want to add data in same file in Apache PIG?

Re: Want to add data in same file in Apache PIG?

RE: Want to add data in same file in Apache PIG?

RE: Want to add data in same file in Apache PIG?

Re: Want to add data in same file in Apache PIG?

Want to add data in same file in Apache PIG?

Re: Getting dimension values for Facts

19 matches

Site Navigation

Mail list logo

Footer information