Hello,
I have written one PIG Script and tried to execute it, but after executing some
part it gives me error java.io.IOException: Spill failed. I have included below
statements in my script. And also I have set the classpath for hadoop-LZO jar.
1) set mapred.compress.map.output true;
2) set ma
I'm unable to recreate this by doing the following :
tar xvzf pig-0.11.1.tar.gz
cd pig-0.11.1
ant
ant pigunit-jar
I'm running on Mint 14
% java -version
java version "1.7.0_25"
Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
Hi everyone,
I'm attempting to compile a pigunit jar so that I can begin unit testing my
pig scripts, but I'm running into this issue during compilation:
[ivy:resolve]
[ivy:resolve] :: problems summary ::
[ivy:resolve] WARNINGS
[ivy:resolve] ::
Thanks.
Instead, I found a Python implementation of the erf function, so that'll be
good for now.
http://stackoverflow.com/questions/457408/is-there-an-easily-available-implementation-of-erf-for-python
On Wed, Jul 17, 2013 at 5:08 PM, Cheolsoo Park wrote:
> Hi Dexin,
>
> Unfortunately, Pig is
Unfortunately I can't think of any good way of doing this (other than what
Bertrand suggested with using a different language to generate the script).
I'd also recommend Hive... it may be easier to do this in Hive since you
have SQL like syntax. (Haven't used Hive, but it looks like this type of
t
One thing you can do though, is to let pig create new files every time and
have a post-pig task/job to combine the old file and the new file.
It's a little abnormal to require a single file on HDFS. Normally, MR or
other jobs deal with a folder of files, not just a single file.
Regards,
Xuefu
O
You're correct. It looks like an optimization was put in to make distinct use
a special partitioner which prevents the user from setting the partitioner.
Could you file a JIRA against the docs so we can get that fixed?
Alan.
On Jul 17, 2013, at 11:27 AM, William Oberman wrote:
> The docs say
I don't think this is macro-able, Pradeep. Every step of the way a
different column gets updated. For example, for FACT_TABLE3 we update
'col1' from DIMENSION1, for FACT_TABLE5 we update 'col2' from DIMENSION2 &
so on.
Feel free to correct me if I am wrong. Thanks.
On Thu, Jul 18, 2013 at
Looks like this might be macroable. Not entirely sure how that can be done
yet... but I'd look into that if I were you.
On Thu, Jul 18, 2013 at 11:16 AM, Something Something <
mailinglist...@gmail.com> wrote:
> Wow, Bertrand, on the Pig mailing list you're recommending not to use
> Pig... LOL!
Wow, Bertrand, on the Pig mailing list you're recommending not to use
Pig... LOL! Jokes apart, I would think this would be a common use case for
Pig, no? Generating a Pig script on the fly is a decent idea, but we're
hoping to avoid that - unless there's no other way. Thanks for the
pointers.
Hi, we've created simple utility project for testing pig scripts.
We the core we do:
def pigServer = new PigServer(ExecType.LOCAL)
pigServer.setBatchOn()
try
{
pigServer.registerScript(new
FileInputStream(scriptFile.absolutePath), params, null)
pigServer.dumpSchema(
Use ORDER. if set is not too big.
Or write mr job with single reducer. You even can try use default mapper
and reducer in there is no problem with input format.
2013/7/18 Bhavesh Shah
> Thanks Serega and Pradeep for your quick replies.
>
>
>
> Serega, As i am new to PIG, I didn't understand "Pi
Thanks Serega and Pradeep for your quick replies.
Serega, As i am new to PIG, I didn't understand "Pig Script with one reduce
action". Do you mean to write reduce action in Pig Latin or in some other
langauge?
- Bhavesh.
> Date: Thu, 18 Jul 2013 16:03:54 +0400
> Subject: Re: Want to
*merge* and sort them to only
one file on *local fs*. is kept.
Are you sure that you want to merge several HDFS files into one LOCAL file?
Local file would be in your local file system.
The simples way is to use union in pig and union existig files in HDFS with
new one generated by pig script.
If you want persistent storage like that, you're best bet is to use a
database like HBase
On Jul 18, 2013 7:56 AM, "Bhavesh Shah" wrote:
> Thanks for reply. :)
>
> I just came across one command -getmerge
>
>
>
> -getmerge : Get all the files in the directories that
> match the source file pa
Thanks for reply. :)
I just came across one command -getmerge
-getmerge : Get all the files in the directories that
match the source file pattern and merge and sort them to only
one file on local fs. is kept.
I am thinking if I STORE the data in some other file say TMP_Name
and l
it's not possible. It's HDFS.
2013/7/18 Bhavesh Shah
> Hello,
>
> Actually I have a use case in which I will receive the data from some
> source and I have to dump it in the same file after every regular interval
> and use that file for further operation. I tried to search on it, but I
> didn't
Hello,
Actually I have a use case in which I will receive the data from some source
and I have to dump it in the same file after every regular interval and use
that file for further operation. I tried to search on it, but I didn't see the
anything related to this.
I am using STORE function,
I would say either generate the script using another language (eg Python)
or use a true programming language with an API having the same level of
abstraction (eg Java and Cascading).
Bertrand
On Thu, Jul 18, 2013 at 8:44 AM, Something Something <
mailinglist...@gmail.com> wrote:
> There must be
19 matches
Mail list logo