Can anyone tell us the answer?
Someone from yahoo has suggest me to use zebra, although we do not use it.
But I really interest in it, its really usage and why there is not document
in pig > 0.9.2 .
2013/1/29 Devon Crouse
> We're looking into using Zebra for merge joins, and noticed that the l
2012/12/25 Kshiva Kps
> sorry to ask you if possible could you pls advice on below points
>
> In general in Real time how we will write PIG scripts.
>
What is your real time mean? runtime debug?
> 1. PIG scripts alone in Eclipse
>
maybe this can help you. http://wiki.apache.org/pig/PigPen
> 2.
what's the output when you replace the DBStorage to normal text output?
Is it normally the context you expected?
*李剑 Jameson
Hadoop工程师*
*MediaV **聚胜万合* 上海 ・ 北京 ・ 深圳 ・ 广州 ・ 杭州 ・ 厦门 ・ 南京
*
*
上海市闸北区恒丰路580号恒汇国际大厦10楼 200070
MOB:18621578671
TEL:021-60568188*8247
FAX:021-60568
I think If the code is java code in the udf, you'd better catch the
exception and do something, just not ignore that.
If your pig code meet a cast failed, you can step by step debug your code
line by line in the interactive mode.
Sometimes when after one step(as streaming) the schema seems not cer
I am sorry that I forget to change the default mail addr, and using the
company mail addr to send the before mail.
Please ignore it.
专注于Mysql,MSSQL,Oracle,Hadoop
2013/2/1 Jameson Li
> what's the output when you replace the DBStorage to normal text output?
> Is it normally the
Yes. I am sure.
2011/7/6 Dmitriy Ryaboy
> Works for me.
>
> Make sure you have grep on the path of all your nodes?
>
> D
>
> On Mon, Jul 4, 2011 at 7:59 PM, Jameson Li wrote:
> > I have a doubt that:
> >
> > sometime when I run the pig code:
http://thedatachef.blogspot.com/2011/01/apache-pig-08-with-cloudera-cdh3.html
a little difference:
update "for jar in $HADOOP_HOME/hadoop-core-*.jar $HADOOP_HOME/lib/* ; do"
to "for jar in $HADOOP_HOME/hadoop*core*.jar $HADOOP_HOME/lib/* ; do"
2011/7/4 Dmitriy Ryaboy
> Use the jar built with "a
I have a doubt that:
sometime when I run the pig code:
c = stream b through `grep "spider"`;
It will return the error message:
Received Error while processing the map plan: 'grep "spider" ' failed with
exit status: 1
But when I use the pig code:
c = stream b through `awk '{a=index($0,"spider");i
How about the pig jar lib path?
Sometime after building my UDF, I register the new udf jar, but I had forgot
the old udf jar remain in the $PIG_HOME/lib/, and when the pig code used the
UDF class, and it will use the classes compiled in the old udf rather than
the new one.
Maybe your troublesome is
I have the same doubt as Thomas Kappler.
And it will be kind of you if someone can say something more detailed about
'custom partitioner' said by Daniel Dai.
I think the docs 'piglatin_ref2.html#partitionby' seems too simple.
2011/6/17 Daniel Dai
> Try custom partitioner: http://pig.apache.org/
iStorageSwithKey.
Am I right?
Thanks very much.
2011/6/17 Jameson Li
> I am sorry that I have a fault.
> My newest jar file is in the dir /home/user/project/lib/myUDF.jar, but
> there has an old jar file in the pig lib dir $PIG-HOME/lib(/opt/pig/lib ).
> Unfortunately after registe
I am sorry that I have a fault.
My newest jar file is in the dir /home/user/project/lib/myUDF.jar, but there
has an old jar file in the pig lib dir $PIG-HOME/lib(/opt/pig/lib ).
Unfortunately after registering the jar
file--/home/user/project/lib/myUDF.jar, when the pig code execuded, it will
first
t jar file after re-compile?
Thanks very much.
2011/6/15 Daniel Dai
> Check http://wiki.apache.org/pig/PigStorageWithInputPath, also you will
> need to disable split combination: -Dpig.noSplitCombination=true
>
> Daniel
>
>
> On 06/13/2011 04:07 AM, Jameson Li wrote:
>
Hi,
I hava some files in the hdfs://path/load/ like this:
file_29_1
file_47_1
file_16_1
...
These files are generate by other M/R jobs. The files are only contains one
column, and the number in the file name between 'file_' and '_1' is a
id.
I want to add the id into its input form
'value'?
>
> Alan.
>
>
> On May 24, 2011, at 3:05 AM, Jameson Li wrote:
>
> OK.OK.I know that just write UDFs.
>> I have to write UDFs, and see you..
>> And I still think there should be grammar support for map operation both
>> static key and dy
you
> may need to put into UDF.
>
> Grammar support for map is based on static key, eg: m#'key1'. Your use case
> is mostly dealing dynamic keys, which you may rely on yourself currently.
>
> Daniel
>
> -----Original Message- From: Jameson Li
> Sent: Monday, Ma
of map values
>
> The script is like:
> b = foreach ruls generate com.company.pig.GetURLContent($0,3,0.1) as m;
> c = foreach b generate GetKey(m) as key, m;
> d = group c by key;
> e = foreach c generate group, SUM(GetValues(c.m));
>
>
> Daniel
>
>
> On 05/23/2011
14138427,9#0.1107544,282#0.18699136])
I just want group by the map key and sum the map value just like:
c = group b by $0#key;
d = foreach c generate group,SUM(b.$0#value);
How could I write the code?
Thanks,
Jameson Li.
Jameson,
> >
> > Do you mind to add something like this:
> >
> > c = order b by $0 parallel n;
> > store c into '20110331-ab';
> >
> > you can order on anything. it will add a reduce and give you less files.
> >
> > Regards,
> > Sh
gt; --
> Jameson Lopp
> Software Engineer
> Bronto Software, Inc.
>
>
> On 04/01/2011 03:57 AM, Jameson Li wrote:
>
>> Hi,
>>
>> When I run the below pig codes:
>> a = load '/logs/2011-03-31';
>> b = filter a by $1=='a'
have a doubt that how I could store less files when I use pig to store
files in the HDFS.
Thanks,
Jameson Li.
21 matches
Mail list logo