Re: GSoC 2013

2013-04-02 Thread burakkk
So what do you suggest? Is it clear? On Mon, Apr 1, 2013 at 9:35 PM, burakkk burak.isi...@gmail.com wrote: I'm using only WTF graph representation to fit the memory. By the way I haven't seen any explanation from the pig 0.11 release page about WTF or graph models. I don't wanna use

Re: GSoC 2013

2013-04-02 Thread Gianmarco De Francisci Morales
FYI, Giraph has a Random Walk implementation. Pig does not support iteration natively, so any iterative algorithm is not a very good fit for it. Just my 2c. Cheers, -- Gianmarco On Tue, Apr 2, 2013 at 10:04 AM, burakkk burak.isi...@gmail.com wrote: So what do you suggest? Is it clear? On

Re: GSoC 2013

2013-04-02 Thread burakkk
I know that but giraph tries to use bsp. What I'm saying is nothing shared model except reducers. Besides I don't want to divide iteration. One phase is still responsible for whole iteration. Every different origin vertex will be processed in parallel. Thanks Best regards... On Tue, Apr 2, 2013

count duplicate entries

2013-04-02 Thread jamal sasha
Hi, I have data in hdfs like: id1,field1,field2 1,2,3 1,2,3 1,2,4 1,2,5 I want to find the number of unique entries using pig.. So here, number of unique entries are 3 ( as 1,2,3 is repeated twice) How do i find this? Thanks

UDF Complex Pig Object to JsonObject

2013-04-02 Thread Dan DeCapria, CivicScience
From within a Java UDF, I'm looking for an easy way to go from a complex pig Object to a Json Object. The converse operation is also desired. Use Case 1: DataBag {(a,1.0)} with Schema b1:bag{t1:tuple(t:chararray,s:double)} return JsonObject {[a,1.0]} Converse Use Case 1: JsonObject {[a,1.0]}

optimization for data cube

2013-04-02 Thread Haitao Yao
Hi, all I have a tuple like this: (group_a,group_b,group_c,value) and I want to calculate the values in a data cube way, which means I want to generate new tuples from the original one : (all,all,all,value) (group_a,all,all,value) (all,group_b,all,value) (group_a,group_b,all,value)

viewfs on pig

2013-04-02 Thread 李建伟
Hi, Do you know which version of Pig support Viewfs? I tried 0.11.0, got the following error: grunt A = load '/tmp/passwd' using PigStorage(':'); grunt B = foreach A generate $0 as id; grunt dump B; 2013-04-03 01:04:20,489 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features

Re: optimization for data cube

2013-04-02 Thread Prasanth J
From 0.11 release onwards Pig natively supports CUBE operator. Here is the documentation for CUBE operator http://pig.apache.org/docs/r0.11.1/basic.html#cube For your case the query can be represented as cubed = CUBE input BY CUBE(group_a,group_b,group_c); output = FOREACH cubed GENERATE