[jira] [Commented] (TEZ-1190) Allow multiple edges between two vertexes

2015-07-22 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637489#comment-14637489
 ] 

Rohini Palaniswamy commented on TEZ-1190:
-

bq. Perhaps what is happening here is that there are 2 vertices inside VG1 
which are receiving the 2 inputs from V1
   Not sure what you mean by 2 vertices inside VG1.  But I believe this is what 
we were doing before.

VertexGroup vertexGroup = dag.createVertexGroup(groupName, groupMembers); // 
groupMembers is V1, V1


 Allow multiple edges between two vertexes
 -

 Key: TEZ-1190
 URL: https://issues.apache.org/jira/browse/TEZ-1190
 Project: Apache Tez
  Issue Type: Bug
Reporter: Daniel Dai

 This will be helpful in some scenario. In particular example, we can merge 
 two small pipelines together in one pair of vertex. Note it is possible the 
 edge type between the two vertexes are different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1190) Allow multiple edges between two vertexes

2015-07-21 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636084#comment-14636084
 ] 

Bikas Saha commented on TEZ-1190:
-

VertexGroup can only be the source of a group input edge and not the 
destination. So I dont think Pig could have made VG1 the destination of 2 input 
edges. Perhaps what is happening here is that there are 2 vertices inside VG1 
which are receiving the 2 inputs from V1?

 Allow multiple edges between two vertexes
 -

 Key: TEZ-1190
 URL: https://issues.apache.org/jira/browse/TEZ-1190
 Project: Apache Tez
  Issue Type: Bug
Reporter: Daniel Dai

 This will be helpful in some scenario. In particular example, we can merge 
 two small pipelines together in one pair of vertex. Note it is possible the 
 edge type between the two vertexes are different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1190) Allow multiple edges between two vertexes

2015-07-21 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635661#comment-14635661
 ] 

Rohini Palaniswamy commented on TEZ-1190:
-

[~bikassaha],
The classes are UnionOptimizer and TezDAGBuilder before PIG-4495 went in. 
The code is not straightforward tez code though. 

For eg:
A = LOAD 'data';
SPLIT A into B if $0  5, C if $0  10 and $0  12; // This is just for an 
example. This simple condition can be written with just FILTER instead of split 
and union.
D = UNION B, C;
E = GROUP D by $1;

The pig plan would be like - V1(Load) - VG1, V1(Load)-VG1, VG1-V2 (Group by)

  Vertex group VG1 takes two inputs from same source vertex V1. The output 
vertex is only one i.e V2.

 Allow multiple edges between two vertexes
 -

 Key: TEZ-1190
 URL: https://issues.apache.org/jira/browse/TEZ-1190
 Project: Apache Tez
  Issue Type: Bug
Reporter: Daniel Dai

 This will be helpful in some scenario. In particular example, we can merge 
 two small pipelines together in one pair of vertex. Note it is possible the 
 edge type between the two vertexes are different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1190) Allow multiple edges between two vertexes

2015-07-20 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634035#comment-14634035
 ] 

Bikas Saha commented on TEZ-1190:
-

[~rohini] I tried using a vertex group as a source of multiple edges to a 
vertex but it did not work. Not sure what you did in Pig. Do you happen to have 
the code snippet that shows how you used vertex groups for multiple edges?

 Allow multiple edges between two vertexes
 -

 Key: TEZ-1190
 URL: https://issues.apache.org/jira/browse/TEZ-1190
 Project: Apache Tez
  Issue Type: Bug
Reporter: Daniel Dai

 This will be helpful in some scenario. In particular example, we can merge 
 two small pipelines together in one pair of vertex. Note it is possible the 
 edge type between the two vertexes are different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1190) Allow multiple edges between two vertexes

2015-04-05 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14438125#comment-14438125
 ] 

Rohini Palaniswamy commented on TEZ-1190:
-

With the changes in PIG-4495 have handled both of the above scenarios in Pig 
itself. So we do not require this anymore for Pig. But leaving it open if it 
makes life easier for Hive and Cascading.

 Allow multiple edges between two vertexes
 -

 Key: TEZ-1190
 URL: https://issues.apache.org/jira/browse/TEZ-1190
 Project: Apache Tez
  Issue Type: Bug
Reporter: Daniel Dai

 This will be helpful in some scenario. In particular example, we can merge 
 two small pipelines together in one pair of vertex. Note it is possible the 
 edge type between the two vertexes are different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1190) Allow multiple edges between two vertexes

2015-04-02 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393033#comment-14393033
 ] 

Rohini Palaniswamy commented on TEZ-1190:
-

 I encountered a couple of queries in past two weeks which suffer from 
performance due to this. Currently we write out the data to another dummy 
vertex to avoid multiple edges and this adds overhead. The common patterns are
1) People split the data, perform some foreach transformations/filter, 
union them and then do some operation like group by or join with other data
2) People split the data, perform some foreach transformations/filter and 
self join them. No union in this case. 

Vertex groups accept multiple edges from same vertex.  So we can optimize the 
multi-query planning for 1) when we know there is a vertex group. I hope we can 
rely on that behavior and that does not change?

 Allow multiple edges between two vertexes
 -

 Key: TEZ-1190
 URL: https://issues.apache.org/jira/browse/TEZ-1190
 Project: Apache Tez
  Issue Type: Bug
Reporter: Daniel Dai

 This will be helpful in some scenario. In particular example, we can merge 
 two small pipelines together in one pair of vertex. Note it is possible the 
 edge type between the two vertexes are different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)