RE: 0.5 blockers

2014-07-25 Thread Bikas Saha
Yes. The relevant jiras have just been marked as blockers. -Original Message- From: Chris K Wensel [mailto:ch...@wensel.net] Sent: Friday, July 25, 2014 2:08 PM To: d...@tez.apache.org Cc: user@tez.apache.org Subject: Re: 0.5 blockers why not, TEZ-684. https://issues.apache.org/jira/bro

Re: 0.5 blockers

2014-07-25 Thread Chris K Wensel
why not, TEZ-684. https://issues.apache.org/jira/browse/TEZ-684 On Jul 24, 2014, at 7:21 PM, Bikas Saha wrote: > Folks, > > > > Here are the blockers for 0.5. > > https://issues.apache.org/jira/browse/TEZ-1311?jql=project%20%3D%20TEZ%20AND%20resolution%20%3D%20Unresolved%20AND%20priority%2

RE: DataMovementType impls

2014-07-25 Thread Bikas Saha
Please feel free to work out your use case and then outline it on this thread. We may be able to help you figure out what exactly you would need to do in order to integrate with Tez. Bikas *From:* David Capwell [mailto:dcapw...@gmail.com] *Sent:* Friday, July 25, 2014 1:31 PM *To:* user@tez.a

Re: DataMovementType impls

2014-07-25 Thread David Capwell
Its more of a persisted service atm. Ill take a look at defining this the way you spoke of. Thanks! On Fri, Jul 25, 2014 at 12:11 PM, Siddharth Seth wrote: > Doing something like that would involve writing a new Outputs / Inputs, or > modifying the existing ones to write to a different sink.

Re: DataMovementType impls

2014-07-25 Thread Siddharth Seth
Doing something like that would involve writing a new Outputs / Inputs, or modifying the existing ones to write to a different sink. We have prototyped such changes in the past - to write to HDFS as an example, and the changes are not very complicated. This involves changing how the existing Output

Re: DataMovementType impls

2014-07-25 Thread David Capwell
Was looking into saying that when two vertexes share data, that they can choose to share that data over disk, or over our internal system (so share over network). In the cases where data persistence isn't needed and the vertexes can be on the same node, then to ignore this system. The use-case is

Re: DataMovementType impls

2014-07-25 Thread Siddharth Seth
DataSourceType isn't really used at the moment. Eventually, it would serve more as a scheduling and failure recovery mechanism more than deciding how data gets persisted between stages. (This property could potentially be used by some of the Inputs/Outputs to alter the way they persist data - but t

Re: DataMovementType impls

2014-07-25 Thread David Capwell
Sorry, copy/paste issue. I was looking at DataSourceType and trying to see how data gets saved and read between tasks. The use-case is that we have an internal service that might be helpful for us, so wanted to prototype how possible it would be to share data over different mechanism. On Fri, J

Re: DataMovementType impls

2014-07-25 Thread Hitesh Shah
DataMovementEvent is a construct defined for an Input/Output pair to communicate with each other. The actual information being passed between the 2 is not understood by the framework except in that, it is a byte payload to be handed off from the source to the destination. Users are not expected

Re: DataMovementType impls

2014-07-25 Thread Jianfeng (Jeff) Zhang
Hi David, DataMovementType is used for creating EdgeManager for different data movement. (Check method createEdgeManager() in Edge.java) You can define your own custom DataMovementType by defining your Edge manager. Could you let us know what kind of custom data movement you'd like to implement ?

DataMovementType impls

2014-07-25 Thread David Capwell
So going through the code and not sure where the real logic of DataMovementType gets used. I see that in DagTypeConverts it can convert between DataMovementType and PlanEdgeDataMovementType, but once that happens I don't really see a way to implement any of these types. Where is the implementatio

Re: Task count

2014-07-25 Thread Gopal V
On 7/25/14, 3:20 PM, Johannes Zillmann wrote: Ok, will try this, thanks! Can you say the jira number so i can track progress on that ? You know for which version of Tez this is planned ? And no, no more use cases currently! That JIRA is a fairly complex scale problem, so I will take a while

RE: Task count

2014-07-25 Thread Bikas Saha
https://issues.apache.org/jira/browse/TEZ-1157 -Original Message- From: Johannes Zillmann [mailto:jzillm...@googlemail.com] Sent: Friday, July 25, 2014 2:51 AM To: user@tez.apache.org Cc: Gopal V Subject: Re: Task count Ok, will try this, thanks! Can you say the jira number so i can track

Re: Task count

2014-07-25 Thread Johannes Zillmann
Ok, will try this, thanks! Can you say the jira number so i can track progress on that ? You know for which version of Tez this is planned ? And no, no more use cases currently! best Johannes On 24 Jul 2014, at 20:38, Bikas Saha wrote: > The patch should work for all types of vertices because