[ 
https://issues.apache.org/jira/browse/TEZ-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14119492#comment-14119492
 ] 

Siddharth Seth commented on TEZ-1447:
-------------------------------------

bq. Interested parties should be able to register(ENUM, List<Vertex>)
A registration API with arguments can be added at a later point. For now, there 
aren't a lot of events - and just the vertex name should be sufficient.

bq. InputInitializerContext already allows querying the number of tasks in a 
vertex. There is no need to pass the number of tasks in the event itself.
Parallelism Update is handled differently, since it isn't a state transition. 
Since the event is handled differently, it's convenient to provide the 
parallelism information right there.
On the API for the InputInitializer itself - I'm not particularly happy about 
sending Events - which then need to be type checked and potentially cast. There 
should be a comment in the patch on splitting these up into separate calls. May 
eventually go with a model where just an Enum is provided. This is still WIP.

What the patch does at the moment - is have a central entity (has a strange 
name at the moment - StatusNotifier) which tracks registered listeners. All 
vertices (and possibly tasks in the future), push relevant state updated 
information to this entity. This entity then notifies the registered listeners 
via an interface.

bq. We can potentially have one state change, trigger another state change and 
create a cascade.
I'm not sure what you mean by this. There's a finite set of state changes that 
will take place, and the set registered for callbacks is even smaller.

In terms of threading issues - there should be a note in the patch. The 
'StatusNotifier' is a very lightweight call - which calls into Tez internal 
components. At the moment the only registered component is the 
RootInputInitializerManager. That needs to change to send notifications via a 
thread, and will end up routing events via a thread as well. Similarly, when 
VertexPlugins / EdgePlugins make use of this - it's their responsibility to 
setup threading to send these events to the user code. At this point, the 
StatusNotifier itself (and in effect the dispatcher thread) would never make 
calls into user code.
StatusNotifier could set up it's own queue - but it looks like VertexManager, 
EdgeManager will eventually need to run using separate threads so that calls 
like onVertexStated / routeEvents don't block the dispatcher thread. 

Another item which needs to be looked at is whether updates need to be sent out 
when a component registers. As an example, if registering for updates from 
'vertex1' - it's possible that 'vertex1' completes just before the 
registration. In such cases, IMO, it's better to send in notifications with the 
latest state, rather than expect user code to register - check initial state 
(which isn't possible at the moment other than parallelism).

I'll update the patch over the next couple of days, with changes based on my 
previous comment to work with a FIRE_ONCE_ON_SUCCESS mode. Using this should 
take care of TEZ-1532 as well.

> Handle parallelism updates and versioning w/ custom InputInitializerEvents
> --------------------------------------------------------------------------
>
>                 Key: TEZ-1447
>                 URL: https://issues.apache.org/jira/browse/TEZ-1447
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.5.0
>            Reporter: Gunther Hagleitner
>            Assignee: Siddharth Seth
>            Priority: Blocker
>         Attachments: TEZ-1447.1.wip.txt
>
>
> I'm trying to do dynamic partition pruning through input initializer events 
> in Hive. That means that the initializer of a table scan vertex has to 
> receive events from all tasks in another vertex (which contain the pruning 
> info) before generating tasks to run.
> The problem with the current API I ran into:
> getNumTasks: I'm currently using a busy loop to wait for the num tasks for a 
> vertex to be decided (-1 -> x). There's no way around it, because it's the 
> only way to find out what number of events to expect (0 is a valid number of 
> tasks - so I can't wait for the first to complete).
> With auto-reducer parallelism I have to employ another busy loop. Because I 
> might be initially expecting 10 events, which later get's knocked down to 5. 
> Since there's no event associated with this, I have to periodically check 
> whether I have enough events.
> Versioning: Events have a version number, but I don't know which task they 
> are coming from. Thus I can't de-dup events.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to