I'd skip the second step. "For DAG, VERTEX i use the #setConf() method to
forward *all properties with the* *corresponding scope* from my main conf
object". This won't help anything at the moment.
Other than that, this should work.

InputInitializers and OutputCommitters (as well as Processors, Inputs,
Outputs) have a user payload field. If using FileInputFrmat /
FileOutputFormat based Inputs and Outputs - a payload is setup for the
initializer / committer. That will contain a Configuration instances (and
some more information) serialized to bytes. This Configuration instance
would require some of the properties as well.
Regarding the TezRuntimeConfiguration values - these are used when
configuring the standard Edges, and setAdditionalConfiguration will take
care of propagating the appropriate config parameters for a specific edge.

On Tue, Sep 15, 2015 at 3:52 AM, Johannes Zillmann <[email protected]
> wrote:

> Alright… once again…
>
> So i saw that all the TezConfiguration fields are annotated with a Scope
> like AM, DAG, VERTEX, etc…
> So here is what i intend to do:
> - The TezConfiguration for TezClient.create() will simply contain *all
> properties *from my main conf object
> - For DAG, VERTEX i use the #setConf() method to forward *all properties
> with the* *corresponding scope* from my main conf object
> - For the edgeBuilder i use the #setAdditionalConfiguration() method to
> forward *all properties *from my main conf object
>
> So does this strategy make sense to you or am i missing something or
> getting it wrong ?
>
> Couple of more questions:
> - Regarding your comment on InputInitializers and OutputCommitters… I
> don’t see any possibility to set properties on that. I’m using the user
> payload to transfer conf values which are needed. Do i miss something here ?
> - What about the TezRuntimeConfiguration values, do i need to do anything
> special with that ?
>
>
> best
> Johannes
>
>
>
> On 14 Sep 2015, at 20:42, Siddharth Seth <[email protected]> wrote:
>
> For Edges, the approach that you took with
> edgeBuilder.setAdditionalConfiguration will work to set relevant Tez
> properties for an edge. You should be able to iterate through properties
> and set the config on the edge - and the relevant ones will be set.
> (Compression has a specific API which you could use, but using
> setAdditionalConfiguration will also work).
> Typically, additional Hadoop properties are also required for Edges -
> things like the list of compression codecs.
> edgeConfigs.setAdditionalConfiguration does take care of allowing these
> properties through.
>
> The TezClient needs to be provided a config - which is then made available
> to the AM. There's not much filtering involved here, and you could set
> tez.* for this configuration instance. An attempt will be made to pick up
> YarnConfiguration to connect to the cluster.
>
> The same applies for InputInitializers and OutputCommitters. Typically
> (and unfortunately), you'll end up setting all configs.
>
> dag.setConf, and vertex.setConf should not be used - I've opened a jira to
> add docs for these.
>
> How do you get the Hadoop configs in this case ? Is that part of the
> Configuration like object ?
>
>
>
> On Mon, Sep 14, 2015 at 9:47 AM, Johannes Zillmann <
> [email protected]> wrote:
>
>> Ok,
>>
>> found it. The
>>
>> edgeBuilder.setAdditionalConfiguration(TezRuntimeConfiguration.TEZ_RUNTIME_COMPRESS,
>>  "true”);
>> does work for me!
>>
>> So let me describe my use case a little bit...
>> Basically i have one Configuration like object on the client side. This
>> is assembled by multiple sources and the only way a user can set custom Tez
>> properties (do not use tez-site.xml in any perspective).
>> Then i’m building my DAG with its vertices and edges programatically.
>> Now, do you have any recommendation for me how to route the right Tez
>> properties effectively to the corresponding Tez components ? (with tez
>> components i mean like vertex properties, dag properties, AM properties,
>> edge properties, etc..)
>>
>> Should i simply set all tez.* properties to any component or is there a
>> smarter way ?
>> And what components/properties might i’m missing ?
>>
>> Any help appreciated!
>> Johannes
>>
>>
>> On 14 Sep 2015, at 16:57, Johannes Zillmann <[email protected]>
>> wrote:
>>
>> Hey guys,
>>
>> question. How do i enabled tez.runtime.compress programatically ?
>> When i set this property in the tez-site.xml it is picket up correctly.
>> But all other options i tried:
>> - dag.setConf(TezRuntimeConfiguration.TEZ_RUNTIME_COMPRESS, "true");
>> - mapVertex.setConf(TezRuntimeConfiguration.TEZ_RUNTIME_COMPRESS, "true"
>> );
>> - reduceVertex.setConf(TezRuntimeConfiguration.TEZ_RUNTIME_COMPRESS,
>> "true”);
>>
>> do not have any effect! (Checking the log output of the Shuffle class)
>>
>> Johannes
>>
>>
>>
>
>

Reply via email to