TEZ-1379 went in. You should be able to use this properly now.

On Tue, Aug 5, 2014 at 11:27 PM, Johannes Zillmann <[email protected]
> wrote:

> Hey Sid,
> On 05 Aug 2014, at 21:05, Siddharth Seth <[email protected]> wrote:
>
> > The last configuration parameter to "
> OrderedPartitionedKVEdgeConfigurer.newBuilder(keyClassName, valueClassName,
> myPartitionerClassName, jobConfForShuffleSort);" is the configuration for
> the partitioner itself. That's only used in the Output - and hence is not
> available in the consuming Input.
> >
> > It looks like we're missing the option to set a Configuration for the
> comparator. There's a couple of other changes required in the
> EdgeConfigurers - I'll create a jira and post a patch later today.
> Cool, thanks!
>
> >
> > One of the big reasons to separate out the Configurations is to limit
> the size of the payload generated. Using a generic conf (which usually ends
> up inheriting from JobConf etc) ends up setting a large number of keys
> (1000+ in cases), off which very few are actually used.
> setFromConfiguration(...) actually strips out unused keys. The
> partitionerConf parameter is meant to be a very specific Configuration only
> for the Partitioner (should only contain the limited set of keys required
> to run the partitioner). Similarly for the Comparator conf - once it is
> added. Tez has no way of knowing what a valid set of keys for the
> partitioner, comparator and combiner are - since these are all user
> specified classes.
>
> ++++1 yeah, basically i like moving away from configuration!
> Just this time it hit me a bit ;)
>
> >
> > Till I can get a patch going for this, your usage model to get this
> working is likely the only one which will work.
>
> Ok will do!
> Johannes
>
> >
> >
> > On Tue, Aug 5, 2014 at 8:23 AM, Johannes Zillmann <
> [email protected]> wrote:
> > Hey guys,
> >
> > i just upgraded my application to the most current master code of Tez.
> > Run into a problem with setting up my custom key comparator.
> > It implements org.apache.hadoop.conf.Configurable and expects a custom
> property in the passed in configuration.
> >
> > So initially i tried:
> >         JobConf jobConfForShuffleSort = new JobConf();
> >         jobConfForShuffleSort.set(“myCustomProperty”,”value”)
> >         Builder edgeConfBuilder =
> OrderedPartitionedKVEdgeConfigurer.newBuilder(keyClassName, valueClassName,
> myPartitionerClassName, jobConfForShuffleSort);
> >
> > But the property does not come through to the instance of
> ‘myPartitionerClassName’.
> > Basically i see the comparator instantiated 2 times:
> >
> > (1) Here the custom property is available:
> >  java.lang.Exception
> >         at myPartitionerClassName.setConf(TezRecordComparator.java:42)
> >         at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
> >         at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> >         at
> org.apache.tez.runtime.library.common.ConfigUtils.getIntermediateOutputKeyComparator(ConfigUtils.java:125)
> >         at
> org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.<init>(ExternalSorter.java:158)
> >         at
> org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.<init>(DefaultSorter.java:116)
> >         at
> org.apache.tez.runtime.library.output.OnFileSortedOutput.start(OnFileSortedOutput.java:109)
> >         at
> SimpleVertexProcessor.initializeInputOutputs(SimpleVertexProcessor.java:190)
> >
> > (2) Here it is not:
> >   java.lang.Exception
> >         at myPartitionerClassName.setConf(TezRecordComparator.java:42)
> >         at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
> >         at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> >         at
> org.apache.tez.runtime.library.common.ConfigUtils.getIntermediateInputKeyComparator(ConfigUtils.java:135)
> >         at
> org.apache.tez.runtime.library.common.shuffle.impl.MergeManager.finalMerge(MergeManager.java:808)
> >         at
> org.apache.tez.runtime.library.common.shuffle.impl.MergeManager.close(MergeManager.java:465)
> >         at
> org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$RunShuffleCallable.call(Shuffle.java:344)
> >
> >
> > Found following workaround:
> >         Configuration payloadConf =
> TezUtils.createConfFromUserPayload(edgeProperty.getEdgeDestination().getUserPayload());
> >         payloadConf(“myCustomProperty”,”value”)
> >
> edgeProperty.getEdgeDestination().setUserPayload(TezUtils.createUserPayloadFromConf(payloadConf));
> >
> > I think it boils down to that the property is passed to the edge input
> but not to its destination !?
> > However, is there some smarter way making that property available to all
> instantiations of the comparator ?
> > I tried using
> >         edgeConfBuilder.setAdditionalConfiguration(...)
> >         edgeConfBuilder.configureOutput().setAdditionalConfiguration(…)
> > but that seems to filter out custom properties.
> >
> > Also do you plan to use a non-configuration based payload mechanism for
> the edge stuff like you did for the input, output, processor ?
> >
> > Any enlightenment appreciated!
> > Johannes
> >
> >
> >
>
>

Reply via email to