My best guess would be that an older version of commons-codec is also on
the classpath for the running task. If you have access to the local-dirs
configured under YARN - you could find the application dir in the
local-dirs and see what exists in the classpath for a container.
Alternately, set
I'm assuming you intend on having one vertex for 'joined'. This vertex
processes 'hash' while waiting for data to come in from 'probe' (which is
doing a shuffle / partition) ?
The Processor on the 'joined' vertex can absolutely go ahead and process
hash while the Shuffle from the other side is
Thanks for the input folks. I'll go ahead and get started on making the
necessary changes.
On Mon, May 18, 2015 at 6:33 PM, Hitesh Shah hit...@apache.org wrote:
To clarify, my main point was that to run Tez 0.8 with java7 features
would need the hadoop cluster to be running with java7 ( or
Thanks for your input. There are two weird points found:
- In my cluster (also the resources being localized for the tasks) there
are two versions; commons-codec-1.4.jar and commons-codec-1.7.jar. I
think commons-codec since version 1.4 shows be fine, but I still got the
exception
- which means that join stage could process hash and wait for probe making
it a 3 stage DAG. However what you see is a 4 stage DAG, since join will
require shuffle on the ‘hash’.
I guess the 3 stage DAG means the dag in spark, and 4 stage DAG means the
dag you build in tez. However, I think you
commons-codec contains the method since 1.4 - so those two versions of the
jar in the classpath should not cause issues.
Are there any custom jars which are loaded ? I'd look at any fat jars (Pig
for example) - to see if they're bundling commons-codec within them.
You could try this
The Apache Tez team is proud to announce the release of Apache Tez version 0.6.1
The Apache Tez project is aimed at creating a framework to build
efficient and scalable data processing applications that can be
modeled as data flow graphs.
This release contains all the bug fixed and improvement