IIRC, the output files for each vertex have the vertex id encoded in them to
prevent them from overriding output files from other vertices. Thus the files
for different union member vertices can be written safely under the same output
dir.
Hive might be doing this to maintain uniformity
Also, I believe you are comparing the Tez code for IFile (which is intermediate
data) vs code for SequenceFile (which is the final output or initial input from
stable storage like HDFS). So they may not be related.
-Original Message-
From: Gopal Vijayaraghavan
A full stack trace would help determine is this is a Tez issue or hive issue.
From: Jim Green [mailto:openkbi...@gmail.com]
Sent: Tuesday, July 21, 2015 11:12 AM
To: u...@tez.apache.org; user@hive.apache.org
Subject: Hive on Tez query failed with “wrong key class
Hi Team,
Env: Hive 1.0 on Tez
That would be in the hive documentation because it’s the dependent project and
determines its compatibility with downstream projects like Tez.
From: Jim Green [mailto:openkbi...@gmail.com]
Sent: Tuesday, July 07, 2015 10:38 AM
To: u...@tez.apache.org
Cc: user@hive.apache.org
Subject: Re: Hive
(1)- For every TEZ AM it is possible to launch just a single query/DAG at a
time. So within a given AM several DAGs can be executed only in sequential
order (a.k.a. a session), not in parallel. To execute DAGs in parallel we
always need several AMs.
Correct. Today a single AM will accept new
Probably a question for hive user/dev mailing lists.
*From:* Grandl Robert [mailto:rgra...@yahoo.com]
*Sent:* Wednesday, September 17, 2014 4:00 PM
*To:* u...@tez.apache.org; user@hive.apache.org; Grandl Robert
*Subject:* Re: run tpcds queries
Hmm. It seems that the problem is some columns