I ran with debug logging, and this is interesting, there was a loss of
connection to the metastore client RIGHT before the partition mention
above... as data was looking to be moved around... I wonder if the timing
on that is bad?

14/09/09 12:47:37 [main]: INFO exec.MoveTask: Partition is: {day=null,
source=null}

14/09/09 12:47:38 [main]: INFO metadata.Hive: Renaming
src:maprfs:/user/hive/scratch/hive-mapr/hive_2014-09-09_12-38-30_860_3555291990145206535-1/-ext-10000/day=2012-11-30/source=20121119_SWAirlines_Spam/000004_0;dest:
maprfs:/user/hive/warehouse/intel_flow.db/pcaps/day=2012-11-30/source=20121119_SWAirlines_Spam/000004_0;Status:true

14/09/09 12:48:02 [main]: WARN metastore.RetryingMetaStoreClient:
MetaStoreClient lost connection. Attempting to reconnect.

org.apache.thrift.transport.TTransportException:
java.net.SocketTimeoutException: Read timed out

at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)




On Tue, Sep 9, 2014 at 11:02 AM, John Omernik <j...@omernik.com> wrote:

> I am doing a dynamic partition load in Hive 0.13 using ORC files. This has
> always worked in the past both with MapReduce V1 and YARN. I am working
> with Mesos now, and trying to trouble shoot this weird error:
>
>
>
> Failed with exception AlreadyExistsException(message:Partition already
> exists
>
>
>
> What's odd is is my insert is an insert (without Overwrite) so it's like
> two different reducers have data to go into the same partition, but then
> there is a collision of some sort? Perhaps there is a situation where the
> partition doesn't exist prior to the run, but when two reducers have data,
> they both think they should be the one to create the partition? Shouldn't
> if a partition already exists, the reducer just copies it's file into the
> partition?  I am struggling to see why this would be an issue with Mesos,
> but not on Yarn, or MRv1.
>
>
> Any thoughts would be welcome.
>
>
> John
>

Reply via email to