One more data point:  I can read data from this partition as long as I don't 
reference the partition explicitly…

E.g., I my partition column is "ArrivalDate", and I have several different 
partitions:  "2012-02-01"…, and a partition with my test data with 
ArrivalDate="test".

This works:  'select * from table where <some constraint such that I only get 
results from the "test" partition>'.

And this works:  'select * from table where ArrivalDate="2012-02-01"'

But, this fails:  'select * from table where ArrivalDate="test"'

Does this make sense to anybody?



From: Evan Pollan 
<evan.pol...@bazaarvoice.com<mailto:evan.pol...@bazaarvoice.com>>
Reply-To: <user@hive.apache.org<mailto:user@hive.apache.org>>
Date: Tue, 21 Feb 2012 20:56:07 +0000
To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: Custom SerDe -- tracking down stack trace

I have a custom SerDe that's initializing properly and works on one data set.  
I built it to adapt to a couple of different data formats, though, and it's 
choking on a different data set (different partitions in the same table).

A null pointer exception is being thrown on deserialize, that's being wrapped 
by an IOException somewhere up the stack.  The exception is showing up in the 
hive output ("Failed with exception 
java.io.IOException:java.lang.NullPointerException"), but I can't find the 
stack trace in any logs.

It's worth noting that I'm running hive via the cli on a machine external to 
the cluster, and the query doesn't get far enough to create any M/R tasks.  I 
looked in all log files in /var/log on the hive client machine, and in all 
userlogs on each cluster instance.  I also looked in derby.log (I'm using the 
embedded metastore) and in /var/lib/hive/metastore on the hive client machine.

I'm sure I'm missing something obvious…  Any ideas?

Reply via email to