Re: UDFClassLoader isolation leaking

2018-09-14 Thread Jason Gerlowski
Hi Gopal,

Thanks for taking a look, and for the workaround suggestion.

Your workaround worked for the original SerDe we encountered this
issue with.  With the stub StorageHandler mentioned above it produced
a different error (https://pastebin.com/iu0hh21C).  But I suspect
that's a problem with the StorageHandler being a bit *too* minimal.
I'll see if I can't correct that later today.

Thanks again,

Jason
On Thu, Sep 13, 2018 at 10:13 PM Gopal Vijayaraghavan  wrote:
>
> Hi,
>
> > Hopefully someone can tell me if this is a bug, expected behavior, or 
> > something I'm causing myself :)
>
> I don't think this is expected behaviour, but where the bug is what I'm 
> looking into.
>
> >  We have a custom StorageHandler that we're updating from Hive 1.2.1 to 
> > Hive 3.0.0.
>
> Most likely this bug existed in Hive 1.2 as well, but the FetchTask 
> conversion did not happen for these queries.
>
> I'll probably test out your SerDe tomorrow, but I have two target cases to 
> look into right now.
>
> The first one is that this is related to a different issue I noticed with 
> Hadoop-Common code (i.e a direct leak).
>
> https://issues.apache.org/jira/browse/HADOOP-10513
>
> The second one is that this is only broken with the Local FetchTask (which 
> gets triggered when you run "select ... limit n").
>
> > SELECT * FROM my_ext_table;
>
> So those theories, I recommend trying out
>
> set hive.fetch.task.conversion=none;
>
> and running the same query so that the old Hive1 codepaths for reading from 
> the SerDe get triggered.
>
> Cheers,
> Gopal
>
>


Re: UDFClassLoader isolation leaking

2018-09-13 Thread Gopal Vijayaraghavan
Hi,

> Hopefully someone can tell me if this is a bug, expected behavior, or 
> something I'm causing myself :)

I don't think this is expected behaviour, but where the bug is what I'm looking 
into.

>  We have a custom StorageHandler that we're updating from Hive 1.2.1 to Hive 
> 3.0.0.  

Most likely this bug existed in Hive 1.2 as well, but the FetchTask conversion 
did not happen for these queries.

I'll probably test out your SerDe tomorrow, but I have two target cases to look 
into right now.

The first one is that this is related to a different issue I noticed with 
Hadoop-Common code (i.e a direct leak).

https://issues.apache.org/jira/browse/HADOOP-10513

The second one is that this is only broken with the Local FetchTask (which gets 
triggered when you run "select ... limit n").

> SELECT * FROM my_ext_table;

So those theories, I recommend trying out

set hive.fetch.task.conversion=none;

and running the same query so that the old Hive1 codepaths for reading from the 
SerDe get triggered.

Cheers,
Gopal




UDFClassLoader isolation leaking

2018-09-13 Thread Jason Gerlowski
Hi all,

Wanted to let you know of a potential bug I've run into when loading
custom jar's dynamically (i.e. "ADD JAR /path/to/jar").  Hopefully
someone can tell me if this is a bug, expected behavior, or something
I'm causing myself :)

We have a custom StorageHandler that we're updating from Hive 1.2.1 to
Hive 3.0.0.  During testing we found that under some circumstances,
queries to tables backed by our StorageHandler would return result
sets with 'NULL' in each cell.  Digging in, we found that our SerDe's
deserialize() method was returning null after a failed "instanceof"
sanity check on the input Writable.  Debugging a bit, we found that
the "instanceof" operands were the same class/package, but had been
loaded by two different UDFClassLoader instances.  This behavior seems
suspiciously like what was warned against in an early comment on
HIVE-11878 when UDFClassLoader was introduced, so I'm 99% sure it is
unintended. (see:
https://issues.apache.org/jira/browse/HIVE-11878?focusedCommentId=14876858&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14876858)

The behavior is reproducible with the following steps:

1. Find a custom StorageHandler to use.  I wrote a stub StorageHandler
here (https://github.com/gerlowskija/hive-bug-serde/) which reproduces
the issue.
2. Create a table using the StorageHandler: hive -n $hive_user -p
$hive_pass -e "ADD JAR /tmp/mycustomserde.jar; CREATE EXTERNAL TABLE
my_ext_table (hello_col STRING, world_col STRING) STORED BY
'com.helloworld.serde.HelloWorldStorageHandler' LOCATION
'/tmp/some_dir';"
3. Put some data in your external table: hive -n $hive_user -p
$hive_pass -e "ADD JAR /tmp/mycustomserde.jar; INSERT INTO
my_ext_table VALUES ('hello', 'world');"
4. Query your external table: hive -n $hive_user -p $hive_pass -e "ADD
JAR /tmp/mycustomserde.jar; SELECT * FROM my_ext_table;"

Depending on the custom serde you're using the bug might exhibit
itself differently.  But most SerDe's, which cast the "Writable" arg
to a specific Writable implementation in their deserialize method,
will print a table full of 'NULL' values.  (The provided stub
StorageHandler shows the bug this way.  It also logs the "instanceof"
operands out to hiveserver2.log, making the behavior clearer:
"Received unexpected Writable class.  Expected
com.helloworld.serde.HelloWorldWritable from classloader
org.apache.hadoop.hive.ql.exec.UDFClassLoader@489d24e9, but actually
was com.helloworld.serde.HelloWorldWritable from classloader
org.apache.hadoop.hive.ql.exec.UDFClassLoader@75517e2b").

I've written the behavior and reproduction steps up in more detail
here: https://github.com/gerlowskija/hive-bug-serde/.  Please let me
know if this is a true bug in Hive as I suspect, or if there's
something I can be doing to avoid these Classloader conflicts.

Thanks,

Jason