Hi all, Hive Version: 3.0.0 Hadoop Version: 3.1.0 Tez Version: 0.9.1
I help to maintain a Hive SerDe connecting Hive to Apache Solr. We've let our custom SerDe lag quite a bit behind in Hive releases (1.2.1), but we've recently started updating the code to work with Hive 3.0.0. After updating the code to 3.0.0, I'm seeing some odd behavior. I'm able to successfully create a table using our SerDe, and can INSERT into that table (the records end up in Solr as expected). But when I query the external table using HiveCLI (e.g. SELECT * FROM my_external_table), HiveCLI prints out a table with the correct columns, and the correct number of rows, but with "NULL" for every single square in the grid. (Example here: https://pastebin.com/rffujvpc) This behavior persists until I restart HiveServer2. After a hiveserver2 restart, HiveCLI returns the correct data, formatted as expected. I'm sure the problem is with our SerDe itself, but I'm not sure where to start looking for the problem. Can anyone point me towards any likely culprits that might save me some debugging time, or at least give me a place to start looking? Is this a symptom others in the community have stumbled on before? Thanks for any help that anyone can provide. (If it helps you help me, our SerDe is openly developed, and can be found here: https://github.com/lucidworks/hive-solr) Best, Jason
