Re: Predicate pushdown optimisation not working for ORC

2014-04-03 Thread Abhay Bansal
I was able to find the property with some digging around and experimentation. Never knew that ppd had something to do with this property. On Thu, Apr 3, 2014 at 7:23 PM, Stephen Sprague wrote: > wow. good find. i hope these config settings are well documented and that > you didn't have to spend

Re: Predicate pushdown optimisation not working for ORC

2014-04-03 Thread Abhay Bansal
It is not essentially the client side. The file can be generated by a running MR job or any process for that matter which can then feed data to the Hadoop cluster to run hive queries. -Abhay On Thu, Apr 3, 2014 at 7:40 PM, Bogala, Chandra Reddy wrote: > I thought ORC file can be generated only

Re: Deserializing into multiple records

2014-04-03 Thread David Quigley
Thanks again Petter, the custom input format was exactly what I needed. Here is example of my code in case anyone is interested https://github.com/quicklyNotQuigley/nest Basically gives you SQL access to arbitrary json data. I know there are solutions for dealing with JSON data in hive fields but

Re: UDF reflect

2014-04-03 Thread Andy Srine
Just to add to my previous question, I see this example in Hive documentation for the "reflect" UDF. What object or string is the "isEmpty" method working on in the example below? What I am trying to do with hashCode() is something similar. SELECT reflect("java.lang.String", "valueOf", 1),

Problem querying deeply nested data with Parquet and ORC File Hive SerDes

2014-04-03 Thread mpeterson2
Hi, I'm new to using Parquet and ORC files and I'm hitting a problem with querying nested data. Can those files formats be used to query deeply nested data? If yes, why I am getting an error with the SerDes for both of them? Here's the background: I'm starting from a JSON data file like this:

Re: UDF reflect

2014-04-03 Thread John Meagher
It's probably not as pretty as the new built-in version, but this allows scripted UDFs in any javax.script language: https://github.com/livingsocial/HiveSwarm/blob/master/src/main/java/com/livingsocial/hive/udf/ScriptedUDF.java On Thu, Apr 3, 2014 at 4:35 PM, Andy Srine wrote: > Thanks Edward. Bu

Re: UDF reflect

2014-04-03 Thread Andy Srine
Thanks Edward. But "inline groovy" is available on Hive 13 right? I am using an older version. Best, Andy On Thu, Apr 3, 2014 at 11:37 AM, Edward Capriolo wrote: > You can write UDF's in groovy now. That pretty much means. You can just > write a quick method inline now. Makese udf reflect much

Re: UDF reflect

2014-04-03 Thread Edward Capriolo
You can write UDF's in groovy now. That pretty much means. You can just write a quick method inline now. Makese udf reflect much less useful. On Thu, Apr 3, 2014 at 2:22 PM, Andy Srine wrote: > Thanks Szehon and Peyman, I want to call hashCode() on the UUID object. > This object is stored in th

Re: UDF reflect

2014-04-03 Thread Andy Srine
Thanks Szehon and Peyman, I want to call hashCode() on the UUID object. This object is stored in the table as a string, but I can convert it to UUID. Thats not the problem. Basically the question is, how do we call this reflect UDF on methods that takes no arguments? How do I do the following usin

RE: Predicate pushdown optimisation not working for ORC

2014-04-03 Thread Bogala, Chandra Reddy
I thought ORC file can be generated only by running hive query on staging table and inserting into ORC table. If there is option to generate ORC file at client side by using java code then can you share that code or links related to that? Thanks, Chandra From: Abhay Bansal [mailto:abhaybansal.1.

Re: Predicate pushdown optimisation not working for ORC

2014-04-03 Thread Stephen Sprague
wow. good find. i hope these config settings are well documented and that you didn't have to spend alot time searching for that. Interesting that the default isn't true for this one. On Wed, Apr 2, 2014 at 11:00 PM, Abhay Bansal wrote: > I was able to resolve the issue by setting "hive.optimize

Re: UDF reflect

2014-04-03 Thread Peyman Mohajerian
Maybe your intention is the following: reflect("java.util.UUID", "randomUUID") On Thu, Apr 3, 2014 at 2:33 AM, Szehon Ho wrote: > Hi, according to the description of the reflect UDF, you are trying to > call java.util.UUID.hashcode(uidString), which doesnt seem to be an > existing method on eit