Re: Best practice for automating jobs

2013-01-11 Thread Tom Brown
t; >> > >> > >> >> New partitions will be added regularly >> > >> > What type of partitions are you adding? Why frequently? >> > >> > >> > >> > >> > Sean >> > >> > >> > On 1/10/13

Re: Best practice for automating jobs

2013-01-10 Thread Tom Brown
session for each query through hive libs, so > several queries can run concurrently. > > > 2013/1/11 Tom Brown 'tombrow...@gmail.com');>> > >> How is concurrency achieved with this solution? >> >> >> On Thursday, January 10, 2013, Qiang Wan

Re: Best practice for automating jobs

2013-01-10 Thread Tom Brown
nt can be achieved by creating crontabs using the HWI. > > It's simple and easy to use. Hope it helps. > > Regards, > Qiang > > > 2013/1/11 Tom Brown 'tombrow...@gmail.com');>> > >> All, >> >> I want to automate jobs against Hive (

Re: Hive double-precision question

2012-12-17 Thread Tom Brown
Doubles are not perfect fractional numbers. Because of rounding errors, a set of doubles added in different orders can produce different results (e.g., a+b+c != b+c+a) Because of this, if your computation is happening in a different order locally than on the hive server, you might end up with diff

Re: need help on writing hive query

2012-10-31 Thread Tom Brown
It wouldn't retrieve the user's path in a single string, but you could simply select the user id and current page, ordered by the timestamp. It would require a second step to turn it into the single string path, so that might be a deal-breaker. --Tom On Wed, Oct 31, 2012 at 3:32 PM, Philip Troma

Re: Convert JSON array to Hive array

2012-08-29 Thread Tom Brown
I believe the "get_json_object" function will be suitable (though I've never personally tried it in the exact context you describe.) https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-getjsonobject --Tom On Wed, Aug 29, 2012 at 7:49 AM, Aleksei Udatšnõi wrote:

Question about query result storage

2012-08-09 Thread Tom Brown
Team, I'm a new Hive user and I've just run my first large query (a few hours). Unfortunately, I ran it from the CLI, and the output was longer than my SSH client allowed for (scroll buffer) so I can't see the first 1/2 of the result. (It also changes tabs to spaces so properly aligning the column

Question about querying JSON data

2012-08-08 Thread Tom Brown
I have a large amount of data JSON data that was generated using periods in the key names, e.g., {"category.field": "value"}. I know that's not the best way to do JSON but for better or worse, it's the data I have to deal with. I have tried using get_json_object, but I am concerned that it's JSON