Re: Find TOP 10 using HiveQL

2012-07-10 Thread Wouter de Bie
You could use TRANSFORM with a simple awk script: TRANSFORM(a, b, c, d) USING "/usr/bin/awk ' {if($1!=c){c=$1; a=0}; if(a<20){print $0; a++}}'" This will create a top 20 for each group. --Wouter de Bie Team Lead Analytics Infrastructure, Spotify wou...@

Re: Simple Hive Query

2012-03-05 Thread Wouter de Bie
Hi, Your table specifies 3 ints, however, some of your rows contain strings. If you specify you table as strings and then later parse the fields to an int (by doing some replacements and casting it to an int), it should work. // Wouter On Monday, March 5, 2012 at 12:55 PM, hadoop hive wrote:

Re: Asynchronous query exection

2011-11-15 Thread Wouter de Bie
Another way would be using Hive server. This will execute multiple queries in parallel. --Wouter de Bie Team Lead Analytics Infrastructure, Spotify wou...@spotify.com (mailto:wou...@spotify.com) +46 72 018 0777 On Tuesday, November 15, 2011 at 2:48 PM, Sam Wilson wrote: > If you go this ro

Re: jets3t 0.7.4

2011-07-22 Thread Wouter de Bie
And setting httpclient.max-connections=100 doesn't seem to be picked up. --Wouter de Bie Developer Business Intelligence, Spotify wou...@spotify.com (mailto:wou...@spotify.com) +46 72 018 0777 On Friday, July 22, 2011 at 4:54 PM, Wouter de Bie wrote: > Hi, > > When I use 0.

Re: jets3t 0.7.4

2011-07-22 Thread Wouter de Bie
,155 DEBUG httpclient.MultiThreadedHttpConnectionManager (MultiThreadedHttpConnectionManager.java:doGetConnection(494)) - Unable to get a connection, waiting..., hostConfig=HostConfiguration[host=https://MYBUCKET.s3.amazonaws.com] --Wouter de Bie Developer Business Intelligence, Spotify wou...@spotify.com (mailto:wou...@spotify.com) +46 7

Re: jets3t 0.7.4

2011-07-21 Thread Wouter de Bie
hive 0.7.0+27.1-2~maverick-cdh3 and hadoop 0.20.2+923.21-1 --Wouter de Bie Developer Business Intelligence, Spotify wou...@spotify.com (mailto:wou...@spotify.com) +46 72 018 0777 On Thursday, July 21, 2011 at 9:05 PM, Florin Diaconeasa wrote: > What hive version are you using? > >

jets3t 0.7.4

2011-07-21 Thread Wouter de Bie
aused by: org.jets3t.service.impl.rest.HttpException at org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest(RestS3Service.java:518) ... 33 more --Wouter de Bie Developer Business Intelligence, Spotify wou...@spotify.com (mailto:wou...@spotify.com) +46 72 018 0777

Re: LEFT OUTER JOIN and partitioned tables

2011-07-10 Thread Wouter de Bie
Hi all, I've solved the issue by setting hive.outerjoin.supports.filters=false. What does this setting do? It's completely undocumented. --Wouter de Bie Developer Business Intelligence, Spotify wou...@spotify.com +46 72 018 0777 This e-mail (including any attachments) may contain i

Re: Hive session locking up after 4 queries using S3

2011-07-10 Thread Wouter de Bie
Hi Aggarwal, I've upgraded to 0.7.4, but I'm experiencing the same problem. EMR is not an option for now :) // Wouter --Wouter de Bie Developer Business Intelligence, Spotify wou...@spotify.com +46 72 018 0777 This e-mail (including any attachments) may contain informati

LEFT OUTER JOIN and partitioned tables

2011-07-08 Thread Wouter de Bie
Hi all, I'm experiencing problems with using LEFT OUTER JOIN with partitioned tables. The following example works as expected: SELECT a.val1, b.val2 FROM a JOIN b ON a.val1 = b.val1 AND a.dt = 20110708 b.dt = 20110708 But when I change it to use a LEFT OUTER JOIN like: SELECT a.val1, b.val2 FR

Re: Hive session locking up after 4 queries using S3

2011-07-06 Thread Wouter de Bie
bug is in this version. // Wouter --Wouter de Bie Developer Business Intelligence, Spotify wou...@spotify.com +46 72 018 0777 This e-mail (including any attachments) may contain information that is confidential and/or privileged. It is intended only for the recipient(s). If you have reason to beli

Hive session locking up after 4 queries using S3

2011-07-06 Thread Wouter de Bie
Hi all, I'm using Hive with the s3native FS. Today, I noticed that hive locks up after 4 queries that directly access S3 (select * from mytable limit 10). With debug logging on, I get the following output: 2011-07-06 15:54:31,459 DEBUG s3native.NativeS3FileSystem (NativeS3FileSystem.java:getFi