Thanks. On Mar 22, 2013, at 21:02 , Nitin Pawar wrote:
> instead of >= can you just try = if you want to limit top 100 (b being a > partition i guess it will have more that 100 records to fit into your limit) > > to improve your query performance your table file format matters as well. > Which one are you using? > how many partitions are there? > what's the size of the cluster? > you can set the number of reducers but if your query just has one key then > only one reducer will get the data and rest will run empty > > > > On Sat, Mar 23, 2013 at 4:32 AM, Keith Wiley <kwi...@keithwiley.com> wrote: > The following query translates into a many-map-single-reduce job (which is > common) and also slags through the reduce stage...it's killing the overall > query: > > select * from a where b >= 'c' order by b desc limit 100 > > Note that b is a partition. What component is making the reducer heavy? Is > it the order by or the limit (I'm sure it's not the partition-specific where > clause, right?)? Are there ways to improve its performance? > > ________________________________________________________________________________ > Keith Wiley kwi...@keithwiley.com keithwiley.com > music.keithwiley.com > > "You can scratch an itch, but you can't itch a scratch. Furthermore, an itch > can > itch but a scratch can't scratch. Finally, a scratch can itch, but an itch > can't > scratch. All together this implies: He scratched the itch from the scratch > that > itched but would never itch the scratch from the itch that scratched." > -- Keith Wiley > ________________________________________________________________________________ > > > > > -- > Nitin Pawar ________________________________________________________________________________ Keith Wiley kwi...@keithwiley.com keithwiley.com music.keithwiley.com "It's a fine line between meticulous and obsessive-compulsive and a slippery rope between obsessive-compulsive and debilitatingly slow." -- Keith Wiley ________________________________________________________________________________