On Thu, Sep 1, 2016 at 7:11 AM, Amit Adhau <amit.ad...@globant.com> wrote:
> Thanks Todd, we will be trying the same, hope that this should not affect > the performance. > > We are using hash partition for our table. Can you please suggest, if > there would be any other config flags that we should look into to improve > the scan performance. In the past we had used some of the flags that you > had suggested in your kudu insert performance blog and that helped us in > kudu writes. > Are you using a single Java client to read large amounts of data? If so, note that you're getting a single-threaded read, so you are most likely not limited by the server side. What you could consider is using the ScanToken API to retrieve a bunch of scan tokens for your query, and then feed them into a thread pool, starting a new scanner for each token. That should give you parallelism on the client side. -Todd > Thanks, > Amit > > On Aug 31, 2016 10:36 PM, "Todd Lipcon" <t...@cloudera.com> wrote: > >> Hi Amit, >> >> That's correct, there is no "order by" support in the Java API, because >> this is an arbitrarily complex operation. Imagine a table with a trillion >> rows, and asking for "order by" from a Java client. It would have to either >> download and sort the entire table on your client node (which is >> infeasible) or would have to somehow ask the servers to perform a huge >> shuffle and sort, which isn't something Kudu's designed to do. >> >> The recommendation is: >> - if you're just needing to sort small sets of rows, then grab the whole >> result set and use a normal Java-based sort (Collections.sort) >> - if you're needing to sort a large number of rows, use something like >> Impala or Spark SQL to perform the sort. >> >> -Todd >> >> On Wed, Aug 31, 2016 at 8:06 AM, Amit Adhau <amit.ad...@globant.com> >> wrote: >> >>> Hi Kudu Team, >>> >>> Using Java Kudu API, we want to sort the data on kudu table based on >>> table column, but we have not found any option in API for the same. >>> Can you please help us on the same. >>> >>> -- >>> Thanks & Regards, >>> >>> *Amit Adhau* | Data Architect >>> >>> *GLOBANT* | IND:+91 9821518132 >>> >>> [image: Facebook] <https://www.facebook.com/Globant> >>> >>> [image: Twitter] <http://www.twitter.com/globant> >>> >>> [image: Youtube] <http://www.youtube.com/Globant> >>> >>> [image: Linkedin] <http://www.linkedin.com/company/globant> >>> >>> [image: Pinterest] <http://pinterest.com/globant/> >>> >>> [image: Globant] <http://www.globant.com/> >>> >>> The information contained in this e-mail may be confidential. It has >>> been sent for the sole use of the intended recipient(s). If the reader of >>> this message is not an intended recipient, you are hereby notified that any >>> unauthorized review, use, disclosure, dissemination, distribution or >>> copying of this communication, or any of its contents, >>> is strictly prohibited. If you have received it by mistake please let >>> us know by e-mail immediately and delete it from your system. Many >>> thanks. >>> >>> >>> >>> La información contenida en este mensaje puede ser confidencial. Ha sido >>> enviada para el uso exclusivo del destinatario(s) previsto. Si el lector de >>> este mensaje no fuera el destinatario previsto, por el presente queda Ud. >>> notificado que cualquier lectura, uso, publicación, diseminación, >>> distribución o copiado de esta comunicación o su contenido está >>> estrictamente prohibido. En caso de que Ud. hubiera recibido este mensaje >>> por error le agradeceremos notificarnos por e-mail inmediatamente y >>> eliminarlo de su sistema. Muchas gracias. >>> >>> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >> > > The information contained in this e-mail may be confidential. It has been > sent for the sole use of the intended recipient(s). If the reader of this > message is not an intended recipient, you are hereby notified that any > unauthorized review, use, disclosure, dissemination, distribution or > copying of this communication, or any of its contents, > is strictly prohibited. If you have received it by mistake please let us > know by e-mail immediately and delete it from your system. Many thanks. > > > > La información contenida en este mensaje puede ser confidencial. Ha sido > enviada para el uso exclusivo del destinatario(s) previsto. Si el lector de > este mensaje no fuera el destinatario previsto, por el presente queda Ud. > notificado que cualquier lectura, uso, publicación, diseminación, > distribución o copiado de esta comunicación o su contenido está > estrictamente prohibido. En caso de que Ud. hubiera recibido este mensaje > por error le agradeceremos notificarnos por e-mail inmediatamente y > eliminarlo de su sistema. Muchas gracias. > > -- Todd Lipcon Software Engineer, Cloudera