Re: Smaller Region Size?

2009-12-24 Thread Dhruba Borthakur
...@apache.org] Sent: Wednesday, December 23, 2009 12:47 PM To: hbase-user@hadoop.apache.org Subject: Re: Smaller Region Size? How do you have clocks set up on your systems Mark? Are you using NTP to keep them sane? Am I correct that they are sometimes running backward? - Andy - Original

Re: Smaller Region Size?

2009-12-24 Thread Jean-Daniel Cryans
. -Original Message- From: Andrew Purtell [mailto:apurt...@apache.org] Sent: Wednesday, December 23, 2009 12:47 PM To: hbase-user@hadoop.apache.org Subject: Re: Smaller Region Size? How do you have clocks set up on your systems Mark? Are you using NTP to keep them sane? Am I correct

RE: Smaller Region Size?

2009-12-23 Thread Mark Vigeant
The biggest legitimate reason to run smaller region size is if your data set is small (lets say 400mb) but highly accessed, so you want a good spread of regions across your cluster. That's exactly it, my input dataset was 500MB total (~1,000,000 rows) and it was getting stored as just one

Re: Smaller Region Size?

2009-12-23 Thread Andrew Purtell
@hadoop.apache.org Sent: Wed, December 23, 2009 9:09:04 AM Subject: RE: Smaller Region Size? The biggest legitimate reason to run smaller region size is if your data set is small (lets say 400mb) but highly accessed, so you want a good spread of regions across your cluster. That's exactly it, my input

RE: Smaller Region Size?

2009-12-23 Thread Mark Vigeant
: Wednesday, December 23, 2009 12:47 PM To: hbase-user@hadoop.apache.org Subject: Re: Smaller Region Size? How do you have clocks set up on your systems Mark? Are you using NTP to keep them sane? Am I correct that they are sometimes running backward? - Andy - Original Message From

RE: Smaller Region Size?

2009-12-22 Thread Mark Vigeant
performance? Thanks! -Mark -Original Message- From: Mark Vigeant [mailto:mark.vige...@riskmetrics.com] Sent: Monday, December 21, 2009 4:06 PM To: hbase-user@hadoop.apache.org Subject: RE: Smaller Region Size? Thanks J-D! -Original Message- From: jdcry...@gmail.com [mailto:jdcry

Re: Smaller Region Size?

2009-12-22 Thread stack
On Tue, Dec 22, 2009 at 8:57 AM, Mark Vigeant mark.vige...@riskmetrics.comwrote: J-D, I noticed that performance for uploading data into tables got a lot better as I lowered the max file size -- but up until a certain point, where the performance began slowing down again. Tell us more.

Re: Smaller Region Size?

2009-12-22 Thread Ryan Rawson
The biggest legitimate reason to run smaller region size is if your data set is small (lets say 400mb) but highly accessed, so you want a good spread of regions across your cluster. Another is to run a larger region if you are having a huge table and you want to keep absolute region count low. I

Smaller Region Size?

2009-12-21 Thread Mark Vigeant
Hey Everyone, I would like to make my HRegion size be smaller so that I can test out how my jobs run when the tables are split up across multiple region servers. Is this something I can set in the hbase-site config, or is this an hdfs thing? Thanks a lot! Mark Vigeant RiskMetrics Group, Inc.

Re: Smaller Region Size?

2009-12-21 Thread Jean-Daniel Cryans
Mark, When you create a table you can set MAX_FILESIZE in the shell or in the code. Set it to something small than 256MB. J-D On Mon, Dec 21, 2009 at 12:55 PM, Mark Vigeant mark.vige...@riskmetrics.com wrote: Hey Everyone, I would like to make my HRegion size be smaller so that I can test

RE: Smaller Region Size?

2009-12-21 Thread Mark Vigeant
Thanks J-D! -Original Message- From: jdcry...@gmail.com [mailto:jdcry...@gmail.com] On Behalf Of Jean-Daniel Cryans Sent: Monday, December 21, 2009 3:59 PM To: hbase-user@hadoop.apache.org Subject: Re: Smaller Region Size? Mark, When you create a table you can set MAX_FILESIZE