subject:"\[jira\] \[Commented\] \(HBASE\-2375\) Make decision to split based on aggregate size of all StoreFiles and revisit related config params"

[jira] [Commented] (HBASE-2375) Make decision to split based on aggregate size of all StoreFiles and revisit related config params

2012-02-13 Thread stack (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207063#comment-13207063
]

stack commented on HBASE-2375:
--

+1 on patch

Make new issue to change compactionThreshod to at least 4 I'd say and then you
can close out this one?

Make decision to split based on aggregate size of all StoreFiles and revisit
related config params
--

Key: HBASE-2375
URL: https://issues.apache.org/jira/browse/HBASE-2375
Project: HBase
Issue Type: Improvement
Components: regionserver
Affects Versions: 0.20.3
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
Labels: moved_from_0_20_5
Attachments: HBASE-2375-flush-split.patch, HBASE-2375-v8.patch

Currently we will make the decision to split a region when a single StoreFile
in a single family exceeds the maximum region size. This issue is about
changing the decision to split to be based on the aggregate size of all
StoreFiles in a single family (but still not aggregating across families).
This would move a check to split after flushes rather than after compactions.
This issue should also deal with revisiting our default values for some
related configuration parameters.
The motivating factor for this change comes from watching the behavior of
RegionServers during heavy write scenarios.
Today the default behavior goes like this:
- We fill up regions, and as long as you are not under global RS heap
pressure, you will write out 64MB (hbase.hregion.memstore.flush.size)
StoreFiles.
- After we get 3 StoreFiles (hbase.hstore.compactionThreshold) we trigger a
compaction on this region.
- Compaction queues notwithstanding, this will create a 192MB file, not
triggering a split based on max region size (hbase.hregion.max.filesize).
- You'll then flush two more 64MB MemStores and hit the compactionThreshold
and trigger a compaction.
- You end up with 192 + 64 + 64 in a single compaction. This will create a
single 320MB and will trigger a split.
- While you are performing the compaction (which now writes out 64MB more
than the split size, so is about 5X slower than the time it takes to do a
single flush), you are still taking on additional writes into MemStore.
- Compaction finishes, decision to split is made, region is closed. The
region now has to flush whichever edits made it to MemStore while the
compaction ran. This flushing, in our tests, is by far the dominating factor
in how long data is unavailable during a split. We measured about 1 second
to do the region closing, master assignment, reopening. Flushing could take
5-6 seconds, during which time the region is unavailable.
- The daughter regions re-open on the same RS. Immediately when the
StoreFiles are opened, a compaction is triggered across all of their
StoreFiles because they contain references. Since we cannot currently split
a split, we need to not hang on to these references for long.
This described behavior is really bad because of how often we have to rewrite
data onto HDFS. Imports are usually just IO bound as the RS waits to flush
and compact. In the above example, the first cell to be inserted into this
region ends up being written to HDFS 4 times (initial flush, first compaction
w/ no split decision, second compaction w/ split decision, third compaction
on daughter region). In addition, we leave a large window where we take on
edits (during the second compaction of 320MB) and then must make the region
unavailable as we flush it.
If we increased the compactionThreshold to be 5 and determined splits based
on aggregate size, the behavior becomes:
- We fill up regions, and as long as you are not under global RS heap
pressure, you will write out 64MB (hbase.hregion.memstore.flush.size)
StoreFiles.
- After each MemStore flush, we calculate the aggregate size of all
StoreFiles. We can also check the compactionThreshold. For the first three
flushes, both would not hit the limit. On the fourth flush, we would see
total aggregate size = 256MB and determine to make a split.
- Decision to split is made, region is closed. This time, the region just
has to flush out whichever edits made it to the MemStore during the
snapshot/flush of the previous MemStore. So this time window has shrunk by
more than 75% as it was the time to write 64MB from memory not 320MB from
aggregating 5 hdfs files. This will greatly reduce the time data is
unavailable during splits.
- The daughter regions re-open on the same RS. Immediately when the
StoreFiles are opened, a compaction is triggered across all of their
StoreFiles because they contain references.

[jira] [Commented] (HBASE-2375) Make decision to split based on aggregate size of all StoreFiles and revisit related config params

2012-02-09 Thread Jean-Daniel Cryans (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13204701#comment-13204701
]

Jean-Daniel Cryans commented on HBASE-2375:
---

A bunch of things changed since this jira was created:
- we now split based on the store size
- regions split at 1GB
- memstores flush at 128MB
- there's been a lot of work on tuning the store file selection algorithm

My understanding of this jira is that it aims at making the out of the box
mass import experience better. Now that we have bulk loads and pre-splitting
this use case is becoming less and less important... although we still see
people trying to benchmark it (hi hypertable).

I see three things we could do:
- Trigger splits after flushes, I hacked a patch and it works awesomely
- Have a lower split size for newly created tables. Hypertable does this with
a soft limit that gets doubled every time the table splits until it reaches the
normal split size
- Have multi-way splits (Todd's idea), so that if you have enough data that
you know you're going to be splitting after the current split then just spawn
as many daughters as you need.

I'm planning on just fixing the first bullet point in the context of this
jira. Maybe there's another stuff from the patch in this jira that we could fit
in.

Make decision to split based on aggregate size of all StoreFiles and revisit
related config params
--

[jira] [Commented] (HBASE-2375) Make decision to split based on aggregate size of all StoreFiles and revisit related config params

2012-02-09 Thread stack (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13204779#comment-13204779
]

stack commented on HBASE-2375:
--

Doing first bullet point only sounds good. Lets file issues for the split
other suggestions.

What about the other recommendations made up in the issue regards
compactionThreshold.

Upping compactionThreshold from 3 to 5 where 5 is than the number of flushes
it would take to make us splittable; i.e. the intent is no compaction before
first split.

Should we do this too as part of this issue? We could make our flush size 256M
and compactionThreshold 5. Or perhaps thats too rad (thats a big Map to be
carrying around)? Instead up the compactionThreshold and down the default
regionsize from 1G to 512M and keep flush at 128M?

I took a look at patch and its pretty stale now given changes that have gone in
since.

Make decision to split based on aggregate size of all StoreFiles and revisit
related config params
--

[jira] [Commented] (HBASE-2375) Make decision to split based on aggregate size of all StoreFiles and revisit related config params

2012-02-09 Thread Jean-Daniel Cryans (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13204797#comment-13204797
]

Jean-Daniel Cryans commented on HBASE-2375:
---

bq. Doing first bullet point only sounds good. Lets file issues for the split
other suggestions.

Kewl.

bq. Upping compactionThreshold from 3 to 5 where 5 is than the number of
flushes it would take to make us splittable; i.e. the intent is no compaction
before first split.

Sounds like a change that can have a bigger impact but that mostly helps this
specific use case...

bq. Instead up the compactionThreshold and down the default regionsize from 1G
to 512M and keep flush at 128M?

I'd rather split earlier for the first regions.

Make decision to split based on aggregate size of all StoreFiles and revisit
related config params
--

[jira] [Commented] (HBASE-2375) Make decision to split based on aggregate size of all StoreFiles and revisit related config params

2012-02-09 Thread stack (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13204820#comment-13204820
]

stack commented on HBASE-2375:
--

bq. Upping compactionThreshold from 3 to 5... Sounds like a change that can
have a bigger impact but that mostly helps this specific use case...

Dunno. 3 strikes me as one of those decisions that made sense long time ago
but a bunch has changed since... We should test it I suppose.

On changing flush/regionsize, you'd rather have us split faster then slow as
count of regions goes up. Ok.

Make decision to split based on aggregate size of all StoreFiles and revisit
related config params
--

[jira] Commented: (HBASE-2375) Make decision to split based on aggregate size of all StoreFiles and revisit related config params

2010-08-24 Thread Jean-Daniel Cryans (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902178#action_12902178
]

Jean-Daniel Cryans commented on HBASE-2375:
---

Do we have enough time do this if 0.90.0 is due for HW? Punt?

Make decision to split based on aggregate size of all StoreFiles and revisit
related config params
--

Attachments: HBASE-2375-v8.patch

[jira] Commented: (HBASE-2375) Make decision to split based on aggregate size of all StoreFiles and revisit related config params

2010-06-01 Thread Jeff Whiting (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12874199#action_12874199
]

Jeff Whiting commented on HBASE-2375:
-

The optimizations here look great. It seems like an additional optimization
could still be made. Looking at the patch, there doesn't seem to be any
prioritization of compaction requests. So if you have a region server that is
in charge of a large number of regions the compaction queue can still get quite
large and prevent more important compactions from happening in a timely manner.
I implemented a priority queue for compactions that may make a lot of sense
to include with these optimizations (see HBASE-2646).

Make decision to split based on aggregate size of all StoreFiles and revisit
related config params
--

Attachments: HBASE-2375-v8.patch

[jira] [Commented] (HBASE-2375) Make decision to split based on aggregate size of all StoreFiles and revisit related config params

[jira] [Commented] (HBASE-2375) Make decision to split based on aggregate size of all StoreFiles and revisit related config params

[jira] [Commented] (HBASE-2375) Make decision to split based on aggregate size of all StoreFiles and revisit related config params

[jira] [Commented] (HBASE-2375) Make decision to split based on aggregate size of all StoreFiles and revisit related config params

[jira] [Commented] (HBASE-2375) Make decision to split based on aggregate size of all StoreFiles and revisit related config params

[jira] Commented: (HBASE-2375) Make decision to split based on aggregate size of all StoreFiles and revisit related config params

[jira] Commented: (HBASE-2375) Make decision to split based on aggregate size of all StoreFiles and revisit related config params

7 matches

Site Navigation

Mail list logo

Footer information