[jira] [Updated] (HBASE-7508) Fix simple findbugs

2013-01-07 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7508:
---

Status: Open  (was: Patch Available)

> Fix simple findbugs
> ---
>
> Key: HBASE-7508
> URL: https://issues.apache.org/jira/browse/HBASE-7508
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Trivial
> Fix For: 0.96.0
>
> Attachments: 7508.v1.patch, 7508.v2.patch, 7508.v2.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7508) Fix simple findbugs

2013-01-07 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7508:
---

Hadoop Flags: Reviewed
  Status: Patch Available  (was: Open)

> Fix simple findbugs
> ---
>
> Key: HBASE-7508
> URL: https://issues.apache.org/jira/browse/HBASE-7508
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Trivial
> Fix For: 0.96.0
>
> Attachments: 7508.v1.patch, 7508.v2.patch, 7508.v2.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7508) Fix simple findbugs

2013-01-07 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7508:
---

Attachment: 7508.v2.patch

> Fix simple findbugs
> ---
>
> Key: HBASE-7508
> URL: https://issues.apache.org/jira/browse/HBASE-7508
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Trivial
> Fix For: 0.96.0
>
> Attachments: 7508.v1.patch, 7508.v2.patch, 7508.v2.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7441) Make ClusterManager in IntegrationTestingUtility pluggable

2013-01-07 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546280#comment-13546280
 ] 

stack commented on HBASE-7441:
--

[~liushaohui] Patch looks good but I would not add the define 
HBASE_CLUSTER_MANAGER_CLASS in hbase-common.  Keep the define in the class 
where it is actually used; i.e. in 
hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestingUtility.java 


> Make ClusterManager in IntegrationTestingUtility pluggable
> --
>
> Key: HBASE-7441
> URL: https://issues.apache.org/jira/browse/HBASE-7441
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.3
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
>  Labels: newbie, patch
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7441-0.94-v1.patch, HBASE-7441-trunk-v1.patch
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> After the patch HBASE-7009, we can use ChaosMonkey to test the Hbase cluster.
> The ClusterManager use ssh to stop/start the rs or master without passwd. To 
> support other cluster manager tool, we need to make clusterManager in 
> IntegrationTestingUtility pluggable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice (not configurable by cf or dynamically)

2013-01-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7055:


Attachment: HBASE-7055-v5.patch

Rebased the patch; some minor fixes based on /r/

> port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice 
> (not configurable by cf or dynamically)
> -
>
> Key: HBASE-7055
> URL: https://issues.apache.org/jira/browse/HBASE-7055
> Project: HBase
>  Issue Type: Task
>  Components: Compaction
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0
>
> Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch, 
> HBASE-6371-v3-refactor-only-squashed.patch, 
> HBASE-6371-v4-refactor-only-squashed.patch, 
> HBASE-6371-v5-refactor-only-squashed.patch, HBASE-7055-v0.patch, 
> HBASE-7055-v1.patch, HBASE-7055-v2.patch, HBASE-7055-v3.patch, 
> HBASE-7055-v4.patch, HBASE-7055-v5.patch
>
>
> There's divergence in the code :(
> See HBASE-6371 for details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7407) TestMasterFailover under tests some cases and over tests some others

2013-01-07 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546298#comment-13546298
 ] 

nkeywal commented on HBASE-7407:


@[~jxiang]
NP. I will push one with your previous comment taken into account.

> TestMasterFailover under tests some cases and over tests some others
> 
>
> Key: HBASE-7407
> URL: https://issues.apache.org/jira/browse/HBASE-7407
> Project: HBase
>  Issue Type: Bug
>  Components: master, Region Assignment, test
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 7407.v1.patch, 7407.v2.patch, 7407.v3.patch
>
>
> The tests are done with this settings:
> conf.setInt("hbase.master.assignment.timeoutmonitor.period", 2000);
> conf.setInt("hbase.master.assignment.timeoutmonitor.timeout", 4000);
> As a results:
> 1) some tests seems to work, but in real life, the recovery would take 5 
> minutes or more, as in production there always higher. So we don't see the 
> real issues.
> 2) The tests include specific cases that should not happen in production. It 
> works because the timeout catches everything, but these scenarios do not need 
> to be optimized, as they cannot happen. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7404) Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE

2013-01-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546299#comment-13546299
 ] 

Sergey Shelukhin commented on HBASE-7404:
-

+1 on latest patch

> Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE
> --
>
> Key: HBASE-7404
> URL: https://issues.apache.org/jira/browse/HBASE-7404
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7404-trunk-v10.patch, 7404-trunk-v11.patch, 
> 7404-trunk-v12.patch, 7404-trunk-v13.patch, 7404-trunk-v13.txt, 
> 7404-trunk-v14.patch, BucketCache.pdf, hbase-7404-94v2.patch, 
> hbase-7404-trunkv2.patch, hbase-7404-trunkv9.patch, Introduction of Bucket 
> Cache.pdf
>
>
> First, thanks @neil from Fusion-IO share the source code.
> What's Bucket Cache? 
> It could greatly decrease CMS and heap fragment by GC
> It support a large cache space for High Read Performance by using high speed 
> disk like Fusion-io
> 1.An implementation of block cache like LruBlockCache
> 2.Self manage blocks' storage position through Bucket Allocator
> 3.The cached blocks could be stored in the memory or file system
> 4.Bucket Cache could be used as a mainly block cache(see CombinedBlockCache), 
> combined with LruBlockCache to decrease CMS and fragment by GC.
> 5.BucketCache also could be used as a secondary cache(e.g. using Fusionio to 
> store block) to enlarge cache space
> How about SlabCache?
> We have studied and test SlabCache first, but the result is bad, because:
> 1.SlabCache use SingleSizeCache, its use ratio of memory is low because kinds 
> of block size, especially using DataBlockEncoding
> 2.SlabCache is uesd in DoubleBlockCache, block is cached both in SlabCache 
> and LruBlockCache, put the block to LruBlockCache again if hit in SlabCache , 
> it causes CMS and heap fragment don't get any better
> 3.Direct heap performance is not good as heap, and maybe cause OOM, so we 
> recommend using "heap" engine 
> See more in the attachment and in the patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546301#comment-13546301
 ] 

Sergey Shelukhin commented on HBASE-5416:
-

Btw, the integration test for this is in HBASE-7383. I will run it locally for 
latest patch.

> Improve performance of scans with some kind of filters.
> ---
>
> Key: HBASE-5416
> URL: https://issues.apache.org/jira/browse/HBASE-5416
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters, Performance, regionserver
>Affects Versions: 0.90.4
>Reporter: Max Lapan
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0
>
> Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 
> 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
> 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
> Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
> Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
> HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
> HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
> HBASE-5416-v9.patch
>
>
> When the scan is performed, whole row is loaded into result list, after that 
> filter (if exists) is applied to detect that row is needed.
> But when scan is performed on several CFs and filter checks only data from 
> the subset of these CFs, data from CFs, not checked by a filter is not needed 
> on a filter stage. Only when we decided to include current row. And in such 
> case we can significantly reduce amount of IO performed by a scan, by loading 
> only values, actually checked by a filter.
> For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
> megabytes) and is used to filter large entries from snap. Snap is very large 
> (10s of GB) and it is quite costly to scan it. If we needed only rows with 
> some flag specified, we use SingleColumnValueFilter to limit result to only 
> small subset of region. But current implementation is loading both CFs to 
> perform scan, when only small subset is needed.
> Attached patch adds one routine to Filter interface to allow filter to 
> specify which CF is needed to it's operation. In HRegion, we separate all 
> scanners into two groups: needed for filter and the rest (joined). When new 
> row is considered, only needed data is loaded, filter applied, and only if 
> filter accepts the row, rest of data is loaded. At our data, this speeds up 
> such kind of scans 30-50 times. Also, this gives us the way to better 
> normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7414) Convert some HFile metadata to PB

2013-01-07 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546303#comment-13546303
 ] 

stack commented on HBASE-7414:
--

On whether this patch breaks our being able to read hfiles written w/ non-pb 
fileinfo, we have an old v1 hfile under src/test/resources.  It is used by 
TestHFileReaderV1.  In the test we explicitly read the file trailer and at 
least verify we can read its version.  So, if this test passes, we can have 
some faith we have not broke our being able to read hfiles w/ non-pb trailers.  
Just saying.

On the patch, the below name change could confuse.  In the past I have named pb 
generated files the same as their POJO wrapper or user.  I thought I was being 
smart keeping stuff together.  Then Elliott gave me a dirty look last year 
after spending an hour or more trying to figure why something wasn't working 
only to find that he was confused by the fact that there was a POJO ServerName 
and a pb ServerName and it took him a while to realize this.

 // Map of name/values
-message FileInfoProto {
+message FileInfo {
   repeated BytesBytesPair mapEntry = 1;
 }
+
+// HFile file trailer
+message FileTrailer {

Why do the below version test?  All files would be written w/ pbs going forward?

{code}
+if (majorVersion > 2 || (majorVersion == 2 && minorVersion >= 
PBUF_TRAILER_MINOR_VERSION)) {
+  serializeAsPB(baosDos);
+} else {
+  serializeAsWritable(baosDos);
+}
{code}

Else patch looks great.  This is a nice change.


> Convert some HFile metadata to PB
> -
>
> Key: HBASE-7414
> URL: https://issues.apache.org/jira/browse/HBASE-7414
> Project: HBase
>  Issue Type: Task
>  Components: HFile
>Reporter: stack
>Assignee: Andrew Purtell
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 7414.patch, 7414.patch, 7414.patch
>
>
> See HBASE-7201
> Convertion should be in a manner that does not prevent our being able to read 
> old style hfiles with Writable metadata.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7513) HDFSBlocksDistribution shouldn't send NPEs when something goes wrong

2013-01-07 Thread Jean-Daniel Cryans (JIRA)
Jean-Daniel Cryans created HBASE-7513:
-

 Summary: HDFSBlocksDistribution shouldn't send NPEs when something 
goes wrong
 Key: HBASE-7513
 URL: https://issues.apache.org/jira/browse/HBASE-7513
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Daniel Cryans
Priority: Minor
 Fix For: 0.96.0


I saw a pretty weird failure on a cluster with corrupted files and this 
particular exception really threw me off:

{noformat}
2013-01-07 09:58:59,054 ERROR 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open of 
region=redacted., starting to roll back the global memstore size.
java.io.IOException: java.io.IOException: java.lang.NullPointerException: empty 
hosts
at 
org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:548)
at 
org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:461)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3814)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3762)
at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
at 
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: java.lang.NullPointerException: empty hosts
at 
org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:403)
at org.apache.hadoop.hbase.regionserver.Store.(Store.java:256)
at 
org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2995)
at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:523)
at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:521)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
... 3 more
Caused by: java.lang.NullPointerException: empty hosts
at 
org.apache.hadoop.hbase.HDFSBlocksDistribution.addHostsAndBlockWeight(HDFSBlocksDistribution.java:123)
at 
org.apache.hadoop.hbase.util.FSUtils.computeHDFSBlocksDistribution(FSUtils.java:597)
at 
org.apache.hadoop.hbase.regionserver.StoreFile.computeHDFSBlockDistribution(StoreFile.java:492)
at 
org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:521)
at 
org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:602)
at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:380)
at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:375)
... 8 more
2013-01-07 09:58:59,059 INFO 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opening of 
region "redacted" failed, marking as FAILED_OPEN in ZK
{noformat}

This is what the code looks like:

{code}
if (hosts == null || hosts.length == 0) {
 throw new NullPointerException("empty hosts");
}
{code}

So {{hosts}} can exist but we send an NPE anyways? And then this is wrapped in 
{{Store}} by:

{code}
} catch (ExecutionException e) {
  throw new IOException(e.getCause());
{code}

FWIW there's another NPE thrown in 
{{HDFSBlocksDistribution.addHostAndBlockWeight}} and it looks wrong.

We should change the code to just skip computing the locality if it's missing 
and not throw big ugly exceptions. In this case the region would fail opening 
later anyways but at least the error message will be clearer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7441) Make ClusterManager in IntegrationTestingUtility pluggable

2013-01-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546313#comment-13546313
 ] 

Sergey Shelukhin commented on HBASE-7441:
-

Sorry, was away... I'll take a look today

> Make ClusterManager in IntegrationTestingUtility pluggable
> --
>
> Key: HBASE-7441
> URL: https://issues.apache.org/jira/browse/HBASE-7441
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.3
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
>  Labels: newbie, patch
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7441-0.94-v1.patch, HBASE-7441-trunk-v1.patch
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> After the patch HBASE-7009, we can use ChaosMonkey to test the Hbase cluster.
> The ClusterManager use ssh to stop/start the rs or master without passwd. To 
> support other cluster manager tool, we need to make clusterManager in 
> IntegrationTestingUtility pluggable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice (not configurable by cf or dynamically)

2013-01-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546318#comment-13546318
 ] 

Sergey Shelukhin commented on HBASE-7055:
-

>From the description, the scenario that was used as a basis for this feature 
>is just compressing mid-range (in time) data preferentially, and avoiding hot 
>data and old data. That makes sense in general case; a specific scenario with 
>large improvement that I can think of is for example spiky data uploads, where 
>relatively large amount of data gets put at once, and only recent data is 
>accessed (not from the last spike but just recent as such). Then it doesn't 
>make a lot of sense to compact recent data with old data, and triggering 
>compaction after every spike doesn't make sense either. This is pure 
>speculation though.
Based on that a more flexible tiered scheme was developed which can also be 
applied to other patterns. I am not sure about size tiers applicability.

Would you be ok if we put this as an example scenario in documentation and 
javadoc?

[~liyin] [~akashnil07] can you clarify if there are other scenarios, for which 
this was

> port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice 
> (not configurable by cf or dynamically)
> -
>
> Key: HBASE-7055
> URL: https://issues.apache.org/jira/browse/HBASE-7055
> Project: HBase
>  Issue Type: Task
>  Components: Compaction
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0
>
> Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch, 
> HBASE-6371-v3-refactor-only-squashed.patch, 
> HBASE-6371-v4-refactor-only-squashed.patch, 
> HBASE-6371-v5-refactor-only-squashed.patch, HBASE-7055-v0.patch, 
> HBASE-7055-v1.patch, HBASE-7055-v2.patch, HBASE-7055-v3.patch, 
> HBASE-7055-v4.patch, HBASE-7055-v5.patch
>
>
> There's divergence in the code :(
> See HBASE-6371 for details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice (not configurable by cf or dynamically)

2013-01-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546320#comment-13546320
 ] 

Sergey Shelukhin commented on HBASE-7055:
-

...added at FB?

> port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice 
> (not configurable by cf or dynamically)
> -
>
> Key: HBASE-7055
> URL: https://issues.apache.org/jira/browse/HBASE-7055
> Project: HBase
>  Issue Type: Task
>  Components: Compaction
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0
>
> Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch, 
> HBASE-6371-v3-refactor-only-squashed.patch, 
> HBASE-6371-v4-refactor-only-squashed.patch, 
> HBASE-6371-v5-refactor-only-squashed.patch, HBASE-7055-v0.patch, 
> HBASE-7055-v1.patch, HBASE-7055-v2.patch, HBASE-7055-v3.patch, 
> HBASE-7055-v4.patch, HBASE-7055-v5.patch
>
>
> There's divergence in the code :(
> See HBASE-6371 for details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7414) Convert some HFile metadata to PB

2013-01-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-7414:
--

Attachment: 7414.patch

> Convert some HFile metadata to PB
> -
>
> Key: HBASE-7414
> URL: https://issues.apache.org/jira/browse/HBASE-7414
> Project: HBase
>  Issue Type: Task
>  Components: HFile
>Reporter: stack
>Assignee: Andrew Purtell
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 7414.patch, 7414.patch, 7414.patch, 7414.patch
>
>
> See HBASE-7201
> Convertion should be in a manner that does not prevent our being able to read 
> old style hfiles with Writable metadata.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7414) Convert some HFile metadata to PB

2013-01-07 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546335#comment-13546335
 ] 

Andrew Purtell commented on HBASE-7414:
---

bq. So, if this test passes, we can have some faith we have not broke our being 
able to read hfiles w/ non-pb trailers. Just saying.

Yes this test passes.

bq. Why do the below version test? All files would be written w/ pbs going 
forward?

Yes. Of course if someone is implementing HFileV3 some day this is free to be 
changed.

{quote}
-message FileInfoProto {
+message FileInfo
{quote}

Attached updated patch that appends 'Proto' to the PB message names.

> Convert some HFile metadata to PB
> -
>
> Key: HBASE-7414
> URL: https://issues.apache.org/jira/browse/HBASE-7414
> Project: HBase
>  Issue Type: Task
>  Components: HFile
>Reporter: stack
>Assignee: Andrew Purtell
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 7414.patch, 7414.patch, 7414.patch, 7414.patch
>
>
> See HBASE-7201
> Convertion should be in a manner that does not prevent our being able to read 
> old style hfiles with Writable metadata.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7414) Convert some HFile metadata to PB

2013-01-07 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546339#comment-13546339
 ] 

stack commented on HBASE-7414:
--

+1

> Convert some HFile metadata to PB
> -
>
> Key: HBASE-7414
> URL: https://issues.apache.org/jira/browse/HBASE-7414
> Project: HBase
>  Issue Type: Task
>  Components: HFile
>Reporter: stack
>Assignee: Andrew Purtell
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 7414.patch, 7414.patch, 7414.patch, 7414.patch
>
>
> See HBASE-7201
> Convertion should be in a manner that does not prevent our being able to read 
> old style hfiles with Writable metadata.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7507) Make memstore flush be able to retry after exception

2013-01-07 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546341#comment-13546341
 ] 

Andrew Purtell commented on HBASE-7507:
---

bq. Should we open a new issue to retry all hdfs operations?  Put the retries 
into our wrapper around our filesystem instance, HFileSystem?

We could have methods that accept parameters for the op and then an optional 
retry count/flag? Some cases won't want to retry?

> Make memstore flush be able to retry after exception
> 
>
> Key: HBASE-7507
> URL: https://issues.apache.org/jira/browse/HBASE-7507
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0
>
> Attachments: 7507-trunk v1.patch
>
>
> We will abort regionserver if memstore flush throws exception.
> I thinks we could do retry to make regionserver more stable because file 
> system may be not ok in a transient time. e.g. Switching namenode in the 
> NamenodeHA environment
> {code}
> HRegion#internalFlushcache(){
> ...
> try {
> ...
> }catch(Throwable t){
> DroppedSnapshotException dse = new DroppedSnapshotException("region: " +
>   Bytes.toStringBinary(getRegionName()));
> dse.initCause(t);
> throw dse;
> }
> ...
> }
> MemStoreFlusher#flushRegion(){
> ...
> region.flushcache();
> ...
>  try {
> }catch(DroppedSnapshotException ex){
> server.abort("Replay of HLog required. Forcing server shutdown", ex);
> }
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7407) TestMasterFailover under tests some cases and over tests some others

2013-01-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546358#comment-13546358
 ] 

Hadoop QA commented on HBASE-7407:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563588/7407.v3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:green}+1 core tests{color}.  The patch passed unit tests in .

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3913//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3913//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3913//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3913//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3913//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3913//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3913//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3913//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3913//console

This message is automatically generated.

> TestMasterFailover under tests some cases and over tests some others
> 
>
> Key: HBASE-7407
> URL: https://issues.apache.org/jira/browse/HBASE-7407
> Project: HBase
>  Issue Type: Bug
>  Components: master, Region Assignment, test
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 7407.v1.patch, 7407.v2.patch, 7407.v3.patch
>
>
> The tests are done with this settings:
> conf.setInt("hbase.master.assignment.timeoutmonitor.period", 2000);
> conf.setInt("hbase.master.assignment.timeoutmonitor.timeout", 4000);
> As a results:
> 1) some tests seems to work, but in real life, the recovery would take 5 
> minutes or more, as in production there always higher. So we don't see the 
> real issues.
> 2) The tests include specific cases that should not happen in production. It 
> works because the timeout catches everything, but these scenarios do not need 
> to be optimized, as they cannot happen. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-7513) HDFSBlocksDistribution shouldn't send NPEs when something goes wrong

2013-01-07 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark reassigned HBASE-7513:


Assignee: Elliott Clark

> HDFSBlocksDistribution shouldn't send NPEs when something goes wrong
> 
>
> Key: HBASE-7513
> URL: https://issues.apache.org/jira/browse/HBASE-7513
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0
>
>
> I saw a pretty weird failure on a cluster with corrupted files and this 
> particular exception really threw me off:
> {noformat}
> 2013-01-07 09:58:59,054 ERROR 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open 
> of region=redacted., starting to roll back the global memstore size.
> java.io.IOException: java.io.IOException: java.lang.NullPointerException: 
> empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:548)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:461)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3814)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3762)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:403)
>   at org.apache.hadoop.hbase.regionserver.Store.(Store.java:256)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2995)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:523)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:521)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   ... 3 more
> Caused by: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.HDFSBlocksDistribution.addHostsAndBlockWeight(HDFSBlocksDistribution.java:123)
>   at 
> org.apache.hadoop.hbase.util.FSUtils.computeHDFSBlocksDistribution(FSUtils.java:597)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.computeHDFSBlockDistribution(StoreFile.java:492)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:521)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:602)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:380)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:375)
>   ... 8 more
> 2013-01-07 09:58:59,059 INFO 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opening of 
> region "redacted" failed, marking as FAILED_OPEN in ZK
> {noformat}
> This is what the code looks like:
> {code}
> if (hosts == null || hosts.length == 0) {
>  throw new NullPointerException("empty hosts");
> }
> {code}
> So {{hosts}} can exist but we send an NPE anyways? And then this is wrapped 
> in {{Store}} by:
> {code}
> } catch (ExecutionException e) {
>   throw new IOException(e.getCause());
> {code}
> FWIW there's another NPE thrown in 
> {{HDFSBlocksDistribution.addHostAndBlockWeight}} and it looks wrong.
> We should change the code to just skip computing the locality if it's missing 
> and not throw big ugly exceptions. In this case the region would fail opening 
> later anyways but at least the error message will be clearer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7513) HDFSBlocksDistribution shouldn't send NPEs when something goes wrong

2013-01-07 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546366#comment-13546366
 ] 

Elliott Clark commented on HBASE-7513:
--

+1 I'll get this.  Creating an HDFS Block Distribution shouldn't fail opening a 
store file.

> HDFSBlocksDistribution shouldn't send NPEs when something goes wrong
> 
>
> Key: HBASE-7513
> URL: https://issues.apache.org/jira/browse/HBASE-7513
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0
>
>
> I saw a pretty weird failure on a cluster with corrupted files and this 
> particular exception really threw me off:
> {noformat}
> 2013-01-07 09:58:59,054 ERROR 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open 
> of region=redacted., starting to roll back the global memstore size.
> java.io.IOException: java.io.IOException: java.lang.NullPointerException: 
> empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:548)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:461)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3814)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3762)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:403)
>   at org.apache.hadoop.hbase.regionserver.Store.(Store.java:256)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2995)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:523)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:521)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   ... 3 more
> Caused by: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.HDFSBlocksDistribution.addHostsAndBlockWeight(HDFSBlocksDistribution.java:123)
>   at 
> org.apache.hadoop.hbase.util.FSUtils.computeHDFSBlocksDistribution(FSUtils.java:597)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.computeHDFSBlockDistribution(StoreFile.java:492)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:521)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:602)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:380)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:375)
>   ... 8 more
> 2013-01-07 09:58:59,059 INFO 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opening of 
> region "redacted" failed, marking as FAILED_OPEN in ZK
> {noformat}
> This is what the code looks like:
> {code}
> if (hosts == null || hosts.length == 0) {
>  throw new NullPointerException("empty hosts");
> }
> {code}
> So {{hosts}} can exist but we send an NPE anyways? And then this is wrapped 
> in {{Store}} by:
> {code}
> } catch (ExecutionException e) {
>   throw new IOException(e.getCause());
> {code}
> FWIW there's another NPE thrown in 
> {{HDFSBlocksDistribution.addHostAndBlockWeight}} and it looks wrong.
> We should change the code to just skip computing the locality if it's missing 
> and not throw big ugly exceptions. In this case the region would fail opening 
> later anyways but at least the error message will be clearer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7419) revisit hfilelink file name format.

2013-01-07 Thread Matteo Bertozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546390#comment-13546390
 ] 

Matteo Bertozzi commented on HBASE-7419:


Any suggestion on what the separator should be if not '='?
at this point is just matter of picking one symbol that works, and replace it 
in the regex.

> revisit hfilelink file name format.
> ---
>
> Key: HBASE-7419
> URL: https://issues.apache.org/jira/browse/HBASE-7419
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client, master, regionserver, snapshots, Zookeeper
>Reporter: Jonathan Hsieh
>Assignee: Matteo Bertozzi
> Fix For: hbase-6055, 0.96.0
>
> Attachments: HBASE-7419-v0.patch, HBASE-7419-v1.patch, 
> HBASE-7419-v2.patch
>
>
> Valid table names are concatted with a '.' to a valid regions names is also a 
> valid table name, and lead to the incorrect interpretation.
> {code}
> true hfile name constraints: [0-9]+(?:_SeqID_[0-9]+)?
> region name constraints: [a-f0-9]{16}  (but we currently just use 
> [a-f0-9]+.)
> table name constraints : [a-zA-Z0-9_][a-zA-Z0-9_.-]*
> {code}
> Notice that the table name constraints completely covers all region name 
> constraints and true hfile name constraints.   (a valid hfile name is a valid 
> part of a table name, and a valid enc region name is a valid part of a table 
> name.
> Currently the hfilelink filename convention is --.  
> Unfortunately, making a ref to this uses the name 
> --. -- the contactnation of 
> . is a valid table name used to get interpreted as such. 
>  The fix in HBASE-7339 requires a FileNotFoundException before going down the 
> hfile link resolution path. 
> Regardless of what we do, we need to add some char invalid for table names to 
> the hfilelink or reference filename convention.
> Suggestion: if we changed the order of the hfile-link name we could avoid 
> some of the confusion -- @-. (or some 
> other separator char than '@') could be used to avoid handling on the initial 
> filenotfoundexception but I think we'd still need a good chunk of the logic 
> to handle opening half-storefile reader throw a hfilelink.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice (not configurable by cf or dynamically)

2013-01-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546410#comment-13546410
 ] 

Hadoop QA commented on HBASE-7055:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563629/HBASE-7055-v5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.replication.TestReplication

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3917//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3917//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3917//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3917//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3917//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3917//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3917//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3917//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3917//console

This message is automatically generated.

> port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice 
> (not configurable by cf or dynamically)
> -
>
> Key: HBASE-7055
> URL: https://issues.apache.org/jira/browse/HBASE-7055
> Project: HBase
>  Issue Type: Task
>  Components: Compaction
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0
>
> Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch, 
> HBASE-6371-v3-refactor-only-squashed.patch, 
> HBASE-6371-v4-refactor-only-squashed.patch, 
> HBASE-6371-v5-refactor-only-squashed.patch, HBASE-7055-v0.patch, 
> HBASE-7055-v1.patch, HBASE-7055-v2.patch, HBASE-7055-v3.patch, 
> HBASE-7055-v4.patch, HBASE-7055-v5.patch
>
>
> There's divergence in the code :(
> See HBASE-6371 for details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7414) Convert some HFile metadata to PB

2013-01-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546412#comment-13546412
 ] 

Hadoop QA commented on HBASE-7414:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563636/7414.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces lines longer than 
100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestLocalHBaseCluster

 {color:red}-1 core zombie tests{color}.  There are 2 zombie test(s):   
at 
org.apache.hadoop.hdfs.TestLargeBlock.testLargeBlockSize(TestLargeBlock.java:164)
at 
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup.testBalancerWithRackLocality(TestBalancerWithNodeGroup.java:220)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3916//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3916//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3916//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3916//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3916//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3916//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3916//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3916//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3916//console

This message is automatically generated.

> Convert some HFile metadata to PB
> -
>
> Key: HBASE-7414
> URL: https://issues.apache.org/jira/browse/HBASE-7414
> Project: HBase
>  Issue Type: Task
>  Components: HFile
>Reporter: stack
>Assignee: Andrew Purtell
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 7414.patch, 7414.patch, 7414.patch, 7414.patch
>
>
> See HBASE-7201
> Convertion should be in a manner that does not prevent our being able to read 
> old style hfiles with Writable metadata.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7476) HBase shell count command doesn't escape binary output

2013-01-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546419#comment-13546419
 ] 

Hudson commented on HBASE-7476:
---

Integrated in HBase-0.94 #713 (See 
[https://builds.apache.org/job/HBase-0.94/713/])
HBASE-7476 HBase shell count command doesn't escape binary output (Revision 
1430004)

 Result = SUCCESS
stack : 
Files : 
* /hbase/branches/0.94/src/main/ruby/hbase/table.rb


> HBase shell count command doesn't escape binary output
> --
>
> Key: HBASE-7476
> URL: https://issues.apache.org/jira/browse/HBASE-7476
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Reporter: Gabriel Reid
>Priority: Minor
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7476_1.patch, HBASE-7476.patch
>
>
> When running the the count command in the HBase shell, the row key is printed 
> each time a count interval is reached. However, the key is printed verbatim, 
> meaning that non-printable characters are directly printed to the terminal. 
> This can cause confusing results, or even leave the terminal in an unusable 
> state.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7441) Make ClusterManager in IntegrationTestingUtility pluggable

2013-01-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546421#comment-13546421
 ] 

Sergey Shelukhin commented on HBASE-7441:
-

Agree that HBASE_CLUSTER_MANAGER_CLASS shouldn't be in common. Otherwise +1.

What is the example usage? There's no alternative class added; does it make 
sense to add it here too?

> Make ClusterManager in IntegrationTestingUtility pluggable
> --
>
> Key: HBASE-7441
> URL: https://issues.apache.org/jira/browse/HBASE-7441
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.3
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
>  Labels: newbie, patch
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7441-0.94-v1.patch, HBASE-7441-trunk-v1.patch
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> After the patch HBASE-7009, we can use ChaosMonkey to test the Hbase cluster.
> The ClusterManager use ssh to stop/start the rs or master without passwd. To 
> support other cluster manager tool, we need to make clusterManager in 
> IntegrationTestingUtility pluggable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7476) HBase shell count command doesn't escape binary output

2013-01-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546422#comment-13546422
 ] 

Hudson commented on HBASE-7476:
---

Integrated in HBase-TRUNK #3707 (See 
[https://builds.apache.org/job/HBase-TRUNK/3707/])
HBASE-7476 HBase shell count command doesn't escape binary output (Revision 
1430003)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/hbase-server/src/main/ruby/hbase/table.rb


> HBase shell count command doesn't escape binary output
> --
>
> Key: HBASE-7476
> URL: https://issues.apache.org/jira/browse/HBASE-7476
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Reporter: Gabriel Reid
>Priority: Minor
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7476_1.patch, HBASE-7476.patch
>
>
> When running the the count command in the HBase shell, the row key is printed 
> each time a count interval is reached. However, the key is printed verbatim, 
> meaning that non-printable characters are directly printed to the terminal. 
> This can cause confusing results, or even leave the terminal in an unusable 
> state.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2013-01-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546430#comment-13546430
 ] 

Sergey Shelukhin commented on HBASE-5416:
-

Btw, the test appears to pass. 

> Improve performance of scans with some kind of filters.
> ---
>
> Key: HBASE-5416
> URL: https://issues.apache.org/jira/browse/HBASE-5416
> Project: HBase
>  Issue Type: Improvement
>  Components: Filters, Performance, regionserver
>Affects Versions: 0.90.4
>Reporter: Max Lapan
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0
>
> Attachments: 5416-0.94-v1.txt, 5416-0.94-v2.txt, 
> 5416-Filtered_scans_v6.patch, 5416-v13.patch, 5416-v14.patch, 5416-v15.patch, 
> 5416-v16.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, 
> Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, 
> Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, 
> HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v12.patch, 
> HBASE-5416-v12.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, 
> HBASE-5416-v9.patch
>
>
> When the scan is performed, whole row is loaded into result list, after that 
> filter (if exists) is applied to detect that row is needed.
> But when scan is performed on several CFs and filter checks only data from 
> the subset of these CFs, data from CFs, not checked by a filter is not needed 
> on a filter stage. Only when we decided to include current row. And in such 
> case we can significantly reduce amount of IO performed by a scan, by loading 
> only values, actually checked by a filter.
> For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
> megabytes) and is used to filter large entries from snap. Snap is very large 
> (10s of GB) and it is quite costly to scan it. If we needed only rows with 
> some flag specified, we use SingleColumnValueFilter to limit result to only 
> small subset of region. But current implementation is loading both CFs to 
> perform scan, when only small subset is needed.
> Attached patch adds one routine to Filter interface to allow filter to 
> specify which CF is needed to it's operation. In HRegion, we separate all 
> scanners into two groups: needed for filter and the rest (joined). When new 
> row is considered, only needed data is loaded, filter applied, and only if 
> filter accepts the row, rest of data is loaded. At our data, this speeds up 
> such kind of scans 30-50 times. Also, this gives us the way to better 
> normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7514) Fix the TestLogSplitOnMasterFailover.testWithDistributedLogSplittingAndErrors.

2013-01-07 Thread Manukranth Kolloju (JIRA)
Manukranth Kolloju created HBASE-7514:
-

 Summary: Fix the 
TestLogSplitOnMasterFailover.testWithDistributedLogSplittingAndErrors.
 Key: HBASE-7514
 URL: https://issues.apache.org/jira/browse/HBASE-7514
 Project: HBase
  Issue Type: Bug
Reporter: Manukranth Kolloju
Priority: Minor




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7329) remove flush-related records from WAL

2013-01-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546440#comment-13546440
 ] 

Sergey Shelukhin commented on HBASE-7329:
-

Hi. Ping? :)

> remove flush-related records from WAL
> -
>
> Key: HBASE-7329
> URL: https://issues.apache.org/jira/browse/HBASE-7329
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HBASE-7329-v0.patch, HBASE-7329-v0-tmp.patch, 
> HBASE-7329-v1.patch
>
>
> Comments from many people in HBASE-6466 and HBASE-6980 indicate that flush 
> records in WAL are not useful. If so, they should be removed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7268) correct local region location cache information can be overwritten w/stale information from an old server

2013-01-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546441#comment-13546441
 ] 

Sergey Shelukhin commented on HBASE-7268:
-

[~stack] Any opinion on the last comment? Thanks.

> correct local region location cache information can be overwritten w/stale 
> information from an old server
> -
>
> Key: HBASE-7268
> URL: https://issues.apache.org/jira/browse/HBASE-7268
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-7268-v0.patch, HBASE-7268-v0.patch, 
> HBASE-7268-v1.patch, HBASE-7268-v2.patch, HBASE-7268-v2-plus-masterTs.patch, 
> HBASE-7268-v2-plus-masterTs.patch
>
>
> Discovered via HBASE-7250; related to HBASE-5877.
> Test is writing from multiple threads.
> Server A has region R; client knows that.
> R gets moved from A to server B.
> B gets killed.
> R gets moved by master to server C.
> ~15 seconds later, client tries to write to it (on A?).
> Multiple client threads report from RegionMoved exception processing logic "R 
> moved from C to B", even though such transition never happened (neither in 
> nor before the sequence described below). Not quite sure how the client 
> learned of the transition to C, I assume it's from meta from some other 
> thread...
> Then, put fails (it may fail due to accumulated errors that are not logged, 
> which I am investigating... but the bogus cache update is there 
> nonwithstanding).
> I have a patch but not sure if it works, test still fails locally for yet 
> unknown reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (HBASE-7427) Check line lenghts in the test-patch script

2013-01-07 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar reopened HBASE-7427:
--


Reopening, since this breaks in mac os. 

> Check line lenghts in the test-patch script
> ---
>
> Key: HBASE-7427
> URL: https://issues.apache.org/jira/browse/HBASE-7427
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.96.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.96.0
>
> Attachments: hbase-7427_v1.patch
>
>
> Checkstyle is disabled in test-patch, and it is not very easy to make it 
> work. We can just add some check for the line lengths in the mean time. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7476) HBase shell count command doesn't escape binary output

2013-01-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546449#comment-13546449
 ] 

Hudson commented on HBASE-7476:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #334 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/334/])
HBASE-7476 HBase shell count command doesn't escape binary output (Revision 
1430003)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/hbase-server/src/main/ruby/hbase/table.rb


> HBase shell count command doesn't escape binary output
> --
>
> Key: HBASE-7476
> URL: https://issues.apache.org/jira/browse/HBASE-7476
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Reporter: Gabriel Reid
>Priority: Minor
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7476_1.patch, HBASE-7476.patch
>
>
> When running the the count command in the HBase shell, the row key is printed 
> each time a count interval is reached. However, the key is printed verbatim, 
> meaning that non-printable characters are directly printed to the terminal. 
> This can cause confusing results, or even leave the terminal in an unusable 
> state.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7427) Check line lenghts in the test-patch script

2013-01-07 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546450#comment-13546450
 ] 

Enis Soztutar commented on HBASE-7427:
--

We can alternatively just disable line checking if wc -L is not available.

> Check line lenghts in the test-patch script
> ---
>
> Key: HBASE-7427
> URL: https://issues.apache.org/jira/browse/HBASE-7427
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.96.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.96.0
>
> Attachments: hbase-7427_v1.patch
>
>
> Checkstyle is disabled in test-patch, and it is not very easy to make it 
> work. We can just add some check for the line lengths in the mean time. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7501) Introduce MetaEditor method that both adds and deletes rows in .META. table

2013-01-07 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546452#comment-13546452
 ] 

Enis Soztutar commented on HBASE-7501:
--

bq. The Delete and Put would be grouped and then written to .META. table in one 
transaction.
To clarify, it is not one transaction, but one batch RPC right? 

> Introduce MetaEditor method that both adds and deletes rows in .META. table
> ---
>
> Key: HBASE-7501
> URL: https://issues.apache.org/jira/browse/HBASE-7501
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>
> In review of HBASE-7365, MetaEditor.deleteRegions() and 
> MetaEditor.addRegionsToMeta() are used in 
> RestoreSnapshotHandler.java.handleTableOperation() to apply changes to .META.
> I made following suggestion:
> Can we introduce new method in MetaEditor which takes List of Mutation's ?
> The Delete and Put would be grouped and then written to .META. table in one 
> transaction.
> Jon responded:
> I like that idea -- then the todo/warning or follow on could refer to that 
> method.  When we fix it, it could get used in other multi row meta 
> modifications like splits and table creation/deletion in general.
> See https://reviews.apache.org/r/8674/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7441) Make ClusterManager in IntegrationTestingUtility pluggable

2013-01-07 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546451#comment-13546451
 ] 

Enis Soztutar commented on HBASE-7441:
--

Looks good. ClusterManager was intended for being pluggable from the start. One 
nit, can you also define the default class as a constant like 
DEFAULT_HBASE_CLUSTER_MANAGER_CLASS, and do the Stack's comment. 

> Make ClusterManager in IntegrationTestingUtility pluggable
> --
>
> Key: HBASE-7441
> URL: https://issues.apache.org/jira/browse/HBASE-7441
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.3
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
>  Labels: newbie, patch
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7441-0.94-v1.patch, HBASE-7441-trunk-v1.patch
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> After the patch HBASE-7009, we can use ChaosMonkey to test the Hbase cluster.
> The ClusterManager use ssh to stop/start the rs or master without passwd. To 
> support other cluster manager tool, we need to make clusterManager in 
> IntegrationTestingUtility pluggable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7515) Store.loadStoreFiles should close opened files if there's an exception

2013-01-07 Thread Jean-Daniel Cryans (JIRA)
Jean-Daniel Cryans created HBASE-7515:
-

 Summary: Store.loadStoreFiles should close opened files if there's 
an exception
 Key: HBASE-7515
 URL: https://issues.apache.org/jira/browse/HBASE-7515
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Daniel Cryans
 Fix For: 0.96.0


Related to HBASE-7513. If a RS is able to open a few store files in 
{{Store.loadStoreFiles}} but one of them fails like in 7513, the opened files 
won't be closed and file descriptors will remain in a CLOSED_WAIT state.

The situation we encountered is that over the weekend one region was bounced 
between >100 region servers and eventually they all started dying on "Too many 
open files".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7268) correct local region location cache information can be overwritten w/stale information from an old server

2013-01-07 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546460#comment-13546460
 ] 

Enis Soztutar commented on HBASE-7268:
--

Instead of the timestamps which is not guaranteed monotonically increasing, can 
we make use of sequenceId's. On region open, we can save the seqId, and save it 
in META. On region move, the region is opened again with a greater seqId 
somewhere else, and the client can reason about cache invalidation. wdyt? 

> correct local region location cache information can be overwritten w/stale 
> information from an old server
> -
>
> Key: HBASE-7268
> URL: https://issues.apache.org/jira/browse/HBASE-7268
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-7268-v0.patch, HBASE-7268-v0.patch, 
> HBASE-7268-v1.patch, HBASE-7268-v2.patch, HBASE-7268-v2-plus-masterTs.patch, 
> HBASE-7268-v2-plus-masterTs.patch
>
>
> Discovered via HBASE-7250; related to HBASE-5877.
> Test is writing from multiple threads.
> Server A has region R; client knows that.
> R gets moved from A to server B.
> B gets killed.
> R gets moved by master to server C.
> ~15 seconds later, client tries to write to it (on A?).
> Multiple client threads report from RegionMoved exception processing logic "R 
> moved from C to B", even though such transition never happened (neither in 
> nor before the sequence described below). Not quite sure how the client 
> learned of the transition to C, I assume it's from meta from some other 
> thread...
> Then, put fails (it may fail due to accumulated errors that are not logged, 
> which I am investigating... but the bogus cache update is there 
> nonwithstanding).
> I have a patch but not sure if it works, test still fails locally for yet 
> unknown reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7224) Remove references to Writable in the ipc package

2013-01-07 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HBASE-7224:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Marking as resolved.

> Remove references to Writable in the ipc package
> 
>
> Key: HBASE-7224
> URL: https://issues.apache.org/jira/browse/HBASE-7224
> Project: HBase
>  Issue Type: Sub-task
>  Components: IPC/RPC, Protobufs
>Reporter: Devaraj Das
>Assignee: stack
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 7224.txt, 7224v2.txt, 7224v3.txt, 7224v4.txt, 
> 7224v4.txt, 7224v4.txt, 7224v5.txt, purge_more_writables.txt
>
>
> I see references to Writable in the ipc package, most notably in the 
> Invocation class. This class is not being used that much in the core ipc 
> package but used in the coprocessor protocol implementations (there are some 
> coprocessor protocols that are Writable based still). This jira is to track 
> removing those references and the Invocation class (once HBASE-6895 is 
> resolved).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7427) Check line lenghts in the test-patch script

2013-01-07 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546463#comment-13546463
 ] 

Jonathan Hsieh commented on HBASE-7427:
---

nit: [~enis], since it was initially closed two weeks ago, mind closing this 
and creating a new jira for mac/bsd?  If the new fix creates a new patch it'll 
be confusing to see two separated by that amount of time!

> Check line lenghts in the test-patch script
> ---
>
> Key: HBASE-7427
> URL: https://issues.apache.org/jira/browse/HBASE-7427
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.96.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.96.0
>
> Attachments: hbase-7427_v1.patch
>
>
> Checkstyle is disabled in test-patch, and it is not very easy to make it 
> work. We can just add some check for the line lengths in the mean time. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7513) HDFSBlocksDistribution shouldn't send NPEs when something goes wrong

2013-01-07 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-7513:
-

Attachment: HBASE-7513-0.patch

Pretty easy patch.

Since the hdfs block locality stuff isn't really used for anything other than 
metrics, just ignoring the case where no hosts are passed in has no adverse 
effects.

> HDFSBlocksDistribution shouldn't send NPEs when something goes wrong
> 
>
> Key: HBASE-7513
> URL: https://issues.apache.org/jira/browse/HBASE-7513
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-7513-0.patch
>
>
> I saw a pretty weird failure on a cluster with corrupted files and this 
> particular exception really threw me off:
> {noformat}
> 2013-01-07 09:58:59,054 ERROR 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open 
> of region=redacted., starting to roll back the global memstore size.
> java.io.IOException: java.io.IOException: java.lang.NullPointerException: 
> empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:548)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:461)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3814)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3762)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:403)
>   at org.apache.hadoop.hbase.regionserver.Store.(Store.java:256)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2995)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:523)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:521)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   ... 3 more
> Caused by: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.HDFSBlocksDistribution.addHostsAndBlockWeight(HDFSBlocksDistribution.java:123)
>   at 
> org.apache.hadoop.hbase.util.FSUtils.computeHDFSBlocksDistribution(FSUtils.java:597)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.computeHDFSBlockDistribution(StoreFile.java:492)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:521)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:602)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:380)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:375)
>   ... 8 more
> 2013-01-07 09:58:59,059 INFO 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opening of 
> region "redacted" failed, marking as FAILED_OPEN in ZK
> {noformat}
> This is what the code looks like:
> {code}
> if (hosts == null || hosts.length == 0) {
>  throw new NullPointerException("empty hosts");
> }
> {code}
> So {{hosts}} can exist but we send an NPE anyways? And then this is wrapped 
> in {{Store}} by:
> {code}
> } catch (ExecutionException e) {
>   throw new IOException(e.getCause());
> {code}
> FWIW there's another NPE thrown in 
> {{HDFSBlocksDistribution.addHostAndBlockWeight}} and it looks wrong.
> We should change the code to just skip computing the locality if it's missing 
> and not throw big ugly exceptions. In this case the region would fail opening 
> later anyways but at least the error message will be clearer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7513) HDFSBlocksDistribution shouldn't send NPEs when something goes wrong

2013-01-07 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-7513:
-

Affects Version/s: 0.94.4
   0.96.0
   Status: Patch Available  (was: Open)

> HDFSBlocksDistribution shouldn't send NPEs when something goes wrong
> 
>
> Key: HBASE-7513
> URL: https://issues.apache.org/jira/browse/HBASE-7513
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0, 0.94.4
>Reporter: Jean-Daniel Cryans
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-7513-0.patch
>
>
> I saw a pretty weird failure on a cluster with corrupted files and this 
> particular exception really threw me off:
> {noformat}
> 2013-01-07 09:58:59,054 ERROR 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open 
> of region=redacted., starting to roll back the global memstore size.
> java.io.IOException: java.io.IOException: java.lang.NullPointerException: 
> empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:548)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:461)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3814)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3762)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:403)
>   at org.apache.hadoop.hbase.regionserver.Store.(Store.java:256)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2995)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:523)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:521)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   ... 3 more
> Caused by: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.HDFSBlocksDistribution.addHostsAndBlockWeight(HDFSBlocksDistribution.java:123)
>   at 
> org.apache.hadoop.hbase.util.FSUtils.computeHDFSBlocksDistribution(FSUtils.java:597)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.computeHDFSBlockDistribution(StoreFile.java:492)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:521)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:602)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:380)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:375)
>   ... 8 more
> 2013-01-07 09:58:59,059 INFO 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opening of 
> region "redacted" failed, marking as FAILED_OPEN in ZK
> {noformat}
> This is what the code looks like:
> {code}
> if (hosts == null || hosts.length == 0) {
>  throw new NullPointerException("empty hosts");
> }
> {code}
> So {{hosts}} can exist but we send an NPE anyways? And then this is wrapped 
> in {{Store}} by:
> {code}
> } catch (ExecutionException e) {
>   throw new IOException(e.getCause());
> {code}
> FWIW there's another NPE thrown in 
> {{HDFSBlocksDistribution.addHostAndBlockWeight}} and it looks wrong.
> We should change the code to just skip computing the locality if it's missing 
> and not throw big ugly exceptions. In this case the region would fail opening 
> later anyways but at least the error message will be clearer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7236) add per-table/per-cf configuration via metadata

2013-01-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7236:


Attachment: HBASE-7236-v6.patch

Rebased the patch, addressed recent /r/ comments.

> add per-table/per-cf configuration via metadata
> ---
>
> Key: HBASE-7236
> URL: https://issues.apache.org/jira/browse/HBASE-7236
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch, 
> HBASE-7236-PROTOTYPE-v1.patch, HBASE-7236-v0.patch, HBASE-7236-v1.patch, 
> HBASE-7236-v2.patch, HBASE-7236-v3.patch, HBASE-7236-v4.patch, 
> HBASE-7236-v5.patch, HBASE-7236-v6.patch
>
>
> Regardless of the compaction policy, it makes sense to have separate 
> configuration for compactions for different tables and column families, as 
> their access patterns and workloads can be different. In particular, for 
> tiered compactions that are being ported from 0.89-fb branch it is necessary 
> to have, to use it properly.
> We might want to add support for compaction configuration via metadata on 
> table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7414) Convert some HFile metadata to PB

2013-01-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-7414:
--

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks for the reviews Ted and Stack.

> Convert some HFile metadata to PB
> -
>
> Key: HBASE-7414
> URL: https://issues.apache.org/jira/browse/HBASE-7414
> Project: HBase
>  Issue Type: Task
>  Components: HFile
>Reporter: stack
>Assignee: Andrew Purtell
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 7414.patch, 7414.patch, 7414.patch, 7414.patch
>
>
> See HBASE-7201
> Convertion should be in a manner that does not prevent our being able to read 
> old style hfiles with Writable metadata.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6466) Enable multi-thread for memstore flush

2013-01-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-6466:


Attachment: HBASE-6466-v4.patch

Rebasing the patch

> Enable multi-thread for memstore flush
> --
>
> Key: HBASE-6466
> URL: https://issues.apache.org/jira/browse/HBASE-6466
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.96.0
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: HBASE-6466.patch, HBASE-6466v2.patch, 
> HBASE-6466v3.1.patch, HBASE-6466v3.patch, HBASE-6466-v4.patch, 
> HBASE-6466-v4.patch
>
>
> If the KV is large or Hlog is closed with high-pressure putting, we found 
> memstore is often above the high water mark and block the putting.
> So should we enable multi-thread for Memstore Flush?
> Some performance test data for reference,
> 1.test environment : 
> random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 
> regions per regionserver;row len=50 bytes, value len=1024 bytes;5 
> regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler 
> per client for writing
> 2.test results:
> one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per 
> regionserver, appears many aboveGlobalMemstoreLimit blocking
> two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per 
> regionserver,
> 200 thread handler per client & two cacheFlush handlers, tps:16.1k/s per 
> regionserver, Flush:18.6MB/s per regionserver

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7383) create integration test for HBASE-5416 (improving scan performance for certain filters)

2013-01-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7383:


Attachment: HBASE-7383-v1.patch

rebasing the patch

> create integration test for HBASE-5416 (improving scan performance for 
> certain filters)
> ---
>
> Key: HBASE-7383
> URL: https://issues.apache.org/jira/browse/HBASE-7383
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HBASE-7383-v0.patch, HBASE-7383-v1.patch, 
> HBASE-7383-v1.patch
>
>
> HBASE-5416 is risky and needs an integration test.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7515) Store.loadStoreFiles should close opened files if there's an exception

2013-01-07 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7515:
--

Attachment: 7515.txt

First attempt at solving this issue.

> Store.loadStoreFiles should close opened files if there's an exception
> --
>
> Key: HBASE-7515
> URL: https://issues.apache.org/jira/browse/HBASE-7515
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
> Fix For: 0.96.0
>
> Attachments: 7515.txt
>
>
> Related to HBASE-7513. If a RS is able to open a few store files in 
> {{Store.loadStoreFiles}} but one of them fails like in 7513, the opened files 
> won't be closed and file descriptors will remain in a CLOSED_WAIT state.
> The situation we encountered is that over the weekend one region was bounced 
> between >100 region servers and eventually they all started dying on "Too 
> many open files".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-7515) Store.loadStoreFiles should close opened files if there's an exception

2013-01-07 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-7515:
-

Assignee: Ted Yu

> Store.loadStoreFiles should close opened files if there's an exception
> --
>
> Key: HBASE-7515
> URL: https://issues.apache.org/jira/browse/HBASE-7515
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: Ted Yu
> Fix For: 0.96.0
>
> Attachments: 7515.txt
>
>
> Related to HBASE-7513. If a RS is able to open a few store files in 
> {{Store.loadStoreFiles}} but one of them fails like in 7513, the opened files 
> won't be closed and file descriptors will remain in a CLOSED_WAIT state.
> The situation we encountered is that over the weekend one region was bounced 
> between >100 region servers and eventually they all started dying on "Too 
> many open files".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7515) Store.loadStoreFiles should close opened files if there's an exception

2013-01-07 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7515:
--

Status: Patch Available  (was: Open)

> Store.loadStoreFiles should close opened files if there's an exception
> --
>
> Key: HBASE-7515
> URL: https://issues.apache.org/jira/browse/HBASE-7515
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: Ted Yu
> Fix For: 0.96.0
>
> Attachments: 7515.txt
>
>
> Related to HBASE-7513. If a RS is able to open a few store files in 
> {{Store.loadStoreFiles}} but one of them fails like in 7513, the opened files 
> won't be closed and file descriptors will remain in a CLOSED_WAIT state.
> The situation we encountered is that over the weekend one region was bounced 
> between >100 region servers and eventually they all started dying on "Too 
> many open files".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7516) Make compaction policy pluggable and configurable per table

2013-01-07 Thread Jimmy Xiang (JIRA)
Jimmy Xiang created HBASE-7516:
--

 Summary: Make compaction policy pluggable and configurable per 
table
 Key: HBASE-7516
 URL: https://issues.apache.org/jira/browse/HBASE-7516
 Project: HBase
  Issue Type: Improvement
Reporter: Jimmy Xiang




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice (not configurable by cf or dynamically)

2013-01-07 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546489#comment-13546489
 ] 

Jimmy Xiang commented on HBASE-7055:


I filed HBASE-7516 to make compaction policy pluggable and configurable per 
table.  Can we resolve that one at first?

> port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice 
> (not configurable by cf or dynamically)
> -
>
> Key: HBASE-7055
> URL: https://issues.apache.org/jira/browse/HBASE-7055
> Project: HBase
>  Issue Type: Task
>  Components: Compaction
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0
>
> Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch, 
> HBASE-6371-v3-refactor-only-squashed.patch, 
> HBASE-6371-v4-refactor-only-squashed.patch, 
> HBASE-6371-v5-refactor-only-squashed.patch, HBASE-7055-v0.patch, 
> HBASE-7055-v1.patch, HBASE-7055-v2.patch, HBASE-7055-v3.patch, 
> HBASE-7055-v4.patch, HBASE-7055-v5.patch
>
>
> There's divergence in the code :(
> See HBASE-6371 for details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7516) Make compaction policy pluggable and configurable per table

2013-01-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546492#comment-13546492
 ] 

Sergey Shelukhin commented on HBASE-7516:
-

See HBASE-7055, HBASE-7236

> Make compaction policy pluggable and configurable per table
> ---
>
> Key: HBASE-7516
> URL: https://issues.apache.org/jira/browse/HBASE-7516
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7515) Store.loadStoreFiles should close opened files if there's an exception

2013-01-07 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546494#comment-13546494
 ] 

Elliott Clark commented on HBASE-7515:
--

I don't think that will fix the issue.  If the first store file throws the 
error, the rest still need to be pulled from the futures and closed.

> Store.loadStoreFiles should close opened files if there's an exception
> --
>
> Key: HBASE-7515
> URL: https://issues.apache.org/jira/browse/HBASE-7515
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: Ted Yu
> Fix For: 0.96.0
>
> Attachments: 7515.txt
>
>
> Related to HBASE-7513. If a RS is able to open a few store files in 
> {{Store.loadStoreFiles}} but one of them fails like in 7513, the opened files 
> won't be closed and file descriptors will remain in a CLOSED_WAIT state.
> The situation we encountered is that over the weekend one region was bounced 
> between >100 region servers and eventually they all started dying on "Too 
> many open files".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6031) RegionServer does not go down while aborting

2013-01-07 Thread liang xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546495#comment-13546495
 ] 

liang xie commented on HBASE-6031:
--

we can close this issue now, since HADOOP-9181 had been resolved.

> RegionServer does not go down while aborting
> 
>
> Key: HBASE-6031
> URL: https://issues.apache.org/jira/browse/HBASE-6031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0
>Reporter: ramkrishna.s.vasudevan
> Fix For: 0.96.0
>
> Attachments: rs_shutdown_hung20130107.jstack, rsthread.txt
>
>
> Following is the thread dump.
> {code}
> "1997531088@qtp-716941846-5" prio=10 tid=0x7f7c5820c800 nid=0xe1b in 
> Object.wait() [0x7f7c56ae8000]
>java.lang.Thread.State: TIMED_WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at 
> org.mortbay.io.nio.SelectChannelEndPoint.blockWritable(SelectChannelEndPoint.java:279)
>   - locked <0x7f7cfe0616d0> (a 
> org.mortbay.jetty.nio.SelectChannelConnector$ConnectorEndPoint)
>   at 
> org.mortbay.jetty.AbstractGenerator$Output.blockForOutput(AbstractGenerator.java:545)
>   at 
> org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:639)
>   at 
> org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:580)
>   at java.io.ByteArrayOutputStream.writeTo(ByteArrayOutputStream.java:109)
>   - locked <0x7f7cfe74d758> (a 
> org.mortbay.util.ByteArrayOutputStream2)
>   at 
> org.mortbay.jetty.AbstractGenerator$OutputWriter.write(AbstractGenerator.java:904)
>   at java.io.Writer.write(Writer.java:96)
>   - locked <0x7f7cfca02fc0> (a 
> org.mortbay.jetty.HttpConnection$OutputWriter)
>   at java.io.PrintWriter.write(PrintWriter.java:361)
>   - locked <0x7f7cfca02fc0> (a 
> org.mortbay.jetty.HttpConnection$OutputWriter)
>   at org.jamon.escaping.HtmlEscaping.write(HtmlEscaping.java:43)
>   at 
> org.jamon.escaping.AbstractCharacterEscaping.write(AbstractCharacterEscaping.java:35)
>   at 
> org.apache.hadoop.hbase.tmpl.regionserver.RSStatusTmplImpl.renderNoFlush(RSStatusTmplImpl.java:222)
>   at 
> org.apache.hadoop.hbase.tmpl.regionserver.RSStatusTmpl.renderNoFlush(RSStatusTmpl.java:180)
>   at 
> org.apache.hadoop.hbase.tmpl.regionserver.RSStatusTmpl.render(RSStatusTmpl.java:171)
>   at 
> org.apache.hadoop.hbase.regionserver.RSStatusServlet.doGet(RSStatusServlet.java:48)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>   at 
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
>   at 
> org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>   at 
> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:932)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>   at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
>   at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>   at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
>   at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>   at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
>   at 
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
>   at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>   at org.mortbay.jetty.Server.handle(Server.java:326)
>   at 
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
>   at 
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
>   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
>   at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
>   at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
>   at 
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
>   at 
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> "1374615312@qtp-716941846-3" prio=10 tid=0x7f7c58214800 nid=0xc42 in 
> Object.wait() [0x7f7c55bd9000]
>java.lang.Thread.State: TIMED_WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at 
> org.mortbay.io.nio.SelectChannelEndPoint.blockWritable(SelectChannel

[jira] [Commented] (HBASE-7516) Make compaction policy pluggable and configurable per table

2013-01-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546503#comment-13546503
 ] 

Sergey Shelukhin commented on HBASE-7516:
-

Oh, I see. Pluggable can be done by extracting part of HBASE-7055 patch; 
configurable per table would need HBASE-7236

> Make compaction policy pluggable and configurable per table
> ---
>
> Key: HBASE-7516
> URL: https://issues.apache.org/jira/browse/HBASE-7516
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7329) remove flush-related records from WAL

2013-01-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7329:


Attachment: HBASE-7329-v0.patch

> remove flush-related records from WAL
> -
>
> Key: HBASE-7329
> URL: https://issues.apache.org/jira/browse/HBASE-7329
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HBASE-7329-v0.patch, HBASE-7329-v0.patch, 
> HBASE-7329-v0-tmp.patch, HBASE-7329-v1.patch
>
>
> Comments from many people in HBASE-6466 and HBASE-6980 indicate that flush 
> records in WAL are not useful. If so, they should be removed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7329) remove flush-related records from WAL

2013-01-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7329:


Attachment: HBASE-7329-v1.patch

> remove flush-related records from WAL
> -
>
> Key: HBASE-7329
> URL: https://issues.apache.org/jira/browse/HBASE-7329
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HBASE-7329-v0.patch, HBASE-7329-v0.patch, 
> HBASE-7329-v0-tmp.patch, HBASE-7329-v1.patch, HBASE-7329-v1.patch
>
>
> Comments from many people in HBASE-6466 and HBASE-6980 indicate that flush 
> records in WAL are not useful. If so, they should be removed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7329) remove flush-related records from WAL

2013-01-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546508#comment-13546508
 ] 

Sergey Shelukhin commented on HBASE-7329:
-

rebased wrong patch, and then the correct patch (v1)

> remove flush-related records from WAL
> -
>
> Key: HBASE-7329
> URL: https://issues.apache.org/jira/browse/HBASE-7329
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HBASE-7329-v0.patch, HBASE-7329-v0.patch, 
> HBASE-7329-v0-tmp.patch, HBASE-7329-v1.patch, HBASE-7329-v1.patch
>
>
> Comments from many people in HBASE-6466 and HBASE-6980 indicate that flush 
> records in WAL are not useful. If so, they should be removed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7427) Check line lenghts in the test-patch script

2013-01-07 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546511#comment-13546511
 ] 

Enis Soztutar commented on HBASE-7427:
--

Sure, np. If it comes to pass that we need a new patch, I'll open another one. 

> Check line lenghts in the test-patch script
> ---
>
> Key: HBASE-7427
> URL: https://issues.apache.org/jira/browse/HBASE-7427
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.96.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.96.0
>
> Attachments: hbase-7427_v1.patch
>
>
> Checkstyle is disabled in test-patch, and it is not very easy to make it 
> work. We can just add some check for the line lengths in the mean time. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice (not configurable by cf or dynamically)

2013-01-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546512#comment-13546512
 ] 

Sergey Shelukhin commented on HBASE-7055:
-

configurable per table is handled separately in HBASE-7236. With regard to 
pluggable, I can try to extract the patch, but it will be minority of 
functionality from this patch. Is there particular reason to extract it?

> port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice 
> (not configurable by cf or dynamically)
> -
>
> Key: HBASE-7055
> URL: https://issues.apache.org/jira/browse/HBASE-7055
> Project: HBase
>  Issue Type: Task
>  Components: Compaction
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0
>
> Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch, 
> HBASE-6371-v3-refactor-only-squashed.patch, 
> HBASE-6371-v4-refactor-only-squashed.patch, 
> HBASE-6371-v5-refactor-only-squashed.patch, HBASE-7055-v0.patch, 
> HBASE-7055-v1.patch, HBASE-7055-v2.patch, HBASE-7055-v3.patch, 
> HBASE-7055-v4.patch, HBASE-7055-v5.patch
>
>
> There's divergence in the code :(
> See HBASE-6371 for details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7517) Preemptive fast fail exception should not be processed multiple times in the stack

2013-01-07 Thread Amitanand Aiyer (JIRA)
Amitanand Aiyer created HBASE-7517:
--

 Summary: Preemptive fast fail exception should not be processed 
multiple times in the stack
 Key: HBASE-7517
 URL: https://issues.apache.org/jira/browse/HBASE-7517
 Project: HBase
  Issue Type: Bug
Reporter: Amitanand Aiyer
Priority: Minor




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7513) HDFSBlocksDistribution shouldn't send NPEs when something goes wrong

2013-01-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546528#comment-13546528
 ] 

Hadoop QA commented on HBASE-7513:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563656/HBASE-7513-0.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor

 {color:red}-1 core zombie tests{color}.  There are 6 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3918//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3918//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3918//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3918//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3918//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3918//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3918//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3918//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3918//console

This message is automatically generated.

> HDFSBlocksDistribution shouldn't send NPEs when something goes wrong
> 
>
> Key: HBASE-7513
> URL: https://issues.apache.org/jira/browse/HBASE-7513
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0, 0.94.4
>Reporter: Jean-Daniel Cryans
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-7513-0.patch
>
>
> I saw a pretty weird failure on a cluster with corrupted files and this 
> particular exception really threw me off:
> {noformat}
> 2013-01-07 09:58:59,054 ERROR 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open 
> of region=redacted., starting to roll back the global memstore size.
> java.io.IOException: java.io.IOException: java.lang.NullPointerException: 
> empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:548)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:461)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3814)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3762)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.ja

[jira] [Commented] (HBASE-7513) HDFSBlocksDistribution shouldn't send NPEs when something goes wrong

2013-01-07 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546529#comment-13546529
 ] 

Jean-Daniel Cryans commented on HBASE-7513:
---

Not a fan of the copy-paste, maybe extract into the class javadoc?

> HDFSBlocksDistribution shouldn't send NPEs when something goes wrong
> 
>
> Key: HBASE-7513
> URL: https://issues.apache.org/jira/browse/HBASE-7513
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0, 0.94.4
>Reporter: Jean-Daniel Cryans
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-7513-0.patch
>
>
> I saw a pretty weird failure on a cluster with corrupted files and this 
> particular exception really threw me off:
> {noformat}
> 2013-01-07 09:58:59,054 ERROR 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open 
> of region=redacted., starting to roll back the global memstore size.
> java.io.IOException: java.io.IOException: java.lang.NullPointerException: 
> empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:548)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:461)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3814)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3762)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:403)
>   at org.apache.hadoop.hbase.regionserver.Store.(Store.java:256)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2995)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:523)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:521)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   ... 3 more
> Caused by: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.HDFSBlocksDistribution.addHostsAndBlockWeight(HDFSBlocksDistribution.java:123)
>   at 
> org.apache.hadoop.hbase.util.FSUtils.computeHDFSBlocksDistribution(FSUtils.java:597)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.computeHDFSBlockDistribution(StoreFile.java:492)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:521)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:602)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:380)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:375)
>   ... 8 more
> 2013-01-07 09:58:59,059 INFO 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opening of 
> region "redacted" failed, marking as FAILED_OPEN in ZK
> {noformat}
> This is what the code looks like:
> {code}
> if (hosts == null || hosts.length == 0) {
>  throw new NullPointerException("empty hosts");
> }
> {code}
> So {{hosts}} can exist but we send an NPE anyways? And then this is wrapped 
> in {{Store}} by:
> {code}
> } catch (ExecutionException e) {
>   throw new IOException(e.getCause());
> {code}
> FWIW there's another NPE thrown in 
> {{HDFSBlocksDistribution.addHostAndBlockWeight}} and it looks wrong.
> We should change the code to just skip computing the locality if it's missing 
> and not throw big ugly exceptions. In this case the region would fail opening 
> later anyways but at least the error message will be clearer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7236) add per-table/per-cf configuration via metadata

2013-01-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546530#comment-13546530
 ] 

Hadoop QA commented on HBASE-7236:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563659/HBASE-7236-v6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 28 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 3 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces lines longer than 
100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestSerialization

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3919//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3919//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3919//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3919//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3919//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3919//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3919//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3919//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3919//console

This message is automatically generated.

> add per-table/per-cf configuration via metadata
> ---
>
> Key: HBASE-7236
> URL: https://issues.apache.org/jira/browse/HBASE-7236
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch, 
> HBASE-7236-PROTOTYPE-v1.patch, HBASE-7236-v0.patch, HBASE-7236-v1.patch, 
> HBASE-7236-v2.patch, HBASE-7236-v3.patch, HBASE-7236-v4.patch, 
> HBASE-7236-v5.patch, HBASE-7236-v6.patch
>
>
> Regardless of the compaction policy, it makes sense to have separate 
> configuration for compactions for different tables and column families, as 
> their access patterns and workloads can be different. In particular, for 
> tiered compactions that are being ported from 0.89-fb branch it is necessary 
> to have, to use it properly.
> We might want to add support for compaction configuration via metadata on 
> table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7404) Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE

2013-01-07 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546540#comment-13546540
 ] 

Ted Yu commented on HBASE-7404:
---

+1 from me.

> Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE
> --
>
> Key: HBASE-7404
> URL: https://issues.apache.org/jira/browse/HBASE-7404
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7404-trunk-v10.patch, 7404-trunk-v11.patch, 
> 7404-trunk-v12.patch, 7404-trunk-v13.patch, 7404-trunk-v13.txt, 
> 7404-trunk-v14.patch, BucketCache.pdf, hbase-7404-94v2.patch, 
> hbase-7404-trunkv2.patch, hbase-7404-trunkv9.patch, Introduction of Bucket 
> Cache.pdf
>
>
> First, thanks @neil from Fusion-IO share the source code.
> What's Bucket Cache? 
> It could greatly decrease CMS and heap fragment by GC
> It support a large cache space for High Read Performance by using high speed 
> disk like Fusion-io
> 1.An implementation of block cache like LruBlockCache
> 2.Self manage blocks' storage position through Bucket Allocator
> 3.The cached blocks could be stored in the memory or file system
> 4.Bucket Cache could be used as a mainly block cache(see CombinedBlockCache), 
> combined with LruBlockCache to decrease CMS and fragment by GC.
> 5.BucketCache also could be used as a secondary cache(e.g. using Fusionio to 
> store block) to enlarge cache space
> How about SlabCache?
> We have studied and test SlabCache first, but the result is bad, because:
> 1.SlabCache use SingleSizeCache, its use ratio of memory is low because kinds 
> of block size, especially using DataBlockEncoding
> 2.SlabCache is uesd in DoubleBlockCache, block is cached both in SlabCache 
> and LruBlockCache, put the block to LruBlockCache again if hit in SlabCache , 
> it causes CMS and heap fragment don't get any better
> 3.Direct heap performance is not good as heap, and maybe cause OOM, so we 
> recommend using "heap" engine 
> See more in the attachment and in the patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7515) Store.loadStoreFiles should close opened files if there's an exception

2013-01-07 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7515:
--

Attachment: 7515-v2.txt

How about this one ?

> Store.loadStoreFiles should close opened files if there's an exception
> --
>
> Key: HBASE-7515
> URL: https://issues.apache.org/jira/browse/HBASE-7515
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: Ted Yu
> Fix For: 0.96.0
>
> Attachments: 7515.txt, 7515-v2.txt
>
>
> Related to HBASE-7513. If a RS is able to open a few store files in 
> {{Store.loadStoreFiles}} but one of them fails like in 7513, the opened files 
> won't be closed and file descriptors will remain in a CLOSED_WAIT state.
> The situation we encountered is that over the weekend one region was bounced 
> between >100 region servers and eventually they all started dying on "Too 
> many open files".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7236) add per-table/per-cf configuration via metadata

2013-01-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7236:


Attachment: HBASE-7236-v6.patch

broke a test during rebase

> add per-table/per-cf configuration via metadata
> ---
>
> Key: HBASE-7236
> URL: https://issues.apache.org/jira/browse/HBASE-7236
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch, 
> HBASE-7236-PROTOTYPE-v1.patch, HBASE-7236-v0.patch, HBASE-7236-v1.patch, 
> HBASE-7236-v2.patch, HBASE-7236-v3.patch, HBASE-7236-v4.patch, 
> HBASE-7236-v5.patch, HBASE-7236-v6.patch, HBASE-7236-v6.patch
>
>
> Regardless of the compaction policy, it makes sense to have separate 
> configuration for compactions for different tables and column families, as 
> their access patterns and workloads can be different. In particular, for 
> tiered compactions that are being ported from 0.89-fb branch it is necessary 
> to have, to use it properly.
> We might want to add support for compaction configuration via metadata on 
> table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7517) Preemptive fast fail exception should not be processed multiple times in the stack

2013-01-07 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546547#comment-13546547
 ] 

Ted Yu commented on HBASE-7517:
---

A little more detail in JIRA description would help.

> Preemptive fast fail exception should not be processed multiple times in the 
> stack
> --
>
> Key: HBASE-7517
> URL: https://issues.apache.org/jira/browse/HBASE-7517
> Project: HBase
>  Issue Type: Bug
>Reporter: Amitanand Aiyer
>Priority: Minor
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7329) remove flush-related records from WAL

2013-01-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546557#comment-13546557
 ] 

Hadoop QA commented on HBASE-7329:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563668/HBASE-7329-v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestHTableMultiplexer
  org.apache.hadoop.hbase.TestLocalHBaseCluster

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3922//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3922//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3922//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3922//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3922//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3922//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3922//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3922//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3922//console

This message is automatically generated.

> remove flush-related records from WAL
> -
>
> Key: HBASE-7329
> URL: https://issues.apache.org/jira/browse/HBASE-7329
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HBASE-7329-v0.patch, HBASE-7329-v0.patch, 
> HBASE-7329-v0-tmp.patch, HBASE-7329-v1.patch, HBASE-7329-v1.patch
>
>
> Comments from many people in HBASE-6466 and HBASE-6980 indicate that flush 
> records in WAL are not useful. If so, they should be removed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6466) Enable multi-thread for memstore flush

2013-01-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546561#comment-13546561
 ] 

Hadoop QA commented on HBASE-6466:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563661/HBASE-6466-v4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestLocalHBaseCluster

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s):   
at 
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup.testBalancerWithRackLocality(TestBalancerWithNodeGroup.java:220)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3923//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3923//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3923//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3923//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3923//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3923//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3923//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3923//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3923//console

This message is automatically generated.

> Enable multi-thread for memstore flush
> --
>
> Key: HBASE-6466
> URL: https://issues.apache.org/jira/browse/HBASE-6466
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.96.0
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: HBASE-6466.patch, HBASE-6466v2.patch, 
> HBASE-6466v3.1.patch, HBASE-6466v3.patch, HBASE-6466-v4.patch, 
> HBASE-6466-v4.patch
>
>
> If the KV is large or Hlog is closed with high-pressure putting, we found 
> memstore is often above the high water mark and block the putting.
> So should we enable multi-thread for Memstore Flush?
> Some performance test data for reference,
> 1.test environment : 
> random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 
> regions per regionserver;row len=50 bytes, value len=1024 bytes;5 
> regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler 
> per client for writing
> 2.test results:
> one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per 
> regionserver, appears many aboveGlobalMemstoreLimit blocking
> two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per 
> regionserver,
> 200 thread handler per client & two cacheFlush handlers, tps:16.1k/s per 
> regionserver, Flush:18.6MB/s per regionserver

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7515) Store.loadStoreFiles should close opened files if there's an exception

2013-01-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546562#comment-13546562
 ] 

Hadoop QA commented on HBASE-7515:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563664/7515.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestLocalHBaseCluster
  
org.apache.hadoop.hbase.io.encoding.TestUpgradeFromHFileV1ToEncoding
  org.apache.hadoop.hbase.master.TestMasterFailover

 {color:red}-1 core zombie tests{color}.  There are 7 zombie test(s):   
at 
org.apache.hadoop.hbase.master.TestMasterFailover.testMasterFailoverWithMockedRITOnDeadRS(TestMasterFailover.java:833)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3920//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3920//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3920//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3920//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3920//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3920//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3920//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3920//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3920//console

This message is automatically generated.

> Store.loadStoreFiles should close opened files if there's an exception
> --
>
> Key: HBASE-7515
> URL: https://issues.apache.org/jira/browse/HBASE-7515
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: Ted Yu
> Fix For: 0.96.0
>
> Attachments: 7515.txt, 7515-v2.txt
>
>
> Related to HBASE-7513. If a RS is able to open a few store files in 
> {{Store.loadStoreFiles}} but one of them fails like in 7513, the opened files 
> won't be closed and file descriptors will remain in a CLOSED_WAIT state.
> The situation we encountered is that over the weekend one region was bounced 
> between >100 region servers and eventually they all started dying on "Too 
> many open files".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7414) Convert some HFile metadata to PB

2013-01-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546567#comment-13546567
 ] 

Hudson commented on HBASE-7414:
---

Integrated in HBase-TRUNK #3708 (See 
[https://builds.apache.org/job/HBase-TRUNK/3708/])
HBASE-7414. Convert some HFile metadata to PB (Revision 1430106)

 Result = FAILURE
apurtell : 
Files : 
* 
/hbase/trunk/hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/HFileProtos.java
* /hbase/trunk/hbase-protocol/src/main/protobuf/HFile.proto
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java


> Convert some HFile metadata to PB
> -
>
> Key: HBASE-7414
> URL: https://issues.apache.org/jira/browse/HBASE-7414
> Project: HBase
>  Issue Type: Task
>  Components: HFile
>Reporter: stack
>Assignee: Andrew Purtell
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 7414.patch, 7414.patch, 7414.patch, 7414.patch
>
>
> See HBASE-7201
> Convertion should be in a manner that does not prevent our being able to read 
> old style hfiles with Writable metadata.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice (not configurable by cf or dynamically)

2013-01-07 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546584#comment-13546584
 ] 

Jimmy Xiang commented on HBASE-7055:


It will make this patch smaller and easier to review.

> port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice 
> (not configurable by cf or dynamically)
> -
>
> Key: HBASE-7055
> URL: https://issues.apache.org/jira/browse/HBASE-7055
> Project: HBase
>  Issue Type: Task
>  Components: Compaction
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0
>
> Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch, 
> HBASE-6371-v3-refactor-only-squashed.patch, 
> HBASE-6371-v4-refactor-only-squashed.patch, 
> HBASE-6371-v5-refactor-only-squashed.patch, HBASE-7055-v0.patch, 
> HBASE-7055-v1.patch, HBASE-7055-v2.patch, HBASE-7055-v3.patch, 
> HBASE-7055-v4.patch, HBASE-7055-v5.patch
>
>
> There's divergence in the code :(
> See HBASE-6371 for details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6031) RegionServer does not go down while aborting

2013-01-07 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546585#comment-13546585
 ] 

ramkrishna.s.vasudevan commented on HBASE-6031:
---

If you have tested this with the fix for HADOOP-9181 then we can resolve this.

> RegionServer does not go down while aborting
> 
>
> Key: HBASE-6031
> URL: https://issues.apache.org/jira/browse/HBASE-6031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0
>Reporter: ramkrishna.s.vasudevan
> Fix For: 0.96.0
>
> Attachments: rs_shutdown_hung20130107.jstack, rsthread.txt
>
>
> Following is the thread dump.
> {code}
> "1997531088@qtp-716941846-5" prio=10 tid=0x7f7c5820c800 nid=0xe1b in 
> Object.wait() [0x7f7c56ae8000]
>java.lang.Thread.State: TIMED_WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at 
> org.mortbay.io.nio.SelectChannelEndPoint.blockWritable(SelectChannelEndPoint.java:279)
>   - locked <0x7f7cfe0616d0> (a 
> org.mortbay.jetty.nio.SelectChannelConnector$ConnectorEndPoint)
>   at 
> org.mortbay.jetty.AbstractGenerator$Output.blockForOutput(AbstractGenerator.java:545)
>   at 
> org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:639)
>   at 
> org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:580)
>   at java.io.ByteArrayOutputStream.writeTo(ByteArrayOutputStream.java:109)
>   - locked <0x7f7cfe74d758> (a 
> org.mortbay.util.ByteArrayOutputStream2)
>   at 
> org.mortbay.jetty.AbstractGenerator$OutputWriter.write(AbstractGenerator.java:904)
>   at java.io.Writer.write(Writer.java:96)
>   - locked <0x7f7cfca02fc0> (a 
> org.mortbay.jetty.HttpConnection$OutputWriter)
>   at java.io.PrintWriter.write(PrintWriter.java:361)
>   - locked <0x7f7cfca02fc0> (a 
> org.mortbay.jetty.HttpConnection$OutputWriter)
>   at org.jamon.escaping.HtmlEscaping.write(HtmlEscaping.java:43)
>   at 
> org.jamon.escaping.AbstractCharacterEscaping.write(AbstractCharacterEscaping.java:35)
>   at 
> org.apache.hadoop.hbase.tmpl.regionserver.RSStatusTmplImpl.renderNoFlush(RSStatusTmplImpl.java:222)
>   at 
> org.apache.hadoop.hbase.tmpl.regionserver.RSStatusTmpl.renderNoFlush(RSStatusTmpl.java:180)
>   at 
> org.apache.hadoop.hbase.tmpl.regionserver.RSStatusTmpl.render(RSStatusTmpl.java:171)
>   at 
> org.apache.hadoop.hbase.regionserver.RSStatusServlet.doGet(RSStatusServlet.java:48)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>   at 
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
>   at 
> org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>   at 
> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:932)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>   at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
>   at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>   at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
>   at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>   at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
>   at 
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
>   at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>   at org.mortbay.jetty.Server.handle(Server.java:326)
>   at 
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
>   at 
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
>   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
>   at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
>   at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
>   at 
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
>   at 
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> "1374615312@qtp-716941846-3" prio=10 tid=0x7f7c58214800 nid=0xc42 in 
> Object.wait() [0x7f7c55bd9000]
>java.lang.Thread.State: TIMED_WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at 
> org.mortbay.io.nio.SelectChan

[jira] [Updated] (HBASE-7419) revisit hfilelink file name format.

2013-01-07 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-7419:
---

Attachment: HBASE-7419-v3.patch

> revisit hfilelink file name format.
> ---
>
> Key: HBASE-7419
> URL: https://issues.apache.org/jira/browse/HBASE-7419
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client, master, regionserver, snapshots, Zookeeper
>Reporter: Jonathan Hsieh
>Assignee: Matteo Bertozzi
> Fix For: hbase-6055, 0.96.0
>
> Attachments: HBASE-7419-v0.patch, HBASE-7419-v1.patch, 
> HBASE-7419-v2.patch, HBASE-7419-v3.patch
>
>
> Valid table names are concatted with a '.' to a valid regions names is also a 
> valid table name, and lead to the incorrect interpretation.
> {code}
> true hfile name constraints: [0-9]+(?:_SeqID_[0-9]+)?
> region name constraints: [a-f0-9]{16}  (but we currently just use 
> [a-f0-9]+.)
> table name constraints : [a-zA-Z0-9_][a-zA-Z0-9_.-]*
> {code}
> Notice that the table name constraints completely covers all region name 
> constraints and true hfile name constraints.   (a valid hfile name is a valid 
> part of a table name, and a valid enc region name is a valid part of a table 
> name.
> Currently the hfilelink filename convention is --.  
> Unfortunately, making a ref to this uses the name 
> --. -- the contactnation of 
> . is a valid table name used to get interpreted as such. 
>  The fix in HBASE-7339 requires a FileNotFoundException before going down the 
> hfile link resolution path. 
> Regardless of what we do, we need to add some char invalid for table names to 
> the hfilelink or reference filename convention.
> Suggestion: if we changed the order of the hfile-link name we could avoid 
> some of the confusion -- @-. (or some 
> other separator char than '@') could be used to avoid handling on the initial 
> filenotfoundexception but I think we'd still need a good chunk of the logic 
> to handle opening half-storefile reader throw a hfilelink.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7236) add per-table/per-cf configuration via metadata

2013-01-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546591#comment-13546591
 ] 

Hadoop QA commented on HBASE-7236:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563675/HBASE-7236-v6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 28 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 3 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces lines longer than 
100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestLocalHBaseCluster

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3925//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3925//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3925//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3925//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3925//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3925//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3925//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3925//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3925//console

This message is automatically generated.

> add per-table/per-cf configuration via metadata
> ---
>
> Key: HBASE-7236
> URL: https://issues.apache.org/jira/browse/HBASE-7236
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch, 
> HBASE-7236-PROTOTYPE-v1.patch, HBASE-7236-v0.patch, HBASE-7236-v1.patch, 
> HBASE-7236-v2.patch, HBASE-7236-v3.patch, HBASE-7236-v4.patch, 
> HBASE-7236-v5.patch, HBASE-7236-v6.patch, HBASE-7236-v6.patch
>
>
> Regardless of the compaction policy, it makes sense to have separate 
> configuration for compactions for different tables and column families, as 
> their access patterns and workloads can be different. In particular, for 
> tiered compactions that are being ported from 0.89-fb branch it is necessary 
> to have, to use it properly.
> We might want to add support for compaction configuration via metadata on 
> table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7515) Store.loadStoreFiles should close opened files if there's an exception

2013-01-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546592#comment-13546592
 ] 

Hadoop QA commented on HBASE-7515:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563674/7515-v2.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestLocalHBaseCluster

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s):   
at 
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup.testBalancerWithRackLocality(TestBalancerWithNodeGroup.java:220)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3924//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3924//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3924//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3924//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3924//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3924//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3924//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3924//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3924//console

This message is automatically generated.

> Store.loadStoreFiles should close opened files if there's an exception
> --
>
> Key: HBASE-7515
> URL: https://issues.apache.org/jira/browse/HBASE-7515
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: Ted Yu
> Fix For: 0.96.0
>
> Attachments: 7515.txt, 7515-v2.txt
>
>
> Related to HBASE-7513. If a RS is able to open a few store files in 
> {{Store.loadStoreFiles}} but one of them fails like in 7513, the opened files 
> won't be closed and file descriptors will remain in a CLOSED_WAIT state.
> The situation we encountered is that over the weekend one region was bounced 
> between >100 region servers and eventually they all started dying on "Too 
> many open files".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7515) Store.loadStoreFiles should close opened files if there's an exception

2013-01-07 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546593#comment-13546593
 ] 

Ted Yu commented on HBASE-7515:
---

I ran TestLocalHBaseCluster locally and it passed.

> Store.loadStoreFiles should close opened files if there's an exception
> --
>
> Key: HBASE-7515
> URL: https://issues.apache.org/jira/browse/HBASE-7515
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: Ted Yu
> Fix For: 0.96.0
>
> Attachments: 7515.txt, 7515-v2.txt
>
>
> Related to HBASE-7513. If a RS is able to open a few store files in 
> {{Store.loadStoreFiles}} but one of them fails like in 7513, the opened files 
> won't be closed and file descriptors will remain in a CLOSED_WAIT state.
> The situation we encountered is that over the weekend one region was bounced 
> between >100 region servers and eventually they all started dying on "Too 
> many open files".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7403) Online Merge

2013-01-07 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546611#comment-13546611
 ] 

chunhui shen commented on HBASE-7403:
-

bq.while doing complete_merging if due to some reason we dont get regioninfo
In the state of complete_merging, we will delete the merging regions from 
.META., so it is avaiable

bq.My point was does the failed merge transaction, restart again on master 
restart?
Yes, it will. The failed merge transaction not means it will be always failed. 
So we redo it

> Online Merge
> 
>
> Key: HBASE-7403
> URL: https://issues.apache.org/jira/browse/HBASE-7403
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7403-trunkv5.patch, 7403-trunkv6.patch, 7403v5.diff, 
> 7403-v5.txt, 7403v5.txt, hbase-7403-94v1.patch, hbase-7403-trunkv1.patch, 
> hbase-7403-trunkv5.patch, hbase-7403-trunkv6.patch, hbase-7403-trunkv7.patch, 
> merge region.pdf
>
>
> The feature of this online merge:
> 1.Online,no necessary to disable table
> 2.Less change for current code, could applied in trunk,0.94 or 0.92,0.90
> 3.Easy to call merege request, no need to input a long region name, only 
> encoded name enough
> 4.No limit when operation, you don't need to tabke care the events like 
> Server Dead, Balance, Split, Disabing/Enabing table, no need to take care 
> whether you send a wrong merge request, it has alread done for you
> 5.Only little offline time for two merging regions
> We need merge in the following cases:
> 1.Region hole or region overlap, can’t be fix by hbck
> 2.Region become empty because of TTL and not reasonable Rowkey design
> 3.Region is always empty or very small because of presplit when create table
> 4.Too many empty or small regions would reduce the system performance(e.g. 
> mslab)
> Current merge tools only support offline and are not able to redo if 
> exception is thrown in the process of merging, causing a dirty data
> For online system, we need a online merge.
> This implement logic of this patch for  Online Merge is :
> For example, merge regionA and regionB into regionC
> 1.Offline the two regions A and B
> 2.Merge the two regions in the HDFS(Create regionC’s directory, move 
> regionA’s and regionB’s file to regionC’s directory, delete regionA’s and 
> regionB’s directory)
> 3.Add the merged regionC to .META.
> 4.Assign the merged regionC
> As design of this patch , once we do the merge work in the HDFS,we could redo 
> it until successful if it throws exception or abort or server restart, but 
> couldn’t be rolled back. 
> It depends on
> Use zookeeper to record the transaction journal state, make redo easier
> Use zookeeper to send/receive merge request
> Merge transaction is executed on the master
> Support calling merge request through API or shell tool
> About the merge process, please see the attachment and patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7506) Judgment of carrying ROOT/META will become wrong when expiring server

2013-01-07 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546614#comment-13546614
 ] 

chunhui shen commented on HBASE-7506:
-

[~jxiang]
bq.3.in verifyAndAssignRootWithRetries, check if root is RIT
-ROOT- won't be in RIT, because we offline it
{code}
if (isCarryingRoot()) { // -ROOT-
LOG.info("Server " + serverName +
" was carrying ROOT. Trying to assign.");
this.services.getAssignmentManager().
  regionOffline(HRegionInfo.ROOT_REGIONINFO);
{code}

bq.1. remove class MetaServerShutdownHandler;
MetaServerShutdownHandler and ServerShutdownHandler will be submit to different 
ExecutorService, that's why we using MetaServerShutdownHandler 



> Judgment of carrying ROOT/META will become wrong when expiring server
> -
>
> Key: HBASE-7506
> URL: https://issues.apache.org/jira/browse/HBASE-7506
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0
>
> Attachments: 7506-trunk v1.patch
>
>
> We will check whether server carrying ROOT/META when expiring the server.
> See ServerManager#expireServer.
> If the dead server carrying META, we assign meta directly in the process of 
> ServerShutdownHandler.
> If the dead server carrying ROOT, we will offline ROOT and then 
> verifyAndAssignRootWithRetries()
> How judgement of carrtying ROOT/META become wrong?
> If region is in RIT, and isCarryingRegion() return true after addressing from 
> zk.
> However, once RIT time out(could be caused by this.allRegionServersOffline && 
> !noRSAvailable, see AssignmentManager#TimeoutMonitor)   and we assign it to 
> otherwhere, this judgement become wrong.
> See AssignmentManager#isCarryingRegion for details
> With the wrong judgement of carrtying ROOT/META, we would assign ROOT/META 
> twice.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7506) Judgment of carrying ROOT/META will become wrong when expiring server

2013-01-07 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546621#comment-13546621
 ] 

Ted Yu commented on HBASE-7506:
---

I think we should keep MetaServerShutdownHandler.
Take a look at HBASE-3809

> Judgment of carrying ROOT/META will become wrong when expiring server
> -
>
> Key: HBASE-7506
> URL: https://issues.apache.org/jira/browse/HBASE-7506
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0
>
> Attachments: 7506-trunk v1.patch
>
>
> We will check whether server carrying ROOT/META when expiring the server.
> See ServerManager#expireServer.
> If the dead server carrying META, we assign meta directly in the process of 
> ServerShutdownHandler.
> If the dead server carrying ROOT, we will offline ROOT and then 
> verifyAndAssignRootWithRetries()
> How judgement of carrtying ROOT/META become wrong?
> If region is in RIT, and isCarryingRegion() return true after addressing from 
> zk.
> However, once RIT time out(could be caused by this.allRegionServersOffline && 
> !noRSAvailable, see AssignmentManager#TimeoutMonitor)   and we assign it to 
> otherwhere, this judgement become wrong.
> See AssignmentManager#isCarryingRegion for details
> With the wrong judgement of carrtying ROOT/META, we would assign ROOT/META 
> twice.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7404) Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE

2013-01-07 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546619#comment-13546619
 ] 

Ted Yu commented on HBASE-7404:
---

@Chunhui:
This is an important feature. Please fill out Release Notes so that users know 
how to use it.

> Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE
> --
>
> Key: HBASE-7404
> URL: https://issues.apache.org/jira/browse/HBASE-7404
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7404-trunk-v10.patch, 7404-trunk-v11.patch, 
> 7404-trunk-v12.patch, 7404-trunk-v13.patch, 7404-trunk-v13.txt, 
> 7404-trunk-v14.patch, BucketCache.pdf, hbase-7404-94v2.patch, 
> hbase-7404-trunkv2.patch, hbase-7404-trunkv9.patch, Introduction of Bucket 
> Cache.pdf
>
>
> First, thanks @neil from Fusion-IO share the source code.
> What's Bucket Cache? 
> It could greatly decrease CMS and heap fragment by GC
> It support a large cache space for High Read Performance by using high speed 
> disk like Fusion-io
> 1.An implementation of block cache like LruBlockCache
> 2.Self manage blocks' storage position through Bucket Allocator
> 3.The cached blocks could be stored in the memory or file system
> 4.Bucket Cache could be used as a mainly block cache(see CombinedBlockCache), 
> combined with LruBlockCache to decrease CMS and fragment by GC.
> 5.BucketCache also could be used as a secondary cache(e.g. using Fusionio to 
> store block) to enlarge cache space
> How about SlabCache?
> We have studied and test SlabCache first, but the result is bad, because:
> 1.SlabCache use SingleSizeCache, its use ratio of memory is low because kinds 
> of block size, especially using DataBlockEncoding
> 2.SlabCache is uesd in DoubleBlockCache, block is cached both in SlabCache 
> and LruBlockCache, put the block to LruBlockCache again if hit in SlabCache , 
> it causes CMS and heap fragment don't get any better
> 3.Direct heap performance is not good as heap, and maybe cause OOM, so we 
> recommend using "heap" engine 
> See more in the attachment and in the patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7507) Make memstore flush be able to retry after exception

2013-01-07 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546622#comment-13546622
 ] 

chunhui shen commented on HBASE-7507:
-

bq.Why moving the location of validateStoreFile() call ?
We can't do the retry in HStore#commitFile, but we could do the retry if failed 
validateStoreFile(), so move its location

[~stack]
bq.Should we open a new issue to retry all hdfs operations? 
We will do the hdfs operations for HFile and HLog, and we could tolerate IO 
errors in HLog now.
So I think retry for flush is enough since IO errors in compaction are nothing 
matter

For other comments, I will address in new patch

Thanks

> Make memstore flush be able to retry after exception
> 
>
> Key: HBASE-7507
> URL: https://issues.apache.org/jira/browse/HBASE-7507
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0
>
> Attachments: 7507-trunk v1.patch
>
>
> We will abort regionserver if memstore flush throws exception.
> I thinks we could do retry to make regionserver more stable because file 
> system may be not ok in a transient time. e.g. Switching namenode in the 
> NamenodeHA environment
> {code}
> HRegion#internalFlushcache(){
> ...
> try {
> ...
> }catch(Throwable t){
> DroppedSnapshotException dse = new DroppedSnapshotException("region: " +
>   Bytes.toStringBinary(getRegionName()));
> dse.initCause(t);
> throw dse;
> }
> ...
> }
> MemStoreFlusher#flushRegion(){
> ...
> region.flushcache();
> ...
>  try {
> }catch(DroppedSnapshotException ex){
> server.abort("Replay of HLog required. Forcing server shutdown", ex);
> }
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7504) -ROOT- may be offline forever after FullGC of RS

2013-01-07 Thread chunhui shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-7504:


Description: 
1.FullGC happen on ROOT regionserver.
2.ZK session timeout, master expire the regionserver and submit to 
ServerShutdownHandler
3.Regionserver complete the FullGC
4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns true
5.ServerShutdownHandler skip assigning ROOT region
6.Regionserver abort itself because it reveive YouAreDeadException after a 
regionserver report
7.ROOT is offline now, and won't be assigned any more unless we restart master



Master Log:
{code}
2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted shutdown 
handler to be executed, root=true, meta=false
2012-10-31 19:51:39,045 INFO 
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs 
for dw88.kgb.sqa.cm4,60020,1351671478752
2012-10-31 19:51:50,113 INFO 
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server 
dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT. Trying to assign.
2012-10-31 19:52:15,939 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
Server REPORT rejected; currently processing 
dw88.kgb.sqa.cm4,60020,1351671478752 as dead server
2012-10-31 19:52:15,945 INFO 
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Skipping log 
splitting for dw88.kgb.sqa.cm4,60020,1351671478752
{code}

No log of assigning ROOT

Regionserver log:
{code}
2012-10-31 19:52:15,923 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 
229128ms instead of 10ms, this is likely due to a long garbage collecting 
pause and it's usually bad, see 
http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
{code}




  was:
1.FullGC happen on ROOT regionserver.
2.ZK session timeout, master expire the regionserver and submit to 
ServerShutdownHandler
3.Regionserver complete the FullGC
4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns true
5.ServerShutdownHandler skip assigning -ROOT- region
6.Regionserver abort itself because it reveive YouAreDeadException after a 
regionserver report
7.-ROO- is offline now, and won't be assigned any more unless we restart master



Master Log:
{code}
2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted shutdown 
handler to be executed, root=true, meta=false
2012-10-31 19:51:39,045 INFO 
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs 
for dw88.kgb.sqa.cm4,60020,1351671478752
2012-10-31 19:51:50,113 INFO 
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server 
dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT. Trying to assign.
2012-10-31 19:52:15,939 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
Server REPORT rejected; currently processing 
dw88.kgb.sqa.cm4,60020,1351671478752 as dead server
2012-10-31 19:52:15,945 INFO 
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Skipping log 
splitting for dw88.kgb.sqa.cm4,60020,1351671478752
{code}

No log of assigning -ROOT-

Regionserver log:
{code}
2012-10-31 19:52:15,923 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 
229128ms instead of 10ms, this is likely due to a long garbage collecting 
pause and it's usually bad, see 
http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
{code}





> -ROOT- may be offline forever after FullGC of  RS
> -
>
> Key: HBASE-7504
> URL: https://issues.apache.org/jira/browse/HBASE-7504
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Attachments: 7504-trunk v1.patch
>
>
> 1.FullGC happen on ROOT regionserver.
> 2.ZK session timeout, master expire the regionserver and submit to 
> ServerShutdownHandler
> 3.Regionserver complete the FullGC
> 4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns 
> true
> 5.ServerShutdownHandler skip assigning ROOT region
> 6.Regionserver abort itself because it reveive YouAreDeadException after a 
> regionserver report
> 7.ROOT is offline now, and won't be assigned any more unless we restart master
> Master Log:
> {code}
> 2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted 
> shutdown handler to be executed, root=true, meta=false
> 2012-10-31 19:51:39,045 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs 
> for dw88.kgb.sqa.cm4,60020,1351671478752
> 2012-10-31 19:51:50,113 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server 
> dw88.kgb.sqa.cm4,60020,1

[jira] [Updated] (HBASE-7504) -ROOT- may be offline forever after FullGC of RS

2013-01-07 Thread chunhui shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-7504:


Attachment: 7504-trunk v2.patch

> -ROOT- may be offline forever after FullGC of  RS
> -
>
> Key: HBASE-7504
> URL: https://issues.apache.org/jira/browse/HBASE-7504
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Attachments: 7504-trunk v1.patch, 7504-trunk v2.patch
>
>
> 1.FullGC happen on ROOT regionserver.
> 2.ZK session timeout, master expire the regionserver and submit to 
> ServerShutdownHandler
> 3.Regionserver complete the FullGC
> 4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns 
> true
> 5.ServerShutdownHandler skip assigning ROOT region
> 6.Regionserver abort itself because it reveive YouAreDeadException after a 
> regionserver report
> 7.ROOT is offline now, and won't be assigned any more unless we restart master
> Master Log:
> {code}
> 2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted 
> shutdown handler to be executed, root=true, meta=false
> 2012-10-31 19:51:39,045 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs 
> for dw88.kgb.sqa.cm4,60020,1351671478752
> 2012-10-31 19:51:50,113 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server 
> dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT. Trying to assign.
> 2012-10-31 19:52:15,939 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> Server REPORT rejected; currently processing 
> dw88.kgb.sqa.cm4,60020,1351671478752 as dead server
> 2012-10-31 19:52:15,945 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Skipping log 
> splitting for dw88.kgb.sqa.cm4,60020,1351671478752
> {code}
> No log of assigning ROOT
> Regionserver log:
> {code}
> 2012-10-31 19:52:15,923 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 
> 229128ms instead of 10ms, this is likely due to a long garbage collecting 
> pause and it's usually bad, see 
> http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7504) -ROOT- may be offline forever after FullGC of RS

2013-01-07 Thread chunhui shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-7504:


Fix Version/s: 0.96.0
   Status: Patch Available  (was: Open)

> -ROOT- may be offline forever after FullGC of  RS
> -
>
> Key: HBASE-7504
> URL: https://issues.apache.org/jira/browse/HBASE-7504
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0
>
> Attachments: 7504-trunk v1.patch, 7504-trunk v2.patch
>
>
> 1.FullGC happen on ROOT regionserver.
> 2.ZK session timeout, master expire the regionserver and submit to 
> ServerShutdownHandler
> 3.Regionserver complete the FullGC
> 4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns 
> true
> 5.ServerShutdownHandler skip assigning ROOT region
> 6.Regionserver abort itself because it reveive YouAreDeadException after a 
> regionserver report
> 7.ROOT is offline now, and won't be assigned any more unless we restart master
> Master Log:
> {code}
> 2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted 
> shutdown handler to be executed, root=true, meta=false
> 2012-10-31 19:51:39,045 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs 
> for dw88.kgb.sqa.cm4,60020,1351671478752
> 2012-10-31 19:51:50,113 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server 
> dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT. Trying to assign.
> 2012-10-31 19:52:15,939 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> Server REPORT rejected; currently processing 
> dw88.kgb.sqa.cm4,60020,1351671478752 as dead server
> 2012-10-31 19:52:15,945 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Skipping log 
> splitting for dw88.kgb.sqa.cm4,60020,1351671478752
> {code}
> No log of assigning ROOT
> Regionserver log:
> {code}
> 2012-10-31 19:52:15,923 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 
> 229128ms instead of 10ms, this is likely due to a long garbage collecting 
> pause and it's usually bad, see 
> http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7505) Server will hang when stopping cluster, caused by waiting for split threads

2013-01-07 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546630#comment-13546630
 ] 

chunhui shen commented on HBASE-7505:
-

OK, Run the HadoopQA first..

> Server will hang when stopping cluster, caused by waiting for split threads
> ---
>
> Key: HBASE-7505
> URL: https://issues.apache.org/jira/browse/HBASE-7505
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0
>
> Attachments: 7505-trunk v1.patch
>
>
> We will retry 100 times (about 3200 minitues) for 
> HRegionServer#postOpenDeployTasks now, see 
> HConnectionManager#setServerSideHConnectionRetries.
> However, 
> when we stopping the cluster, we will wait for split threads in  
> HRegionServer#join,
> if META/ROOT server has already been stopped, the split thread won't exit 
> because it is in the retrying for HRegionServer#postOpenDeployTasks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7505) Server will hang when stopping cluster, caused by waiting for split threads

2013-01-07 Thread chunhui shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-7505:


Status: Patch Available  (was: Open)

> Server will hang when stopping cluster, caused by waiting for split threads
> ---
>
> Key: HBASE-7505
> URL: https://issues.apache.org/jira/browse/HBASE-7505
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0
>
> Attachments: 7505-trunk v1.patch
>
>
> We will retry 100 times (about 3200 minitues) for 
> HRegionServer#postOpenDeployTasks now, see 
> HConnectionManager#setServerSideHConnectionRetries.
> However, 
> when we stopping the cluster, we will wait for split threads in  
> HRegionServer#join,
> if META/ROOT server has already been stopped, the split thread won't exit 
> because it is in the retrying for HRegionServer#postOpenDeployTasks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7404) Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE

2013-01-07 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546632#comment-13546632
 ] 

chunhui shen commented on HBASE-7404:
-

OK, I will attach the usage

> Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE
> --
>
> Key: HBASE-7404
> URL: https://issues.apache.org/jira/browse/HBASE-7404
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7404-trunk-v10.patch, 7404-trunk-v11.patch, 
> 7404-trunk-v12.patch, 7404-trunk-v13.patch, 7404-trunk-v13.txt, 
> 7404-trunk-v14.patch, BucketCache.pdf, hbase-7404-94v2.patch, 
> hbase-7404-trunkv2.patch, hbase-7404-trunkv9.patch, Introduction of Bucket 
> Cache.pdf
>
>
> First, thanks @neil from Fusion-IO share the source code.
> What's Bucket Cache? 
> It could greatly decrease CMS and heap fragment by GC
> It support a large cache space for High Read Performance by using high speed 
> disk like Fusion-io
> 1.An implementation of block cache like LruBlockCache
> 2.Self manage blocks' storage position through Bucket Allocator
> 3.The cached blocks could be stored in the memory or file system
> 4.Bucket Cache could be used as a mainly block cache(see CombinedBlockCache), 
> combined with LruBlockCache to decrease CMS and fragment by GC.
> 5.BucketCache also could be used as a secondary cache(e.g. using Fusionio to 
> store block) to enlarge cache space
> How about SlabCache?
> We have studied and test SlabCache first, but the result is bad, because:
> 1.SlabCache use SingleSizeCache, its use ratio of memory is low because kinds 
> of block size, especially using DataBlockEncoding
> 2.SlabCache is uesd in DoubleBlockCache, block is cached both in SlabCache 
> and LruBlockCache, put the block to LruBlockCache again if hit in SlabCache , 
> it causes CMS and heap fragment don't get any better
> 3.Direct heap performance is not good as heap, and maybe cause OOM, so we 
> recommend using "heap" engine 
> See more in the attachment and in the patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3809) .META. may not come back online if > number of executors servers crash and one of those > number of executors was carrying meta

2013-01-07 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546633#comment-13546633
 ] 

chunhui shen commented on HBASE-3809:
-

I think it won't happen in trunk now.Because:
1.We use different ExecutorService to execute ServerShutdownHandler and 
MetaServerShutdownHandler
2.In the process of MetaServerShutdownHandler
{code}
if (isCarryingRoot() || isCarryingMeta() // -ROOT- or .META.
  || !services.getAssignmentManager().isFailoverCleanupDone()) {
this.services.getServerManager().processDeadServer(serverName);
return;
  }
{code}

It means MetaServerShutdownHandler could always be executed, so this stuck 
scenario won't happen again 

> .META. may not come back online if > number of executors servers crash and 
> one of those > number of executors was carrying meta
> ---
>
> Key: HBASE-3809
> URL: https://issues.apache.org/jira/browse/HBASE-3809
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Priority: Critical
> Fix For: 0.96.0
>
>
> This is a duplicate of another issue but at the moment I cannot find the 
> original.
> If you had a 700 node cluster and then you ran something on the cluster which 
> killed 100 nodes, and .META. had been running on one of those downed nodes, 
> well, you'll have all of your master executors processing ServerShutdowns and 
> more than likely non of the currently processing executors will be servicing 
> the shutdown of the server that was carrying .META.
> Well, for server shutdown to complete at the moment, an online .META. is 
> required.  So, in the above case, we'll be stuck. The current executors will 
> not be able to clear to make space for the processing of the server carrying 
> .META. because they need .META. to complete.
> We can make the master handlers have no bound so it will expand to accomodate 
> all crashed servers -- so it'll have the one .META. in its queue -- or we can 
> change it so shutdown handling doesn't require .META. to be on-line (its used 
> to figure the regions the server was carrying); we could use the master's 
> in-memory picture of the cluster (But IIRC, there may be holes TBD)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3809) .META. may not come back online if > number of executors servers crash and one of those > number of executors was carrying meta

2013-01-07 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546635#comment-13546635
 ] 

Ted Yu commented on HBASE-3809:
---

What about the scenario J-D described @ 22/Apr/11 21:48 ?

> .META. may not come back online if > number of executors servers crash and 
> one of those > number of executors was carrying meta
> ---
>
> Key: HBASE-3809
> URL: https://issues.apache.org/jira/browse/HBASE-3809
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Priority: Critical
> Fix For: 0.96.0
>
>
> This is a duplicate of another issue but at the moment I cannot find the 
> original.
> If you had a 700 node cluster and then you ran something on the cluster which 
> killed 100 nodes, and .META. had been running on one of those downed nodes, 
> well, you'll have all of your master executors processing ServerShutdowns and 
> more than likely non of the currently processing executors will be servicing 
> the shutdown of the server that was carrying .META.
> Well, for server shutdown to complete at the moment, an online .META. is 
> required.  So, in the above case, we'll be stuck. The current executors will 
> not be able to clear to make space for the processing of the server carrying 
> .META. because they need .META. to complete.
> We can make the master handlers have no bound so it will expand to accomodate 
> all crashed servers -- so it'll have the one .META. in its queue -- or we can 
> change it so shutdown handling doesn't require .META. to be on-line (its used 
> to figure the regions the server was carrying); we could use the master's 
> in-memory picture of the cluster (But IIRC, there may be holes TBD)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7403) Online Merge

2013-01-07 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546640#comment-13546640
 ] 

Ted Yu commented on HBASE-7403:
---

TestMergeTransaction is marked as MediumTests
I originally expected a large test, due to the various scenarios that should be 
covered.
{code}
+  public void testRedoMergeWhenOfflineRegion() throws Exception {
{code}
Should the above method be called 
testRedoMergeWhenOfflineRegionEncountersException() ?
{code}
+// Throw exception when offline region
+throwExceptionStep = 1;
{code}
Can you use enum or add javadoc explaining what each step does ?
{code}
+  public void testRedoMergeWhenExecuteMerge() throws Exception {
{code}
There should be a better name for the above method - we redo merge when there 
is exception from merge execution.

For testRedoMergeWhenCancelMerge(), 
{code}
+// Throw exception when complete merge
+throwExceptionStep = 4;
{code}
Does the comment match method name ?

Overall, testing is done through fault injection. Can we add more test(s) for 
master failover scenario, etc ?

Thanks

> Online Merge
> 
>
> Key: HBASE-7403
> URL: https://issues.apache.org/jira/browse/HBASE-7403
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0, 0.94.5
>
> Attachments: 7403-trunkv5.patch, 7403-trunkv6.patch, 7403v5.diff, 
> 7403-v5.txt, 7403v5.txt, hbase-7403-94v1.patch, hbase-7403-trunkv1.patch, 
> hbase-7403-trunkv5.patch, hbase-7403-trunkv6.patch, hbase-7403-trunkv7.patch, 
> merge region.pdf
>
>
> The feature of this online merge:
> 1.Online,no necessary to disable table
> 2.Less change for current code, could applied in trunk,0.94 or 0.92,0.90
> 3.Easy to call merege request, no need to input a long region name, only 
> encoded name enough
> 4.No limit when operation, you don't need to tabke care the events like 
> Server Dead, Balance, Split, Disabing/Enabing table, no need to take care 
> whether you send a wrong merge request, it has alread done for you
> 5.Only little offline time for two merging regions
> We need merge in the following cases:
> 1.Region hole or region overlap, can’t be fix by hbck
> 2.Region become empty because of TTL and not reasonable Rowkey design
> 3.Region is always empty or very small because of presplit when create table
> 4.Too many empty or small regions would reduce the system performance(e.g. 
> mslab)
> Current merge tools only support offline and are not able to redo if 
> exception is thrown in the process of merging, causing a dirty data
> For online system, we need a online merge.
> This implement logic of this patch for  Online Merge is :
> For example, merge regionA and regionB into regionC
> 1.Offline the two regions A and B
> 2.Merge the two regions in the HDFS(Create regionC’s directory, move 
> regionA’s and regionB’s file to regionC’s directory, delete regionA’s and 
> regionB’s directory)
> 3.Add the merged regionC to .META.
> 4.Assign the merged regionC
> As design of this patch , once we do the merge work in the HDFS,we could redo 
> it until successful if it throws exception or abort or server restart, but 
> couldn’t be rolled back. 
> It depends on
> Use zookeeper to record the transaction journal state, make redo easier
> Use zookeeper to send/receive merge request
> Merge transaction is executed on the master
> Support calling merge request through API or shell tool
> About the merge process, please see the attachment and patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7513) HDFSBlocksDistribution shouldn't send NPEs when something goes wrong

2013-01-07 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546641#comment-13546641
 ] 

Elliott Clark commented on HBASE-7513:
--

Sure.

> HDFSBlocksDistribution shouldn't send NPEs when something goes wrong
> 
>
> Key: HBASE-7513
> URL: https://issues.apache.org/jira/browse/HBASE-7513
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0, 0.94.4
>Reporter: Jean-Daniel Cryans
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-7513-0.patch
>
>
> I saw a pretty weird failure on a cluster with corrupted files and this 
> particular exception really threw me off:
> {noformat}
> 2013-01-07 09:58:59,054 ERROR 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open 
> of region=redacted., starting to roll back the global memstore size.
> java.io.IOException: java.io.IOException: java.lang.NullPointerException: 
> empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:548)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:461)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3814)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3762)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:403)
>   at org.apache.hadoop.hbase.regionserver.Store.(Store.java:256)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2995)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:523)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:521)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   ... 3 more
> Caused by: java.lang.NullPointerException: empty hosts
>   at 
> org.apache.hadoop.hbase.HDFSBlocksDistribution.addHostsAndBlockWeight(HDFSBlocksDistribution.java:123)
>   at 
> org.apache.hadoop.hbase.util.FSUtils.computeHDFSBlocksDistribution(FSUtils.java:597)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.computeHDFSBlockDistribution(StoreFile.java:492)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:521)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:602)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:380)
>   at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:375)
>   ... 8 more
> 2013-01-07 09:58:59,059 INFO 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opening of 
> region "redacted" failed, marking as FAILED_OPEN in ZK
> {noformat}
> This is what the code looks like:
> {code}
> if (hosts == null || hosts.length == 0) {
>  throw new NullPointerException("empty hosts");
> }
> {code}
> So {{hosts}} can exist but we send an NPE anyways? And then this is wrapped 
> in {{Store}} by:
> {code}
> } catch (ExecutionException e) {
>   throw new IOException(e.getCause());
> {code}
> FWIW there's another NPE thrown in 
> {{HDFSBlocksDistribution.addHostAndBlockWeight}} and it looks wrong.
> We should change the code to just skip computing the locality if it's missing 
> and not throw big ugly exceptions. In this case the region would fail opening 
> later anyways but at least the error message will be clearer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3809) .META. may not come back online if > number of executors servers crash and one of those > number of executors was carrying meta

2013-01-07 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546642#comment-13546642
 ] 

chunhui shen commented on HBASE-3809:
-

It is a common multi-assign scenario J-D described @ 22/Apr/11 21:48  in early 
version.

We have done many works to fix multi-assign cases,
So it won't be a problem now.




> .META. may not come back online if > number of executors servers crash and 
> one of those > number of executors was carrying meta
> ---
>
> Key: HBASE-3809
> URL: https://issues.apache.org/jira/browse/HBASE-3809
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Priority: Critical
> Fix For: 0.96.0
>
>
> This is a duplicate of another issue but at the moment I cannot find the 
> original.
> If you had a 700 node cluster and then you ran something on the cluster which 
> killed 100 nodes, and .META. had been running on one of those downed nodes, 
> well, you'll have all of your master executors processing ServerShutdowns and 
> more than likely non of the currently processing executors will be servicing 
> the shutdown of the server that was carrying .META.
> Well, for server shutdown to complete at the moment, an online .META. is 
> required.  So, in the above case, we'll be stuck. The current executors will 
> not be able to clear to make space for the processing of the server carrying 
> .META. because they need .META. to complete.
> We can make the master handlers have no bound so it will expand to accomodate 
> all crashed servers -- so it'll have the one .META. in its queue -- or we can 
> change it so shutdown handling doesn't require .META. to be on-line (its used 
> to figure the regions the server was carrying); we could use the master's 
> in-memory picture of the cluster (But IIRC, there may be holes TBD)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7479) Remove VersionedProtocol and ProtocolSignature from RPC

2013-01-07 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7479:
-

Attachment: 7479.txt

Patch that removes VersionedProtocol and ProtocolSignature.

> Remove VersionedProtocol and ProtocolSignature from RPC
> ---
>
> Key: HBASE-7479
> URL: https://issues.apache.org/jira/browse/HBASE-7479
> Project: HBase
>  Issue Type: Task
>  Components: IPC/RPC
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 7479.txt
>
>
> Replace with an innocuous "Protocol" Interface for now.  Will minimize 
> changes doing a replacement.  Implication is that we are no longer going to 
> do special "handling" based off protocol version.  See "Handling protocol 
> versions" - http://search-hadoop.com/m/6k7GUM028E/v=threaded thread and 
> HBASE-6521 for background.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7479) Remove VersionedProtocol and ProtocolSignature from RPC

2013-01-07 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546651#comment-13546651
 ] 

stack commented on HBASE-7479:
--

Commit message:

M hbase-protocol/src/main/protobuf/RPC.proto 
b/hbase-protocol/src/main/protobuf/RPC.proto
  Remove the clientProtocolVersion field.  Unused.
A hbase-server/src/main/java/org/apache/hadoop/hbase/IpcProtocol.java
  Added a marker Interface to use as VersionedProtocol was used, as the
  Interface all protocols implemented.  Needed for now otherwise would
  have to refactor loads of code.  It is up here at top level rather
  than down in ipc because the server protocols are up here at this
  level and its odd have super packages implement Interfaces that
  are in subpackages.
M hbase-server/src/main/java/org/apache/hadoop/hbase/MasterAdminProtocol.java
  Remove the repetition of the content of BlockingInterface.  True, there is
  nice javadoc in the reptition here but doc belongs in the proto files.
  Remove VersionedProtocol and implement IpcProtocol instead.
M hbase-server/src/main/java/org/apache/hadoop/hbase/MasterProtocol.java
  Ditto
M 
hbase-server/src/main/java/org/apache/hadoop/hbase/RegionServerStatusProtocol.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/client/AdminProtocol.java
M hbase-server/src/main/java/org/apache/hadoop/hbase/client/ClientProtocol.java
  Implement IpcProtocol and remove VersionedProtocol
M 
hbase-server/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
  Remove VersionedProtocol and implement IpcProtocol instead.
  Don't pass 'version' when getting clients and servers and protocols.
M hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
  Remove unused call method  Refer to IpcProtocol rather than VP.
M hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClientRPC.java
  Remove taking client version from methods.  Use IpcProtocol instead of VP
M hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
  Ditto
M hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServerRPC.java
  Ditoo and remove proxy; there is no proxying serverside.
  Remove an unused getServer method.
M 
hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/ProtobufRpcClientEngine.java
  Use IpcProtocol instead of VP  Removed the PROTOCOL_VERSION Map.
  Not needed any more.
M 
hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/ProtobufRpcServerEngine.java
  Remove clientVersion.  Not used.  Use IpcP instead of VP
D hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/ProtocolSignature.java
D hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/VersionedProtocol.java
  Remove.  Not used any more.
M  hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RequestContext.java
  Use IpcP instead of VP
M  base-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcClientEngine.java
  Use IpcP instead of VP
M  hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java
  Comments.
M hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServerEngine.java
  Use IpcP instead of VP
M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
M 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
M 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java
M 
/hbase-server/src/test/java/org/apache/hadoop/hbase/security/token/TestTokenAuthentication.java
  Remove no longer needed methods from VersionedProtocol.
M hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  Fix eclipse warning.
M 
hbase-server/src/test/java/org/apache/hadoop/hbase/ipc/RandomTimeoutRpcEngine.java
M hbase-server/src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java
M hbase-server/src/test/java/org/apache/hadoop/hbase/ipc/TestIPC.java
M hbase-server/src/test/java/org/apache/hadoop/hbase/ipc/TestProtoBufRpc.java
M 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestHMasterRPCException.java
  Use IpcP instead of VP
M hbase-server/src/test/protobuf/test_delayed_rpc.proto
  Turn off flag that asks for generation of Service (Service not defined)

> Remove VersionedProtocol and ProtocolSignature from RPC
> ---
>
> Key: HBASE-7479
> URL: https://issues.apache.org/jira/browse/HBASE-7479
> Project: HBase
>  Issue Type: Task
>  Components: IPC/RPC
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 7479.txt
>
>
> Replace with an innocuous "Protocol" Interface for now.  Will minimize 
> changes doing a replacement.  Implication is that we are no longer going to 
> do special "handling" based off protocol version.  See "Handling protocol 
> versions" - http://search-hadoop.com/m/6k7GUM028

[jira] [Commented] (HBASE-7505) Server will hang when stopping cluster, caused by waiting for split threads

2013-01-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546652#comment-13546652
 ] 

Hadoop QA commented on HBASE-7505:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563541/7505-trunk%20v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.replication.TestReplicationWithCompression
  org.apache.hadoop.hbase.TestLocalHBaseCluster

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3926//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3926//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3926//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3926//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3926//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3926//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3926//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3926//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3926//console

This message is automatically generated.

> Server will hang when stopping cluster, caused by waiting for split threads
> ---
>
> Key: HBASE-7505
> URL: https://issues.apache.org/jira/browse/HBASE-7505
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.3
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0
>
> Attachments: 7505-trunk v1.patch
>
>
> We will retry 100 times (about 3200 minitues) for 
> HRegionServer#postOpenDeployTasks now, see 
> HConnectionManager#setServerSideHConnectionRetries.
> However, 
> when we stopping the cluster, we will wait for split threads in  
> HRegionServer#join,
> if META/ROOT server has already been stopped, the split thread won't exit 
> because it is in the retrying for HRegionServer#postOpenDeployTasks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7479) Remove VersionedProtocol and ProtocolSignature from RPC

2013-01-07 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546653#comment-13546653
 ] 

stack commented on HBASE-7479:
--

Here is the rb posting: https://reviews.apache.org/r/8880/

Patch is basic.  Removes VP and PS.  Replaces with IpcProtocol, a marker 
Interface.  Thats it really.  Removes in a few places the methods that were in 
VP.

Review appreciated.  Thanks.

> Remove VersionedProtocol and ProtocolSignature from RPC
> ---
>
> Key: HBASE-7479
> URL: https://issues.apache.org/jira/browse/HBASE-7479
> Project: HBase
>  Issue Type: Task
>  Components: IPC/RPC
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 7479.txt
>
>
> Replace with an innocuous "Protocol" Interface for now.  Will minimize 
> changes doing a replacement.  Implication is that we are no longer going to 
> do special "handling" based off protocol version.  See "Handling protocol 
> versions" - http://search-hadoop.com/m/6k7GUM028E/v=threaded thread and 
> HBASE-6521 for background.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7515) Store.loadStoreFiles should close opened files if there's an exception

2013-01-07 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546654#comment-13546654
 ] 

Lars Hofhansl commented on HBASE-7515:
--

Is this only an 0.96 issue?

> Store.loadStoreFiles should close opened files if there's an exception
> --
>
> Key: HBASE-7515
> URL: https://issues.apache.org/jira/browse/HBASE-7515
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: Ted Yu
> Fix For: 0.96.0
>
> Attachments: 7515.txt, 7515-v2.txt
>
>
> Related to HBASE-7513. If a RS is able to open a few store files in 
> {{Store.loadStoreFiles}} but one of them fails like in 7513, the opened files 
> won't be closed and file descriptors will remain in a CLOSED_WAIT state.
> The situation we encountered is that over the weekend one region was bounced 
> between >100 region servers and eventually they all started dying on "Too 
> many open files".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7479) Remove VersionedProtocol and ProtocolSignature from RPC

2013-01-07 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7479:
-

Hadoop Flags: Incompatible change
  Status: Patch Available  (was: Open)

> Remove VersionedProtocol and ProtocolSignature from RPC
> ---
>
> Key: HBASE-7479
> URL: https://issues.apache.org/jira/browse/HBASE-7479
> Project: HBase
>  Issue Type: Task
>  Components: IPC/RPC
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 7479.txt
>
>
> Replace with an innocuous "Protocol" Interface for now.  Will minimize 
> changes doing a replacement.  Implication is that we are no longer going to 
> do special "handling" based off protocol version.  See "Handling protocol 
> versions" - http://search-hadoop.com/m/6k7GUM028E/v=threaded thread and 
> HBASE-6521 for background.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7441) Make ClusterManager in IntegrationTestingUtility pluggable

2013-01-07 Thread Liu Shaohui (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liu Shaohui updated HBASE-7441:
---

Attachment: HBASE-7441-trunk-v2.patch

Thanks for your reviews.

Main Changes:
(1) Move the define of HBASE_CLUSTER_MANAGER_CLASS into 
IntegrationTestingUtility where is used

(2) Add  a constant like DEFAULT_HBASE_CLUSTER_MANAGER_CLASS in 
IntegrationTestingUtility

> Make ClusterManager in IntegrationTestingUtility pluggable
> --
>
> Key: HBASE-7441
> URL: https://issues.apache.org/jira/browse/HBASE-7441
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.3
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
>  Labels: newbie, patch
> Fix For: 0.96.0, 0.94.5
>
> Attachments: HBASE-7441-0.94-v1.patch, HBASE-7441-trunk-v1.patch, 
> HBASE-7441-trunk-v2.patch
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> After the patch HBASE-7009, we can use ChaosMonkey to test the Hbase cluster.
> The ClusterManager use ssh to stop/start the rs or master without passwd. To 
> support other cluster manager tool, we need to make clusterManager in 
> IntegrationTestingUtility pluggable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


<    1   2   3   >