[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

2012-03-19 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232488#comment-13232488
 ] 

Hudson commented on HBASE-5568:
---

Integrated in HBase-0.92 #329 (See 
[https://builds.apache.org/job/HBase-0.92/329/])
HBASE-5568 Multi concurrent flushcache() for one region could cause data 
loss (Chunhui) (Revision 1302270)

 Result = FAILURE
tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java


> Multi concurrent flushcache() for one region could cause data loss
> --
>
> Key: HBASE-5568
> URL: https://issues.apache.org/jira/browse/HBASE-5568
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: HBASE-5568-90.patch, HBASE-5568-92v2.patch, 
> HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through 
> HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by 
> multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do 
> this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the 
> actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is 
> negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Started memstore flush for 
> writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Added 
> hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, 
> memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Started memstore flush for 
> writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Added 
> hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, 
> memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> Finished memstore flush of ~128.1m for region 
> writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction 
> requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Started memstore flush for 
> writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Added 
> hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, 
> memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Added 
> hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, 
> memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> Finished memstore flush of ~134.8m for region 
> writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction 
> requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Added 
> hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, 
> memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Added 
> hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, 
> memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> Finished memstore flush of ~6.8m for region 
> writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction 
> requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdminis

[jira] [Updated] (HBASE-5599) The hbkc tool can not fix the six scenarios, it is NO_VERSION_FILE, NOT_IN_META_OR_DEPLOYED, NOT_IN_META, SHOULD_NOT_BE_DEPLOYED, FIRST_REGION_STARTKEY_NOT_EMPTY, HOLE_IN

2012-03-19 Thread fulin wang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

fulin wang updated HBASE-5599:
--

Summary: The hbkc tool can not fix the six scenarios, it is 
NO_VERSION_FILE, NOT_IN_META_OR_DEPLOYED, NOT_IN_META, SHOULD_NOT_BE_DEPLOYED, 
FIRST_REGION_STARTKEY_NOT_EMPTY, HOLE_IN_REGION_CHAIN.  (was: The hbck tool can 
not fix the six scenarios, it is NO_VERSION_FILE, NOT_IN_META_OR_DEPLOYED, 
NOT_IN_META, NOT_IN_HDFS_OR_DEPLOYED, FIRST_REGION_STARTKEY_NOT_EMPTY, 
HOLE_IN_REGION_CHAIN.)

> The hbkc tool can not fix the six scenarios, it is NO_VERSION_FILE, 
> NOT_IN_META_OR_DEPLOYED, NOT_IN_META, SHOULD_NOT_BE_DEPLOYED, 
> FIRST_REGION_STARTKEY_NOT_EMPTY, HOLE_IN_REGION_CHAIN.
> 
>
> Key: HBASE-5599
> URL: https://issues.apache.org/jira/browse/HBASE-5599
> Project: HBase
>  Issue Type: New Feature
>  Components: hbck
>Affects Versions: 0.90.6
>Reporter: fulin wang
> Fix For: 0.90.6
>
> Attachments: hbase-5599-0.90.patch
>
>
> The hbck tool can not fix the six scenarios.
> 1. Version file does not exist in root dir.
>Fix: I try to create a version file by 'FSUtils.setVersion' method.
>
> 2. [REGIONNAME][KEY] on HDFS, but not listed in META or deployed on any 
> region server.
>Fix: I get region info form the hdfs file, this region info write to 
> '.META.' table.
>
> 3. [REGIONNAME][KEY] not in META, but deployed on [SERVERNAME]
>Fix: I get region info form the hdfs file, this region info write to 
> '.META.' table.
>
> 4. [REGIONNAME] should not be deployed according to META, but is deployed on 
> [SERVERNAME]
>Fix: Close this region.
>
> 5. First region should start with an empty key.  You need to  create a new 
> region and regioninfo in HDFS to plug the hole.
>Fix: The region info is not in hdfs and .META., so it create a empty 
> region for this error.
> 6. There is a hole in the region chain between [KEY] and [KEY]. You need to 
> create a new regioninfo and region dir in hdfs to plug the hole.
>   Fix: The region info is not in hdfs and .META., so it create a empty region 
> for this hole.
>   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5569) Do not collect deleted KVs when they are still in use by a scanner.

2012-03-19 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232513#comment-13232513
 ] 

nkeywal commented on HBASE-5569:


I stopped it after 2700 iterations (10 hours), no error => patch seems to be 
fix the issue...

> Do not collect deleted KVs when they are still in use by a scanner.
> ---
>
> Key: HBASE-5569
> URL: https://issues.apache.org/jira/browse/HBASE-5569
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5569-v2.txt, 5569-v3.txt, 5569-v4.txt, 5569.txt, 
> TestAtomicOperation-output.trunk_120313.rar
>
>
> I noticed this because TestAtomicOperation.testMultiRowMutationMultiThreads 
> fails rarely.
> The solution is similar to HBASE-2856, where expired KVs are not collected 
> when in use by a scanner.
> ---
> What I pieced together so far is that it is the *scanning* side that has 
> problems sometimes.
> Every time I see a assertion failure in the log I see this before:
> {quote}
> 2012-03-12 21:48:49,523 DEBUG [Thread-211] regionserver.StoreScanner(499): 
> Storescanner.peek() is changed where before = 
> rowB/colfamily11:qual1/75366/Put/vlen=6,and after = 
> rowB/colfamily11:qual1/75203/DeleteColumn/vlen=0
> {quote}
> The order of if the Put and Delete is sometimes reversed.
> The test threads should always see exactly one KV, if the "before" was the 
> Put the thread see 0 KVs, if the "before" was the Delete the threads see 2 
> KVs.
> This debug message comes from StoreScanner to checkReseek. It seems we still 
> some consistency issue with scanning sometimes :(

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5592) Make it easier to get a table from shell

2012-03-19 Thread Ben West (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232600#comment-13232600
 ] 

Ben West commented on HBASE-5592:
-

I would prefer HBASE-5548 as well. Fine with closing this.

> Make it easier to get a table from shell
> 
>
> Key: HBASE-5592
> URL: https://issues.apache.org/jira/browse/HBASE-5592
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Affects Versions: 0.94.0
>Reporter: Ben West
>Assignee: Ben West
>Priority: Trivial
>  Labels: shell
> Fix For: 0.92.2, 0.94.0
>
> Attachments: publicTable.patch
>
>
> The one argument constructor to HTable was removed at some point, which means 
> that you now have to pass in a Configuration to instantiate an HTable. This 
> is annoying for me when I create quick scripts.
> This JIRA is a tiny patch which lets you get an HTable instance in the shell 
> by doing
> {code}foo_table = @shell.hbase_table('foo').table{code}
> Basically, it is changing table to be a public member rather than a private 
> one.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1841) If multiple of same key in an hfile and they span blocks, may miss the earlier keys on a lookup

2012-03-19 Thread Schubert Zhang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232632#comment-13232632
 ] 

Schubert Zhang commented on HBASE-1841:
---

Thank you ryan, I think it is a open talk for tech.
It seems it's really a big work to refine [) to (]. 

> If multiple of same key in an hfile and they span blocks, may miss the 
> earlier keys on a lookup
> ---
>
> Key: HBASE-1841
> URL: https://issues.apache.org/jira/browse/HBASE-1841
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Schubert Zhang
> Fix For: 0.90.0
>
> Attachments: HBASE-1841-step1-v2.patch, HBASE-1841-step2-v2.patch
>
>
> See HBASE-818 for description by Schubert Zhang -- discovered by him doing a 
> code review of hfile.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5592) Make it easier to get a table from shell

2012-03-19 Thread Jesse Yates (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232677#comment-13232677
 ] 

Jesse Yates commented on HBASE-5592:


Thanks for taking a look Ben (and being cool with the reversion). 

> Make it easier to get a table from shell
> 
>
> Key: HBASE-5592
> URL: https://issues.apache.org/jira/browse/HBASE-5592
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Affects Versions: 0.94.0
>Reporter: Ben West
>Assignee: Ben West
>Priority: Trivial
>  Labels: shell
> Fix For: 0.92.2, 0.94.0
>
> Attachments: publicTable.patch
>
>
> The one argument constructor to HTable was removed at some point, which means 
> that you now have to pass in a Configuration to instantiate an HTable. This 
> is annoying for me when I create quick scripts.
> This JIRA is a tiny patch which lets you get an HTable instance in the shell 
> by doing
> {code}foo_table = @shell.hbase_table('foo').table{code}
> Basically, it is changing table to be a public member rather than a private 
> one.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3996) Support multiple tables and scanners as input to the mapper in map/reduce jobs

2012-03-19 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232686#comment-13232686
 ] 

Lars Hofhansl commented on HBASE-3996:
--

I am not opposed to having this in 0.94. Seems quite useful and performance 
related :)

> Support multiple tables and scanners as input to the mapper in map/reduce jobs
> --
>
> Key: HBASE-3996
> URL: https://issues.apache.org/jira/browse/HBASE-3996
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Eran Kutner
>Assignee: Eran Kutner
> Fix For: 0.96.0
>
> Attachments: 3996-v2.txt, HBase-3996.patch
>
>
> It seems that in many cases feeding data from multiple tables or multiple 
> scanners on a single table can save a lot of time when running map/reduce 
> jobs.
> I propose a new MultiTableInputFormat class that would allow doing this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5569) Do not collect deleted KVs when they are still in use by a scanner.

2012-03-19 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232693#comment-13232693
 ] 

Lars Hofhansl commented on HBASE-5569:
--

Thanks N. Good news! Tests pass too. I'm going to wait for some other folks to 
test on their machines to be extra sure this time.

I need to be extra clear here:
This patch will prevent any deleted KVs from being collected upon flush or 
compaction if there is a scanner open with a readpoint smaller than the KV's 
memstoreTS (HBASE-2856 does the same for expired KVs).
Furthermore this is only needed for mixed delete and put operations, although 
it will generally prevent a flush/compaction from pulling the rug under a 
scanner.

Personally, I think this is an important fix. However, I want to mention that 
the alternative is to remove the mutateRows functionality (obviously not my 
favorite choice), or to document that it only works with KEEP_DELETED_CELLS 
enabled (also not my favorite outcome).


> Do not collect deleted KVs when they are still in use by a scanner.
> ---
>
> Key: HBASE-5569
> URL: https://issues.apache.org/jira/browse/HBASE-5569
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5569-v2.txt, 5569-v3.txt, 5569-v4.txt, 5569.txt, 
> TestAtomicOperation-output.trunk_120313.rar
>
>
> I noticed this because TestAtomicOperation.testMultiRowMutationMultiThreads 
> fails rarely.
> The solution is similar to HBASE-2856, where expired KVs are not collected 
> when in use by a scanner.
> ---
> What I pieced together so far is that it is the *scanning* side that has 
> problems sometimes.
> Every time I see a assertion failure in the log I see this before:
> {quote}
> 2012-03-12 21:48:49,523 DEBUG [Thread-211] regionserver.StoreScanner(499): 
> Storescanner.peek() is changed where before = 
> rowB/colfamily11:qual1/75366/Put/vlen=6,and after = 
> rowB/colfamily11:qual1/75203/DeleteColumn/vlen=0
> {quote}
> The order of if the Put and Delete is sometimes reversed.
> The test threads should always see exactly one KV, if the "before" was the 
> Put the thread see 0 KVs, if the "before" was the Delete the threads see 2 
> KVs.
> This debug message comes from StoreScanner to checkReseek. It seems we still 
> some consistency issue with scanning sometimes :(

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5335) Dynamic Schema Configurations

2012-03-19 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232697#comment-13232697
 ] 

Phabricator commented on HBASE-5335:


nspiegelberg has commented on the revision "[jira] [HBASE-5335] Dynamic Schema 
Config".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:947 Important 
notes:

  1. CompoundConfiguration is a derived class from Configuration
  2. there is a CompoundConfiguration.add(Configuration) function

HRegion.conf = CompoundConfiguration(BaseConf, HTD)

  if you passed HRegion.conf to the daughter region constructor on a split, the 
daughter region would have:

HRegion.conf
  = CompoundConfiguration(CompoundConfiguration(BaseConf, HTD), HTD)
  = CompoundConfiguration(BaseConf, HTD, HTD)

  We need to dedupe the HTD.
  src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:1 I 
guess the question is: do we want to refactor the existing API & both all conf 
classes under a conf directory?  I put this under util because I didn't want to 
clutter the main folder with a class that was only supposed to be used 
internally.  I considered putting it under the regionserver folder.  Maybe 
that's a better fit?
  src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java:169 yeah.  another 
option is making a copy constructor
  src/main/ruby/hbase/admin.rb:187 Basically, your schema would look like:

NAME => 'blah', BLOOMFILTER => ROWCOL,
ADVANCED => {"hbase.hstore.compaction.ratio" => "0.25"}

  I don't want to explain the notion of "ADVANCED" too much beyond HBase 
committers.  Basically, it's only a toggle for people who know what they're 
doing and aren't afraid to be power users and look at code.  If we get really 
common config patterns, we should pull them out to reserved keywords for common 
users and then map.  For example:

COMPACT_RATIO => 'hbase.hstore.compaction.ratio'

  Why make a config option that most people won't play with?  Because , as 
power users, we can iterate on functionality & help users.  There is now a 
workaround for a specific user's problem without modifying code and we don't 
have to have advanced deprecation strategies like we would with a reserved 
keyword.

REVISION DETAIL
  https://reviews.facebook.net/D2247


> Dynamic Schema Configurations
> -
>
> Key: HBASE-5335
> URL: https://issues.apache.org/jira/browse/HBASE-5335
> Project: HBase
>  Issue Type: New Feature
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>  Labels: configuration, schema
> Attachments: D2247.1.patch, D2247.2.patch
>
>
> Currently, the ability for a core developer to add per-table & per-CF 
> configuration settings is very heavyweight.  You need to add a reserved 
> keyword all the way up the stack & you have to support this variable 
> long-term if you're going to expose it explicitly to the user.  This has 
> ended up with using Configuration.get() a lot because it is lightweight and 
> you can tweak settings while you're trying to understand system behavior 
> [since there are many config params that may never need to be tuned].  We 
> need to add the ability to put & read arbitrary KV settings in the HBase 
> schema.  Combined with online schema change, this will allow us to safely 
> iterate on configuration settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1841) If multiple of same key in an hfile and they span blocks, may miss the earlier keys on a lookup

2012-03-19 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232699#comment-13232699
 ] 

Lars Hofhansl commented on HBASE-1841:
--

One more reason to finish HBASE-2600 to rid ourselves from getCLosestRowBefore.

> If multiple of same key in an hfile and they span blocks, may miss the 
> earlier keys on a lookup
> ---
>
> Key: HBASE-1841
> URL: https://issues.apache.org/jira/browse/HBASE-1841
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Schubert Zhang
> Fix For: 0.90.0
>
> Attachments: HBASE-1841-step1-v2.patch, HBASE-1841-step2-v2.patch
>
>
> See HBASE-818 for description by Schubert Zhang -- discovered by him doing a 
> code review of hfile.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5482) In 0.90, balancer algo leading to same region balanced twice and picking same region with Src and Destination as same RS.

2012-03-19 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232757#comment-13232757
 ] 

ramkrishna.s.vasudevan commented on HBASE-5482:
---

Committed to 0.90.  Thanks for the review Ted.

> In 0.90, balancer algo leading to same region balanced twice and picking same 
> region with Src and Destination as same RS.
> -
>
> Key: HBASE-5482
> URL: https://issues.apache.org/jira/browse/HBASE-5482
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.5
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.7
>
> Attachments: 5482-v2.txt, HBASE-5482_1.patch, HBASE-5482_2.patch
>
>
> There are possibility of 2 problems
> -> When we populate regionsToMove while iterating the serverinfo in 
> descending manner there is a chance that the same region can be added twice.
> Because in the first loop we do a randomization of the regions.
> Where as when we get we have neededRegions!= 0 we just get the region in the 
> index and add it again . This may lead to have same region in the 
> regionsToMove list.
> -> Another problem is 
> when the problem in the first point happens then there is a chance that
> the regionToMove can have the same src and destination and the same region 
> can be picked every 5 mins.
> {code}
> for(Map.Entry> server :
> serversByLoad.descendingMap().entrySet()) {
> BalanceInfo balanceInfo = serverBalanceInfo.get(server.getKey());
> int idx =
>   balanceInfo == null ? 0 : balanceInfo.getNextRegionForUnload();
> if (idx >= server.getValue().size()) break;
> HRegionInfo region = server.getValue().get(idx);
> if (region.isMetaRegion()) continue; // Don't move meta regions.
> regionsToMove.add(new RegionPlan(region, server.getKey(), null));
> if(--neededRegions == 0) {
>   // No more regions needed, done shedding
>   break;
> }
>   }
> {code}
> If i have meta and root in the top two loaded region server(totally 3 RS), we 
> just skip the regions in those region server and populate the region from the 
> least loaded RS.
> Then in the next loop we iterate from the least loaded server and populate 
> the destination as also the same server.
> This is leading to a condition where every 5 min balancing happens and also 
> the server is same for src and dest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5516) GZip leading to memory leak in 0.90. Fix similar to HBASE-5387 needed for 0.90.

2012-03-19 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232759#comment-13232759
 ] 

ramkrishna.s.vasudevan commented on HBASE-5516:
---

Pls review this patch.  Will commit it tomorrow if it is fine.


> GZip leading to memory leak in 0.90.  Fix similar to HBASE-5387 needed for 
> 0.90.
> 
>
> Key: HBASE-5516
> URL: https://issues.apache.org/jira/browse/HBASE-5516
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.5
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.7
>
> Attachments: HBASE-5516_2_0.90.patch, HBASE-5516_3_0.90.patch
>
>
> Usage of GZip is leading to resident memory leak in 0.90.
> We need to have something similar to HBASE-5387 in 0.90. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5566) [89-fb] Region server can get stuck in getMaster on master failover

2012-03-19 Thread Mikhail Bautin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232762#comment-13232762
 ] 

Mikhail Bautin commented on HBASE-5566:
---

Code reviewed at: https://reviews.facebook.net/D2283

> [89-fb] Region server can get stuck in getMaster on master failover
> ---
>
> Key: HBASE-5566
> URL: https://issues.apache.org/jira/browse/HBASE-5566
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.89-fb
>Reporter: Prakash Khemani
>Assignee: Mikhail Bautin
>
> This is specific to the 89-fb master. We have a retry loop in 
> HRegionServer.getMaster where we do not read the location of the master from 
> ZK, so a region server can get stuck there on master failover. We need to add 
> a unit test to reliably catch this, and fix the bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5566) [89-fb] Region server can get stuck in getMaster on master failover

2012-03-19 Thread Mikhail Bautin (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin resolved HBASE-5566.
---

Resolution: Fixed

Patch committed internally, will be synced to 0.89-fb very soon.

> [89-fb] Region server can get stuck in getMaster on master failover
> ---
>
> Key: HBASE-5566
> URL: https://issues.apache.org/jira/browse/HBASE-5566
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.89-fb
>Reporter: Prakash Khemani
>Assignee: Mikhail Bautin
>
> This is specific to the 89-fb master. We have a retry loop in 
> HRegionServer.getMaster where we do not read the location of the master from 
> ZK, so a region server can get stuck there on master failover. We need to add 
> a unit test to reliably catch this, and fix the bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5569) Do not collect deleted KVs when they are still in use by a scanner.

2012-03-19 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232765#comment-13232765
 ] 

Lars Hofhansl commented on HBASE-5569:
--

If some more folks would run the tests in a loop with the patch applied that'd 
be of great help.

> Do not collect deleted KVs when they are still in use by a scanner.
> ---
>
> Key: HBASE-5569
> URL: https://issues.apache.org/jira/browse/HBASE-5569
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5569-v2.txt, 5569-v3.txt, 5569-v4.txt, 5569.txt, 
> TestAtomicOperation-output.trunk_120313.rar
>
>
> I noticed this because TestAtomicOperation.testMultiRowMutationMultiThreads 
> fails rarely.
> The solution is similar to HBASE-2856, where expired KVs are not collected 
> when in use by a scanner.
> ---
> What I pieced together so far is that it is the *scanning* side that has 
> problems sometimes.
> Every time I see a assertion failure in the log I see this before:
> {quote}
> 2012-03-12 21:48:49,523 DEBUG [Thread-211] regionserver.StoreScanner(499): 
> Storescanner.peek() is changed where before = 
> rowB/colfamily11:qual1/75366/Put/vlen=6,and after = 
> rowB/colfamily11:qual1/75203/DeleteColumn/vlen=0
> {quote}
> The order of if the Put and Delete is sometimes reversed.
> The test threads should always see exactly one KV, if the "before" was the 
> Put the thread see 0 KVs, if the "before" was the Delete the threads see 2 
> KVs.
> This debug message comes from StoreScanner to checkReseek. It seems we still 
> some consistency issue with scanning sometimes :(

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5600) Make Endpoint Coprocessors Available from Thrift

2012-03-19 Thread Ben West (Created) (JIRA)
Make Endpoint Coprocessors Available from Thrift


 Key: HBASE-5600
 URL: https://issues.apache.org/jira/browse/HBASE-5600
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Ben West
Priority: Minor


Currently, the only way to access an endpoint coprocessor via thrift is to 
modify the schema and Thrift server for every coprocessor function. This is 
annoying. It should be possible to use your coprocessors without having to 
mangle HBase core code (since that's the point of coprocessors).


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5600) Make Endpoint Coprocessors Available from Thrift

2012-03-19 Thread Ben West (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232813#comment-13232813
 ] 

Ben West commented on HBASE-5600:
-

I'm not entirely sure how to accomplish this in full generality. The most 
obvious methods would involve sending java code through thrift, which is 
probably a no-go. However, here is one simple coprocessor which I have that it 
seems could be handled:

{code}
Batch.Call call = Batch.forMethod(IMyEndpoint.class, "getRow", 
rowKey.getBytes());
Map results = table.coprocessorExec(IMyEndpoint.class, null, 
null, call);
{code}

We could create a thrift method to take the name of the class, method, and an 
array of params and then call coprocessorExec. If this sounds reasonable, I can 
try to allocate some time for a patch.

Alternatively, we could just require that you write a client for your 
endpoints, and then thrift calls the client.

> Make Endpoint Coprocessors Available from Thrift
> 
>
> Key: HBASE-5600
> URL: https://issues.apache.org/jira/browse/HBASE-5600
> Project: HBase
>  Issue Type: Improvement
>  Components: thrift
>Reporter: Ben West
>Priority: Minor
>  Labels: thrift
>
> Currently, the only way to access an endpoint coprocessor via thrift is to 
> modify the schema and Thrift server for every coprocessor function. This is 
> annoying. It should be possible to use your coprocessors without having to 
> mangle HBase core code (since that's the point of coprocessors).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4348) Add metrics for regions in transition

2012-03-19 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232819#comment-13232819
 ] 

jirapos...@reviews.apache.org commented on HBASE-4348:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4402/#review6076
---


Looks pretty good, just some spacing issues.

Are we sure that 60 seconds is the proper timeout to display "interesting" 
regions in transition?  Perhaps we should make this configurable?  (If yes, I'd 
also create a master msgInterval instead of reusing the regionserver one).


src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java


The spacing looks wrong here in all the java code -- everywhere else in the 
code it looks like we use two spaces for an indent level, whereas here you are 
using tabs.

Also, the braces aren't lined up.

I don't see anything about spacing at this page, though:
http://hbase.apache.org/book/submitting.patches.html
Perhaps we should update it.



src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetrics.java


From the submitting patches page:
"Keep lines less than 80 characters."


- Gregory


On 2012-03-19 06:48:19, Himanshu Vashishtha wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4402/
bq.  ---
bq.  
bq.  (Updated 2012-03-19 06:48:19)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  This patch is for addinf Region in transition metrics to the HMaster 
metrics system. It also adds these metrics in the master ui, in the Region in 
transition section. I have attached the proposed new format in the jira 4348.
bq.  
bq.  
bq.  This addresses bug HBase-4348.
bq.  https://issues.apache.org/jira/browse/HBase-4348
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
src/main/jamon/org/apache/hadoop/hbase/tmpl/master/AssignmentManagerStatusTmpl.jamon
 0dc0691 
bq.src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 
ae468ca 
bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java c4b4d30 
bq.src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetrics.java 
83abc52 
bq.src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java 
d68ce33 
bq.  
bq.  Diff: https://reviews.apache.org/r/4402/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Ran on a 5 node cluster and kill region servers randomly to observe the 
changes in the RIT metrics as emitted out by the Master's mxbean;
bq.  
bq.  mvn test passes without any failure.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Himanshu
bq.  
bq.



> Add metrics for regions in transition
> -
>
> Key: HBASE-4348
> URL: https://issues.apache.org/jira/browse/HBASE-4348
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.92.0
>Reporter: Todd Lipcon
>Assignee: Himanshu Vashishtha
>Priority: Minor
>  Labels: noob
> Attachments: 4348-metrics-v3.patch, 4348-v1.patch, 4348-v2.patch, 
> RITs.png, RegionInTransitions2.png, metrics-v2.patch
>
>
> The following metrics would be useful for monitoring the master:
> - the number of regions in transition
> - the number of regions in transition that have been in transition for more 
> than a minute
> - how many seconds has the oldest region-in-transition been in transition

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5521) Move compression/decompression to an encoder specific encoding context

2012-03-19 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232824#comment-13232824
 ] 

Phabricator commented on HBASE-5521:


mbautin has accepted the revision "HBASE-5521 [jira] Move 
compression/decompression to an encoder specific encoding context".

REVISION DETAIL
  https://reviews.facebook.net/D2097

BRANCH
  svn


> Move compression/decompression to an encoder specific encoding context
> --
>
> Key: HBASE-5521
> URL: https://issues.apache.org/jira/browse/HBASE-5521
> Project: HBase
>  Issue Type: Improvement
>Reporter: He Yongqiang
>Assignee: He Yongqiang
> Fix For: 0.96.0
>
> Attachments: HBASE-5521.1.patch, HBASE-5521.D2097.1.patch, 
> HBASE-5521.D2097.10.patch, HBASE-5521.D2097.2.patch, 
> HBASE-5521.D2097.3.patch, HBASE-5521.D2097.4.patch, HBASE-5521.D2097.5.patch, 
> HBASE-5521.D2097.6.patch, HBASE-5521.D2097.7.patch, HBASE-5521.D2097.8.patch, 
> HBASE-5521.D2097.9.patch
>
>
> As part of working on HBASE-5313, we want to add a new columnar 
> encoder/decoder. It makes sense to move compression to be part of 
> encoder/decoder:
> 1) a scanner for a columnar encoded block can do lazy decompression to a 
> specific part of a key value object
> 2) avoid an extra bytes copy from encoder to hblock-writer. 
> If there is no encoder specified for a writer, the HBlock.Writer will use a 
> default compression-context to do something very similar to today's code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5521) Move compression/decompression to an encoder specific encoding context

2012-03-19 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-5521:
--

Attachment: 
HBASE-5521-jira-Move-compression-decompression-to-an-2012-03-19_12_12_32.patch

Attaching what has been committed.

> Move compression/decompression to an encoder specific encoding context
> --
>
> Key: HBASE-5521
> URL: https://issues.apache.org/jira/browse/HBASE-5521
> Project: HBase
>  Issue Type: Improvement
>Reporter: He Yongqiang
>Assignee: He Yongqiang
> Fix For: 0.96.0
>
> Attachments: 
> HBASE-5521-jira-Move-compression-decompression-to-an-2012-03-19_12_12_32.patch,
>  HBASE-5521.1.patch, HBASE-5521.D2097.1.patch, HBASE-5521.D2097.10.patch, 
> HBASE-5521.D2097.2.patch, HBASE-5521.D2097.3.patch, HBASE-5521.D2097.4.patch, 
> HBASE-5521.D2097.5.patch, HBASE-5521.D2097.6.patch, HBASE-5521.D2097.7.patch, 
> HBASE-5521.D2097.8.patch, HBASE-5521.D2097.9.patch
>
>
> As part of working on HBASE-5313, we want to add a new columnar 
> encoder/decoder. It makes sense to move compression to be part of 
> encoder/decoder:
> 1) a scanner for a columnar encoded block can do lazy decompression to a 
> specific part of a key value object
> 2) avoid an extra bytes copy from encoder to hblock-writer. 
> If there is no encoder specified for a writer, the HBlock.Writer will use a 
> default compression-context to do something very similar to today's code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5521) Move compression/decompression to an encoder specific encoding context

2012-03-19 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232831#comment-13232831
 ] 

Hadoop QA commented on HBASE-5521:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12518927/HBASE-5521-jira-Move-compression-decompression-to-an-2012-03-19_12_12_32.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 19 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1225//console

This message is automatically generated.

> Move compression/decompression to an encoder specific encoding context
> --
>
> Key: HBASE-5521
> URL: https://issues.apache.org/jira/browse/HBASE-5521
> Project: HBase
>  Issue Type: Improvement
>Reporter: He Yongqiang
>Assignee: He Yongqiang
> Fix For: 0.96.0
>
> Attachments: 
> HBASE-5521-jira-Move-compression-decompression-to-an-2012-03-19_12_12_32.patch,
>  HBASE-5521.1.patch, HBASE-5521.D2097.1.patch, HBASE-5521.D2097.10.patch, 
> HBASE-5521.D2097.2.patch, HBASE-5521.D2097.3.patch, HBASE-5521.D2097.4.patch, 
> HBASE-5521.D2097.5.patch, HBASE-5521.D2097.6.patch, HBASE-5521.D2097.7.patch, 
> HBASE-5521.D2097.8.patch, HBASE-5521.D2097.9.patch
>
>
> As part of working on HBASE-5313, we want to add a new columnar 
> encoder/decoder. It makes sense to move compression to be part of 
> encoder/decoder:
> 1) a scanner for a columnar encoded block can do lazy decompression to a 
> specific part of a key value object
> 2) avoid an extra bytes copy from encoder to hblock-writer. 
> If there is no encoder specified for a writer, the HBlock.Writer will use a 
> default compression-context to do something very similar to today's code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5521) Move compression/decompression to an encoder specific encoding context

2012-03-19 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232832#comment-13232832
 ] 

Phabricator commented on HBASE-5521:


mbautin has committed the revision "HBASE-5521 [jira] Move 
compression/decompression to an encoder specific encoding context".

REVISION DETAIL
  https://reviews.facebook.net/D2097

COMMIT
  https://reviews.facebook.net/rHBASE1302602


> Move compression/decompression to an encoder specific encoding context
> --
>
> Key: HBASE-5521
> URL: https://issues.apache.org/jira/browse/HBASE-5521
> Project: HBase
>  Issue Type: Improvement
>Reporter: He Yongqiang
>Assignee: He Yongqiang
> Fix For: 0.96.0
>
> Attachments: 
> HBASE-5521-jira-Move-compression-decompression-to-an-2012-03-19_12_12_32.patch,
>  HBASE-5521.1.patch, HBASE-5521.D2097.1.patch, HBASE-5521.D2097.10.patch, 
> HBASE-5521.D2097.2.patch, HBASE-5521.D2097.3.patch, HBASE-5521.D2097.4.patch, 
> HBASE-5521.D2097.5.patch, HBASE-5521.D2097.6.patch, HBASE-5521.D2097.7.patch, 
> HBASE-5521.D2097.8.patch, HBASE-5521.D2097.9.patch
>
>
> As part of working on HBASE-5313, we want to add a new columnar 
> encoder/decoder. It makes sense to move compression to be part of 
> encoder/decoder:
> 1) a scanner for a columnar encoded block can do lazy decompression to a 
> specific part of a key value object
> 2) avoid an extra bytes copy from encoder to hblock-writer. 
> If there is no encoder specified for a writer, the HBlock.Writer will use a 
> default compression-context to do something very similar to today's code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5521) Move compression/decompression to an encoder specific encoding context

2012-03-19 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-5521:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk.

> Move compression/decompression to an encoder specific encoding context
> --
>
> Key: HBASE-5521
> URL: https://issues.apache.org/jira/browse/HBASE-5521
> Project: HBase
>  Issue Type: Improvement
>Reporter: He Yongqiang
>Assignee: He Yongqiang
> Fix For: 0.96.0
>
> Attachments: 
> HBASE-5521-jira-Move-compression-decompression-to-an-2012-03-19_12_12_32.patch,
>  HBASE-5521.1.patch, HBASE-5521.D2097.1.patch, HBASE-5521.D2097.10.patch, 
> HBASE-5521.D2097.2.patch, HBASE-5521.D2097.3.patch, HBASE-5521.D2097.4.patch, 
> HBASE-5521.D2097.5.patch, HBASE-5521.D2097.6.patch, HBASE-5521.D2097.7.patch, 
> HBASE-5521.D2097.8.patch, HBASE-5521.D2097.9.patch
>
>
> As part of working on HBASE-5313, we want to add a new columnar 
> encoder/decoder. It makes sense to move compression to be part of 
> encoder/decoder:
> 1) a scanner for a columnar encoded block can do lazy decompression to a 
> specific part of a key value object
> 2) avoid an extra bytes copy from encoder to hblock-writer. 
> If there is no encoder specified for a writer, the HBlock.Writer will use a 
> default compression-context to do something very similar to today's code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4337) Update HBase directory structure layout to be aligned with Hadoop

2012-03-19 Thread Giridharan Kesavan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giridharan Kesavan updated HBASE-4337:
--

Attachment: hbase-4337-7.patch

ported patch to work with the latest rc tag

> Update HBase directory structure layout to be aligned with Hadoop
> -
>
> Key: HBASE-4337
> URL: https://issues.apache.org/jira/browse/HBASE-4337
> Project: HBase
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.92.0
>Reporter: Eric Yang
>Assignee: Eric Yang
> Attachments: HBASE-4337-1.patch, HBASE-4337-2.patch, 
> HBASE-4337-3.patch, HBASE-4337-4.patch, HBASE-4337-5.patch, 
> HBASE-4337-6.patch, HBASE-4337.patch, hbase-4337-7.patch
>
>
> In HADOOP-6255, a proposal was made for common directory layout for Hadoop 
> ecosystem.  This jira is to track the necessary work for making HBase 
> directory structure aligned with Hadoop for better integration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5335) Dynamic Schema Configurations

2012-03-19 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5335:
---

Attachment: D2247.3.patch

nspiegelberg updated the revision "[jira] [HBASE-5335] Dynamic Schema Config".
Reviewers: JIRA, Kannan, stack, mbautin, Liyin

  (1) Addressed Ted & Stacks second peer review
  (2) Added integration test
  (3) Add ability to remove an ADVANCED key once created (set value to nil)

REVISION DETAIL
  https://reviews.facebook.net/D2247

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
  src/main/java/org/apache/hadoop/hbase/HConstants.java
  src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
  src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java
  src/main/ruby/hbase/admin.rb
  src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
  src/test/java/org/apache/hadoop/hbase/util/TestCompoundConfiguration.java


> Dynamic Schema Configurations
> -
>
> Key: HBASE-5335
> URL: https://issues.apache.org/jira/browse/HBASE-5335
> Project: HBase
>  Issue Type: New Feature
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>  Labels: configuration, schema
> Attachments: D2247.1.patch, D2247.2.patch, D2247.3.patch
>
>
> Currently, the ability for a core developer to add per-table & per-CF 
> configuration settings is very heavyweight.  You need to add a reserved 
> keyword all the way up the stack & you have to support this variable 
> long-term if you're going to expose it explicitly to the user.  This has 
> ended up with using Configuration.get() a lot because it is lightweight and 
> you can tweak settings while you're trying to understand system behavior 
> [since there are many config params that may never need to be tuned].  We 
> need to add the ability to put & read arbitrary KV settings in the HBase 
> schema.  Combined with online schema change, this will allow us to safely 
> iterate on configuration settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5335) Dynamic Schema Configurations

2012-03-19 Thread Nicolas Spiegelberg (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232889#comment-13232889
 ] 

Nicolas Spiegelberg commented on HBASE-5335:


Note: alterations to the CF schema can be made online.  Currently, alterations 
to the table-level schema requires a disable-modify-enable.  This is related to 
online schema design but should be trivial to alter.

> Dynamic Schema Configurations
> -
>
> Key: HBASE-5335
> URL: https://issues.apache.org/jira/browse/HBASE-5335
> Project: HBase
>  Issue Type: New Feature
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>  Labels: configuration, schema
> Attachments: D2247.1.patch, D2247.2.patch, D2247.3.patch
>
>
> Currently, the ability for a core developer to add per-table & per-CF 
> configuration settings is very heavyweight.  You need to add a reserved 
> keyword all the way up the stack & you have to support this variable 
> long-term if you're going to expose it explicitly to the user.  This has 
> ended up with using Configuration.get() a lot because it is lightweight and 
> you can tweak settings while you're trying to understand system behavior 
> [since there are many config params that may never need to be tuned].  We 
> need to add the ability to put & read arbitrary KV settings in the HBase 
> schema.  Combined with online schema change, this will allow us to safely 
> iterate on configuration settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5569) Do not collect deleted KVs when they are still in use by a scanner.

2012-03-19 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232896#comment-13232896
 ] 

Lars Hofhansl commented on HBASE-5569:
--

Ran more variations of the test (different number of threads, loops, 
synchronized flushing or not). Each time I see a failure after 2-3 runs without 
the patch, and no failures with the patch after at least 20 iterations.


> Do not collect deleted KVs when they are still in use by a scanner.
> ---
>
> Key: HBASE-5569
> URL: https://issues.apache.org/jira/browse/HBASE-5569
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5569-v2.txt, 5569-v3.txt, 5569-v4.txt, 5569.txt, 
> TestAtomicOperation-output.trunk_120313.rar
>
>
> I noticed this because TestAtomicOperation.testMultiRowMutationMultiThreads 
> fails rarely.
> The solution is similar to HBASE-2856, where expired KVs are not collected 
> when in use by a scanner.
> ---
> What I pieced together so far is that it is the *scanning* side that has 
> problems sometimes.
> Every time I see a assertion failure in the log I see this before:
> {quote}
> 2012-03-12 21:48:49,523 DEBUG [Thread-211] regionserver.StoreScanner(499): 
> Storescanner.peek() is changed where before = 
> rowB/colfamily11:qual1/75366/Put/vlen=6,and after = 
> rowB/colfamily11:qual1/75203/DeleteColumn/vlen=0
> {quote}
> The order of if the Put and Delete is sometimes reversed.
> The test threads should always see exactly one KV, if the "before" was the 
> Put the thread see 0 KVs, if the "before" was the Delete the threads see 2 
> KVs.
> This debug message comes from StoreScanner to checkReseek. It seems we still 
> some consistency issue with scanning sometimes :(

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4337) Update HBase directory structure layout to be aligned with Hadoop

2012-03-19 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232899#comment-13232899
 ] 

Hadoop QA commented on HBASE-4337:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12518939/hbase-4337-7.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1226//console

This message is automatically generated.

> Update HBase directory structure layout to be aligned with Hadoop
> -
>
> Key: HBASE-4337
> URL: https://issues.apache.org/jira/browse/HBASE-4337
> Project: HBase
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.92.0
>Reporter: Eric Yang
>Assignee: Eric Yang
> Attachments: HBASE-4337-1.patch, HBASE-4337-2.patch, 
> HBASE-4337-3.patch, HBASE-4337-4.patch, HBASE-4337-5.patch, 
> HBASE-4337-6.patch, HBASE-4337.patch, hbase-4337-7.patch
>
>
> In HADOOP-6255, a proposal was made for common directory layout for Hadoop 
> ecosystem.  This jira is to track the necessary work for making HBase 
> directory structure aligned with Hadoop for better integration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4348) Add metrics for regions in transition

2012-03-19 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232904#comment-13232904
 ] 

jirapos...@reviews.apache.org commented on HBASE-4348:
--



bq.  On 2012-03-19 19:05:08, Gregory Chanan wrote:
bq.  > Looks pretty good, just some spacing issues.
bq.  > 
bq.  > Are we sure that 60 seconds is the proper timeout to display 
"interesting" regions in transition?  Perhaps we should make this configurable? 
 (If yes, I'd also create a master msgInterval instead of reusing the 
regionserver one).

Thanks for the review. I will make these changes and revise the patch.


- Himanshu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4402/#review6076
---


On 2012-03-19 06:48:19, Himanshu Vashishtha wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4402/
bq.  ---
bq.  
bq.  (Updated 2012-03-19 06:48:19)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  This patch is for addinf Region in transition metrics to the HMaster 
metrics system. It also adds these metrics in the master ui, in the Region in 
transition section. I have attached the proposed new format in the jira 4348.
bq.  
bq.  
bq.  This addresses bug HBase-4348.
bq.  https://issues.apache.org/jira/browse/HBase-4348
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
src/main/jamon/org/apache/hadoop/hbase/tmpl/master/AssignmentManagerStatusTmpl.jamon
 0dc0691 
bq.src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 
ae468ca 
bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java c4b4d30 
bq.src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetrics.java 
83abc52 
bq.src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java 
d68ce33 
bq.  
bq.  Diff: https://reviews.apache.org/r/4402/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Ran on a 5 node cluster and kill region servers randomly to observe the 
changes in the RIT metrics as emitted out by the Master's mxbean;
bq.  
bq.  mvn test passes without any failure.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Himanshu
bq.  
bq.



> Add metrics for regions in transition
> -
>
> Key: HBASE-4348
> URL: https://issues.apache.org/jira/browse/HBASE-4348
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.92.0
>Reporter: Todd Lipcon
>Assignee: Himanshu Vashishtha
>Priority: Minor
>  Labels: noob
> Attachments: 4348-metrics-v3.patch, 4348-v1.patch, 4348-v2.patch, 
> RITs.png, RegionInTransitions2.png, metrics-v2.patch
>
>
> The following metrics would be useful for monitoring the master:
> - the number of regions in transition
> - the number of regions in transition that have been in transition for more 
> than a minute
> - how many seconds has the oldest region-in-transition been in transition

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5601) Add per-column-family data block cache hit ratios

2012-03-19 Thread Mikhail Bautin (Created) (JIRA)
Add per-column-family data block cache hit ratios
-

 Key: HBASE-5601
 URL: https://issues.apache.org/jira/browse/HBASE-5601
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin


In addition to the overall block cache hit ratio it would be extremely useful 
to have per-column-family data block cache hit ratio metrics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-50) Snapshot of table

2012-03-19 Thread Jesse Yates (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232908#comment-13232908
 ] 

Jesse Yates commented on HBASE-50:
--

If no one is working on this, I'd like to pick up shepherding in Chongxin's 
original patch (updating to trunk, completion of features, etc) into trunk.

> Snapshot of table
> -
>
> Key: HBASE-50
> URL: https://issues.apache.org/jira/browse/HBASE-50
> Project: HBase
>  Issue Type: New Feature
>Reporter: Billy Pearson
>Assignee: Li Chongxin
>Priority: Minor
>  Labels: gsoc
> Attachments: HBase Snapshot Design Report V2.pdf, HBase Snapshot 
> Design Report V3.pdf, HBase Snapshot Implementation Plan.pdf, Snapshot Class 
> Diagram.png
>
>
> Havening an option to take a snapshot of a table would be vary useful in 
> production.
> What I would like to see this option do is do a merge of all the data into 
> one or more files stored in the same folder on the dfs. This way we could 
> save data in case of a software bug in hadoop or user code. 
> The other advantage would be to be able to export a table to multi locations. 
> Say I had a read_only table that must be online. I could take a snapshot of 
> it when needed and export it to a separate data center and have it loaded 
> there and then i would have it online at multi data centers for load 
> balancing and failover.
> I understand that hadoop takes the need out of havening backup to protect 
> from failed servers, but this does not protect use from software bugs that 
> might delete or alter data in ways we did not plan. We should have a way we 
> can roll back a dataset.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4348) Add metrics for regions in transition

2012-03-19 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232911#comment-13232911
 ] 

jirapos...@reviews.apache.org commented on HBASE-4348:
--



bq.  On 2012-03-19 19:05:08, Gregory Chanan wrote:
bq.  > Looks pretty good, just some spacing issues.
bq.  > 
bq.  > Are we sure that 60 seconds is the proper timeout to display 
"interesting" regions in transition?  Perhaps we should make this configurable? 
 (If yes, I'd also create a master msgInterval instead of reusing the 
regionserver one).
bq.  
bq.  Himanshu Vashishtha wrote:
bq.  Thanks for the review. I will make these changes and revise the patch.

Note that 60 seconds thing is actually used from the jira description, and is 
not using any property as such. I can make this configurable though.
The region server property (hbase.regionserver.msginterval, default is 3 sec) 
which i used is for the frequency for emitting out the metrics. Should that be 
different for Master and RS?


- Himanshu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4402/#review6076
---


On 2012-03-19 06:48:19, Himanshu Vashishtha wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4402/
bq.  ---
bq.  
bq.  (Updated 2012-03-19 06:48:19)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  This patch is for addinf Region in transition metrics to the HMaster 
metrics system. It also adds these metrics in the master ui, in the Region in 
transition section. I have attached the proposed new format in the jira 4348.
bq.  
bq.  
bq.  This addresses bug HBase-4348.
bq.  https://issues.apache.org/jira/browse/HBase-4348
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
src/main/jamon/org/apache/hadoop/hbase/tmpl/master/AssignmentManagerStatusTmpl.jamon
 0dc0691 
bq.src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 
ae468ca 
bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java c4b4d30 
bq.src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetrics.java 
83abc52 
bq.src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java 
d68ce33 
bq.  
bq.  Diff: https://reviews.apache.org/r/4402/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Ran on a 5 node cluster and kill region servers randomly to observe the 
changes in the RIT metrics as emitted out by the Master's mxbean;
bq.  
bq.  mvn test passes without any failure.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Himanshu
bq.  
bq.



> Add metrics for regions in transition
> -
>
> Key: HBASE-4348
> URL: https://issues.apache.org/jira/browse/HBASE-4348
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.92.0
>Reporter: Todd Lipcon
>Assignee: Himanshu Vashishtha
>Priority: Minor
>  Labels: noob
> Attachments: 4348-metrics-v3.patch, 4348-v1.patch, 4348-v2.patch, 
> RITs.png, RegionInTransitions2.png, metrics-v2.patch
>
>
> The following metrics would be useful for monitoring the master:
> - the number of regions in transition
> - the number of regions in transition that have been in transition for more 
> than a minute
> - how many seconds has the oldest region-in-transition been in transition

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5570) Compression tool section is referring to wrong link in HBase Book.

2012-03-19 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5570:
-

Status: Patch Available  (was: Open)

> Compression tool section is referring to wrong link in HBase Book.
> --
>
> Key: HBASE-5570
> URL: https://issues.apache.org/jira/browse/HBASE-5570
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>Reporter: Laxman
>Assignee: Doug Meil
>Priority: Trivial
>  Labels: documentaion
> Attachments: ops_mgt_hbase_5570.xml.patch
>
>
> http://hbase.apache.org/book/ops_mgt.html#compression.tool
> Above section is refering to itself (recursive) in HBase book.
> This needs to be corrected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5570) Compression tool section is referring to wrong link in HBase Book.

2012-03-19 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5570:
-

Attachment: ops_mgt_hbase_5570.xml.patch

> Compression tool section is referring to wrong link in HBase Book.
> --
>
> Key: HBASE-5570
> URL: https://issues.apache.org/jira/browse/HBASE-5570
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>Reporter: Laxman
>Assignee: Doug Meil
>Priority: Trivial
>  Labels: documentaion
> Attachments: ops_mgt_hbase_5570.xml.patch
>
>
> http://hbase.apache.org/book/ops_mgt.html#compression.tool
> Above section is refering to itself (recursive) in HBase book.
> This needs to be corrected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5570) Compression tool section is referring to wrong link in HBase Book.

2012-03-19 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5570:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Compression tool section is referring to wrong link in HBase Book.
> --
>
> Key: HBASE-5570
> URL: https://issues.apache.org/jira/browse/HBASE-5570
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>Reporter: Laxman
>Assignee: Doug Meil
>Priority: Trivial
>  Labels: documentaion
> Attachments: ops_mgt_hbase_5570.xml.patch
>
>
> http://hbase.apache.org/book/ops_mgt.html#compression.tool
> Above section is refering to itself (recursive) in HBase book.
> This needs to be corrected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5335) Dynamic Schema Configurations

2012-03-19 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232915#comment-13232915
 ] 

Phabricator commented on HBASE-5335:


mbautin has commented on the revision "[jira] [HBASE-5335] Dynamic Schema 
Config".

  A few initial comments (unfortunately on an earlier version of the revision).

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:748 Use 
containsKey
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:753 Are we 
trying to make the output jruby-parseable? If so, we need to take care of 
escaping embedded single quotes here.
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:760 
HConstants.ADVANCED sounds confusing. I think we need that constant to have a 
more descriptive name.
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:773-775 See my 
earlier comment about escaping embedded single quotes. The escaping method also 
needs to be shared between all callsites.
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:783 Is it 
possible to create the unmodifiable map once instead of every time this is 
called?
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:41 I might have 
missed some earlier discussion, but why exactly is a Guava dependency a bad 
thing? Guava is licensed under Apache License 2.0 according to 
http://code.google.com/p/guava-libraries/.

REVISION DETAIL
  https://reviews.facebook.net/D2247


> Dynamic Schema Configurations
> -
>
> Key: HBASE-5335
> URL: https://issues.apache.org/jira/browse/HBASE-5335
> Project: HBase
>  Issue Type: New Feature
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>  Labels: configuration, schema
> Attachments: D2247.1.patch, D2247.2.patch, D2247.3.patch
>
>
> Currently, the ability for a core developer to add per-table & per-CF 
> configuration settings is very heavyweight.  You need to add a reserved 
> keyword all the way up the stack & you have to support this variable 
> long-term if you're going to expose it explicitly to the user.  This has 
> ended up with using Configuration.get() a lot because it is lightweight and 
> you can tweak settings while you're trying to understand system behavior 
> [since there are many config params that may never need to be tuned].  We 
> need to add the ability to put & read arbitrary KV settings in the HBase 
> schema.  Combined with online schema change, this will allow us to safely 
> iterate on configuration settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4348) Add metrics for regions in transition

2012-03-19 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232932#comment-13232932
 ] 

jirapos...@reviews.apache.org commented on HBASE-4348:
--



bq.  On 2012-03-19 19:05:08, Gregory Chanan wrote:
bq.  > Looks pretty good, just some spacing issues.
bq.  > 
bq.  > Are we sure that 60 seconds is the proper timeout to display 
"interesting" regions in transition?  Perhaps we should make this configurable? 
 (If yes, I'd also create a master msgInterval instead of reusing the 
regionserver one).
bq.  
bq.  Himanshu Vashishtha wrote:
bq.  Thanks for the review. I will make these changes and revise the patch.
bq.  
bq.  Himanshu Vashishtha wrote:
bq.  Note that 60 seconds thing is actually used from the jira description, 
and is not using any property as such. I can make this configurable though.
bq.  The region server property (hbase.regionserver.msginterval, default is 
3 sec) which i used is for the frequency for emitting out the metrics. Should 
that be different for Master and RS?

I know, I was just asking a question because I don't have much operational 
experience.  If you or others think 60 seconds is a good cutoff and it doesn't 
need to be configurable, that sounds good to me.
If you are not going to make it a property, you should only have it calculated 
in one place so it is easy to change :).


- Gregory


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4402/#review6076
---


On 2012-03-19 06:48:19, Himanshu Vashishtha wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4402/
bq.  ---
bq.  
bq.  (Updated 2012-03-19 06:48:19)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  This patch is for addinf Region in transition metrics to the HMaster 
metrics system. It also adds these metrics in the master ui, in the Region in 
transition section. I have attached the proposed new format in the jira 4348.
bq.  
bq.  
bq.  This addresses bug HBase-4348.
bq.  https://issues.apache.org/jira/browse/HBase-4348
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
src/main/jamon/org/apache/hadoop/hbase/tmpl/master/AssignmentManagerStatusTmpl.jamon
 0dc0691 
bq.src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 
ae468ca 
bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java c4b4d30 
bq.src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetrics.java 
83abc52 
bq.src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java 
d68ce33 
bq.  
bq.  Diff: https://reviews.apache.org/r/4402/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Ran on a 5 node cluster and kill region servers randomly to observe the 
changes in the RIT metrics as emitted out by the Master's mxbean;
bq.  
bq.  mvn test passes without any failure.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Himanshu
bq.  
bq.



> Add metrics for regions in transition
> -
>
> Key: HBASE-4348
> URL: https://issues.apache.org/jira/browse/HBASE-4348
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.92.0
>Reporter: Todd Lipcon
>Assignee: Himanshu Vashishtha
>Priority: Minor
>  Labels: noob
> Attachments: 4348-metrics-v3.patch, 4348-v1.patch, 4348-v2.patch, 
> RITs.png, RegionInTransitions2.png, metrics-v2.patch
>
>
> The following metrics would be useful for monitoring the master:
> - the number of regions in transition
> - the number of regions in transition that have been in transition for more 
> than a minute
> - how many seconds has the oldest region-in-transition been in transition

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3725) HBase increments from old value after delete and write to disk

2012-03-19 Thread Vaibhav Puranik (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232940#comment-13232940
 ] 

Vaibhav Puranik commented on HBASE-3725:


We encountered the exact same issue in @GumGum production environment. We are 
running 0.90.5. I hope this gets fixed ASAP.

> HBase increments from old value after delete and write to disk
> --
>
> Key: HBASE-3725
> URL: https://issues.apache.org/jira/browse/HBASE-3725
> Project: HBase
>  Issue Type: Bug
>  Components: io, regionserver
>Affects Versions: 0.90.1
>Reporter: Nathaniel Cook
>Assignee: Jonathan Gray
> Attachments: HBASE-3725-Test-v1.patch, HBASE-3725-v3.patch, 
> HBASE-3725.patch
>
>
> Deleted row values are sometimes used for starting points on new increments.
> To reproduce:
> Create a row "r". Set column "x" to some default value.
> Force hbase to write that value to the file system (such as restarting the 
> cluster).
> Delete the row.
> Call table.incrementColumnValue with "some_value"
> Get the row.
> The returned value in the column was incremented from the old value before 
> the row was deleted instead of being initialized to "some_value".
> Code to reproduce:
> {code}
> import java.io.IOException;
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.hbase.HBaseConfiguration;
> import org.apache.hadoop.hbase.HColumnDescriptor;
> import org.apache.hadoop.hbase.HTableDescriptor;
> import org.apache.hadoop.hbase.client.Delete;
> import org.apache.hadoop.hbase.client.Get;
> import org.apache.hadoop.hbase.client.HBaseAdmin;
> import org.apache.hadoop.hbase.client.HTableInterface;
> import org.apache.hadoop.hbase.client.HTablePool;
> import org.apache.hadoop.hbase.client.Increment;
> import org.apache.hadoop.hbase.client.Result;
> import org.apache.hadoop.hbase.util.Bytes;
> public class HBaseTestIncrement
> {
>   static String tableName  = "testIncrement";
>   static byte[] infoCF = Bytes.toBytes("info");
>   static byte[] rowKey = Bytes.toBytes("test-rowKey");
>   static byte[] newInc = Bytes.toBytes("new");
>   static byte[] oldInc = Bytes.toBytes("old");
>   /**
>* This code reproduces a bug with increment column values in hbase
>* Usage: First run part one by passing '1' as the first arg
>*Then restart the hbase cluster so it writes everything to disk
>*Run part two by passing '2' as the first arg
>*
>* This will result in the old deleted data being found and used for 
> the increment calls
>*
>* @param args
>* @throws IOException
>*/
>   public static void main(String[] args) throws IOException
>   {
>   if("1".equals(args[0]))
>   partOne();
>   if("2".equals(args[0]))
>   partTwo();
>   if ("both".equals(args[0]))
>   {
>   partOne();
>   partTwo();
>   }
>   }
>   /**
>* Creates a table and increments a column value 10 times by 10 each 
> time.
>* Results in a value of 100 for the column
>*
>* @throws IOException
>*/
>   static void partOne()throws IOException
>   {
>   Configuration conf = HBaseConfiguration.create();
>   HBaseAdmin admin = new HBaseAdmin(conf);
>   HTableDescriptor tableDesc = new HTableDescriptor(tableName);
>   tableDesc.addFamily(new HColumnDescriptor(infoCF));
>   if(admin.tableExists(tableName))
>   {
>   admin.disableTable(tableName);
>   admin.deleteTable(tableName);
>   }
>   admin.createTable(tableDesc);
>   HTablePool pool = new HTablePool(conf, Integer.MAX_VALUE);
>   HTableInterface table = pool.getTable(Bytes.toBytes(tableName));
>   //Increment unitialized column
>   for (int j = 0; j < 10; j++)
>   {
>   table.incrementColumnValue(rowKey, infoCF, oldInc, 
> (long)10);
>   Increment inc = new Increment(rowKey);
>   inc.addColumn(infoCF, newInc, (long)10);
>   table.increment(inc);
>   }
>   Get get = new Get(rowKey);
>   Result r = table.get(get);
>   System.out.println("initial values: new " + 
> Bytes.toLong(r.getValue(infoCF, newInc)) + " old " + 
> Bytes.toLong(r.getValue(infoCF, oldInc)));
>   }
>   /**
>* First deletes the data then increments the column 10 times by 1 each 
> time
>*
>* Should result in a value of 10 but it doesn't, i

[jira] [Commented] (HBASE-5335) Dynamic Schema Configurations

2012-03-19 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232946#comment-13232946
 ] 

Phabricator commented on HBASE-5335:


nspiegelberg has commented on the revision "[jira] [HBASE-5335] Dynamic Schema 
Config".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:760 actually, 
the purpose was to dissuade people from using this unless they know what 
they're doing.  we don't want people randomly putting keys in here without 
looking at the source code and then wondering why it doesn't work.  make sense? 
do you have another suggestion?
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:753 we don't 
currently have any config variables with single quotes, do we?  it's much more 
developer-controlled than user input.  I guess I should do some basic 
sanitization for the user error case.  the important ability is that the user 
can mistakenly enter a key with a single quote and then delete it
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:783 this is a 
shallow pointer copy, not a deep KV copy
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:41 @mbautin: 
ted's point is that this will be in client code, so every app server & 
MapReduce cluster would need to have the Guava dependency installed versus just 
the HBase server deployment.

REVISION DETAIL
  https://reviews.facebook.net/D2247


> Dynamic Schema Configurations
> -
>
> Key: HBASE-5335
> URL: https://issues.apache.org/jira/browse/HBASE-5335
> Project: HBase
>  Issue Type: New Feature
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>  Labels: configuration, schema
> Attachments: D2247.1.patch, D2247.2.patch, D2247.3.patch
>
>
> Currently, the ability for a core developer to add per-table & per-CF 
> configuration settings is very heavyweight.  You need to add a reserved 
> keyword all the way up the stack & you have to support this variable 
> long-term if you're going to expose it explicitly to the user.  This has 
> ended up with using Configuration.get() a lot because it is lightweight and 
> you can tweak settings while you're trying to understand system behavior 
> [since there are many config params that may never need to be tuned].  We 
> need to add the ability to put & read arbitrary KV settings in the HBase 
> schema.  Combined with online schema change, this will allow us to safely 
> iterate on configuration settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5335) Dynamic Schema Configurations

2012-03-19 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232953#comment-13232953
 ] 

Phabricator commented on HBASE-5335:


mbautin has commented on the revision "[jira] [HBASE-5335] Dynamic Schema 
Config".

  Nicolas: Looks good! Some more comments inline.

  Also, we now have lint enabled. Could you please re-run "mvn -Darc 
initialize", then do "arc lint" or "arc diff --preview" and address lint 
comments? Thanks!


INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/HConstants.java:361 This constant needs 
a better name. Probably some of the constants above it do, too.
  src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:530 Is it 
possible to reduce code duplication between here and HColumnDescriptor?
  src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:549 single quote 
escaping
  src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:579-581 single 
quote escaping
  src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:367 Looks similar 
to the corresponding function in HColumnDescriptor. Is it possible to reuse 
code between the two? (getValues() would be a bigger case for that.)
  src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:42 Add 

  src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:222-229 
Why exactly do we have to copy-and-paste the code instead of using composition 
or inheritance?

  If there is a bug in Configuration and it is fixed there, it will not be 
reflected here, which is indeed somewhat "tragic".
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4246 It 
would be great to avoid adding more tests to TestFromClientSide and create a 
separate test class instead.
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4247 
This probably should not use javadoc syntax.
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4312 Is 
it possible to wait for an event instead of a specific amount of time?
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:753 In that case 
we should probably sanitize user input for single quotes somewhere else, 
wherever the user is supposed to tweak configuration values in real time. 
However, I think it is easier to escape single quotes here. It would also be 
good to parse the output value of this function with jruby in a test.
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4322 
Create an HConstant for this key
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4292 
Sleep times should be according to 
http://hbase.apache.org/book.html#hbase.tests.writing . We should probably have 
a constant for each type of sleep time. Quoting the linked page from HBase book:

  > Whenever possible, tests should not use Thread.sleep, but rather waiting 
for the real event they need. This is faster and clearer for the reader. Tests 
should not do a Thread.sleep without testing an ending condition. This allows 
understanding what the test is waiting for. Moreover, the test will work 
whatever the machine performance is. Sleep should be minimal to be as fast as 
possible. Waiting for a variable should be done in a 40ms sleep loop. Waiting 
for a socket operation should be done in a 200 ms sleep loop.
  src/test/java/org/apache/hadoop/hbase/util/TestCompoundConfiguration.java:33 
Can this be private?
  
src/test/java/org/apache/hadoop/hbase/util/TestCompoundConfiguration.java:43-46 
Is this override necessary?
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:760 I guess we 
need a javadoc saying that ADVANCED is not for general-purpose use, and that it 
is meant for use as an HColumnDescriptor key, then.
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:783 OK, if you 
think we won't be calling this a lot, this is fine with me.

REVISION DETAIL
  https://reviews.facebook.net/D2247


> Dynamic Schema Configurations
> -
>
> Key: HBASE-5335
> URL: https://issues.apache.org/jira/browse/HBASE-5335
> Project: HBase
>  Issue Type: New Feature
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>  Labels: configuration, schema
> Attachments: D2247.1.patch, D2247.2.patch, D2247.3.patch
>
>
> Currently, the ability for a core developer to add per-table & per-CF 
> configuration settings is very heavyweight.  You need to add a reserved 
> keyword all the way up the stack & you have to support this variable 
> long-term if you're going to expose it explicitly to the user.  This has 
> ended up with using Configuration.get() a lot because it is lightweight and 
> you can tweak settings while you're trying to understand system behavior 
> [since there are many config params that may never need to be tune

[jira] [Resolved] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

2012-03-19 Thread Ted Yu (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-5568.
---

  Resolution: Fixed
Hadoop Flags: Reviewed

Integrated to 0.92

> Multi concurrent flushcache() for one region could cause data loss
> --
>
> Key: HBASE-5568
> URL: https://issues.apache.org/jira/browse/HBASE-5568
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: HBASE-5568-90.patch, HBASE-5568-92v2.patch, 
> HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through 
> HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by 
> multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do 
> this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the 
> actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is 
> negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Started memstore flush for 
> writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Added 
> hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, 
> memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Started memstore flush for 
> writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Added 
> hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, 
> memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> Finished memstore flush of ~128.1m for region 
> writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction 
> requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Started memstore flush for 
> writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Added 
> hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, 
> memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Added 
> hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, 
> memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> Finished memstore flush of ~134.8m for region 
> writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction 
> requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Added 
> hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, 
> memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Added 
> hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, 
> memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> Finished memstore flush of ~6.8m for region 
> writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction 
> requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5335) Dynamic Schema Configurations

2012-03-19 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232968#comment-13232968
 ] 

Phabricator commented on HBASE-5335:


nspiegelberg has commented on the revision "[jira] [HBASE-5335] Dynamic Schema 
Config".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:753 I looked 
into this some more.  We would need to escape more than just single quotes.  
There is a function StringEscapeUtils.escapeJavaScript() in apache.commons.lang 
that uses the same parsing escapes as Ruby.  However, that requires us to 
either ensure client packaging of this dependency or write a custom parser for 
this.

  practically, single quotes are a mistake.  we offer zero parsing sanitization 
as is and just let the RegionServer throw an IllegalArgumentException if the 
user configures it wrong.  we need to let the user undo a single quote mistake. 
 this is currently possible.  value sanitization is the same as existing 
functionality
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:760 currently, 
there is no javadoc info & the user has no way of knowing he can do this unless 
he knows about it from the JIRA.  security through obscurity.
  src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:367 Already 
talked with Ted & Stack about this in previous comments.  A summary:

  Both HTD & HCD have massive redundancies that could be unified beyond this 
function.  They are both basically decorated K,V stores.  I don't think 
unification is very necessary until we have something like locality groups and 
need a third K,V store.  Have you traditionally seen bugs related to HTD & HCD 
inconsistencies?  I'm just worried that there is time spent on thinking of how 
to refactor this code without thinking about whether the denormalized design is 
actually hurting us on a practical feature design/debug sense.
  src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:222-229 
3 problems:

  1) If I don't use Configuration.java, then I need to refactor a LOT of the 
regionserver code to use the new API

  2) we basically need to override Configuration.getProps or we'd need 
protected access to Configuration.properties so we can modify that pointer.  
Basically, there are a bunch of functions in the base Configuration that call 
getProps() and we need to use our derived version instead of the base version.

  3) Trying to make a generic implementation that works across all HDFS 
versions.  I would like to modify Configuration.properties in HDFS 1.0 to be 
protected, but I'd still need to have this code until my patch made it to all 
the versions we support.

  Configuration.java hasn't changed much, so I don't think this is an issue in 
practice.  Do you have another strategy?
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4292 The 
10 sec sleep is found from TestFromClientSide.compactCFRegion.  I used the same 
paradigm here and added poll-waiting to speed up the test.  We need the 
synchronous compaction feature before we can remove this (HBASE-2949).

REVISION DETAIL
  https://reviews.facebook.net/D2247


> Dynamic Schema Configurations
> -
>
> Key: HBASE-5335
> URL: https://issues.apache.org/jira/browse/HBASE-5335
> Project: HBase
>  Issue Type: New Feature
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>  Labels: configuration, schema
> Attachments: D2247.1.patch, D2247.2.patch, D2247.3.patch
>
>
> Currently, the ability for a core developer to add per-table & per-CF 
> configuration settings is very heavyweight.  You need to add a reserved 
> keyword all the way up the stack & you have to support this variable 
> long-term if you're going to expose it explicitly to the user.  This has 
> ended up with using Configuration.get() a lot because it is lightweight and 
> you can tweak settings while you're trying to understand system behavior 
> [since there are many config params that may never need to be tuned].  We 
> need to add the ability to put & read arbitrary KV settings in the HBase 
> schema.  Combined with online schema change, this will allow us to safely 
> iterate on configuration settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3996) Support multiple tables and scanners as input to the mapper in map/reduce jobs

2012-03-19 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3996:
--

Fix Version/s: 0.94.0

Adding 0.94 according to Lars' feedback.

> Support multiple tables and scanners as input to the mapper in map/reduce jobs
> --
>
> Key: HBASE-3996
> URL: https://issues.apache.org/jira/browse/HBASE-3996
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Eran Kutner
>Assignee: Eran Kutner
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 3996-v2.txt, HBase-3996.patch
>
>
> It seems that in many cases feeding data from multiple tables or multiple 
> scanners on a single table can save a lot of time when running map/reduce 
> jobs.
> I propose a new MultiTableInputFormat class that would allow doing this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5335) Dynamic Schema Configurations

2012-03-19 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5335:
---

Attachment: D2247.4.patch

nspiegelberg updated the revision "[jira] [HBASE-5335] Dynamic Schema Config".
Reviewers: JIRA, Kannan, stack, mbautin, Liyin

  1) Addressed comments from Mikhail's peer review
  2) Added unit test for key removal

REVISION DETAIL
  https://reviews.facebook.net/D2247

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
  src/main/java/org/apache/hadoop/hbase/HConstants.java
  src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
  src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java
  src/main/ruby/hbase/admin.rb
  src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
  src/test/java/org/apache/hadoop/hbase/util/TestCompoundConfiguration.java


> Dynamic Schema Configurations
> -
>
> Key: HBASE-5335
> URL: https://issues.apache.org/jira/browse/HBASE-5335
> Project: HBase
>  Issue Type: New Feature
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>  Labels: configuration, schema
> Attachments: D2247.1.patch, D2247.2.patch, D2247.3.patch, 
> D2247.4.patch
>
>
> Currently, the ability for a core developer to add per-table & per-CF 
> configuration settings is very heavyweight.  You need to add a reserved 
> keyword all the way up the stack & you have to support this variable 
> long-term if you're going to expose it explicitly to the user.  This has 
> ended up with using Configuration.get() a lot because it is lightweight and 
> you can tweak settings while you're trying to understand system behavior 
> [since there are many config params that may never need to be tuned].  We 
> need to add the ability to put & read arbitrary KV settings in the HBase 
> schema.  Combined with online schema change, this will allow us to safely 
> iterate on configuration settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5335) Dynamic Schema Configurations

2012-03-19 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232978#comment-13232978
 ] 

Phabricator commented on HBASE-5335:


mbautin has commented on the revision "[jira] [HBASE-5335] Dynamic Schema 
Config".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:222-229 
OK, got it. Please note in the comment that these methods are private in some 
HDFS versions HBase has to support.
  src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java:367 OK, sounds 
good.

REVISION DETAIL
  https://reviews.facebook.net/D2247


> Dynamic Schema Configurations
> -
>
> Key: HBASE-5335
> URL: https://issues.apache.org/jira/browse/HBASE-5335
> Project: HBase
>  Issue Type: New Feature
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>  Labels: configuration, schema
> Attachments: D2247.1.patch, D2247.2.patch, D2247.3.patch, 
> D2247.4.patch
>
>
> Currently, the ability for a core developer to add per-table & per-CF 
> configuration settings is very heavyweight.  You need to add a reserved 
> keyword all the way up the stack & you have to support this variable 
> long-term if you're going to expose it explicitly to the user.  This has 
> ended up with using Configuration.get() a lot because it is lightweight and 
> you can tweak settings while you're trying to understand system behavior 
> [since there are many config params that may never need to be tuned].  We 
> need to add the ability to put & read arbitrary KV settings in the HBase 
> schema.  Combined with online schema change, this will allow us to safely 
> iterate on configuration settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5335) Dynamic Schema Configurations

2012-03-19 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232980#comment-13232980
 ] 

Phabricator commented on HBASE-5335:


nspiegelberg has commented on the revision "[jira] [HBASE-5335] Dynamic Schema 
Config".

INLINE COMMENTS
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4246 we 
need to chop this file into multiple pieces for parallelization, separate from 
this JIRA, correct?  testFromClientSide seems to be the correct spot since it's 
an HBase java client integration test.
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4326 
will make an HConstant

REVISION DETAIL
  https://reviews.facebook.net/D2247


> Dynamic Schema Configurations
> -
>
> Key: HBASE-5335
> URL: https://issues.apache.org/jira/browse/HBASE-5335
> Project: HBase
>  Issue Type: New Feature
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>  Labels: configuration, schema
> Attachments: D2247.1.patch, D2247.2.patch, D2247.3.patch, 
> D2247.4.patch
>
>
> Currently, the ability for a core developer to add per-table & per-CF 
> configuration settings is very heavyweight.  You need to add a reserved 
> keyword all the way up the stack & you have to support this variable 
> long-term if you're going to expose it explicitly to the user.  This has 
> ended up with using Configuration.get() a lot because it is lightweight and 
> you can tweak settings while you're trying to understand system behavior 
> [since there are many config params that may never need to be tuned].  We 
> need to add the ability to put & read arbitrary KV settings in the HBase 
> schema.  Combined with online schema change, this will allow us to safely 
> iterate on configuration settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5335) Dynamic Schema Configurations

2012-03-19 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232985#comment-13232985
 ] 

Phabricator commented on HBASE-5335:


nspiegelberg has commented on the revision "[jira] [HBASE-5335] Dynamic Schema 
Config".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/util/CompoundConfiguration.java:222-229 
will do.

REVISION DETAIL
  https://reviews.facebook.net/D2247


> Dynamic Schema Configurations
> -
>
> Key: HBASE-5335
> URL: https://issues.apache.org/jira/browse/HBASE-5335
> Project: HBase
>  Issue Type: New Feature
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>  Labels: configuration, schema
> Attachments: D2247.1.patch, D2247.2.patch, D2247.3.patch, 
> D2247.4.patch
>
>
> Currently, the ability for a core developer to add per-table & per-CF 
> configuration settings is very heavyweight.  You need to add a reserved 
> keyword all the way up the stack & you have to support this variable 
> long-term if you're going to expose it explicitly to the user.  This has 
> ended up with using Configuration.get() a lot because it is lightweight and 
> you can tweak settings while you're trying to understand system behavior 
> [since there are many config params that may never need to be tuned].  We 
> need to add the ability to put & read arbitrary KV settings in the HBase 
> schema.  Combined with online schema change, this will allow us to safely 
> iterate on configuration settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5335) Dynamic Schema Configurations

2012-03-19 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232990#comment-13232990
 ] 

Phabricator commented on HBASE-5335:


mbautin has commented on the revision "[jira] [HBASE-5335] Dynamic Schema 
Config".

  A couple more responses inline.

INLINE COMMENTS
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4246 
This would be the right place for a new test like this if it were not as huge 
as it is now. Would it be difficult to add new unit tests into separate test 
classes? I have tried to do so when creating new tests and have not had much 
difficulties.
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4292 It 
would be great to add this explanation to the comment and/or make this delay a 
meaningfully-named constant and reuse it in both places.

REVISION DETAIL
  https://reviews.facebook.net/D2247


> Dynamic Schema Configurations
> -
>
> Key: HBASE-5335
> URL: https://issues.apache.org/jira/browse/HBASE-5335
> Project: HBase
>  Issue Type: New Feature
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>  Labels: configuration, schema
> Attachments: D2247.1.patch, D2247.2.patch, D2247.3.patch, 
> D2247.4.patch
>
>
> Currently, the ability for a core developer to add per-table & per-CF 
> configuration settings is very heavyweight.  You need to add a reserved 
> keyword all the way up the stack & you have to support this variable 
> long-term if you're going to expose it explicitly to the user.  This has 
> ended up with using Configuration.get() a lot because it is lightweight and 
> you can tweak settings while you're trying to understand system behavior 
> [since there are many config params that may never need to be tuned].  We 
> need to add the ability to put & read arbitrary KV settings in the HBase 
> schema.  Combined with online schema change, this will allow us to safely 
> iterate on configuration settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3692) Handle RejectedExecutionException in HTable

2012-03-19 Thread Jean-Daniel Cryans (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232995#comment-13232995
 ] 

Jean-Daniel Cryans commented on HBASE-3692:
---

Chinmay,

Could it be that you are re-inserting closed HTables in that pool? At this 
point no one has been able to provide code here that we could test that shows a 
bug. As far as I can tell there's no bug just API misusage (and poor error 
reporting on HBase's end).

> Handle RejectedExecutionException in HTable
> ---
>
> Key: HBASE-3692
> URL: https://issues.apache.org/jira/browse/HBASE-3692
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.90.1
>Reporter: Jean-Daniel Cryans
> Attachments: test_datanucleus.zip
>
>
> A user on IRC yesterday had an issue with RejectedExecutionException coming 
> out of HTable sometimes. Apart from being very confusing to the user as it 
> comes with no message at all, it exposes the HTable internals. 
> I think we should handle it and instead throw something like 
> DontUseHTableInMultipleThreadsException or something more clever. In his 
> case, the user had a HTable leak with the pool that he was able to figure out 
> once I told him what to look for.
> It could be an unchecked exception and we could consider adding in 0.90 but 
> marking for 0.92 at the moment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5594) Unable to stop a master that's waiting on -ROOT- during initialization

2012-03-19 Thread Jean-Daniel Cryans (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233000#comment-13233000
 ] 

Jean-Daniel Cryans commented on HBASE-5594:
---

Hey Ram,

Didn't you write that that thread dump on Dec. 9th wasn't for 4951 and that the 
issue wasn't in 0.92?

> Unable to stop a master that's waiting on -ROOT- during initialization
> --
>
> Key: HBASE-5594
> URL: https://issues.apache.org/jira/browse/HBASE-5594
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1
>Reporter: Jean-Daniel Cryans
> Fix For: 0.92.2, 0.94.0, 0.96.0
>
>
> We just had a case where the master (that was just restarted) was having a 
> hard time assigning -ROOT- (all the PRI handlers were full already) so we 
> tried to shutdown the cluster and even though all the RS closed down properly 
> the master kept running being blocked on:
> {noformat}
> "master-sv4r20s12,10302,1331916142866" prio=10 tid=0x7f3708008800 
> nid=0x4b20 in Object.wait() [0x7f370d1d]
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   - waiting on <0x0006030be3f8> (a 
> org.apache.hadoop.hbase.zookeeper.RootRegionTracker)
>   at java.lang.Object.wait(Object.java:485)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:131)
>   - locked <0x0006030be3f8> (a 
> org.apache.hadoop.hbase.zookeeper.RootRegionTracker)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:104)
>   - locked <0x0006030be3f8> (a 
> org.apache.hadoop.hbase.zookeeper.RootRegionTracker)
>   at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRoot(CatalogTracker.java:313)
>   at 
> org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:571)
>   at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:501)
>   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:336)
>   at java.lang.Thread.run(Thread.java:662)
> {noformat}
> I haven't checked the 0.90 code, we got this on 0.92.1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3692) Handle RejectedExecutionException in HTable

2012-03-19 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233037#comment-13233037
 ] 

Lars Hofhansl commented on HBASE-3692:
--

In 0.92 you can also use HBASE-4805. It let's you create an HConnection and 
ExecutorService separately and pass them to the HTable constructor.


> Handle RejectedExecutionException in HTable
> ---
>
> Key: HBASE-3692
> URL: https://issues.apache.org/jira/browse/HBASE-3692
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.90.1
>Reporter: Jean-Daniel Cryans
> Attachments: test_datanucleus.zip
>
>
> A user on IRC yesterday had an issue with RejectedExecutionException coming 
> out of HTable sometimes. Apart from being very confusing to the user as it 
> comes with no message at all, it exposes the HTable internals. 
> I think we should handle it and instead throw something like 
> DontUseHTableInMultipleThreadsException or something more clever. In his 
> case, the user had a HTable leak with the pool that he was able to figure out 
> once I told him what to look for.
> It could be an unchecked exception and we could consider adding in 0.90 but 
> marking for 0.92 at the moment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5533) Add more metrics to HBase

2012-03-19 Thread Shaneal Manek (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaneal Manek updated HBASE-5533:
-

Attachment: hbase5533-0.92-v5.patch

Thanks for the review Stack, I appreciate all the help!

The latest patch contains the following changes:
- moved the NamedThreadFactory into a static method of the util.Threads class
- fixed license headers
- truncates string output (good catch - I didn't see that)
- added some very basic unit tests on the ExponentiallyDecayingSample

> Add more metrics to HBase
> -
>
> Key: HBASE-5533
> URL: https://issues.apache.org/jira/browse/HBASE-5533
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.2, 0.94.0
>Reporter: Shaneal Manek
>Assignee: Shaneal Manek
>Priority: Minor
> Attachments: BlockingQueueContention.java, HBASE-5533-0.92-v4.patch, 
> TimingOverhead.java, hbase-5533-0.92.patch, hbase5533-0.92-v2.patch, 
> hbase5533-0.92-v3.patch, hbase5533-0.92-v5.patch, histogram_web_ui.png
>
>
> To debub/monitor production clusters, there are some more metrics I wish I 
> had available.
> In particular:
> - Although the average FS latencies are useful, a 'histogram' of recent 
> latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) 
> would be more useful
> - Similar histograms of latencies on common operations (GET, PUT, DELETE) 
> would be useful
> - Counting the number of accesses to each region to detect hotspotting
> - Exposing the current number of HLog files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3996) Support multiple tables and scanners as input to the mapper in map/reduce jobs

2012-03-19 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3996:
--

Attachment: 3996-v3.txt

Patch v3 compiles

I reformatted some of the new code.

> Support multiple tables and scanners as input to the mapper in map/reduce jobs
> --
>
> Key: HBASE-3996
> URL: https://issues.apache.org/jira/browse/HBASE-3996
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Eran Kutner
>Assignee: Eran Kutner
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 3996-v2.txt, 3996-v3.txt, HBase-3996.patch
>
>
> It seems that in many cases feeding data from multiple tables or multiple 
> scanners on a single table can save a lot of time when running map/reduce 
> jobs.
> I propose a new MultiTableInputFormat class that would allow doing this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5600) Make Endpoint Coprocessors Available from Thrift

2012-03-19 Thread Andrew Purtell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233058#comment-13233058
 ] 

Andrew Purtell commented on HBASE-5600:
---

bq. We could create a thrift method to take the name of the class, method, and 
an array of params and then call coprocessorExec. 

This sounds like a reasonable short term thing to do.

For now with the dynamic behaviors of the current HRPC based stack we can 
mostly get away with using the same Java client tools for flexible remote 
method invocation of Endpoints as with the core interfaces. In the future the 
fact every Endpoint is really its own little protocol may be more exposed. In 
this world, the interface passes a blob. Such blobs could recursively contain 
protobuf (or Thrift) encoding.

If going forward we will support Thrift and protobuf ("new RPC") clients both, 
then maybe we can expect server side code will translate from Thrift and 
protobuf message representations to some common representation, POJO or 
whatever. In other words, rehydrate from message representation into real 
classes (via reflection?) At least for Java, protobufs documentation recommends 
the objects built by the protobuf unmarshaller not be used directly as 
application classes. I think Thrift has the same practice. So on the server 
side that might not be so bad.

On the client side, given the static nature of Thrift and protobuf message 
schemas (compiled from IDL) we can't dynamically create messages, so there's no 
way to hide behind for example a remote invocation proxy or some message 
builder. It could be different if we used Avro or some other option which can 
create message schemas at runtime and use those dynamically generated schemas 
server side.

> Make Endpoint Coprocessors Available from Thrift
> 
>
> Key: HBASE-5600
> URL: https://issues.apache.org/jira/browse/HBASE-5600
> Project: HBase
>  Issue Type: Improvement
>  Components: thrift
>Reporter: Ben West
>Priority: Minor
>  Labels: thrift
>
> Currently, the only way to access an endpoint coprocessor via thrift is to 
> modify the schema and Thrift server for every coprocessor function. This is 
> annoying. It should be possible to use your coprocessors without having to 
> mangle HBase core code (since that's the point of coprocessors).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3996) Support multiple tables and scanners as input to the mapper in map/reduce jobs

2012-03-19 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3996:
--

Attachment: 3996-v3.txt

> Support multiple tables and scanners as input to the mapper in map/reduce jobs
> --
>
> Key: HBASE-3996
> URL: https://issues.apache.org/jira/browse/HBASE-3996
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Eran Kutner
>Assignee: Eran Kutner
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 3996-v2.txt, 3996-v3.txt, HBase-3996.patch
>
>
> It seems that in many cases feeding data from multiple tables or multiple 
> scanners on a single table can save a lot of time when running map/reduce 
> jobs.
> I propose a new MultiTableInputFormat class that would allow doing this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3996) Support multiple tables and scanners as input to the mapper in map/reduce jobs

2012-03-19 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3996:
--

Attachment: (was: 3996-v3.txt)

> Support multiple tables and scanners as input to the mapper in map/reduce jobs
> --
>
> Key: HBASE-3996
> URL: https://issues.apache.org/jira/browse/HBASE-3996
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Eran Kutner
>Assignee: Eran Kutner
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 3996-v2.txt, 3996-v3.txt, HBase-3996.patch
>
>
> It seems that in many cases feeding data from multiple tables or multiple 
> scanners on a single table can save a lot of time when running map/reduce 
> jobs.
> I propose a new MultiTableInputFormat class that would allow doing this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3996) Support multiple tables and scanners as input to the mapper in map/reduce jobs

2012-03-19 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3996:
--

Status: Patch Available  (was: Open)

> Support multiple tables and scanners as input to the mapper in map/reduce jobs
> --
>
> Key: HBASE-3996
> URL: https://issues.apache.org/jira/browse/HBASE-3996
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Eran Kutner
>Assignee: Eran Kutner
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 3996-v2.txt, 3996-v3.txt, HBase-3996.patch
>
>
> It seems that in many cases feeding data from multiple tables or multiple 
> scanners on a single table can save a lot of time when running map/reduce 
> jobs.
> I propose a new MultiTableInputFormat class that would allow doing this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5602) Add cache access pattern statistics and report hot blocks/keys

2012-03-19 Thread Mikhail Bautin (Created) (JIRA)
Add cache access pattern statistics and report hot blocks/keys
--

 Key: HBASE-5602
 URL: https://issues.apache.org/jira/browse/HBASE-5602
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin


In many practical applications it would be very useful to know how well 
utilized the block cache is, i.e. how many times we actually access a block 
once it gets into cache. This would also allow to evaluate cache-on-write on 
flush. In addition, we need to keep track of and report some set of hottest 
block in cache, and possibly even hottest keys. This would allow to diagnose 
"hot-row" problems in real time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

2012-03-19 Thread Schubert Zhang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233092#comment-13233092
 ] 

Schubert Zhang commented on HBASE-2600:
---

Fixing this will also fix HBASE-1978, since I have no ability to complete that.

> Change how we do meta tables; from tablename+STARTROW+randomid to instead, 
> tablename+ENDROW+randomid
> 
>
> Key: HBASE-2600
> URL: https://issues.apache.org/jira/browse/HBASE-2600
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Alex Newman
> Attachments: 
> 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v7.2.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8.1, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v9.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, 
> 2600-trunk-01-17.txt, jenkins.pdf
>
>
> This is an idea that Ryan and I have been kicking around on and off for a 
> while now.
> If regionnames were made of tablename+endrow instead of tablename+startrow, 
> then in the metatables, doing a search for the region that contains the 
> wanted row, we'd just have to open a scanner using passed row and the first 
> row found by the scan would be that of the region we need (If offlined 
> parent, we'd have to scan to the next row).
> If we redid the meta tables in this format, we'd be using an access that is 
> natural to hbase, a scan as opposed to the perverse, expensive 
> getClosestRowBefore we currently have that has to walk backward in meta 
> finding a containing region.
> This issue is about changing the way we name regions.
> If we were using scans, prewarming client cache would be near costless (as 
> opposed to what we'll currently have to do which is first a 
> getClosestRowBefore and then a scan from the closestrowbefore forward).
> Converting to the new method, we'd have to run a migration on startup 
> changing the content in meta.
> Up to this, the randomid component of a region name has been the timestamp of 
> region creation.   HBASE-2531 "32-bit encoding of regionnames waaay 
> too susceptible to hash clashes" proposes changing the randomid so that it 
> contains actual name of the directory in the filesystem that hosts the 
> region.  If we had this in place, I think it would help with the migration to 
> this new way of doing the meta because as is, the region name in fs is a hash 
> of regionname... changing the format of the regionname would mean we generate 
> a different hash... so we'd need hbase-2531 to be in place before we could do 
> this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1978) Change the range/block index scheme from [start,end) to (start, end], and index range/block by endKey, specially in HFile

2012-03-19 Thread Schubert Zhang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233094#comment-13233094
 ] 

Schubert Zhang commented on HBASE-1978:
---

refers to HBASE-2600

> Change the range/block index scheme from [start,end) to (start, end], and 
> index range/block by endKey, specially in HFile
> -
>
> Key: HBASE-1978
> URL: https://issues.apache.org/jira/browse/HBASE-1978
> Project: HBase
>  Issue Type: New Feature
>  Components: io, master, regionserver
>Reporter: Schubert Zhang
> Attachments: HBASE-1978-HFile-v1.patch
>
>
> From the code review of HFile (HBASE-1818), we found the HFile allows 
> duplicated key. But the old implementation would lead to missing of 
> duplicated key when seek and scan, when the duplicated key span multiple 
> blocks.
> We provide a patch (HBASE-1841 is't step1) to resolve above issue. This patch 
> modified HFile.Writer to avoid generating a problem hfile with above 
> cross-block duplicated key. It only start a new block when current appending 
> key is different from the last appended key. But it still has a rish when the 
> user of HFile.Writer append many same duplicated key which lead to a very 
> large block and need much memory or Out-of-memory.
> The current HFile's block-index use startKey to index a block, i.e. the 
> range/block index scheme is [startKey,endKey).
> As refering to the section 5.1 of the Google Bigtable paper.
> "The METADATA table stores the location of a tablet under a row key that is 
> an encoding of the tablet's table identifer and its end row."
> The theory of Bigtable's METADATA is same as the BlockIndex in a SSTable or 
> HFile, so we should use EndKey in HFile's BlockIndex. In my experiences of 
> Hypertable, the METADATA is also "tableID:endRow".
> We would change the index scheme in HFile, from [startKey,endKey) to 
> (startKey,endKey]. And change the binary search method to meet this index 
> scheme.
> This change can resolve above duplicated-key issue. 
> Note:
> The totally fix need to modify many modules in HBase, seems include HFile, 
> META schema, some internal code, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5602) Add cache access pattern statistics and report hot blocks/keys

2012-03-19 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233101#comment-13233101
 ] 

Todd Lipcon commented on HBASE-5602:


An alternative cache metric that would be interesting to know is the last 
access time of the entries being evicted. If the last-access-time is only a 
minute or so, it's likely that you can benefit from adding more cache. If it's 
substantially older, then adding cache won't really help. 
http://en.wikipedia.org/wiki/Five-minute_rule

> Add cache access pattern statistics and report hot blocks/keys
> --
>
> Key: HBASE-5602
> URL: https://issues.apache.org/jira/browse/HBASE-5602
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>
> In many practical applications it would be very useful to know how well 
> utilized the block cache is, i.e. how many times we actually access a block 
> once it gets into cache. This would also allow to evaluate cache-on-write on 
> flush. In addition, we need to keep track of and report some set of hottest 
> block in cache, and possibly even hottest keys. This would allow to diagnose 
> "hot-row" problems in real time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3996) Support multiple tables and scanners as input to the mapper in map/reduce jobs

2012-03-19 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233111#comment-13233111
 ] 

Hadoop QA commented on HBASE-3996:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12518989/3996-v3.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 166 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1227//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1227//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1227//console

This message is automatically generated.

> Support multiple tables and scanners as input to the mapper in map/reduce jobs
> --
>
> Key: HBASE-3996
> URL: https://issues.apache.org/jira/browse/HBASE-3996
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Eran Kutner
>Assignee: Eran Kutner
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 3996-v2.txt, 3996-v3.txt, HBase-3996.patch
>
>
> It seems that in many cases feeding data from multiple tables or multiple 
> scanners on a single table can save a lot of time when running map/reduce 
> jobs.
> I propose a new MultiTableInputFormat class that would allow doing this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3996) Support multiple tables and scanners as input to the mapper in map/reduce jobs

2012-03-19 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233120#comment-13233120
 ] 

Ted Yu commented on HBASE-3996:
---

Eran might be busy.

I created https://reviews.apache.org/r/4411/ for people to review.

> Support multiple tables and scanners as input to the mapper in map/reduce jobs
> --
>
> Key: HBASE-3996
> URL: https://issues.apache.org/jira/browse/HBASE-3996
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Eran Kutner
>Assignee: Eran Kutner
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 3996-v2.txt, 3996-v3.txt, HBase-3996.patch
>
>
> It seems that in many cases feeding data from multiple tables or multiple 
> scanners on a single table can save a lot of time when running map/reduce 
> jobs.
> I propose a new MultiTableInputFormat class that would allow doing this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

2012-03-19 Thread Alex Newman (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233123#comment-13233123
 ] 

Alex Newman commented on HBASE-2600:


Oh thanks for the reminder. I think this patch is ready. I just need to rebase 
and retest on my jenkins setup. Expect a patch soon.

> Change how we do meta tables; from tablename+STARTROW+randomid to instead, 
> tablename+ENDROW+randomid
> 
>
> Key: HBASE-2600
> URL: https://issues.apache.org/jira/browse/HBASE-2600
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Alex Newman
> Attachments: 
> 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v7.2.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8.1, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v9.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, 
> 2600-trunk-01-17.txt, jenkins.pdf
>
>
> This is an idea that Ryan and I have been kicking around on and off for a 
> while now.
> If regionnames were made of tablename+endrow instead of tablename+startrow, 
> then in the metatables, doing a search for the region that contains the 
> wanted row, we'd just have to open a scanner using passed row and the first 
> row found by the scan would be that of the region we need (If offlined 
> parent, we'd have to scan to the next row).
> If we redid the meta tables in this format, we'd be using an access that is 
> natural to hbase, a scan as opposed to the perverse, expensive 
> getClosestRowBefore we currently have that has to walk backward in meta 
> finding a containing region.
> This issue is about changing the way we name regions.
> If we were using scans, prewarming client cache would be near costless (as 
> opposed to what we'll currently have to do which is first a 
> getClosestRowBefore and then a scan from the closestrowbefore forward).
> Converting to the new method, we'd have to run a migration on startup 
> changing the content in meta.
> Up to this, the randomid component of a region name has been the timestamp of 
> region creation.   HBASE-2531 "32-bit encoding of regionnames waaay 
> too susceptible to hash clashes" proposes changing the randomid so that it 
> contains actual name of the directory in the filesystem that hosts the 
> region.  If we had this in place, I think it would help with the migration to 
> this new way of doing the meta because as is, the region name in fs is a hash 
> of regionname... changing the format of the regionname would mean we generate 
> a different hash... so we'd need hbase-2531 to be in place before we could do 
> this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-03-19 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233141#comment-13233141
 ] 

jirapos...@reviews.apache.org commented on HBASE-5128:
--



bq.  On 2012-03-11 01:25:43, Ted Yu wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java, 
line 2652
bq.  > 
bq.  >
bq.  > Can we deprecate this method in 0.94 and remove it in 0.96 ?

Completed in HBASE-5588.


- jmhsieh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4280/#review5823
---


On 2012-03-10 01:04:58, jmhsieh wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4280/
bq.  ---
bq.  
bq.  (Updated 2012-03-10 01:04:58)
bq.  
bq.  
bq.  Review request for hbase, Todd Lipcon, Ted Yu, and Lars Hofhansl.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  This version is similar to the 0.90.x version posted a few months back, 
but has a few new features and some minor differences.
bq.  
bq.  1) No trackHTD method needed since we can read from the file system.
bq.  2) Added safeguards to prevent mega merges, and to isolate repairs to 
particular tables.
bq.  3) Fixed comparator in HRegionInfo
bq.  4) Fixed TestRegionObserverInterface so that it doesn't rely on bug in 
HRegionInfo comparator.
bq.  
bq.  I'll backport to 0.94/0.92 (which should be very similar) and update the 
0.90 versions after this patch has mostly cleared.
bq.  
bq.  This version is not perfect (there are definitely cases not covered) but 
it think it is worth trying to get this in so that future reviews are more 
manageable.
bq.  
bq.  
bq.  This addresses bug HBASE-5128.
bq.  https://issues.apache.org/jira/browse/HBASE-5128
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 98f79fc 
bq.src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java 3bcf899 
bq.src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 
ae468ca 
bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java e2bbbd0 
bq.src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 720841c 
bq.src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 5916d9c 
bq.src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java 
d57bb6b 
bq.
src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandler.java 
PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 38eb6a8 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java
 1b3b6df 
bq.src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java 937781d 
bq.src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsckComparator.java 
0599da1 
bq.src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java 
dbb97f8 
bq.
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java 
2b4cac8 
bq.
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java 
ebbeead 
bq.
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java
 b175548 
bq.  
bq.  Diff: https://reviews.apache.org/r/4280/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Unit tests cover many many situations and pass.  Most "live" testing has 
been done on 0.90.x versions.  Many improvements and features added from 
experience.  Not much testing live on the trunk versions.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  jmhsieh
bq.  
bq.



> [uber hbck] Enable hbck to automatically repair table integrity problems as 
> well as region consistency problems while online.
> -
>
> Key: HBASE-5128
> URL: https://issues.apache.org/jira/browse/HBASE-5128
> Project: HBase
>  Issue Type: New Feature
>  Components: hbck
>Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: hbase-5128-trunk.patch
>
>
> The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
> consistency and table integrity invariant violations.  However with '-fix' it 
> can only automatically repair region consistency cases having to do with 
> deployment problems.  This updated version should be able to handle all cases 
> (including a new orphan regiondir case).  When complete will likely deprecate 

[jira] [Commented] (HBASE-5583) Master restart on create table with splitkeys does not recreate table with all the splitkey regions

2012-03-19 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233150#comment-13233150
 ] 

Jonathan Hsieh commented on HBASE-5583:
---

Maybe we should put plans for master actions into ZK, and then remove each step 
as they are completed.  This would allow a backup master or the restarted 
master to pick up and continue where the previous master had died.

So this we may have a zk structure act like a queue with region infos and then 
let the masters drain from there to create .META. rows.

I was thinking about adding a flag to info:regioninfo or extra column 
("info:unfinshedPresplitCreate") but this would seem to suffer from a similar 
problem if the master failed as these were being removed after all regions 
created.

> Master restart on create table with splitkeys does not recreate table with 
> all the splitkey regions
> ---
>
> Key: HBASE-5583
> URL: https://issues.apache.org/jira/browse/HBASE-5583
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.96.0
>
>
> -> Create table using splitkeys
> -> MAster goes down before all regions are added to meta
> -> On master restart the table is again enabled but with less number of 
> regions than specified in splitkeys
> Anyway client will get an exception if i had called sync create table.  But 
> table exists or not check will say table exists. 
> Is this scenario to be handled by client only or can we have some mechanism 
> on the master side for this? Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5551) Some functions should not be used by customer code and must be deprecated in 0.94

2012-03-19 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233163#comment-13233163
 ] 

Jonathan Hsieh commented on HBASE-5551:
---

@nkeywal, @LarsH -- I'm catching up with HBASE-5399 -- but if these are 
removed, what is the proper way to get info about which HMaster is active?  
Also, what is the proper way for a client to a connection to the master if it 
wanted to talk to it directly?

> Some functions should not be used by customer code and must be deprecated in 
> 0.94
> -
>
> Key: HBASE-5551
> URL: https://issues.apache.org/jira/browse/HBASE-5551
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.0
>Reporter: nkeywal
>Assignee: nkeywal
> Fix For: 0.94.0
>
> Attachments: 5551.092.patch
>
>
> They are:
> HBaseAdmin#getMaster
> HConnection#getZooKeeperWatcher
> HConnection#getMaster

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3996) Support multiple tables and scanners as input to the mapper in map/reduce jobs

2012-03-19 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3996:
--

Attachment: 3996-v4.txt

Latest patch from review board.

> Support multiple tables and scanners as input to the mapper in map/reduce jobs
> --
>
> Key: HBASE-3996
> URL: https://issues.apache.org/jira/browse/HBASE-3996
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Eran Kutner
>Assignee: Eran Kutner
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 3996-v2.txt, 3996-v3.txt, 3996-v4.txt, HBase-3996.patch
>
>
> It seems that in many cases feeding data from multiple tables or multiple 
> scanners on a single table can save a lot of time when running map/reduce 
> jobs.
> I propose a new MultiTableInputFormat class that would allow doing this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5564) Bulkload is discarding duplicate records

2012-03-19 Thread Laxman (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laxman updated HBASE-5564:
--

Status: Patch Available  (was: Open)

> Bulkload is discarding duplicate records
> 
>
> Key: HBASE-5564
> URL: https://issues.apache.org/jira/browse/HBASE-5564
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
> Environment: HBase 0.92
>Reporter: Laxman
>Assignee: Laxman
>  Labels: bulkloader
> Attachments: HBASE-5564_trunk.patch
>
>
> Duplicate records are getting discarded when duplicate records exists in same 
> input file and more specifically if they exists in same split.
> Duplicate records are considered if the records are from diffrent different 
> splits.
> Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5564) Bulkload is discarding duplicate records

2012-03-19 Thread Laxman (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laxman updated HBASE-5564:
--

Attachment: HBASE-5564_trunk.patch

Initial patch on trunk for review.

> Bulkload is discarding duplicate records
> 
>
> Key: HBASE-5564
> URL: https://issues.apache.org/jira/browse/HBASE-5564
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
> Environment: HBase 0.92
>Reporter: Laxman
>Assignee: Laxman
>  Labels: bulkloader
> Attachments: HBASE-5564_trunk.patch
>
>
> Duplicate records are getting discarded when duplicate records exists in same 
> input file and more specifically if they exists in same split.
> Duplicate records are considered if the records are from diffrent different 
> splits.
> Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5596) Few minor bugs from HBASE-5209

2012-03-19 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233177#comment-13233177
 ] 

Jonathan Hsieh commented on HBASE-5596:
---

Dave,

Since HBASE-5209 were included in 0.94 and 0.92, it makes sense to include this 
as well.  The current patch seems to only apply to trunk but not to 0.94 and 
0.92 branches.  Can you update the patch for the other branches?  


> Few minor bugs from HBASE-5209
> --
>
> Key: HBASE-5596
> URL: https://issues.apache.org/jira/browse/HBASE-5596
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.92.1, 0.94.0
>Reporter: David S. Wang
>Assignee: David S. Wang
>Priority: Minor
> Attachments: HBASE_5596.patch
>
>
> A few leftover bugs from HBASE-5209.  Comments are documented here:
> https://reviews.apache.org/r/3892/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records

2012-03-19 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233179#comment-13233179
 ] 

Ted Yu commented on HBASE-5564:
---

{code}
+public int getTimestapKeyColumnIndex() {
{code}
Please fix typo in the above method name.
{code}
+  "  -D" + TIMESTAMP_CONF_KEY + "=currentTimeAsLong - use the specified 
timestamp for the import. This option is ignored if HBASE_TS_KEY is specfied in 
'importtsv.columns'\n" +
{code}
Please wrap the long line above.
{code}
+// Should never get 0.
+ts = conf.getLong(ImportTsv.TIMESTAMP_CONF_KEY, 0);
{code}
Please explain why 0 wouldn't be returned.
{code}
+  if (parser.getTimestapKeyColumnIndex() != -1)
+ts = parsed.getTimestamp();
{code}
Please use curly braces around the assignment.

> Bulkload is discarding duplicate records
> 
>
> Key: HBASE-5564
> URL: https://issues.apache.org/jira/browse/HBASE-5564
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
> Environment: HBase 0.92
>Reporter: Laxman
>Assignee: Laxman
>  Labels: bulkloader
> Attachments: HBASE-5564_trunk.patch
>
>
> Duplicate records are getting discarded when duplicate records exists in same 
> input file and more specifically if they exists in same split.
> Duplicate records are considered if the records are from diffrent different 
> splits.
> Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records

2012-03-19 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233204#comment-13233204
 ] 

Hadoop QA commented on HBASE-5564:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12519017/HBASE-5564_trunk.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 165 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1229//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1229//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1229//console

This message is automatically generated.

> Bulkload is discarding duplicate records
> 
>
> Key: HBASE-5564
> URL: https://issues.apache.org/jira/browse/HBASE-5564
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
> Environment: HBase 0.92
>Reporter: Laxman
>Assignee: Laxman
>  Labels: bulkloader
> Attachments: HBASE-5564_trunk.patch
>
>
> Duplicate records are getting discarded when duplicate records exists in same 
> input file and more specifically if they exists in same split.
> Duplicate records are considered if the records are from diffrent different 
> splits.
> Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5551) Some functions should not be used by customer code and must be deprecated in 0.94

2012-03-19 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233209#comment-13233209
 ] 

Jonathan Hsieh commented on HBASE-5551:
---

Hm.. is it to get info ClusterStats from HBASE-5209/HBASE-5596 and instatiate 
new?

> Some functions should not be used by customer code and must be deprecated in 
> 0.94
> -
>
> Key: HBASE-5551
> URL: https://issues.apache.org/jira/browse/HBASE-5551
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.0
>Reporter: nkeywal
>Assignee: nkeywal
> Fix For: 0.94.0
>
> Attachments: 5551.092.patch
>
>
> They are:
> HBaseAdmin#getMaster
> HConnection#getZooKeeperWatcher
> HConnection#getMaster

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5596) Few minor bugs from HBASE-5209

2012-03-19 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233216#comment-13233216
 ] 

Ted Yu commented on HBASE-5596:
---

Is this patch ready for integration ?

>From 
>https://builds.apache.org/job/PreCommit-HBASE-Build/1212//testReport/org.apache.hadoop.hbase.client/TestInstantSchemaChangeFailover/testInstantSchemaChangeWhileRSCrash/:
{code}
Caused by: java.lang.RuntimeException: Master not initialized after 200 seconds
at 
org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:208)
at 
org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:424)
at 
org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:200)
{code}

> Few minor bugs from HBASE-5209
> --
>
> Key: HBASE-5596
> URL: https://issues.apache.org/jira/browse/HBASE-5596
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.92.1, 0.94.0
>Reporter: David S. Wang
>Assignee: David S. Wang
>Priority: Minor
> Attachments: HBASE_5596.patch
>
>
> A few leftover bugs from HBASE-5209.  Comments are documented here:
> https://reviews.apache.org/r/3892/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3996) Support multiple tables and scanners as input to the mapper in map/reduce jobs

2012-03-19 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3996:
--

Attachment: (was: 3996-v4.txt)

> Support multiple tables and scanners as input to the mapper in map/reduce jobs
> --
>
> Key: HBASE-3996
> URL: https://issues.apache.org/jira/browse/HBASE-3996
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Eran Kutner
>Assignee: Eran Kutner
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 3996-v2.txt, 3996-v3.txt, 3996-v4.txt, HBase-3996.patch
>
>
> It seems that in many cases feeding data from multiple tables or multiple 
> scanners on a single table can save a lot of time when running map/reduce 
> jobs.
> I propose a new MultiTableInputFormat class that would allow doing this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3996) Support multiple tables and scanners as input to the mapper in map/reduce jobs

2012-03-19 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3996:
--

Attachment: 3996-v4.txt

> Support multiple tables and scanners as input to the mapper in map/reduce jobs
> --
>
> Key: HBASE-3996
> URL: https://issues.apache.org/jira/browse/HBASE-3996
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Eran Kutner
>Assignee: Eran Kutner
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 3996-v2.txt, 3996-v3.txt, 3996-v4.txt, HBase-3996.patch
>
>
> It seems that in many cases feeding data from multiple tables or multiple 
> scanners on a single table can save a lot of time when running map/reduce 
> jobs.
> I propose a new MultiTableInputFormat class that would allow doing this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5599) The hbkc tool can not fix the six scenarios, it is NO_VERSION_FILE, NOT_IN_META_OR_DEPLOYED, NOT_IN_META, SHOULD_NOT_BE_DEPLOYED, FIRST_REGION_STARTKEY_NOT_EMPTY, HOLE_

2012-03-19 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233220#comment-13233220
 ] 

Jonathan Hsieh commented on HBASE-5599:
---

@ram @fulin

I'm working on getting HBASE-5128 into trunk/0.94/0.92/0.90.  I'm focusing on 
trunk first (0.94/0.92 should fall out) and changes from the trunk versions 
(max merge and specific table safegurds) should make it into the 0.90 version.

I believe NOT_IN_META_OR_DEPLOYED, NOT_IN_META, HOLE_IN_REGION_CHAIN are 
handled by the HBASE-5128. I believe the others are shortcomings.

Fulin, give me a few days to get HBASE-5128 in -- I'm testing a prerequisite of 
it currently (HBASE-5589).



> The hbkc tool can not fix the six scenarios, it is NO_VERSION_FILE, 
> NOT_IN_META_OR_DEPLOYED, NOT_IN_META, SHOULD_NOT_BE_DEPLOYED, 
> FIRST_REGION_STARTKEY_NOT_EMPTY, HOLE_IN_REGION_CHAIN.
> 
>
> Key: HBASE-5599
> URL: https://issues.apache.org/jira/browse/HBASE-5599
> Project: HBase
>  Issue Type: New Feature
>  Components: hbck
>Affects Versions: 0.90.6
>Reporter: fulin wang
> Fix For: 0.90.6
>
> Attachments: hbase-5599-0.90.patch
>
>
> The hbck tool can not fix the six scenarios.
> 1. Version file does not exist in root dir.
>Fix: I try to create a version file by 'FSUtils.setVersion' method.
>
> 2. [REGIONNAME][KEY] on HDFS, but not listed in META or deployed on any 
> region server.
>Fix: I get region info form the hdfs file, this region info write to 
> '.META.' table.
>
> 3. [REGIONNAME][KEY] not in META, but deployed on [SERVERNAME]
>Fix: I get region info form the hdfs file, this region info write to 
> '.META.' table.
>
> 4. [REGIONNAME] should not be deployed according to META, but is deployed on 
> [SERVERNAME]
>Fix: Close this region.
>
> 5. First region should start with an empty key.  You need to  create a new 
> region and regioninfo in HDFS to plug the hole.
>Fix: The region info is not in hdfs and .META., so it create a empty 
> region for this error.
> 6. There is a hole in the region chain between [KEY] and [KEY]. You need to 
> create a new regioninfo and region dir in hdfs to plug the hole.
>   Fix: The region info is not in hdfs and .META., so it create a empty region 
> for this hole.
>   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5564) Bulkload is discarding duplicate records

2012-03-19 Thread Laxman (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laxman updated HBASE-5564:
--

Status: Open  (was: Patch Available)

> Bulkload is discarding duplicate records
> 
>
> Key: HBASE-5564
> URL: https://issues.apache.org/jira/browse/HBASE-5564
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
> Environment: HBase 0.92
>Reporter: Laxman
>Assignee: Laxman
>  Labels: bulkloader
> Attachments: HBASE-5564_trunk.patch
>
>
> Duplicate records are getting discarded when duplicate records exists in same 
> input file and more specifically if they exists in same split.
> Duplicate records are considered if the records are from diffrent different 
> splits.
> Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records

2012-03-19 Thread Anoop Sam John (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233238#comment-13233238
 ] 

Anoop Sam John commented on HBASE-5564:
---

@Laxman
ImportTsv
{code}
+// If timestamp option is not specified, use current system time.
+long timstamp = conf.getLong(TIMESTAMP_CONF_KEY, 
System.currentTimeMillis());
+
+// Set it back to replace invalid timestamp (non-numeric) with current 
system time
+conf.setLong(TIMESTAMP_CONF_KEY, timstamp);
{code}

Doing this will use the same TS across all the mappers. Is this the intention 
for this change? So in TsvImporterMapper, 
conf.getLong(ImportTsv.TIMESTAMP_CONF_KEY, 0) will always have value to get 
from conf.

> Bulkload is discarding duplicate records
> 
>
> Key: HBASE-5564
> URL: https://issues.apache.org/jira/browse/HBASE-5564
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
> Environment: HBase 0.92
>Reporter: Laxman
>Assignee: Laxman
>  Labels: bulkloader
> Attachments: HBASE-5564_trunk.patch
>
>
> Duplicate records are getting discarded when duplicate records exists in same 
> input file and more specifically if they exists in same split.
> Duplicate records are considered if the records are from diffrent different 
> splits.
> Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5564) Bulkload is discarding duplicate records

2012-03-19 Thread Laxman (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laxman updated HBASE-5564:
--

Attachment: HBASE-5564_trunk.1.patch

> Bulkload is discarding duplicate records
> 
>
> Key: HBASE-5564
> URL: https://issues.apache.org/jira/browse/HBASE-5564
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
> Environment: HBase 0.92
>Reporter: Laxman
>Assignee: Laxman
>  Labels: bulkloader
> Attachments: HBASE-5564_trunk.1.patch, HBASE-5564_trunk.patch
>
>
> Duplicate records are getting discarded when duplicate records exists in same 
> input file and more specifically if they exists in same split.
> Duplicate records are considered if the records are from diffrent different 
> splits.
> Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3996) Support multiple tables and scanners as input to the mapper in map/reduce jobs

2012-03-19 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233241#comment-13233241
 ] 

Hadoop QA commented on HBASE-3996:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12519023/3996-v4.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 166 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.mapreduce.TestMultiTableInputFormat
  org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1230//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1230//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1230//console

This message is automatically generated.

> Support multiple tables and scanners as input to the mapper in map/reduce jobs
> --
>
> Key: HBASE-3996
> URL: https://issues.apache.org/jira/browse/HBASE-3996
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Eran Kutner
>Assignee: Eran Kutner
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 3996-v2.txt, 3996-v3.txt, 3996-v4.txt, HBase-3996.patch
>
>
> It seems that in many cases feeding data from multiple tables or multiple 
> scanners on a single table can save a lot of time when running map/reduce 
> jobs.
> I propose a new MultiTableInputFormat class that would allow doing this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5564) Bulkload is discarding duplicate records

2012-03-19 Thread Laxman (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laxman updated HBASE-5564:
--

Status: Patch Available  (was: Open)

Ted, Thanks for your review. Attached the patch after fixing the review 
comments.

> Bulkload is discarding duplicate records
> 
>
> Key: HBASE-5564
> URL: https://issues.apache.org/jira/browse/HBASE-5564
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
> Environment: HBase 0.92
>Reporter: Laxman
>Assignee: Laxman
>  Labels: bulkloader
> Attachments: HBASE-5564_trunk.1.patch, HBASE-5564_trunk.patch
>
>
> Duplicate records are getting discarded when duplicate records exists in same 
> input file and more specifically if they exists in same split.
> Duplicate records are considered if the records are from diffrent different 
> splits.
> Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5570) Compression tool section is referring to wrong link in HBase Book.

2012-03-19 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233243#comment-13233243
 ] 

Hudson commented on HBASE-5570:
---

Integrated in HBase-TRUNK-security #143 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/143/])
hbase-5570 ops_mgt.xml - fixed self-reference with compression section 
(Revision 1302658)

 Result = FAILURE

> Compression tool section is referring to wrong link in HBase Book.
> --
>
> Key: HBASE-5570
> URL: https://issues.apache.org/jira/browse/HBASE-5570
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>Reporter: Laxman
>Assignee: Doug Meil
>Priority: Trivial
>  Labels: documentaion
> Attachments: ops_mgt_hbase_5570.xml.patch
>
>
> http://hbase.apache.org/book/ops_mgt.html#compression.tool
> Above section is refering to itself (recursive) in HBase book.
> This needs to be corrected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5313) Restructure hfiles layout for better compression

2012-03-19 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233244#comment-13233244
 ] 

Hudson commented on HBASE-5313:
---

Integrated in HBase-TRUNK-security #143 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/143/])
HBASE-5521 [jira] Move compression/decompression to an encoder specific 
encoding
context

Author: Yongqiang He

Summary:
https://issues.apache.org/jira/browse/HBASE-5521

As part of working on HBASE-5313, we want to add a new columnar encoder/decoder.
It makes sense to move compression to be part of encoder/decoder:
1) a scanner for a columnar encoded block can do lazy decompression to a
specific part of a key value object
2) avoid an extra bytes copy from encoder to hblock-writer.

If there is no encoder specified for a writer, the HBlock.Writer will use a
default compression-context to do something very similar to today's code.

Test Plan: existing unit tests verified by mbautin and tedyu. And no new test
added here since this code is just a preparation for columnar encoder. Will add
testcase later in that diff.

Reviewers: dhruba, tedyu, sc, mbautin

Reviewed By: mbautin

Differential Revision: https://reviews.facebook.net/D2097 (Revision 1302602)

 Result = FAILURE
mbautin : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDecodingContext.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDefaultDecodingContext.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDefaultEncodingContext.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockEncodingContext.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/Compression.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java


> Restructure hfiles layout for better compression
> 
>
> Key: HBASE-5313
> URL: https://issues.apache.org/jira/browse/HBASE-5313
> Project: HBase
>  Issue Type: Improvement
>  Components: io
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
>
> A HFile block contain a stream of key-values. Can we can organize these kvs 
> on the disk in a better way so that we get much greater compression ratios?
> One option (thanks Prakash) is to store all the keys in the beginning of the 
> block (let's call this the key-section) and then store all their 
> corresponding values towards the end of the block. This will allow us to 
> not-even decompress the values when we are scanning and skipping over rows in 
> the block.
> Any other ideas? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5521) Move compression/decompression to an encoder specific encoding context

2012-03-19 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233245#comment-13233245
 ] 

Hudson commented on HBASE-5521:
---

Integrated in HBase-TRUNK-security #143 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/143/])
HBASE-5521 [jira] Move compression/decompression to an encoder specific 
encoding
context

Author: Yongqiang He

Summary:
https://issues.apache.org/jira/browse/HBASE-5521

As part of working on HBASE-5313, we want to add a new columnar encoder/decoder.
It makes sense to move compression to be part of encoder/decoder:
1) a scanner for a columnar encoded block can do lazy decompression to a
specific part of a key value object
2) avoid an extra bytes copy from encoder to hblock-writer.

If there is no encoder specified for a writer, the HBlock.Writer will use a
default compression-context to do something very similar to today's code.

Test Plan: existing unit tests verified by mbautin and tedyu. And no new test
added here since this code is just a preparation for columnar encoder. Will add
testcase later in that diff.

Reviewers: dhruba, tedyu, sc, mbautin

Reviewed By: mbautin

Differential Revision: https://reviews.facebook.net/D2097 (Revision 1302602)

 Result = FAILURE
mbautin : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDecodingContext.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDefaultDecodingContext.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDefaultEncodingContext.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockEncodingContext.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/Compression.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java


> Move compression/decompression to an encoder specific encoding context
> --
>
> Key: HBASE-5521
> URL: https://issues.apache.org/jira/browse/HBASE-5521
> Project: HBase
>  Issue Type: Improvement
>Reporter: He Yongqiang
>Assignee: He Yongqiang
> Fix For: 0.96.0
>
> Attachments: 
> HBASE-5521-jira-Move-compression-decompression-to-an-2012-03-19_12_12_32.patch,
>  HBASE-5521.1.patch, HBASE-5521.D2097.1.patch, HBASE-5521.D2097.10.patch, 
> HBASE-5521.D2097.2.patch, HBASE-5521.D2097.3.patch, HBASE-5521.D2097.4.patch, 
> HBASE-5521.D2097.5.patch, HBASE-5521.D2097.6.patch, HBASE-5521.D2097.7.patch, 
> HBASE-5521.D2097.8.patch, HBASE-5521.D2097.9.patch
>
>
> As part of working on HBASE-5313, we want to add a new columnar 
> encoder/decoder. It makes sense to move compression to be part of 
> encoder/decoder:
> 1) a scanner for a columnar encoded block can do lazy decompression to a 
> specific part of a key value object
> 2) avoid an extra bytes copy from encoder to hblock-writer. 
> If there is no encoder specified for a writer, the HBlock.Writer will use a 
> default compression-context to do something very similar to today's code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
F

[jira] [Commented] (HBASE-5599) The hbkc tool can not fix the six scenarios, it is NO_VERSION_FILE, NOT_IN_META_OR_DEPLOYED, NOT_IN_META, SHOULD_NOT_BE_DEPLOYED, FIRST_REGION_STARTKEY_NOT_EMPTY, HOLE_

2012-03-19 Thread fulin wang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233247#comment-13233247
 ] 

fulin wang commented on HBASE-5599:
---

today I will update the patch, I am testing;
please, it was referring to the patch, I think you should think about this 
scenarios in my patch.
I hope that the hbck tool will be strong, thanks.

> The hbkc tool can not fix the six scenarios, it is NO_VERSION_FILE, 
> NOT_IN_META_OR_DEPLOYED, NOT_IN_META, SHOULD_NOT_BE_DEPLOYED, 
> FIRST_REGION_STARTKEY_NOT_EMPTY, HOLE_IN_REGION_CHAIN.
> 
>
> Key: HBASE-5599
> URL: https://issues.apache.org/jira/browse/HBASE-5599
> Project: HBase
>  Issue Type: New Feature
>  Components: hbck
>Affects Versions: 0.90.6
>Reporter: fulin wang
> Fix For: 0.90.6
>
> Attachments: hbase-5599-0.90.patch
>
>
> The hbck tool can not fix the six scenarios.
> 1. Version file does not exist in root dir.
>Fix: I try to create a version file by 'FSUtils.setVersion' method.
>
> 2. [REGIONNAME][KEY] on HDFS, but not listed in META or deployed on any 
> region server.
>Fix: I get region info form the hdfs file, this region info write to 
> '.META.' table.
>
> 3. [REGIONNAME][KEY] not in META, but deployed on [SERVERNAME]
>Fix: I get region info form the hdfs file, this region info write to 
> '.META.' table.
>
> 4. [REGIONNAME] should not be deployed according to META, but is deployed on 
> [SERVERNAME]
>Fix: Close this region.
>
> 5. First region should start with an empty key.  You need to  create a new 
> region and regioninfo in HDFS to plug the hole.
>Fix: The region info is not in hdfs and .META., so it create a empty 
> region for this error.
> 6. There is a hole in the region chain between [KEY] and [KEY]. You need to 
> create a new regioninfo and region dir in hdfs to plug the hole.
>   Fix: The region info is not in hdfs and .META., so it create a empty region 
> for this hole.
>   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5482) In 0.90, balancer algo leading to same region balanced twice and picking same region with Src and Destination as same RS.

2012-03-19 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-5482.
---

  Resolution: Fixed
Hadoop Flags: Reviewed

> In 0.90, balancer algo leading to same region balanced twice and picking same 
> region with Src and Destination as same RS.
> -
>
> Key: HBASE-5482
> URL: https://issues.apache.org/jira/browse/HBASE-5482
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.5
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.7
>
> Attachments: 5482-v2.txt, HBASE-5482_1.patch, HBASE-5482_2.patch
>
>
> There are possibility of 2 problems
> -> When we populate regionsToMove while iterating the serverinfo in 
> descending manner there is a chance that the same region can be added twice.
> Because in the first loop we do a randomization of the regions.
> Where as when we get we have neededRegions!= 0 we just get the region in the 
> index and add it again . This may lead to have same region in the 
> regionsToMove list.
> -> Another problem is 
> when the problem in the first point happens then there is a chance that
> the regionToMove can have the same src and destination and the same region 
> can be picked every 5 mins.
> {code}
> for(Map.Entry> server :
> serversByLoad.descendingMap().entrySet()) {
> BalanceInfo balanceInfo = serverBalanceInfo.get(server.getKey());
> int idx =
>   balanceInfo == null ? 0 : balanceInfo.getNextRegionForUnload();
> if (idx >= server.getValue().size()) break;
> HRegionInfo region = server.getValue().get(idx);
> if (region.isMetaRegion()) continue; // Don't move meta regions.
> regionsToMove.add(new RegionPlan(region, server.getKey(), null));
> if(--neededRegions == 0) {
>   // No more regions needed, done shedding
>   break;
> }
>   }
> {code}
> If i have meta and root in the top two loaded region server(totally 3 RS), we 
> just skip the regions in those region server and populate the region from the 
> least loaded RS.
> Then in the next loop we iterate from the least loaded server and populate 
> the destination as also the same server.
> This is leading to a condition where every 5 min balancing happens and also 
> the server is same for src and dest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5583) Master restart on create table with splitkeys does not recreate table with all the splitkey regions

2012-03-19 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233252#comment-13233252
 ] 

ramkrishna.s.vasudevan commented on HBASE-5583:
---

@Jon 
Yes we should go with some form of persisting the splitkeys.  ZK should be the 
best place for that.

> Master restart on create table with splitkeys does not recreate table with 
> all the splitkey regions
> ---
>
> Key: HBASE-5583
> URL: https://issues.apache.org/jira/browse/HBASE-5583
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.96.0
>
>
> -> Create table using splitkeys
> -> MAster goes down before all regions are added to meta
> -> On master restart the table is again enabled but with less number of 
> regions than specified in splitkeys
> Anyway client will get an exception if i had called sync create table.  But 
> table exists or not check will say table exists. 
> Is this scenario to be handled by client only or can we have some mechanism 
> on the master side for this? Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira