date:20110725


 [ 
https://issues.apache.org/jira/browse/HBASE-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vamshi updated HBASE-4137:
--

Description: To implement  a scalable data structure i.e Distributed hash 
table in the HBase. In the Hbase to perform fast lookup  we can take the help 
of Distributed Hash Table (DHT), a scalable data structure.  (was: To implement 
 a scalable data structure i.e Distributed hash table in the HBase.   )
Summary: Implementation of Distributed Hash Table(DHT)  for lookup 
service  (was: Implementation of Distributed Hash Table(DHT) )

 Implementation of Distributed Hash Table(DHT)  for lookup service
 -

 Key: HBASE-4137
 URL: https://issues.apache.org/jira/browse/HBASE-4137
 Project: HBase
  Issue Type: Improvement
  Components: performance
Affects Versions: 0.90.3
Reporter: vamshi

 To implement  a scalable data structure i.e Distributed hash table in the 
 HBase. In the Hbase to perform fast lookup  we can take the help of 
 Distributed Hash Table (DHT), a scalable data structure.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-1938) Make in-memory table scanning faster


[ 
https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070371#comment-13070371
 ] 

vamshi commented on HBASE-1938:
---

Hi stack, how can we perform lookup/scanning in HBase? can we use Distributed 
Hashing (DHT) for that? I want to implement scalable data structure i.e DHT in 
HBase, for how can i proceed? please help me..Thank you

 Make in-memory table scanning faster
 

 Key: HBASE-1938
 URL: https://issues.apache.org/jira/browse/HBASE-1938
 Project: HBase
  Issue Type: Improvement
  Components: performance
Reporter: stack
Assignee: stack
Priority: Blocker
 Attachments: MemStoreScanPerformance.java, 
 MemStoreScanPerformance.java, caching-keylength-in-kv.patch, test.patch


 This issue is about profiling hbase to see if I can make hbase scans run 
 faster when all is up in memory.  Talking to some users, they are seeing 
 about 1/4 million rows a second.  It should be able to go faster than this 
 (Scanning an array of objects, they can do about 4-5x this).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

[
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070374#comment-13070374
]

ramkrishna.s.vasudevan commented on HBASE-3845:
---

Ted,
I tried using the 'this.cacheFlushLock.isHeldByCurrentThread()'.
The problem here is as HLog.append() may be called by other thread whereas the
HRegion.internalFlushCache() is called by memstoreflusher thread.
So if we check this.cacheFlushLock.isHeldByCurrentThread() it returns false.

So as per your suggestion i have inlined the isFlushInProgress into
wal.startCacheFlush() and wal.abortCacheFlush() and still going with
AtomicBoolean.
Is it fine Ted ? I am planning to upload the patch with these changes.

data loss because lastSeqWritten can miss memstore edits

Key: HBASE-3845
URL: https://issues.apache.org/jira/browse/HBASE-3845
Project: HBase
Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
Fix For: 0.90.5

Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch,
HBASE-3845_4.patch, HBASE-3845__trunk.patch

(I don't have a test case to prove this yet but I have run it by Dhruba and
Kannan internally and wanted to put this up for some feedback.)
In this discussion let us assume that the region has only one column family.
That way I can use region/memstore interchangeably.
After a memstore flush it is possible for lastSeqWritten to have a
log-sequence-id for a region that is not the earliest log-sequence-id for
that region's memstore.
HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure
that we only keep track of the earliest log-sequence-number that is present
in the memstore.
Every time the memstore is flushed we remove the region's entry in
lastSequenceWritten and wait for the next append to populate this entry
again. This is where the problem happens.
step 1:
flusher.prepare() snapshots the memstore under
HRegion.updatesLock.writeLock().
step 2 :
as soon as the updatesLock.writeLock() is released new entries will be added
into the memstore.
step 3 :
wal.completeCacheFlush() is called. This method removes the region's entry
from lastSeqWritten.
step 4:
the next append will create a new entry for the region in lastSeqWritten().
But this will be the log seq id of the current append. All the edits that
were added in step 2 are missing.
==
as a temporary measure, instead of removing the region's entry in step 3 I
will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-4138) If zookeeper.znode.parent is not specifed explicitly in Client code then HTable object loops continuously waiting for the root region by using /hbase as the base node.

If zookeeper.znode.parent is not specifed explicitly in Client code then HTable 
object loops continuously waiting for the root region by using /hbase as the 
base node.
---

 Key: HBASE-4138
 URL: https://issues.apache.org/jira/browse/HBASE-4138
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.3
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.4


Change the zookeeper.znode.parent property (default is /hbase).
Now do not specify this change in the client code.

Use the HTable Object.
The HTable is not able to find the root region and keeps continuously looping.

Find the stack trace:

Object.wait(long) line: not available [native method]
RootRegionTracker(ZooKeeperNodeTracker).blockUntilAvailable(long) line: 122

RootRegionTracker.waitRootRegionLocation(long) line: 73  
HConnectionManager$HConnectionImplementation.locateRegion(byte[],
byte[], boolean) line: 578
HConnectionManager$HConnectionImplementation.locateRegion(byte[],
byte[]) line: 558
HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[],
byte[], byte[], boolean, Object) line: 687
HConnectionManager$HConnectionImplementation.locateRegion(byte[],
byte[], boolean) line: 589
HConnectionManager$HConnectionImplementation.locateRegion(byte[],
byte[]) line: 558
HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[],
byte[], byte[], boolean, Object) line: 687
HConnectionManager$HConnectionImplementation.locateRegion(byte[],
byte[], boolean) line: 593
HConnectionManager$HConnectionImplementation.locateRegion(byte[],
byte[]) line: 558
HTable.init(Configuration, byte[]) line: 171   
HTable.init(Configuration, String) line: 145   
HBaseTest.test() line: 45

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-2827) HBase Client doesn't handle master failover


[ 
https://issues.apache.org/jira/browse/HBASE-2827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070383#comment-13070383
 ] 

vamshi commented on HBASE-2827:
---

Hi jonathan,may be in this place this question is irrelevant. But please let me 
know whether we can implement a distributed hashing in HBase for fast lookup/ 
scanning purpose?? i want to implement scalable data structure i.e DHT in 
Hbase, for that how can i proceed? Thank you.

 HBase Client doesn't handle master failover
 ---

 Key: HBASE-2827
 URL: https://issues.apache.org/jira/browse/HBASE-2827
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.0
Reporter: Nicolas Spiegelberg
Assignee: Jonathan Gray

 A client on our beta tier was stuck in this exception loop when we issued a 
 new HMaster after the old one died:
 Exception while trying to connect hBase
 java.lang.reflect.UndeclaredThrowableException
 at $Proxy1.getClusterStatus(Unknown Source)
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.getClusterStatus(HBaseAdmin.java:912)
 at org.apache.hadoop.hbase.client.HTable.getCurrentNrHRS(HTable.java:170)
 at org.apache.hadoop.hbase.client.HTable.init(HTable.java:143)
 ...
 at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: java.net.SocketTimeoutException: 2 millis timeout while 
 waiting for channel to be ready for connect. ch : 
 java.nio.channels.SocketChannel[connection-pending remote=/10.18.34.212:6]
 at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:213)
 at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:406)
 at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:309)
 at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:856)
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:724)
 at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:252)
 ... 20 more
 12:52:55,863 [pool-4-thread-5182] INFO PersistentUtil:153 - Retry after 1 
 second...
 Looking at the client code, the HConnectionManager does not watch ZK for 
 NodeDeleted  NodeCreated of /hbase/master

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-2645) HLog writer can do 1-2 sync operations after lease has been recovered for split process.

[
https://issues.apache.org/jira/browse/HBASE-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070384#comment-13070384
]

vamshi commented on HBASE-2645:
---

Hi Todd,may be in this place this question is irrelevant. But please let me
know whether we can implement a distributed hashing in HBase for fast lookup/
scanning purpose?? i want to implement scalable data structure i.e DHT in
Hbase, for that how can i proceed? Thank you.

HLog writer can do 1-2 sync operations after lease has been recovered for
split process.

Key: HBASE-2645
URL: https://issues.apache.org/jira/browse/HBASE-2645
Project: HBase
Issue Type: Bug
Components: filters
Affects Versions: 0.90.4
Reporter: Cosmin Lehene
Assignee: Todd Lipcon
Priority: Blocker
Fix For: 0.94.0

TestHLogSplit.testLogCannotBeWrittenOnceParsed is failing.
This test starts a thread that writes one edit to the log, syncs and counts.
During this, a HLog.splitLog operation is started. splitLog recovers the log
lease before reading the log, so that the original regionserver could not
wake up and write after the split process started.
The test compares the number of edits reported by the split process and by
the writer thread. Writer thread (called zombie in the test) should report =
than the splitLog (sync() might raise after the last edit gets written and
the edit won't get counted by zombie thread). However it appears that the
zombie counts 1-2 more edits. So it looks like it can sync without a lease.
This might be a hdfs-0.20 related issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-1938) Make in-memory table scanning faster

2011-07-25 Thread nkeywal (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070458#comment-13070458
 ] 

nkeywal commented on HBASE-1938:


Hello Stack,

accesses and perhaps to make it go faster.
I will have a look at it, I see as well in this test and in the global
profiling that a lot of time is spent on it.


scanner.next();

There are two iterators in the class(kvsetIt and snapshotIt), and getLowest
compare the two to return the lowest. However, in this test, one of the list
is empty, so the value is null, and hence the real comparison on byte[] is
not executed.

On this subject, there is a possible optimisation on the function peek,
that will repeat the comparison: if peek is called multiple time, of if we
often have peek() then next(), we can save the redundant comparisons. To me,
it makes sense to precalculate the value returned by peek, and reuse it in
next().



The profiling (method: sampling, java inlining desactivated) says something
interesting:

Name; total time spent
org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.next()100%
org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.getNext(Iterator)
88%
org.apache.hadoop.hbase.regionserver.KeyValueSkipListSet$MapEntryIterator.next()
44%
java.util.concurrent.ConcurrentSkipListMap$SubMap$SubMapEntryIterator.next()
36%
org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.getThreadReadPoint()
26%
java.lang.ThreadLocal.get()21%
org.apache.hadoop.hbase.regionserver.KeyValueSkipListSet$MapEntryIterator.hasNext()
8%
org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.getLowest()
7%
java.util.concurrent.ConcurrentSkipListMap$SubMap$SubMapIter.hasNext()3%
org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.getLower(KeyValue,
KeyValue)3%
java.lang.Long.longValue()2%



So we're spending 26% of the time on this:
org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.getThreadReadPoint()
26%

And in this getThreadReadPoint(), the actual time is spent in:
java.lang.ThreadLocal.get()21%

It's a TLS, so we can expect a system call to get the thread id. It would be
great to save this system call in a next().

There is at least an improvement for the case when one of the list is done:
don't get the data getThreadReadPoint(). That would not change the behaviour
at all, but would already be interesting (may be 10% in this test).
Another option is to share getThreadReadPoint() value for the two iterators,
i.e. read the value in the next() function, and give it as a parameter to
getNext(). In fact, as this value seems to be a TLS, I don't see how it
could change during the execution of next(). What do you think?
Last question on this: what is the use case when the getThreadReadPoint()
will change during the whole scan (i.e.: between next)?


Most of the public methods (except reseek) are synchronized, it implies
that the scanner can be shared between threads?


At the end, it seems that there are 3 possible things to do:
1) Replacement of KeyValue lowest = getLowest();
2) theNext precalculation for peek() and next()
3) Depending on your feedback, one of the options above on
getThreadReadPoint().

This should give 5 to 15% increase in performances, not a problem solved
stuff, but could justify a first patch. I can do it (with the hbase
indenting :-)



On Sun, Jul 24, 2011 at 12:23 AM, stack (JIRA) j...@apache.org wrote:



 Make in-memory table scanning faster
 

 Key: HBASE-1938
 URL: https://issues.apache.org/jira/browse/HBASE-1938
 Project: HBase
  Issue Type: Improvement
  Components: performance
Reporter: stack
Assignee: stack
Priority: Blocker
 Attachments: MemStoreScanPerformance.java, 
 MemStoreScanPerformance.java, caching-keylength-in-kv.patch, test.patch


 This issue is about profiling hbase to see if I can make hbase scans run 
 faster when all is up in memory.  Talking to some users, they are seeing 
 about 1/4 million rows a second.  It should be able to go faster than this 
 (Scanning an array of objects, they can do about 4-5x this).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

[
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070470#comment-13070470
]

Ted Yu commented on HBASE-3845:
---

That is fine.

data loss because lastSeqWritten can miss memstore edits

Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch,
HBASE-3845_4.patch, HBASE-3845__trunk.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

[
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

ramkrishna.s.vasudevan updated HBASE-3845:
--

Status: Open (was: Patch Available)

data loss because lastSeqWritten can miss memstore edits

Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch,
HBASE-3845_4.patch, HBASE-3845__trunk.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

[
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

ramkrishna.s.vasudevan updated HBASE-3845:
--

Status: Patch Available (was: Open)

data loss because lastSeqWritten can miss memstore edits

Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch,
HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845__trunk.patch,
HBASE-3845_trunk_2.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

[
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

ramkrishna.s.vasudevan updated HBASE-3845:
--

Attachment: HBASE-3845_5.patch

data loss because lastSeqWritten can miss memstore edits

Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch,
HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845__trunk.patch,
HBASE-3845_trunk_2.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

[
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

ramkrishna.s.vasudevan updated HBASE-3845:
--

Attachment: HBASE-3845_trunk_2.patch

data loss because lastSeqWritten can miss memstore edits

Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch,
HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845__trunk.patch,
HBASE-3845_trunk_2.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4134) The total number of regions was more than the actual region count after the hbck fix


[ 
https://issues.apache.org/jira/browse/HBASE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070498#comment-13070498
 ] 

Ted Yu commented on HBASE-4134:
---

https://issues.apache.org/jira/browse/HBASE-4053 is in 0.90.4 RC1
Do you want to try out RC1 to see if the situation of double counting has 
improved ?

 The total number of regions was more than the actual region count after the 
 hbck fix
 

 Key: HBASE-4134
 URL: https://issues.apache.org/jira/browse/HBASE-4134
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: feng xu
 Fix For: 0.90.4


 1. I found the problem(some regions were multiply assigned) while running 
 hbck to check the cluster's health. Here's the result:
 {noformat}
 ERROR: Region test1,230778,1311216270050.fff783529fcd983043610eaa1cc5c2fe. is 
 listed in META on region server 158-1-91-101:20020 but is multiply assigned 
 to region servers 158-1-91-101:20020, 158-1-91-105:20020 
 ERROR: Region test1,252103,1311216293671.fff9ed2cb69bdce535451a07686c0db5. is 
 listed in META on region server 158-1-91-101:20020 but is multiply assigned 
 to region servers 158-1-91-101:20020, 158-1-91-105:20020 
 ERROR: Region test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. is 
 listed in META on region server 158-1-91-103:20020 but is multiply assigned 
 to region servers 158-1-91-103:20020, 158-1-91-105:20020 
 Summary: 
   -ROOT- is okay. 
 Number of regions: 1 
 Deployed on: 158-1-91-105:20020 
   .META. is okay. 
 Number of regions: 1 
 Deployed on: 158-1-91-103:20020 
   test1 is okay. 
 Number of regions: 25297 
 Deployed on: 158-1-91-101:20020 158-1-91-103:20020 158-1-91-105:20020 
 14829 inconsistencies detected. 
 Status: INCONSISTENT 
 {noformat}
 2. Then I tried to use hbck -fix to fix the problem. Everything seemed ok. 
 But I found that the total number of regions reported by load balancer 
 (35029) was more than the actual region count(25299) after the fixing.
 Here's the related logs snippet:
 {noformat}
 2011-07-22 02:19:02,866 INFO org.apache.hadoop.hbase.master.LoadBalancer: 
 Skipping load balancing.  servers=3 regions=25299 average=8433.0 
 mostloaded=8433 
 2011-07-22 03:06:11,832 INFO org.apache.hadoop.hbase.master.LoadBalancer: 
 Skipping load balancing.  servers=3 regions=35029 average=11676.333 
 mostloaded=11677 leastloaded=11676
 {noformat}
 3. I tracked one region's behavior during the time. Taking the region of 
 test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. as example:
 (1) It was assigned to 158-1-91-101 at first. 
 (2) HBCK sent closing request to RegionServer. And RegionServer closed it 
 silently without notice to HMaster.
 (3) The region was still carried by RS 158-1-91-103 which was known to 
 HMaster.
 (4) HBCK will trigger a new assignment.
 The fact is, the region was assigned again, but the old assignment 
 information still remained in AM#regions,AM#servers.
 That's why the problem of region count was larger than the actual number 
 occurred.  
 {noformat}
 Line 178967: 2011-07-22 02:47:51,247 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase/unassigned/52782c0241a598b3e37ca8729da0 
 (region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., 
 server=HBCKServerName, state=M_ZK_REGION_OFFLINE)
 Line 178968: 2011-07-22 02:47:51,247 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling HBCK triggered 
 transition=M_ZK_REGION_OFFLINE, server=HBCKServerName, 
 region=52782c0241a598b3e37ca8729da0
 Line 178969: 2011-07-22 02:47:51,248 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: HBCK repair is triggering 
 assignment of 
 region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0.
 Line 178970: 2011-07-22 02:47:51,248 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
 was found (or we are ignoring an existing plan) for 
 test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. so generated a 
 random one; hri=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., 
 src=, dest=158-1-91-101,20020,1311231878544; 3 (online=3, exclude=null) 
 available servers
 Line 178971: 2011-07-22 02:47:51,248 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. to 
 158-1-91-101,20020,1311231878544
 Line 178983: 2011-07-22 02:47:51,285 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=158-1-91-101,20020,1311231878544, 
 region=52782c0241a598b3e37ca8729da0
 Line 179001: 2011-07-22 02:47:51,318 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling

[jira] [Commented] (HBASE-4138) If zookeeper.znode.parent is not specifed explicitly in Client code then HTable object loops continuously waiting for the root region by using /hbase as the base node.


[ 
https://issues.apache.org/jira/browse/HBASE-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070520#comment-13070520
 ] 

ramkrishna.s.vasudevan commented on HBASE-4138:
---

1. I tried to identify the problem in HBASE-4138.  I ended up in the following 
analysis,

The HMaster creates the BASENODE along with unassigned node, RS node and table 
node based on 
the zookeeper.znode.parent property.

Currently when we use the HTable() as part of getConnection() if the this value 
is  not configured
we tend to create a new connection.

Two points to not here is - 
1)The HTable documentation clearly states us to use the same configuration 
object.


But what if its not done, particularly someone forgets to set this base node 
property.  Even it may be
like in my RS instance i have configured the property but not in the master 
instance.

2)The reuse of the getConnection() logic across all levels, was it intended ?

The major problem lies in the the HConnectionManager.setupZookeeperTrackers() 
which tries to
create the BASENODES again.

What i feel here is,
this should not be done as only the master should have the rights to create it 
else there are
high possibility that muliple basenodes can be created.

Currently as the Client creates the node once again with the default value 
'/hbase'
the client keeps waiting to know the root location indefinitely.

What happens in the Admin case:
The same thing happens in admin case but in HBaseAdmin() we call the 
connection.getMaster api
which throws an exception.  
'ZooKeeper available but no active master location found'

So we should prevent the Admin or HTable (In general any client even RS ) from 
creating the 
base nodes and what ever is created by the master should be used by the clients.

 If zookeeper.znode.parent is not specifed explicitly in Client code then 
 HTable object loops continuously waiting for the root region by using /hbase 
 as the base node.
 ---

 Key: HBASE-4138
 URL: https://issues.apache.org/jira/browse/HBASE-4138
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.3
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.4


 Change the zookeeper.znode.parent property (default is /hbase).
 Now do not specify this change in the client code.
 Use the HTable Object.
 The HTable is not able to find the root region and keeps continuously looping.
 Find the stack trace:
 
 Object.wait(long) line: not available [native method]  
 RootRegionTracker(ZooKeeperNodeTracker).blockUntilAvailable(long) line: 122
 RootRegionTracker.waitRootRegionLocation(long) line: 73
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[], boolean) line: 578
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[]) line: 558
 HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[],
 byte[], byte[], boolean, Object) line: 687
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[], boolean) line: 589
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[]) line: 558
 HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[],
 byte[], byte[], boolean, Object) line: 687
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[], boolean) line: 593
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[]) line: 558
 HTable.init(Configuration, byte[]) line: 171 
 HTable.init(Configuration, String) line: 145 
 HBaseTest.test() line: 45

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-25 Thread Prakash Khemani (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070542#comment-13070542
 ] 

Prakash Khemani commented on HBASE-3845:


In the patch that is deployed internally we have implemented a different 
approach. We remove the region's entry in startCacheFlush() and save it (as 
opposed to the current behavior of removing the entry in completeCacheFlush()). 
If the flush aborts then we restore the saved entry.

The approach taken in the latest patch in this jira might also be OK. I have a 
few comments

{noformat}
   this.lastSeqWritten.remove(encodedRegionName);
+  Long seqWhileFlush = 
this.seqWrittenWhileFlush.get(encodedRegionName);
+  if (null != seqWhileFlush) {
+this.lastSeqWritten.putIfAbsent(encodedRegionName, seqWhileFlush);
+this.seqWrittenWhileFlush.remove(encodedRegionName);
+   
{noformat}

seqWrittenWhileFlush .get() and subsequent .remove() can be replaced by a 
single .remove()
{code}
Long seqWhileFlush = this.seqWrittenWhileFlush.remove(encodedRegionName);
if (null != seqWhileFlush) {
  lSW.put(encodedRegionName, seqWhileFlush);
else
  lSW.remove(encodedRegionName);
{code}

==
The bigger problem here is that completeCacheFlush() is not called with 
updatedLock acquired. Therefore there might still be correctness issues with 
the latest patch.

==

{noformat}
   public void abortCacheFlush() {
+this.isFlushInProgress.set(false);
 this.cacheFlushLock.unlock();
   }
{noformat}
shouldn't seqWrittenWhileFlush be cleaned up in abort cache?


 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.5

 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, 
 HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845__trunk.patch, 
 HBASE-3845_trunk_2.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

[
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070545#comment-13070545
]

Ted Yu commented on HBASE-3845:
---

@Prakash:
Would you be able to share your patch ?

The bigger problem here is that completeCacheFlush() is not called with
updatedLock acquired.
See line 1154 in HLog:
{code}
synchronized (updateLock) {
{code}

data loss because lastSeqWritten can miss memstore edits

Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch,
HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845__trunk.patch,
HBASE-3845_trunk_2.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival


[ 
https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070588#comment-13070588
 ] 

Andrew Purtell commented on HBASE-4132:
---

@dhruba Thanks. For example should be preArchiveStart and postArchiveStart, 
preArchiveCompleted and postArchiveCompleted. In part it is a naming 
convention, in part it is a contract: Pre hooks allow introduction of 
preprocessing and also important override of default behavior and associated 
short-circuiting of base processing or any additional coprocessors. Post hooks 
allow introduction of postprocessing and modification of return values.

 Extend the WALObserver API to accomodate log archival
 -

 Key: HBASE-4132
 URL: https://issues.apache.org/jira/browse/HBASE-4132
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.92.0

 Attachments: walArchive.txt


 The WALObserver interface exposes the log roll events. It would be nice to 
 extend it to accomodate log archival events as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

[
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

ramkrishna.s.vasudevan updated HBASE-3845:
--

Attachment: HBASE-3845_6.patch

data loss because lastSeqWritten can miss memstore edits

Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch,
HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845_6.patch,
HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

[
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

ramkrishna.s.vasudevan updated HBASE-3845:
--

Status: Open (was: Patch Available)

data loss because lastSeqWritten can miss memstore edits

Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch,
HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845_6.patch,
HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

[
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

ramkrishna.s.vasudevan updated HBASE-3845:
--

Attachment: HBASE-3845_trunk_3.patch

data loss because lastSeqWritten can miss memstore edits

Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch,
HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845_6.patch,
HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch, HBASE-3845_trunk_3.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

[
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

ramkrishna.s.vasudevan updated HBASE-3845:
--

Status: Patch Available (was: Open)

data loss because lastSeqWritten can miss memstore edits

Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch,
HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845_6.patch,
HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch, HBASE-3845_trunk_3.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-4139) [stargate] Update ScannerModel with support for filter package additions

[stargate] Update ScannerModel with support for filter package additions


 Key: HBASE-4139
 URL: https://issues.apache.org/jira/browse/HBASE-4139
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.92.0


Filters have been added to the o.a.h.h.filters package without updating 
o.a.h.h.rest.model.ScannerModel. Bring ScannerModel up to date.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-25 Thread Prakash Khemani (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070614#comment-13070614
 ] 

Prakash Khemani commented on HBASE-3845:


In the method internalFlushcache() I don't see updatesLock.writeLock() being 
held around the following piece of code.

{code}
if (wal != null) {
  wal.completeCacheFlush(this.regionInfo.getEncodedNameAsBytes(),
regionInfo.getTableDesc().getName(), completeSequenceId,
this.getRegionInfo().isMetaRegion());
}
{code}

==

I will upload the internal patch for reference ...





 data loss because lastSeqWritten can miss memstore edits
 

 Key: HBASE-3845
 URL: https://issues.apache.org/jira/browse/HBASE-3845
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: Prakash Khemani
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.5

 Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, 
 HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845_6.patch, 
 HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch, HBASE-3845_trunk_3.patch


 (I don't have a test case to prove this yet but I have run it by Dhruba and 
 Kannan internally and wanted to put this up for some feedback.)
 In this discussion let us assume that the region has only one column family. 
 That way I can use region/memstore interchangeably.
 After a memstore flush it is possible for lastSeqWritten to have a 
 log-sequence-id for a region that is not the earliest log-sequence-id for 
 that region's memstore.
 HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
 that we only keep track  of the earliest log-sequence-number that is present 
 in the memstore.
 Every time the memstore is flushed we remove the region's entry in 
 lastSequenceWritten and wait for the next append to populate this entry 
 again. This is where the problem happens.
 step 1:
 flusher.prepare() snapshots the memstore under 
 HRegion.updatesLock.writeLock().
 step 2 :
 as soon as the updatesLock.writeLock() is released new entries will be added 
 into the memstore.
 step 3 :
 wal.completeCacheFlush() is called. This method removes the region's entry 
 from lastSeqWritten.
 step 4:
 the next append will create a new entry for the region in lastSeqWritten(). 
 But this will be the log seq id of the current append. All the edits that 
 were added in step 2 are missing.
 ==
 as a temporary measure, instead of removing the region's entry in step 3 I 
 will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3899) enhance HBase RPC to support free-ing up server handler threads even if response is not ready

2011-07-25 Thread Vlad Dogaru (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vlad Dogaru updated HBASE-3899:
---

Attachment: HBASE-3899.patch

@stack, followup from review board: HBaseServer.Call uses warnResponseSize from
parent class.

Also, similar code is in production on Facebook clusters. This patch only adds
and tests new behavior, but it is not actually used yet.

enhance HBase RPC to support free-ing up server handler threads even if
response is not ready
-

Key: HBASE-3899
URL: https://issues.apache.org/jira/browse/HBASE-3899
Project: HBase
Issue Type: Improvement
Components: ipc
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Fix For: 0.94.0

Attachments: HBASE-3899.patch, asyncRpc.txt, asyncRpc.txt

In the current implementation, the server handler thread picks up an item
from the incoming callqueue, processes it and then wraps the response as a
Writable and sends it back to the IPC server module. This wastes
thread-resources when the thread is blocked for disk IO (transaction logging,
read into block cache, etc).
It would be nice if we can make the RPC Server Handler threads pick up a call
from the IPC queue, hand it over to the application (e.g. HRegion), the
application can queue it to be processed asynchronously and send a response
back to the IPC server module saying that the response is not ready. The RPC
Server Handler thread is now ready to pick up another request from the
incoming callqueue. When the queued call is processed by the application, it
indicates to the IPC module that the response is now ready to be sent back to
the client.
The RPC client continues to experience the same behaviour as before. A RPC
client is synchronous and blocks till the response arrives.
This RPC enhancement allows us to do very powerful things with the
RegionServer. In future, we can make enhance the RegionServer's threading
model to a message-passing model for better performance. We will not be
limited by the number of threads in the RegionServer.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3899) enhance HBase RPC to support free-ing up server handler threads even if response is not ready

2011-07-25 Thread jirapos...@reviews.apache.org (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070727#comment-13070727
]

jirapos...@reviews.apache.org commented on HBASE-3899:
--

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1174/#review1179
---

I think we need some additional metrics for number of outstanding (delayed)
calls... how do we debug cases where calls are getting orphaned?

src/main/java/org/apache/hadoop/hbase/ipc/Delayable.java
https://reviews.apache.org/r/1174/#comment2475

RPC calls can return Writables or any java primitive supported by
ObjectWritable.

So, this should probably be Object result.

src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
https://reviews.apache.org/r/1174/#comment2476

this isn't your code... but this expression is always true!

src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
https://reviews.apache.org/r/1174/#comment2477

this is a no-op. need proper error handling

src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
https://reviews.apache.org/r/1174/#comment2478

assert this.delayResponse

src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
https://reviews.apache.org/r/1174/#comment2479

assert !delayResponse

src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
https://reviews.apache.org/r/1174/#comment2480

if !delayResponse, would we ever have response == null?

src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
https://reviews.apache.org/r/1174/#comment2481

shouldn't this just be a call to enqueueInSelector now?

- Todd

On 2011-07-22 00:17:13, Vlad Dogaru wrote:
bq.
bq. ---
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/1174/
bq. ---
bq.
bq. (Updated 2011-07-22 00:17:13)
bq.
bq.
bq. Review request for hbase.
bq.
bq.
bq. Summary
bq. ---
bq.
bq. Free up RPC server Handler thread if the called routine specifies the call
should be delayed. The RPC client sees no difference, changes are server-side
only. This is based on the previous submitted patch from Dhruba.
bq.
bq.
bq. This addresses bug HBASE-3899.
bq. https://issues.apache.org/jira/browse/HBASE-3899
bq.
bq.
bq. Diffs
bq. -
bq.
bq.src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java
PRE-CREATION
bq.src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java 0da7f9e
bq.src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 61d3915
bq.src/main/java/org/apache/hadoop/hbase/ipc/Delayable.java PRE-CREATION
bq.
bq. Diff: https://reviews.apache.org/r/1174/diff
bq.
bq.
bq. Testing
bq. ---
bq.
bq. Unit tests run. Also, the patch includes a new unit test.
bq.
bq.
bq. Thanks,
bq.
bq. Vlad
bq.
bq.

enhance HBase RPC to support free-ing up server handler threads even if
response is not ready
-

Key: HBASE-3899
URL: https://issues.apache.org/jira/browse/HBASE-3899
Project: HBase
Issue Type: Improvement
Components: ipc
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Fix For: 0.94.0

Attachments: HBASE-3899.patch, asyncRpc.txt, asyncRpc.txt

[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival


[ 
https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070752#comment-13070752
 ] 

stack commented on HBASE-4132:
--

@Dhruba

bq. Given the above, do you think that this patch needs to touch LogCleaner at 
all? If so, what is ur proposal?

No proposal.  Just wanted to point you at some utility we have already that you 
might not have known about and that might have helped you composing your 
addition.



 Extend the WALObserver API to accomodate log archival
 -

 Key: HBASE-4132
 URL: https://issues.apache.org/jira/browse/HBASE-4132
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.92.0

 Attachments: walArchive.txt


 The WALObserver interface exposes the log roll events. It would be nice to 
 extend it to accomodate log archival events as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4138) If zookeeper.znode.parent is not specifed explicitly in Client code then HTable object loops continuously waiting for the root region by using /hbase as the base node.


[ 
https://issues.apache.org/jira/browse/HBASE-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070755#comment-13070755
 ] 

stack commented on HBASE-4138:
--

@Ram Your reasoning sounds right to me.  I agree ...we should prevent the 
Admin or HTable (In general any client even RS ) from creating the base nodes 
and what ever is created by the master should be used by the clients.

Thanks for digging in on this.

 If zookeeper.znode.parent is not specifed explicitly in Client code then 
 HTable object loops continuously waiting for the root region by using /hbase 
 as the base node.
 ---

 Key: HBASE-4138
 URL: https://issues.apache.org/jira/browse/HBASE-4138
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.3
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.4


 Change the zookeeper.znode.parent property (default is /hbase).
 Now do not specify this change in the client code.
 Use the HTable Object.
 The HTable is not able to find the root region and keeps continuously looping.
 Find the stack trace:
 
 Object.wait(long) line: not available [native method]  
 RootRegionTracker(ZooKeeperNodeTracker).blockUntilAvailable(long) line: 122
 RootRegionTracker.waitRootRegionLocation(long) line: 73
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[], boolean) line: 578
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[]) line: 558
 HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[],
 byte[], byte[], boolean, Object) line: 687
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[], boolean) line: 589
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[]) line: 558
 HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[],
 byte[], byte[], boolean, Object) line: 687
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[], boolean) line: 593
 HConnectionManager$HConnectionImplementation.locateRegion(byte[],
 byte[]) line: 558
 HTable.init(Configuration, byte[]) line: 171 
 HTable.init(Configuration, String) line: 145 
 HBaseTest.test() line: 45

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4137) Implementation of Distributed Hash Table(DHT) for lookup service


[ 
https://issues.apache.org/jira/browse/HBASE-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070804#comment-13070804
 ] 

stack commented on HBASE-4137:
--

Can you describe what you are trying to do?  What do you mean by 'lookup 
service'?  What are you looking up?  And why would we put in place a DHT for 
lookups when there is already a means of locating data.  Thanks.

 Implementation of Distributed Hash Table(DHT)  for lookup service
 -

 Key: HBASE-4137
 URL: https://issues.apache.org/jira/browse/HBASE-4137
 Project: HBase
  Issue Type: Improvement
  Components: performance
Affects Versions: 0.90.3
Reporter: vamshi

 To implement  a scalable data structure i.e Distributed hash table in the 
 HBase. In the Hbase to perform fast lookup  we can take the help of 
 Distributed Hash Table (DHT), a scalable data structure.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-1938) Make in-memory table scanning faster

[
https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070819#comment-13070819
]

stack commented on HBASE-1938:
--

bq. To me, it makes sense to precalculate the value returned by peek, and
reuse it in next().

If there is no chance of the value changing between the peek and next, it
sounds good (I've not looked at this code in a while).

bq. It would be great to save this system call in a next().

Yes (I like how you figure there's a system call doing thread local get).

bq. In fact, as this value seems to be a TLS, I don't see how it could change
during the execution of next(). What do you think?

(I'm being lazy. I've not looked at the code). The updates to RWCC happen at
well-defined points so should be easy enough to elicit if there is a problem w/
your presumption above.

bq. Last question on this: what is the use case when the getThreadReadPoint()
will change during the whole scan (i.e.: between next)?

IIRC, we want to let the scan see the most up-to-date view on a row though our
guarantees are less than this (See http://hbase.apache.org/acid-semantics.html).

bq. Most of the public methods (except reseek) are synchronized, it implies
that the scanner can be shared between threads?

That seems like a valid deduction to make.

bq. 1) Replacement of KeyValue lowest = getLowest();

You mean in MemStore#reseek? What would you put in its place (Sorry if I'm not
following the bouncing ball properly).

bq. ...don't get the data getThreadReadPoint()

So, we'd just hold to the current read point for how long? The full scan?
That might be possible given our lax guarantees above though it would be nice
to not have to give up on up to the millisecond views on rows.

bq. Another option is to share getThreadReadPoint() value for the two
iterators, i.e. read the value in the next() function, and give it as a
parameter to getNext()

What are the 'two iterators' here?

Sorry N, I don't have my head as deep in this stuff as you do currently so my
questions and answers above may be off. Please compensate appropriately.

Make in-memory table scanning faster

Key: HBASE-1938
URL: https://issues.apache.org/jira/browse/HBASE-1938
Project: HBase
Issue Type: Improvement
Components: performance
Reporter: stack
Assignee: stack
Priority: Blocker
Attachments: MemStoreScanPerformance.java,
MemStoreScanPerformance.java, caching-keylength-in-kv.patch, test.patch

This issue is about profiling hbase to see if I can make hbase scans run
faster when all is up in memory. Talking to some users, they are seeing
about 1/4 million rows a second. It should be able to go faster than this
(Scanning an array of objects, they can do about 4-5x this).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-4137) Implementation of Distributed Hash Table(DHT) for lookup service


 [ 
https://issues.apache.org/jira/browse/HBASE-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-4137.
--

Resolution: Incomplete


The description does not seem to apply to hbase and the description has been 
sprayed across a few random issues which leads me to believe the author is not 
clear themselves on what is wanted.

Resolving as incomplete.

 Implementation of Distributed Hash Table(DHT)  for lookup service
 -

 Key: HBASE-4137
 URL: https://issues.apache.org/jira/browse/HBASE-4137
 Project: HBase
  Issue Type: Improvement
  Components: performance
Affects Versions: 0.90.3
Reporter: vamshi

 To implement  a scalable data structure i.e Distributed hash table in the 
 HBase. In the Hbase to perform fast lookup  we can take the help of 
 Distributed Hash Table (DHT), a scalable data structure.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival

2011-07-25 Thread Gary Helmling (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070825#comment-13070825
 ] 

Gary Helmling commented on HBASE-4132:
--

Hmm, this seems to be a confusion of 
{{org.apache.hadoop.hbase.regionserver.wal.WALObserver}} and 
{{org.apache.hadoop.hbase.coprocessor.WALObserver}}.  Not surprising since both 
classes have the same name.  I think the former is the WAL listener used in 
replication and the latter is the coprocessor interface for WALs.  I know the 
former has been around longer, but maybe we should consider renaming it to 
WALListener.  Or maybe we should bite the bullet and combine these two 
interfaces to one.  (I say that knowing very little about replication and 
whether it would make sense/be feasible to convert it to a coprocessor 
implementation).

Anyway, I see no problem adding {{pre/postArchiveStart}} and 
{{pre/postArchiveCompleted}} to 
{{org.apache.hadoop.hbase.coprocessor.WALObserver}}, as Andy mentions.  Would 
that be sufficient, or should we look at adding the logRoll and logClose events 
from {{o.a.h.h.regionserver.wal.WALObserver}} as well?

 Extend the WALObserver API to accomodate log archival
 -

 Key: HBASE-4132
 URL: https://issues.apache.org/jira/browse/HBASE-4132
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.92.0

 Attachments: walArchive.txt


 The WALObserver interface exposes the log roll events. It would be nice to 
 extend it to accomodate log archival events as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-1938) Make in-memory table scanning faster


[ 
https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070829#comment-13070829
 ] 

Ted Yu commented on HBASE-1938:
---

bq. 1) Replacement of KeyValue lowest = getLowest();
It is in the seek function

bq. Another option is to share getThreadReadPoint() value for the two iterators
N was talking about the following code in MemStore.next():
{code}
  if (theNext == kvsetNextRow) {
kvsetNextRow = getNext(kvsetIt);
  } else {
snapshotNextRow = getNext(snapshotIt);
  }
{code}
The initiative was to save the system call.

 Make in-memory table scanning faster
 

 Key: HBASE-1938
 URL: https://issues.apache.org/jira/browse/HBASE-1938
 Project: HBase
  Issue Type: Improvement
  Components: performance
Reporter: stack
Assignee: stack
Priority: Blocker
 Attachments: MemStoreScanPerformance.java, 
 MemStoreScanPerformance.java, caching-keylength-in-kv.patch, test.patch


 This issue is about profiling hbase to see if I can make hbase scans run 
 faster when all is up in memory.  Talking to some users, they are seeing 
 about 1/4 million rows a second.  It should be able to go faster than this 
 (Scanning an array of objects, they can do about 4-5x this).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival


[ 
https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070832#comment-13070832
 ] 

stack commented on HBASE-4132:
--

+1 on one interface only.  J-D!

 Extend the WALObserver API to accomodate log archival
 -

 Key: HBASE-4132
 URL: https://issues.apache.org/jira/browse/HBASE-4132
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.92.0

 Attachments: walArchive.txt


 The WALObserver interface exposes the log roll events. It would be nice to 
 extend it to accomodate log archival events as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4027) Enable direct byte buffers LruBlockCache

2011-07-25 Thread Li Pi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Pi updated HBASE-4027:
-

Attachment: slabcachepatchv4.diff

Added tests for eviction, now logs finely grained stats to file.

Added a bunch of documentation. A bunch - this should take care of most of the 
documentation concerns.

 Enable direct byte buffers LruBlockCache
 

 Key: HBASE-4027
 URL: https://issues.apache.org/jira/browse/HBASE-4027
 Project: HBase
  Issue Type: Improvement
Reporter: Jason Rutherglen
Assignee: Li Pi
Priority: Minor
 Attachments: slabcachepatch.diff, slabcachepatchv2.diff, 
 slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, slabcachepatchv3.diff, 
 slabcachepatchv4.diff


 Java offers the creation of direct byte buffers which are allocated outside 
 of the heap.
 They need to be manually free'd, which can be accomplished using an 
 documented {{clean}} method.
 The feature will be optional.  After implementing, we can benchmark for 
 differences in speed and garbage collection observances.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4134) The total number of regions was more than the actual region count after the hbck fix


[ 
https://issues.apache.org/jira/browse/HBASE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070834#comment-13070834
 ] 

stack commented on HBASE-4134:
--

@feng nice debugging

 The total number of regions was more than the actual region count after the 
 hbck fix
 

 Key: HBASE-4134
 URL: https://issues.apache.org/jira/browse/HBASE-4134
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: feng xu
 Fix For: 0.90.4


 1. I found the problem(some regions were multiply assigned) while running 
 hbck to check the cluster's health. Here's the result:
 {noformat}
 ERROR: Region test1,230778,1311216270050.fff783529fcd983043610eaa1cc5c2fe. is 
 listed in META on region server 158-1-91-101:20020 but is multiply assigned 
 to region servers 158-1-91-101:20020, 158-1-91-105:20020 
 ERROR: Region test1,252103,1311216293671.fff9ed2cb69bdce535451a07686c0db5. is 
 listed in META on region server 158-1-91-101:20020 but is multiply assigned 
 to region servers 158-1-91-101:20020, 158-1-91-105:20020 
 ERROR: Region test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. is 
 listed in META on region server 158-1-91-103:20020 but is multiply assigned 
 to region servers 158-1-91-103:20020, 158-1-91-105:20020 
 Summary: 
   -ROOT- is okay. 
 Number of regions: 1 
 Deployed on: 158-1-91-105:20020 
   .META. is okay. 
 Number of regions: 1 
 Deployed on: 158-1-91-103:20020 
   test1 is okay. 
 Number of regions: 25297 
 Deployed on: 158-1-91-101:20020 158-1-91-103:20020 158-1-91-105:20020 
 14829 inconsistencies detected. 
 Status: INCONSISTENT 
 {noformat}
 2. Then I tried to use hbck -fix to fix the problem. Everything seemed ok. 
 But I found that the total number of regions reported by load balancer 
 (35029) was more than the actual region count(25299) after the fixing.
 Here's the related logs snippet:
 {noformat}
 2011-07-22 02:19:02,866 INFO org.apache.hadoop.hbase.master.LoadBalancer: 
 Skipping load balancing.  servers=3 regions=25299 average=8433.0 
 mostloaded=8433 
 2011-07-22 03:06:11,832 INFO org.apache.hadoop.hbase.master.LoadBalancer: 
 Skipping load balancing.  servers=3 regions=35029 average=11676.333 
 mostloaded=11677 leastloaded=11676
 {noformat}
 3. I tracked one region's behavior during the time. Taking the region of 
 test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. as example:
 (1) It was assigned to 158-1-91-101 at first. 
 (2) HBCK sent closing request to RegionServer. And RegionServer closed it 
 silently without notice to HMaster.
 (3) The region was still carried by RS 158-1-91-103 which was known to 
 HMaster.
 (4) HBCK will trigger a new assignment.
 The fact is, the region was assigned again, but the old assignment 
 information still remained in AM#regions,AM#servers.
 That's why the problem of region count was larger than the actual number 
 occurred.  
 {noformat}
 Line 178967: 2011-07-22 02:47:51,247 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase/unassigned/52782c0241a598b3e37ca8729da0 
 (region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., 
 server=HBCKServerName, state=M_ZK_REGION_OFFLINE)
 Line 178968: 2011-07-22 02:47:51,247 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling HBCK triggered 
 transition=M_ZK_REGION_OFFLINE, server=HBCKServerName, 
 region=52782c0241a598b3e37ca8729da0
 Line 178969: 2011-07-22 02:47:51,248 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: HBCK repair is triggering 
 assignment of 
 region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0.
 Line 178970: 2011-07-22 02:47:51,248 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
 was found (or we are ignoring an existing plan) for 
 test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. so generated a 
 random one; hri=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., 
 src=, dest=158-1-91-101,20020,1311231878544; 3 (online=3, exclude=null) 
 available servers
 Line 178971: 2011-07-22 02:47:51,248 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. to 
 158-1-91-101,20020,1311231878544
 Line 178983: 2011-07-22 02:47:51,285 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=158-1-91-101,20020,1311231878544, 
 region=52782c0241a598b3e37ca8729da0
 Line 179001: 2011-07-22 02:47:51,318 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENED, server=158-1-91-101,20020,1311231878544, 
 region=52782c0241a598b3e37ca8729da0
 Line 179002: 2011-07-22 02:47:51,319 DEBUG

[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival


[ 
https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070835#comment-13070835
 ] 

stack commented on HBASE-4132:
--

Or, one should inherit from the other rather than repeat.

 Extend the WALObserver API to accomodate log archival
 -

 Key: HBASE-4132
 URL: https://issues.apache.org/jira/browse/HBASE-4132
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.92.0

 Attachments: walArchive.txt


 The WALObserver interface exposes the log roll events. It would be nice to 
 extend it to accomodate log archival events as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3909) Add dynamic config

2011-07-25 Thread Gary Helmling (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070836#comment-13070836
 ] 

Gary Helmling commented on HBASE-3909:
--

It would be really nice to have this capability but seems way out there for 
0.92.  We can't depend on Hadoop trunk/0.23 classes for 0.92.  We could fork 
the HADOOP-7001 patch or come up with our own approach, but either one is going 
to be a lot of work.  And the server related changes to support this seem 
fairly tricky for anything beyond trivial configuration options -- ie, how to 
support reconfiguring number of rpc handler threads, say.

All this adds up to: I'd suggest we punt from 0.92.

 Add dynamic config
 --

 Key: HBASE-3909
 URL: https://issues.apache.org/jira/browse/HBASE-3909
 Project: HBase
  Issue Type: Bug
Reporter: stack
 Fix For: 0.92.0


 I'm sure this issue exists already, at least as part of the discussion around 
 making online schema edits possible, but no hard this having its own issue.  
 Ted started a conversation on this topic up on dev and Todd suggested we 
 lookd at how Hadoop did it over in HADOOP-7001

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3909) Add dynamic config


[ 
https://issues.apache.org/jira/browse/HBASE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070843#comment-13070843
 ] 

Ted Yu commented on HBASE-3909:
---

+1 on moving out of 0.92

 Add dynamic config
 --

 Key: HBASE-3909
 URL: https://issues.apache.org/jira/browse/HBASE-3909
 Project: HBase
  Issue Type: Bug
Reporter: stack
 Fix For: 0.92.0


 I'm sure this issue exists already, at least as part of the discussion around 
 making online schema edits possible, but no hard this having its own issue.  
 Ted started a conversation on this topic up on dev and Todd suggested we 
 lookd at how Hadoop did it over in HADOOP-7001

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache

2011-07-25 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070845#comment-13070845
 ] 

jirapos...@reviews.apache.org commented on HBASE-4027:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1191/
---

Review request for hbase and Todd Lipcon.


Summary
---

Uploading slabcachepatchv4 to review for Li Pi.


This addresses bug HBASE-4027.
https://issues.apache.org/jira/browse/HBASE-4027


Diffs
-

  conf/hbase-env.sh 2d55d27 
  pom.xml 729dc37 
  src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 5963552 
  src/main/java/org/apache/hadoop/hbase/io/hfile/MetaSlab.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/SingleSizeCache.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/SlabCache.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java b600020 
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestSingleSlabCache.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestSlabCache.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/1191/diff


Testing
---


Thanks,

Todd



 Enable direct byte buffers LruBlockCache
 

 Key: HBASE-4027
 URL: https://issues.apache.org/jira/browse/HBASE-4027
 Project: HBase
  Issue Type: Improvement
Reporter: Jason Rutherglen
Assignee: Li Pi
Priority: Minor
 Attachments: slabcachepatch.diff, slabcachepatchv2.diff, 
 slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, slabcachepatchv3.diff, 
 slabcachepatchv4.diff


 Java offers the creation of direct byte buffers which are allocated outside 
 of the heap.
 They need to be manually free'd, which can be accomplished using an 
 documented {{clean}} method.
 The feature will be optional.  After implementing, we can benchmark for 
 differences in speed and garbage collection observances.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache


[ 
https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070859#comment-13070859
 ] 

stack commented on HBASE-4027:
--

Doc is great.

These could be final:

+  private LruBlockCache onHeapCache;
+  private SlabCache offHeapCache;

Says 'Metrics are the combined size and hits and misses of both caches' but 
down in getStats we seem to be getting onheap stats only.  Intentional?  Same 
for heapSize.

Do you want to leave this line in hfile? +  LOG.debug(decompressedSize =  
+ decompressedSize);

Whats it mean when you say 'An exception will be thrown if the cached data is 
larger than the size of the allocated block'?

More notes later.

 Enable direct byte buffers LruBlockCache
 

 Key: HBASE-4027
 URL: https://issues.apache.org/jira/browse/HBASE-4027
 Project: HBase
  Issue Type: Improvement
Reporter: Jason Rutherglen
Assignee: Li Pi
Priority: Minor
 Attachments: slabcachepatch.diff, slabcachepatchv2.diff, 
 slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, slabcachepatchv3.diff, 
 slabcachepatchv4.diff


 Java offers the creation of direct byte buffers which are allocated outside 
 of the heap.
 They need to be manually free'd, which can be accomplished using an 
 documented {{clean}} method.
 The feature will be optional.  After implementing, we can benchmark for 
 differences in speed and garbage collection observances.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

[
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070860#comment-13070860
]

Ted Yu commented on HBASE-3845:
---

+1 on HBASE-3845_trunk_3.patch

Ran unit tests and they passed.

data loss because lastSeqWritten can miss memstore edits

Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch,
HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845_6.patch,
HBASE-3845__trunk.patch, HBASE-3845_trunk_2.patch, HBASE-3845_trunk_3.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3909) Add dynamic config


[ 
https://issues.apache.org/jira/browse/HBASE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070861#comment-13070861
 ] 

stack commented on HBASE-3909:
--

My thought on moving issues in and out of releases is just do it with 
justification rather than make justification and not move it waiting on others 
to agree.  For example, you make a good case for moving the issue out Gary, so 
go for it.

If someone objects, let them counter argue and move it back.  If a dispute, we 
can move it to the dev list to duke it out.

Good stuff.

 Add dynamic config
 --

 Key: HBASE-3909
 URL: https://issues.apache.org/jira/browse/HBASE-3909
 Project: HBase
  Issue Type: Bug
Reporter: stack
 Fix For: 0.92.0


 I'm sure this issue exists already, at least as part of the discussion around 
 making online schema edits possible, but no hard this having its own issue.  
 Ted started a conversation on this topic up on dev and Todd suggested we 
 lookd at how Hadoop did it over in HADOOP-7001

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache

2011-07-25 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070862#comment-13070862
 ] 

jirapos...@reviews.apache.org commented on HBASE-4027:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1191/#review1182
---


could do with some tests for MetaSlab. also some multi-threaded tests - see 
MultithreadedTestUtil, example usage in TestMemStoreLAB


pom.xml
https://reviews.apache.org/r/1191/#comment2484

did you determine that this ConcurrentLinkedHashMap was different than the 
one in Guava? I thought it got incorporated into Guava, which we already depend 
on.



src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java
https://reviews.apache.org/r/1191/#comment2485

punctuation wise, I think it would be easier to read if you hyphenated 
on-heap and off-heap. This applies to log messages below as well.



src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java
https://reviews.apache.org/r/1191/#comment2486

No need to line-break here



src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java
https://reviews.apache.org/r/1191/#comment2487

consider using StringUtils.humanReadableInt for these sizes.



src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java
https://reviews.apache.org/r/1191/#comment2488

@Override



src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java
https://reviews.apache.org/r/1191/#comment2489

when you're just overriding something from the superclass, no need for 
javadoc unless it says something new and exciting. If you feel like you want to 
put something there, you can use /** {@inheritDoc} */ to be explicit that 
you're inheriting from the superclass.



src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java
https://reviews.apache.org/r/1191/#comment2490

I think you should only put-back into the on-heap cache in the case that 
the 'caching' parameter is true.



src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java
https://reviews.apache.org/r/1191/#comment2491

hrm, the class javadoc says that the statistics should be cumulative, but 
this seems to just forward



src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java
https://reviews.apache.org/r/1191/#comment2492

TODOs



src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java
https://reviews.apache.org/r/1191/#comment2493

is this code used? seems like dead copy-paste code to me.



src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
https://reviews.apache.org/r/1191/#comment2497

extraneous debugging left in



src/main/java/org/apache/hadoop/hbase/io/hfile/MetaSlab.java
https://reviews.apache.org/r/1191/#comment2498

I think this is usually called a slab class - I think that name would be 
less confusing, since Meta is already used in HBase to refer to .META.



src/main/java/org/apache/hadoop/hbase/io/hfile/MetaSlab.java
https://reviews.apache.org/r/1191/#comment2499

unclear what the difference is between the two.

Is slabs the list of 2GB buffers, and buffers is the list of actual 
items that will be allocated? I think the traditional names here are slabs 
and items. where each slab holds some number of allocatable items

Also, rather than // comments, use /** javadoc comments */ before the vars



src/main/java/org/apache/hadoop/hbase/io/hfile/MetaSlab.java
https://reviews.apache.org/r/1191/#comment2500

these vars probably better called maxBlocksPerSlab and maxSlabSize, since 
they're upper bounds.



src/main/java/org/apache/hadoop/hbase/io/hfile/MetaSlab.java
https://reviews.apache.org/r/1191/#comment2501

I think this code would be a little easier to understand if you split it 
into one loop for the full slabs, and an if statement for the partially full 
one. Something like:

int numFullSlabs = numBlocks / maxBlocksPerSlab;
boolean hasPartialSlab = (numBlocks % maxBlocksPerSlab)  0;

for (int i = 0; i  numFullSlabs; i++) {
  alloc one of maxSlabSize;
  addBuffersForSlab(slab);
}

if (hasPartialSlab) {
  alloc the partial one
  addBuffersForSlab(slab);
}




src/main/java/org/apache/hadoop/hbase/io/hfile/MetaSlab.java
https://reviews.apache.org/r/1191/#comment2502

should be a LOG.warn



src/main/java/org/apache/hadoop/hbase/io/hfile/MetaSlab.java
https://reviews.apache.org/r/1191/#comment2503

shouldn't this class have an alloc() and free() method?



src/main/java/org/apache/hadoop/hbase/io/hfile/SingleSizeCache.java
https://reviews.apache.org/r/1191/#comment2511

shouldn't this implement BlockCache?

[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival

2011-07-25 Thread Jean-Daniel Cryans (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070863#comment-13070863
 ] 

Jean-Daniel Cryans commented on HBASE-4132:
---

It could be weird, if we just merge them then the Replication class (and others 
implementing wal.WALObserver in the future) would have imports for CP classes 
since ObserverContext is passed in the cp.WALObserver methods. I'd prefer a 
rename of either or both. 

 Extend the WALObserver API to accomodate log archival
 -

 Key: HBASE-4132
 URL: https://issues.apache.org/jira/browse/HBASE-4132
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.92.0

 Attachments: walArchive.txt


 The WALObserver interface exposes the log roll events. It would be nice to 
 extend it to accomodate log archival events as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4064) Two concurrent unassigning of the same region caused the endless loop of Region has been PENDING_CLOSE for too long...

2011-07-25 Thread gaojinchao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaojinchao updated HBASE-4064:
--

Attachment: (was: HBASE-4064_branch90V2.patch)

 Two concurrent unassigning of the same region caused the endless loop of 
 Region has been PENDING_CLOSE for too long...
 

 Key: HBASE-4064
 URL: https://issues.apache.org/jira/browse/HBASE-4064
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.3
Reporter: Jieshan Bean
 Fix For: 0.90.5

 Attachments: HBASE-4064-v1.patch, HBASE-4064_branch90V2.patch, 
 disableflow.png


 1. If there is a rubbish RegionState object with PENDING_CLOSE in 
 regionsInTransition(The RegionState was remained by some exception which 
 should be removed, that's why I called it as rubbish object), but the 
 region is not currently assigned anywhere, TimeoutMonitor will fall into an 
 endless loop:
 2011-06-27 10:32:21,326 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
 state=PENDING_CLOSE, ts=1309141555301
 2011-06-27 10:32:21,326 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
 PENDING_CLOSE for too long, running forced unassign again on 
 region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f.
 2011-06-27 10:32:21,438 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
 (offlining)
 2011-06-27 10:32:21,441 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign 
 region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is 
 not currently assigned anywhere
 2011-06-27 10:32:31,207 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
 state=PENDING_CLOSE, ts=1309141555301
 2011-06-27 10:32:31,207 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
 PENDING_CLOSE for too long, running forced unassign again on 
 region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f.
 2011-06-27 10:32:31,215 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
 (offlining)
 2011-06-27 10:32:31,215 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign 
 region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is 
 not currently assigned anywhere
 2011-06-27 10:32:41,164 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
 state=PENDING_CLOSE, ts=1309141555301
 2011-06-27 10:32:41,164 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
 PENDING_CLOSE for too long, running forced unassign again on 
 region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f.
 2011-06-27 10:32:41,172 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
 (offlining)
 2011-06-27 10:32:41,172 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign 
 region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is 
 not currently assigned anywhere
 .
 2  In the following scenario, two concurrent unassigning call of the same 
 region may lead to the above problem:
 the first unassign call send rpc call success, the master watched the event 
 of RS_ZK_REGION_CLOSED, process this event, will create a 
 ClosedRegionHandler to remove the state of the region in master.eg.
 while ClosedRegionHandler is running in  
 hbase.master.executor.closeregion.threads thread (A), another unassign call 
 of same region run in another thread(B).
 while thread B  run if (!regions.containsKey(region)), this.regions have 
 the region info, now  cpu switch to thread A.
 The thread A will remove the region from the sets of this.regions and 
 regionsInTransition, then switch to thread B. the thread B run continue, 
 will throw an exception with the msg of Server null returned 
 java.lang.NullPointerException: Passed server is null for 
 9a6e26d40293663a79523c58315b930f, but without removing the new-adding 
 RegionState from regionsInTransition,and it can not be removed for ever.
  public void unassign(HRegionInfo region, boolean force) {
 LOG.debug(Starting unassignment of region  +
   region.getRegionNameAsString() +  (offlining));
 synchronized (this.regions) {
   //

[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival


[ 
https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070877#comment-13070877
 ] 

Andrew Purtell commented on HBASE-4132:
---

bq. Hmm, this seems to be a confusion of 
org.apache.hadoop.hbase.regionserver.wal.WALObserver and 
org.apache.hadoop.hbase.coprocessor.WALObserver. 

Aha.

I agree with J-D, should do a rename.

 Extend the WALObserver API to accomodate log archival
 -

 Key: HBASE-4132
 URL: https://issues.apache.org/jira/browse/HBASE-4132
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.92.0

 Attachments: walArchive.txt


 The WALObserver interface exposes the log roll events. It would be nice to 
 extend it to accomodate log archival events as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4132) Extend the WALObserver API to accomodate log archival


[ 
https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070887#comment-13070887
 ] 

Ted Yu commented on HBASE-4132:
---

org.apache.hadoop.hbase.regionserver.wal.WALObserver is mostly internal.
It is used for LogRoller and replication.
Shall we rename it to 
org.apache.hadoop.hbase.regionserver.wal.WALActionsListener ?
See HRegionServer.getWALActionListeners():
{code}
  // Replication handler is an implementation of WALActionsListener.
  listeners.add(this.replicationHandler);
{code}

 Extend the WALObserver API to accomodate log archival
 -

 Key: HBASE-4132
 URL: https://issues.apache.org/jira/browse/HBASE-4132
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.92.0

 Attachments: walArchive.txt


 The WALObserver interface exposes the log roll events. It would be nice to 
 extend it to accomodate log archival events as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3899) enhance HBase RPC to support free-ing up server handler threads even if response is not ready

2011-07-25 Thread jirapos...@reviews.apache.org (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070891#comment-13070891
]

jirapos...@reviews.apache.org commented on HBASE-3899:
--

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1174/
---

(Updated 2011-07-26 01:19:52.655737)

Review request for hbase.

Changes
---

* Add checking for number of calls currently delayed. A warning message is
issued if too many calls are delayed.
* Unit test to check that above warning works.
* endDelay() now takes an Object as a parameter, not a Writable. Initially, I
thought the method that ended the delay should pack the response (i.e.
endDelay(new HbaseObjectWritable(retval))), but it makes more sense to pack it
in setResponse.
* Address other feedback from Todd Lipcon. Thanks!

Summary
---

Free up RPC server Handler thread if the called routine specifies the call
should be delayed. The RPC client sees no difference, changes are server-side
only. This is based on the previous submitted patch from Dhruba.

This addresses bug HBASE-3899.
https://issues.apache.org/jira/browse/HBASE-3899

Diffs (updated)
-

src/main/java/org/apache/hadoop/hbase/ipc/Delayable.java PRE-CREATION
src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 61d3915
src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java 0da7f9e
src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java PRE-CREATION

Diff: https://reviews.apache.org/r/1174/diff

Testing
---

Unit tests run. Also, the patch includes a new unit test.

Thanks,

Vlad

enhance HBase RPC to support free-ing up server handler threads even if
response is not ready
-

Key: HBASE-3899
URL: https://issues.apache.org/jira/browse/HBASE-3899
Project: HBase
Issue Type: Improvement
Components: ipc
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Fix For: 0.94.0

Attachments: HBASE-3899.patch, asyncRpc.txt, asyncRpc.txt

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3899) enhance HBase RPC to support free-ing up server handler threads even if response is not ready

2011-07-25 Thread jirapos...@reviews.apache.org (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070896#comment-13070896
]

jirapos...@reviews.apache.org commented on HBASE-3899:
--

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1174/#review1185
---

Ship it!

Looks good. Have you run the full test suite with the current iteration of the
patch?

- Todd

On 2011-07-26 01:19:52, Vlad Dogaru wrote:
bq.
bq. ---
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/1174/
bq. ---
bq.
bq. (Updated 2011-07-26 01:19:52)
bq.
bq.
bq. Review request for hbase.
bq.
bq.
bq. Summary
bq. ---
bq.
bq. Free up RPC server Handler thread if the called routine specifies the call
should be delayed. The RPC client sees no difference, changes are server-side
only. This is based on the previous submitted patch from Dhruba.
bq.
bq.
bq. This addresses bug HBASE-3899.
bq. https://issues.apache.org/jira/browse/HBASE-3899
bq.
bq.
bq. Diffs
bq. -
bq.
bq.src/main/java/org/apache/hadoop/hbase/ipc/Delayable.java PRE-CREATION
bq.src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 61d3915
bq.src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java 0da7f9e
bq.src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java
PRE-CREATION
bq.
bq. Diff: https://reviews.apache.org/r/1174/diff
bq.
bq.
bq. Testing
bq. ---
bq.
bq. Unit tests run. Also, the patch includes a new unit test.
bq.
bq.
bq. Thanks,
bq.
bq. Vlad
bq.
bq.

enhance HBase RPC to support free-ing up server handler threads even if
response is not ready
-

Key: HBASE-3899
URL: https://issues.apache.org/jira/browse/HBASE-3899
Project: HBase
Issue Type: Improvement
Components: ipc
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Fix For: 0.94.0

Attachments: HBASE-3899.patch, asyncRpc.txt, asyncRpc.txt

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4139) [stargate] Update ScannerModel with support for filter package additions


 [ 
https://issues.apache.org/jira/browse/HBASE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-4139:
--

Attachment: HBASE-4139.patch

 [stargate] Update ScannerModel with support for filter package additions
 

 Key: HBASE-4139
 URL: https://issues.apache.org/jira/browse/HBASE-4139
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.90.4, 0.92.0

 Attachments: HBASE-4139.patch


 Filters have been added to the o.a.h.h.filters package without updating 
 o.a.h.h.rest.model.ScannerModel. Bring ScannerModel up to date.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4139) [stargate] Update ScannerModel with support for filter package additions


 [ 
https://issues.apache.org/jira/browse/HBASE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-4139:
--

Fix Version/s: 0.90.4
   Status: Patch Available  (was: Open)

 [stargate] Update ScannerModel with support for filter package additions
 

 Key: HBASE-4139
 URL: https://issues.apache.org/jira/browse/HBASE-4139
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.90.4, 0.92.0

 Attachments: HBASE-4139.patch


 Filters have been added to the o.a.h.h.filters package without updating 
 o.a.h.h.rest.model.ScannerModel. Bring ScannerModel up to date.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3937) Region PENDING-OPEN timeout with un-expected ZK node state leads to an endless loop

[
https://issues.apache.org/jira/browse/HBASE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ted Yu updated HBASE-3937:
--

Fix Version/s: (was: 0.92.0)
0.94.0

Region PENDING-OPEN timeout with un-expected ZK node state leads to an
endless loop
---

Key: HBASE-3937
URL: https://issues.apache.org/jira/browse/HBASE-3937
Project: HBase
Issue Type: Bug
Components: master
Affects Versions: 0.90.3
Reporter: Jieshan Bean
Assignee: Jieshan Bean
Fix For: 0.94.0

I describe the scenario of how this problem happened:
1.HMaster assigned the region A to RS1. So the RegionState was set to
PENDING_OPEN.
2.For there's too many opening requests, the open process on RS1 was blocked.
3.Some time later, TimeoutMonitor found the assigning of A was timeout. For
the RegionState was in PENDING_OPEN, went into the following handler
process(Just put the region into an waiting-assigning set):
case PENDING_OPEN:
LOG.info(Region has been PENDING_OPEN for too +
long, reassigning region= +
regionInfo.getRegionNameAsString());
assigns.put(regionState.getRegion(), Boolean.TRUE);
break;
So we can see that, under this case, we consider the ZK node state was
OFFLINE. Indeed, in an normal disposal, it's OK.
4.But before the real-assigning, the requests of RS1 was disposed. So that
affected the new-assigning. For it update the ZK node state from OFFLINE to
OPENING.
5.The new assigning started, so it send region to open in RS2. But while the
opening, it should update the ZK node state from OFFLINE to OPENING. For the
current state is OPENING, so this operation failed.
So this region couldn't be open success anymore.
So I think, to void this problem , under the case of PENDING_OPEN of
TiemoutMonitor, we should transform the ZK node state to OFFLINE first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-4121) improve hbck tool to fix .META. hole issue.


 [ 
https://issues.apache.org/jira/browse/HBASE-4121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-4121.
---

Resolution: Fixed

Duplicate with HBASE-4122

 improve hbck tool to fix .META. hole issue.
 ---

 Key: HBASE-4121
 URL: https://issues.apache.org/jira/browse/HBASE-4121
 Project: HBase
  Issue Type: Improvement
Reporter: feng xu
 Fix For: 0.92.0


 hbase hbck tool can check the META hole, but it can not fix this problem by 
 --fix.
 I plan to improve the tool.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4114) Metrics for HFile HDFS block locality

2011-07-25 Thread Ming Ma (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070922#comment-13070922
]

Ming Ma commented on HBASE-4114:

Thanks, Stack, Ted.

1. In the experiment table above, the total number of HDFS blocks that can be
retrieved locally by the region server as well as total number of HDFS blocks
for all HFiles are defined on the whole cluster level. The external program
also calculates locality information per hfile, region as well as per region
server. It uses HDFS namenode and the calculation is independent of any map
reduce jobs.

2. In terms of how we can calculate this metrics inside hbase, we can do in two
steps. first one is to calcluate the metrics independent of map reduce jobs;
the second one is to calcuate it on per map reduce job level.

3. Calculate on the locality index, independent of map reduce jobs.

a. It will first be calcuated on hfile level { total # of HDFS block, total #
of local HDFS blocks }; then the data get aggregated on region level, finally
get aggregated on region server level.

b. Impact on namenode. There are 2 RPC calls to NN to get block info for each
hfile. If we assume 100 regions per RS, 10 hfiles per region, 500 RSs, we will
have 1M RPC hits to NN. Most of the time, that won't be an issue if we only
calculate hfile locality index when hfile is created or region is loaded by the
RS the first time. Because HDFS can still move HDFS blocks around without hbase
knowing it, we still need to refresh the value periodically.

c. The computation can be done in RS or HMaster. It seems RS is better in terms
of design(only store knows the HDFS path for hfile location, HMaster doesn't)
and extensiblity(to calculate locality index per map reduce job). The locality
index can be part of HServerLoad and RegionLoad for load balancer to use. RS
will rotate through all regions periodically in its main thread. The calcuation
interval defined by by hbase.regionserver.msginterval might be too short for
this scenario to minimize the load to NN for a large cluster ( 20 NN RPC per RS
per 3 sec ).

d. The locality index can be a new RS metrics. We can also put it on table.jsp
for each region.

e. HRegionInfo is kind of static. It doesn't change over time, however,
locality index changes overtime for a given region. Maybe
ClusterStatus/HServerInfo/HServerLoad/RegionLoad are better?

4. Locality index calculation for scan / map reduce job.

a. The original scenario is for full table scan only. If we want to provide
accurate locality index for any scan / map reduce, this could be tricky given
i) map reduce job can have start/end keys and filters such as time range; ii)
block cache can be used and thus hfile shouldn't be accounted for if there is
cache hit. iii) data volume read from HDFS block is also a factor. Reading
smaller buffer is different from reading bigger buffer.

b. One useful scenario is we want to find out why map jobs run slower
sometimes. So it is useful the metrics is just there as part of map reduce job
status page. We can estimate by using ganglia page to get the locality index
value for the RSs at the time map reduce job is run.

c. To provide more accurate data, we can modify TableInputFormat, a) call
HBaseAdmin.getClusterStatus to get the locality index info for each region. b)
calculate the intersection between scan specification and ClusterStatus based
on key range as well as column family. It isn't 100% accurate, but it might be
good enough.

d. To be really accurate, region server needs to provide locality index for
each scan operation back to the client.

Metrics for HFile HDFS block locality
-

Key: HBASE-4114
URL: https://issues.apache.org/jira/browse/HBASE-4114
Project: HBase
Issue Type: Improvement
Components: metrics, regionserver
Reporter: Ming Ma
Assignee: Ming Ma

Normally, when we put hbase and HDFS in the same cluster ( e.g., region
server runs on the datenode ), we have a reasonably good data locality, as
explained by Lars. Also Work has been done by Jonathan to address the startup
situation.
There are scenarios where regions can be on a different machine from the
machines that hold the underlying HFile blocks, at least for some period of
time. This will have performance impact on whole table scan operation and map
reduce job during that time.
1.After load balancer moves the region and before compaction (thus
generate HFile on the new region server ) on that region, HDFS block can be
remote.
2.When a new machine is added, or removed, Hbase's region assignment
policy is different from HDFS's block reassignment policy.
3.Even if there is no much hbase activity, HDFS can load balance HFile
blocks as other non-hbase applications push other

[jira] [Updated] (HBASE-4134) The total number of regions was more than the actual region count after the hbck fix

2011-07-25 Thread feng xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

feng xu updated HBASE-4134:
---

Fix Version/s: (was: 0.90.4)
   0.94.0

 The total number of regions was more than the actual region count after the 
 hbck fix
 

 Key: HBASE-4134
 URL: https://issues.apache.org/jira/browse/HBASE-4134
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
Reporter: feng xu
 Fix For: 0.94.0


 1. I found the problem(some regions were multiply assigned) while running 
 hbck to check the cluster's health. Here's the result:
 {noformat}
 ERROR: Region test1,230778,1311216270050.fff783529fcd983043610eaa1cc5c2fe. is 
 listed in META on region server 158-1-91-101:20020 but is multiply assigned 
 to region servers 158-1-91-101:20020, 158-1-91-105:20020 
 ERROR: Region test1,252103,1311216293671.fff9ed2cb69bdce535451a07686c0db5. is 
 listed in META on region server 158-1-91-101:20020 but is multiply assigned 
 to region servers 158-1-91-101:20020, 158-1-91-105:20020 
 ERROR: Region test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. is 
 listed in META on region server 158-1-91-103:20020 but is multiply assigned 
 to region servers 158-1-91-103:20020, 158-1-91-105:20020 
 Summary: 
   -ROOT- is okay. 
 Number of regions: 1 
 Deployed on: 158-1-91-105:20020 
   .META. is okay. 
 Number of regions: 1 
 Deployed on: 158-1-91-103:20020 
   test1 is okay. 
 Number of regions: 25297 
 Deployed on: 158-1-91-101:20020 158-1-91-103:20020 158-1-91-105:20020 
 14829 inconsistencies detected. 
 Status: INCONSISTENT 
 {noformat}
 2. Then I tried to use hbck -fix to fix the problem. Everything seemed ok. 
 But I found that the total number of regions reported by load balancer 
 (35029) was more than the actual region count(25299) after the fixing.
 Here's the related logs snippet:
 {noformat}
 2011-07-22 02:19:02,866 INFO org.apache.hadoop.hbase.master.LoadBalancer: 
 Skipping load balancing.  servers=3 regions=25299 average=8433.0 
 mostloaded=8433 
 2011-07-22 03:06:11,832 INFO org.apache.hadoop.hbase.master.LoadBalancer: 
 Skipping load balancing.  servers=3 regions=35029 average=11676.333 
 mostloaded=11677 leastloaded=11676
 {noformat}
 3. I tracked one region's behavior during the time. Taking the region of 
 test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. as example:
 (1) It was assigned to 158-1-91-101 at first. 
 (2) HBCK sent closing request to RegionServer. And RegionServer closed it 
 silently without notice to HMaster.
 (3) The region was still carried by RS 158-1-91-103 which was known to 
 HMaster.
 (4) HBCK will trigger a new assignment.
 The fact is, the region was assigned again, but the old assignment 
 information still remained in AM#regions,AM#servers.
 That's why the problem of region count was larger than the actual number 
 occurred.  
 {noformat}
 Line 178967: 2011-07-22 02:47:51,247 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase/unassigned/52782c0241a598b3e37ca8729da0 
 (region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., 
 server=HBCKServerName, state=M_ZK_REGION_OFFLINE)
 Line 178968: 2011-07-22 02:47:51,247 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling HBCK triggered 
 transition=M_ZK_REGION_OFFLINE, server=HBCKServerName, 
 region=52782c0241a598b3e37ca8729da0
 Line 178969: 2011-07-22 02:47:51,248 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: HBCK repair is triggering 
 assignment of 
 region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0.
 Line 178970: 2011-07-22 02:47:51,248 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
 was found (or we are ignoring an existing plan) for 
 test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. so generated a 
 random one; hri=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., 
 src=, dest=158-1-91-101,20020,1311231878544; 3 (online=3, exclude=null) 
 available servers
 Line 178971: 2011-07-22 02:47:51,248 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. to 
 158-1-91-101,20020,1311231878544
 Line 178983: 2011-07-22 02:47:51,285 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENING, server=158-1-91-101,20020,1311231878544, 
 region=52782c0241a598b3e37ca8729da0
 Line 179001: 2011-07-22 02:47:51,318 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_OPENED, server=158-1-91-101,20020,1311231878544, 
 region=52782c0241a598b3e37ca8729da0
 Line 179002: 2011-07-22 02:47:51,319 DEBUG

[jira] [Created] (HBASE-4140) book: Update our hadoop vendor section

book: Update our hadoop vendor section
--

 Key: HBASE-4140
 URL: https://issues.apache.org/jira/browse/HBASE-4140
 Project: HBase
  Issue Type: Improvement
Reporter: stack




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4140) book: Update our hadoop vendor section